UNIVERSITY of LIMERICK OLLSCOIL LUIMNIGH College of Informatics and Electronics END OF SEMESTER ASSESSMENT PAPER MODULE CODE: MS4327 SEMESTER: Spring 2007 MODULE TITLE: Optimisation DURATION OF EXAMINATION: 2 1/2 hours LECTURER: Dr. J. Kinsella PERCENTAGE OF TOTAL MARKS: 80% EXTERNAL EXAMINER: Prof. J. King INSTRUCTIONS TO CANDIDATES: Answer four questions correctly for full marks; 80%. See the Appendix at the end of the paper for some useful results. MS4327 Optimisation 1 Dr. J. Kinsella Spring 2007 Marks (a) Given the one-dimensional “line” function φ(α) = 2 − 2α + 2α2 − 1/2α3; show that the strong Wolfe conditions (Appendix B) hold in the interval [0 · 45, 1] (approximately). Take c1 = 0.1 and c2 = 0.25. (b) The function φ(α) has a local minimum at α = 32 . Is this consistent with the above result? 15 1 (c) For any line function φ, define Φ(α) = φ(α) − (φ(0) + c1αφ ′ (0)) and show that when 0 < c1 < c2 < 1: (i) For any α, Φ(α) ≤ 0 is equivalent to the first Wolfe condition (11a) holding at α. (ii) For any α, Φ ′ (α) = 0 implies that the second (weak) Wolfe condition (11b) holds at α. 2 (d) Use the results from (c) to show that if the first Wolfe condition fails for some α0 > 0 then the interval (0, α0) contains values for α that satisfy the strong Wolfe conditions. (Assume that φ ′ (0) < 0.) Illustrate your proof with a sketch graph of Φ. 5 (e) Explain briefly the significance of this result for step-size selection algorithms. 1 1 2 The Dogleg Trust Region method is described in Appendix C. (a) First show that for a positive definite matrix B that the Cauchy-Schwartz inequality (uT v)2 ≤ (uT u)2(vT v)2 (1) implies that 2 (pT Bp)(pT B−1p) ≥ (pT p)2 for any vector p. (2) (b) Show that if B is positive definite then (you may assume the result from part (a)) ˜ (i) kp(τ)k is an increasing function of τ and ˜ (ii) m(p(τ)) is a decreasing function of τ . 6 10 ˜ (c) Explain why the path p(τ) intersects the trust region boundary kpk = ˜ ∆ at exactly one point if kp(2)k ≡ kpBk ≥ ∆ and nowhere otherwise. 1 (d) What value should be selected for p if kpBk ≤ ∆? 1 (e) In the case where the vector p is chosen to be on the boundary, explain carefully how the appropriate value of the parameter τ is chosen. 2 1 MS4327 Optimisation Dr. J. Kinsella Spring 2007 0.5 1 and ∆ = 1.5, find the appropriate , pB = (f) Finally, if pU = 1p.5 1 value of τ. Marks 3 3 The Fletcher-Reeves (FR) version of the non-linear conjugate gradient algorithm (Alg. 1) is given in Appendix D. is increasing on 0 ≤ t ≤ 12 and (a) Show that the function h(t) = 2t−1 1−t conclude that −1 ≤ h(t) < 0 on [0, 12 ). (b) Suppose that the algorithm is implemented with a step length αk that satisfies the strong Wolfe conditions with 0 < c2 < 21 . Use the result of (a) to show that the method generates descent directions pk that satisfy the following inequalities: − 1 ≤ 1 − c2 pTkgk kgkk2 ≤ 2c2 − 1 , 1 − c2 3 14 for all k = 0, 1, . . . . (c) Explain the significance of the result. (d) Explain carefully why the method “sticks” if a “bad” value of pk (satkgkk isfying ≪ 1) is generated at iteration k. kgkk (e) What is the significance of this result for the method? 1 6 1 4 The BFGS method updates an estimate Hk of the Inverse Hessian in the form: Hk+1 = (I − γkskyTk)Hk(I − γkyksTk) + γksksTk. (3) Here sk = xk+1 − xk, yk = ∇f(xk+1) − ∇f(xk) and γk = 1 . sTk yk (a) Derive this update rule by solving the problem: 18 min kH − Hkk (3a) subject to H = H , Hyk = sk. (3b) H T 1 1 Use the “weighted norm”: kAkW ≡ kW 2 AW 2 kF, where Pn PnFrobenius 2 2 kCkF ≡ j=1 Cij. The weight W is an unspecified positive i=1 definite symmetric matrix satisfying Wsk = yk. (b) Explain briefly how you could use the BFGS method to generate search directions. (c) Show that (when the inverse Hessian approximations are generated using the BFGS algorithm) if Hk is positive definite, then the next iterate Hk+1 is also ). (d) Explain the significance of this result. 2 1 5 1 MS4327 Optimisation 5 Dr. J. Kinsella Spring 2007 Marks (a) Consider the constrained minimisation problem: min f(x) = (x1 − a)2 + (x2 − b)2 + x1x2 x∈R2 subject to 0 ≤ xi ≤ 1, for i = 1, 2. (i) Show that the unconstrained minimum of f is at x= 3 2 (2a − b, 2b − a) . 3 (ii) Write the first-order necessary (KKT) conditions for this problem. (See Appendix E.) (iii) When a = 47 and b = 54 , show that the unconstrained minimum of f is at x = 23 , 12 — an infeasible point. (iv) Show that because of the location of the unconstrained minimum either one or two of the constraints x1 ≤ 1, x2 ≥ 0 and x2 ≤ 1 must be active at the optimal point. Explain why the solution must either be at (A) the vertex (1, 1), (B) on the edge x1 = 1 or (C) at the vertex (1, 0) — use a sketch to illustrate your argument. (v) Solve the KKT equations to find the solution to the constrained problem for these values of a and b — check the three cases A, B and C separately. 4 2 4 8 (b) Consider the problem min f(x) = −2x1 + x2 x∈R2 (1 − x1)3 − x2 ≥0 subject to 2 x2 + 0.25x1 − 1 ≥ 0. (4) (i) Show that the LICQ applies at x∗ = (0, 1) (active constraints are linearly independant). (ii) Show that the KKT conditions are satisfied at x∗ . 3 2 2 MS4327 Optimisation Dr. J. Kinsella Spring 2007 Marks 6 Consider the equality-constrained Quadratic Program (Q.P.): 1 min q(x) = xT Gx + xT d, x 2 T subject to ai x = bui, i = 1, . . . , k (5a) (5b) (a) Writing the set of k equality constraints (5b) as the matrix equation Ax − b = 0, show that if x∗ is a local minimum then the KKT conditions (Appendix E) require that there must be a vector λ∗ of Lagrange multipliers such that the following system of equations is satisfied: G −AT A 0 ∗ −d x ∗ = b λ 3 (6) (b) If we write x∗ = x0 + p, where x0 is any estimate of the solution and p the required step to the solution, show that (6) can be re-written as: g G AT −p (7) = c λ∗ A 0 1 where c = Ax0 − b, g = Gx0 + d (the gradient of q(x0)) and p = x∗ − x0. (c) Let A have full row rank and let Z be the n × (n − m) matrix whose columns form a basis for the null space of A. (AZ = 0.) Show that, if the “reduced-Hessian” matrix ZT GZ is positive definite, the KKT matrix G AT (8) K= A 0 is non-singular. 8 (d) Briefly explain the significance of this result. 1 (e) Show that this block matrix equation can be reduced to the problem of solving AG−1AT λ∗ = AG−1g − c . (9) for λ∗ and then solving −Gp + AT λ∗ = g for p. (10) 2 4 MS4327 Optimisation Dr. J. Kinsella Spring 2007 (f) Given q(x) = 2x2 + 2xy + 2y2 − x + y and the inequality constraint 2x + 3y ≥ 1, solve the QP using the method you derived in (e). (First check whether the unconstrained solution is feasible — if it is, you are finished; if not the inequality constraint must be active at the optimal point.) 5 Marks 10 MS4327 Optimisation Dr. J. Kinsella Spring 2007 Appendix of Results A The Wolfe conditions for the step length α in a line search require that f(xk + αpk) ≤ pkT g(xk f(xk) + c1αpkT g(xk), (10a) c2pkT g(xk) (10b) + αpk) ≥ where g(x) ≡ ∇f(x) and 0 < c1 < c2 < 1. The strong Wolfe conditions replace (10b) by |pkT g(xk + αpk)| ≤ c2|pkT g(xk)|. (11) B In terms of a “line” function φ(α) ≡ f(x + αp); the Wolfe conditions for the step length α in a line search require that φ(0) + c1αφ ′ (0), c2φ ′ (0) φ(α) ≤ φ ′ (α) ≥ (11a) (11b) where 0 < c1 < c2 < 1. The strong Wolfe conditions replace (11b) by |φ ′ (α)| ≤ c2|φ ′ (0)|. (12) C The Trust Region Method is based on the problem: 1 minn m(p) ≡ f0 + gT p + pT Bp, p∈R 2 such that kpk ≤ ∆, (13) where f0 is a fixed scalar, g a fixed vector in Rn, B a fixed n × n matrix and ∆ a fixed positive scalar. The “dogleg” method finds an approximate solution to (13) by replacing the (unknown) curved trajectory for p∗ (∆) with a path consisting of two line segments. The first line segment runs from the starting point to the unconstrained minimiser along the steepest descent direction defined by pU = − gT g g gT Bg (14) while the second line segment runs from pU to pB ≡ −B−1g. We can define ˜ the trajectory as a path p(τ) parameterised by τ as follows: τpU, 0 ≤ τ ≤ 1, ˜ (15) p(τ) = U B U p + (τ − 1)(p − p ), 1 ≤ τ ≤ 2. 6 Marks MS4327 Optimisation Dr. J. Kinsella Spring 2007 D Algorithm 1 (FR-CGM) begin Given x0. set r0 ← ∇f0, p0 ← −r0, k ← 0; while rk 6= 0 do αk ← Result of line search along pk; xk+1 ← xk + αkpk; rk+1 ← ∇fk+1 rT r 2 k k+1 k+1 ≡ krkrk+1 βFR 2 ; k+1 ← rk T rk kk pk+1 ← −rk+1 + βFR k+1pk; k ← k + 1; end (while) end E The first-order KKT necessary conditions for a point x∗ with optimal multipliers λ∗ to be a local solution of an equality-constrained minimisation problem are as follows: Suppose that x∗ is a local solution of a constrained minimisation problem and that the LICQ holds at x∗ . Then there is a Lagrange multiplier vector λ∗ , with components λ∗i , i ∈ E ∪ I, such that the following conditions are satisfied at (x∗ , λ∗) ∇xL(x∗ , λ∗) = 0, ci(x∗ ) = 0, ci(x∗ ) ≥ 0, λ∗i ≥ 0, λ∗i ci(x∗ ) = 0, for all i ∈ E, for all i ∈ I, for all i ∈ I, for all i ∈ E ∪ I 7 (16a) (16b) (16c) (16d) (16e) Marks
© Copyright 2024 ExpyDoc