EECS227C - Homework II February 18, 2014 This assignment is due at the beginning of class on March 4. 1. Let f be a convex, continuously differentiable function, with L-Lipschitz derivatives, and bounded level sets. That is, suppose that for any α, the set Ωα = {x : f (x) ≤ α} is bounded. Prove that gradient descent with backtracking line search converges for all such f . Provide as tight an upper bound on the rate of convergence as you can. 2. Consider the momentum method x(k+1) = x(k) − αk ∇f (x(k) ) + βk (x(k) − x(k−1) ) where α and β are chose to minimize f at each iteration. Assume that x0 = x−1 . Show that if f (x) = 21 xT Ax − bT x, then this method is equivalent to the conjugate gradient method. 3. Let ` and L be constants satisfying 0 < ` ≤ L. Define 4 √ α := √ ( L + `)2 √ √ L− ` √ . β := √ L+ ` and the matrix 1 + β − αλ −β T (λ) = . 1 0 Assume throughout that λ ∈ (`, L). (a) Show that T (λ) has two complex eigenvalues γ1 and γ2 , each with magnitude β. (b) Bound the condition number of the complex similarity transform S(λ) which satisfies γ1 0 −1 S(λ) T (λ)S(λ) = 0 γ2 That is, compute maxλ kS(λ)kkS(λ)−1 k. 4. Recall that in the Heavy Ball Method, we tried to force the matrix (1 + β)I − αA −βI T = . I 0 1 to have as small a spectral radius as possible when the eigenvalues of A were in the interval (`, L). What is the smallest norm of T possible? That is, compute max kT k . min α>0,1>β>0 `IALI What values of α and β achieve the minimum? 5. Let f be a convex, continuously differentiable function, with L-Lipschitz derivatives. Prove for all such f . Provide as that Nesterov’s method converges when αk = 1/L and βk = k−1 k+2 tight an upper bound on the rate of convergence as you can. 6. Apply Newton’s method with a constant stepsize to minimize the function f : Rn → R given by !3/2 n X f (x) := 31 x2i . i=1 Identify the range of stepsizes for which this method converges. Show that for any stepsize within this range, the iterates converge linearly to xopt = 0. Explain why the method does not converge quadratically to the optimal solution. 7. Derive closed form expressions for the following quasi-Newton methods: (a) H (k+1) a diagonal matrix: H (k+1) = diag(h1 , . . . , hn ) for all k. Minimize the distance to the last iterate in the operator norm subject to s(k) = H (k+1) y (k) . (b) H (k+1) a rank-one update H (k+1) = H (k) + vv T . Ensure s(k) = H (k+1) y (k) . In both cases, determine whether H (k) is positive definite for all k. Explain why or why not. 2
© Copyright 2025 ExpyDoc