APPM 4720/5720 Problem Set 2 Solutions This assignment is due at the start of class on Wednesday, February 19th. Minimal credit will be given for incomplete solutions or solutions that do not provide details on how the solution is found. Problems marked with a ? are required only for students enrolled in APPM 5720. You may discuss the problems with your classmates, but all work (analysis and code) must be your own. 1. (Watkins 2.2.15) We know that a matrix A is singular if and only if det (A) = 0. We might expect that a very small determinant indicates a poorly conditioned matrix. This turns out not to be true, and you will prove it in this exercise. α 0 (a) Let α be a positive real number and consider the following matrix A = Show that for 0 α any induced matrix norm kAk = α, kA−1 k = 1/α and κ(A) = 1, while det (A) = α2 . Solution: Since the determinant of a diagonal matrix is the product of its diagonal elements it is trivial to see that det(A) = α2 . Furthermore, we have kAk = max x6=0 kαIxk αkxk kAxk = max = max =α x6=0 x6=0 kxk kxk kxk Since A−1 = α−1 I we have kA−1 k = max x6=0 Then κ(A) = kAkkA−1 k = α · kA−1 xk kα−1 Ixk α−1 kxk 1 = max = max = x6=0 x6=0 kxk kxk kxk α 1 =1 α (b) More generally, given any nonsingular matrix A, discuss the condition number and determinant of αA, where α is any positive real number. Solution: For positive α we have det(αA) = α det(A) which means we can make the determinant of a matrix arbitrarily large by multiplying it by a large scalar. On the other hand, for any induced matrix norm we have kαAk = αkAk and kαA−1 k = α−1 kA−1 k ⇒ κ(αA) = κ(A) which implies that multiplying A by a scalar does not affect its condition number. The main take-away from this is that the determinant of a matrix has very little to do with its conditioning. Matrices with large determinants can be perfectly well conditioned and matrices with small determinants can be terribly ill-conditioned. ˆ be an approximation to the solution of Ax = b, and let r = b − Aˆ 2. (Watkins 2.5.9) Let x x. Define δA ∈ Rn×n by δA = αrˆ xT , where α = kˆ xk−2 2 ˆ is the exact solution of (A + δA) x ˆ = b. (a) Show that x Solution: We have ˆ (A + δA) x = ˆ A + αrˆ xT x = Aˆ x+r ˆT x ˆ x ˆT x ˆ x = Aˆ x+r = Aˆ x + b − Aˆ x = b (b) Show that kδAk2 = krk2 /kˆ xk2 and kδAk2 krk2 = kAk2 kAk2 kˆ xk2 Thus if krk2 is tiny relative to kAk2 kˆ xk2 , then the algorithm (whichever algorithm was used) is backward stable for this problem. Solution: We have kδAk2 = max kδAyk2 kyk2 =1 max kαrˆ xT yk2 T x ˆ y krk2 = max ˆT x ˆ kyk2 =1 x = kyk2 =1 We are only free to select y to maximize this quantity. Notice that since this is the Euclidean ˆ . Thus we ˆ T y will be maximized when y it is a unit vector in the same direction as x norm, x ˆ /kxk2 which gives pick y = x kδAk2 = kˆ xk2 krk2 krk2 = kˆ xk22 kˆ xk2 It is then trivial to see that kδAk2 krk2 = kAk2 kAk2 kˆ xk2 3. (Watkins 2.5.9) In this exercise you will assess the backward stability of Gaussian Elimination by calculating error in the LU decomposition: E = LU − A. Write a MATLAB/OCTAVE/PYTHON program that computes the LU decomposition of A both with and without partial pivoting. (a) Calculate the LU decomposition of random matrices without pivoting for several choices of n (e.g. n = 40, 80, 160), and note kLk∞ , kU k∞ , and the norm of the backward error kEk∞ = kLU −Ak∞ . On the same matrices compute the LU decomposition with partial pivoting and calculate the same quantities. Discuss the effectiveness of both the methods, in terms of stability, and comment on the effect of partial pivoting. Solution: The following tables shows the mean values of the quantities of interest over 100 random matrices. The table on the left shows the results for LU with no pivoting. The table on the right shows the results with partial pivoting. Notice that without pivoting the norms of L and U can grow quite large, while with pivoting the kLk∞ remains small since all engries in L are smaller than or equal to 1. While the error between A and the compute LU stays fairly small even without pivoting, the error is much better with pivoting. n 40 80 160 kLk∞ 357.3 2501.1 1315.3 kU k∞ 5764.1 44885 1.39 × 105 kLU − Ak∞ 1.32 × 10−12 3.75 × 10−11 2.89 × 10−11 n 40 80 160 kLk∞ 15.3 29.1 53.3 kU k∞ 69.3 189.2 521.9 kLU − Ak∞ 6.62 × 10−15 2.09 × 10−14 6.25 × 10−14 (b) To demonstrate the weakness of the LU decomposition without pivoting, give it a matrix for which one of the pivots is guaranteed to be small. The easiest way to do this is to use matrices whose (1, 1) entry is tiny. Repeat the experiments from (a) using matrices for which a11 is tiny. Solution: Here we perform identical tests as above, except this time we set a11 = 1 × 10−13 for each matrix. Again the results with and without pivoting are shown in the tables on the left and right, respectively. Here we see just how bad LU without pivoting can be on matrices with small diagonal elements. The entries in both L and U without pivoting are extremely large, and the error in the decomposition is quite bad. With pivoting however the size of the entries of L and U remain fairly small, and the decomposition is very accurate. n 40 80 160 kLk∞ 2.41 × 1013 2.60 × 1013 2.89 × 1013 kU k∞ 2.38 × 1014 5.39 × 1014 8.93 × 1014 kLU − Ak∞ 0.0512 0.1085 0.2435 n 40 80 160 kLk∞ 15.9 28.7 52.3 kU k∞ 68.9 186.2 517.8 kLU − Ak∞ 6.67 × 10−15 2.07 × 10−14 6.27 × 10−14 4. (Watkins 3.2.46) The following steps lead to a proof of the uniqueness of the QR decomposition. (a) Suppose B ∈ Rn×n is both orthogonal and upper triangular. Prove that B must be a diagonal matrix whose main-diagonal entries are ±1. Solution: If we write B in component form and enforce the conditions that it is orthogonal and upper triangular we have b11 b12 b13 .. . b22 b23 .. . b33 .. . b1n b2n b3n .. . ··· b11 b12 b22 b13 b23 b33 ··· ··· ··· .. . b1n b2n b3n .. . 1 1 = 1 .. . 1 bnn bnn Let’s form the first row of the product B T B one entry at a time. Taking the inner product of the first row of B T with the first column of B we have b211 = 1 ⇒ b11 = ±1 Then, taking the inner product of the first row of B T with the j th column of B for j ≥ 2 we have b11 b1j = 0 ⇒ b1j = 0 for j = 2, . . . , n Having determined the entries in the first row of B (and the first column of B T ) we now have ±1 0 0 .. . b22 b23 .. . b33 .. . 0 b2n b3n .. . ··· ±1 0 b22 0 ··· b23 · · · b33 · · · .. . 0 b2n b3n .. . = 1 1 bnn bnn 1 .. . 1 ˆ = In−1 where B ˆ = ˆT B From this equation it is clear that upper triangular B is orthogonal iff B ˆ B2:n,2:n . We can prove the result by applying the same argument as above recursively to B. (b) Suppose Q1 R1 = Q2 R2 , where Q1 and Q2 are both orthogonal, and R1 and R2 are both upper triangular and nonsingular. Show that there is a diagonal matrix D with main diagonal entries ±1 such that R2 = DR1 and Q1 = Q2 D. Solution: Note that Q1 R1 = Q2 R2 . Since Q2 is orthogonal and R1 is nonsingular we can multiplying both sides of the equation on the left by QT2 and the right by R1 −1 to obtain D := QT2 Q1 = R2 R1 −1 It is easy to check that the inverse of an upper triangular matrix is also upper triangular, and the product of two upper triangular matrices is upper triangular. It is also easy to verfy that QT2 Q1 is orthogonal. Then, D is a matrix that is both orthogonal and upper triangular. Thus by part (a) D is diagonal with diagonal entries ±1. It remains to be shown that the matrices satisfy the relations R2 = DR1 and Q1 = Q2 D. We have DR1 = R2 R1 −1 R1 = R2 as desired. and Q2 D = Q2 QT2 Q1 = Q1 (c) Suppose that A ∈ Rn×n is nonsingular and there exist orthogonal Q ∈ Rn×n and upper triangular R ∈ Rn×n with positive main-diagonal entries such that A = QR. Use parts (a) and (b) to prove that this decomposition is unique. Solution: Let A ∈ Rn×n be nonsingular and assume there exist orthogonal Q ∈ Rn×n and upper triangular R ∈ Rn×n with positive main diagonal entries such that A = QR. Note also that since the main diagonal entries of R are nonzero then R is nonsingular. ˆR ˆ where Q ˆ and R ˆ satisfy the same requirements as Assume that A can also be written as A = Q ˆ ˆ above. Then we have QR = QR. From part (b) there exists diagonal matrix D with main-diagonal ˆ = DR. If both R and R ˆ have positive main diagonal entries then the only entries ±1 such that R D for which this works is the identity matrix. Then ˆ=R R and the decomposition is unique. ˆ=Q and Q 5. (Watkins 3.2.65) Some of the most important transforming matrices are rank-one updates of the identity matrix (including Householder Reflectors) (a) Recall that the rank of a matrix is equal to the number of linearly independent columns. Prove that A ∈ Rm×m has rank one if and only if there exist nonzero vectors u, v ∈ Rm such that A = uvT . To what extent is there flexibility in the choice of u and v? Solution: (⇐) Assume that there exist nozero vector u and v in Rm such that A = uvT . Then each column of A has the form αu (where here α = vi ). Since each column of A can be written as a scalar multiple of the same vector it must be the case that rank(A) = 1. (⇒) Assume that rank(A) = 1 and let Aj represent the j th column of A. Since A is rank 1 it must be the case that dim(span {Aj }) = 1. We can easily construct (nonunique) nonzero vectors u and v such that A = uvT . For instance, we can let u = A1 and define v such that vj = a1j /a11 for j = 1, . . . , m We have some small amount of flexibility in choosing u and v. We can choose u to be any vector in R(A). Once u is fixed then the entries of v are uniquely determined by vj = a1j /u1 for j = 1, . . . , m. (b) A matrix of the form G = I − uvT is called a rank-one update of the identity for obvious reasons. Prove that if G is singular, then Gx = 0 if and only if x is a multiple of u. Prove that G is singular if and only if vT u = 1. Solution: Let G = I − uvT be singular. Then there exists some x 6= 0 such that Gx = 0. Then we have Gx = 0 ⇔ I − uvT x = 0 ⇔ x = uvT x ⇔ x = vT x u = αu (⇒) Let G = I − uvT be singular. We showed above that any vector in the nullspace of G must have the form αu. Then we have G (αu) = 0 ⇔ αu − uvT u = 0 ⇔ u = u vT u ⇔ vT u = 1 (⇐) Assume that vT u = 1. Then we have Gu = u − u vT u = u − u = 0 and G is singular. (c) Show that if G = I − uvT is nonsingular, then G−1 has the form I − βuvT . Give a formula for β. Solution: This one we’ll prove by construction (by actually finding the desired β). We have I = I − uvT I − βuvT = I − βuvT − uvT + βuvT uvT = I − βuvT − uvT + β vT u uvT This implies that β + 1 − β vT u uvT = 0 Solving for the β that makes the scalar coefficient zero we find β= 1 vT u − 1 We showed in part (b) that if G is nonsingular then vT u 6= 1 so β is well-defined. Thus G−1 = 1 − βuvT where β is as given above. (d) Prove that if G = I − uvT is orthogonal, then u and v are proportional, i.e. v = ρu for some nonzero scalar ρ. Show that if kuk2 = 1, then ρ = 2. Solution: If G = I − uvT is orthogonal then we have GT = G−1 . We know the form of G−1 from part (c) so we have GT = G−1 I − vuT = I − βuvT ⇔ ⇔ vuT = βuvT If GT = G−1 then GT x = G−1 x for all nonzero x. If we take x = u then we have vuT u = βuvT u ⇔ v=β vT u u = ρu uT u where ρ=β vT u uT u Assume that kuk2 = 1, then letting v = ρu the expression GT G = I GT G = I ⇒ I − uvT ⇒ T I − uvT = I ⇒ I − ρuuT T I − ρuuT = I I − 2ρuuT + ρ2 kuk22 uuT = I Solving the quadratic equation in uuT gives ρ = 0 and ρ = 2. Since we know ρ 6= 0 we conclude that ρ = 2. (e) Show that if G = I − uvT and W ∈ Rm×n , then GW can be computed in about 4nm flops if the arithmetic is done in an efficient way. Solution: Notice that we can write GW = I − uvT W = W − uvT W . We first compute xT = vT W , which requires taking the inner product of v with the n columns of W . Eeach inner product costs approximately 2m flops, and there are n of them, for a total of 2mn flops. We then form the m × n outer-product matrix uxT , which costs mn flops. Finally we subtract W − uxT for another mn flops, for a grand total of 4mn flops.
© Copyright 2024 ExpyDoc