PS 2 Solutions

APPM 4720/5720
Problem Set 2 Solutions
This assignment is due at the start of class on Wednesday, February 19th. Minimal credit will be given for
incomplete solutions or solutions that do not provide details on how the solution is found. Problems marked
with a ? are required only for students enrolled in APPM 5720. You may discuss the problems with your
classmates, but all work (analysis and code) must be your own.
1. (Watkins 2.2.15) We know that a matrix A is singular if and only if det (A) = 0. We might expect
that a very small determinant indicates a poorly conditioned matrix. This turns out not to be true,
and you will prove it in this exercise.
α 0
(a) Let α be a positive real number and consider the following matrix A =
Show that for
0 α
any induced matrix norm kAk = α, kA−1 k = 1/α and κ(A) = 1, while det (A) = α2 .
Solution:
Since the determinant of a diagonal matrix is the product of its diagonal elements it is trivial to
see that det(A) = α2 .
Furthermore, we have kAk = max
x6=0
kαIxk
αkxk
kAxk
= max
= max
=α
x6=0
x6=0 kxk
kxk
kxk
Since A−1 = α−1 I we have kA−1 k = max
x6=0
Then κ(A) = kAkkA−1 k = α ·
kA−1 xk
kα−1 Ixk
α−1 kxk
1
= max
= max
=
x6=0
x6=0
kxk
kxk
kxk
α
1
=1
α
(b) More generally, given any nonsingular matrix A, discuss the condition number and determinant
of αA, where α is any positive real number.
Solution: For positive α we have det(αA) = α det(A) which means we can make the determinant
of a matrix arbitrarily large by multiplying it by a large scalar.
On the other hand, for any induced matrix norm we have
kαAk = αkAk
and kαA−1 k = α−1 kA−1 k
⇒
κ(αA) = κ(A)
which implies that multiplying A by a scalar does not affect its condition number. The main
take-away from this is that the determinant of a matrix has very little to do with its conditioning. Matrices with large determinants can be perfectly well conditioned and matrices with small
determinants can be terribly ill-conditioned.
ˆ be an approximation to the solution of Ax = b, and let r = b − Aˆ
2. (Watkins 2.5.9) Let x
x. Define
δA ∈ Rn×n by δA = αrˆ
xT , where α = kˆ
xk−2
2
ˆ is the exact solution of (A + δA) x
ˆ = b.
(a) Show that x
Solution:
We have
ˆ
(A + δA) x
=
ˆ
A + αrˆ
xT x
= Aˆ
x+r
ˆT x
ˆ
x
ˆT x
ˆ
x
= Aˆ
x+r
= Aˆ
x + b − Aˆ
x
= b
(b) Show that kδAk2 = krk2 /kˆ
xk2 and
kδAk2
krk2
=
kAk2
kAk2 kˆ
xk2
Thus if krk2 is tiny relative to kAk2 kˆ
xk2 , then the algorithm (whichever algorithm was used) is
backward stable for this problem.
Solution:
We have
kδAk2
=
max kδAyk2
kyk2 =1
max kαrˆ
xT yk2
T x
ˆ y
krk2
= max
ˆT x
ˆ
kyk2 =1 x
=
kyk2 =1
We are only free to select y to maximize this quantity. Notice that since this is the Euclidean
ˆ . Thus we
ˆ T y will be maximized when y it is a unit vector in the same direction as x
norm, x
ˆ /kxk2 which gives
pick y = x
kδAk2
=
kˆ
xk2
krk2
krk2 =
kˆ
xk22
kˆ
xk2
It is then trivial to see that
kδAk2
krk2
=
kAk2
kAk2 kˆ
xk2
3. (Watkins 2.5.9) In this exercise you will assess the backward stability of Gaussian Elimination by
calculating error in the LU decomposition: E = LU − A. Write a MATLAB/OCTAVE/PYTHON
program that computes the LU decomposition of A both with and without partial pivoting.
(a) Calculate the LU decomposition of random matrices without pivoting for several choices of n (e.g.
n = 40, 80, 160), and note kLk∞ , kU k∞ , and the norm of the backward error kEk∞ = kLU −Ak∞ .
On the same matrices compute the LU decomposition with partial pivoting and calculate the same
quantities. Discuss the effectiveness of both the methods, in terms of stability, and comment on
the effect of partial pivoting.
Solution:
The following tables shows the mean values of the quantities of interest over 100 random matrices.
The table on the left shows the results for LU with no pivoting. The table on the right shows the
results with partial pivoting. Notice that without pivoting the norms of L and U can grow quite
large, while with pivoting the kLk∞ remains small since all engries in L are smaller than or equal
to 1. While the error between A and the compute LU stays fairly small even without pivoting,
the error is much better with pivoting.
n
40
80
160
kLk∞
357.3
2501.1
1315.3
kU k∞
5764.1
44885
1.39 × 105
kLU − Ak∞
1.32 × 10−12
3.75 × 10−11
2.89 × 10−11
n
40
80
160
kLk∞
15.3
29.1
53.3
kU k∞
69.3
189.2
521.9
kLU − Ak∞
6.62 × 10−15
2.09 × 10−14
6.25 × 10−14
(b) To demonstrate the weakness of the LU decomposition without pivoting, give it a matrix for which
one of the pivots is guaranteed to be small. The easiest way to do this is to use matrices whose
(1, 1) entry is tiny. Repeat the experiments from (a) using matrices for which a11 is tiny.
Solution:
Here we perform identical tests as above, except this time we set a11 = 1 × 10−13 for each
matrix. Again the results with and without pivoting are shown in the tables on the left and right,
respectively. Here we see just how bad LU without pivoting can be on matrices with small diagonal
elements. The entries in both L and U without pivoting are extremely large, and the error in the
decomposition is quite bad. With pivoting however the size of the entries of L and U remain fairly
small, and the decomposition is very accurate.
n
40
80
160
kLk∞
2.41 × 1013
2.60 × 1013
2.89 × 1013
kU k∞
2.38 × 1014
5.39 × 1014
8.93 × 1014
kLU − Ak∞
0.0512
0.1085
0.2435
n
40
80
160
kLk∞
15.9
28.7
52.3
kU k∞
68.9
186.2
517.8
kLU − Ak∞
6.67 × 10−15
2.07 × 10−14
6.27 × 10−14
4. (Watkins 3.2.46) The following steps lead to a proof of the uniqueness of the QR decomposition.
(a) Suppose B ∈ Rn×n is both orthogonal and upper triangular. Prove that B must be a diagonal
matrix whose main-diagonal entries are ±1.
Solution:
If we write B in component form and enforce the conditions that it is orthogonal and upper
triangular we have








b11
b12
b13
..
.
b22
b23
..
.
b33
..
.
b1n
b2n
b3n






..
.
···
b11
b12
b22
b13
b23
b33
···
···
···
..
.
b1n
b2n
b3n
..
.



1
1
 
 
 
=
 
 
1
..
.
1
bnn
bnn






Let’s form the first row of the product B T B one entry at a time. Taking the inner product of the
first row of B T with the first column of B we have
b211 = 1
⇒
b11 = ±1
Then, taking the inner product of the first row of B T with the j th column of B for j ≥ 2 we have
b11 b1j = 0
⇒
b1j = 0
for j = 2, . . . , n
Having determined the entries in the first row of B (and the first column of B T ) we now have








±1
0
0
..
.
b22
b23
..
.
b33
..
.
0
b2n
b3n






..
.
···
±1
0
b22
0 ···
b23 · · ·
b33 · · ·
..
.
0
b2n
b3n
..
.


 
 
 
=
 
 

1
1
bnn
bnn






1
..
.
1
ˆ = In−1 where B
ˆ =
ˆT B
From this equation it is clear that upper triangular B is orthogonal iff B
ˆ
B2:n,2:n . We can prove the result by applying the same argument as above recursively to B.
(b) Suppose Q1 R1 = Q2 R2 , where Q1 and Q2 are both orthogonal, and R1 and R2 are both upper
triangular and nonsingular. Show that there is a diagonal matrix D with main diagonal entries
±1 such that R2 = DR1 and Q1 = Q2 D.
Solution:
Note that Q1 R1 = Q2 R2 . Since Q2 is orthogonal and R1 is nonsingular we can multiplying both
sides of the equation on the left by QT2 and the right by R1 −1 to obtain
D := QT2 Q1 = R2 R1 −1
It is easy to check that the inverse of an upper triangular matrix is also upper triangular, and the
product of two upper triangular matrices is upper triangular. It is also easy to verfy that QT2 Q1
is orthogonal. Then, D is a matrix that is both orthogonal and upper triangular. Thus by part
(a) D is diagonal with diagonal entries ±1.
It remains to be shown that the matrices satisfy the relations R2 = DR1 and Q1 = Q2 D. We have
DR1 = R2 R1 −1 R1 = R2
as desired.
and Q2 D = Q2 QT2 Q1 = Q1
(c) Suppose that A ∈ Rn×n is nonsingular and there exist orthogonal Q ∈ Rn×n and upper triangular
R ∈ Rn×n with positive main-diagonal entries such that A = QR. Use parts (a) and (b) to prove
that this decomposition is unique.
Solution: Let A ∈ Rn×n be nonsingular and assume there exist orthogonal Q ∈ Rn×n and upper
triangular R ∈ Rn×n with positive main diagonal entries such that A = QR. Note also that since
the main diagonal entries of R are nonzero then R is nonsingular.
ˆR
ˆ where Q
ˆ and R
ˆ satisfy the same requirements as
Assume that A can also be written as A = Q
ˆ
ˆ
above. Then we have QR = QR. From part (b) there exists diagonal matrix D with main-diagonal
ˆ = DR. If both R and R
ˆ have positive main diagonal entries then the only
entries ±1 such that R
D for which this works is the identity matrix. Then
ˆ=R
R
and the decomposition is unique.
ˆ=Q
and Q
5. (Watkins 3.2.65) Some of the most important transforming matrices are rank-one updates of the
identity matrix (including Householder Reflectors)
(a) Recall that the rank of a matrix is equal to the number of linearly independent columns. Prove
that A ∈ Rm×m has rank one if and only if there exist nonzero vectors u, v ∈ Rm such that
A = uvT . To what extent is there flexibility in the choice of u and v?
Solution:
(⇐) Assume that there exist nozero vector u and v in Rm such that A = uvT . Then each column
of A has the form αu (where here α = vi ). Since each column of A can be written as a scalar
multiple of the same vector it must be the case that rank(A) = 1.
(⇒) Assume that rank(A) = 1 and let Aj represent the j th column of A. Since A is rank 1 it must
be the case that dim(span {Aj }) = 1. We can easily construct (nonunique) nonzero vectors u and
v such that A = uvT . For instance, we can let u = A1 and define v such that
vj = a1j /a11
for j = 1, . . . , m
We have some small amount of flexibility in choosing u and v. We can choose u to be any vector
in R(A). Once u is fixed then the entries of v are uniquely determined by
vj = a1j /u1
for j = 1, . . . , m.
(b) A matrix of the form G = I − uvT is called a rank-one update of the identity for obvious reasons.
Prove that if G is singular, then Gx = 0 if and only if x is a multiple of u. Prove that G is singular
if and only if vT u = 1.
Solution:
Let G = I − uvT be singular. Then there exists some x 6= 0 such that Gx = 0. Then we have
Gx = 0
⇔
I − uvT x = 0
⇔
x = uvT x
⇔
x = vT x u = αu
(⇒) Let G = I − uvT be singular. We showed above that any vector in the nullspace of G must
have the form αu. Then we have
G (αu) = 0
⇔
αu − uvT u = 0
⇔ u = u vT u
⇔
vT u = 1
(⇐) Assume that vT u = 1. Then we have
Gu = u − u vT u = u − u = 0
and G is singular.
(c) Show that if G = I − uvT is nonsingular, then G−1 has the form I − βuvT . Give a formula for β.
Solution:
This one we’ll prove by construction (by actually finding the desired β). We have
I = I − uvT
I − βuvT = I − βuvT − uvT + βuvT uvT = I − βuvT − uvT + β vT u uvT
This implies that
β + 1 − β vT u uvT = 0
Solving for the β that makes the scalar coefficient zero we find
β=
1
vT u − 1
We showed in part (b) that if G is nonsingular then vT u 6= 1 so β is well-defined. Thus G−1 =
1 − βuvT where β is as given above.
(d) Prove that if G = I − uvT is orthogonal, then u and v are proportional, i.e. v = ρu for some
nonzero scalar ρ. Show that if kuk2 = 1, then ρ = 2.
Solution:
If G = I − uvT is orthogonal then we have GT = G−1 . We know the form of G−1 from part (c)
so we have
GT = G−1
I − vuT = I − βuvT
⇔
⇔
vuT = βuvT
If GT = G−1 then GT x = G−1 x for all nonzero x. If we take x = u then we have
vuT u = βuvT u
⇔
v=β
vT u
u = ρu
uT u
where
ρ=β
vT u
uT u
Assume that kuk2 = 1, then letting v = ρu the expression GT G = I
GT G = I
⇒
I − uvT
⇒
T
I − uvT = I
⇒
I − ρuuT
T
I − ρuuT = I
I − 2ρuuT + ρ2 kuk22 uuT = I
Solving the quadratic equation in uuT gives ρ = 0 and ρ = 2. Since we know ρ 6= 0 we conclude
that ρ = 2.
(e) Show that if G = I − uvT and W ∈ Rm×n , then GW can be computed in about 4nm flops if the
arithmetic is done in an efficient way.
Solution:
Notice that we can write GW = I − uvT W = W − uvT W . We first compute xT = vT W ,
which requires taking the inner product of v with the n columns of W . Eeach inner product costs
approximately 2m flops, and there are n of them, for a total of 2mn flops. We then form the m × n
outer-product matrix uxT , which costs mn flops. Finally we subtract W − uxT for another mn
flops, for a grand total of 4mn flops.