Notation & Review of some linear algebra concepts Iterative methods : Notation and a brief background • Mathematical background: matrices, inner products and norms • linear systems of equations ä The set of all linear combinations of a set of vectors G = {a1, a2, . . . , aq } of Rn is a vector subspace called the linear span of G. Notation span(G), span {a1, a2, . . . , aq } ä If the ai’s are linearly independent, then each vector of span{G} admits a unique expression as a linear combination of the ai’s. The set G is then called a basis ä Recall: A matrix represents a linear mapping between two vector spaces of finite dimension n and m. • Iterative processes 11-2 Transposition: If A ∈ Rm×n then its transpose is a matrix C ∈ Rn×m with entries cij = aji, i = 1, . . . , n, j = 1, . . . , m Notation : AT . Transpose Conjugate: for complex matrices, the transpose con¯T = AT . jugate matrix denoted by AH is more relevant: AH = A ä We consider now only square matrices (m = n). ä Spectral radius = The maximum modulus of the eigenvalues Csci 8314 – March 5, 2014 ä Trace of A = sum of diagonal elements of A. P tr(A) = ni=1 aii. ä tr(A) = sum of all the eigenvalues of A counted with their multiplicities. ä Recall that det(A) = product of all the eigenvalues of A counted with their multiplicities. Example: : Trace, spectral radius, and determinant of the matrix: ρ(A) = maxλ∈λ(A) |λ|. 2 1 A= . 3 0 ä Recall: limk→∞ Ak = 0 iff ρ(A) < 1. 11-3 Csci 8314 – March 5, 2014 11-4 Csci 8314 – March 5, 2014 Range and null space ä Range: Types of matrices (square) Ran(A) = {Ax | x ∈ Rn} ä Null Space: Null(A) = {x ∈ Rn | Ax = 0 } ä Range = linear span of the columns of A ä Rank of a matrix rank(A) = dim(Ran(A)) ä rank (A) = the number of linearly independent columns of (A) = the number of linearly independent rows of A. ä A is of full rank if rank(A) = min{m, n}. Otherwise it is rank-deficient. Rank+Nullity theorem for an m × n matrix: • Symmetric matrices: AT = A. • Hermitian matrices: AH = A. • Skew-symmetric matrices: AT = −A. • Skew-Hermitian matrices: AH = −A. • Normal matrices: AH A = AAH . • Nonnegative matrices: aij ≥ 0, i, j = 1, . . . , n (similar definition for nonpositive, positive, and negative matrices). • Unitary matrices: QH Q = I. Note: if Q is unitary then Q−1 = QH . dim(Ran(A)) + dim(N ull(A)) = n Csci 8314 – March 5, 2014 11-5 Vector norms Inner products and Norms ä Inner product of 2 vectors x and y in Rn: x1y1 + x2y2 + · · · + xnyn in Rn Notation: (x, y) or y T x ä For complex vectors (x, y) = x1y¯1 + x2y¯2 + · · · + xny¯n in Cn Norms are needed to measure lengths of vectors and closeness of two vectors. Examples of use: Estimate convergence rate of an iterative method; Estimate the error of an approximation to a given solution; ... ä A vector norm on a vector space X is a real-valued function on X, which satisfies the following three conditions: 1. kxk ≥ 0, Note: (x, y) = y H x An important property: Given A ∈ Cm×n then (Ax, y) = (x, AH y) ∀ x ∈ Cn, ∀y ∈ Cm 11-7 Csci 8314 – March 5, 2014 11-6 Csci 8314 – March 5, 2014 ∀ x ∈ X, 2. kαxk = |α|kxk, and kxk = 0 iff x = 0. ∀ x ∈ X, 3. kx + yk ≤ kxk + kyk, ∀α ∈ C. ∀ x, y ∈ X. ä 3. is called the triangle inequality. 11-8 Csci 8314 – March 5, 2014 Example: Euclidean norm on X = Cn, kxk2 = (x, x)1/2 = Convergence of vector sequences p |x1|2 + |x2|2 + . . . + |xn|2 ä Most common vector norms in numerical linear algebra: A sequence of vectors x(k), k = 1, . . . , ∞ converges to a vector x with respect to the norm k.k if, by definition, lim kx(k) − xk = 0 k→∞ kxk1 = |x1| + |x2| + · · · + |xn|, 1/2 kxk2 = |x1|2 + |x2|2 + · · · + |xn|2 , kxk∞ = max |xi|. i=1,...,n ä Important point: because all norms in Rn are equivalent, the convergence of x(k) w.r.t. a given norm implies convergence w.r.t. any other norm. ä Notation: ä The Cauchy-Schwartz inequality (important) is: |(x, y)| ≤ kxk2kyk2. 11-9 Csci 8314 – March 5, 2014 limk→∞ x(k) = x (k) ä Note: x(k) converges to x iff each component xi converges to the corresoponding component xi of x of x(k) Csci 8314 – March 5, 2014 11-10 ä Given a matrix A in Cm×n, define the set of matrix norms Matrix norms ä Can define matrix norms by considering m × n matrices as vectors in Rmn. These norms satisfy the usual properties of vector norms, i.e., 1. kAk ≥ 0, ∀ A ∈ Cm×n, and kAk = 0 iff A = 0 2. kαAk = |α|kAk, ∀ A ∈ Cm×n, ∀ α ∈ C 3. kA + Bk ≤ kAk + kBk, ∀ A, B ∈ Cm×n. kAkp = max n x∈C , x6=0 kAxkp kxkp . ä These norms satisfy the usual properties of vector norms (see previous page). ä The matrix norm k.kp is induced by the vector norm k.kp. ä Again, important cases are for p = 1, 2, ∞. ä However, these will lack (in general) the right properties for composition of operators (product of matrices). ä The case of k.k2 yields the Frobenius norm of matrices. 11-11 Csci 8314 – March 5, 2014 11-12 Csci 8314 – March 5, 2014 Consistency Important equalities: ä A fundamental property is consistency kABkp ≤ kAkpkBkp. kAk1 = max j=1,...,n ä Consequence: kAk kp ≤ kAkkp kAk∞ = max m X i=1 n X i=1,...,m ä Ak converges to zero if any of its p-norms is < 1 |aij |, |aij |, j=1 1/2 1/2 kAk2 = ρ(AH A) = ρ(AAH ) , 1/2 H H 1/2 kAkF = T r(A A) = T r(AA ) . ä The Frobenius norm is defined by P 1/2 Pm n 2 kAkF = . j=1 i=1 |aij | ä Same as the 2-norm of the column vector in Cmn consisting of all the columns (respectively rows) of A. ä This norm is also consistent [but not induced from a vector norm] Csci 8314 – March 5, 2014 11-13 A few properties of Symmetric Positive Definite matrices Positive-Definite Matrices ä Diagonal entries of A are positive. More generally, ... ä A real matrix is said to be positive definite if ä Diagonal block A(k : l, k : l), (k < l), is SPD (Au, u) > 0 for all u 6= 0 u ∈ R n ä Let A be a real positive definite matrix. Then there is a scalar α > 0 such that (Au, u) ≥ αkuk22, ä Consider now the case of Symmetric Positive Definite (SPD) matrices. ä Consequence 1: A is nonsingular ä Consequence 2: the eigenvalues of A are (real) positive 11-15 Csci 8314 – March 5, 2014 11-14 Csci 8314 – March 5, 2014 ä For any n × k matrix X of rank k, the matrix X T AX is SPD. ä The mapping : x, y → (x, y)A ≡ (Ax, y) is a proper inner product on Rn. The associated norm, denoted by k.kA, is called the energy norm: √ kxkA = (Ax, x)1/2 = xT Ax ä A admits the Cholesky factorization A = LLT where L is lower triangular 11-16 Csci 8314 – March 5, 2014 Iterative processes for linear systems Basic relaxation techniques In contrast with “direct” methods (Gaussian Elimination) iterative methods compute a sequence of approximations x(k) to the solution x. Ideally, a good approximation is obtained in a few iterations of the process. Convergence measured by some norm. • Relaxation methods: Jacobi, Gauss-Seidel, SOR • Basic convergence results Questions which arise: • Optimal relaxation parameter for SOR ä Why not use a direct method [always works!] • See Chapter 4 of text for details. ä Under which condition (s) will the method converge? ä When to stop? ä Can we estimate costs? 11-17 Csci 8314 – March 5, 2014 ä Gauss-Seidel iteration can be expressed as: Basic relaxation schemes ä Relaxation schemes: methods that modify one component of current approximation at a time ä Based on the decomposition A = D − E − F with: D = diag(A), −E = strict lower part of A and −F = its strict upper part. −F D −E (D − E)x(k+1) = F x(k) + b Can also define a backward Gauss-Seidel Iteration: (D − F )x(k+1) = Ex(k) + b and a Symmetric Gauss-Seidel Iteration: forward sweep followed by backward sweep. Over-relaxation is based on the decomposition: ωA = (D − ωE) − (ωF + (1 − ω)D) Gauss-Seidel iteration for solving Ax = b: ä corrects j-th component of current approximate solution, to zero the j − th component of residual for j = 1, 2, · · · , n. 11-19 Csci 8314 – March 5, 2014 → successive overrelaxation, (SOR): (D − ωE)x(k+1) = [ωF + (1 − ω)D]x(k) + ωb 11-20 Csci 8314 – March 5, 2014 Iteration matrices General convergence result Jacobi, Gauss-Seidel, SOR, & SSOR iterations are of the form x(k+1) = M x(k) + f Consider the iteration: x(k+1) = Gx(k) + f (1) Assume that ρ(G) < 1. Then I − G is non-singular and G has a fixed point. Iteration converges to a fixed point for any f and x(0). MJ ac = D −1(E + F ) = I − D −1A MGS = (D − E)−1F = I − (D − E)−1A MSOR = (D − ωE)−1(ωF + (1 − ω)D) = I − (ω −1D − E)−1A MSSOR = I − ω(2 − ω)(D − ωF )−1D(D − ωE)−1A (2) If iteration converges for any f and x(0) then ρ(G) < 1. Example: Richardson’s iteration x(k+1) = x(k) + α(b − Ax(k)) - Assume Λ(A) ⊂ R. When does the iteration converge? 11-21 Csci 8314 – March 5, 2014 A few well-known results ä Jacobi and Gauss-Seidel converge for diagonal dominant matrices, i.e., matrices such that P |aii| > j6=i |aij |, i = 1, · · · , n ä SOR converges for 0 < ω < 2 for SPD matrices ä The optimal ω is known in theory for an important class of matrices called 2-cyclic matrices or matrices with property A. Csci 8314 – March 5, 2014 11-22 ä A matrix has property A if it can be (symmetrically) permuted into a 2 × 2 block matrix whose diagonal blocks are diagonal. P AP T D1 E = E T D2 ä Let A be a matrix which has property A. Then the eigenvalues λ of the SOR iteration matrix and the eigenvalues µ of the Jacobi iteration matrix are related by (λ + ω − 1)2 = λω 2µ2 ä The optimal ω for matrices with property A is given by ωopt = 2 p 1 + 1 − ρ(B)2 where B is the Jacobi iteration matrix. 11-23 Csci 8314 – March 5, 2014 11-24 Csci 8314 – March 5, 2014 An observation Introduction to Preconditioning ä The iteration x(k+1) = M x(k) + f is attempting to solve (I − M )x = f . Since M is of the form M = I − P −1A this system can be rewritten as P −1Ax = P −1b where for SSOR, we have PSSOR = (D − ωE)D −1(D − ωF ) referred to as the SSOR ‘preconditioning’ matrix. In other words: Relaxation iter. 11-25 ⇐⇒ Preconditioned Fixed Point Iter. Csci 8314 – March 5, 2014
© Copyright 2025 ExpyDoc