A brief discussion of the mathematical structure of the eigenvalue problem is necessary to fix notation and introduce ideas that lead to an understanding of the behavior, strengths and limitations of the algorithm. In this discussion, the real and complex number fields are denoted by and respectively. The standard n-dimensional real and complex vectors are denoted by and and the symbols and will denote the real and complex matrices with m rows and n columns. Scalars are denoted by lower case Greek letters, vectors are denoted by lower case Latin letters and matrices by capital Latin letters. The transpose of a matrix is denoted by and the conjugate-transpose by The symbol, will denote the Euclidean or 2-norm of a vector. The standard basis for is denoted by the set
The set of numbers is called the spectrum of . The elements of this discrete set are the eigenvalues of and they may be characterized as the n roots of the characteristic polynomial . Corresponding to each distinct eigenvalue is at least one nonzero vector such that . This vector is called a right eigenvector of corresponding to the eigenvalue . The pair is called an eigenpair. A nonzero vector such that is called a left eigenvector. The multiplicity of as a root of the characteristic polynomial is the algebraic multiplicity and the dimension of is the geometric multiplicity of . The matrix is defective if for some eigenvalue and otherwise it is non-defective. The eigenvalue is simple if
A subspace of is called an invariant subspace of if . It is straightforward to show if , and satisfy
then is an invariant subspace of . Moreover, if has full column rank k then the columns of form a basis for this subspace and . If, in addition, k=n then and is said to be similar to under the similarity transformation with is diagonalizable if it is similar to a diagonal matrix and this property is equivalent to being non-defective.
Invariant subspaces are central to the methods developed here. Invariant subspaces generalize the notion of eigenvectors. If , then , and is a one dimensional invariant subspace of More generally, if for and we put ,then is an invariant subspace of and indeed where . If where is unitary and is upper triangular(the standard QR-factorization), then where is an upper triangular matrix with the eigenvalues on its diagonal. The columns of provide an orthonormal basis for the invariant subspace .
A large condition number of and hence of will indicate these eigenvalues and this invariant subspace are sensitive to perturbations in (such as those introduced by roundoff error in a finite precision computation). But this is not the entire story. Separation of eigenvalues and angles between invariant subspaces also come into play. In the symmetric (Hermitian) case, invariant subspaces corresponding to distinct eigenvalues are orthogonal to each other and completely decouple the action of the matrix (as an operator on . In the non-symmetric case, such a decoupling is generally not possible. The nature of the coupling is completely described by the Jordan canonical form but this form is usually extremely sensitive to perturbations and hence unsuitable as the basis for a computational algorithm.
The Schur decomposition [42] does provide a means to express this coupling and provides a foundation for the development of stable numerical algorithms. In particular, the implicitly shifted QR algorithm computes a Schur decomposition. Schur's result states that every square matrix is unitarily similar to an upper triangular matrix. In other words, for any linear operator on , there is a unitary basis in which the operator has an upper triangular matrix representation. The following result may be found in [16].
Theorem 13862 ((Schur Decomposition))
. Let . Then there is a unitary matrix
and an upper triangular matrix such that
A Schur decomposition is not unique. The eigenvalues of may appear on the diagonal of in any specified order. Thus, for any specified set of k eigenvalues of , there is a Schur decompostion such that these k eigenvalues appear as diagonal elements of the leading principal submatrix of the upper triangular matrix .If denotes the leading k columns of the corresponding unitary matrix , then
(1) |
The fundamental structure of Hermitian and normal matrices is easily derived from the Schur decomposition.
Lemma 13891
A matrix is normal () if and only if with unitary and diagonal. Morevover, is Hermitian () if and only if is diagonal with real entries. In either case, the diagonal entries of are the eigenvalues of and the columns of are the corresponding eigenvectors.
For purposes of algorithmic development this structure is fundamental. In fact, the well known Implicitly Shifted QR-Algorithm [14,15] is designed to produce a sequence of unitary similarity transformations with that iteratively reduce to upper triangular form. This algorithm begins with an initial unitary similarity transformation of with to the condensed form where is upper Hessenberg matrix . An upper Hessenberg matrix has zero elements below its main subdiagonal so it is almost upper triangular. When is Hermitian, is a real symmetric tridiagonal matrix in which case all the elements are zero except for those on the main, sub and super diagonals.
Input: (
) with
,
upper Hessenberg;
For until ,
(a1.1) Select a shift
End_For
(a1.2) Factor [
] = qr(
) ;
(a1.3)
;
;
The QR-iteration is shown in Figure 4.2.
The QR factorization
of is given by the unitary matrix and upper
triangular It is easy to see that is unitarily similar to
throughout the course of this iteration. The iteration is continued until
the subdiagonal elements of converge to zero, i.e. until a Schur
decomposition has been (approximately) obtained. In the standard implicitly
shifted QR-iteration, the unitary matrix is never actually formed.
It is computed indirectly as a product of Givens or
Householder transformations through a ``bulge chase" process.
The elegant details of an efficient and stable implementation would be too much
of a digression here. They may be found in [16,28,48].
The convergence behavior of this iteration is fascinating.
The columns of converge to Schur
vectors at various rates. These rates are fundamentally linked to
the simple power method and its rapidly convergent variant,
inverse iteration. See [47] for further information and references.
Despite the extremely fast rate of convergence and the efficient use of storage, the implicitly shifted QR method is not suitable for large scale problems and it has proven to be extremely difficult to parallelize. Large scale problems are typically sparse or structured so that a matrix-vector product may be computed with time and storage proportional to n rather than n2. A method based upon full similarity transformations quickly destroys this structure. Storage and operation counts per iteration become order n2. Hence, there is considerable motivation for methods that only require matrix-vector products with the original
The power method provides a simple means to find the dominant eigenvalue (of largest magnitude) of a matrix without performing matrix factorizations and dense similarity transformations. It has the drawback that only one eigen-pair may be found and that convergence may be slow or non-existent. Deflation schemes and/or block variants may be employed to overcome the first problem and spectral transformations may be used to accelerate convergence and focus on selected eigenvalues. However, there is another very elegant way to extract additional eigenvalue information from the power method sequence.