Structure of the Eigenvalue Problem

A brief discussion of the mathematical structure of the eigenvalue problem is necessary to fix notation and introduce ideas that lead to an understanding of the behavior, strengths and limitations of the algorithm. In this discussion, the real and complex number fields are denoted by and respectively. The standard n-dimensional real and complex vectors are denoted by and and the symbols and will denote the real and complex matrices with m rows and n columns. Scalars are denoted by lower case Greek letters, vectors are denoted by lower case Latin letters and matrices by capital Latin letters. The transpose of a matrix ${\bf A}$ is denoted by and the conjugate-transpose by The symbol, will denote the Euclidean or 2-norm of a vector. The standard basis for is denoted by the set

The set of numbers is called the spectrum of ${\bf A}$ . The elements of this discrete set are the eigenvalues of ${\bf A}$ and they may be characterized as the n roots of the characteristic polynomial . Corresponding to each distinct eigenvalue is at least one nonzero vector such that . This vector is called a right eigenvector of ${\bf A}$ corresponding to the eigenvalue . The pair is called an eigenpair. A nonzero vector such that is called a left eigenvector. The multiplicity of as a root of the characteristic polynomial is the algebraic multiplicity and the dimension of is the geometric multiplicity of . The matrix ${\bf A}$ is defective if for some eigenvalue and otherwise it is non-defective. The eigenvalue is simple if

A subspace of is called an invariant subspace of ${\bf A}$ if . It is straightforward to show if , and satisfy

then is an invariant subspace of ${\bf A}$ . Moreover, if has full column rank k then the columns of form a basis for this subspace and . If, in addition, k=n then and ${\bf A}$ is said to be similar to under the similarity transformation with ${\bf A}$ is diagonalizable if it is similar to a diagonal matrix and this property is equivalent to ${\bf A}$ being non-defective.

Invariant subspaces are central to the methods developed here. Invariant subspaces generalize the notion of eigenvectors. If , then , and is a one dimensional invariant subspace of More generally, if for and we put ,then is an invariant subspace of ${\bf A}$ and indeed where . If where is unitary and is upper triangular(the standard QR-factorization), then where is an upper triangular matrix with the eigenvalues on its diagonal. The columns of provide an orthonormal basis for the invariant subspace .

A large condition number of and hence of will indicate these eigenvalues and this invariant subspace are sensitive to perturbations in ${\bf A}$ (such as those introduced by roundoff error in a finite precision computation). But this is not the entire story. Separation of eigenvalues and angles between invariant subspaces also come into play. In the symmetric (Hermitian) case, invariant subspaces corresponding to distinct eigenvalues are orthogonal to each other and completely decouple the action of the matrix (as an operator on . In the non-symmetric case, such a decoupling is generally not possible. The nature of the coupling is completely described by the Jordan canonical form but this form is usually extremely sensitive to perturbations and hence unsuitable as the basis for a computational algorithm.

The Schur decomposition [42] does provide a means to express this coupling and provides a foundation for the development of stable numerical algorithms. In particular, the implicitly shifted QR algorithm computes a Schur decomposition. Schur's result states that every square matrix is unitarily similar to an upper triangular matrix. In other words, for any linear operator on , there is a unitary basis in which the operator has an upper triangular matrix representation. The following result may be found in [16].

A Schur decomposition is not unique. The eigenvalues of ${\bf A}$ may appear on the diagonal of in any specified order. Thus, for any specified set of k eigenvalues of ${\bf A}$ , there is a Schur decompostion such that these k eigenvalues appear as diagonal elements of the leading principal submatrix of the upper triangular matrix .If denotes the leading k columns of the corresponding unitary matrix , then

(1)

is obtained by equating the leading k columns on both sides of (4.1.2). We shall call this a partial Shur decomposition. Note that is an invariant subspace of ${\bf A}$ corresponding to the k specified eigenvalues and that the columns of form an orthonormal basis for this space. This is called a Schur basis for the invariant subspace. The Implicitly Restarted Arnoldi Method is designed to compute a partial Schur form corresponding to selected eigenvalues of ${\bf A}$ .

The fundamental structure of Hermitian and normal matrices is easily derived from the Schur decomposition.

A matrix is normal () if and only if with unitary and diagonal. Morevover, ${\bf A}$ is Hermitian () if and only if is diagonal with real entries. In either case, the diagonal entries of are the eigenvalues of ${\bf A}$ and the columns of are the corresponding eigenvectors.

For purposes of algorithmic development this structure is fundamental. In fact, the well known Implicitly Shifted QR-Algorithm [14,15] is designed to produce a sequence of unitary similarity transformations with that iteratively reduce ${\bf A}$ to upper triangular form. This algorithm begins with an initial unitary similarity transformation of ${\bf A}$ with to the condensed form where is upper Hessenberg matrix . An upper Hessenberg matrix has zero elements below its main subdiagonal so it is almost upper triangular. When ${\bf A}$ is Hermitian, is a real symmetric tridiagonal matrix in which case all the elements are zero except for those on the main, sub and super diagonals.

**Figure 4.2:** Algorithm 1: Shifted QR-iteration.
Input: ( ${\bf A}, {\bf V}, {\bf H}$ ) with ${\bf A}{\bf V} = {\bf V} {\bf H}, {\bf V}^H {\bf V} = {\bf I}$ , ${\bf H}$ upper Hessenberg; For until , (a1.1) Select a shift $\mu \leftarrow \mu_j$ (a1.2) Factor [ ${\bf Q},{\bf R}$ ] = qr( ${\bf H} - \mu {\bf I}$ ) ; (a1.3) ${\bf H} \leftarrow {\bf Q}^H {\bf H} {\bf Q}$ ; ${\bf V} \leftarrow {\bf V} {\bf Q}$ ; End_For

Despite the extremely fast rate of convergence and the efficient use of storage, the implicitly shifted QR method is not suitable for large scale problems and it has proven to be extremely difficult to parallelize. Large scale problems are typically sparse or structured so that a matrix-vector product ${\bf w}\leftarrow{\bf A}{\bf v}$ may be computed with time and storage proportional to n rather than n². A method based upon full similarity transformations quickly destroys this structure. Storage and operation counts per iteration become order n². Hence, there is considerable motivation for methods that only require matrix-vector products with the original

The power method provides a simple means to find the dominant eigenvalue (of largest magnitude) of a matrix without performing matrix factorizations and dense similarity transformations. It has the drawback that only one eigen-pair may be found and that convergence may be slow or non-existent. Deflation schemes and/or block variants may be employed to overcome the first problem and spectral transformations may be used to accelerate convergence and focus on selected eigenvalues. However, there is another very elegant way to extract additional eigenvalue information from the power method sequence.