A brief discussion of the mathematical structure of the eigenvalue problem
is necessary to fix notation and introduce ideas that
lead to an understanding of the behavior, strengths and limitations of the
algorithm. In this discussion,
the real and complex number fields are denoted by and respectively.
The standard n-dimensional real and complex vectors are denoted by
and and the symbols and
will denote the real and complex
matrices with m rows and n columns.
Scalars are denoted by lower case Greek letters, vectors are denoted by lower
case Latin letters and matrices by capital Latin letters. The transpose of
a matrix is denoted by and the conjugate-transpose by The symbol, will denote the Euclidean or 2-norm of a vector.
The standard basis for is denoted by the set
The set of numbers
is called the spectrum of .
The elements of this discrete set are the eigenvalues
of
and they
may be characterized as the n roots of the characteristic polynomial
. Corresponding
to each distinct eigenvalue is at least one
nonzero vector such that . This vector is
called a right eigenvector
of
corresponding to
the eigenvalue . The pair is called
an eigenpair. A nonzero vector such that
is called a left
eigenvector.
The multiplicity of as a root
of the characteristic polynomial is the algebraic multiplicity
and the dimension of
is the geometric multiplicity of .
The matrix
is defective if
for some eigenvalue and otherwise it is
non-defective. The eigenvalue is
simple if
A subspace of is called an invariant subspace
of if . It is straightforward to show
if , and satisfy
then is an invariant
subspace of . Moreover, if has full column rank k then
the columns of form a basis for this subspace and
. If, in addition, k=n
then and
is said to be similar to under
the similarity transformation with
is diagonalizable if it is similar to a diagonal matrix
and this property is equivalent to
being non-defective.
Invariant subspaces are central to the methods developed here. Invariant
subspaces generalize the notion of eigenvectors.
If , then , and is a one
dimensional invariant
subspace of More generally, if
for and we put ,then is an invariant subspace
of and indeed
where .
If where is unitary and is upper
triangular(the standard QR-factorization), then
where is an upper triangular matrix
with the eigenvalues on its diagonal. The columns of
provide an orthonormal basis for the invariant subspace .
A large condition number
of and
hence of
will indicate these eigenvalues and this invariant subspace are sensitive
to perturbations in (such as those introduced by roundoff error in
a finite precision computation). But this is not the entire story.
Separation of eigenvalues and angles between invariant subspaces
also come into play. In the symmetric (Hermitian) case, invariant subspaces
corresponding to distinct eigenvalues are orthogonal to each other and
completely decouple the action of the matrix (as an operator on .
In the non-symmetric case, such
a decoupling is generally not possible. The nature of the coupling
is completely described by the Jordan canonical form but this form
is usually extremely sensitive to perturbations and hence unsuitable
as the basis for a computational algorithm.
The Schur decomposition [42] does provide a means to express this coupling and provides a foundation for the development of stable numerical algorithms. In particular, the implicitly shifted QR algorithm computes a Schur decomposition. Schur's result states that every square matrix is unitarily similar to an upper triangular matrix. In other words, for any linear operator on , there is a unitary basis in which the operator has an upper triangular matrix representation. The following result may be found in [16].
Theorem 13862 ((Schur Decomposition))
. Let . Then there is a unitary matrix
and an upper triangular matrix such that
A Schur decomposition is not unique. The eigenvalues of may appear on the diagonal of in any specified order. Thus, for
any specified set of k eigenvalues of
, there is a Schur decompostion
such that these k eigenvalues appear as diagonal elements of
the leading principal submatrix of the upper triangular matrix .If denotes the leading k columns of the corresponding
unitary matrix , then
(1) |
The fundamental structure of Hermitian and normal matrices is easily derived from the Schur decomposition.
Lemma 13891
A matrix is normal ()
if and only if with unitary
and diagonal.
Morevover, is Hermitian ()
if and only if is diagonal with real entries.
In either case, the diagonal entries of are the eigenvalues of
and the columns of are the corresponding
eigenvectors.
For purposes of algorithmic development this structure is fundamental.
In fact, the well known Implicitly Shifted QR-Algorithm [14,15]
is designed to produce a sequence of unitary similarity
transformations with
that iteratively reduce to upper triangular form. This algorithm begins
with an initial unitary similarity transformation of
with to the
condensed
form where is upper Hessenberg matrix .
An upper Hessenberg matrix has zero elements below
its main subdiagonal so it is almost upper triangular.
When
is Hermitian, is a real symmetric tridiagonal matrix
in which case all the elements
are zero except for those on the main, sub and super diagonals.
Input: (
) with
,
upper Hessenberg;
For until
,
(a1.1) Select a shift
End_For
(a1.2) Factor [
] = qr(
) ;
(a1.3)
;
;
The QR-iteration is shown in Figure 4.2.
The QR factorization
of is given by the unitary matrix and upper
triangular It is easy to see that is unitarily similar to
throughout the course of this iteration. The iteration is continued until
the subdiagonal elements of converge to zero, i.e. until a Schur
decomposition has been (approximately) obtained. In the standard implicitly
shifted QR-iteration, the unitary matrix is never actually formed.
It is computed indirectly as a product of Givens or
Householder transformations through a ``bulge chase" process.
The elegant details of an efficient and stable implementation would be too much
of a digression here. They may be found in [16,28,48].
The convergence behavior of this iteration is fascinating.
The columns of converge to Schur
vectors at various rates. These rates are fundamentally linked to
the simple power method and its rapidly convergent variant,
inverse iteration. See [47] for further information and references.
Despite the extremely fast rate of convergence and the efficient use of
storage, the implicitly shifted QR method
is not suitable for large
scale problems and it has proven to be extremely difficult to parallelize.
Large scale problems are typically sparse or structured
so that a matrix-vector product may be
computed with time and storage proportional to n rather than n2.
A method based upon full similarity transformations quickly destroys
this structure. Storage and operation counts per iteration become order n2.
Hence, there is considerable motivation for methods that only
require matrix-vector products with the original
The power method provides a simple means to find the dominant eigenvalue (of largest magnitude) of a matrix without performing matrix factorizations and dense similarity transformations. It has the drawback that only one eigen-pair may be found and that convergence may be slow or non-existent. Deflation schemes and/or block variants may be employed to overcome the first problem and spectral transformations may be used to accelerate convergence and focus on selected eigenvalues. However, there is another very elegant way to extract additional eigenvalue information from the power method sequence.