8.11 Approximation of Eigenvalues
INTRODUCTION
Recall, to find the eigenvalues for a matrix A we must find the roots of the polynomial equation p(λ) = det(A − λI) = 0. If A is a large matrix, the computations involved in obtaining this characteristic equation can be overwhelming. Moreover, even if we can find the exact characteristic equation it is likely that we would have to use a numerical procedure to approximate its roots. There are alternative numerical procedures for approximating eigenvalues and the corresponding eigenvectors. The procedure that we shall consider in this section deals with matrices that possess a dominant eigenvalue.
A Definition
A dominant eigenvalue of a square matrix A is one whose absolute value is greater than the absolute value of each of the remaining eigenvalues. We formalize the last sentence in the next definition.
DEFINITION 8.11.1 Dominant Eigenvalue
Let λ1, λ2, . . . , λk, . . . , λn denote the eigenvalues of an n × n matrix A. The eigenvalue λk is said to be the dominant eigenvalue of A if
|λk| > λi, i = 1, 2, . . . , n, but i ≠ k.
An eigenvector corresponding to λk is called the dominant eigenvector of A.
In Example 2 of Section 8.8, we saw that the eigenvalues of the matrix
are λ1 = 0, λ2 = −4, and λ3 = 3. Since |−4| > 0 and |−4| > 3, we see that λ2 = −4 is the dominant eigenvalue of A.
EXAMPLE 1 Matrices with No Dominant Eigenvalue
(a) The matrix A = has eigenvalues λ1 = −2 and λ2 = 2. Since |λ1| = |λ2| = 2, it follows that there is no dominant eigenvalue.
(b) The eigenvalues of the matrix
A =
are λ1 = 2, λ2 = λ3 = 5. Again, the matrix has no dominant eigenvalue. ≡
Power Method
Suppose the n × n matrix A has a dominant eigenvalue λ1. The iterative technique for approximating a corresponding dominant eigenvector is due to the German mathematician Richard von Mises (1883–1953) and is called the power method. The basic idea of this procedure is first to compute an approximation to a dominant eigenvector by means of the sequence
(1)
where X0 represents a nonzero n × 1 vector that is an initial guess or approximation for the eigenvector we seek. Iterating (1) gives
(2)
Under certain circumstances, for large values of m the vector defined by Xm = AmX0 is an approximation to a dominant eigenvector. To see this, let us make some further assumptions about the matrix A. Let us assume that the eigenvalues of A are such that
|λ1| > |λ2| ≥ |λ3| ≥ ≥ |λn|
and that the corresponding n eigenvectors K1, K2, . . . , Kn are linearly independent. Because of this last assumption, K1, K2, . . . , Kn can serve as a basis for Rn (see Section 7.6). Thus, for any nonzero n × 1 vector X0, constants c1, c2, . . . , cn can be found such that
X0 = c1K1 + c2K2 + . . . + cnKn. (3)
We shall also assume X0 is chosen so that c1 ≠ 0. Multiplying (3) by A then gives
AX0 = c1AK1 + c2AK2 + . . . + cnAKn.
Since AK1 = λ1K1, AK2 = λ2K2, . . . , AKn = λnKn, the last line becomes
AX0 = c1λ1K1 + c2λ2K2 + . . . + cnλnKn. (4)
Multiplying (4) by A gives
Continuing in this manner, we find that
(5)
(6)
Since |λ1| > |λi| for i = 2, 3, . . . , n, we have |λi/λ1| < 1 and consequently limm→∞(λi/λ1)m = 0. Therefore, as m → ∞, we see from (6) that
(7)
Since a nonzero constant multiple of an eigenvector is an eigenvector, it follows from (7) that for large values of m and under all the assumptions that were made, the n × 1 matrix Xm = AmX0 is an approximation to a dominant eigenvector associated with the dominant eigenvalue λ1. The rate at which this method converges depends on the quotient λ2/λ1: If |λ2/λ1| is very small, then convergence is fast, whereas if |λ2/λ1| is close to one, convergence is slow. Of course, this information is not as useful as it seems because we generally do not know the eigenvalues in advance.
It remains, then, to approximate the dominant eigenvalue itself. This can be done by means of the inner product. If K is an eigenvector of a matrix A corresponding to the eigenvalue λ, we have AK = λK, and so we have AK · K = λK · K. Since AK · K and K · K are scalars, we can solve this last equation for λ:
.
Hence, if Xm = AmX0 is an approximation to a dominant eigenvector obtained by the iteration of (1), then the dominant eigenvalue λ1 can be approximated by the quotient
(8)
The quotient in (8) is sometimes referred to as the Rayleigh quotient.
EXAMPLE 2 Using the Power Method
Use the power method to approximate the dominant eigenvalue and a corresponding dominant eigenvector of A = .
SOLUTION
Since we do not know the eigenvalues and eigenvectors, we may as well take X0 = . Now the first two terms of the sequence of vectors defined by (1) are
The next five vectors obtained in this manner are given in the following table:
At this point it may not appear that we are getting anywhere, since the entries of the vectors in the table appear to be growing without bound. But bear in mind that (7) indicates we are obtaining a constant multiple of a vector. If the power method converges, then by factoring the entry with the largest absolute value from Xm (for a large value of m), we should obtain a reasonable approximation to a dominant eigenvector. From the table,
X7 = 89,304 (9)
It appears then that the vectors are approaching scalar multiples of .
We now use (8) to approximate the dominant eigenvalue λ1. First we have
Finally, we have
λ1 ≈ = 5.0006.
The reader should use the procedure of Section 8.7 to verify that the eigenvalues and corresponding eigenvectors of A are λ1 = 5, λ2 = −2, K1 = , and K2 = ≡
Scaling
As we have just seen, iteration of (1) often results in vectors whose entries become very large in absolute value. Large numbers can, of course, cause a problem if a computer is used to carry out a great number of iterations. The result in (9) suggests that one way around this difficulty is to use a scaled-down vector at each step of the iteration. To do this, we simply multiply the vector AX0 by the reciprocal of the entry with the largest absolute value. That is, we multiply
This resulting matrix, whose entries are now less than or equal to one, we label X1. We repeat the process with the vector AX1 to obtain the scaled-down vector X2, and so on.
EXAMPLE 3 Example 2 Revisited
Repeat the iterations of Example 2 using scaled-down vectors.
SOLUTION
From AX0 = we define
From AX1 = we define
X2 =
We continue in this manner to construct the following table:
In contrast to the table in Example 2, it is apparent from this table that the vectors are approaching . ≡
Method of Deflation
After we have found the dominant eigenvalue λ1 of a matrix A it may still be necessary to find nondominant eigenvalues. The procedure we shall consider next is a modification of the power method and is called the method of deflation. We will limit the discussion to the case where A is a symmetric matrix.
Suppose λ1 and K1 are, respectively, the dominant eigenvalue and a corresponding normalized eigenvector* (that is, ||K1|| = 1) of a symmetric matrix A. Furthermore, suppose the eigenvalues of A are such that
|λ1| > |λ2| > |λ3| ≥ . . . ≥ |λn|.
It can be proved that the matrix
(10)
has eigenvalues 0, λ2, λ3, . . . , λn and that eigenvectors of B are also eigenvectors of A. Note that λ2 is now the dominant eigenvalue of B. We apply the power method to B to approximate λ2 and a corresponding eigenvector.
EXAMPLE 4 Using the Method of Deflation
Use the method of deflation to approximate the eigenvalues of
SOLUTION
We begin by using the power method with scaling to find the dominant eigenvalue and a corresponding eigenvector of A. Choosing X0 = , we see that
The scaled vectors X3 to X10 are given in the following table:
Utilizing X10 and (8), we find
λ1 ≈ = 2.9997.
It appears that the dominant eigenvalue and a corresponding eigenvector are, respectively, λ1 = 3 and K = .
Our next task is to construct the matrix B defined by (10). With ||K|| = , the normalized eigenvector is K1 = . Thus,
We now use the power method with scaling to find the dominant eigenvalue of B. With X0 = again, the results are summarized in the following table:
Utilizing X7 and (8), we find
λ2 ≈ = −1.9996.
From these computations it seems apparent that the dominant eigenvalue of B and a corresponding eigenvector are λ2 = −2 and K = .
To find the last eigenvalue of A, we repeat the deflation process to find the dominant eigenvalue and a corresponding eigenvector of the matrix
where we have used K2 = . The student is encouraged to verify that λ3 = 1. ≡
Example 4 is somewhat artificial since eigenvalues of a matrix need not be “nice” numbers such as 3, −2, and 1. Moreover, we used the exact values of the dominant eigenvalues λ1 and λ2 in the formation of the matrices B and C. In practice, of course, we may have to be content with working with approximations to the dominant eigenvalue λ1 and a corresponding normalized dominant eigenvector K1 of A. If these approximations are used in (10), an error is introduced in the computation of the matrix B, and so more errors may be introduced in the computation of its dominant eigenvalue λ2 and dominant eigenvector K2. If λ2 and K2 are now used to construct the matrix C, it seems reasonable to conclude that errors are being compounded. In other words, the method of deflation can become increasingly inaccurate as more eigenvalues are computed.
Inverse Power Method
In some applied problems we are more interested in approximating the eigenvalue of a matrix A of smallest absolute value than the dominant eigenvalue. If A is nonsingular, then the eigenvalues of A are nonzero, and if λ1, λ2, . . . , λn are the eigenvalues of A, then 1/λ1, 1/λ2, . . . , 1/λn are the eigenvalues of A−1. Now if the eigenvalues of A can be arranged in the order
|λ1| ≥ |λ2| ≥ |λ3| ≥ . . . ≥ |λn−1| > |λn|,
then we see that 1/λn is the dominant eigenvalue of A−1. By applying the power method to A−1, we approximate the eigenvalue of largest magnitude and, by taking its reciprocal, we find the eigenvalue of A of least magnitude. This is called the inverse power method. See Problems 11–13 in Exercises 8.11.
Review (6) and Theorems 8.8.2 and 8.8.3 on pages 430–431.
8.11 Exercises Answers to selected odd-numbered problems begin on page ANS-20.
To the Instructor/Student: A calculator with matrix capabilities or a CAS would be useful in the following problems.
Each matrix in Problems 1–10 has a dominant eigenvalue.
In Problems 1 and 2, use the power method as illustrated in Example 3 to find the dominant eigenvalue and a corresponding dominant eigenvector of the given matrix.
In Problems 3–6, use the power method with scaling to find the dominant eigenvalue and a corresponding eigenvector of the given matrix.
In Problems 7–10, use the method of deflation to find the eigenvalues of the given matrix.
In Problems 11 and 12, use the inverse power method to find the eigenvalue of least magnitude for the given matrix.
- In Example 4 of Section 3.9 we saw that the deflection curve of a thin column under an applied load P was defined by the boundary-value problem
EI + Py = 0, y(0) = 0, y(L) = 0.
In this problem we show how to apply matrix techniques to compute the smallest critical load.
Let the interval [0, L] be divided into n subintervals of length h = L/n, and let xi = ih, i = 0, 1, . . . , n. For small values of h it follows from (6) of Section 6.5 that
where yi = y(xi).
- Show that the differential equation can be replaced by the difference equation
EI(yi+1 − 2yi + yi−1) + Ph2yi = 0, i = 1, 2, . . ., n − 1.
- Show that for n = 4 the difference equation in part (a) yields the system of linear equations
Note that this system has the form of the eigenvalue problem AY = λY, where λ = PL2/16EI.
- Find A−1.
- Use the inverse power method to find, to two decimal places, the eigenvalue of A of least magnitude.
- Use the result of part (d) to compute the approximate smallest critical load. Compare your answer with that given in Section 3.9.
- Show that the differential equation can be replaced by the difference equation
- Suppose the column in Problem 13 is tapered so that the moment of inertia of a cross-section I varies linearly from I(0) = I0 = 0.002 to I(L) = IL = 0.001.
- Use the difference equation in part (a) of Problem 13 with n = 4 to set up a system of equations analogous to that given in part (b).
- Proceed as in Problem 13 to find an approximation to the smallest critical load.
Computer Lab Assignment
- In Section 8.9 we saw how to compute a power Am for an n × n matrix A. Consult the documentation for the CAS you have on hand for the command to compute the power Am. (In Mathematica the command is MatrixPower[A, m].) The matrix
possesses a dominant eigenvalue.
- Use a CAS to compute A10.
- Now use (2), Xm = AmX0, with m = 10 and
X0 = , to compute X10. In the same manner compute X12. Then proceed as in (9) to find the approximate dominant eigenvector K. - If K is an eigenvector of A, then AK = λK. Use this definition and the result in part (b) to find the dominant eigenvalue.
*See Example 3 in Section 8.10.