8.12 Diagonalization
INTRODUCTION
In Chapter 10 we shall see that eigenvalues, eigenvectors, orthogonal matrices, and the topic of this present section, diagonalization, are important tools in the solution of systems of linear first-order differential equations. The basic question that we shall consider in this section is
For an n × n matrix A, can we find an n × n nonsingular matrix P such that P−1AP = D is a diagonal matrix?
A Special Notation
We begin by introducing a shorthand notation for the product of two n × n matrices. This notation will be useful in proving the principal theorem of this section. To illustrate, suppose A and B are 2 × 2 matrices. Then
(1)
If we write the columns of the matrix B as vectors X1 = and X2 =
, then column 1 and column 2 in the product (1) can be expressed as the products AX1 and AX2. That is,
In general, for two n × n matrices
AB = A(X1 X2 . . . Xn) = (AX1 AX2 . . . AXn), (2)
where X1, X2, . . . , Xn, are the columns of B.
Diagonalizable Matrix
If an n × n nonsingular matrix P can be found so that P−1AP = D is a diagonal matrix, then we say that the n × n matrix A can be diagonalized, or is diagonalizable, and that P diagonalizes A.
To discover how to diagonalize a matrix, let us assume for the sake of discussion that A is a 3 × 3 diagonalizable matrix. Then there exists a 3 × 3 nonsingular matrix P such that P−1AP = D or AP = PD, where D is a diagonal matrix
If P1, P2, and P3 denote the columns of P, then it follows from (2) that the equation AP = PD is the same as
(AP1 AP2 AP3) = (d11P1 d22P2 d33P3)
or AP1 = d11P1, AP2 = d22P2, AP3 = d33P3.
But by Definition 8.8.1 we see that d11, d22, and d33 are eigenvalues of A associated with the eigenvectors P1, P2, and P3. These eigenvectors are linearly independent, since P was assumed to be nonsingular.
We have just discovered, in a particular case, that if A is diagonalizable, then the columns of the diagonalizing matrix P consist of linearly independent eigenvectors of A. Since we wish to diagonalize a matrix, we are really concerned with the validity of the converse of the last sentence. In other words, if we can find n linearly independent eigenvectors of an n × n matrix A and form an n × n matrix P whose columns consist of these eigenvectors, then does P diagonalize A? The answer is yes and will be proved in the next theorem.
THEOREM 8.12.1 Sufficient Condition for Diagonalizability
If an n × n matrix A has n linearly independent eigenvectors K1, K2, . . . , Kn, then A is diagonalizable.
PROOF:
We shall prove the theorem in the case when A is a 3 × 3 matrix. Let K1, K2, and K3 be linearly independent eigenvectors corresponding to eigenvalues λ1, λ2, and λ3; that is,
AK1 = λ1K1, AK2 = λ2K2, and AK3 = λ3K3. (3)
Next form the 3 × 3 matrix P with column vectors K1, K2, and K3: P = (K1 K2 K3). P is non-singular since, by hypothesis, the eigenvectors are linearly independent. Now using (2) and (3), we can write the product AP as
Multiplying the last equation on the left by P−1 then gives P−1AP = D. ≡
Note carefully in the proof of Theorem 8.12.1 that the entries in the diagonalized matrix are the eigenvalues of A and the order in which these numbers appear on the diagonal of D corresponds to the order in which the eigenvectors are used as columns in the matrix P.
In view of the motivational discussion preceding Theorem 8.12.1, we can state the general result:
THEOREM 8.12.2 Criterion for Diagonalizability
An n × n matrix A is diagonalizable if and only if A has n linearly independent eigenvectors.
We saw in Section 8.8 that an n × n matrix A has n linearly independent eigenvectors whenever it possesses n distinct eigenvalues.
THEOREM 8.12.3 Sufficient Condition for Diagonalizability
If an n × n matrix A has n distinct eigenvalues, it is diagonalizable.
EXAMPLE 1 Diagonalizing a Matrix
Diagonalize A = if possible.
SOLUTION
First we find the eigenvalues of A. The characteristic equation is det(A − λI) = = λ2 − 5λ + 4 = (λ − 1)(λ − 4) = 0. The eigenvalues are λ1 = 1 and λ2 = 4. Since the eigenvalues are distinct, we know from Theorem 8.12.3 that A is diagonalizable.
Next the eigenvectors of A corresponding to λ1 = 1 and λ2 = 4 are, respectively,
K1 = and K2 =
.
Using these vectors as columns, we find that the nonsingular matrix P that diagonalizes A is
P = (K1 K2) = .
Now P−1 = ,
and so carrying out the multiplication gives
P−1AP = = D. ≡
In Example 1, had we reversed the columns in P, that is, P = , then the diagonal matrix would have been D =
.
EXAMPLE 2 Diagonalizing a Matrix
Consider the matrix A = . We saw in Example 2 of Section 8.8 that the eigenvalues and corresponding eigenvectors are
λ1 = 0, λ2 = −4, λ3 = 3, K1 = , K2 =
, K3 =
.
Since the eigenvalues are distinct, A is diagonalizable. We form the matrix
P = (K1 K2 K3) = .
Matching the eigenvalues with the order in which the eigenvectors appear in P, we know that the diagonal matrix will be
D = .
Now from either of the methods of Section 8.6 we find
P−1 = ,
and so ≡
The condition that an n × n matrix A have n distinct eigenvalues is sufficient—that is, a guarantee—that A is diagonalizable. The condition that there be n distinct eigenvalues is not a necessary condition for the diagonalization of A. In other words, if the matrix A does not have n distinct eigenvalues, then it may or may not be diagonalizable.
A matrix with repeated eigenvalues could be diagonalizable.
EXAMPLE 3 A Matrix That Is Not Diagonalizable
In Example 3 of Section 8.8 we saw that the matrix A = has a repeated eigenvalue λ1 = λ2 = 5. Correspondingly we were able to find only a single eigenvector K1 =
. We conclude from Theorem 8.12.2 that A is not diagonalizable. ≡
EXAMPLE 4 Repeated Eigenvalues Yet Diagonalizable
The eigenvalues of the matrix A = are λ1 = −1 and λ2 = λ3 = 1.
For λ1 = −1 we find K1 = . For the repeated eigenvalue λ2 = λ3 = 1, Gauss–Jordan elimination gives
From the last matrix we see that k1 − k2 = 0. Since k3 is not determined from the last matrix, we can choose its value arbitrarily. The choice k2 = 1 gives k1 = 1. If we then pick k3 = 0, we get the eigenvector
K2 = .
The alternative choice k2 = 0 gives k1 = 0. If k3 = 1, we get another eigenvector corresponding to λ2 = λ3 = 1:
K3 = .
Since the eigenvectors K1, K2, and K3 are linearly independent, a matrix that diagonalizes A is
P = .
Matching the eigenvalues with the eigenvectors in P, we have P−1AP = D, where
D = . ≡
Symmetric Matrices
An n × n symmetric matrix A with real entries can always be diagonalized. This is a consequence of the fact that we can always find n linearly independent eigenvectors for such a matrix. Moreover, since we can find n mutually orthogonal eigenvectors, we can use an orthogonal matrix P to diagonalize A. A symmetric matrix is said to be orthogonally diagonalizable.
THEOREM 8.12.4 Criterion for Orthogonal Diagonalizability
An n × n matrix A can be orthogonally diagonalized if and only if A is symmetric.
PARTIAL PROOF:
We shall prove the necessity part (that is, the “only if” part) of the theorem. Assume an n × n matrix A is orthogonally diagonalizable. Then there exists an orthogonal matrix P such that P−1AP = D or A = PDP−1. Since P is orthogonal, P−1 = PT and consequently A = PDPT. But from (i) and (iii) of Theorem 8.1.2 and the fact that a diagonal matrix is symmetric, we have
AT = (PDPT)T = (PT)TDTPT = PDPT = A.
Thus, A is symmetric. ≡
EXAMPLE 5 Diagonalizing a Symmetric Matrix
Consider the symmetric matrix A = . We saw in Example 4 of Section 8.8 that the eigenvalues and corresponding eigenvectors are
λ1 = 11, λ2 = λ3 = 8, K1 = , K2 =
, K3 =
.
See the Remarks on page 458.
The eigenvectors K1, K2, and K3 are linearly independent, but note that they are not mutually orthogonal since K2 and K3, the eigenvectors corresponding to the repeated eigenvalue λ2 = λ3 = 8, are not orthogonal. For λ2 = λ3 = 8, we found the eigenvectors from Gauss–Jordan elimination
which implies that k1 + k2 + k3 = 0. Since two of the variables are arbitrary, we selected k2 = 1, k3 = 0 to obtain K2, and k2 = 0, k3 = 1 to obtain K3. Now if instead we choose k2 = 1, k3 = 1 and then k2 = 1, k3 = −1, we obtain, respectively, two entirely different but orthogonal eigenvectors:
K2 = and K3 =
.
Thus, a new set of mutually orthogonal eigenvectors is
K1 = , K2 =
, K3 =
.
Multiplying these vectors, in turn, by the reciprocals of the norms ||K1|| = , ||K2|| =
, and ||K3|| =
, we obtain an orthonormal set
We then use these vectors as columns to construct an orthogonal matrix that diagonalizes A:
The diagonal matrix whose entries are the eigenvalues of A corresponding to the order in which the eigenvectors appear in P is then
D = .
This is verified from
≡
Quadratic Forms
An algebraic expression of the form
ax2 + bxy + cy2 (4)
is said to be a quadratic form. If we let X = , then (4) can be written as the matrix product
XTAX = (x y) (5)
Observe that the matrix is symmetric.
In calculus you may have seen that an appropriate rotation of axes enables us to eliminate the xy-term in an equation:
ax2 + bxy + cy2 + dx + ey + f = 0.
As the next example will illustrate, we can eliminate the xy-term by means of an orthogonal matrix and diagonalization rather than by using trigonometry.
EXAMPLE 6 Identifying a Conic Section
Identify the conic section whose equation is 2x2 + 4xy − y2 = 1.
SOLUTION
From (5) we can write the given equation as
(x y) = 1 or XTAX = 1, (6)
where A = and X =
. Now the eigenvalues and corresponding eigenvectors of A are found to be
λ1 = −2, λ2 = 3, K1 = , K2 =
.
Observe that K1 and K2 are orthogonal. Moreover, ||K1|| = ||K2|| = , and so the vectors
are orthonormal. Hence, the matrix
is orthogonal. If we define the change of variables X = PX′ where X′ = , then the quadratic form 2x2 + 4xy − y2 can be written
XTAX = (X′)T PTAPX′ = (X′)T(PTAP)X′.
Since P orthogonally diagonalizes the symmetric matrix A, the last equation is the same as
XTAX = (X′)T DX′. (7)
Using (7), we see that (6) becomes
(X Y) = 1 or −2X2 + 3Y2 = 1.
This last equation is recognized as the standard form of a hyperbola. The xy-coordinates of the eigenvectors are (1, −2) and (2, 1). Using the substitution X = PX′ in the form X′ = P−1X = PT X, we find that the XY-coordinates of these two points are (, 0) and (0,
), respectively. From this we conclude that the X-axis and Y-axis are as shown in FIGURE 8.12.1. The eigenvectors, shown in red in the figure, lie along the new axes. The X- and Y-axes are called the principal axes of the conic. ≡
FIGURE 8.12.1 X- and Y-axes in Example 6
REMARKS
The matrix A in Example 5 is symmetric and as such eigenvectors corresponding to distinct eigenvalues are orthogonal. In the third line of the example, note that K1, an eigenvector for λ1 = 11, is orthogonal to both K2 and K3. The eigenvectors K2 = and K3 =
corresponding to λ2 = λ3 = 8 are not orthogonal. As an alternative to searching for orthogonal eigenvectors for this repeated eigenvalue by performing Gauss–Jordan elimination a second time, we could simply apply the Gram–Schmidt orthogonalization process and transform the set {K2, K3} into an orthogonal set. See Section 7.7 and Example 4 in Section 8.10.
8.12 Exercises Answers to selected odd-numbered problems begin on page ANS-20.
In Problems 1–20, determine whether the given matrix A is diagonalizable. If so, find the matrix P that diagonalizes A and the diagonal matrix D such that D = P−1AP.
In Problems 21–30, the given matrix A is symmetric. Find an orthogonal matrix P that diagonalizes A and the diagonal matrix D such that D = PTAP.
In Problems 31–34, use the procedure illustrated in Example 6 to identify the given conic section. Graph.
- 5x2 − 2xy + 5y2 = 24
- 13x2 − 10xy + 13y2 = 288
- −3x2 + 8xy + 3y2 = 20
- 16x2 + 24xy + 9y2 − 3x + 4y = 0
- Find a 2 × 2 matrix A that has eigenvalues λ1 = 2 and λ2 = 3 and corresponding eigenvectors
K1 =
and K2 =
.
- Find a 3 × 3 symmetric matrix that has eigenvalues λ1 = 1, λ2 = 3, and λ3 = 5 and corresponding eigenvectors
K1 =
, K2 =
, and K3 =
.
- If A is an n × n diagonalizable matrix, then D = P−1AP, where D is a diagonal matrix. Show that if m is a positive integer, then Am = PDmP−1.
- The mth power of a diagonal matrix
is
.
Use this result to compute
In Problems 39 and 40, use the results of Problems 37 and 38 to find the indicated power of the given matrix.
- A =
, A5
- A =
, A10
- Suppose A is a nonsingular diagonalizable matrix. Then show that A−1 is diagonalizable.
- Suppose A is a diagonalizable matrix. Is the matrix P unique?