Section 4.1 Eigenvalues and Eigenvectors
We jump right into the definition, which you have probably seen previously in your first course in linear algebra.
Definition 4.1.1.
Let \(A\) be an \(n\times n\) matrix. A number \(\lambda\) is called an eigenvalue of \(A\) if there exists a nonzero vector \(\xx\) such that
\begin{equation*}
A\xx = \lambda\xx\text{.}
\end{equation*}
Any such vector \(\xx\) is called an eigenvector associated to the eigenvalue \(\lambda\text{.}\)
Exercise 4.1.4.
For the matrix
\(A = \bbm -1\amp 0\amp 3\\1\amp -1\amp 0\\1\amp 0\amp 1\ebm\text{,}\) match each vector on the left with the corresponding eigenvalue on the right. (For typographical reasons, column vectors have been transposed.)
- \(\bbm -3\amp 3\amp 1\ebm^T\)
- \(-2\)
- \(\bbm 0\amp 1\amp 0\ebm^T\)
- \(-1\)
- \(\bbm 3\amp 1\amp 3\ebm^T\)
- \(2\)
- \(\bbm 1\amp 1\amp 1\ebm^T\)
- Not an eigenvector
Note that if \(\xx\) is an eigenvector of the matrix \(A\text{,}\) then we have
\begin{equation}
(A-\lambda I_n)\xx=\zer\text{,}\tag{4.1.1}
\end{equation}
where \(I_n\) denotes the \(n\times n\) identity matrix. Thus, if \(\lambda\) is an eigenvalue of \(A\text{,}\) any corresponding eigenvector is an element of \(\nll(A-\lambda I_n)\text{.}\)
Definition 4.1.5.
For any real number \(\lambda\) and \(n\times n\) matrix \(A\text{,}\) we define the eigenspace \(E_\lambda(A)\) by
\begin{equation*}
E_\lambda(A) = \nll (A-\lambda I_n)\text{.}
\end{equation*}
Since we know that the null space of any matrix is a subspace, it follows that eigenspaces are subspaces of
\(\R^n\text{.}\)
Note that
\(E_\lambda(A)\) can be defined for any real number
\(\lambda\text{,}\) whether or not
\(\lambda\) is an eigenvalue. However, the eigenvalues of
\(A\) are distinguished by the property that there is a
nonzero solution to
(4.1.1). Furthermore, we know that
(4.1.1) can only have nontrivial solutions if the matrix
\(A-\lambda I_n\) is not invertible. We also know that
\(A-\lambda I_n\) is non-invertible if and only if
\(\det (A-\lambda I_n) = 0\text{.}\) This gives us the following theorem.
Theorem 4.1.6.
The following are equivalent for any \(n\times n\) matrix \(A\) and real number \(\lambda\text{:}\)
-
\(\lambda\) is an eigenvalue of
\(A\text{.}\)
-
\(\displaystyle E_\lambda(A)\neq \{\zer\}\)
-
\(\displaystyle \det(A-\lambda I_n) = 0\)
Strategy.
To prove a theorem involving a βthe following are equivalentβ statement, a good strategy is to show that the first implies the second, the second implies the third, and the third implies the first. The ideas needed for the proof are given in the paragraph preceding the theorem. See if you can turn them into a formal proof.
The polynomial \(c_A(x)=\det(xI_n -A)\) is called the characteristic polynomial of \(A\text{.}\) (Note that \(\det(x I_n-A) = (-1)^n\det(A-x I_n)\text{.}\) We choose this order so that the coefficient of \(x^n\) is always 1.) The equation
\begin{equation}
\det(xI_n - A) = 0\tag{4.1.2}
\end{equation}
is called the characteristic equation of \(A\text{.}\) The solutions to this equation are precisely the eigenvalues of \(A\text{.}\)
Exercise 4.1.8.
In order for certain properties of a matrix
\(A\) to be satisfied, the eigenvalues of
\(A\) need to have particular values. Match each property of a matrix
\(A\) on the left with the corresponding information about the eigenvalues of
\(A\) on the right. Be sure that you can justify your answers with a suitable proof.
-
\(A\) is invertible
-
\(0\) is not an eigenvalue of \(A\)
-
\(A^k=0\) for some integer \(k\geq 2\)
-
\(0\) is the only eigenvalue of \(A\)
- \(A=A^{-1}\)
-
\(1\) and \(-1\) are the only eigenvalues of \(A\)
- \(A^2=A\)
-
\(0\) and \(1\) are the only eigenvalues of \(A\)
- \(A^3=A\)
-
\(0\text{,}\) \(1\text{,}\) and \(-1\) are the eigenvalues of \(A\)
Recall that a matrix
\(B\) is said to be
similar to a matrix
\(A\) if there exists an invertible matrix
\(P\) such that
\(B = P^{-1}AP\text{.}\) Much of what follows concerns the question of whether or not a given
\(n\times n\) matrix
\(A\) is
diagonalizable.
Definition 4.1.9.
An
\(n\times n\) matrix
\(A\) is said to be
diagonalizable if
\(A\) is similar to a diagonal matrix.
The following results will frequently be useful.
Theorem 4.1.10.
The relation \(A\sim B\) if and only if \(A\) is similar to \(B\) is an equivalence relation. Moreover, if \(A\sim B\text{,}\) then:
-
\(\displaystyle \det A = \det B\)
-
\(\displaystyle \tr A = \tr B\)
-
\(\displaystyle c_A(x) = c_B(x)\)
In other words, \(A\) and \(B\) have the same determinant, trace, and characteristic polynomial (and thus, the same eigenvalues).
Proof.
The first two follow directly from properties of the determinant and trace. For the last, note that if \(B = P^{-1}AP\text{,}\) then
\begin{equation*}
P^{-1}(xI_n-A)P = P^{-1}(xI_n)P-P^{-1}AP = xI_n - B\text{,}
\end{equation*}
so \(xI_n-B\sim xI_n-A\text{,}\) and therefore \(\det(xI_n-B)=\det(xI_n-A)\text{.}\)
Example 4.1.11.
Determine the eigenvalues and eigenvectors of
\(A = \bbm 0\amp 1\amp 1\\1\amp 0\amp 1\\1\amp 1\amp 0\ebm\text{.}\)
Solution.
We begin with the characteristic polynomial. We have
\begin{align*}
\det(xI_n - A) \amp =\det\bbm x \amp -1\amp -1\\-1\amp x \amp -1\\-1\amp -1\amp x\ebm\\
\amp = x \begin{vmatrix}x \amp -1\\-1\amp x\end{vmatrix}
+1\begin{vmatrix}-1\amp -1\\-1\amp x\end{vmatrix}
-1\begin{vmatrix}-1\amp x\\-1\amp -1\end{vmatrix}\\
\amp = x(x^2-1)+(-x-1)-(1+x)\\
\amp x(x-1)(x+1)-2(x+1)\\
\amp (x+1)[x^2-x-2]\\
\amp (x+1)^2(x-2)\text{.}
\end{align*}
The roots of the characteristic polynomial are our eigenvalues, so we have
\(\lambda_1=-1\) and
\(\lambda_2=2\text{.}\) Note that the first eigenvalue comes from a repeated root. This is typically where things get interesting. If an eigenvalue does not come from a repeated root, then there will only be one (independent) eigenvector that corresponds to it. (That is,
\(\dim E_\lambda(A)=1\text{.}\)) If an eigenvalue is repeated, it could have more than one eigenvector, but this is not guaranteed.
We find that \(A-(-1)I_n = \bbm 1\amp 1\amp 1\\1\amp 1\amp 1\\1\amp 1\amp 1\ebm\text{,}\) which has reduced row-echelon form \(\bbm 1\amp 1\amp 1\\0\amp 0\amp 0\\0\amp 0\amp 0\ebm\text{.}\) Solving for the nullspace, we find that there are two independent eigenvectors:
\begin{equation*}
\xx_{1,1}=\bbm 1\\-1\\0\ebm, \quad \text{ and } \quad \xx_{1,2}=\bbm 1\\0\\-1\ebm\text{,}
\end{equation*}
so
\begin{equation*}
E_{-1}(A) = \spn\left\{\bbm 1\\-1\\0\ebm, \bbm 1\\0\\-1\ebm\right\}\text{.}
\end{equation*}
For the second eigenvector, we have \(A-2I = \bbm -2\amp 1\amp 1\\1\amp -2\amp 1\\1\amp 1\amp -2\ebm\text{,}\) which has reduced row-echelon form \(\bbm 1\amp 0\amp -1\\0\amp 1\amp -1\\0\amp 0\amp 0\ebm\text{.}\) An eigenvector in this case is given by
\begin{equation*}
\xx_2 = \bbm 1\\1\\1\ebm\text{.}
\end{equation*}
In general, if the characteristic polynomial can be factored as
\begin{equation*}
c_A(x)=(x-\lambda)^mq(x)\text{,}
\end{equation*}
where \(q(x)\) is not divisible by \(x-\lambda\text{,}\) then we say that \(\lambda\) is an eigenvalue of multiplicity \(m\text{.}\) In the example above, \(\lambda_1=-1\) has multiplicty 2, and \(\lambda_2=2\) has multiplicty 1.
The
eigenvects
command in SymPy takes a square matrix as input, and outputs a list of lists (one list for each eigenvalue). For a given eigenvalue, the corresponding list has the form
(eigenvalue, multiplicity, eigenvectors)
. Using SymPy to solve
ExampleΒ 4.1.11 looks as follows:
An important result about multiplicity is the following.
Theorem 4.1.12.
Let
\(\lambda\) be an eigenvalue of
\(A\) of multiplicity
\(m\text{.}\) Then
\(\dim E_\lambda(A)\leq m\text{.}\)
To prove
TheoremΒ 4.1.12 we need the following lemma, which weβve borrowed from Section 5.5 of Nicholsonβs textbook.
Lemma 4.1.13.
Let \(\{\xx_1,\ldots, \xx_k\}\) be a set of linearly independent eigenvectors of a matrix \(A\text{,}\) with corresponding eigenvalues \(\lambda_1,\ldots, \lambda_k\) (not necessarily distinct). Extend this set to a basis \(\{\xx_1,\ldots, \xx_k,\xx_{k+1},\ldots, \xx_n\}\text{,}\) and let \(P=\bbm \xx_1\amp \cdots \amp \xx_n\ebm\) be the matrix whose columns are the basis vectors. (Note that \(P\) is necessarily invertible.) Then
\begin{equation*}
P^{-1}AP = \bbm \diag(\lambda_1,\ldots, \lambda_k) \amp B\\0\amp A_1\ebm\text{,}
\end{equation*}
where \(B\) has size \(k\times (n-k)\text{,}\) and \(A_1\) has size \((n-k)\times (n-k)\text{.}\)
Proof.
We have
\begin{align*}
P^{-1}AP \amp = P^{-1}A\bbm \xx_1\amp \cdots \amp \xx_n\ebm\\
\amp =\bbm (P^{-1}A)\xx_1\amp \cdots \amp (P^{-1}A)\xx_n\ebm\text{.}
\end{align*}
For \(1\leq i\leq k\text{,}\) we have
\begin{equation*}
(P^{-1}A)(\xx_i) = P^{-1}(A\xx_i) = P^{-1}(\lambda_i\xx_i)=\lambda_i(P^{-1}\xx_i)\text{.}
\end{equation*}
But \(P^{-1}\xx_i\) is the \(i\)th column of \(P^{-1}P = I_n\text{,}\) which proves the result.
We can use
LemmaΒ 4.1.13 to prove that
\(\dim E_\lambda(A)\leq m\) as follows. Suppose
\(\{\xx_1,\ldots, \xx_k\}\) is a basis for
\(E_\lambda(A)\text{.}\) Then this is a linearly independent set of eigenvectors, so our lemma guarantees the existence of a matrix
\(P\) such that
\begin{equation*}
P^{-1}AP = \bbm \lambda I_k \amp B\\0\amp A_1\ebm\text{.}
\end{equation*}
Let \(\tilde{A}=P^{-1}AP\text{.}\) On the one hand, since \(\tilde{A}\sim A\text{,}\) we have \(c_A(x)=p_{\tilde{A}}(x)\text{.}\) On the other hand,
\begin{equation*}
\det(xI_n-\tilde{A}) = \det\bbm (x-\lambda)I_k \amp -B\\0 \amp xI_{n-k}-A_1\ebm = (x-\lambda)^k\det(xI_{n-k}-A_1)\text{.}
\end{equation*}
This shows that \(c_A(x)\) is divisible by \((x-\lambda)^k\text{.}\) Since \(m\) is the largest integer such that \(c_A(x)\) is divisible by \((x-\lambda)^m\text{,}\) we must have \(\dim E_\lambda(A)=k\leq m\text{.}\)
Another important result is the following. The proof is a bit tricky: it requires mathematical induction, and a couple of clever observations.
Theorem 4.1.14.
Let
\(\vv_1,\ldots, \vv_k\) be eigenvectors corresponding to distinct eigenvalues
\(\lambda_1,\ldots, \lambda_k\) of a matrix
\(A\text{.}\) Then
\(\{\vv_1,\ldots, \vv_k\}\) is linearly independent.
Proof.
The proof is by induction on the number
\(k\) of distinct eigenvalues. Since eigenvectors are nonzero, any set consisting of a single eigenvector
\(\vv_1\) is independent. Suppose, then, that a set of eigenvectors corresponding to
\(k-1\) distinct eigenvalues is independent, and let
\(\vv_1,\ldots, \vv_k\) be eigenvectors corresponding to distinct eigenvalues
\(\lambda_1,\ldots, \lambda_k\text{.}\)
Consider the equation
\begin{equation*}
c_1\vv_1+c_2\vv_2+\cdots +c_k\vv_k=\zer\text{,}
\end{equation*}
for scalars \(c_1,\ldots, c_k\text{.}\) Multiplying both sides by the matrix \(A\text{,}\) we have
\begin{align}
A(c_1\vv_1+c_2\vv_2+\cdots +c_k\vv_k) \amp = A\zer\tag{4.1.3}\\
c_1A\vv_1+c_2A\vv_2+\cdots + c_kA\vv_k \amp = \zer\tag{4.1.4}\\
c_1\lambda_1\vv_1+c_2\lambda_2\vv_2+\cdots + c_k\lambda_k\vv_k \amp =\zer\text{.}\tag{4.1.5}
\end{align}
On the other hand, we can also multiply both sides by the eigenvalue \(\lambda_1\text{,}\) giving
\begin{equation}
\zer = c_1\lambda_1\vv_1 + c_2\lambda_1\vv_2+\cdots + c_k\lambda_1\vv_k\text{.}\tag{4.1.6}
\end{equation}
\begin{equation*}
c_2(\lambda_2-\lambda_1)\vv_2+\cdots + c_k(\lambda_k-\lambda_1)\vv_k=\zer\text{.}
\end{equation*}
By hypothesis, the set
\(\{\vv_2,\ldots, \vv_k\}\) of
\(k-1\) eigenvectors is linearly independent. We know that
\(\lambda_j-\lambda_1\neq 0\) for
\(j=2,\ldots, k\text{,}\) since the eigenvalues are all distinct. Therefore, the only way this linear combination can equal zero is if
\(c_2=0,\ldots, c_k=0\text{.}\) This leaves us with
\(c_1\vv_1=\zer\text{,}\) but
\(\zz_1\neq \zer\text{,}\) so
\(c_1=0\) as well.
TheoremΒ 4.1.14 tells us that vectors from different eigenspaces are independent. In particular, a union of bases from each eigenspace will be an independent set. Therefore,
TheoremΒ 4.1.12 provides an initial criterion for diagonalization: if the dimension of each eigenspace
\(E_\lambda(A)\) is equal to the multiplicity of
\(\lambda\text{,}\) then
\(A\) is diagonalizable.
Our focus in the next section will be on diagonalization of
symmetric matrices, and soon we will see that for such matrices, eigenvectors corresponding to different eigenvalues are not just independent, but
orthogonal.
Exercises Exercises
1.
2.
3.
4.
5.
6.
7.
8.
9.
You have attempted
of
activities on this page.