Kernel and Image

Section 2.2 Kernel and Image

Given any linear transformation \(T:V\to W\) we can associate two important sets: the kernel of \(T\) (also known as the nullspace), and the image of \(T\) (also known as the range).

🔗

Definition 2.2.1.

Let \(T:V\to W\) be a linear transformation. The kernel of \(T\text{,}\) denoted \(\ker T\text{,}\) is defined by

\begin{equation*} \ker T = \{\vv\in V \,|\, T(\vv)=\zer\}\text{.} \end{equation*}

The image of \(T\text{,}\) denoted \(\im T\text{,}\) is defined by

\begin{equation*} \im T = \{T(\vv) \,|\, \vv\in V\}\text{.} \end{equation*}

🔗

Remark 2.2.2.

Caution: the kernel is the set of vectors that \(T\) sends to zero. Saying that \(\vv\in \ker T\) does not mean that \(\vv=\zer\text{;}\) it means that \(T(\vv)=\zer\text{.}\) Although it’s true that \(T(\zer)=\zer\) for any linear transformation, the kernel can contain vectors other than the zero vector.

🔗

In fact, as we’ll see in Theorem 2.2.10 below, the case where the kernel contains only the zero vector is an important special case.

🔗

Remark 2.2.3. How to use these definitions.

As you read through the theorems and examples in this section, think carefully about how the definitions of kernel and image are used.

🔗

For a linear transformation \(T:V\to W\text{:}\)

If you assume \(\vv\in\ker T\text{:}\) you are asserting that \(T(\vv)=\zer\text{.}\) Similarly, to prove \(\vv\in \ker T\text{,}\) you must show that \(T(\vv)=\zer\text{.}\)
🔗

🔗
If your hypothesis is that \(U=\ker T\) for some subspace \(U\subseteq V\text{,}\) you can assume that \(T(\uu)=\zer\) for any \(\uu\in U\text{.}\)
🔗

🔗
If you need to prove that \(U=\ker T\) for some subspace \(U\text{,}\) then you need to prove that if \(\uu\in U\text{,}\) then \(T(\uu)=\zer\text{,}\) and if \(T(\uu)=\zer\text{,}\) then \(\uu\in U\text{.}\)
🔗

🔗
If you assume \(\ww\in\im T\text{:}\) you are asserting that there exists some \(\vv\in V\text{,}\) such that \(T(\vv)=\ww\text{,}\) and to prove that \(\ww\in \im T\text{,}\) you must find some \(\vv\in V\) such that \(T(\vv)=\ww\text{.}\)
🔗

🔗
If your hypothesis is that \(U=\im T\) for some subspace \(U\subseteq W\text{,}\) then you are assuming that for every \(\uu\in U\text{,}\) there is some \(\vv\in V\) such that \(T(\vv)=\uu\text{.}\)
🔗

🔗
If you need to prove that \(\im T=U\) for some subspace \(U\text{,}\) then you need to show that for every \(\uu\in U\text{,}\) there is some \(\vv\in V\) with \(T(\vv)=\uu\text{,}\) and that \(T(\vv)\in U\) for every \(\vv\in V\text{.}\)
🔗

🔗

🔗

Theorem 2.2.4.

For any linear transformation \(T:V\to W\text{,}\)

\(\ker T\) is a subspace of \(V\text{.}\)
🔗

🔗
\(\im T\) is a subspace of \(W\text{.}\)
🔗

🔗

🔗

Strategy.

Both parts of the proof rely on the Subspace Test. So for each set, we first need to explain why it contains the zero vector. Next comes closure under addition: assume you have to elements of the set, then use the definition to explain what that means.

🔗

Now you have to show that the sum of those elements belongs to the set as well. It’s fairly safe to assume that this is going to involve the addition property of a linear transformation!

🔗

Scalar multiplication is handled similarly, but using the scalar multiplication property of \(T\text{.}\)

🔗

Proof.

To show that \(\ker T\) is a subspace, first note that \(\zer\in \ker T\text{,}\) since \(T(\zer)=\zer\) for any linear transformation \(T\text{.}\) Now, suppose that \(\vv,\ww\in \ker T\text{;}\) this means that \(T(\vv)=\zer\) and \(T(\ww)=0\text{,}\) and therefore,

\begin{equation*} T(\vv+\ww)=T(\vv)+T(\ww)=\zer+\zer=\zer\text{.} \end{equation*}

Similarly, for any scalar \(c\text{,}\) if \(\vv\in \ker T\) then \(T(\vv)=\zer\text{,}\) so

\begin{equation*} T(c\vv)=cT(\vv)=c\zer=\zer\text{.} \end{equation*}

By the subspace test, \(\ker T\) is a subspace.

🔗

🔗
Again, since \(T(\zer)=\zer\text{,}\) we see that \(\zer\in \im T\text{.}\) (That is, \(T(\zer_V)=\zer_W\text{,}\) so \(\zer_W\in \im T\text{.}\)) Now suppose that \(\ww_1,\ww_2\in \im T\text{.}\) This means that there exist \(\vv_1,\vv_2\in V\) such that \(T(\vv_1)=\ww_1\) and \(T(\vv_2)=\ww_2\text{.}\) It follows that

\begin{equation*} \ww_1+\ww_2 = T(\vv_1)+T(\vv_2) = T(\vv_1+\vv_2)\text{,} \end{equation*}

so \(\ww_1+\ww_2\in \im T\text{,}\) since it’s the image of \(\vv_1+\vv_2\text{.}\) Similarly, if \(c\) is any scalar and \(\ww=T(\vv)\in\im T\text{,}\) then

\begin{equation*} c\ww=cT(\vv)=T(c\vv)\text{,} \end{equation*}

so \(c\ww\in \im T\text{.}\)

🔗

🔗

🔗

Example 2.2.5. Null space and column space.

A familiar setting that you may already have encountered in a previous linear algebra course (or Example 2.1.4) is that of a matrix transformation. Let \(A\) be an \(m\times n\) matrix. Then we can define \(T:\R^n\to \R^m\) by \(T(\xx)=A\xx\text{,}\) where elements of \(\R^n,\R^m\) are considered as column vectors. We then have

\begin{equation*} \ker T = \nll(A) = \{\xx\in \R^n \,|\, A\xx=\zer\} \end{equation*}

and

\begin{equation*} \im T = \csp(A) = \{A\xx\,|\, \xx\in \R^n\}\text{,} \end{equation*}

where \(\csp(A)\) denotes the column space of \(A\text{.}\) Recall further that if we write \(A\) in terms of its columns as

\begin{equation*} A = \bbm C_1 \amp C_2 \amp \cdots \amp C_n\ebm \end{equation*}

and a vector \(\xx\in \R^n\) as \(\xx=\bbm x_1\\x_2\\\vdots \\x_n\ebm\text{,}\) then

\begin{equation*} A\xx = x_1C_1+x_2C_2+\cdots +x_nC_n\text{.} \end{equation*}

Thus, any element of \(\csp(A)\) is a linear combination of its columns, explaining the name column space.

🔗

Determining \(\nll(A)\) and \(\csp(A)\) for a given matrix \(A\) is, unsurprisingly, a matter of reducing \(A\) to row-echelon form. Finding \(\nll(A)\) comes down to describing the set of all solutions to the homogeneous system \(A\xx=\zer\text{.}\) Finding \(\csp(A)\) relies on the following theorem.

🔗

Theorem 2.2.6.

Let \(A\) be an \(m\times n\) matrix with columns \(C_1,C_2,\ldots, C_n\text{.}\) If the reduced row-echelon form of \(A\) has leading ones in columns \(j_1,j_2,\ldots, j_k\text{,}\) then

\begin{equation*} \{C_{j_1},C_{j_2},\ldots, C_{j_k}\} \end{equation*}

is a basis for \(\csp(A)\text{.}\)

🔗

For a proof of this result, see Section 5.4 of Linear Algebra with Applications, by Keith Nicholson. The proof is fairly long and technical, so we omit it here.

🔗

Example 2.2.7.

Consider the linear transformation \(T:\R^4\to \R^3\) defined by the matrix

\begin{equation*} A = \bbm 1 \amp 3 \amp 0 \amp -2\\ -2 \amp -1 \amp 2 \amp 0\\ 1 \amp 8 \amp 2 \amp -6\ebm\text{.} \end{equation*}

Let’s determine the RREF of \(A\text{:}\)

🔗

We see that there are leading ones in the first and second column. Therefore, \(\csp(A) = \im(T) = \spn\left\{\bbm 1\\-2\\1\ebm, \bbm 3\\-1\\8\ebm\right\}\text{.}\) Indeed, note that

\begin{equation*} \bbm 0\\2\\2\ebm = -\frac65\bbm 1\\-2\\1\ebm + \frac25\bbm 3\\-1\\8\ebm \end{equation*}

and

\begin{equation*} \bbm -2\\0\\-6\ebm = \frac25\bbm 1\\-2\\1\ebm -\frac45\bbm 3\\-1\\8\ebm\text{,} \end{equation*}

so that indeed, the third and fourth columns are in the span of the first and second.

🔗

Furthermore, we can determine the nullspace: if \(A\xx=\zer\) where \(\xx=\bbm x_1\\x_2\\x_3\\x_4\ebm\text{,}\) then we must have

\begin{align*} x_1 \amp =\frac65 x_3-\frac25 x_4\\ x_2 \amp =-\frac25 x_3+\frac 45 x_4\text{,} \end{align*}

\begin{equation*} \xx = \bbm \frac65x_3-\frac25x_4\\ -\frac25x_3+\frac45x_4\\x_3\\x_4\ebm = \frac{x_3}{5}\bbm 6\\-2\\5\\0\ebm + \frac{x_4}{5}\bbm -2\\4\\0\\5\ebm\text{.} \end{equation*}

It follows that a basis for \(\nll(A)=\ker T\) is \(\left\{\bbm 6\\-2\\5\\0\ebm, \bbm -2\\4\\0\\5\ebm\right\}\text{.}\)

🔗

Remark 2.2.8.

The SymPy library for Python has built-in functions for computing nullspace and column space. But it’s probably worth your while to know how to determine these from the RREF of a matrix, without additional help from the computer. That said, let’s see how the computer’s output compares to what we found in Example 2.2.7.

🔗

Note that the output from the computer simply states the basis for each space. Of course, for computational purposes, this is typically good enough.

🔗

An important result that comes out while trying to show that the “pivot columns” of a matrix (the ones that end up with leading ones in the RREF) are a basis for the column space is that the column rank (defined as the dimension of \(\csp(A)\)) and the row rank (the dimension of the space spanned by the rows of \(A\)) are equal. One can therefore speak unambiguously about the rank of a matrix \(A\text{,}\) and it is just as it’s defined in a first course in linear algebra: the number of leading ones in the RREF of \(A\text{.}\)

🔗

For a general linear transformation, we can’t necessarily speak in terms of rows and columns, but if \(T:V\to W\) is linear, and either \(V\) or \(W\) is finite-dimensional, then we can define the rank of \(T\) as follows.

🔗

Definition 2.2.9.

Let \(T:V\to W\) be a linear transformation. Then the rank of \(T\) is defined by

\begin{equation*} \operatorname{rank} T = \dim \im T\text{,} \end{equation*}

and the nullity of \(T\) is defined by

\begin{equation*} \operatorname{nullity} T = \dim \ker T\text{,} \end{equation*}

provided that the kernel and image of \(T\) are finite-dimensional.

🔗

Note that if \(W\) is finite-dimensional, then so is \(\im T\text{,}\) since it’s a subspace of \(W\text{.}\) On the other hand, if \(V\) is finite-dimensional, then we can find a basis \(\{\vv_1,\ldots, \vv_n\}\) of \(V\text{,}\) and the set \(\{T(\vv_1),\ldots, T(\vv_n)\}\) will span \(\im T\text{,}\) so again the image is finite-dimensional, so the rank of \(T\) is finite. It is possible for either the rank or the nullity of a transformation to be infinite.

🔗

Recall that a function \(f:A\to B\) is injective (or one-to-one) if for any \(x_1,x_2\in A\text{,}\) \(f(x_1)=f(x_2)\) implies that \(x_1=x_2\text{.}\) (In other words, no two different inputs give the same output.) We say that \(f\) is surjective (or onto) if \(f(A)=B\text{.}\) (That is, if the range of \(f\) is the entire codomain \(B\text{.}\)) These properties are important considerations for the discussion of inverse functions.

🔗

For a linear transformation \(T\text{,}\) the property of surjectivity is tied to \(\im T\) by definition: \(T:V\to W\) is onto if \(\im T = W\text{.}\) What might not be immediately obvious is that the kernel tells us if a linear transformation is injective.

🔗

Theorem 2.2.10.

Let \(T:V\to W\) be a linear transformation. Then \(T\) is injective if and only if \(\ker T = \{\zer\}\text{.}\)

🔗

Strategy.

We have an “if and only if” statement, so we have to make sure to consider both directions. The basic idea is this: we know that \(\zer\) is always in the kernel, so if the kernel contains any other vector \(\vv\text{,}\) we would have \(T(\vv)=T(\zer)=\zer\text{,}\) and \(T\) would not be injective.

🔗

There is one trick to keep in mind: the statement \(T(\vv_1)=T(\vv_2)\) is equivalent to \(T(\vv_1)-T(\vv_2)=\zer\text{,}\) and since \(T\) is linear, \(T(\vv_1)-T(\vv_2)=T(\vv_1-\vv_2)\text{.}\)

🔗

Proof.

Suppose \(T\) is injective, and let \(\vv\in \ker T\text{.}\) Then \(T(\vv)=\zer\text{.}\) On the other hand, we know that \(T(\zer)=\zer=T(\vv)\text{.}\) Since \(T\) is injective, we must have \(\vv=\zer\text{.}\) Conversely, suppose that \(\ker T = \{0\}\) and that \(T(\vv_1)=T(\vv_2)\) for some \(\vv_1,\vv_2\in V\text{.}\) Then

\begin{equation*} \zer = T(\vv_1)-T(\vv_2) = T(\vv_1-\vv_2)\text{,} \end{equation*}

so \(\vv_1-\vv_2\in \ker T\text{.}\) Therefore, we must have \(\vv_1-\vv_2=\zer\text{,}\) so \(\vv_1=\vv_2\text{,}\) and it follows that \(T\) is injective.

🔗

Exercise 2.2.11.

Rearrange the blocks below to produce a valid proof of the following statement:

🔗

If \(T:V\to W\) is injective and \(\{\vv_1,\vv_2,\ldots, \vv_n\}\) is linearly independent in \(V\text{,}\) then \(\{T(\vv_1),T(\vv_2),\ldots, T(\vv_n)\}\) is linearly independent in \(W\text{.}\)

🔗

Suppose \(T:V\to W\) is injective and \(\{\vv_1,\ldots, \vv_n\}\subseteq V\) is independent.🔗

---
Assume that \(c_1T(\vv_1)+\cdots + c_nT(\vv_n)=\zer\text{,}\) for some scalars \(c_1,c_2,\ldots, c_n\text{.}\)🔗

---

Since \(T\) is linear,

\begin{align*}
\zer \amp = c_1T(\vv_1)+\cdots + c_nT(\vv_n)\\
\amp =T(c_1\vv_1+ \ldots + c_n\vv_n)\text{.}
\end{align*}

🔗

---
Therefore, \(c_1\vv_1+\cdots + c_n\vv_n\in \ker T\text{.}\)🔗

---
Since \(T\) is injective, \(\ker T = \{\zer\}\text{.}\)🔗

---
Therefore, \(c_1\vv_1+\cdots + c_n\vv_n=\zer\text{.}\)🔗

---
Since \(\{\vv_1,\ldots, \vv_n\}\) is independent, we must have \(c_1=0,\ldots, c_n=0\text{.}\)🔗

---
It follows that \(\{T(\vv_1),\ldots, T(\vv_n)\}\) is linearly independent.🔗

Hint.

Remember that your goal is to show that \(\{T(\vv_1),\ldots, T(\vv_n)\}\) is linearly independent, so after you assume your hypotheses, you should begin by setting a linear combination of these vectors equal to zero.

🔗

Exercise 2.2.12.

Rearrange the blocks below to produce a valid proof of the following statement:

🔗

If \(T:V\to W\) is surjective and \(V=\spn\{\vv_1,\ldots, \vv_n\}\text{,}\) then \(W = \spn\{T(\vv_1),\ldots, T(\vv_n)\}\text{.}\)

🔗

Suppose \(T\) is surjective, and \(\spn\{\vv_1,\ldots, \vv_n\}=V\text{.}\)🔗

---
Let \(\ww\in W\) be any vector.🔗

---
Since \(T\) is a surjection, there is some \(\vv\in V\) such that \(T(\vv)=\ww\text{.}\)🔗

---
Since \(V=\spn\{\vv_1,\ldots, \vv_n\}\) and \(\vv\in V\text{,}\) there are scalars \(c_1,\ldots, c_n\) such that \(\vv=c_1\vv_1+\cdots +c_n\vv_n\text{.}\)🔗

---

Since \(T\) is linear,

\begin{align*}
\ww \amp = T(\vv) \\
\amp = T(c_1\vv_1+\cdots +c_n\vv_n)\\
\amp = c_1T(\vv_1)+\cdots +c_nT(\vv_n)\text{,}
\end{align*}

so \(\ww\in \spn\{T(\vv_1),\ldots, T(\vv_n)\}\text{.}\)

🔗

---
Therefore, \(W\subseteq \spn\{T(\vv_1),\ldots, T(\vv_n)\}\text{,}\) and since \(\spn\{T(\vv_1),\ldots, T(\vv_n)\}\subseteq W\text{,}\) we have \(W=\spn\{T(\vv_1),\ldots, T(\vv_n)\}\text{.}\)🔗

Hint.

You need to show that any \(\ww\in W\) can be written as a linear combination of the \(T(\vv_i)\text{.}\) Because \(T\) is surjective, you know that \(\ww=T(\vv)\) for some \(\vv\in V\text{.}\)

🔗

Remark 2.2.13.

For the case of a matrix transformation \(T_A:\R^n\to \R^m\text{,}\) notice that \(\ker T_A\) is simply the set of all solutions to \(A\xx=\zer\text{,}\) while \(\im T_A\) is the set of all \(\yy\in\R^m\) for which \(A\xx=\yy\) has a solution.

🔗

Recall from the discussion preceding Definition 2.2.9 that \(\rank A = \dim \csp(A) = \dim \im T_A\text{.}\) It follows that \(T_A\) is surjective if and only if \(\rank A = m\text{.}\) On the other hand, \(T_A\) is injective if and only if \(\rank A = n\text{,}\) because we know that the system \(A\xx=\zer\) has a unique solution if and only if each column of \(A\) contains a leading one.

🔗

This has some interesting consequences. If \(m=n\) (that is, if \(A\) is square), then each increase in \(\dim \nll(A)\) produces a corresponding decrease in \(\dim \csp(A)\text{,}\) since both correspond to the “loss” of a leading one. Moreover, if \(\rank A = n\text{,}\) then \(T_A\) is both injective and surjective. Recall that a function is invertible if and only if it is both injective and surjective. It should come as no surprise that invertibility of \(T_A\) (as a function) is equivalent to invertibility of \(A\) (as a matrix).

🔗

Also, note that if \(m \lt n\text{,}\) then \(\rank A\leq m \lt n\text{,}\) so \(T_A\) could be surjective, but can’t possibly be injective. On the other hand, if \(m\gt n\text{,}\) then \(\rank A\leq n \lt m\text{,}\) so \(T_A\) could be injective, but can’t possibly be surjective. These results generalize to linear transformations between any finite-dimensional vector spaces. The first step towards this is the following theorem, which is sometimes known as the Fundamental Theorem of Linear Transformations.

🔗

Theorem 2.2.14. Dimension Theorem.

Let \(T:V\to W\) be any linear transformation such that \(\ker T\) and \(\im T\) are finite-dimensional. Then \(V\) is finite-dimensional, and

\begin{equation*} \dim V = \dim \ker T + \dim \im T\text{.} \end{equation*}

🔗

Proof.

The trick with this proof is that we aren’t assuming \(V\) is finite-dimensional, so we can’t start with a basis of \(V\text{.}\) But we do know that \(\im T\) is finite-dimensional, so we start with a basis \(\{\ww_1,\ldots, \ww_m\}\) of \(\im T\text{.}\) Of course, every vector in \(\im T\) is the image of some vector in \(V\text{,}\) so we can write \(\ww_i =T(\vv_i)\text{,}\) where \(\vv_i\in V\text{,}\) for \(i=1,2,\ldots, m\text{.}\)

🔗

Since \(\{T(\vv_1),\ldots, T(\vv_m)\}\) is a basis, it is linearly independent. The results of Exercise 2.1.12 tell us that the set \(\{\vv_1,\ldots, \vv_m\}\) must therefore be independent.

🔗

We now introduce a basis \(\{\uu_1,\ldots, \uu_n\}\) of \(\ker T\text{,}\) which we also know to be finite-dimensional. If we can show that the set \(\{\uu_1,\ldots, \uu_n,\vv_1,\ldots, \vv_m\}\) is a basis for \(V\text{,}\) we’d be done, since the number of vectors in this basis is \(\dim\ker T + \dim \im T\text{.}\) We must therefore show that this set is independent, and that it spans \(V\text{.}\)

🔗

To see that it’s independent, suppose that

\begin{equation*} a_1\uu_1+\cdots + a_n\uu_n+b_1\vv_1+\cdots +b_m\vv_m=\zer\text{.} \end{equation*}

Applying \(T\) to this equation, and noting that \(T(\uu_i)=\zer\) for each \(i\text{,}\) by definition of the \(\uu_i\text{,}\) we get

\begin{equation*} b_1T(\vv_1)+\cdots +b_mT(\vv_m)=\zer\text{.} \end{equation*}

We assumed that the vectors \(T(\vv_i)\) were independent, so all the \(b_i\) must be zero. But then we get

\begin{equation*} a_1\uu_1+\cdots +a_n\uu_n=\zer\text{,} \end{equation*}

and since the \(\uu_i\) are independent, all the \(a_i\) must be zero.

🔗

To see that these vectors span, choose any \(\xx\in V\text{.}\) Since \(T(\xx)\in \im T\text{,}\) there exist scalars \(c_1,\ldots, c_m\) such that

\begin{equation} T(\xx)=c_1T(\vv_1)+\cdots +c_mT(\vv_m)\text{.}\tag{2.2.1} \end{equation}

We’d like to be able to conclude from this that \(\xx=c_1\vv_1+\cdots +c_m\vv_m\text{,}\) but this would be false, unless \(T\) was known to be injective (which it isn’t). Failure to be injective involves the kernel -- how do we bring that into the picture?

🔗

The trick is to realize that the reason we might have \(\xx\neq c_1\vv_1+\cdots +c_m\vv_m\) is that we’re off by something in the kernel. Indeed, (2.2.1) can be re-written as

\begin{equation*} T(\xx-c_1\vv_1-\cdots -c_m\vv_m) = \zer\text{,} \end{equation*}

so \(\xx-c_1\vv_1-\cdots -c_m\vv_m\in\ker T\text{.}\) But we have a basis for \(\ker T\text{,}\) so we can write

\begin{equation*} \xx-c_1\vv_1-\cdots -c_m\vv_m=t_1\uu_1+\cdots +t_n\uu_n \end{equation*}

for some scalars \(t_1,\ldots, t_n\text{,}\) and this can be rearanged to give

\begin{equation*} \xx=t_1\uu_1+\cdots +t_n\uu_n+c_1\vv_1+\cdots + c_m\vv_m\text{,} \end{equation*}

which completes the proof.

🔗

This is sometimes known as the Rank-Nullity Theorem, since it can be stated in the form

\begin{equation*} \dim V = \rank T + \operatorname{nullity} T\text{.} \end{equation*}

We will see that this result is frequently useful for providing simple arguments that establish otherwise difficult results. A basic situation where the theorem is useful is as follows: we are given \(T:V\to W\text{,}\) where the dimensions of \(V\) and \(W\) are known. Since \(\im T\) is a subspace of \(W\text{,}\) we know from Theorem 1.7.15 that \(T\) is onto if and only if \(\dim \im T = \dim W\text{.}\) In many cases it is easier to compute \(\ker T\) than it is \(\im T\text{,}\) and the Dimension Theorem lets us determine \(\dim\im T\) if we know \(\dim V\) and \(\dim \ker T\text{.}\)

🔗

Exercise 2.2.15.

Select all statements below that are true:

🔗

If \(\vv\in \ker T\text{,}\) then \(\vv=\zer\text{.}\)
🔗
Remember that \(\vv\in \ker T\) implies \(T(\vv)=\zer\text{;}\) this does not necessarily mean \(\vv=\zer\text{.}\)
If \(T:\R^4\to\R^6\) is injective, then it is surjective.
🔗
By the dimension theorem, the dimension of \(\im T\) cannot be greater than 4, so \(T\) can never be surjective.
If \(T:\R^4\to P_3(\R)\) is injective, then it is surjective.
🔗
Correct! If \(T\) is injective, then \(\dim\ker T=0\text{,}\) so by the dimension theorem, \(\dim \im T=\dim \R^4=4\text{.}\) Since \(\dim P_3(\R)=4\) as well, \(\im T=P_3(\R)\text{.}\)
It is possible to have an injective linear transformation \(T:\R^4\to\R^3\text{.}\)
🔗
The maximum dimension of \(\im T\) is 3, so the minimum dimension of \(\ker T\) is 1.
If \(T:V\to W\) is surjective, then \(\dim V\geq \dim W\text{.}\)
🔗
Correct! If \(T\) is surjective, then \(\im T=W\text{,}\) so \(\dim V = \dim \ker T +\dim \im T = \dim \ker T+\dim W\geq \dim W\text{.}\)

🔗

A useful consequence of this result is that if we know \(V\) is finite-dimensional, we can order any basis such that the first vectors in the list are a basis of \(\ker T\text{,}\) and the images of the remaining vectors produce a basis of \(\im T\text{.}\)

🔗

Another consequence of the dimension theorem is that we must have

\begin{equation*} \dim \ker T\leq \dim V \quad \text{ and } \quad \dim \im T\leq \dim V\text{.} \end{equation*}

Of course, we must also have \(\dim\im T\leq \dim W\text{,}\) since \(\im T\) is a subspace of \(W\text{.}\) In the case of a matrix transformation \(T_A\text{,}\) this means that the rank of \(T_A\) is at most the minimum of \(\dim V\) and \(\dim W\text{.}\) This once again has consequences for the existence and uniqueness of solutions for linear systems with the coefficient matrix \(A\text{.}\)

🔗

Exercises Exercises

1.

Knowing that the kernel and image of an operator are subspaces gives us a convenient way to define certain subspaces. Here is a nice example of this.

🔗

Let \(T:M_{nn}\to M_{nn}\) be defined by \(T(A)=A-A^T\text{.}\) Show that:

🔗

(a)

\(T\) is a linear map.

🔗

Hint.

You can use the fact that the transpose is linear: \((A+B)^T=A^T+B^T\) and \((cA)^T=cA^T\text{.}\)

🔗

(b)

\(\ker T\) is equal to the set of all symmetric matrices.

🔗

Hint.

A matrix is symmetric if \(A^T=A\text{,}\) or in other words, \(A-A^T=0\text{.}\)

🔗

(c)

\(\im T\) is equal to the set of all skew-symmetric matrices.

🔗

Hint.

A matrix is skew-symmetric if \(A^T=-A\text{.}\)

🔗

2.

Let

\begin{equation*} A={\left[\begin{array}{ccc} 1 \amp 4 \amp 4\cr -1 \amp -2 \amp -3\cr 0 \amp -2 \amp -1\cr -1 \amp -2 \amp -3 \end{array}\right]}\text{.} \end{equation*}

🔗

Find a basis for the image of \(A\) (or, equivalently, for the linear transformation \(T(x)=Ax\)).

🔗

3.

Let

\begin{equation*} A = {\left[\begin{array}{ccc} -4 \amp -1 \amp 4\cr -1 \amp 2 \amp 1\cr -1 \amp 2 \amp 1\cr -8 \amp -2 \amp 8 \end{array}\right]}. \end{equation*}

Find dimensions of the kernel and image of \(T(\vec{x}) = A \vec{x}\text{.}\)

🔗

4.

Let

\begin{equation*} A = {\left[\begin{array}{cccc} 1 \amp 2 \amp 2 \amp -5\cr 0 \amp 1 \amp -2 \amp -2\cr 3 \amp 9 \amp 0 \amp -21 \end{array}\right]}. \end{equation*}

Find a vector \(\vec{w}\) in \({\mathbb R}^3\) that is not in the image of the transformation \(\vec{x}\mapsto A\vec{x}\text{.}\)

🔗

5.

Suppose \(A \in M_{2,3}(\mathbb{R})\) is a matrix and

\begin{equation*} A \mathbf{e_1} = {\left[\begin{array}{c} -3\cr -2 \end{array}\right]}, A \mathbf{e_2} = {\left[\begin{array}{c} -3\cr -1 \end{array}\right]}, \text{ and } A \mathbf{e_3} = {\left[\begin{array}{c} -3\cr 0 \end{array}\right]}. \end{equation*}

🔗

What is \(A( -5 \mathbf{e_1} - 3 \mathbf{e_2} - 4 \mathbf{e_3})\text{?}\)
🔗

🔗
Find the matrix for the linear transformation \(f\) (relative to the standard basis in the domain and codomain). That is, find the matrix \(A\) such that \(f(\mathbf{x}) = A \mathbf{x}\text{.}\)
🔗

🔗
Find a formula for the linear transformation \(f\text{.}\)
🔗

🔗
Find bases (i.e., maximal independent sets) for the kernel and image of \(f\text{.}\)
🔗

🔗
The linear transformation \(f\) is:
🔗
- injective
  🔗
  
  🔗
- surjective
  🔗
  
  🔗
- bijective
  🔗
  
  🔗
- none of these
  🔗
  
  🔗
🔗
🔗

🔗

6.

Suppose \(f :\mathbb{R}^2 \to \mathbb{R}^3\) is the function defined by \(\displaystyle{ f(x,y) = {\left[\begin{array}{c} 5x+3y\cr -4y+3x\cr -3x\cr \end{array}\right]}. }\)

🔗

What is \(f( 2, -5 )\text{?}\) Enter your answer as a coordinate vector of the form \(\langle 1,2,3\rangle\text{.}\)
🔗

🔗
If \(f\) is a linear transformation, find the matrix \(A\) such that \(f(\mathbf{x}) = A \mathbf{x}\text{.}\)
🔗

🔗
Find bases (i.e., minimal spanning sets) for the kernel and image of \(f\text{.}\)
🔗

🔗
The linear transformation \(f\) is:
🔗
- injective
  🔗
  
  🔗
- surjective
  🔗
  
  🔗
- bijective
  🔗
  
  🔗
- none of these
  🔗
  
  🔗
🔗
🔗

🔗

7.

Let \(f : \mathbb{R}^{2} \to \mathbb{R}^{2}\) be the linear transformation defined by \(f(x,y) = \langle {5x+y}, {3x-3y} \rangle\text{.}\)

🔗

Find the matrix of the linear transformation \(f\text{.}\)
🔗

🔗
The linear transformation \(f\) is
🔗
- injective
  🔗
  
  🔗
- surjective
  🔗
  
  🔗
- bijective
  🔗
  
  🔗
- none of these
  🔗
  
  🔗
🔗
🔗
If \(f\) is bijective, find the matrix of its inverse.
🔗

🔗

🔗

8.

Let \(f : \mathbb{R}^{2} \to \mathbb{R}^{3}\) be the linear transformation determined by

\begin{equation*} f\begin{pmatrix}1 \\ 0\end{pmatrix}=\begin{pmatrix}-4 \\ -3 \\ -1 \end{pmatrix} \quad f\begin{pmatrix}0 \\ 1\end{pmatrix}=\begin{pmatrix}-3 \\ 5 \\ -2 \end{pmatrix} \end{equation*}

🔗

Find \(\displaystyle{f\begin{pmatrix}8 \\ -9\end{pmatrix}}\text{.}\)
🔗

🔗
Find the matrix of the linear transformation \(f\text{.}\)
🔗

🔗
The linear transformation \(f\) is
🔗
- injective
  🔗
  
  🔗
- surjective
  🔗
  
  🔗
- bijective
  🔗
  
  🔗
- none of these
  🔗
  
  🔗
🔗
🔗

🔗

9.

Let \(f : \mathbb{R}^{2} \to \mathbb{R}^{3}\) be the linear transformation determined by

\begin{equation*} f(\vec{e}_1)=\begin{pmatrix}-9 \\ 12 \\ 3 \end{pmatrix}, \quad f(\vec{e}_2)=\begin{pmatrix}-15 \\ 20 \\ 5 \end{pmatrix}. \end{equation*}

🔗

Find \(\displaystyle{f\begin{pmatrix}4 \\ -1\end{pmatrix}}\)
🔗

🔗
Find the matrix of the linear transformation \(f\text{.}\)
🔗

🔗
The linear transformation \(f\) is
🔗
- injective
  🔗
  
  🔗
- surjective
  🔗
  
  🔗
- bijective
  🔗
  
  🔗
- none of these
  🔗
  
  🔗
🔗
🔗

🔗

10.

A linear transformation \(T:{\mathbb R}^3 \rightarrow {\mathbb R}^2\) whose matrix is

\begin{equation*} {\left[\begin{array}{ccc} -1 \amp 1 \amp 1\cr -3 \amp 3 \amp 0+k\cr \end{array}\right]} \end{equation*}

is onto if and only if \(k \neq\).

🔗

11.

Let \(L:\mathbb{R}^3 \to \mathbb{R}^3\) be the linear operator defined by

🔗

\begin{equation*} L(x) = \begin{bmatrix}-12 x_1 - 24 x_2 + 12 x_3\\-12 x_2\\-36 x_1 + 36 x_3\end{bmatrix} \end{equation*}

🔗

(a) Find the dimension of the range of \(L\text{:}\)

🔗

(b) Find the dimension of the kernel of \(L\text{:}\)

🔗

(c) Let \(S\) be the subspace of \(\mathbb{R}^3\) spanned by \(11 e_1\) and \(24 e_2 + 24 e_3\text{.}\) Find the dimension of \(L(S)\text{:}\)

🔗

12.

Let \(\mathcal{P}_n\) be the vector space of all polynomials of degree \(n\) or less in the variable \(x\text{.}\) Let \(D^2 : \mathcal{P}_4 \to \mathcal{P}_2\) be the linear transformation that takes a polynomial to its second derivative. That is, \(D^2(p(x)) = p''(x)\) for any polynomial \(p(x)\) of degree \(4\) or less.

🔗

Find a basis for the kernel of \(D^2\text{.}\)
🔗

🔗
Find a basis for the image of \(D^2\text{.}\)
🔗

🔗

🔗

13.

The following exercise is both challenging, and useful. A solution can be found at the end of the book, but do your best to come up with a proof before looking at the solution.

🔗

Let \(V\) and \(W\) be finite-dimensional vector spaces. Prove the following:

🔗

(a)

\(\dim V\leq \dim W\) if and only if there exists an injection \(T:V\to W\text{.}\)

🔗

Hint.

You’re dealing with an “if and only if” statement, so be sure to prove both directions. One direction follows immediately from the dimension theorem.

🔗

What makes the other direction harder is that you need to prove an existence statement. To show that a transformation with the required property exists, you’re going to need to construct it! To do so, try defining your transformation in terms of a basis.

🔗

(b)

\(\dim V\geq \dim W\) if and only if there exists a surjection \(T:V\to W\text{.}\)

🔗

Hint.

The hint from the previous part also applies here!

🔗

You have attempted of activities on this page.

🔗

Prev Top Next