The trick with this proof is that we aren’t assuming \(V\) is finite-dimensional, so we can’t start with a basis of \(V\text{.}\) But we do know that \(\im T\) is finite-dimensional, so we start with a basis \(\{\ww_1,\ldots, \ww_m\}\) of \(\im T\text{.}\) Of course, every vector in \(\im T\) is the image of some vector in \(V\text{,}\) so we can write \(\ww_i =T(\vv_i)\text{,}\) where \(\vv_i\in V\text{,}\) for \(i=1,2,\ldots, m\text{.}\)
Since
\(\{T(\vv_1),\ldots, T(\vv_m)\}\) is a basis, it is linearly independent. The results of
Exercise 2.1.12 tell us that the set
\(\{\vv_1,\ldots, \vv_m\}\) must therefore be independent.
We now introduce a basis \(\{\uu_1,\ldots, \uu_n\}\) of \(\ker T\text{,}\) which we also know to be finite-dimensional. If we can show that the set \(\{\uu_1,\ldots, \uu_n,\vv_1,\ldots, \vv_m\}\) is a basis for \(V\text{,}\) we’d be done, since the number of vectors in this basis is \(\dim\ker T + \dim \im T\text{.}\) We must therefore show that this set is independent, and that it spans \(V\text{.}\)
To see that it’s independent, suppose that
\begin{equation*}
a_1\uu_1+\cdots + a_n\uu_n+b_1\vv_1+\cdots +b_m\vv_m=\zer\text{.}
\end{equation*}
Applying \(T\) to this equation, and noting that \(T(\uu_i)=\zer\) for each \(i\text{,}\) by definition of the \(\uu_i\text{,}\) we get
\begin{equation*}
b_1T(\vv_1)+\cdots +b_mT(\vv_m)=\zer\text{.}
\end{equation*}
We assumed that the vectors \(T(\vv_i)\) were independent, so all the \(b_i\) must be zero. But then we get
\begin{equation*}
a_1\uu_1+\cdots +a_n\uu_n=\zer\text{,}
\end{equation*}
and since the \(\uu_i\) are independent, all the \(a_i\) must be zero.
To see that these vectors span, choose any \(\xx\in V\text{.}\) Since \(T(\xx)\in \im T\text{,}\) there exist scalars \(c_1,\ldots, c_m\) such that
\begin{equation}
T(\xx)=c_1T(\vv_1)+\cdots +c_mT(\vv_m)\text{.}\tag{2.2.1}
\end{equation}
We’d like to be able to conclude from this that \(\xx=c_1\vv_1+\cdots +c_m\vv_m\text{,}\) but this would be false, unless \(T\) was known to be injective (which it isn’t). Failure to be injective involves the kernel -- how do we bring that into the picture?
The trick is to realize that the reason we might have
\(\xx\neq c_1\vv_1+\cdots +c_m\vv_m\) is that we’re off by something in the kernel. Indeed,
(2.2.1) can be re-written as
\begin{equation*}
T(\xx-c_1\vv_1-\cdots -c_m\vv_m) = \zer\text{,}
\end{equation*}
so \(\xx-c_1\vv_1-\cdots -c_m\vv_m\in\ker T\text{.}\) But we have a basis for \(\ker T\text{,}\) so we can write
\begin{equation*}
\xx-c_1\vv_1-\cdots -c_m\vv_m=t_1\uu_1+\cdots +t_n\uu_n
\end{equation*}
for some scalars \(t_1,\ldots, t_n\text{,}\) and this can be rearanged to give
\begin{equation*}
\xx=t_1\uu_1+\cdots +t_n\uu_n+c_1\vv_1+\cdots + c_m\vv_m\text{,}
\end{equation*}
which completes the proof.