Chapter 3 Linear Mappings and Their Matrices

3.1 Linear Mappings

3.1.1.

Prove that $T : \mathbb{R}^n \longrightarrow \mathbb{R}^m$ is linear if and only if it satisfies (3.1) and (3.2). (It may help to rewrite (3.1) with the symbols $x_1$ and $x_2$ in place of $x$ and $y$ . Then prove one direction by showing that (3.1) and (3.2) are implied by the defining condition for linearity, and prove the other direction by using induction to show that (3.1) and (3.2) imply the defining condition. Note that as pointed out in the text, one direction of this argument has a bit more substance than the other.)

Proof:

$\Rightarrow$

Use induction, when $k = 1$ ,

$T(α_1 x_1) = α_1 T(x_1)$

This is (3.2).

Then assume $k = n$ holds, we are checking $k = n + 1$ .

$\begin{align*} T(\sum_{i = 1}^{n+1} α_i x_i) &= T((\sum_{i = 1}^{n} α_i x_i) + α_{n+1} x_{n+1}) \\ &= T((\sum_{i = 1}^{n} α_i x_i)) + T(α_{n+1} x_{n+1}) \qquad \text{Use (3.1)}\\ &= T((\sum_{i = 1}^{n} α_i x_i)) + α_{n+1} T(x_{n+1}) \qquad \text{Use (3.2)}\\ &= \sum_{i = 1}^{n}α_kT(x_k) + α_{n+1} T(x_{n+1}) \qquad \text{Use induction}\\ &= \sum_{i = 1}^{n+1}α_kT(x_k)\\ \end{align*}$

$\Leftarrow$

To prove (3.1), set $k = 2, α_1 = α_2 = 1$ .

To prove (3.2), set $k = 1$ .

$\square$

3.1.2.

Suppose that $T : \mathbb{R}^n \longrightarrow \mathbb{R}^m$ is linear. Show that $T(0_n) = 0_m$ . (An intrinsic argument is nicer.)

Proof:

Note in both $\mathbb{R}^n, \mathbb{R}^m$ , $0 v = 0$ . So

$T(0) = T(0 v_n) = 0 T(v_n) = 0 v_m = 0$

$\square$

3.1.3

Fix a vector $a ∈ \mathbb{R}$ . Show that the mapping $f : \mathbb{R}^n \longrightarrow \mathbb{R}$ given by $T(x) = ⟨a,x⟩$ is linear, and that $T(e_j) = a_j$ for $j = 1,...,n$ .

Proof:

(3.1) $T(x + y) = ⟨a,x+y⟩ = ⟨a,x⟩ + ⟨a,y⟩ = T(x) + T(y)$

(3.2) $T(αx) = ⟨a,αx⟩ = α ⟨a,x⟩ = αT(x)$

$T(e_j) = a_j$ follows the definition of the inner product.

$\square$

3.1.4

Find the linear mapping $T : \mathbb{R}^3 \longrightarrow \mathbb{R}$ such that $T(0,1,1) = 1$ , $T(1,0,1) = 2$ , and $T(1,1,0) = 3$ .

Solution

We need to solve this equations

$\begin{cases} 0x + 1y + 1z = 1 &\text{}\\ 1x + 0y + 1z = 2 &\text{}\\ 1x + 1y + 0z = 3 &\text{}\\ \end{cases}$

So $T(x) = ⟨a,x⟩, a = (2, 1, 0)$ .

$\square$

3.1.5.

Complete the proof of the componentwise nature of linearity.

Proof:

For $α \in \mathbb{R}, x \in \mathbb{R}^n$

If $T_1, \cdots, T_m$ are linear, then

$T(αx) = (T_1(\alpha x), \cdots, T_m(\alpha x)) \\ = (αT_1(x), \cdots, αT_m(x)) \\ = α (T_1(x), \cdots, T_m(x)) \\ = α T(x)$

On the other hand, if $T$ is linear,

Then

$\begin{split} (T_1(\alpha x), \cdots, T_m(\alpha x)) &= T(αx) \\ &= α T(x) \\ &= α (T_1(x), \cdots, T_m(x)) \\ &= (αT_1(x), \cdots, αT_m(x)) \\ \end{split}$

So $T_i(αx) = α T_i(x)$

$\square$

3.1.6.

Carry out the matrix-by-vector multiplications

Solution:

(a) $[1,3,6]$

(b) $(ax+by, cx+dy, ex+fy)$

(c) $(x_1 y_1, \cdots, x_n y_n)$

(d) $(0,0,0)$

$\square$

3.1.7.

Prove that the identity mapping $\text{id} : \mathbb{R}^n \longrightarrow \mathbb{R}^n$ is linear. What is its matrix? Explain.

Proof

(3.1) $\text{id}(x+y) = x+y = \text{id}(x) + \text{id}(y)$

(3.2) $\text{id}(αx) = αx = α \text{id}(x)$

Then matrix is

$\begin{bmatrix} 1 &0 &\cdots &0 \\ 0 &1 &\cdots &0 \\ \vdots &\vdots &\ddots &\vdots \\ 0 &0 &\cdots &1 \\ \end{bmatrix}$

$\square$

3.1.8.

Let $θ$ denote a fixed but generic angle. Argue geometrically that the mapping $R : \mathbb{R}^n \longrightarrow \mathbb{R}^n$ given by counterclockwise rotation by $θ$ is linear, and then find its matrix.

Proof:

Geometrically is quite obvious.

Consider

$R(e_1) = (\cos θ, \sin θ) \\ R(e_2) = (-\sin θ, \cos θ) \\$

Then matrix is

$\begin{bmatrix} \cos θ & -\sin θ \\ \sin θ & \cos θ \\ \end{bmatrix}$

$\square$

3.1.9.

Show that the mapping $Q : \mathbb{R}^2 \longrightarrow \mathbb{R}^2$ given by reflection through the x-axis is linear. Find its matrix.

$R(e_1) = (1, 0) \\ R(e_2) = (0, -1) \\$

Then matrix is

$\begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix}$

$\square$

3.1.10.

Show that the mapping $P : \mathbb{R}^2 \longrightarrow \mathbb{R}^2$ given by orthogonal projection onto the diagonal line $x = y$ is linear. Find its matrix. (See Exercise 2.2.15.)

Proof:

We can use vector $d = (1,1)$ to represent the diagonal line $x = y$ .

Then for $x, y \in \mathbb{R}^2$

$P(x) = \frac{⟨d,x⟩}{|d|^2} d \\ P(y) = \frac{⟨d,y⟩}{|d|^2} d \\ P(x+y) = \frac{⟨d,x+y⟩}{|d|^2} d \\ = \frac{⟨d,x⟩}{|d|^2} d + \frac{⟨d,y⟩}{|d|^2} d = P(x) + P(y)$

Also

$P(αx) = \frac{⟨d,αx⟩}{|d|^2} d = α\frac{⟨d,x⟩}{|d|^2} d = α P(x)$

So $P$ is linear.

$R(e_1) = (\sqrt[]{2}/2, \sqrt[]{2}/2) \\ R(e_2) = (\sqrt[]{2}/2, \sqrt[]{2}/2) \\$

Then matrix is

$\begin{bmatrix} \sqrt[]{2}/2 & \sqrt[]{2}/2 \\ \sqrt[]{2}/2 & \sqrt[]{2}/2 \\ \end{bmatrix}$

$\square$

3.1.11.

Draw the graph of a generic linear mapping from $\mathbb{R}^2$ to $\mathbb{R}^3$ .

Solution: skip.

3.1.12.

Continue the proof of Proposition 3.1.8 by proving the other three statements about $S + T$ and $aS$ satisfying (3.1) and (3.2).

Proof:

(3.2) for $S+T$ .

$(S+T)(αx) = S(αx) + T(αx) = αS(x) + αT(x) = α (S(x) + T(x)) = α (S+T)(x)$

Next, (3.1) for $aS$ .

$\begin{align*} (aS)(x+y) &= a(S(x + y)) & \quad \text{definition of the } aS \\ &= a(S(x) + S(y)) & \quad \text{S is linear} \\ &= aS(x) + aS(y) & \quad \text{Vector space axioms D2} \\ &= (aS)(x) + (aS)(y) & \quad \text{definition of the } aS \\ \end{align*}$

Next, (3.2) for $aS$ .

$\begin{aligned} (aS)(αx) &= a(S(αx)) & \quad \text{definition of the } aS \\ &= a(αS(x)) & \quad \text{S is linear} \\ &= α(aS(x)) & \quad \mathbb{R} \text{ is a field} \\ &= α((aS)(x)) & \quad \text{definition of the } aS \\ \end{aligned}$

$\square$

3.1.13.

If $S \in \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m)$ and $T \in \mathcal{L}(\mathbb{R}^p, \mathbb{R}^n)$ , show that $S \circ T : \mathbb{R}^p \longrightarrow \mathbb{R}^m$ lies in $\mathcal{L}(\mathbb{R}^p, \mathbb{R}^m)$ .

Proof:

(3.1)

$\begin{align*} (S \circ T)(x+y) &= S(T(x+y)) \\ &= S(T(x) + T(y)) \\ &= (S \circ T)(x) + (S \circ T)(y) \\ \end{align*}$

Next (3.2)

$\begin{align*} (S \circ T)(αx) &= S(T(αx)) \\ &= S(αT(x)) \\ &= α(S(T(x))) \\ &= α((S \circ T)(x)) \\ \end{align*}$

So $S \circ T \in \mathcal{L}(\mathbb{R}^p, \mathbb{R}^m)$ .

3.1.14

(a) Let $S \in \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m)$ . Its transpose is the mapping

$S^T : \mathbb{R}^m \longrightarrow \mathbb{R}^n$

defined by the characterizing condition

$⟨x,S^T(y)⟩ = ⟨S(x),y⟩ \qquad \text{for all } x \in \mathbb{R}^n \text{ and } y \in \mathbb{R}^m.$

Granting that indeed a unique such $S^T$ exists, use the characterizing condition to show that

$S^T(y+y') = S^T(y) + S^T(y') \qquad \text{for all } y, y' \in \mathbb{R}^m$

by showing that

$⟨x,S^T(y+y')⟩ = ⟨x,S^T(y)+S^T(y')⟩ \qquad \text{for all } x \in \mathbb{R}^n \text{ and } y, y' \in \mathbb{R}^m.$

Proof:

$\begin{align*} ⟨x,S^T(y+y')⟩ &= ⟨S(x),y+y'⟩ \\ &= ⟨S(x),y⟩ + ⟨S(x),y'⟩ \\ &= ⟨x,S^T(y)⟩ + ⟨x,S^T(y')⟩ \\ &= ⟨x,S^T(y) + S^T(y')⟩ \\ \end{align*}$

Because $x$ can be arbitrary, so we have

$S^T(y+y') = S^T(y) + S^T(y') \qquad \text{for all } y, y' \in \mathbb{R}^m$

$\square$

(b) Keeping S from part (a), now further introduce $T \in \mathcal{L}(\mathbb{R}^p, \mathbb{R}^n)$ , so that also $S \circ T \in \mathcal{L}(\mathbb{R}^p, \mathbb{R}^m)$ . Show that the transpose of the composition is the composition of the transposes in reverse order,

$(S ◦T)^T = T^T ◦S^T,$

by showing that

$⟨x,(S ◦T)^T(z)⟩ = ⟨x, (T^T ◦S^T)(z)⟩$

Proof:

$\begin{align*} ⟨x,(S ◦T)^T(z)⟩ &= ⟨(S◦T)(x), z⟩ \\ &= ⟨S(T(x)), z⟩ \\ &= ⟨T(x), S^T(z)⟩ \\ &= ⟨x, T^T (S^T(z))⟩ \\ &= ⟨x, (T^T ◦ S^T)(z)⟩ \\ \end{align*}$

$\square$

3.1.15

A mapping $f : \mathbb{R}^n \longrightarrow \mathbb{R}^m$ is called aﬃne if it has the form $f(x) = T(x) + b$ , where $T ∈ \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m)$ and $b ∈ \mathbb{R}^m$ . State precisely and prove: the composition of aﬃne mappings is aﬃne.

Proof:

If $f : \mathbb{R}^p \longrightarrow \mathbb{R}^n$ and $g : \mathbb{R}^n \longrightarrow \mathbb{R}^m$ are affine, then

$g \circ f : \mathbb{R}^p \longrightarrow \mathbb{R}^m \quad \text{ is affine}.$

$\begin{align*} g(f(x)) &= g(T(x) + b) \\ &= g(T(x)) + g(b) \\ &= S(T(x)) + c + g(b) \\ \end{align*}$

$\square$

3.1.16

Let $T : \mathbb{R}^n \longrightarrow \mathbb{R}^m$ be a linear mapping. Note that since $T$ is continuous and since the absolute value function on $\mathbb{R}^m$ is continuous, the composite function

$|T| : \mathbb{R}^n \longrightarrow \mathbb{R}$

is continuous.

(a) Let $S= \{x ∈ \mathbb{R}^n : |x| = 1\}$ . Explain why $S$ is a compact subset of $\mathbb{R}^n$ . Explain why it follows that $|T|$ takes a maximum value $c$ on $S$ .

Proof: $S$ is bounded. Then we just need to show $S$ is closed.

Use "Sequential characterization of closed sets", assume $\{x_ν\} \rightarrow p \in \mathbb{R}^n$ .

Since $||$ is a continuous function and

$\lim_{\nu \to \infty} |x_ν| = 1$

Then $|p| = 1$ .

So $p \in S$ , then $S$ is closed. So $S$ is compact.

Then based on Theorem 2.4.15 (Extreme value theorem), $|T|$ takes a maximum value $c$ on $S$ .

$\square$

(b) Show that $|T(x)| ≤ c|x|$ for all $x ∈ \mathbb{R}^n$ . This result is the linear magnification boundedness lemma. We will use it in Chapter 4.

Proof:

$x = |x|\frac{x}{|x|}\\ \begin{align*} T(x) &= T(|x|\frac{x}{|x|}) \\ &= |x|T(\frac{x}{|x|}) &\qquad T\text{ is linear} \\ &\leq c|x| &\qquad \left| \frac{x}{|x|} \right| = 1 \\ \end{align*}$

$\square$

3.1.17.

Let $T : \mathbb{R}^n \longrightarrow \mathbb{R}^m$ be a linear mapping.

(a) Explain why the set $D = \{x ∈ \mathbb{R}^n : |x| = 1\}$ is compact.

Proof: See 3.1.16 (a)

(b) Use part (a) of this exercise and part (b) of the preceding exercise to explain why therefore the set $\{|T(x)| : x ∈ D\}$ has a maximum. This maximum is called the norm of $T$ and is denoted $\left\| T \right\|$ .

Proof:

See 3.1.16 (b).

(c) Explain why $\left\| T \right\|$ is the smallest value $K$ that satisfies the condition from part (b) of the preceding exercise, $|T(x)| ≤ K|x|$ for all $x ∈ \mathbb{R}^n$ .

Proof:

First, we know from 3.1.16 (b), $|T(x)| ≤ \left\| T \right\| |x|$ .

Next, we can find a point $c \in D$ such that

$|T(c)| = \left\| T \right\| |c| = \left\| T \right\|$

$\square$

(d) Show that for every $S,T ∈ \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m)$ and every $a ∈ \mathbb{R}$ ,

$\left\| S + T \right\| \leq \left\| S \right\| + \left\| T \right\| \qquad \text{and} \qquad \left\| aT \right\| = |a|\left\| T \right\|$

Define a distance function

$d: \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m) \times \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m) \longrightarrow \mathbb{R}, \quad d(S, T) = \left\| S - T \right\|$

Show that this function satisfies the distance properties of Theorem 2.2.8.

Proof:

$\left\| S+T \right\| = |(S+T)(c)| \\ = |S(c) + T(c)| \\ \leq |S(c)| + |T(c)| \\ \leq \left\| S \right\| + \left\| T \right\|$

$\left\| aT \right\| = |(aT)(c)| \\ |a(T(c))| = |a| |T(c)| = |a| \left\| T \right\|$

$\square$

(D1) $d(S, T) \geq 0$ .

Since $|(S - T)(x)| \geq 0$ , then $\left\| S-T \right\| \geq 0$ .

$d(S,T) = 0$ if and only if $S = T$ .

$\square$

(D2) $d(S,T) = d(T,S)$

Because

$|(S-T)(x)| = |(T-S)(x)|$

$\square$

(D3) $d(R,T) ≤ d(R,S)+d(S,T)$

$d(R,T) = \left\| R - T \right\| = |(R-T)(c)| \\ = |R(c) - T(c)| \\ = |(R(c)-S(c)) + (S(c)-T(c))| \\ \leq |R(c)-S(c)| + |S(c)-T(c)| \\ = |(R-S)(c)| + |(S-T)(c)| \\ \leq \left\| R-S \right\| + \left\| S-T \right\| \\ = d(R,S)+d(S,T)$

$\square$

(e) Show that for every $S \in \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m)$ and every $T \in \mathcal{L}(\mathbb{R}^p, \mathbb{R}^n)$

Proof:

Assume

$\left\| ST \right\| = |S(T(c))|$

Let $x = T(c) \in \mathbb{R}^n$ .

$|S(x)| \leq \left\| S \right\| |x| \\ |x| = |T(c)| \leq \left\| T \right\| |c| = \left\| T \right\|$

Therefore

$\begin{align*} \left\| ST \right\| &= |S(T(c))| \\ &= |S(x)| \\ &\leq \left\| S \right\| |x| \\ &\leq \left\| S \right\| \left\| T \right\| \\ \end{align*}$

$\square$

3.2 Operations on Matrices

3.2.1.

Justify Definition 3.2.2 of scalar multiplication of matrices.

Solution:

The $j$ th column of $αA$ is

$$ (αS)(e_j) = α (S(e_j)) = α \times \text{( $j$ th column of A)} $$

So it's reasonable to define $αA = [αa_{ij}]_{m \times n}$ .

$\square$

3.2.2.

Carry out the matrix multiplications.

Solution: skipped.

3.2.3

Prove more of Proposition 3.2.5, that $A(B+C) = AB+AC$ , $(αA)B= A(αB)$ , and $I_mA= A$ for suitable matrices $A,B,C$ and any scalar $α$ .

Proof:

$$ $\begin{align*} (R ◦ (S + T))(x) &= R((S+T)(x)) &\quad \text{by definition of <script type="math/tex">R ◦ (S + T)$ } \ &= R(S(x)+T(x)) &\quad \text{by definition of $(S + T)$ } \ &= R(S(x)) + R(S(x)) &\quad \text{ $R$ is linear mapping} \ &= (R ◦ S)(x) + (R ◦ T)(x) &\quad \text{by definition of $R ◦ S$ and $R ◦ T$ } \end{align*} $$

$$ $\begin{align*} ((αS) ◦ T)(x) &= (αS)(T(x)) &\quad \text{by definition of <script type="math/tex">(αS) ◦ T$ } \ &= α (S(T(x))) &\quad \text{by definition of $(αS)$ } \ &= S(αT(x)) &\quad \text{ $S$ is linear} \ &= S((αT)(x)) &\quad \text{by definition of $(αT)$ } \ &= (S◦(αT))(x) &\quad \text{by definition of $S◦(αT)$ } \\ \end{align*} $$

$$ $\begin{align*} (\text{id} ◦ A) (x) &= \text{id}(A(x)) &\quad \text{by definition of id <script type="math/tex"> ◦ A$ } \\ &= A(x) &\quad \text{by definition of id} \\ \end{align*} $$

3.2.4.

(If you have not yet worked Exercise 3.1.14 then do so before working this exercise.)

Let $A = [a_{ij}]_{m \times n} \in M_{m, n}(\mathbb{R})$ be the matrix of $S \in \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m)$ Its transpose $A^T \in M_{n, m}(\mathbb{R})$ is the matrix of the transpose mapping $S^T$ . Since $S$ and $S^T$ act respectively as multiplication by $A$ and $A^T$ , the characterizing property of $S$ from Exercise 3.1.14 gives

$⟨x,A^T y⟩ = ⟨Ax,y⟩ \quad \text{for all } x \in \mathbb{R}^n \text{ and } y \in \mathbb{R}^m.$

Make specific choices of x and y to show that the transpose $A^T \in M_{m, n}(\mathbb{R})$ is obtained by flipping $A$ about its northwest–southeast diagonal; that is, show that the $(i,j)$ th entry of $A^T$ is $a_{ji}$ .

It follows that the rows of $A^T$ are the columns of $A$ , and the columns of $A^T$ are the rows of $A$ .

Proof:

Given any $1 \leq p \leq m, 1 \leq q \leq n$ , let $x = e_p \in \mathbb{R}^n, y = e_q \in \mathbb{R}^m$ .

Let $A = [a_{ij}]_{m \times n}, A^T = [b_{ij}]_{n \times m}$ .

Then $A^T y = A^T e_q = j \text{th column of }A^T$ , then $⟨x,A^T y⟩ = b_{ij}$ .

On the other hand, $Ax = i \text{th column of }A$ , then $⟨Ax,y⟩ = a_{ji}$ .

So we proved.

(Similarly, let $B \in M_{n, p}(\mathbb{R})$ be the matrix of $T \in \mathcal{L}(\mathbb{R}^p, \mathbb{R}^n)$ so that $B^T$ is the matrix of $T^T$ . Because matrix multiplication is compatible with linear mapping composition, we know immediately from Exercise 3.1.14(b), with no reference to the concrete description of the matrix transposes $A^T$ and $B^T$ in terms of the original matrices $A$ and $B$ , that the transpose of the product is the product of the transposes in reverse order,

$(AB)^T = B^T A^T \quad \text{for all } A \in M_{m, n}(\mathbb{R}) \text{ and } B \in M_{n, p}(\mathbb{R}).$

That is, by characterizing the transpose mapping in Exercise 3.1.14, we easily derived the construction of the transpose matrix here and obtained the formula for the product of transpose matrices with no reference to their construction.)

$\square$

3.2.5.

The trace of a square matrix $A \in M_{n}(\mathbb{R})$ is the sum of its diagonal elements,

$\text{tr}(A) = \sum_{i = 1}^{n} a_{ii}$

Show that

$\text{tr}(AB) = \text{tr}(BA), \quad A,B \in M_{n}(\mathbb{R})$

Proof:

$\text{tr}(AB) = \sum_{i = 1}^{n} c_{ii} \\ = \sum_{i = 1}^{n} \left( \sum_{j = 1}^{n} a_{ij} b_{ji} \right) \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{ij} b_{ji} \\ = \sum_{j = 1}^{n} \sum_{i = 1}^{n} b_{ji} a_{ij} \\ = \sum_{j = 1}^{n} \left( \sum_{i = 1}^{n} b_{ji} a_{ij} \right) \\ = \text{tr}(BA)$

$\square$

3.2.6

For every matrix $A \in M_{m, n}(\mathbb{R})$ and column vector $a \in \mathbb{R}^m$ , define the aﬃne mapping (cf. Exercise 3.1.15)

$\text{Aff}_{A, a} : \mathbb{R}^n \longrightarrow \mathbb{R}^m$

by the rule $\text{Aff}_{A, a} = Ax + a$ for all $x \in \mathbb{R}^n$ , viewing $x$ as a column vector.

(a) Explain why every aﬃne mapping from $\mathbb{R}^n$ to $\mathbb{R}^m$ takes this form.

Proof:

An affine mapping is this form: $f(x) = T(x) + b$ , $T \in \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m)$ and $b \in \mathbb{R}^m$ .

Since $T \in \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m)$ , then we can find matrix $A \in M_{m, n}(\mathbb{R})$ such that $T(x) = A(x)$ .

So $\text{Aff}_{A, a} = Ax + a$ .

$\square$

(b) Given such $A$ and $a$ , define the matrix $A' \in M_{{m+1}, {n+1}}(\mathbb{R})$ to be

$A' = \begin{bmatrix} A & a \\ \bold{0}_n & 1 \\ \end{bmatrix}.$

Show that for all $x ∈ \mathbb{R}^n$ ,

$A' \begin{bmatrix} x \\ 1 \\ \end{bmatrix} = \begin{bmatrix} \text{Aff}_{A, a}(x) \\ 1 \end{bmatrix}.$

Thus, aﬃne mappings, like linear mappings, behave as matrix-by-vector multiplications but where the vectors are the usual input and output vectors augmented with an extra “1” at the bottom.

Proof:

For $e_j, j < n+1$ ,

$A' e_j = \begin{bmatrix} A_j \\ 0 \\ \end{bmatrix} \\ A' e_{n+1} = \begin{bmatrix} a \\ 1 \\ \end{bmatrix}$

Let $x' = \begin{bmatrix} x \\ 1 \\ \end{bmatrix}$

Then

$x' = \sum_{j = 1}^{n} x_j e_j + e_{n+1}$

Then

$A'x' = \sum_{j = 1}^{n} x_j A' e_j + A' e_{n+1} \\ = \sum_{j = 1}^{n} x_j A_j + \begin{bmatrix} a \\ 1 \\ \end{bmatrix} \\ = \sum_{j = 1}^{n} \begin{bmatrix} A_j x_j \\ 0 \\ \end{bmatrix} + \begin{bmatrix} a \\ 1 \\ \end{bmatrix} \\ = \begin{bmatrix} \text{Aff}_{A, a}(x) \\ 1 \end{bmatrix}$

$\square$

(c) The aﬃne mapping $\text{Aff}_{B, b} : \mathbb{R}^p \longrightarrow \mathbb{R}^n$ determined by $B \in M_{n, p}(\mathbb{R})$ and $b \in \mathbb{R}^n$ has matrix

$B' = \begin{bmatrix} B & b \\ \bold{0}_p & 1 \\ \end{bmatrix}$

Show that

$\text{Aff}_{A, a} \circ \text{Aff}_{B, b} : \mathbb{R}^p \longrightarrow \mathbb{R}^m$ has matrix $A'B'$ . That is, matrix multiplication is compatible with composition of aﬃne mappings.

Proof:

First we know

$B'x = \begin{bmatrix} \text{Aff}_{B, b}(x) \\ 1 \end{bmatrix}$

Then

$$ $\begin{align*} (A'B')x &= A' (B'x) &\quad \text{Properties of matrix multiplication} \\ &= A' \begin{bmatrix} \text{Aff}_{B, b}(x) \\ 1 \end{bmatrix} &\quad \text{Compute <script type="math/tex">B'x$ from (b)} \\ &= \begin{bmatrix} \text{Aff}_{A, a} ( \text{Aff}_{B, b}(x) ) \\ 1 \\ \end{bmatrix} &\quad \text{Again from (b)} \\ &= \begin{bmatrix} (\text{Aff}_{A, a} \circ \text{Aff}_{B, b})(x) \\ 1 \end{bmatrix} &\quad \text{Definition of composition of maps.} \end{align*} $$

$\square$

3.2.7.

The exponential of a square matrix $A$ is the infinite matrix sum

$e^A = I + A + \frac{1}{2!} A^2 + \frac{1}{3!} A^3 + \cdots$

Compute the exponentials of the following matrices.

Solution:

(a) $A = [λ]$

$e^A = [e^λ]$

$\square$

(b)

$A = \begin{bmatrix} λ & 1 \\ 0 & λ \\ \end{bmatrix}$

$$ A^2 = $\begin{bmatrix} λ & 1 \\ 0 & λ \\ \end{bmatrix}$ $\begin{bmatrix} λ & 1 \\ 0 & λ \\ \end{bmatrix}$ \ = $\begin{bmatrix} λ^2 & 2λ \\ 0 & λ^2 \\ \end{bmatrix}$

$$

$A^3 = \begin{bmatrix} λ^2 & 2λ \\ 0 & λ^2 \\ \end{bmatrix} \begin{bmatrix} λ & 1 \\ 0 & λ \\ \end{bmatrix} \\ = \begin{bmatrix} λ^3 & 3λ^2 \\ 0 & λ^3 \\ \end{bmatrix}$

Then

$e^A = \begin{bmatrix} e^λ & e^λ \\ 0 & e^λ \\ \end{bmatrix}$

$\square$

(c)

$A = \begin{bmatrix} λ & 1 & 0 \\ 0 & λ & 1 \\ 0 & 0 & λ \\ \end{bmatrix}$

$A^2 = \begin{bmatrix} λ & 1 & 0 \\ 0 & λ & 1 \\ 0 & 0 & λ \\ \end{bmatrix} \begin{bmatrix} λ & 1 & 0 \\ 0 & λ & 1 \\ 0 & 0 & λ \\ \end{bmatrix} \\ = \begin{bmatrix} λ^2 & 2λ & 1 \\ 0 & λ^2 & 2λ \\ 0 & 0 & λ^2 \\ \end{bmatrix}$

$A^3 = \begin{bmatrix} λ^2 & 2λ & 1 \\ 0 & λ^2 & 2λ \\ 0 & 0 & λ^2 \\ \end{bmatrix} \begin{bmatrix} λ & 1 & 0 \\ 0 & λ & 1 \\ 0 & 0 & λ \\ \end{bmatrix} \\ = \begin{bmatrix} λ^3 & 3λ^2 & 3λ \\ 0 & λ^3 & 3λ^2 \\ 0 & 0 & λ^3 \\ \end{bmatrix}$

$A^4 = \begin{bmatrix} λ^3 & 3λ^2 & 3λ \\ 0 & λ^3 & 3λ^2 \\ 0 & 0 & λ^3 \\ \end{bmatrix} \begin{bmatrix} λ & 1 & 0 \\ 0 & λ & 1 \\ 0 & 0 & λ \\ \end{bmatrix} \\ = \begin{bmatrix} λ^4 & 4λ^3 & 6λ^2 \\ 0 & λ^4 & 4λ^3 \\ 0 & 0 & λ^4 \\ \end{bmatrix}$

So

$e^A = \begin{bmatrix} e^λ & e^λ & e^λ/2 \\ 0 & e^λ & e^λ \\ 0 & 0 & e^λ \\ \end{bmatrix}$

$\square$

(d)

$A = \begin{bmatrix} λ & 1 & 0 & 0 \\ 0 & λ & 1 & 0 \\ 0 & 0 & λ & 1 \\ 0 & 0 & 0 & λ \\ \end{bmatrix}$

$A^2 = \begin{bmatrix} λ & 1 & 0 & 0 \\ 0 & λ & 1 & 0 \\ 0 & 0 & λ & 1 \\ 0 & 0 & 0 & λ \\ \end{bmatrix} \begin{bmatrix} λ & 1 & 0 & 0 \\ 0 & λ & 1 & 0 \\ 0 & 0 & λ & 1 \\ 0 & 0 & 0 & λ \\ \end{bmatrix} \\ = \begin{bmatrix} λ^2 & 2λ & 1 & 0 \\ 0 & λ^2 & 2λ & 1 \\ 0 & 0 & λ^2 & 2λ \\ 0 & 0 & 0 & λ^2 \\ \end{bmatrix}$

So

$\frac{1}{2!} A^2 = \begin{bmatrix} \frac{1}{2!} λ^2 & λ & \frac{1}{2} & 0 \\ 0 & \frac{1}{2!} λ^2 & λ & \frac{1}{2} \\ 0 & 0 & \frac{1}{2!} λ^2 & λ \\ 0 & 0 & 0 & \frac{1}{2!} λ^2 \\ \end{bmatrix}$

Then

$A^3 = \begin{bmatrix} λ^2 & 2λ & 1 & 0 \\ 0 & λ^2 & 2λ & 1 \\ 0 & 0 & λ^2 & 2λ \\ 0 & 0 & 0 & λ^2 \\ \end{bmatrix} \begin{bmatrix} λ & 1 & 0 & 0 \\ 0 & λ & 1 & 0 \\ 0 & 0 & λ & 1 \\ 0 & 0 & 0 & λ \\ \end{bmatrix} \\ = \begin{bmatrix} λ^3 & 3λ^2 & 3λ & 1 \\ 0 & λ^3 & 3λ^2 & 3λ \\ 0 & 0 & λ^3 & 3λ^2 \\ 0 & 0 & 0 & λ^3 \\ \end{bmatrix}$

So

$\frac{1}{3!} A^3 = \begin{bmatrix} \frac{1}{3!} λ^3 & \frac{1}{2!} λ^2 & \frac{1}{2} λ & \frac{1}{3!} \\ 0 & \frac{1}{3!} λ^3 & \frac{1}{2!} λ^2 & \frac{1}{2} λ \\ 0 & 0 & \frac{1}{3!} λ^3 & \frac{1}{2!} λ^2 \\ 0 & 0 & 0 & \frac{1}{3!} λ^3 \\ \end{bmatrix}$

Then

$A^4 = \begin{bmatrix} λ^3 & 3λ^2 & 3λ & 1 \\ 0 & λ^3 & 3λ^2 & 3λ \\ 0 & 0 & λ^3 & 3λ^2 \\ 0 & 0 & 0 & λ^3 \\ \end{bmatrix} \begin{bmatrix} λ & 1 & 0 & 0 \\ 0 & λ & 1 & 0 \\ 0 & 0 & λ & 1 \\ 0 & 0 & 0 & λ \\ \end{bmatrix} \\ = \begin{bmatrix} λ^4 & 4 λ^3 & 6 λ^2 & 4 λ \\ 0 & λ^4 & 4 λ^3 & 6 λ^2 \\ 0 & 0 & λ^4 & 4 λ^3 \\ 0 & 0 & 0 & λ^4 \\ \end{bmatrix} \\$

So

$\frac{1}{4!} A^4 = \begin{bmatrix} (1/4!)λ^4 & (1/3!) λ^3 & (1/2)(1/2!) λ^2 & (1/3!) λ \\ 0 & (1/4!)λ^4 & (1/3!) λ^3 & (1/2)(1/2!) λ^2 \\ 0 & 0 & (1/4!)λ^4 & (1/3!) λ^3 \\ 0 & 0 & 0 & (1/4!)λ^4 \\ \end{bmatrix} \\$

Then finally

$e^A = \begin{bmatrix} e^λ & e^λ & \frac{1}{2!} e^λ & \frac{1}{3!} e^λ \\ 0 & e^λ & e^λ & \frac{1}{2!} e^λ \\ 0 & 0 & e^λ & e^λ \\ 0 & 0 & 0 & e^λ \\ \end{bmatrix} \\$

$\square$

3.2.8.

Let $a,b,d$ be real numbers with $ad = 1$ . show that

$\begin{bmatrix} a & b \\ 0 & d \\ \end{bmatrix} = \begin{bmatrix} 1 & ab \\ 0 & 1 \\ \end{bmatrix} \begin{bmatrix} a & 0 \\ 0 & d \\ \end{bmatrix}$

Proof: Just regular matrix multiplication.

(b)

Let $a,b,c,d$ be real numbers with $c \neq 0$ and $ad−bc = 1$ . Show that

$$

$\square$

3.3 The Inverse of a Linear Mapping

3.3.2.

Finish the proof of Proposition 3.3.2.

Proof:

To prove (b), each row $r_j$ of $S_{i,a} M$ is the row $r_i$ of $M$ . And $r_i$ of $S_{i,a} M$ is $a r_i$ .

$\square$

3.3.4.

Finish the proof of Lemma 3.3.3, part (1).

Proof:

We need to prove

$S_{i,a}^{-1} = S_{i,a^{-1}}$

Note $S_{i,a}$ is the identity matrix $I_m$ with $a$ times it's $i$ th row, and multiplying this from the left by $S_{i,a^{-1}}$ makes the $i$ th row back to be $(0,\cdots , 1, \cdots 0)$ .

We also need to prove

$T_{i;j}^{-1} = T_{i;j}$

Note $T_{i;j}$ is the identity matrix $I_m$ with swapping its $i$ th row with its $j$ th row, and multiplying this from the left by $T_{i;j}$ makes it back to $I_m$ .

3.3.5.

What is the eﬀect of right multiplying the $m×n$ matrix $M$ by an $n×n$ matrix $R_{i;j,a}, S_{i,a}, T_{i;j}$ ?

Solution:

We can directly carry out the matrix multiplication and the effect is described in 3.3.6.

3.3.6.

Recall the transpose of a matrix $M$ (cf. Exercise 3.2.4), denoted $M^T$ . Prove $R_{i;j,a}^T = R_{j;i,a}M$ ; $S_{i,a}^T = S_{i,a}$ ; $T_{i;j}^T = T_{i;j}$ .

Use these results and the formula $(AB)^T = B^TA^T$ to redo the previous problem.

Proof:

Let $A = [a_{ij}]_{m \times n}, A^T = [b_{ij}]_{n \times m}$ . Then we know, $a_{ij} = b_{ji}$ . So then it's easy to prove.

For the second part

$(A R_{i;j,a})^T = R_{i;j,a}^T A^T = R_{j;i,a} A^T$

So the effect is to use $a$ times $i$ th column and add to the $j$ th column.

$(A S_{i,a})^T = S_{i,a} A^T$

So the effect is to use $a$ times $i$ th column of $A$ .

Similar, the effect is to swap the $i$ th and $j$ th column of $A$ .

3.3.7.

Are the following matrices echelon? For each matrix $M$ , solve the equation $Mx= 0$ .

Solution:

(a) no, the last column is not a neither a new column nor an old column.

(b) (c) yes.

(d) No. The 2nd/3rd column should be swapped with the 1st row.

(e) The 3rd row should be subtract from the 2nd row.

(f) The 1st and 2nd row should be swapped.

3.3.8.

For each matrix $A$ solve the equation $Ax= 0$ .

(a)

$\begin{bmatrix} -1 & 1 & 4 \\ 1 & 3 & 8 \\ 1 & 2 & 5 \\ \end{bmatrix}$

Solution:

Multiply by $R_{2;1,1}$ we got

$\begin{bmatrix} -1 & 1 & 4 \\ 0 & 4 & 12 \\ 1 & 2 & 5 \\ \end{bmatrix}$

Multiply by $R_{3;1,1}$ we got

$\begin{bmatrix} -1 & 1 & 4 \\ 0 & 4 & 12 \\ 0 & 3 & 9 \\ \end{bmatrix}$

Multiply by $S_{1,-1}$ we got

$\begin{bmatrix} 1 & -1 & -4 \\ 0 & 4 & 12 \\ 0 & 3 & 9 \\ \end{bmatrix}$

Multiply by $R_{1;3,1/3}$ we got

$\begin{bmatrix} 1 & 0 & -1 \\ 0 & 4 & 12 \\ 0 & 3 & 9 \\ \end{bmatrix}$

Multiply by $R_{2;3,-4/3}$ we got

$\begin{bmatrix} 1 & 0 & -1 \\ 0 & 0 & 0 \\ 0 & 3 & 9 \\ \end{bmatrix}$

Multiply by $T_{2;3}$ we got

$\begin{bmatrix} 1 & 0 & -3 \\ 0 & 3 & 9 \\ 0 & 0 & 0 \\ \end{bmatrix}$

Multiply by $S_{2,1/3}$ we got

$\begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 3 \\ 0 & 0 & 0 \\ \end{bmatrix}$

Then $x = (1, -3, 1)$ can be one solution.

$\square$

3.3.9.

Balance the chemical equation

$\text{Ca} + \text{H}_3 \text{PO}_4 \rightarrow \text{Ca}_3 \text{P}_2 \text{O}_8 + \text{H}_2$

Solution

$\begin{bmatrix} 1 & 0 & -3 & 0 \\ 0 & 3 & 0 & -2 \\ 0 & 1 & -2 & 0 \\ 0 & 4 & -8 & 0 \\ \end{bmatrix}$

So $(3, 2, 1, 3)$ is a solution.

$3 \text{Ca} + 2\text{H}_3 \text{PO}_4 \rightarrow 1 \text{Ca}_3 \text{P}_2 \text{O}_8 + 3 \text{H}_2$

$\square$

3.3.10.

Prove by induction that the only square echelon matrix with all new columns is the identity matrix.

Proof:

Assume this is true for $n \times n$ , i.e. $E_n = I_n$ .

Consider $(n+1) \times (n+1)$ echelon matrix. Since the last column is a new column, then

$E_{n+1} = \begin{bmatrix} A & 0 \\ 0 & 1 \\ \end{bmatrix}$

If any column of $A$ is not a new column, then it does not have a leading $1$ 's. Then that column does not have a leading $1$ 's in $E_{n+1}$ either. So it's not a new column in $E_{n+1}$ . We have a contradiction.

So every column in $A$ is a new column, and since its size is $n \times n$ , then $A = I_n$ . So $E_{n+1} = I_{n+1}$ .

$\square$

3.3.11.

Are the following matrices invertible? Find the inverse when possible, and then check your answer.

Solution:

We check (b)

$B = [P | I] = \\ \begin{bmatrix} 2 & 5 & -1 & 1 & 0 & 0 \\ 4 & -1 & 2 & 0 & 1 & 0 \\ 6 & 4 & 1 & 0 & 0 & 1 \\ \end{bmatrix}$

Apply $R_{2;1,-2}$

$\begin{bmatrix} 2 & 5 & -1 & 1 & 0 & 0 \\ 0 & -11 & 4 & -2 & 1 & 0 \\ 6 & 4 & 1 & 0 & 0 & 1 \\ \end{bmatrix}$

Apply $R_{3;1,-3}$

$\begin{bmatrix} 2 & 5 & -1 & 1 & 0 & 0 \\ 0 & -11 & 4 & -2 & 1 & 0 \\ 0 & -11 & 4 & -3 & 0 & 1 \\ \end{bmatrix}$

Apply $R_{3;2,-1}$

$\begin{bmatrix} 2 & 5 & -1 & 1 & 0 & 0 \\ 0 & -11 & 4 & -2 & 1 & 0 \\ 0 & 0 & 0 & -1 & -1 & 1 \\ \end{bmatrix}$

So it's not invertible.

$\square$

3.3.12.

The matrix $A$ is called lower triangular if $a_{ij} = 0$ whenever $i < j$ . If $A$ is a lower triangular square matrix with all diagonal entries equal to $1$ , show that $A$ is invertible and $A^{−1}$ takes the same form.

Proof:

We can mutiply $A$ with $R_{2;1,-a_{2,1}}, \cdots R_{n;1,-a_{n,1}}$ to get the first column to become a new column, let the new matrix be $A_1$ . And the right half be $B_1$ . Then $B_1$ is still a lower triangular.

Similarly, we can mutiply $A_1$ with $R_{3;2,-a_{3,2}}, \cdots R_{n;2,-a_{n,2}}$ to get $A_2$ and $B_2$ which are also lower triangular.

Eventually, we can get $A_n = I_n$ and $B_n$ , which are also lower triangular.

$\square$

3.3.13.

This exercise refers back to the Gram–Schmidt exercise in Chapter 2. That exercise expresses the relation between the vectors $\{x'_j\}$ and the vectors $\{x_j\}$ formally as $x' = Ax$ , where $x'$ is a column vector whose entries are the vectors $x'_1, \cdots, x'_n$ , $x$ is the corresponding column vector of $x_j$ ’s, and $A$ is an $n×n$ lower triangular matrix.

Show that each $x_j$ has the form

$x_j = a'_{j1} x'_1 + \cdots + a'_{j,j-1} x'_{j-1} + x'_j$

and thus every linear combination of the original $\{x_j\}$ is also a linear combination of the new $\{x'_j\}$ .

Proof:

Since $A$ is lower triangular matrix, from 3.3.12, $A$ is invertible, and $A^{-1}$ is also a lower triangular matrix, such that

$x = A^{-1} x'$

If we expand it, we have

$x_j = a'_{j1} x'_1 + \cdots + a'_{j,j-1} x'_{j-1} + x'_j$

$\square$

3.5 The Determinant: Characterizing Properties and Their Consequences

3.5.1

Solution:

Assume $x = (x_1, \cdots, x_n), y = (y_1, \cdots, y_n)$ , then

$x = \sum_{i = 1}^{n} x_i e_i \\ y = \sum_{i = 1}^{n} y_i e_i \\$

Then

$\begin{align*} \text{ip}(x,y) &= \text{ip}( \sum_{i = 1}^{n} x_i e_i, \sum_{i = 1}^{n} y_i e_i, ) \\ &= \sum_{i = 1}^{n} \sum_{j = 1}^{n} x_i y_j \text{ip} (e_i, e_j) \\ &= \sum_{i = 1}^{n} \sum_{j = 1}^{n} x_i y_j \delta (e_i, e_j) \\ &= \sum_{i = 1}^{n} x_i y_i \\ &= ⟨x,y⟩ \end{align*}$

$\square$

3.5.2

Proof:

$(i, j)(i, k) = (j, k)(i, j) \\ (i, j)(j, k) = (j, k)(i, k) \\ (i, j)(k, l) = (k, l)(i, j) \\$

If it's the last 3 cases, then only the last pair-exchange involves the $i$ th slot. So the element $i$ will be swapped to another location.

So it won't be the ordered set.

$\square$

3.5.4.

This exercise shows that $\det(A^T) = \det(A)$ for every square matrix $A$ .

(b) If $E$ is a square echelon matrix then either $E= I$ or the bottom row of $E$ is $0$ . In either case, show that $\det(E^T) = \det(E)$ . (For the case $E \neq I$ , we know that $E$ is not invertible. What is $E^T e_n$ , and what does this say about the invertibility of $E$ ?)

3.6 The Determinant: Characterizing Properties and Their Consequences

3.6.11

Consider the following $n×n$ matrix based on Pascal’s triangle:

$A = \begin{bmatrix} 1 & 1 & 1 & 1 & \cdots & 1 \\ 1 & 2 & 3 & 4 & \cdots & n \\ 1 & 3 & 6 & 10 & \cdots & \frac{n(n+1)}{2} \\ 1 & 4 & 10 & 20 & \cdots & \cdot \\ \vdots & \vdots & \vdots & \vdots & \cdots & \vdots \\ 1 & n & \frac{n(n+1)}{2} & \cdot & \cdots & \cdot \end{bmatrix}$

Find $\det(A)$ . (Hint: Row and column reduce.)

Solution:

Consider $n = 4$ , then

$A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 2 & 3 & 4 \\ 1 & 3 & 6 & 10 \\ 1 & 4 & 10 & 20 \\ \end{bmatrix}$

Then substract row $3$ from row $4$ , row $2$ from row $3$ , row $1$ from row $2$

$A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & 1 & 3 & 6 \\ 0 & 1 & 4 & 10 \\ \end{bmatrix}$

Then substract row $3$ from row $4$ , row $2$ from row $3$ ,

$A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 1 & 3 \\ 0 & 0 & 1 & 4 \\ \end{bmatrix}$

Then substract row $3$ from row $4$ ,

$A = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 1 & 3 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}$

So $\det A = 1$ .

This can be generalized for any $n$ .

$\square$

3.8 Geometry of the Determinant: Volume

3.8.1.

(a) This section states that the image of a union is the union of the images. More specifically, let $A$ and $B$ be any sets, let $f : A \rightarrow B$ be any mapping, and let $A_1,...,A_N$ be any subsets of $A$ . Show that

$f(\bigcup_{i=1}^N A_i) = \bigcup_{i=1}^N f(A_i).$

Proof

Assume $b \in f(\bigcup_{i=1}^N A_i)$ , then we can find $a \in \bigcup_{i=1}^N A_i$ , such that $f(a) = b$ .

Then $a \in \bigcup_{i=1}^N A_i$ , then $a \in A_i$ , so $b \in f(A_i)$ , $b \in \bigcup_{i=1}^N f(A_i)$ .

The other side is similar.

$\square$

3.8.3.

Describe the geometric eﬀect of multiplying by the $3 × 3$ elementary matrices $R_{2;3,1}$ , $R_{3;1,2}$ , and $S_{2,−3}$ .

Solution:

$R_{2;3,1} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \\ \end{bmatrix}$

It changes the cubic to parallelepiped.

$R_{3;1,2} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \\ \end{bmatrix}$

It changes the cubic to parallelepiped.

$R_{3;1,2} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & -3 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}$

It changes the cubic to rectangle.

3.8.4.

3.8.4. (a) Express the matrix $\begin{bmatrix} 0 & -1 \\ 1 & 0 \\ \end{bmatrix}$ as a product of recombine and scale (you may not need both types).

Solution:

Add the first row to the second row.

$\begin{bmatrix} 1 & 0 \\ 1 & 1 \\ \end{bmatrix}$

Then multiply the second row with $-1$ and add to the first row.

$\begin{bmatrix} 0 & -1 \\ 1 & 1 \\ \end{bmatrix}$

Add the first row to the second row.

$\begin{bmatrix} 0 & -1 \\ 1 & 0 \\ \end{bmatrix}$

$\square$

(b) Use part (a) to describe counterclockwise rotation of the plane through the angle $π/2$ as a composition of shears and scales.

Solution:

Part (a) just shows the 3 steps of shearing to get it.

$\square$

3.8.5.

Describe counterclockwise rotation of the plane through the angle $θ$ (where $cosθ \neq 0$ and $sinθ \neq 0$ ) as a composition of shears and scales.

Solution:

The matrix is

$A = \begin{bmatrix} \cos θ & -\sin θ \\ \sin θ & \cos θ \\ \end{bmatrix}$

We start with

$E = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix}$

Scale the 2nd row by $\cos θ$ , multiply the 1st row by $\sin θ$ and add to the 2nd row, we have

$\begin{bmatrix} 1 & 0 \\ \sin θ & \cos θ \\ \end{bmatrix}$

Then scale the 2nd row by $-\sin θ$ and add to the 1st row, we get

$\begin{bmatrix} \cos^2 θ & - \cos θ \sin θ \\ \sin θ & \cos θ \\ \end{bmatrix}$

Finally, multiply the 1st row by $1/\cos θ$ we get $A$ .

$\square$

3.8.6.

In $\mathbb{R}^3$ , describe the linear mapping that takes $e_1$ to $e_2$ , $e_2$ to $e_3$ , and $e_3$ to $e_1$ as a composition of shears, scales, and transpositions.

Solution:

$\mathcal{M}(T) = \begin{bmatrix} 0 & 0 & 1\\ 1 & 0 & 0\\ 0 & 1 & 0\\ \end{bmatrix}$

So we just need to have 2 transpositions

$T_1 = \begin{bmatrix} 0 & 1 & 0\\ 1 & 0 & 0\\ 0 & 0 & 1\\ \end{bmatrix}, T_2 = \begin{bmatrix} 0 & 0 & 1\\ 0 & 1 & 0\\ 1 & 0 & 0\\ \end{bmatrix},$

$\square$

3.8.7.

Let $\mathcal{P}$ be the parallelogram in $\mathbb{R}^2$ spanned by $(a,c)$ and $(b,d)$ . Calculate directly that

$\bigg| \det \begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} \bigg| = \text{area } \mathcal{P}$

Proof:

Let $v_1 = (a,c), v_2 = (b,d)$ .

$A^2 = |v_1|^2 |v_2|^2 \sin ^2 \theta \\ = |v_1|^2 |v_2|^2 (1 - \cos ^2 \theta ) \\ = |v_1|^2 |v_2|^2 (1 - \frac{⟨v_1,v_2⟩^2}{|v_1|^2 |v_2|^2} ) \\ = (a^2 + c^2)(b^2 + d^2) - (ab + cd)^2 \\ = (ad)^2 + (cb)^2 - 2 abcd \\ = (ad - bc)^2 \\ = \bigg| \det \begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} \bigg|^2$

$\square$

3.8.8.

This exercise shows directly that $|\det| =$ volume in $\mathbb{R}^3$ . Let $\mathcal{P}$ be the parallelepiped in $\mathbb{R}^3$ spanned by $v_1, v_2, v_3$ , let $\mathcal{P}'$ be spanned by the vectors $v_1', v_2', v_3'$ obtained from performing the Gram–Schmidt process on the $v_j$ 's, let $A = M_{3}(\mathbb{R})$ have rows $v_1, v_2, v_3$ , $A' = M_{3}(\mathbb{R})$ have rows $v'_1, v'_2, v'_3$ .

(a) Explain why $\det A' = \det A$ .

Proof:

$v_1' = v_1 \\ v_2' = v_2 - (v_2)_{(\parallel v_1')} = v_2 - \frac{⟨v_1',v_2⟩}{|v_1'|^2} v_1' \\ v_3' = v_3 - (v_3)_{(\parallel v_2')} - (v_3)_{(\parallel v_1')} = v_3 - \frac{⟨v_1',v_3⟩}{|v_1'|^2} v_1' - \frac{⟨v_2',v_3⟩}{|v_2'|^2} v_2'$

So $A'$ can be transformed from $A$ by twice recombine. Therefore $\det A' = \det A$ .

$\square$

(b) Give a plausible geometric argument that vol $\mathcal{P}'$ = vol $\mathcal{P}$ .

Proof:

Consider the parallelogram made of $v_1, v_2$ . The area is the same as the rectangle made by $v_1', v_2'$ . So we can just calculate the parallelepiped made by $v_1', v_2', v_3$ .

Note it's height is the same as the parallelepiped made by $v_1', v_2', v_3'$ .

Therefore vol $\mathcal{P}'$ = vol $\mathcal{P}$ .

$\square$

(c) Show that

$A' A'^t = \begin{bmatrix} |v_1'|^2 & 0 & 0 \\ 0 & |v_2'|^2 & 0 \\ 0 & 0 & |v_2'|^3 \\ \end{bmatrix}$

Explain why therefore $|\det A'| = \text{vol} P'$ . It follows from parts (a) and (b) that $|\det A| = \text{vol} P$ .

Proof

$A' A'^t = \begin{bmatrix} v_1' \\ v_2' \\ v_3' \\ \end{bmatrix} [v_1'^t, v_2'^t, v_3'^t] \\ = \begin{bmatrix} ⟨v_1',v_1'^t⟩ & ⟨v_1',v_2'^t⟩ & ⟨v_1',v_3'^t⟩ \\ ⟨v_2',v_1'^t⟩ & ⟨v_2',v_2'^t⟩ & ⟨v_2',v_3'^t⟩ \\ ⟨v_3',v_1'^t⟩ & ⟨v_3',v_2'^t⟩ & ⟨v_3',v_3'^t⟩ \\ \end{bmatrix} \\ = \begin{bmatrix} |v_1'|^2 & 0 & 0 \\ 0 & |v_2'|^2 & 0 \\ 0 & 0 & |v_2'|^3 \\ \end{bmatrix}$

Exercise 3.5.4. shows $\det A' = \det A'^t$ .

On the other hand,

$\text{vol } P'^2 = |v_1'|^2 |v_2'|^2 |v_3'|^2$

So $|\det A| = |\det A'| = \text{vol } P' = \text{vol} P$ .

$\square$

3.10 The Cross Product, Lines, and Planes in $\mathbb{R}^3$

3.10.5.

(a) Let $U,V ∈ M_n(\mathbb{R}^n)$ be skew-symmetric, meaning that $U^T=−U$ and similarly for $V$ , where $U^T$ is the transpose of $U$ (Exercise3.2.4). Show that $aU$ is skew-symmetric for every $a ∈ R$ , and that $U+V$ is skew-symmetric.Thus the skew-symmetric matrices form a vector space. Show furthermore that the Lie bracket $[U,V] = UV−VU$ is skew-symmetric. One can optionally check that although the Lie bracket product is not in general associative, it instead satisfies the Jacobi identity,

$[U,[V,W]]+[V,[W,U]]+[W,[U,V]] = 0$

Proof:

$(aU)^T = a \cdot U^T = a \cdot (-U) = -(aU)$

$(U+V)^T = U^T + V^T = (-U) + (-V) = - (U+V)$

$(UV−VU)^T = (UV)^T - (VU)^T = (V^T U^T) - (U^T V^T) \\ = (-V) (-U) - (-U) (-V) \\ = - (UV - VU) \\$

$[U,[V,W]] = [U, (VW - WV)] = U(VW - WV) - (VW - WV)U = UVW - UWV - VWU + WVU \\ [V,[U,W]] = [V, (UW - WU)] = V(UW - WU) - (UW - WU)V = VUW - VWU - UWV + WUV \\ [W,[U,V]] = [W, (UV - VU)] = W(UV - VU) - (UV - VU)W = WUV - WVU - UVW + VUW \\$

Note

$(UVW - UWV - VWU + WVU) + (VUW - VWU - UWV + WUV) + (WUV - WVU - UVW + VUW) \\ = -2 UWV - 2 VWU + 2 VUW + 2 WUV$

3.10.11.

Show that $ℓ(p,d)$ and $ℓ(p',d')$ intersect if and only if the linear equation $Dt= Δp$ is solvable, where $D ∈ M_{3, 2}(\mathbb{R})$ has columns $d$ and $d'$ , $t$ is the column vector $\begin{bmatrix} t_1 \\ t_2 \\ \end{bmatrix}$ , and $Δp = p'-p$ .

For what points $p$ and $p'$ do $ℓ(p,(1,2,2))$ and $ℓ(p',(2,−1,4))$ intersect?

Proof:

$\Rightarrow$

$p + dt = p' + d't' \\ \Rightarrow \\ p - p' = d t_1 + d' t_2 \\ \Rightarrow \\ Δp = \begin{bmatrix} d & d' \end{bmatrix} \begin{bmatrix} t_1 \\ t_2 \\ \end{bmatrix}$

$\Leftarrow$

$Δp = \begin{bmatrix} d & d' \end{bmatrix} \begin{bmatrix} t_1 \\ t_2 \\ \end{bmatrix} \\ \Rightarrow \\ p - p' = d t_1 + d' t_2 \\ \Rightarrow \\ p + d t_1 = p' + d' t_2 \\$

$\square$

3.10.12.

Use vector geometry to show that the distance from the point $q$ to the line $ℓ(p,d)$ is

$\frac{|(q-p) \times d|}{|d|}$

(Hint: what is the area of the parallelogram spanned by $q−p$ and $d$ ?) Find the distance from the point $(3,4,5)$ to the line $ℓ((1,1,1),(1,2,3))$ .

Proof:

As shown in the figure below

alt

The area of the parallelogram spanned by $q−p$ and $d$ is

$|q-p \times d|$

Then the distance

$\frac{ |(q-p) \times d| }{|d|}$

So $q = (3,4,5), p = (1,1,1), d = (1,2,3)$

$(q - p) = (2,3,4)$

$(2,3,4) \times (1,2,3) = (1, -2, 1)$

So the distance is

$\frac{ |(1, -2, 1)| }{ |(1,2,3)| } = \sqrt[]{ \frac{ 6 }{14} }$

$\square$

3.10.13.

Show that the time of nearest approach of two particles whose positions are $s(t) = p + tv$ , $\tilde{s}(t) = \tilde{p} + t \tilde{v}$ is $t= −⟨Δp,Δv⟩/|Δv|^2$ . (You may assume that the particles are at their nearest approach when the diﬀerence of their velocities is orthogonal to the diﬀerence of their positions.)

Proof:

$⟨(p+tv)-(p'+tv'),(v-v')⟩ = 0 \\ \Rightarrow \\ ⟨(p-p')+t(v-v'),v-v'⟩ = 0 \\ \Rightarrow \\ ⟨Δp+tΔv,Δv⟩ = 0 \\ \Rightarrow \\ ⟨Δp,Δv⟩ + t ⟨Δv,Δv⟩ = 0 \\ \Rightarrow \\ t = −⟨Δp,Δv⟩/|Δv|^2$

$\square$

3.10.14.

Write the equation of the plane through $(1,2,3)$ with normal direction $(1,1,1)$ .

Solution:

$⟨(x-1,y-2,z-3),(1,1,1)⟩ = 0 \\ \Rightarrow \\ x + y + z = 6 \\$

$\square$

3.10.15.

Where does the plane $x/a+y/b+z/c = 1$ intersect each axis?

Solution:

It intersects with $(a, 0, 0),(0,b,0),(0,0,c)$ .

$\square$

3.10.16.

Specify the plane containing the point $p$ and spanned by directions $d$ and $d'$ . Specify the plane containing the three points $p$ , $q$ , and $r$ .

Solution:

To find a normal (orthogonal) vector $n$ , we have $n = d \times d'$ . So, for any $q$ , if $⟨p-q, d \times d'⟩ = 0$ , then $q$ in the plane.

Let $d = q - p, d' = r - p$ , the we just need to have $s$ to satisfy:

$⟨s - p, q - p \times r - p⟩$

$\square$

3.10.17.

Use vector geometry to show that the distance from the point $q$ to the plane $P(p,n)$ is

$\frac{ |⟨q-p,n⟩| }{|n|}$

(Hint: Resolve $q−p$ into components parallel and normal to $n$ .) Find the distance from the point $(3,4,5)$ to the plane $P((1,1,1),(1,2,3))$ .

Solution:

From exercise 2.2.15,

$(q-p)_{\parallel n} = \frac{⟨n, q-p⟩}{|n|^2}n$

And the distance is

$|(q-p)_{\parallel n}| =\\ \frac{|⟨n, q-p⟩|}{|n|^2}|n| = \\ \frac{|⟨n, q-p⟩|}{|n|}$

$\square$

Chapter 3 Linear Mappings and Their Matrices

3.1 Linear Mappings

3.1.1.

3.1.2.

3.1.3

3.1.4

3.1.5.

3.1.6.

3.1.7.

3.1.8.

3.1.9.

3.1.10.

3.1.11.

3.1.12.

3.1.13.

3.1.14

3.1.15

3.1.16

3.1.17.

3.2 Operations on Matrices

3.2.1.

3.2.2.

3.2.3

3.2.4.

3.2.5.

3.2.6

3.2.7.

3.2.8.

3.3 The Inverse of a Linear Mapping

3.3.2.

3.3.4.

3.3.5.

3.3.6.

3.3.7.

3.3.8.

3.3.9.

3.3.10.

3.3.11.

3.3.12.

3.3.13.

3.5 The Determinant: Characterizing Properties and Their Consequences

3.5.1

3.5.2

3.5.4.

3.6 The Determinant: Characterizing Properties and Their Consequences

3.6.11

3.8 Geometry of the Determinant: Volume

3.8.1.

3.8.3.

3.8.4.

3.8.5.

3.8.6.

3.8.7.

3.8.8.

3.10 The Cross Product, Lines, and Planes in \mathbb{R}^3

3.10.5.

3.10.11.

3.10.12.

3.10.13.

3.10.14.

3.10.15.

3.10.16.

3.10.17.

3.10 The Cross Product, Lines, and Planes in $\mathbb{R}^3$