Chapter 3 Linear Mappings and Their Matrices

3.1 Linear Mappings

3.2 Operations on Matrices

The matrices for the linear mapping

Should be denoted

the jth column of

Definition 3.2.1 (Matrix addition).

If and then .

Definition 3.2.2 (Scalar-by-matrix multiplication).

If and then .

Proposition 3.2.3 ( forms a vector space).

The set of matrices forms a vector space over .

Composition

If

are linear then their composition.

is linear as well. Suppose that and respectively have matrices

Then the composition has a matrix in that is naturally defined as the matrix-by-matrix product

the order of multiplication being chosen for consistency with the composition. Under this specification,

Definition 3.2.4 (Matrix multiplication).

Given two matrices

such that has as many columns as has rows, their product,

has for its th entry (for every ) the inner product of the th row of and the th column of . In symbols,

$$ (AB)_{ij} = ⟨\text{th row of }, \text{th column of }⟩, $$

or, at the level of individual entries, if and then

$$ \text{(A times B)’s jth column equals A times (B’s jth column),} \ \text{ith row of (A times B) equals (ith row of A) times B. } $$

Proposition 3.2.5 (Properties of matrix multiplication).

Matrix multiplication is associative:

Matrix multiplication distributes over matrix addition:

Scalar multiplication passes through matrix multiplication:

The identity matrix is a multiplicative identity,

Proof:

The right way to prove these is intrinsic, by recalling that addition, scalar multiplication, and multiplication of matrices precisely mirror addition, scalar multiplication, and composition of mappings.

$$ where } & \ &= S(T(U(x))) &\quad \text{by definition of } & \ &= S((T ◦U)(x)) &\quad \text{by definition of } & \ &= (S ◦(T ◦U))(x) &\quad \text{by definition of where .} & \\ \end{align*} $$

Alternatively, one can verify the equalities elementwise by manipulating sums.

The author finds this method as grim as it is gratuitous: the coordinates work because they must, but their presence only clutters the argument.

3.3 The Inverse of a Linear Mapping

Given a linear mapping , does it have an inverse? That is, is there a mapping such that

Note the inverse , if it exists, must be unique,

If the inverse exists then it too is linear.

Assume , then we can find , such that .

Then

$$ is linear} \ &= (T \circ S)(x_1 + x_2) &\quad \text{Definition of composition} \ &= x_1 + x_2 &\quad \text{ inverts } \\ &= T(y_1) + T(y_2) &\quad \text{above discussion} \\ \end{align*} $$

If the equation has a nonzero solution then has no inverse.

We have a contradiction.

Definition 3.3.1 (Elementary matrices).

  • Recombine matrix: (Here the a sits in the (i,j)th position, the diagonal entries are and all other entries are 0. The is above the diagonal as shown only when ; otherwise it is below.)
  • Scale matrix: Here the a sits in the ith diagonal position, all other diagonal entries are , and all other entries are .
  • Transposition matrix: Here the diagonal entries are 1 except the ith and jth, the (i,j)th and (j,i)th entries are 1, and all other entries are 0.

Proposition 3.3.2 (Effects of the elementary matrices).

Let be an matrix; call its rows . Then:

(1) The matrix has the same rows as except that its ith row is .

(2) The matrix has the same rows as except that its ith row is .

(3) The matrix has the same rows as except that its ith row is and its jth row is .

Lemma 3.3.3 (Invertibility of products of the elementary matrices).

Products of elementary matrices are invertible. More specifically: (1) The elementary matrices are invertible by other elementary matrices. Specifically,

(2) If the m×m matrices and are invertible by and , then the product matrix is invertible by . (Note the order reversal.)

(3) Every product of elementary matrices is invertible by another such product, specifically the product of the inverses of the original matrices, but taken in reverse order.

Proposition 3.3.4 (Persistence of solution).

Let be an matrix and let be a product of elementary matrices. Then the equations

are satisfied by the same vectors in .

Definition 3.3.5 (Echelon matrix).

A matrix is called echelon if it has the form

Here the 's are arbitrary entries, and all entries below the stairway are . Thus each row’s first nonzero entry is a , each row’s leading is farther right than that of the row above it, each leading has a column of ’s above it, and any rows of ’s are at the bottom.

Theorem 3.3.6 (Matrices reduce to echelon form).

Every matrix row reduces to a unique echelon matrix .

In an echelon matrix , the columns with leading 's are called new columns, and all others are old columns.

Theorem 3.3.7 (Invertibility and echelon form for matrices).

A nonsquare matrix is never invertible. A square matrix is invertible if and only if its echelon form is the identity matrix.

Proposition 3.3.8 (Matrix inversion algorithm).

Given , set up the matrix

in . Carry out row operations on this matrix to reduce the left side to echelon form. If the left side reduces to In then is invertible and the right side is . If the left side doesn’t reduce to In then is not invertible.

Theorem 3.3.9 (Echelon criterion for invertibility).

The linear mapping is invertible only when and its matrix has echelon matrix , in which case its inverse is the linear mapping with matrix .

3.5 The Determinant: Characterizing Properties and Their Consequences

  • The goal is to define a function that takes such a matrix, with its entries, and returns a single number.

  • For every square matrix , the scalar should contain as much algebraic and geometric information about the matrix as possible.

This context nicely demonstrates a pedagogical principle already mentioned in Section 3.1: characterizing a mathematical object illuminates its construction and its use. Rather than beginning with a definition of the determinant, we will stipulate a few natural behaviors for it, and then we will eventually see that

  • there is a function with these behaviors (existence),
  • there is only one such function (uniqueness), and, most importantly,
  • these behaviors, rather than the definition, further show how the function works (consequences).

If has rows , write for .

The advantage of this viewpoint is that now we can impose conditions on the determinant, using language already at our disposal in a natural way. Specifically, we make three requirements:

(1) The determinant is multilinear, meaning that it is linear as a function of each of its vector variables when the rest are held fixed. That is, for all vectors and every scalar ,

Then

(2) The determinant is skew-symmetric as a function of its vector vari- ables, meaning that exchanging any two inputs changes the sign of the determinant,

(Here .) Consequently, the determinant is also alternating, meaning that if two inputs and are equal then .

(3) The determinant is normalized, meaning that the standard basis has determinant ,

Theorem 3.5.1 (Existence and uniqueness of the determinant).

One, and only one, multilinear skew-symmetric normalized function from the n-fold product of to exists. This function is the determinant,

Furthermore, all multilinear skew-symmetric functions from the n-fold product of to are scalar multiples of of the determinant. That is, every multilinear skew-symmetric function

is

The parity of a rearrangement

Independently of the determinant, every rearrangement of n objects has a well-defined parity, meaning that for every rearrangement of the objects, either all sequences of pairwise exchanges that put the objects back in order have even length or all such sequences have odd length.

Theorem 3.5.2 (The determinant is multiplicative).

For all matrices ,

In particular, if is invertible then the determinant of the matrix inverse is the scalar inverse of the determinant,

Proof:

Let be fixed. Consider the function

So we have

We first show is multilinear.

Next we show is skew symmetric.

So ,

So .

Also , then .

3.6 The Determinant: Characterizing Properties and Their Consequences

Definition 3.6.1 (Permutation).

A permutation of is a vector

whose entries are , each appearing once, in any order. An inversion in the permutation is a pair of entries with the larger one to the left.

The sign of the permutation , written , is −1 raised to the number of inversions in . The set of permutations of is denoted .

Every multilinear function (if it exists at all) must satisfy

If is also alternating then for every , then

If any two subscripts agree.

Thus we may sum only over permutations,

Compute

Consider any permutation . Suppose that contains an inversion, i.e., two elements are out of order. Then necessarily two elements in adjacent slots are out of order.

To see this, assume , then either (adjacent slots found) or . In the latter case, is closer than by . We can continue this process until find adjacent slots.

After finding the adjacent slots, we can swap them and reduce the # of inversions by .

Repeating this process until the permutation has no remaining inversions shows that

That is, a possible formula for a multilinear skew-symmetric function is

Where .

The previous display gives the unique possible formula for a multilinear skew-symmetric normalized function because every method of rearranging into order must produce the same factor .

This is because if two rearrangings produce different factor and , then that means we can find a sequence with odd steps such that

Which is not possible as proved in exercise 3.5.2.

Definition 3.6.2 (Determinant).

The determinant function,

is defined as follows. For every with entries

Proposition 3.6.3 (Properties of the determinant).

(1) The determinant is linear as a function of each row of .

Proof: Let be a matrix with

Then

(2) The determinant is skew-symmetric as a function of the rows of .

Proof:

Let Suppose that rows and are exchanged. The resulting matrix has

For each permutation , define a companion permutation by exchanging the th and st entries,

Thus , and for all other . each we have the relation (Exercise 3.6.6).

Then we have

$$ \begin{align*} \det (B)

&= \sum_{\pi \in S_n} (-1)^π b_{1 \pi(1)} \cdots b_{k \pi(k)} b_{(k+1) \pi(k+1)} \cdots b_{n \pi(n)} \

&= \sum_{\pi \in S_n} (-1)^π a_{1 \pi(1)} \cdots a_{(k+1) \pi(k)} a_{k \pi(k+1)} \cdots a_{n \pi(n)} \

&= \sum_{\pi \in S_n} (-1)^π a_{1 π'(1)} \cdots a_{(k+1) π'(k+1)} a_{k π'(k)} \cdots a_{n π'(n)} \

&= -\sum_{π \in S_n} (-1)^{π'} a_{1 π'(1)} \cdots a_{(k+1) π'(k+1)} a_{k π'(k)} \cdots a_{n π'(n)} \

&= -\sum_{π' \in S_n} (-1)^{π'} a_{1 π'(1)} \cdots a_{(k+1) π'(k+1)} a_{k π'(k)} \cdots a_{n π'(n)} \

&= - \det (A)

\end{align*}

$$

To exchange rows and in where , we can do the following exchanges:

Note there are exchanges in total, which is an odd number.

(3) The determinant is normalized.

Proof:

Consider

Note that if , then . So the only item left from above summation is

Therefore, the determinant is normalized.

So a unique determinant function with the stipulated behavior exists. And we have seen that every multilinear skew-symmetric function must be a scalar multiple of the determinant. (This is discussed in the secton) before 3.6.2.

Since the determinant is multilinear and skew- symmetric, so are its scalar multiples. This fact was shown in Exercise 3.5.3.

Proposition 3.6.4 (Determinant algorithm).

Given , use row and column operations—recombines, scales, transpositions—to reduce to a triangular matrix . Then is times the reciprocal of each scale factor and times−1 for each transposition.

3.9 Geometry of the Determinant: Orientation

We have . Arrange them in column to get .

defines a linear map from to

We can view in two ways:

  1. It's the linear combination: .
  2. It's the coefficients of and note .

We have following equivlent conditions

  • is a basis of
  • can be uniquely represented by them
  • for some unique .
  • is invertible