Matrix Multiplication

Section 3 Matrix Multiplication

Next we are going to take a look at matrix multiplication. This is the operation does not follow our initial intuition and is not preformed entry-wise like matrix addition. Like matrix addition, we will need to check if two matrices are compatible for matrix multiplication.

Subsection 3.1 Compatibility

Unlike addition, the check will have to do with the columns of the matrix on the left in the product, and the number of rows in the matrix on the right in the product. Let's break this down a bit more. Consider,

\begin{equation*} A = \begin{bmatrix} 1 & 2 \\ 0 & 5 \\ 2 & -9 \end{bmatrix} \textrm{ and matrix } B = \begin{bmatrix} 0 & -2 & 8 \\ 10 & -2 & -4 \end{bmatrix} \end{equation*}

Now if we wanted to find \(AB \) we would need to check to make sure that this product is possible. To do this we observe in the expression \(AB \text{,}\) matrix \(A \) is "left" in the written description in of the product. So we will note the number of columns that is has (in this case 2). The we observe that \(B \) is "right" in the written description in of the product. So we will note the number of rows that is has (in this case 2). Since these two match, we say that the two matrices are compatible for matrix multiplication with \(A \) on the left and \(B \) on the right.

Checkpoint 3.1.

Are the following two matrices compatible for matrix multiplication with \(AB \text{?}\) what about \(BA \) ?

\begin{equation*} A = \begin{bmatrix} 1 & 2 & 2 \\ 0 & 5 & 1 \\ 2 & -9 & 3 \end{bmatrix} \textrm{ and matrix } B = \begin{bmatrix} 0 & -2 & 8 \\ 10 & -2 & -4 \end{bmatrix} \end{equation*}

Solution

\(AB \) can't be preformed since the number of columns of \(A \) does not equal the number of rows of \(B \)

Hint 1

\(A \) has 3 columns and \(B \) has only 2 rows.

\(BA \) can be preformed since the number of columns of \(B \) does equal the number of rows of \(A \)

Hint 2

\(B \) has 3 columns and \(A \) has 3 rows.

This compatibility check is really important when working with matrices. In machine learning, the weight matrices often need to interact with another through matrix multiplication. So keeping an eye on the matrix size info is a must as the matrices are designed and utilized.

Let's take a look at how Python reports a compatibility problem. (Hint, this will look a lot like what we saw in the last Python code-block example.)

Matrix compatibility check in Python for Checkpoint 3.1

# Include the numpy module. 
import numpy as np

# Assign matrix A and matrix B from Checkpoint 3.1
A  = np.array([[1, 2, 2], [0, 5, 1], [2, -9 , 3]])
B  = np.array([[0,-2, 8], [10, -2, -4]])

# Compute the product BA and assign it to C
C = np.dot(B, A)

# This will print C. (The product BA)
print(C)

# Try to compute the product AB
D = np.dot(A, B)

# This would print D if it could be computed. 
print(D)

The output of the above code gives us this:

Since the product that does work we printed before the incompatible product, we can see the printed product, and then the error. At this point take a close look at the error here. Once again this is a good error to key in on, as it is something that will come up if there are shape (dimension) alignment issues.

Subsection 3.2 The Algorithm

Up to this point we have talked about the vocab of matrix multiplication as well as how to check if it can be preformed. Now let's explore how the product is computed. Let's consider again the matrices,

\begin{equation*} A = \begin{bmatrix} 1 & 2 \\ 0 & 5 \\ 2 & -9 \end{bmatrix} \textrm{ and matrix } B = \begin{bmatrix} 0 & -2 & 8 \\ 10 & -2 & -4 \end{bmatrix} \end{equation*}

Above we stated that \(AB \) can be computed, and we did the column = row compatibility check. Now we will showcase how this product is computed. To preform the product \(AB \) , we are going to use the dot product on the columns of \(A \) and the rows of \(B \text{.}\)

Definition 3.2. Dot Product.

The dot product is an operation between two vectors (in our case they will be the columns of the "left" matrix and the rows of the "right") the result of which is a scalar. So if \(\mathbf{a} \) and \(\mathbf{b} \) are vectors, then,

\begin{equation*} \mathbf{a} \cdot \mathbf{b} = \sum_{i=1}^n \mathbf{a}_i\mathbf{b}_i = \mathbf{a}_1\mathbf{b}_1 + \mathbf{a}_2\mathbf{b}_2 + \mathbf{a}_3\mathbf{b}_3 + ... + \mathbf{a}_n\mathbf{b}_n \end{equation*}

where \(\mathbf{a}_i\mathbf{b}_i \) is the scalar product between \(\mathbf{a}_i \) and \(\mathbf{b}_i \text{.}\)

So, to find \(AB \) given,

\begin{equation*} A = \begin{bmatrix} 1 & 2 \\ 0 & 5 \\ 2 & -9 \end{bmatrix} \textrm{ and matrix } B = \begin{bmatrix} 0 & -2 & 8 \\ 10 & -2 & -4 \end{bmatrix} \end{equation*}

we will dot row 1 of \(A \) with all the columns of \(B \text{.}\) The fact that we are using row 1 of \(A \) means that we will be computing the first row in the product matrix. When we switch to row 2 of \(A \) we will be computed the second row of the product matrix, and so on.

So,

\begin{equation*} AB = \begin{bmatrix} \mathbf{1} & \mathbf{2} \\ \mathbf{0} & \mathbf{5} \\ \mathbf{2} & \mathbf{-9} \end{bmatrix} \begin{bmatrix} 0 & -2 & 8 \\ 10 & -2 & -4 \end{bmatrix} = \begin{bmatrix} \mathbf{1}(0) + \mathbf{2}(10) & \mathbf{1}(-2) + \mathbf{2}(-2) & \mathbf{1}(8)+\mathbf{2}(-4) \\ \mathbf{0}(0)+\mathbf{5}(10) & \mathbf{0}(-2)+\mathbf{5}(-2) & \mathbf{0}(8)+\mathbf{5}(-4) \\ \mathbf{2}(0)+\mathbf{-9}(10) & \mathbf{2}(-2)+\mathbf{-9}(-2) & \mathbf{2}(8)+\mathbf{-9}(-4) \end{bmatrix} \end{equation*}

Throughout all of this we need to be thinking, "row by column".

Checkpoint 3.3.

Use the above example to find the product CD, given,

\begin{equation*} C = \begin{bmatrix} 9 & 1 \\ 2 & -4 \\ 2 & 7 \end{bmatrix} \textrm{ and matrix } D = \begin{bmatrix} 0 & 1 & 3 \\ 1 & -2 & 6 \end{bmatrix} \end{equation*}

Solution

\begin{equation*} CD = \begin{bmatrix} \mathbf{9} & \mathbf{1} \\ \mathbf{2} & \mathbf{-4} \\ \mathbf{2} & \mathbf{7} \end{bmatrix} \begin{bmatrix} 0 & 1 & 3 \\ 1 & -2 & 6 \end{bmatrix} = \end{equation*}

\begin{equation*} \begin{bmatrix} \mathbf{9}(0) + \mathbf{1}(1) & \mathbf{9}(1) + \mathbf{1}(-2) & \mathbf{9}(3)+\mathbf{1}(6) \\ \mathbf{2}(0)+\mathbf{-4}(1) & \mathbf{2}(1)+\mathbf{-4}(-2) & \mathbf{2}(3)+\mathbf{-4}(6) \\ \mathbf{2}(0)+\mathbf{7}(1) & \mathbf{2}(1)+\mathbf{7}(-2) & \mathbf{2}(3)+\mathbf{7}(6) \end{bmatrix} = \end{equation*}

\begin{equation*} \begin{bmatrix} 1 & 7 & 33 \\ -4 & 10 & -18 \\ 7 & -12 & 48 \end{bmatrix} \end{equation*}

Hint

Note that the product matrix is the number of rows of \(C \) by the number of columns of \(D \text{.}\) This punctuates the importance of knowing the size of the matrices as well as the order that they are being multiplied.

Matrix multiplication in Python for Checkpoint 3.3

# Include the numpy module. 
import numpy as np

# Assign matrix C and matrix D from Checkpoint 3.3
C  = np.array([[9, 1], [2, -4], [2, 7]])
D  = np.array([[0, 1, 3], [1, -2, 6]])

# Compute the product of C and D and assign it to M
M = np.dot(C, D)

# This will print M
print(M)

Notice: The syntax to multiply matrices is np.dot(A,B) not *. As we will see in the next section, the * is for a different operation. Also, the order that they appear in the parenthesizes (from left to right) is the order of the product.

Subsection 3.3 The Hadamard Product

We started this section by indicating that matrix multiplication does not have the entry by entry process that we see with matrix addition. It should be noted that there is, in fact, another product that does work this way.

Definition 3.4. The Hadamard Product.

The Hadamard product is another (different) matrix product, that is computed by entry-wise multiplication that is,

\begin{equation*} \begin{bmatrix} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33} \end{bmatrix} \circ \begin{bmatrix} b_{11} & b_{12} & b_{13}\\ b_{21} & b_{22} & b_{23}\\ b_{31} & b_{32} & b_{33} \end{bmatrix} = \begin{bmatrix} a_{11}\, b_{11} & a_{12}\, b_{12} & a_{13}\, b_{13}\\ a_{21}\, b_{21} & a_{22}\, b_{22} & a_{23}\, b_{23}\\ a_{31}\, b_{31} & a_{32}\, b_{32} & a_{33}\, b_{33} \end{bmatrix}. \end{equation*}

This product should be thought of as an entirely different operation to matrix multiplication. With that in mind, the compatibility check for the Hadamard product is that the two matrices in the product must have be the same size (same number of rows and columns).

Checkpoint 3.5.

Find the Hadamard product of \(A \) and \(B \) where,

\begin{equation*} A = \begin{bmatrix} 1 & 2 \\ 3 & 5 \end{bmatrix} \textrm{ and matrix } B = \begin{bmatrix} 0 & -2 \\ 10 & -2 \end{bmatrix} \end{equation*}

Solution

\(A \circ B = \begin{bmatrix} 0 & -4 \\ 30 & -10 \end{bmatrix} \)

The Hadamard Product in Python for Checkpoint 3.5

# Include the numpy module. 
import numpy as np

# Assign matrix C and matrix D from Checkpoint 3.5
A  = np.array([[1, 2], [3, 5]])
B  = np.array([[0, -2], [10, -2]])

# Compute the Hadamard product of A and B and assign it to M
M = A * B

# print M
print(M)

Next we are going to take a look at general matrix equations. This will bring some context to the matrix product, and put down some foundation to build on moving forward.