Notebook

I'm taking Stanford's online course on machine learning through Coursera. The second week has a good overview of linear algebra and matrix operations. The instructor has provided a useful PowerPoint deck in which he explains the basics. I'm going to go through this pdf and implement the linear algebra using NumPy. I have NumPy 1.6 installed with Python 2.7.5.

(Update: The ML course pdf has been taken down, but this person has a good summary of the same PDF:

In [1]:
from __future__ import print_function, division, unicode_literals, absolute_import
import numpy as np


Create a matrix

Let's try just creating the 4x2 matrix he shows in slides 2 and 3. The basic form for creating arrays is to use the array method with parenthesis: a = np.array() Inside the parenthesis, nest some square brackets, and in those brackets just put comma-separated lists of elements in more square brackets. Each square bracket represents a row. Make sure they all have the same number of columns. So to create the 4x2 matrix from slides 2 and 3 use the following:

In [2]:
a = np.array([[1402,191],[1371,821],[949,1437]])
print(a)

[[1402  191]
[1371  821]
[ 949 1437]]



Matrix Elements

Next he talks about accessing the matrix element-wise, so a11 should access 1402:

In [3]:
a[1,1]

Out[3]:
821


Instead of returning the first row of the first column, it gave us the second row of the second column. This is because NumPy is using 0-indexing (start at 0) instead of 1-indexing (start at 1). So to get the first row of the first column we index from 0:

In [4]:
a[0,0]

Out[4]:
1402


Next let's create two 3x2 matrices and add them together. First, instantiate the matrices:

In [5]:
b = np.array([[1,0],[2,5],[3,1]])
c = np.array([[4,0.5],[2,5],[0,1]])
print(b)

[[1 0]
[2 5]
[3 1]]


In [6]:
print(c)

[[ 4.   0.5]
[ 2.   5. ]
[ 0.   1. ]]



Add b and c and see what happens:

In [7]:
b+c

Out[7]:
array([[  5. ,   0.5],
[  4. ,  10. ],
[  3. ,   2. ]])


As you can see, NumPy correctly performed an element-wise addition.

Scalar Multiplication

Next, multiply a scalar by a 3x2 matrix. We'll use matrix b from above:

In [8]:
3 * b

Out[8]:
array([[ 3,  0],
[ 6, 15],
[ 9,  3]])


Again, NumPy correctly multiplied each element of the matrix by 3. Note that this can be done in any order (i.e. scalar matrix = matrix scalar).

Division is just multiplication by a fraction:

In [9]:
b / 4

Out[9]:
array([[ 0.25,  0.  ],
[ 0.5 ,  1.25],
[ 0.75,  0.25]])


Combination of Operands

Order of operations is important. In this slide the instructor sets up three vectors (3x1) and provides an example in which he multiples by a scalar then adds then subtracts then divides. Let NumPy do it and see what happens:

In [10]:
e = np.array([[1],[4],[2]])
f = np.array([[0],[0],[5]])
g = np.array([[3],[0],[2]])
3 * e + f - g / 3

Out[10]:
array([[  2.        ],
[ 12.        ],
[ 10.33333333]])


As before, NumPy produces the same answer as the instructor found by doing it by hand.

In [11]:
Matrix-vector multiplication
-------------------------

We can multiply a matrix by a vector as long as the number of columns of the matrix
is the same as the number of rows of the vector.
In other words, the matrix must be as wide as the vector is long.

  File "<ipython-input-11-90b4a5613aed>", line 1
Matrix-vector multiplication
^
SyntaxError: invalid syntax

In [12]:
h = np.array([[1,3],[4,0],[2,1]]) # 3x2
i = np.array([[1],[5]]) # 2x1
h * i

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
1 h = np.array([[1,3],[4,0],[2,1]]) # 3x2
2 i = np.array([[1],[5]]) # 2x1
----> 3 h * i

ValueError: operands could not be broadcast together with shapes (3,2) (2,1) 

My multiplication operation didn't work, and it helpfully gave me the shapes of the arrays for which multiplication failed. This is because the multiplication operator '*' causes element-wise multiplication. For that to work, the matrices need to be of the same shape (hence the error message showed us the shapes were different). What we want is the dot product.

Dot Product

To multiply two matrices using the dot product, use the dot() method:

In [13]:
h = np.array([[1,3],[4,0],[2,1]]) # 3x2
i = np.array([[1],[5]]) # 2x1
np.dot(h,i)

Out[13]:
array([[16],
[ 4],
[ 7]])


That gives us the correct answer, according to the slides. A longer example:

In [14]:
j = np.array([[1,2,1,5],[0,3,0,4],[-1,-2,0,0]])
k = np.array([[1],[3],[2],[1]])
np.dot(j,k)

Out[14]:
array([[14],
[13],
[-7]])


This one checks out with the slides as well.

Matrix-Matrix Multiplication

As far as NumPy is concerned, matrix-matrix multiplication is just like matrix-vector multiplication.

In [15]:
l = np.array([[1,3,2],[4,0,1]])
m = np.array([[1,3],[0,1],[5,2]])
np.dot(l,m)

Out[15]:
array([[11, 10],
[ 9, 14]])


He goes on to give one more example with a pair of 2x2 matrices:

In [16]:
n = np.array([[1,3],[2,5]]) # 2x2
o = np.array([[0,1],[3,2]]) # 2x2
np.dot(n,o)

Out[16]:
array([[ 9,  7],
[15, 12]])


Next he provides some context by using the example of home prices. He sets up a 4x2 and 2x3 matrix and multiplies them to quickly come up with price predictions:

In [17]:
p = np.array([[1,2104],[1,1416],[1,1534],[1,852]]) # 4x2
q = np.array([[-40,200,-150],[0.25,0.1,0.4]]) # 2x3
np.dot(p,q)

Out[17]:
array([[ 486. ,  410.4,  691.6],
[ 314. ,  341.6,  416.4],
[ 343.5,  353.4,  463.6],
[ 173. ,  285.2,  190.8]])


Matrix Multiplication Properties

Show that matrix multiplication is not commutative:

In [18]:
A = np.array([[1,1],[0,0]]) # 2x2
B = np.array([[0,0],[2,0]]) # 2x2
np.dot(A,B)

Out[18]:
array([[2, 0],
[0, 0]])

In [19]:
np.dot(B,A)

Out[19]:
array([[0, 0],
[2, 2]])


To test this another way I asserted equality between the two operations and found a neat element-wise comparison:

In [20]:
np.dot(A, B) == np.dot(B, A)

Out[20]:
array([[False,  True],
[False, False]], dtype=bool)


To show the associative property of arrays, create another 2x2 array C and multiply them in different groupings, but in the same order, to show that the result is always the same:

In [21]:
A = np.array([[1,1],[0,0]]) # 2x2
B = np.array([[0,0],[2,0]]) # 2x2
C = np.array([[1,3],[0,2]]) # 2x2
np.dot(A, np.dot(B,C))

Out[21]:
array([[2, 6],
[0, 0]])

In [22]:
np.dot(np.dot(A,B),C)

Out[22]:
array([[2, 6],
[0, 0]])


Identity Matrix

NumPy comes with a built-in function for producing an identity matrix. Just pass it the dimension (numnber of rows or columns) as the argument. Optionally tell it to output elements as integers in order to clean up the output:

In [23]:
np.identity(3)

Out[23]:
array([[ 1.,  0.,  0.],
[ 0.,  1.,  0.],
[ 0.,  0.,  1.]])

In [24]:
np.identity(3, dtype=int)

Out[24]:
array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])


Show that for any matrix A, AI=IA=A. We can use the same A from before:

In [25]:
A = np.array([[4,2,1],[4,8,3],[1,1,0]]) # 3x3
I = np.identity(3, dtype=int)
np.dot(A,I)

Out[25]:
array([[4, 2, 1],
[4, 8, 3],
[1, 1, 0]])

In [26]:
A

Out[26]:
array([[4, 2, 1],
[4, 8, 3],
[1, 1, 0]])

In [27]:
np.dot(A,I) == np.dot(I,A)

Out[27]:
array([[ True,  True,  True],
[ True,  True,  True],
[ True,  True,  True]], dtype=bool)

In [28]:
np.dot(A,I) == A

Out[28]:
array([[ True,  True,  True],
[ True,  True,  True],
[ True,  True,  True]], dtype=bool)


Inverse and Transpose

Show that if A is an mxm matrix, and if it has an inverse, then A(A-1) == (A-1)A == I. To do this, we can use the same 3x3 matrix A from above:

In [29]:
A = np.array([[4,2,1],[4,8,3],[1,1,0]]) # 3x3
A

Out[29]:
array([[4, 2, 1],
[4, 8, 3],
[1, 1, 0]])

In [30]:
np.linalg.inv(A)

Out[30]:
array([[ 0.3, -0.1,  0.2],
[-0.3,  0.1,  0.8],
[ 0.4,  0.2, -2.4]])


Now show that that A(A-1) = (A-1)A = I:

In [31]:
np.dot(A, np.linalg.inv(A))

Out[31]:
array([[  1.00000000e+00,  -2.77555756e-17,   0.00000000e+00],
[ -2.22044605e-16,   1.00000000e+00,  -4.44089210e-16],
[ -5.55111512e-17,  -1.38777878e-17,   1.00000000e+00]])

In [32]:
np.dot(np.linalg.inv(A), A)

Out[32]:
array([[  1.00000000e+00,   0.00000000e+00,  -5.55111512e-17],
[ -2.22044605e-16,   1.00000000e+00,  -5.55111512e-17],
[  0.00000000e+00,   0.00000000e+00,   1.00000000e+00]])


Note that this was meant to return the identity matrix, but that the float operations returned not zeros but numbers very close to zero. Note also that a matrix of all zeros has no inverse:

In [33]:
C = np.array([[0,0],[0,0]])
C

Out[33]:
array([[0, 0],
[0, 0]])

In [34]:
np.linalg.inv(C)

---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
<ipython-input-34-a6d66d9f99af> in <module>()
----> 1 np.linalg.inv(C)

/usr/local/lib/python2.7/site-packages/numpy/linalg/linalg.pyc in inv(a)
518     signature = 'D->D' if isComplexType(t) else 'd->d'
519     extobj = get_linalg_error_extobj(_raise_linalgerror_singular)
--> 520     ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
521     return wrap(ainv.astype(result_t))
522

/usr/local/lib/python2.7/site-packages/numpy/linalg/linalg.pyc in _raise_linalgerror_singular(err, flag)
88
89 def _raise_linalgerror_singular(err, flag):
---> 90     raise LinAlgError("Singular matrix")
91
92 def _raise_linalgerror_nonposdef(err, flag):

LinAlgError: Singular matrix

The error message agrees with the machine learning instructor, who calls these special matrices "singular" or "degenerate."

Matrix Transpose

Lastly, show some matrix transposition, whereby the rows are flipped element-wise:

In [35]:
A = np.array([[1,2,0],[3,5,9]])
A

Out[35]:
array([[1, 2, 0],
[3, 5, 9]])

In [36]:
A.transpose()

Out[36]:
array([[1, 3],
[2, 5],
[0, 9]])


If we name the transposed array 'B', then we can show that the ijth element of A is the jth element of B:

In [37]:
B = A.transpose()
A[0,2] == B[2,0]

Out[37]:
True

In [38]:
A[1,2] == B[2,1]

Out[38]:
True


Conclusion

We used NumPy and NumPy's library linalg to go through the linear algebra review slides from Coursera's Machine Learning course.