18 In-Class Assignment: Least Squares Fit (LSF)#

image showing a 3D vector projected onto a 2D plane

0. Final Exam Review Survey#

The two class periods during the week of 4/24-4/28 will be used to review for the final exam. There will not be any PCAs for next week, and there are no graded ICAs next week either, but instructors will likely have some sort of review activity for you to work on in class. To help instructors decide what material to focus on during class next week, please fill out the following survey by this Friday 4/21 https://forms.office.com/r/XfSSM3mQnt.


1. LSF Pre-class Review#

  • 19–LSF_pre-class-assignment.ipynb


2. Finding the best solution in an overdetermined system#

Let Ax=y be a system of m linear equations in n variables. A least squares solution of Ax=y is an solution x^ in Rn such that:

minx^yAx^.

Note we substitute y for our typical variable b here because we will use b later to represent the intercept to a line and we want to try and avoid confusion in notation. It also consistent with the picture above.

In other words, x^ is a value of x for which Ax is as close as possible to y. From previous lectures, we know this to be true if the vector $yAx^isorthogonal(perpendicular)tothecolumnspaceofA$.

We also know that the dot product is zero if two vectors are orthogonal. So we have
$a(Axy)=0,forallvectorsainthecolumnspacesofA$.

The columns of A span the column space of A. Denote the columns of A as $A=[a1,,an].Thenwehavea1(Axy)=0,a2(Axy)=0an(Axy)=0.ItisthesameastakingthetransposeofAanddoingamatrixmultiply:A(Axy)=0.$

That is:

$AAx=Ay$

The above equation is called the least squares solution to the original equation Ax=y. The matrix AA is symmetric and invertable. Then solving for x^ can be calculated as follows:

x=(AA)1Ay

The matrix (AA)1A is also called the left inverse.

Example: A researcher has conducted experiments of a particular Hormone dosage in a population of rats. The table shows the number of fatalities at each dosage level tested. Determine the least squares line and use it to predict the number of rat fatalities at hormone dosage of 22.

Hormone level

20

25

30

35

40

45

50

Fatalities

101

115

92

64

60

50

49

%matplotlib inline
import matplotlib.pylab as plt
import numpy as np
import sympy as sym
import time
from answercheck import checkanswer
sym.init_printing(use_unicode=True)
H = [20,25,30,35,40,45,50]
f = [101,115, 92,64,60,50,49]

plt.scatter(H,f)
plt.xlabel('Hormone Level')
plt.ylabel('Fatalities')

We want to determine a linear model that is expressed by the following equation

f=a+bH,

to approximate the connection between Hormone dosage (H) and Fatalities f. That is, we want to find a (y-intercept) and b (slope) for this line. First we define the variable x=[ab] as the column vector that needs to be solved. Then, we need to convert the overdetermined system of equations $a+bH0=f0a+bH1=f1a+bH6=f6intomatrixformAx=y$.

DO THIS: Rewrite the system of equations in matrix form Ax=y by defining your numpy matrices A and y using the data from above:

# put your code here
checkanswer.matrix(A,'50278283443c1d502428e602f0d363db');
checkanswer.matrix(y, '37e2ed57a1516fb4a17eb2a3e9e99d2d');

QUESTION: Calculate the square matrix AA (Call it AtA) and the modified right hand side vector as Ay (Call it Aty):

#Put your code here
# AtA=
# Aty=
checkanswer.matrix(AtA,'3f579409e55e9cc409bc727c35e09c74');
checkanswer.matrix(Aty,'1dec77018852144a8b90600161c8bff6');

QUESTION: Find the least squares solution by solving AAx=Ay for x.

# Put your code here
# x=
checkanswer.matrix(x,'3213b988975659ed46496a07542eff33');

QUESTION: Given the solution above, define the two scalars y-intercept a and slope b.

#put your code here
#a=
#b=
checkanswer.float(a,'87e53cb122536f53f434cdbccf0aca94');
checkanswer.float(b,'f7dfa5a5ed0f65f0e0a87f29efd9cd74');

The following code will plot the original data and the linear model whose coefficients we found by performing a least squares fit.

H = [20,25,30,35,40,45,50]
f = [101,115, 92,64,60,50,49]
plt.scatter(H,f)

H2 = np.linspace(np.min(H), np.max(H))

f2 = a + b*H2

plt.plot(H2, f2)

QUESTION: Now, let’s fit a quadratic model instead of a linear model to our data, i.e. $a+bH+cH2=f$

To do this, the vector of unknowns is now x=[abc], and the overdetermined system of equations is $a+bH0+cH02=f0a+bH1+cH12=f1a+bH6+cH62=f6$

Again, put this system of equations into matrix form Ax=y, solve AAx=Ay to find the least squares solution for x, and then extract the coefficients a, b, and c for your model.

# put your code here
checkanswer.float(a,'8acc26f9f9da4edfa44529a811ebb8a9')
checkanswer.float(b,'b1e317387458d5471bd7cec94e4959fa')
checkanswer.float(c,'451fd8dc5a793f8483160363b42fa8cd')

QUESTION: Now, plot the original data along with the curve for the quadratic model f=a+bH+cH2 whose coefficients we found by performing the least squares fit.

# put your code here

QUESTION: Now, see if you can generalize your code above to fit a degree-d polynomial model $x0+x1H+x2H2++xdHd=fforsomecoefficientsx_0,x_1,x_2,\ldots,x_d,where3 \le d \le 6$. Again, you should first put the system of equations into matrix form, then find the least squares solution, and finally make a plot showing the original data and the curve for the polynomial model.

# put your code here

d = 3 # Set d to be 3, 4, 5, or 6

# Form a len(H)x(d+1) matrix A whose entries are A[i,j] = H[i]**j

# Solve Ax = y using least squares

# Plot the original data points along with the curve for the polynomial model

QUESTION: The interactive function below allows you to adjust the degree of the polynomial model as well as the limits of the x-axis in the plot. Play with the interactive function below by adjusting the degree of the least-squares fit approximation and extending the x_min and x_max parameters. Do you think that an eighth-order polynomial is a good model for this dataset? Why or why not?

from ipywidgets import interact, fixed
import ipywidgets as widgets

@interact(x=fixed(H), y=fixed(f), degree=widgets.IntSlider(min=1, max=8, step=1, value=8), x_min=widgets.IntSlider(min=min(H)-10, max=min(H), step=1, value=min(H)), x_max=widgets.IntSlider(min=max(H), max=max(H)+10, step=1, value=max(H)))
def graphPolyN(x, y, x_min, x_max, degree):
    p = np.polyfit(x, y, degree)
    f = np.poly1d(p)
    
    x_pred = np.linspace(x_min, x_max, 1000)
    y_pred = f(x_pred)
    
    plt.scatter(x, y, color="red")
    plt.plot(x_pred, y_pred)
    

Put your answer to the above question here


3. Pseudoinverse#

In this class we often talk about solving problems of the form:

Ax=b

Currently we have determined that this problem becomes very nice when the n×n matrix A has an inverse. We can easily multiply each side by the inverse:

A1Ax=A1b

Since A1A=I the solution for x is simply:

x=A1b

Now, let us consider a a more general problem where the m×n matrix A is not square, i.e. mn and its rank r maybe less than m and/or n. In this case we want to find a Pseudoinverse (which we denote as A+) which acts like an inverse for a non-square matrix. In other words we want to find an A+ for A such that:

A+AI

Assuming we can find the n×m matrix A+, we should then be able to solve for x as follows:

Ax=b
A+Ax=A+b
xA+b

How do we know there is a Psudoinverse#

Assuming the general case of a m×n matrix A where its rank r maybe less than m and/or n (rm and rn). We can conclude the following about the fundamental spaces of A:

  • The rowspace of A is in Rn with dimension r

  • The columnspace of A is in Rm also with dimension r.

  • The nullspace of A is in Rn with dimension nr

  • The nullspace of AT is in Rm with dimension mr.

Because the rowspace of A and the column space A have the same dimension then A is a the one-to-one mapping from the row space to the columnspace. In other words:

  • For any x in the rowspace, we have that Ax is one point in the columnspace. If x is another point in the row space different from x, we have AxAx (The mapping is one-to-one).

  • For any y in the columnspace, we can find x in the rowspace such that Ax=y (The mapping is onto).

The above is not really a proof but hopefully there is sufficient information to convince yourself that this is true.

How to compute pseudoinverse#

We want to find the n×m matrix that maps from columnspace to the rowspace of A, and x=A+Ax, if x is in the rowspace.

  • Let’s apply SVD on A: $A=UΣV,whereUisam\times mmatrix,V^\topisan\times nmatrix,and\Sigmaisadiagonalm\times nmatrix.WecandecomposethematricesasA=[U1U2][Σ1000][V1V2].HereU_1isofm\times r,U_2isofm\times (m-r),\Sigma_1isofr\times r,V_1^\topisofr\times n,andV_2^\topisof(n-r)\times n$.

    • The columnspace of U1 is the columnspace of A, and columnspace of U2 is the nullspace of A.

    • The rowspace of V1 is the rowspace of A, and rowspae of V2 is the nullspace of A.

  • If x is in the rowspace of A, we have that V2x=0. We have Ax=U1Σ1V1x.

    • If we define a matrix B=V1Σ11U1, we have that BAx=V1Σ11U1U1Σ1V1x=V1V1x. That is BAx=x is x is in the rowspace of A.

  • The matrix B is the pseudoinverse of matrix A. $A+=V1Σ11U1A+=[V1V2][Σ11000][U1U2].$

Example 1: Let $A=[101123]weknowthatr=m=2andn=3$.

A = np.matrix([[1,0,1],[1,2,3]])

TODO: Calculate the pseudoinverse A+ of A using the numpy.linalg function pinv:

# put your code here
# Apinv=
checkanswer.matrix(Apinv,'7724dc0d898980cb5a80dac73bfa7aa7');

DO THIS: Compute AA+ and A+A

#put your code here

QUESTION: If x is in the nullspace of A what is the effect of A+Ax?

Put your answer to the above question here

QUESTION: If x is in the rowspace of A what is the effect of A+Ax?

Put your answer to the above question here

Left inverse is pseudoinverse#

We can compute the left inverse of A if r=nm. In this case, we may have more rows than columns, and the matrix A has full column rank.

In this case, the SVD of A is $A=UΣV.HereUisofm\times n,\Sigmaisofn\times nandnonsingular,V^\topisofn\times n.ThepseudoinverseofAisA+=VΣ1U$

The left inverse of A is $(AA)1A=(VΣUUΣV)1VΣU=V(ΣΣ)1VVΣU=VΣ1U=A+$

Example 2: Let $A=[110213]weknowthatr=n=2andm=3$.

A = np.matrix([[1,1],[0,2],[1,3]])
A

DO THIS: Calculate the pseudoinverse A+ of A.

#Put your answer here
# pInv_A = 
checkanswer.matrix(pInv_A,'fda9d63c9e6cdc7bdb2acc57c4bdf490');

DO THIS: Calculate the left inverse of A, and verify that it is the same as A+.

#Put your anaswer here
# leftInv_A = 
checkanswer.matrix(leftInv_A,'fda9d63c9e6cdc7bdb2acc57c4bdf490');

Written by Dr. Dirk Colbry with interactive code David Yonkers, Michigan State University. Some modifications by Dr. Santhosh Karnik, Michigan State University Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.