18 Pre-Class Assignment: Least Squares Fit (Regression)#

Goals for today’s pre-class assignment#

  1. Least Squares Fit

  2. Linear Regression

  3. One-to-one and Inverse transform

  4. Inverse of a Matrix


1. Least Squares Fit#

Review Chapters Chapter 13 pg 225-239 of the Boyd textbook.

In this first part of this course, we try to solve the system of linear equations \(Ax=b\) with an \(m\times n\) matrix \(A\) and a column vector \(b\).

There are three possible outcomes: an unique solution, no solution, and infinite many solutions. (Review the material on this part if you are no familiar with when the three types of outcomes happen.)

When \(m<n\), we call the matrix \(A\) underdeterminated, because we can not have an unique solution for it. When \(m>n\), we call the matrix \(A\) overdeterminated, becasue we may not have a solution with high probability.

However, if we still need to find a best \(x\), even when there is no solution or infinite many solutions we use a technique called least squares fit (LSF). Least squares fit find \(x\) such that \(\|Ax-b\|\) is the smallest (i.e. we try to minimize the estimation error).

  • When there is no solution, we want to find \(x\) such that \(Ax-b\) is small (here, we want \(\|Ax-b\|\) to be small).

  • If the null space of \(A\) is just \(\{0\}\), we can find an unique \(x\) to obtain the smallest \(\|Ax-b\|\).

    • If there is a unique solution \(x^*\) for \(Ax=b\), then \(x^*\) is the optimal \(x\) to obtain the smallest \(\|Ax-b\|\), which is 0.

    • Because the null space of \(A\) is just \(\{0\}\), you can not have infinite many solutions for \(Ax=b\).

  • If the null space of \(A\) is not just \(\{0\}\), we know that we can always add a nonzero point \(x_0\) in the null space of \(A\) to a best \(x^*\), and \(\|A(x^*+x_0)-b\|=\|Ax^*-b\|\). Therefore, when we have multiple best solutions, we choose to find the \(x\) in the rowspace of \(A\), and this is unique.

QUESTION 1: Let $\(A=\begin{bmatrix}1\\2\end{bmatrix},\quad b=\begin{bmatrix}1.5 \\ 2\end{bmatrix}\)\( Find the best \)x\( such that \)|Ax-b|$ has the smallest value.

Put your answer to the above question here.

QUESTION 2: Compute \((A^\top A)^{-1}A^\top b\).

Put your answer to the above question here.


2. Linear Regression#

Watch the video for using Least Squares to do linear regression.

from IPython.display import YouTubeVideo
YouTubeVideo("Lx6CfgKVIuE",width=640,height=360, cc_load_policy=True)

QUESTION 3: How to tell it is a good fit or a bad one?

Put your answer to the above question here.


3. One-to-one and Inverse transform#

Read Section 4.9 of the textbook if you are not familiar with this part.

Definition: A transformation \(T:U\mapsto V\) is said to be one-to-one if each element in the range is the image of just one element in the domain. That is, for two elements (\(x\) and \(y\)) in \(U\). \(T(x)=T(y)\) happens only when \(x=y\).

Theorem: Let \(T:U\mapsto V\) be a one-to-one linear transformation. If \(\{u_1,\dots,u_n\}\) is linearly independent in \(U\), then \(\{T(u_1),\dots,T(u_n)\}\) is linearly independent in \(V\).

Definition: A linear transformation \(T:U\mapsto V\) is said to be invertible if there exists a transformation \(S:V\mapsto U\), such that $\(S(T(u))=u,\qquad T(S(v))=v,\)\( for any \)v\( in \)V\( and any \)u\( in \)U$.

QUESTION 4: If linear transformation \(T:U\mapsto V\) is invertible, and the dimension of \(U\) is 2, what is the dimension of \(V\)? Why?

Put your answer to the above question here.


4. Inverse of a Matrix#

  • Recall the four fundamental subspaces of a \(m\times n\) matrix \(A\)

    • The rowspace and nullspace of \(A\) in \(R^n\)

    • The columnspace and the nullspace of \(A^\top\) in \(R^m\)

  • The two-sided inverse gives us the following $\( {A}{A}^{-1}=I={A}^{-1}{A} \)$

  • For this we need \(r = m = n\), here \(r\) is the rank of the matrix.

  • For a left-inverse, we have the following

    • Full column rank, with \(r = n \leq m\) (but possibly more rows)

    • The nullspace contains just the zero vector (columns are independent)

    • The rows might not all be independent

    • We thus have either no or only a single solution to \(Ax=b\).

    • \(A^\top \) will now also have full row rank

    • From \((A^\top A)^{-1}A^\top A = I\) follows the fact that \((A^\top A)^{-1}A^\top\) is a left-sided inverse

    • Note that \((A^\top A)^{-1}A^\top\) is a \(n\times m\) matrix and \(A\) is of size \(m\times n\), theire mulitiplication \((A^\top A)^{-1}A^\top A\) results in a \(n\times n\) identity matrix

    • The \(A(A^\top A)^{-1}A^\top\) is a \(m\times m\) matrix. BUT \(A(A^\top A)^{-1}A^\top\neq I\) if \(m\neq n\). The matrix \(A(A^\top A)^{-1}A^\top\) is the projection matrix onto the column space of \(A\).

QUESTION 5: What is the projection matrix that projects any vector onto the subspace spanned by \([1,2]^\top\). (What matrix will give the same result as projecting any point onto the vector \([1,2]^\top\).)

Put your answer to the above question here.

QUESTION 6: If \(m=n\), is the left inverse the same as the inverse?

Put your answer to the above question here.

Theorem: For a matrix \(A\) with \(r=n<m\), the columnspace of \(A\) has dimension \(r(=n)\). The linear transfrom \(A: R^n\mapsto R^m\) is one-to-one. In addition, the linear transformation \(A\) from \(R^n\) to the columnspace of \(A\) is one-to-one and onto (it means that for any element in the columnspace of \(A\), we can find \(x\) in \(R^n\) such that it equals \(Ax\).) Then the left inverse of \(A\) is a one-to-one mapping from the columnspace of \(A\) to \(R^n\), and it can be considered as an inverse transform of \(A\).


Written by Dr. Dirk Colbry, Michigan State University Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

###STARTFOOTER###


Congratulations, we’re done!#