Ch 3.1: More Linear Regression

Lecture 5 - CMSE 381

Michigan State University
::
Dept of Computational Mathematics, Science /span> Engineering

Fri, Jan 23, 2026

Announcements

Last time:

Started 3.1 - Simple linear regression (least squares)

Announcements:

Homework #1 Due Sun, Jan 25
Homework #2 Due Sun, Feb 8

Covered in this lecture

Confidence interval, hypothesis test, and p-value for coefficient estimates
Residual standard error (RSE)
R squared

Setup

Predict $Y$ on a single predictor variable $X$

Y \approx β_{0} + β_{1} X

”

\approx

” .... “is approximately modeled as”

Given $(x_{1}, y_{1}), \dots, (x_{n}, y_{n})$
Let $ŷ_{i} = {\hat{β}}_{0} + {\hat{β}}_{1} x_{i}$ be prediction for $Y$ on $i$ th value of $X$ .
$e_{i} = y_{i} - ŷ_{i}$ is the $i$ th residual

scatter plot with linear fit and
residuals

Least squares criterion: RSS

left: contour plots of residual sum of squares in
beta0-beta1 plane. Right: Three dimensional plot of RSS
versus beta0 and beta1.

Residual sum of squares RSS is

\begin{array}{l} 𝑅𝑆𝑆 & = e_{1}^{2} + \dots + e_{n}^{2} \\ = \sum_{i} {(y_{i} - {\hat{β}}_{0} - {\hat{β}}_{1} x_{i})}^{2} \end{array}

Least squares criterion

Find

β_{0}

and

β_{1}

that minimize the RSS.

\begin{array}{l} {\hat{β}}_{1} & = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} \\ {\hat{β}}_{0} & = \bar{y} - {\hat{β}}_{1} \bar{x} \end{array}

Section 2

Assessing Coefficient Estimate Accuracy

Bias in estimation Analogy with mean

Assume a true value $μ^{*}$
An estimate from training data $\hat{μ}$
The estimate is unbiased if $E (\hat{μ}) = μ^{*}$

Sample mean is unbiased for population mean:

E (\hat{μ}) = E (\frac{1}{n} \sum_{i} X_{i}) = μ

Standard variance estimate is biased

E ({\hat{σ}}^{2}) = E [\frac{1}{n} \sum_{i} {(X_{i} - \bar{X})}^{2}] \neq σ^{2}

Linear regression is unbiased

Scatter plot of simulated
data with linear relationship
plus noise with overalid true
line relationship and least
squres linear fit.

10 linear fits for the same
data with 10 different noise
realizations.

Variance in estimation Continuing analogy with mean

True value $μ^{*}$
Estimate from training data $\hat{μ}$
Variance of sample mean $Var (\hat{μ}) = SE {(\hat{μ})}^{2} = \frac{σ^{2}}{n}$

Variance of linear regression estimates

Variance of linear regression estimates:
$\begin{array}{l} SE {({\hat{β}}_{0})}^{2} & = σ^{2} [\frac{1}{n} + \frac{{\bar{x}}^{2}}{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}] \\ SE {({\hat{β}}_{1})}^{2} & = \frac{σ^{2}}{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} \end{array}$
where $σ^{2} = Var (𝜀)$
Residual standard error is an estimate of $σ$

𝑅𝑆𝐸 = \sqrt{𝑅𝑆𝑆 ∕ (n - 2)}

Coding group work

Run the section titled “Simulating data”

Confidence Interval

The 95% confidence interval for

β_{1}

approximately takes the form

{\hat{β}}_{1} \pm 2 \cdot SE ({\hat{β}}_{1})

Interpretation:

There is approximately a 95% chance that the interval

[{\hat{β}}_{1} - 2 \cdot SE ({\hat{β}}_{1}), {\hat{β}}_{1} + 2 \cdot SE ({\hat{β}}_{1})]

will contain $β_{1}$ where we repeatedly approximate ${\hat{β}}_{1}$ using repeated samples.

CI in Advertising data

scatter plot with linear fit and residuals

For the advertising data set, the 95% CIs are:

$β_{1}$ :: $[0.042, 0.053]$
$β_{0}$ :: $[6.130, 7.935]$

Hypothesis testing

There is no relationship between $X$ and $Y$
There is some relationship between $X$ and $Y$

Test statistic and p-value

Test statistic:

t = \frac{{\hat{β}}_{1} - 0}{SE ({\hat{β}}_{1})}

t-distribution with $n - 2$ degrees of freedom student t-distribution
for four different values of nu.

Assessing the accuracy of the module: RSE

Residual standard error (RSE):

\begin{array}{l} 𝑅𝑆𝐸 & = \sqrt{\frac{1}{n - 2} 𝑅𝑆𝑆} \\ = \sqrt{\frac{1}{n - 2} \sum_{i} {(y_{i} - ŷ_{i})}^{2}} \end{array}

Assessing the accuracy of the module:

R^{2}

R squared:

\begin{array}{l} R^{2} & = \frac{𝑇𝑆𝑆 - 𝑅𝑆𝑆}{𝑇𝑆𝑆} = 1 - \frac{𝑅𝑆𝑆}{𝑇𝑆𝑆} \end{array}

where total sum of squares is

𝑇𝑆𝑆 = \sum_{i} {(y_{i} - \bar{y})}^{2}

Advertising example

Sales versus TV advertising scatter plot with linear fit line Sales versus radio advertising scatter
plot with linear fit line Sales versus newspaper advertising scatter plot with linear fit line

R^{2} = 0.61 R^{2} = 0.33 R^{2} = 0.05

Coding group work

Run the section titled “Assessing Coefficient Estimate Accuracy”

Next time

Next time:

Multi-linear regression.

Screenshot of the course schedule from
lectures 1 to 10.

Announcements

Homework 1
- Due Sun, Jan 25
Homework 2
- Due Sun, Feb 8