blackblack
picture picture

Ch 3.1: More Linear Regression

Lecture 5 - CMSE 381
Michigan State University
::
Dept of Computational Mathematics, Science /span> Engineering
Fri, Jan 23, 2026
Announcements

Last time:

Announcements:

Covered in this lecture

Section 1 picture picture

Last time
Setup

Y β0 + β1X
  • ” .... “is approximately modeled as”
  • scatter plot with linear fit and
residuals

    Least squares criterion: RSS

    left: contour plots of residual sum of squares in
beta0-beta1 plane. Right: Three dimensional plot of RSS
versus beta0 and beta1.
    Residual sum of squares RSS is 𝑅𝑆𝑆 = e12 + + e n2 = i(yi β^0 β^1xi)2
    picture
    Least squares criterion
    picture picture
    Find β0 and β1 that minimize the RSS.
    β^1 = i=1n(xi x¯)(yi y¯) i=1n(xi x¯)2 β^0 = y¯ β^1x¯

    Section 2 picture picture

    Assessing Coefficient Estimate Accuracy
    Bias in estimation Analogy with mean

    E(μ^) = E (1 n iXi) = μ
  • Standard variance estimate is biased

    E(σ^2) = E [1 n i(Xi X¯)2] σ2
  • Linear regression is unbiased

    Scatter plot of simulated
data with linear relationship
plus noise with overalid true
line relationship and least
squres linear fit.

    10 linear fits for the same
data with 10 different noise
realizations.

    Variance in estimation Continuing analogy with mean

    Variance of linear regression estimates

    𝑅𝑆𝐸 = 𝑅𝑆𝑆(n 2)
    Coding group work

    Run the section titled “Simulating data”
    Confidence Interval

    The 95% confidence interval for β1 approximately takes the form
    β^1 ± 2 SE(β^1)
    Interpretation:

    There is approximately a 95% chance that the interval

    [β^1 2 SE(β^1),β^1 + 2 SE(β^1)]

    will contain β1 where we repeatedly approximate β^1 using repeated samples.

    CI in Advertising data

    scatter plot with linear fit and residuals
    For the advertising data set, the 95% CIs are:
    Hypothesis testing

    Test statistic and p-value

    Test statistic:
    t = β^1 0 SE(β^1)

    t-distribution with n 2 degrees of freedom student t-distribution
for four different values of nu.

    Advertising example

    linear fit table for the sales versus TV advertising in 1000
dollars data. scatter plot with linear fit and residuals
    Assessing the accuracy of the module: RSE

    Residual standard error (RSE): 𝑅𝑆𝐸 = 1 n 2𝑅𝑆𝑆 = 1 n 2 i(yi ŷi)2
    Assessing the accuracy of the module: R2

    R squared: R2 = 𝑇𝑆𝑆 𝑅𝑆𝑆 𝑇𝑆𝑆 = 1 𝑅𝑆𝑆 𝑇𝑆𝑆

    where total sum of squares is

    𝑇𝑆𝑆 = i(yi y¯)2
    Advertising example

    Sales versus TV advertising scatter plot with linear fit line Sales versus radio advertising scatter
plot with linear fit line Sales versus newspaper advertising scatter plot with linear fit line

    R2 = 0.61 R2 = 0.33 R2 = 0.05
    Coding group work

    Run the section titled “Assessing Coefficient Estimate Accuracy”
    Next time

    Next time:

    Screenshot of the course schedule from
lectures 1 to 10.

    Announcements