blackblack
picture picture

Ch 6.3: PCR

Lecture 20 - CMSE 381
Michigan State University
::
Dept of Computational Mathematics, Science /span> Engineering
Wed, Mar 11, 2026
Announcements

Last time:

This lecture:

Announcements:

Course schedule for weeks 11-20, listing
topics like Logistic Regression, PCA, and
key dates including Midterm and Spring
Break

SectionΒ 1 picture picture

Previously…
Shrinkage

Find Ξ² to minimize
𝑅𝑆𝑆 = βˆ‘ i=1n (y i βˆ’Ξ²0 βˆ’βˆ‘ j=1pΞ² jx𝑖𝑗) 2

subject to:

Least Squares:
No constraints
Ridge:
βˆ‘ j=1pΞ² j2 ≀s

The Lasso:

βˆ‘ j=1p|Ξ² j|≀s
Two plots showing standardized ridge
regression coefficients shrinking to
zero as You may provide a definition
with increases (left) or norm ratio
decreases (right)

Two plots showing standardized
Lasso regression coefficients shrinking
to exactly zero as lambda increases
(left) or norm ratio decreases (right)

Linear transformation of predictors

Original Predictors:
X1,β‹―,Xp

New Predictors:

Z1,β‹―,ZM
Zm = βˆ‘ j=1pΟ† π‘—π‘šXj
The goal:
PCA - First PC

Scatter
plot
of
2D
data
with
projection
line
at
45
degrees. Histogram
of
the
2D
data
projected
onto
the
line
at
45
degrees.
Scatter
plot
of
2D
data
with
projection
line
at
90
degrees.align Histogram
of
the
2D
data
projected
onto
the
line
at
90
degrees.align
Scatter
plot
of
2D
data
with
projection
line
at
180
degrees.align Histogram
of
the
2D
data
projected
onto
the
line
at
180
degrees.align
Scatter
plot
of
2D
data
with
projection
line
at
45
degrees.align Histogram
of
the
2D
data
projected
onto
the
line
at
135
degrees.align
Projection onto first PC

Scatter plot of Ad Spending versus Population with a first principal component line and
dashed lines showing data projections.

Z1 = 0.839 β‹… (πš™πš˜πš™βˆ’πš™πš˜πš™Β―) + 0.544 β‹… (πšŠπšβˆ’πšŠπšΒ―)
Drawing points in PC space

Two plots illustrating PCA: data projected onto the first principal component line (left) and the
resulting PC scores (right).

What will you learn from this lecture?

SectionΒ 2 picture picture

Principal Components Regression
So you’ve found your PCA coefficients

Now what?
What are we assuming?
Interpretation of PCR coefficients

Original Predictors:
X1,β‹―,Xp

New Predictors:

Z1,β‹―,ZM
Zm = βˆ‘ j=1pΟ† π‘—π‘šXj

Learned model:

y = πœƒ0 + πœƒ1Z1 + β‹― + πœƒMZM
Picking M

Two plots showing standardized PCR coefficients (left) and
cross-validation MSE (right) versus the number of principal
components.
Do PCR with hitters data

Bias-Variance Trade-off

Example with simulated data: n = 50 observations of p = 45 predictors
Y is a function of all predictors Plot of
training and test Mean Squared Error
versus number of components, illustrating
the bias-variance tradeoff.
Y is a function of 2 predictors Plot of
Squared Bias, Variance, and Test MSE
versus number of components,
demonstrating the bias-variance
decomposition.
Comparison to results on shrinkage

Y is a function of all predictors Plot of
training and test Mean Squared Error
versus number of components, illustrating
the bias-variance tradeoff. Two plots
showing MSE, squared bias, and variance
versus lambda (left) and training
R-squared (right), illustrating the
bias-variance tradeoff.
Y is a function of 2 predictors Plot of
Squared Bias, Variance, and Test MSE
versus number of components,
demonstrating the bias-variance
decomposition. Two plots showing MSE,
squared bias, and variance for Lasso versus
lambda (left) and training R-squared
(right).
Properties of PCR

TL;DR

PCR

Scatter plot of Ad Spending versus Population
with a first principal component line and
dashed lines showing data projections.

Next time

Course schedule for weeks 11-20, listing topics like Logistic Regression, PCA, and key dates including
Midterm and Spring Break.