blackblack
picture picture

Ch 6.1: Subset Selection

Lecture 16 - CMSE 381
Michigan State University
::
Dept of Computational Mathematics, Science /span> Engineering
Mon, Feb 23, 2026
Announcements

Last time

Covered in this lecture

Announcements:

Screenshot of the course schedule forlectures 11 to 20.
What should you learn from this lecture?

Section 1 picture picture

Previously on linear regression ...
The problem of many features (p) relative to samples (n)

Up to now, we’ve focused on standard linear model: Y = β0 + β1X1 + + βpXp + 𝜀 and done least squares estimation.
Prediction accuracy
The problem of many features (p) relative to samples (n)

Up to now, we’ve focused on standard linear model: Y = β0 + β1X1 + + βpXp + 𝜀 and done least squares estimation.

Model Interpretability

Section 2 picture picture

Best Subset Selection
Go through each combo of variables exhaustively (exhausting?)

All subsets of 4 variables (24 = 16)

One way of breaking this up

Algorithm diagram showing one way to break up the subset
selection procedure into steps for choosing models based on
training and testing scores.
Calculate by hand

We train a model using four variables, X1,X2,X3,X4. We’re interested in getting a subset of the variables to use. The following table shows the mean squared error and the MSE value computed for the model learned using each possible subset of variables.

Table showing mean squared error values
for models fit using all possible subsets of
four variables in a subset selection
example.

picture

What subset of variables is found for each of the sets M0,M1,M2,M3,M4 when using best subset selection?

picture

What subset of variables is returned using best subset selection?

Extra work space if it helps

Table
showing
mean
squared
error
values
for
models
fit
using
all
possible
subsets
of
four
variables
in
a
subset
selection
example.
Code to do this

Section 3 picture picture

Forward Selection
What’s the problem with best subset selection?

Forward Stepwise Selection

Algorithm diagram showing the forward stepwise selection
procedure for building a model by adding variables one at a
time.
An example for Forward Stepwise Selection

Group work: by hand same example with forward example

We train a model using four variables, X1,X2,X3,X4. We’re interested in getting a subset of the variables to use. The following table shows the mean squared error and the R2 value computed for the model learned using each possible subset of variables.

Table showing mean squared error values
for models fit using all possible subsets of
four variables in a subset selection
example.

picture

What subset of variables is found for each of the sets M0,M1,M2,M3,M4 when using forward selection?

picture

What subset of variables is returned using forward subset selection?

Extra work space if it helps

Table
showing
mean
squared
error
values
for
models
fit
using
all
possible
subsets
of
four
variables
in
a
subset
selection
example.
Pros and Cons of Forward Stepwise

Pros:
Cons:

Section 4 picture picture

Backward Selection
Backward stepwise selection

Algorithm diagram showing the backward stepwise selection
procedure for building a model by removing variables one at a
time.
Pros and Cons of Backward Stepwise

Pros:
Cons:
TL;DR

Algorithm diagram showing one way to break up the
subset selection procedure into steps for choosing
models based on training and testing scores.
Next time

Screenshot of the course schedule for lectures 11 to 20.