blackblack
picture picture

Ch 5.1.4-5: More Cross-Validation

Lecture 14 - CMSE 381
Michigan State University
::
Dept of Computational Mathematics, Science /span> Engineering
Wed, Feb 18, 2026
Announcements

Last time:

This lecture:

Screenshot of the course schedule forlectures 11 to 20.

Section 1 picture picture

k-fold CV
Approximations of Test Error

Validation Set Diagram
showing the first validation
set split. Diagram showing
the second validation set
split. Diagram showing the
third validation set split. Diagram showing the fourth
validation set split. Diagram
showing the fifth validation
set split. Figure labeled
validation set for
comparison.
LOOCV Diagram showing
the first leave-one-out
cross-validation split. Diagram showing the
second leave-one-out
cross-validation split. Diagram showing the third
leave-one-out
cross-validation split. Diagram showing the final
leave-one-out
cross-validation split. Figure
using the auto data example
to show a fixed result with
no randomness across
repeated runs.
K-fold CV Diagram
showing the first split in
five-fold cross-validation. Diagram showing the
second split in five-fold
cross-validation. Diagram
showing the third split in
five-fold cross-validation. Diagram showing the fourth
split in five-fold
cross-validation. Diagram
showing the fifth split in
five-fold cross-validation. Second figure from the auto
data example.
Definition of k-fold CV

Diagram showing the first split in five-fold
cross-validation. Diagram showing the
second split in five-fold cross-validation. Diagram showing the third split in five-fold
cross-validation. Diagram showing the
fourth split in five-fold cross-validation. Diagram showing the fifth split in five-fold
cross-validation. Return
CV(k) = 1 k i=1kMSE i
Comparison with simulated data: Ex 3

Plot of simulated nonlinear data
with fitted curves of different
flexibility and corresponding
error behavior.
Plot comparing true test
MSE, LOOCV estimate, and
10-fold cross-validation
estimate, showing very
similar results.
Comparison with simulated data: Ex 1

Plot of simulated nonlinear data
with fitted curves of different
flexibility and corresponding
training and test error behavior.
Plot comparing true test
MSE, LOOCV estimate, and
10-fold cross-validation
estimate, showing that the
cross-validation estimates
underestimate the true test
MSE.
Comparison with simulated data: Ex 2

Plot of simulated close-to-linear
data with fitted curves of
different flexibility and
corresponding training and test
error behavior.
Plot comparing true test
MSE, LOOCV estimate,
and 10-fold cross-validation
estimate, showing close
agreement at low flexibility
and overestimation at
higher flexibility.
Takeaways from the examples

Bias-Variance Tradeoff: Bias

E(y0 f ^(x0))2 = Var(f ^(x 0)) + [Bias(f ^(x0))]2 + Var(𝜀)
Bias-Variance Tradeoff: Variance

E(y0 f ^(x0))2 = Var(f ^(x 0)) + [Bias(f ^(x0))]2 + Var(𝜀)
In short: Vadidation vs Test

— Next Frame —

Real-world example: Chekroud et al., Science 383, 164–167 (2024)

Figure from a real-world example in Chekroud et al., Science 383, 164–167 (2024).
Real-world example: Chekroud et al., Science 383, 164–167 (2024)

Second figure from a real-world example in Chekroud et al., Science 383, 164–167 (2024).

Section 2 picture picture

Using K-Fold CV on Polynomial Linear Regression
Polynomial regression

Replace linear model
yi = β0 + β1x1 + 𝜀i

with

yi = β0 + β1x1 + β2x12 + + β dx1d + 𝜀 i
Faking linear regression into doing our work for us

Coding - Build a plot for train/test scores vs flexibility

Next time

Screenshot of the course schedule for lectures 11 to 20.