CMSE 381 Study Guide

CMSE 381 Study Guide#

Here we summarize the detailed learning goals in terms of a series of questions that you should be able to answer at the end of this course.

(Work in progress… Last update 10/6/2025)

(Required readings: chapter sections 5.1 … 10 pages)

Why do you need validation methods?
What is the difference between validation scores and true test errors?
What are the three most basic validation methods?
For each of these methods,
- How do you calculate the validation score? You should be able to describe the procedures verbally and mathematically.
- How do you calculate the validation score in Python?
- What is the advantages and disadvantages of using this method compared to the two other methods? You should be able to describe them in terms of computational cost and bias/variance of estimation.
How do you use validation methods to select the appropriate meta-parameter?
- Given a plot of validation score as a function of model flexibility, which depends on a specific meta-parameter of the model, you should be able to choose the right parameter by reading the plot, and provide an explanation based on the principle of bias-variance tradeoff.
- Given a dataset, you should be able to generate this plot yourself in Python for any type of model covered in this class.

(Required readings: chapter section 6.1 … 9 pages)

Why shouldn’t you have as many predictors in your model as possible?
What are the three basic methods could be used to select the appropriate predictors (feature selection)?
What are the steps/procedures in the algorithm for each of these methods?
How to implement these procedures by hand given the training and CV test error for each combo of predictors?
What are the pros and cons for choosing each of these feature selection methods? When you can or cannot use them?
How do you implement the feature selection algorithm in Python?

(Required readings: chapter section 6.2.1-6.2.2 … 12 pages)

What is regularization? Why do we need it?
What are the two basic types of regularization methods? How are they implemented mathematically in linear regression? Why are they also called Shrinkage methods?
How do you fit these two basic type of regularized linear regression model in python?
How do you control the model flexibility & bias-variance tradeoff when using regularization?
How do you find the right amount of regularization using cross-validation? How do you do this in python?
What additional precautions do you need to take when using regularization (compared to least squares)?
What are the advantages of regularization compared to Least Squares?
What are the advantages of regularization compared to subset selection?
What are the advantages of one type Shrinkage method over another? When do you choose one over another?

(Required readings: chapter section 6.3 … 9 pages)

WHow to create new variables as linear combinations of the original predictors?
Why do we need Principal Component Analysis (PCA)? What is the main purpose of using it?
What is a principal component (PC)?
- What does the first PC maximize? You should be able to explain this both geometrically in a plot and mathematically.
- Similarly, what do the subsequent PCs maximize?
How do you compute the PCs in Python, given a dataset?
- How do you project data points on each PC? You should also be able to plot the data points in the PC space.
- How do you find out how much variance each PC explains?
Why do you want to use the PCs in regression models?
- What assumptions do you have to make for it to be a good idea to use principal component regression (PCR)?
- Or conversely, what is a typical bad scenario to use PCR?
How do you implement PCR in Python?
How do you interpret the model coefficients when using PCR?
How do you choose the number of PC to use in PCR?
- Given a figure of, e.g., cross-validation score as a function of the number of PCs, you should be able to choose appropriately and provide rationales in terms of bias-variance tradeoff.
- You should be able to generate such figures in Python.
What is the relationship/differences between PCR and feature selection and regularization methods that you learned in this part of the course?