Jupyter Notebook#
Lecture 3 - Mean Squared Error#
CMSE 381 - Fall 2024#
Sept 4, 2024#
This notebook has some code to go along with lecture 3 on Mean Squared Error.
# As always, we start with our favorite standard imports.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
Info about the data set#
Also hosted on our class website Data Sets Page
Auto: Auto Data Set#
Description
Gas mileage, horsepower, and other information for 392 vehicles. Usage
Format
A data frame with 392 observations on the following 9 variables.
mpg
: miles per galloncylinders
: Number of cylinders between 4 and 8displacement
: Engine displacement (cu. inches)horsepower
: Engine horsepowerweight
: Vehicle weight (lbs.)acceleration
: Time to accelerate from 0 to 60 mph (sec.)year
: Model year (modulo 100)origin
: Origin of car (1. American, 2. European, 3. Japanese)name
: Vehicle name
The orginal data contained 408 observations but 16 observations with missing values were removed.
Source
This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University. The dataset was used in the 1983 American Statistical Association Exposition.
# First, we're going to do all the data loading and cleanup we figured out last time.
auto = pd.read_csv('../../DataSets/Auto.csv')
auto = auto.replace('?', np.nan)
auto = auto.dropna()
auto.horsepower = auto.horsepower.astype('int')
auto.shape
I want to just predict acceleration using horsepower.
✅ Do this: Make a scatter plot of acceleration (the output varible) vs horsepower (the input variable). Does it look like there’s a relationship between the two variables?
# Your code here.
I’ve decided to use the model \( \hat {f}(\texttt{horsepower}) = 23-0.05 \cdot \texttt{horsepower} \)
✅ Do this: Make a panda Series with entries \(\hat f(\texttt{horsepower})\) for each entry in auto.horsepower
.
# Your code here
✅ Do this: Using the series you just built, calculate the mean squared error,
\( MSE = \frac{1}{n} \sum_{i=1}^n (y_i - \hat y_i)^2. \)
# Your code here
Have some spare time? Can you mess around with the coefficients in your model to decrease the MSE?
# Your code here
Congratulations, we’re done!#
Written by Dr. Liz Munch, Michigan State University
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.