Practice with: Pandas#
This notebook is meant as a review of some of the more salient features of Pandas. To be clear, we will not be focusing on reading in data; rather, this notebook will focus more on using Pandas for selecting and analyzing data (using Pandas, of course).
We’ll be using a data set of weather data from various weather stations around the US (weather.csv
) over time. We are going to be exploring the average temperature in a few different states in different years.
Part 0: Importing Packages and Reading in the Data#
As a first step, let’s import the necessary packages and read in the data. Then, we’ll use .head() to take a first look at the data.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
weather_data = pd.read_csv('weather.csv')
weather_data.head()
NOTE: Sometimes, when you have a data file you are having trouble figuring out the column headers, you can use the pandas function .keys()
to tell you the exact strings!
weather_data.keys()
Part 1: Analyzing a Single State#
1.0 Filtering down to one year#
You may have noticed that the dataframe contains weather station data from both 2016 and 2017, let’s start by masking our dataframe to just include data from 2016.
# Filter the data with a mask
year_mask = weather_data['Date_Year']==2016
year_data = weather_data[year_mask]
year_data
1.1#
✅ Pick a single state (from the column name Station State
). Create a new dataframe that just contains that state
# Write your code here
1.2#
Our ultimate goal is to see what the average temperature is for each month in our state’s data. Using another mask, select one month from the Date_Month
column. Then calculate the average temperature of the Data_Temperature
column for that month.
# Write your code here
1.3#
✅ Now repeat the process above (for your state), but for the other 11 months. You can do this by making a new mask and new masked dataframe for each month, but try to do it in as compact a way as possible! Using a loop or a function would be a good way to approach it!
# Write your code here
1.4#
✅ Create a plot that visualizes the monthly average temperature for your state. Make sure to include all of the appropriate details in your plot!
# put your answer here
1.5#
✅ What do you notice about the average temperature across the months? Does it make sense to you, given the climate of the state you chose?
2.0 Challenge Problem#
✅ Now, we want to expand the above procedure to additional states. Bringing together all of your knowledge of loops, functions, pandas, and matplotlib, repeat the above process for one state for several other states. Try to do it using loops and/or functions to make your code compact and easier to debug! This is more challenging than what hsould be on a quiz but will let you practice alot of what you have learned up to this point!
# Write your code here
Prepared by the Department of Computational Mathematics, Science and Engineering at MSU