Homework 4: Compartmental Models + Data Visualization (Fall 2025)#
Assignment instructions#
Work through the following assignment, making sure to follow all of the directions and answer all of the questions.
This assignment is due at 11:59pm on Friday, November 21, 2025
It should be uploaded into D2L Homework #4. Submission instructions can be found at the end of the notebook.
Table of Contents#
Part 0. Academic Integrity Statement (2 points)
Part 1. Understanding compartmental models (23 points)
Part 2. Simulating differential equations (20 points)
Part 3. Interpreting results of simulations (15 points)
Part 4. Data visualization (27 points)
Part 5. Data interpretation (20 points)
Total points = 107.
Part 0. Academic integrity statement (2 points)#
In the markdown cell below, paste your personal academic integrity statement. By including this statement, you are confirming that you are submitting this as your own work and not that of someone else.
✎ Put your personal academic integrity statement here.
Part 1. Understanding compartmental models (23 total points)#
There are many different cycles that can be observed in nature. Some examples you might be familiar with include the water cycle and the carbon cycle. For this problem, you will investigate the nitrogen cycle.
Below is a diagram about the nitrogen cycle (source: Wikipedia):
Now, someone wrote down a compartmental model for the nitrogen cycle:
You will need to conceptually analyze this model and write down the model as ordinary differential equations (ODEs).
✅ Part 1.1 (5 points)
Describe in the cell below: how many compartments there are in the model, and what each of the compartments stands for.
✎ Put your answer here
✅ Part 1.2 (8 points)
For each of the four pathways or arrows in the model, give an interpretation for what actual process might be represented.
✎ Put your answer here
✅ Part 1.3 (4 points)
Name at least two assumptions or simplifications that the compartmental model makes with respect to the more detailed schematic above.
✎ Put your answer here
✅ Part 1.4 (6 points)
In the cell below, write down the ordinary differential equations associated with this compartmental model.
✎ Put your answer here
Part 2. Solving differential equations numerically (20 total points)#
In previous assignments, you have used solve_ivp to compute the numeric solution of the logistic model of the growth of a single population. Recall, the Logistic population model is described by the following differential equations:
\begin{equation} \frac{dP}{dt} = kP\Big(1-\frac{P}{C}\Big), \end{equation}
where \(P =\) population, \(k =\) growth rate, and \(C =\) the carrying capacity.
Examples code for computing the solution for for \(P_0 =0.1\) billion (initial population), \(k=1\), and \(C =1\) billion is provided below:
# example code to compute a numeric solution of the logistic model
import numpy as np
from scipy.integrate import solve_ivp
# define the derivative
def logistic(time, current_state):
p = current_state
dpdt = p*(1-p)
return dpdt
# compute numeric solution
initial_p = [0.1]
time = np.linspace(0,10,50)
result = solve_ivp(logistic, (0,20), initial_p, t_eval = time)
# unpack solution
numerical_p = result.y[0,:]
Cyclic 3-Species (Rock-Paper-Scissors) Ecosystem Model#
Now you wonder: what happens if three different species interact in a cyclic way - each one outcompetes a second, the second outcompetes a third, and the third outcompetes the first. (This is the ecological analogue of Rock-Paper-Scissors (RPS).)
Such cyclic systems are common in nature (for example, among certain bacterial strains or lizard morphs) and can produce rich dynamics including coexistence, oscillations, and heteroclinic cycles.
A simple nondimensional model for three interacting species (u(t)), (v(t)), and (w(t)) is:
\begin{align} \frac{du}{dt} &=u(1-u-av) &(1)\ \frac{dv}{dt} &=v(1-v-bw) &(2)\ \frac{dw}{dt} &=w(1-w-cu). &(3) \end{align}
Solve this RPS system using solve_ivp. Assuming \(a = b= c =2\), and the following initial conditions:
\(u(0) = 0.6\)
\(v(0) = 0.2\)
\(w(0) = 0.5\) in units of thousands.
Evolve the model for 40 decades (400 years) using a timestep of \(\Delta t = 0.01\) decades.
Notes#
Each species grows logistically when alone (carrying capacity scaled to 1).
Cross terms such as \(a v\) represent harmful competition from another species.
When \(a=b=c\), the system is symmetric; when the parameters differ, asymmetry can lead to new behaviors.
Biological Example#
Three bacterial strains (A, B, C) compete for the same nutrient:
A produces a compound that inhibits C,
C inhibits B,
B inhibits A.
This forms a cyclic dominance network similar to Rock-Paper-Scissors.
Your Task#
Implement and analyze the three-species cyclic Lotka-Volterra model above.
You can adapt your code from the logistic model.
✅ Part 2.1 (8 points)
Define the derivative function for the three-species competitive Lotka-Volterra model in the cell below (to be used as an input later for solve_ivp – so pay attention to the format).
# write your function here
# example code to compute a numeric solution of the logistic model
✅ Part 2.2 (6 points)
Using solve_ivp, compute the numeric solution for RPS model with the parameters \(a=2.0\), \(b=2.0,\) \(c=2.0\), the initial conditions \([u_0,v_0,w_0]= [0.6,0.2,0.5]\), and the final time equal 40. Unpack the result you get from solve_ivp into separate variables u,v,and w.
# put your code here
✅ Part 2.3 (4 points)
Plot the solutions of \(u\), \(v\), and \(w\) as a function of time in the cell below. Be sure to add appropriate axis labels and legends.
# put our code here
✅ Part 2.4 (2 points)
When \(a=b=c\), we call the model a symmetric model. Check with the plot above to see what will happen as time goes on. Can the three species, i.e. \(u\), \(v\), and \(w\), coexist or one species will die out?
✎ Put your answer here
Part 3. Interpreting model behavior for different parameters and initial conditions (15 total points)#
In Part 2, you have explored the three species competitive Lotka-Volterra model in a symmetric scenario, i.e. neither species were very competitive with parameters \(a\), \(b\) and \(c\) where \(a=b=c\).
In this part, you will be shown the solutions of the model under a couple of different asymmetric competition scenarios and with the initial condition. You will be asked to interpret these plots.
✅ Part 3.1 (5 points) Asymmetric competition case (\(a<b=c\))
The figure below shows the numerically solved population growth dynamics of the three species \(u\), \(v\) and \(w\):
The initial population was \(u_0 = 0.6\), \(v_0 =0.2\), and \(w_0 = 0.5\). The parameters were \(a = 1\), \(b=c=2\). In the cell below, answer the following questions:
Based on the plot, which species was more successful at the beginning? which species was more successful at the end? Here success is measured by population size.
Based on the plot, which species was more competitive?
Connecting your visual interpretations to the model parameters \(a, b,\) and \(c\), explain how these three parameters determine which species is strongly competitive and which species is weakly competitive.
✎ Put your answer here
✅ Part 3.2 (4 points) Asymmetric competition case (\(b <a=c\))
Now we changed the parameters so that \(a=c=2\) and \(b =1\), with initial conditions remain the same. We obtained the following figure:
In the cell below, answer the following questions:
How is this figure different from the one in Part 3.1?
How are the changes in parameters \(a,b\) and \(c\) related to changes in the figure?
✎ Put your answer here
✅ Part 3.3 (6 points) Asymmetric competition case (\(c<a=b\))
We changed the parameters again to be \(a=b=2\) and \(c =1\),with initial conditions remain the same (\([u_0,v_0,w_0] = [0.6,0.2,0.5]\)). We obtained the following figure:

Then, we use a different set of parameters with \(a = 2.0, b=1.5\), and \(c=1.0\) while the initial population levels still be the same (\([u_0,v_0,w_0] = [0.6,0.2,0.5]\)). We obtained the figure below:
Compare the two figures, and answer the following questions in the cell below:
In what aspects are these two figures different in terms of the final outcome?
What differences in the model caused the differences in the figures?
Why would the short term dynamics of the two figures look different, especially we still keep the competitive parameter \(c < b < a\)?
✎ Put your answer here
Part 4: Visualization (27 points)#
In Parts 4 and 5 of this homework, you will look at some data about flight delays in the United States. Some information about the dataset is provided below.
Original dataset: Flight Delay and Cancellation Dataset (2019-2023) from Kaggle
Created by Patrick Zelazko
Source: U.S. Bureau of Transportation Statistics (BTS)
Coverage: January 2019 - August 2023
Original size: 3 million flight records
Dataset processing for this homework: The original dataset has been preprocessed for educational use:
Downsampled from 3M rows to ~15k rows using stratified sampling (preserves temporal patterns)
Aggregated to monthly level with computed statistics
Enhanced with cancellation cause percentages from BTS cancellation codes
This preprocessing reduces file size and computation time while maintaining the ability to analyze seasonal patterns and trends.
Dataset Used in This Notebook#
Use the cell below to load in the flights_monthly_ready.csv file. This includes the following columns:
ym, year, month, flights- temporal information and flight countsdep_delayed, arr_delayed, dep_delay_rate, arr_delay_rate- delay metricsdep_delay_hours_mean, dep_delay_hours_std- departure delay statisticscancelled_total, cancel_carrier_pct, cancel_weather_pct, cancel_nas_pct, cancel_security_pct- cancellation causes
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
DATA_PATH = "flights_monthly_ready.csv" # change if needed
m = pd.read_csv(DATA_PATH, parse_dates=["ym"])
m.head()
✅ Part 4.1 (4 points) Calculate the monthly departure delay rate and arrival delay rate from counts and verify against the columns.
dep_delay_rate = 100 * dep_delayed / flightsarr_delay_rate = 100 * arr_delayed / flights
Print ym, dep_delay_rate, arr_delay_rate for the first 12 months.
✎ Put your answer here (1-2 sentences): Do your computed rates match the columns?
✅ Part 4.2 (6 points) Visualize the monthly departure delay rate and arrival delay rate as line plots over time on the same figure.
✎ Put your answer here (1 sentence): Which month is worst for each rate?
✅ Part 4.3 (6 points) Visualize the average and standard deviation of departure delaying hours for every month.
Because the monthly file contains aggregated statistics (
dep_delay_hours_mean,dep_delay_hours_std) rather than raw per-flight values, use a mean +/- std error-bar chart (a traditional boxplot would require raw values).
How to create an error bar chart:
Use plt.errorbar() to show mean values with error bars representing standard deviation:
plt.errorbar(x, y, yerr=error, fmt='o')
Parameters:
x: x-axis values (e.g., dates/months)y: y-axis values (e.g., mean values)yerr: error bars (e.g., standard deviation)fmt='o': format string -'o'creates points/markers, can also use'o-'for points connected by lines
The error bars extend vertically from (mean - std) to (mean + std), showing the variability in the data.
✎ Put your answer here (1 sentence): Which months have the largest typical delays and variability?
✅ Part 4.4 (8 points) Cancellation cause percentages by month - make a stacked area (or stacked bars) using:
cancel_carrier_pct, cancel_weather_pct, cancel_nas_pct, cancel_security_pct.
These are the shares of cancellations by cause (DOT/BTS: A=Carrier, B=Weather, C=NAS, D=Security).
How to create a stacked area chart:
Use plt.stackplot() to show how different categories contribute to a total over time:
plt.stackplot(x, y1, y2, y3, ..., labels=['Category 1', 'Category 2', ...])
Parameters:
x: x-axis values (e.g., dates/months) - shared by all categoriesy1, y2, y3, ...: y-axis values for each category (one array per category)labels: list of category names for the legend
Key Features:
Each area is stacked on top of the previous one
The height of each colored band represents that category’s value
The total height at any point shows the sum of all categories
Perfect for showing proportions or compositions that add up to 100%
Reading the chart:
Bottom area = first category (e.g., y1)
Each band above = next category stacked on top
Top edge = cumulative total (should be ~100% for percentages)
✎ Put your answer here (2 sentences): Which causes tend to rise in winter vs. summer?
✅ Part 4.5 (3 points) Compare the plot you made above in Part 4.4 with the plot shown below. Identify one thing the first plot does better, and identify one thing the second plot does better.

✎ Put your answer here (2 sentences)
Part 5: Interpretation (20 points)#
✅ Part 5.1 (4 points) Best vs. worst months for departure and arrival delay rates.
Identify the highest and lowest months for each rate and quantify the values and difference (in percentage points).
✎ Put your answer here (2-3 sentences).
✅ Part 5.2 (4 points) Do departure and arrival delay rates move together?
Compute the correlation between dep_delay_rate and arr_delay_rate and describe the strength in one sentence.
✎ Put your answer here (1-2 sentences).
✅ Part 5.3 (6 points) How large and how variable are departure delays?
Using dep_delay_hours_mean and dep_delay_hours_std, list the top two months with the largest mean and the largest std; interpret briefly.
✎ Put your answer here (2-3 sentences).
✅ Part 5.4 (6 points) Winter vs Summer cancellation causes comparison.
Compare cancellation causes between a typical winter month and a typical summer month:
Find the month with the highest
cancel_weather_pct(likely winter)Find the month with the lowest
cancel_weather_pct(likely summer)
For each of these two months, display:
The month (
ym)All four cancellation cause percentages:
cancel_carrier_pct,cancel_weather_pct,cancel_nas_pct,cancel_security_pct
Then describe which cause(s) are higher in winter vs summer and give one plausible explanation for the weather pattern.
✎ Put your answer here (2-3 sentences).
What to include in your answer:
State the two months you found and their weather cancellation percentages
Compare the causes: Which cause(s) are higher in the winter month? Which are higher in the summer month?
Explain why weather cancellations differ: Why would weather cause more cancellations in winter than summer? (Think about: snow, ice, storms, visibility, extreme temperatures, etc.)
Example structure: “The month with highest weather cancellations was [month] with [X]% weather-related cancellations, while [month] had only [Y]% weather cancellations. In the winter month, [cause] was the dominant factor at [Z]%, compared to the summer month where [cause] was higher at [W]%. This pattern makes sense because winter weather conditions such as [specific weather phenomenon] are more likely to ground flights than typical summer conditions.”
Congratulations, you’re done!#
Submit this assignment by uploading your notebook to the course Desire2Learn web page. Go to the “Homework” folder, find the appropriate submission link, and upload everything there. Make sure your name is on it!
© Copyright 2025, Department of Computational Mathematics, Science and Engineering at Michigan State University