CMSE 495

Logo

Michigan State University Data Science Capstone.

View the Project on GitHub msu-cmse-courses/cmse495-FS25

1. Case Study: Data Not Available

The capstone project was to determine if “justice is blind” by looking at legislation data from judges in Michigan. Students were given links to a website with a database and were able to successfully build a web scraper. Unfortunately, the data did not include key information (race and gender) needed to answer any of the research questions asked by the community partner. Some students felt the project “failed,” were frustrated, didn’t know how to proceed and lost motivation.

  1. What could be done to recover from missing this key data?
  2. What could the team do to pivot?
  3. What could be added to your team charter that may help avoid this type of problem?

2. Case Study: Visit from the CEO

The team’s community partner was a very large corporation, and the students had the opportunity mid-semester to give a presentation to the CEO. This was a big deal, and the community partner contact asked the students to give a “practice” presentation a month before the CEO visit. The students were already stressed because this course is a lot of work, and neither of these presentations were required as part of the course milestones. The students figured that these presentations wouldn’t count toward their actual grade, and delayed working on the practice presentation until the last minute. The practice went poorly, and the community partner contact was so unhappy with the students’ work that they cancelled the CEO visit. The students felt relieved that they didn’t have to talk to the CEO, and continued on with the project. The students never told the instructor about the presentation requests, so the instructor was blindsided when the community partner contact complained about the students’ poor performance.

  1. What was the professional thing to do in this situation?
  2. What could the team have done differently to address their concerns about “extra” work?
  3. What could be added to your team charter that may help avoid this type of problem?

3. Case Study: Micromanaged and Misaligned

A student team was paired with a community partner for a semester-long consulting project. The partner was enthusiastic and met with the team every week, which initially seemed like a great sign. However, the meetings quickly turned into weekly task assignments given by the partner to the students. Instead of collaborating on a semester-long plan, the partner gave the students new instructions each week, often changing direction or adding new tasks without considering the team’s overall timeline or course deliverables.

The students tried to balance the partner’s weekly requests with the structured milestones required by the course. This led to long hours, duplicated work, and growing frustration. The team felt like they were constantly reacting instead of planning. They eventually reached out to the instructor for help.

After the instructor intervened, the situation improved slightly, but the partner began to express dissatisfaction with the students, accusing them of being unmotivated and unprofessional. The relationship became strained, and the team struggled to maintain a positive connection with the community partner while still meeting course expectations.

In the end, the team delivered a solid final product, but the experience left them feeling discouraged and unsure how to handle similar dynamics in the future.

Discussion Questions

  1. What are the risks of allowing a community partner to drive the project week-by-week?
  2. How could the team have communicated their course requirements and project timeline more effectively?
  3. What strategies could the team use to balance partner expectations with academic deliverables?
  4. What could be added to your team charter to help set boundaries and clarify roles early in the project?
  5. How can students maintain professionalism when a partner relationship becomes strained or unproductive?

4. Case Study: Team Conflict

Two weeks before the final project was due, the instructor received a long email from a student complaining about another team member who they felt was not contributing sufficiently to the project. The email was extremely detailed and documented various interactions that raised concerns within the team throughout the semester. However, this email was the first time that anyone told the instructor there were problems. The student wanted the team member to get a lower grade than everyone else, and wanted the instructor to intervene and tell the student that they were not meeting expectations.

  1. What responsibilities do team members have for setting clear expectations and for holding each other accountable? How could the team have addressed concerns earlier in the semester?
  2. What is the appropriate role for the instructor when there are conflicts within a team?
  3. What could be added to your team charter that may help avoid this type of problem?

5. Case Study: Data Overload

The scope of this capstone project was to find freely available data to answer a set of research questions about “Smart Cities.” The problem is that there were too many data sources (for example, WIFI networks, types and locations of businesses, city websites etc.) and none of the data sources was organized with specific information that would answer any of the research questions directly. It would require a lot of work to clean and organize the data, by hand, representing many hours of manual labor. and even after the work was done there was no clear indication that the data would be sufficient to answer their research questions. Students became overwhelmed with the options and had no clear path forward. Unfortunately, the project community partners where not data experts so didn’t have the ability to guild the team who lost a lot of time continuing their search for the “perfect” dataset.

  1. What are some things the team could do to move forward on the project?
  2. What could be added to your team charter that may help avoid this type of problem?

6. Case Study: What is the best accuracy?

A capstone team is assigned a classification project (i.e., a project where they are given labeled data and need to use machine learning to train a classifier). After a discussion with the community partner there was consensus that at least 85% accuracy seemed like a reasonable accuracy to do what they needed.

The team got into trouble when they kept trying different algorithms with different hyperparameters and couldn’t get their accuracy above 72%. The team took a long time telling the community partner there was a problem and when they did the community partner was very upset at such a terrible result and told the team to keep trying.

  1. What was the fundamental technical problem?
  2. Where were the failures of communication?
  3. What could be done to better manage expectations?
  4. What could be added to your team charter that may help avoid this type of problem?

7. Case Study: Not Enough Training Data

A community partner wanted to analyze high dimensional, 3D data of material and build a regression model that connected the data to mechanical measurements. This project included a paper (from the 1990’s) that did a similar study that required quite a bit of expensive manual measurements (i.e., no 3D data) of materials and used simple linear correlation.

The community partner wanted to take advantage of the automatic method of gathering data using 3D scanning technology and then use deep learning methods to make an automated model similar to the 1990 paper. Unfortunately, we didn’t know until late in the semester that they only had seven (7) 3D models with 7 labeled points.

  1. What is the “curse of dimensionality” and how does it relate to this problem?
  2. What is the “rule of 10” and how does it relate to this problem?
  3. What could be done to better manage the community partner’s expectations?
  4. What could be done to recover from the lack of data?
  5. What could be added to your team charter that may help avoid this type of problem?

8. Case Study: Working Alone

Members of a team came to the instructors asking for help with a fellow teammate. The teammate was given the task of building a front-end GUI for their project. The problem wasn’t that the teammate was slacking in their role. In fact, you could say that the opposite was true, the teammate was working hard on the GUI and the design of the result looked impressive. The problem the teammate was not sharing everything they needed to get the code working on their own machines and just wanted to work alone. The team was worried the GUI was getting too complex and the teammate was not looping any of the other teammates into the design. Because they didn’t understand how the GUI worked, they didn’t feel confident that their parts of the project (the back end) were going to fit in or even work with the front end. The teammate didn’t seem too concerned and just wanted to work independently from the rest of the team. The team decided to just let the teammate work alone and behind their back built a second GUI backup. In the end they ended up submitting both solutions that really didn’t work together.

  1. Does building a backup GUI seem like the right response to rouge teammate?
  2. How should the instructor grade a working project from a team that ostracized one of their teammates?
  3. What are some other ways the team could have addressed their concerns with their teammate?
  4. What could be added to your team charter that may help avoid this type of problem?

9. Case Study: NDA Delays and Data Drought

A student team was excited to begin work on a data-driven project with a community partner. The project involved analyzing proprietary customer behavior data to help the partner improve their service delivery. However, early in the semester, the team learned that the university and the community partner had not yet finalized a Non-Disclosure Agreement (NDA).

The NDA was necessary before the partner could legally share any data. Weeks passed as legal teams on both sides negotiated terms. The students grew increasingly frustrated. Without the data, they felt stuck and unsure how to proceed. They spent time waiting, checking in with the partner, and worrying about falling behind. Eventually, the NDA was signed—but by then, the semester was halfway over.

Despite the delay, the team managed to deliver a final report, but it lacked depth and polish. In hindsight, they realized there were many things they could have done earlier to prepare for the data and make better use of their time.

Discussion Questions

  1. What are some productive activities the team could have done while waiting for the data?
  2. How could the team have better communicated with the community partner and instructors during the delay?
  3. What are some ways to prototype or simulate data when real data is unavailable?
  4. What could be added to your team charter to help manage expectations and timelines when legal or administrative delays occur?
  5. How might the team have used this time to better understand the domain, define metrics, or explore related literature?

10. Case Study: Divergent Paths in Model Selection

A student team was tasked with building a classification model to support a community partner’s decision-making process. Early in the project, the team realized that there were many possible machine learning models to choose from. Uncertain about which model would perform best, they made a strategic decision: each team member would independently explore a different model. This approach encouraged exploration and individual initiative. However, as the semester progressed, the team struggled to maintain cohesion. Each student treated their model as a standalone project, using different preprocessing techniques, evaluation metrics, and reporting styles. By the time of the final presentation, the team had five separate solutions to the same problem, but no unified comparison or synthesis.

The final report reflected this fragmentation. Rather than presenting a cohesive narrative or a comparative analysis of model performance, the report consisted of five loosely connected sections. The team had not agreed on common evaluation criteria, making it impossible to determine which model was most effective. The community partner was left without clear guidance, and the team missed an opportunity to demonstrate collaborative data science practices.

Discussion Questions

  1. What were the benefits and drawbacks of the team’s “divide and conquer” approach to model selection?
  2. How could the team have structured their work to allow for both exploration and integration of results?
  3. What strategies could have helped the team align on evaluation metrics and reporting standards early in the project?
  4. How might the team have used regular check-ins or shared documentation to maintain cohesion throughout the semester?
  5. What could be added to your team charter to help ensure students were communicating on the project?
  6. If you were a member of this team, what steps would you take to ensure the final presentation and report told a unified story?

11. Uneven Skills, Uneven Progress

A student team was assigned a predictive modeling project for a community partner. While all members had some Python experience, their familiarity with Git and machine learning varied significantly. One student had prior experience with version control and model development, while others were still learning how to use Git effectively and were new to concepts like hyperparameter tuning and model evaluation.

The experienced student took on most of the technical work, including managing the Git repository and building the core model. Other team members contributed to documentation and presentation design but felt disconnected from the technical process. The team didn’t establish shared workflows or take time to build common understanding, which led to uneven contributions and limited collaboration.

During the final presentation, only one student could explain the modeling pipeline and Git workflow. The others struggled to engage with technical questions, and the final report reflected a lack of shared ownership. The project met its goals, but the team missed an opportunity for inclusive learning and skill development. Discussion Questions

Discussion Questions

  1. How could the team have identified and addressed skill gaps early in the project?
  2. What strategies could help ensure that all team members are engaged in technical work, even if their skills differ?
  3. How might peer mentoring or skill-sharing sessions have improved the team’s experience and final product?
  4. What role should instructors or mentors play in supporting teams with uneven skill levels?
  5. If you were the technically skilled student, how would you balance contributing to the project with supporting your teammates’ learning?
  6. What could be added to your team charter to help ensure students learn and contribute equally to the project?

12. Tool or Platform Failure

A student team was excited to use a cutting-edge large language model (LLM) for their community partner project. The model promised advanced capabilities for text classification and summarization, and two students strongly advocated for it, citing its popularity and recent breakthroughs. Since the tool required a paid license, the team submitted a short proposal to the instructor, which was approved in about a week.

Once they gained access, the team began integrating the model into their workflow. However, after a few weeks of experimentation, they discovered that the model’s API was more limited than expected. It didn’t support key features they had assumed were available, and adapting it to their domain-specific dataset proved difficult. The team had invested heavily in this tool and didn’t have a backup plan.

As the semester progressed, they struggled to pivot. The final presentation focused more on the challenges they faced than on actionable results. The community partner appreciated their effort but was left without a usable solution. The team reflected that their decision had been driven more by excitement than by careful evaluation. Discussion Questions

Discussion Questions

  1. What could the team have done to mitigate the risk of relying on a tool they hadn’t fully tested?
  2. How might early prototyping with alternative tools have helped the team stay on track?
  3. What role did hype and novelty play in the team’s decision-making?
  4. How should teams balance innovation with practicality when selecting tools?
  5. What could be added to your team charter to help avoid group think?

13. Meticulous Planning

A student team was assigned a semester-long data science project with a community partner. The team was on top of the course milestones, completing some of them early and even taking breaks to celebrate their progress. They had strong communication, divided assigned tasks fairly, and supported each other throughout the semester.

However, the team treated the course milestones as their primary roadmap and didn’t develop their own internal timeline. When they reached the Minimum Viable Product (MVP) milestone, their submission was extremely weak and consisted more of a rough sketch than a usable prototype. They hadn’t realized how much work would be needed after the MVP to refine, validate, and prepare the final deliverables.

As the semester neared its end, they found themselves short on time. Despite their strong collaboration and steady progress, the final product felt rushed. The community partner received a working solution, but the team knew it could have been more impactful with better planning. They had done everything “right” on paper, but hadn’t taken full ownership of the project timeline.

Discussion Questions

  1. Why isn’t following course milestones enough to ensure a successful project outcome?
  2. What could this team have done differently to plan ahead while still using the milestones as a guide?
  3. How can teams build in buffer time for unexpected challenges or refinement?
  4. What are the risks of taking breaks too early in a project, even if you’re ahead of schedule?
  5. What could be added to your team charter to help avoid this problem?

Written by Dr. Dirk Colbry, Michigan State University Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.