CMSE 495

Logo

Michigan State University Data Science Capstone.

View the Project on GitHub msu-cmse-courses/cmse495-SS24

In-Class Assignment: Tutorial Review

All projects you will find “in the real world” require you to learn something. Knowing how to learn something new is a key learning goal of this class. To help with the many skills you may need for your project, this capstone course maintains a repository of student generated “tutorials” which can be found in the following repository:

We will be working on these during Friday classes for the rest of the semester. In today’s assignment we would like you to just review what has been done. There are three major goals for this review.

  1. Learn how to better use git to interact with code.
  2. Learn what content is available in this repository that may help you with your projects.
  3. Identify ways the repository can be used to help future students.

Your team’s task for the day is to go through one of the tutorials (assigned by team by the instructors), find bugs, issue or improvement and then submit a “issue” to the git repository.

Agenda (80 Minutes)


1. Group Review of the Data Tools Totorial Demo (DTTD)

The following is a list of teams and the an assigned tutorial form previous semesters.

Team Tutorial
CEPI - Anomaly Detection GAMA_AutoML_Tutorial.ipynb FFmpegDemo.ipynb
City of Grand Rapids - Social Impact pcatutorial.ipynb FuzzyWuzzy.ipynb
Ford - Defect Prediction Video-Image-Data-Tutorial Seaborn_Tutorial_DTTD.ipynb
HAP - Synthetic Data Generation Classification.ipynb Auto_Cropping_Image_Tutorial
HFH - Revenue Cycle Prediction Streamlit AudioDataTutorial.ipynb
ICER - User Data Analytics tpot_tutorial.ipynb GoogleSheetsTutorial.ipynb
Intramotev - Automated Video Data Labeling for Autonomous Trains GoogleSheetsTutorial1.ipynb BeautifulSoup.ipynb
Kellanova - Point of Sale Analysis GUI_Tutorial.ipynb MorphologicalOperators_Tutorial
QSIDE - SToPA: MultiTown Data Analysis image_thresholding_tutorial censusdata_package_tutorial
Techsmith - Healthy and Engaged User Data Exploration Create_a_python_package.ipynb DTTD_PowerBI_Tutorial.ipynb
Tribal Start Program - Tribal Early Childhood Research Data PyTorch_tutorial.ipynb DTTD_Python_Package.ipynb
TwoSix - LLM to Graphs DTTD_Tutorial_Widgets-D2LAPITeam.ipynb socail_media_scrapper

Your group is expected to review all of the tutorials. However, today we will start with just this small set. As a group do the following:

  1. Clone the DTTD repository.
  2. Follow the directions and get your tutorial “working”.
  3. Add “issues” to the git issue list for all things that need to be improved in the tutorial. Make sure the issue is well written and clearly states the file/tutorial that needs fixing. Every student should add an issue.

Notes for making Git Issues

Gitlab and github have a simple mechanism for reporting “issues” inside the repository. Go to the DataTools Tutorial Demo and click on the “issues” button at the top. Once there you can read through the current issues and make new ones by pressing the green “new issues” button on the top right. When creating the issue put in a lot of details, be very specific about what file has the problem and what you know needs to be done to fix it. Please consider the following when submitting an issue:


Getting Credit for this assignment

Each member of the team should author at least one git issue. More is better but help each other out and try to make good quality issues that have substance and are not redundant and/or just filler. There is always something that is missing or needs improvement.

NOTE: I realize we are using a lot of jargon. This is normal when you start a new job. Please research anything you don’t understand and talk with your team. Come to your instructor with questions if you can’t figure out something together.

Written by Dr. Dirk Colbry, Michigan State University Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.