This is the webpage for CMSE495 Data Science Capstone Course. These materials are provided as an Open Educational Resource (OER). Instructors interested in using these classroom resources should reach out to Dirk Colbry (colbrydi@msu.edu) who can provide all the materials and instructor notes.
All projects you will find “in the real world” require you to learn something. Knowing how to learn something new is a key goal of this class. To help with this learning, this capstone course maintains a repository of student generated “tutorials” which can be found in the following repository:
We will be working on these for the rest of the semester. In today’s assignment we would like you to just review what has been done. There are three major goals for this review.
The goal for today’s project is quite simple. Go through one of the tutorials (assigned by team by the instructors), find bugs, issue or improvement and then submit a “issue” to the git repository.
The following is a list of teams and the an assigned tutorial form previous semesters.
Team | Tutorial |
---|---|
Boeing - Defect Prediction | GAMA_AutoML_Tutorial.ipynb |
CEPI - Anomaly Detection | GUI_Tutorial.ipynb |
D2L - Instructor API | GoogleSheetsTutorial.ipynb |
Henry Ford Hospital - Image Segmentation | Video-Image-Data-Tutorial |
Intramotev - Autonomous Vehicles | tpot_tutorial.ipynb |
Kellogg - Deduction Classification | Zotero_Instructions.ipynb |
Kinesiology - 40 yard dash | AudioDataTutorial.ipynb |
NC3 - Community Capital | tpot_tutorial.ipynb |
Neogen - Pesticide Analysis | Auto-SKLearn_AutoML |
Olson - Campain finance Data | social_media_scraper |
QSIDE - Human trafficking | censusdata_package_tutorial |
QSIDE - Justfare Toolbox | GUI_Tutorial.ipynb |
Your group is expected to review all of the tutorials. However, today we will start with just one. As a group do the following:
Gitlab and github have a simple mechanism for reporting “issues” inside the repository. Go to the DataTools Tutorial Demo and click on the “issues” button at the top. Once there you can read through the current issues and make new ones by pressing the green “new issues” button on the top right. When creating the issue put in a lot of details, be very spcific about what file has the problem and what you know needs to be done to fix it.
Bad issues are things like “this is confusing” or “needs more”.
Each member of the team should author at least one git issue. More is better but help each other out and try to make good quality issues that have substance and are not redundant and/or just filler. There is always something that is missing or needs improvement.
NOTE: I realize we are using a lot of jargon. This is normal when you start a new job. Please research anything you don’t understand and talk with your team. Come to your instructor with questions if you can’t figure out something together.
Written by Dr. Dirk Colbry, Michigan State University
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.