CMSE401 Project Proposal#

In this milestone project you will write a project proposal submit it using a git repository that you create and share with your instructor. This repository will be used for the remainder of the course to turn in all project related files.

  1. Project overview

  2. Project Proposal

  3. Setting up your GIT repository

  4. Turning in your Project


1. Project overview#

Before you begin you need to do some research about the software and problems you want to explore. If you are having trouble, please schedule an appointment with your instructor either during office hours or by appointment.

Class projects will include Three key milestones:

  • Milestone 1: Project Proposals - (This assignment) Overview of what you would like to do for your projects. What software will you explore, what science problem you will solve and what are the expected outcomes. Also the creation of a git repository.

  • Milestone 2: Software Exploration - You will explore software NOT covered in this course. Software may include a library, programming language, hardware, scientific code, etc). The project will require you to learn about the software, get it installed and working on the HPC (or equivalent) and write up a short tutorial with examples with enough details so that your peers can get it up and running). See 21p-PROJECT_Part1 for more information.

  • Milestone 3: Benchmark and Optimization - You will identify an existing scientific/engineering problem that currently runs slow. You will either write a new program form scratch or modify an existing code to run in parallel. You may also modify an existing parallel code to run in parallel faster. You may, but are not required to use the software you explored in milestone 2 of the project. See 0415-PROJECT_Part2 for more information.

Software ideas for Milestone 1#

OpenFlow
Blast
Petsci
OpenMM
MKL
BLAS
FFTW
Kokkos
Charm++
Python Numba
FPGA
Tensorflow
GPU
OpenACC
OpenCV
MakeFlow
Hadoop
Tensorflows
Torch
Twister
Condor
XSEDE

2. Project Proposal#

Please write a 1-page proposal about what you plan to do for your project. If you do not have any ideas you should go talk with your instructor. The proposal should consist of the following:

  • Project title

  • Motivating picture

  • Abstract - summary about why you picked. your project and what it is going to cover

  • Schedule - Weekly schedule of what you plan on doing (so you don’t fall behind)

  • Software Exploration - Which application did you pick out for your software exploration

  • Benchmark and optimization - What problem are you trying to make go faster

The following is a basic template you can use for your project proposal. Your instructor highly recommends using either markdown or jupyternotebooks when writing your proposal.

—- START TEMPLATE —-

Project Title#

By “Your name”

✅ Replace the following with a picture that “defines” your project. This could be software logo, an expected outcome of your project, or a graphical representation of the research area.

Simple Icon of a camera. This is just a place holder for your image

Image from: http://simpleicon.com/


Abstract#

✅ Provide a short paragraph about the domain you are going to cover in your project (research area, sports, economics, etc). Explain why you picked this domain (i.e. what is your motivation). Explain how computation is used I this domain. Give a short description that describes the software are you want to explore, what hardware are you going to use during this exploration, what you hope to benchmark, what you hope to speed up and and what will be a successful outcome.


Schedule#

✅ provide a schedule for you to complete all of your milestones on time. Include what you plan to work on each week from now until when the projects are due:

  • Sunday February 5 - Project Proposal Milestone Due

  • Week of February

  • Week of February

  • Week of February

  • Sunday February 26 - Project Part 1 Due

  • Week of March

  • Week of March

  • Week of March

  • Week of March

  • Week of April

  • Week of April

  • April 15 - Final Project due


Part 1 Software Exploration#

✅ Provide a more detailed paragraph about the software you are going to review for the first part of the assignment. Again this will vary based on project but provide any links (references) and information on where you plan to start. Also descrbe what you expect as the outcome of this step. Will you have a list of instructions for running the code on the HPC or maybe a folder in a format compatible with getexample?


Part 2 Benchmark and Optimization#

✅ Explain what code you are going to benchmark. Do you know if it already can be run in parallel? If so what types of things are you going to look for to optimize the though put of the software. How will you evaluate the software to know if the project is successful? What do you hope/expect to deliver and a successful outcome in your final report? For example, you plan to implement openmp and demonstrate 20 times speedup in the code or I plan to identify a bottleneck in the code and demonstrate a 4x speed up the slow.

—- END TEMPLATE —-


3. Setting up your GIT repository#

To submit your proposal you are going to create a project folder, commit your proposal to the folder and share your git repository with your instructor. The following videos from the Getting to know git (Tutorial) and may be helpful setting up your repository.

# Git init intorduction https://www.youtube.com/playlist?list=PLqPfbT7gwVP_AlE6HeDQUJsG4nUbGyeh3
from IPython.display import YouTubeVideo
YouTubeVideo("IAAv4DjYYUA",width=640,height=360, cc_load_policy=True)

The following are instructions for how to use the more general Github:

#Inicializing Repository using Github
from IPython.display import YouTubeVideo
YouTubeVideo("dpeHlFm8SYU",width=640,height=360, cc_load_policy=True)

As you update and change the files in your repository you will need to push those changes to github. The following instructions walk you though this process:

#Git Add Committ https://www.youtube.com/playlist?list=PLqPfbT7gwVP_AlE6HeDQUJsG4nUbGyeh3

from IPython.display import YouTubeVideo
YouTubeVideo("GTM-h5xX2Lk",width=640,height=360, cc_load_policy=True)

What not to include (building a .gitignore file)#

First thing we want to teach is is that not everything should go into a git repository. i.e. we do not want to bloat our repository with unwanted files. The git repository works best with Text files that represent “source” code and not compiled or generated code. Here are some basic guidelines of what not to include:

  • .ipynb_checkpoint - These folders are generated when you run jupyter notebooks. They are “temporary” compiled folders that will change each time you run your notebook and should not be included in your repository.

  • __pychache__ - Similar to .ipynb_checkpoint folders these folders are often generated when running python scripts and should not be included in your repository.

  • Other “Temporary” files - Temporary files are generated by all types of software and often start with a special characters such as the dot (.) or the tilde (~). For example many text editors generate temporary files to save a document in case of a program crash. Do not include temporary files in your repository.

  • Compiled Code - Programs such as C and FORTRAN must compile their code to an executable in order to run on your computer. These compiled codes are not editable and should be left out of your repository. Instead it is better to include instructions for compiling the source code as part of your repository.

  • Program Output - Do not include any program output in your repository (unless for very specific reasons such as documentation, testing, or figures in your final report). Assume that any output that can be generated by the source code should not be included with the source code (it is redundant).

A good rule of thumb is that if you did not generate the file and/or do not know what it is you probably do NOT want to include it in your repository.

WARNING do not blindly add all files to your repository with the * (star) syntax. This is bad practice. For example do NOT do the following:

git add * #THIS IS BAD!!!!

Other files to avoid#

In addition to the above files it is good to avoid any type of “Binary” file (with a few exceptions). As stated early, git works best with text files so it can easily track changes. Some example binary files to avoid include:

  • Large Data files Although it is good to include a few example inputs to your software, avoid using entire datasets. It is best to store these files someplace else.

  • Non-Text formats such as Word, Excel or PowerPoint documents should be avoided. These tend to change each time they are opened even if the core text does not change. it is better to use an alternative text example.

Note: one exception to the above rules are image files (ex jpg or png) that are used to help markdown or in the documentation. It is typically okay to include these since they tend to get included only once and do not change much as the project evolves.

The .gitignore (typically read “dot git ignore”) is aa text file that contains a list of regular expressions (we will learn more about these later) that specify names of files we do not want to include in a git repository.

.gitignore file#

The .gitignore (read “dot git ignore”) file is used to help keep unwanted files out of your project. Each line .gitignore file are filenames you want git to ignore. For example, based on what we said above, a good place to start on your .gitignore file would be the following two lines:

.ipynbcheckpoint
__pychache__

What should go into a .gitignore depends a lot on the type of project. However, you don’t need to invent these from scratch. For example, you could just copy the .gitignore file from the course repository or find one on the internet. If you are using github I think it can also automatically create a .gitignore file for you if you specify your project as a python project.

#Hidden files https://www.youtube.com/playlist?list=PLqPfbT7gwVP_AlE6HeDQUJsG4nUbGyeh3

from IPython.display import YouTubeVideo
YouTubeVideo("kzI-mPSY8y4",width=640,height=360, cc_load_policy=True)

Avoid Spaces in file names#

When you name all of the files and folders inside of a repository, it is important that your names DO NOT include spaces. Although all modern computer’s have ways to accept names with spaces do not use them. Instead use underscores (_) or CamleCase (No spaces and capital letters at the beginning of each word in the name). Avoiding spaces in your names will ALWAYS save time in the long run.


Always Use Relative Paths#

In your code there are two basic ways to determine the location of a folder inside your computer; Relative Paths and Absolute Paths. A relative path is a path starting from your current directory and an absolute path is is a path starting from your computer’s “root” directory.

  • Relative paths typically start with a single dot (.), representing the currecnt directory, or two double dots (..) representing the current directories parent folder.

  • Absolute paths typically start at the global root directory (/) on a Linux or Mac machine or with a drive label (ex C:) on a windows machine.

ALWAYS use relative paths in your git repository. This ensures that others will be able to use your software if they download it onto their computer. For example:

Good: ./data/  or ../data/ is a relative path to a child directory or sibling directory called data. 
Bad (not acceptable): C:/research/data or /mnt/home/data are absolute paths to a data directory

Jupyter notebook files in git repositories#

Turns out that Jupyter notebook files and git repositories work very poorly together. Jupyter notebook files are a unique combination of source and program generated information. So, everytime you run a jupyter file it can add output cells which make git think you you changed something important. In many cases it is just a few numbers or some output text. When you run the git status command it always looks like jupyter notebook files have changed even when they have not changed.

A good rule of thumb is to clear all of the output files before committing any changes to jupyter notebook files.

  • Open the jupyter notebook file

  • Select “Reset Kernel and clear output” from the menu

  • Save the notebook file.

  • Do your “git add” and “git commit” commands

The following video goes though why we have to treat jupyter notebooks this way:

Direct Link

# Jupyter vs. Git -- https://www.youtube.com/playlist?list=PLqPfbT7gwVP_AlE6HeDQUJsG4nUbGyeh3
from IPython.display import YouTubeVideo
YouTubeVideo("79hW_TzLos8",width=640,height=360, cc_load_policy=True)

4. Turning in your Project#

In order to turn in your GIT repository you just need to give the instructors and classmates the permissions to clone the repository and provide the full git command. Please use the following form to submit this information.

Or if you prefer, you can use the Direct Link

from IPython.display import HTML
HTML(
"""
<iframe 
	src="https://forms.microsoft.com/Pages/ResponsePage.aspx?id=MHEXIi9k2UGSEXQjetVofbihPqVa-WtNjOGYhCwpOgRUMlhQQlY0WE04SzUyRVhWR1NDRFRZSzdOMS4u" 
	width="100%" 
	height="1100px" 
	frameborder="0" 
	marginheight="0" 
	marginwidth="0">
	Loading...
</iframe>
"""
)

4. Rubric#

Below is the grading rubric that will be used to assess your project proposals. Please use the following as a guide when preparing your proposal:

(50 points) Proposal
    - (5 points) Is there a project title?
    - (5 points) Is there a motivating picture included and referenced?
    - (10 points) Is the abstract well written and easy to understand?
    - (5 points) Are there due dates and milestones in the schedule?
    - (10 points) Are the goals for Part1 clear from the proposal, including links and references?
    - (10 points) Are the goals for Part2 clear from the proposal, including links and references?
    - (5 points) Is the proposal written in ipynb or md file? 

(50 points) Correctly setting up your git repository
    - (10 points) Was everything turned in on time?
    - (10 points) Is there a gitignore file (does it work)?
    - (10 points) Are the permissions to the git repository set up correctly?
    - (5 points) Does the project use correct Filenames?
    - (5 points) Does the project have any temporary or hidden files that should not be included?
    - (10 points) Were all directions followed?
    
100 points total

Congratulations, you are done!#

Make sure your report is added to your Github repository as well as D2L. You should also make sure that you have added the instructors as collaborators to your project repository so that we can access your project.

Written by Dr. Dirk Colbry, Michigan State University (Updated by Dr. Nathan Haut in Spring 2025) Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.