Link to this document's Jupyter Notebook

CMSE401 Project Proposal

In this milestone project you will write a project proposal submit it using a git repository that you create and share with your instructor. This repository will be used for the remainder of the course to turn in all project related files.

  1. Project overview
  2. Project Proposal
  3. Setting up your GIT repository
  4. Turning in your Project

1. Project overview

Before you begin you need to do some research about the software and problems you want to explore. Your instructor has a lot of experience helping students find projects that are fun/interesting and will meet all of the class requirements. Please schedule an appointment with your instructor either during office hours or by appointment.

Class projects will include Three key milestones:

Software ideas for Milestone 1

OpenFlow
Blast
Petsci
OpenMM
MKL
BLAS
FFTW
Kokkos
Charm++
Python Numba
FPGA
Tensorflow
GPU
OpenACC
OpenCV
MakeFlow
Hadoop
Tensorflows
Torch
Twister
Condor
XSEDE

2. Project Proposal

Please write a 1-page proposal about what you plan to do for your project. If you do not have any ideas you should go talk with your instructor. The proposal should consist of the following:

The following is a basic template you can use for your project proposal. Your instructor highly recommends using either markdown or jupyternotebooks when writing your proposal.

---- START TEMPLATE ----

Project Title

By "Your name"

✅ Replace the following with a picture that "defines" your project. This could be software logo, an expected outcome of your project, or a graphical representation of the research area.

Simple Icon of a camera. This is just a place holder for your image

Image from: http://simpleicon.com/


Abstract

✅ Provide a short paragraph about the domain you are going to cover in your project (research area, sports, economics, etc). Explain why you picked this domain (i.e. what is your motivation). Explain how computation is used I this domain. Give a short description that describes the software are you want to explore, what hardware are you going to use during this exploration, what you hope to benchmark, what you hope to speed up and and what will be a successful outcome.


Schedule

✅ provide a schedule for you to complete all of your milestones on time. Include what you plan to work on each week from now until when the projects are due:


Part 1 Software Exploration

✅ Provide a more detailed paragraph about the software you are going to review for the first part of the assignment. Again this will vary based on project but provide any links (references) and information on where you plan to start. Also descrbe what you expect as the outcome of this step. Will you have a list of instructions for running the code on the HPC or maybe a folder in a format compatible with getexample?


Part 2 Benchmark and Optimization

✅ Explain what code you are going to benchmark. Do you know if it already can be run in parallel? If so what types of things are you going to look for to optimize the though put of the software. How will you evaluate the software to know if the project is successful? What do you hope/expect to deliver and a successful outcome in your final report? For example, you plan to implement openmp and demonstrate 20 times speedup in the code or I plan to identify a bottleneck in the code and demonstrate a 4x speed up the slow.

---- END TEMPLATE ----


3. Setting up your GIT repository

To submit your proposal you are going to create a project folder, commit your proposal to the folder and share your git repository with your instructor. The following videos from the Getting to know git (Tutorial) and may be helpful setting up your repository.

The following video are instructions specifically for how to use the MSU Gitlab

The following are instructions for how to use the more general Github:

As you update and change the files in your repository you will need to push those changes to gitlab or github. The following instructions walk you though this process:


What not to include (building a .gitignore file)

First thing we want to teach is is that not everything should go into a git repository. i.e. we do not want to bloat our repository with unwanted files. The git repository works best with Text files that represent "source" code and not compiled or generated code. Here are some basic guidelines of what not to include:

A good rule of thumb is that if you did not generate the file and/or do not know what it is you probably do NOT want to include it in your repository.

WARNING do not blindly add all files to your repository with the * (star) syntax. This is bad practice. For example do NOT do the following:

git add * #THIS IS BAD!!!!

Other files to avoid

In addition to the above files it is good to avoid any type of "Binary" file (with a few exceptions). As stated early, git works best with text files so it can easily track changes. Some example binary files to avoid include:

Note: one exception to the above rules are image files (ex jpg or png) that are used to help markdown or in the documentation. It is typically okay to include these since they tend to get included only once and do not change much as the project evolves.

The .gitignore (typically read "dot git ignore") is aa text file that contains a list of regular expressions (we will learn more about these later) that specify names of files we do not want to include in a git repository.

.gitignore file

The .gitignore (read "dot git ignore") file is used to help keep unwanted files out of your project. Each line .gitignore file are filenames you want git to ignore. For example, based on what we said above, a good place to start on your .gitignore file would be the following two lines:

.ipynbcheckpoint
__pychache__

What should go into a .gitignore depends a lot on the type of project. However, you don't need to invent these from scratch. For example, you could just copy the .gitignore file from the course repository or find one on the internet. If you are using github I think it can also automatically create a .gitignore file for you if you specify your project as a python project.


Avoid Spaces in file names

When you name all of the files and folders inside of a repository, it is important that your names DO NOT include spaces. Although all modern computer's have ways to accept names with spaces do not use them. Instead use underscores (_) or CamleCase (No spaces and capital letters at the beginning of each word in the name). Avoiding spaces in your names will ALWAYS save time in the long run.


Always Use Relative Paths

In your code there are two basic ways to determine the location of a folder inside your computer; Relative Paths and Absolute Paths. A relative path is a path starting from your current directory and an absolute path is is a path starting from your computer's "root" directory.

ALWAYS use relative paths in your git repository. This ensures that others will be able to use your software if they download it onto their computer. For example:

Good: ./data/  or ../data/ is a relative path to a child directory or sibling directory called data. 
Bad (not acceptable): C:/research/data or /mnt/home/data are absolute paths to a data directory

Jupyter notebook files in git repositories

Turns out that Jupyter notebook files and git repositories work very poorly together. Jupyter notebook files are a unique combination of source and program generated information. So, everytime you run a jupyter file it can add output cells which make git think you you changed something important. In many cases it is just a few numbers or some output text. When you run the git status command it always looks like jupyter notebook files have changed even when they have not changed.

A good rule of thumb is to clear all of the output files before committing any changes to jupyter notebook files.

The following video goes though why we have to treat jupyter notebooks this way:

Direct Link


4. Turning in your Project

In order to turn in your GIT repository you just need to give the instructors and classmates the permissions to clone the repository and provide the full git command. Please use the following form to submit this information.

Or if you prefer, you can use the Direct Link


4. Rubric

I am not a big fan of giving student's detailed rubrics like this one because i think it is important for students to be able to determine on their own what is "good" work and what is not. However, not having a rubric is also not really fair because it can be difficult to know what an instructor will expect from you; this is especially true the first time something is turned in. Please use the following as a guild and I reserve the right to change my grading slightly based on the submissions and effort:

(50 points) Proposal
    - (5 points) Is there a project title?
    - (5 points) Is there a motivating picture included and referenced?
    - (10 points) Is the abstract well written and easy to understand?
    - (5 points) Are there due dates and milestones in the schedule?
    - (10 points) Are the goals for Part1 clear from the proposal, including links and references?
    - (10 points) Are the goals for Part2 clear from the proposal, including links and references?
    - (5 points) Is the proposal written in ipynb or md file? 

(50 points) Correctly setting up your git repository
    - (10 points) Was everything turned in on time?
    - (10 points) Is there a gitignore file (does it work)?
    - (10 points) Are the permissions to the git repository set up correctly?
    - (5 points) Does the project use correct Filenames?
    - (5 points) Does the project have any temporary or hidden files that should not be included?
    - (10 points) Were all directions followed?

100 points total

Congratulations, you are done!

Your instructor can download your git report using the link you provided in the above Google form.

Written by Dr. Dirk Colbry, Michigan State University Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.