Homework 02: Projects Pre-Proposal#

✅ Put your name here.#

Project Pre-Proposal#

This week you will write a preproposal for your project, which is between one and two pages long. Your preproposal will be reviewed by several of the other students and the feedback given to you so that you can write a much more detailed and credible full proposal. Connected with that process, obviously, is that you will review several proposals and provide feedback.

You will turn this in to D2L in the form of a PDF document. Feel free to use your favorite method for writing: \(\LaTeX\), Google Docs, Markdown, etc., but save the result as a PDF.

Task: Write a preprosal that contains these sections:

  • Title: (2 points) A descriptive title that captures your goal and the method you plan to use.

  • Abstract: (4 points) A 4-6 sentence abstract that summarizes all of the points in the following bullets.

  • Goal: (4 points) What is the goal of your project? Give context for the problem, why it is interesting, what you will do that is new and what that is interesting and important. State very clearly what the desired outcome is.

  • Solution: (4 points) How can ML help achieve your goal and/or solve your problem? Describe other ways that one could solve the problem (or achieve your goal) and why you think ML is a better approach.

  • Data: (4 points) What data will you use? What is the state of the data (what cleaning and/or transformations are needed)? What makes you think the data will allow you to achieve your goal?

  • Algorithms: (4 points) What machine learning algorithms will you use? You will need to use between 3-6 algorithms in your project, much like we compared six classifiers. If you plan to use deep neural networks, you must compare with other methods so that you can demonstrate the power and convergence of your chosen method. This should include both completely different algorithms (e.g., decision trees versus logistic regression), but also various forms of regularization (of which we will cover much more). The point is that you can’t know the power of a method if you don’t compare it to other methods. (Who knows, maybe guessing is better than your ML algorithm!)

  • Machine Learning Options: (4 points) What things can you vary in your choice of algorithms? For example, are there hyperparameters you can tune? Are there ways to choose/engineer better features?

  • Validation/Testing: (4 points) We haven’t covered this in detail yet, but we have alluded to the basic notion of checking your result by holding back some data for testing - the T in GEPT. Discuss how you think you will do this. For example, do you have enough data? How will you know, for example, that you are not overfitting.

  • Tools: (2 points) What tools will you use to make your poster? PPT? Keynote? LaTeX? Are there any issues you can anticipate? What computing resources will you need? GPUs? TPUs? Will you use animation/movies in your presentation?

Each section should be at least one paragraph with clear answers to the questions.

NOTE: One important question to keep in mind: did you design something too easy or too hard? Too easy would be that you upload a dataset, run it through three lines of sklearn and call it a day. Too hard would be that what you are looking for is barely in your data and you need a month on GPUs before you get your first result; or, there is a lot of overhead in getting the data itself in a form that you can use.

HINT: Here is a useful trick. While you are writing your preproposal start doing your project. For example, create a Jupyter notebook (or whatever tool you use), start writing a little code, read in your data, look at the dictionary, make a Seaborn plot. The act of just doing these very simple steps will make things less vague and abstract in your mind, and will potentially immediately reveal where your first problems are going to occur. When you write your proposal in two weeks, you will want to have a few of these plots made, so now is the time to be sure that there are no surprises waiting for you. There is no harm in throwing all of this away if you do find a surprise - better to do that now than in a month.

Deadline: Friday, October 27 at 11:59PM

Total points: 32

© Copyright 2023, Department of Computational Mathematics, Science and Engineering at Michigan State University.