CMSE 802 Software Setup Guide#

Table of Contents#

As this is a course in applied machine learning, we assume you have a working computer with Anaconda and Python 3.11.x installed. However, in order to do so, there are a number of things you need to set up before the course starts.

Python setup#

In this class, you will do a lot of coding, and all of it in Python. We will exclusively use Python 3.11.x. If you have used Python before and you already have it on your laptop, upgrade to this version before class starts. You will be at a serious disadvantage if you are not using the same version as your classmates because of the team coding projects every week. We will use Jupyter notebooks (in Jupyter lab) for all assignments, so be sure that you are able to create and save notebooks. It is recommended that you use the Anaconda distribution. If you don’t have Anaconda installed already, follow the instructions below to get the Anaconda distribution of Python installed on your computer. Even if you already have a version of Python installed on your machine, we encourage you to go through this installation process as the assignments will assume that you are working with the same versions of the Anaconda Python packages that the instructors are using. If you already specifically have Anaconda installed, we encourage you to update all of the Python packages (you may need to look up how to do this).

MSU’s JupyterHub Interface#

From time to time, you might run into issues with your computer. When this happens, you can use the web-hosted JupyterHub server managed by MSU. It creates a virtual environment that allows you to run simple commands and host Jupyter notebooks. To make sure that you have access to this backup option, follow the directions below. Note that there are extra steps involved that require you to upload your Jupyter notebooks to JupyterHub, as well as to download them (in the case of Homework assignments) in order to turn them in on D2L.

This service should only be used as a temporary fix if Jupyter stops working on your computer. To make sure that you have access to this backup option, follow the directions below. Note that there are extra steps involved that require you to upload and download your Jupyter notebooks to and from JupyterHub in order to turn them in on D2L. Again, this option is meant to be used as a backup to your local Python installation and requires an internet connection to use. It will often be easier to run things locally on your computer.

Instructions for connecting to the engineering JupyterHub server:
Every student enrolled in this class will be given an engineering computing account. If this is your first time using your Engineering account, you will need to activate the account by going to the following website:

https://www.egr.msu.edu/decs/myaccount/?page=activate

Enter your MSU NetID. The initial password will be your APID with an @ on the end (example: A12345678@) and then you have to set a password that meets the requirements listed on the page. Verify the password. Then agree to the terms and Activate.

Once your account is activated, you can access the classroom JupyterHub server using the following instructions:

  1. Open up a web browser and go to the following URL: https://jupyterhub.egr.msu.edu

  2. Type your engineering login name. This will be your MSU NetID.

  3. Type your engineering password.

If everything is working properly, you will see the main “Files” windows in the Jupyter interface.

If you ever end up working on your assignments using JupyterHub, the remaining directions should serve as a reference for how you can go about uploading and downloading Jupyter notebooks and turning them in.

MSU’s High Perfomance Computing Cluster (HPCC)#

In case you need a lot of computational power for your project, you can use MSU High-Performance Computing Cluster (HPCC). This class has a research space dedicated to it so that you can use it for your project. The instructor has created a research space called CMSE_802_SS25_S001 that you can access.

Instructions to log in: To SSH into the MSU High-Performance Computing Cluster (HPCC), follow these steps:

  1. Open a terminal on your local machine.

  2. Use the following command to initiate an SSH connection:      ssh <your_netid>@hpcc.msu.edu     Replace <your_netid> with your MSU NetID.

  3. Enter your MSU NetID password when prompted.

  4. Once logged in, you will be connected to the HPCC and can start using its resources for your project.

  5. Once you are logged into the HPCC, navigate to the research space by running the following command:      cd /mnt/research/CMSE_802_SS25_S001     To create a directory with your last name in the research space CMSE_802_SS25_S001, follow these steps:

  6. Use the following command to create a directory with your last name:      mkdir LastName     Replace LastName with your actual last name.

  7. Verify that the directory has been created by running the following command:      ls     You should see your newly created directory listed.

You have now successfully created a directory with your last name in the research space CMSE_802_SS25_S001 on the HPCC.

HPCC Documentation: For more information on using MSU’s High-Performance Computing Cluster (HPCC) and accessing its resources, please refer to the official documentation available at https://docs.icer.msu.edu/. The documentation provides detailed instructions and guidelines on how to utilize the HPCC for your computational needs. Whether you need additional computational power for your project or want to explore advanced computing capabilities, the HPCC documentation is a valuable resource to help you get started. Make sure to familiarize yourself with the documentation to make the most out of the HPCC’s capabilities.

D2L Courses: For more information on how to use the MSU High-Performance Computing Cluster (HPCC) through Desire2Learn (D2L) courses, please visit the following link: https://icer.msu.edu/Education-and-Events/Desire2Learn. This resource provides detailed instructions and tutorials on how to access and utilize the HPCC resources for your computational needs. Make sure to explore this link to enhance your understanding of HPCC and maximize its capabilities.

OnDemand on MSU HPCC#

In addition to accessing the MSU High-Performance Computing Cluster (HPCC) through SSH, you can also use OnDemand to interact with the cluster. OnDemand provides a web-based interface that allows you to submit and manage jobs, access files, and monitor job progress. This can be particularly useful if you prefer a graphical interface or if you need to access the cluster from a remote location.

To use OnDemand on MSU HPCC, follow the instructions provided in the OnDemand documentation. The documentation provides detailed information on how to set up and use OnDemand, including instructions for logging in, navigating the interface, and submitting jobs.

By utilizing OnDemand, you can easily leverage the power of the HPCC and perform your computational tasks efficiently and conveniently.

Note: It is important to note that it is the responsibility of the student to learn how to use the MSU High-Performance Computing Cluster (HPCC) and OnDemand. While the instructor can provide assistance with certain issues, any technical problems or inquiries should be directed to the official support channels. For any technical issues related to HPCC or OnDemand, students are encouraged to contact the ICER (Institute for Cyber-Enabled Research) support team by visiting their website at https://contact.icer.msu.edu/contact. They will be able to provide the necessary guidance and support to resolve any technical difficulties encountered during the course.