In-Class Assignment: Collaborating with Git#
Day 16#
CMSE 202#
✅ Put your name here
#✅ Put your group member names here
#Learning goals for today#
By the end of class you should:
Create and share a GitHub repository with others
Push commits you’ve made to a repo and pull commits made by others by using the git command line tools
Create a new branch in a repo to work on a new feature
Merge changes from one branch into another as a way to share your work with others
Agenda for today:#
Review of pre-class assignment: thinking about collaboration and branch management in git
Exploring the version history and moving between versions (time permitting)
Assignment instructions#
Work through the notebook with your group. Making sure to write all necessary code and answer any questions. Do your best to finish the assignment, but it’s OK if you don’t make it to the end.
This assignment is due by the end of class, and should be uploaded into the appropriate “In-Class assignments” submission folder. Submission instructions can be found at the end of the notebook.
1. Review of pre-class assignment: thinking about collaboration and branch management in git#
Were there any specific issues that came up for you with the pre-class assignment?
Let’s take a moment to highlight some key concepts. Discuss with your table the following prompts and write down a brief definition of each of these concepts.
If you don’t feel like you have good working definitions yet, try doing a quick internet search to see if you can find a definition that makes sense to you.
✅ Question 1: Discuss what we are referring to when we talk about a “branches” in git.
✎ Do This - Write your disussion notes here.
✅ Question 2: How do you “checkout” a branch in git and what happens when you do?
✎ Do This - Write your disussion notes here.
✅ Question 3: What does it mean to “merge changes” in git?
✎ Do This - Write your disussion notes here.
2: Forking a repository and collaborating with others#
Sometimes, when we find a useful new repository that we want to work with, one of the best things we can to is to “fork” the repository. This creates a new version of the repository with all of the previous version history, but transfers ownership of the new repository to us. It’s basically a fancy way of making a copy of the repository that we control, but if we ever want to pull in new changes from the original repository, we can still do that!
For this part of the activity you’re going to make a new fork of a preexisting repository and then practice collaborating on make changes to this new for with your group.
2.1 Forking the repository and sharing it among the group#
First, in order to collaborate with your team, you will need to make a repository that your whole group can edit and push to. In order to do this, your group will first fork a pre-existing repository and then make sure that every in the group has access to this new fork.
For this assignment the “leader” of your group will be the one that “forks” the repository.
✅ Do This: With your group, designate someone as “group leader” for this activity. The leader of your group should be someone who has the ability to share their screen with the rest of the group and previously requested and set up their GitHub “Student Pack.”
The leader must then do the following while sharing their screen so that everyone can follow along:
Visit GitHub and make sure they are signed in. (You don’t have to share your screen of this part if you’re worried about accidentally sharing your password.)
After logging in, double check that everyone in the group can see their screen.
Visit msu-cmse-courses/CMSE202_Git_Started and “Fork” the repository by clicking the “Fork” button in the upper right region of the repository page.
Once Fork is complete, you should be re-directed to the repository. Go to “Settings” and click on the “Manage access” tab on the left side.
Invite everyone in the group to be a collaborator with “Write” privileges.
Once everyone has access, everyone in the group must clone this repository into their ~/CMSE202/repositories
folder and then navigate inside of cloned repository.
✅ Question 4: What is the url you used to clone this repository?
✎ Do This - Erase the contents of this cell and replace it with your answer to the above question! (double-click on this text to edit this cell, and hit shift+enter to save the text)
When you are finished with this part everyone should have a local copy of the repo.#
2.3 Working on the same file in a repository#
Perhaps one of the most powerful uses of Git is the ability to work on the same file as others and the ability to merge all the work together. However, this process comes with some extra baggage and can be a bit challenging at times, if not done carefully. The safest way to do this sort of collaborative work is to get into the habit of using git branches when adding new code or features to a repository.
In order to make this a little easier here is a general formula you can follow when working on a shared repository to ensure success.
Before you embark on any significant code development or repository changes, create a new branch. Call that branch something relevant to the work that you’re doing so that you remember what you’re working on.
If the file you are working and plan to update in the repository is a
.ipynb
(Jupyter notebook) file. You should always try to clear the output from that file before you commit your changes. This helps to ensure that the output content is not being tracked by git since this content can be large and may include code that presents output images, clutters the code base. To clear the output, do the following:In the notebook click on “Kernel” -> “Restart & Clear Output”. Wait for that to finish (shouldn’t take long) then save the file (this is important)
Once you’ve added and committed your files to your branch (which you can double-check with
git status
), check to see if there have been any new changes on the main branch. If the haven’t been any changes to the main branch since you made all of your changes, you’re in good shape! You can go ahead and push your new branch and submit it as a pull request to the main branch. If you find that changes have been made to the main branch, it is good to git pull in those changes to your new branch first and deal with any possible merge conflicts. If you’re sure that all of your changes have been safely committed to your branch, you can check and see if the main has been changed by doing:git checkout main # this will jump you back to the main branch git pull # this should pull all new main changes into the main branch
If you do find that the main repository has been updated since you last pulled changes when you run the check above, now you need to try and merge those changes into your new branch. This is where things can get a little dicey, be prepare to be patient with yourself as you get used to this process. Once you’ve pulled the changes from the remote repository into the main branch, you can merge them into your branch by doing the following after you’ve made sure to pull the new changes into the main branch on your machine:
git checkout [BRANCHNAME] git merge main
At this point If you’re lucky, git will successfully complete an “auto-merge” and it will let you know that it’s done so and then you can push your new branch. But, depending on what all has changed, you might run into a merge conflict. At this point you need to look at what file(s) has the conflict and try to resolve the conflict. This can get tricky, so if you run issue merge conflicts, ask your group members or your instructors for help. The upside is, since you committed all of your files before you did this, you can always go back to where you started! If you can’t figure out how to fix the merge conflict and just want to abort the merge so that you can try again at some point later, you can do:
git merge --abort
Let’s give it a shot, shall we?#
✅ Do This: Make sure you’re in the CMSE202_Git_Started
repository you forked and cloned above then do the following:
Open the
Great_Lakes_and_Grand_River.ipynb
and as a group, decided who will complete which “PART”. There may be more parts than there are members of your group, so start with just one part person. If you finish your part and other are still working, you can move on to an unclaimed part. It is crucial each person does their own part as this will help divide up the work and avoid merge conflicts.Once you know which part you are working on, create a new branch and name that branch
part-#
where the “#
” is replaces with the part number you’re going to work on. Use the following command:git branch [BRANCHNAME] # using your appropriate "part-#" branch name
Work on your part of the notebook and complete the task. Once you’ve successfully completed the task, clear all of the output and save the notebook. If you run into issues completing your part, talk with your group. Afterall, you’re part of a collaboration!
One potentially new concept is how to access a file that is not in the same directory as the notebook you are working in. For example, this repository has all data files in a separate folder titled “data” rather than having all csv files in the same place as
Great_Lakes_and_Grand_River.ipynb
(this causes a “No such file or directory” error). You could just move the data files (with the mv command on the terminal!), but there is a faster and easier way! Instead, you can alter the code inside your pandas function to read in the data. Because the csv files live in the data folder, you need to specify that path: ‘data/csv file name’ so that your computer knows to look in the data folder for the file, not the current directory. No more error!Add and commit your changes to your new branch.
Switch back to the main branch with
git checkout main
and do agit pull
to see if anything has changed in the original repository. If the main branch has changed, you’ll want to follow the step above to try and merge those changes into your new branch. Again, if you run into issues ask your group or your instructor for help.If there were no changes to the main branch or your managed to merge in any changes that were there, you’re going to push your new branch to the GitHub repository. You can do so with:
git push origin [BRANCHNAME] # using your appropriate "part-#" branch name
Once you’ve pushed your branch, go to your repository on GitHub and confirm that your new branch has appeared by clicking on the “branches” label that should be hanging out above the list of repository files.
Once you’re on the branches page for YOUR repository, try issuing a new pull request and notify your group that your pull request is available for review.
IMPORTANT NOTE: when submitting a pull request, make sure you are submitting it to your group member’s forked repository, not the original repository. You might see something like this: You want the base repository to be your group member’s forked repository (not msu-cmse-courses/CMSE202_Git_Started). If you don’t confirm this, the pull request will be sent to the original repository, and your group mates won’t see you pull request in their cloned repository of the forked repository for the next step:
From here, someone in your group will need to merge your pull request (yes, you could do it yourself, but this is not common practice!). You should kindly ask if someone would be willing to take a moment to review and merge your pull request. After they’ve done so, see if they need help finishing their part or if anyone else could use your help as well. If someone else’s pull request is merged before your’s, GitHub may tell you that you have merge conflicts that have to be resolved before the pull request can be merged. If this happens, you’ll need to go through the process of pulling new changes from the main branch and resolving the conflicts.
A note about minimizing merge conflicts! Sometimes, this process can be easier if group members take turns submitting pull requests. If you’re running into issues, let one group member pull, push their work, and then submit a pull request. Then another group member follows the same process, etc.
Once your branch is successfully merged in, try pulling it into the main branch on your machine. You should also be able to pull in the changes from others eventually as well. If there’s extra time, feel free to create a new branch and tackle one of the remaining parts.
✅ Question 6: What part(s) of the notebook did you complete? If you ran into any issues completing the part(s) you were responsible for, make note of that here.
✎ Do This - Erase the contents of this cell and replace it with your answer to the above question! (double-click on this text to edit this cell, and hit shift+enter to save the text)
✅ Question 7: Once everyone finished their parts and all of the pull requests were merged, how many commits where there in your group repository?
✎ Do This - Erase the contents of this cell and replace it with your answer to the above question! (double-click on this text to edit this cell, and hit shift+enter to save the text)
🛑 Do not move on to the next section until everyone is done. Once everyone is done, everyone should do a git pull
on the CMSE202_Git_Started
forked repository to make sure they have all of the changes.
3: Exploring the version history and moving between versions (time permitting)#
3.1 Forking the repository (again)#
For the next part of this assignment, everyone who wasn’t the group leader who made the original fork of the course repository, will make a new fork of the repository your group has been working on not of the original course repository. Make a fork of your group’s CMSE202_Git_Started
repository that you were just working on so that you now have your own personal copy. Since the leader already has a copy of that version, they can just use that repository.
Once you’ve “forked” your group repository and made your own, you should clone that repository into your ~/CMSE202/repositories
directory as instructed here:
Everyone, including the leader, should clone a fresh copy of the CMSE202_Git_Started
repository that they are the primary owner of. When you git clone
the repository into your ~/CMSE202/repositories
folder, you should give it a slightly different name. It is suggested you slightly modify the folder name by adding your name or initials as a suffix/tag on the end. Something like so:
git clone [URL] CMSE202_Git_Started_YOURNAME`
✅ Question 8: What is the exact command you used to clone your version of the forked repository?
✎ Do This - Erase the contents of this cell and replace it with your answer to the above question! (double-click on this text to edit this cell, and hit shift+enter to save the text)
3.2 Using version history and version control to move around in “time”#
Another feature that makes Git such a powerful tool is its ability to keep track of all the changes the repository goes through from one commit to the next. This allows for any user to go backwards and forwards in time and be able to see the repository as it stood at each commit.
3.2.1:#
We will be utilizing the version history and version control to “fix” the Grand River
section of the Great_Lakes_and_Grand_River.ipynb
notebook. However, make sure that you are now in your newly forked repository and editing that version of notebook this is important. You’ll probably want to make sure you close down any previously opened versions to ensure you are working on the right now.
✅ Do This: Open the notebook and see if you can spot where the error is (Hint: Look at where it gets the data). Discuss this with your group
✅ Question 9: What bug is there? What lines of code appears to be the problem?
✎ Do This - Erase the contents of this cell and replace it with your answer to the above question! (double-click on this text to edit this cell, and hit shift+enter to save the text)
3.2.2:#
One of your collaborators informs you that they recall that things were working before commit labeled “Updated the Grand River of notebook to pull Grand River Data from the web.” Use git log
to confirm that this commit occured in your version of the repository! Prior to that commit that the section of the notebook was fully functioning.
✅ Do this: “Close and halt” the Jupyer notebook (make sure you’ve completed exited from it). Then use git log
to find the commit that happened before the “Updated the Grand River to pull Grand River Data from the web” commit. Use this command to travel back to that commit:
git checkout [HASHNUMBER]
The HASHNUMBER
is the number that follows the word “commit” in the git log.
✅ Question 10: What is the hash number and commit message for that commit?
✎ Do This - Erase the contents of this cell and replace it with your answer to the above question! (double-click on this text to edit this cell, and hit shift+enter to save the text)
✅ Question 11: Re-open the notebook and review that section. What appears to be different about it? Where does the plot pull its data from? What is the name of the file? You can also view the commits in the GitHub web interface for your repository. Try looking at this commit and the two commits that come after it to see what was being changed. Comment on what you were able to figure out about these changes.
✎ Do This - Erase the contents of this cell and replace it with your answer to the above question! (double-click on this text to edit this cell, and hit shift+enter to save the text)
3.2.3:#
Having done a little digging, you are now ready to “fix” the section. In order to do this, you can follow a general formula that should lead you to success
“Close and halt” the notebook again to make sure it’s closed. Then make sure you jump back to the current commit of the respository. You can do this by running the command
git checkout main
.Before you make any edits, let’s be extra cautious and make a copy of the current notebook using a new, slightly different name (e.g
Great_Lakes_and_Grand_River_YOURNAME.ipynb
).If you can remember what needed to be changed in your new copy of the notebook, go ahead and do that. If you can’t remember what needed to be changed, you can go back to that older, correct version of the original notebook using
git checkout [HASHNUMBER] Great_Lakes_and_Grand_River.ipynb
to revert the notebook back to its older self. You can use that old notebook as a reference to edit your new copy (Great_Lakes_and_Grand_River_YOURNAME.ipynb
). Alternatively, you can avoid using thegit checkout
command and look at the old version and the edits on the GitHub interface.Once you’ve fixed things in your new copy, return the original notebook to it’s most recent state by doing
git checkout [HASHNUMBER] Great_Lakes_and_Grand_River.ipynb
but this time use the HASHNUMBER for the most recent commit (which you can find usinggit log
).Check to see if your new version of the notebook works as intended. You might still run into an error, especially if any important files are missing from the repository. The upside is that even if a file is missing, if it existed in the past, it can be brought back! If you need to being a file back that was deleted in the past, you can do the following:
git checkout [HASHNUMBER] [MISSINGFILE] # "MISSINGFILE" should be the name of the file you wish to restore
This should make the file appear and then you can re-commit it to your repository!
At this point, ideally, you will have the original (broken) version of the notebook, the fixed copy you just made, and restored versions of any files that went missing.
Now you have two options, you can add and commit your copy of the notebook, or you can apply your fix to the original and
rm
your copy. The choice is yours!To finish up, make sure your final version of the repository has been pushed to GitHub.
✅ Do this: Using the above as a guide, fix the Grand_River
section of the notebook.
Once you are done, make sure to turn in your final version of the notebook you fixed along with this notebook on D2L
Congratulations, you’re done with your in-class assignment!#
Now, you just need to submit this assignment by uploading it to the course Desire2Learn web page for today’s in-class assignment submission folder. (Don’t forget to add all of the appropriate names in the first cell).
© Copyright 2024, Department of Computational Mathematics, Science and Engineering at Michigan State University