CMSE401 Quiz Instructions

This quiz is designed to take approximately 20 minutes to complete (you will be given 50 Minutes).

Please read the following instructions before starting the quiz.

This is an open Internet quiz. Feel free to use anything on the Internet with one important exception...

  • DO NOT communicate live with other people during the quiz (either verbally or on-line). The goal here is to find answers to problems as you would in the real world.

You will be given 20 minutes to complete this quiz. Use your time wisely.

HINTS:

  • Neatness and grammar is important. We will ignore all notes or code we can not read or understand.
  • Read the entire quiz from beginning to end before starting. Not all questions are equal in points vs. time so plan your time accordingly.
  • Some of the information provided my be a distraction. Do not assume you need to understand everything written to answer the questions.
  • Spaces for answers are provided. Delete the prompting text such as "Put your answer to the above question here" and replace it with your answer. Do not leave the prompting text with your answer.
  • Do not assume that the answer must be in the same format of the cell provided. Feel free to change the cell formatting (e.g., markdown to code, and vice versa) or add additional cells as needed to provide your answer.
  • When a question asks for an answer "in your own words" it is still okay to search the Internet for the answer as a reminder. However, we would like you to do more than cut and paste. Make the answer your own.
  • If you get stuck, try not to leave an answer blank. It is better to include some notes or stub functions so we have an idea about your thinking process so we can give you partial credit.
  • Always provid links to any references you find helpful.
  • Feel free to delete the provided check marks (✅) as a way to keep track of which questions you have successfully completed.

Honor Code

I, agree to neither give nor receive any help on this quiz from other people. I also understand that providing answers to questions on this quiz to other students is also an academic misconduct violation as is live communication or receiving answers to questions on this quiz from other people. It is important to me to be a person of integrity and that means that ALL ANSWERS on this quiz are my answers.

DO THIS: Include your name in the line below to acknowledge the above statement:

Put your name here.


Scientific Image Analysis

ImageJ logo of a microscope. Motivating the reason behind ImageJ as a scientific tool

Logo from the ImageJ website

ImageJ is a software package developed with funding from the National Institute of Health (NIH). This is a well established tool written in java with decades of plugins and options to help researchers measure data inside of images (mostly medical images). For this quiz you will explore how to run large baches of ImageJ on the HPCC.

Question 1: (10 points) What module command do you use to be able to load the ImageJ on the HPCC?

Put your answer here

Question 2: (10 points) What versions of ImageJ is installed on the HPCC. Type the command you used to figure this out?

Put your answer here


For the following questions a researcher is trying to manually interact with the ImageJ Graphical User Interface (GUI) on the HPCC using the following command:

java -jar $EBROOTIMAGEJ/ij.jar

However, when they run the command on dev-intel18 through the ondemand server they get the following output:

$ java -jar $EBROOTIMAGEJ/ij.jar
Exception in thread "main" java.awt.HeadlessException: 
No X11 DISPLAY variable was set, but this program performed an operation which requires it.
        at java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:204)
        at java.awt.Window.<init>(Window.java:536)
        at java.awt.Frame.<init>(Frame.java:420)
        at ij.ImageJ.<init>(ImageJ.java:143)
        at ij.ImageJ.main(ImageJ.java:703)

Question 3: (20 points) Explain this error and describe one way you can fix or get around the problem and get the ImageJ GUI up and running on the HPCC. Make the instructions short but detailed enough for a new researcher to the HPCC. You are encouraged to provide links to websites as references to help answer your question. HINT: test your answers on the actual HPC to make sure they work.

Put your answer here


We can also run ImageJ with a "macro" that uses a custom language developed only for ImageJ. Here is an example:

print("Inverting Image");
name = getArgument;
if (name=="") exit ("No argument!");|
setBatchMode(true);

//Set PATH
var patho="./";

//Open bot.tif file and make inverse
print("Opening File");
open(patho+name);

print("Inverting Image");
run("Invert");

print("Saving Inverted File");
saveAs("Png", patho+"Inv_"+name);

The above macro (named macro.ijm) can be run on input file input1.png using the following command:

java -jar $EBROOTIMAGEJ/ij.jar -batch ./macro.ijm input1.png



As we can tell from the error in the previous questions, one of the problems with ImageJ is that it needs a graphical user interface which requires a connected display. Unfortunately, if we try to run the macro using the scheduler in "batch" mode, we would get a similar error because none of the compute nodes have displays attached either.

One way to get around this problem is to use a "fake" display or "virtual" display. In Linux there is often a program called "the X11 virtual frame buffer" (aka Xvfb).

The following job script can be used with ImageJ and Xvfb to run a batch job on the hpcc:

#!/bin/bash
#SBATCH --mem=4gb
#SBATCH --time=00:10:00
#SBATCH -n 1
#SBATCH -c 1

module load Java
module load ImageJ

#Remove left over xvfb lock files
rm -rf /tmp/.X11-unix
rm -rf /tmp/.X11-lock

##### Specify the display, start the Xvfb server, and save the process ID.
export DISPLAY=":1"
Xvfb $DISPLAY -auth /dev/null &
XVFB_PID=$!

#Give system time to spin up X11 display (Probably not needed)
sleep 5

####
#Run ImageJ script
java -jar $EBROOTIMAGEJ/ij.jar -batch ./macro.ijm input1.png


##### Stop the Xvfb server and remove the temporary lock files it created (if they don't remove themselves.
kill -9 $XVFB_PID
rm -rf /tmp/.X11-unix
rm -rf  /tmp/.X11-lock
####

Question 4: (20 points) If we remove the "&" at the end of the Xvfb line what will happen if we submit this script to the cluster? (explain why)

Put your answer to the above question here


Now let us assume we have a directory filled with image files we want to process using our macro.ijm in imagej. These files are located in your current directory with the following names:

input1.png
input2.png
input3.png
...
input300.png

We could use the following simple bash script to loop over all of the png files in the current directory and run ImageJ on each file:

for file in *.png
do
    java -jar $EBROOTIMAGEJ/ij.jar -batch ./macro.ijm ${file}
done

However, lets predent that the time to process each file is 12 minutes and 31 seconds.

Question 5: (10 points) How long (in seconds) will it take to run all 300 files using this loop?

In [1]:
# put your answer to the above question here.

If you think about it, the order of the loop does not matter and would be really easy to run in parallel. This type of problem is called pleasantly parallel. The idea is we can just process a different file on a different computer using a SLURM job array.

This type of workflow is also often called "unrolling a loop". Lets assume that we want to unroll the above loop and run it as a job array on the cluster. First, you would need to add the following resource request to the top of your script:

#SBATCH --array=1-300

This --array request will tell SLURM to run 300 identical jobs. The only difference will be each job will run on a different node on the cluster and will be given a unique array task ID number (the numbers 1-300) inside a bash variable called SLURM_ARRAY_TASK_ID.

Question 6: (20 points) Modify the following java command (which would be inside your job script) so that it will use the SLURM_ARRAY_TASK_ID environment variable to select a different file name for each file in the array. (instead of all jobs trying to use the input1.png file)

Modify this code

java -jar $EBROOTIMAGEJ/ij.jar -batch ./macro.ijm intput1.png

Question 7: (10 points) What is the fastest possible speed (in seconds) we could run the same job using our job array?

put your answer to the above question here.

Congradulations

You are done with your quiz. Please save the file and upload the jupyter notebook to the D2L dropbox.

Written by Dr. Dirk Colbry, Michigan State University Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.