Running a parallel MATLAB job on Flux

This will (hopefully) be a quick guide for students to start running parallel MATLAB jobs on the High Performance Computing (HPC) cluster Flux. This can be especially useful for:

  • Parameter sweeps. Spread parameter values over nodes!
  • Monte Carlo simulations. Spread Monte Carlo trials over nodes!

We’ll show the process for an example related to Wigner’s Semicircle Law, a very cool result from Random Matrix Theory. Let’s jump in!

Remark: There are many ways to set this up - we’ll just focus on one here.

Goal

We want to explore the average histogram of eigenvalues for the real symmetric random matrix

$$ \begin{align} X &= \frac{1}{2}\left(Y + Y^T\right) & Y_{ij} &\overset{iid}{\sim} \mathcal{N}(0,1) \end{align} $$

More specifically, we want to

  1. Generate many instances of the random matrix $X$.
  2. Compute the eigenvalues for each instance.
  3. Make a histogram of the eigenvalues collected from all the instances.

Flux allows us to spread this work over nodes. :)

Step 0: Pre-requisites

You need a few things before we start:

  1. MToken. See these instructions provided by ITS!
  2. Flux user account. Request one through the online signup!
  3. Flux allocation access. Ask your advisor for this. Your user account needs to be granted access to their allocation, and you need to know the name of their allocation.

Step 1: Write the simulation program

We’ll have each node generate one instance of $X$ and compute its eigenvalues. Here’s a MATLAB function to do that!

% simulation.m

function simulation(ARRAYID)

% Parallel Configuration
rng(ARRAYID); OUTFILE = sprintf('data/SIM%g.mat',ARRAYID);

if exist(OUTFILE,'file') ~= 0
    fprintf('File %s already exists! Simulation %g skipped.\n',OUTFILE,ARRAYID);
    return
end

% Run simulation
Y = randn(100); X = 1/2*(Y+Y'); e = eig(X);

% Save outputs
save(OUTFILE);

end

When we submit this to Flux, we’ll provide a list of “Array IDs”. For each Array ID, Flux will allocate a node to us and run our MATLAB function on it with that ID as input.

Note that we

  • seed the random number generator using ARRAYID so that nodes don’t generate the same random numbers
  • save the results in an output file corresponding to ARRAYID.
  • check if the output file already exists. If Flux goes down before all the nodes finish, we’ll submit the job again and won’t want to waste time redoing runs in the array that already completed.

Remark: For parameter sweeps, ARRAYID is a great way to select the parameters to use for each node.

Step 2: Write the PBS script

A PBS script describes the job we want to run so that Flux can schedule and run it.

# script.pbs

## PBS Directives (Configuration)
# Job Description and Messaging (i.e., notifications)
#PBS -N EigRand
#PBS -M [your email here]
#PBS -m abe

# Account Information
#PBS -A [allocation name here]
#PBS -l qos=flux
#PBS -q flux

# Requested Resources and Environment
#PBS -l nodes=1:ppn=1,pmem=1gb
#PBS -l walltime=15:00
#PBS -V

# Job array (1 to 9 with at most 3 running at once)
#PBS -t 1-9%3

# Location for log files (stdout and stderr)
#PBS -o logs/
#PBS -e logs/

## Script
cd $PBS_O_WORKDIR
matlab -nodisplay -r "simulation($PBS_ARRAYID)"

The PBS script is actually just a normal bash script with “PBS Directives” at the top. The script gets run on each node with access to some special environment variables like:

  • $PBS_O_WORKDIR: the directory we submitted the job from
  • $PBS_ARRAYID: the Array ID assigned to that node

Note that this script is what runs our MATLAB function above with the Array ID as input.

Each PBS directive starts with #PBS and tells Flux about our job:

  • #PBS -N EigRand sets the name of the job.
  • #PBS -M [your email here] sets the email you want to use for messages from Flux.
  • #PBS -m abe configures Flux to email you when each Array ID aborts, begins and ends.
  • #PBS -A [allocation name here] sets the allocation you are using.
  • #PBS -l qos=flux sets the quality of service (this should be flux unless told otherwise).
  • #PBS -q flux sets the queue. It generally matches the allocation name suffix (i.e., an allocation called default_flux would have flux as the queue)
  • #PBS -l nodes=1:ppn=1,pmem=1gb (approximately) requests that each Array ID get 1 node with 1 processor per node and 1gb of physical memory.
  • #PBS -l walltime=15:00 requests 15 minutes for each Array ID to complete. Once this time is up, Flux kills our program even if it’s still running.
  • #PBS -V tells Flux to copy the environemnt variables from where we submit the job to each node. This is important because we’ll need to put MATLAB in the path and we’ll need that to be applied to all the nodes.
  • #PBS -t 1-9%3 sets the list of Array IDs to be 1,2,…,9. It also tells Flux to only run 3 Array IDs at a time (that way you don’t hog all the nodes available in the allocation!).
  • #PBS -o logs/ and #PBS -e logs/ tell Flux where to store the stdout and stderr streams from each run.

Step 3: Submit the job

We now have all the files we need ready! Time to upload them to Flux and submit the job.

Uploading to Flux

Upload simulation.m and script.pbs to your directory in /scratch using the transfer server flux-xfer.engin.umich.edu.

Don’t know what your directory in /scratch is? Ask your advisor. It’s likely something like /scratch/[allocation name here]/[your uniquename here].

Submitting the job

Sign in to the login server flux-login.engin.umich.edu and run

cd [your directory on scratch here]
mkdir data/ logs/
module load matlab/2015a
qsub script.pbs

Note: This must be done from the university network (i.e., you’ll need to be on the network, VPN in, or go through another on-campus server first) and you’ll need to use your MToken to authenticate.

These commands

  1. move us into the directory where we have put our files
  2. creates directories for the output files
  3. adds MATLAB 2015a to the path environment variable
  4. submits the job to Flux

You will see an output like

$ qsub script.pbs 
18319311[].nyx.arc-ts.umich.edu

The number (18319311 in this example) is the Job ID.

Keeping track of the job

You’ll recieve an email from Flux when each Array ID begins, ends and aborts because of the PBS directive #PBS -m abe. To check the current status run the following command on the login server.

qstat -t -au [your uniquename here]

You’ll see something like this.

$ qstat -t -au dahong

nyx.arc-ts.umich.edu: 
                                                                                  Req'd       Req'd       Elap
Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory      Time    S   Time
----------------------- ----------- -------- ---------------- ------ ----- ------ --------- --------- - ---------
18319311[1].nyx.arc-ts  dahong      flux     EigRand-1           --      1      1       1gb  00:15:00 Q       -- 
18319311[2].nyx.arc-ts  dahong      flux     EigRand-2           --      1      1       1gb  00:15:00 Q       -- 
18319311[3].nyx.arc-ts  dahong      flux     EigRand-3           --      1      1       1gb  00:15:00 Q       -- 
18319311[4].nyx.arc-ts  dahong      flux     EigRand-4           --      1      1       1gb  00:15:00 H       -- 
18319311[5].nyx.arc-ts  dahong      flux     EigRand-5           --      1      1       1gb  00:15:00 H       -- 
18319311[6].nyx.arc-ts  dahong      flux     EigRand-6           --      1      1       1gb  00:15:00 H       -- 
18319311[7].nyx.arc-ts  dahong      flux     EigRand-7           --      1      1       1gb  00:15:00 H       -- 
18319311[8].nyx.arc-ts  dahong      flux     EigRand-8           --      1      1       1gb  00:15:00 H       -- 
18319311[9].nyx.arc-ts  dahong      flux     EigRand-9           --      1      1       1gb  00:15:00 H       -- 

The S column shows whether each Array ID is: Held, Queued, Running, Exiting or Completed. Note that the first 3 are Queued and the rest are Held because we told Flux to only run 3 Array IDs at a time.

After some time, the Flux scheduler starts running some of the Array IDs. Here is the output after one has run and is Completed, two are Running and a new one has been Queued (because one completed). Note that the Elap Time column shows how long currently running Array IDs have been running.

$ qstat -t -au dahong

nyx.arc-ts.umich.edu: 
                                                                                  Req'd       Req'd       Elap
Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory      Time    S   Time
----------------------- ----------- -------- ---------------- ------ ----- ------ --------- --------- - ---------
18319311[1].nyx.arc-ts  dahong      flux     EigRand-1         65929     1      1       1gb  00:15:00 C       -- 
18319311[2].nyx.arc-ts  dahong      flux     EigRand-2         27082     1      1       1gb  00:15:00 R  00:00:25
18319311[3].nyx.arc-ts  dahong      flux     EigRand-3        112572     1      1       1gb  00:15:00 R  00:00:25
18319311[4].nyx.arc-ts  dahong      flux     EigRand-4           --      1      1       1gb  00:15:00 Q       -- 
18319311[5].nyx.arc-ts  dahong      flux     EigRand-5           --      1      1       1gb  00:15:00 H       -- 
18319311[6].nyx.arc-ts  dahong      flux     EigRand-6           --      1      1       1gb  00:15:00 H       -- 
18319311[7].nyx.arc-ts  dahong      flux     EigRand-7           --      1      1       1gb  00:15:00 H       -- 
18319311[8].nyx.arc-ts  dahong      flux     EigRand-8           --      1      1       1gb  00:15:00 H       -- 
18319311[9].nyx.arc-ts  dahong      flux     EigRand-9           --      1      1       1gb  00:15:00 H       -- 

Once they are all done, the output looks like this.

$ qstat -t -au dahong

nyx.arc-ts.umich.edu: 
                                                                                  Req'd       Req'd       Elap
Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory      Time    S   Time
----------------------- ----------- -------- ---------------- ------ ----- ------ --------- --------- - ---------
18319311[1].nyx.arc-ts  dahong      flux     EigRand-1         65929     1      1       1gb  00:15:00 C       -- 
18319311[2].nyx.arc-ts  dahong      flux     EigRand-2         27082     1      1       1gb  00:15:00 C       -- 
18319311[3].nyx.arc-ts  dahong      flux     EigRand-3        112572     1      1       1gb  00:15:00 C       -- 
18319311[4].nyx.arc-ts  dahong      flux     EigRand-4         66196     1      1       1gb  00:15:00 C       -- 
18319311[5].nyx.arc-ts  dahong      flux     EigRand-5         66353     1      1       1gb  00:15:00 C       -- 
18319311[6].nyx.arc-ts  dahong      flux     EigRand-6        121724     1      1       1gb  00:15:00 C       -- 
18319311[7].nyx.arc-ts  dahong      flux     EigRand-7         27418     1      1       1gb  00:15:00 C       -- 
18319311[8].nyx.arc-ts  dahong      flux     EigRand-8         66509     1      1       1gb  00:15:00 C       -- 
18319311[9].nyx.arc-ts  dahong      flux     EigRand-9        121903     1      1       1gb  00:15:00 C       -- 

Step 4: Download output files and merge the results

Once all Array IDs are completed, download the directories containing output files. Once again use the transfer server flux-xfer.engin.umich.edu.

The directory data/ should contain files named SIM1.mat, … , SIM9.mat. These are the files saved by the MATLAB function.

The directory logs/ should contain files named something like

  • EigRand.o18319311-1, … , EigRand.o18319311-9; these are the standard output (stdout) streams from the Array IDs.
  • EigRand.e18319311-1, … , EigRand.e18319311-9; these are the standard error (stderr) streams from the Array IDs.

Now we need to merge the results from the many files (one for each Array ID) to a single data file. A good way is to write a program like this.

% merge.m

function merge(ARRAYID_LIST)

E = [];
for i = 1:length(ARRAYID_LIST)
    OUTFILE = sprintf('data/SIM%g.mat',ARRAYID_LIST(i));
    load(OUTFILE);
    
    E = [E e];
end

save('data/SIM-Merged.mat','ARRAYID_LIST','E');

end

In MATLAB run this program with the Array ID list we had (1,2,…,9) with the command

merge(1:9);

This will generate a new file data/SIM-Merged.mat that we can use to make our histogram as follows

load('data/SIM-Merged.mat');
hist(E(:)); xlabel('Eigenvalues'); ylabel('Histogram');
title('Average Eigenvalue Distribution');

After you run this, you should get (something like) the following histogram

Histogram

Turns out the histogram is (close to) a semicircle, as was predicted! :)

Conclusion

To run a different parallel MATLAB job, modify simulation.m with the code you want to run for each Array ID and adjust the script.pbs. You’ll want to remember to change the name, the requested resources (especially the physical memory and walltime) and the Array ID list.

Handy commands

Here’s an evolving list of some handy commands you can issue from the login server!

To get a live/streaming view of the status of your jobs (refreshes every ~2 secs) use watch.

watch "qstat -t -au [your uniquename here]"

If the list is too long (e.g., because of previous jobs that haven’t been cleared yet) use tail.

watch "qstat -t -au [your uniquename here] | tail -n [number of entries]"