# Lecture 11: Odyssey!!!¶

Date: 10/05/2017, Thursday

You are expected to finish 8+1 tiny tasks. They will help you get prepared for the final project!

Related resources:

- Ryans’s Linux tutorial
*Intro-to-Odssey-S17_am111.pdf*on Canvas.- Odyssey quickstart guide
- MATLAB on Odyssey
- Parallel MATLAB on Odyssey

## Task 1: Command line on your laptop¶

### Preparation¶

Read Session 4 note, especially Ryans’s tutorial if you didn’t come to Monday’s session.

After reading Chapter 1 to Chapter 5, you should at least know the following Linux commands

`ls`

`pwd`

`mkdir`

`cd`

`mv`

`rm`

and`rm -rf`

`cp`

and`cp -r`

If you choose *vi/vim* as your text editor, read Chapter
6. Then you should
at least know the following *vim* commands

`i`

`esc`

`:wq`

`:q!`

Find your own tutorial if you choose other text editors.

### Writing code in terminal¶

**Task**: Use *vim* or other command line text editer to create a matlab
file *hello.m* with the content “disp(‘hello world!’)”

We use *vim* as an example.

First, create a text file by

```
vim hello.m
```

(If *hello.m* already exists, then it will just open that file)

Inside *vim*, type `i`

to enter the *Insert Mode*.

Then type the code as usual. For example

```
disp('hello world!')
```

After writting the content, type `esc`

to go back to *Command Mode*.

Finally, type `:wq`

to save and quit *vim*.

Again, read Chapter
6 for more *vim*
usages!

**Tips**: You can check the content of *hello.m* by a graphic editer. On
Mac, you can use `open ./`

to open the graphic finder, and then open
*hello.m* that you’ve just created. On Odyssey (See Task 2), there’s no
graphic editor, so you will also use *vim* to check the file content.

### Running MATLAB interactively in terminal¶

**Windows users can jump to Task 2 because I am not sure if the
following stuff would work.**

Find the MATLAB executable path on your laptop. On Mac it should be something like

```
/Applications/MATLAB_R2017a.app/bin/matlab
```

Running the above command will open the traditional graphic version of MATLAB.

To only use the command line, add 3 options:

```
/Applications/MATLAB_R2017a.app/bin/matlab -nojvm -nosplash -nodesktop
```

Play with this command line version of MATLAB for a while. Type `exit`

to quit.

### Set shortcut¶

If you are tired with typing this long command, you can set

```
alias matlab='/Applications/MATLAB_R2017a.app/bin/matlab'
```

Then you can simply type `matlab`

to launch the program. However, this
shortcut will go away if you close the terminal. To make it a permanent
configuration, add the above command to a system file called
*~/.bash_profile*. You can edit it by *vim* for example:

```
vim ~/.bash_profile
```

### Running MATLAB scripts in terminal¶

`cd`

to the directory where you saved the *hello.m* file. You can
execute it by

```
matlab -nojvm -nosplash -nodesktop
hello
```

Or you can use ‘-r’ to combine two commands together

```
matlab -nojvm -nosplash -nodesktop -r hello
```

If you didn’t set shortcut, the full command would be

```
/Applications/MATLAB_R2017a.app/bin/matlab -nojvm -nosplash -nodesktop -r hello
```

(I actually prefer this command line version to the complicated graphic version!)

## Task 2: Command line on Odyssey¶

### Login¶

Login to Odyssey by

```
ssh am111uXXXX@login.rc.fas.harvard.edu
```

Check Odyssey website if you have any trouble.

**Tips**: You can open multiple terminals and login to Odyssey, if one
is not enough for you.

### File transfer¶

#### Use scp¶

You can transfer files by the built-in `scp`

(security-copy) command.
**Make sure you are running this command on your laptop, not on
odyssey.**

From you laptop to Odyssey (first figure out your Odyssey home directory
path by `pwd`

)

```
scp local_file_path username@login.rc.fas.harvard.edu:/path_shown_by_pwd_on_Odyssey
```

**Try to transfer *hello.m* that you wrote in Task 1 to Odyssey!** You
will be asked to enter your password again.

From to Odyssey to your laptop is just reversing the arguments

```
scp username@login.rc.fas.harvard.edu:/file_path_on_odyssey local_file_path
```

Use `scp -r`

for transfering directory (similar to `cp -r`

)

## Task 3: MATLAB on Odyssey¶

### Load MATLAB¶

Load MATLAB by

```
module load matlab
```

(If you get an error, run `source new-modules.sh`

and try again.)

It loads the lastest version by default. You can check the version by
`which`

```
[username]$ which matlab
alias matlab='matlab -singleCompThread'
/n/sw/matlab-R2017a/bin/matlab
```

Or you can load a specific version

```
module load matlab/R2017a-fasrc01
```

Use this RC portal to find avaiable software and the corresponding loading command. Search for MATLAB. How many different verions do you see?

### Run MATLAB¶

After loading MATLAB, you can run it by: (same as on your laptop)

```
matlab -nojvm -nosplash -nodesktop
```

The 3 options are crucial because there’s no graphical user interface on Odyssey.

Play with it, and type `exit`

to quit.

Run *hello.m* by `matlab -nojvm -nosplash -nodesktop -r hello`

.

## Task 4: Interactive Job on Odyssey¶

After logging into Odyssey, you are on a *home node* with very few
computational resources. For any serious computing work you need to
switch to a *compute node*. The easiest way is to do this interactively
(more about interative
mode):

```
srun -t 0-0:30 -c 4 -N 1 --pty -p interact /bin/bash
```

Here we request 30 minutes of computing time (`-t 0-0:30`

) on 4 CPUs
(`-c 4`

), on a single computer (`-N 1`

), using interactive mode
(`--pty`

and `/bin/bash`

).

**Warning: Don’t request too many CPUs! This will make you wait for much
longer.**

`-p interact`

only means you are requesting CPUs on the *interactive
partition*, but doesn’t mean that you want it to run interactively. The
following command starts interactive mode on the *general partition*
(more about
partition).

```
srun -t 0-0:30 -c 4 -N 1 --pty -p general /bin/bash
```

**Then repeat what you’ve done in Task 3.**

## Task 5: Batch Job on Odyssey¶

If your job runs for hours or even days, you can submit it as a *batch
job*, so you don’t need to keep your terminal open all the time. You are
allowed to log out and go away while the job is runnning.

Create a file called *runscript.sh* with the following content. (you can
use *vim* to create such a text file)

```
#!/bin/bash
#SBATCH -J Matlabjob1
#SBATCH -p general
#SBATCH -c 1 # single CPU
#SBATCH -t 00:05:00
#SBATCH --mem=400M # memory
#SBATCH -o %j.o # output filename
#SBATCH -e %j.e # error filename
## LOAD SOFTWARE ENV ##
source new-modules.sh
module purge
module load matlab/R2017a-fasrc01
## EXECUTE CODE ##
matlab -nojvm -nodisplay -nosplash -r hello
```

It just puts the options you’ve used in Task 4 into a text file.

Make sure *runscript.sh* is at the same directory as hello.m, then
execute

```
sbatch runscript.sh
```

Use `sacct`

to check job status. You should get some output files once
it is finished. (more about
submitting
and
monitoring
jobs)

**Tips: always test your code in interactive mode before submitting a
batch job!**

## Task 6: Use MATLAB-parallel on your laptop¶

Make sure you’ve installed the parallel toolbox. To start the command
line version, remove the `-nojvm`

option when using parallel mode.
(The original graphic version works as usual)

```
matlab -nosplash -nodesktop
```

Initialize parallel mode by

```
In [1]:
```

```
parpool('local', 2)
```

```
Starting parallel pool (parpool) using the 'local' profile ...
connected to 2 workers.
ans =
Pool with properties:
Connected: true
NumWorkers: 2
Cluster: local
AttachedFiles: {}
IdleTimeout: 30 minutes (30 minutes remaining)
SpmdEnabled: true
```

Then run this script for several times to make sure you get speed-up by
using parallel for-loop (`parfor`

)

```
In [4]:
```

```
n = 1e9;
X = 0;
tic
for i = 1:n
X = X + 1;
end
T = toc;
fprintf('serial time: %f; result: %d \n',T,X)
X = 0;
tic
parfor i = 1:n
X = X + 1;
end
T = toc;
fprintf('parallel time: %f; result: %d \n',T,X)
```

```
serial time: 2.724932; result: 1000000000
parallel time: 1.748450; result: 1000000000
```

**Tips**: For command line version of MATLAB, save the code as
*parallel_timing.m*, and then execute `parallel_timing`

inside
MATLAB.

Finally, quit the parallel mode

```
In [5]:
```

```
delete(gcp)
```

## Task 7: Use MATLAB-parallel on Odyssey interactive mode¶

Repeat what you’ve done in Task 6, but on Odyssey. This might **not** be
as straightforward as you expected!

You need to request enough memory for the parallel tool box

```
srun -t 0-0:30 -c 4 -N 1 --mem-per-cpu 4000 --pty -p interact /bin/bash
```

Environment variable *SLURM_CPUS_PER_TASK* tells you how many CPUs
are available

```
echo $SLURM_CPUS_PER_TASK
4
```

For parallel support, you need to call `matlab-default`

instead of
`matlab`

to launch the program, as described
here.

```
module load matlab
matlab-default -nosplash -nodesktop
```

Inside MATLAB, you can again check the number of CPUs by

```
getenv('SLURM_CPUS_PER_TASK')
ans = '4'
```

Initialize parallel mode by (this is a general code for any number of CPUs)

```
parpool('local', str2num(getenv('SLURM_CPUS_PER_TASK')) )
```

The initialization might take severals minutes on Odyssey. Eventually you should see something like

```
ans =
Pool with properties:
Connected: true
NumWorkers: 4
Cluster: local
AttachedFiles: {}
IdleTimeout: 30 minutes (30 minutes remaining)
SpmdEnabled: true
```

Then, execute the *parallel_timing.m* script in Task 6. You should see
a speed-up like that

```
>> parallel_timing
serial time: 12.228084; result: 1000000000
parallel time: 2.667366; result: 1000000000
```

## Task 8: MATLAB-parallel as batch Job¶

Sightly modify the script *parallel_timing.m* in Task 6. Call it
*parallel_timing_batch.m* this time.

```
parpool('local', str2num(getenv('SLURM_CPUS_PER_TASK')))
n = 1e9;
X = 0;
tic
for i = 1:n
X = X + 1;
end
T = toc;
fprintf('serial time: %f; result: %d \n',T,X)
X = 0;
tic
parfor i = 1:n
X = X + 1;
end
T = toc;
fprintf('parallel time: %f; result: %d \n',T,X)
X = 0;
tic
parfor i = 1:n
X = X + 1;
end
T = toc;
fprintf('parallel time: %f; result: %d \n',T,X)
delete(gcp)
```

Then, change the *runscript.sh* in Task 5 correspondingly

```
#!/bin/bash
#SBATCH -J timing
#SBATCH -o timing.out
#SBATCH -e timing.err
#SBATCH -N 1
#SBATCH -c 4
#SBATCH -t 0-00:20
#SBATCH -p general
#SBATCH --mem-per-cpu 8000
source new-modules.sh
module load matlab
srun -n 1 -c 4 matlab-default -nosplash -nodesktop -r parallel_timing_batch
```

Submit this job. It will take many minutes to finish. Do you get expected speed-up?

In timing.out, you should see something like

```
ans =
Pool with properties:
Connected: true
NumWorkers: 4
Cluster: local
AttachedFiles: {}
IdleTimeout: 30 minutes (30 minutes remaining)
SpmdEnabled: true
serial time: 7.635188; result: 1000000000
parallel time: 5.901599; result: 1000000000
parallel time: 3.516169; result: 1000000000
Parallel pool using the 'local' profile is shutting down.
```

Explain why the second `parfor`

is faster then the first `parfor`

**Tips**: Using batch job for this kind of small computation is
definitely an overkill, as queuing and initializing would take much
longer than actual compuation. You will probably use the interactive
mode much more often in this class.

## Bonus task: make your terminal prettier¶

Open ~/.bash_profile (for example `vim ~/.bash_profile`

), add the
following lines

For Mac

```
export CLICOLOR=1
export LSCOLORS=ExFxBxDxCxegedabagacad
```

For Linux (Odyssey)

```
alias ls="ls --color=auto"
```

Type `source ~/.bash_profile`

or relaunch the terminal. Notice any
difference?