README.md 5.06 KB
Newer Older
Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
1
2
# intro-to-slurm

Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
## Creating a job

A job consists in two parts: resource requests and job steps. Resource requests consist in a number of CPUs, computing expected duration, amounts of RAM or disk space, etc. Job steps describe tasks that must be done, software which must be run.

The typical way of creating a job is to write a submission script. A submission script is a shell script, e.g. a Bash script, whose comments, if they are prefixed with SBATCH, are understood by Slurm as parameters describing resource requests and other submissions options. You can get the complete list of parameters from the sbatch manpage man sbatch.

**Important**

The SBATCH directives must appear at the top of the submission file, before any other line except for the very first line which should be the shebang (e.g. #!/bin/bash).

The script itself is a job step. Other job steps are created with the **srun** command.

For instance, the following script, hypothetically named submit.sh,

```
#!/bin/bash
#
#SBATCH --job-name=test
#SBATCH --output=res.txt
#
#SBATCH --ntasks=1
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100

srun hostname
srun sleep 60
```

would request one CPU for 10 minutes, along with 100 MB of RAM, in the default queue. When started, the job would run a first job step srun hostname, which will launch the UNIX command hostname on the node on which the requested CPU was allocated. Then, a second job step will start the sleep command. Note that the --job-name parameter allows giving a meaningful name to the job and the --output parameter defines the file to which the output of the job must be sent.

Once the submission script is written properly, you need to submit it to slurm through the sbatch command, which, upon success, responds with the jobid attributed to the job. (The dollar sign below is the shell prompt)

```
$ sbatch submit.sh
sbatch: Submitted batch job 99999999
Simple submission script
```

The job then enters the queue in the PENDING state. Once resources become available and the job has highest priority, an allocation is created for it and it goes to the RUNNING state. If the job completes correctly, it goes to the COMPLETED state, otherwise, it is set to the FAILED state.

Interestingly, you can get near-realtime information about your running program (memory consumption, etc.) with the sstat command, by introducing sstat -j jobid. You can select what you want sstat to output with the --format parameter. Refer to the manpage for more information man sstat.

Upon completion, the output file contains the result of the commands run in the script file. In the above example, you can see it with cat res.txt command.

This example illustrates a serial job which runs a single CPU on a single node. It does not take advantage of multi-processor nodes or the multiple compute nodes available with a cluster. The next sections explain how to create parallel jobs.

Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
49
50
# Going parallel

Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
51
52
53
54
55
56
57
58
59
60
61
62
63
64
There are several ways a parallel job, one whose tasks are run simultaneously, can be created:

* by running a multi-process program (SPMD paradigm, e.g. with MPI)
* by running a multithreaded program (shared memory paradigm, e.g. with OpenMP or pthreads)
* by running several instances of a single-threaded program (so-called embarrassingly parallel paradigm or a job array)
* by running one master program controlling several slave programs (master/slave paradigm)

In the Slurm context, a task is to be understood as a process. So a multi-process program is made of several tasks. By contrast, a multithreaded program is composed of only one task, which uses several CPUs.

Tasks are requested/created with the --ntasks option, while CPUs, for the multithreaded programs, are requested with the --cpus-per-task option. Tasks cannot be split across several compute nodes, so requesting several CPUs with the --cpus-per-task option will ensure all CPUs are allocated on the same compute node. By contrast, requesting the same amount of CPUs with the --ntasks option may lead to several CPUs

More submission script examples
Here are some quick sample submission scripts. For more detailed information, make sure to have a look at the Slurm FAQ and to follow our training sessions. There is also an interactive Script Generation Wizard you can use to help you in submission scripts creation.

Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
65
66
67
**Message passing example (MPI)**

```
Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
68
69
70
71
72
73
74
75
76
77
78
#!/bin/bash
#
#SBATCH --job-name=test_mpi
#SBATCH --output=res_mpi.txt
#
#SBATCH --ntasks=4
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100

module load OpenMPI
srun hello.mpi
Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
79
80
```

Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
81
82
83
84
Request four cores on the cluster for 10 minutes, using 100 MB of RAM per core. Assuming hello.mpi was compiled with MPI support, srun will create four instances of it, on the nodes allocated by Slurm.

You can try the above example by downloading the example hello world program from Wikipedia (name it for instance wiki_mpi_example.c), and compiling it with

Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
85
```
Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
86
87
88
89
90
91
92
93
94
95
module load openmpi
mpicc wiki_mpi_example.c -o hello.mpi
The res_mpi.txt file should contain something like

0: We have 4 processors
0: Hello 1! Processor 1 reporting for duty

0: Hello 2! Processor 2 reporting for duty

0: Hello 3! Processor 3 reporting for duty
Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
96
```
Giuseppe Fiameni's avatar
Giuseppe Fiameni committed
97