Disclaim: This tutorial has been designed to be run on the IFB Core Cluster or the ABiMS Cluster, part of the IFB NNCR Cluster. Although except the "Software environment" part, the rest can suit with any SLURM Cluster.
This tutorial aims to give the basic workflow when analyse data on a SLURM remote HPC cluster infrastructure
Note: at some point, you will have to complete your knowledge with other documentations
During this tutorial, you will have to launch some commands in a terminal.
This is a terminal:
$ # This is a comment that will be executed
$ progam "This is a command. Don't type the $"
This is the result of the command
The $
is your terminal prompt
You will have to replace:
your_login
by your login (ex: cnorris)your_project
by your project name you requestedWe will establish a connection between your computer and the login node using the protocol SSH (Secure Shell) and the program ssh
.
Open a terminal or alternatives (ex: MobaXterm)
ssh
core.cluster.france-bioinformatique.fr
or slurm0.sb-roscoff.fr
your_login
22
Use the ssh
program to establish a secure connection with the targeted login node with Terminal:
$ # For the IFB Core Cluster:
$ ssh -Y your_login@core.cluster.france-bioinformatique.fr
$ # For the ABiMS Cluster:
$ ssh -Y your_login@slurm0.sb-roscoff.fr
your_login@slurm0.sb-roscoff.fr\'s password:
Tips: You will then be prompted to enter your password (beware: at the password prompt, the characters you type are not printed on the screen, for obvious security reasons).
There are two ways to nagivate within a tree of directories:
/
/
but directly by the subfolder name fastq/run1
or with ..
to step back ../fastq/run1
../../../
–> For this tutorial, we will mainly use absolute paths.
pwd
and change the directory cd
$ # Display your current directory
$ pwd
/shared/home/your_login
$ # Move to another directory
$ cd /shared/bank
$ # Display your new current directory
$ pwd
/shared/bank
$ # Move to your project directory
$ cd /shared/projects/your_project
$ # Display your current directory
$ pwd
/shared/projects/your_project
$ cd /shared/bank
$ # List the current directory
$ ls
accession2taxid lachancea_kluyveri rosa_chinensis
arabidopsis_thaliana mus_musculus saccharomyces_cerevisiae
bos_taurus nicotiana_tabacum uniprot
canis_lupus_familiaris nr uniprot_swissprot
danio_rerio nt uniref50
homo_sapiens refseq uniref90
$ # Or list directories somewhere on the filesystem
$ ls /shared/bank/uniprot_swissprot/current/
blast diamond fasta flat mapping mmseqs
$ ls /shared/bank/uniprot_swissprot/current/fasta/
uniprot_swissprot.fsa
mkdir
We will create this arborescence in your project directory :
.
└── tuto_slurm
├── 01_fastq
└── 02_qualitycontrol
(Computer scientists suck at botany, for them the root is at the top :/)
$ cd /shared/projects/your_project
$ pwd
/shared/projects/your_project
$ ls # So far there is nothing in your project directory
$ mkdir tuto_slurm
$ ls # We can check that the directory has been created
tuto_slurm
$ cd tuto_slurm # Oh! a relative path
$ pwd
/shared/projects/your_project/tuto_slurm
$ mkdir 01_fastq 02_qualitycontrol
$ ls
01_fastq 02_qualitycontrol
$ cd /shared/projects/your_project
$ tree # tree will help you to display a nice tree
.
└── tuto_slurm
├── 01_fastq
└── 02_qualitycontrol
3 directories, 0 files
# you can also use: tree -d
# to display only directory structure
You can either fetch data:
wget
In this part, we will choose the first solution and fetch from the Zenodo website, a public repository with wget
.
The usage of a SFTP Client is explained in the section "Transfer: get back your results on your personal computer". It's just the reverse !
For this tutorial, we will borrow a FASTQ file provided by the excellent Galaxy Training Network: .
$ cd /shared/projects/your_project/tuto_slurm/01_fastq
$ wget https://zenodo.org/record/61771/files/GSM461178_untreat_paired_subset_1.fastq
$ ls
GSM461178_untreat_paired_subset_1.fastq
$ ls -lh # Two option of ls that will among other things give use the weight of our file: 20MB. "l" for long format and "h" for human readable
total 20M
-rw-r--r-- 1 your_login root 20M Nov 6 07:33 GSM461178_untreat_paired_subset_1.fastq
At the IFB, the cluster administrators install all tools required by the users. To access to a tool, you need to load it into your environment using a special application called module.
Let's load the software environment for FastQC, a quality control tool.
$ # List all the softwares and versions available
$ module avail
abyss/2.2.1 emboss/6.6.0 mirdeep2/2.0.1.2 rseqc/2.6.4
adxv/1.9.14 enabrowsertools/1.5.4 mixcr/2.1.10 rstudio-server/1.2.5042
alientrimmer/0.4.1 ensembl-vep/98.2 mmseqs2/8-fac81 salmon/0.11.3
anvio/6.1 epa-ng/0.3.6 mmseqs2/8.fac81 salmon/0.14.1
anvio/6.2 epic2/0.0.41 mmseqs2/10-6d92c salmon/0.14.2
$ # List the different versions of one software
$ module avail fastqc
fastqc/0.11.5 fastqc/0.11.7 fastqc/0.11.8 fastqc/0.11.9
$ # We can check that the fastqc application isn't available by default
$ fastqc --version
-bash: fastqc: command not found
$ # Load the module for fastqc version 0.11.9
$ module load fastqc/0.11.9
$ # Check the availability and the version
$ fastqc --version
FastQC v0.11.9
$ # List loaded modules
$ module list
Currently Loaded Modulefiles:
1) fastqc/0.11.9
Note that the module load
command is only enabled for your current terminal session. You have to load it on each session and at the beginning of your sbatch
scripts (cf. below).
At the IFB, our scientific softwares are composed:
To provide the same interface for both Conda and Singularity technologies, the IFB NNCR Cluster provides an abstraction layer with Environment Modules
Computer components:
Type of "Architecture":
A HPC/SLURM infrastructure is composed of:
The sequence:
srun
or sbatch
The resources you need for your job can be set using options:
--cpus-per-task
but implicitly by default --cpus-per-task 1
--mem
but implicitly by default --mem 2GB
--partition
but implicitly by default --partition fast
There are 2 main partitions:
fast
: for job that can run within 24 hourslong
: for job that can run within 30 dayssrun
vs sbatch
srun
⚠️ The job is killed if the terminal is closed or the network is cut off.
sbatch
Better for reproducibility because it's self documented.
srun
suits with small jobs in duration because indeed, the job is killed if the terminal is closed or the network is cut off. The classic examples are files [de]compression (ex: tar
, gzip
...), files parsing (ex: sort
, grep
, awk
, sed
...), etc.
$ cd /shared/projects/your_project/tuto_slurm/02_qualitycontrol/
$ # Creation of a dedicated folder for srun
$ mkdir srun
$ cd srun
$ # Load the module for fastqc if it wasn't done yet
$ module load fastqc/0.11.9
$ srun fastqc /shared/projects/your_project/tuto_slurm/01_fastq/GSM461178_untreat_paired_subset_1.fastq -o .
Started analysis of GSM461178_untreat_paired_subset_1.fastq
Approx 5% complete for GSM461178_untreat_paired_subset_1.fastq
Approx 10% complete for GSM461178_untreat_paired_subset_1.fastq
[...]
Approx 95% complete for GSM461178_untreat_paired_subset_1.fastq
Approx 100% complete for GSM461178_untreat_paired_subset_1.fastq
Analysis complete for GSM461178_untreat_paired_subset_1.fastq
$ # We can check the files produced
$ ls
GSM461178_untreat_paired_subset_1_fastqc.html GSM461178_untreat_paired_subset_1_fastqc.zip
⚠️ Note that if you omit the srun
command, the job will run on the login node. It's bad!
srun fastqc /shared/projects/your_project/tuto_slurm/01_fastq/GSM461178_untreat_paired_subset_1.fastq -o .
srun
: we ask SLURM to distribute our job on one of the computer nodesfastqc
: the software we want to use/shared/projects/your_project/tuto_slurm/01_fastq/GSM461178_untreat_paired_subset_1.fastq
: our input file-o
: the FastQC option to indicate the output directory. Otherwise, FastQC will create the output files within the fastq directory.
: .
is a relative path in Linux that designates the current directory (where you are when you launch the job)-o .
: we ask FastQC to create its output files in the current directoryWith an absolute path the command will be write as follow:
srun fastqc /shared/projects/your_project/tuto_slurm/01_fastq/GSM461178_untreat_paired_subset_1.fastq -o /shared/projects/your_project/tuto_slurm/02_qualitycontrol/srun/
Note that implicitly, 2GB of RAM and 1 CPU is reserved , you can modify theses parameters and use additional memory:
srun --cpus-per-task 1 --mem 2GB fastqc /shared/projects/your_project/tuto_slurm/01_fastq/GSM461178_untreat_paired_subset_1.fastq -o .
sbatch
will launch the jobs in background. Additionally to your results, SLURM will create 2 files containing the standard output and the standard error flows. The advantage of using sbatch is that you can close your terminal during the job execution.
The conterpart is that you need write a script file that will contain your command lines and the sbatch
parameters.
There are different ways to create a script file on a remote server:
vi
,nano
,emacs
: those tools are fully intergrated within the terminal, non-graphical and so not easy to handle for beginnersgedit
: it's a graphical text editor (required ssh -Y
). Can be rather slow depending of the network connection.We will use the gedit
solution for this part of the tutorial. But the usage of a SFTP Client is explained in the part "Transfer: get back your results on your personal computer".
1. Open an other terminal because gedit
display a lot of annoying warnings. Don't forget the -Y
option for graphical forwarding.
ssh -Y your_login@slurm0.sb-roscoff.fr
2. Open gedit
mkdir /shared/projects/your_project/tuto_slurm/scripts/
gedit /shared/projects/your_project/tuto_slurm/scripts/fastqc.sbatch &
gedit
should open a file named fastqc.sbatch
in a detached window. The &
in bash will put gedit
in background and so release the terminal to type other commands.
3. Write you script within gedit
#!/bin/bash
module load fastqc/0.11.9
srun fastqc /shared/projects/your_project/tuto_slurm/01_fastq/GSM461178_untreat_paired_subset_1.fastq -o .
Note that implicitly, 2GB of RAM and 1 CPU is reserved: you can modify theses parameters and use additional memory
#!/bin/bash
#SBATCH --cpus-per-task 1
#SBATCH --mem 4GB
module load fastqc/0.11.9
srun fastqc /shared/projects/your_project/tuto_slurm/01_fastq/GSM461178_untreat_paired_subset_1.fastq -o .
#!/bin/bash
is the Shebang. It's indicate to SLURM the language used in the scripts, in our case, the bash
language.#SBATCH
will allow to give parameters to sbatch like: cpus, memory, email, ...Save your fastqc.sbatch
file by clicking the SAVE
button in gedit
.
Now back to our first terminal, we will launch the job using sbatch
$ cd /shared/projects/your_project/tuto_slurm/02_qualitycontrol/
$ mkdir sbatch
$ cd sbatch
$ sbatch /shared/projects/your_project/tuto_slurm/scripts/fastqc.sbatch
Submitted batch job 203739
PD
Maybe your job will have to wait for available resources.
$ squeue -u your_login
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
203739 fast fastqc.+ your_login PD 0:00 1 (Resources)
R
At some point, the job will run on one of the computer nodes.
$ squeue -u your_login
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
203739 fast fastqc.+ your_login PD 5:00 1 cpu-node-23
Possibility, some jobs will FAILED
, one of the reasons is that the job consume more memory than reserved.
It can be checked by comparing the memory requested ReqCPUS
and used MaxVMSize
.
sacct --format=JobID,JobName,User,Submit,ReqCPUS,ReqMem,Start,NodeList,State,CPUTime,MaxVMSize%15 -j 203739
JobID JobName User Submit ReqCPUS ReqMem Start NodeList State CPUTime MaxVMSize
------------ ---------- ----------- ------------------- ------- ------ ------------------- -------- ------ ----------- -----------
203739 fastqc.sb+ your_login 2020-09-02T22:06:31 1 2Gn 2020-11-03T23:32:38 n97 FAILED 26-12:25:00
203739.batch batch 2020-09-03T23:32:38 2 2Gn 2020-11-03T23:32:38 n97 FAILED 26-12:25:00 2279915K
In this case, the job consume at some point 2.2GB (MaxVMSize=2279915K). You should increase the reservation with --mem 4GB
!?
Simply use the scancel
command with jobID(s) to kill
scancel 218672
To get back and forth files between a remote server and your local Personal Computer, we need a FTP/SFTP Clients:
For this tutorial, we will use FileZilla because it's the only one being cross-platform. It's also the more complex so the other ones will be easy to handle.
core.cluster.france-bioinformatique.fr
slurm0.sb-roscoff.fr
22
to use the SFTP protocolThe interface is rather completed with logs, a lot of panels ... But don't be afraid:
pwd
in the input "Site distant/Remote site" (ex: /shared/projects/your_project/tuto_slurm/02_qualitycontrol/sbatch
)/shared
-> projects
-> your_project
-> [...] -> sbatch
You just need to Drag and Drop the file between the "Local panel" and the "Remote panel".
It's the same mechanism to get and to push data depending if you drag a file from or to the "Remote panel"
Congrats, you have launched your first job on a HPC Cluster and get the results on your own computer!