@@ -31,9 +31,11 @@ similar as with many other distributed computing systems:
3131
3232### Preparing the packages
3333
34- The easiest way to install the packages is using a single-machine interactive job. On the access node of your HPC, run:
34+ The easiest way to install the packages is using a single-machine interactive
35+ job. On the access node of your HPC, run this command to give you a 60-minute
36+ interactive session:
3537``` sh
36- srun --pty -n1 -c1 - t60 --mem 1G /bin/bash
38+ srun --pty -t60 /bin/bash -
3739```
3840
3941When the shell opens (the prompt should change), you can load the Julia module,
@@ -43,7 +45,7 @@ usually with a command such as this:
4345module load lang/Julia/1.3.0
4446```
4547
46- (You may consult ` module avail ` for other possible Julia versions.)
48+ (You may consult ` module spider julia ` for other possible Julia versions.)
4749
4850After that, start ` julia ` and add press ` ] ` to open the packaging prompt (you
4951should see ` (v1.3) pkg> ` instead of ` julia> ` ). There you can download and
@@ -65,14 +67,16 @@ Julia and the interactive Slurm job shell.
6567
6668### Slurm batch script
6769
68- An example Slurm batch script is listed below -- save it as
69- ` run-analysis.batch ` to your Slurm access node, in a directory that is shared
70- with the workers (usually a "scratch" directory; try ` cd $SCRATCH ` ).
70+ An example Slurm batch script
71+ ([ download] ( https://github.com/LCSB-BioCore/DistributedData.jl/blob/master/docs/slurm-example/run-analysis.batch ) )
72+ is listed below -- save it as ` run-analysis.batch ` to your Slurm access node,
73+ in a directory that is shared with the workers (usually a "scratch" directory;
74+ try ` cd $SCRATCH ` ).
7175``` sh
7276#! /bin/bash -l
7377# SBATCH -n 128
7478# SBATCH -c 1
75- # SBATCH -t 60
79+ # SBATCH -t 10
7680# SBATCH --mem-per-cpu 4G
7781# SBATCH -J MyDistributedJob
7882
@@ -85,7 +89,7 @@ The parameters in the script have this meaning, in order:
8589- the batch spawns 128 "tasks" (ie. spawning 128 separate processes)
8690- each task uses 1 CPU (you may want more CPUs if you work with actual
8791 threads and shared memory)
88- - the whole batch takes maximum 60 minutes
92+ - the whole batch takes maximum 10 minutes
8993- each CPU (in our case each process) will be allocated 4 gigabytes of RAM
9094- the job will be visible in the queue as ` MyDistributedJob `
9195- it will load Julia 1.3.0 module on the workers, so that ` julia ` executable is
@@ -95,12 +99,13 @@ The parameters in the script have this meaning, in order:
9599
96100### Julia script
97101
98- The ` run-analysis.jl ` may look as follows:
102+ The ` run-analysis.jl `
103+ ([ download] ( https://github.com/LCSB-BioCore/DistributedData.jl/blob/master/docs/slurm-example/run-analysis.jn ) )
104+ may look as follows:
99105``` julia
100106using Distributed, ClusterManagers, DistributedData
101107
102108# read the number of available workers from environment and start the worker processes
103-
104109n_workers = parse (Int , ENV [" SLURM_NTASKS" ])
105110addprocs_slurm (n_workers , topology = :master_worker )
106111
@@ -119,7 +124,8 @@ println(f, totalResult)
119124close (f)
120125```
121126
122- Finally, you can execute the whole thing with ` sbatch ` :
127+ Finally, you can start the whole thing with ` sbatch ` command executed on the
128+ access node:
123129``` sh
124130sbatch run-analysis.batch
125131```
0 commit comments