Skip to content

Commit ed096ee

Browse files
committed
minor fixes, make the sample scripts downloadable
1 parent bd0d185 commit ed096ee

3 files changed

Lines changed: 46 additions & 11 deletions

File tree

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
#!/bin/bash -l
2+
#SBATCH -n 128
3+
#SBATCH -c 1
4+
#SBATCH -t 60
5+
#SBATCH --mem-per-cpu 4G
6+
#SBATCH -J MyDistributedJob
7+
8+
module load lang/Julia/1.3.0
9+
10+
julia run-analysis.jl

docs/slurm-example/run-analysis.jl

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
using Distributed, ClusterManagers, DistributedData
2+
3+
# read the number of available workers from environment and start the worker processes
4+
n_workers = parse(Int , ENV["SLURM_NTASKS"])
5+
addprocs_slurm(n_workers , topology =:master_worker)
6+
7+
# load the required packages on all workers
8+
@everywhere using DistributedData
9+
10+
# generate a random dataset on all workers
11+
dataset = dtransform((), _ -> randn(10000,10000), workers(), :myData)
12+
13+
# for demonstration, sum the whole dataset
14+
totalResult = dmapreduce(dataset, sum, +)
15+
16+
# do not forget to save the results!
17+
f = open("result.txt", "w")
18+
println(f, totalResult)
19+
close(f)

docs/src/slurm.md

Lines changed: 17 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,11 @@ similar as with many other distributed computing systems:
3131

3232
### Preparing the packages
3333

34-
The easiest way to install the packages is using a single-machine interactive job. On the access node of your HPC, run:
34+
The easiest way to install the packages is using a single-machine interactive
35+
job. On the access node of your HPC, run this command to give you a 60-minute
36+
interactive session:
3537
```sh
36-
srun --pty -n1 -c1 -t60 --mem 1G /bin/bash
38+
srun --pty -t60 /bin/bash -
3739
```
3840

3941
When the shell opens (the prompt should change), you can load the Julia module,
@@ -43,7 +45,7 @@ usually with a command such as this:
4345
module load lang/Julia/1.3.0
4446
```
4547

46-
(You may consult `module avail` for other possible Julia versions.)
48+
(You may consult `module spider julia` for other possible Julia versions.)
4749

4850
After that, start `julia` and add press `]` to open the packaging prompt (you
4951
should see `(v1.3) pkg>` instead of `julia>`). There you can download and
@@ -65,14 +67,16 @@ Julia and the interactive Slurm job shell.
6567

6668
### Slurm batch script
6769

68-
An example Slurm batch script is listed below -- save it as
69-
`run-analysis.batch` to your Slurm access node, in a directory that is shared
70-
with the workers (usually a "scratch" directory; try `cd $SCRATCH`).
70+
An example Slurm batch script
71+
([download](https://github.com/LCSB-BioCore/DistributedData.jl/blob/master/docs/slurm-example/run-analysis.batch))
72+
is listed below -- save it as `run-analysis.batch` to your Slurm access node,
73+
in a directory that is shared with the workers (usually a "scratch" directory;
74+
try `cd $SCRATCH`).
7175
```sh
7276
#!/bin/bash -l
7377
#SBATCH -n 128
7478
#SBATCH -c 1
75-
#SBATCH -t 60
79+
#SBATCH -t 10
7680
#SBATCH --mem-per-cpu 4G
7781
#SBATCH -J MyDistributedJob
7882

@@ -85,7 +89,7 @@ The parameters in the script have this meaning, in order:
8589
- the batch spawns 128 "tasks" (ie. spawning 128 separate processes)
8690
- each task uses 1 CPU (you may want more CPUs if you work with actual
8791
threads and shared memory)
88-
- the whole batch takes maximum 60 minutes
92+
- the whole batch takes maximum 10 minutes
8993
- each CPU (in our case each process) will be allocated 4 gigabytes of RAM
9094
- the job will be visible in the queue as `MyDistributedJob`
9195
- it will load Julia 1.3.0 module on the workers, so that `julia` executable is
@@ -95,12 +99,13 @@ The parameters in the script have this meaning, in order:
9599

96100
### Julia script
97101

98-
The `run-analysis.jl` may look as follows:
102+
The `run-analysis.jl`
103+
([download](https://github.com/LCSB-BioCore/DistributedData.jl/blob/master/docs/slurm-example/run-analysis.jn))
104+
may look as follows:
99105
```julia
100106
using Distributed, ClusterManagers, DistributedData
101107

102108
# read the number of available workers from environment and start the worker processes
103-
104109
n_workers = parse(Int , ENV["SLURM_NTASKS"])
105110
addprocs_slurm(n_workers , topology =:master_worker)
106111

@@ -119,7 +124,8 @@ println(f, totalResult)
119124
close(f)
120125
```
121126

122-
Finally, you can execute the whole thing with `sbatch`:
127+
Finally, you can start the whole thing with `sbatch` command executed on the
128+
access node:
123129
```sh
124130
sbatch run-analysis.batch
125131
```

0 commit comments

Comments
 (0)