@@ -63,11 +63,11 @@ Depending on the package, this may take a while, but should be done in under a
6363minute for most existing packages. Finally, press ` Ctrl+D ` twice to exit both
6464Julia and the interactive Slurm job shell.
6565
66- ### Batch script
66+ ### Slurm batch script
6767
68- An example Slurm batch script is here, save it as ` run-analysis.batch ` to your
69- Slurm access node, in a directory that is shared with the workers (usually
70- a subdirectory of ` /scratch ` ):
68+ An example Slurm batch script is listed below -- save it as
69+ ` run-analysis.batch ` to your Slurm access node, in a directory that is shared
70+ with the workers (usually a "scratch" directory; try ` cd $SCRATCH ` ).
7171``` sh
7272#! /bin/bash -l
7373# SBATCH -n 128
@@ -81,9 +81,9 @@ module load lang/Julia/1.3.0
8181julia run-analysis.jl
8282```
8383
84- The parameters are , in order:
85- - using 128 "tasks" (ie. spawning 128 separate processes)
86- - each process uses 1 CPU (you may want more CPUs if you work with actual
84+ The parameters in the script have this meaning , in order:
85+ - the batch spawns 128 "tasks" (ie. spawning 128 separate processes)
86+ - each task uses 1 CPU (you may want more CPUs if you work with actual
8787 threads and shared memory)
8888- the whole batch takes maximum 60 minutes
8989- each CPU (in our case each process) will be allocated 4 gigabytes of RAM
@@ -94,6 +94,7 @@ The parameters are, in order:
9494- finally, it will run the Julia script ` run-analysis.jl `
9595
9696### Julia script
97+
9798The ` run-analysis.jl ` may look as follows:
9899``` julia
99100using Distributed, ClusterManagers, DistributedData
@@ -123,6 +124,8 @@ Finally, you can execute the whole thing with `sbatch`:
123124sbatch run-analysis.batch
124125```
125126
127+ ### Collecting the results
128+
126129After your tasks gets queued, executed and finished successfully, you may see
127130the result in ` result.txt ` . In the meantime, you can entertain yourself by
128131watching ` squeue ` , to see e.g. the expected execution time of your batch.
@@ -153,8 +156,8 @@ job0017.out job0036.out job0055.out job0074.out job0093.out job0112.out sl
153156job0018.out job0037.out job0056.out job0075.out job0094.out job0113.out
154157```
155158
156- The files ` jobXXXX .out` contain the information collected from individual
159+ The files ` job* .out` contain the information collected from individual
157160workers' standard outputs, such as the output of ` println ` or ` @info ` . For
158- complicated programs, this is the easiest way to get out debugging information,
159- and a simple but informative way to collect benchmarking output (using e.g.
160- ` @time ` ).
161+ complicated programs, this is the easiest way to get out the debugging
162+ information, and a simple but often sufficient way to collect benchmarking
163+ output (using e.g. ` @time ` ).
0 commit comments