-
Notifications
You must be signed in to change notification settings - Fork 64
GPUhack example run
Peter Willendrup edited this page Jan 13, 2020
·
3 revisions
On an interactive node (e.g. via voltash command) the binary executes normally, that is:
hpclogin2(username) $ voltash
... wait a bit for interactive node
nodename(username) $ ./BNL_H8.out -n1e8 lambda=2.36 -dmy_dir
Instrument: BNL_H8 on n-62-20-1.
Monochromator : DM = 3.3539
A1 = 20.60, A2 = 41.20 (deg)
Ki = 2.662 Angs-1 Energy = 14.69 meV
Velocity = 1676 m/s, lambda = 2.36 Angs
[BNL_H8] Initialize
Save [BNL_H8]
Detector: D0_Source_I=0.00316469 D0_Source_ERR=1.39388e-06 D0_Source_N=5.15476e+06 "D0_Source.psd"
Detector: D1_SC1_Out_I=0.0204115 D1_SC1_Out_ERR=3.53868e-06 D1_SC1_Out_N=3.3459e+07 "D1_SC1_Out.psd"
Detector: D2_A4_I=0.0137503 D2_A4_ERR=2.90472e-06 D2_A4_N=2.2573e+07 "D2_A4.psd"
Detector: D4_SC2_In_I=0.00154987 D4_SC2_In_ERR=9.75286e-07 D4_SC2_In_N=2.56212e+06 "D4_SC2_In.psd"
Detector: D5_SC2_Out_I=0.00121189 D5_SC2_Out_ERR=8.62348e-07 D5_SC2_Out_N=2.01959e+06 "D5_SC2_Out.psd"
Detector: D7_SC3_In_I=1.24521e-07 D7_SC3_In_ERR=2.24855e-10 D7_SC3_In_N=336749 "D7_SC3_In.psd"
Detector: D8_SC3_Out_I=2.38721e-08 D8_SC3_Out_ERR=9.90139e-11 D8_SC3_Out_N=63914 "D8_SC3_Out.psd"
Detector: D10_SC4_In_I=1.29521e-09 D10_SC4_In_ERR=2.31474e-11 D10_SC4_In_N=3443 "D10_SC4_In.psd"
Detector: He3H_I=1.0102e-09 He3H_ERR=2.04559e-11 He3H_N=2679 "He3.psd"
Finally [BNL_H8: my_dir]. Time: 431400 [h]
Measure wall clock time using time:
nodename(username) $ time ./BNL_H8.out -n1e8 lambda=2.36 -dmy_dir2
Instrument: BNL_H8 on n-62-20-1.
Monochromator : DM = 3.3539
A1 = 20.60, A2 = 41.20 (deg)
Ki = 2.662 Angs-1 Energy = 14.69 meV
Velocity = 1676 m/s, lambda = 2.36 Angs
[BNL_H8] Initialize
Save [BNL_H8]
Detector: D0_Source_I=0.00316628 D0_Source_ERR=1.39424e-06 D0_Source_N=5.15736e+06 "D0_Source.psd"
Detector: D1_SC1_Out_I=0.0204134 D1_SC1_Out_ERR=3.53884e-06 D1_SC1_Out_N=3.34622e+07 "D1_SC1_Out.psd"
Detector: D2_A4_I=0.0137538 D2_A4_ERR=2.90509e-06 D2_A4_N=2.2579e+07 "D2_A4.psd"
Detector: D4_SC2_In_I=0.00155039 D4_SC2_In_ERR=9.75448e-07 D4_SC2_In_N=2.56339e+06 "D4_SC2_In.psd"
Detector: D5_SC2_Out_I=0.00121155 D5_SC2_Out_ERR=8.62226e-07 D5_SC2_Out_N=2.0193e+06 "D5_SC2_Out.psd"
Detector: D7_SC3_In_I=1.24184e-07 D7_SC3_In_ERR=2.2455e-10 D7_SC3_In_N=335859 "D7_SC3_In.psd"
Detector: D8_SC3_Out_I=2.39021e-08 D8_SC3_Out_ERR=9.91055e-11 D8_SC3_Out_N=63976 "D8_SC3_Out.psd"
Detector: D10_SC4_In_I=1.26222e-09 D10_SC4_In_ERR=2.28589e-11 D10_SC4_In_N=3354 "D10_SC4_In.psd"
Detector: He3H_I=9.70538e-10 He3H_ERR=2.00144e-11 He3H_N=2591 "He3.psd"
Finally [BNL_H8: my_dir2]. Time: 431400 [h]
real 0m1.589s
user 0m0.552s
sys 0m0.587s
Get profiling output:
hostname(username) $ nsys profile ./BNL_H8.out -n1e8 lambda=2.36 -dmy_dir3
**** collection configuration ****
force-overwrite = false
stop-on-exit = true
export_sqlite = false
stats = false
capture-range = none
stop-on-range-end = false
delay = 0 seconds
duration = 0 seconds
kill = signal number 15
inherit-environment = true
show-output = true
wait = all
application command = ./BNL_H8.out
application arguments = -n1e8 lambda=2.36 -dmy_dir3
application working directory = /zhome/68/5/1000184454/test
NVTX profiler range trigger =
NVTX profiler domain trigger =
environment variables:
application configuration
sample_cpu = true
backtrace_method = lbr
trace_cublas = false
trace_cuda = true
trace_cudnn = false
trace_nvtx = true
trace_mpi = false
trace_openacc = false
trace_vulkan = false
trace_opengl = true
trace_osrt = true
osrt-threshold = 0 nanoseconds
cudabacktrace = false
cudabacktrace-threshold = 0 nanoseconds
trace-fork-before-exec = false
profile_processes = tree
system configuration
Beta: ftrace events:
ftrace-keep-user-config = false
trace-GPU-context-switch = false
Collecting data...
Instrument: BNL_H8 on n-62-20-1.
Monochromator : DM = 3.3539
A1 = 20.60, A2 = 41.20 (deg)
Ki = 2.662 Angs-1 Energy = 14.69 meV
Velocity = 1676 m/s, lambda = 2.36 Angs
[BNL_H8] Initialize
Save [BNL_H8]
Detector: D0_Source_I=0.00316671 D0_Source_ERR=1.39433e-06 D0_Source_N=5.15805e+06 "D0_Source.psd"
Detector: D1_SC1_Out_I=0.0204119 D1_SC1_Out_ERR=3.53871e-06 D1_SC1_Out_N=3.34595e+07 "D1_SC1_Out.psd"
Detector: D2_A4_I=0.0137486 D2_A4_ERR=2.90454e-06 D2_A4_N=2.25704e+07 "D2_A4.psd"
Detector: D4_SC2_In_I=0.00154838 D4_SC2_In_ERR=9.74817e-07 D4_SC2_In_N=2.5598e+06 "D4_SC2_In.psd"
Detector: D5_SC2_Out_I=0.0012103 D5_SC2_Out_ERR=8.6178e-07 D5_SC2_Out_N=2.01721e+06 "D5_SC2_Out.psd"
Detector: D7_SC3_In_I=1.24163e-07 D7_SC3_In_ERR=2.24566e-10 D7_SC3_In_N=335377 "D7_SC3_In.psd"
Detector: D8_SC3_Out_I=2.37247e-08 D8_SC3_Out_ERR=9.86484e-11 D8_SC3_Out_N=63495 "D8_SC3_Out.psd"
Detector: D10_SC4_In_I=1.21353e-09 D10_SC4_In_ERR=2.24508e-11 D10_SC4_In_N=3198 "D10_SC4_In.psd"
Detector: He3H_I=9.35548e-10 He3H_ERR=1.97262e-11 He3H_N=2464 "He3.psd"
Finally [BNL_H8: my_dir3]. Time: 431400 [h]
Generating the /zhome/68/5/1000184454/test/report1.qdstrm file.
Saving diagnostics...
Saving qdstrm file to disk...
Finished saving file.
Importing [===============================================================100%]
Saved report file to "/zhome/68/5/1000184454/test/report1.qdrep"
Looking at the report data is easiest by installing NVIDIA's NSIGHT SYSTEMS to your own machine and scp back the file... (Peter has downloaded the current version for the different OS'es.