You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -5,7 +5,13 @@ The base component of Open XDMoD uses the job accounting logs from the HPC
5
5
resource manager as the data source. We are also going to install the optional Job Performance Module. This
6
6
allows Open XDMoD to also display performance data for HPC jobs.
7
7
8
+
The asciinema media is not meant to be used on its own, they are intended for use in a "live" demonstration.
9
+
Command Line Demos in a Light color, are meant to be watched. Dark theme are interactive
10
+
11
+
`VIM` is used to edit files in this tutorial. If you prefer a different editor, please install it on the xdmod container.
12
+
8
13
## Submit some jobs to the cluster
14
+
9
15
Before we install and configure XDMoD we are going to submit
10
16
some HPC jobs to the cluster. These jobs will run while we go through
11
17
the install and then we will be able to view the job information
@@ -20,7 +26,7 @@ Run the provided script that submits several jobs to the cluster. These jobs
20
26
run as multiple different users with different job sizes and durations. The
21
27
purpose of this is to generate data to display in Open XDMoD. This, of course,
22
28
would not be required on a production deployment. This script should be run
23
-
as the hpcadmin user as it uses `sudo` to submit jobs as different cluster:
29
+
as the hpcadmin user as it uses `sudo` to submit jobs as different cluster
24
30
users.
25
31
```bash
26
32
submit_jobs.sh
@@ -38,13 +44,14 @@ The Open XDMoD software is installed via RPMs. The majority of the software depe
38
44
are automatically installed via RPM. However, the `phantomjs` software
39
45
that Open XDMoD uses for its image export must be installed seperately.
40
46
41
-
Open XDMoD provides an interactive configuration script that performs the
42
-
database initialization and generates configuration files. This script
43
-
handles the basic setup.
44
-
45
-
The `hpc-toolset-tutorial/xdmod/install.sh` script contains the step-by-step
47
+
The [`hpc-toolset-tutorial/xdmod/install.sh`](https://github.com/ubccr/hpc-toolset-tutorial/blob/master/xdmod/install.sh) script contains the step-by-step
@@ -62,39 +69,235 @@ The following information is needed by Open XDMoD:
62
69
Optionally:
63
70
64
71
- An image file containing the HPC center logo
72
+
- The width HPC center logo
65
73
66
74
Also the following technical information:
67
75
68
76
- The public url of Open XDMoD
69
77
- Paths to installed dependencies (phantomjs)
70
78
- MySQL connection information
79
+
- Host
80
+
- Port
81
+
- Admin Username
82
+
- Admin Password
83
+
- DB Username
84
+
- DB Password
85
+
86
+
If you are installing the Job Performance module (as we are in this tutorial)
87
+
- mongoDB connection information
88
+
89
+
### Prerequisites used in this Tutorial
90
+
91
+
- Name of the organization: `Tutorial` abbreviation: `hpcts`
92
+
- information for each HPC resource
93
+
- Name: `hpc`
94
+
- Number of compute nodes: `2`
95
+
- Number of cores: `2`
96
+
- Timezone: `UTC`
97
+
- Whether it runs shared jobs: `no`
98
+
- An image file containing the HPC center logo: `/srv/xdmod/small-logo.png`
99
+
- The width HPC center logo: `354`
100
+
- The public url of Open XDMoD: `https://localhost:4443`
101
+
- Paths to installed dependencies (phantomjs): `detected defaults`
102
+
- MySQL connection information
103
+
- Host: `mysql`
104
+
- Port: `3306`
105
+
- Admin Username: `root`
106
+
- Admin Password: ` leave blank `
107
+
- DB Username: `xdmodapp`
108
+
- DB Password: `ofbatgorWep0`
109
+
- mongoDB connection information `mongodb://xdmod:xsZ0LpZstneBpijLy7@mongodb:27017/supremm?authSource=admin`
110
+
111
+
### Basic Configuration
112
+
Open XDMoD provides an interactive configuration script that performs the
113
+
database initialization and generates configuration files. This script
114
+
handles the basic setup.
115
+
116
+
The [`hpc-toolset-tutorial/xdmod/entrypoint.sh`](https://github.com/ubccr/hpc-toolset-tutorial/blob/master/xdmod/entrypoint.sh) script automates this process.
The `xdmod-setup` script is used for the basic setup of Open XDMoD. The script includes options to configure the Open XDMoD database, setup the admin user account and configure resources.
137
+
Open XDMoD's [Configuration](https://open.xdmod.org/configuration.html#location-of-configuration-files) files can be modified directly when needing more advanced customization.
138
+
139
+
*Have a heterogeneous cluster?* You could modify `/etc/xdmod/resource_specs.json` and set the PPN to the average number of processors per node.
140
+
141
+
#### Hierarchy
142
+
143
+
Open XDMoD supports a three level hierarchy.
144
+
In this tutorial we use a hierarchy configuration that is typical of the organizational structure in a University.
[Job Performance](https://supremm.xdmod.org) data - for the Open source release we'll try to provide support for [Performance Co-Pilot (PCP)](https://pcp.io).
159
+
We chose PCP because it is included by default in Centos / RedHat.
160
+
In XSEDE we use tacc_stats and PCP (depending on the resource provider). and we have also used LDMS, Cray RUR and are aware of groups using Ganglia too.
161
+
162
+
PCP has been [installed](https://github.com/ubccr/hpc-toolset-tutorial/blob/master/slurm/install.sh#L80-L87) and configured on the compute nodes.
163
+
This tutorial uses a cut-down list of PCP metrics from the recommended metrics for a production HPC system.
164
+
This shorter list is suitable for running inside the docker demo. On a
165
+
real HPC system the data collection should be setup following the
166
+
[PCP Data collection](https://supremm.xdmod.org/supremm-compute-pcp.html#configuration-templates) guide
167
+
168
+
The file used in this demo can be viewed here: https://github.com/ubccr/hpc-toolset-tutorial/blob/master/slurm/pmlogger-supremm.config#L56-L59
169
+
170
+
VERY IMPORTANT - Don't start the configuration of the Job Performance module until there is job data ingested into Open XDMoD
171
+
The Job performance setup relies on the accounting data from the Jobs realm in Open XDMoD.
172
+
This was done as part of this tutorial as part of setup and will be done again later in the tutorial.
> What actually gets data into the Open XDMoD portal. For core xdmod this is part of ingestion. Job Performance has a separate script.
195
+
196
+
This tutorial provides a script [`shred-ingest-aggregate-all.sh`](https://github.com/ubccr/hpc-toolset-tutorial/blob/master/xdmod/scripts/shred-ingest-aggregate-all.sh)
197
+
that does this. In a typical setup this would be part of a cron job run when it is best suited for the HPC system.
198
+
199
+
Run this now on the `xdmod` container
200
+
201
+
Login to frontend via SSH and user: `hpcadmin` password: `ilovelinux`:
This is going to produce A LOT of output. Each of these commands have flags that will turn this off. For the purpose of this tutorial they have not been silenced.
-`[WARNING] Autoperiod library not found, TimeseriesPatterns plugins will not do period analysis`
90
-
- The autoperiod code is used for detecting period I/O patterns in the parallel filesystem traffic. (not needed in the tutorial configuration)
91
-
TODO: User Dashboard?
92
-
93
-
## Open XDMoD Functionality
94
-
TODO: User
95
-
TODO: PI
96
-
TODO: Center
97
-
TODO: Basic Admin
224
+
- The autoperiod code is used for detecting period I/O patterns in the parallel filesystem traffic. (not needed in the tutorial configuration)
225
+
226
+
227
+
## User / PI Names
228
+
229
+
The resource manager logs contain the system usernames of the users that submitted jobs.
230
+
To display the full names in Open XDMoD you must provide a data file that contains the
231
+
full name of each user for each system username. This file is in a `csv` format.
232
+
233
+

234
+
235
+
This has not been automated for this tutorial. We dont want you to fall asleep!
236
+
237
+
Login to frontend via SSH and user: `hpcadmin` password: `ilovelinux`:
238
+
```bash
239
+
ssh -p6222 hpcadmin@localhost
240
+
```
241
+
242
+
Create a file with the contents below:
243
+
The file needs to be able to be read by the `xdmod` user, for this demo it will be
244
+
created in /var/tmp
245
+
246
+
```bash
247
+
vim /var/tmp/names.csv
248
+
```
249
+
250
+
The first column should include the user name or group name used by your resource manager, the second column is the user’s first name and the third column is the user’s last name.
251
+
(Feel free to change the First and Last names)
252
+
253
+
```csv
254
+
cgray,Carl,Gray
255
+
sfoster,Stephanie,Foster
256
+
csimmons,Charles,Simmons
257
+
astewart,Andrea,Stewart
258
+
hpcadmin,HPC, Administrators
259
+
```
260
+
261
+
Now this needs to be imported into xdmod with the command [`xdmod-import-csv`](https://open.xdmod.org/commands.html#xdmod-import-csv)
0 commit comments