Skip to content

Commit f502735

Browse files
committed
more updates
1 parent c78a9e3 commit f502735

1 file changed

Lines changed: 78 additions & 64 deletions

File tree

xdmod/README.md

Lines changed: 78 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ resource manager as the data source. We are also going to install the optional
66
allows Open XDMoD to also display performance data for HPC jobs.
77

88
The asciinema media is not meant to be used on its own, they are intended for use in a "live" demonstration.
9+
Command Line Demos in a Light color, are meant to be watched. Dark theme are interactive
910

1011
`VIM` is used to edit files in this tutorial. If you prefer a different editor, please install it on the xdmod container.
1112

@@ -82,6 +83,9 @@ Also the following technical information:
8283
- DB Username
8384
- DB Password
8485

86+
If you are installing the Job Performance module (as we are in this tutorial)
87+
- mongoDB connection information
88+
8589
### Prerequisites used in this Tutorial
8690

8791
- Name of the organization: `Tutorial` abbreviation: `hpcts`
@@ -102,6 +106,7 @@ Also the following technical information:
102106
- Admin Password: ` leave blank `
103107
- DB Username: `xdmodapp`
104108
- DB Password: `ofbatgorWep0`
109+
- mongoDB connection information `mongodb://xdmod:xsZ0LpZstneBpijLy7@mongodb:27017/supremm?authSource=admin`
105110

106111
### Basic Configuration
107112
Open XDMoD provides an interactive configuration script that performs the
@@ -118,24 +123,20 @@ General Setup:
118123
[![asciicast](https://asciinema.org/a/349236.svg)](https://asciinema.org/a/349236)
119124

120125
Database Setup:
121-
[![asciicast](https://asciinema.org/a/349237.svg)](https://asciinema.org/349237)
126+
[![asciicast](https://asciinema.org/a/349237.svg)](https://asciinema.org/a/349237)
122127

123128
Organization Setup:
124-
[![asciicast](https://asciinema.org/a/349238.svg)](https://asciinema.org/349238)
129+
[![asciicast](https://asciinema.org/a/349238.svg)](https://asciinema.org/a/349238)
125130

126131
Resource Setup:
127-
[![asciicast](https://asciinema.org/a/349240.svg)](https://asciinema.org/349240)
132+
[![asciicast](https://asciinema.org/a/349240.svg)](https://asciinema.org/a/349240)
128133

129-
#### What about a Heterogeneous cluster?
134+
#### Advanced configuration
130135

131136
The `xdmod-setup` script is used for the basic setup of Open XDMoD. The script includes options to configure the Open XDMoD database, setup the admin user account and configure resources.
132-
Open XDMoD's [Configuration](https://open.xdmod.org/configuration.html#location-of-configuration-files) files.
133-
134-
135-
*Have a heterogeneous cluster?* You would need to modify `/etc/xdmod/resource_specs.json`:
136-
137-
[![asciicast](https://asciinema.org/a/349249.svg)](https://asciinema.org/349249)
137+
Open XDMoD's [Configuration](https://open.xdmod.org/configuration.html#location-of-configuration-files) files can be modified directly when needing more advanced customization.
138138

139+
*Have a heterogeneous cluster?* You could modify `/etc/xdmod/resource_specs.json` and set the PPN to the average number of processors per node.
139140

140141
#### Hierarchy
141142

@@ -146,63 +147,19 @@ Decanal Unit -> Department -> PI Group
146147

147148
Reference: [Hierarchy Guide](https://open.xdmod.org/hierarchy.html)
148149

149-
#### User / PI Names
150-
151-
The resource manager logs contain the system usernames of the users that submitted jobs.
152-
To display the full names in Open XDMoD you must provide a data file that contains the
153-
full name of each user for each system username. This file is in a `csv` format.
154-
155-
![Group By User(names not importe)](./tutorial-screenshots/usernames.png)
156-
157-
This has not been automated for this tutorial. We dont want you to fall asleep!
158-
159-
Create a file with the contents below:
160-
161-
The file needs to be able to be read by the `xdmod` user, for this demo it will be
162-
created in /var/tmp
163-
164-
```bash
165-
vim /var/tmp/names.csv
166-
```
167-
168-
The first column should include the user name or group name used by your resource manager, the second column is the user’s first name and the third column is the user’s last name.
169-
(Feel free to change the First and Last names)
170-
171-
```csv
172-
cgray,Carl,Gray
173-
sfoster,Stephanie,Foster
174-
csimmons,Charles,Simmons
175-
astewart,Andrea,Stewart
176-
hpcadmin,HPC, Administrators
177-
```
178-
179-
Now this needs to be imported into xdmod with the command [`xdmod-import-csv`](https://open.xdmod.org/commands.html#xdmod-import-csv)
180-
181-
```bash
182-
sudo su - xdmod xdmod-import-csv -t names -i /var/tmp/names.csv
183-
```
184-
185-
186-
![Group By User](./tutorial-screenshots/fullnames.png)
187-
188-
Reference: [User/PI Names Guide](https://open.xdmod.org/user-names.html)
189-
190-
xdmod-import-csv -t names:
191-
[![asciicast](https://asciinema.org/a/349325.svg)](https://asciinema.org/349325)
192-
193-
194-
195150
## Open XDMoD Job Performance
151+
196152
The Job Performance module is optional, but highly recommended.
197153

198154
![Job Performance Dataflow](./tutorial-screenshots/admin-job-performance-dataflow.png)
199155

200156
### Job Performance Configuration
201157

202-
This tutorial uses the [Job Performance](https://supremm.xdmod.org) module for Open XDMoD. The Job Performance module displays job performance metric based on data collected on the HPC compute nodes.
203-
204-
The example in this tutorial uses [Performance Co-Pilot (PCP)](https://pcp.io) as the performance data collection software. This must be [installed](https://github.com/ubccr/hpc-toolset-tutorial/blob/master/slurm/install.sh#L80-L87) and configured on the compute nodes.
158+
[Job Performance](https://supremm.xdmod.org) data - for the Open source release we'll try to provide support for [Performance Co-Pilot (PCP)](https://pcp.io).
159+
We chose PCP because it is included by default in Centos / RedHat.
160+
In XSEDE we use tacc_stats and PCP (depending on the resource provider). and we have also used LDMS, Cray RUR and are aware of groups using Ganglia too.
205161

162+
PCP has been [installed](https://github.com/ubccr/hpc-toolset-tutorial/blob/master/slurm/install.sh#L80-L87) and configured on the compute nodes.
206163
This tutorial uses a cut-down list of PCP metrics from the recommended metrics for a production HPC system.
207164
This shorter list is suitable for running inside the docker demo. On a
208165
real HPC system the data collection should be setup following the
@@ -226,6 +183,7 @@ Job summarization (SUPReMM) configuration:
226183

227184
Shredding
228185
> Load logs from a scheduler (SLURM in this tutorial) and put them into the Open XDMoD databases.
186+
> see [Open XDMoD](https://open.xdmod.org/) for notes on SGE/Grid Engine, Univa Grid Engine, PBS/TORQUE, LSF
229187
> Reference: [Shredder Guide](https://open.xdmod.org/shredder.html)
230188
231189
Ingestion
@@ -265,15 +223,61 @@ This is going to produce A LOT of output. Each of these commands have flags tha
265223
- `[WARNING] Autoperiod library not found, TimeseriesPatterns plugins will not do period analysis`
266224
- The autoperiod code is used for detecting period I/O patterns in the parallel filesystem traffic. (not needed in the tutorial configuration)
267225

268-
## Open XDMoD Functionality
269226

270-
TODO: User Dashboard
227+
## User / PI Names
228+
229+
The resource manager logs contain the system usernames of the users that submitted jobs.
230+
To display the full names in Open XDMoD you must provide a data file that contains the
231+
full name of each user for each system username. This file is in a `csv` format.
271232

272-
TODO: User
233+
![Group By User(names not importe)](./tutorial-screenshots/usernames.png)
273234

274-
TODO: PI
235+
This has not been automated for this tutorial. We dont want you to fall asleep!
236+
237+
Login to frontend via SSH and user: `hpcadmin` password: `ilovelinux`:
238+
```bash
239+
ssh -p6222 hpcadmin@localhost
240+
```
241+
242+
Create a file with the contents below:
243+
The file needs to be able to be read by the `xdmod` user, for this demo it will be
244+
created in /var/tmp
245+
246+
```bash
247+
vim /var/tmp/names.csv
248+
```
249+
250+
The first column should include the user name or group name used by your resource manager, the second column is the user’s first name and the third column is the user’s last name.
251+
(Feel free to change the First and Last names)
252+
253+
```csv
254+
cgray,Carl,Gray
255+
sfoster,Stephanie,Foster
256+
csimmons,Charles,Simmons
257+
astewart,Andrea,Stewart
258+
hpcadmin,HPC, Administrators
259+
```
260+
261+
Now this needs to be imported into xdmod with the command [`xdmod-import-csv`](https://open.xdmod.org/commands.html#xdmod-import-csv)
262+
263+
```bash
264+
sudo -u xdmod xdmod-import-csv -t names -i /var/tmp/names.csv
265+
```
266+
267+
Then we will need to reingest and aggregate the data
268+
269+
```bash
270+
sudo -u xdmod /srv/xdmod/scripts/shred-ingest-aggregate-all.sh
271+
```
272+
![Group By User](./tutorial-screenshots/fullnames.png)
273+
274+
Reference: [User/PI Names Guide](https://open.xdmod.org/user-names.html)
275+
276+
xdmod-import-csv -t names:
277+
[![asciicast](https://asciinema.org/a/349325.svg)](https://asciinema.org/a/349325)
278+
279+
## Open XDMoD Functionality (Interactive Demo)
275280

276-
TODO: Center
277281

278282
### Administration
279283

@@ -285,6 +289,16 @@ Admin Dashboard:
285289

286290
![Admin Dashboard](./tutorial-screenshots/admin-dashboard.png)
287291

292+
### End User
293+
294+
Lets actually use Open XDMoD now.
295+
296+
User:
297+
298+
PI:
299+
300+
Center: Staff
301+
288302
## Tutorial Navigation
289303
[Next - OnDemand](../ondemand/README.md)
290304
[Previous Step - ColdFront](../coldfront/README.md)

0 commit comments

Comments
 (0)