Skip to content

Commit eb93f32

Browse files
authored
Update README.md
1 parent 5e2925b commit eb93f32

1 file changed

Lines changed: 5 additions & 5 deletions

File tree

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -252,7 +252,7 @@ The preliminary support for visualization during training process are provided a
252252
253253
## MegaDPP
254254
255-
## Environment Configuration
255+
### Environment Configuration
256256
257257
- The following is the pod configuration.
258258
@@ -295,9 +295,9 @@ cd megatron/shm_tensor_new_rdma_pre_alloc
295295
pip install -e .
296296
```
297297
298-
## Run
298+
### Run
299299
300-
### Dataset Preparation
300+
#### Dataset Preparation
301301
302302
The dataset preparation step follows largely from the Megatron framework.
303303
@@ -340,7 +340,7 @@ python ../tools/preprocess_data.py \
340340
341341
For other models, please refer to `nvidia/megatron` for the corresponding datasets.
342342
343-
### Single Node Distributed Training
343+
#### Single Node Distributed Training
344344
To run distributed training on a single node, go to the project root directory and run
345345
346346
```bash
@@ -379,7 +379,7 @@ bash examples/<model>/<train_file>.sh
379379
```
380380
or write a file similar to `run_{single,master,worker}_<model>.sh` that sets up configurations and runs the shell under `examples/`
381381
382-
### Multinode Distributed Training
382+
#### Multinode Distributed Training
383383
To run distributed training on multiple nodes, go to the root directory. First run
384384
385385
```bash

0 commit comments

Comments
 (0)