- pretraining GPT2 Model using FineWebEDU dataset.
- Trained using distributed data parallel technique in kaggle notebook for 1 epoch
- started training using random file order
sathishkumar67/GPT2-Pretraining_Finewebedu10B
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|