Skip to content

sathishkumar67/GPT2-Pretraining_Finewebedu10B

Repository files navigation

  • pretraining GPT2 Model using FineWebEDU dataset.
  • Trained using distributed data parallel technique in kaggle notebook for 1 epoch
  • started training using random file order

About

Pretraining the 124m GPT2 model

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors