[megatron] support megatron fp4#9330
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces FP4 precision parameters (fp4_format, fp4_recipe, and fp4_param_gather) to the MegatronArguments class and updates the documentation. Reviewers identified typos in the documentation where 'FP8' was used instead of 'FP4' and suggested formatting improvements in the English version to ensure consistency with the FP8 section.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for FP4 precision by adding new command-line parameters (fp4_format, fp4_recipe, fp4_param_gather) and updating the argument parsing and utility functions. Reviewers suggested increasing the padding alignment for FP4 to 32 elements to ensure compatibility with Blackwell+ architecture requirements and recommended explicitly stating default values in the documentation for both Chinese and English versions.
No description provided.