Skip to content

About training setting of Llama3-8B #33

@HXuan-Wang

Description

@HXuan-Wang

很感谢您开源这么有趣且有用的工作,我在复现llama模型的量化结果时发现llama2是可以复现的,但是在llama3上复现不出来,我尝试用llama2的训练设置训练llama3-8B,得到的结果如下:
(main_block_ap.py 39): INFO wikitext2 perplexity: 17.49
(main_block_ap.py 39): INFO c4 perplexity: 20.56
(main_block_ap.py 58): INFO Average Acc: 52.33%
这个结果跟论文中的结果相差很远,但是您没有提供llama3的训练配置,这个可以提供吗

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions