Skip to content

Hi, does this series of model have a quantized version like NVFP4 or FP8? #23

@bi1101

Description

@bi1101

Hi, while quantizing the model is simple enough, I'm wondering if the MOSS's fork of SGLang supports quantization like NVFP4 & FP8 for efficent deployment of the model?

Thanks for the info in advanced

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions