You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, while quantizing the model is simple enough, I'm wondering if the MOSS's fork of SGLang supports quantization like NVFP4 & FP8 for efficent deployment of the model?
Hi, while quantizing the model is simple enough, I'm wondering if the MOSS's fork of SGLang supports quantization like NVFP4 & FP8 for efficent deployment of the model?
Thanks for the info in advanced