We are starting to look into implementing tensorizer with VLLM, but didn't see any specific support for Lora Adapters. We use a rank of up to 128 with adapters trained on llama3-70B which results in ~3.5GB safetensors file per adapter. Is there support for serializing and deploying these from tensorizer?
We are starting to look into implementing tensorizer with VLLM, but didn't see any specific support for Lora Adapters. We use a rank of up to 128 with adapters trained on llama3-70B which results in ~3.5GB safetensors file per adapter. Is there support for serializing and deploying these from tensorizer?