llm-d
Here are 12 public repositories matching this topic...
Kubernetes controllers for fast model actuation using vLLM sleep/wake and launcher-based model swapping
-
Updated
Apr 4, 2026 - Go
Accelerate reproducible inference experiments for large language models with LLM-D! This lab automates the setup of a complete evaluation environment on OpenShift/OKD: GPU worker pools, core operators, observability, traffic control, and ready-to-run example workloads.
-
Updated
Apr 2, 2026 - Python
Open Source software code for use with PCIe card-based hardware AI accelerators catering to both inference and training use cases
-
Updated
Feb 21, 2026 - Python
RecursiveCharacterTextSplitter and context cache with llm-d
-
Updated
Jan 29, 2026 - Python
Improve this page
Add a description, image, and links to the llm-d topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the llm-d topic, visit your repo's landing page and select "manage topics."