AgentDish directory

llm-serving

Accepted listings with this tag.

Listing	Category	Score	Trend	Checked
#290 ↓ -6 LLM inference at scale An open-source handbook for production LLM serving and inference at scale, covering GPU fundamentals, KV cache, batching, quantization, speculative decoding, and engines like vLLM, SGLang, and TensorRT-LLM.	Developer Tools / AI Infrastructure	84	↓ -6	14 days ago	Details