AgentDish directory
llm-serving
Accepted listings with this tag.
| Listing | Category | Score | Trend | Checked | |
|---|---|---|---|---|---|
|
#290
↓ -6
LLM inference at scale
An open-source handbook for production LLM serving and inference at scale, covering GPU fundamentals, KV cache, batching, quantization, speculative decoding, and engines like vLLM, SGLang, and TensorRT-LLM. |
Developer Tools / AI Infrastructure | 84 | ↓ -6 | 14 days ago | Details |