AgentDish directory
vLLM
Accepted listings with this tag.
| Listing | Category | Score | Trend | Checked | |
|---|---|---|---|---|---|
|
#41
↓ -17
AutoRound
AutoRound is an open-source quantization toolkit for LLMs and VLMs, focused on high-accuracy low-bit inference across CPU, XPU, CUDA, and multiple deployment backends. |
Developer Tools / AI Infrastructure | 89 | ↓ -17 | 28 days ago | Details |
|
#45
↓ -3
tiny-vllm
Open-source C++ and CUDA LLM inference engine inspired by vLLM, with a teaching-focused course that walks through model serving, batching, KV cache, and attention kernels. |
Developer Tools / AI Inference / LLM Serving | 88 | ↓ -3 | 3 days ago | Details |
|
Google Developers Blog post about integrating DFlash, a diffusion-style speculative decoding framework, into the vLLM TPU ecosystem to improve LLM serving speed on TPU v5p. |
Developer Tools / Code Assistant | 78 | ↓ -86 | 27 days ago | Details |
|
#483
↓ -13
vLLM-Compile
A public slide deck about vLLM-compile, a project focused on bringing compiler optimizations to LLM inference and speeding up torch.compile for vLLM workflows. |
Developer Tools / Code Assistant | 72 | ↓ -13 | 27 days ago | Details |