AgentDish directory

vLLM

Accepted listings with this tag.

Listing Category Score Trend Checked
#41 ↓ -17
AutoRound

AutoRound is an open-source quantization toolkit for LLMs and VLMs, focused on high-accuracy low-bit inference across CPU, XPU, CUDA, and multiple deployment backends.

Developer Tools / AI Infrastructure 89 ↓ -17 28 days ago Details
#45 ↓ -3
tiny-vllm

Open-source C++ and CUDA LLM inference engine inspired by vLLM, with a teaching-focused course that walks through model serving, batching, KV cache, and attention kernels.

Developer Tools / AI Inference / LLM Serving 88 ↓ -3 3 days ago Details

Google Developers Blog post about integrating DFlash, a diffusion-style speculative decoding framework, into the vLLM TPU ecosystem to improve LLM serving speed on TPU v5p.

Developer Tools / Code Assistant 78 ↓ -86 27 days ago Details
#483 ↓ -13
vLLM-Compile

A public slide deck about vLLM-compile, a project focused on bringing compiler optimizations to LLM inference and speeding up torch.compile for vLLM workflows.

Developer Tools / Code Assistant 72 ↓ -13 27 days ago Details