Research / AI/ML Paper

Cassandra: Enabling Reasoning LLMs at Edge via Self-Speculative Decoding

arXiv paper on a self-speculative decoding framework for speeding up reasoning LLM inference on edge hardware, with hardware co-design and reported speedups.

AI tool AI/ML Paper Research edge inference hardware acceleration llm reasoning models speculative decoding

Why it was accepted

The page clearly describes an AI/ML research contribution focused on LLM inference acceleration. The abstract gives enough evidence for a useful public listing: it states the problem, the proposed method, the hardware co-design angle, and reported benchmark gains.

Weakness

This is only the arXiv abstract page, so visitors cannot see implementation details, code availability, datasets, or whether the approach is reproducible outside the paper.

Review status

49 days ago #1002 → 0

Last evaluated 49 days ago. Current rank #1002. Holding steady in the rankings.

Score history

Related listings

Below the Fold — A New York Times X-Ray Dashboard screenshot

#86 Below the Fold — A New York Times X-Ray Dashboard

Research / Data Visualization

An interactive dashboard that analyzes New York Times coverage since 2000 using the NYT Archive API, with views for reporters, beats, sections, subjects, geography, obituaries, and corrections.

↑ +274 73 days ago

#163 CAD-Bench

Research / Knowledge Work

An open benchmark and leaderboard for AI CAD agents, with 308 prompts across 20 categories and layered scoring for geometry, engineering, manufacturability, and cognition.

↓ -3 69 days ago

Benchmarking Inference Engines on Agentic Workloads screenshot

#243 Benchmarking Inference Engines on Agentic Workloads

Research / Knowledge Work

A research article from Applied Compute on how agentic, tool-using workloads differ from traditional LLM benchmarks, with production observations, workload profiles, and an open-source harness for replaying traces.

↓ -74 72 days ago

#268 UnderstandDocs

Research / Knowledge Work

UnderstandDocs is a document analysis tool that summarizes pasted text or uploaded files, flags risks, extracts important dates, and simplifies dense language. The page shows a working analysis form, supported file types, a privacy/no-storage claim, and an example output for a tenancy agreement.

↑ +2 9 days ago