Research / Knowledge Work

CAD-Bench

An open benchmark and leaderboard for AI CAD agents, with 308 prompts across 20 categories and layered scoring for geometry, engineering, manufacturability, and cognition.

Clear28/30
Useful27/30
Specific17/20
Complete16/20
CAD-Bench screenshot

Why it was accepted

The page clearly presents a real AI-adjacent research product: a benchmark for evaluating CAD agents. It shows the scope of the task set, scoring methodology, leaderboard results, category breakdowns, and reproduction steps, which is enough for a useful public listing.

Weakness

This is a benchmark site rather than an end-user tool, so visitors cannot use it directly to create CAD models. The snapshot also does not show deeper documentation for the benchmark dataset, task definitions, or licensing details beyond the MIT reference.

Review status

24 days ago #56 ↓ -3

Last evaluated 24 days ago. Current rank #56. Down 3 spots in the rankings.

Score history

88

Related listings

Below the Fold — A New York Times X-Ray Dashboard screenshot

Research / Data Visualization

An interactive dashboard that analyzes New York Times coverage since 2000 using the NYT Archive API, with views for reporters, beats, sections, subjects, geography, obituaries, and corrections.

Benchmarking Inference Engines on Agentic Workloads screenshot

Research / Knowledge Work

A research article from Applied Compute on how agentic, tool-using workloads differ from traditional LLM benchmarks, with production observations, workload profiles, and an open-source harness for replaying traces.

Alignment Whack-a-Mole screenshot

Research / Copywriting

A research code repository for studying how fine-tuning can trigger verbatim recall of copyrighted books in large language models. It includes preprocessing, fine-tuning, generation, and memorization-evaluation scripts, with setup notes and example data.