AgentDish directory

Research

Accepted listings with this tag.

Listing Category Score Trend Checked

An interactive dashboard that analyzes New York Times coverage since 2000 using the NYT Archive API, with views for reporters, beats, sections, subjects, geography, obituaries, and corrections.

Research / Data Visualization 89 ↑ +111 27 days ago Details
#56 ↓ -3
CAD-Bench

An open benchmark and leaderboard for AI CAD agents, with 308 prompts across 20 categories and layered scoring for geometry, engineering, manufacturability, and cognition.

Research / Knowledge Work 88 ↓ -3 24 days ago Details

A research article from Applied Compute on how agentic, tool-using workloads differ from traditional LLM benchmarks, with production observations, workload profiles, and an open-source harness for replaying traces.

Research / Knowledge Work 87 ↓ -34 27 days ago Details

arXiv paper describing QUEST, an open family of deep research agents from 2B to 35B parameters, plus a synthetic-task training recipe and released models, data, and scripts.

Research / AI Agents 83 ↓ -3 7 days ago Details
#258 ↓ -3
wwwatch

A daily AI intelligence journal for builders, covering notable model, tooling, and release updates in a short sourced digest.

Research / Knowledge Work 83 ↓ -3 11 days ago Details
#262 ↓ -3
Physics AI

Physics AI is a physics homework and study tool that solves problems from photos or typed prompts, with step-by-step explanations, tutor mode, and visual breakdowns for diagrams and vectors.

Research / Knowledge Work 83 ↓ -3 12 days ago Details

A research report on the current MCP ecosystem, with live crawl numbers, verification rates, category breakdowns, and examples of both strong and weak MCP-positive sites.

Research / AI research 83 ↑ +40 27 days ago Details
#318 ↑ +53
ShadowBrokers

AI-powered trade signal product for retail traders that turns financial news into ranked trade plans with entries, stops, targets, and tracked accuracy.

Research / Knowledge Work 82 ↑ +53 27 days ago Details

Agora-1 is a multi-agent world model from Odyssey that simulates shared real-time environments for up to four participants, human or AI, with a focus on gaming, robotics, reinforcement learning, and foundation model research.

AI Research / World Models 78 ↑ +6 14 days ago Details

PaperProfit explains an AI-assisted stock evaluation approach that combines fundamentals, technical signals, and qualitative analysis from transcripts and SEC filings into a weighted score.

Research / Knowledge Work 77 → 0 15 hours ago Details

A research write-up on detecting AI agents through process differences in CAPTCHA and related cognitive tasks. It outlines the CogCAPTCHA30 approach, reports human-vs-model differences, and connects the findings to Roundtable’s Proof of Human product.

Research / Knowledge Work 77 → 0 3 days ago Details

arXiv paper on a self-speculative decoding framework for speeding up reasoning LLM inference on edge hardware, with hardware co-design and reported speedups.

Research / AI/ML Paper 77 → 0 4 days ago Details

A GitHub research project documenting a long-form, multi-model analysis of LLM behavior across Claude, Gemini, ChatGPT, and Grok. The repo includes an executive summary, screenplay, technical white paper, and archive of logs and chat records.

AI Research / LLM Evaluation & Analysis 75 → 0 7 days ago Details

A GitHub research project that measures how gpt-4.1 responds when asked to pick a random number between 1 and 100, using 10,000 API calls and comparing the results to a uniform baseline.

AI Research / Model Behavior Analysis 74 ↓ -1 8 days ago Details

An open-source experiment that adds a small zero-initialized overlay layer to a frozen GPT-2 so its behavior can be adjusted at inference time without retraining the base model.

AI Developer Tool / Model Adaptation / Adapters 74 ↓ -1 27 days ago Details

A research page and preprint about using code as the runtime layer for agent systems, with a taxonomy of harness interfaces, harness mechanisms, and scaling patterns for multi-agent workflows.

Developer Tools / Code Assistant 71 → 0 12 days ago Details