AgentDish directory

ai-evaluation

Accepted listings with this tag.

Listing Category Score Trend Checked
#400 ↑ +6
LLM INQUISITOR

A GitHub repository that proposes a practical methodology for evaluating how AI systems behave during real work, with quick-start, practitioner, and methodology guides included.

Developer Tools / AI Evaluation 78 ↑ +6 13 days ago Details
#438 ↑ +31
Agent Eval

A GitHub repo for evaluating agentic AI pipeline systems, with guidance for defining metrics, building eval cases, running repeatable tests, and tracking regressions.

Developer Tools / Copywriting 77 ↑ +31 27 days ago Details