AgentDish directory

ai-evaluation

Accepted listings with this tag.

Listing	Category	Score	Trend	Checked
#961 ↑ +6 LLM INQUISITOR A GitHub repository that proposes a practical methodology for evaluating how AI systems behave during real work, with quick-start, practitioner, and methodology guides included.	Developer Tools / AI Evaluation	78	↑ +6	59 days ago	Details
#1014 ↑ +75 Agent Eval A GitHub repo for evaluating agentic AI pipeline systems, with guidance for defining metrics, building eval cases, running repeatable tests, and tracking regressions.	Developer Tools / Copywriting	77	↑ +75	73 days ago	Details