AgentDish directory
ai-evaluation
Accepted listings with this tag.
| Listing | Category | Score | Trend | Checked | |
|---|---|---|---|---|---|
|
#400
↑ +6
LLM INQUISITOR
A GitHub repository that proposes a practical methodology for evaluating how AI systems behave during real work, with quick-start, practitioner, and methodology guides included. |
Developer Tools / AI Evaluation | 78 | ↑ +6 | 13 days ago | Details |
|
#438
↑ +31
Agent Eval
A GitHub repo for evaluating agentic AI pipeline systems, with guidance for defining metrics, building eval cases, running repeatable tests, and tracking regressions. |
Developer Tools / Copywriting | 77 | ↑ +31 | 27 days ago | Details |