Pondera is a lightweight, YAML-first framework to evaluate AI models and agents with pluggable runners and an LLM-as-a-judge.
-
Updated
Oct 23, 2025 - Python
Pondera is a lightweight, YAML-first framework to evaluate AI models and agents with pluggable runners and an LLM-as-a-judge.
ai agents to classify and rank chrome bookmarks with gemini and pydantic-ai
Analyze Claude Code session logs and generate efficiency reports, cost diagnostics, and actionable recommendations. This project reads local JSONL session logs, computes deterministic efficiency signals, and can optionally add local LLM recommendations using Ollama.
Add a description, image, and links to the rubric-based-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the rubric-based-evaluation topic, visit your repo's landing page and select "manage topics."