Skip to content

🦀 PinchBench

Real-world benchmarks for AI coding agents

PinchBench measures how well LLM models perform as the brain of an OpenClaw agent. Instead of synthetic tests, we throw real tasks at agents: scheduling meetings, writing code, triaging email, researching topics, and managing files.


Repositories

Repo Description
skill Benchmark runner and task definitions — run it yourself
leaderboard The pinchbench.com leaderboard frontend

Run the Benchmark

git clone https://github.com/pinchbench/skill.git
cd skill
./scripts/run.sh --model anthropic/claude-sonnet-4

Results upload to the public leaderboard. Get started →


Claw-some AI agent testing. Made with 🦀 by the humans at https://kilo.ai 🦞

Popular repositories Loading

  1. skill skill Public

    PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai

    Python 401 38

  2. leaderboard leaderboard Public

    PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai

    TypeScript 14 6

  3. api api Public

    TypeScript 1 3

  4. .github .github Public

    PinchBench organization profile and community health files

  5. scripts scripts Public

    Shell

Repositories

Showing 5 of 5 repositories
  • scripts Public
    pinchbench/scripts’s past year of commit activity
    Shell 0 0 0 2 Updated Mar 11, 2026
  • leaderboard Public

    PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai

    pinchbench/leaderboard’s past year of commit activity
    TypeScript 14 6 2 0 Updated Mar 11, 2026
  • api Public
    pinchbench/api’s past year of commit activity
    TypeScript 1 MIT 3 0 0 Updated Mar 10, 2026
  • skill Public

    PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai

    pinchbench/skill’s past year of commit activity
    Python 401 MIT 38 6 9 Updated Mar 10, 2026
  • .github Public

    PinchBench organization profile and community health files

    pinchbench/.github’s past year of commit activity
    0 0 0 0 Updated Mar 8, 2026

Most used topics

Loading…