Skip to content

thisyearnofear/detective

 
 

Repository files navigation

Detective - AI Detection Research Platform

🔍 Automated research harness for evaluating AI agents in adversarial Turing tests

Can your AI model fool human evaluators? Detective provides a standardized benchmark for testing AI detection capabilities with verifiable on-chain provenance.

Quick Start

git clone https://github.com/thisyearnofear/detective.git
cd detective && npm install
cp .env.example .env.local
npm run dev

See Development Guide for full setup instructions.

What is Detective?

For Players: A social deduction game on Farcaster — chat with opponents and guess: Human or AI?

For Researchers: A production-ready platform for benchmarking AI models' ability to pass the Turing test, with standardized protocols, public leaderboards, and verifiable provenance.

Documentation

Doc Covers
Architecture System design, tech stack, game mechanics, advanced features (agent auth, Storacha, World ID)
Smart Contracts Contract deployment, client integration, security, testing
Research API Agent API, evaluation metrics (DSR/DA), datasets, leaderboards, research tools
Development Setup, testing, deployment, troubleshooting, Farcaster mini app configuration

Tech Stack

  • Frontend: Next.js 15 + React 19 + TypeScript + Tailwind CSS
  • Backend: Next.js API Routes (serverless on Vercel + standalone VPS)
  • Auth: Farcaster Quick Auth + World ID 4.0
  • Game State: In-memory + Redis (Upstash) + PostgreSQL (Neon)
  • AI/Bot: Venice AI (Llama 3.3 70B) + OpenRouter multi-model
  • Blockchain: Arbitrum One (smart contracts)
  • Storage: Storacha (IPFS/Filecoin) for verifiable provenance

Key Features

  • Adversarial Turing Test: 50% human, 50% bot matching with 4-minute chats
  • Negotiation Mode: Resource-based negotiation with LLM-powered behavioral economics
  • Personality-Aware Bots: AI trained on real Farcaster users' writing styles (20+ behavioral traits)
  • MPP Integration: Machine Payments Protocol for agent-to-agent micropayments (Tempo/Optimization Arena)
  • On-Chain Registration: Arbitrum smart contract for sybil resistance
  • Verifiable Provenance: All game data stored on IPFS/Filecoin via Storacha
  • Agent Benchmarking: Public leaderboard comparing AI models by Deception Success Rate
  • Farcaster Native: Built as a Warpcast mini app with Quick Auth

For Researchers: Quick Start

# Configure your agent
export DETECTIVE_API_URL="https://your-instance.com"
export DETECTIVE_BOT_FID=123456
export DETECTIVE_AGENT_PRIVATE_KEY="0x..."

# Run example agent
node examples/example-agent.js

# Batch evaluation
npm run research:batch --model=your-model --matches=100

See Research API for complete API documentation and evaluation metrics.

Machine Payments Protocol (MPP)

Detective uses a dual-blockchain payment strategy optimized for different audiences:

Payment Strategy

Arbitrum One - Human Players

  • One-time registration (~$1 entry fee)
  • Optional match staking
  • On-chain voting and reputation
  • Why: Low fees, established ecosystem, human-friendly UX

Tempo/MPP - AI Agents & Researchers (Original)

  • Pay-per-request API access
  • Research data exports
  • Premium features
  • Why: Sub-millidollar fees, machine-optimized micropayments

Stellar/MPP - AI Agents & Researchers (NEW)

  • Same pricing as Tempo
  • USDC stablecoin payments
  • Fast settlement (~5 seconds)
  • Ultra-low fees (~$0.00001)
  • Why: Native USDC support, strong stablecoin infrastructure, ideal for agent micropayments

This separation provides clear audience segmentation (consumer vs B2B) and optimized UX for each use case.

Quick Start for Agents

Option 1: Tempo (Original)

# 1. Create Tempo wallet
npx mppx account create

# 2. Fund wallet with pathUSD/USDC
# (Optimization Arena participants have $20 credit)

# 3. Make paid request to start a negotiation match
npx mppx https://your-instance.com/api/agent/negotiate \
  --method POST \
  -J '{"agentId":"your-agent","action":"start"}'

Option 2: Stellar (NEW - for Stellar Hackathon)

# 1. Create Stellar wallet (use Freighter or stellar-sdk)
# 2. Fund wallet with USDC on Stellar testnet/mainnet
# 3. Send payment transaction to Detective's Stellar wallet
# 4. Include transaction hash in Authorization header

# Example with stellar-mpp-sdk (experimental):
# See: https://github.com/stellar/stellar-mpp-sdk

How It Works

  1. Agent requests resource → Server returns 402 Payment Required with challenge (Tempo and/or Stellar options)
  2. Agent chooses provider → Pays via Tempo (mppx CLI) or Stellar (stellar-sdk/Freighter)
  3. Agent includes payment proof → Retries request with payment credential in Authorization header
  4. Server verifies payment → Checks transaction on Tempo or Stellar blockchain
  5. Server returns resource → Match details with receipt

Pricing

Service Price Description
Negotiation match $0.10 Test your strategy against platform bots (MVP)
Conversation match $0.05 Turing test conversation (coming soon)
Research data export $0.50 Complete negotiation dataset (coming soon)
Match history $0.25 Historical match data (coming soon)

MVP Note: Currently agents play against platform bots to test negotiation strategies. Future versions will support agent-vs-agent and agent-vs-human matches for competitive benchmarking.

Configuration

# .env.local

# Tempo MPP (Original)
MPP_ENABLED=true
MPP_WALLET_ADDRESS=0xYourTempoWalletAddress
TEMPO_RPC_URL=https://rpc.tempo.xyz

# Stellar MPP (NEW)
STELLAR_MPP_ENABLED=true
STELLAR_WALLET_ADDRESS=GYourStellarWalletAddress
STELLAR_HORIZON_URL=https://horizon-testnet.stellar.org
STELLAR_NETWORK=TESTNET

Testing

# Test MPP integration
./scripts/test-mpp.sh

# Or manually
curl -X POST https://your-instance.com/api/agent/negotiate \
  -H "Content-Type: application/json" \
  -d '{"agentId":"test","action":"start"}'
# → Returns 402 with payment challenge

Docs: MPP Protocol | Tempo | Stellar | stellar-mpp-sdk | mppx CLI | Smart Contracts

Project Status

  • Phase 1-4: Complete ✅ (Bot communication, Arbitrum gating, multi-chain, agent auth)
  • Phase 5: Complete ✅ (Storacha provenance)
  • Phase 6: Complete ✅ (World ID 4.0)
  • Build: Passing (Next.js 15.5.6, TypeScript strict)

Contributing

Contributions welcome! Fork the repo, create a feature branch, and open a PR.

License

MIT License - see LICENSE.md

Support

About

i just cant prove it yet

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TypeScript 87.5%
  • JavaScript 8.1%
  • Solidity 2.1%
  • Shell 1.4%
  • Other 0.9%