dotnet · JanKrivanek · Mar 6, 2026
diff --git a/.github/agents/learn-from-pr.md b/.github/agents/learn-from-pr.md
@@ -0,0 +1,31 @@
+---
+name: learn-from-pr
+description: "Analyzes completed PRs for lessons learned from agent behavior. Use after any PR with agent involvement to identify what worked, what failed, and what to improve in instruction files, skills, or documentation."
+---
+
+# Learn From PR Agent
+
+Analyzes a completed PR, extracts lessons, and **applies improvements** to the repo's AI infrastructure.
+
+## Workflow
+
+1. **Invoke learn-from-pr skill** to analyze the PR and get recommendations
+2. **Present recommendations** to the user for approval
+3. **Apply approved changes** to instruction files, skills, or documentation
+4. **Commit** with descriptive message
+
+## Where to Apply Changes
+
+| Recommendation Category | Target File |
+|------------------------|-------------|
+| instruction-file | `.github/instructions/*.instructions.md` |
+| copilot-instructions | `.github/copilot-instructions.md` |
+| skill | `.github/skills/*/SKILL.md` |
+| agent | `.github/agents/*.md` |
+| code-comment | Source files |
+
+## Rules
+
+- Always get user approval before applying changes
+- Make minimal, surgical edits
+- Don't remove existing valid instructions — add alongside
diff --git a/.github/agents/pr.md b/.github/agents/pr.md
@@ -0,0 +1,126 @@
+---
+name: pr
+description: "Sequential 4-phase PR workflow: Pre-Flight, Gate, Fix (multi-model), Report. Phases MUST complete in order."
+---
+
+# PR Agent
+
+End-to-end agent that takes a GitHub issue from investigation through to a completed PR.
+
+## Workflow Overview
+
+This file covers **Phases 1-2** (Pre-Flight → Gate).
+
+After Gate passes, read `.github/agents/pr/post-gate.md` for **Phases 3-4** (multi-model Fix → Report).
+
+```
+┌──────────────────────────────┐     ┌────────────────────────────────────────┐
+│  THIS FILE: pr.md            │     │  pr/post-gate.md                       │
+│                              │     │                                        │
+│  1. Pre-Flight → 2. Gate     │ ──► │  3. Fix (multi-model) → 4. Report      │
+│                    ⛔         │     │                                        │
+│               MUST PASS      │     │  (Only read after Gate ✅ PASSED)      │
+└──────────────────────────────┘     └────────────────────────────────────────┘
+```
+
+**Read `.github/agents/pr/SHARED-RULES.md` for rules that apply across all phases**, including multi-model configuration.
+
+---
+
+## Critical Rules
+
+- ❌ Never commit directly to `main`. Always create a feature branch.
+- ❌ Never stop and ask the user during autonomous execution — use best judgment to continue.
+- ❌ Never mark a phase ✅ with pending fields remaining.
+- Phase 3 uses a multi-model exploration workflow. See `post-gate.md` after Gate passes.
+
+---
+
+## PRE-FLIGHT: Context Gathering (Phase 1)
+
+> **SCOPE**: Document only. No code analysis. No fix opinions. No running tests.
+
+### What TO Do
+
+- Read issue description and comments
+- Note platforms/areas affected
+- Identify files changed (if PR exists)
+- Document disagreements and edge cases from comments
+
+### What NOT To Do
+
+| ❌ Do NOT | Why | When to do it |
+|-----------|-----|---------------|
+| Research git history | Root cause analysis | Phase 3: Fix |
+| Look at implementation code | Understanding the bug | Phase 3: Fix |
+| Design or implement fixes | Solution design | Phase 3: Fix |
+| Run tests | Verification | Phase 2: Gate |
+
+### Steps
+
+**If starting from a PR:**
+```bash
+gh pr view XXXXX --json title,body,url,author,labels,files
+gh pr diff XXXXX
+gh issue view ISSUE_NUMBER --json title,body,comments
+```
+
+**If starting from an Issue:**
+```bash
+gh issue view XXXXX --json title,body,comments,labels
+```
+
+---
+
+## GATE: Verify Tests Catch the Issue (Phase 2)
+
+> **SCOPE**: Verify tests exist and correctly detect the fix (for PRs) or reproduce the bug (for issues).
+
+**⛔ This phase MUST pass before continuing.**
+
+### Step 1: Check if Tests Exist
+
+```bash
+# For PRs — check changed files for test files
+gh pr view XXXXX --json files --jq '.files[].path' | grep -iE "test"
+
+# For issues — search for tests
+find . -name "*Tests.cs" -o -name "*Test.cs" | head -10
+```
+
+**If NO tests exist** → Let the user know. They can use `write-tests-agent` to create them.
+
+### Step 2: Run Verification
+
+```bash
+./build.sh
+dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj
+```
+
+For PRs with a fix, ideally verify both directions (invoke `verify-tests-fail` skill):
+1. Tests FAIL without fix ← proves tests catch the bug
+2. Tests PASS with fix ← proves fix works
+
+### Complete Gate
+
+- ✅ **PASSED**: Tests fail without fix, pass with fix → Read `pr/post-gate.md` for Phases 3-4
+- ❌ **FAILED**: Tests don't catch the bug → Request changes from PR author
+
+---
+
+## ⛔ STOP HERE
+
+**If Gate `✅ PASSED`** → Read `.github/agents/pr/post-gate.md` to continue with phases 3-4.
+
+**If Gate `❌ FAILED`** → Stop. Request changes from the PR author to fix the tests.
+
+---
+
+## Commands
+
+| Action | Command |
+|--------|---------|
+| Build | `./build.sh` |
+| Test | `dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj` |
+| Format | `dotnet format Microsoft.ML.sln --no-restore` |
+| CI Status | Invoke `pr-build-status` skill |
diff --git a/.github/agents/pr/SHARED-RULES.md b/.github/agents/pr/SHARED-RULES.md
@@ -0,0 +1,72 @@
+# PR Agent: Shared Rules
+
+Rules that apply across all PR agent phases. Referenced by `pr.md` and `post-gate.md`.
+
+---
+
+## Multi-Model Configuration
+
+Phase 3 uses these AI models for try-fix exploration (run **SEQUENTIALLY**):
+
+| Order | Model |
+|-------|-------|
+| 1 | `claude-sonnet-4-5` |
+| 2 | `gpt-4.1` |
+| 3 | `gemini-2.5-pro` |
+
+**Note:** The `model` parameter is passed to the `task` tool's agent invocation. Each model runs try-fix independently.
+
+**⚠️ SEQUENTIAL ONLY**: try-fix runs modify the same files and use the same build/test environment. Never run in parallel.
+
+### Recommended Default Models
+
+If no specific models are configured, use a diverse set across providers:
+
+| Order | Model | Why |
+|-------|-------|-----|
+| 1 | `claude-sonnet-4-5` | Strong code reasoning |
+| 2 | `gpt-4.1` | Fast, different perspective |
+| 3 | `gemini-2.5-pro` | Different training data |
+
+Adjust based on available models and budget. More models = more fix diversity = better chance of finding optimal solution.
+
+---
+
+## Phase Completion Protocol
+
+**Before changing ANY phase status to ✅ COMPLETE:**
+
+1. Review the phase checklist
+2. Verify all required items are addressed
+3. Then mark the phase as ✅ COMPLETE
+
+**Rule:** Status ✅ means "work complete and verified", not "I finished thinking about it."
+
+---
+
+## Stop on Environment Blockers
+
+If you encounter a blocker that prevents completing a phase:
+
+1. **Try ONE retry** (install missing tool, rebuild, etc.)
+2. **If still blocked after one retry**, skip the blocked phase and continue
+3. **Document what was skipped and why** in the Report phase
+4. **Always prefer continuing with partial results** over stopping completely
+
+| Blocker Type | Max Retries | Then Do |
+|--------------|-------------|---------|
+| Missing tool/dependency | 1 install attempt | Skip phase, continue |
+| Server errors (500, timeout) | 1 retry | Skip phase, continue |
+| Build failures in try-fix | 2 attempts | Skip remaining models, proceed to Report |
+| Configuration issues | 1 fix attempt | Skip phase, continue |
+
+---
+
+## No Direct Git State Changes
+
+The agent should not run git commands that change branch state during PR review. Use read-only commands:
+
+- ✅ `gh pr diff`, `gh pr view`, `gh issue view`
+- ❌ `git checkout`, `git switch`, `git stash`, `git reset`
+
+Exception: `git checkout HEAD -- .` and `git clean -fd` are allowed for cleanup between try-fix attempts.
diff --git a/.github/agents/pr/post-gate.md b/.github/agents/pr/post-gate.md
@@ -0,0 +1,157 @@
+# PR Agent: Post-Gate Phases (3-4)
+
+**⚠️ PREREQUISITE: Only read this file after 🚦 Gate shows `✅ PASSED`.**
+
+If Gate is not passed, go back to `.github/agents/pr.md` and complete phases 1-2 first.
+
+---
+
+## Workflow Overview
+
+| Phase | Name | What Happens |
+|-------|------|--------------|
+| 3 | **Fix** | Invoke `try-fix` skill with multiple models to explore independent fix alternatives, then compare with PR's fix |
+| 4 | **Report** | Deliver result (approve PR, request changes, or create new PR) |
+
+**All rules from `.github/agents/pr/SHARED-RULES.md` apply**, including multi-model configuration.
+
+---
+
+## 🔧 FIX: Multi-Model Exploration (Phase 3)
+
+> **SCOPE**: Explore independent fix alternatives using `try-fix` skill across multiple AI models, compare with PR's fix, select the best approach.
+
+### Why Multi-Model?
+
+Each AI model has different strengths — one may spot a root cause another misses, or propose a simpler fix. By running try-fix with 3 models sequentially, you maximize fix diversity and increase the chance of finding the optimal solution.
+
+### 🚨 CRITICAL: try-fix is Independent of PR's Fix
+
+**The PR's fix has already been validated by Gate.** Phase 3 is NOT re-testing the PR's fix — it's exploring whether a better alternative exists.
+
+**Do NOT let the PR's fix influence your thinking.** Generate ideas as if you hadn't seen the PR.
+
+### Step 1: Run try-fix with Each Model (Round 1)
+
+Run the `try-fix` skill **3 times sequentially**, once with each model (see `SHARED-RULES.md` for model list).
+
+**⚠️ SEQUENTIAL ONLY**: try-fix runs modify the same files and use the same build/test environment. Never run in parallel.
+
+**For each model**, invoke as a task agent with the specified model:
+
+```
+Invoke the try-fix skill for PR #XXXXX:
+- problem: [Description of the bug — what's broken and expected behavior]
+- test_command: dotnet test test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj
+- target_files: [files likely affected]
+
+Generate ONE independent fix idea. Review the PR's fix first to ensure your approach is DIFFERENT.
+```
+
+**Wait for each to complete before starting the next.**
+
+**🧹 MANDATORY: Clean up between attempts.** After each try-fix completes (pass or fail):
+
+```bash
+# Restore all tracked files to HEAD
+git checkout HEAD -- .
+
+# Remove untracked files added by the previous attempt
+git clean -fd
+```
+
+### Step 2: Cross-Pollination (Round 2+)
+
+After Round 1, share each model's results with the others and ask for new ideas.
+
+**For each model**, invoke again with this context:
+
+```
+Here are the fix attempts from Round 1:
+[List each model's approach and result]
+
+Given what worked and what didn't, propose a NEW fix idea that:
+- Is DIFFERENT from all attempts above
+- Learns from the failures (avoid the same mistakes)
+- Combines insights from passing fixes if applicable
+
+If you genuinely have no new idea, respond "NO NEW IDEAS" — don't force a bad attempt.
+```
+
+**Exhaustion criteria**: Cross-pollination is exhausted when ALL models respond "NO NEW IDEAS" via actual invocation (not assumed).
+
+### Step 3: Select Best Fix
+
+Build a comparison table of all candidates:
+
+```markdown
+### Fix Candidates
+| # | Model | Approach | Result | Files Changed | Notes |
+|---|-------|----------|--------|---------------|-------|
+| 1 | claude-sonnet-4-5 | [approach] | ✅/❌ | `file.cs` | [why] |
+| 2 | gpt-4.1 | [approach] | ✅/❌ | `file.cs` | [why] |
+| PR | PR author | [approach] | ✅ (Gate) | `file.cs` | Original |
+```
+
+**Selection criteria** (in order):
+1. Tests pass
+2. Minimal changes (fewer files, fewer lines)
+3. Root cause fix (not symptom suppression)
+4. Code quality and maintainability
+
+---
+
+## 📋 REPORT: Deliver Result (Phase 4)
+
+### If Starting from PR — Write Review
+
+| Scenario | Recommendation |
+|----------|---------------|
+| PR's fix was selected | ✅ **APPROVE** — PR's approach is correct/optimal |
+| Alternative fix was better | ⚠️ **REQUEST CHANGES** — suggest the better approach |
+| PR's fix failed tests | ⚠️ **REQUEST CHANGES** — fix doesn't work |
+
+Run `pr-finalize` skill to verify PR title/description match implementation.
+
+### If Starting from Issue — Create PR
+
+Present the selected fix to the user:
+
+```markdown
+I've implemented the fix for issue #XXXXX:
+- **Selected fix**: Candidate #N — [approach]
+- **Files changed**: [list]
+- **Other candidates considered**: [brief summary]
+
+Please review the changes and create a PR when ready.
+```
+
+### Report Format
+
+```markdown
+## Final Recommendation: APPROVE / REQUEST CHANGES
+
+### Summary
+[Brief summary of the review]
+
+### Fix Exploration
+[How many models tried, how many passed, which was selected and why]
+
+### Root Cause
+[Root cause analysis]
+
+### Fix Quality
+[Assessment of the selected fix]
+```
+
+---
+
+## Common Mistakes
+
+- ❌ **Looking at PR's fix before generating ideas** — Generate independently first
+- ❌ **Re-testing the PR's fix in try-fix** — Gate already validated it
+- ❌ **Skipping models in Round 1** — All models must run before cross-pollination
+- ❌ **Running try-fix in parallel** — SEQUENTIAL ONLY
+- ❌ **Declaring exhaustion prematurely** — All models must confirm "NO NEW IDEAS"
+- ❌ **Not cleaning up between attempts** — Always restore working directory
+- ❌ **Selecting a failing fix** — Only select from passing candidates