feat(ci): add duplicate issue detection and auto-close bot#22034
feat(ci): add duplicate issue detection and auto-close bot#22034
Conversation
Add a Python script that detects duplicate issues using title similarity (difflib.SequenceMatcher) and closes them via the gh CLI. Two-tier system: - 0.6 threshold: informational comment via existing wow-actions step - 0.85 threshold: auto-close with comment, label, and not_planned reason Includes a workflow_dispatch workflow for one-time batch scans and integrates auto-close into the existing check_duplicate_issues workflow for newly opened issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryAdds a two-tier duplicate issue detection system: the existing
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| .github/scripts/close_duplicate_issues.py | New script for duplicate issue detection using difflib.SequenceMatcher. Contains dead code in fetch_open_issues (lines 44-47 overwritten by 48-49). Core logic is sound — threshold-based matching with dry-run safety. Minor issues only. |
| .github/workflows/check_duplicate_issues.yml | Extends existing workflow with auto-close steps gated on opened events. Missing actions/setup-python step — relies on system Python for PEP 604 syntax (`str |
| .github/workflows/scan_duplicate_issues.yml | New workflow_dispatch workflow for batch scanning. Properly sets up Python 3.11. Minor concern: ${{ inputs.threshold }} directly interpolated in shell (low risk since write-access gated). |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Issue Opened] --> B[wow-actions/potential-duplicates\nthreshold: 0.60]
B --> C{Similarity >= 0.60?}
C -->|Yes| D[Add 'potential-duplicate' label\n+ informational comment]
C -->|No| E[No action]
D --> F[Checkout close script]
E --> F
F --> G[Run close_duplicate_issues.py\nthreshold: 0.85]
G --> H[Fetch ALL open issues via gh api]
H --> I[Normalize titles & compare\nusing SequenceMatcher]
I --> J{Similarity >= 0.85?}
J -->|Yes| K[Add comment + 'duplicate' label\nClose with 'not_planned' reason]
J -->|No| L[No further action]
M[Manual Trigger\nworkflow_dispatch] --> N[Fetch ALL open issues]
N --> O[Compare every issue\nagainst older issues]
O --> P{Similarity >= threshold?}
P -->|Yes & close=true| Q[Close as duplicate]
P -->|Yes & close=false| R[Dry-run log only]
P -->|No| S[Skip]
Last reviewed commit: db3d61f
Review1. Does this PR fix the issue it describes?
Uses 2. Has this issue already been solved elsewhere? 3. Are there other PRs addressing the same problem? 4. Are there other issues this potentially closes? ✅ LGTM — conservative threshold (0.85) ensures no false positives. Good CI addition. |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Summary
.github/scripts/close_duplicate_issues.py) that detects duplicate issues usingdifflib.SequenceMatcheron normalized titles and closes them via theghCLIworkflow_dispatchworkflow (scan_duplicate_issues.yml) for one-time batch scans of all open issuescheck_duplicate_issues.ymlworkflow to auto-close newly opened issues that are high-confidence duplicatesTwo-tier duplicate detection
potential-duplicatelabel (existing wow-actions step)duplicatelabel, andnot_plannedstate reasonDry-run results against 830 open issues
Test plan
python3 .github/scripts/close_duplicate_issues.py --repo BerriAI/litellm --scan --threshold 0.85gh workflow run scan_duplicate_issues.ymlwithclose: falseto verify in CIclose: trueto close the 7 confirmed duplicates🤖 Generated with Claude Code