-
Notifications
You must be signed in to change notification settings - Fork 60
Add Discussion Task Miner workflow #224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
pelikhan
merged 2 commits into
main
from
daily-repo-goals/discussion-task-miner-20260301-fd8c3ec1aec706c4
Mar 3, 2026
+303
−0
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,67 @@ | ||
| # 🔍 Discussion Task Miner | ||
|
|
||
| > For an overview of all available workflows, see the [main README](../README.md). | ||
|
|
||
| **Automatically extract actionable tasks from GitHub Discussions and create trackable issues** | ||
|
|
||
| The [Discussion Task Miner workflow](../workflows/discussion-task-miner.md?plain=1) runs daily to scan recent GitHub Discussions for actionable improvement opportunities. It identifies concrete, well-scoped tasks and converts them into GitHub issues (up to 5 per run), bridging the gap between discussion insights and tracked work items. | ||
|
|
||
| ## Installation | ||
|
|
||
| ```bash | ||
| # Install the 'gh aw' extension | ||
| gh extension install github/gh-aw | ||
|
|
||
| # Add the workflow to your repository | ||
| gh aw add-wizard githubnext/agentics/discussion-task-miner | ||
| ``` | ||
|
|
||
| This walks you through adding the workflow to your repository. | ||
|
|
||
| ## How It Works | ||
|
|
||
| ```mermaid | ||
| graph LR | ||
| A[Scan Recent Discussions] --> B[Extract Action Items] | ||
| B --> C[Filter & Prioritize] | ||
| C --> D{High Value?} | ||
| D -->|Yes| E[Create GitHub Issue] | ||
| D -->|No| F[Skip] | ||
| E --> G[Update Memory] | ||
| F --> G | ||
| ``` | ||
|
|
||
| The workflow reads discussions from the last 7 days, analyzes their content for recommendations, action items, and improvement suggestions, then converts the top findings into focused, actionable GitHub issues. It uses repo-memory to avoid re-processing the same discussions across runs. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| GitHub Discussions must be enabled for your repository. The workflow works best in repositories that generate discussion content from other agentic workflows (such as analysis reports, quality audits, or review summaries), though it can also mine any human-authored discussions containing improvement suggestions. | ||
|
|
||
| ## Examples | ||
|
|
||
| Based on usage in the gh-aw repository: **57% merge rate** (60 merged PRs out of 105 proposed through a discussion → issue → PR causal chain). The workflow demonstrates how insights buried in discussions can be surfaced as trackable work—a verified example chain: [Discussion #13934](https://github.com/github/gh-aw/discussions/13934) → [Issue #14084](https://github.com/github/gh-aw/issues/14084) → [PR #14129](https://github.com/github/gh-aw/pull/14129). | ||
|
|
||
| ## Usage | ||
|
|
||
| ### Configuration | ||
|
|
||
| The workflow is configured to: | ||
| - Run daily | ||
| - Create max 5 issues per run | ||
| - Auto-expire issues after 1 day if not addressed | ||
| - Use repo-memory to track processed discussions and avoid duplicates | ||
|
|
||
| To customize which types of tasks to extract, edit the "Focus Areas" and "Task Extraction Criteria" sections in the workflow file. After editing, run `gh aw compile` to update the workflow and commit all changes to the default branch. | ||
|
|
||
| ### Pairing with Other Workflows | ||
|
|
||
| This workflow pairs especially well with other analysis workflows that post findings as discussions: | ||
| - [Daily Accessibility Review](daily-accessibility-review.md) | ||
| - [Daily Adhoc QA](daily-qa.md) | ||
| - [Daily Malicious Code Scan](daily-malicious-code-scan.md) | ||
| - [Daily Performance Improver](daily-perf-improver.md) | ||
|
|
||
| ## Learn More | ||
|
|
||
| - [GitHub Agentic Workflows Documentation](https://github.github.io/gh-aw/) | ||
| - [Blog: Agentic Workflow Campaigns & Multi-Phase Workflows](https://github.github.io/gh-aw/blog/2026-01-13-meet-the-workflows-campaigns/) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,235 @@ | ||
| --- | ||
| name: Discussion Task Miner | ||
| description: Scans recent GitHub Discussions to extract actionable improvement tasks and create trackable GitHub issues | ||
| on: | ||
| schedule: daily | ||
| workflow_dispatch: | ||
|
|
||
| permissions: | ||
| contents: read | ||
| discussions: read | ||
| issues: read | ||
| pull-requests: read | ||
|
|
||
| tracker-id: discussion-task-miner | ||
| timeout-minutes: 20 | ||
| engine: copilot | ||
| strict: true | ||
|
|
||
| network: | ||
| allowed: | ||
| - defaults | ||
|
|
||
| safe-outputs: | ||
| create-issue: | ||
| title-prefix: "[task-miner] " | ||
| labels: [automated-analysis] | ||
| max: 5 | ||
| group: true | ||
| expires: 1d | ||
| messages: | ||
| footer: "> 🔍 *Task mining by [{workflow_name}]({run_url})*" | ||
| run-started: "🔍 Discussion Task Miner starting! [{workflow_name}]({run_url}) is scanning discussions for actionable tasks..." | ||
| run-success: "✅ Task mining complete! [{workflow_name}]({run_url}) has identified actionable tasks from recent discussions. 📊" | ||
| run-failure: "⚠️ Task mining interrupted! [{workflow_name}]({run_url}) {status}. Please review the logs..." | ||
|
|
||
| tools: | ||
| cache-memory: true | ||
| github: | ||
| lockdown: true | ||
| toolsets: [default, discussions] | ||
| bash: | ||
| - "jq *" | ||
| - "cat *" | ||
| - "date *" | ||
|
|
||
| imports: | ||
| - shared/reporting.md | ||
| --- | ||
|
|
||
| # Discussion Task Miner | ||
|
|
||
| You are a task mining agent that analyzes recent GitHub Discussions to discover actionable improvement opportunities. | ||
|
|
||
| ## Mission | ||
|
|
||
| Scan recent GitHub Discussions to identify and extract specific, actionable tasks that improve the repository. Convert these discoveries into trackable GitHub issues. | ||
|
|
||
| ## Objectives | ||
|
|
||
| 1. **Mine Discussions**: Analyze recent discussions (last 7 days) | ||
| 2. **Extract Tasks**: Identify concrete, actionable improvements | ||
| 3. **Create Issues**: Convert high-value tasks into GitHub issues | ||
| 4. **Track Progress**: Maintain memory of processed discussions to avoid duplicates | ||
|
|
||
| ## Task Extraction Criteria | ||
|
|
||
| Focus on extracting tasks that meet **ALL** these criteria: | ||
|
|
||
| ### Quality Criteria | ||
| - ✅ **Specific**: Task has clear scope and acceptance criteria | ||
| - ✅ **Actionable**: Can be completed by a developer or AI agent | ||
| - ✅ **Valuable**: Improves the repository in a meaningful way | ||
| - ✅ **Scoped**: Can be completed in 1-3 days of work | ||
| - ✅ **Independent**: Doesn't require completing other tasks first | ||
|
|
||
| ### Focus Areas | ||
| - **Code Quality**: Simplify complex code, reduce duplication, improve structure | ||
| - **Testing**: Add missing tests, improve test coverage, fix flaky tests | ||
| - **Documentation**: Add or improve documentation, examples, guides | ||
| - **Performance**: Optimize slow operations, reduce resource usage | ||
| - **Security**: Fix vulnerabilities, improve security practices | ||
| - **Maintainability**: Improve code organization, naming, patterns | ||
| - **Technical Debt**: Address TODOs, deprecated APIs, workarounds | ||
| - **Tooling**: Improve linters, formatters, build scripts, CI/CD | ||
|
|
||
| ### Exclude These | ||
| - ❌ Vague suggestions without clear scope ("improve code") | ||
| - ❌ Already tracked in existing issues | ||
| - ❌ Feature requests or new functionality | ||
| - ❌ Bug reports (those go through normal bug triage) | ||
| - ❌ Tasks requiring architectural decisions | ||
| - ❌ Tasks requiring human judgment or business decisions | ||
|
|
||
| ## Workflow Steps | ||
|
|
||
| ### Step 1: Load Memory | ||
|
|
||
| Check cache-memory for previously processed discussions. The cache memory stores a JSON object with this structure: | ||
|
|
||
| ```json | ||
| { | ||
| "last_run": "2026-03-01", | ||
| "discussions_processed": [ | ||
| {"id": 1234, "title": "...", "processed_at": "2026-03-01T10:00:00Z"} | ||
| ], | ||
| "extracted_tasks": [ | ||
| { | ||
| "source_discussion": 1234, | ||
| "issue_number": 5678, | ||
| "title": "...", | ||
| "created_at": "2026-03-01T10:00:00Z", | ||
| "status": "created" | ||
| } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| This helps avoid re-processing the same discussions and creating duplicate issues. | ||
|
|
||
| ### Step 2: Query Recent Discussions | ||
|
|
||
| Use GitHub tools to fetch recent discussions from the last 7 days. Look for discussions with titles or content that contain actionable insights, such as: | ||
| - Analysis reports and audit findings | ||
| - Code review observations | ||
| - Performance or quality assessments | ||
| - Recommendations sections in any discussion | ||
| - Any discussion mentioning "should", "could", "improve", "fix", "refactor", "add" | ||
|
|
||
| Limit to the 20-30 most recent discussions for efficiency. | ||
|
|
||
| ### Step 3: Analyze Discussion Content | ||
|
|
||
| For each discussion, extract the full content including: | ||
| - Title and body | ||
| - All comments | ||
| - Look for sections like: | ||
| - "Recommendations" | ||
| - "Action Items" | ||
| - "Improvements Needed" | ||
| - "Issues Found" | ||
| - "Technical Debt" | ||
| - "Refactoring Opportunities" | ||
| - "TODOs" or "Next Steps" | ||
|
|
||
| **Analysis approach:** | ||
| 1. Read the discussion content carefully | ||
| 2. Identify mentions of concrete improvement opportunities | ||
| 3. Extract specific tasks with clear descriptions | ||
| 4. Note the file paths, components, or areas mentioned | ||
| 5. Assess impact and feasibility | ||
|
|
||
| ### Step 4: Filter and Prioritize Tasks | ||
|
|
||
| From all identified tasks, select the **top 3-5 highest-value tasks** based on: | ||
| 1. **Impact**: How much does this improve the repository? | ||
| 2. **Effort**: Is it achievable in 1-3 days? | ||
| 3. **Clarity**: Is the task well-defined? | ||
| 4. **Uniqueness**: Haven't we already created an issue for this? | ||
|
|
||
| **Deduplication:** | ||
| - Check processed-discussions.json to avoid re-extracting from same discussion | ||
| - Check extracted-tasks.json to avoid creating duplicate issues | ||
| - Search existing GitHub issues to ensure task isn't already tracked | ||
|
|
||
| ### Step 5: Create GitHub Issues | ||
|
|
||
| For each selected task, use the `create-issue` safe output with a clear title and body. Format issues to include: | ||
|
|
||
| - **Description**: What needs to be done and why | ||
| - **Suggested Changes**: Specific actions to take | ||
| - **Files Affected**: Relevant files or components (if known) | ||
| - **Success Criteria**: How to know when done | ||
| - **Source**: Link to the source discussion | ||
| - **Priority**: High/Medium/Low | ||
|
|
||
| **Issue formatting guidelines:** | ||
| - Use clear, descriptive titles (50-80 characters) | ||
| - Include acceptance criteria | ||
| - Link back to source discussion | ||
| - Add appropriate priority (High/Medium/Low) | ||
|
|
||
| ### Step 6: Update Memory | ||
|
|
||
| Save progress to cache-memory using the JSON structure: | ||
|
|
||
| ```json | ||
| { | ||
| "last_run": "<today's date>", | ||
| "discussions_processed": [ | ||
| {"id": 1234, "title": "...", "processed_at": "<timestamp>"} | ||
| ], | ||
| "extracted_tasks": [ | ||
| { | ||
| "source_discussion": 1234, | ||
| "issue_number": 5678, | ||
| "title": "...", | ||
| "created_at": "<timestamp>", | ||
| "status": "created" | ||
| } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| Merge with the existing cache-memory data to preserve historical tracking of processed discussions and extracted tasks. | ||
|
|
||
| ## Output Requirements | ||
|
|
||
| ### Issue Creation | ||
| - Create **3-5 issues maximum** per run | ||
| - Each issue expires after 1 day if not addressed | ||
| - All issues tagged with `automated-analysis` | ||
| - Issues include clear acceptance criteria | ||
|
|
||
| ### Memory Tracking | ||
| - Always update cache-memory after each run to avoid duplicates | ||
| - Maintain extracted tasks in cache-memory for historical tracking | ||
|
|
||
| ### Quality Standards | ||
| - Only create issues for high-value, actionable tasks | ||
| - Ensure each issue is specific and well-scoped | ||
| - Link back to source discussions for context | ||
|
|
||
| ## Important Notes | ||
|
|
||
| - **Be selective** - only the highest-value tasks make the cut | ||
| - **Avoid duplicates** - check memory and existing issues before creating | ||
| - **Clear scope** - tasks should be completable in 1-3 days | ||
| - **Actionable** - someone should be able to start immediately | ||
| - **Source attribution** - always link to the original discussion | ||
|
|
||
| **Important**: If no discussions are found or no actionable tasks are identified, you **MUST** call the `noop` safe-output tool with a brief explanation. | ||
|
|
||
| ```json | ||
| {"noop": {"message": "No action needed: [brief explanation of what was analyzed and why no tasks were extracted]"}} | ||
| ``` | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot update memory layout for cache-memory