diff --git a/README.md b/README.md index 2a2a879..461b74c 100644 --- a/README.md +++ b/README.md @@ -60,6 +60,7 @@ You can use the "/plan" agent to turn the reports into actionable issues which c - [๐Ÿ‹๏ธ Daily File Diet](docs/daily-file-diet.md) - Monitor for oversized source files and create targeted refactoring issues - [๐Ÿงช Daily Test Improver](docs/daily-test-improver.md) - Improve test coverage by adding meaningful tests to under-tested areas - [โšก Daily Perf Improver](docs/daily-perf-improver.md) - Analyze and improve code performance through benchmarking and optimization +- [๐Ÿ“Š Repository Quality Improver](docs/repository-quality-improver.md) - Daily rotating analysis of repository quality across code, documentation, testing, security, and custom dimensions ## Security Workflows diff --git a/docs/repository-quality-improver.md b/docs/repository-quality-improver.md new file mode 100644 index 0000000..3693e74 --- /dev/null +++ b/docs/repository-quality-improver.md @@ -0,0 +1,113 @@ +# ๐Ÿ“Š Repository Quality Improver + +> For an overview of all available workflows, see the [main README](../README.md). + +The [Repository Quality Improver workflow](../workflows/repository-quality-improver.md?plain=1) analyzes your repository from a different quality angle every weekday, producing an issue with findings and actionable improvement tasks. + +## Installation + +Add the workflow to your repository: + +```bash +gh aw add https://github.com/githubnext/agentics/blob/main/workflows/repository-quality-improver.md +``` + +Then compile: + +```bash +gh aw compile +``` + +> **Note**: This workflow creates GitHub Issues with the `quality` and `automated-analysis` labels. + +## What It Does + +The Repository Quality Improver runs on weekdays and: + +1. **Selects a Focus Area** โ€” Picks a different quality dimension each run, using a rotating strategy to ensure broad, diverse coverage over time +2. **Analyzes the Repository** โ€” Examines source code, configuration, tests, and documentation from the chosen angle +3. **Creates an Issue** โ€” Posts a structured report with findings, metrics, and 3โ€“5 actionable improvement tasks +4. **Tracks History** โ€” Remembers previous focus areas (using cache memory) to avoid repetition and maximize coverage + +## How It Works + +````mermaid +graph LR + A[Load Focus History] --> B[Select Focus Area] + B --> C{Strategy?} + C -->|60%| D[Custom: Repo-specific area] + C -->|30%| E[Standard: Code/Docs/Tests/Security...] + C -->|10%| F[Reuse: Most impactful recent area] + D --> G[Analyze Repository] + E --> G + F --> G + G --> H[Create Issue Report] + H --> I[Update Cache Memory] +```` + +### Focus Area Strategy + +The workflow follows a deliberate diversity strategy across runs: + +- **60% Custom areas** โ€” Repository-specific issues the agent discovers by inspecting the codebase: e.g., "Error Message Clarity", "Contributor Onboarding Experience", "API Consistency" +- **30% Standard categories** โ€” Established quality dimensions: Code Quality, Documentation, Testing, Security, Performance, CI/CD, Dependencies, Code Organization, Accessibility, Usability +- **10% Revisits** โ€” Revisit the most impactful area from recent history for follow-up + +Over ten runs, the agent will typically explore 6โ€“7+ unique quality dimensions. + +### Output: GitHub Issues + +Each run produces one issue containing: + +- **Executive Summary** โ€” 2โ€“3 paragraphs of key findings +- **Full Analysis** โ€” Detailed metrics, strengths, and areas for improvement (collapsed) +- **Improvement Tasks** โ€” 3โ€“5 concrete, prioritized tasks with file-level specificity +- **Historical Context** โ€” Table of previous focus areas for reference + +You can comment on the issue to request follow-up actions or add it to a project board for tracking. + +## Example Reports + +From the original gh-aw use (62% merge rate via causal chain): +- [CI/CD Optimization report](https://github.com/github/gh-aw/discussions/6863) โ€” identified pipeline inefficiencies leading to multiple PRs +- [Performance report](https://github.com/github/gh-aw/discussions/13280) โ€” surfaced bottlenecks addressed by downstream agents + +## Configuration + +The workflow uses these default settings: + +| Setting | Default | Description | +|---------|---------|-------------| +| Schedule | Daily on weekdays | When to run the analysis | +| Issue labels | `quality`, `automated-analysis` | Labels applied to created issues | +| Max issues per run | 1 | Prevents duplicate reports | +| Issue expiry | 2 days | Older issues are closed when a new one is posted | +| Timeout | 20 minutes | Per-run time limit | + +## Customization + +```bash +gh aw edit repository-quality-improver +``` + +Common customizations: +- **Change issue labels** โ€” Set the `labels` field in `safe-outputs.create-issue` to labels that exist in your repository +- **Adjust the schedule** โ€” Change the cron to run less frequently if your codebase changes slowly +- **Add custom standard areas** โ€” Extend the standard categories list with areas relevant to your project + +## Tips for Success + +1. **Review open issues** โ€” Check the labeled issues regularly to pick up quick wins +2. **Add issues to a project board** โ€” Track improvement tasks using GitHub Projects for visibility +3. **Let the diversity algorithm work** โ€” Avoid overriding the focus area too frequently; the rotating strategy ensures broad coverage over time +4. **Review weekly** โ€” Check recent issues to pick up any quick wins + +## Source + +This workflow is adapted from [Peli's Agent Factory](https://github.github.io/gh-aw/blog/2026-01-13-meet-the-workflows-continuous-improvement/), where it achieved a 62% merge rate (25 merged PRs out of 40 proposed) via a causal discussion โ†’ issue โ†’ PR chain. + +## Related Workflows + +- [Daily File Diet](daily-file-diet.md) โ€” Targeted refactoring for oversized files +- [Code Simplifier](code-simplifier.md) โ€” Simplify recently modified code +- [Duplicate Code Detector](duplicate-code-detector.md) โ€” Find and remove code duplication diff --git a/workflows/repository-quality-improver.md b/workflows/repository-quality-improver.md new file mode 100644 index 0000000..0ca7eb8 --- /dev/null +++ b/workflows/repository-quality-improver.md @@ -0,0 +1,399 @@ +--- +name: Repository Quality Improver +description: Daily analysis of repository quality focusing on a different software development lifecycle area each run +on: + schedule: daily on weekdays + workflow_dispatch: +permissions: + contents: read + actions: read + issues: read + pull-requests: read +engine: copilot +tools: + bash: ["*"] + cache-memory: + - id: focus-areas + key: quality-focus-${{ github.workflow }} + github: + toolsets: + - default +safe-outputs: + create-issue: + expires: 2d + labels: [quality, automated-analysis] + max: 1 +timeout-minutes: 20 +strict: true + +--- + +# Repository Quality Improvement Agent + +You are the Repository Quality Improvement Agent โ€” an expert system that periodically analyzes and improves different aspects of the repository's quality by focusing on a specific software development lifecycle area each day. + +## Mission + +Daily or on-demand, select a focus area for repository improvement, conduct analysis, and produce a single issue with actionable tasks. Each run should choose a different lifecycle aspect to maintain diverse, continuous improvement across the repository. + +## Current Context + +- **Repository**: ${{ github.repository }} +- **Run Date**: $(date +%Y-%m-%d) +- **Cache Location**: `/tmp/gh-aw/cache-memory/focus-areas/` +- **Strategy Distribution**: ~60% custom areas, ~30% standard categories, ~10% reuse for consistency + +## Phase 0: Setup and Focus Area Selection + +### 0.1 Load Focus Area History + +Check the cache memory folder `/tmp/gh-aw/cache-memory/focus-areas/` for previous focus area selections: + +```bash +if [ -f /tmp/gh-aw/cache-memory/focus-areas/history.json ]; then + cat /tmp/gh-aw/cache-memory/focus-areas/history.json +fi +``` + +The history file should contain: +```json +{ + "runs": [ + { + "date": "2024-01-15", + "focus_area": "code-quality", + "custom": false, + "description": "Static analysis and code quality metrics" + } + ], + "recent_areas": ["code-quality", "documentation", "testing", "security", "performance"], + "statistics": { + "total_runs": 5, + "custom_rate": 0.6, + "reuse_rate": 0.1, + "unique_areas_explored": 12 + } +} +``` + +### 0.2 Select Focus Area + +Choose a focus area based on the following strategy to maximize diversity and repository-specific insights: + +**Strategy Options:** + +1. **Create a Custom Focus Area (60% of the time)** โ€” Invent a new, repository-specific focus area that addresses unique needs: + - Think creatively about this specific project's challenges + - Consider areas beyond traditional software quality categories + - Focus on workflow-specific, tool-specific, or user experience concerns + - **Be creative!** Analyze the repository structure and identify truly unique improvement opportunities + +2. **Use a Standard Category (30% of the time)** โ€” Select from established areas: + - Code Quality, Documentation, Testing, Security, Performance + - CI/CD, Dependencies, Code Organization, Accessibility, Usability + +3. **Reuse Previous Strategy (10% of the time)** โ€” Revisit the most impactful area from recent runs for deeper analysis + +**Available Standard Focus Areas:** +1. **Code Quality**: Static analysis, linting, code smells, complexity, maintainability +2. **Documentation**: README quality, API docs, inline comments, user guides, examples +3. **Testing**: Test coverage, test quality, edge cases, integration tests, performance tests +4. **Security**: Vulnerability scanning, dependency updates, secrets detection, access control +5. **Performance**: Build times, runtime performance, memory usage, bottlenecks +6. **CI/CD**: Workflow efficiency, action versions, caching, parallelization +7. **Dependencies**: Update analysis, license compliance, security advisories, version conflicts +8. **Code Organization**: File structure, module boundaries, naming conventions, duplication +9. **Accessibility**: Documentation accessibility, UI considerations, inclusive language +10. **Usability**: Developer experience, setup instructions, error messages, tooling + +**Selection Algorithm:** +- Generate a random number between 0 and 100 +- **If number โ‰ค 60**: Invent a custom focus area specific to this repository's needs +- **Else if number โ‰ค 90**: Select a standard category that hasn't been used in the last 3 runs +- **Else**: Reuse the most common or impactful focus area from the last 10 runs +- Update the history file with the selected focus area, whether it was custom, and a brief description + +## Phase 1: Conduct Analysis + +First, determine the primary programming language(s) in this repository: + +```bash +# Detect the primary languages used +find . -type f \( -name "*.go" -o -name "*.py" -o -name "*.ts" -o -name "*.js" -o -name "*.rb" -o -name "*.java" -o -name "*.rs" -o -name "*.cs" -o -name "*.cpp" -o -name "*.c" \) \ + -not -path "*/.git/*" -not -path "*/node_modules/*" -not -path "*/vendor/*" -not -path "*/dist/*" -not -path "*/build/*" -not -path "*/target/*" \ + 2>/dev/null | sed 's/.*\.//' | sort | uniq -c | sort -rn | head -5 +``` + +Then, based on the selected focus area, perform targeted analysis using the examples below as guidance. Adapt commands to the detected language(s). + +### Code Quality Analysis + +```bash +# Find largest source files +find . -type f \( -name "*.go" -o -name "*.py" -o -name "*.ts" -o -name "*.js" -o -name "*.rb" -o -name "*.java" -o -name "*.rs" -o -name "*.cs" \) \ + -not -path "*/.git/*" -not -path "*/node_modules/*" -not -path "*/vendor/*" -not -path "*/dist/*" -not -path "*/target/*" \ + -exec wc -l {} \; 2>/dev/null | sort -rn | head -10 + +# TODO/FIXME comments +grep -r "TODO\|FIXME\|HACK\|XXX" \ + --include="*.go" --include="*.py" --include="*.ts" --include="*.js" \ + --include="*.rb" --include="*.java" --include="*.rs" --include="*.cs" \ + . 2>/dev/null | grep -v ".git" | wc -l +``` + +### Documentation Analysis + +```bash +# Check for README and docs +find . -maxdepth 2 -name "*.md" -type f | head -20 + +# Check for undocumented public APIs (example for TypeScript) +grep -r "^export" --include="*.ts" . 2>/dev/null | grep -v "node_modules" | wc -l +``` + +### Testing Analysis + +```bash +# Count test files vs source files +TOTAL_SRC=$(find . -type f \( -name "*.go" -o -name "*.py" -o -name "*.ts" -o -name "*.js" -o -name "*.rb" -o -name "*.java" -o -name "*.rs" \) \ + -not -path "*/.git/*" -not -path "*/node_modules/*" -not -path "*/vendor/*" -not -name "*test*" -not -name "*spec*" \ + 2>/dev/null | wc -l) +TOTAL_TEST=$(find . -type f \( -name "*_test.*" -o -name "*.test.*" -o -name "*.spec.*" -o -name "*Test.*" -o -name "*Tests.*" \) \ + -not -path "*/.git/*" -not -path "*/node_modules/*" \ + 2>/dev/null | wc -l) +echo "Source files: $TOTAL_SRC | Test files: $TOTAL_TEST" +``` + +### Security Analysis + +```bash +# Check for hardcoded sensitive patterns +grep -ri "password\s*=\|api_key\s*=\|secret\s*=\|token\s*=" \ + --include="*.go" --include="*.py" --include="*.ts" --include="*.js" \ + . 2>/dev/null | grep -v ".git" | grep -v "test" | grep -v "example" | head -10 + +# Check for pinned action versions in CI +grep "uses:" .github/workflows/*.yml 2>/dev/null | grep -v "@" | head -10 +``` + +### CI/CD Analysis + +```bash +# Workflow health overview +find .github/workflows -name "*.yml" -o -name "*.yaml" 2>/dev/null | wc -l + +# Check for unpinned action versions +grep -r "uses:" .github/workflows/ 2>/dev/null | grep -v "@" | wc -l +``` + +### Dependencies Analysis + +```bash +# Detect package manager and list dependencies +if [ -f package.json ]; then + echo "npm dependencies:" + jq '.dependencies | length' package.json 2>/dev/null +fi +if [ -f go.mod ]; then + echo "Go modules:" + grep "^require" -A1000 go.mod | grep -v "^)" | wc -l +fi +if [ -f requirements.txt ]; then + echo "Python dependencies:" + wc -l requirements.txt +fi +if [ -f Gemfile ]; then + echo "Ruby gems:" + grep "gem " Gemfile | wc -l +fi +``` + +### Code Organization Analysis + +```bash +# Directory structure +find . -type d ! -path "./.git/*" ! -path "*/node_modules/*" ! -path "*/vendor/*" | head -20 + +# File distribution by top-level directory +for dir in src lib cmd pkg app; do + if [ -d "$dir" ]; then + echo "$dir: $(find "$dir" -type f | wc -l) files" + fi +done +``` + +### Accessibility & Usability Analysis + +```bash +# Check for inclusive language +grep -ri "whitelist\|blacklist\|master\|slave" --include="*.md" . 2>/dev/null | grep -v ".git" | wc -l + +# README quality +wc -l README.md 2>/dev/null || echo "No README.md found" + +# Check for CONTRIBUTING, CODE_OF_CONDUCT, etc. +for f in CONTRIBUTING.md CODE_OF_CONDUCT.md SECURITY.md CHANGELOG.md; do + [ -f "$f" ] && echo "โœ… $f" || echo "โŒ $f missing" +done +``` + +### For Custom Focus Areas + +When you invent a custom focus area, **design appropriate analysis commands** tailored to that area. Consider: + +- What metrics would reveal the current state? +- What files or patterns should be examined? +- What would success look like in this area? + +**Example: "Error Message Clarity"** +```bash +# Find error messages across codebase +grep -r "throw\|Error\|exception\|error(" \ + --include="*.ts" --include="*.js" --include="*.py" \ + . 2>/dev/null | grep -v "node_modules" | head -20 +``` + +**Example: "Developer Onboarding Experience"** +```bash +# Check onboarding documentation +find . -name "GETTING_STARTED*" -o -name "SETUP*" -o -name "QUICKSTART*" 2>/dev/null +# Check if there's a dev container or codespaces config +ls .devcontainer/ 2>/dev/null || echo "No devcontainer" +cat .github/codespaces/devcontainer.json 2>/dev/null +``` + +**Example: "Contribution Friction"** +```bash +# Check PR template +cat .github/pull_request_template.md 2>/dev/null +# Check issue templates +ls .github/ISSUE_TEMPLATE/ 2>/dev/null +# Check CI feedback speed (look at workflow complexity) +find .github/workflows -name "*.yml" -exec wc -l {} \; | sort -rn | head -5 +``` + +## Phase 2: Generate Improvement Report + +Write a comprehensive report as a GitHub issue with the following structure: + +**Report Formatting**: Use h3 (###) or lower for all headers in the report to maintain proper document hierarchy. The issue title serves as h1, so start section headers at h3. + +```markdown +### ๐ŸŽฏ Repository Quality Improvement Report โ€” [FOCUS AREA] + +**Analysis Date**: [DATE] +**Focus Area**: [SELECTED AREA] +**Strategy Type**: [Custom/Standard/Reused] + +### Executive Summary + +[2โ€“3 paragraphs summarizing the analysis findings and key recommendations] + +
+Full Analysis Report + +### Focus Area: [AREA NAME] + +### Current State Assessment + +**Metrics Collected:** +| Metric | Value | Status | +|--------|-------|--------| +| [Metric 1] | [Value] | โœ…/โš ๏ธ/โŒ | +| [Metric 2] | [Value] | โœ…/โš ๏ธ/โŒ | + +### Findings + +#### Strengths +- [Strength 1] +- [Strength 2] + +#### Areas for Improvement +- [Issue 1 with severity indicator] +- [Issue 2 with severity indicator] + +
+ +--- + +### ๐Ÿค– Suggested Improvement Tasks + +The following actionable tasks address the findings above. + +#### Task 1: [Short Description] + +**Priority**: High/Medium/Low +**Estimated Effort**: Small/Medium/Large + +[Detailed description of what needs to be done, including specific files or patterns to change] + +--- + +#### Task 2: [Short Description] + +[Continue pattern for 3โ€“5 total tasks] + +--- + +### ๐Ÿ“Š Historical Context + +
+Previous Focus Areas + +| Date | Focus Area | Type | +|------|------------|------| +| [Date] | [Area] | [Custom/Standard/Reused] | + +
+ +--- + +### ๐ŸŽฏ Recommendations + +#### Immediate Actions (This Week) +1. [Action 1] โ€” Priority: High + +#### Short-term Actions (This Month) +1. [Action 1] โ€” Priority: Medium + +--- + +*Generated by Repository Quality Improvement Agent* +*Next analysis: [Tomorrow's date] โ€” Focus area selected based on diversity algorithm* +``` + +## Phase 3: Update Cache Memory + +After generating the report, update the focus area history: + +```bash +mkdir -p /tmp/gh-aw/cache-memory/focus-areas/ +# Write updated history.json with the new run appended +``` + +The JSON should include: +- All previous runs (preserve existing history) +- The new run: date, focus_area, custom (true/false), description, tasks_generated +- Updated `recent_areas` (last 5) +- Updated statistics (total_runs, custom_rate, unique_areas_explored) + +## Success Criteria + +A successful quality improvement run: +- โœ… Selects a focus area using the diversity algorithm (60% custom, 30% standard, 10% reuse) +- โœ… Determines the repository's primary language(s) and adapts analysis accordingly +- โœ… Conducts thorough analysis of the selected area +- โœ… Generates exactly one issue with the report +- โœ… Includes 3โ€“5 actionable tasks +- โœ… Updates cache memory with run history +- โœ… Maintains high diversity rate (aim for 60%+ custom or varied strategies) + +## Important Guidelines + +- **Prioritize Custom Areas**: 60% of runs should invent new, repository-specific focus areas +- **Avoid Repetition**: Don't select the same area in consecutive runs +- **Be Creative**: Think beyond the standard categories โ€” what unique aspects of this project need attention? +- **Be Thorough**: Collect relevant metrics and perform meaningful analysis +- **Be Specific**: Provide exact file paths, line numbers, and code examples where relevant +- **Be Actionable**: Every finding should lead to a concrete task +- **Respect Timeout**: Complete within 20 minutes