Skip to content

Implement agentic link following for contributing guidelines#60

Merged
slavingia merged 1 commit intomainfrom
devin/1755717081-agentic-link-following
Aug 20, 2025
Merged

Implement agentic link following for contributing guidelines#60
slavingia merged 1 commit intomainfrom
devin/1755717081-agentic-link-following

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Implement agentic link following for contributing guidelines

Summary

Enhanced the loadContributingGuidelines function to parse markdown links in CONTRIBUTING.md files and recursively fetch referenced documents to provide comprehensive context for AI analysis. The bot can now follow links like [README](README.md) to aggregate content from multiple related documents.

Key Features:

  • Recursive markdown link parsing with regex /\[([^\]]+)\]\(([^)]+)\)/g
  • URL resolution for relative and absolute GitHub links to raw.githubusercontent.com
  • Depth limiting (max 3 levels) to prevent infinite loops
  • Enhanced caching system with composite keys including depth
  • Graceful error handling for failed network requests
  • Content aggregation with clear document separation for AI context

Files Modified:

  • app/api/webhook/route.ts: Enhanced loadContributingGuidelines with 4 new helper functions
  • tests/webhook.test.ts: Added comprehensive test suite for link following functionality

Review & Testing Checklist for Human

  • Test with real webhook data: Create a PR in a repository that has CONTRIBUTING.md with links to other markdown files and verify the bot processes all linked content
  • Verify network error handling: Test behavior when linked files return 404 or network requests fail (should gracefully degrade)
  • Check recursion safety: Verify that depth limiting and visited URL tracking prevent infinite loops, especially with circular references
  • Validate URL resolution: Test various GitHub URL formats (relative paths, /blob/, /tree/, external links) to ensure proper handling
  • Confirm caching behavior: Verify that cache invalidation works correctly with the new composite cache keys and doesn't serve stale aggregated content

Recommended End-to-End Test Plan:

  1. Set up a test repository with CONTRIBUTING.md that links to README.md and CODE_OF_CONDUCT.md
  2. Create a PR with guideline violations in that repository
  3. Trigger webhook and verify bot comment references content from all linked documents
  4. Monitor response times and error logs for any issues with link fetching

Diagram

%%{ init : { "theme" : "default" }}%%
flowchart TD
    PR["GitHub PR Event"] --> Webhook["app/api/webhook/route.ts"]
    Webhook --> LoadGuidelines["loadContributingGuidelines()"]
    LoadGuidelines --> ExtractLinks["extractMarkdownLinks()"]
    ExtractLinks --> ResolveURL["resolveGitHubUrl()"]
    ResolveURL --> FetchContent["fetchUrlContent()"]
    FetchContent --> ProcessLinked["processLinkedContent()"]
    ProcessLinked --> Cache["Enhanced Caching"]
    Cache --> AIPrompt["Updated AI System Prompt"]
    AIPrompt --> Response["Bot Response"]
    
    Tests["tests/webhook.test.ts"] --> LoadGuidelines
    
    LoadGuidelines:::major-edit
    ExtractLinks:::major-edit
    ResolveURL:::major-edit
    FetchContent:::major-edit
    ProcessLinked:::major-edit
    Tests:::major-edit
    AIPrompt:::minor-edit
    Cache:::minor-edit
    Webhook:::context
    PR:::context
    Response:::context

    subgraph Legend
        L1["Major Edit"]:::major-edit
        L2["Minor Edit"]:::minor-edit  
        L3["Context/No Edit"]:::context
    end


    classDef major-edit fill:#90EE90
    classDef minor-edit fill:#87CEEB
    classDef context fill:#FFFFFF
Loading

Notes

  • Successfully tested with real jacquez repository data (CONTRIBUTING.md → README.md link following)
  • All existing tests pass (38/38) and linting passes with no errors
  • Maintains backward compatibility with existing loadContributingGuidelines calls
  • Implementation follows existing code patterns and error handling conventions

Link to Devin session: https://app.devin.ai/sessions/ec0555de1c364310b2477f92be708c3f
Requested by: Sahil Lavingia (@slavingia)

- Add recursive markdown link parsing and following
- Support relative and absolute GitHub URLs with depth limiting (max 3 levels)
- Implement URL resolution to raw.githubusercontent.com for direct content fetching
- Extend caching system with composite keys for aggregated content
- Update AI prompt to handle content from multiple linked documents
- Add comprehensive test coverage for link extraction and URL resolution
- Maintain backward compatibility with existing function signature
- Add graceful error handling and detailed logging for debugging

The enhanced loadContributingGuidelines function now:
- Parses markdown links using regex /\[([^\]]+)\]\(([^)]+)\)/g
- Follows GitHub links within the same repository
- Aggregates content from linked documents with clear separation
- Prevents infinite loops with depth limiting and visited URL tracking
- Caches aggregated results for performance optimization

Tested with real data from jacquez repository CONTRIBUTING.md -> README.md link following.

Co-Authored-By: Sahil Lavingia <sahil@gumroad.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@vercel
Copy link
Copy Markdown

vercel bot commented Aug 20, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
jacquez Ready Ready Preview Comment Aug 20, 2025 7:19pm

@slavingia slavingia merged commit 5a88098 into main Aug 20, 2025
4 checks passed
@slavingia slavingia deleted the devin/1755717081-agentic-link-following branch August 20, 2025 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant