Like: https://docs.moonbeam.network/ai-resources/ai-resources/
The Transformation Process
- Content Extraction
Scrape or export content from existing documentation websites
Extract the raw text, removing HTML formatting, navigation elements, and other UI components
Preserve the core informational content while stripping away presentation layer
- Metadata Addition
Add structured headers with title and description
Include source URLs for reference
Add categorization tags (like "Basics", "Ethereum Toolkit", "Precompiles", etc.)
Create topic classifications to help with search and organization
- Content Restructuring
Convert to markdown or plain text format
Standardize formatting across all documents
Ensure consistent heading structures
Remove redundant navigation text and boilerplate
- AI Optimization
Structure content for better LLM consumption
Add contextual tags and categories
Create clear document boundaries
Optimize for semantic search and retrieval
Tools and Approaches
Automated Solutions:
Web scraping tools (Scrapy, BeautifulSoup)
Documentation crawlers
CMS export functions
API-based content extraction
Manual Curation:
Content review and cleanup
Category assignment
Quality assurance
Link validation
Like: https://docs.moonbeam.network/ai-resources/ai-resources/
The Transformation Process
Scrape or export content from existing documentation websites
Extract the raw text, removing HTML formatting, navigation elements, and other UI components
Preserve the core informational content while stripping away presentation layer
Add structured headers with title and description
Include source URLs for reference
Add categorization tags (like "Basics", "Ethereum Toolkit", "Precompiles", etc.)
Create topic classifications to help with search and organization
Convert to markdown or plain text format
Standardize formatting across all documents
Ensure consistent heading structures
Remove redundant navigation text and boilerplate
Structure content for better LLM consumption
Add contextual tags and categories
Create clear document boundaries
Optimize for semantic search and retrieval
Tools and Approaches
Automated Solutions:
Web scraping tools (Scrapy, BeautifulSoup)
Documentation crawlers
CMS export functions
API-based content extraction
Manual Curation:
Content review and cleanup
Category assignment
Quality assurance
Link validation