Skip to content

A curated & production-ready collection of tools, frameworks, APIs, proxies, and n8n workflows for B2B web scraping, lead enrichment, data verification, and CRM automation. Built and maintained by Lead Orchestra.

License

Notifications You must be signed in to change notification settings

Lead-Orchestra/awesome-b2b-leads

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

546 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Awesome Web Scraping for B2B Leads

Curated by Lead Orchestra β€” https://leadorchestra.com

GitHub Stars GitHub Forks MIT License Maintained

A fully-curated, SEO-optimized list of the best tools, frameworks, APIs, workflows, and services for
B2B lead scraping, enrichment, automation, and CRM-ready data pipelines β€” maintained by Lead Orchestra.


πŸ“Œ What Is Lead Orchestra?

Lead Orchestra is a complete B2B lead scraping & automation platform that orchestrates:

  • Web scraping at scale
  • Undetectable browser automation
  • Data enrichment (email, company, social, intent)
  • Lead verification & deduplication
  • n8n / Make.com automation workflows
  • CRM export (HubSpot, Salesforce, Pipedrive, GoHighLevel, Deal Scale)

Learn more β†’ https://leadorchestra.com

This GitHub repository supports the project by offering the best-in-class curated tools used in modern lead generation pipelines.


🧭 Table of Contents


πŸ•ΈοΈ Web Scraping Frameworks

High-performance, scalable frameworks for scraping B2B data:

Python

JavaScript / TypeScript

No-Code Scraping Tools


🧭 Headless Browser & Automation Tools

Use these for undetectable scraping, dynamic content, infinite scroll, and JS-heavy websites.

  • Playwright – https://playwright.dev
    Multi-browser (Chromium, WebKit, Firefox) automation, best anti-bot resistance.
  • Puppeteer – https://pptr.dev
    Chrome-only automation for scraping & testing.
  • Selenium – https://www.selenium.dev
    Classic browser automation, supports multiple languages.
  • Apify Actors – https://apify.com
    Cloud headless browser environment with rotation, retries, storage.

🧬 B2B Lead Enrichment APIs

Turn raw scraped data into sales-ready enriched profiles.

Top Enrichment Providers


πŸ“« Email Verification Services

Ensure deliverability & reduce bounce rates.


🌐 Proxy & Anti-Bot Providers

Necessary for large-scale scraping without blocks.


πŸ”„ n8n Workflows & Automation Nodes

Ready-to-use n8n workflow templates for B2B lead automation, sourced from awesome-n8n-templates.

πŸ“§ Email & Lead Processing

Gmail & Email Automation

  • Auto-label incoming Gmail messages with AI – Automatically labels incoming Gmail messages using AI. Retrieves message content, suggests labels like Partnership or Inquiry, and assigns them for better organization. Template
  • Compose reply draft in Gmail with OpenAI Assistant – Generates draft replies in Gmail using OpenAI. Triggers on new emails, extracts content, and creates a suggested reply draft. Template
  • Analyze & Sort Suspicious Email Contents with ChatGPT – Analyzes suspicious emails using ChatGPT, classifies them, and can generate screenshots for review. Template
  • A Very Simple "Human in the Loop" Email Response System Using AI and IMAP – Implements a workflow for human-in-the-loop email responses. Uses IMAP to fetch emails, summarizes content with AI, and drafts professional replies for review. Template
  • Auto Categorise Outlook Emails with AI – Automatically categorizes Outlook emails using AI models. Moves messages to folders and assigns categories based on content. Template

πŸ“Š Data Management & Enrichment

Google Drive & Google Sheets

  • Qualify new leads in Google Sheets via OpenAI's GPT-4 – Uses OpenAI's GPT-4 to analyze and qualify new leads entered into a Google Sheet, helping sales teams prioritize their outreach. Template
  • Chat with a Google Sheet using AI – Allows users to interact with and query data within a Google Sheet using natural language via an AI model. Template
  • Summarize Google Sheets form feedback via OpenAI's GPT-4 – Summarizes feedback collected through Google Forms and stored in Google Sheets using OpenAI's GPT-4. Template
  • Summarize the New Documents from Google Drive and Save Summary in Google Sheet – Monitors Google Drive for new documents, summarizes their content using AI, and saves these summaries into a Google Sheet. Template

Database & Storage

  • Chat with Postgresql Database – Enables an AI assistant to chat with a PostgreSQL database, allowing users to query and retrieve data using natural language. Template
  • Generate SQL queries from schema only - AI-powered – Uses AI to generate SQL queries based on a given database schema. Template
  • Talk to your SQLite database with a LangChain AI Agent – Allows users to interact with a SQLite database using a LangChain AI agent. Template

πŸ€– AI-Powered Lead Processing

OpenAI & LLMs

  • AI-Driven Lead Management and Inquiry Automation with ERPNext & n8n – Lead management automation workflow. Template
  • AI Data Extraction with Dynamic Prompts and Airtable – AI-driven data extraction with Airtable integration. Template
  • AI-Powered Email Automation for Business: Summarize & Respond with RAG – Email automation with summarization and response capabilities. Template
  • AI agent that can scrape webpages – AI agent for web scraping tasks. Template

Airtable Integration

  • AI Agent to chat with Airtable and analyze data – Creates an AI agent that can chat with Airtable, analyze data, and perform queries based on user requests. Template
  • Handling Job Application Submissions with AI and n8n Forms – Automates the handling of job application submissions by extracting information from resumes (PDFs) using AI. Template

πŸ“ Forms & Lead Capture

  • Conversational Interviews with AI Agents and n8n Forms – Implements AI-powered conversational interviews using n8n Forms for interactive data collection. Template
  • Email Subscription Service with n8n Forms, Airtable and AI – Manages email subscriptions with n8n Forms, stores data in Airtable, and uses AI for processing. Template
  • Qualifying Appointment Requests with AI & n8n Forms – Uses AI to qualify and process appointment requests submitted through n8n Forms. Template

πŸ’¬ Communication & Notifications

Slack Integration

  • AI-Powered Information Monitoring with OpenAI, Google Sheets, Jina AI and Slack – Monitors RSS feeds, summarizes articles with OpenAI and Jina AI, classifies them, and sends formatted notifications to Slack. Template
  • Customer Support Channel and Ticketing System with Slack and Linear – Automates customer support by querying Slack for messages with a ticket emoji, deciding if a new Linear ticket is needed. Template
  • Enrich Pipedrive's Organization Data with OpenAI GPT-4o & Notify it in Slack – Enriches Pipedrive organization data by scraping website content, using OpenAI GPT-4o to generate a summary, and notifying a Slack channel. Template

πŸ” Research & Data Analysis

AI Research & RAG

  • Ultimate Scraper Workflow for n8n – A comprehensive scraping workflow for n8n to extract data from various sources. Template
  • Scrape and summarize webpages with AI – Scrapes and summarizes content from webpages using AI. Template
  • Host Your Own AI Deep Research Agent with n8n, Apify and OpenAI o3 – Sets up a self-hosted AI deep research agent using n8n, Apify, and OpenAI. Template
  • Automate Competitor Research with Exa.ai, Notion and AI Agents – Builds a competitor research agent using Exa.ai to find similar companies. AI agents then scour the internet for company overviews, product offerings, and customer reviews. Template

πŸ”Œ Popular n8n Community Nodes

Essential community nodes for B2B lead automation, ranked by monthly downloads.

Browser Automation & Web Scraping

Communication & Messaging

AI, LLM & Voice

API & Cloud Integrations

Data Processing & Utilities

File & PDF Manipulation

πŸ“š Resources


πŸ—οΈ Example B2B Lead Pipeline

A real-world, production-ready pipeline:


1. Scrape β†’ Playwright / Crawlee
2. Store Raw Data β†’ n8n / DB / Sheets
3. Enrich Lead β†’ Clearbit, Apollo, Clay
4. Verify Email β†’ NeverBounce
5. Clean & Deduplicate β†’ CRM Query / Hash Matching
6. Export to CRM β†’ HubSpot / Salesforce / Pipedrive
7. Trigger Outreach β†’ Deal Scale / GHL / Apollo

This is the exact architecture Lead Orchestra uses for daily B2B lead generation.


🀝 Contributing

We welcome contributions:

  1. Fork this repo
  2. Add your tool/resource
  3. Submit a PR
  4. Follow formatting, keep quality high

See CONTRIBUTING.md for details.


πŸ“„ License

MIT License β€” free to use and distribute.


πŸ” SEO Keywords & Topics

This README is intentionally optimized for ranking in searches related to:

  • b2b lead scraping
  • web scraping tools
  • lead enrichment APIs
  • browser automation scraping
  • scraping with Playwright
  • n8n lead workflows
  • CRM lead automation
  • email finding & verification
  • proxy rotation & anti-bot systems
  • sales prospecting automation

πŸ”— Links & Resources

About

A curated & production-ready collection of tools, frameworks, APIs, proxies, and n8n workflows for B2B web scraping, lead enrichment, data verification, and CRM automation. Built and maintained by Lead Orchestra.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Contributors