Skip to content

feat(source-intercom): add configurable num_workers for sync throughput#74067

Open
Lucas Leadbetter (lleadbet) wants to merge 6 commits intomasterfrom
devin/1772125053-intercom-concurrency-level
Open

feat(source-intercom): add configurable num_workers for sync throughput#74067
Lucas Leadbetter (lleadbet) wants to merge 6 commits intomasterfrom
devin/1772125053-intercom-concurrency-level

Conversation

@lleadbet
Copy link
Contributor

@lleadbet Lucas Leadbetter (lleadbet) commented Feb 26, 2026

What

Adds a user-configurable num_workers option to the Intercom source connector, controlling the number of concurrent worker threads. Without this, the CDK defaults to only 2 worker threads, which severely underutilizes the Intercom API rate limit (10k–50k requests/min depending on workspace) and causes syncs with large datasets to take weeks — particularly the conversation_parts stream which uses an N+1 substream pattern (one API call per conversation).

Motivated by a customer sync taking >2 weeks despite having Intercom bump their rate limit to 50k/min.

How

  1. Added a top-level concurrency_level block to manifest.yaml that reads from config:
    • default_concurrency: "{{ config.get('num_workers', 2) }}" — defaults to 2 (preserving existing behavior)
    • max_concurrency: 25
  2. Added a num_workers optional integer field to the connector spec (default: 2, min: 1, max: 25, airbyte_hidden: true)
  3. Version bumped to 0.13.16-rc.3

No Python code changes. The existing custom ErrorHandlerWithRateLimiter in components.py already provides proactive per-request backoff based on X-RateLimit-Remaining headers for most streams.

Updates since last revision

  • Changed from a hardcoded default_concurrency: 10 to a config-driven num_workers field that defaults to 2. This preserves existing behavior for all current users while allowing users with higher rate limits to opt in to higher concurrency.
  • Marked num_workers as airbyte_hidden: true and moved it to the last position in the spec, so it does not appear in the standard connector UI.
  • Fixed version to 0.13.16-rc.3 (next RC after master's 0.13.16-rc.2) to comply with progressive rollout versioning requirements.

Review guide

  1. manifest.yaml — adds the concurrency_level block between streams and spec, and adds num_workers (hidden) to the spec properties as the last field
  2. metadata.yaml — version bump to 0.13.16-rc.3
  3. docs/integrations/sources/intercom.md — changelog entry
  4. Worth reviewing alongside components.py (unchanged) — the custom IntercomRateLimiter sleeps per-thread based on shared rate limit headers. With higher concurrency, multiple threads may read stale remaining-capacity values and collectively overshoot, causing more 429s. The CDK's default retry/backoff should handle this, but it's worth considering whether the proactive rate limiter needs adjustment for higher concurrency.
  5. Note that conversation_parts and companies streams do not use the custom rate limiter — they use DefaultErrorHandler and rely solely on 429 retry/backoff, so they will be the most aggressive under higher concurrency.

Suggested human review checklist

  • Verify that config.get('num_workers', 2) Jinja2 interpolation works correctly with ConcurrencyLevel in the CDK (string-to-int coercion)
  • Consider whether max_concurrency: 25 is appropriate given Intercom's default 10k/min rate limit
  • Confirm that airbyte_hidden: true achieves the desired UX (hidden from standard UI, settable via API or internal tooling)

User Impact

  • No change for existing users — default of 2 preserves current behavior
  • Users with higher Intercom rate limits can set num_workers to 10–25 to significantly improve sync throughput
  • The setting is hidden from the standard UI; it must be configured via API or internal tooling
  • May see slightly more 429 retry cycles under high concurrency, but net throughput should still be much higher

Can this PR be safely reverted and rolled back?

  • YES 💚

Link to Devin run: https://app.devin.ai/sessions/60341a71384b4c03ab3c54a0f9ee4d2d
Requested by: Lucas Leadbetter (@lleadbet)


Open with Devin

Co-Authored-By: lucas.leadbetter@airbyte.io <lucas.leadbetter@gmail.com>
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Contributor

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • 🛠️ Quick Fixes
    • /format-fix - Fixes most formatting issues.
    • /bump-version - Bumps connector versions, scraping changelog description from the PR title.
  • ❇️ AI Testing and Review (internal link: AI-SDLC Docs):
    • /ai-prove-fix - Runs prerelease readiness checks, including testing against customer connections.
    • /ai-canary-prerelease - Rolls out prerelease to 5-10 connections for canary testing.
    • /ai-review - AI-powered PR review for connector safety and quality gates.
  • 🚀 Connector Releases:
    • /publish-connectors-prerelease - Publishes pre-release connector builds (tagged as {version}-preview.{git-sha}) for all modified connectors in the PR.
    • /bump-progressive-rollout-version - Bumps connector version with an RC suffix (2.16.10-rc.1) for progressive rollouts (enableProgressiveRollout: true).
      • Example: /bump-progressive-rollout-version changelog="Add new feature for progressive rollout"
  • ☕️ JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
    • /bump-bulk-cdk-version bump=patch changelog='foo' - Bump the Bulk CDK's version. bump can be major/minor/patch.
  • 🐍 Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.
  • ⚙️ Admin commands:
    • /force-merge reason="<REASON>" - Force merges the PR using admin privileges, bypassing CI checks. Requires a reason.
      Example: /force-merge reason="CI is flaky, tests pass locally"
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

devin-ai-integration bot and others added 2 commits February 26, 2026 17:01
Co-Authored-By: lucas.leadbetter@airbyte.io <lucas.leadbetter@gmail.com>
…orkers with default=2

Co-Authored-By: lucas.leadbetter@airbyte.io <lucas.leadbetter@gmail.com>
@devin-ai-integration devin-ai-integration bot changed the title feat(source-intercom): add concurrency_level to increase sync throughput feat(source-intercom): add configurable num_workers for sync throughput Feb 26, 2026
…tion

Co-Authored-By: lucas.leadbetter@airbyte.io <lucas.leadbetter@gmail.com>
@lleadbet Lucas Leadbetter (lleadbet) marked this pull request as ready for review February 26, 2026 17:05
@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

Deploy preview for airbyte-docs ready!

✅ Preview
https://airbyte-docs-r78d35fjb-airbyte-growth.vercel.app

Built with commit 470c08d.
This pull request is being automatically deployed with vercel-action

@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

source-intercom Connector Test Results

13 tests   9 ✅  12s ⏱️
 2 suites  4 💤
 2 files    0 ❌

Results for commit 470c08d.

♻️ This comment has been updated with latest results.

Co-Authored-By: lucas.leadbetter@airbyte.io <lucas.leadbetter@gmail.com>
@lleadbet
Copy link
Contributor Author

Lucas Leadbetter (lleadbet) commented Feb 26, 2026

/ai-review

PR AI Review Started

Evaluating connector PR for safety and quality.
View workflow run
AI PR Review starting...

Reviewing PR for connector safety and quality.
View playbook

Devin AI session created successfully!

@devin-ai-integration
Copy link
Contributor

🔍 AI PR Review in progress — gathering evidence and evaluating gates. Will post the full report shortly.

Session: https://app.devin.ai/sessions/bbe9a0c047b64e6d8fca244e59c13df6

…sive rollout

Co-Authored-By: lucas.leadbetter@airbyte.io <lucas.leadbetter@gmail.com>
@devin-ai-integration
Copy link
Contributor

AI PR Review Report

Review Action: NO ACTION (INCONCLUSIVE)

Gate Status
PR Hygiene PASS
Code Hygiene PASS
Code Security PASS
Per-Record Performance PASS
Breaking Dependencies PASS
Backwards Compatibility PASS
Forwards Compatibility PASS
Behavioral Changes PASS
Out-of-Scope Changes PASS
CI Checks PASS
Live / E2E Tests UNKNOWN

📋 PR Details & Eligibility

Connector & PR Info

Connector(s): source-intercom
PR: #74067
HEAD SHA: eb609e7228a5c0e9440619dbbf2636fedc89fe38
Session: https://app.devin.ai/sessions/bbe9a0c047b64e6d8fca244e59c13df6

Auto-Approve Eligibility

Eligible: No
Category: not-eligible
Reason: PR adds functional changes (new concurrency_level block in manifest.yaml that controls CDK concurrency behavior). This goes beyond docs-only, additive-spec-only, or patch/minor dependency changes.

Review Action Details

NO ACTION (INCONCLUSIVE) — The Live / E2E Tests gate is UNKNOWN because validation evidence could not be verified (MCP prod database unavailable — Cloud SQL Proxy not running). All other gates pass. Human review is recommended to verify that the concurrency changes work as expected in a live environment. No PR review submitted.

Note: This bot can approve PRs when all gates pass AND the PR is eligible for auto-approval (docs-only, additive spec changes, patch/minor dependency bumps, or comment/whitespace-only changes). PRs with other types of changes require human review even if all gates pass.

🔍 Gate Evaluation Details

Gate-by-Gate Analysis

Gate Status Enforced? Details
PR Hygiene PASS Yes PR description present with What/How/User Impact/Review Guide/Revert safety. Changelog updated. Version bumped.
Code Hygiene PASS WARNING All 13 tests pass (9 passed, 4 skipped). Lint passes. No Python code changes — all changes are declarative YAML config.
Code Security PASS Yes No auth/credential patterns, secrets, tokens, or security-sensitive keywords in functional diff lines.
Per-Record Performance PASS WARNING No changes to record processing loops. No Python code modified. Concurrency config is a top-level CDK setting.
Breaking Dependencies PASS WARNING Version bump 0.13.16-rc.2 to 0.13.17-rc.1. No dependency changes (no pyproject.toml modified).
Backwards Compatibility PASS Blocks Auto-Approve num_workers field is optional with default: 2 (preserving existing behavior). Field is airbyte_hidden: true. concurrency_level uses config.get('num_workers', 2) fallback. No fields removed/renamed. No stream changes.
Forwards Compatibility PASS Blocks Auto-Approve Rolling back removes the concurrency_level block and num_workers spec property. Since num_workers is optional with a default, rollback is safe. No state format changes.
Behavioral Changes PASS Blocks Auto-Approve No rate_limit, retry, timeout, backoff, sleep, or error_handler keywords in functional diff lines. Default concurrency of 2 preserves existing behavior. Users must explicitly opt in to higher concurrency.
Out-of-Scope Changes PASS Skip All changes within airbyte-integrations/connectors/source-intercom/ and docs/.
CI Checks PASS Yes Core checks pass: Connector CI Checks Summary passed, Lint source-intercom Connector passed, Test source-intercom Connector passed (13 tests, 9 pass, 4 skipped). source-intercom Pre-Release Checks failed but is excluded from CI Checks gate per policy.
Live / E2E Tests UNKNOWN Yes No validation labels found. MCP prod database verification unavailable (Cloud SQL Proxy not running). Cannot verify whether pre-release was published or live connections were tested.

Pre-Release Checks Failure Details (informational, excluded from CI Checks)

The source-intercom Pre-Release Checks failed for two reasons:

  1. Changelog mismatch: Changelog entry is for version 0.13.17 but metadata has 0.13.17-rc.1
  2. Version increment check: Master version is 0.13.16-rc.2, current is 0.13.17-rc.1 — RC versions should only differ in the prerelease part

These are pre-release validation issues and do not affect the CI Checks gate, but may need to be resolved before publishing.

📚 Evidence Consulted

Evidence

  • Changed files: 3 files (+28, -1)
    • airbyte-integrations/connectors/source-intercom/manifest.yaml — adds concurrency_level block and num_workers spec property
    • airbyte-integrations/connectors/source-intercom/metadata.yaml — version bump 0.13.16-rc.2 to 0.13.17-rc.1
    • docs/integrations/sources/intercom.md — changelog entry
  • CI checks: 34 pass, 1 fail (pre-release, excluded), 2 pending (Vercel/docs, non-core), 7 skipped
  • PR labels: Not explicitly checked via API; no validation labels visible in PR conversation
  • PR description: Present and detailed
  • Existing bot reviews: None found for this HEAD SHA
❓ How to Respond

Providing Context or Justification

You can add explanations that the bot will see on the next review:

Option 1: PR Description (recommended)
Add a section to your PR description:

## AI PR Review Justification

### Live / E2E Tests
[Your explanation here — e.g., why live testing is not required for this change, or evidence that it was tested]

Option 2: PR Comment
Add a comment starting with:

AI PR Review Justification:
[Your explanation here]

After adding your response, re-run /ai-review to have the bot evaluate it.

Note: For the Live / E2E Tests gate, a sufficient justification explaining why validation is not required for this specific change can lead to PASS. For example: the change preserves default behavior (num_workers defaults to 2), the field is hidden (airbyte_hidden: true), and no existing connections will be affected without explicit user action.


Devin session

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants