Upgrade mongodb-source-v2 debezium version from 2.6.2 to 3.0.1 #68156
Richard Gao (RScicomp)
started this conversation in
Connector Ideas and Features
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
MongoDB Source v2: Buffer Lock Queue Issues and Debezium Upgrade Request
Hello! We're currently running Airbyte v1.7.1 with the latest mongodb-source-v2 2.0.4 connector and experiencing intermittent buffer lock queue warnings in our pipelines. While these warnings typically resolve themselves without causing serious issues, we notice a significant spike in these messages when jobs fail. In addition large spikes in cpu usage occur when there isn't a large volume of data to process.
Issue Background:
Our logs show warnings like:
And the job ends up failing with:
Some streams either received an INCOMPLETE stream status, or did not receive a stream status at all: io.airbyte.commons.exceptions.TransientErrorException: Some streams were unsuccessful due to a source error. See logs for details.This leads us to believe that they may contribute to pipeline failures.
Potential Solution:
We discovered that Debezium addressed these buffer lock queue issues in PR #5692, which was included in Debezium 2.7.1+.
Questions:
Is a Debezium upgrade to 3.0.1+ planned? We noticed PR #61370 attempted to upgrade to 2.7.1 but was never merged. In addition PostgreSQL source connection use 3.0.1 so there is some precedent for this in other connectors
Postgres:
airbyte/airbyte-integrations/connectors/source-postgres/build.gradle
Line 31 in 569a6f4
How would the resume token format change be handled? Starting with Debezium 2.7.1, resume tokens are stored in Base64 format (commit 774edca) instead of the previous hex string format. This seems like it could be a migration challenge for existing connectors.
Impact:
The buffer lock issues appear to correlate with job failures and may be contributing to pipeline instability. An upgrade could potentially resolve these reliability concerns.
Would appreciate any insights on the upgrade timeline and approach for handling the resume token format migration!
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions