Skip to content

fix(source-snowflake): normalize numeric type aliases to prevent silent stream drops#74066

Draft
devin-ai-integration[bot] wants to merge 4 commits intomasterfrom
devin/1772118359-fix-snowflake-number-type-mismatch
Draft

fix(source-snowflake): normalize numeric type aliases to prevent silent stream drops#74066
devin-ai-integration[bot] wants to merge 4 commits intomasterfrom
devin/1772118359-fix-snowflake-number-type-mismatch

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Feb 26, 2026

What

Resolves https://github.com/airbytehq/oncall/issues/11452:

Resolves #74064:

Snowflake source syncs complete with 0 records emitted due to a type mismatch between catalog discovery and the read phase. The Snowflake JDBC driver returns different type name strings for the same NUMBER(38,0) column depending on which metadata API is used ("NUMBER" via DatabaseMetaData.getColumns() vs "INTEGER" via ResultSetMetaData). The CDK's StateManagerFactory.toStream() detects this as a FieldTypeMismatch and silently drops the entire stream.

How

In SnowflakeSourceOperations, all Snowflake numeric type aliases (NUMBER, DECIMAL, NUMERIC, INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT) are now routed through a single numericType(scale) function that determines the JdbcFieldType based on the column's scale value rather than the type name string:

  • scale > 0BigDecimalFieldType (LeafAirbyteSchemaType.NUMBER)
  • scale == 0 or nullBigIntegerFieldType (LeafAirbyteSchemaType.INTEGER)

This ensures consistent type mapping regardless of which JDBC metadata API returns which type name string.

Version bumped to 1.0.9 with changelog entry.

Review guide

  1. SnowflakeSourceOperations.kt — Core fix. Review the leafType() and numericType() functions. All numeric type name aliases now share a single code path that dispatches on scale instead of type name string.
  2. SnowflakeSourceOperationsTest.kt — New unit tests. Parameterized tests verify all 9 numeric aliases produce the same type at scale=0, scale>0, and scale=null. Includes a test for the exact reported bug scenario (NUMBER vs INTEGER for the same column).
  3. metadata.yaml / snowflake.md — Version bump 1.0.8 → 1.0.9 and changelog entry.

Human review checklist

  • SMALLINT / TINYINT / BYTEINT accessor change: Previously mapped to ShortFieldType / ByteFieldType (using ShortAccessor / ByteAccessor), now mapped to BigIntegerFieldType (using BigDecimalAccessor). The Airbyte schema type is INTEGER in all cases, but the JDBC getter changed. Verify this doesn't cause issues reading small integer values.
  • No real Snowflake integration test: Fix was validated with unit tests only; no live Snowflake connection was used. Consider running regression tests against a real Snowflake instance before release.

User Impact

Positive:

  • Users with NUMBER(38,0) columns (or other zero-scale numeric types) will no longer experience silent sync failures with 0 records emitted.
  • Syncs that were previously failing silently will now succeed and emit data.

Potential schema change:

  • Existing catalogs with NUMBER(p,0) columns discovered as NUMBER type will now discover as INTEGER type on the next schema refresh. This is a schema change but corrects the inconsistency that was causing failures. Users may need to refresh their schemas and potentially reset affected streams.

Can this PR be safely reverted and rolled back?

  • YES 💚

Reverting would restore the previous behavior where type mismatches cause silent stream drops, but no data corruption or state issues would occur.


Devin session
Requested by: bot_apk (apk@cognition.ai)

…nt stream drops

Snowflake JDBC driver can return different type name strings (e.g. NUMBER
vs INTEGER) for the same NUMBER(38,0) column depending on whether metadata
is queried via DatabaseMetaData.getColumns() or ResultSetMetaData. This
caused StateManagerFactory.toStream() to detect a FieldTypeMismatch and
silently drop entire streams, resulting in syncs completing with 0 records.

Fix: Normalize all Snowflake numeric type aliases (NUMBER, DECIMAL,
NUMERIC, INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT) to a
consistent FieldType based on scale: scale=0 -> BigIntegerFieldType
(INTEGER), scale>0 -> BigDecimalFieldType (NUMBER).

Resolves: #74064
Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Co-Authored-By: bot_apk <apk@cognition.ai>
@github-actions
Copy link
Contributor

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • 🛠️ Quick Fixes
    • /format-fix - Fixes most formatting issues.
    • /bump-version - Bumps connector versions, scraping changelog description from the PR title.
  • ❇️ AI Testing and Review (internal link: AI-SDLC Docs):
    • /ai-prove-fix - Runs prerelease readiness checks, including testing against customer connections.
    • /ai-canary-prerelease - Rolls out prerelease to 5-10 connections for canary testing.
    • /ai-review - AI-powered PR review for connector safety and quality gates.
  • 🚀 Connector Releases:
    • /publish-connectors-prerelease - Publishes pre-release connector builds (tagged as {version}-preview.{git-sha}) for all modified connectors in the PR.
    • /bump-progressive-rollout-version - Bumps connector version with an RC suffix (2.16.10-rc.1) for progressive rollouts (enableProgressiveRollout: true).
      • Example: /bump-progressive-rollout-version changelog="Add new feature for progressive rollout"
  • ☕️ JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
    • /bump-bulk-cdk-version bump=patch changelog='foo' - Bump the Bulk CDK's version. bump can be major/minor/patch.
  • 🐍 Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.
  • ⚙️ Admin commands:
    • /force-merge reason="<REASON>" - Force merges the PR using admin privileges, bypassing CI checks. Requires a reason.
      Example: /force-merge reason="CI is flaky, tests pass locally"
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

Co-Authored-By: bot_apk <apk@cognition.ai>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

source-snowflake Connector Test Results

0 tests   0 ✅  0s ⏱️
0 suites  0 💤
0 files    0 ❌

Results for commit 5cd356f.

♻️ This comment has been updated with latest results.

Co-Authored-By: bot_apk <apk@cognition.ai>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2026

Deploy preview for airbyte-docs ready!

✅ Preview
https://airbyte-docs-cctndrryv-airbyte-growth.vercel.app

Built with commit 5cd356f.
This pull request is being automatically deployed with vercel-action

@devin-ai-integration
Copy link
Contributor Author

↪️ Triggering /ai-prove-fix per Hands-Free AI Triage Project triage next step.

Reason: Draft PR normalizes numeric type aliases to prevent silent stream drops when Snowflake JDBC returns inconsistent type names for NUMBER(38,0) columns.
https://github.com/airbytehq/oncall/issues/11452

Devin session

@octavia-bot
Copy link
Contributor

octavia-bot bot commented Feb 27, 2026

🔍 AI Prove Fix session starting... Running readiness checks and testing against customer connections. View playbook

Devin AI session created successfully!

@devin-ai-integration
Copy link
Contributor Author

devin-ai-integration bot commented Feb 27, 2026

Fix Validation Evidence

Outcome: Could not Run Tests

Evidence Summary

The connector could not be built from this PR branch. All three build attempts (2 pre-release publishes, 1 regression test) failed at compileKotlin with Unresolved reference errors for CDK classes (ConfigErrorException, StreamIdentifier, KotlinLogging, Jsons, OpaqueStateValue, etc.). The branch needs to be rebased on master to pick up CDK dependency changes.

Static analysis of the fix is positive — the type normalization logic is correct and directly addresses the root cause.

Next Steps
  1. Rebase this PR branch on master to resolve CDK dependency compilation errors.
  2. Re-run /ai-prove-fix after the branch builds successfully.
  3. Regression tests and live connection testing will proceed once the image can be built.

Connector & PR Details

Connector: source-snowflake
PR: #74066
Pre-release Version Tested: 1.0.9-preview.5cd356f (failed to build)
Detailed Results: https://github.com/airbytehq/oncall/issues/11452#issuecomment-3972826337

Evidence Plan

Proving Criteria

A sync on a connection with NUMBER(38,0) columns completes successfully and emits records. Regression tests show no behavioral changes for non-affected type mappings.

Disproving Criteria

Regression tests fail, or a live sync still emits 0 records after applying the fix, or new errors appear.

Cases Attempted

Case Outcome
Regression tests (comparison mode) Build failed — workflow
Pre-release publish (attempt 1) Build failed — workflow
Pre-release publish (attempt 2) Build failed — workflow
Pre-flight Checks
  • Viability: Fix normalizes all 9 numeric type aliases through numericType(scale) based on scale rather than type name string
  • Safety: No malicious code or dangerous patterns
  • Breaking Change: No breaking changes detected
  • Reversibility: Patch bump 1.0.8 → 1.0.9, no state/config format changes
Detailed Evidence Log

Build Failure Root Cause:
All builds fail at :airbyte-integrations:connectors:source-snowflake:compileKotlin with Unresolved reference errors for CDK classes. The PR branch is stale relative to master and needs rebasing.

Failed build artifacts:

Note: Connection IDs and detailed logs are recorded in the linked private issue.


Devin session

@github-actions
Copy link
Contributor

Pre-release Connector Publish Started

Publishing pre-release build for connector source-snowflake.
PR: #74066

Pre-release versions will be tagged as {version}-preview.5cd356f
and are available for version pinning via the scoped_configuration API.

View workflow run

@github-actions
Copy link
Contributor

Pre-release Connector Publish Started

Publishing pre-release build for connector source-snowflake.
PR: #74066

Pre-release versions will be tagged as {version}-preview.5cd356f
and are available for version pinning via the scoped_configuration API.

View workflow run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Snowflake source

1 participant