Add support for compressed file formats (gzip/zip) in source-gcs connector
#73292
Lucas Leadbetter (lleadbet)
started this conversation in
Connector Ideas and Features
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Current Limitation
The
source-gcsconnector currently doesn't support gzip or zip compressed files as a selectable file format option. This means users cannot directly read compressed files from GCS buckets, which is a common storage optimization practice. This feature currently exists on the low-code Connector Builder as an option as a response, so this should also be enabled for thesource-gcsSource as well.Proposed Enhancement
Add support for compression formats (gzip, zip, bzip2) in the file format settings, with the ability to specify the nested file format within the compressed archive.
Example Use Case
Many organizations store CSV files compressed as .csv.gz or .csv.zip to reduce storage costs and transfer times. The connector should support configurations like:
This would allow the connector to:
Decompress the gzip/zip file
Parse the inner content using the specified format parser (CSV, JSON, Avro, Parquet, etc.)
Beta Was this translation helpful? Give feedback.
All reactions