feat(connectors): Delta Lake Sink Connector#2889
feat(connectors): Delta Lake Sink Connector#2889kriti-sc wants to merge 29 commits intoapache:masterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2889 +/- ##
============================================
+ Coverage 72.71% 72.85% +0.14%
Complexity 943 943
============================================
Files 1117 1121 +4
Lines 96285 97008 +723
Branches 73485 74225 +740
============================================
+ Hits 70014 70680 +666
- Misses 23725 23760 +35
- Partials 2546 2568 +22
🚀 New features to boost your workflow:
|
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
|
|
||
| pub(crate) fn build_storage_options( | ||
| config: &DeltaSinkConfig, | ||
| ) -> Result<HashMap<String, String>, Error> { |
There was a problem hiding this comment.
Maybe instead of returning a hash map, we could have the dedicated structs (per storage provider) with optional fields - and then either an enum could be returned or a single struct with all available storages as optional fields?
There was a problem hiding this comment.
Valid point, however this change would require the config to use nested structs. Because the mechanism to convert env vars into the config (function apply_plugin_config_env_overrides in runtime) does not support nested structs.
It would be very good to support nested structs in config, but that change doesn't belong in this PR. I propose adding a TODO and following up with a PR that adds support for nested structs + changing Delta Sink config to use nested structs.
Let me know your thoughts @spetz
| ), | ||
| ]; | ||
| let storage_options = HashMap::from([ | ||
| ("AWS_ACCESS_KEY_ID".into(), MINIO_ACCESS_KEY.into()), |
There was a problem hiding this comment.
Same here (based on my previous suggestion) - we would use the dedicated option struct, and then simply serialize it.
Which issue does this PR close?
Closes #1852
Rationale
Delta Lake is a data analytics engine, and very popular in modern streaming analytics architectures.
What changed?
Introduces a Delta Lake Sink Connector that enables writing data from Iggy to Delta Lake.
The Delta Lake writing logic is heavily inspired by the kafka-delta-ingest project, to have a proven starting ground for writing to Delta Lake.
Local Execution
user_id: String, user_type: u8, email: String, source: String, state: String, created_at: DateTime<Utc>, message: Stringusing sample data producer.AI Usage
If AI tools were used, please answer: