Conversation
There was a problem hiding this comment.
Hello @lucaslinhares, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request adds a consumer to send messages to a data lake. It includes changes to the poetry.lock and pyproject.toml files to update dependencies, and adds new files temba/msgs/consumers/__init__.py, temba/msgs/consumers/msg_consumer.py, and temba/msgs/handle.py to implement the consumer logic.
Highlights
- Dependency Updates: The
poetry.lockandpyproject.tomlfiles are updated to include new dependencies such asexceptiongroup,grpcio,grpcio-health-checking,grpcio-tools,iniconfig,jinja2,markupsafe,moto,pluggy,protobuf,pytest,responses,tomli,weni-datalake-sdk,werkzeug, andxmltodict. Versions of existing dependencies are also updated. - New Consumer Implementation: The
temba/msgs/consumers/msg_consumer.pyfile implements a new consumer,MsgConsumer, which inherits fromEDAConsumer. This consumer parses JSON messages and sends the data to a data lake using theweni_datalake_sdk. - Consumer Registration: The
temba/msgs/handle.pyfile registers the newMsgConsumerto consume messages from theflows.msgsqueue.
Changelog
Click here to see the changelog
- poetry.lock
- Adds new dependencies:
exceptiongroup,grpcio,grpcio-health-checking,grpcio-tools,iniconfig,jinja2,markupsafe,moto,pluggy,protobuf,pytest,responses,tomli,weni-datalake-sdk,werkzeug, andxmltodict. - Updates versions of existing dependencies.
- Updates content hash.
- Adds new dependencies:
- pyproject.toml
- Updates version of
weni-rp-appsfrom 2.9.0 to 2.9.1a1. - Adds
motoandweni-datalake-sdkas dependencies. - Updates version of
responsesto be >=0.13.0
- Updates version of
- temba/msgs/consumers/init.py
- Adds import statement for
MsgConsumer.
- Adds import statement for
- temba/msgs/consumers/msg_consumer.py
- Implements
MsgConsumerclass that inherits fromEDAConsumer. - Defines
consumemethod to parse JSON messages and send data to a data lake.
- Implements
- temba/msgs/handle.py
- Registers
MsgConsumerto consume messages from theflows.msgsqueue.
- Registers
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
A consumer awakes, with purpose defined,
To ingest the data, of every kind.
From queues it shall listen, with diligent ear,
To send to the lake, banishing fear.
With SDK in hand, and paths to explore,
The data will land, forevermore.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces a consumer to send messages to a data lake. The changes involve adding new dependencies, updating existing ones, and creating new files for the consumer logic. Overall, the changes seem reasonable, but there are a few points that need attention.
Summary of Findings
- Dependency Updates: Several dependencies have been updated, including
grpcioandprotobuf. It's crucial to ensure that these updates are compatible with the existing codebase and that any breaking changes are addressed. - New Dependencies: New dependencies like
exceptiongroup,grpcio-health-checking,moto, andweni-datalake-sdkhave been added. It's important to verify the necessity and security of these dependencies. - Error Handling: The
MsgConsumerincludes basic error handling usingsentry_sdk. Consider adding more robust error handling and logging to ensure message delivery and data integrity.
Merge Readiness
The pull request introduces significant changes, including dependency updates and new consumer logic. While the changes seem generally well-structured, it's crucial to address the identified issues before merging. Specifically, the dependency updates need thorough verification, and the error handling in MsgConsumer should be enhanced. I am unable to approve this pull request, and recommend that it not be merged until these issues are addressed (at a minimum), and that others review and approve this code before merging.
| except Exception as exception: | ||
| capture_exception(exception) | ||
| message.channel.basic_reject(message.delivery_tag, requeue=False) | ||
| print(f"[MsgConsumer] - Message rejected by: {exception}") |
There was a problem hiding this comment.
The except block captures all exceptions. While this prevents the consumer from crashing, it might mask specific errors that should be handled differently. Consider catching more specific exception types and handling them accordingly. For example, you might want to retry sending the message if it fails due to a temporary network issue, but reject it immediately if it fails due to invalid data.
Also, consider adding more context to the exception message, such as the message ID or other relevant identifiers.
except JSONDecodeError as e:
capture_exception(e)
message.channel.basic_reject(message.delivery_tag, requeue=False)
logger.error(f"[MsgConsumer] - JSONDecodeError: {e} for message with delivery_tag: {message.delivery_tag}")
except Exception as exception:
capture_exception(exception)
message.channel.basic_reject(message.delivery_tag, requeue=False)
logger.exception(f"[MsgConsumer] - Message rejected due to an unexpected error: {exception} for message with delivery_tag: {message.delivery_tag}")| def consume(self, message: amqp.Message): # pragma: no cover | ||
| print(f"[MsgConsumer] - Consuming a message. Body: {message.body}") | ||
| try: |
There was a problem hiding this comment.
Consider adding more context to the print statement, such as the message ID or other relevant identifiers, to aid in debugging and monitoring.
Also, it might be beneficial to log the message body using a structured logging approach (e.g., using logger.info with extra context) instead of a simple print statement.
logger.info(f"[MsgConsumer] - Consuming message with delivery_tag: {message.delivery_tag}", extra={"message_body": message.body})| data = dict(body) | ||
| send_data(MsgPath, data) |
There was a problem hiding this comment.
It's generally a good practice to log the data being sent to the data lake, especially in production environments, to facilitate debugging and auditing. Consider adding a log statement before calling send_data.
logger.debug(f"[MsgConsumer] - Sending data to datalake: {data}")
send_data(MsgPath, data)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #536 +/- ##
==========================================
- Coverage 99.05% 99.04% -0.02%
==========================================
Files 574 576 +2
Lines 29616 29627 +11
==========================================
+ Hits 29336 29343 +7
- Misses 280 284 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
…add-weni-datalake-sdk-lib
|



No description provided.