Optimize transducer observation pagination#612
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes the /observation/transducer-groundwater-level listing path by paginating TransducerObservation records without joining QC blocks in the base query, then resolving the correct TransducerObservationBlock only for the page’s items (scoped to the same well). It also adds a composite index to support the new access pattern and introduces a regression test for overlapping blocks across wells.
Changes:
- Refactors
get_transducer_observations()to paginate observations first, then bulk-fetch and attach matching QC blocks for returned page items. - Adds a composite DB index on
(deployment_id, parameter_id, observation_datetime)for faster filtered lookups and ordering. - Adds a regression test ensuring block selection is constrained to the same
Thing(well), even when other wells have overlapping blocks.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
services/observation_helper.py |
Reworks transducer pagination to avoid QC-block joins in the base query and resolves blocks post-page. |
tests/test_observation.py |
Adds coverage to ensure blocks are selected only from the observation’s own well. |
db/transducer.py |
Declares the new composite index on the transducer_observation table. |
alembic/versions/p9c0d1e2f3a4_add_transducer_observation_deployment_index.py |
Creates/drops the composite index via migration. |
Comments suppressed due to low confidence (1)
services/observation_helper.py:77
- The paginated
queryselects rawTransducerObservationrows but the response transformer can drop observations that don't have a matchingTransducerObservationBlock. Withfastapi-pagination,totalis computed from the SQL query (before transformation), so this can produce pages wheretotalcounts observations that never appear initems(and potentially empty pages). Consider constraining the SQL query to only observations that have a matching block (e.g., via anEXISTSpredicate correlated on deployment→thing_id, parameter_id, and observation_datetime), or otherwise ensure the transformer returns one item per row sototalstays consistent withitems.
query = select(TransducerObservation)
if start_time:
query = query.where(TransducerObservation.observation_datetime >= start_time)
if end_time:
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c6ae9947b1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if matching_block is None: | ||
| continue |
There was a problem hiding this comment.
Keep blockless observations out of the paginated result set
If a page contains transducer observations that do not currently fall inside any TransducerObservationBlock (for example, newly ingested data before QC blocks are created), this branch drops them after paginate() has already counted and sliced select(TransducerObservation). That makes total/pages inconsistent with items, and page 1 can be empty even though later pages still return data. The previous SQL join filtered these rows before pagination, so this is a regression in get_transducer_observations.
Useful? React with 👍 / 👎.
Summary
Testing