Skip to content

Add gRPC client and worker connection resiliency#135

Open
berndverst wants to merge 28 commits intomainfrom
grpc-resiliency-pr708-impl
Open

Add gRPC client and worker connection resiliency#135
berndverst wants to merge 28 commits intomainfrom
grpc-resiliency-pr708-impl

Conversation

@berndverst
Copy link
Copy Markdown
Member

@berndverst berndverst commented Apr 24, 2026

Summary

  • Add public gRPC resiliency option types and shared transport helpers, then wire them through core and Azure Managed constructors.
  • Harden worker, sync client, and async client connection recovery to better survive silent disconnects, transport failures, and channel recreation scenarios.
  • Add focused regression coverage plus user-facing changelog and design/plan updates for the new resiliency behavior.

Test Plan

  • Focused pytest run covering grpc resiliency, worker resiliency, worker concurrency loop, client resiliency, and Azure Managed wrapper wiring.
  • flake8 on all changed source and test files.

Bernd Verst and others added 26 commits April 23, 2026 17:19
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extend worker resiliency coverage with an end-to-end silent-disconnect recovery test and an explicit reconnect backoff assertion.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 24, 2026 09:35
Comment thread tests/durabletask/test_client.py Fixed
Comment thread tests/durabletask/test_client.py Fixed
Comment thread durabletask/client.py Fixed
Comment thread durabletask/worker.py Fixed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class gRPC connection resiliency to the Durable Task Python SDK (core durabletask/) and threads the new configuration through Azure Managed wrappers, with extensive regression tests and design docs.

Changes:

  • Introduces GrpcWorkerResiliencyOptions / GrpcClientResiliencyOptions and shared internal resiliency helpers (backoff, failure tracking, transport-failure classification).
  • Updates worker stream loop and sync/async clients to detect transport-shaped failures and safely recreate/retire SDK-owned channels while preserving caller-owned channel semantics.
  • Adds comprehensive unit tests (core + azuremanaged) plus docs/specs and changelog entries.

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
durabletask/worker.py Implements worker stream monitoring, backoff, failure tracking, and safe SDK-owned channel retirement with in-flight tracking.
durabletask/client.py Adds sync/async unary invocation wrappers and SDK-owned channel recreation + retirement handling.
durabletask/grpc_options.py Adds public resiliency option dataclasses with validation.
durabletask/internal/grpc_resiliency.py Adds shared backoff, FailureTracker, and transport-failure classification helpers.
durabletask-azuremanaged/durabletask/azuremanaged/client.py Forwards client resiliency options through Azure Managed client wrappers.
durabletask-azuremanaged/durabletask/azuremanaged/worker.py Forwards worker resiliency options through Azure Managed worker wrapper.
tests/durabletask/test_worker_resiliency.py New worker resiliency tests (silent disconnect, graceful close, recreation thresholds, in-flight close deferral).
tests/durabletask/test_grpc_resiliency.py New tests for option validation, backoff, FailureTracker, and transport-failure classification.
tests/durabletask/test_client.py Adds sync/async client channel recreation/retirement tests and wrapper verification.
tests/durabletask/test_worker_concurrency_loop.py Updates tests to call prepare_for_run() before reusing the worker manager.
tests/durabletask/test_worker_concurrency_loop_async.py Updates async loop tests to call prepare_for_run() before reusing the worker manager.
tests/durabletask-azuremanaged/test_azuremanaged_grpc_resiliency.py New tests validating Azure Managed wrapper pass-through of resiliency options.
CHANGELOG.md Documents new resiliency options and behavior changes in core SDK.
durabletask-azuremanaged/CHANGELOG.md Documents pass-through resiliency options in Azure Managed package.
docs/superpowers/specs/2026-04-23-grpc-resiliency-design.md Adds design spec for resiliency behavior and public API.
docs/superpowers/plans/2026-04-23-grpc-resiliency.md Adds implementation plan document for the work.
.gitignore Ignores .worktrees/ and normalizes coverage.lcov entry formatting.

Comment thread durabletask/internal/grpc_resiliency.py Outdated
Comment thread docs/superpowers/plans/2026-04-23-grpc-resiliency.md Outdated
Comment thread durabletask/worker.py Outdated
Comment thread durabletask/client.py Outdated
Comment thread durabletask/client.py Outdated
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants