Conversation
Contributor
Author
|
🤖 Created releases: 🌻 |
Merged
4 tasks
bokelley
added a commit
that referenced
this pull request
Apr 30, 2026
…design
Round-4 review pass synthesizes (a) the TS team's review of the parallel
@adcp/client port (PR #1005, EmmaLouise2018), (b) the TS team's
decisioning-platform-python-port-v2.md RFC, and (c) Yahoo's ask for
typed framework-owned state threading on RequestContext.
Guiding principle ported from the TS port: "make it impossible for an
implementer to screw up via typing." Python can't match TS's
compile-time RequiredPlatformsFor<S> gate, but per-method typed
surfaces, runtime validate_platform fail-fast, and Protocol structural
matching close most of the gap.
Highlights:
- D15 NEW: typed RequestContext sub-readers (state + resolve).
- StateReader (sync) — find_by_object, find_proposal_by_id,
governance_context, workflow_steps. Lets platforms read prior
workflow context without re-querying their own DB.
- ResourceResolver (async) — property_list, collection_list,
creative_format. Framework-mediated cache + validation.
- Surface ships in v6.0 with no-op stub backings; impls fill in
for v6.1 (same gating as TS side). Locks the typed contract so
adopters write the right shape from day one.
- Round-4 changelog covers 8 cross-language items applied:
- D14 enum coverage (Emma #6)
- D7+serve() prod gate on InMemoryTaskRegistry (Emma #8)
- Dispatch AdcpError projection consistency (Emma #10)
- D6 sync-handoff register-before-cleanup race (Emma #11)
- validate_platform catches validator throws (Emma #16)
- Per-server status-change bus, not module-level singleton (Emma #17)
- AdcpError ACCOUNT_NOT_FOUND semantic narrowing (Emma #18)
- CI lint: examples can't reach into src/ (Emma #5)
- Bugs structurally avoided in our hybrid SalesResult[T] design
documented (Emma #2, #3, #13, design concern #14) — worth calling
out in foundation PR description; the framework-design choice gets
the credit.
- File plan additions: state.py, resolve.py, context.py extensions for
D15; four new test files for Round-4 regressions. Foundation PR
total grew from ~2475 to ~2965 lines.
- Items deferred to follow-up PRs: ErrorCode Literal codegen (Emma #19),
workflow-step/proposal/governance backing store (D15 v6.1),
tasks/get wire surface.
- TS-only items (no Python equivalent) explicitly enumerated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley
added a commit
that referenced
this pull request
Apr 30, 2026
…sign) (#316) * feat(decisioning): foundation skeleton — types, accounts, Protocol, platform Lays the v6.0 DecisioningPlatform foundation inside the existing ``adcp`` package at ``adcp.decisioning.*``. Pure types + Protocol + reference account-store impls; no dispatch adapter yet (wire-up to ``adcp.server.serve`` ships in the next commit on this PR). Modules: * ``adcp.decisioning.types`` — TaskHandoff (``__slots__``-only marker with type-identity dispatch; rejects subclasses), Account[TMeta], AdcpError (wire-shaped structured error distinct from ``adcp.exceptions.ADCPError``), MaybeAsync / SalesResult named aliases (TypeAliasType-based, mypy-clean for generic parameterization on 3.10-3.12 via typing_extensions) * ``adcp.decisioning.context`` — RequestContext[TMeta] subclasses ``adcp.server.ToolContext`` so the existing framework's idempotency middleware, observability hooks, and A2A executor consume it unchanged while adopter Protocol methods read the typed ``account: Account[TMeta]`` directly. AuthInfo dataclass for verified-principal threading. * ``adcp.decisioning.accounts`` — AccountStore Protocol + three reference impls (``SingletonAccounts``, ``ExplicitAccounts``, ``FromAuthAccounts``). SingletonAccounts synthesizes per-principal IDs (``f"{base}:{principal}"``) so the buyer-to-buyer cache-leak regression from the foundation audit is closed at the reference-impl layer (regression test asserts). * ``adcp.decisioning.platform`` — DecisioningPlatform base class + DecisioningCapabilities dataclass. Adopters subclass and declare ``capabilities`` + ``accounts`` + per-specialism methods directly on the class; the dispatch adapter (next commit) discovers methods via hasattr at server boot. * ``adcp.decisioning.specialisms.sales`` — SalesPlatform Protocol covering all 9 ``sales-*`` specialisms under one unified hybrid shape. Full method signatures with per-method docstrings declaring which specialism gates each (so ``validate_platform`` at boot matches what the docstrings claim). Wire-type imports under ``TYPE_CHECKING`` to keep Protocol-only loads lightweight. 19 unit tests covering: TaskHandoff identity dispatch (subclass-rejection regression), AdcpError wire projection, Account default shape, SingletonAccounts per-principal scoping (buyer-to-buyer leak regression), ExplicitAccounts/FromAuthAccounts resolver shapes, AccountStore Protocol structural matching, DecisioningPlatform subclass attribute contract. All tests pass; mypy clean; black + ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(decisioning): dispatch-adapter design (post-6-reviewer-pass) Design doc capturing 14 locked decisions for the upcoming dispatch adapter, codegen pipeline, task_registry stub, and serve() wrapper — plus the decision to split the framework-shared handler-registration seam into a separate prep PR. Refined across 6 reviewer passes: - Round 1 (initial design): agentic-product-architect, python-expert - Round 2 (post codegen + framing additions): agentic-product-architect, python-expert, dx-expert, code-reviewer Authoritative reference for the foundation PR. Documents: * D1 codegen — reads per-specialism Protocols (not _HANDLER_TOOLS), arg-projection for wire-shape mismatches, fail-fast on missing Pydantic types, prescriptive header, ruff format post-emit * D2 context mutation — extends ToolContext via context_factory, middleware mutates in place (framework supports; replacement doesn't compose) * D3 method discovery — reuses framework's _is_method_overridden * D4 register_handler_tools — adcps an advertised_tools class attr + __init_subclass__ auto-registration + boot-time UserWarning; framed as PlatformHandler enabler, NOT general framework feature (no adopter evidence motivates the broader framing); split as a prep PR * D5 sync-method dispatch — explicit ThreadPoolExecutor + explicit contextvars.copy_context (run_in_executor doesn't auto-snapshot) * D6 TaskHandoff routing — async via create_task (snapshots contextvars for free); sync via run_in_executor + explicit copy. Awaitable-returning sync callables explicitly unsupported * D7 TaskRegistry — Protocol shape pinned with per-method contract docstrings; in-memory stub ships in foundation * D8 dual public API — adcp.decisioning.serve wrapper + seam * D9 caller_identity = account.id — semantic shift documented; metadata["adcp_decisioning.auth_principal"] retains raw principal * D10 idempotency ordering — wrapper builds correctly, runtime assert dropped (had a slice bug; document invariant instead) * D11 __init_subclass__ validator — fails class-definition without capabilities/accounts; BaseModel MRO conflict noted * D12 get_adcp_capabilities — synthesized from platform.capabilities * D13 vertical-slice example + integration test * D14 _invoke_platform_method contract pinned; REQUIRED_METHODS_PER_SPECIALISM.get tolerates unknown specialisms (forward-compat with v6.1+ specs) File plan splits into 2 PRs: - Prep PR: ~175 lines (framework handler-registration seam) - Foundation PR: ~2100 lines (adcp.decisioning.* + 1500 already committed in 4a2f8aae) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(decisioning): apply round-3 user feedback to dispatch design Round-3 review on PR #316 surfaced eight items, all resolved in-place on D1 / D5 / D9 / D13 / D14 plus added cross-tenant + arg-projection regression tests in the file plan. Highlights: - D9: cache scope key composed as (account_store qualname, account.id) for structural cross-store isolation instead of relying on adopter Account.id-uniqueness discipline. RequestContext.auth_principal added as a typed attribute (caller_identity now correctly names the cache scope key, not the auth principal). - D14: unknown specialisms emit UserWarning at boot (not DEBUG) so typos like sales-non-guarateed surface in CI without breaking v6.1+ forward-compat tolerance. - D1: drift error message names the regen command verbatim; arg-projection emits explicit kwargs (not **unpack) so Pydantic field renames trip a NameError at codegen time. - D5: serve() exposes executor= / thread_pool_size= knobs (mutually exclusive) with a documented default of min(32, cpu+4) and thread_name_prefix; framework owns lifecycle for default pools, operator owns lifecycle for BYO. - D13: examples split into hello_seller.py (sync) and hello_seller_async_handoff.py (hybrid + AdcpError round-trip). - File plan: added test_decisioning_task_registry_cross_tenant.py hostile-probe regression and test_hello_seller_async_handoff_integration.py; extended dispatch test to cover composite caller_identity, auth_principal, UserWarning, kwargs path. Foundation total ~2475 lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(decisioning): apply round-4 cross-language feedback to dispatch design Round-4 review pass synthesizes (a) the TS team's review of the parallel @adcp/client port (PR #1005, EmmaLouise2018), (b) the TS team's decisioning-platform-python-port-v2.md RFC, and (c) Yahoo's ask for typed framework-owned state threading on RequestContext. Guiding principle ported from the TS port: "make it impossible for an implementer to screw up via typing." Python can't match TS's compile-time RequiredPlatformsFor<S> gate, but per-method typed surfaces, runtime validate_platform fail-fast, and Protocol structural matching close most of the gap. Highlights: - D15 NEW: typed RequestContext sub-readers (state + resolve). - StateReader (sync) — find_by_object, find_proposal_by_id, governance_context, workflow_steps. Lets platforms read prior workflow context without re-querying their own DB. - ResourceResolver (async) — property_list, collection_list, creative_format. Framework-mediated cache + validation. - Surface ships in v6.0 with no-op stub backings; impls fill in for v6.1 (same gating as TS side). Locks the typed contract so adopters write the right shape from day one. - Round-4 changelog covers 8 cross-language items applied: - D14 enum coverage (Emma #6) - D7+serve() prod gate on InMemoryTaskRegistry (Emma #8) - Dispatch AdcpError projection consistency (Emma #10) - D6 sync-handoff register-before-cleanup race (Emma #11) - validate_platform catches validator throws (Emma #16) - Per-server status-change bus, not module-level singleton (Emma #17) - AdcpError ACCOUNT_NOT_FOUND semantic narrowing (Emma #18) - CI lint: examples can't reach into src/ (Emma #5) - Bugs structurally avoided in our hybrid SalesResult[T] design documented (Emma #2, #3, #13, design concern #14) — worth calling out in foundation PR description; the framework-design choice gets the credit. - File plan additions: state.py, resolve.py, context.py extensions for D15; four new test files for Round-4 regressions. Foundation PR total grew from ~2475 to ~2965 lines. - Items deferred to follow-up PRs: ErrorCode Literal codegen (Emma #19), workflow-step/proposal/governance backing store (D15 v6.1), tasks/get wire surface. - TS-only items (no Python equivalent) explicitly enumerated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(decisioning): pin framework-only RequestContext construction Round-4 follow-up: D15 documents that adopter code receives a RequestContext from the dispatch hydration helper on every request, never constructs one directly. Mirrors the TS port's to-context.ts:buildRequestContext contract. - D15 + RequestContext docstring add the @internal-construction note: direct construction is for tests only; adopters needing to modify context use dataclasses.replace. - Hydration helper _build_request_context in dispatch.py is the one production path; _NotYetWiredStateReader / _NotYetWiredResolver defaults exist solely so test fixtures and examples can construct a RequestContext without the framework. - Silent divergence between framework path and ad-hoc adopter construction is exactly the failure mode the typing-driven safety principle is supposed to prevent (no auth_principal plumbing, no v6.1 backing store hand-off). 19 decisioning unit tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(decisioning): tighten D15 stub posture, governance gate, types Round-4 review on D15 surfaced five concerns; all addressed in-place on D15 plus a tightenings subsection in the round-4 changelog. - **Stub asymmetry resolved.** Both StateReader and ResourceResolver stubs emit a one-time UserWarning per method on first call. state.* still returns type-correct empty values (empty workflow-steps IS legitimate for fresh tenants); resolve.* still raises (an empty PropertyList is divergence the framework cannot silently paper over). Asymmetry now justified per-reader. - **governance_context() fail-fast.** Added capabilities.governance_aware: bool = False. validate_platform raises AdcpError at server boot if any governance-* specialism is claimed without a real StateReader wired AND no opt-in. Framework refuses to ship silent governance-gate skipping. Defaults False; non-governance flows untouched. - **Type-stability table added.** Lock all D15-referenced types in v6.0, not just the Protocols. Account, AuthInfo, Proposal, PropertyList, CollectionList, Format, FormatReferenceStructuredObject already in adcp.types.generated_poc; WorkflowStep, WorkflowObjectType, GovernanceContextJWS framework-internal in adcp.decisioning.state, shipped foundation-stable. - **creative_format(revalidate: bool = False).** Pinned in the Protocol contract so adopters with freshness needs aren't stuck on the impl's cache TTL. Cache TTL becomes impl detail; revalidate=True is the opt-out at the Protocol level. - **ADCP_ENV reuse.** Replaces free-form ADCP_ENV=production reference with the existing SDK helper at src/adcp/validation/client_hooks.py:68 (case-insensitive ADCP_ENV in {"prod", "production"}). One prod-detection mechanism. Test additions in test_decisioning_context_state_resolve.py (~150 lines): one-time UserWarning regression, governance opt-in fail-fast, revalidate parameter contract. Foundation PR total grew from ~2475 to ~2510 lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(decisioning): D15 typed RequestContext sub-readers (state, resolve) Adds the typed framework-owned sub-readers Yahoo asked for and the TS team's adcp-client PR #1005 already ships. Surface lands in v6.0; backing impls fill in for v6.1 — adopters write platform method bodies that read ctx.state.* and ctx.resolve.* against the real contract from day one rather than refactoring when v6.1 lands. Mirrors the TS-side `to-context.ts:buildRequestContext` shape 1:1: account, state (sync workflow-state reads), resolve (async framework-mediated fetches with cache + validation), auth_principal, handoff_to_task. Cross-language adopters get the same fields. What lands: - adcp/decisioning/state.py: StateReader Protocol + WorkflowStep frozen dataclass + WorkflowObjectType Literal + GovernanceContextJWS NewType + Proposal re-exported. _NotYetWiredStateReader v6.0 stub returns empty values + emits one-time UserWarning per method per process. - adcp/decisioning/resolve.py: ResourceResolver Protocol with property_list, collection_list, creative_format(revalidate=False). revalidate kwarg pinned in the Protocol contract — cache TTL is impl detail. _NotYetWiredResolver v6.0 stub raises NotImplementedError with design-doc anchor (#d15) on every method. Asymmetry vs. state stub justified per-reader: empty workflow list IS legitimate for fresh tenants; empty PropertyList is divergence the framework can't silently paper over. - adcp/decisioning/context.py: state, resolve, auth_principal fields on RequestContext with stub defaults via field(default_factory=...). - adcp/decisioning/platform.py: DecisioningCapabilities.governance_aware bool flag + GOVERNANCE_SPECIALISMS frozenset. Foundation-PR validate_platform reads these to fail-fast at server boot when a governance-* specialism is claimed without the opt-in. - adcp/decisioning/__init__.py: re-exports all D15 types. - adcp/types/__init__.py: surfaces FormatReferenceStructuredObject (already in _generated.py but missing from public surface). Snapshot regenerated. - tests/test_decisioning_context_state_resolve.py: 22 tests covering Protocol matching, structural custom impls, all four state stub methods (empty + warn-once + independent per-method), resolve stub raises with anchor + revalidate parameter contract enforced for both False/True, RequestContext defaults, dataclasses.replace test-double substitution, governance_aware default + opt-in, GOVERNANCE_SPECIALISMS pinned. Foundation tests: 39 passing (+22 from Stage 2). Full suite: 2417 passed, 17 skipped, 1 xfailed. ruff + mypy clean on touched files. Stage 2 of the foundation PR. Stage 3 (codegen + dispatch + serve) can start once prep PR #318 merges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(decisioning): apply D15 review feedback (P1 polish) Python-expert review on commit b4b1616 (D15) flagged six items, all P1 polish: - **governance_context warning text fixed**: previous text claimed "different values once wired" — misleading for non-governance flows where None IS the v6.1 answer. Special-cased the warning to explain that the fail-fast lands at server boot for governance adopters, and None is correct for non-governance flows. - **Removed __module__ = __name__ no-op** in state.py — module-scope function definitions already have __module__ set. - **Protocol structural-match caveats documented**: - StateReader docstring: isinstance() matches by attribute name only; return types (including NewType GovernanceContextJWS) and signatures are mypy-only enforcement. - ResourceResolver docstring: isinstance() doesn't check coroutinehood — sync method named property_list passes structural check, fails at await time. Use mypy. - **PropertyList alias pinned**: contract comment + regression test (test_property_list_alias_pinned_to_reference) tripwires future spec rev that introduces a distinct resolved-list type — drift becomes visible at CI time, not deploy time. - **governance_aware fails-fast docstring softened**: this commit ships the contract; Stage 3 dispatch lands the actual fail-fast. Docstring now reads "Stage 3 dispatch will fail-fast" rather than promising current behavior. - **Cross-instance warn-once test added**: confirms the module-level _STATE_STUB_WARNED set carries across stub instances (per process per method, not per request). Three new tests: - test_state_stub_warned_once_is_cross_instance - test_state_stub_governance_context_warning_text - test_property_list_alias_pinned_to_reference Test count: 40 (+3) in test_decisioning_context_state_resolve.py. Full suite: 2420 passed, 17 skipped, 1 xfailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(decisioning): TaskRegistry Protocol + InMemoryTaskRegistry stub Stage 3 first piece. Foundational — no deps on dispatch.py or serve.py yet; those layers consume this Protocol next. What lands: - adcp/decisioning/task_registry.py: TaskRegistry runtime_checkable Protocol with per-method contract docstrings (D7). Cross-tenant safety pinned: get(task_id, expected_account_id=) MUST return None on mismatch. InMemoryTaskRegistry v6.0 reference impl (asyncio.Lock-guarded dict). Idempotent on equal terminal payloads; raises on mismatched re-completion. TaskHandoffContext (id + update + heartbeat). TaskRecord frozen-shape dataclass. Production-mode gate documented (Stage 3 serve.py wiring will refuse InMemoryTaskRegistry in ADCP_ENV in {prod, production} without ADCP_DECISIONING_ALLOW_INMEMORY_TASKS=1). - adcp/decisioning/__init__.py: re-exports. - tests/test_decisioning_task_registry.py: 22 tests covering Protocol structural matching (concrete + duck-typed), full lifecycle (issue/update_progress/complete/fail/get), idempotency on equal terminal payloads, raise on mismatch, concurrent issue() unique ids, update_progress on unknown task is silent no-op, TaskHandoffContext.update swallows registry errors, TaskHandoffContext.heartbeat is v6.0 no-op. - tests/test_decisioning_task_registry_cross_tenant.py: 8 tests covering the security boundary at every state (submitted / working / completed / failed) — cross-tenant probe returns None; same-tenant read still works; empty-string account_id is mismatch; substring/prefix not enough — exact equality required; unknown task_id + cross-tenant probe both return None. 70 decisioning tests pass total (+30 from this commit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(decisioning): dispatch layer — validate_platform + invoke + handoff Stage 3 second piece. Builds on task_registry.py (commit e961adc) to ship the dispatch seam that ties RequestContext hydration, account resolution, executor lifecycle, AdcpError projection, and TaskHandoff lifecycle together. - adcp/decisioning/dispatch.py: * REQUIRED_METHODS_PER_SPECIALISM (sales-* 9 specialisms; pinned by contract test). * validate_platform(platform) — server-boot fail-fast with governance opt-in security gate (D15 round-4) and forward-compatible unknown-specialism UserWarning (D14 round-3). Validator throws caught + projected to INVALID_REQUEST so boot never crashes (Emma #16). * compose_caller_identity(account, store) — composite key per round-3 D9 (structural cross-store isolation). * _build_request_context — hydration helper mirroring TS to-context.ts:buildRequestContext. Stub state/resolve when not supplied; v6.1 backing impls plug in via kwargs. * _invoke_platform_method — async-vs-sync detection (asyncio, not inspect — partial-unwrap drift), sync runs on executor with explicit contextvars snapshot (D6). TaskHandoff returns flow through _project_handoff. Non-AdcpError exceptions wrap to INTERNAL_ERROR with __cause__ preserved. * _project_handoff — registry.issue → Submitted envelope → background fn (asyncio.create_task or run_in_executor) → registry.complete/fail. - tests/test_decisioning_dispatch.py: 27 tests covering every surface (validate_platform happy + 7 failure paths; compose_caller_identity composite + isolation; _build_request_context hydration variants; _invoke_platform_method async/sync/contextvars/ errors/arg-projector; _project_handoff envelope/lifecycle). Foundation tests: 97 (+27). Full suite: 2477 passed, 17 skipped, 1 xfailed. ruff + mypy clean. Stage 3 next: codegen handler.py + serve.py wrapper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(decisioning): PlatformHandler — wire-shape shims for SalesPlatform Stage 3 third piece. Builds on dispatch.py + task_registry.py to ship the wire-shape shim layer the framework's typed-handler dispatch routes wire requests through. - adcp/decisioning/handler.py: PlatformHandler(ADCPHandler[ToolContext]) — codegen target (hand-written for v6.0 alpha; codegen drift test Stage 4 follow-up). Constructor takes DecisioningPlatform + executor + registry; optional state_reader + resource_resolver kwargs plumb through to _build_request_context. advertised_tools: ClassVar[set[str]] declares all 9 sales-* tools. Auto-registers via __init_subclass__ once prep PR #318 merges and foundation rebases. Per-tool typed shims: resolve account → build RequestContext → invoke platform method → return typed response (or AdcpError flows through verbatim). update_media_buy uses arg-projection (D1 — Python signature is media_buy_id+patch+ctx vs wire shape having both at top level). list_creative_formats + provide_performance_feedback have no 'account' field on wire — shim passes None, adopter store handles via 'singleton' or 'derived' resolution. AccountReference handling: tolerant of both Pydantic instance (typical wire path) and raw dict (test fixtures, custom dispatch). Liskov narrowing: param types narrow from base ADCPHandler's Pydantic | dict union to just Pydantic — endorsed by docs/handler-authoring.md typed-dispatch pattern. Per-method # type: ignore[override] documents the intentional narrowing. - tests/test_decisioning_handler.py: 12 tests covering routing — advertised_tools, get_products (account+auth+error+wrap), create_media_buy sync + handoff, update_media_buy arg-projection, sync_creatives, list_creative_formats, async account resolver, dict-shaped auth_info re-coercion. Foundation tests: 109 (+12). Full suite: 2489 passed, 17 skipped, 1 xfailed. ruff + mypy clean. Stage 3 next: serve.py wrapper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(decisioning): serve.py wrapper — public adopter surface Stage 3 final piece. Two public entry points wire the foundation layers together: - create_adcp_server_from_platform(platform, ...) → (handler, executor, registry) 3-tuple. Adopters wanting full control over MCP/A2A wiring use this seam. - serve(platform, ...) → one-call wrapper that builds the handler and starts the MCP server via adcp.server.serve. Most adopters use this. Forwards host/port/transport/etc. via **serve_kwargs. Wires per the dispatch design doc: - D5 ThreadPoolExecutor configurability: * executor= (BYO operator-vetted pool — operator owns lifecycle) * thread_pool_size= (size the framework-allocated default) * default min(32, cpu+4) with thread_name_prefix="adcp-decisioning-" * executor= and thread_pool_size= are mutually exclusive - Emma #8 production-mode gate on InMemoryTaskRegistry: * Reads ADCP_ENV (case-insensitive {"prod", "production"} — same convention as adcp.validation.client_hooks._default_response_mode) * Refuses to start in production with InMemoryTaskRegistry unless ADCP_DECISIONING_ALLOW_INMEMORY_TASKS=1 explicitly set * Custom durable registry bypasses the gate - D15 state_reader / resource_resolver kwargs plumbed through to PlatformHandler. - validate_platform called before handler construction; failure surfaces as AdcpError to the caller. 24 tests in test_decisioning_serve.py covering all the above scenarios. Foundation tests: 133 (+24). Full suite: 2513 passed, 17 skipped, 1 xfailed. ruff + mypy clean. Stage 3 complete. Stage 4 next: examples/hello_seller.py + integration tests + ruff lint rule banning examples reaching into src/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(decisioning): Stage-3 review P0 fixes — wiring, GC, durability marker Three independent reviewer passes (code-reviewer, security-reviewer, python-expert) flagged six P0/security blockers in Stage 3. All fixed; full suite 2519 passed, mypy + ruff clean. P0 fixes: 1. compose_caller_identity wired into _build_request_context. Was exported + tested but never called from the dispatch path — D9 round-3 cross-store cache isolation did not exist at runtime. _build_request_context now accepts store= and sets ctx.caller_identity to the composite key. handler.py passes self._platform.accounts on every dispatch. 2. Background _run() task strong-referenced. asyncio.create_task only weak-refs; under GC pressure tasks vanish before completion, leaving registry stuck in 'submitted' forever. Tracked in module-level _BACKGROUND_HANDOFF_TASKS set with add_done_callback cleanup. Documented Python footgun. 3. Production gate uses is_durable marker, not isinstance. The isinstance(registry, InMemoryTaskRegistry) check was bypassable by duck-typed re-implementations AND fired incorrectly on safe instrumentation subclasses. New TaskRegistry Protocol declares is_durable: ClassVar[bool]; InMemoryTaskRegistry sets False. Subclasses inherit False (gate fires); custom durable impls set True explicitly. Safe-by-default. 4. Empty/<unset> account_id rejected. AccountStore returning Account(id="") or default Account(id="<unset>") silently collapsed every empty-id tenant into one cache scope class — cross-tenant data leak. Both compose_caller_identity AND InMemoryTaskRegistry.issue now reject empty/whitespace/<unset> fail-fast. 5. compose_caller_identity uses module + qualname. __qualname__ alone collides for two MyStore classes in different packages. Now composes f"{module}.{qualname}:{account.id}". 6. _project_handoff contextvars comment corrected. Comment claimed asyncio.create_task auto-snapshots — it inherits, not snapshots. Updated to explain the inherit-by-reference semantics and why it's the right behavior here. Test additions: - test_compose_caller_identity_uses_module_qualname_and_account_id (replaces qualname-only test) - test_compose_caller_identity_rejects_empty_account_id - test_build_request_context_uses_composite_key_when_store_supplied (the load-bearing wiring regression) - test_handoff_background_task_is_strong_referenced - test_create_passes_in_production_with_custom_durable_registry (updated to use is_durable marker) - test_create_raises_when_inmemory_subclass_used_in_production (subclass-bypass regression) - test_create_raises_when_duck_typed_non_durable_used_in_production (safe-by-default regression) - test_in_memory_task_registry_is_not_durable Foundation tests: 145 (+12). Full suite: 2519 passed, 17 skipped, 1 xfailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(decisioning): Stage-3 review P1 fixes — singletons, drift detection, logging Batch 2 of Stage-3 reviewer feedback. Three focused improvements landing on top of the P0 fixes (commit c2b0407): 1. Module-level singleton stubs for the v6.0 stub state/resolve readers. Per-RequestContext allocation bought nothing — the warned-once set is module-level and the docstring promises "per process per method, not per request". Singleton matches the contract and eliminates per-request stub churn. 2. arg_projector signature-drift detection. When an adopter renames a kwargs-projected param (e.g., update_media_buy's `patch` → `update_data`), the framework's kwargs-unpack hits TypeError. Previously projected to bare INTERNAL_ERROR with no hint. Now projected to INVALID_REQUEST with the projected-kwargs and method name in the message — adopters fix the signature without log archaeology. Fall-through TypeError (non-projector path) still wraps to INTERNAL_ERROR. 3. TaskHandoffContext.update logging. The swallow-on-registry-error contract is preserved (transient writes must not abort the handoff fn), but now logs at WARNING with traceback + task_id so transient failures aren't silently invisible to operators. Test additions: - test_invoke_arg_projector_signature_drift_projects_invalid_request - test_handoff_context_update_swallows_registry_errors strengthened to assert the new WARNING log - test_default_state_reader_is_module_singleton - test_default_resolver_is_module_singleton - test_request_context_default_factories_share_singleton Foundation tests: 149 (+4). Full suite: 2523 passed, 17 skipped, 1 xfailed. ruff + mypy clean. P2 items (full hex task_id, design-doc WeakValueDictionary mention, TaskRecord.error vs adcp_error spec field, async-detection docstring alignment) deferred to Stage-4 follow-up — they don't block correctness or security. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(decisioning): hello_seller examples + integration tests Stage 4 — vertical-slice examples that demonstrate the v6.0 DecisioningPlatform from a single screen, plus integration tests that exercise the full dispatch path end-to-end. - examples/hello_seller.py: smallest possible sales-non-guaranteed adopter. Five sync methods (the spec-required core) + AdcpError raise on empty packages. Validates against the framework's full validate_platform surface (specialism method coverage, AccountStore wiring, composite caller_identity). - examples/hello_seller_async_handoff.py: hybrid platform demonstrating all three return shapes of create_media_buy in one body — sync success / AdcpError raise (correctable rejection) / TaskHandoff (HITL trafficker review with progress updates). - tests/test_hello_seller_integration.py: 7 tests covering the sync dispatch path — typed Pydantic request → resolved account → typed response, AdcpError correctable rejection, account_resolution threading two principals, composite caller_identity wiring (D9 round-3), advertised_tools class attribute pinned. - tests/test_hello_seller_async_handoff_integration.py: 5 tests covering the hybrid path — sync arm returns success envelope without task_id, AdcpError arm raises with full to_wire() envelope, handoff arm returns wire Submitted envelope synchronously and async registry persists the terminal artifact, progress updates from the handoff fn visible via registry, handoff fn AdcpError persists via registry.fail. Foundation tests: 161 (+12). Full suite: 2535 passed, 17 skipped, 1 xfailed. ruff + mypy + black clean. Stage 4 complete except for the codegen drift test (deferred to follow-up PR per design doc — Stage 4 file plan note). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(decisioning): final review fixes — typo fail-fast + namespace hygiene Final cross-cutting reviewer pass (code-reviewer + dx-expert + adtech-product-expert). Four highest-leverage items addressed; broader product feedback (URL-path account resolution, webhook-on-terminal-state, ErrorCode StrEnum) explicitly deferred to v6.1 per design doc roadmap. Fixes: 1. **DX P0: typo specialism slugs raise instead of warn.** Adopter typing "sales-non-guarateed" (missing 'n') previously got a UserWarning + 0 tools advertised — server boots, silently 404s every buyer call. Adopters running `python hello_seller.py` never see warnings on stderr's default filter. Now: difflib close-match (cutoff 0.7) raises AdcpError("INVALID_REQUEST") with "Did you mean..." hint AND structured details for tooling. Truly novel slugs (no close match) still get the soft UserWarning for forward-compat with v6.x+ specs. 2. **Code P1: don't leak task_id namespace into media_buy_id.** The hybrid example's _async_trafficker_review returned media_buy_id=f"mb_reviewed_{task_ctx.id}" — adopters copying this produce buyer-visible cross-namespace confusion. Switched to a fresh uuid prefix; integration test asserts task_id is NOT a substring of media_buy_id. 3. **Code P1: TaskHandoffContext.update suppression documented in example.** The handoff fn docstring now explicitly notes that registry write failures are logged at WARNING and suppressed. 4. **Code P1: logger placement in task_registry.py.** Moved `logger = logging.getLogger(__name__)` below the import block per PEP 8 (was placed mid-imports as a convenience). Test additions: - test_validate_platform_raises_on_typo_specialism - test_validate_platform_warns_on_novel_specialism (renamed from warns_on_unknown — preserves the forward-compat path) Foundation tests: 162 (+1). Full suite: 2536 passed. ruff + mypy + black clean. Items intentionally deferred to v6.1 / follow-up PRs per design doc roadmap: - Product P0: URL-path AccountStore mode for salesagent integration - Product P0: webhook-on-terminal-state for HITL polling avoidance - Product P0: idempotency middleware integration with composite caller_identity - Product P1: update_media_buy hybrid (gated on adcp#3392) - DX P1: ErrorCode StrEnum codegen (deferred follow-up) - DX P1: SalesResult union split (API change, defer) - DX P0: hello_seller.py size — file is 210 lines because sales-non-guaranteed requires 5 methods. Docstring accurate; rename / smaller-specialism alternative deferred. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(examples,server): cherry-pick storyboard CI fixes from PR #321 Foundation branch (PR #316) is failing the storyboard CI on examples/seller_agent.py for the same reasons as PR #321 — the seed_product defaults, format_ids agent_url normalization, TERMS_REJECTED measurement-terms gate, and context-echo on comply_test_controller. Until #321 lands on main, the foundation branch inherits the same broken state. Cherry-picks the four squashed commits from ``bokelley/storyboard-seed-product-complete``: - examples/seller_agent.py: seed_product non-empty defaults (publisher_properties minItems:1, format_ids[].agent_url, reporting_capabilities.available_reporting_frequencies); format_ids agent_url normalization; targeting_overlay / creative_assignments / creatives round-trip on create + update; TERMS_REJECTED gate covering vendor mismatch / variance < 10 / unsupported windows, with seller-vendor and common windows accepted; defensive non-dict measurement_terms coercion. - src/adcp/server/test_controller.py: dispatcher echoes the wire ``context`` field onto every comply_test_controller response per the comply-test-controller-response schema. Storyboards thread state across steps via $context.<field> resolution; without echo the create_media_buy_async track fails on force_arm_submitted → create_media_buy_submitted handoff. - tests/test_seller_agent_storyboard.py: 18 storyboard regression tests (seed_product schema-shape, fixture-fields-not-overwritten, format_ids edge cases, TERMS_REJECTED variants — vendor/variance/ window/threshold, targeting_overlay round-trip, defensive coercion). - tests/test_test_controller_context.py: 3 new tests covering wire context echo dispatch behavior. Foundation tests: 162 (decisioning) + 33 (storyboard) + everything else. Full suite: 2558 passed, 17 skipped, 1 xfailed. This commit duplicates work in PR #321; when that PR merges to main, the foundation branch's rebase will drop these commits cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(decisioning): round-5 Emma review — spec drift + governance + wire shape Three P0 blockers + four P1 review items from a fresh read of the foundation PR against the canonical spec at schemas/cache/enums/specialism.json. P0: - Drop sales-streaming-tv, sales-exchange, sales-retail-media from REQUIRED_METHODS_PER_SPECIALISM — none of them exist in the spec enum. Add SPEC_SPECIALISM_ENUM constant mirroring the on-disk enum, with a unit test that pins it to the schema cache so out-of-band drift surfaces in CI. Typo detection now runs against the full spec enum, not just the v6.0 enforced subset; an unknown slug that matches a real spec slug we don't yet enforce method coverage for emits a distinct "spec-recognized but unenforced" UserWarning (separate from the "novel" forward-compat warning). - Add governance-aware-seller to GOVERNANCE_SPECIALISMS. Without it, a seller agent claiming the slug skipped the governance-aware/StateReader fail-fast — silent governance-gate bypass. - Drop task_type from the synchronous Submitted wire envelope per schemas/cache/core/protocol-envelope.json. The field stays on TaskRecord (tasks/get reads it) but the wire never carries the Python method name. P1: - InMemoryTaskRegistry.update_progress: terminal-state guard. A straggler progress write against a completed/failed task no longer resurrects "working" appearance against tasks/get readers holding the prior terminal state. - ExplicitAccounts: drop the unsupported "auth-info available for scope checks" claim from the docstring — del auth_info actually discards it. Adopters needing principal-vs-account scope checks implement AccountStore directly. - TaskRegistry production-mode gate: distinguish "is_durable marker absent" (programmer error, fails fast in any env) from "marker present and False in prod" (deployment misconfig). Without the split, a duck-typed registry without the marker would surface a misleading "non-durable refused" error. - handler.py: clarify the cast() lines are static-typing hints, not runtime validation. Adopters returning plain dicts that match the wire shape are supported by the framework's transport layer. Cleanup: prune docstring references to the dropped fake specialism slugs in specialisms/sales.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 I have created a release beep boop
1.0.2 (2025-11-06)
Bug Fixes
This PR was generated with Release Please. See documentation.