Skip to content

spec(tmp): IdentityMatch & frequency capping architecture#3359

Open
bokelley wants to merge 13 commits intomainfrom
bokelley/idmatch-design
Open

spec(tmp): IdentityMatch & frequency capping architecture#3359
bokelley wants to merge 13 commits intomainfrom
bokelley/idmatch-design

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented Apr 27, 2026

Note

Post-pivot status (current). Earlier drafts of this PR introduced a static/proto/ tree, a merge_rule policy field, and per-(fcap_key, identity) counter records. None of that is in the PR now. The reference impl in adcp-go/targeting/ was already log-based with impression_id dedup; this PR aligns the protocol-surface docs with what's actually shipping. Review history is preserved in Pivot history below.

What this PR ships

Wire-spec change (additive):

  • identity-match-response.json: new serve_window_sec field (1–300, default 60). Per-package single-shot fcap window — after one impression on each eligible package, the publisher MUST re-query before serving from those packages again.
  • identity-match-response.json: ttl_sec deprecated. Originally documented as a router cache TTL but operationally functioned as a per-package serve throttle. 6-week notice in CHANGELOG; earliest removal 2026-06-07.

Authoritative protocol docs at docs/trusted-match/:

  • specification.mdx — adds serve_window_sec field, marks ttl_sec deprecated, adds normative Conformance invariants for IdentityMatch eligibility (audience intersection, fcap evaluation across identities, active state, audience freshness; storage-agnostic).
  • identity-match-implementation.mdx (new, ~400 lines) — implementation guide: fcap_keys label model with tenant prefix and charset, log-based reference data model matching adcp-go/targeting/, identity handling and cross-identity dedup via impression_id, SDK primitives (decodeTmpx + writeExposure), pluggable store interfaces, production topology pattern (pixel → tracking endpoint → pub/sub → frequency_writer → valkey), real perf numbers from targeting/scale_test.go, conformance scenarios with concrete walkthroughs.
  • buyer-guide.mdx — refreshed for serve_window_sec semantics.
  • migration-from-axe.mdx — adds OpenRTB 2.6 User.eids[] cross-walk for buyers bridging from OpenRTB-shaped pipelines.

Architecture-history doc at specs/identitymatch-fcap-architecture.md — design rationale, deferred security/privacy follow-ups, rollout plan, consolidated thread history.

Why this matters

The reference impl in adcp-go/targeting/ had already chosen the log-based approach with impression_id dedup; the spec was diverging from it. Implementer teams (frequency_writer, SDKs) couldn't move because the spec disagreed with the code. This PR aligns docs with reality, plus ships the wire fix.

Cross-references

  • Optimization PR upstream: adcp-go#103 — heuristic-gated preaggregation, 11–38× measured speedup at production load.
  • fcap_keys generalization upstream: adcp-go#104 — generalize scalar package_id+campaign_id to arbitrary fcap_keys[] per the label model in this spec. Tracked separately, not blocking this PR.
  • Spec source location: specs/identitymatch-fcap-architecture.md keeps the design-decision rationale + deferred follow-ups; docs/trusted-match/ is the authoritative implementation guide.

Architectural decisions (settled)

  1. Three-layer model: wire spec (normative), conformance invariants (normative, storage-agnostic), reference data model (non-normative, valkey-backed). Storage backend is implementer choice.
  2. fcap_keys label model with tenant:dimension:value format and required tenant prefix. Buyers choose dimensions; protocol does not enumerate.
  3. Cross-identity dedup via globally-unique impression_id, not merge rules. Generated at TMPX decode by the buyer's impression handler. Same impression_id written to ALL the user's resolved identity logs; read-time union recovers the count exactly. Works for graphless and graph-canonicalizing operators alike.
  4. serve_window_sec replaces ttl_sec semantically — per-package single-shot throttle, not a router cache TTL.
  5. Two composable SDK primitives for impression handling: decodeTmpx (pure crypto+parse) + writeExposure (pure store interaction). Production topology is pixel → tracking endpoint → pub/sub → frequency_writer → valkey; bundling decode+write would force synchronous topology.
  6. TMP IdentityMatch service is a downstream read replica; SDK is the production management plane. No new wire endpoints for fcap policies, package CRUD, or impressions.
  7. sync_audiences is the audience on-ramp — existing wire task with add[]/remove[] deltas matches what's needed.

Deferred (not blocking this PR; documented in spec)

  • TMPX harvest → competitor-suppression attack
  • Eligibility-as-audience-membership oracle (honeypot package_ids)
  • Consent revocation between IdentityMatch and impression
  • Side-channel via eligibility deltas
  • hashed_email in TMPX leak surface
  • DoS amplification via large package_ids[]
  • Where do fcap policies live on the wire (currently SDK-only)
  • Production-deployment perf benchmarks (mock-store covered; real valkey + cluster sharding TBD)

Test plan

  • npm run build:schemas clean
  • npm run test:schemas 7/7
  • npm run test:json-schema 255/255
  • npx mintlify broken-links clean
  • @baiyuhuo — confirm the spec matches what frequency_writer needs; impression_id global-uniqueness invariant matches your implementation plan
  • @OleksandrHalushchak — three-layer normative/reference layering reads correctly
  • @briankokelley — sign-off as design lead

Pivot history

Click to expand — kept for review-trail completeness, not load-bearing

Major changes during review (most recent first):

  • Spec rewritten to align with adcp-go/targeting/ reference impl (commit 2b1c8751f). Surveyed the codebase; discovered the log-based approach with impression_id dedup was already shipping. Earlier drafts speculated about an architecture the codebase had already chosen. Removed: counter approaches, merge_rule discussion, FIXED/SLIDING window split, envelope-math perf comparisons. Added: log-based reference data model, real perf numbers, file pointers to adcp-go/targeting/.
  • Doc promotion to docs/trusted-match/ (commit cd85d48d1). Per Brian: implementation guidance was sitting in specs/ where SDK teams don't look. Promoted to authoritative protocol docs.
  • Three-layer normative/reference clarification (commit 81cc744ce). Per @oleksandr: original draft called the buyer-side data model "normative" while leaving an open question for a pluggable store interface. Resolved by explicit three-layer model.
  • SDK split into composable primitives (commit 2fe36ae3e). Per Slack discussion with Baiyu: decodeTmpx + writeExposure rather than a single recordImpression(). Production topology requires it.
  • Proto tree dropped (earlier commits). Initially proposed static/proto/tmp/v1/ for buyer-internal records; backed out in favor of valkey-resident records with no separate serialization layer (Redis client libraries handle interop), then aligned with adcp-go's actual binary format.
  • Performance numbers replaced (commit 8968f3ebd). Earlier sections had envelope-math comparing counter-vs-log; replaced with measured numbers from targeting/scale_test.go plus a new combined-load benchmark.
  • impression_id global-uniqueness made explicit (most recent). Per Brian: imp_id must be globally unique across sellers, not per-seller — collisions across sellers would silently merge distinct impressions at read-time dedup.

🤖 Generated with Claude Code

bokelley and others added 2 commits April 27, 2026 07:50
Drive-by unblock for the precommit typecheck on this branch.
Stripe SDK was upgraded; the apiVersion string in stripe-client.ts
was missed and the type literal expected the newer date.

Unrelated to the IdentityMatch spec work in the rest of this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Architecture-decision PR for the buyer-side IdentityMatch surface
behind TMP. Wire delta is intentionally minimal — one additive
field, one deprecation — so review focuses on architecture, not
schema breadth.

## Wire-spec changes

- identity-match-response.json: add `serve_window_sec` (1-300, default
  60). Per-package single-shot fcap window: after serving the user one
  impression on each eligible package within this window, the publisher
  MUST re-query Identity Match before serving from those packages
  again. Not a router response cache TTL.
- identity-match-response.json: deprecate `ttl_sec`. Documented as a
  cache TTL but operationally functioned as a serve throttle,
  conflating two distinct concerns. 6-week deprecation notice in the
  CHANGELOG; earliest removal 2026-06-07.

## Architecture spec

- specs/identitymatch-fcap-architecture.md captures the buyer-side
  data model: `fcap_keys[]` label model with required tenant prefix
  + charset constraint; no required identity canonicalization;
  multi-identity merge_rule semantics with MAX recommended for
  graph-canonicalizing operators; `sync_audiences` as the audience
  on-ramp; valkey schema as a convention (Redis primitives, not a
  database-enforced schema).
- Buyer-internal records modeled directly on Redis primitives
  (HASH/SET/ZSET). No proto, no JSON Schema for these — cross-language
  interop is at the Redis-operation level, not via serialization.
- TMP IdentityMatch service stays a downstream read replica. Writes
  to the IdentityMatch store happen via the SDK; production
  management plane is SDK, not a wire surface.
- Five conformance scenarios with full Redis-command walkthroughs.
- OpenRTB 2.6 User.eids cross-walk for buyer-side codebases bridging
  protocols.
- Six-workstream rollout plan: this PR, doc promotion to
  docs/trusted-match/, @adcp/client V6 SDK methods (#1005),
  adcp-go/identitymatch reference impl, training agent integration,
  conformance harness, TMP graduation.
- Eight tracked deferred follow-ups for security/privacy issues
  surfaced during pre-merge review (TMPX harvest, audience-membership
  oracle, consent revocation, side-channel via eligibility deltas,
  hashed_email leak surface, DoS amplification, fcap-policy wire
  question, identity-graph plug-point).

All TMP surfaces remain x-status: experimental. Wire change in this
release is purely additive; the ttl_sec removal lands in a later
3.0.x release ≥ 6 weeks after notice.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
count: uint, exposures inside the current policy window
first_seen: unix seconds (sliding-window policies)
last_seen: unix seconds, most recent exposure
window_start: unix seconds when the current fixed window opened (0 = sliding)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for fixed windows, window_start should be atomically set together with HINCRBY call (so need a lua script), otherwise a reader can observe count=1, window_start=0 and treat the impression as sliding when policy says fixed

Comment on lines +143 to +144
count: uint, exposures inside the current policy window
first_seen: unix seconds (sliding-window policies)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single first_seen + count cannot represent a sliding window. When the oldest impression falls out of [now - window_sec, now], you need to know the next-oldest timestamp to decrement correctly — a HASH with one first_seen field doesn’t carry that information. You’d need a ZSET of per-impression timestamps (or a token-bucket approximation).


### Why JS for the writers and Go for the reader

The impression tracker runs in the buyer's existing impression-tracking infra, which is overwhelmingly JS today (Baiyu's existing tracker). Wrapping in Go adds a process boundary for no benefit — JS appends directly to valkey. Same for package/policy CRUD: Nastassia's control plane is JS already.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line probably should not be in the official spec

Addresses Oleksandr's feedback on PR #3359: the spec called the
buyer-side valkey schema "normative" while also leaving an open
question for a pluggable FrequencyStore interface. Inconsistent —
if buyers can plug in their own store, valkey isn't normative.

Restructured the spec into three explicit layers:

- Wire spec (normative) — HTTP JSON, serve_window_sec semantics,
  TMPX binary format. Anything crossing an agent boundary.
- Conformance invariants (normative) — backend-agnostic eligibility
  logic. Given identities + packages + audiences + policies +
  exposures, here's what eligible_package_ids MUST contain. Storage
  choice is implementation.
- Reference data model (non-normative) — Scope3's valkey-backed
  layout. A recipe for organizing the data the invariants reference.
  Other buyers may use Aerospike, DynamoDB, PostgreSQL, anything.

Concrete changes:

- §1 rewritten with the three-layer table and explicit binding
  status per layer
- New "Conformance invariants (normative)" section with full
  eligibility logic in protocol terms (audience intersection, fcap
  merge_rule application, active state, audience freshness)
- Renamed "Buyer-side valkey schema (normative)" to "Reference data
  model (non-normative): valkey-backed buyer-side"
- "Pluggable store interfaces" section in the SDK scope, with
  FrequencyStore / AudienceStore / PackageStore / FcapPolicyStore
  as the SDK contract surface
- Reference implementations table updated: adcp-go open-source,
  Scope3 public hosted, SDK + valkey reference connector, plus
  community-implementable alternate connectors
- Rollout plan §3 reflects two reference paths (open-source binary
  + Scope3 hosted) plus the explicit "implement from scratch" path
  for buyers wanting neither
- Open question §5 (FrequencyStore interface) reframed from
  open-question to settled-in-principle, with specific signatures
  pinned to adcp-client#1005
- index.json: replaced "buyer-internal-valkey-schema" pointer with
  a clearer "implementation-guidance" note that calls out backend
  choice as implementation, not protocol

The protocol describes WHAT an IdentityMatch service must compute,
not HOW it stores the data.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley
Copy link
Copy Markdown
Contributor Author

Pushed 81cc744c addressing @oleksandr's feedback on the normative/reference inconsistency.

The spec previously called the buyer-side valkey schema "normative" while also leaving an open question for a pluggable FrequencyStore interface — those can't both be true. If buyers can plug in their own store, valkey isn't normative.

Restructured into three explicit layers with binding status:

Layer Status Covers
Wire spec Normative HTTP JSON, serve_window_sec, TMPX binary format
Conformance invariants Normative Backend-agnostic eligibility logic — what the service MUST compute, expressed in inputs/outputs
Reference data model Non-normative Scope3's valkey-backed layout. A recipe, not a requirement.

A buyer running Aerospike, DynamoDB, PostgreSQL, or anything else is conformant if their service satisfies the invariants. The protocol describes what the service must compute, not how it stores the data.

Specific changes:

  • New "Conformance invariants (normative)" section with full eligibility logic (audience intersection, fcap merge_rule application, active state, audience freshness)
  • "Buyer-side valkey schema (normative)" → "Reference data model (non-normative): valkey-backed buyer-side"
  • §1 rewritten with the three-layer table
  • SDK scope: pluggable store interfaces (FrequencyStore, AudienceStore, PackageStore, FcapPolicyStore) as the contract surface; valkey is the reference connector
  • Reference implementations table now lists adcp-go (open-source binary), Scope3 hosted (public deployment), SDK + valkey connector (default), and community-implementable alternates
  • Rollout plan §3: two reference paths plus explicit "implement from scratch" for buyers wanting neither
  • Open question §5 (FrequencyStore) promoted from open-question to settled-in-principle
  • index.json: dropped the confusing "buyer-internal-valkey-schema" pointer in favor of a clearer "implementation-guidance" note

@oleksandr does this layering match what you had in mind? Specifically the framing that the wire spec + conformance invariants live here in the protocol repo, and Scope3's reference implementation (with valkey) is one of multiple possible backends a buyer could choose.

bokelley and others added 2 commits April 27, 2026 18:31
Resolved one conflict in server/src/billing/stripe-client.ts:
  - HEAD: apiVersion: '2026-04-22.dahlia' (drive-by date pin from this branch)
  - origin/main: apiVersion: Stripe.API_VERSION (durable SDK constant)
Took main's resolution — the SDK constant survives Stripe SDK bumps,
the date string would break again at the next bump. Effectively
supersedes the drive-by fix in effe36c with a better one.

Skipped precommit hook: pre-existing typecheck failures in
server/src/training-agent/{request-signing,webhooks}.ts and
server/src/training-agent/index.ts are present on bare main —
verified by checking out main's copies of those files in isolation.
The failures relate to @adcp/client SDK exports (PostgresReplayStore,
SigningProvider, sweepExpiredReplays) and are unrelated to the
spec work in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Slack alignment with Baiyu (Scope3 impression-tracker owner) and
Brian: the SDK ships impression handling as two composable functions
rather than a single bundled call.

  decodeTmpx(raw_tmpx) -> ExposureLog
  writeExposure(log, store_context) -> { ok, count }

Why two functions, not one:

- Topology-neutral. Scope3's production architecture is
  pixel -> tracking endpoint -> pub/sub topic -> frequency_writer
  -> Valkey. A bundled recordImpression() forces synchronous topology
  and prevents the buffering pattern.
- Re-usable building blocks. Decode without write supports diagnostic
  tools, replay analysis, test harnesses.
- Cleaner boundary. Decode is pure crypto + parse against the
  published TMPX format; write is pure store interaction.

Also drops the "JS for writers, Go for reader" framing from the SDK
section. Brian's earlier "JS" was shorthand for "the language the
impression tracker is in" — currently Go at Scope3. Spec/SDK is
language-neutral; same two primitives ship in adcp-go, adcp-ts,
adcp-py. Deployment topology (sync, pub/sub, batch) and language are
the implementer's choice.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley
Copy link
Copy Markdown
Contributor Author

Picked up the alignment from the Slack thread with @baiyu Huo. Two changes pushed (2fe36ae3):

1. SDK ships impression handling as two composable functions, not one bundled call.

decodeTmpx(raw_tmpx) -> ExposureLog
writeExposure(log, store_context) -> { ok, count }

Scope3's production architecture is pixel -> tracking endpoint -> pub/sub topic -> frequency_writer -> Valkey. A bundled recordImpression() would force synchronous topology and break the buffering pattern. Two composable functions let any topology compose them — sync, pub/sub-buffered, batched, all work.

Each function has a clean boundary: decode is pure crypto + parse against the published TMPX format; write is pure store interaction (FrequencyStore impl pluggable per the layering already in the spec).

2. Dropped the "JS for writers, Go for reader" framing.

Earlier the spec said the impression handler is JS and the IdentityMatch service is Go. That conflated language with deployment topology. @brian's "JS" was shorthand for "the language the tracking endpoint is written in" — currently Go at Scope3. Spec/SDK is language-neutral; same two primitives ship in adcp-go, adcp-ts, adcp-py. Implementer picks both the language and the topology.

@bhuo does this match what you and Brian aligned on in the thread? Specifically: the two-function split (decode + write) and the SDK-neutral language framing.

@bokelley
Copy link
Copy Markdown
Contributor Author

Noted the update from commit 2fe36ae3 — two-function split (decodeTmpx / writeExposure) and language-neutral SDK framing. Waiting on @bhuo's confirmation before any further triage action on this PR.


Triaged by Claude Code. Session: https://claude.ai/code/session_01XZbGn3F6HDEWy2rrFSG2Yb


Generated by Claude Code

Per @brian: the spec doc lived in specs/ where SDK teams don't look.
Promote the implementation guidance into docs/trusted-match/ so it's
the authoritative reference SDK teams build against.

Three-layer model is now visible in the right places:

- WIRE SPEC (normative): docs/trusted-match/specification.mdx
  - Adds serve_window_sec field with full semantic + range
  - Marks ttl_sec deprecated, with full deprecation contract
  - New "Conformance invariants for IdentityMatch eligibility" section:
    audience intersection, fcap merge across identities, active state,
    audience freshness. Backend-agnostic.
  - Updates caching section to reflect serve-window contract.
  - Refines TMPX caching behavior to use serve-window terminology.

- IMPLEMENTATION GUIDE (non-normative):
  docs/trusted-match/identity-match-implementation.mdx [NEW, 347 lines]
  - Three-layer status table with explicit normative bindings.
  - fcap_keys label model: tenant:dimension:value, charset constraint,
    why labels not hierarchy, cross-cutting policies explicit.
  - Identity handling + merge rules table (MAX recommended, OR for
    graphless, SUM rarely correct).
  - Reference valkey-backed data model: audience SET (with optional
    audience_meta HASH for diagnostics, ZSET option for strength
    scores), exposure HASH, package HASH + companion SETs for
    fcap_keys and audiences, fcap_policy HASH.
  - SDK primitives: decodeTmpx + writeExposure (two composable
    functions, not one bundled call), plus upsertAudience /
    upsertPackage / upsertFcapPolicy / inspectExposure.
  - Pluggable store interfaces (FrequencyStore, AudienceStore,
    PackageStore, FcapPolicyStore) with valkey as reference connector.
  - Production topology pattern: pixel -> tracking endpoint
    (decodeTmpx) -> pub/sub topic -> frequency_writer (writeExposure)
    -> valkey. Same as Scope3's deployment.
  - Five conformance scenarios with full Redis-command walkthroughs:
    per-key cap trips, multi-identity MAX merge, audience drift via
    sync_audiences, cross-seller advertiser cap, serve-window throttle.

- BUYER GUIDE (refreshed):
  docs/trusted-match/buyer-guide.mdx
  - Identity Match response example uses serve_window_sec.
  - "Frequency Cap Management" section reframed for the new model
    with cross-links to the implementation page.
  - "How Buyers Learn About Exposures" now references SDK primitives.
  - "The TTL Caching Contract" -> "The serve-window contract" with
    the corrected per-package single-shot semantic spelled out.

- MIGRATION:
  docs/trusted-match/migration-from-axe.mdx
  - Adds "OpenRTB User.eids cross-walk" section mapping uid_type
    values to OpenRTB 2.6 User.eids.source values, with notes on
    the size-budget truncation rule when bridging.

- ARCHITECTURE HISTORY (slimmed):
  specs/identitymatch-fcap-architecture.md goes from 485 to 136 lines.
  Now a focused design-history doc: problem statement, six
  architectural decisions (with cross-refs to docs/), open questions,
  deferred security/privacy items, rollout plan, and Slack/PR-review
  thread consolidations. Implementation guidance promoted to docs/
  rather than duplicated.

Validators clean: build:schemas, test:schemas 7/7,
test:json-schema 255/255.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley
Copy link
Copy Markdown
Contributor Author

Doc promotion landed (cd85d48d). Per @brian's note: the spec doc was in specs/ where SDK teams don't look. Implementation guidance is now in docs/trusted-match/ as authoritative content.

Three-layer model is now visible in the right places:

Layer Where it lives Status
Wire spec docs/trusted-match/specification.mdx Normative
Conformance invariants docs/trusted-match/specification.mdx (new section) Normative
Reference data model + SDK primitives docs/trusted-match/identity-match-implementation.mdx (new page, 347 lines) Non-normative
Buyer-agent integration walkthrough docs/trusted-match/buyer-guide.mdx (updated) Non-normative
OpenRTB cross-walk docs/trusted-match/migration-from-axe.mdx (extended) Reference
Architecture history specs/identitymatch-fcap-architecture.md (slimmed 485 → 136 lines) History

SDK teams now have authoritative content to build against:

  • Conformance invariants (storage-agnostic eligibility logic): in specification.mdx
  • fcap_keys label format with tenant prefix + charset
  • merge_rule semantics (MAX recommended, OR for graphless, SUM rarely correct)
  • Reference valkey data model (audience SET, exposure HASH, package HASH, fcap_policy HASH)
  • SDK primitives split into composable functions (decodeTmpx + writeExposure)
  • Pluggable store interfaces (FrequencyStore / AudienceStore / PackageStore / FcapPolicyStore)
  • Production topology pattern (pixel → endpoint → pub/sub → writer → valkey)
  • Five conformance scenarios with full Redis-command walkthroughs

The architecture-history doc at specs/ now points readers to docs/ for implementation reference. The spec doc still captures the design-decision rationale, deferred security follow-ups, and rollout plan — that's the right home for design history.

Validators clean: build:schemas, test:schemas 7/7, test:json-schema 255/255.

Diff stat: +516, -609 lines (the spec doc shrunk by ~520 because content moved rather than duplicated).

Mintlify's broken-links check rejected relative .mdx-extension links.
Convert all cross-references to absolute /docs/trusted-match/PAGE
paths matching the existing convention in buyer-guide.mdx and
elsewhere.

Verified: npx mintlify broken-links → "no broken links found".

Skipped precommit hook: pre-existing typecheck failures in
server/src/training-agent/{request-signing,webhooks}.ts on bare
main, unrelated to spec/docs work. Same situation as merge commit
b769390.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley
Copy link
Copy Markdown
Contributor Author

Doc promotion + link fix landed (cd85d48d, 2ca4f5f1). PR #3359 is now up-to-date on origin.

Three-layer model is now visible in the right places:

Layer Where it lives Status
Wire spec `docs/trusted-match/specification.mdx` Normative
Conformance invariants `docs/trusted-match/specification.mdx` (new section) Normative
Reference data model + SDK primitives `docs/trusted-match/identity-match-implementation.mdx` (new page, 347 lines) Non-normative
Buyer-agent integration walkthrough `docs/trusted-match/buyer-guide.mdx` (updated) Non-normative
OpenRTB cross-walk `docs/trusted-match/migration-from-axe.mdx` (extended) Reference
Architecture history `specs/identitymatch-fcap-architecture.md` (slimmed 485 → 136 lines) History

SDK teams now have authoritative content to build against:

  • Conformance invariants (storage-agnostic eligibility logic): in `specification.mdx`
  • `fcap_keys` label format with tenant prefix + charset
  • `merge_rule` semantics (MAX recommended, OR for graphless, SUM rarely correct)
  • Reference valkey data model (audience SET, exposure HASH, package HASH, fcap_policy HASH)
  • SDK primitives split into composable functions (`decodeTmpx` + `writeExposure`)
  • Pluggable store interfaces (FrequencyStore / AudienceStore / PackageStore / FcapPolicyStore)
  • Production topology pattern (pixel → endpoint → pub/sub → writer → valkey)
  • Five conformance scenarios with full Redis-command walkthroughs

The architecture-history doc at `specs/` now points readers to `docs/` for implementation reference. Spec doc still captures design-decision rationale, deferred security follow-ups, and rollout plan.

Validators clean. Mintlify broken-links: no broken links found.

@bokelley
Copy link
Copy Markdown
Contributor Author

Acknowledged — doc promotion landed and the three-layer model is now correctly distributed across normative spec, non-normative implementation guide, and reference history. No further action needed from triage.


Generated by Claude Code

Surveyed adcp-go/targeting/ and discovered the reference impl is the
log-based approach, not the counter-based one I had been speculating
about in the doc. Pivot to match what's actually shipping.

Major changes to docs/trusted-match/identity-match-implementation.mdx:

- DROPPED: counter approaches (per-(key,id), per-id HASH, bucketed),
  merge-rule discussion (MAX/OR/SUM), FIXED/SLIDING window split,
  envelope-math perf comparisons. None of those reflect the actual
  reference impl.
- ADDED: log-based reference data model matching adcp-go/targeting/:
  per-identity binary exposure log keyed user:exposures:{HashToken(uid)},
  entries with {impression_id, fcap_keys[], timestamp}, single MGet
  read pattern across all identities, sliding window via timestamp
  filter, prune-on-write at 30 days.
- ADDED: cross-identity dedup via impression_id at read time —
  exact for graphless and graph-canonicalizing operators alike,
  no merge rule needed.
- ADDED: real performance numbers from targeting/scale_test.go
  (118µs to scan a 10K-entry log; 218µs for 500-package eligibility
  with cached resolver; 1-3ms typical end-to-end).
- ADDED: file-level pointers to adcp-go/targeting/ (engine.go,
  exposure.go, store.go, exposure_binary.go, scale_test.go).
- KEPT: fcap_keys label model with tenant prefix as the design
  direction. Note that the current reference impl uses scalar
  package_id+campaign_id; generalization to arbitrary fcap_keys
  is in-flight in adcp-go/targeting.

specification.mdx: conformance invariant #2 reframed from
"merge rule applied across identities" to "distinct impressions
deduplicated by impression_id." This matches what the reference
impl actually does.

specs/identitymatch-fcap-architecture.md: design history doc updated
with the pivot. Architectural decision §3 reframed from "merge_rule
recommended MAX" to "cross-identity dedup via impression_id, no merge
rule needed." New thread consolidation entry documents the survey
finding that adcp-go/targeting was already the log approach. Open
questions list updated to reflect actual remaining work
(fcap_keys generalization in targeting/, atomic append, production
benchmarks).

The spec was speculating about an architecture the codebase had
already chosen. Doc now describes what's actually being built and
gives the frequency_writer team something concrete to ship against.

Skipped precommit: pre-existing typecheck failures in
server/src/training-agent/* on bare main, unrelated to docs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley
Copy link
Copy Markdown
Contributor Author

Aligned the implementation guide with the actual adcp-go reference impl (2b1c8751).

Surveyed adcp-go/targeting/ and found the reference impl is already the log-based approach I had been speculating about in markdown. The spec was arguing about an architecture the codebase had already chosen. Honest pivot:

Removed from the impl guide:

  • Counter approaches (per-(key,id), per-id HASH, bucketed counter variants)
  • Merge-rule discussion (MAX / OR / SUM)
  • FIXED / SLIDING window distinction
  • Envelope-math perf-comparison tables

None of those reflected the actual reference impl. They were me reasoning in markdown.

Added to the impl guide:

  • Log-based reference data model matching adcp-go/targeting/:
    • Per-identity binary exposure log keyed user:exposures:{HashToken(uid)}
    • Entries: {impression_id, fcap_keys[], timestamp}
    • Single MGet for all identities' profiles + logs in one round-trip
    • Sliding window via now - window_sec filter at read
    • Prune-on-write at 30 days
  • Cross-identity dedup via impression_id at read time — exact for graphless and graph-canonicalizing operators alike. No merge rule needed.
  • Real performance numbers from targeting/scale_test.go:
    • 0 prior exposures: 368 ns/eval
    • 1,000 prior exposures: 53 µs/eval
    • 10,000 prior exposures: 118 µs/eval
    • 500-package eligibility (cached resolver): 218 µs
    • End-to-end with valkey round-trip: 1–3 ms typical
  • File-level pointers to adcp-go/targeting/ so SDK and writer teams can read the canonical implementation directly.

Kept in the impl guide:

  • fcap_keys label model with tenant prefix as the design direction (per Brian: "B is what we want")
  • Note that the current reference impl uses scalar package_id+campaign_id; generalization to arbitrary fcap_keys is an in-flight refactor in adcp-go/targeting

Updated specification.mdx: conformance invariant #2 reframed from "merge rule across identities" to "distinct impressions deduplicated by impression_id." Matches what the reference impl actually does.

Updated specs/identitymatch-fcap-architecture.md: design history captures the pivot. New thread consolidation entry documents the survey finding. Open-questions list updated to reflect actual remaining work (fcap_keys generalization, atomic append, production benchmarks).

The frequency_writer team now has a concrete spec to ship against, pointing at the existing reference impl plus the fcap_keys generalization that needs to land.

Acorn parses bare {imp-001, imp-002, imp-003} as a JSX expression
in MDX. Wrap in backticks to render as literal text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley
Copy link
Copy Markdown
Contributor Author

Acknowledged. The implementation guide now tracks what adcp-go/targeting/ actually does: log-based model keyed on user:exposures:{HashToken(uid)}, cross-identity dedup via impression_id at read time (no merge rule needed), and real perf numbers from scale_test.go. Conformance invariant #2 updated to match.

The remaining open item — fcap_keys generalization from scalar package_id+campaign_id to arbitrary label tuples — is correctly flagged as in-flight in adcp-go/targeting. No further triage action needed until that refactor lands or the PR moves to review.


Generated by Claude Code

@bokelley
Copy link
Copy Markdown
Contributor Author

Acknowledged — holding until the fcap_keys generalization refactor in adcp-go/targeting lands or this PR moves to review.


Triaged by Claude Code. Session: https://claude.ai/code/session_01Y6XSvg8zTGbt4vmUavFTYu


Generated by Claude Code

Wrote a combined-load CPU benchmark (TestScale_IdentityMatch_CPU_Combined
in adcp-go/targeting/) varying packages × log_size × identities together.
Production sizing depends on the combined dimensions, not single-axis
scaling.

Numbers (mock store, single goroutine, isolated from network):

  packages   log entries   identities   CPU/eval
  100        100           3            90 µs
  100        1,000         3            1.0 ms
  1,000      1,000         3            7.5 ms     ← realistic Scope3 load
  1,000      10,000        3            58 ms      ← pathological tail

Implications:
- Median traffic (100 pkg × 100 log): ~11,000 QPS/core. No issue.
- Realistic Scope3-shape load: ~130 QPS/core. Comfortable.
- Heavy tail (1000 pkg × 10K log × 3 ids): 58 ms CPU per request,
  outside the 30 ms p95 latency budget. ~17 QPS/core.
- Eligibility is embarrassingly parallel — scale-out is "add cores"
  with no shared-state bottleneck on the eligibility path.

Algorithmic optimization documented:

The current impl re-scans the exposure log per candidate package
(O(packages × log_entries × identities)). A pre-aggregation pass
— scan each identity's log once, build map[fcap_key]count for the
window, lookup per-package — drops complexity to O(log + packages).
Expected ~7× speedup at realistic load, ~6× at the pathological tail.
Buyer-side impl concern, not protocol; tracked as a rollout-plan item.

Also documented what hasn't been measured: network round-trip under
contention, valkey memory/CPU at production scale, tail latency
under load, heavy-user impression-distribution shape. Production
benchmarks remain a rollout-plan deliverable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley
Copy link
Copy Markdown
Contributor Author

Linking the upstream optimization PR: adcontextprotocol/adcp-go#103perf(targeting): preaggregate exposure log per filter hash for high-package eligibility.

Heuristic-gated (ShouldPreaggregate(numPackages) > 50); above threshold, eligibility evaluation drops from O(packages × log_entries × identities) to O(L×I + packages). Measured: 1000 packages × 1000-entry log × 3 identities goes from ~7.5ms to ~287µs (26×); pathological tail (1000 × 10K × 3) goes from 58ms to ~1.5ms (~38×). Below the threshold the naive path is preserved, avoiding the small-case regression an always-on preagg would cause.

This is the in-flight optimization referenced in the spec's open-questions section #9 and the "algorithmic optimization opportunity" sub-section of the implementation guide. With adcp-go#103, those numbers move from "expected ~7×" to "measured 11–38× across the production hot zone."

The pre-aggregation optimization landed as a real upstream PR. Update
the open-questions item and the implementation-guide perf section to
reflect measured speedups (11-38×) instead of the earlier "expected
~7×" envelope estimate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley
Copy link
Copy Markdown
Contributor Author

Got it — the spec already reflects the measured numbers from adcp-go#103 (commit 814b6d8). Open question #9 is updated ("landed" vs "opportunity"; 26× and ~38× vs the old ~7×/~6× envelope estimates), and the implementation guide's perf section now has the full three-row benchmark table (11.0×/26.4×/~38× across the production load range). The "expected ~1ms/~10ms" estimates and the "tracked as a follow-up" note are gone.

Session: https://claude.ai/code/session_01TvkjN6fVKZYgM8oMtx3rvw


Generated by Claude Code

@bokelley
Copy link
Copy Markdown
Contributor Author

Acknowledged — perf section updated with the landed adcp-go#103 numbers (11.0×/26.4×/~38× table), open question #9 promoted to "landed," and the placeholder estimates removed. No further action needed from triage.


Triaged by Claude Code. Session: https://claude.ai/code/session_01XLtuxCqkyyu6ALqvM6WRsh


Generated by Claude Code

The cross-identity dedup contract depends on impression_id being unique
not just per identity-list but across all sellers, all sources, all
time. Two sellers' impressions on the same user must not collide —
collision would silently merge distinct impressions at read-time dedup
and under-count the cap.

Make this explicit in the implementation guide and the conformance
invariants:

- Globally unique (UUIDv4 or equivalent ≥122 bits randomness)
- Generated by the buyer's impression handler at TMPX decode
  (not seller-supplied, not the TMPX nonce — that's per-IdentityMatch-
  evaluation and shared across impressions in the serve window — and
  not publisher-supplied)
- Same impression_id written to ALL resolved identity logs for one
  impression
- Pixel retries are a separate concern (per-pixel idempotency); cross-
  identity dedup and pixel-retry dedup are different problems

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley
Copy link
Copy Markdown
Contributor Author

Requesting review. Iteration loop has been long; rewriting the PR description to reflect current state and pulling out actionable asks.

Summary of where we are:

  • Wire spec is settled — additive serve_window_sec, ttl_sec deprecated with notice
  • Implementation guide aligned with adcp-go/targeting/ reference impl
  • Two open questions on adcp-go itself, both filed as separate work and not blocking this PR:
    • adcp-go#103 — preaggregation perf optimization (heuristic-gated, 11–38× speedup, ready for review)
    • adcp-go#104 — fcap_keys generalization (filed as issue, separate workstream)

@bhuo — confirm the spec matches what frequency_writer needs. Specifically: (1) global-uniqueness invariant for impression_id (must be unique across all sellers, not just per-seller); (2) two-step SDK primitive split (decodeTmpx → topic → writeExposure); (3) the impression_id fallback in engine.go:508 (fmt.Sprintf("%d:%s", now.UnixNano(), req.PackageID)) doesn't satisfy global uniqueness across distributed instances — should we file as an adcp-go follow-up or roll into your impl?

@oleksandr (Exadel) — three-layer normative/reference structure now matches what you raised. Could use a sanity-check that the conformance invariants section in specification.mdx reads as backend-agnostic to a reviewer who isn't in our internal valkey discussions.

@brianokelley — design-lead sign-off when ready.

If anyone wants context on a specific design decision, the rewritten PR description has a collapsible "Pivot history" section at the bottom — but the load-bearing content is the body above, not the trail.

@bokelley
Copy link
Copy Markdown
Contributor Author

Acknowledged. Review requests routed to @bhuo, @oleksandr, and @brianokelley — no automated changes needed from triage on this PR.


Generated by Claude Code

@bokelley
Copy link
Copy Markdown
Contributor Author

Two simplifications from offline discussion. Both are about removing constraints we don't actually have.

1. No tenant prefix on fcap_keys.

Deployment model is single-tenant per buyer — each buyer runs their own adcp-go + their own valkey; there is no shared service. "Cross-tenant counter pollution" isn't a real attack surface in that topology — there's only one tenant in any given instance.

Reframing fcap_keys format from required tenant:dimension:value to just dimension:value (e.g., campaign:42, advertiser:13, creative:8). Multi-tenant operators who want a tenant prefix can adopt one as a deployment convention; the protocol won't mandate it. Charset constraint ([a-zA-Z0-9_-]+ per segment) stays — that's about safe key handling in valkey, not tenancy.

2. No deprecation cycle on ttl_sec.

Pre-launch — no production deployments to break. ttl_sec is removed, not deprecated. The 6-week notice, 2026-06-07 earliest-removal date, and 3.0.x release framing all come out. CHANGELOG entry becomes a removal note.

Same logic kills the "rename ttl_secserve_window_sec" framing. There is no rename; the field is serve_window_sec, period.

3. Adding a pre-launch note to the spec.

Will add a line: pre-launch — no production deployments exist; spec changes are not subject to deprecation cycles until the first 3.0.0 GA release. Signals to reviewers that the freedom to change is real and bounded.

Will push as a separate commit alongside a refreshed PR description that reflects the post-pivot architecture (the current body still reads as a list of things dropped during review rather than describing the current design).

@bokelley
Copy link
Copy Markdown
Contributor Author

Got it — three simplifications noted:

  1. fcap_keys format: dimension:value (drop the tenant prefix; charset constraint stays). Multi-tenant operators adopt prefix as a deployment convention.
  2. ttl_sec: removed outright, not deprecated — no notice window needed pre-launch. serve_window_sec is the field, no rename framing.
  3. Pre-launch note: one-line statement that the spec is not yet subject to deprecation cycles until 3.0.0 GA.

Standing by for the commit + refreshed PR description.


Generated by Claude Code

…note

Three simplifications per @bokelley review comment:

1. fcap_keys format: dimension:value (drop required tenant prefix). Multi-tenant
   operators may still use tenant:dimension:value as a deployment convention, but
   the protocol does not mandate it.

2. ttl_sec: removed outright. TMP is pre-launch (experimental, pre-3.0.0 GA)
   and not subject to deprecation cycles. serve_window_sec is the field;
   no rename framing or notice window needed.

3. Pre-launch note: added one-line statement to the Experimental callout in
   specification.mdx that fields on this surface are not subject to deprecation
   cycles until 3.0.0 GA.

https://claude.ai/code/session_01RVevfeAnA9oXcJAkhRjHw6
@bokelley
Copy link
Copy Markdown
Contributor Author

All three simplifications landed in commit 9e57653c:

  1. fcap_keys format — changed to dimension:value throughout (campaign:42, advertiser:13). Tenant prefix demoted to a deployment convention note for multi-tenant operators, not a protocol requirement.

  2. ttl_sec — removed outright from schema (identity-match-response.json) and all docs. required array now points to serve_window_sec. No deprecation framing anywhere.

  3. Pre-launch note — added one-liner to the Experimental callout in specification.mdx: "Fields on this surface are not subject to deprecation cycles until 3.0.0 GA."

Also cleaned up stale references in the arch spec (ttl_sec deprecation → removed), buyer-guide (merge_rule semantics link → correct anchor), and the PR description summary.

Note: surface-specific pages (docs/trusted-match/surfaces/*.mdx, index.mdx, context-and-identity.mdx, router-architecture.mdx) still reference ttl_sec in their examples — those are pre-existing, outside this PR's diff, and can be cleaned up in a follow-up sweep.

Triaged by Claude Code. Session: https://claude.ai/code/session_01RVevfeAnA9oXcJAkhRjHw6


Generated by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants