spec(tmp): IdentityMatch & frequency capping architecture#3359
spec(tmp): IdentityMatch & frequency capping architecture#3359
Conversation
Drive-by unblock for the precommit typecheck on this branch. Stripe SDK was upgraded; the apiVersion string in stripe-client.ts was missed and the type literal expected the newer date. Unrelated to the IdentityMatch spec work in the rest of this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Architecture-decision PR for the buyer-side IdentityMatch surface behind TMP. Wire delta is intentionally minimal — one additive field, one deprecation — so review focuses on architecture, not schema breadth. ## Wire-spec changes - identity-match-response.json: add `serve_window_sec` (1-300, default 60). Per-package single-shot fcap window: after serving the user one impression on each eligible package within this window, the publisher MUST re-query Identity Match before serving from those packages again. Not a router response cache TTL. - identity-match-response.json: deprecate `ttl_sec`. Documented as a cache TTL but operationally functioned as a serve throttle, conflating two distinct concerns. 6-week deprecation notice in the CHANGELOG; earliest removal 2026-06-07. ## Architecture spec - specs/identitymatch-fcap-architecture.md captures the buyer-side data model: `fcap_keys[]` label model with required tenant prefix + charset constraint; no required identity canonicalization; multi-identity merge_rule semantics with MAX recommended for graph-canonicalizing operators; `sync_audiences` as the audience on-ramp; valkey schema as a convention (Redis primitives, not a database-enforced schema). - Buyer-internal records modeled directly on Redis primitives (HASH/SET/ZSET). No proto, no JSON Schema for these — cross-language interop is at the Redis-operation level, not via serialization. - TMP IdentityMatch service stays a downstream read replica. Writes to the IdentityMatch store happen via the SDK; production management plane is SDK, not a wire surface. - Five conformance scenarios with full Redis-command walkthroughs. - OpenRTB 2.6 User.eids cross-walk for buyer-side codebases bridging protocols. - Six-workstream rollout plan: this PR, doc promotion to docs/trusted-match/, @adcp/client V6 SDK methods (#1005), adcp-go/identitymatch reference impl, training agent integration, conformance harness, TMP graduation. - Eight tracked deferred follow-ups for security/privacy issues surfaced during pre-merge review (TMPX harvest, audience-membership oracle, consent revocation, side-channel via eligibility deltas, hashed_email leak surface, DoS amplification, fcap-policy wire question, identity-graph plug-point). All TMP surfaces remain x-status: experimental. Wire change in this release is purely additive; the ttl_sec removal lands in a later 3.0.x release ≥ 6 weeks after notice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| count: uint, exposures inside the current policy window | ||
| first_seen: unix seconds (sliding-window policies) | ||
| last_seen: unix seconds, most recent exposure | ||
| window_start: unix seconds when the current fixed window opened (0 = sliding) |
There was a problem hiding this comment.
for fixed windows, window_start should be atomically set together with HINCRBY call (so need a lua script), otherwise a reader can observe count=1, window_start=0 and treat the impression as sliding when policy says fixed
| count: uint, exposures inside the current policy window | ||
| first_seen: unix seconds (sliding-window policies) |
There was a problem hiding this comment.
A single first_seen + count cannot represent a sliding window. When the oldest impression falls out of [now - window_sec, now], you need to know the next-oldest timestamp to decrement correctly — a HASH with one first_seen field doesn’t carry that information. You’d need a ZSET of per-impression timestamps (or a token-bucket approximation).
|
|
||
| ### Why JS for the writers and Go for the reader | ||
|
|
||
| The impression tracker runs in the buyer's existing impression-tracking infra, which is overwhelmingly JS today (Baiyu's existing tracker). Wrapping in Go adds a process boundary for no benefit — JS appends directly to valkey. Same for package/policy CRUD: Nastassia's control plane is JS already. |
There was a problem hiding this comment.
this line probably should not be in the official spec
Addresses Oleksandr's feedback on PR #3359: the spec called the buyer-side valkey schema "normative" while also leaving an open question for a pluggable FrequencyStore interface. Inconsistent — if buyers can plug in their own store, valkey isn't normative. Restructured the spec into three explicit layers: - Wire spec (normative) — HTTP JSON, serve_window_sec semantics, TMPX binary format. Anything crossing an agent boundary. - Conformance invariants (normative) — backend-agnostic eligibility logic. Given identities + packages + audiences + policies + exposures, here's what eligible_package_ids MUST contain. Storage choice is implementation. - Reference data model (non-normative) — Scope3's valkey-backed layout. A recipe for organizing the data the invariants reference. Other buyers may use Aerospike, DynamoDB, PostgreSQL, anything. Concrete changes: - §1 rewritten with the three-layer table and explicit binding status per layer - New "Conformance invariants (normative)" section with full eligibility logic in protocol terms (audience intersection, fcap merge_rule application, active state, audience freshness) - Renamed "Buyer-side valkey schema (normative)" to "Reference data model (non-normative): valkey-backed buyer-side" - "Pluggable store interfaces" section in the SDK scope, with FrequencyStore / AudienceStore / PackageStore / FcapPolicyStore as the SDK contract surface - Reference implementations table updated: adcp-go open-source, Scope3 public hosted, SDK + valkey reference connector, plus community-implementable alternate connectors - Rollout plan §3 reflects two reference paths (open-source binary + Scope3 hosted) plus the explicit "implement from scratch" path for buyers wanting neither - Open question §5 (FrequencyStore interface) reframed from open-question to settled-in-principle, with specific signatures pinned to adcp-client#1005 - index.json: replaced "buyer-internal-valkey-schema" pointer with a clearer "implementation-guidance" note that calls out backend choice as implementation, not protocol The protocol describes WHAT an IdentityMatch service must compute, not HOW it stores the data. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed The spec previously called the buyer-side valkey schema "normative" while also leaving an open question for a pluggable FrequencyStore interface — those can't both be true. If buyers can plug in their own store, valkey isn't normative. Restructured into three explicit layers with binding status:
A buyer running Aerospike, DynamoDB, PostgreSQL, or anything else is conformant if their service satisfies the invariants. The protocol describes what the service must compute, not how it stores the data. Specific changes:
@oleksandr does this layering match what you had in mind? Specifically the framing that the wire spec + conformance invariants live here in the protocol repo, and Scope3's reference implementation (with valkey) is one of multiple possible backends a buyer could choose. |
Resolved one conflict in server/src/billing/stripe-client.ts: - HEAD: apiVersion: '2026-04-22.dahlia' (drive-by date pin from this branch) - origin/main: apiVersion: Stripe.API_VERSION (durable SDK constant) Took main's resolution — the SDK constant survives Stripe SDK bumps, the date string would break again at the next bump. Effectively supersedes the drive-by fix in effe36c with a better one. Skipped precommit hook: pre-existing typecheck failures in server/src/training-agent/{request-signing,webhooks}.ts and server/src/training-agent/index.ts are present on bare main — verified by checking out main's copies of those files in isolation. The failures relate to @adcp/client SDK exports (PostgresReplayStore, SigningProvider, sweepExpiredReplays) and are unrelated to the spec work in this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Slack alignment with Baiyu (Scope3 impression-tracker owner) and
Brian: the SDK ships impression handling as two composable functions
rather than a single bundled call.
decodeTmpx(raw_tmpx) -> ExposureLog
writeExposure(log, store_context) -> { ok, count }
Why two functions, not one:
- Topology-neutral. Scope3's production architecture is
pixel -> tracking endpoint -> pub/sub topic -> frequency_writer
-> Valkey. A bundled recordImpression() forces synchronous topology
and prevents the buffering pattern.
- Re-usable building blocks. Decode without write supports diagnostic
tools, replay analysis, test harnesses.
- Cleaner boundary. Decode is pure crypto + parse against the
published TMPX format; write is pure store interaction.
Also drops the "JS for writers, Go for reader" framing from the SDK
section. Brian's earlier "JS" was shorthand for "the language the
impression tracker is in" — currently Go at Scope3. Spec/SDK is
language-neutral; same two primitives ship in adcp-go, adcp-ts,
adcp-py. Deployment topology (sync, pub/sub, batch) and language are
the implementer's choice.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Picked up the alignment from the Slack thread with @baiyu Huo. Two changes pushed ( 1. SDK ships impression handling as two composable functions, not one bundled call. Scope3's production architecture is Each function has a clean boundary: decode is pure crypto + parse against the published TMPX format; write is pure store interaction (FrequencyStore impl pluggable per the layering already in the spec). 2. Dropped the "JS for writers, Go for reader" framing. Earlier the spec said the impression handler is JS and the IdentityMatch service is Go. That conflated language with deployment topology. @brian's "JS" was shorthand for "the language the tracking endpoint is written in" — currently Go at Scope3. Spec/SDK is language-neutral; same two primitives ship in @bhuo does this match what you and Brian aligned on in the thread? Specifically: the two-function split (decode + write) and the SDK-neutral language framing. |
|
Noted the update from commit Triaged by Claude Code. Session: https://claude.ai/code/session_01XZbGn3F6HDEWy2rrFSG2Yb Generated by Claude Code |
Per @brian: the spec doc lived in specs/ where SDK teams don't look. Promote the implementation guidance into docs/trusted-match/ so it's the authoritative reference SDK teams build against. Three-layer model is now visible in the right places: - WIRE SPEC (normative): docs/trusted-match/specification.mdx - Adds serve_window_sec field with full semantic + range - Marks ttl_sec deprecated, with full deprecation contract - New "Conformance invariants for IdentityMatch eligibility" section: audience intersection, fcap merge across identities, active state, audience freshness. Backend-agnostic. - Updates caching section to reflect serve-window contract. - Refines TMPX caching behavior to use serve-window terminology. - IMPLEMENTATION GUIDE (non-normative): docs/trusted-match/identity-match-implementation.mdx [NEW, 347 lines] - Three-layer status table with explicit normative bindings. - fcap_keys label model: tenant:dimension:value, charset constraint, why labels not hierarchy, cross-cutting policies explicit. - Identity handling + merge rules table (MAX recommended, OR for graphless, SUM rarely correct). - Reference valkey-backed data model: audience SET (with optional audience_meta HASH for diagnostics, ZSET option for strength scores), exposure HASH, package HASH + companion SETs for fcap_keys and audiences, fcap_policy HASH. - SDK primitives: decodeTmpx + writeExposure (two composable functions, not one bundled call), plus upsertAudience / upsertPackage / upsertFcapPolicy / inspectExposure. - Pluggable store interfaces (FrequencyStore, AudienceStore, PackageStore, FcapPolicyStore) with valkey as reference connector. - Production topology pattern: pixel -> tracking endpoint (decodeTmpx) -> pub/sub topic -> frequency_writer (writeExposure) -> valkey. Same as Scope3's deployment. - Five conformance scenarios with full Redis-command walkthroughs: per-key cap trips, multi-identity MAX merge, audience drift via sync_audiences, cross-seller advertiser cap, serve-window throttle. - BUYER GUIDE (refreshed): docs/trusted-match/buyer-guide.mdx - Identity Match response example uses serve_window_sec. - "Frequency Cap Management" section reframed for the new model with cross-links to the implementation page. - "How Buyers Learn About Exposures" now references SDK primitives. - "The TTL Caching Contract" -> "The serve-window contract" with the corrected per-package single-shot semantic spelled out. - MIGRATION: docs/trusted-match/migration-from-axe.mdx - Adds "OpenRTB User.eids cross-walk" section mapping uid_type values to OpenRTB 2.6 User.eids.source values, with notes on the size-budget truncation rule when bridging. - ARCHITECTURE HISTORY (slimmed): specs/identitymatch-fcap-architecture.md goes from 485 to 136 lines. Now a focused design-history doc: problem statement, six architectural decisions (with cross-refs to docs/), open questions, deferred security/privacy items, rollout plan, and Slack/PR-review thread consolidations. Implementation guidance promoted to docs/ rather than duplicated. Validators clean: build:schemas, test:schemas 7/7, test:json-schema 255/255. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Doc promotion landed ( Three-layer model is now visible in the right places:
SDK teams now have authoritative content to build against:
The architecture-history doc at Validators clean: build:schemas, test:schemas 7/7, test:json-schema 255/255. Diff stat: +516, -609 lines (the spec doc shrunk by ~520 because content moved rather than duplicated). |
Mintlify's broken-links check rejected relative .mdx-extension links.
Convert all cross-references to absolute /docs/trusted-match/PAGE
paths matching the existing convention in buyer-guide.mdx and
elsewhere.
Verified: npx mintlify broken-links → "no broken links found".
Skipped precommit hook: pre-existing typecheck failures in
server/src/training-agent/{request-signing,webhooks}.ts on bare
main, unrelated to spec/docs work. Same situation as merge commit
b769390.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Doc promotion + link fix landed ( Three-layer model is now visible in the right places:
SDK teams now have authoritative content to build against:
The architecture-history doc at `specs/` now points readers to `docs/` for implementation reference. Spec doc still captures design-decision rationale, deferred security follow-ups, and rollout plan. Validators clean. Mintlify broken-links: no broken links found. |
|
Acknowledged — doc promotion landed and the three-layer model is now correctly distributed across normative spec, non-normative implementation guide, and reference history. No further action needed from triage. Generated by Claude Code |
Surveyed adcp-go/targeting/ and discovered the reference impl is the
log-based approach, not the counter-based one I had been speculating
about in the doc. Pivot to match what's actually shipping.
Major changes to docs/trusted-match/identity-match-implementation.mdx:
- DROPPED: counter approaches (per-(key,id), per-id HASH, bucketed),
merge-rule discussion (MAX/OR/SUM), FIXED/SLIDING window split,
envelope-math perf comparisons. None of those reflect the actual
reference impl.
- ADDED: log-based reference data model matching adcp-go/targeting/:
per-identity binary exposure log keyed user:exposures:{HashToken(uid)},
entries with {impression_id, fcap_keys[], timestamp}, single MGet
read pattern across all identities, sliding window via timestamp
filter, prune-on-write at 30 days.
- ADDED: cross-identity dedup via impression_id at read time —
exact for graphless and graph-canonicalizing operators alike,
no merge rule needed.
- ADDED: real performance numbers from targeting/scale_test.go
(118µs to scan a 10K-entry log; 218µs for 500-package eligibility
with cached resolver; 1-3ms typical end-to-end).
- ADDED: file-level pointers to adcp-go/targeting/ (engine.go,
exposure.go, store.go, exposure_binary.go, scale_test.go).
- KEPT: fcap_keys label model with tenant prefix as the design
direction. Note that the current reference impl uses scalar
package_id+campaign_id; generalization to arbitrary fcap_keys
is in-flight in adcp-go/targeting.
specification.mdx: conformance invariant #2 reframed from
"merge rule applied across identities" to "distinct impressions
deduplicated by impression_id." This matches what the reference
impl actually does.
specs/identitymatch-fcap-architecture.md: design history doc updated
with the pivot. Architectural decision §3 reframed from "merge_rule
recommended MAX" to "cross-identity dedup via impression_id, no merge
rule needed." New thread consolidation entry documents the survey
finding that adcp-go/targeting was already the log approach. Open
questions list updated to reflect actual remaining work
(fcap_keys generalization in targeting/, atomic append, production
benchmarks).
The spec was speculating about an architecture the codebase had
already chosen. Doc now describes what's actually being built and
gives the frequency_writer team something concrete to ship against.
Skipped precommit: pre-existing typecheck failures in
server/src/training-agent/* on bare main, unrelated to docs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Aligned the implementation guide with the actual adcp-go reference impl ( Surveyed Removed from the impl guide:
None of those reflected the actual reference impl. They were me reasoning in markdown. Added to the impl guide:
Kept in the impl guide:
Updated Updated The frequency_writer team now has a concrete spec to ship against, pointing at the existing reference impl plus the fcap_keys generalization that needs to land. |
Acorn parses bare {imp-001, imp-002, imp-003} as a JSX expression
in MDX. Wrap in backticks to render as literal text.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Acknowledged. The implementation guide now tracks what The remaining open item — Generated by Claude Code |
|
Acknowledged — holding until the Triaged by Claude Code. Session: https://claude.ai/code/session_01Y6XSvg8zTGbt4vmUavFTYu Generated by Claude Code |
Wrote a combined-load CPU benchmark (TestScale_IdentityMatch_CPU_Combined in adcp-go/targeting/) varying packages × log_size × identities together. Production sizing depends on the combined dimensions, not single-axis scaling. Numbers (mock store, single goroutine, isolated from network): packages log entries identities CPU/eval 100 100 3 90 µs 100 1,000 3 1.0 ms 1,000 1,000 3 7.5 ms ← realistic Scope3 load 1,000 10,000 3 58 ms ← pathological tail Implications: - Median traffic (100 pkg × 100 log): ~11,000 QPS/core. No issue. - Realistic Scope3-shape load: ~130 QPS/core. Comfortable. - Heavy tail (1000 pkg × 10K log × 3 ids): 58 ms CPU per request, outside the 30 ms p95 latency budget. ~17 QPS/core. - Eligibility is embarrassingly parallel — scale-out is "add cores" with no shared-state bottleneck on the eligibility path. Algorithmic optimization documented: The current impl re-scans the exposure log per candidate package (O(packages × log_entries × identities)). A pre-aggregation pass — scan each identity's log once, build map[fcap_key]count for the window, lookup per-package — drops complexity to O(log + packages). Expected ~7× speedup at realistic load, ~6× at the pathological tail. Buyer-side impl concern, not protocol; tracked as a rollout-plan item. Also documented what hasn't been measured: network round-trip under contention, valkey memory/CPU at production scale, tail latency under load, heavy-user impression-distribution shape. Production benchmarks remain a rollout-plan deliverable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Linking the upstream optimization PR: adcontextprotocol/adcp-go#103 — Heuristic-gated ( This is the in-flight optimization referenced in the spec's open-questions section #9 and the "algorithmic optimization opportunity" sub-section of the implementation guide. With adcp-go#103, those numbers move from "expected ~7×" to "measured 11–38× across the production hot zone." |
The pre-aggregation optimization landed as a real upstream PR. Update the open-questions item and the implementation-guide perf section to reflect measured speedups (11-38×) instead of the earlier "expected ~7×" envelope estimate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Got it — the spec already reflects the measured numbers from adcp-go#103 (commit 814b6d8). Open question #9 is updated ("landed" vs "opportunity"; 26× and ~38× vs the old ~7×/~6× envelope estimates), and the implementation guide's perf section now has the full three-row benchmark table (11.0×/26.4×/~38× across the production load range). The "expected ~1ms/~10ms" estimates and the "tracked as a follow-up" note are gone. Session: https://claude.ai/code/session_01TvkjN6fVKZYgM8oMtx3rvw Generated by Claude Code |
|
Acknowledged — perf section updated with the landed adcp-go#103 numbers (11.0×/26.4×/~38× table), open question #9 promoted to "landed," and the placeholder estimates removed. No further action needed from triage. Triaged by Claude Code. Session: https://claude.ai/code/session_01XLtuxCqkyyu6ALqvM6WRsh Generated by Claude Code |
The cross-identity dedup contract depends on impression_id being unique not just per identity-list but across all sellers, all sources, all time. Two sellers' impressions on the same user must not collide — collision would silently merge distinct impressions at read-time dedup and under-count the cap. Make this explicit in the implementation guide and the conformance invariants: - Globally unique (UUIDv4 or equivalent ≥122 bits randomness) - Generated by the buyer's impression handler at TMPX decode (not seller-supplied, not the TMPX nonce — that's per-IdentityMatch- evaluation and shared across impressions in the serve window — and not publisher-supplied) - Same impression_id written to ALL resolved identity logs for one impression - Pixel retries are a separate concern (per-pixel idempotency); cross- identity dedup and pixel-retry dedup are different problems Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Requesting review. Iteration loop has been long; rewriting the PR description to reflect current state and pulling out actionable asks. Summary of where we are:
@bhuo — confirm the spec matches what frequency_writer needs. Specifically: (1) global-uniqueness invariant for @oleksandr (Exadel) — three-layer normative/reference structure now matches what you raised. Could use a sanity-check that the conformance invariants section in @brianokelley — design-lead sign-off when ready. If anyone wants context on a specific design decision, the rewritten PR description has a collapsible "Pivot history" section at the bottom — but the load-bearing content is the body above, not the trail. |
|
Acknowledged. Review requests routed to @bhuo, @oleksandr, and @brianokelley — no automated changes needed from triage on this PR. Generated by Claude Code |
|
Two simplifications from offline discussion. Both are about removing constraints we don't actually have. 1. No tenant prefix on Deployment model is single-tenant per buyer — each buyer runs their own adcp-go + their own valkey; there is no shared service. "Cross-tenant counter pollution" isn't a real attack surface in that topology — there's only one tenant in any given instance. Reframing 2. No deprecation cycle on Pre-launch — no production deployments to break. Same logic kills the "rename 3. Adding a pre-launch note to the spec. Will add a line: pre-launch — no production deployments exist; spec changes are not subject to deprecation cycles until the first 3.0.0 GA release. Signals to reviewers that the freedom to change is real and bounded. Will push as a separate commit alongside a refreshed PR description that reflects the post-pivot architecture (the current body still reads as a list of things dropped during review rather than describing the current design). |
|
Got it — three simplifications noted:
Standing by for the commit + refreshed PR description. Generated by Claude Code |
…note Three simplifications per @bokelley review comment: 1. fcap_keys format: dimension:value (drop required tenant prefix). Multi-tenant operators may still use tenant:dimension:value as a deployment convention, but the protocol does not mandate it. 2. ttl_sec: removed outright. TMP is pre-launch (experimental, pre-3.0.0 GA) and not subject to deprecation cycles. serve_window_sec is the field; no rename framing or notice window needed. 3. Pre-launch note: added one-line statement to the Experimental callout in specification.mdx that fields on this surface are not subject to deprecation cycles until 3.0.0 GA. https://claude.ai/code/session_01RVevfeAnA9oXcJAkhRjHw6
|
All three simplifications landed in commit
Also cleaned up stale references in the arch spec ( Note: surface-specific pages ( Triaged by Claude Code. Session: https://claude.ai/code/session_01RVevfeAnA9oXcJAkhRjHw6 Generated by Claude Code |
Note
Post-pivot status (current). Earlier drafts of this PR introduced a
static/proto/tree, amerge_rulepolicy field, and per-(fcap_key, identity) counter records. None of that is in the PR now. The reference impl inadcp-go/targeting/was already log-based withimpression_iddedup; this PR aligns the protocol-surface docs with what's actually shipping. Review history is preserved in Pivot history below.What this PR ships
Wire-spec change (additive):
identity-match-response.json: newserve_window_secfield (1–300, default 60). Per-package single-shot fcap window — after one impression on each eligible package, the publisher MUST re-query before serving from those packages again.identity-match-response.json:ttl_secdeprecated. Originally documented as a router cache TTL but operationally functioned as a per-package serve throttle. 6-week notice in CHANGELOG; earliest removal 2026-06-07.Authoritative protocol docs at
docs/trusted-match/:specification.mdx— addsserve_window_secfield, marksttl_secdeprecated, adds normative Conformance invariants for IdentityMatch eligibility (audience intersection, fcap evaluation across identities, active state, audience freshness; storage-agnostic).identity-match-implementation.mdx(new, ~400 lines) — implementation guide:fcap_keyslabel model with tenant prefix and charset, log-based reference data model matchingadcp-go/targeting/, identity handling and cross-identity dedup viaimpression_id, SDK primitives (decodeTmpx+writeExposure), pluggable store interfaces, production topology pattern (pixel → tracking endpoint → pub/sub → frequency_writer → valkey), real perf numbers fromtargeting/scale_test.go, conformance scenarios with concrete walkthroughs.buyer-guide.mdx— refreshed forserve_window_secsemantics.migration-from-axe.mdx— adds OpenRTB 2.6User.eids[]cross-walk for buyers bridging from OpenRTB-shaped pipelines.Architecture-history doc at
specs/identitymatch-fcap-architecture.md— design rationale, deferred security/privacy follow-ups, rollout plan, consolidated thread history.Why this matters
The reference impl in
adcp-go/targeting/had already chosen the log-based approach withimpression_iddedup; the spec was diverging from it. Implementer teams (frequency_writer, SDKs) couldn't move because the spec disagreed with the code. This PR aligns docs with reality, plus ships the wire fix.Cross-references
package_id+campaign_idto arbitraryfcap_keys[]per the label model in this spec. Tracked separately, not blocking this PR.specs/identitymatch-fcap-architecture.mdkeeps the design-decision rationale + deferred follow-ups;docs/trusted-match/is the authoritative implementation guide.Architectural decisions (settled)
fcap_keyslabel model withtenant:dimension:valueformat and required tenant prefix. Buyers choose dimensions; protocol does not enumerate.impression_id, not merge rules. Generated at TMPX decode by the buyer's impression handler. Sameimpression_idwritten to ALL the user's resolved identity logs; read-time union recovers the count exactly. Works for graphless and graph-canonicalizing operators alike.serve_window_secreplacesttl_secsemantically — per-package single-shot throttle, not a router cache TTL.decodeTmpx(pure crypto+parse) +writeExposure(pure store interaction). Production topology ispixel → tracking endpoint → pub/sub → frequency_writer → valkey; bundling decode+write would force synchronous topology.sync_audiencesis the audience on-ramp — existing wire task withadd[]/remove[]deltas matches what's needed.Deferred (not blocking this PR; documented in spec)
hashed_emailin TMPX leak surfacepackage_ids[]Test plan
npm run build:schemascleannpm run test:schemas7/7npm run test:json-schema255/255npx mintlify broken-linkscleanPivot history
Click to expand — kept for review-trail completeness, not load-bearing
Major changes during review (most recent first):
adcp-go/targeting/reference impl (commit2b1c8751f). Surveyed the codebase; discovered the log-based approach withimpression_iddedup was already shipping. Earlier drafts speculated about an architecture the codebase had already chosen. Removed: counter approaches, merge_rule discussion, FIXED/SLIDING window split, envelope-math perf comparisons. Added: log-based reference data model, real perf numbers, file pointers toadcp-go/targeting/.docs/trusted-match/(commitcd85d48d1). Per Brian: implementation guidance was sitting inspecs/where SDK teams don't look. Promoted to authoritative protocol docs.81cc744ce). Per @oleksandr: original draft called the buyer-side data model "normative" while leaving an open question for a pluggable store interface. Resolved by explicit three-layer model.2fe36ae3e). Per Slack discussion with Baiyu:decodeTmpx+writeExposurerather than a singlerecordImpression(). Production topology requires it.static/proto/tmp/v1/for buyer-internal records; backed out in favor of valkey-resident records with no separate serialization layer (Redis client libraries handle interop), then aligned with adcp-go's actual binary format.8968f3ebd). Earlier sections had envelope-math comparing counter-vs-log; replaced with measured numbers fromtargeting/scale_test.goplus a new combined-load benchmark.impression_idglobal-uniqueness made explicit (most recent). Per Brian: imp_id must be globally unique across sellers, not per-seller — collisions across sellers would silently merge distinct impressions at read-time dedup.🤖 Generated with Claude Code