perf(targeting): preaggregate exposure log per filter hash for high-package eligibility#103
Open
perf(targeting): preaggregate exposure log per filter hash for high-package eligibility#103
Conversation
…ackage eligibility
Eligibility evaluation re-scanned the exposure log per candidate package
in CheckFrequencyRulesMultiLog and LatestExposureMultiLog, giving
O(packages × log_entries × identities) CPU. At realistic Scope3-shape
load (1000 candidate packages × 1000-entry log × 3 identities) this
was ~7.5ms CPU per request; pathological tail (1000 × 10K × 3) hit
58ms — outside the 30ms latency budget.
Pre-bucket the user's exposure log entries by filter hash (campaign and
package) once per request, plus precompute per-package latest timestamp.
Per-package eligibility check then walks only the matching bucket
instead of the full log. Build cost O(L × I), per-package check
O(matches-per-filter), independent of candidate-package count.
Heuristic-gated: ShouldPreaggregate(numPackages) > 50. Below that
threshold the map-build allocation overhead dominates — at 10 packages
× 10K-entry log × 3 identities, the build cost more than triples
per-request CPU. Above ~50 packages, preagg amortizes — at 1000 × 1000
× 3, ~26× speedup; at 1000 × 10K × 3, ~40× speedup. The phase
transition is sharp enough that a simple package-count check beats more
elaborate heuristics.
The two-path engine code is justified by avoiding a real regression on
small-package requests, not just complexity-for-complexity's-sake. An
intermediate draft removed the threshold to collapse to one path; the
resulting ~700µs regression on 10pkg × 10K × 3ids was material enough
to walk back.
Measured speedups vs baseline (TestScale_IdentityMatch_CPU_Combined):
packages log_size ids before after speedup
10 10000 3 1,299 µs 898 µs 1.4× (naive path)
1000 100 3 784 µs 71 µs 11.0× (preagg path)
1000 1000 3 7,566 µs 287 µs 26.4× (preagg path)
1000 10000 3 57,861 µs 1,500 µs ~38× (preagg, pathological tail)
Full targeting test suite passes; behavior is bit-identical between
the multi-log and aggregated paths (same dedup, same window filter,
same MaxCount short-circuit).
Adds:
- targeting/exposure_aggregate.go — PreaggregatedExposures type +
BuildPreaggregatedExposures +
CheckFrequencyRulesAggregated +
LatestExposureAggregated +
ShouldPreaggregate
- targeting/exposure_aggregate_test.go — TestPreaggregate_Crossover
(documents the empirical
naive-vs-preagg crossover at
~50 packages)
- targeting/cpu_combined_test.go — TestScale_IdentityMatch_CPU_Combined
Modifies:
- targeting/engine.go EvaluateIdentityResolved — gates between naive
and preagg paths,
uses preagg for
campaign + package fcap
+ intent score
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 28, 2026
| // TestPreaggregate_Crossover measures naive vs preaggregated frequency-cap | ||
| // evaluation across the (packages × log_entries × identities) matrix to | ||
| // determine where the heuristic threshold should sit. | ||
| func TestPreaggregate_Crossover(t *testing.T) { |
Collaborator
There was a problem hiding this comment.
this doesn't really test anything, so should not be run together with the real tests. (should utilize t.Skip())
Collaborator
There was a problem hiding this comment.
or ideally this would be a Benchmark, not Test
| @@ -0,0 +1,84 @@ | |||
| package targeting | |||
|
|
|||
Collaborator
There was a problem hiding this comment.
this needs a test for CheckFrequencyRulesMultiLog(...) == CheckFrequencyRulesAggregated(BuildPreaggregatedExposures(...), ...) for identical inputs
| // (candidate packages per request) × (exposure log entries per identity) × | ||
| // (identities per request). All numbers are isolated from network I/O via | ||
| // the mock store, so they represent in-process CPU only. | ||
| func TestScale_IdentityMatch_CPU_Combined(t *testing.T) { |
Collaborator
There was a problem hiding this comment.
same as TestPreaggregate_Crossover - no real tests, should be skipped or Benchmark
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Eligibility evaluation in
EvaluateIdentityResolvedre-scanned the user's exposure log per candidate package viaCheckFrequencyRulesMultiLogandLatestExposureMultiLog, givingO(packages × log_entries × identities)CPU per request. At realistic Scope3-shape load (1000 candidate packages × 1000-entry log × 3 identities) this was ~7.5ms CPU per request; the pathological tail (1000 × 10K × 3) hit ~58ms — outside the 30ms p95 latency budget called out in the TMP spec.This PR pre-buckets the user's exposure log entries by filter hash (campaign and package) once per request, plus precomputes per-package latest timestamp for intent score. Per-package eligibility check then walks only the matching bucket instead of the full log. Build cost
O(L × I), per-package checkO(matches-per-filter), independent of candidate-package count.Heuristic-gated, not always-on
ShouldPreaggregate(numPackages) > 50decides which path runs. Below that threshold, the map-build allocation overhead dominates — at 10 packages × 10K-entry log × 3 identities, the build cost more than triples per-request CPU (~1.25ms vs ~408µs naive). Above ~50 packages, preagg amortizes — at 1000 × 1000 × 3, ~26× speedup; at 1000 × 10K × 3, ~40× speedup.The phase transition is sharp enough that a simple package-count check beats more elaborate heuristics. The two-path engine code is justified by avoiding a real measured regression on small-package requests.
Measured speedups
vs. main, from
TestScale_IdentityMatch_CPU_Combined, in-memory mock store, single goroutine:The pathological-tail case drops from 58ms to 1.5ms — comfortably within the latency budget.
Bit-identical behavior
Same dedup (impression hash), same window filter, same MaxCount short-circuit. The naive
CheckFrequencyRulesMultiLogandLatestExposureMultiLogfunctions remain inexposure_binary.goas public API with their existing tests; the engine just calls into the aggregated path when the threshold is exceeded.Test plan
go test ./targeting/passesgo test ./targeting/ -run TestPreaggregate_Crossovershows the threshold-driven crossover empiricallygo test ./targeting/ -run TestScale_IdentityMatch_CPU_Combinedconfirms speedupsRelated: adcontextprotocol/adcp#3359 — IdentityMatch architecture spec that surfaced this perf concern.
🤖 Generated with Claude Code