Skip to content

Proposal: Semantic fidelity as a core governance capability (3.1) #3363

@mikulbhatt

Description

@mikulbhatt

Summary

The AdCP governance protocol validates budget authority, brand safety, and regulatory compliance before a media buy executes. It does not validate whether the seller correctly understood what the buyer asked for. This proposal adds two interconnected capabilities to the campaign governance specification for 3.1:

  1. Campaign intent declaration — A new campaign_intent field on sync_plans that gives the governance agent a structured baseline to verify against
  2. Semantic fidelity — A first-class check_governance finding category that verifies the seller's interpretation matches the buyer's intent

These are not extensions. They are proposed additions to the core protocol, completing the governance model by making semantic decisions as verifiable as budget and brand safety decisions already are.

Note: Taxonomy declaration — how sellers advertise which classification systems they support — is proposed in a companion issue. See #3362Proposal: Taxonomy declaration as a core capability (3.1).

From human negotiation to machine negotiation

When humans negotiate media deals, ambiguity is resolved conversationally. A buyer says "outdoor enthusiasts" and the seller asks "do you mean hikers or general sports fans?" over Slack. "New Mexico" gets clarified in a phone call — "the state, not the country." "Premium publishers" gets defined in an IO kickoff meeting. Intent lives in the relationship between two people who share context.

When agents negotiate, there is no Slack exchange. There is no phone call. There is no shared history of what "premium" meant on the last three campaigns. The buyer agent submits a brief. The seller agent interprets it. If the interpretation is wrong, the money moves anyway — because the protocol has no mechanism to verify that what the seller understood is what the buyer meant.

This is the fundamental shift from human-mediated to agent-mediated transactions. In the human world, semantic alignment was informal and continuous. In the agent world, it must be formal and verifiable. The protocol must replace the conversations that used to resolve ambiguity.

That is what semantic fidelity provides: a structured mechanism for the buyer to declare intent, and for governance to verify that the seller's interpretation matches — before a dollar moves.

The problem: four green checks on a broken campaign

Scenario A — $200K to the wrong country (illustrative)

A state tourism board hires Pinnacle Agency to run a $200K digital campaign. The buyer agent submits a brief: "Launch campaign for New Mexico tourism board targeting adventure travelers across premium travel and outdoor publishers."

The seller agent at StreamHaus interprets "New Mexico" as a geo-target: Mexico. It builds a campaign plan targeting Mexican travel publishers with Spanish-language creative slots. The plan looks solid — good CPMs, reputable publishers, strong reach projections.

The governance agent runs check_governance:

Check Result Detail
Budget authority Pass $200K within plan limit
Brand safety Pass All selected publishers on approved list
Regulatory compliance Pass Mexico targeting meets LFPDPPP requirements
Creative provenance Pass All creatives carry required metadata

Every check passes. The governance agent returns approved. $200K flows to Mexican publishers. The New Mexico tourism board gets zero impressions in their actual market. Their summer hiking season campaign runs in Cancun.

Six months later, the tourism board's procurement team pulls the audit trail. Every governance finding is green. Nobody can point to where the interpretation went wrong because the protocol never recorded how "New Mexico" was resolved. The data shows what was bought, not what was meant.

Scenario B — Two audiences become one (illustrative)

Acme Outdoor is launching the Trail Pro 3000. The buyer agent submits a brief targeting two distinct audiences: outdoor recreation enthusiasts (hiking, camping, trail running) and extreme sports fans (rock climbing, white water rafting, backcountry skiing). The brief says: "outdoor enthusiasts and adventure sports fans, 25-44, US."

The seller has one behavioral segment that covers both: "Sports Fans." Both buyer concepts map to the same segment. The governance agent checks brand safety (pass), budget (pass), compliance (pass). The campaign runs.

Performance is poor. CTR is 0.08% against a 0.15% category benchmark. Post-campaign analysis reveals why: the "Sports Fans" segment included NFL fans, NBA fans, and fantasy sports players — audiences with zero interest in trail running shoes. The buyer's two distinct audience concepts collapsed into a single broad segment at the taxonomy mapping step, and the campaign inherited the noise.

The buyer asks the seller: "How did you map our targeting?" The seller says: "We matched to our closest behavioral segment." The buyer asks: "Can you show me the mapping?" There is nothing to show. The protocol recorded the transaction, not the interpretation. The $50K campaign underperformed because of a taxonomy mapping decision that no system captured.

Scenario C — "Premium" means three different things (illustrative)

A luxury brand runs campaigns through three sellers using the same brief: "premium lifestyle publishers, above-the-fold display, affluent professionals 30-55."

  • Seller A maps "premium" to top 10 properties by editorial quality score
  • Seller B maps "premium" to top 10 by CPM (most expensive = most premium)
  • Seller C maps "premium" to properties offering premium ad formats (rich media, takeover units)

Same brief. Three different interpretations. All defensible. Governance approves all three — "premium" isn't a protocol-level concept, it's a natural-language term each seller resolves independently.

The brand gets wildly inconsistent delivery. The brand can't diagnose why performance varies 4x across sellers because the protocol doesn't record how each seller interpreted the same word.

Why the current protocol can't catch this

The governance protocol validates what it can see. It can see budget amounts, publisher domains, regulatory jurisdictions, and creative metadata. It cannot see how the seller interpreted the brief or whether the seller's classification systems can represent the buyer's intent.

AdCP 3.0 is not starting from zero. The governance protocol already evaluates six finding categories: budget_authority, strategic_alignment, bias_fairness, regulatory_compliance, seller_verification, and brand_policy. The strategic_alignment category is the closest existing precedent — it validates whether the proposed action aligns with the campaign plan's strategy. But strategic_alignment operates at the plan level (is this buy consistent with the plan's objectives?), not at the interpretation level (did the seller understand what the buyer meant by 'outdoor enthusiasts'?). The gap is not in governance infrastructure — the governance agent already has the structural position, the finding categories, and the escalation patterns. The gap is in inputs: no structured intent from the buyer, no interpretation declaration from the seller, and no taxonomy context to evaluate the mapping.

Governance Capability What it validates What it misses
budget_authority Spend within authorized limits Nothing — budget is numeric
brand_policy Publisher on approved/blocked lists Nothing — lists are explicit
regulatory_compliance Targeting meets jurisdiction rules Nothing — rules are codified
creative_provenance Creatives carry required metadata Nothing — metadata is binary
Semantic fidelity Does not exist Brief interpretation, taxonomy mapping, term resolution

The first four categories work because they validate structured data against structured rules. Semantic fidelity is different — it validates interpretation against intent.

The gap in the three-party model

Party Existing Role Gap
Orchestrator (buyer) Proposes campaign plans, declares budget Does NOT declare structured intent — brief is natural language
Governance agent Validates plans against policies CANNOT verify semantic fidelity — no intent baseline to verify against
Seller Fulfills media buys, reports delivery Does NOT declare how it interpreted the brief

The governance agent is structurally capable of performing semantic verification — it already sits between buyer and seller, evaluating proposed actions against policies. But it lacks two inputs: the buyer's structured intent and the seller's declared interpretation.

Proposed additions to the 3.1 specification

1. Campaign intent declaration on sync_plans

A new campaign_intent field on the sync_plans request. sync_plans already exists as a governance task (experimental in 3.0). Adding campaign_intent as a field on the existing sync_plans request is an additive change to an experimental surface, which is allowed with 6 weeks' notice per the experimental status contract. This gives the governance agent a structured baseline to verify against:

{
  "plan_id": "acme-q2-trail-pro",
  "objectives": "Q2 Trail Pro 3000 launch across outdoor and adventure publishers",
  "budget": { "total": 50000, "currency": "USD" },
  "campaign_intent": {
    "audience_concepts": [
      {
        "term": "outdoor_recreation",
        "description": "hiking, camping, trail running",
        "distinct_from": ["general_sports", "extreme_sports"],
        "priority": "primary"
      },
      {
        "term": "extreme_sports",
        "description": "rock climbing, white water rafting, backcountry skiing",
        "distinct_from": ["outdoor_recreation", "general_sports"],
        "priority": "primary"
      }
    ],
    "geographic_intent": {
      "regions": ["US"],
      "disambiguation": [
        { "term": "New Mexico", "means": "US state (NM)", "not": "country of Mexico" }
      ]
    },
    "publisher_quality": {
      "definition": "editorial quality and audience relevance",
      "not": ["CPM tier", "ad format availability"]
    }
  }
}

The campaign_intent field is structured enough for machine verification but readable enough for human review. The distinct_from arrays explicitly declare which concepts the buyer considers different — giving the governance agent a testable assertion. The disambiguation array handles the "New Mexico" class of problems by making intent unambiguous at declaration time.

2. Semantic fidelity as a core finding category

semantic_fidelity becomes a specified category_id in check_governance findings — alongside budget_authority, brand_policy, and regulatory_compliance. Note that category_id values in AdCP are implementation-defined strings — the spec defines common ones but does not restrict to an enum. Adding semantic_fidelity as a new category_id follows the existing extensibility pattern.

Today (3.0) — no semantic verification:

{
  "findings": [
    { "category_id": "budget_authority", "severity": "must", "explanation": "Within limit", "confidence": 1.0 },
    { "category_id": "brand_policy", "severity": "must", "explanation": "Publishers approved", "confidence": 1.0 }
  ]
}

Proposed (3.1) — with semantic verification:

{
  "findings": [
    { "category_id": "budget_authority", "severity": "must", "explanation": "Within limit", "confidence": 1.0 },
    { "category_id": "brand_policy", "severity": "must", "explanation": "Publishers approved", "confidence": 1.0 },
    {
      "category_id": "semantic_fidelity",
      "severity": "should",
      "explanation": "Buyer terms 'outdoor_recreation' and 'extreme_sports' both resolved to seller segment 'Sports Fans'. Buyer declared these as distinct audiences (distinct_from: each other). Seller's IAB Audience Taxonomy 1.1 coverage is partial — hierarchy depth 2 does not distinguish sub-categories within 'Sports & Recreation'.",
      "confidence": 0.85,
      "details": {
        "intent_source": "campaign_intent.audience_concepts",
        "mismatches": [
          {
            "buyer_terms": ["outdoor_recreation", "extreme_sports"],
            "seller_segment": "Sports Fans",
            "issue": "many_to_one_collapse",
            "cause": "seller_taxonomy_insufficient_depth",
            "seller_taxonomy": "iab-audience-1.1",
            "seller_hierarchy_depth": 2
          }
        ],
        "geographic_verification": {
          "status": "passed",
          "buyer_declared": "US (country code)",
          "seller_interpreted": "US"
        }
      }
    }
  ]
}

Notice how the taxonomy context enriches the finding. The governance agent doesn't just say "two terms collapsed" — it explains why: the seller's IAB Audience Taxonomy 1.1 only has two hierarchy levels, which structurally cannot distinguish sub-categories within "Sports & Recreation." This is the difference between "the seller made a bad mapping" and "the seller's taxonomy can't represent what the buyer asked for." Both are important; taxonomy visibility (see companion issue #3362) makes the distinction possible.

How the two pieces work together

Buyer                          Seller                         Governance Agent
  |                              |                                |
  |-- get_adcp_capabilities ---->|                                |
  |<-- capabilities              |                                |
  |   (incl. supported_taxonomies|                                |
  |    — see companion issue)    |                                |
  |                              |                                |
  |-- sync_plans --------------->|-----(plan forwarded)---------->|
  |   (campaign_intent)          |                                |
  |                              |                                |
  |-- get_products ------------->|                                |
  |<-- products[] ---------------|                                |
  |                              |                                |
  |-- check_governance ------------------------------------------>|
  |   (proposed buy)             |   [governance checks:          |
  |                              |    - budget_authority            |
  |                              |    - brand_policy               |
  |                              |    - regulatory_compliance      |
  |                              |    - semantic_fidelity  (NEW)   |
  |                              |    uses: campaign_intent        |
  |                              |    uses: taxonomy context]      |
  |<-- findings[] ------------------------------------------------|

3. Escalation follows existing patterns

Semantic misalignment triggers the same escalation flow as budget authority violations:

  • severity: "must" findings (geographic misinterpretation — the New Mexico scenario) -> governance agent holds the check async -> routes to human reviewer -> approve/deny with conditions
  • severity: "should" findings (taxonomy precision loss — the outdoor enthusiasts scenario) -> advisory, logged in audit trail, buyer decides whether to proceed
  • severity: "may" findings (interpretation drift across sellers — the "premium" scenario) -> informational, available for post-campaign analysis

No new escalation machinery needed. The existing async hold -> human review -> resolution pattern works unchanged.

4. Crawl, walk, run

Following the governance protocol's established adoption pattern:

  • Crawl (audit mode): campaign_intent is optional. Governance evaluates semantic fidelity where inputs are available, always returns approved. Findings are logged for review. Missing inputs produce a "may" finding: "No campaign intent declared — semantic verification skipped." Taxonomy context from the companion proposal, where available, enriches findings.
  • Walk (advisory mode): campaign_intent is expected for campaigns above a configurable budget threshold. Governance returns real should findings. Both sides tune their declarations based on real data.
  • Run (enforce mode): campaign_intent is required. must-severity semantic misalignments block execution. Geographic misinterpretation is a hard stop. Taxonomy precision loss severity is configurable per-plan.

Callers never change their integration code across modes — the governance agent's internal configuration controls enforcement level.

Regulatory context

  • EU AI Act (Article 50) — Transparency obligations for AI systems in advertising decisions
  • GDPR (Article 22) — Rights related to automated decision-making and profiling
  • EU DSA (Article 26) — Transparency in online advertising targeting
  • California SB 942 — Transparency requirements for AI-generated content and decisions

Semantic verification creates protocol-level infrastructure for compliance. When a regulator asks "how did the AI interpret this campaign brief?", the governance audit trail provides a structured answer — including how intent was declared and where interpretation diverged. Combined with taxonomy declaration (companion issue), the trail also records what classification systems were in play.

Stakeholder considerations

Stakeholder Benefit Concern
Buyer agent / DSP Can prove specificity when sellers underdeliver Who generates campaign_intent from natural-language briefs? If the buyer's LLM produces it, who validates that interpretation?
Seller agent / SSP Clear signal of what the buyer actually wants Sellers must demonstrate how they interpreted intent; interpretation quality becomes visible
Governance agent Gets the structured inputs it needs to verify Requires semantic analysis capabilities. Not all governance implementations are AI-powered — many are rules-based.
Brand safety provider Semantic fidelity prevents brand-unsafe placements that pass keyword checks Brand safety taxonomies (GARM, IAS, DV) should be included in taxonomy declarations (see companion issue)
3P Orchestrator Can compare interpretation quality across sellers for the same brief Must assemble intent declarations across multiple briefs and sellers
Advertiser procurement Structured evidence of interpretation quality, not just outcome metrics Audit trail needs human-readable rendering, not just API responses

Verification levels

The proposal should not assume governance agents have AI capabilities. Verification should be defined in levels:

  • Level 1 — Structural comparison: Buyer declared two concepts as distinct, seller collapsed them into one segment -> flag. Achievable by any rules-based system.
  • Level 2 — Taxonomy-aware: Check whether the seller's declared taxonomy could represent the buyer's distinction based on hierarchy depth and coverage. Requires taxonomy metadata (see companion issue) but not AI.
  • Level 3 — Semantic analysis: Evaluate whether the seller's mapping was semantically reasonable given the taxonomy structure. Requires NLP or embedding capabilities.

The 3.1 spec should require Level 1, recommend Level 2, and allow Level 3. This ensures the lowest adoption bar is achievable by all governance implementations.

Open questions for the working group

  1. Scope for 3.1: Should semantic verification cover all brief elements (audience, geo, context, creative, publisher quality) or start with a subset? Recommendation: audience and geographic intent for 3.1, expand in 3.2.
  2. Intent declaration generation: Should the spec allow governance agents to generate campaign_intent from the natural-language brief as a service, rather than requiring the buyer to produce it? This lowers the adoption burden for buyer agents.
  3. Default severity: Should semantic misalignment default to must (blocking) or should (advisory)? Should severity be configurable per intent field (e.g., geographic = must, audience precision = should)?
  4. Human-readable audit: get_plan_audit_logs should return semantic fidelity findings in a format that supports human-readable rendering for advertiser procurement teams, not just machine consumption. What format?
  5. Backward compatibility: In crawl mode, missing campaign_intent produces an informational finding. Is this sufficient, or should 3.1 define a migration timeline to required status?

Working group context and prior decisions

This proposal is grounded in governance working group discussions and protocol decisions already underway:

Governance WG Slack discussion (April 2025): The working group established that governance requirements will be agreed between buyer and seller within the Policy Registry framework and applicable regulations. The consensus: auditable logs are recommended for self-protection, while what gets shared with counterparties is open to interpretation where proprietary considerations apply. The practical design direction: support both internal and shareable views.

Brian O'Kelley's response confirmed the architectural foundation this proposal builds on:

  • get_plan_audit_logs already captures budget tracking, validation history, and compliance summary grouped by governance_context — experimental in 3.0 with 6-week change notice. This means campaign_intent as an additive field on the experimental sync_plans surface follows the established modification contract.
  • The Policy Registry's effective_date mechanism supports the crawl→walk→run adoption path — semantic fidelity can be informational (effective_date set in the future) before it enforces, exactly as the working group recommended for "minimal restrictions initially."
  • The hard invariant that bespoke policies only add restrictions, never relax them, ensures that entities with higher compliance thresholds can layer stricter semantic fidelity requirements without undermining the baseline.

Issues and PRs already resolved:

Production validation: Yahoo has run live agentic campaigns with a major holding company partner where the exact scenarios described in this proposal occurred in production — including a healthcare campaign targeting "Bay Area" where the agent narrowed to San Francisco DMA only, missing Sacramento, Modesto, and other Northern California DMAs, resulting in insufficient avails and an unfulfilled buy. An ontology with proper geo decomposition would have caught this at the semantic fidelity check, before a dollar moved.

Industry collaboration: Yahoo is co-leading the semantic matching and context graph workstreams within AdCP, and partnering with Google on the open-source BigQuery Agent Analytics SDK where the same semantic verification patterns — entity resolution, adaptive similarity scoring, cross-encoder reranking — are being applied to agentic systems at large. The verification levels proposed here (rules-based → taxonomy-aware → AI-powered) map directly to the SDK's Extract → Retrieve → Rerank pipeline architecture.


Relationship to other tracks

  • Taxonomy declaration: Separately tracked. Taxonomy declaration defines how sellers advertise which classification systems, alignment methods, and hierarchy depths they support — so buyers and governance agents can assess compatibility before a campaign begins. Taxonomy context enriches semantic fidelity findings by explaining why a mismatch occurred (mapping error vs. structural limitation). See: Proposal: Taxonomy declaration as a core capability (3.1).
  • Decision provenance: Separately tracked. Decision provenance is seller-facing — governance validates that sellers followed a defensible evaluation process. Semantic fidelity is buyer-facing — governance validates that interpretation matches intent. Taxonomy declaration serves both: it contextualizes semantic fidelity findings and provides structure for decision provenance attestations. See: Proposal: Decision provenance and lineage as core governance capabilities (3.1).

Prior work

The feat/semantic-governance-extensions branch on mikulbhatt/adcp contains prototype schemas for ext.semantic_alignment, ext.decision_trace, and ext.semantic_capabilities. Working group feedback recommended promoting these from extensions to core protocol capabilities. This proposal incorporates that feedback — semantic_alignment becomes the semantic_fidelity finding category and campaign_intent is a new first-class field rather than an extension namespace. The semantic_capabilities extension is now proposed as supported_taxonomies on capabilities in the companion taxonomy declaration issue.


Related: See also #3362Proposal: Taxonomy declaration as a core capability (3.1) | #3364Proposal: Decision provenance and lineage as core governance capabilities (3.1) | #3365Proposal: AdCP Reference Media Ontology — a shared vocabulary for agentic advertising (3.1)

Metadata

Metadata

Assignees

No one assigned

    Labels

    claude-triagedIssue has been triaged by the Claude Code triage routine. Remove to re-triage.needs-wg-reviewBlocked on a working-group decision — surface in WG meeting agendas

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions