Skip to content

[DRAFT] FEAT: Benchmark Scenario#1662

Draft
ValbuenaVC wants to merge 7 commits intomicrosoft:mainfrom
ValbuenaVC:benchmark
Draft

[DRAFT] FEAT: Benchmark Scenario#1662
ValbuenaVC wants to merge 7 commits intomicrosoft:mainfrom
ValbuenaVC:benchmark

Conversation

@ValbuenaVC
Copy link
Copy Markdown
Contributor

Description

Adds a benchmarking scenario to PyRIT to compare the performance between adversarial targets. This is currently a draft PR and there are several design conflicts to resolve before opening for review.

  1. The largest design tension is that get_strategy_class doesn't work with the factory pattern for scenario strategy generation, because the scenario instance changes the scenario strategy for benchmarks. The working solution is to intercept the lifecycle at several points in the scenario (_build_benchmark_strategy => _prepare_strategies => _get_atomic_attacks). This works but is very brittle. Callers like registries see a "blank" version of the strategy while at runtime the strategy is populated fully with live adversarial targets.

  2. We explicitly filter out non-adversarial attack strategies using a list of attack names in _build_benchmark_strategy, but this is also brittle. We have options for adding richer tagging. A cheap intervention could be to check if adversarial_target is an attribute of that attack type. Another could be to use TargetCapabilities and add an is_adversarial tag, which could pass through the attack to the caller in the scenario. But as-is we're just keeping a literal list of attacks we know have adversarial targets.

  3. The original requirements asked to grab list[PromptChatTarget] in the constructor. The issue with this is that targets don't know they're adversarial, so we need to label them with a human-readable name. model_name isn't guaranteed and similar fields don't exist in the so we fall back on the identifier. Not a great design in my opinion. Inferring the model name from a private attribute is also a yellow flag. We could change the constructor to grab dict[str, PromptChatTarget] where str is a human-readable name, but that's less ergonomic.

  4. There's explicitly no CLI support, and there can't be because of the get_strategy_class issue. This will have downstream implications for the GUI that I'd like to fix.

  5. Scenarios are designed to be plug-and-play. Do we need a list of default adversarial targets?

  6. _build_benchmark_strategy is a huge function and should be refactored since it returns a tuple of length 3. It does too much but I'm not sure how to refactor this while keeping it similar to rapid response.

  7. TBD on if this should get an integration test in this PR.

Tests and Documentation

Added tests/unit/scenario/test_benchmark.py.

Victor Valbuena added 2 commits April 23, 2026 17:33
@ValbuenaVC ValbuenaVC changed the title Benchmark [DRAFT] FEAT: Benchmark Scenario Apr 27, 2026
adversarial_models: list[PromptChatTarget] | None = None,
) -> tuple[type[ScenarioStrategy], dict[str, str], list[AttackTechniqueSpec]]:
"""
Build the Benchmark strategy class dynamically from SCENARIO_TECHNIQUES.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can replace these at the factory level, and simplify things a bunch. I'm going to take a stab

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1664

There might be ways to simplify so we don't need to overwrite _get_atomic_attacks_async either, but for now I think something like this would be good.

The fundamental architectural difference: this PR treats models as a strategy dimension (permuting them into enum members), requiring two different strategy classes and a _prepare_strategies override to reconcile them.

#1664 treats models as a runtime parameter (looping at create-time), keeping the strategy axis purely about technique selection — which is what it was designed for.

Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated
Comment thread pyrit/scenario/scenarios/benchmark/benchmark.py Outdated

if adversarial_models:
permuted_specs = []
for model in adversarial_models:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are model names definitely unique? just thinking if we have 2 models w same name we have a slight issue I think currently - ie if we have 2 "gpt-4o" model names, we end up with two identical technique names that resolve, and so the 2nd model would get overwritten wo any warning/error. maybe we add a suffix to ensure unique names or we do checking for model label collisions early & raise warning early so its not silent?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(oh rich's suggestion might remove this issue)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants