rwyn ("arwin") means run what you need.
rwyn is a stage-aware planner and executor for change-driven verification. Given a code change, it determines which repository requirements are plausibly at risk, gathers and weighs the relevant evidence, constructs the smallest practical plan it can justify for the current stage of the code lifecycle, and executes that plan automatically.
The goal is simple: get the confidence you need with the least unnecessary work.
Install rwyn:
curl -fsSL https://get.rwyn.dev/install.sh | sh
brew install rwyn
cargo install rwynInitialize a repository:
cd your-repo
rwyn init
rwyn run --stage save
rwyn plan --stage merge
rwyn explainFor contributors or local development from source:
cargo install --path .Or build a release binary locally:
cargo build --release
./target/release/rwyn --helpSet up a repository in five steps:
- install the CLI
- run
rwyn init - review
.rwyn/config.yaml - run
rwyn doctor - run
rwyn run --stage save
rwyn init creates the initial repository model:
.rwyn/config.yaml- an initial set of stages
- an initial set of steps
- obvious repository structure and toolchain assumptions
- suggested CI wiring
A minimal config looks like:
requirements:
- id: tests-pass
description: TypeScript tests pass
stages:
save:
default_confidence: medium
commit:
default_confidence: high
merge:
default_confidence: certain
steps:
- id: test
kind: test
command: bun test
inputs:
- "src/**/*.ts"
satisfies:
- tests-passBootstrap is the heavy-agent phase of the loop described in How The Model Is Built And Improved; after the first session, the same skill drives lighter ongoing iteration.
If you are using Claude Code or Codex, the best initial setup flow is:
- ask the agent to inspect the repo and scaffold
.rwyn/config.yaml - have it add declarative plugins for obvious repo-specific structure
- have it run
rwyn doctor - have it run
rwyn plan --stage save - have it explain any surprising selections
The agent uses the rwyn skill or plugin surface for setup. Repository truth lives in config and plugins, not in prompts.
The model improves as the repository gives rwyn more information:
- a minimal working config
- declared prerequisites and hard relationships
- dynamic evidence such as coverage
- plugins for hidden structure
gaps,replay, andcomparefor ongoing refinement
Practical habits that move the model forward:
- keep steps narrow and scopeable
- declare obvious prerequisites and hard relationships explicitly
- collect coverage so test scoping and confidence improve
- model generated artifacts and hidden dependencies with plugins
- treat repeated expensive early-stage work as a sign that the repo needs a cheaper signal
Coverage tells rwyn what code a step actually exercises, which sharpens scoping and confidence beyond what declared and static evidence can give.
Use this loop to keep dynamic evidence current:
rwyn coverage status
rwyn coverage refresh
rwyn coverage collect --kind bun-typescript --step bun-testUseful evidence is:
- incremental
- scope-aware
- fresh enough to trust
- shared across local runs, CI, and agents when possible
Add Plugins When The Repo Has Hidden Structure
Plugins capture repository truth that is real but not obvious from plain file layout:
- generated-artifact relationships
- hidden dependency edges
- interface-to-implementation links
- path-derived scopes
- repository-specific structure that affects relevance or confidence
Most teams still treat verification as tribal knowledge.
Developers learn rules like "if you touch this area, run these tests." CI pipelines encode partial logic in scattered configs and scripts. Agents miss important repo-specific checks, or overrun by falling back to "run everything."
rwyn exists to replace that folklore with a repository model. Instead of teaching every human and every agent what to run for every kind of change, the repository declares what it cares about once, and rwyn plans and executes from that model everywhere.
rwyn is built around a small set of concepts:
- requirement A property the repository wants to hold, such as formatting being correct, generated artifacts being current, relevant builds succeeding, or relevant tests passing.
- step An executable action that provides evidence about, verifies, satisfies, or helps satisfy one or more requirements.
- evidence
The raw information
rwynuses to decide what is relevant, which steps are useful, and when a plan is sufficient. - plan A stage-specific decision about which steps to run, in what order, at what scope, for which requirements.
- stage A repo-defined lifecycle checkpoint with a default confidence target for relevant requirements.
- confidence A global concept applied per requirement. Different requirements do not redefine what confidence means; they differ in what evidence is needed to reach it.
The model is many-to-many:
- a requirement may be supported by multiple steps
- a step may support multiple requirements
- some steps fully satisfy a requirement
- some steps provide only partial evidence
That lets the repository express realities like:
- a formatter satisfying a formatting requirement
- a non-mutating formatting step verifying the same requirement
- a narrow unit test providing partial evidence about a broader integration risk
- a generation step satisfying an artifact-freshness requirement that later verification depends on
The same logical requirement may admit different operational strategies at different stages. The repository model defines those choices explicitly.
rwyn is fundamentally an evidence system.
For a given change, rwyn first asks which requirements have non-zero plausible risk. Then it asks which steps provide the best next evidence for those requirements at the current stage. Planning stops when every relevant requirement reaches its effective confidence target.
This means rwyn distinguishes between two phases:
- selection Which requirements are plausibly in play for this change?
- planning Which steps should run now so each relevant requirement reaches the confidence needed for this stage?
A plan gathers enough evidence so that every relevant requirement reaches its target with the least unnecessary work. The planning question is:
what is the cheapest evidence I can gather now that reduces the chance of later-stage failure enough for this stage?
If a slow step is genuinely necessary at an early stage, rwyn runs it. Repeated expensive early-stage work is diagnostic: the repo is missing a cheaper earlier signal for that risk.
Evidence remains inspectable. rwyn can explain:
- why a requirement is relevant
- why a step is useful
- why the selected plan is sufficient
Those are three distinct layers of evidence:
- requirement evidence for relevance
- step evidence for usefulness
- plan evidence for sufficiency
Relevance is computed from a stack of evidence sources, with stronger evidence preferred before weaker evidence:
- declared repository knowledge
- static structural evidence
- semantic or AST-level evidence
- dynamic execution evidence such as coverage or traces
- historical empirical evidence
- heuristics and priors last
Freshness, scope, reliability, cost, contradiction, and recency are all inputs to the calculation.
Confidence is the probability, for a given requirement, that the selected subset of relevant checks catches what the full set of relevant checks would catch, measured against observed outcomes.
A target of 0.75 for a requirement means: calibrated against observed history, the selected subset is expected to catch the same set of failures the full set would catch with at least 0.75 probability. 1 - 0.75 = 0.25 is the acceptable probability that a failure surfaces later.
Confidence is tracked per relevant requirement, not as one global score for the whole change.
For each change:
rwynidentifies the requirements with non-zero plausible risk.- Each relevant requirement gets a confidence estimate from the evidence the planner has, the priors it carries, and the calibration accumulated from prior runs.
- Candidate steps are evaluated by how much useful evidence they provide relative to cost.
rwynkeeps selecting steps until every relevant requirement reaches its effective confidence target.
The stage ladder is a probability budget spread across the lifecycle: early-stage checks accept a higher probability of missed failures because later stages re-verify at higher targets.
Confidence targets inherit cleanly:
stage default -> requirement override
A stage supplies a default confidence target for the requirements relevant at that lifecycle point, declared with the default_confidence: field in .rwyn/config.yaml. Every stage must declare one; missing defaults are surfaced by rwyn doctor.
Confidence is configured on one global scale. Repositories can use either named labels or numeric values, and both resolve to the same underlying targets.
The built-in confidence labels map to:
| Label | Numeric target |
|---|---|
low |
0.25 |
medium |
0.50 |
high |
0.75 |
very_high |
0.90 |
certain |
1.00 |
Numeric values use the same 0.00 to 1.00 scale. For example, confidence: 0.85 sets a stricter target than high and a looser target than very_high.
Within a single planning pass, confidence accumulation is monotonic: adding valid evidence only maintains or increases confidence for a requirement.
Calibration is empirical. In a fresh repo, targets are reached using declared evidence and priors; as run history accumulates, calibration sharpens. The planner reports per requirement whether its confidence number is calibrated against history or still relying on priors.
The planner is one artifact of rwyn's verification model. Building and maintaining that model is the rest of the system.
The model is built and improved in three modes that share the same machinery:
- Bootstrap. Turn a fresh repo into a usable model in one good agent session. Sources include programmatic analysis (file structure, AST, language detection), dynamic evidence (coverage when collected), and AI-elicited declared knowledge (the strongest tier in the evidence stack). The only evidence source legitimately missing at this point is historical outcomes; everything else can be in place from the first session.
- Iteration. When
rwynmisses a failure or pays too much for confidence, the diagnostic surface (gaps,explain,replay) describes the gap as honestly as it can: clean attribution where the data supports it, candidate causes where it does not, and explicit "I cannot tell" where it cannot. The agent reads that report, proposes a model change, validates withreplay, and commits. - Calibration. Background sharpening of probability estimates from accumulated runs. The planner's predictions become more honest as outcomes flow back into the model.
rwyn combines declared semantics with empirical evidence.
Users declare what they know for sure:
- explicit requirement and step relationships
- prerequisites
- obvious full-satisfaction cases
- stage configuration
- scope rules
- repo-specific structure
Everything else is learned empirically over time:
- how predictive a step really is for a requirement
- which early steps substitute well for broader later steps
- how much confidence a scoped run really buys
- which failure surfaces are under-modeled
- where the repo is missing a cheaper earlier signal
Declared configuration remains authoritative for planning and execution. When observed outcomes repeatedly contradict declared assumptions, rwyn surfaces the divergence through warnings, reports, and recommendations.
When a step fails at a later stage, rwyn looks backward to find the earlier stages where the same step was a candidate but was skipped. The miss type comes from why it was skipped:
- Selection miss. The relevance gate filtered the step out wrongly. Fix: tighten relevance.
- Weight miss. The step was a candidate, but the planner believed another step substituted for it. Fix: adjust evidence weights.
- Set miss. The step was not a candidate for the relevant requirement at the earlier stage. Fix: declare or learn the link.
- Link miss. The change-to-step relationship was not modeled at the earlier stage. Fix: add a plugin or declared edge.
Failures where no current step would have caught the problem (a novel failure mode, an unmodeled risk) are a different gap class — "no earlier signal exists for this failure type" — and surface separately, not as miss attribution.
rwyn produces diagnostics and accepts model changes. The orchestration of "read gap → propose change → validate → commit" lives in the skill. The bundled Claude Code and Codex skills are reference drivers; the loop they implement is one example among many.
JSON outputs (rwyn gaps --json, rwyn explain --json, rwyn plan --json) are the public APIs the loop writes against. Their schemas are stable across versions.
For each change, rwyn produces a run record — the durable artifact that powers replay, compare, gaps, and calibration over time.
A run record contains:
- identity: change ref (commit, diff, or range), stage, environment, timestamp,
rwynversion, model state hash - plan: the selected steps, the candidate steps that were skipped, the per-requirement confidence reached, scopes
- decisions: for each candidate step, why it was selected or skipped — the data that powers
explainand miss attribution - outcomes: per executed step, pass/fail, duration, exit code, captured evidence (coverage paths, traces)
- provenance: source of the record (a local
rwyn run, a CI run, or an external ingest)
Plans are proposals before execution and records after. The same object survives both phases, so intent, outcomes, and later attribution all reference the same artifact.
By default, run records live locally in .rwyn/runs/ as JSON files, one per run, and that directory is gitignored. The schema is stable across versions and admits external sources, so any record — local, CI, future hosted — can be ingested by any environment.
The engine ships two primitives:
rwyn export runs— write records out for transport (CI artifacts, archival, manual sharing)rwyn ingest runs <path>— bring records from elsewhere into the local model
Local↔remote sync is orchestrated by the skill. A typical flow: CI uploads .rwyn/runs/ as a build artifact at the end of a stage; the skill, on git pull or session start, downloads new artifacts and ingests them.
An opt-in runs_storage: git_branch mode stores records on a parallel branch like rwyn/runs. The tradeoffs (repo bloat, paths and outcomes in git history, a tool writing to a branch) are why it is opt-in.
rwyn works like this:
- Model the repository.
- Map a change onto that model.
- Select requirements with non-zero plausible risk.
- Evaluate candidate steps as evidence.
- Build the smallest practical sufficient plan for the stage.
- Execute that plan with ordering, prerequisites, and environment contracts preserved.
- Record outcomes and feed them back into future planning.
During development:
rwyn run --stage save
rwyn run --stage commitWhen work is pushed remotely:
rwyn run --stage pushBefore or during integration:
rwyn run --stage mergeWhen the result surprises you:
rwyn explain
rwyn gapsWhen the repository model needs work:
rwyn doctor
rwyn gapsrwyn is stage-aware, but stages are repo-defined lifecycle checkpoints, not platform nouns like "PR" or "merge queue".
A stage provides:
- a default confidence target for relevant requirements
- a lifecycle marker that steps reference to declare when they apply
The planner's objective is already cheapest sufficient evidence, so cost lives in the planner, not in stage configuration. Which steps run when is decided by step-level stage applicability.
The default stage vocabulary is:
savecommitpushmergepost_mergerelease
These are examples, not a universal lifecycle. Repos define the stages that match how they actually work, including names like:
nightlystaginghotfixperfsecuritydeploy
Stages are also flexible enough to support immediate local or operational goals, such as keeping the workspace healthy, validating post-merge behavior, or preparing release artifacts.
The command surface is small and role-oriented.
Most commands operate on the same core inputs:
- stage
Which lifecycle checkpoint you are planning for, such as
save,commit,push,merge, or a repo-defined custom stage. - change The change under consideration. By default this is the current local diff, but it can also be a base/head range, a commit, a pushed change, or an explicit diff artifact.
- scope overrides Optional narrowing or explicit step selection when a user wants to override automatic planning.
- output mode Human-readable explanation by default, with machine-readable output available for CI, agents, and tooling.
Most commands use flags in the shape of:
--stage <stage>
--base <rev>
--head <rev>
--change <change-ref>
--step <step-id>
--scope <scope>
--jsonBootstrap rwyn in a repository.
init initializes the repository model and gets the repo to a usable baseline quickly.
rwyn init is responsible for:
- detecting languages, tools, and common repo patterns
- inferring an initial set of requirements and steps
- creating
.rwyn/config.yaml - suggesting stage defaults
- suggesting CI wiring
- optionally scaffolding plugins for common repo-specific structure
Examples:
rwyn init
rwyn init --yes
rwyn init --stage-defaults save,commit,push,mergeValidate installation, repo model, tools, environment, evidence state, and integrations.
doctor is the trust and diagnosis command. It answers questions like:
- is
rwyninstalled correctly? - did the repo load the configuration I expect?
- are required tools available?
- are required environment contracts satisfied?
- is the repository model stale or broken?
- is coverage or other evidence missing or obviously inconsistent?
Examples:
rwyn doctor
rwyn doctor --json
rwyn doctor --stage mergeBuild or refresh repository structure and derived evidence indexes.
build refreshes the repository model itself. In a mature setup it may happen automatically when needed; the explicit command is for debugging, CI bootstrap, and large repo changes.
Examples:
rwyn build
rwyn build --full
rwyn build --refreshPlan and execute the right steps for the current change and stage.
run is the primary command and the one most day-to-day use lives in.
run is responsible for:
- selecting relevant requirements
- evaluating candidate steps as evidence
- building a sufficient plan for the requested stage
- executing the selected steps in the right order
- recording results for replay, comparison, analytics, and learning
Examples:
rwyn run --stage save
rwyn run --stage commit
rwyn run --stage merge
rwyn run --stage merge --jsonrun also supports explicit user intent when needed:
rwyn run --stage save --step rust-test
rwyn run --stage commit --scope src/foo.ts
rwyn run --stage merge --change origin/main...HEADShow the selected plan without executing it.
plan shows:
- what would run
- why it would run
- what is being scoped
- what prerequisites would be pulled in
- what confidence targets are driving the decision
Examples:
rwyn plan --stage save
rwyn plan --stage merge
rwyn plan --stage merge --json
rwyn plan --stage merge --change origin/main...HEADExplain a single planning decision.
explain operates on one decision at a time — the most recent plan, or a specific target like a file, requirement, or step. It answers:
- why a requirement is relevant
- why a step was selected
- why a scope was chosen
- why a broader or cheaper alternative was not chosen
- why the final plan is sufficient for the stage
For model-wide introspection — where the model itself is wrong, weak, or contradicted by observed outcomes — use gaps.
Examples:
rwyn explain
rwyn explain path/to/file.ts
rwyn explain --step integration-tests
rwyn explain --requirement formattingManage coverage and related dynamic execution evidence.
Coverage is one evidence source among many. These commands let the repo inspect, refresh, collect, and ingest coverage without treating it as the whole system.
Examples:
rwyn coverage status
rwyn coverage refresh
rwyn coverage collect --kind bun-typescript --step bun-test
rwyn coverage ingest path/to/lcov.infoIngest external evidence or historical results.
This command family brings externally generated evidence into rwyn's model, including coverage, execution reports, CI artifacts, and learned priors.
Examples:
rwyn ingest coverage path/to/lcov.info
rwyn ingest runs path/to/run-records/
rwyn ingest evidence path/to/report.jsonRe-evaluate historical changes against the current model.
replay answers: if the current planner had existed in the past, what would it have chosen, and what would it have missed?
This matters for:
- validating model changes
- measuring recall
- understanding regressions
- improving trust before changing policy
Examples:
rwyn replay
rwyn replay --stage merge
rwyn replay --since 30dCompare behavior across stages, environments, or time.
compare helps answer questions like:
- what changed between local and CI behavior?
- what changed after a policy update?
- why does
mergerun more thancommithere? - where plans diverge in ways that matter
Examples:
rwyn compare --group change
rwyn compare --stage commit --stage merge
rwyn compare --environment local --environment ciSurface where the model itself is wrong, weak, or contradicted.
Where explain introspects a single decision, gaps introspects the model against ground truth and accumulated outcomes. It surfaces two classes of gaps:
- correctness gaps Missing early signals, contradicted declarations, under-modeled requirements, weak evidence paths.
- efficiency gaps Expensive early-stage work, broad steps that need narrower scopes, missing cheaper proxies, repeated unnecessary evidence gathering.
Calibration of evidence weights from observed outcomes happens automatically as runs accumulate; gaps is how that calibration surfaces.
Examples:
rwyn gaps
rwyn gaps --stage commit
rwyn gaps --kind efficiency
rwyn gaps --jsonInspect, validate, and edit effective configuration.
This command family answers questions like:
- what config is actually in effect?
- where did this setting come from?
- how are stage defaults resolving?
- what does this requirement or step currently look like?
Examples:
rwyn config show
rwyn config show --effective
rwyn config explain stages.merge
rwyn config validateManage declarative repository-model extensions.
Plugins define repository-specific structure and evidence logic in the repo model.
Examples:
rwyn plugin list
rwyn plugin validate
rwyn plugin scaffold relationScaffold, inspect, and validate CI integration.
Examples:
rwyn ci init github-actions
rwyn ci init circleci
rwyn ci doctor
rwyn ci showThe primary config surface is .rwyn/config.yaml.
It describes:
- requirements
- steps
- stages
- plugins
- runtime paths
- evidence and learning policy
Split files are fine for larger repos, but the default experience is one obvious entry point.
Example:
graph: .rwyn/graph.json
coverage_data: .rwyn/coverage-data
runs_dir: .rwyn/runs
requirements:
- id: rust-tests-pass
description: Rust unit and integration tests pass
- id: typescript-tests-pass
description: TypeScript tests pass
- id: bindings-current
description: Generated Go bindings match the Solidity sources
plugins:
- id: solidity-interface-link
type: relation
from: "interfaces/**/*.sol"
to: "src/**/*.sol"
edge: imports
match_rule:
by: normalized_basename
from_strip_prefix: I
- id: solidity-bindings
type: generate
from: "src/**/*.sol"
to: "bindings/**/*.go"
match_rule:
by: normalized_basename
stages:
save:
default_confidence: medium
commit:
default_confidence: high
merge:
default_confidence: certain
steps:
- id: rust-test
name: Rust tests
kind: test
language: rust
command: cargo test --all-targets --all-features
tools: [cargo]
inputs:
- "src/**/*.rs"
satisfies:
- rust-tests-pass
- id: bun-test
name: Bun tests
kind: test
language: typescript
command: bun test
scopeable: true
scope_flag: ""
scope_type: test_paths
tools: [bun]
inputs:
- "src/**/*.ts"
- "src/**/*.tsx"
coverage:
kind: bun-typescript
pass_scopes: true
satisfies:
- typescript-tests-passExplicit CLI flags still override config when needed.
Requirements are first-class declared objects. Each one names a property the repository wants to hold; steps reference requirements to declare what they provide evidence for.
A requirement describes:
- identity (
id, optional human-readablename) description- optional
confidenceoverride (replaces the stage default for this requirement when relevant)
Example:
requirements:
- id: rust-tests-pass
description: Rust unit and integration tests pass
- id: security-checks-pass
description: All critical security checks pass
confidence: certain # always certain, regardless of stage default
- id: bindings-current
description: Generated Go bindings match the Solidity sourcesSteps reference requirements by id, with relationship strength:
satisfies:— the step's success fully addresses the requirementevidence_for:— the step is candidate evidence for the requirement; its lift is learned from outcomes
steps:
- id: cargo-fmt-check
satisfies:
- formatting-clean
- id: rust-test
satisfies:
- rust-tests-pass
evidence_for:
- bindings-current # rust tests indirectly exercise generated bindingsevidence_for contributes zero confidence until the planner has enough observed outcomes to calibrate the lift. The declaration marks the step as candidate evidence: when it runs (because it satisfies something else, or because it is cheap), its outcomes accumulate against the requirement and a learned weight emerges over time.
A mutating step (a formatter applying fixes) and a non-mutating step (a formatter in check mode) are two different steps. Each can declare stage applicability — stages: [list] to limit to specific stages, exclude_stages: [list] to remove specific ones, or neither to apply at every stage. The planner picks from stage-eligible steps:
steps:
- id: cargo-fmt
kind: format
mutating: true
stages: [save, commit]
satisfies:
- formatting-clean
- id: cargo-fmt-check
kind: format
stages: [merge, push]
satisfies:
- formatting-cleanMutation is a step property recorded on the step itself; behavior across stages is controlled by which step is listed where.
A step describes:
- identity and kind
- command
- inputs and outputs
- explicit prerequisites for non-file dependencies (
requires:) - which requirements it
satisfiesor providesevidence_for - stage applicability (
stages:to allowlist,exclude_stages:to blocklist; defaults to all stages) - whether it mutates (
mutating: true) - whether and how it can be scoped
- toolchain requirements
- required environment variables
- optional evidence collectors such as coverage
Explicit step invocation uses the normal planner and executor, so prerequisites, layering, and evidence rules still apply.
Examples:
rwyn plan --step rust-test
rwyn plan --step bun-test --scope src/foo.test.ts
rwyn plan --step lint --step test --step-scope test=src/foo.tsrwyn run always executes; rwyn plan never does. They share the same arg shape, so any preview is the same invocation with plan instead of run.
Step ordering is derived from declared inputs and outputs by default. If step B's inputs include a path that step A's outputs produce — directly, or via a generate-type plugin relationship — the planner runs A before B without anything explicit.
For dependencies that are not file-based (a service that must be running, a setup script that exports environment, a remote resource that must be initialized), declare them explicitly with requires::
steps:
- id: db-migrate
kind: setup
command: ./scripts/migrate.sh
stages: [save, commit, merge]
- id: integration-test
kind: test
command: bun test --integration
requires: [db-migrate]
inputs:
- "src/**/*.ts"
satisfies:
- integration-tests-passThe planner combines implicit (file-derived) and explicit (requires:) ordering into a single dependency graph and executes steps in valid topological order. Cycles are surfaced by rwyn doctor.
Steps can also declare environment contracts:
steps:
- id: slice-v5
name: Slice adapter v5
kind: test
language: typescript
command: bun run scripts/integration-run.ts --adapter v5
tools: [bun]
required_env:
- FELDERA_API_URL
- FELDERA_API_TOKENrwyn integrates with existing CI systems:
- CI remains the execution substrate
rwynbecomes the planner and executor- local development, agents, and CI all use the same verification model
rwyn works adopted entirely locally. With local and CI both routed through it, plans, evidence, and outcomes reinforce each other over time.
A CI setup looks like:
- name: Install rwyn
run: curl -fsSL https://get.rwyn.dev/install.sh | sh
- name: Run merge-stage verification
run: rwyn run --stage mergeCI bootstrap commands look like:
rwyn ci init github-actions
rwyn ci init circleci
rwyn ci doctorrwyn treats coverage as one dynamic execution signal among many, used for scoping and confidence updates.
Coverage and other evidence are:
- incremental
- scope-aware
- freshness-aware
- reusable across local runs, CI, and agents
Examples:
rwyn coverage status
rwyn coverage refresh
rwyn coverage collect --kind bun-typescript --step bun-testExecuted plans also produce normalized run records that feed replay, compare, and gaps. Calibration of evidence weights from those records happens automatically; the loop that uses them to improve the model over time is described in How The Model Is Built And Improved.
Repo-specific structure lives in declarative repository knowledge: hidden dependency relationships, generated-artifact relationships, path-to-scope derivation, and any repository-specific structure that affects relevance or confidence.
The plugin DSL is extensible. New types can be added as the engine learns new repository patterns; the existing types remain stable.
Declares an edge between two sets of files. When a file in from: changes, files matched in to: are treated as semantically affected, and the planner uses the edge during relevance computation. The edge: label is a free-form string that surfaces in explain output ("touched via interface link") but does not drive planning logic — the planner cares that an edge exists, not what it is named.
- id: solidity-interface-link
type: relation
from: "interfaces/**/*.sol"
to: "src/**/*.sol"
edge: imports
match_rule:
by: normalized_basename
from_strip_prefix: IDeclares that files in from: produce files in to:. Two effects: a generator step runs before any step that consumes the output, and changes to from: invalidate the freshness of the corresponding to: files until regenerated.
- id: solidity-bindings
type: generate
from: "src/**/*.sol"
to: "bindings/**/*.go"
match_rule:
by: normalized_basenameDerives an execution scope for a scopeable step from a change. When changed files match from:, the named target_step:'s scope becomes the matching to: paths. Lets the planner narrow a broad step to the part of the repo a change actually affects, instead of running it across everything.
- id: typescript-tests-by-module
type: scope
target_step: bun-test
from: "src/**/*.ts"
to: "tests/**/*.test.ts"
match_rule:
by: normalized_basenameHow from: and to: glob matches are paired. Three modes ship today; more may be added as the engine grows.
| Mode | Behavior | Notes |
|---|---|---|
normalized_basename |
Match by filename, stripped of optional prefix/suffix | Use from_strip_prefix / from_strip_suffix to normalize before comparison |
directory_path |
Match by directory path | Useful for "src/X/* maps to tests/X/*" style mappings |
regex |
Capture groups in from:, substitution in to: |
Most flexible escape hatch; use when neither basename nor directory matching fits |
The goal is to keep repository truth in the repository model itself.
This repo includes an official Claude Code plugin and marketplace layout:
- marketplace:
.claude-plugin/marketplace.json - plugin manifest:
plugins/rwyn/.claude-plugin/plugin.json
Local testing:
claude --plugin-dir ./plugins/rwynPublic install after adding this repo as a marketplace:
claude plugin marketplace add smartcontracts/rwyn
claude plugin install rwyn@rwyn-pluginsThis repo also includes a Codex plugin scaffold:
- marketplace entry:
.agents/plugins/marketplace.json - plugin manifest:
plugins/rwyn/.codex-plugin/plugin.json
The bundled skill content lives under plugins/rwyn/skills/.
These skills are reference drivers for the loop described in How The Model Is Built And Improved. They demonstrate one good bootstrap-and-iteration flow against rwyn's diagnostic surface; users can replace them with their own.
Notable bundled skills:
rwynOperate and debug an existingrwynworkflow.setupInspect a repo, scaffold.rwyn/config.yaml, and add declarative transforms.doctorDiagnose a repo'srwynsetup and verification surface.selectExplain and inspect the chosen plan for a change.planPreview whatrwynwould execute without running it.explainExplain why a file or change selected a given plan item.
There is a parity harness at scripts/benchmark-parity.sh.
It compares rwyn against a legacy selector on a commit corpus and reports:
- selected item count
- missing selections vs legacy
- extra selections vs legacy
- per-commit runtime
Run the core checks locally:
cargo fmt
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-targets --all-featuresMIT, see LICENSE.