Introduce `kosli evaluate` by tooky · Pull Request #671 · kosli-dev/cli

tooky · 2026-02-28T16:02:20Z

Why: Customers are duplicating attestation types to work around the lack of evaluation logic — creating separate types per environment just to encode different pass/fail rules. Three independent customers (JB, Deutsche Bank, Norsk Tipping) have hit this, and two more (Blackstone, NatWest) are building toward it. This is the most consistent product gap in our pipeline.

Objective: Add kosli evaluate — a CLI command that applies a Rego policy to trail data and returns a structured pass/fail decision. This separates what you collect (attestation type) from how you judge it (evaluation), and is the first step toward controls as a first-class product concept.

How this was built

This PR is also a demonstration of elephant carpaccio + TDD with Claude Code. The entire feature was built as a conversation — 48 commits, each one a single red-green-refactor step. Slices were kept thin enough to review independently, and the commit list reads bottom-to-top as a narrative of how the feature grew. Later commits came from reviewing the branch against Beck's Rules of Simple Design. The branch is intentionally unrebased so you can follow the progression.

Capabilities

kosli evaluate trail <name> — single trail against a Rego policy
kosli evaluate trails <name>... — multiple trails in one policy call
--output json|table — structured audit output or human-readable (default)
--show-input — include policy input in JSON output for debugging
--attestations — filter which attestations reach the policy (plain name for trail-level, artifact.name for artifact-level)
Exit code 0/1 reflects the policy decision — designed for CI/CD gates

Example

Validate all PRs are approved:

package policy

import rego.v1

default allow = false

violations contains msg if {
    some trail in input.trails
    some pr in trail.compliance_status.attestations_statuses["pull-request"].pull_requests
    count(pr.approvers) == 0
    msg := sprintf("trail '%v': pull-request %v has no approvers", [trail.name, pr.url])
}

allow if {
    count(violations) == 0
}

Kosli Server:

$ kosli evaluate trails \
  --policy pr-approved.rego \
  --org kosli \
  --flow server \
  --attestations pull-request \
  c643b06bf2efaa8f35d4da54c9e34a34a28bd251 \
  bd8254c58d20826df7248772cedf523f715516b6 \
  012cb304aab50bc4a3cc96fba7840ff29ea4d19e \
  a49a603c04b73c58d18909aace2f13e98892089f \
  9373bda52a51550b8ecb2236ed94cb88aa6e3a98
RESULT:  ALLOWED

CyberDojo Dashboard:

$ kosli evaluate trails \
  --policy tmp/pr-approved.rego \
  --org cyber-dojo \
  --flow dashboard-ci \
  9978a1ca82c273a68afaa85fc37dd60d1e394f84 \
  b334d371eb85c9a5c811776de1b65fb80b52d952 \
  5abd63aa1d64af7be5b5900af974dc73ae425bd6 \
  cb3ec71f5ce1103779009abaf4e8f8a3ed97d813
RESULT:      DENIED
VIOLATIONS:  trail '5abd63aa1d64af7be5b5900af974dc73ae425bd6': pull-request https://github.com/cyber-dojo/dashboard/pull/342 has no approvers
             trail '9978a1ca82c273a68afaa85fc37dd60d1e394f84': pull-request https://github.com/cyber-dojo/dashboard/pull/344 has no approvers
             trail 'b334d371eb85c9a5c811776de1b65fb80b52d952': pull-request https://github.com/cyber-dojo/dashboard/pull/343 has no approvers
             trail 'cb3ec71f5ce1103779009abaf4e8f8a3ed97d813': pull-request https://github.com/cyber-dojo/dashboard/pull/341 has no approvers
Error: policy denied: [trail '5abd63aa1d64af7be5b5900af974dc73ae425bd6': pull-request https://github.com/cyber-dojo/dashboard/pull/342 has no approvers trail '9978a1ca82c273a68afaa85fc37dd60d1e394f84': pull-request https://github.com/cyber-dojo/dashboard/pull/344 has no approvers trail 'b334d371eb85c9a5c811776de1b65fb80b52d952': pull-request https://github.com/cyber-dojo/dashboard/pull/343 has no approvers trail 'cb3ec71f5ce1103779009abaf4e8f8a3ed97d813': pull-request https://github.com/cyber-dojo/dashboard/pull/341 has no approvers]

Architecture

internal/evaluate/ — OPA Rego engine + trail data transforms (array-to-map, rehydration from detail API, filtering)
cmd/kosli/evaluateHelpers.go — shared options, flag registration, fetch+enrich pipeline, output dispatch
cmd/kosli/evaluateTrail.go / evaluateTrails.go — thin command wrappers

Also included

Upgraded golang.org/x/net to v0.51.0 for CVE-2026-27141

Slice 1 of kosli evaluate trail - adds the evaluate parent command and evaluate trail subcommand that fetches a trail from the API and wraps the response in {"trail": ...} JSON output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds internal/evaluate with Evaluate() function that validates and evaluates Rego policies. Validates package name is 'policy', requires an 'allow' rule, collects violations on deny. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds --policy flag to evaluate trail command. Reads a .rego file, evaluates it against the trail input using OPA, exits 0 on allow and 1 on deny. Uses Rego v1 syntax. Policy must use package policy and declare an allow rule. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…e and violations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…wed text Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…lations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…violations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Tests: non-map passthrough, no compliance_status passthrough, empty array to empty map, single attestation keyed by name. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Tests: multiple trail attestations, artifact-level, both levels, multiple artifacts, entries without attestation_name skipped. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…maps Wire TransformTrail into evaluateTrail after JSON parse. Tests verify trail-level and artifact-level maps, plus Rego policy access by name. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tion Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…vels Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…te existing fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…hanged Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…evel attestation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…mixed, no-match) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…trail Slice 5c complete: FilterAttestations filters trail-level (plain name) and artifact-level (dot-qualified) attestations before rehydration, saving API calls for excluded attestations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Aligns evaluate trail with the codebase convention of --output table|json (with -o shorthand, defaulting to "table") instead of --format text|json. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds `kosli evaluate trails TRAIL-NAME [TRAIL-NAME...]` command that fetches multiple trails, wraps them in {"trails": [...]} input shape, and evaluates against a Rego policy. Supports --policy, --output, and --show-input flags with the same semantics as evaluate trail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds rehydration (CollectAttestationIDs + RehydrateTrail) and --attestations filtering to evaluate trails, matching the same enrichment pipeline as evaluate trail. Each trail is independently transformed, filtered, and rehydrated before being collected into the trails array. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove the no-policy code path (raw JSON dump) since there's no use case for it without evaluation. Add "policy" to RequireFlags, remove early-return blocks, update tests accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move fetchAndEnrichTrail, evaluateAndPrintResult, and printEvaluateInput into evaluateHelpers.go. Both evaluate trail and evaluate trails now delegate to these shared functions, eliminating duplicated code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Both evaluateTrailOptions and evaluateTrailsOptions now embed commonEvaluateOptions instead of duplicating the same five fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace manual if/else output format branching with the output.FormattedPrint dispatcher pattern used by other commands like getTrail. Adds printEvaluateResultAsJson and printEvaluateResultAsTable format functions. Removes the manual --output validation from both run methods, letting FormattedPrint handle unsupported formats consistently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…etching Log HTTP and JSON parse errors at debug level when fetching attestation details during trail enrichment, instead of silently continuing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Use assert for leaf value checks in TestTransformTrail, consistent with the other test functions in the file. Keep require only for the IsType guard that validates the type before subsequent access. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…sform.go Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ailEnrichment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fixes SNYK-GOLANG-GOLANGORGXNETHTTP2-15363313 — missing nil check in http2 causing server panic on malformed frames. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

tooky and others added 30 commits February 27, 2026 17:12

Setup Claude to work in tiny steps

6aa74aa

Mark Slice 2 complete, Slice 3 active in TODO.md

57fae74

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: --format json with allow-all policy prints JSON with allow true

59a5a70

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: --format json with deny-all policy prints JSON with allow fals…

95f81b3

…e and violations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: --format text with allow-all policy prints human-readable allo…

cc46763

…wed text Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: --format text with deny-all policy prints denied text with vio…

c43c183

…lations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: no --format flag defaults to text output

057c835

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: --format json without --policy prints trail JSON unchanged

1e9ea53

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: --format with invalid value returns descriptive error

c03c101

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

refactor: fix lint errors in text output, mark Slice 3 complete

f7c7550

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: --show-input with --format json includes input in JSON output

fac65a8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: --show-input with deny-all includes input alongside allow and …

fbabaf9

…violations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: --show-input with --format text prints input JSON after result

78ba427

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: --show-input without --policy prints trail JSON unchanged

825d4ae

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Mark Slice 4 complete, Slice 5 active in TODO.md

61daae1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: TransformTrail nil input returns nil

7335507

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: trail-level attestations_statuses array converts to map

b1e238f

Tests: non-map passthrough, no compliance_status passthrough, empty array to empty map, single attestation keyed by name. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: artifact-level attestations_statuses also converted to maps

21630f5

Tests: multiple trail attestations, artifact-level, both levels, multiple artifacts, entries without attestation_name skipped. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: integration tests verify attestations_statuses transformed to …

aad23b2

…maps Wire TransformTrail into evaluateTrail after JSON parse. Tests verify trail-level and artifact-level maps, plus Rego policy access by name. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Mark Slice 5a complete, Slice 5b active in TODO.md

f2fab53

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: CollectAttestationIDs nil input returns empty slice

df42b1d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: CollectAttestationIDs collects ID from trail-level attestation

d026549

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: CollectAttestationIDs collects IDs from artifact-level attesta…

206c443

…tion Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: CollectAttestationIDs skips null IDs and collects from both le…

224a736

…vels Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: RehydrateTrail nil details map leaves trail unchanged

f17e0cd

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: RehydrateTrail merges detail fields into trail-level attestation

eb4e18a

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: RehydrateTrail merges into artifact-level and does not overwri…

e0edd8a

…te existing fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

tooky and others added 21 commits February 28, 2026 07:08

green: RehydrateTrail attestation with no matching detail is left unc…

305e72c

…hanged Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: integration tests verify attestations rehydrated with detail data

49e2e17

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Mark Slice 5b complete, Slice 5c active in TODO.md

2251e21

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: FilterAttestations with plain name keeps only matching trail-l…

b38c6e8

…evel attestation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: all FilterAttestations unit tests pass (plain, dot-qualified, …

4234094

…mixed, no-match) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: replace --format with --output/-o on evaluate trail

8270020

Aligns evaluate trail with the codebase convention of --output table|json (with -o shorthand, defaulting to "table") instead of --format text|json. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Mark Slice 7 complete in TODO.md

46e32a8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: extract commonEvaluateOptions shared struct

7c82fa0

Both evaluateTrailOptions and evaluateTrailsOptions now embed commonEvaluateOptions instead of duplicating the same five fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: add debug logging for swallowed errors in attestation detail f…

de31af6

…etching Log HTTP and JSON parse errors at debug level when fetching attestation details during trail enrichment, instead of silently continuing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: make --output table produce tabular output with tabFormattedPrint

68762fd

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: extract addFlags method to DRY up evaluate command flag regist…

6988552

…ration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: extract walkTrailAttestations to DRY up tree traversal in tran…

d59e1a8

…sform.go Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

green: rename TestEvaluateTrailAttestationTransform to TestEvaluateTr…

5cf5a36

…ailEnrichment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: upgrade golang.org/x/net to v0.51.0 for CVE-2026-27141

880cf9c

Fixes SNYK-GOLANG-GOLANGORGXNETHTTP2-15363313 — missing nil check in http2 causing server panic on malformed frames. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `kosli evaluate`#671

Introduce `kosli evaluate`#671
tooky wants to merge 51 commits intomainfrom
introduce-kosli-evaluate

tooky commented Feb 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tooky commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How this was built

Capabilities

Example

Architecture

Also included

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tooky commented Feb 28, 2026 •

edited

Loading