Skip to content

Performance metrics aggregation for workflow execution #7

@JaimeStill

Description

@JaimeStill

Migrated from tailored-agentic-units/tau-orchestrate#3

Summary

Collect and aggregate timing and resource metrics across workflow execution.

Requirements

  • Node execution duration (start/complete timestamps already in events)
  • Workflow total duration
  • Parallel worker utilization (active workers, queue depth)
  • Per-step latency distribution across chain executions
  • Error rates and categorization
  • Aggregation into summary statistics (min, max, mean, p50, p95, p99)

Design Constraints

  • Zero-overhead default: advanced features are opt-in; NoOpObserver behavior must remain zero-cost
  • Registry compatible: new observers must integrate with the existing string-based registry pattern
  • No core package coupling: consider a separate package (e.g., orchestrate/metrics) to avoid adding dependencies to core orchestrate/observability

Acceptance Criteria

  • Performance metrics are aggregatable into summary statistics without external tooling
  • All new observer implementations pass through the existing registry pattern
  • No performance regression for existing NoOpObserver and SlogObserver usage

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew functionalityorchestrateorchestrate subsystem

    Type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions