Migrated from tailored-agentic-units/tau-orchestrate#5
Summary
The following design questions need resolution through discussion before implementation of the advanced observability features begins.
Open Questions
-
Scope boundary: Should OpenTelemetry integration live in orchestrate/otel or in a separate supplemental module to avoid adding OTel as a transitive dependency for the entire kernel module?
-
Trace context propagation: Should trace IDs be carried in context.Context (Go standard), in Event.Data, or in a new field on Event?
-
Metrics observer vs. trace observer: Should metrics aggregation and trace correlation be separate observer implementations, or a single combined observer?
-
Confidence scoring: The original roadmap mentioned confidence scoring utilities. Is this an observability concern (recording confidence in events) or a workflow concern (routing based on confidence thresholds)? If the latter, it may belong in orchestrate/workflows rather than orchestrate/observability.
-
Event enrichment: Should the existing Event.Data map be formalized with typed keys/values, or should it remain map[string]any for flexibility?
-
Sampling: For high-throughput workflows, should observers support event sampling to reduce overhead?
Expected Outcome
Document decisions for each question that guide implementation of the feature issues.
Summary
The following design questions need resolution through discussion before implementation of the advanced observability features begins.
Open Questions
Scope boundary: Should OpenTelemetry integration live in
orchestrate/otelor in a separate supplemental module to avoid adding OTel as a transitive dependency for the entire kernel module?Trace context propagation: Should trace IDs be carried in
context.Context(Go standard), inEvent.Data, or in a new field on Event?Metrics observer vs. trace observer: Should metrics aggregation and trace correlation be separate observer implementations, or a single combined observer?
Confidence scoring: The original roadmap mentioned confidence scoring utilities. Is this an observability concern (recording confidence in events) or a workflow concern (routing based on confidence thresholds)? If the latter, it may belong in
orchestrate/workflowsrather thanorchestrate/observability.Event enrichment: Should the existing
Event.Datamap be formalized with typed keys/values, or should it remainmap[string]anyfor flexibility?Sampling: For high-throughput workflows, should observers support event sampling to reduce overhead?
Expected Outcome
Document decisions for each question that guide implementation of the feature issues.