chore(agent-data-plane): update to tokio 1.52.1 and re-enable eager driver handoff#1411
chore(agent-data-plane): update to tokio 1.52.1 and re-enable eager driver handoff#1411
Conversation
Binary Size Analysis (Agent Data Plane)Target: c4787c3 (baseline) vs f504e3a (comparison) diff
|
| Module | File Size | Symbols |
|---|---|---|
figment |
-112.85 KiB | 657 |
core |
+22.49 KiB | 17284 |
piecemeal |
+13.16 KiB | 42 |
anon.9939619bfe8bfb192b56a1b7864558c4.15.llvm.12248130204864774867 |
-12.09 KiB | 1 |
anon.3b479cc647c83ba29012d8c7926b85aa.15.llvm.3139051186277400116 |
+12.09 KiB | 1 |
http_body_util |
-11.89 KiB | 237 |
saluki_common::task::instrument |
-9.54 KiB | 36 |
datadog_protos::trace_piecemeal_include::datadog |
-8.47 KiB | 20 |
[sections] |
-7.96 KiB | 7 |
memory_accounting::allocator::Tracked |
+7.19 KiB | 77 |
tokio |
+6.74 KiB | 5897 |
saluki_components::common::datadog |
+5.86 KiB | 352 |
http |
+4.32 KiB | 643 |
agent_data_plane::components::tag_filterlist |
-4.23 KiB | 46 |
anon.9939619bfe8bfb192b56a1b7864558c4.14.llvm.12248130204864774867 |
-4.10 KiB | 1 |
anon.3b479cc647c83ba29012d8c7926b85aa.14.llvm.3139051186277400116 |
+4.10 KiB | 1 |
bytes |
+3.88 KiB | 97 |
saluki_io::net::util |
+3.77 KiB | 155 |
serde_json |
+3.74 KiB | 238 |
tower |
-3.66 KiB | 450 |
Detailed Symbol Changes
FILE SIZE VM SIZE
-------------- --------------
[NEW] +161Ki [NEW] +161Ki agent_data_plane::cli::run::handle_run_command::_{{closure}}::hc4c50a51e079975e
[NEW] +65.2Ki [NEW] +65.0Ki saluki_core::topology::built::BuiltTopology::spawn::_{{closure}}::hbcfe0e1b546e6f87
[NEW] +63.8Ki [NEW] +63.6Ki agent_data_plane::run_inner::_{{closure}}::h7b15982b493fc84c
[NEW] +57.9Ki [NEW] +57.7Ki saluki_core::topology::blueprint::TopologyBlueprint::build::_{{closure}}::h63a4a43c250569b3
[NEW] +57.7Ki [NEW] +57.5Ki agent_data_plane::cli::debug::handle_debug_command::_{{closure}}::h183970306d93bed9
[NEW] +48.2Ki [NEW] +47.9Ki _<saluki_components::transforms::apm_stats::ApmStats as saluki_core::components::transforms::Transform>::run::_{{closure}}::h14dba9cde45cd586
[NEW] +44.0Ki [NEW] +43.9Ki _<figment::value::de::ConfiguredValueDe<I> as serde_core::de::Deserializer>::deserialize_struct::h7355bcfc505155e9
[NEW] +41.8Ki [NEW] +41.6Ki saluki_config::ConfigurationLoader::with_default_secrets_resolution::_{{closure}}::h654dc887d6638e21
[NEW] +39.3Ki [NEW] +39.2Ki saluki_env::workload::providers::remote_agent::build_collector::_{{closure}}::hd014a85cff2a5c71
[NEW] +38.6Ki [NEW] +38.4Ki saluki_components::common::datadog::io::run_endpoint_io_loop::_{{closure}}::hea93225365fe4fc6
[DEL] -39.0Ki [DEL] -38.8Ki saluki_components::common::datadog::io::run_endpoint_io_loop::_{{closure}}::h5434ef7a955f5c5e
[DEL] -39.7Ki [DEL] -39.5Ki saluki_env::workload::providers::remote_agent::build_collector::_{{closure}}::h28ff2d07ad18c51b
[DEL] -41.7Ki [DEL] -41.5Ki saluki_config::ConfigurationLoader::with_default_secrets_resolution::_{{closure}}::h84d68ef3446e85bb
[DEL] -44.0Ki [DEL] -43.9Ki _<figment::value::de::ConfiguredValueDe<I> as serde_core::de::Deserializer>::deserialize_struct::h445c89aa25f43822
[DEL] -48.1Ki [DEL] -47.9Ki _<saluki_components::transforms::apm_stats::ApmStats as saluki_core::components::transforms::Transform>::run::_{{closure}}::h3ea5d2d9c7bad8a2
[DEL] -57.7Ki [DEL] -57.5Ki agent_data_plane::cli::debug::handle_debug_command::_{{closure}}::h4c208bdbbcd89e05
[DEL] -57.9Ki [DEL] -57.7Ki saluki_core::topology::blueprint::TopologyBlueprint::build::_{{closure}}::h922a0f6fad9ed58b
[DEL] -64.5Ki [DEL] -64.3Ki agent_data_plane::run_inner::_{{closure}}::h3e37159b0c10a48e
[DEL] -65.2Ki [DEL] -65.0Ki saluki_core::topology::built::BuiltTopology::spawn::_{{closure}}::h49f3d4c7a4032ff0
-0.4% -89.4Ki -0.4% -80.4Ki [57344 Others]
[DEL] -162Ki [DEL] -162Ki agent_data_plane::cli::run::handle_run_command::_{{closure}}::h967bbbe4648074fc
-0.2% -91.8Ki -0.3% -82.8Ki TOTAL
Regression Detector (Agent Data Plane)Regression Detector ResultsRun ID: bfc98bb3-d149-4ed6-9e1c-f6072b0c365e Baseline: c4787c3 Optimization Goals: ✅ No significant changes detected
|
| perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
|---|---|---|---|---|---|---|
| ❌ | otlp_ingest_logs_5mb_memory | memory utilization | +9.44 | [+8.90, +9.97] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_logs_5mb_cpu | % cpu utilization | +1.49 | [-3.30, +6.28] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_logs_5mb_throughput | ingress throughput | +0.02 | [-0.11, +0.14] | 1 | (metrics) (profiles) (logs) |
Fine details of change detection per experiment
| perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
|---|---|---|---|---|---|---|
| ❌ | otlp_ingest_logs_5mb_memory | memory utilization | +9.44 | [+8.90, +9.97] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_10mb_3k_contexts_cpu | % cpu utilization | +7.06 | [-25.61, +39.74] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_metrics_5mb_memory | memory utilization | +2.50 | [+2.33, +2.66] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_500mb_3k_contexts_throughput | ingress throughput | +1.84 | [+1.72, +1.96] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_logs_5mb_cpu | % cpu utilization | +1.49 | [-3.30, +6.28] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_5mb_cpu | % cpu utilization | +1.08 | [-1.12, +3.29] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_metrics_5mb_cpu | % cpu utilization | +0.66 | [-6.49, +7.81] | 1 | (metrics) (profiles) (logs) |
| ➖ | quality_gates_rss_idle | memory utilization | +0.47 | [+0.44, +0.51] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_1mb_3k_contexts_memory | memory utilization | +0.42 | [+0.28, +0.57] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_5mb_memory | memory utilization | +0.31 | [+0.15, +0.48] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_10mb_3k_contexts_memory | memory utilization | +0.28 | [+0.13, +0.43] | 1 | (metrics) (profiles) (logs) |
| ➖ | quality_gates_rss_dsd_medium | memory utilization | +0.25 | [+0.08, +0.42] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_500mb_3k_contexts_memory | memory utilization | +0.22 | [+0.07, +0.37] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_500mb_3k_contexts_cpu | % cpu utilization | +0.13 | [-1.39, +1.64] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_filtering_5mb_memory | memory utilization | +0.10 | [-0.15, +0.35] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_5mb_throughput | ingress throughput | +0.08 | [+0.01, +0.15] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_512kb_3k_contexts_memory | memory utilization | +0.07 | [-0.08, +0.22] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_transform_5mb_throughput | ingress throughput | +0.07 | [-0.00, +0.14] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_logs_5mb_throughput | ingress throughput | +0.02 | [-0.11, +0.14] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_transform_5mb_cpu | % cpu utilization | +0.01 | [-1.96, +1.98] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_10mb_3k_contexts_throughput | ingress throughput | +0.00 | [-0.14, +0.15] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_1mb_3k_contexts_throughput | ingress throughput | +0.00 | [-0.06, +0.06] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_512kb_3k_contexts_throughput | ingress throughput | -0.00 | [-0.05, +0.05] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_metrics_5mb_throughput | ingress throughput | -0.00 | [-0.19, +0.19] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_100mb_3k_contexts_throughput | ingress throughput | -0.01 | [-0.05, +0.02] | 1 | (metrics) (profiles) (logs) |
| ➖ | quality_gates_rss_dsd_low | memory utilization | -0.01 | [-0.18, +0.15] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_filtering_5mb_throughput | ingress throughput | -0.06 | [-0.13, +0.01] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_transform_5mb_memory | memory utilization | -0.11 | [-0.27, +0.06] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_100mb_3k_contexts_memory | memory utilization | -0.19 | [-0.34, -0.04] | 1 | (metrics) (profiles) (logs) |
| ➖ | quality_gates_rss_dsd_ultraheavy | memory utilization | -0.29 | [-0.41, -0.17] | 1 | (metrics) (profiles) (logs) |
| ➖ | quality_gates_rss_dsd_heavy | memory utilization | -0.55 | [-0.68, -0.41] | 1 | (metrics) (profiles) (logs) |
| ➖ | otlp_ingest_traces_ottl_filtering_5mb_cpu | % cpu utilization | -1.17 | [-3.58, +1.24] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_512kb_3k_contexts_cpu | % cpu utilization | -1.24 | [-58.70, +56.22] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_100mb_3k_contexts_cpu | % cpu utilization | -1.54 | [-7.51, +4.42] | 1 | (metrics) (profiles) (logs) |
| ➖ | dsd_uds_1mb_3k_contexts_cpu | % cpu utilization | -2.45 | [-55.58, +50.68] | 1 | (metrics) (profiles) (logs) |
Bounds Checks: ✅ Passed
| perf | experiment | bounds_check_name | replicates_passed | observed_value | links |
|---|---|---|---|---|---|
| ✅ | quality_gates_rss_dsd_heavy | memory_usage | 10/10 | 117.91MiB ≤ 140MiB | (metrics) (profiles) (logs) |
| ✅ | quality_gates_rss_dsd_low | memory_usage | 10/10 | 39.64MiB ≤ 50MiB | (metrics) (profiles) (logs) |
| ✅ | quality_gates_rss_dsd_medium | memory_usage | 10/10 | 61.47MiB ≤ 75MiB | (metrics) (profiles) (logs) |
| ✅ | quality_gates_rss_dsd_ultraheavy | memory_usage | 10/10 | 174.55MiB ≤ 200MiB | (metrics) (profiles) (logs) |
| ✅ | quality_gates_rss_idle | memory_usage | 10/10 | 26.92MiB ≤ 40MiB | (metrics) (profiles) (logs) |
Explanation
Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
…ges) Version bump only — no changes to runtime builder configuration. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
d48114f to
27b19e2
Compare
…river handoff) Enables enable_alt_timer() on all runtime builders but does not enable enable_eager_driver_handoff(). Isolates alt timer as a variable for CI. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…(no alt timer) Enables enable_eager_driver_handoff() on all runtime builders but does not enable enable_alt_timer(). Isolates eager driver handoff as a variable for CI. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…config changes 1.52.x shows regressions across all configurations. Trying 1.51.1 (the current LTS line, supported until Mar 2027) with no runtime builder changes to isolate whether the issue is in the version itself. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…est) Keeps tokio at 1.50.0 but bumps mio from 1.1.1 to 1.2.0 (the version pulled in by tokio 1.51). If the regression appears here, mio 1.2.0 is the culprit. If clean, the regression is in tokio 1.51's scheduler changes (specifically #7431: steal tasks from the LIFO slot). Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…stack Adds a bench group that spins up a real tonic gRPC server on a loopback TCP socket and drives it with a client, exercising the full path: epoll -> TCP -> HTTP/2 (h2) -> tonic -> re-encode/re-decode -> MPSC -> translate This mirrors the actual otlp_ingest_traces SMP test more closely than the in-memory async_pipeline benchmark, and will show differences caused by mio/tokio I/O driver changes between versions. Two variants: sequential - one request at a time (closest to Lading single-connection model) concurrent - 100 requests multiplexed over the same HTTP/2 connection Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Adds test/bench/ infrastructure for measuring ADP's OTLP trace ingest throughput locally using millstone against a live ADP binary: test/bench/otlp-ingest.sh — main benchmark script test/bench/adp-otlp.yaml — minimal ADP config (OTLP, standalone mode) test/bench/millstone-otlp.yaml — millstone config matching the SMP test corpus test/bench/intake-blackhole.py — fake HTTP intake to absorb forwarded traces Also adds Makefile targets: make bench-otlp-ingest make bench-otlp-ingest-compare TOKIO_A=X TOKIO_B=Y Reproduces a consistent throughput regression on tokio 1.51 vs 1.50 (small locally, amplifies on Linux CI with more cores due to LIFO slot stealing). Uses --profile release to match the SMP regression test build profile. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…mpare macOS requires `sed -i ''`; Linux requires `sed -i` without an empty string argument. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…l -f pkill -f <pattern> self-matches the shell process running the command (because <pattern> appears in the shell's argv), killing the SSH session. Use fuser -k <port>/tcp instead, which kills by port binding and is safe. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Single command to reproduce the tokio 1.51 OTLP ingest throughput regression from PR #7431 (steal tasks from the LIFO slot). make bench-tokio-regression Builds ADP and millstone, runs 3 warmup + 10 measured millstone rounds against each tokio version, and prints a clean throughput comparison with regression percentage. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Summary
enable_eager_driver_handoff()andenable_alt_timer()in all three runtime builder sitesTest plan
🤖 Generated with Claude Code