Skip to content

[Code Health] Unify profiling abstractions across perf, dump tensor, and PMU #641

@ChaoZheng109

Description

@ChaoZheng109

Category

Technical Debt (cleanup, refactor)

Component

Other (please specify in description)

Description

This is a cross-cutting code health issue across SceneTest, common worker ABI, runtime structs, and platform diagnostics collectors.

Current user-facing profiling capability already includes three distinct features: perf swimlane export, tensor dump, and PMU. However, the front-end still uses profiling to mean perf only. --enable-profiling / enable_profiling drives perf snapshots, perf output directory handling, and swimlane conversion, while dump tensor and PMU are modeled as separate one-off flags. This makes the terminology inconsistent: profiling is the umbrella concept at the product level, but profiling in the current API/CLI effectively means perf.

Perf-specific plumbing also leaks into generic runtime layers. Generic worker/runtime ABI and runtime structs carry perf-named fields such as enable_profiling, perf_data_base, perf_records_addr, and enable_profiling_flag. By contrast, dump tensor and PMU are closer to platform-owned collectors. This makes the boundary between common runtime and platform diagnostics inconsistent, and perf ends up polluting runtime internals.

In addition, perf, dump tensor, and PMU duplicate a large amount of lifecycle logic: config propagation, feature-flag publication, per-core/per-thread buffer allocation, AICPU init, host-side collection/export, artifact naming, and cleanup. These paths should be normalized behind a shared diagnostics/profiling abstraction instead of evolving as three parallel implementations.

Observed at commit 89003b5fccf9160bb35c48779c8d20e938aa70dc.

Related: #510

Location

  • simpler_setup/scene_test.py:657-691
  • simpler_setup/scene_test.py:859-867
  • simpler_setup/scene_test.py:1156-1166
  • simpler_setup/scene_test.py:1223-1225
  • simpler_setup/scene_test.py:1288-1297
  • simpler_setup/scene_test.py:1394-1397
  • src/common/task_interface/chip_call_config.h:21-26
  • src/common/worker/pto_runtime_c_api.h:75-98
  • src/common/worker/chip_worker.cpp:245-248
  • src/common/hierarchical/worker_manager.cpp:168-178
  • src/a5/runtime/host_build_graph/runtime/runtime.h:104-118
  • src/a5/runtime/host_build_graph/runtime/runtime.h:211-213
  • src/a5/runtime/tensormap_and_ringbuffer/runtime/runtime.h:86-111
  • src/a5/runtime/tensormap_and_ringbuffer/runtime/runtime.h:179-187
  • src/a5/platform/src/host/performance_collector.cpp:57-157
  • src/a5/platform/src/host/tensor_dump_collector.cpp:45-156
  • src/a5/platform/src/aicpu/performance_collector_aicpu.cpp:40-118
  • src/a5/platform/src/aicpu/performance_collector_aicpu.cpp:132-181
  • src/a5/platform/src/aicpu/tensor_dump_aicpu.cpp:36-57
  • src/a2a3/platform/sim/host/device_runner.cpp:312-376
  • src/a2a3/platform/onboard/host/device_runner.cpp:522-603
  • docs/testing.md:73-117
  • docs/task-flow.md:30-32
  • docs/task-flow.md:185-190
  • docs/profiling-name-map.md:132-163

Proposed Fix

  • Introduce a first-class umbrella config for diagnostics/profiling with explicit sub-features (perf, dump_tensor, pmu) instead of overloading enable_profiling to mean perf only.
  • At the CLI/API layer, make perf explicit. If backward compatibility is required, keep --enable-profiling / enable_profiling only as a compatibility alias to the perf sub-feature and document the deprecation path.
  • Move perf-specific state and memory layout ownership out of generic runtime naming. Generic runtime/common ABI should carry only feature-agnostic diagnostics hooks or flags; perf collector pointers and buffer layout should stay in platform diagnostics components, aligned with dump tensor and PMU.
  • Extract shared lifecycle logic across perf, dump tensor, and PMU into reusable helpers or components: feature flag encoding/publication, collector init/finalize contract, host/device buffer allocation and copy-back pattern, artifact naming policy, and SceneTest post-processing/export hooks.
  • Update docs so profiling is consistently the umbrella term and perf refers only to the swimlane/perf data path.

Priority

Medium (minor risk, should fix in next few releases)

Metadata

Metadata

Assignees

Labels

code healthTechnical debt, robustness, code quality

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions