Skip to content

feat(hotspot): Add hotspot-bpf to retina shell#2209

Draft
SRodi wants to merge 5 commits intomicrosoft:mainfrom
SRodi:srodi/hotspot-bpf
Draft

feat(hotspot): Add hotspot-bpf to retina shell#2209
SRodi wants to merge 5 commits intomicrosoft:mainfrom
SRodi:srodi/hotspot-bpf

Conversation

@SRodi
Copy link
Copy Markdown
Member

@SRodi SRodi commented Apr 20, 2026

Description

This PR adds hotspot-bpf to the retina shell image, making it available as an interactive diagnostic tool via kubectl retina shell.

hotspot-bpf is an eBPF-based performance triage tool that correlates CPU time, scheduler contention, page-fault pressure, and RSS growth in a single terminal view. Instead of showing raw numbers, it automatically classifies processes into actionable diagnoses: OOM risk, CPU-bound, mem-thrashing, starved, noisy neighbor, or OK.

What it detects

Diagnosis Meaning
OOM risk RSS growing monotonically + high page-fault rate
CPU-bound Saturating a CPU core with no memory pressure
Mem-thrashing Costly page faults or high fault volume with low CPU
Starved Frequently preempted, getting little CPU
Noisy neighbor Preempting others while consuming significant CPU
OK No anomaly detected

Why add this to retina shell?

  • Complements existing tools: fills the gap between top/htop (no root-cause labels) and perf (requires deep expertise) by providing instant triage
  • Zero runtime dependencies: single statically-linked Go binary with embedded CO-RE eBPF programs — no kernel headers needed at runtime
  • Works across kernel versions: uses BPF CO-RE (Compile Once, Run Everywhere) with BTF relocations, compatible with any kernel ≥ 5.5 with BTF enabled
  • Follows existing patterns: installed the same way as pwru — downloaded from GitHub releases and placed in /usr/local/bin

Changes

  • shell/Dockerfile: Added hotspot-bpf v0.1.0 binary download and installation (amd64 only, matching current release availability)

Usage

# Start a retina shell on a node
kubectl retina shell node001

# Run hotspot inside the shell
hotspot -interval 5s -topk 5

# Filter by cgroup (e.g. a specific container)
hotspot -interval 5s -cgroup-filter my-container

Testing

# Build the shell image locally
docker build -t retina-shell:dev -f shell/Dockerfile .

# Verify binary is installed
docker run --rm retina-shell:dev ls -la /usr/local/bin/hotspot

# Test with eBPF (requires privileged + BTF-enabled kernel)
docker run --rm -it --privileged --pid=host \
  -v /sys/kernel:/sys/kernel:ro \
  retina-shell:dev hotspot -interval 5s -topk 5

Requirements

  • Host kernel ≥ 5.5 with BTF (/sys/kernel/btf/vmlinux must exist)
  • Privileged container or CAP_BPF + CAP_PERFMON (already provided by retina shell)
  • x86_64 architecture (arm64 support not yet available in hotspot-bpf)

Related Issue

n/a

Checklist

  • I have read the contributing documentation.
  • I signed and signed-off the commits (git commit -S -s ...). See this documentation on signing commits.
  • I have correctly attributed the author(s) of the code.
  • I have tested the changes locally.
  • I have followed the project's style guidelines.
  • I have updated the documentation, if necessary.
  • I have added tests, if applicable.

Screenshots (if applicable) or Testing Completed

image

Additional Notes

Add any additional notes or context about the pull request here.


Please refer to the CONTRIBUTING.md file for more information on how to contribute to this project.

@SRodi SRodi self-assigned this Apr 20, 2026
@SRodi SRodi requested a review from a team as a code owner April 20, 2026 12:06
@SRodi SRodi requested review from matmerr and snguyen64 April 20, 2026 12:06
@SRodi SRodi marked this pull request as draft April 21, 2026 07:43
@SRodi SRodi force-pushed the srodi/hotspot-bpf branch from 1f9fbb4 to 3685fea Compare April 21, 2026 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant