HIVE-3127: Add benchmark harness for resource/remoteclient optimization by dlom · Pull Request #2877 · openshift/hive

dlom · 2026-03-23T17:04:21Z

The clustersync controller and others that depend on pkg/resource (Helper) and pkg/remoteclient have been repeatedly flagged by downstream users for:

Slow reconciliation times
Memory leaks during large SyncSet applications
Excessive API server load (unnecessary round trips and discovery calls)

These packages are legacy code with significant behavioral dependencies from downstream consumers. We cannot rewrite them, but targeted optimizations (50-100 LOC changes) in specific hot paths could significantly improve performance without altering behavior.

Each benchmark runs against a real kube-apiserver (via envtest) and reports custom metrics for round-trip count and byte transfer (in addition to standard allocation and wall-time measurements)

Some preliminary findings:

CreateOrUpdate SteadyState allocates ~79MB/op for a no-op: 88x more than Apply for the same 2 roundtrips. Something isn't being cached properly.
NewHelper construction costs 78MB and 205K allocs: the dominant cost in most benchmarks. More missed caching opportunities
First Apply per package pays 2 extra discovery roundtrips (5 vs 3) due to disk-cache cold start. Caching...
Hibernation takes 1 full second despite only 13 roundtrips and 1.3MB allocs. This is Wall-clock bound, likely connection/TLS overhead in the BuildKubeClient path.

Assisted-by: Claude Opus 4.6 (LLM wrote the original suite of benchmarks automatically, I have refined them extensively to meet my quality standards)

/assign @2uasimojo

Summary by CodeRabbit

Release Notes

New Features
- Added comprehensive benchmarking infrastructure for performance testing against real Kubernetes API servers
- Introduced benchmark comparison tooling to measure performance changes against base commits
- Added new make target and setup scripts for Kubernetes test environment initialization
Refactor
- Remote client builder now accepts optional configuration parameters for customization

The clustersync controller and others that depend on pkg/resource (Helper) and pkg/remoteclient have been repeatedly flagged by downstream users for: - Slow reconciliation times - Memory leaks during large SyncSet applications - Excessive API server load (unnecessary round trips and discovery calls) These packages are legacy code with significant behavioral dependencies from downstream consumers. We cannot rewrite them, but targeted optimizations (50-100 LOC changes) in specific hot paths could significantly improve performance without altering behavior. Each benchmark runs against a real kube-apiserver (via envtest) and reports custom metrics for round-trip count and byte transfer (in addition to standard allocation and wall-time measurements) Some preliminary findings: - CreateOrUpdate SteadyState allocates ~79MB/op for a no-op -- 88x more than Apply for the same 2 roundtrips. Something isn't being cached properly. - NewHelper construction costs 78MB and 205K allocs -- the dominant cost in most benchmarks. Caching the OpenAPI schema and REST mapper across operations would cut this dramatically. - First Apply per package pays 2 extra discovery roundtrips (5 vs 3) due to disk-cache cold start. Pre-warming or sharing the discovery cache would eliminate the penalty. - Hibernation takes 1 full second despite only 13 roundtrips and 1.3MB allocs. This is Wall-clock bound, likely connection/TLS overhead in the BuildKubeClient path. Assisted-by: Claude Opus 4.6

openshift-ci-robot · 2026-03-23T17:04:27Z

@dlom: This pull request references HIVE-3127 which is a valid jira issue.

Details

In response to this:

xref: HIVE-3127

The clustersync controller and others that depend on pkg/resource (Helper) and pkg/remoteclient have been repeatedly flagged by downstream users for:

Slow reconciliation times

Memory leaks during large SyncSet applications

Excessive API server load (unnecessary round trips and discovery calls)

These packages are legacy code with significant behavioral dependencies from downstream consumers. We cannot rewrite them, but targeted optimizations (50-100 LOC changes) in specific hot paths could significantly improve performance without altering behavior.

Each benchmark runs against a real kube-apiserver (via envtest) and reports custom metrics for round-trip count and byte transfer (in addition to standard allocation and wall-time measurements)

Some preliminary findings:

CreateOrUpdate SteadyState allocates ~79MB/op for a no-op: 88x more than Apply for the same 2 roundtrips. Something isn't being cached properly.

NewHelper construction costs 78MB and 205K allocs: the dominant cost in most benchmarks. More missed caching opportunities

First Apply per package pays 2 extra discovery roundtrips (5 vs 3) due to disk-cache cold start. Caching...

Hibernation takes 1 full second despite only 13 roundtrips and 1.3MB allocs. This is Wall-clock bound, likely connection/TLS overhead in the BuildKubeClient path.

Assisted-by: Claude Opus 4.6 (LLM wrote the original suite of benchmarks automatically, I have refined them extensively to meet my quality standards)

/assign @2uasimojo

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2026-03-23T17:04:48Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dlom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [dlom]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-03-23T17:07:01Z

📝 Walkthrough

Walkthrough

This PR introduces comprehensive benchmarking infrastructure for Hive, including envtest setup utilities, benchmark orchestration harnesses, HTTP round-trip tracking, and benchmark test suites for the remoteclient and resource packages, along with vendored controller-runtime dependencies.

Changes

Cohort / File(s)	Summary
Build Configuration `Makefile`, `go.mod`, `pkg/dependencymagnet/doc.go`	Added `setup-envtest` make target and tool installation step; introduced direct dependency on `sigs.k8s.io/controller-runtime/tools/setup-envtest` v0.0.0-20240927101401-4381fa0aeee4.
Envtest and Build Utilities `hack/setup-envtest.sh`, `hack/benchmark-comparison.sh`	Created scripts for envtest asset download/setup and benchmark result comparison via `benchstat`.
Benchmark Utilities `test/benchutil/doc.go`, `test/benchutil/envtest.go`, `test/benchutil/namespace.go`, `test/benchutil/roundtrip.go`, `test/benchutil/generators.go`, `test/benchutil/helpers.go`, `test/benchutil/harness.go`, `test/benchutil/benchenv.go`, `test/benchutil/client.go`	Introduced core benchmark infrastructure: envtest lifecycle management, namespace isolation via atomic counter, HTTP round-trip/byte tracking, object generators, benchmark harness with `ControllerHarness[S]` generic orchestration, and benchmark environment setup with dual local/remote client support.
Benchmark Test Suites `test/benchmark/benchmark_test.go`, `test/benchmark/controller_sim_benchmark_test.go`, `pkg/remoteclient/benchmark_test.go`, `pkg/remoteclient/remoteclient_benchmark_test.go`, `pkg/resource/benchmark_test.go`, `pkg/resource/helper_benchmark_test.go`	Added TestMain entry points and benchmark suites covering controller reconciliation patterns, remoteclient builder operations, and resource helper operations (apply, patch, create, delete, etc.) with payload/patch-type variations.
Benchmark Documentation `test/benchmark/README.md`	Comprehensive guide covering setup, execution, result comparison, benchmark architecture, metrics reporting, and troubleshooting.
Vendored Controller-Runtime `vendor/sigs.k8s.io/controller-runtime/...`	Vendored `sigs.k8s.io/controller-runtime` modules including: `envtest` CRD/webhook management, internal testing utilities (`process`, `certs`, `addr`, `flock`), and the `setup-envtest` tool with version/platform management, HTTP client integration, store operations, and CLI workflows.
Vendor Index `vendor/modules.txt`	Updated vendored module index with new controller-runtime packages and `setup-envtest` tool subpackages.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

This review requires careful attention to the benchmark infrastructure design (ControllerHarness generic, HTTP round-trip tracking, envtest lifecycle), verification that benchmark tests are correctly wired to the harness, validation of the vendored controller-runtime code organization, and confirmation that the benchutil utilities properly isolate test environments and accurately measure metrics without interference.

Poem

🐰 Hop, hop—let's benchmark with glee,
Envtest spinning for all we can see,
Round-trips tracked, resources applied,
A harness so clever, performance our guide!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title directly and clearly describes the primary change: adding a benchmark harness for resource/remoteclient optimization targeting the identified packages.
Docstring Coverage	✅ Passed	Docstring coverage is 87.06% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

dlom · 2026-03-23T17:07:19Z

pkg/remoteclient/remoteclient.go

 }

+// BuilderOption configures optional behavior on a Builder.
+type BuilderOption func(*builderOptions)


NTR: This new BuilderOption (and associated machinery) is the only new code in production. I tried to make this as non-invasive as possible: all the callsites that construct new Builders are unchanged and this code no-ops in prod

openshift-ci-robot · 2026-03-23T17:10:27Z

@dlom: This pull request references HIVE-3127 which is a valid jira issue.

Details

In response to this:

xref: HIVE-3127

The clustersync controller and others that depend on pkg/resource (Helper) and pkg/remoteclient have been repeatedly flagged by downstream users for:

Slow reconciliation times

Memory leaks during large SyncSet applications

Excessive API server load (unnecessary round trips and discovery calls)

These packages are legacy code with significant behavioral dependencies from downstream consumers. We cannot rewrite them, but targeted optimizations (50-100 LOC changes) in specific hot paths could significantly improve performance without altering behavior.

Each benchmark runs against a real kube-apiserver (via envtest) and reports custom metrics for round-trip count and byte transfer (in addition to standard allocation and wall-time measurements)

Some preliminary findings:

CreateOrUpdate SteadyState allocates ~79MB/op for a no-op: 88x more than Apply for the same 2 roundtrips. Something isn't being cached properly.

NewHelper construction costs 78MB and 205K allocs: the dominant cost in most benchmarks. More missed caching opportunities

First Apply per package pays 2 extra discovery roundtrips (5 vs 3) due to disk-cache cold start. Caching...

Hibernation takes 1 full second despite only 13 roundtrips and 1.3MB allocs. This is Wall-clock bound, likely connection/TLS overhead in the BuildKubeClient path.

Assisted-by: Claude Opus 4.6 (LLM wrote the original suite of benchmarks automatically, I have refined them extensively to meet my quality standards)

/assign @2uasimojo

Summary by CodeRabbit

Release Notes

New Features

Added comprehensive benchmarking infrastructure for performance testing against real Kubernetes API servers

Introduced benchmark comparison tooling to measure performance changes against base commits

Added new make target and setup scripts for Kubernetes test environment initialization

Refactor

Remote client builder now accepts optional configuration parameters for customization

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

codecov · 2026-03-23T17:20:20Z

Codecov Report

❌ Patch coverage is 0.84507% with 352 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.80%. Comparing base (4caba33) to head (b501df9).
⚠️ Report is 5 commits behind head on master.

Files with missing lines	Patch %	Lines
test/benchutil/envtest.go	0.00%	83 Missing ⚠️
test/benchutil/benchenv.go	0.00%	49 Missing ⚠️
test/benchutil/roundtrip.go	0.00%	47 Missing ⚠️
test/benchutil/generators.go	0.00%	46 Missing ⚠️
test/benchutil/harness.go	0.00%	39 Missing ⚠️
test/benchutil/client.go	0.00%	33 Missing ⚠️
test/benchutil/helpers.go	0.00%	22 Missing ⚠️
test/benchutil/namespace.go	0.00%	15 Missing ⚠️
pkg/remoteclient/kubeconfig.go	0.00%	11 Missing ⚠️
pkg/remoteclient/remoteclient.go	30.00%	5 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2877      +/-   ##
==========================================
- Coverage   50.31%   49.80%   -0.51%     
==========================================
  Files         280      288       +8     
  Lines       34314    34666     +352     
==========================================
+ Hits        17264    17266       +2     
- Misses      15689    16037     +348     
- Partials     1361     1363       +2

Files with missing lines	Coverage Δ
pkg/remoteclient/remoteclient.go	`68.66% <30.00%> (-2.97%)`	⬇️
pkg/remoteclient/kubeconfig.go	`0.00% <0.00%> (ø)`
test/benchutil/namespace.go	`0.00% <0.00%> (ø)`
test/benchutil/helpers.go	`0.00% <0.00%> (ø)`
test/benchutil/client.go	`0.00% <0.00%> (ø)`
test/benchutil/harness.go	`0.00% <0.00%> (ø)`
test/benchutil/generators.go	`0.00% <0.00%> (ø)`
test/benchutil/roundtrip.go	`0.00% <0.00%> (ø)`
test/benchutil/benchenv.go	`0.00% <0.00%> (ø)`
test/benchutil/envtest.go	`0.00% <0.00%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (3)

test/benchutil/generators.go (1)

13-35: Consider de-duplicating payload/namespace defaults.
GenerateConfigMap and GenerateSecret repeat the same padding pattern and hardcoded namespace, which is a small maintainability hotspot.

♻️ Optional cleanup diff

 package benchutil
 
 import (
 	"strings"
@@
 	hivev1 "github.com/openshift/hive/apis/hive/v1"
 )
 
+const defaultNamespace = "default"
+
+func withPayload(base map[string]string, dataSize int) map[string]string {
+	if dataSize > 0 {
+		base["payload"] = strings.Repeat("x", dataSize)
+	}
+	return base
+}
+
 // GenerateConfigMap creates a ConfigMap with optional padding payload (0 = minimal).
 func GenerateConfigMap(name string, dataSize int) *corev1.ConfigMap {
-	data := map[string]string{"key": "value"}
-	if dataSize > 0 {
-		data["payload"] = strings.Repeat("x", dataSize)
-	}
+	data := withPayload(map[string]string{"key": "value"}, dataSize)
 	return &corev1.ConfigMap{
-		ObjectMeta: metav1.ObjectMeta{Name: name, Namespace: "default"},
+		ObjectMeta: metav1.ObjectMeta{Name: name, Namespace: defaultNamespace},
 		Data:       data,
 	}
 }
@@
 func GenerateSecret(name string, dataSize int) *corev1.Secret {
-	data := map[string]string{"password": "secret"}
-	if dataSize > 0 {
-		data["payload"] = strings.Repeat("x", dataSize)
-	}
+	data := withPayload(map[string]string{"password": "secret"}, dataSize)
 	return &corev1.Secret{
-		ObjectMeta: metav1.ObjectMeta{Name: name, Namespace: "default"},
+		ObjectMeta: metav1.ObjectMeta{Name: name, Namespace: defaultNamespace},
 		StringData: data,
 	}
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/benchutil/generators.go` around lines 13 - 35, GenerateConfigMap and
GenerateSecret duplicate the padding payload logic and hardcoded namespace;
extract a small helper (e.g., buildPayloadMap(payloadSize int, base
map[string]string) map[string]string) to return the base map plus optional
"payload" and introduce a single defaultNamespace constant (e.g.,
defaultNamespace = "default") and/or a helper for ObjectMeta (e.g.,
newDefaultMeta(name string) metav1.ObjectMeta), then update GenerateConfigMap to
call buildPayloadMap for Data and newDefaultMeta for ObjectMeta and update
GenerateSecret to call the same buildPayloadMap for StringData and reuse
newDefaultMeta; keep function signatures the same.

test/benchutil/namespace.go (2)

35-37: Add a timeout to namespace creation calls.

At Line 35, context.Background() can hang benchmarks indefinitely if the API server stalls.

💡 Proposed fix

 import (
 	"context"
 	"fmt"
 	"sync"
 	"sync/atomic"
 	"testing"
+	"time"
@@
-	_, err := benchClientset.CoreV1().Namespaces().Create(context.Background(), &corev1.Namespace{
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+	_, err := benchClientset.CoreV1().Namespaces().Create(ctx, &corev1.Namespace{
 		ObjectMeta: metav1.ObjectMeta{Name: name},
 	}, metav1.CreateOptions{})

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/benchutil/namespace.go` around lines 35 - 37, Replace the use of
context.Background() in the Namespace creation call
(benchClientset.CoreV1().Namespaces().Create) with a cancellable context that
has a deadline—use context.WithTimeout to create ctx, defer cancel(), and pass
that ctx to Create so the call times out instead of hanging indefinitely; choose
an appropriate timeout value for benchmarks and ensure cancel() is deferred
immediately after creating the context to avoid leaks.

24-33: sync.Once ignores later cfg values.

At Line 26, the first config wins permanently; later calls with different cfg values will still use the original clientset.

💡 Guard against config mismatch

 var (
 	benchNSCounter  atomic.Int64
 	benchClientset  *kubernetes.Clientset
+	benchClientHost string
 	benchClientOnce sync.Once
 )
@@
 	benchClientOnce.Do(func() {
 		cs, err := kubernetes.NewForConfig(cfg)
 		if err != nil {
 			b.Fatalf("failed to create clientset: %v", err)
 		}
 		benchClientset = cs
+		benchClientHost = cfg.Host
 	})
+	if cfg.Host != benchClientHost {
+		b.Fatalf("BenchNamespace called with different config host: first=%s current=%s", benchClientHost, cfg.Host)
+	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/benchutil/namespace.go` around lines 24 - 33, BenchNamespace currently
uses benchClientOnce so the first cfg wins forever; replace the sync.Once
pattern with a small guarded init that allows different cfg values: add a
package-level mutex (e.g., benchClientMu) and a stored lastCfg (e.g.,
benchClientCfg *rest.Config), then in BenchNamespace acquire the mutex, if
benchClientset is nil or the new cfg differs from benchClientCfg (compare a
stable identifier like cfg.Host or use reflect.DeepEqual), create a new
kubernetes client via kubernetes.NewForConfig(cfg), assign benchClientset and
benchClientCfg, and release the mutex; remove benchClientOnce usage so
subsequent calls with different cfg will recreate the clientset.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hack/setup-envtest.sh`:
- Line 17: The ENVTEST_PATH assignment is capturing stderr into the variable
(ENVTEST_PATH=$(setup-envtest use "$K8S_VERSION" -p path 2>&1)), which pollutes
the path used later for KUBEBUILDER_ASSETS; remove the redirection so only
stdout is captured (ENVTEST_PATH=$(setup-envtest use "$K8S_VERSION" -p path)).
Locate the ENVTEST_PATH assignment and the subsequent export of
KUBEBUILDER_ASSETS and ensure no stderr is merged into ENVTEST_PATH; if you need
error output, redirect errors separately or check the command exit status and
log errors to stderr rather than into ENVTEST_PATH.

In `@Makefile`:
- Around line 393-397: Add the missing .PHONY declarations for the Makefile
targets "clean" and "test" to avoid checkmake warnings; similar to the existing
.PHONY: setup-envtest line, update the .PHONY list (or add a new .PHONY line) to
include clean and test so the Makefile treats those targets as phony rather than
files.

In `@test/benchmark/controller_sim_benchmark_test.go`:
- Around line 278-317: The benchmark leaves created MachineSets behind, causing
MachineSetList to grow across iterations; modify the Reconcile function to clean
up the created resource after the iteration by deleting the MachineSet you
created (use the toCreate or workerName with
remoteClient.Delete(context.Background(), toCreate) or remoteClient.Delete by
object key), or alternatively ensure fully isolated state per iteration by
rebuilding a fresh namespace/state before each iteration (use
s.NewBuilder().Build per-iteration and tear down afterwards). Target the
Reconcile closure, the toCreate variable and workerName, and add explicit
cleanup (remoteClient.Delete) or per-iteration state teardown to avoid
accumulating MachineSets between runs.
- Around line 435-442: The benchmark BenchmarkControllerReconcileUnreachable
currently calls brc.NewBuilder().UsePrimaryAPIURL().Build() which performs
network I/O and will fail the entire benchmark via b.Fatalf when the endpoint is
unreachable; change the setup to simulate an unreachable API by configuring the
builder with an APIURLOverride to a non-routable address (or otherwise inject an
unreachable URL) and do not call b.Fatalf on the Build() error—instead treat the
connection error as expected (e.g., verify the error type/contains expected text
and continue to exercise the reconcile path or skip the build failure), or
alternatively separate client construction from reconciliation measurement so
the reconcile logic is benchmarked independently of Build() failures; update
BenchmarkControllerReconcileUnreachable and the brc.NewBuilder() usage
accordingly.

In `@test/benchutil/client.go`:
- Around line 61-73: The newClient function creates an httpClient for
apiutil.NewDynamicRESTMapper but does not pass that same client into client.New,
causing a second HTTP client to be created and splitting connection/TLS state;
update newClient so the client.Options passed to client.New includes HTTPClient:
httpClient (in addition to Scheme and Mapper) so the same *http.Client is reused
by apiutil.NewDynamicRESTMapper and client.New.

In `@test/benchutil/harness.go`:
- Around line 27-33: The benchmark misreports FirstApply when SteadyState is
true but NewObjects returns a fixed set (so iterations after 0 are already
steady-state); update run (the function that orchestrates benchmark phases) to
ensure true "first-apply" behavior by either requiring per-iteration unique
objects or by recreating the environment and object slice for each iteration
during the first-apply phase: detect when SteadyState is true and NewObjects
returns a fixed-length slice (i.e., NewObjects does not depend on i/unique
naming), and in that case invoke Setup and NewObjects inside the b.N loop (or
otherwise reset the BenchEnv and re-create objects each iteration) before
calling Reconcile(b, state, objects, i); use the existing symbols SteadyState,
NewObjects, Setup, Reconcile and run to locate and implement this change so
FirstApply truly applies fresh objects for every iteration.

In `@test/benchutil/namespace.go`:
- Around line 26-31: Replace the abrupt os.Exit(1) in the benchClientOnce
initialization with a benchmark-friendly failure: when
kubernetes.NewForConfig(cfg) returns an error, call the benchmark's b.Fatalf (or
pass the testing.B into the setup) instead of fmt.Fprintf/os.Exit so the
benchmark harness reports the failure correctly; update the initialization flow
around benchClientOnce and the NewForConfig error handling to propagate or call
b.Fatalf in the error branch.

---

Nitpick comments:
In `@test/benchutil/generators.go`:
- Around line 13-35: GenerateConfigMap and GenerateSecret duplicate the padding
payload logic and hardcoded namespace; extract a small helper (e.g.,
buildPayloadMap(payloadSize int, base map[string]string) map[string]string) to
return the base map plus optional "payload" and introduce a single
defaultNamespace constant (e.g., defaultNamespace = "default") and/or a helper
for ObjectMeta (e.g., newDefaultMeta(name string) metav1.ObjectMeta), then
update GenerateConfigMap to call buildPayloadMap for Data and newDefaultMeta for
ObjectMeta and update GenerateSecret to call the same buildPayloadMap for
StringData and reuse newDefaultMeta; keep function signatures the same.

In `@test/benchutil/namespace.go`:
- Around line 35-37: Replace the use of context.Background() in the Namespace
creation call (benchClientset.CoreV1().Namespaces().Create) with a cancellable
context that has a deadline—use context.WithTimeout to create ctx, defer
cancel(), and pass that ctx to Create so the call times out instead of hanging
indefinitely; choose an appropriate timeout value for benchmarks and ensure
cancel() is deferred immediately after creating the context to avoid leaks.
- Around line 24-33: BenchNamespace currently uses benchClientOnce so the first
cfg wins forever; replace the sync.Once pattern with a small guarded init that
allows different cfg values: add a package-level mutex (e.g., benchClientMu) and
a stored lastCfg (e.g., benchClientCfg *rest.Config), then in BenchNamespace
acquire the mutex, if benchClientset is nil or the new cfg differs from
benchClientCfg (compare a stable identifier like cfg.Host or use
reflect.DeepEqual), create a new kubernetes client via
kubernetes.NewForConfig(cfg), assign benchClientset and benchClientCfg, and
release the mutex; remove benchClientOnce usage so subsequent calls with
different cfg will recreate the clientset.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f92bb83f-09df-47df-9e6b-5b69f051a7bb

📥 Commits

Reviewing files that changed from the base of the PR and between fc39075 and b501df9.

📒 Files selected for processing (60)

Makefile
go.mod
hack/benchmark-comparison.sh
hack/setup-envtest.sh
pkg/dependencymagnet/doc.go
pkg/remoteclient/benchmark_test.go
pkg/remoteclient/kubeconfig.go
pkg/remoteclient/remoteclient.go
pkg/remoteclient/remoteclient_benchmark_test.go
pkg/resource/benchmark_test.go
pkg/resource/helper_benchmark_test.go
test/benchmark/README.md
test/benchmark/benchmark_test.go
test/benchmark/controller_sim_benchmark_test.go
test/benchutil/benchenv.go
test/benchutil/client.go
test/benchutil/doc.go
test/benchutil/envtest.go
test/benchutil/generators.go
test/benchutil/harness.go
test/benchutil/helpers.go
test/benchutil/namespace.go
test/benchutil/roundtrip.go
vendor/modules.txt
vendor/sigs.k8s.io/controller-runtime/pkg/envtest/crd.go
vendor/sigs.k8s.io/controller-runtime/pkg/envtest/doc.go
vendor/sigs.k8s.io/controller-runtime/pkg/envtest/helper.go
vendor/sigs.k8s.io/controller-runtime/pkg/envtest/server.go
vendor/sigs.k8s.io/controller-runtime/pkg/envtest/webhook.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/flock/doc.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/flock/errors.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/flock/flock_other.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/flock/flock_unix.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/addr/manager.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/certs/tinyca.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/controlplane/apiserver.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/controlplane/auth.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/controlplane/etcd.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/controlplane/kubectl.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/controlplane/plane.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/process/arguments.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/process/bin_path_finder.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/process/procattr_other.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/process/procattr_unix.go
vendor/sigs.k8s.io/controller-runtime/pkg/internal/testing/process/process.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/LICENSE
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/README.md
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/env/env.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/env/exit.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/env/helpers.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/main.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/remote/client.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/remote/http_client.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/remote/read_body.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/store/helpers.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/store/store.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/versions/parse.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/versions/platform.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/versions/version.go
vendor/sigs.k8s.io/controller-runtime/tools/setup-envtest/workflows/workflows.go

coderabbitai · 2026-03-23T17:29:33Z

hack/setup-envtest.sh

+echo "using k8s version ${K8S_VERSION}"
+
+echo "Fetching binaries (on first run this may take some time)"
+ENVTEST_PATH=$(setup-envtest use "$K8S_VERSION" -p path 2>&1)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find and read the target file find . -name "setup-envtest.sh" -type f

Repository: openshift/hive

Length of output: 81

🏁 Script executed:

# Read the file to see context around line 17 cat -n hack/setup-envtest.sh | head -40

Repository: openshift/hive

Length of output: 782

🏁 Script executed:

# Check how ENVTEST_PATH is used in the file grep -n "ENVTEST_PATH" hack/setup-envtest.sh

Repository: openshift/hive

Length of output: 178

🏁 Script executed:

# Check for KUBEBUILDER_ASSETS references grep -n "KUBEBUILDER_ASSETS" hack/setup-envtest.sh

Repository: openshift/hive

Length of output: 113

Remove 2>&1 from ENVTEST_PATH capture to avoid stderr pollution.

At Line 17, 2>&1 captures both stderr and stdout into the variable. Any progress or error output from setup-envtest will be mixed into ENVTEST_PATH, yielding an invalid path when exported as KUBEBUILDER_ASSETS on Line 21.

💡 Proposed fix

-ENVTEST_PATH=$(setup-envtest use "$K8S_VERSION" -p path 2>&1) +ENVTEST_PATH="$(setup-envtest use "$K8S_VERSION" -p path)" +if [[ ! -d "$ENVTEST_PATH" ]]; then + echo "setup-envtest returned a non-directory path: $ENVTEST_PATH" >&2 + exit 1 +fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

ENVTEST_PATH=$(setup-envtest use "$K8S_VERSION" -p path 2>&1)

ENVTEST_PATH="$(setup-envtest use "$K8S_VERSION" -p path)"

if [[ ! -d "$ENVTEST_PATH" ]]; then

echo "setup-envtest returned a non-directory path: $ENVTEST_PATH" >&2

exit 1

fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@hack/setup-envtest.sh` at line 17, The ENVTEST_PATH assignment is capturing stderr into the variable (ENVTEST_PATH=$(setup-envtest use "$K8S_VERSION" -p path 2>&1)), which pollutes the path used later for KUBEBUILDER_ASSETS; remove the redirection so only stdout is captured (ENVTEST_PATH=$(setup-envtest use "$K8S_VERSION" -p path)). Locate the ENVTEST_PATH assignment and the subsequent export of KUBEBUILDER_ASSETS and ensure no stderr is merged into ENVTEST_PATH; if you need error output, redirect errors separately or check the command exit status and log errors to stderr rather than into ENVTEST_PATH.

coderabbitai · 2026-03-23T17:29:34Z

Makefile

+.PHONY: setup-envtest
+# Downloads envtest binaries and prints the required export KUBEBUILDER_ASSETS=... command.
+# Run the printed export command (or add it to your shell rc) before running benchmarks.
+setup-envtest:
+	hack/setup-envtest.sh


⚠️ Potential issue | 🟡 Minor

Add missing .PHONY declarations for clean and test.

At Line [393], static analysis reports missing required phony targets, which can cause avoidable checkmake warnings.

🔧 Suggested fix

-.PHONY: all +.PHONY: all clean test

🧰 Tools

🪛 checkmake (0.2.2)

[warning] 393-393: Missing required phony target "clean"

(minphony)

[warning] 393-393: Missing required phony target "test"

(minphony)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 393 - 397, Add the missing .PHONY declarations for the Makefile targets "clean" and "test" to avoid checkmake warnings; similar to the existing .PHONY: setup-envtest line, update the .PHONY list (or add a new .PHONY line) to include clean and test so the Makefile treats those targets as phony rather than files.

coderabbitai · 2026-03-23T17:29:34Z

test/benchmark/controller_sim_benchmark_test.go

+			Reconcile: func(b *testing.B, s *firstApplyState, _ []client.Object, i int) {
+				remoteClient, err := s.NewBuilder().Build()
+				if err != nil {
+					b.Fatalf("Build failed: %v", err)
+				}
+				workerName := fmt.Sprintf("worker-%d", i)
+
+				ms := &machinev1beta1.MachineSet{}
+				if err := remoteClient.Get(context.Background(), client.ObjectKey{
+					Namespace: s.ns, Name: workerName,
+				}, ms); err != nil && !apierrors.IsNotFound(err) {
+					b.Fatalf("Get MachineSet failed: %v", err)
+				}
+
+				machineList := &machinev1beta1.MachineList{}
+				if err := remoteClient.List(context.Background(), machineList, client.InNamespace(s.ns)); err != nil {
+					b.Fatalf("List Machines failed: %v", err)
+				}
+
+				machineSetList := &machinev1beta1.MachineSetList{}
+				if err := remoteClient.List(context.Background(), machineSetList, client.InNamespace(s.ns)); err != nil {
+					b.Fatalf("List MachineSets failed: %v", err)
+				}
+
+				replicas := int32(3)
+				toCreate := &machinev1beta1.MachineSet{
+					ObjectMeta: metav1.ObjectMeta{
+						Name:      workerName,
+						Namespace: s.ns,
+					},
+					Spec: machinev1beta1.MachineSetSpec{
+						Replicas: &replicas,
+						Selector: metav1.LabelSelector{
+							MatchLabels: map[string]string{"machine.openshift.io/cluster-api-machineset": workerName},
+						},
+					},
+				}
+				if err := remoteClient.Create(context.Background(), toCreate); err != nil {
+					b.Fatalf("Create failed: %v", err)
+				}


⚠️ Potential issue | 🟠 Major

FirstApply keeps inflating the MachineSetList response.

Each iteration creates a new worker-%d MachineSet and leaves it behind, but the benchmark still lists all MachineSets in the namespace before the create. That makes later iterations more expensive than earlier ones, so the reported per-op numbers depend on b.N instead of a stable first-apply workload. Recreate isolated state per iteration, or remove the created MachineSet before the next loop.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/benchmark/controller_sim_benchmark_test.go` around lines 278 - 317, The benchmark leaves created MachineSets behind, causing MachineSetList to grow across iterations; modify the Reconcile function to clean up the created resource after the iteration by deleting the MachineSet you created (use the toCreate or workerName with remoteClient.Delete(context.Background(), toCreate) or remoteClient.Delete by object key), or alternatively ensure fully isolated state per iteration by rebuilding a fresh namespace/state before each iteration (use s.NewBuilder().Build per-iteration and tear down afterwards). Target the Reconcile closure, the toCreate variable and workerName, and add explicit cleanup (remoteClient.Delete) or per-iteration state teardown to avoid accumulating MachineSets between runs.

coderabbitai · 2026-03-23T17:29:34Z

test/benchmark/controller_sim_benchmark_test.go

+func BenchmarkControllerReconcileUnreachable(b *testing.B) {
+	benchutil.ControllerHarness[*benchutil.BenchRemoteClient]{
+		Setup: benchutil.SetupRemoteClient,
+		Reconcile: func(b *testing.B, brc *benchutil.BenchRemoteClient, _ []client.Object, _ int) {
+			if _, err := brc.NewBuilder().UsePrimaryAPIURL().Build(); err != nil {
+				b.Fatalf("Build failed: %v", err)
+			}
+		},


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Verify whether the remoteclient builder does any network I/O before the first request. rg -nP -C6 --type go 'func\s+$[^)]*$\s+UsePrimaryAPIURL\s*$' test/benchutil pkg/remoteclient rg -nP -C10 --type go 'func\s+\([^)]*$\s+Build\s*$' test/benchutil pkg/remoteclient rg -nP -C10 --type go 'func\s+\([^)]*$\s+RESTConfig\s*$' test/benchutil pkg/remoteclient rg -nP -C10 --type go 'func\s+\([^)]*$\s+BuildKubeClient\s*\(' test/benchutil pkg/remoteclient

Repository: openshift/hive

Length of output: 21597

🏁 Script executed:

# Check the benchmark setup and BenchRemoteClient rg -nA20 'type BenchRemoteClient' test/benchutil rg -nA15 'func SetupRemoteClient' test/benchutil rg -nB5 'BenchmarkControllerReconcileUnreachable' test/benchmark/controller_sim_benchmark_test.go

Repository: openshift/hive

Length of output: 2540

🏁 Script executed:

# Look at ControllerHarness structure and how it invokes Reconcile rg -nA30 'type ControllerHarness' test/benchutil rg -nA50 'func.*Run.*ControllerHarness' test/benchutil/harness.go

Repository: openshift/hive

Length of output: 2241

🏁 Script executed:

# Get more of the run function to see how Reconcile is invoked sed -n '47,80p' test/benchutil/harness.go

Repository: openshift/hive

Length of output: 895

🏁 Script executed:

# Check how NewBuilder works and what URLs are configured rg -nB10 -A10 'func.*NewBuilder' test/benchutil rg -nB5 -A15 'type BenchEnv' test/benchutil

Repository: openshift/hive

Length of output: 2674

🏁 Script executed:

# Check how primaryURL and URLs are used in remoteclient rg -nB5 -A15 'primaryURL|secondaryURL|urlToUse' pkg/remoteclient/remoteclient.go | head -60

Repository: openshift/hive

Length of output: 1762

🏁 Script executed:

# Check how the URLs are constructed in RESTConfig and what they represent rg -nB10 -A20 'func.*unadulteratedRESTConfig' pkg/remoteclient/remoteclient.go rg -nB5 -A15 'urlToUse.*==' pkg/remoteclient/remoteclient.go

Repository: openshift/hive

Length of output: 1829

🏁 Script executed:

# Check what's in the BenchEnv's ClusterDeployment rg -nB5 -A30 'func NewBenchEnv' test/benchutil

Repository: openshift/hive

Length of output: 2138

🏁 Script executed:

# Continue reading NewBenchEnv to see the ClusterDeployment setup sed -n '37,130p' test/benchutil/benchenv.go

Repository: openshift/hive

Length of output: 1610

🏁 Script executed:

# Look at the full BenchmarkControllerReconcileUnreachable code to understand intent sed -n '433,450p' test/benchmark/controller_sim_benchmark_test.go

Repository: openshift/hive

Length of output: 797

🏁 Script executed:

# Look at other benchmark patterns to understand the intent and compare sed -n '380,445p' test/benchmark/controller_sim_benchmark_test.go

Repository: openshift/hive

Length of output: 2330

🏁 Script executed:

# Check if there's an APIURLOverride set anywhere in the BenchEnv setup rg -n 'APIURLOverride|APIServerIPOverride' test/benchutil # Also check if the benchmark is trying to test an unreachable endpoint rg -nB2 -A2 'unreachable\|Unreachable' test/benchmark/controller_sim_benchmark_test.go

Repository: openshift/hive

Length of output: 40

Build() performs network I/O immediately and will fail on an unreachable endpoint, aborting the benchmark instead of measuring reconciliation behavior.

Build() is not lazy—it performs reachability verification via discovery.NewDiscoveryClientForConfig() and restmapper.GetAPIGroupResources(), so any connection error occurs during Build(), not later. When Build() fails for an unreachable endpoint, b.Fatalf() aborts the benchmark run rather than allowing measurement of reconciliation behavior under degraded conditions. Additionally, the benchmark's comment claims to "model unreachable," but the setup does not configure an unreachable endpoint (no APIURLOverride is set), so the benchmark connects to the actual working envtest server.

To benchmark reconciliation against an unreachable endpoint, the benchmark should either set an APIURLOverride to a non-existent address and treat the resulting connection error as expected (avoiding b.Fatalf()), or restructure to measure reconciliation logic separately from client-building failures.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/benchmark/controller_sim_benchmark_test.go` around lines 435 - 442, The benchmark BenchmarkControllerReconcileUnreachable currently calls brc.NewBuilder().UsePrimaryAPIURL().Build() which performs network I/O and will fail the entire benchmark via b.Fatalf when the endpoint is unreachable; change the setup to simulate an unreachable API by configuring the builder with an APIURLOverride to a non-routable address (or otherwise inject an unreachable URL) and do not call b.Fatalf on the Build() error—instead treat the connection error as expected (e.g., verify the error type/contains expected text and continue to exercise the reconcile path or skip the build failure), or alternatively separate client construction from reconciliation measurement so the reconcile logic is benchmarked independently of Build() failures; update BenchmarkControllerReconcileUnreachable and the brc.NewBuilder() usage accordingly.

coderabbitai · 2026-03-23T17:29:34Z

test/benchutil/client.go

+func newClient(cfg *rest.Config) (client.Client, error) {
+	httpClient, err := rest.HTTPClientFor(cfg)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create HTTP client: %w", err)
+	}
+	mapper, err := apiutil.NewDynamicRESTMapper(cfg, httpClient)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create REST mapper: %w", err)
+	}
+	return client.New(cfg, client.Options{
+		Scheme: hivescheme.GetScheme(),
+		Mapper: mapper,
+	})


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find the client package and understand the structure find . -type f -name "*.go" | grep -E "(client|benchmark)" | head -20

Repository: openshift/hive

Length of output: 870

🏁 Script executed:

# Look for the client.Options definition rg "type Options struct" -A 15 --type go

Repository: openshift/hive

Length of output: 50371

🏁 Script executed:

# Check the client.New function signature rg "func New$" -B 2 -A 10 --type go | head -50

Repository: openshift/hive

Length of output: 4736

🏁 Script executed:

# Check the actual file content head -100 test/benchutil/client.go

Repository: openshift/hive

Length of output: 2273

🏁 Script executed:

# Find and examine client.New function rg "func New\(" vendor/sigs.k8s.io/controller-runtime/pkg/client/ -A 20 --type go

Repository: openshift/hive

Length of output: 1949

🏁 Script executed:

# Get more of the newClient implementation rg "func newClient\(config \*rest.Config, options Options$" vendor/sigs.k8s.io/controller-runtime/pkg/client/client.go -A 50 --type go

Repository: openshift/hive

Length of output: 1418

Reuse the same http.Client in client.New.

The mapper uses the provided httpClient, but client.New creates a second HTTP client since HTTPClient is not set in the options. This splits connection pooling and TLS state, skewing benchmark results for cold-start wall time.

🔧 Suggested change

return client.New(cfg, client.Options{ Scheme: hivescheme.GetScheme(), Mapper: mapper, + HTTPClient: httpClient, })

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func newClient(cfg *rest.Config) (client.Client, error) {

httpClient, err := rest.HTTPClientFor(cfg)

if err != nil {

return nil, fmt.Errorf("failed to create HTTP client: %w", err)

}

mapper, err := apiutil.NewDynamicRESTMapper(cfg, httpClient)

if err != nil {

return nil, fmt.Errorf("failed to create REST mapper: %w", err)

}

return client.New(cfg, client.Options{

Scheme: hivescheme.GetScheme(),

Mapper: mapper,

})

func newClient(cfg *rest.Config) (client.Client, error) {

httpClient, err := rest.HTTPClientFor(cfg)

if err != nil {

return nil, fmt.Errorf("failed to create HTTP client: %w", err)

}

mapper, err := apiutil.NewDynamicRESTMapper(cfg, httpClient)

if err != nil {

return nil, fmt.Errorf("failed to create REST mapper: %w", err)

}

return client.New(cfg, client.Options{

Scheme: hivescheme.GetScheme(),

Mapper: mapper,

HTTPClient: httpClient,

})

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/benchutil/client.go` around lines 61 - 73, The newClient function creates an httpClient for apiutil.NewDynamicRESTMapper but does not pass that same client into client.New, causing a second HTTP client to be created and splitting connection/TLS state; update newClient so the client.Options passed to client.New includes HTTPClient: httpClient (in addition to Scheme and Mapper) so the same *http.Client is reused by apiutil.NewDynamicRESTMapper and client.New.

coderabbitai · 2026-03-23T17:29:34Z

test/benchutil/envtest.go

+		if os.Getenv("KUBEBUILDER_ASSETS") == "" {
+			fmt.Fprintf(os.Stderr, `
+	ERROR: KUBEBUILDER_ASSETS environment variable not set
+
+	Benchmarks require envtest binaries (kube-apiserver, etcd).
+
+	To set up:
+	  $ make setup-envtest
+
+	Then export the path shown, or run:
+	  $ export KUBEBUILDER_ASSETS="$(hack/setup-envtest.sh | tail -1 | cut -d'=' -f2 | tr -d '"')"
+
+	See test/benchmark/README.md for details.
+
+`)
+			os.Exit(1)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the file structure and locate the file find . -name "envtest.go" -path "*/benchutil/*" 2>/dev/null | head -20

Repository: openshift/hive

Length of output: 85

🏁 Script executed:

# Get an overview of the file wc -l test/benchutil/envtest.go

Repository: openshift/hive

Length of output: 87

🏁 Script executed:

# Read the full file to understand the context cat -n test/benchutil/envtest.go

Repository: openshift/hive

Length of output: 5920

Don't call os.Exit from the shared envtest bootstrap.

If testEnv.Start() succeeds (line 71) and a later step fails (lines 76-80), these exits bypass the normal cleanup path and leave the envtest child processes behind. Please change the function signature to return an error, then have callers fail via testing.TB methods instead.

This applies to all os.Exit calls within EnsureEnvTest and hiveCRDPaths (lines 42-57, 62-80, 96-105).

coderabbitai · 2026-03-23T17:29:34Z

test/benchutil/harness.go

+	NewObjects  func(b *testing.B, ns string) []client.Object               // optional; creates objects to reconcile
+	Setup       func(b *testing.B, env *BenchEnv) S                         // creates per-benchmark state
+	// Reconcile runs one iteration. objects is the slice returned by NewObjects (nil if NewObjects is nil).
+	// When NewObjects returns b.N uniquely-named objects (e.g. via SingleTemplate), use objects[i].
+	// When NewObjects returns a fixed set to apply every iteration, iterate over the whole slice and ignore i.
+	Reconcile   func(b *testing.B, state S, objects []client.Object, i int)
+	SteadyState bool // when true, also run with pre-seeded objects


⚠️ Potential issue | 🟠 Major

SteadyState plus a fixed object set makes FirstApply numbers misleading.

run(b, false) still reuses one env and one object slice across the full b.N loop. For benchmarks that return fixed-name objects and ignore i, only iteration 0 is a real first apply; the remaining b.N-1 iterations are already steady-state. That currently skews the FirstApply results in BenchmarkControllerReconcileClusterSync*. Either require per-iteration unique objects for first-apply runs (like SingleTemplate) or recreate/reset the env between iterations in the first-apply phase.

Also applies to: 39-41, 47-68

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/benchutil/harness.go` around lines 27 - 33, The benchmark misreports FirstApply when SteadyState is true but NewObjects returns a fixed set (so iterations after 0 are already steady-state); update run (the function that orchestrates benchmark phases) to ensure true "first-apply" behavior by either requiring per-iteration unique objects or by recreating the environment and object slice for each iteration during the first-apply phase: detect when SteadyState is true and NewObjects returns a fixed-length slice (i.e., NewObjects does not depend on i/unique naming), and in that case invoke Setup and NewObjects inside the b.N loop (or otherwise reset the BenchEnv and re-create objects each iteration) before calling Reconcile(b, state, objects, i); use the existing symbols SteadyState, NewObjects, Setup, Reconcile and run to locate and implement this change so FirstApply truly applies fresh objects for every iteration.

coderabbitai · 2026-03-23T17:29:34Z

test/benchutil/namespace.go

+	benchClientOnce.Do(func() {
+		cs, err := kubernetes.NewForConfig(cfg)
+		if err != nil {
+			fmt.Fprintf(os.Stderr, "failed to create clientset: %v\n", err)
+			os.Exit(1)
+		}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n test/benchutil/namespace.go

Repository: openshift/hive

Length of output: 1340

Avoid os.Exit(1) in benchmark code paths.

At line 30, this exits the entire test binary abruptly and bypasses normal benchmark failure handling; use b.Fatalf() instead to properly fail the benchmark.

💡 Proposed fix

import ( "context" "fmt" - "os" "sync" "sync/atomic" "testing" @@ benchClientOnce.Do(func() { cs, err := kubernetes.NewForConfig(cfg) if err != nil { - fmt.Fprintf(os.Stderr, "failed to create clientset: %v\n", err) - os.Exit(1) + b.Fatalf("failed to create clientset: %v", err) } benchClientset = cs })

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/benchutil/namespace.go` around lines 26 - 31, Replace the abrupt os.Exit(1) in the benchClientOnce initialization with a benchmark-friendly failure: when kubernetes.NewForConfig(cfg) returns an error, call the benchmark's b.Fatalf (or pass the testing.B into the setup) instead of fmt.Fprintf/os.Exit so the benchmark harness reports the failure correctly; update the initialization flow around benchClientOnce and the NewForConfig error handling to propagate or call b.Fatalf in the error branch.

openshift-ci · 2026-03-23T19:12:03Z

@dlom: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/verify	`b501df9`	link	true	`/test verify`
ci/prow/security	`b501df9`	link	true	`/test security`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

2uasimojo

This is really nice, Mark. Most of my comments are in the form of wishes :)

LMK when you're ready for another pass.

2uasimojo · 2026-03-26T20:39:17Z

hack/benchmark-comparison.sh

+# Ensure KUBEBUILDER_ASSETS is set
+if [ -z "$KUBEBUILDER_ASSETS" ]; then
+    echo "Error: KUBEBUILDER_ASSETS environment variable not set"
+    echo ""
+    echo "Run: make setup-envtest"
+    echo "Then: export KUBEBUILDER_ASSETS=\"\$(hack/setup-envtest.sh | tail -1 | cut -d'=' -f2 | tr -d '\"')\""
+    exit 1
+fi


What would be really cool is if the benchmark could optionally be run against an actual cluster. I've been trying to track down where/how KUBEBUILDER_ASSETS is actually consumed to spoof the cluster via envtest, but have so far been unsuccessful.

2uasimojo · 2026-03-26T20:51:53Z

pkg/remoteclient/kubeconfig.go

+	if err != nil {
+		return nil, err
+	}
+	if b.opts.transportWrapper != nil {


This seems like it should be DRYable if, instead of func(*builderOptions), our BuilderOptions (possibly renamed to RESTConfigOptions) was func(*whatever_type_cfg_is). Unless that type is different in the different calls?

2uasimojo · 2026-03-26T20:55:17Z

pkg/resource/helper_benchmark_test.go

+			{"ConfigMap", benchutil.GenerateConfigMap("apply-cm", 0)},
+			{"Secret", benchutil.GenerateSecret("apply-secret", 0)},
+			{"Deployment", benchutil.GenerateDeployment("apply-deploy")},
+			{"ServiceAccount", benchutil.GenerateServiceAccount("apply-sa")},


I wonder if it would be instructive to include some OpenShift-specific types?

2uasimojo · 2026-03-26T20:58:37Z

pkg/resource/helper_benchmark_test.go

+		{"ConfigMap", benchutil.GenerateConfigMap("rt-cm", 0)},
+		{"Secret", benchutil.GenerateSecret("rt-secret", 0)},
+		{"ServiceAccount", benchutil.GenerateServiceAccount("rt-sa")},


Curious why the list of resources isn't the same across the board.

In fact, seems like we could DRY the list via a wrapper that accepts the string prefix (here rt).

2uasimojo · 2026-03-26T20:59:04Z

pkg/resource/helper_benchmark_test.go

+			{"100B", 100},
+			{"1KB", 1024},
+			{"10KB", 10 * 1024},
+			{"100KB", 100 * 1024},


similar DRY comment for sizes

2uasimojo · 2026-03-26T21:22:32Z

test/benchmark/README.md

+Key insight: Round-trip count changes indicate behavioral changes.
+If your optimization changes round-trip count, it's not just a performance
+improvement -- it's a logic change that needs careful review.


Perhaps the comparison script could flag these visually somehow.

2uasimojo · 2026-03-26T21:25:36Z

test/benchmark/README.md

+Benchmarks run against a real kube-apiserver + etcd (via controller-runtime's envtest),
+not mocks, giving accurate round-trip and byte measurements. Each benchmark uses


Yeah, no. envtest is not a "real" KAS. I'm still not clear whether it has object storage, but at a minimum, it's local, so the round-trip times are certainly not "accurate". I suppose it might be set up with sleeps to simulate network latency/throughput? Anyway, suggest softening this language a bit.

As mentioned earlier, a goal would be the ability to run the suite against an actual cluster, so that we could get accurate RT measurements.

(Later) Oh, perhaps you meant a count of the number of round trips?

2uasimojo · 2026-03-26T21:32:49Z

test/benchutil/benchenv.go

+// NewBenchEnv creates a fully initialized benchmark environment.
+func NewBenchEnv(b *testing.B) *BenchEnv {
+	b.Helper()
+	cfg := EnsureEnvTest(b)


Yeah, yeah, so here, couldn't we optionally load up a real KUBECONFIG instead? And the rest would Just Work™?

2uasimojo · 2026-03-26T21:35:57Z

test/benchutil/envtest.go

+func hiveCRDPaths() []string {
+	out, err := exec.Command("go", "list", "-m", "-f", "{{.Dir}}", "github.com/openshift/hive").Output()
+	if err != nil {
+		fmt.Fprintf(os.Stderr, "failed to locate module root: %v\n", err)
+		os.Exit(1)
+	}
+	root := strings.TrimSpace(string(out))
+	crdDir := filepath.Join(root, "config", "crds")
+	if _, err := os.Stat(crdDir); err != nil {
+		fmt.Fprintf(os.Stderr, "CRD directory not found at %s: %v\n", crdDir, err)
+		os.Exit(1)
+	}
+	return []string{crdDir}
+}


This seems unnecessarily obfuscated. It's not really adding anything to just having the list of dirs hardcoded in a const at the top.

2uasimojo · 2026-03-26T21:39:30Z

go.mod

 	k8s.io/utils v0.0.0-20251002143259-bc988d571ff4
 	sigs.k8s.io/cluster-api-provider-azure v1.21.1-0.20250929163617-2c4eaa611a39
 	sigs.k8s.io/controller-runtime v0.22.3
+	sigs.k8s.io/controller-runtime/tools/setup-envtest v0.0.0-20240927101401-4381fa0aeee4


Odd that this didn't produce a go.sum delta?

openshift-ci bot assigned 2uasimojo Mar 23, 2026

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 23, 2026

openshift-ci bot requested review from 2uasimojo and jstuever March 23, 2026 17:04

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 23, 2026

dlom commented Mar 23, 2026

View reviewed changes

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

2uasimojo reviewed Mar 26, 2026

View reviewed changes

-ENVTEST_PATH=$(setup-envtest use "$K8S_VERSION" -p path 2>&1)
+ENVTEST_PATH="$(setup-envtest use "$K8S_VERSION" -p path)"
+if [[ ! -d "$ENVTEST_PATH" ]]; then
+    echo "setup-envtest returned a non-directory path: $ENVTEST_PATH" >&2
+    exit 1
+fi

		Benchmarks run against a real kube-apiserver + etcd (via controller-runtime's envtest),
		not mocks, giving accurate round-trip and byte measurements. Each benchmark uses

Conversation

dlom commented Mar 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

openshift-ci-robot commented Mar 23, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented Mar 23, 2026

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-ci-robot commented Mar 23, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

codecov bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Mar 23, 2026

Uh oh!

2uasimojo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

dlom commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

openshift-ci-robot commented Mar 23, 2026 •

edited by openshift-ci bot

Loading

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

openshift-ci-robot commented Mar 23, 2026 •

edited by openshift-ci bot

Loading

codecov bot commented Mar 23, 2026 •

edited

Loading