OCPNODE-3880: Add criocredentialproviderconfig event handler#5487
OCPNODE-3880: Add criocredentialproviderconfig event handler#5487QiWang19 wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
/test all |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: QiWang19 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
865fa47 to
8a072e4
Compare
|
/test all |
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
8a072e4 to
75dfbfd
Compare
|
/test all |
75dfbfd to
511fde6
Compare
|
/test all |
511fde6 to
0ff0fe8
Compare
|
/test all |
0ff0fe8 to
5d5008b
Compare
|
/test all |
5d5008b to
7d79743
Compare
147fbf2 to
3027d6a
Compare
|
/test all |
3027d6a to
9a477a8
Compare
|
/test all |
9a477a8 to
58fa6bc
Compare
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
|
||
| for _, pool := range mcpPools { | ||
| applied := true | ||
| role := pool.Name |
There was a problem hiding this comment.
seems only used once. do we need?
| klog.Warningf("error updating CRIOCredentialProviderConfig status: %v", statusUpdateErr) | ||
| } | ||
| // Want to return the actual error received from the sync function | ||
| return err |
There was a problem hiding this comment.
I don't love this pattern but i'm sure there's precedent. seems better to have this function not take nor return an error and then return in caller rather than double returning with no modification
There was a problem hiding this comment.
Fixed. This Sync status function focuses on status update, let the caller return the actual err.
0227c1a to
4be43bd
Compare
saschagrunert
left a comment
There was a problem hiding this comment.
Solid start, the controller plumbing follows existing patterns well and the handleQueueErr refactor is a nice cleanup. Left some inline comments.
| } | ||
|
|
||
| if err := retry.RetryOnConflict(updateBackoff, func() error { | ||
| if crioCredentialProviderConfig != nil { |
There was a problem hiding this comment.
Always true at this point since the function already returned early for IsNotFound and the disabled case. Remove the check and flatten the retry body.
| crioCredentialProviderConfig, err = ctrl.criocpLister.Get("cluster") | ||
| if errors.IsNotFound(err) { | ||
| klog.V(2).Infof("CRIOCredentialProviderConfig 'cluster' does not exist or has been deleted") | ||
| return nil |
There was a problem hiding this comment.
The enhancement requires cleanup of generated MachineConfigs when the CR is deleted. The 97-<pool>-generated-credentialproviderconfig MCs are left orphaned here. On downgrade, nodes would still reference a crio-credential-provider binary that may no longer exist. This should iterate pools and delete the managed MCs.
| case apicfgv1.AzurePlatformType: | ||
| credProviderConfigPath = fmt.Sprintf(credProviderConfigPathFormat, "acr") | ||
| default: | ||
| return nil, "", fmt.Errorf("unsupported platform type: %s", cc.Spec.Infra.Status.PlatformStatus.Type) |
There was a problem hiding this comment.
This hard-errors on any platform that isn't AWS/GCP/Azure. The upstream crio-credential-provider works standalone: its reference credential-provider-config.yml has only the crio-credential-provider entry with no platform-specific provider. For bare-metal, vSphere, OpenStack, etc., there's no pre-existing credential provider config to merge into, but you could generate a standalone config with just the crio-credential-provider entry. As-is, the feature is broken on non-cloud platforms even though the enhancement says it applies to standalone OCP and SNO.
| // Get ControllerConfig | ||
| controllerConfig, err := ctrl.ccLister.Get(ctrlcommon.ControllerConfigName) | ||
| if err != nil { | ||
| return fmt.Errorf("could not get ControllerConfig %w", err) |
There was a problem hiding this comment.
Nit: missing colon before %w. Should be "could not get ControllerConfig: %w".
|
|
||
| // matchImages is not expected to be empty here as the caller should skip calling this function if there are no images | ||
| images := []string{} | ||
| for image := range matchImages { |
There was a problem hiding this comment.
Map iteration order in Go is non-deterministic. Each sync could produce different YAML, causing unnecessary MC updates and node rollouts (kubelet restarts). Add sort.Strings(images) after building the slice.
| } | ||
|
|
||
| type credentialProviderConfigWithVersion struct { | ||
| APIVersion string `json:"apiVersion"` |
There was a problem hiding this comment.
The enhancement requires: "validate the matchImages field to ensure no conflicts with existing non-CRI-O credential providers. ignores conflicting registries and adds logs to CR's status condition." No validation exists. If a user specifies images overlapping with the platform provider (e.g. *.dkr.ecr.*.amazonaws.com), both providers match the same images, causing undefined kubelet behavior.
| return nil, fmt.Errorf("could not decode CRIOCredentialProvider config for role %s: %w", role, err) | ||
| } | ||
|
|
||
| matchImages := make(map[string]bool) |
There was a problem hiding this comment.
Converts slice to map[string]bool only for updateCredentialProviderConfig to convert it back to a slice. The map is the source of the non-determinism bug (see other comment on helpers.go). Pass the slice directly and deduplicate with a sorted slice if needed.
| fmt.Sprintf("Error: %v", err), | ||
| ) | ||
| } | ||
| if len(args) > 0 { |
There was a problem hiding this comment.
When err == nil and len(args) > 0, condition is never assigned (the first branch requires len(args) == 0, the second requires err != nil), so condition.Message will nil-pointer panic. This if len(args) > 0 block should be nested inside if err != nil.
| } | ||
|
|
||
| func TestCrioCredentialProviderConfigCreate(t *testing.T) { | ||
| for _, platform := range []apicfgv1.PlatformType{apicfgv1.AWSPlatformType} { |
There was a problem hiding this comment.
Only tests AWS. GCP and Azure should be covered given the platform switch. More importantly, a non-cloud platform test would catch the hard-error issue in generateOriginalCredentialProviderConfig.
| } | ||
| } | ||
|
|
||
| func TestCrioCredentialProviderConfigUpdate(t *testing.T) { |
There was a problem hiding this comment.
No test for CR deletion. The enhancement and the downgrade strategy require that generated MachineConfigs are cleaned up when the CR is removed.
There was a problem hiding this comment.
During API review, this CRD was explicitly constrained to a singleton and OpenShift managed resource, we do not support user delete/recreate, and it is not required to remove this singleton as part of a downgrade flow.
The enhancement should be updated to clarify that.
Instead of testing CR deletion, added TestCrioCredentialProviderConfigCreateEmpty to verify that shipping a CR with an empty spec correctly handles the no-op scenario.
c09c4eb to
ce0b3cb
Compare
6229834 to
89ae4bf
Compare
| }() | ||
|
|
||
| if !ctrl.criocpEnabled() || !ctrl.addedCRIOCPObservers { | ||
| klog.V(2).Infof("CRIOCredentialProviderConfig observer not added, skipping sync") |
There was a problem hiding this comment.
The owner reference does not set Controller: true, so GC won't cascade-delete the managed MCs. ContainerRuntimeConfig handles this with finalizers and cascadeDelete. Consider doing the same here: iterate pools, delete the 97-<pool>-generated-credentialproviderconfig MCs, then return.
There was a problem hiding this comment.
This controller only reconciles the singleton CRIOCredentialProviderConfig named cluster, and cluster objects are OpenShift managed and not user deletable, enforced in openshift-apiserver. Because there is no supported delete path for this CR, we do not have an orphaned MC in this case.
| sort.Strings(conflictList) | ||
|
|
||
| if len(matchImages) > 0 { | ||
| for img := range matchImages { |
There was a problem hiding this comment.
When all images overlap, matchImages is empty and the else branch re-adds every newImage unfiltered, contradicting the overlap status reported to the user.
| for img := range matchImages { | |
| if len(matchImages) > 0 { | |
| for img := range matchImages { | |
| images = append(images, img) | |
| } | |
| } else if len(overlappedEntries) == 0 { | |
| // No existing providers to conflict with (e.g. non-cloud platform) | |
| for _, img := range newImages { | |
| images = append(images, string(img)) | |
| } | |
| } |
| ) | ||
| } else { | ||
| condition = apihelpers.NewCondition( | ||
| apicfgv1alpha1.ConditionTypeMachineConfigRendered, |
There was a problem hiding this comment.
The partially-applied call site passes the Reason constant as args[0], which gets used as the fmt.Sprintf format string here. Since the Reason has no % verbs, the actual message and overlappedEntries are silently discarded. The condition Reason also stays ReasonMachineConfigRenderingSucceeded instead of ReasonConfigurationPartiallyApplied. Consider splitting Reason and Message into separate parameters.
89ae4bf to
0d2b089
Compare
WalkthroughAdds CRIOCredentialProviderConfig support: cluster RBAC, controller observers/queues and status handling, rendering and Ignition generation (including kubelet drop-in and flag wiring), constants and default policies, tests, and CRD registration for the test environment. Changes
Sequence Diagram(s)sequenceDiagram
participant API as "K8s API\n(CRIOCredentialProviderConfig)"
participant Controller as "machine-config-controller\n(container-runtime-config)"
participant Renderer as "Ignition Generator\n(merge & drop-in)"
participant MCO as "MachineConfig/ MCO"
participant Node as "Node kubelet\n(systemd)"
API->>Controller: watch/create/update CRIOCredentialProviderConfig
Controller->>Controller: enqueue & process work item
Controller->>Renderer: generate platform-aware provider config
Renderer->>Controller: return merged Ignition + drop-in unit
Controller->>MCO: create/update MachineConfig with rendered Ignition
MCO->>Node: apply MachineConfig (write files, add unit)
Node->>Node: systemctl daemon-reload && systemctl restart kubelet
Controller->>API: update CRIO CP status (ObservedGeneration, Conditions)
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (2)
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
1168-1172:⚠️ Potential issue | 🟠 MajorDelete the managed MachineConfigs when the singleton CR disappears.
The
IsNotFoundpath returns immediately, so the existing97-<pool>-generated-credentialproviderconfigMachineConfigs survive CR deletion. That leaves stale kubelet credential-provider config on the pool until something else cleans it up.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 1168 - 1172, The IsNotFound branch for ctrl.criocpLister.Get("cluster") currently returns immediately and leaves generated MachineConfigs behind; instead, when the singleton CR is missing, enumerate and delete the managed MachineConfigs (those matching the generated name pattern "97-<pool>-generated-credentialproviderconfig") using the MachineConfig lister/client (e.g., ctrl.mcLister or ctrl.machineConfigClient/MachineConfigs().Delete), handling NotFound errors gracefully and logging failures; update the errors.IsNotFound branch in the function around ctrl.criocpLister.Get("cluster") to perform this cleanup before returning.pkg/controller/container-runtime-config/helpers.go (1)
1354-1366:⚠️ Potential issue | 🟠 MajorPrepend
crio-credential-providerinstead of appending it.Provider precedence is order-sensitive here. Appending the new entry makes the CRI-O plugin lowest priority, which can change which provider wins for nested or overlapping matches.
Suggested fix
- credProviderConfigObject.Providers = append(credProviderConfigObject.Providers, newProvider) + credProviderConfigObject.Providers = append( + []*credentialProviderWithTag{newProvider}, + credProviderConfigObject.Providers..., + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/helpers.go` around lines 1354 - 1366, The credential provider for CRI-O is being appended which makes crio-credential-provider lowest priority; change the logic that adds the new credentialProviderWithTag (crioCredentialProviderName) so it is prepended to credProviderConfigObject.Providers (i.e., insert newProvider at index 0) rather than using append, preserving all existing providers after it to ensure correct precedence for matching.
🧹 Nitpick comments (1)
install/0000_80_machine-config_00_rbac.yaml (1)
149-178: Consider addingibm-cloud-managedannotation for consistency.Other RBAC resources in this file include the
include.release.openshift.io/ibm-cloud-managed: "true"annotation. If the CRIO credential provider feature should be available on IBM Cloud Managed deployments, this annotation should be added to both the ClusterRole and ClusterRoleBinding.Proposed fix to add ibm-cloud-managed annotation
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: node-credential-providers annotations: + include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true"apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: node-credential-providers-binding annotations: + include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install/0000_80_machine-config_00_rbac.yaml` around lines 149 - 178, Add the missing IBM Cloud managed annotation to the ClusterRole and ClusterRoleBinding by inserting include.release.openshift.io/ibm-cloud-managed: "true" under metadata.annotations for the ClusterRole named node-credential-providers and the ClusterRoleBinding named node-credential-providers-binding so their annotations match other RBAC entries and ensure the CRIO credential provider is marked for IBM Cloud Managed deployments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 156-162: The RBAC rule grants the custom verb
request-serviceaccounts-token-audience to resources: ["*"], which is too broad;
update the rules for serviceaccounts so the
request-serviceaccounts-token-audience verb is only applied to resources:
["serviceaccounts"] (consolidate the two serviceaccount rules so the
"serviceaccounts" resource has verbs
"get","list","request-serviceaccounts-token-audience")—locate the rules block
containing resources: ["*"] and the serviceaccounts rule and adjust accordingly.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 651-654: The current verifyCRIOCredentialProviderConfigContents
function short-circuits when criocp.Spec.MatchImages is empty and returns before
running the empty-spec assertions; change the logic so that when
len(criocp.Spec.MatchImages) == 0 you call the empty-MC assertion
(expectMCNilContent or equivalent) and still run any drop-in/file checks from
the verifyOpts (criocpConfigVerifyOptions) before returning; in short, do not
simply return immediately—invoke expectMCNilContent(mcName) and execute the
drop-in verification paths in verifyOpts when MatchImages is empty, then return.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-566: The current logic calls queue.Forget(key) before
queue.AddAfter(...), which prevents honoring maxRetries and causes permanent
1-minute resyncs; change the retry handling in this error branch to check the
retry count (use queue.NumRequeues(key) or the controller's maxRetries variable)
and only call queue.AddAfter or queue.AddRateLimited when retries < maxRetries,
otherwise call queue.Forget(key) and stop requeuing; update the block around
utilruntime.HandleError(err), klog.V(2).Infof(...), queue.Forget, and
queue.AddAfter to implement this conditional retry logic using the existing
maxRetries, resourceType, key, and err symbols.
- Around line 1235-1249: The code fetches an informer-cached object via
ctrl.criocpLister.Get("cluster") and then mutates it in-place
(meta.SetStatusCondition) which can corrupt the shared cache; fix by
deep-copying the returned object immediately (e.g., call DeepCopy on the result
of criocpLister.Get inside the RetryOnConflict closure), perform the
ObservedGeneration update and meta.SetStatusCondition on the copy, and pass that
copy to
ctrl.configClient.ConfigV1alpha1().CRIOCredentialProviderConfigs().UpdateStatus;
keep the use of wrapErrorWithCRIOCredentialProviderConfigCondition,
retry.RetryOnConflict, and UpdateStatus but ensure mutations occur only on the
DeepCopy.
In `@pkg/controller/container-runtime-config/helpers.go`:
- Around line 1309-1327: The current loop resets skip per provider causing an
image to be considered non-conflicting if it doesn't overlap a later provider;
change the logic to detect conflicts per image across all providers by iterating
newImages outermost and then checking every provider in
credProviderConfigObject.Providers (skipping crioCredentialProviderName) and
using runtimeutils.ScopeIsNestedInsideScope to mark overlappedEntries[img]=true
and set skip=true then break out of the provider loop; only after checking all
providers (or breaking on first overlap) add img to matchImages if skip is
false. Ensure you reference and update the same variables: newImages,
matchImages, overlappedEntries and preserve skipping of
crioCredentialProviderName.
In `@test/framework/envtest.go`:
- Around line 119-121: The shared envtest helpers are not handling the
config.openshift.io/v1alpha1 CRIOCredentialProviderConfig type; update
NewTestEnv and the helper functions CheckCleanEnvironment, CleanEnvironment, and
CreateObjects to include CRIOCredentialProviderConfig in the CRD/object lists
they install, delete, and create. Specifically, add the CRD file entries (the
three zz_generated.crd-manifests paths) to the set NewTestEnv installs, ensure
CheckCleanEnvironment includes
config.openshift.io/v1alpha1.CRIOCredentialProviderConfig when scanning for
leftover resources, make CleanEnvironment delete any
CRIOCredentialProviderConfig instances it finds, and allow CreateObjects to
accept/seed CRIOCredentialProviderConfig objects so tests can create them via
the common helper.
---
Duplicate comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 1168-1172: The IsNotFound branch for
ctrl.criocpLister.Get("cluster") currently returns immediately and leaves
generated MachineConfigs behind; instead, when the singleton CR is missing,
enumerate and delete the managed MachineConfigs (those matching the generated
name pattern "97-<pool>-generated-credentialproviderconfig") using the
MachineConfig lister/client (e.g., ctrl.mcLister or
ctrl.machineConfigClient/MachineConfigs().Delete), handling NotFound errors
gracefully and logging failures; update the errors.IsNotFound branch in the
function around ctrl.criocpLister.Get("cluster") to perform this cleanup before
returning.
In `@pkg/controller/container-runtime-config/helpers.go`:
- Around line 1354-1366: The credential provider for CRI-O is being appended
which makes crio-credential-provider lowest priority; change the logic that adds
the new credentialProviderWithTag (crioCredentialProviderName) so it is
prepended to credProviderConfigObject.Providers (i.e., insert newProvider at
index 0) rather than using append, preserving all existing providers after it to
ensure correct precedence for matching.
---
Nitpick comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 149-178: Add the missing IBM Cloud managed annotation to the
ClusterRole and ClusterRoleBinding by inserting
include.release.openshift.io/ibm-cloud-managed: "true" under
metadata.annotations for the ClusterRole named node-credential-providers and the
ClusterRoleBinding named node-credential-providers-binding so their annotations
match other RBAC entries and ensure the CRIO credential provider is marked for
IBM Cloud Managed deployments.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e40c7900-9f33-4929-8e5c-f255d95f064f
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
| rules: | ||
| - apiGroups: [""] | ||
| resources: ["serviceaccounts"] | ||
| verbs: ["get", "list"] | ||
| - apiGroups: [""] | ||
| resources: ["*"] | ||
| verbs: ["request-serviceaccounts-token-audience"] |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
OpenShift request-serviceaccounts-token-audience RBAC verb
💡 Result:
In OpenShift/Kubernetes RBAC, request-serviceaccounts-token-audience is itself the RBAC verb string (it’s a non-standard/custom verb) that you place under rules[].verbs—typically alongside get / list on the serviceaccounts resource. Examples in the wild grant it like this: verbs: ["request-serviceaccounts-token-audience"] (often with get,list) on resources: ["serviceaccounts"]. [1], [2]
If what you actually need is permission to call the upstream TokenRequest endpoint (.../serviceaccounts/<name>/token), the RBAC permission is the standard Kubernetes pattern: verb create on the token subresource (resource: serviceaccounts/token). [3]
Example ClusterRole rule (custom verb):
rules:
- apiGroups: [""]
resources: ["serviceaccounts"]
verbs: ["get","list","request-serviceaccounts-token-audience"]Example ClusterRole rule (TokenRequest API):
rules:
- apiGroups: [""]
resources: ["serviceaccounts/token"]
verbs: ["create"]Sources: [1] [2] [3]
Fix overly broad resource scope for request-serviceaccounts-token-audience verb.
The verb request-serviceaccounts-token-audience is a valid OpenShift custom RBAC verb, but it's currently applied to resources: ["*"], granting it on all resources. According to OpenShift patterns, this verb should be restricted to resources: ["serviceaccounts"] only. The second rule should be:
Suggested fix
- apiGroups: [""]
resources: ["serviceaccounts"]
verbs: ["request-serviceaccounts-token-audience"]This consolidates both rules into a single rule on the serviceaccounts resource.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@install/0000_80_machine-config_00_rbac.yaml` around lines 156 - 162, The RBAC
rule grants the custom verb request-serviceaccounts-token-audience to resources:
["*"], which is too broad; update the rules for serviceaccounts so the
request-serviceaccounts-token-audience verb is only applied to resources:
["serviceaccounts"] (consolidate the two serviceaccount rules so the
"serviceaccounts" resource has verbs
"get","list","request-serviceaccounts-token-audience")—locate the rules block
containing resources: ["*"] and the serviceaccounts rule and adjust accordingly.
| func (f *fixture) verifyCRIOCredentialProviderConfigContents(t *testing.T, mcName string, criocp *apicfgv1alpha1.CRIOCredentialProviderConfig, verifyOpts criocpConfigVerifyOptions) { | ||
| if len(criocp.Spec.MatchImages) == 0 { | ||
| return | ||
| } |
There was a problem hiding this comment.
Don't short-circuit the empty-spec assertions.
This returns before expectMCNilContent or any drop-in checks run, so TestCrioCredentialProviderConfigCreateEmpty currently passes without validating the generated MachineConfig at all.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`
around lines 651 - 654, The current verifyCRIOCredentialProviderConfigContents
function short-circuits when criocp.Spec.MatchImages is empty and returns before
running the empty-spec assertions; change the logic so that when
len(criocp.Spec.MatchImages) == 0 you call the empty-MC assertion
(expectMCNilContent or equivalent) and still run any drop-in/file checks from
the verifyOpts (criocpConfigVerifyOptions) before returning; in short, do not
simply return immediately—invoke expectMCNilContent(mcName) and execute the
drop-in verification paths in verifyOpts when MatchImages is empty, then return.
pkg/controller/container-runtime-config/container_runtime_config_controller.go
Show resolved
Hide resolved
pkg/controller/container-runtime-config/container_runtime_config_controller.go
Show resolved
Hide resolved
| filepath.Join("..", "..", "vendor", "github.com", "openshift", "api", "config", "v1alpha1", "zz_generated.crd-manifests", "0000_10_config-operator_01_criocredentialproviderconfigs-CustomNoUpgrade.crd.yaml"), | ||
| filepath.Join("..", "..", "vendor", "github.com", "openshift", "api", "config", "v1alpha1", "zz_generated.crd-manifests", "0000_10_config-operator_01_criocredentialproviderconfigs-DevPreviewNoUpgrade.crd.yaml"), | ||
| filepath.Join("..", "..", "vendor", "github.com", "openshift", "api", "config", "v1alpha1", "zz_generated.crd-manifests", "0000_10_config-operator_01_criocredentialproviderconfigs-TechPreviewNoUpgrade.crd.yaml"), |
There was a problem hiding this comment.
Wire CRIOCredentialProviderConfig through the shared envtest helpers.
NewTestEnv now installs this CRD, but CheckCleanEnvironment, CleanEnvironment, and CreateObjects still don't list/delete/create config.openshift.io/v1alpha1.CRIOCredentialProviderConfig. Any envtest that exercises this type can leak state across cases and can't seed the object through the common helper.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/framework/envtest.go` around lines 119 - 121, The shared envtest helpers
are not handling the config.openshift.io/v1alpha1 CRIOCredentialProviderConfig
type; update NewTestEnv and the helper functions CheckCleanEnvironment,
CleanEnvironment, and CreateObjects to include CRIOCredentialProviderConfig in
the CRD/object lists they install, delete, and create. Specifically, add the CRD
file entries (the three zz_generated.crd-manifests paths) to the set NewTestEnv
installs, ensure CheckCleanEnvironment includes
config.openshift.io/v1alpha1.CRIOCredentialProviderConfig when scanning for
leftover resources, make CleanEnvironment delete any
CRIOCredentialProviderConfig instances it finds, and allow CreateObjects to
accept/seed CRIOCredentialProviderConfig objects so tests can create them via
the common helper.
0d2b089 to
9b97140
Compare
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
install/0000_80_machine-config_00_rbac.yaml (1)
160-162:⚠️ Potential issue | 🟠 MajorScope
request-serviceaccounts-token-audiencetoserviceaccountsonly.This rule still grants the custom verb across
resources: ["*"], which is broader than the credential-provider flow needs. Please fold it into the existingserviceaccountsrule instead of granting it cluster-wide on every core resource.Suggested fix
- apiGroups: [""] resources: ["serviceaccounts"] - verbs: ["get", "list"] - - apiGroups: [""] - resources: ["*"] - verbs: ["request-serviceaccounts-token-audience"] + verbs: ["get", "list", "request-serviceaccounts-token-audience"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install/0000_80_machine-config_00_rbac.yaml` around lines 160 - 162, The manifest currently grants the custom verb "request-serviceaccounts-token-audience" to resources: ["*"] under apiGroups: [""], which is too broad; remove that standalone rule and instead add "request-serviceaccounts-token-audience" to the existing serviceaccounts rule (the rule that has apiGroups: [""], resources: ["serviceaccounts"] or resources: ["serviceaccounts", ...]) so the verb is scoped only to serviceaccounts; ensure you remove the resources: ["*"] entry and update the serviceaccounts rule's verbs array to include "request-serviceaccounts-token-audience" without duplicating other verbs.pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
562-565:⚠️ Potential issue | 🟠 Major
maxRetriesstill never becomes terminal here.After
Forget(key), re-adding the item withAddAfterturns persistent failures into an endless 1-minute retry loop for every queue using this helper. That defeats the contract implied bymaxRetries.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 562 - 565, The code calls queue.Forget(key) and then queue.AddAfter(key, 1*time.Minute) which resets retry state and defeats the intended maxRetries behavior; instead, check the retry count (use queue.NumRequeues(key) or the workqueue.RateLimitingInterface helper) against maxRetries and only re-enqueue with AddAfter/AddRateLimited when the count is below maxRetries, otherwise call queue.Forget(key) and stop requeuing (log terminal failure). Update the error-handling block around utilruntime.HandleError and klog.V(2).Infof to implement this conditional requeue logic using the existing maxRetries variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 1224-1228: The status update is only executed when the reconcile
actually applied changes (applied == true), so no-op reconciles don't advance
CRIOCredentialProviderConfig status; move or add a call to
syncCRIOCredentialProviderConfigStatusOnly(...) so it runs even when applied is
false (i.e., after the applied conditional), ensuring you invoke
syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeMachineConfigRendered,
apicfgv1alpha1.ReasonMachineConfigRenderingSucceeded) on successful
reconciliation/no-op so ObservedGeneration and the rendered condition are
advanced regardless of whether MachineConfigs were modified.
---
Duplicate comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 160-162: The manifest currently grants the custom verb
"request-serviceaccounts-token-audience" to resources: ["*"] under apiGroups:
[""], which is too broad; remove that standalone rule and instead add
"request-serviceaccounts-token-audience" to the existing serviceaccounts rule
(the rule that has apiGroups: [""], resources: ["serviceaccounts"] or resources:
["serviceaccounts", ...]) so the verb is scoped only to serviceaccounts; ensure
you remove the resources: ["*"] entry and update the serviceaccounts rule's
verbs array to include "request-serviceaccounts-token-audience" without
duplicating other verbs.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The code calls queue.Forget(key) and then
queue.AddAfter(key, 1*time.Minute) which resets retry state and defeats the
intended maxRetries behavior; instead, check the retry count (use
queue.NumRequeues(key) or the workqueue.RateLimitingInterface helper) against
maxRetries and only re-enqueue with AddAfter/AddRateLimited when the count is
below maxRetries, otherwise call queue.Forget(key) and stop requeuing (log
terminal failure). Update the error-handling block around
utilruntime.HandleError and klog.V(2).Infof to implement this conditional
requeue logic using the existing maxRetries variable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5e364b1d-5e56-4d04-922d-2361caed86f3
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (6)
- manifests/machineconfigcontroller/clusterrole.yaml
- pkg/controller/container-runtime-config/helpers_test.go
- pkg/daemon/constants/constants.go
- test/framework/envtest.go
- pkg/apihelpers/apihelpers.go
- templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
pkg/controller/container-runtime-config/container_runtime_config_controller.go
Show resolved
Hide resolved
9b97140 to
dc39060
Compare
|
@QiWang19: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
562-565:⚠️ Potential issue | 🟠 MajorDon't requeue from the terminal retry path.
After
maxRetries, this still doesForgetand thenAddAfter, so broken objects never actually drop out of the queue and will retry forever every minute.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 562 - 565, The terminal retry path currently calls utilruntime.HandleError(err), logs via klog.V(2).Infof, then calls queue.Forget(key) and queue.AddAfter(key, 1*time.Minute) causing infinite requeues; update the terminal path so after reaching maxRetries you call utilruntime.HandleError(err) and queue.Forget(key) (or just Forget and return) but do NOT call queue.AddAfter; locate the logic around maxRetries handling in container_runtime_config_controller.go where queue.Forget and queue.AddAfter are invoked and remove the AddAfter call so failed objects are not requeued forever.pkg/controller/container-runtime-config/container_runtime_config_controller_test.go (1)
659-664:⚠️ Potential issue | 🟡 MinorDon't short-circuit the empty-spec verification path.
This still returns before any later assertions run, so
TestCrioCredentialProviderConfigCreateEmptycan pass while silently masking mismatches incriocpConfigVerifyOptionsfor the non-cloud path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go` around lines 659 - 664, The test currently returns early when verifyOpts.expectMCNilContent is true, which skips later assertions and can mask mismatches for TestCrioCredentialProviderConfigCreateEmpty; change the block in the test that checks verifyOpts.expectMCNilContent so it only asserts that ignCfg.Storage.Files is empty (when expectMCNilContent is true) and do not return from the test—remove the early return (or scope it so only the specific check is skipped) so the remaining criocpConfigVerifyOptions-related assertions still execute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 685-692: The current loop sets allMatch to true when any
criocp.Spec.MatchImages entry is found in provider.MatchImages, so a partial
match passes; change the logic so allMatch reflects that every expected image is
present: initialize allMatch = true before the loop over
criocp.Spec.MatchImages, and for each mi check assert.Contains(t,
provider.MatchImages, string(mi)); if the check fails set allMatch = false and
break (or return) immediately. Use the same symbols (allMatch,
criocp.Spec.MatchImages, provider.MatchImages) so the test correctly requires
every expected image to be present.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 1208-1210: When handling CRIOCredentialProviderConfig overlaps in
syncCRIOCredentialProviderConfigStatusOnly, also clear the stale
ConditionTypeValidated/ReasonConfigurationPartiallyApplied when overlaps are
resolved: if len(overlappedEntries) == 0 call
syncCRIOCredentialProviderConfigStatusOnly with a success/validated condition
(e.g., apicfgv1alpha1.ConditionTypeValidated, ReasonConfigurationApplied or
similar) and a message indicating rendering succeeded so the previous
ConfigurationPartiallyApplied condition is removed; ensure you update the same
condition type (ConditionTypeValidated) rather than only setting it when
overlaps exist.
In `@pkg/controller/container-runtime-config/helpers.go`:
- Line 1383: The parameter name in function generateDropinUnitCredProviderConfig
is misspelled as generticCredProviderConfigPath; rename it to
genericCredProviderConfigPath in the function signature and update all internal
references and callers to use the corrected identifier so the symbol matches
expected spelling (ensure generateDropinUnitCredProviderConfig and any places
that call it compile after the rename).
---
Duplicate comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 659-664: The test currently returns early when
verifyOpts.expectMCNilContent is true, which skips later assertions and can mask
mismatches for TestCrioCredentialProviderConfigCreateEmpty; change the block in
the test that checks verifyOpts.expectMCNilContent so it only asserts that
ignCfg.Storage.Files is empty (when expectMCNilContent is true) and do not
return from the test—remove the early return (or scope it so only the specific
check is skipped) so the remaining criocpConfigVerifyOptions-related assertions
still execute.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The terminal retry path currently calls
utilruntime.HandleError(err), logs via klog.V(2).Infof, then calls
queue.Forget(key) and queue.AddAfter(key, 1*time.Minute) causing infinite
requeues; update the terminal path so after reaching maxRetries you call
utilruntime.HandleError(err) and queue.Forget(key) (or just Forget and return)
but do NOT call queue.AddAfter; locate the logic around maxRetries handling in
container_runtime_config_controller.go where queue.Forget and queue.AddAfter are
invoked and remove the AddAfter call so failed objects are not requeued forever.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6b802971-f882-404a-af1a-cf29785423f4
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (8)
- pkg/daemon/constants/constants.go
- templates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yaml
- install/0000_80_machine-config_00_rbac.yaml
- test/framework/envtest.go
- templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
- templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
- templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
- pkg/apihelpers/apihelpers.go
pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
Outdated
Show resolved
Hide resolved
| if len(overlappedEntries) > 0 { | ||
| ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationPartiallyApplied, "CRIOCredentialProviderConfig has one or multiple entries that overlap with the original credential provider config. Skip rendering entries: %v.", overlappedEntries) | ||
| } |
There was a problem hiding this comment.
Clear the stale Validated=False condition after overlaps are fixed.
This only updates ConditionTypeValidated when overlaps exist. If a user removes the conflicting entries later, the old ConfigurationPartiallyApplied condition is left behind indefinitely even though rendering now succeeds.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`
around lines 1208 - 1210, When handling CRIOCredentialProviderConfig overlaps in
syncCRIOCredentialProviderConfigStatusOnly, also clear the stale
ConditionTypeValidated/ReasonConfigurationPartiallyApplied when overlaps are
resolved: if len(overlappedEntries) == 0 call
syncCRIOCredentialProviderConfigStatusOnly with a success/validated condition
(e.g., apicfgv1alpha1.ConditionTypeValidated, ReasonConfigurationApplied or
similar) and a message indicating rendering succeeded so the previous
ConfigurationPartiallyApplied condition is removed; ensure you update the same
condition type (ConditionTypeValidated) rather than only setting it when
overlaps exist.
Signed-off-by: Qi Wang <qiwan@redhat.com>
dc39060 to
d1eef7d
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (3)
install/0000_80_machine-config_00_rbac.yaml (1)
156-162:⚠️ Potential issue | 🟠 MajorScope
request-serviceaccounts-token-audiencetoserviceaccountsonly.Granting this custom verb on
resources: ["*"]givessystem:nodesthat capability across every core resource. The new credential-provider flow only needs it onserviceaccounts, so this should be merged into the preceding rule instead of expanding the resource scope cluster-wide.Suggested fix
rules: - apiGroups: [""] resources: ["serviceaccounts"] - verbs: ["get", "list"] - - apiGroups: [""] - resources: ["*"] - verbs: ["request-serviceaccounts-token-audience"] + verbs: ["get", "list", "request-serviceaccounts-token-audience"]In OpenShift RBAC, should the custom verb `request-serviceaccounts-token-audience` be granted on `resources: ["serviceaccounts"]` rather than `resources: ["*"]`?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install/0000_80_machine-config_00_rbac.yaml` around lines 156 - 162, The second rule grants the custom verb "request-serviceaccounts-token-audience" across all core resources; narrow its scope by moving that verb into the prior rule that already targets apiGroups: [""] and resources: ["serviceaccounts"] so the combined rule lists verbs: ["get","list","request-serviceaccounts-token-audience"] (i.e., remove the separate rule with resources: ["*"] and ensure only serviceaccounts retain the custom verb).pkg/controller/container-runtime-config/container_runtime_config_controller.go (2)
549-565:⚠️ Potential issue | 🟠 MajorStop requeueing once
maxRetriesis exhausted.
queue.Forget(key)followed byqueue.AddAfter(key, 1*time.Minute)turns permanent failures into an endless 1-minute retry loop, so invalid objects never actually drop out of the queue despite the controller contract above.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 549 - 565, The current handleQueueErr implementation requeues keys after maxRetries by calling queue.Forget(key) then queue.AddAfter(key, 1*time.Minute), creating an endless retry loop; modify the logic in Controller.handleQueueErr so that when queue.NumRequeues(key) >= maxRetries it only calls queue.Forget(key) (and utilruntime.HandleError(err) / klog as present) and does not call queue.AddAfter(key, 1*time.Minute) so the item is dropped permanently instead of being requeued.
1208-1228:⚠️ Potential issue | 🟠 MajorClear the stale
Validated=Falsecondition when overlaps disappear.The validated condition is only updated inside the overlap branch. If a user fixes the conflicting
matchImageslater, this reconcile only writesMachineConfigRendered=True, so the oldConditionTypeValidated/ partial-applied status remains stuck on the CR. UpdateConditionTypeValidatedon the non-overlap path too someta.SetStatusConditioncan replace the stale failure.Suggested fix
if len(overlappedEntries) > 0 { ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationPartiallyApplied, "CRIOCredentialProviderConfig has one or multiple entries that overlap with the original credential provider config. Skip rendering entries: %v.", overlappedEntries) + } else { + ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, "") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 1208 - 1228, The code only calls syncCRIOCredentialProviderConfigStatusOnly(..., apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationPartiallyApplied, ...) inside the "overlap" branch, leaving a stale Validated=False when overlaps are later resolved; modify the non-overlap path (the branch that proceeds to call syncIgnitionConfig and returns nil/applied) to explicitly clear or set ConditionTypeValidated to a success state (e.g., call syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationValidated, "validated") or similar) so meta.SetStatusCondition replaces the previous failure—add this call just before returning nil or after successful MachineConfig rendering in the code paths that use syncIgnitionConfig and when applied is true, referencing syncCRIOCredentialProviderConfigStatusOnly and ConditionTypeValidated to locate the insertion point.
🧹 Nitpick comments (1)
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
67-71: Reuse the exported credential-provider path constants here.These literals now duplicate
pkg/daemon/constants/constants.go, so a later path change can silently desync controller rendering from the daemon-side policy wiring. Please build these fromconstants.KubernetesCredentialProvidersDirandconstants.KubeletCrioImageCredProviderConfPathinstead of hardcoding them again.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 67 - 71, Replace the hardcoded credential provider path constants in this file: instead of defining genericCredProviderConfigPath and kubeletCrioImageCredProviderConfPath as string literals, build them from the exported constants in pkg/daemon/constants: use constants.KubernetesCredentialProvidersDir to construct the generic credential provider config filename (generic-credential-provider.yaml) and use constants.KubeletCrioImageCredProviderConfPath for the kubelet crio image credential provider config; update the declarations where builtInLabelKey and these two path constants are defined so the controller reuses constants.KubernetesCredentialProvidersDir and constants.KubeletCrioImageCredProviderConfPath rather than duplicating the literal paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 156-162: The second rule grants the custom verb
"request-serviceaccounts-token-audience" across all core resources; narrow its
scope by moving that verb into the prior rule that already targets apiGroups:
[""] and resources: ["serviceaccounts"] so the combined rule lists verbs:
["get","list","request-serviceaccounts-token-audience"] (i.e., remove the
separate rule with resources: ["*"] and ensure only serviceaccounts retain the
custom verb).
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 549-565: The current handleQueueErr implementation requeues keys
after maxRetries by calling queue.Forget(key) then queue.AddAfter(key,
1*time.Minute), creating an endless retry loop; modify the logic in
Controller.handleQueueErr so that when queue.NumRequeues(key) >= maxRetries it
only calls queue.Forget(key) (and utilruntime.HandleError(err) / klog as
present) and does not call queue.AddAfter(key, 1*time.Minute) so the item is
dropped permanently instead of being requeued.
- Around line 1208-1228: The code only calls
syncCRIOCredentialProviderConfigStatusOnly(...,
apicfgv1alpha1.ConditionTypeValidated,
apicfgv1alpha1.ReasonConfigurationPartiallyApplied, ...) inside the "overlap"
branch, leaving a stale Validated=False when overlaps are later resolved; modify
the non-overlap path (the branch that proceeds to call syncIgnitionConfig and
returns nil/applied) to explicitly clear or set ConditionTypeValidated to a
success state (e.g., call syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeValidated,
apicfgv1alpha1.ReasonConfigurationValidated, "validated") or similar) so
meta.SetStatusCondition replaces the previous failure—add this call just before
returning nil or after successful MachineConfig rendering in the code paths that
use syncIgnitionConfig and when applied is true, referencing
syncCRIOCredentialProviderConfigStatusOnly and ConditionTypeValidated to locate
the insertion point.
---
Nitpick comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 67-71: Replace the hardcoded credential provider path constants in
this file: instead of defining genericCredProviderConfigPath and
kubeletCrioImageCredProviderConfPath as string literals, build them from the
exported constants in pkg/daemon/constants: use
constants.KubernetesCredentialProvidersDir to construct the generic credential
provider config filename (generic-credential-provider.yaml) and use
constants.KubeletCrioImageCredProviderConfPath for the kubelet crio image
credential provider config; update the declarations where builtInLabelKey and
these two path constants are defined so the controller reuses
constants.KubernetesCredentialProvidersDir and
constants.KubeletCrioImageCredProviderConfPath rather than duplicating the
literal paths.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 049b026b-d651-43da-9a25-0d4750052dd9
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (5)
- templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
- pkg/apihelpers/apihelpers.go
- test/framework/envtest.go
- templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
- pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
- What I did
Implement criocredentialprovierconfig that is used by
crio-credential-providerplugin to fetch private mirror image pull secrets from the secret object.The handler creates
97-pool-generated-credentialproviderconfigto rollout configurations to file/etc/kubernetes/credential-providers/[platform]-credential-provider.yaml.workflow: https://github.com/openshift/enhancements/blob/master/enhancements/api-review/criocredentialproviderconfig-for-namespace-scoped-mirror-authentication.md#workflow-description
- How to verify it
clusterCRIOCredentialProviderConfig resource, file updated with a new sectionname: crio-credential-providernamespace:mynamespacecontainers.imageis from mirrorsourceregistryjournalctl _COMM=crio-credentialon the scheduled node- Description for the changelog
Summary by CodeRabbit
New Features
Chores / Defaults
Security & Permissions
Tests