OCPNODE-3880: Add criocredentialproviderconfig event handler#5487
OCPNODE-3880: Add criocredentialproviderconfig event handler#5487QiWang19 wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
/test all |
865fa47 to
8a072e4
Compare
|
/test all |
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
8a072e4 to
75dfbfd
Compare
|
/test all |
75dfbfd to
511fde6
Compare
|
/test all |
511fde6 to
0ff0fe8
Compare
|
/test all |
0ff0fe8 to
5d5008b
Compare
|
/test all |
5d5008b to
7d79743
Compare
147fbf2 to
3027d6a
Compare
|
/test all |
3027d6a to
9a477a8
Compare
|
/test all |
9a477a8 to
58fa6bc
Compare
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test all |
| }() | ||
|
|
||
| if !ctrl.criocpEnabled() || !ctrl.addedCRIOCPObservers { | ||
| klog.V(2).Infof("CRIOCredentialProviderConfig observer not added, skipping sync") |
There was a problem hiding this comment.
The owner reference does not set Controller: true, so GC won't cascade-delete the managed MCs. ContainerRuntimeConfig handles this with finalizers and cascadeDelete. Consider doing the same here: iterate pools, delete the 97-<pool>-generated-credentialproviderconfig MCs, then return.
There was a problem hiding this comment.
This controller only reconciles the singleton CRIOCredentialProviderConfig named cluster, and cluster objects are OpenShift managed and not user deletable, enforced in openshift-apiserver. Because there is no supported delete path for this CR, we do not have an orphaned MC in this case.
| sort.Strings(conflictList) | ||
|
|
||
| if len(matchImages) > 0 { | ||
| for img := range matchImages { |
There was a problem hiding this comment.
When all images overlap, matchImages is empty and the else branch re-adds every newImage unfiltered, contradicting the overlap status reported to the user.
| for img := range matchImages { | |
| if len(matchImages) > 0 { | |
| for img := range matchImages { | |
| images = append(images, img) | |
| } | |
| } else if len(overlappedEntries) == 0 { | |
| // No existing providers to conflict with (e.g. non-cloud platform) | |
| for _, img := range newImages { | |
| images = append(images, string(img)) | |
| } | |
| } |
| ) | ||
| } else { | ||
| condition = apihelpers.NewCondition( | ||
| apicfgv1alpha1.ConditionTypeMachineConfigRendered, |
There was a problem hiding this comment.
The partially-applied call site passes the Reason constant as args[0], which gets used as the fmt.Sprintf format string here. Since the Reason has no % verbs, the actual message and overlappedEntries are silently discarded. The condition Reason also stays ReasonMachineConfigRenderingSucceeded instead of ReasonConfigurationPartiallyApplied. Consider splitting Reason and Message into separate parameters.
89ae4bf to
0d2b089
Compare
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds CRIOCredentialProviderConfig support: cluster RBAC, controller observers/queues and status handling, rendering and Ignition generation (including kubelet drop-in and flag wiring), constants and default policies, tests, and CRD registration for the test environment. Changes
Sequence Diagram(s)sequenceDiagram
participant API as "K8s API\n(CRIOCredentialProviderConfig)"
participant Controller as "machine-config-controller\n(container-runtime-config)"
participant Renderer as "Ignition Generator\n(merge & drop-in)"
participant MCO as "MachineConfig / MCO"
participant Node as "Node kubelet\n(systemd)"
API->>Controller: watch/create/update CRIOCredentialProviderConfig
Controller->>Controller: enqueue & process work item
Controller->>Renderer: generate platform-aware credential provider config
Renderer->>Controller: return merged Ignition + drop-in unit
Controller->>MCO: create/update MachineConfig with rendered Ignition
MCO->>Node: apply MachineConfig (write files, add unit)
Node->>Node: systemctl daemon-reload && systemctl restart kubelet
Controller->>API: update CRIO CP status (ObservedGeneration, Conditions)
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (2)
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
1168-1172:⚠️ Potential issue | 🟠 MajorDelete the managed MachineConfigs when the singleton CR disappears.
The
IsNotFoundpath returns immediately, so the existing97-<pool>-generated-credentialproviderconfigMachineConfigs survive CR deletion. That leaves stale kubelet credential-provider config on the pool until something else cleans it up.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 1168 - 1172, The IsNotFound branch for ctrl.criocpLister.Get("cluster") currently returns immediately and leaves generated MachineConfigs behind; instead, when the singleton CR is missing, enumerate and delete the managed MachineConfigs (those matching the generated name pattern "97-<pool>-generated-credentialproviderconfig") using the MachineConfig lister/client (e.g., ctrl.mcLister or ctrl.machineConfigClient/MachineConfigs().Delete), handling NotFound errors gracefully and logging failures; update the errors.IsNotFound branch in the function around ctrl.criocpLister.Get("cluster") to perform this cleanup before returning.pkg/controller/container-runtime-config/helpers.go (1)
1354-1366:⚠️ Potential issue | 🟠 MajorPrepend
crio-credential-providerinstead of appending it.Provider precedence is order-sensitive here. Appending the new entry makes the CRI-O plugin lowest priority, which can change which provider wins for nested or overlapping matches.
Suggested fix
- credProviderConfigObject.Providers = append(credProviderConfigObject.Providers, newProvider) + credProviderConfigObject.Providers = append( + []*credentialProviderWithTag{newProvider}, + credProviderConfigObject.Providers..., + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/helpers.go` around lines 1354 - 1366, The credential provider for CRI-O is being appended which makes crio-credential-provider lowest priority; change the logic that adds the new credentialProviderWithTag (crioCredentialProviderName) so it is prepended to credProviderConfigObject.Providers (i.e., insert newProvider at index 0) rather than using append, preserving all existing providers after it to ensure correct precedence for matching.
🧹 Nitpick comments (1)
install/0000_80_machine-config_00_rbac.yaml (1)
149-178: Consider addingibm-cloud-managedannotation for consistency.Other RBAC resources in this file include the
include.release.openshift.io/ibm-cloud-managed: "true"annotation. If the CRIO credential provider feature should be available on IBM Cloud Managed deployments, this annotation should be added to both the ClusterRole and ClusterRoleBinding.Proposed fix to add ibm-cloud-managed annotation
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: node-credential-providers annotations: + include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true"apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: node-credential-providers-binding annotations: + include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install/0000_80_machine-config_00_rbac.yaml` around lines 149 - 178, Add the missing IBM Cloud managed annotation to the ClusterRole and ClusterRoleBinding by inserting include.release.openshift.io/ibm-cloud-managed: "true" under metadata.annotations for the ClusterRole named node-credential-providers and the ClusterRoleBinding named node-credential-providers-binding so their annotations match other RBAC entries and ensure the CRIO credential provider is marked for IBM Cloud Managed deployments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 156-162: The RBAC rule grants the custom verb
request-serviceaccounts-token-audience to resources: ["*"], which is too broad;
update the rules for serviceaccounts so the
request-serviceaccounts-token-audience verb is only applied to resources:
["serviceaccounts"] (consolidate the two serviceaccount rules so the
"serviceaccounts" resource has verbs
"get","list","request-serviceaccounts-token-audience")—locate the rules block
containing resources: ["*"] and the serviceaccounts rule and adjust accordingly.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 651-654: The current verifyCRIOCredentialProviderConfigContents
function short-circuits when criocp.Spec.MatchImages is empty and returns before
running the empty-spec assertions; change the logic so that when
len(criocp.Spec.MatchImages) == 0 you call the empty-MC assertion
(expectMCNilContent or equivalent) and still run any drop-in/file checks from
the verifyOpts (criocpConfigVerifyOptions) before returning; in short, do not
simply return immediately—invoke expectMCNilContent(mcName) and execute the
drop-in verification paths in verifyOpts when MatchImages is empty, then return.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-566: The current logic calls queue.Forget(key) before
queue.AddAfter(...), which prevents honoring maxRetries and causes permanent
1-minute resyncs; change the retry handling in this error branch to check the
retry count (use queue.NumRequeues(key) or the controller's maxRetries variable)
and only call queue.AddAfter or queue.AddRateLimited when retries < maxRetries,
otherwise call queue.Forget(key) and stop requeuing; update the block around
utilruntime.HandleError(err), klog.V(2).Infof(...), queue.Forget, and
queue.AddAfter to implement this conditional retry logic using the existing
maxRetries, resourceType, key, and err symbols.
- Around line 1235-1249: The code fetches an informer-cached object via
ctrl.criocpLister.Get("cluster") and then mutates it in-place
(meta.SetStatusCondition) which can corrupt the shared cache; fix by
deep-copying the returned object immediately (e.g., call DeepCopy on the result
of criocpLister.Get inside the RetryOnConflict closure), perform the
ObservedGeneration update and meta.SetStatusCondition on the copy, and pass that
copy to
ctrl.configClient.ConfigV1alpha1().CRIOCredentialProviderConfigs().UpdateStatus;
keep the use of wrapErrorWithCRIOCredentialProviderConfigCondition,
retry.RetryOnConflict, and UpdateStatus but ensure mutations occur only on the
DeepCopy.
In `@pkg/controller/container-runtime-config/helpers.go`:
- Around line 1309-1327: The current loop resets skip per provider causing an
image to be considered non-conflicting if it doesn't overlap a later provider;
change the logic to detect conflicts per image across all providers by iterating
newImages outermost and then checking every provider in
credProviderConfigObject.Providers (skipping crioCredentialProviderName) and
using runtimeutils.ScopeIsNestedInsideScope to mark overlappedEntries[img]=true
and set skip=true then break out of the provider loop; only after checking all
providers (or breaking on first overlap) add img to matchImages if skip is
false. Ensure you reference and update the same variables: newImages,
matchImages, overlappedEntries and preserve skipping of
crioCredentialProviderName.
In `@test/framework/envtest.go`:
- Around line 119-121: The shared envtest helpers are not handling the
config.openshift.io/v1alpha1 CRIOCredentialProviderConfig type; update
NewTestEnv and the helper functions CheckCleanEnvironment, CleanEnvironment, and
CreateObjects to include CRIOCredentialProviderConfig in the CRD/object lists
they install, delete, and create. Specifically, add the CRD file entries (the
three zz_generated.crd-manifests paths) to the set NewTestEnv installs, ensure
CheckCleanEnvironment includes
config.openshift.io/v1alpha1.CRIOCredentialProviderConfig when scanning for
leftover resources, make CleanEnvironment delete any
CRIOCredentialProviderConfig instances it finds, and allow CreateObjects to
accept/seed CRIOCredentialProviderConfig objects so tests can create them via
the common helper.
---
Duplicate comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 1168-1172: The IsNotFound branch for
ctrl.criocpLister.Get("cluster") currently returns immediately and leaves
generated MachineConfigs behind; instead, when the singleton CR is missing,
enumerate and delete the managed MachineConfigs (those matching the generated
name pattern "97-<pool>-generated-credentialproviderconfig") using the
MachineConfig lister/client (e.g., ctrl.mcLister or
ctrl.machineConfigClient/MachineConfigs().Delete), handling NotFound errors
gracefully and logging failures; update the errors.IsNotFound branch in the
function around ctrl.criocpLister.Get("cluster") to perform this cleanup before
returning.
In `@pkg/controller/container-runtime-config/helpers.go`:
- Around line 1354-1366: The credential provider for CRI-O is being appended
which makes crio-credential-provider lowest priority; change the logic that adds
the new credentialProviderWithTag (crioCredentialProviderName) so it is
prepended to credProviderConfigObject.Providers (i.e., insert newProvider at
index 0) rather than using append, preserving all existing providers after it to
ensure correct precedence for matching.
---
Nitpick comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 149-178: Add the missing IBM Cloud managed annotation to the
ClusterRole and ClusterRoleBinding by inserting
include.release.openshift.io/ibm-cloud-managed: "true" under
metadata.annotations for the ClusterRole named node-credential-providers and the
ClusterRoleBinding named node-credential-providers-binding so their annotations
match other RBAC entries and ensure the CRIO credential provider is marked for
IBM Cloud Managed deployments.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e40c7900-9f33-4929-8e5c-f255d95f064f
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
| rules: | ||
| - apiGroups: [""] | ||
| resources: ["serviceaccounts"] | ||
| verbs: ["get", "list"] | ||
| - apiGroups: [""] | ||
| resources: ["*"] | ||
| verbs: ["request-serviceaccounts-token-audience"] |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
OpenShift request-serviceaccounts-token-audience RBAC verb
💡 Result:
In OpenShift/Kubernetes RBAC, request-serviceaccounts-token-audience is itself the RBAC verb string (it’s a non-standard/custom verb) that you place under rules[].verbs—typically alongside get / list on the serviceaccounts resource. Examples in the wild grant it like this: verbs: ["request-serviceaccounts-token-audience"] (often with get,list) on resources: ["serviceaccounts"]. [1], [2]
If what you actually need is permission to call the upstream TokenRequest endpoint (.../serviceaccounts/<name>/token), the RBAC permission is the standard Kubernetes pattern: verb create on the token subresource (resource: serviceaccounts/token). [3]
Example ClusterRole rule (custom verb):
rules:
- apiGroups: [""]
resources: ["serviceaccounts"]
verbs: ["get","list","request-serviceaccounts-token-audience"]Example ClusterRole rule (TokenRequest API):
rules:
- apiGroups: [""]
resources: ["serviceaccounts/token"]
verbs: ["create"]Sources: [1] [2] [3]
Fix overly broad resource scope for request-serviceaccounts-token-audience verb.
The verb request-serviceaccounts-token-audience is a valid OpenShift custom RBAC verb, but it's currently applied to resources: ["*"], granting it on all resources. According to OpenShift patterns, this verb should be restricted to resources: ["serviceaccounts"] only. The second rule should be:
Suggested fix
- apiGroups: [""]
resources: ["serviceaccounts"]
verbs: ["request-serviceaccounts-token-audience"]This consolidates both rules into a single rule on the serviceaccounts resource.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@install/0000_80_machine-config_00_rbac.yaml` around lines 156 - 162, The RBAC
rule grants the custom verb request-serviceaccounts-token-audience to resources:
["*"], which is too broad; update the rules for serviceaccounts so the
request-serviceaccounts-token-audience verb is only applied to resources:
["serviceaccounts"] (consolidate the two serviceaccount rules so the
"serviceaccounts" resource has verbs
"get","list","request-serviceaccounts-token-audience")—locate the rules block
containing resources: ["*"] and the serviceaccounts rule and adjust accordingly.
| func (f *fixture) verifyCRIOCredentialProviderConfigContents(t *testing.T, mcName string, criocp *apicfgv1alpha1.CRIOCredentialProviderConfig, verifyOpts criocpConfigVerifyOptions) { | ||
| if len(criocp.Spec.MatchImages) == 0 { | ||
| return | ||
| } |
There was a problem hiding this comment.
Don't short-circuit the empty-spec assertions.
This returns before expectMCNilContent or any drop-in checks run, so TestCrioCredentialProviderConfigCreateEmpty currently passes without validating the generated MachineConfig at all.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`
around lines 651 - 654, The current verifyCRIOCredentialProviderConfigContents
function short-circuits when criocp.Spec.MatchImages is empty and returns before
running the empty-spec assertions; change the logic so that when
len(criocp.Spec.MatchImages) == 0 you call the empty-MC assertion
(expectMCNilContent or equivalent) and still run any drop-in/file checks from
the verifyOpts (criocpConfigVerifyOptions) before returning; in short, do not
simply return immediately—invoke expectMCNilContent(mcName) and execute the
drop-in verification paths in verifyOpts when MatchImages is empty, then return.
pkg/controller/container-runtime-config/container_runtime_config_controller.go
Show resolved
Hide resolved
pkg/controller/container-runtime-config/container_runtime_config_controller.go
Show resolved
Hide resolved
| filepath.Join("..", "..", "vendor", "github.com", "openshift", "api", "config", "v1alpha1", "zz_generated.crd-manifests", "0000_10_config-operator_01_criocredentialproviderconfigs-CustomNoUpgrade.crd.yaml"), | ||
| filepath.Join("..", "..", "vendor", "github.com", "openshift", "api", "config", "v1alpha1", "zz_generated.crd-manifests", "0000_10_config-operator_01_criocredentialproviderconfigs-DevPreviewNoUpgrade.crd.yaml"), | ||
| filepath.Join("..", "..", "vendor", "github.com", "openshift", "api", "config", "v1alpha1", "zz_generated.crd-manifests", "0000_10_config-operator_01_criocredentialproviderconfigs-TechPreviewNoUpgrade.crd.yaml"), |
There was a problem hiding this comment.
Wire CRIOCredentialProviderConfig through the shared envtest helpers.
NewTestEnv now installs this CRD, but CheckCleanEnvironment, CleanEnvironment, and CreateObjects still don't list/delete/create config.openshift.io/v1alpha1.CRIOCredentialProviderConfig. Any envtest that exercises this type can leak state across cases and can't seed the object through the common helper.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/framework/envtest.go` around lines 119 - 121, The shared envtest helpers
are not handling the config.openshift.io/v1alpha1 CRIOCredentialProviderConfig
type; update NewTestEnv and the helper functions CheckCleanEnvironment,
CleanEnvironment, and CreateObjects to include CRIOCredentialProviderConfig in
the CRD/object lists they install, delete, and create. Specifically, add the CRD
file entries (the three zz_generated.crd-manifests paths) to the set NewTestEnv
installs, ensure CheckCleanEnvironment includes
config.openshift.io/v1alpha1.CRIOCredentialProviderConfig when scanning for
leftover resources, make CleanEnvironment delete any
CRIOCredentialProviderConfig instances it finds, and allow CreateObjects to
accept/seed CRIOCredentialProviderConfig objects so tests can create them via
the common helper.
0d2b089 to
9b97140
Compare
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
install/0000_80_machine-config_00_rbac.yaml (1)
160-162:⚠️ Potential issue | 🟠 MajorScope
request-serviceaccounts-token-audiencetoserviceaccountsonly.This rule still grants the custom verb across
resources: ["*"], which is broader than the credential-provider flow needs. Please fold it into the existingserviceaccountsrule instead of granting it cluster-wide on every core resource.Suggested fix
- apiGroups: [""] resources: ["serviceaccounts"] - verbs: ["get", "list"] - - apiGroups: [""] - resources: ["*"] - verbs: ["request-serviceaccounts-token-audience"] + verbs: ["get", "list", "request-serviceaccounts-token-audience"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install/0000_80_machine-config_00_rbac.yaml` around lines 160 - 162, The manifest currently grants the custom verb "request-serviceaccounts-token-audience" to resources: ["*"] under apiGroups: [""], which is too broad; remove that standalone rule and instead add "request-serviceaccounts-token-audience" to the existing serviceaccounts rule (the rule that has apiGroups: [""], resources: ["serviceaccounts"] or resources: ["serviceaccounts", ...]) so the verb is scoped only to serviceaccounts; ensure you remove the resources: ["*"] entry and update the serviceaccounts rule's verbs array to include "request-serviceaccounts-token-audience" without duplicating other verbs.pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
562-565:⚠️ Potential issue | 🟠 Major
maxRetriesstill never becomes terminal here.After
Forget(key), re-adding the item withAddAfterturns persistent failures into an endless 1-minute retry loop for every queue using this helper. That defeats the contract implied bymaxRetries.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 562 - 565, The code calls queue.Forget(key) and then queue.AddAfter(key, 1*time.Minute) which resets retry state and defeats the intended maxRetries behavior; instead, check the retry count (use queue.NumRequeues(key) or the workqueue.RateLimitingInterface helper) against maxRetries and only re-enqueue with AddAfter/AddRateLimited when the count is below maxRetries, otherwise call queue.Forget(key) and stop requeuing (log terminal failure). Update the error-handling block around utilruntime.HandleError and klog.V(2).Infof to implement this conditional requeue logic using the existing maxRetries variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 1224-1228: The status update is only executed when the reconcile
actually applied changes (applied == true), so no-op reconciles don't advance
CRIOCredentialProviderConfig status; move or add a call to
syncCRIOCredentialProviderConfigStatusOnly(...) so it runs even when applied is
false (i.e., after the applied conditional), ensuring you invoke
syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeMachineConfigRendered,
apicfgv1alpha1.ReasonMachineConfigRenderingSucceeded) on successful
reconciliation/no-op so ObservedGeneration and the rendered condition are
advanced regardless of whether MachineConfigs were modified.
---
Duplicate comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 160-162: The manifest currently grants the custom verb
"request-serviceaccounts-token-audience" to resources: ["*"] under apiGroups:
[""], which is too broad; remove that standalone rule and instead add
"request-serviceaccounts-token-audience" to the existing serviceaccounts rule
(the rule that has apiGroups: [""], resources: ["serviceaccounts"] or resources:
["serviceaccounts", ...]) so the verb is scoped only to serviceaccounts; ensure
you remove the resources: ["*"] entry and update the serviceaccounts rule's
verbs array to include "request-serviceaccounts-token-audience" without
duplicating other verbs.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The code calls queue.Forget(key) and then
queue.AddAfter(key, 1*time.Minute) which resets retry state and defeats the
intended maxRetries behavior; instead, check the retry count (use
queue.NumRequeues(key) or the workqueue.RateLimitingInterface helper) against
maxRetries and only re-enqueue with AddAfter/AddRateLimited when the count is
below maxRetries, otherwise call queue.Forget(key) and stop requeuing (log
terminal failure). Update the error-handling block around
utilruntime.HandleError and klog.V(2).Infof to implement this conditional
requeue logic using the existing maxRetries variable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5e364b1d-5e56-4d04-922d-2361caed86f3
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (6)
- manifests/machineconfigcontroller/clusterrole.yaml
- pkg/controller/container-runtime-config/helpers_test.go
- pkg/daemon/constants/constants.go
- test/framework/envtest.go
- pkg/apihelpers/apihelpers.go
- templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
pkg/controller/container-runtime-config/container_runtime_config_controller.go
Show resolved
Hide resolved
9b97140 to
dc39060
Compare
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
562-565:⚠️ Potential issue | 🟠 MajorDon't requeue from the terminal retry path.
After
maxRetries, this still doesForgetand thenAddAfter, so broken objects never actually drop out of the queue and will retry forever every minute.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 562 - 565, The terminal retry path currently calls utilruntime.HandleError(err), logs via klog.V(2).Infof, then calls queue.Forget(key) and queue.AddAfter(key, 1*time.Minute) causing infinite requeues; update the terminal path so after reaching maxRetries you call utilruntime.HandleError(err) and queue.Forget(key) (or just Forget and return) but do NOT call queue.AddAfter; locate the logic around maxRetries handling in container_runtime_config_controller.go where queue.Forget and queue.AddAfter are invoked and remove the AddAfter call so failed objects are not requeued forever.pkg/controller/container-runtime-config/container_runtime_config_controller_test.go (1)
659-664:⚠️ Potential issue | 🟡 MinorDon't short-circuit the empty-spec verification path.
This still returns before any later assertions run, so
TestCrioCredentialProviderConfigCreateEmptycan pass while silently masking mismatches incriocpConfigVerifyOptionsfor the non-cloud path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go` around lines 659 - 664, The test currently returns early when verifyOpts.expectMCNilContent is true, which skips later assertions and can mask mismatches for TestCrioCredentialProviderConfigCreateEmpty; change the block in the test that checks verifyOpts.expectMCNilContent so it only asserts that ignCfg.Storage.Files is empty (when expectMCNilContent is true) and do not return from the test—remove the early return (or scope it so only the specific check is skipped) so the remaining criocpConfigVerifyOptions-related assertions still execute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 685-692: The current loop sets allMatch to true when any
criocp.Spec.MatchImages entry is found in provider.MatchImages, so a partial
match passes; change the logic so allMatch reflects that every expected image is
present: initialize allMatch = true before the loop over
criocp.Spec.MatchImages, and for each mi check assert.Contains(t,
provider.MatchImages, string(mi)); if the check fails set allMatch = false and
break (or return) immediately. Use the same symbols (allMatch,
criocp.Spec.MatchImages, provider.MatchImages) so the test correctly requires
every expected image to be present.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 1208-1210: When handling CRIOCredentialProviderConfig overlaps in
syncCRIOCredentialProviderConfigStatusOnly, also clear the stale
ConditionTypeValidated/ReasonConfigurationPartiallyApplied when overlaps are
resolved: if len(overlappedEntries) == 0 call
syncCRIOCredentialProviderConfigStatusOnly with a success/validated condition
(e.g., apicfgv1alpha1.ConditionTypeValidated, ReasonConfigurationApplied or
similar) and a message indicating rendering succeeded so the previous
ConfigurationPartiallyApplied condition is removed; ensure you update the same
condition type (ConditionTypeValidated) rather than only setting it when
overlaps exist.
In `@pkg/controller/container-runtime-config/helpers.go`:
- Line 1383: The parameter name in function generateDropinUnitCredProviderConfig
is misspelled as generticCredProviderConfigPath; rename it to
genericCredProviderConfigPath in the function signature and update all internal
references and callers to use the corrected identifier so the symbol matches
expected spelling (ensure generateDropinUnitCredProviderConfig and any places
that call it compile after the rename).
---
Duplicate comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 659-664: The test currently returns early when
verifyOpts.expectMCNilContent is true, which skips later assertions and can mask
mismatches for TestCrioCredentialProviderConfigCreateEmpty; change the block in
the test that checks verifyOpts.expectMCNilContent so it only asserts that
ignCfg.Storage.Files is empty (when expectMCNilContent is true) and do not
return from the test—remove the early return (or scope it so only the specific
check is skipped) so the remaining criocpConfigVerifyOptions-related assertions
still execute.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The terminal retry path currently calls
utilruntime.HandleError(err), logs via klog.V(2).Infof, then calls
queue.Forget(key) and queue.AddAfter(key, 1*time.Minute) causing infinite
requeues; update the terminal path so after reaching maxRetries you call
utilruntime.HandleError(err) and queue.Forget(key) (or just Forget and return)
but do NOT call queue.AddAfter; locate the logic around maxRetries handling in
container_runtime_config_controller.go where queue.Forget and queue.AddAfter are
invoked and remove the AddAfter call so failed objects are not requeued forever.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6b802971-f882-404a-af1a-cf29785423f4
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (8)
- pkg/daemon/constants/constants.go
- templates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yaml
- install/0000_80_machine-config_00_rbac.yaml
- test/framework/envtest.go
- templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
- templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
- templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
- pkg/apihelpers/apihelpers.go
pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
Outdated
Show resolved
Hide resolved
pkg/controller/container-runtime-config/container_runtime_config_controller.go
Show resolved
Hide resolved
dc39060 to
d1eef7d
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (3)
install/0000_80_machine-config_00_rbac.yaml (1)
156-162:⚠️ Potential issue | 🟠 MajorScope
request-serviceaccounts-token-audiencetoserviceaccountsonly.Granting this custom verb on
resources: ["*"]givessystem:nodesthat capability across every core resource. The new credential-provider flow only needs it onserviceaccounts, so this should be merged into the preceding rule instead of expanding the resource scope cluster-wide.Suggested fix
rules: - apiGroups: [""] resources: ["serviceaccounts"] - verbs: ["get", "list"] - - apiGroups: [""] - resources: ["*"] - verbs: ["request-serviceaccounts-token-audience"] + verbs: ["get", "list", "request-serviceaccounts-token-audience"]In OpenShift RBAC, should the custom verb `request-serviceaccounts-token-audience` be granted on `resources: ["serviceaccounts"]` rather than `resources: ["*"]`?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install/0000_80_machine-config_00_rbac.yaml` around lines 156 - 162, The second rule grants the custom verb "request-serviceaccounts-token-audience" across all core resources; narrow its scope by moving that verb into the prior rule that already targets apiGroups: [""] and resources: ["serviceaccounts"] so the combined rule lists verbs: ["get","list","request-serviceaccounts-token-audience"] (i.e., remove the separate rule with resources: ["*"] and ensure only serviceaccounts retain the custom verb).pkg/controller/container-runtime-config/container_runtime_config_controller.go (2)
549-565:⚠️ Potential issue | 🟠 MajorStop requeueing once
maxRetriesis exhausted.
queue.Forget(key)followed byqueue.AddAfter(key, 1*time.Minute)turns permanent failures into an endless 1-minute retry loop, so invalid objects never actually drop out of the queue despite the controller contract above.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 549 - 565, The current handleQueueErr implementation requeues keys after maxRetries by calling queue.Forget(key) then queue.AddAfter(key, 1*time.Minute), creating an endless retry loop; modify the logic in Controller.handleQueueErr so that when queue.NumRequeues(key) >= maxRetries it only calls queue.Forget(key) (and utilruntime.HandleError(err) / klog as present) and does not call queue.AddAfter(key, 1*time.Minute) so the item is dropped permanently instead of being requeued.
1208-1228:⚠️ Potential issue | 🟠 MajorClear the stale
Validated=Falsecondition when overlaps disappear.The validated condition is only updated inside the overlap branch. If a user fixes the conflicting
matchImageslater, this reconcile only writesMachineConfigRendered=True, so the oldConditionTypeValidated/ partial-applied status remains stuck on the CR. UpdateConditionTypeValidatedon the non-overlap path too someta.SetStatusConditioncan replace the stale failure.Suggested fix
if len(overlappedEntries) > 0 { ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationPartiallyApplied, "CRIOCredentialProviderConfig has one or multiple entries that overlap with the original credential provider config. Skip rendering entries: %v.", overlappedEntries) + } else { + ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, "") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 1208 - 1228, The code only calls syncCRIOCredentialProviderConfigStatusOnly(..., apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationPartiallyApplied, ...) inside the "overlap" branch, leaving a stale Validated=False when overlaps are later resolved; modify the non-overlap path (the branch that proceeds to call syncIgnitionConfig and returns nil/applied) to explicitly clear or set ConditionTypeValidated to a success state (e.g., call syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationValidated, "validated") or similar) so meta.SetStatusCondition replaces the previous failure—add this call just before returning nil or after successful MachineConfig rendering in the code paths that use syncIgnitionConfig and when applied is true, referencing syncCRIOCredentialProviderConfigStatusOnly and ConditionTypeValidated to locate the insertion point.
🧹 Nitpick comments (1)
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
67-71: Reuse the exported credential-provider path constants here.These literals now duplicate
pkg/daemon/constants/constants.go, so a later path change can silently desync controller rendering from the daemon-side policy wiring. Please build these fromconstants.KubernetesCredentialProvidersDirandconstants.KubeletCrioImageCredProviderConfPathinstead of hardcoding them again.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 67 - 71, Replace the hardcoded credential provider path constants in this file: instead of defining genericCredProviderConfigPath and kubeletCrioImageCredProviderConfPath as string literals, build them from the exported constants in pkg/daemon/constants: use constants.KubernetesCredentialProvidersDir to construct the generic credential provider config filename (generic-credential-provider.yaml) and use constants.KubeletCrioImageCredProviderConfPath for the kubelet crio image credential provider config; update the declarations where builtInLabelKey and these two path constants are defined so the controller reuses constants.KubernetesCredentialProvidersDir and constants.KubeletCrioImageCredProviderConfPath rather than duplicating the literal paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 156-162: The second rule grants the custom verb
"request-serviceaccounts-token-audience" across all core resources; narrow its
scope by moving that verb into the prior rule that already targets apiGroups:
[""] and resources: ["serviceaccounts"] so the combined rule lists verbs:
["get","list","request-serviceaccounts-token-audience"] (i.e., remove the
separate rule with resources: ["*"] and ensure only serviceaccounts retain the
custom verb).
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 549-565: The current handleQueueErr implementation requeues keys
after maxRetries by calling queue.Forget(key) then queue.AddAfter(key,
1*time.Minute), creating an endless retry loop; modify the logic in
Controller.handleQueueErr so that when queue.NumRequeues(key) >= maxRetries it
only calls queue.Forget(key) (and utilruntime.HandleError(err) / klog as
present) and does not call queue.AddAfter(key, 1*time.Minute) so the item is
dropped permanently instead of being requeued.
- Around line 1208-1228: The code only calls
syncCRIOCredentialProviderConfigStatusOnly(...,
apicfgv1alpha1.ConditionTypeValidated,
apicfgv1alpha1.ReasonConfigurationPartiallyApplied, ...) inside the "overlap"
branch, leaving a stale Validated=False when overlaps are later resolved; modify
the non-overlap path (the branch that proceeds to call syncIgnitionConfig and
returns nil/applied) to explicitly clear or set ConditionTypeValidated to a
success state (e.g., call syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeValidated,
apicfgv1alpha1.ReasonConfigurationValidated, "validated") or similar) so
meta.SetStatusCondition replaces the previous failure—add this call just before
returning nil or after successful MachineConfig rendering in the code paths that
use syncIgnitionConfig and when applied is true, referencing
syncCRIOCredentialProviderConfigStatusOnly and ConditionTypeValidated to locate
the insertion point.
---
Nitpick comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 67-71: Replace the hardcoded credential provider path constants in
this file: instead of defining genericCredProviderConfigPath and
kubeletCrioImageCredProviderConfPath as string literals, build them from the
exported constants in pkg/daemon/constants: use
constants.KubernetesCredentialProvidersDir to construct the generic credential
provider config filename (generic-credential-provider.yaml) and use
constants.KubeletCrioImageCredProviderConfPath for the kubelet crio image
credential provider config; update the declarations where builtInLabelKey and
these two path constants are defined so the controller reuses
constants.KubernetesCredentialProvidersDir and
constants.KubeletCrioImageCredProviderConfPath rather than duplicating the
literal paths.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 049b026b-d651-43da-9a25-0d4750052dd9
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (5)
- templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
- pkg/apihelpers/apihelpers.go
- test/framework/envtest.go
- templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
- pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
|
/test verify |
d1eef7d to
5d64533
Compare
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
♻️ Duplicate comments (4)
test/framework/envtest.go (1)
119-121:⚠️ Potential issue | 🟡 MinorAlso wire this CRD through the shared envtest helpers.
NewTestEnvnow installsCRIOCredentialProviderConfig, butCheckCleanEnvironment,CleanEnvironment, andCreateObjectsstill don't list/delete/create that type. Any envtest using the common helpers can leak these objects across cases and can't seed them through the shared setup path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/framework/envtest.go` around lines 119 - 121, NewTestEnv installs CRIOCredentialProviderConfig but the shared helpers CheckCleanEnvironment, CleanEnvironment, and CreateObjects don't handle that type; update those helpers to include CRIOCredentialProviderConfig in their lists/operations so envtest correctly creates, verifies absence, and deletes these objects. Specifically, add the CRIOCredentialProviderConfig GVK/Kind into the type lists or switch cases used by CreateObjects, ensure CheckCleanEnvironment also checks for zero instances of CRIOCredentialProviderConfig, and have CleanEnvironment remove any CRIOCredentialProviderConfig resources found so tests don't leak them between cases.pkg/controller/container-runtime-config/container_runtime_config_controller_test.go (1)
658-663:⚠️ Potential issue | 🟡 MinorDon't short-circuit the empty-spec verification path.
expectMCNilContentcurrently means “no files at all” and returns before the drop-in assertions run. That makesTestCrioCredentialProviderConfigCreateEmptyskip validation of the non-cloud drop-in case entirely, and it would also reject a valid drop-in-only MachineConfig.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go` around lines 658 - 663, The test short-circuits when verifyOpts.expectMCNilContent is true, returning immediately after checking ignCfg.Storage.Files and thus skipping the subsequent drop-in assertions (affecting TestCrioCredentialProviderConfigCreateEmpty and similar cases); change the logic in the verification block so that you still assert len(ignCfg.Storage.Files) == 0 when verifyOpts.expectMCNilContent is true but do not return—allow execution to continue to the drop-in assertions (the code referencing verifyOpts.expectMCNilContent and ignCfg.Storage.Files should be updated to remove the early return and only perform the emptiness check).pkg/controller/container-runtime-config/container_runtime_config_controller.go (2)
562-565:⚠️ Potential issue | 🟠 MajorDon't requeue from the terminal branch.
queue.Forget(key)followed byqueue.AddAfter(key, 1*time.Minute)makesmaxRetriesnon-terminal again. Persistent failures will keep coming back forever instead of actually dropping out of the queue.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 562 - 565, The terminal/error branch currently calls queue.Forget(key) and then queue.AddAfter(key, 1*time.Minute), which prevents retries from becoming terminal; remove the requeue call so terminal errors are dropped: in the block handling utilruntime.HandleError(err) and logging via klog.V(2).Infof, keep queue.Forget(key) (and return if applicable) but remove queue.AddAfter(key, 1*time.Minute) so the item truly stops being requeued.
1208-1228:⚠️ Potential issue | 🟠 MajorClear
ConditionTypeValidatedafter conflicts are fixed.This only writes the validated condition on the overlap path. Once the conflicting
matchImagesare removed, the oldValidated=False/ConfigurationPartiallyAppliedcondition is left behind even though the next reconcile succeeds. Set the same condition type back to a success state in the no-overlap path as well.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 1208 - 1228, The validated condition is only set when overlaps exist; after overlaps are resolved you must clear/reset that ConditionTypeValidated to success so old Validated=False / ConfigurationPartiallyApplied isn't left behind. In the code path around overlappedEntries and before calling ctrl.syncIgnitionConfig (referencing overlappedEntries, ctrl.syncCRIOCredentialProviderConfigStatusOnly and syncIgnitionConfig), add a call to ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, <appropriate success reason constant e.g. apicfgv1alpha1.ReasonConfigurationValidated or a matching success reason>, "CRIOCredentialProviderConfig validated and no overlapping entries") so the validated condition is marked successful when no overlaps are present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 658-663: The test short-circuits when
verifyOpts.expectMCNilContent is true, returning immediately after checking
ignCfg.Storage.Files and thus skipping the subsequent drop-in assertions
(affecting TestCrioCredentialProviderConfigCreateEmpty and similar cases);
change the logic in the verification block so that you still assert
len(ignCfg.Storage.Files) == 0 when verifyOpts.expectMCNilContent is true but do
not return—allow execution to continue to the drop-in assertions (the code
referencing verifyOpts.expectMCNilContent and ignCfg.Storage.Files should be
updated to remove the early return and only perform the emptiness check).
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The terminal/error branch currently calls
queue.Forget(key) and then queue.AddAfter(key, 1*time.Minute), which prevents
retries from becoming terminal; remove the requeue call so terminal errors are
dropped: in the block handling utilruntime.HandleError(err) and logging via
klog.V(2).Infof, keep queue.Forget(key) (and return if applicable) but remove
queue.AddAfter(key, 1*time.Minute) so the item truly stops being requeued.
- Around line 1208-1228: The validated condition is only set when overlaps
exist; after overlaps are resolved you must clear/reset that
ConditionTypeValidated to success so old Validated=False /
ConfigurationPartiallyApplied isn't left behind. In the code path around
overlappedEntries and before calling ctrl.syncIgnitionConfig (referencing
overlappedEntries, ctrl.syncCRIOCredentialProviderConfigStatusOnly and
syncIgnitionConfig), add a call to
ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeValidated, <appropriate success reason constant e.g.
apicfgv1alpha1.ReasonConfigurationValidated or a matching success reason>,
"CRIOCredentialProviderConfig validated and no overlapping entries") so the
validated condition is marked successful when no overlaps are present.
In `@test/framework/envtest.go`:
- Around line 119-121: NewTestEnv installs CRIOCredentialProviderConfig but the
shared helpers CheckCleanEnvironment, CleanEnvironment, and CreateObjects don't
handle that type; update those helpers to include CRIOCredentialProviderConfig
in their lists/operations so envtest correctly creates, verifies absence, and
deletes these objects. Specifically, add the CRIOCredentialProviderConfig
GVK/Kind into the type lists or switch cases used by CreateObjects, ensure
CheckCleanEnvironment also checks for zero instances of
CRIOCredentialProviderConfig, and have CleanEnvironment remove any
CRIOCredentialProviderConfig resources found so tests don't leak them between
cases.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 41d65ca0-f596-478f-a2ac-0897802b2836
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (5)
- templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
- pkg/apihelpers/apihelpers.go
- install/0000_80_machine-config_00_rbac.yaml
- templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
- manifests/machineconfigcontroller/clusterrole.yaml
5d64533 to
137ce18
Compare
|
@QiWang19: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/verified by @QiWang19 |
|
@QiWang19: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Signed-off-by: Qi Wang <qiwan@redhat.com>
137ce18 to
c8f5763
Compare
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: haircommander, QiWang19 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
- What I did
Implement criocredentialprovierconfig that is used by
crio-credential-providerplugin to fetch private mirror image pull secrets from the secret object.The handler creates
97-pool-generated-credentialproviderconfigto rollout configurations to file/etc/kubernetes/credential-providers/[platform]-credential-provider.yaml.workflow: https://github.com/openshift/enhancements/blob/master/enhancements/api-review/criocredentialproviderconfig-for-namespace-scoped-mirror-authentication.md#workflow-description
- How to verify it
clusterCRIOCredentialProviderConfig resource, file updated with a new sectionname: crio-credential-providernamespace:mynamespacecontainers.imageis from mirrorsourceregistryjournalctl _COMM=crio-credentialon the scheduled node- Description for the changelog
Summary by CodeRabbit
New Features
Chores / Defaults
Security & Permissions
Tests