Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Device Plugin daemonset keep restarting #805

Conversation

armandpicard
Copy link

Define an ordering of the NodeSelectorRequirement in the daemonset to avoid the daemonset to be unstable when policies come in different order.

Fixes #804

Define an ordering of the NodeSelectorRequirement in the daemonset to
avoid the daemonset to be unstable when policies come in different
order.

Fixes k8snetworkplumbingwg#804
Copy link

github-actions bot commented Nov 8, 2024

Thanks for your PR,
To run vendors CIs, Maintainers can use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs, Maintainers can use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

controllers/helper.go Outdated Show resolved Hide resolved
@coveralls
Copy link

Pull Request Test Coverage Report for Build 11767732959

Details

  • 15 of 16 (93.75%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.09%) to 45.543%

Changes Missing Coverage Covered Lines Changed/Added Lines %
controllers/helper.go 15 16 93.75%
Totals Coverage Status
Change from base Build 11613650922: 0.09%
Covered Lines: 6811
Relevant Lines: 14955

💛 - Coveralls

@ykulazhenkov
Copy link
Collaborator

ykulazhenkov commented Nov 11, 2024

Hi, @armandpicard. Thanks for the PR.

I think these sorting [1,2] in the controllers code should keep the order stable without the changes from this PR.
[1] https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/controllers/sriovoperatorconfig_controller.go#L131

[2] https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/controllers/sriovnetworknodepolicy_controller.go#L127

Note: ByPriority uses Priority + Name of the CR

func (a ByPriority) Less(i, j int) bool {

The 91cf86c (the one that introduced [1]) was added pretty recently.

I think the issue with the unstable order of the selectors should not exist in the master branch.
Which version of the operator are you using?

@armandpicard
Copy link
Author

Hi @ykulazhenkov.
I'm not 100% sure of the exact version we use as in our case srio-network-operator is deployed by mellanox network-operator(24.7.0). By the look of the bindata in the container it is at least 8 mouth old so it doesn't have the fix.
I'll close the MR and the Issue as sriov-network-operator from master was able to get the daemonset stable with our SrioNetworkNodePolicy's

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Device Plugin daemonset keep restarting
4 participants