From 24bef3299096efc89be155d202c2c39229faa8db Mon Sep 17 00:00:00 2001 From: amaslennikov Date: Wed, 6 Mar 2024 13:28:29 +0300 Subject: [PATCH] Add 'Getting Started with Kubernetes' article Add 'Lifecycle Management' doc Signed-off-by: amaslennikov --- docs/common/vars.rst | 2 + docs/getting-started-kubernetes.rst | 1256 ++++++++++++++++++++++++++- docs/index.rst | 2 +- docs/life-cycle-management.rst | 476 ++++++++++ docs/life-cycle-managment.rst | 22 - 5 files changed, 1733 insertions(+), 25 deletions(-) create mode 100644 docs/life-cycle-management.rst delete mode 100644 docs/life-cycle-managment.rst diff --git a/docs/common/vars.rst b/docs/common/vars.rst index 71806f2..d15ae26 100644 --- a/docs/common/vars.rst +++ b/docs/common/vars.rst @@ -15,3 +15,5 @@ .. |sriovnetop-ib-sriov-cni-image-tag| replace:: v1.0.3 .. |sriovnetop-sriov-device-plugin-image-tag| replace:: v3.6.2 .. |node-feature-discovery-version| replace:: v0.13.2 +.. |helm-chart-version| replace:: v24.1.0 +.. |network-operator-version| replace:: 24.1.0 diff --git a/docs/getting-started-kubernetes.rst b/docs/getting-started-kubernetes.rst index 84cc2e7..a2a55eb 100644 --- a/docs/getting-started-kubernetes.rst +++ b/docs/getting-started-kubernetes.rst @@ -15,8 +15,1260 @@ limitations under the License. .. headings # #, * *, =, -, ^, " - +.. include:: ./common/vars.rst ******************************* Getting Started with Kubernetes -******************************* \ No newline at end of file +******************************* + +.. contents:: On this page + :depth: 4 + :local: + :backlinks: none + +================================= +Network Operator Deployment Guide +================================= +.. _here: ./release-notes.html +.. _Kubernetes CRDs: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/ +.. _Operator SDK: https://github.com/operator-framework/operator-sdk +.. _GPU-Operator: https://github.com/NVIDIA/gpu-operator +.. warning:: The Network Operator Release Notes chapter is available here_. + +NVIDIA Network Operator leverages `Kubernetes CRDs`_ and `Operator SDK`_ to manage networking related components in order to enable fast networking, RDMA and GPUDirect for workloads in a Kubernetes cluster. The Network Operator works in conjunction with the GPU-Operator_ to enable GPU-Direct RDMA on compatible systems. + +The goal of the Network Operator is to manage the networking related components, while enabling execution of RDMA and GPUDirect RDMA workloads in a Kubernetes cluster. This includes: + +* NVIDIA Networking drivers to enable advanced features - enp1 netdcreate, an NV-IPAM IPPool +* Kubernetes device plugins to provide hardware resources required for an accelerated network +* Kubernetes secondary network components for network intensive workloads + +========================================================= +Network Operator Deployment on Vanilla Kubernetes Cluster +========================================================= +.. _Network-Operator Project Sources: https://github.com/Mellanox/network-operator#nicclusterpolicy-crd +.. warning:: It is recommended to have dedicated control plane nodes for Vanilla Kubernetes deployments with NVIDIA Network Operator. + +The default installation via Helm as described below will deploy the Network Operator and related CRDs, after which an additional step is required to create a NicClusterPolicy custom resource with the configuration that is desired for the cluster. + +For more information on NicClusterPolicy custom resource, please refer to the `Network-Operator Project Sources`_. + +The provided Helm chart contains various parameters to facilitate the creation of a NicClusterPolicy custom resource upon deployment. + +.. warning:: Each Network Operator Release has a set of default version values for the various components it deploys. It is recommended that these values will not be changed. Testing and validation were performed with these values, and there is no guarantee of interoperability nor correctness when different versions are used. + +.. code-block:: bash + :caption: Add NVIDIA NGC Helm repository + + helm repo add nvidia https://helm.ngc.nvidia.com/nvidia + +.. code-block:: bash + :caption: Update helm repositories + + helm repo update + +Install Network Operator from the NVIDIA NGC chart using the default values: + +.. parsed-literal:: + + helm install network-operator nvidia/network-operator \ + -n nvidia-network-operator \ + --create-namespace \ + --version |helm-chart-version| \ + --wait + +.. code-block:: bash + :caption: View deployed resources + + kubectl -n nvidia-network-operator get pods + +Install the Network Operator from the NVIDIA NGC chart using custom values: + +.. warning:: Since several parameters should be provided when creating custom resources during operator deployment, it is recommended to use a configuration file. While it is possible to override the parameters via CLI, we recommend to avoid the use of CLI arguments in favor of a configuration file. + +.. parsed-literal:: + + helm show values nvidia/network-operator --version |helm-chart-version| > values.yaml + +.. parsed-literal:: + + helm install network-operator nvidia/network-operator \ + -n nvidia-network-operator \ + --create-namespace \ + --version |helm-chart-version| \ + -f ./values.yaml \ + --wait + +=================== +Deployment Examples +=================== + +.. warning:: Since several parameters should be provided when creating custom resources during operator deployment, it is recommended to use a configuration file. While it is possible to override the parameters via CLI, we recommend to avoid the use of CLI arguments in favor of a configuration file. + +Below are deployment examples, which the ``values.yaml`` file provided to the Helm during the installation of the network operator. This was achieved by running: + +.. code-block:: bash + + helm install -f ./values.yaml -n nvidia-network-operator --create-namespace --wait nvidia/network-operator network-operator + +---------------------------------------------------------- +Network Operator Deployment with RDMA Shared Device Plugin +---------------------------------------------------------- + +Network operator deployment with the default version of the OFED driver and a single RDMA resource mapped to ens1f0 netdev.: + +``values.yaml`` configuration file for such a deployment: + +.. code-block:: yaml + + nfd: + enabled: true + sriovNetworkOperator: + enabled: false + # NicClusterPolicy CR values: + deployCR: true + ofedDriver: + deploy: true + + rdmaSharedDevicePlugin: + deploy: true + resources: + - name: rdma_shared_device_a + ifNames: [ens1f0] + + sriovDevicePlugin: + deploy: false + +-------------------------------------------------------------------------------- +Network Operator Deployment with Multiple Resources in RDMA Shared Device Plugin +-------------------------------------------------------------------------------- + +Network Operator deployment with the default version of OFED and an RDMA device plugin with two RDMA resources. The first is mapped to ens1f0 and ens1f1, and the second is mapped to ens2f0 and ens2f1. + +``values.yaml`` configuration file for such a deployment: + +.. code-block:: yaml + + nfd: + enabled: true + sriovNetworkOperator: + enabled: false + # NicClusterPolicy CR values: + deployCR: true + ofedDriver: + deploy: true + rdmaSharedDevicePlugin: + deploy: true + resources: + - name: rdma_shared_device_a + ifNames: [ens1f0, ens1f1] + - name: rdma_shared_device_b + ifNames: [ens2f0, ens2f1] + + sriovDevicePlugin: + deploy: false + +---------------------------------------------------- +Network Operator Deployment with a Secondary Network +---------------------------------------------------- + +Network Operator deployment with: + +* RDMA shared device plugin +* Secondary network +* Mutlus CNI +* Container-networking-plugins CNI plugins +* Whereabouts IPAM CNI Plugin + +``values.yaml``: + +.. code-block:: yaml + + nfd: + enabled: true + sriovNetworkOperator: + enabled: false + # NicClusterPolicy CR values: + deployCR: true + ofedDriver: + deploy: false + + rdmaSharedDevicePlugin: + deploy: true + resources: + - name: rdma_shared_device_a + ifNames: [ens1f0] + + secondaryNetwork: + deploy: true + multus: + deploy: true + cniPlugins: + deploy: true + ipamPlugin: + deploy: true + +-------------------------------------------- +Network Operator Deployment with NVIDIA-IPAM +-------------------------------------------- + +Network Operator deployment with: + +* RDMA shared device plugin +* Secondary network +* Multus CNI +* Container-networking-plugins +* CNI plugins +* NVIDIA-IPAM CNI Plugin + +``values.yaml``: + +.. code-block:: yaml + + nfd: + enabled: true + sriovNetworkOperator: + enabled: false + # NicClusterPolicy CR values: + deployCR: true + ofedDriver: + deploy: false + + rdmaSharedDevicePlugin: + deploy: true + resources: + - name: rdma_shared_device_a + ifNames: [ens1f0] + + secondaryNetwork: + deploy: true + multus: + deploy: true + cniPlugins: + deploy: true + ipamPlugin: + deploy: true + +To create an NV-IPAM IPPool, apply: + +.. code-block:: yaml + + apiVersion: nv-ipam.nvidia.com/v1alpha1 + kind: IPPool + metadata: + name: my-pool + namespace: nvidia-network-operator + spec: + subnet: 192.168.0.0/24 + perNodeBlockSize: 100 + gateway: 192.168.0.1 + +Example of a MacvlanNetwork that uses NVIDIA-IPAM: + +.. code-block:: yaml + + apiVersion: mellanox.com/v1alpha1 + kind: MacvlanNetwork + metadata: + name: example-macvlannetwork + spec: + networkNamespace: "default" + master: "ens2f0" + mode: "bridge" + mtu: 1500 + ipam: | + { + "type": "nv-ipam", + "poolName": "my-pool" + } + +------------------------------------------------------ +Network Operator Deployment with a Host Device Network +------------------------------------------------------ + +Network Operator deployment with: + +* SR-IOV device plugin, single SR-IOV resource pool +* Secondary network +* Multus CNI +* Container-networking-plugins CNI plugins +* Whereabouts IPAM CNI plugin + +In this mode, the Network Operator could be deployed on virtualized deployments as well. It supports both Ethernet and InfiniBand modes. From the Network Operator perspective, there is no difference between the deployment procedures. To work on a VM (virtual machine), the PCI passthrough must be configured for SR-IOV devices. The Network Operator works both with VF (Virtual Function) and PF (Physical Function) inside the VMs. + +.. warning:: If the Host Device Network is used without the MLNX_OFED driver, the following packages should be installed: + + * the linux-generic package on Ubuntu hosts + * the kernel-modules-extra package on the RedHat-based hosts + +``values.yaml``: + +.. code-block:: yaml + + nfd: + enabled: true + sriovNetworkOperator: + enabled: false + # NicClusterPolicy CR values: + deployCR: true + ofedDriver: + deploy: false + + rdmaSharedDevicePlugin: + deploy: false + + sriovDevicePlugin: + deploy: true + resources: + - name: hostdev + vendors: [15b3] + secondaryNetwork: + deploy: true + multus: + deploy: true + cniPlugins: + deploy: true + ipamPlugin: + deploy: true + +Following the deployment, the network operator should be configured, and K8s networking should be deployed to use it in pod configuration. + +The ``host-device-net.yaml`` configuration file for such a deployment: + +.. code-block:: yaml + + apiVersion: mellanox.com/v1alpha1 + kind: HostDeviceNetwork + metadata: + name: hostdev-net + spec: + networkNamespace: "default" + resourceName: "nvidia.com/hostdev" + ipam: | + { + "type": "whereabouts", + "datastore": "kubernetes", + "kubernetes": { + "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig" + }, + "range": "192.168.3.225/28", + "exclude": [ + "192.168.3.229/30", + "192.168.3.236/32" + ], + "log_file": "/var/log/whereabouts.log", + "log_level": "info" + } + +The ``host-device-net-ocp.yaml`` configuration file for such a deployment in the OpenShift Platform: + +.. code-block:: yaml + + apiVersion: mellanox.com/v1alpha1 + kind: HostDeviceNetwork + metadata: + name: hostdev-net + spec: + networkNamespace: "default" + resourceName: "nvidia.com/hostdev" + ipam: | + { + "type": "whereabouts", + "range": "192.168.3.225/28", + "exclude": [ + "192.168.3.229/30", + "192.168.3.236/32" + ] + } + +The ``pod.yaml`` configuration file for such a deployment: + +.. code-block:: yaml + + apiVersion: v1 + kind: Pod + metadata: + name: hostdev-test-pod + annotations: + k8s.v1.cni.cncf.io/networks: hostdev-net + spec: + restartPolicy: OnFailure + containers: + - image: + name: mofed-test-ctr + securityContext: + capabilities: + add: [ "IPC_LOCK" ] + resources: + requests: + nvidia.com/hostdev: 1 + limits: + nvidia.com/hostdev: 1 + command: + - sh + - -c + - sleep inf + +-------------------------------------------------------------------------- +Network Operator Deployment with a Host Device Network and Macvlan Network +-------------------------------------------------------------------------- + +In this combined deployment, different NVIDIA NICs are used for RDMA Shared Device Plugin and SR-IOV Network Device Plugin in order to work with a Host Device Network or a Macvlan Network on different NICs. It is impossible to combine different networking types on the same NICs. The same principle should be applied for other networking combinations. + +``values.yaml``: + +.. code-block:: yaml + + nfd: + enabled: true + + # NicClusterPolicy CR values: + deployCR: true + + ofedDriver: + deploy: false + + rdmaSharedDevicePlugin: + deploy: true + resources: + - name: rdma_shared_device_a + linkTypes: [ether] + + sriovDevicePlugin: + deploy: true + resources: + - name: hostdev + linkTypes: [“infiniband”] + + secondaryNetwork: + deploy: true + multus: + deploy: true + cniPlugins: + deploy: true + ipamPlugin: + deploy: true + +For pods and network configuration examples please refer to the corresponding sections: Network Operator Deployment with the RDMA Shared Device Plugin and Network Operator Deployment with a Host Device Network. + +---------------------------------------------------------------------- +Network Operator Deployment with an IP over InfiniBand (IPoIB) Network +---------------------------------------------------------------------- + +Network Operator deployment with: + +* RDMA shared device plugin +* Secondary network +* Multus CNI +* IPoIB CNI +* Whereabouts IPAM CNI plugin + +In this mode, the Network Operator could be deployed on virtualized deployments as well. It supports both Ethernet and InfiniBand modes. From the Network Operator perspective, there is no difference between the deployment procedures. To work on a VM (virtual machine), the PCI passthrough must be configured for SR-IOV devices. The Network Operator works both with VF (Virtual Function) and PF (Physical Function) inside the VMs. + +``values.yaml``: + +.. code-block:: yaml + + nfd: + enabled: true + sriovNetworkOperator: + enabled: false + # NicClusterPolicy CR values: + deployCR: true + ofedDriver: + deploy: true + + rdmaSharedDevicePlugin: + deploy: true + resources: + - name: rdma_shared_device_a + ifNames: [ibs1f0] + + secondaryNetwork: + deploy: true + multus: + deploy: true + ipoib: + deploy: true + ipamPlugin: + deploy: true + +Following the deployment, the network operator should be configured, and K8s networking deployed to use it in the pod configuration. + +The ``ipoib-net.yaml`` configuration file for such a deployment: + +.. code-block:: yaml + + apiVersion: mellanox.com/v1alpha1 + kind: IPoIBNetwork + metadata: + name: example-ipoibnetwork + spec: + networkNamespace: "default" + master: "ibs1f0" + ipam: | + { + "type": "whereabouts", + "datastore": "kubernetes", + "kubernetes": { + "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig" + }, + "range": "192.168.5.225/28", + "exclude": [ + "192.168.6.229/30", + "192.168.6.236/32" + ], + "log_file" : "/var/log/whereabouts.log", + "log_level" : "info", + "gateway": "192.168.6.1" + } + +The ``ipoib-net-ocp.yaml`` configuration file for such a deployment in the OpenShift Platform: + +.. code-block:: yaml + + apiVersion: mellanox.com/v1alpha1 + kind: IPoIBNetwork + metadata: + name: example-ipoibnetwork + spec: + networkNamespace: "default" + master: "ibs1f0" + ipam: | + { + "type": "whereabouts", + "range": "192.168.5.225/28", + "exclude": [ + "192.168.6.229/30", + "192.168.6.236/32" + ] + } + +The ``pod.yaml`` configuration file for such a deployment: + +.. code-block:: yaml + + apiVersion: v1 + kind: Pod + metadata: + name: iboip-test-pod + annotations: + k8s.v1.cni.cncf.io/networks: example-ipoibnetwork + spec: + restartPolicy: OnFailure + containers: + - image: + name: mofed-test-ctr + securityContext: + capabilities: + add: [ "IPC_LOCK" ] + resources: + requests: + rdma/rdma_shared_device_a: 1 + limits: + edma/rdma_shared_device_a: 1 + command: + - sh + - -c + - sleep inf + +--------------------------------------------------- +Network Operator Deployment for GPUDirect Workloads +--------------------------------------------------- + +GPUDirect requires the following: + +* MLNX_OFED v5.5-1.0.3.2 or newer +* GPU Operator v1.9.0 or newer +* NVIDIA GPU and driver supporting GPUDirect e.g Quadro RTX 6000/8000 or NVIDIA T4/NVIDIA V100/NVIDIA A100 + +``values.yaml`` example: + +.. code-block:: yaml + + nfd: + enabled: true + sriovNetworkOperator: + enabled: false + # NicClusterPolicy CR values: + ofedDriver: + deploy: true + deployCR: true + + sriovDevicePlugin: + deploy: true + resources: + - name: hostdev + vendors: [15b3] + + secondaryNetwork: + deploy: true + multus: + deploy: true + cniPlugins: + deploy: true + ipamPlugin: + deploy: true + +``host-device-net.yaml:`` + +.. code-block:: yaml + + apiVersion: mellanox.com/v1alpha1 + kind: HostDeviceNetwork + metadata: + name: hostdevice-net + spec: + networkNamespace: "default" + resourceName: "hostdev" + ipam: | + { + "type": "whereabouts", + "datastore": "kubernetes", + "kubernetes": { + "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig" + }, + "range": "192.168.3.225/28", + "exclude": [ + "192.168.3.229/30", + "192.168.3.236/32" + ], + "log_file" : "/var/log/whereabouts.log", + "log_level" : "info" + } + +The ``host-device-net-ocp.yaml`` configuration file for such a deployment in the OpenShift Platform: + +.. code-block:: yaml + + apiVersion: mellanox.com/v1alpha1 + kind: HostDeviceNetwork + metadata: + name: hostdevice-net + spec: + networkNamespace: "default" + resourceName: "hostdev" + ipam: | + { + "type": "whereabouts", + "range": "192.168.3.225/28", + "exclude": [ + "192.168.3.229/30", + "192.168.3.236/32" + ] + } + +``host-net-gpudirect-pod.yaml`` + +.. code-block:: yaml + + apiVersion: v1 + kind: Pod + metadata: + name: testpod1 + annotations: + k8s.v1.cni.cncf.io/networks: hostdevice-net + spec: + containers: + - name: appcntr1 + image: + imagePullPolicy: IfNotPresent + securityContext: + capabilities: + add: ["IPC_LOCK"] + command: + - sh + - -c + - sleep inf + resources: + requests: + nvidia.com/hostdev: '1' + nvidia.com/gpu: '1' + limits: + nvidia.com/hostdev: '1' + nvidia.com/gpu: '1' + +------------------------------------------------- +Network Operator Deployment in SR-IOV Legacy Mode +------------------------------------------------- + +.. _Project Documentation: https://github.com/k8snetworkplumbingwg/sriov-network-operator/ +.. warning:: The SR-IOV Network Operator will be deployed with the default configuration. You can override these settings using a CLI argument, or the ‘sriov-network-operator’ section in the values.yaml file. For more information, refer to the `Project Documentation`_. +.. warning:: This deployment mode supports SR-IOV in legacy mode. + +``values.yaml`` configuration for such a deployment: + +.. code-block:: yaml + + nfd: + enabled: true + sriovNetworkOperator: + enabled: true + + # NicClusterPolicy CR values: + deployCR: true + ofedDriver: + deploy: true + rdmaSharedDevicePlugin: + deploy: false + sriovDevicePlugin: + deploy: false + + secondaryNetwork: + deploy: true + multus: + deploy: true + cniPlugins: + deploy: true + ipamPlugin: + deploy: true + +Following the deployment, the Network Operator should be configured, and sriovnetwork node policy and K8s networking should be deployed. + +The ``sriovnetwork-node-policy.yaml`` configuration file for such a deployment: + +.. code-block:: yaml + + apiVersion: sriovnetwork.openshift.io/v1 + kind: SriovNetworkNodePolicy + metadata: + name: policy-1 + namespace: nvidia-network-operator + spec: + deviceType: netdevice + mtu: 1500 + nicSelector: + vendor: "15b3" + pfNames: ["ens2f0"] + nodeSelector: + feature.node.kubernetes.io/pci-15b3.present: "true" + numVfs: 8 + priority: 90 + isRdma: true + resourceName: sriov_resource + +The ``sriovnetwork.yaml`` configuration file for such a deployment: + +.. code-block:: yaml + + apiVersion: sriovnetwork.openshift.io/v1 + kind: SriovNetwork + metadata: + name: "example-sriov-network" + namespace: nvidia-network-operator + spec: + vlan: 0 + networkNamespace: "default" + resourceName: "sriov_resource" + ipam: |- + { + "datastore": "kubernetes", + "kubernetes": { + "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig" + }, + "log_file": "/tmp/whereabouts.log", + "log_level": "debug", + "type": "whereabouts", + "range": "192.168.101.0/24" + } + +.. warning:: The ens2f0 network interface name has been chosen from the following command output: ``kubectl -n nvidia-network-operator get sriovnetworknodestates.sriovnetwork.openshift.io -o yaml``. + +.. code-block:: yaml + + ... + + status: + interfaces: + - deviceID: 101d + driver: mlx5_core + linkSpeed: 100000 Mb/s + linkType: ETH + mac: 0c:42:a1:2b:74:ae + mtu: 1500 + name: ens2f0 + pciAddress: "0000:07:00.0" + totalvfs: 8 + vendor: 15b3 + - deviceID: 101d + driver: mlx5_core + linkType: ETH + mac: 0c:42:a1:2b:74:af + mtu: 1500 + name: ens2f1 + pciAddress: "0000:07:00.1" + totalvfs: 8 + vendor: 15b3 + + ... + +Wait for all required pods to be spawned: + +.. code-block:: bash + + # kubectl get pod -n nvidia-network-operator | grep sriov + network-operator-sriov-network-operator-544c8dbbb9-vzkmc 1/1 Running 0 5d + sriov-device-plugin-vwpzn 1/1 Running 0 2d6h + sriov-network-config-daemon-qv467 3/3 Running 0 5d + # kubectl get pod -n nvidia-network-operator + NAME READY STATUS RESTARTS AGE + cni-plugins-ds-kbvnm 1/1 Running 0 5d + cni-plugins-ds-pcllg 1/1 Running 0 5d + kube-multus-ds-5j6ns 1/1 Running 0 5d + kube-multus-ds-mxgvl 1/1 Running 0 5d + mofed-ubuntu20.04-ds-2zzf4 1/1 Running 0 5d + mofed-ubuntu20.04-ds-rfnsw 1/1 Running 0 5d + whereabouts-nw7hn 1/1 Running 0 5d + whereabouts-zvhrv 1/1 Running 0 5d + ... + +The ``pod.yaml`` configuration file for such a deployment: + +.. code-block:: yaml + + apiVersion: v1 + kind: Pod + metadata: + name: testpod1 + annotations: + k8s.v1.cni.cncf.io/networks: example-sriov-network + spec: + containers: + - name: appcntr1 + image: + imagePullPolicy: IfNotPresent + securityContext: + capabilities: + add: ["IPC_LOCK"] + resources: + requests: + nvidia.com/sriov_resource: '1' + limits: + nvidia.com/sriov_resource: '1' + command: + - sh + - -c + - sleep inf + +--------------------------------------------------------------------------- +SR-IOV Network Operator Deployment – Parallel Node Configuration for SR-IOV +--------------------------------------------------------------------------- + +.. warning:: This is a Tech Preview feature, which is supported only for Vanilla Kubernetes deployments with SR-IOV Network Operator. + +To apply SriovNetworkNodePolicy on several nodes in parallel, specify the ``maxParallelConfiguration`` option in the SriovOperatorConfig CRD: + +.. code-block:: bash + + kubectl patch sriovoperatorconfigs.sriovnetwork.openshift.io -n network-operator default --patch '{ "spec": { "maxParallelNodeConfiguration": 0 } }' --type='merge' + +-------------------------------------------------------------------------- +SR-IOV Network Operator Deployment – Parallel NIC Configuration for SR-IOV +-------------------------------------------------------------------------- + +.. warning:: This is a Tech Preview feature, which is supported only for Vanilla Kubernetes deployments with SR-IOV Network Operator. + +To apply SriovNetworkNodePolicy on several nodes in parallel, specify the ``featureGates`` option in the SriovOperatorConfig CRD: + +.. code-block:: bash + + kubectl patch sriovoperatorconfigs.sriovnetwork.openshift.io -n network-operator default --patch '{ "spec": { "featureGates": { "parallelNicConfig": true } } }' --type='merge' + +--------------------------------------------------------------------------- +SR-IOV Network Operator Deployment – SR-IOV Using the systemd Service +--------------------------------------------------------------------------- + +To enable systemd SR-IOV configuration mode, specify the configurationMode option in the SriovOperatorConfig CRD: + +.. code-block:: bash + + kubectl patch sriovoperatorconfigs.sriovnetwork.openshift.io -n network-operator default --patch '{ "spec": { "configurationMode": "systemd"} }' --type='merge' + +------------------------------------------------------------- +Network Operator Deployment with an SR-IOV InfiniBand Network +------------------------------------------------------------- + +Network Operator deployment with InfiniBand network requires the following: + +* MLNX_OFED and OpenSM running. OpenSM runs on top of the MLNX_OFED stack, so both the driver and the subnet manager should come from the same installation. Note that partitions that are configured by OpenSM should specify defmember=full to enable the SR-IOV functionality over InfiniBand. For more details, please refer to `this article `. +* InfiniBand device – Both the host device and switch ports must be enabled in InfiniBand mode. +* rdma-core package should be installed when an inbox driver is used. + +``values.yaml`` + +.. code-block:: yaml + + nfd: + enabled: true + sriovNetworkOperator: + enabled: true + + # NicClusterPolicy CR values: + deployCR: true + ofedDriver: + deploy: true + rdmaSharedDevicePlugin: + deploy: false + sriovDevicePlugin: + deploy: false + + secondaryNetwork: + deploy: true + multus: + deploy: true + cniPlugins: + deploy: true + ipamPlugin: + deploy: true + +``sriov-ib-network-node-policy.yaml`` + +.. code-block:: yaml + + apiVersion: sriovnetwork.openshift.io/v1 + kind: SriovNetworkNodePolicy + metadata: + name: infiniband-sriov + namespace: nvidia-network-operator + spec: + deviceType: netdevice + mtu: 1500 + nodeSelector: + feature.node.kubernetes.io/pci-15b3.present: "true" + nicSelector: + vendor: "15b3" + linkType: infiniband + isRdma: true + numVfs: 8 + priority: 90 + resourceName: mlnxnics + +``sriov-ib-network.yaml`` + +.. code-block:: yaml + + apiVersion: sriovnetwork.openshift.io/v1 + kind: SriovIBNetwork + metadata: + name: example-sriov-ib-network + namespace: nvidia-network-operator + spec: + ipam: | + { + "type": "whereabouts", + "datastore": "kubernetes", + "kubernetes": { + "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig" + }, + "range": "192.168.5.225/28", + "exclude": [ + "192.168.5.229/30", + "192.168.5.236/32" + ], + "log_file": "/var/log/whereabouts.log", + "log_level": "info" + } + resourceName: mlnxnics + linkState: enable + networkNamespace: default + +``sriov-ib-network-pod.yaml`` + +.. code-block:: yaml + + apiVersion: v1 + kind: Pod + metadata: + name: test-sriov-ib-pod + annotations: + k8s.v1.cni.cncf.io/networks: example-sriov-ib-network + spec: + containers: + - name: test-sriov-ib-pod + image: centos/tools + imagePullPolicy: IfNotPresent + command: + - sh + - -c + - sleep inf + securityContext: + capabilities: + add: [ "IPC_LOCK" ] + resources: + requests: + nvidia.com/mlnxics: "1" + limits: + nvidia.com/mlnxics: "1" + +---------------------------------------------------------------------------------- +Network Operator Deployment with an SR-IOV InfiniBand Network with PKey Management +---------------------------------------------------------------------------------- + +.. _this article: https://docs.mellanox.com/display/MLNXOFEDv51258060/OpenSM +.. _the project documentation: https://docs.nvidia.com/networking/display/UFMEnterpriseUMv652 + +Network Operator deployment with InfiniBand network requires the following: + +* MLNX_OFED and OpenSM running. OpenSM runs on top of the MLNX_OFED stack, so both the driver and the subnet manager should come from the same installation. Note that partitions that are configured by OpenSM should specify defmember=full to enable the SR-IOV functionality over InfiniBand. For more details, please refer to `this article`_. +* NVIDIA UFM running on top of OpenSM. For more details, please refer to `the project documentation`_. +* InfiniBand device – Both the host device and the switch ports must be enabled in InfiniBand mode. +* rdma-core package should be installed when an inbox driver is used. + +Current limitations: + +* Only a single PKey can be configured per workload pod. +* When a single instance of NVIDIA UFM is used with several K8s clusters, different PKey GUID pools should be configured for each cluster. + +.. warning:: `ib-kubernetes-ufm-secret` should be created before NicClusterPolicy. + +``ufm-secret.yaml`` + +.. code-block:: yaml + + apiVersion: v1 + kind: Secret + metadata: + name: ib-kubernetes-ufm-secret + namespace: nvidia-network-operator + stringData: + UFM_USERNAME: "admin" + UFM_PASSWORD: "123456" + UFM_ADDRESS: "ufm-host" + UFM_HTTP_SCHEMA: "" + UFM_PORT: "" + data: + UFM_CERTIFICATE: "" + +``values.yaml`` + +.. code-block:: yaml + + nfd: + enabled: true + sriovNetworkOperator: + enabled: true + resourcePrefix: "nvidia.com" + + # NicClusterPolicy CR values: + deployCR: true + ofedDriver: + deploy: true + rdmaSharedDevicePlugin: + deploy: false + sriovDevicePlugin: + deploy: false + ibKubernetes: + deploy: true + periodicUpdateSeconds: 5 + pKeyGUIDPoolRangeStart: "02:00:00:00:00:00:00:00" + pKeyGUIDPoolRangeEnd: "02:FF:FF:FF:FF:FF:FF:FF" + ufmSecret: ufm-secret + + secondaryNetwork: + deploy: true + multus: + deploy: true + cniPlugins: + deploy: true + ipamPlugin: + deploy: true + +Wait for MLNX_OFED to install and apply the following CRs: + +``sriov-ib-network-node-policy.yaml`` + +.. code-block:: yaml + + apiVersion: sriovnetwork.openshift.io/v1 + kind: SriovNetworkNodePolicy + metadata: + name: infiniband-sriov + namespace: nvidia-network-operator + spec: + deviceType: netdevice + mtu: 1500 + nodeSelector: + feature.node.kubernetes.io/pci-15b3.present: "true" + nicSelector: + vendor: "15b3" + linkType: ib + isRdma: true + numVfs: 8 + priority: 90 + resourceName: mlnxnics + +``sriov-ib-network.yaml`` + +.. code-block:: yaml + + apiVersion: "k8s.cni.cncf.io/v1" + kind: SriovIBNetwork + metadata: + name: ib-sriov-network + annotations: + k8s.v1.cni.cncf.io/resourceName: nvidia.com/mlnxnics + spec: + config: '{ + "type": "ib-sriov", + "cniVersion": "0.3.1", + "name": "ib-sriov-network", + "pkey": "0x6", + "link_state": "enable", + "ibKubernetesEnabled": true, + "ipam": { + "type": "whereabouts", + "datastore": "kubernetes", + "kubernetes": { + "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig" + }, + "range": "10.56.217.0/24", + "log_file" : "/var/log/whereabouts.log", + "log_level" : "info" + } + }' + +``sriov-ib-network-pod.yaml`` + +.. code-block:: yaml + + apiVersion: v1 + kind: Pod + metadata: + name: test-sriov-ib-pod + annotations: + k8s.v1.cni.cncf.io/networks: ib-sriob-network + spec: + containers: + - name: test-sriov-ib-pod + image: centos/tools + imagePullPolicy: IfNotPresent + command: + - sh + - -c + - sleep inf + securityContext: + capabilities: + add: [ "IPC_LOCK" ] + resources: + requests: + nvidia.com/mlnxics: "1" + limits: + nvidia.com/mlnxics: "1" + +-------------------------------------------------------------------- +Network Operator Deployment for DPDK Workloads with NicClusterPolicy +-------------------------------------------------------------------- + +.. _HUGEPAGE: http://manpages.ubuntu.com/manpages/focal/man8/hugeadm.8.html + +This deployment mode supports DPDK applications. In order to run DPDK applications, HUGEPAGE_ should be configured on the required K8s Worker Nodes. By default, the inbox operating system driver is used. For support of cases with specific requirements, OFED container should be deployed. + +Network Operator deployment with: + +* Host Device Network +* DPDK pod + +``nicclusterpolicy.yaml`` + +.. parsed-literal:: + + apiVersion: mellanox.com/v1alpha1 + kind: NicClusterPolicy + metadata: + name: nic-cluster-policy + spec: + ofedDriver: + image: doca-driver + repository: nvcr.io/nvidia/mellanox + version: |mofed-version| + sriovDevicePlugin: + image: sriov-network-device-plugin + repository: ghcr.io/k8snetworkplumbingwg + version: |sriov-device-plugin-version| + config: | + { + "resourceList": [ + { + "resourcePrefix": "nvidia.com", + "resourceName": "rdma_host_dev", + "selectors": { + "vendors": ["15b3"], + "devices": ["1018"], + "drivers": ["mlx5_core"] + } + } + ] + } + secondaryNetwork: + cniPlugins: + image: plugins + repository: ghcr.io/k8snetworkplumbingwg + version: |cni-plugins-version|-amd64 + ipamPlugin: + image: whereabouts + repository: ghcr.io/k8snetworkplumbingwg + version: |whereabouts-version|-amd64 + multus: + image: multus-cni + repository: ghcr.io/k8snetworkplumbingwg + version: |multus-version| + +``host-device-net.yaml`` + +.. code-block:: yaml + + apiVersion: mellanox.com/v1alpha1 + kind: HostDeviceNetwork + metadata: + name: example-hostdev-net + spec: + networkNamespace: "default" + resourceName: "rdma_host_dev" + ipam: | + { + "type": "whereabouts", + "datastore": "kubernetes", + "kubernetes": { + "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig" + }, + "range": "192.168.3.225/28", + "exclude": [ + "192.168.3.229/30", + "192.168.3.236/32" + ], + "log_file" : "/var/log/whereabouts.log", + "log_level" : "info" + } + +``pod.yaml`` + +.. code-block:: yaml + + apiVersion: v1 + kind: Pod + metadata: + name: testpod1 + annotations: + k8s.v1.cni.cncf.io/networks: example-hostdev-net + spec: + containers: + - name: appcntr1 + image: + imagePullPolicy: IfNotPresent + securityContext: + capabilities: + add: ["IPC_LOCK"] + volumeMounts: + - mountPath: /dev/hugepages + name: hugepage + resources: + requests: + memory: 1Gi + hugepages-1Gi: 2Gi + nvidia.com/rdma_host_dev: '1' + command: [ "/bin/bash", "-c", "--" ] + args: [ "while true; do sleep 300000; done;" ] + volumes: + - name: hugepage + emptyDir: + medium: HugePages \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index 343db07..41f2dec 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -27,7 +27,7 @@ Getting Started with Kubernetes Getting Started with Red Hat OpenShift Customization Options and CRDs - Life Cycle Management + Life Cycle Management Advanced Configurations diff --git a/docs/life-cycle-management.rst b/docs/life-cycle-management.rst new file mode 100644 index 0000000..e6e3099 --- /dev/null +++ b/docs/life-cycle-management.rst @@ -0,0 +1,476 @@ +.. license-header + SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + SPDX-License-Identifier: Apache-2.0 + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +.. headings # #, * *, =, -, ^, " +.. include:: ./common/vars.rst + +********************* +Life Cycle Management +********************* + +.. contents:: On this page + :depth: 4 + :local: + :backlinks: none + +============================= +Ensuring Deployment Readiness +============================= + +Once the Network Operator is deployed, and a NicClusterPolicy resource is created, the operator will reconcile the state of the cluster until it reaches the desired state, as defined in the resource. + +Alignment of the cluster to the defined policy can be verified in the custom resource status. + +a "Ready" state indicates that the required components were deployed, and that the policy is applied on the cluster. + +--------------------------------------------------- +Status Field Example of a NICClusterPolicy Instance +--------------------------------------------------- + +Get the NicClusterPolicy status: + +.. code-block:: bash + + kubectl get -n network-operator nicclusterpolicies.mellanox.com nic-cluster-policy -o yaml + +.. code-block:: bash + + status: + appliedStates: + - name: state-pod-security-policy + state: ignore + - name: state-multus-cni + state: ready + - name: state-container-networking-plugins + state: ignore + - name: state-ipoib-cni + state: ignore + - name: state-whereabouts-cni + state: ready + - name: state-OFED + state: ready + - name: state-SRIOV-device-plugin + state: ignore + - name: state-RDMA-device-plugin + state: ready + - name: state-NV-Peer + state: ignore + - name: state-ib-kubernetes + state: ignore + - name: state-nv-ipam-cni + state: ready + state: ready + +.. note:: An "Ignore" state indicates that the sub-state was not defined in the custom resource, and thus, it is ignored. + +======================== +Network Operator Upgrade +======================== + +Before upgrading to Network Operator v1.0 or newer with SR-IOV Network Operator enabled, the following manual actions are required: + +.. code-block:: bash + + $ kubectl -n nvidia-network-operator scale deployment network-operator-sriov-network-operator --replicas 0 + + $ kubectl -n nvidia-network-operator delete sriovnetworknodepolicies.sriovnetwork.openshift.io default + +The network operator provides limited upgrade capabilities, which require additional manual actions if a containerized OFED driver is used. Future releases of the network operator will provide an automatic upgrade flow for the containerized driver. + +Since Helm does not support auto-upgrade of existing CRDs, the user must follow a two-step process to upgrade the network-operator release: + +* Upgrade the CRD to the latest version +* Apply Helm chart update + +---------------------------- +Downloading a New Helm Chart +---------------------------- + +To obtain new releases, run: + +.. parsed-literal:: + + # Download Helm chart + $ helm fetch \https://helm.ngc.nvidia.com/nvidia/charts/network-operator-|network-operator-version|.tgz + $ ls network-operator-\*.tgz | xargs -n 1 tar xf + + +------------------------------------- +Upgrading CRDs for a Specific Release +------------------------------------- + +It is possible to retrieve updated CRDs from the Helm chart or from the release branch on GitHub. The example below shows how to upgrade CRDs from the downloaded chart. + +.. code-block:: bash + + $ kubectl apply \ + -f network-operator/crds \ + -f network-operator/charts/sriov-network-operator/crds + +--------------------------------------------- +Preparing the Helm Values for the New Release +--------------------------------------------- + +Edit the values-.yaml file as required for your cluster. The network operator has some limitations as to which updates in the NicClusterPolicy it can handle automatically. If the configuration for the new release is different from the current configuration in the deployed release, some additional manual actions may be required. + +Known limitations: + +* If component configuration was removed from the NicClusterPolicy, manual clean up of the component's resources (DaemonSets, ConfigMaps, etc.) may be required. +* If the configuration for devicePlugin changed without image upgrade, manual restart of the devicePlugin may be required. + +These limitations will be addressed in future releases. + +.. warning:: Changes that were made directly in the NicClusterPolicy CR (e.g. with kubectl edit) will be overwritten by the Helm upgrade due to the `force` flag. + +------------------------------ +Applying the Helm Chart Update +------------------------------ + +To apply the Helm chart update, run: + +.. code-block:: bash + + $ helm upgrade -n nvidia-network-operator network-operator nvidia/network-operator --version= -f values-.yaml --force + +.. warning:: The --devel option is required if you wish to use the Beta release. + +-------------------------- +OFED Driver Manual Upgrade +-------------------------- + +################################################ +Restarting Pods with a Containerized OFED Driver +################################################ + +.. warning:: This operation is required only if containerized OFED is in use. + +When a containerized OFED driver is reloaded on the node, all pods that use a secondary network based on NVIDIA NICs will lose network interface in their containers. To prevent outage, remove all pods that use a secondary network from the node before you reload the driver pod on it. + +The Helm upgrade command will only upgrade the DaemonSet spec of the OFED driver to point to the new driver version. The OFED driver's DaemonSet will not automatically restart pods with the driver on the nodes, as it uses "OnDelete" updateStrategy. The old OFED version will still run on the node until you explicitly remove the driver pod or reboot the node: + +.. code-block:: bash + + $ kubectl delete pod -l app=mofed- -n nvidia-network-operator + +It is possible to remove all pods with secondary networks from all cluster nodes, and then restart the OFED pods on all nodes at once. + +The alternative option is to perform an upgrade in a rolling manner to reduce the impact of the driver upgrade on the cluster. The driver pod restart can be done on each node individually. In this case, pods with secondary networks should be removed from the single node only. There is no need to stop pods on all nodes. + +For each node, follow these steps to reload the driver on the node: + +1. Remove pods with a secondary network from the node. +2. Restart the OFED driver pod. +3. Return the pods with a secondary network to the node. + +When the OFED driver is ready, proceed with the same steps for other nodes. + +#################################################### +Removing Pods with a Secondary Network from the Node +#################################################### + +To remove pods with a secondary network from the node with node drain, run the following command: + +.. code-block:: bash + + $ kubectl drain --pod-selector= + +.. warning:: Replace with -l "network.nvidia.com/operator.mofed.wait=false" if you wish to drain all nodes at once. + +############################## +Restarting the OFED Driver Pod +############################## + +Find the OFED driver pod name for the node: + +.. code-block:: bash + + $ kubectl get pod -l app=mofed- -o wide -A + +Example for Ubuntu 20.04: + +.. code-block:: bash + + kubectl get pod -l app=mofed-ubuntu20.04 -o wide -A + +########################################## +Deleting the OFED Driver Pod from the Node +########################################## + +To delete the OFED driver pod from the node, run: + +.. code-block:: bash + + $ kubectl delete pod -n + +.. warning:: Replace with -l app=mofed-ubuntu20.04 if you wish to remove OFED pods on all nodes at once. + +A new version of the OFED pod will automatically start. + +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Returning Pods with a Secondary Network to the Node +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +After the OFED pod is ready on the node, you can make the node schedulable again. + +The command below will uncordon (remove node.kubernetes.io/unschedulable:NoSchedule taint) the node, and return the pods to it: + +.. code-block:: bash + + $ kubectl uncordon -l "network.nvidia.com/operator.mofed.wait=false" + +----------------------------- +Automatic OFED Driver Upgrade +----------------------------- + +To enable automatic OFED upgrade, define the UpgradePolicy section for the ofedDriver in the NicClusterPolicy spec, and change the OFED version. + +``nicclusterpolicy.yaml``: + +.. parsed-literal:: + + apiVersion: mellanox.com/v1alpha1 + kind: NicClusterPolicy + metadata: + name: nic-cluster-policy + namespace: nvidia-network-operator + spec: + ofedDriver: + image: doca-driver + repository: nvcr.io/nvidia/mellanox + version: |mofed-version| + upgradePolicy: + # autoUpgrade is a global switch for automatic upgrade feature + # if set to false all other options are ignored + autoUpgrade: true + # maxParallelUpgrades indicates how many nodes can be upgraded in parallel + # 0 means no limit, all nodes will be upgraded in parallel + maxParallelUpgrades: 0 + # cordon and drain (if enabled) a node before loading the driver on it + safeLoad: false + # describes the configuration for waiting on job completions + waitForCompletion: + # specifies a label selector for the pods to wait for completion + podSelector: "app=myapp" + # specify the length of time in seconds to wait before giving up for workload to finish, zero means infinite + # if not specified, the default is 300 seconds + timeoutSeconds: 300 + # describes configuration for node drain during automatic upgrade + drain: + # allow node draining during upgrade + enable: true + # allow force draining + force: false + # specify a label selector to filter pods on the node that need to be drained + podSelector: "" + # specify the length of time in seconds to wait before giving up drain, zero means infinite + # if not specified, the default is 300 seconds + timeoutSeconds: 300 + # specify if should continue even if there are pods using emptyDir + deleteEmptyDir: false + +Apply NicClusterPolicy CRD: + +.. code-block:: bash + + $ kubectl apply -f nicclusterpolicy.yaml + +.. warning:: To be able to drain nodes, make sure to fill the PodDisruptionBudget field for all the pods that use it. On some clusters (e.g. Openshift), many pods use PodDisruptionBudget, which makes draining multiple nodes at once impossible. Since evicting several pods that are controlled by the same deployment or replica set, violates their PodDisruptionBudget, those pods are not evicted and in drain failure. + + To perform a driver upgrade, the network-operator must evict pods that are using network resources. Therefore, in order to ensure that the network-operator is evicting only the required pods, the upgradePolicy.drain.podSelector field must be configured. + +################### +Node Upgrade States +################### + +The status upgrade of each node is reflected in its nvidia.com/ofed-driver-upgrade-state label . This label can have the following values: + +.. list-table:: + :header-rows: 1 + + * - Name + - Description + * - Unknown (empty) + - The node has this state when the upgrade flow is disabled or the node has not been processed yet. + * - ``upgrade-done`` + - Set when OFED POD is up-to-date and running on the node, the node is schedulable. + * - ``upgrade-required`` + - Set when OFED POD on the node is not up-to-date and requires upgrade. No actions are performed at this stage. + * - ``cordon-required`` + - Set when the node needs to be made unschedulable in preparation for driver upgrade. + * - ``wait-for-jobs-required`` + - Set on the node when waiting is required for jobs to complete until the given timeout. + * - ``drain-required`` + - Set when the node is scheduled for drain. After the drain, the state is changed either to pod-restart-required or upgrade-failed. + * - ``pod-restart-required`` + - Set when the OFED POD on the node is scheduled for restart. After the restart, the state is changed to uncordon-required. + * - ``uncordon-required`` + - Set when OFED POD on the node is up-to-date and has "Ready" status. After uncordone, the state is changed to upgrade-done + * - ``upgrade-failed`` + - Set when the upgrade on the node has failed. Manual interaction is required at this stage. See Troubleshooting section for more details. + +.. warning:: Depending on your cluster workloads and pod Disruption Budget, set the following values for auto upgrade: + + .. parsed-literal:: + + apiVersion: mellanox.com/v1alpha1 + kind: NicClusterPolicy + metadata: + name: nic-cluster-policy + namespace: nvidia-network-operator + spec: + ofedDriver: + image: doca-driver + repository: nvcr.io/nvidia/mellanox + version: |mofed-version| + upgradePolicy: + autoUpgrade: true + maxParallelUpgrades: 1 + drain: + enable: true + force: false + deleteEmptyDir: true + podSelector: "" + +################### +Safe Driver Loading +################### + +.. warning:: The state of this feature can be controlled with the ofedDriver.upgradePolicy.safeLoad option. + +Upon node startup, the OFED container takes some time to compile and load the driver. During that time, workloads might get scheduled on that node. When OFED is loaded, all existing PODs that use NVIDIA NICs will lose their network interfaces. Some such PODs might silently fail or hang. To avoid this situation, before the OFED container is loaded, the node should get cordoned and drained to ensure all workloads are rescheduled. The node should be un-cordoned when the driver is ready on it. + +The safe driver loading feature is implemented as a part of the upgrade flow, meaning safe driver loading is a special scenario of the upgrade procedure, where we upgrade from the inbox driver to the containerized OFED. + +When this feature is enabled, the initial OFED driver rollout on the large cluster can take a while. To speed up the rollout, the initial deployment can be done with the safe driver loading feature disabled, and this feature can be enabled later by updating the NicClusterPolicy CRD. + +^^^^^^^^^^^^^^^ +Troubleshooting +^^^^^^^^^^^^^^^ + +.. list-table:: + :header-rows: 1 + + * - Issue + - Required Action + * - The node is in upgrade-failed state. + - * Drain the node manually by running kubectl drain --ignore-daemonsets. + * Delete the MLNX_OFED pod on the node manually, by running the following command: ``kubectl delete pod -n `kubectl get pods --A --field-selector spec.nodeName= -l nvidia.com/ofed-driver --no-headers | awk '{print $1 " "$2}'```. + + **NOTE:** If the "Safe driver loading" feature is enabled, you may also need to remove the ``nvidia.com/ofed-driver-upgrade.driver-wait-for-safe-load`` annotation from the node object to unblock the loading of the driver + ``kubectl annotate node nvidia.com/ofed-driver-upgrade.driver-wait-for-safe-load-`` + + * Wait for the node to complete the upgrade. + + * - The updated MLNX_OFED pod failed to start/ a new version of MLNX_OFED cannot be installed on the node. + - Manually delete the pod by using ``kubectl delete -n ``. + If following the restart the pod still fails, change the MLNX_OFED version in the NicClusterPolicy to the previous version or to another working version. + +================================= +Uninstalling the Network Operator +================================= + +------------------------------------------------------------- +Uninstalling Network Operator on a Vanilla Kubernetes Cluster +------------------------------------------------------------- + +Uninstall the Network Operator: + +.. code-block:: bash + + helm uninstall network-operator -n network-operator + +You should now see all the pods being deleted: + +.. code-block:: bash + + kubectl get pods -n network-operator + +Make sure that the CRDs created during the operator installation have been removed: + +.. code-block:: bash + + kubectl get nicclusterpolicies.mellanox.com + No resources found + +--------------------------------------------------------- +Uninstalling the Network Operator on an OpenShift Cluster +--------------------------------------------------------- + +.. _Red Hat OpenShift Container Platform Documentation: https://docs.openshift.com/container-platform/4.10/operators/admin/olm-deleting-operators-from-cluster.html + +From the console: + +In the OpenShift Container Platform web console side menu, select **Operators >Installed Operators**, search for the **NVIDIA Network Operator**, and click on it. + +On the right side of the **Operator Details** page, select **Uninstall Operator** from the **Actions** drop-down menu. + +For additional information, see the `Red Hat OpenShift Container Platform Documentation`_. + +From the CLI: + + * Check the current version of the Network Operator in the currentCSV field: + + .. code-block:: bash + + oc get subscription -n nvidia-network-operator nvidia-network-operator -o yaml | grep currentCSV + + Example output: + + .. code-block:: bash + + currentCSV: nvidia-network-operator.v24.1.0 + * Delete the subscription: + + .. code-block:: bash + + oc delete subscription -n nvidia-network-operator nvidia-network-operator + + Example output: + + .. code-block:: bash + + subscription.operators.coreos.com "nvidia-network-operator" deleted + + * Delete the CSV using the currentCSV value from the previous step: + + .. code-block:: bash + + subscription.operators.coreos.com "nvidia-network-operator" deleted + + Example output: + + .. code-block:: bash + + clusterserviceversion.operators.coreos.com "nvidia-network-operator.v10.0" deleted + +The SR-IOV Network Operator uninstallation procedure is described in this document. For additional information, see the `Red Hat OpenShift Container Platform Documentation`_. + +---------------- +Additional Steps +---------------- + +.. warning:: In OCP, uninstalling an operator does not remove its managed resources, including CRDs and CRs. To remove them, you must manually delete the Operator CRDs following the operator uninstallation. + +Delete the Network Operator CRDs: + +.. code-block:: bash + + oc delete crds hostdevicenetworks.mellanox.com macvlannetworks.mellanox.com nicclusterpolicies.mellanox.com + +=========================== +NicClusterPolicy CRD Update +=========================== +If the NicClusterPolicy manual update affects the device plugin configuration (e.g. NICs selectors), manual device plugin pods restart is required. \ No newline at end of file diff --git a/docs/life-cycle-managment.rst b/docs/life-cycle-managment.rst deleted file mode 100644 index 7026b2f..0000000 --- a/docs/life-cycle-managment.rst +++ /dev/null @@ -1,22 +0,0 @@ -.. license-header - SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. - SPDX-License-Identifier: Apache-2.0 - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. - -.. headings # #, * *, =, -, ^, " - - -********************* -Life Cycle Management -********************* \ No newline at end of file