diff --git a/docs/customizations/helm.rst b/docs/customizations/helm.rst index 0c85358..2f10d53 100644 --- a/docs/customizations/helm.rst +++ b/docs/customizations/helm.rst @@ -224,9 +224,9 @@ For example: memory: "300Mi" -================ -MLNX_OFED Driver -================ +=================== + NVIDIA DOCA Driver +=================== .. list-table:: :header-rows: 1 @@ -238,19 +238,19 @@ MLNX_OFED Driver * - ofedDriver.deploy - Bool - false - - Deploy the MLNX_OFED driver container + - Deploy the NVIDIA DOCA Driver driver container * - ofedDriver.repository - String - nvcr.io/nvidia/mellanox - - MLNX_OFED driver image repository + - NVIDIA DOCA Driver image repository * - ofedDriver.image - String - doca-driver - - MLNX_OFED driver image name + - NVIDIA DOCA Driver image name * - ofedDriver.version - String - |mofed-version| - - MLNX_OFED driver version + - NVIDIA DOCA Driver version * - ofedDriver.initContainer.enable - Bool - true @@ -282,7 +282,7 @@ MLNX_OFED Driver * - ofedDriver.imagePullSecrets - List - [] - - An optional list of references to secrets to use for pulling any of the MLNX_OFED driver images + - An optional list of references to secrets to use for pulling any of the NVIDIA DOCA Driver images * - ofedDriver.env - List - [] @@ -290,27 +290,27 @@ MLNX_OFED Driver * - ofedDriver.startupProbe.initialDelaySeconds - Int - 10 - - MLNX_OFED startup probe initial delay + - NVIDIA DOCA Driver startup probe initial delay * - ofedDriver.startupProbe.periodSeconds - Int - 20 - - MLNX_OFED startup probe interval + - NVIDIA DOCA Driver startup probe interval * - ofedDriver.livenessProbe.initialDelaySeconds - Int - 30 - - MLNX_OFED liveness probe initial delay + - NVIDIA DOCA Driver liveness probe initial delay * - ofedDriver.livenessProbe.periodSeconds - Int - 30 - - MLNX_OFED liveness probe interval + - NVIDIA DOCA Driver liveness probe interval * - ofedDriver.readinessProbe.initialDelaySeconds - Int - 10 - - MLNX_OFED readiness probe initial delay + - NVIDIA DOCA Driver readiness probe initial delay * - ofedDriver.readinessProbe.periodSeconds - Int - 30 - - MLNX_OFED readiness probe interval + - NVIDIA DOCA Driver readiness probe interval * - ofedDriver.upgradePolicy.autoUpgrade - Bool - true @@ -360,11 +360,11 @@ MLNX_OFED Driver - false - Fail Mellanox OFED deployment if precompiled OFED driver container image does not exists -====================================== -MLNX_OFED Driver Environment Variables -====================================== +=============================================== +NVIDIA DOCA Driver Driver Environment Variables +=============================================== -The following are special environment variables supported by the MLNX_OFED container to configure its behavior: +The following are special environment variables supported by the NVIDIA DOCA Driver container to configure its behavior: .. list-table:: :header-rows: 1 @@ -378,7 +378,7 @@ The following are special environment variables supported by the MLNX_OFED conta - Create an udev rule to preserve "old-style" path based netdev names e.g enp3s0f0 * - UNLOAD_STORAGE_MODULES - "false" - - | Unload host storage modules prior to loading MLNX_OFED modules: + - | Unload host storage modules prior to loading NVIDIA DOCA Driver modules: | * ib_isert | * nvme_rdma | * nvmet_rdma @@ -387,12 +387,12 @@ The following are special environment variables supported by the MLNX_OFED conta | * ib_srpt * - ENABLE_NFSRDMA - "false" - - Enable loading of NFS related storage modules from a MLNX_OFED container + - Enable loading of NFS related storage modules from a NVIDIA DOCA Driver container * - RESTORE_DRIVER_ON_POD_TERMINATION - "true" - Restore host drivers when a container -In addition, it is possible to specify any environment variables to be exposed to the MLNX_OFED container, such as the standard "HTTP_PROXY", "HTTPS_PROXY", "NO_PROXY". +In addition, it is possible to specify any environment variables to be exposed to the NVIDIA DOCA Driver container, such as the standard "HTTP_PROXY", "HTTPS_PROXY", "NO_PROXY". .. warning:: CREATE_IFNAMES_UDEV is set automatically by the Network Operator, depending on the Operating System of the worker nodes in the cluster (the cluster is assumed to be homogenous). diff --git a/docs/files/RHEL_Dockerfile b/docs/files/RHEL_Dockerfile index 5b1d772..06e0da9 100644 --- a/docs/files/RHEL_Dockerfile +++ b/docs/files/RHEL_Dockerfile @@ -86,7 +86,7 @@ ARG D_KERNEL_VER ARG OFED_SRC_LOCAL_DIR RUN set -x && \ -# MOFED installation requirements +# NVIDIA DOCA Driver installation requirements dnf install -y autoconf gcc make rpm-build # Build driver @@ -123,7 +123,7 @@ RUN set -x && \ ./mlnx-tools-*.rpm RUN set -x && \ -# MOFED functional requirements +# NVIDIA DOCA Driver functional requirements dnf install -y pciutils hostname udev ethtool \ # Container functional requirements jq iproute kmod procps-ng udev diff --git a/docs/getting-started-kubernetes.rst b/docs/getting-started-kubernetes.rst index 15ffab7..c891c96 100644 --- a/docs/getting-started-kubernetes.rst +++ b/docs/getting-started-kubernetes.rst @@ -581,7 +581,7 @@ Network Operator Deployment for GPUDirect Workloads GPUDirect requires the following: -* MLNX_OFED v5.5-1.0.3.2 or newer +* NVIDIA DOCA Driver v5.5-1.0.3.2 or newer * GPU Operator v1.9.0 or newer * NVIDIA GPU and driver supporting GPUDirect e.g Quadro RTX 6000/8000 or NVIDIA T4/NVIDIA V100/NVIDIA A100 @@ -962,7 +962,7 @@ Network Operator Deployment with an SR-IOV InfiniBand Network Network Operator deployment with InfiniBand network requires the following: -* MLNX_OFED and OpenSM running. OpenSM runs on top of the MLNX_OFED stack, so both the driver and the subnet manager should come from the same installation. Note that partitions that are configured by OpenSM should specify defmember=full to enable the SR-IOV functionality over InfiniBand. For more details, please refer to `this article `. +* NVIDIA DOCA Driver and OpenSM running. OpenSM runs on top of the NVIDIA DOCA Driver stack, so both the driver and the subnet manager should come from the same installation. Note that partitions that are configured by OpenSM should specify defmember=full to enable the SR-IOV functionality over InfiniBand. For more details, please refer to `this article `. * InfiniBand device – Both the host device and switch ports must be enabled in InfiniBand mode. * rdma-core package should be installed when an inbox driver is used. @@ -1081,7 +1081,7 @@ Network Operator Deployment with an SR-IOV InfiniBand Network with PKey Manageme Network Operator deployment with InfiniBand network requires the following: -* MLNX_OFED and OpenSM running. OpenSM runs on top of the MLNX_OFED stack, so both the driver and the subnet manager should come from the same installation. Note that partitions that are configured by OpenSM should specify defmember=full to enable the SR-IOV functionality over InfiniBand. For more details, please refer to `this article`_. +* NVIDIA DOCA Driver and OpenSM running. OpenSM runs on top of the NVIDIA DOCA Driver stack, so both the driver and the subnet manager should come from the same installation. Note that partitions that are configured by OpenSM should specify defmember=full to enable the SR-IOV functionality over InfiniBand. For more details, please refer to `this article`_. * NVIDIA UFM running on top of OpenSM. For more details, please refer to `the project documentation`_. * InfiniBand device – Both the host device and the switch ports must be enabled in InfiniBand mode. * rdma-core package should be installed when an inbox driver is used. @@ -1145,7 +1145,7 @@ Current limitations: ipamPlugin: deploy: true -Wait for MLNX_OFED to install and apply the following CRs: +Wait for NVIDIA DOCA Driver to install and apply the following CRs: ``sriov-ib-network-node-policy.yaml`` diff --git a/docs/getting-started-openshift.rst b/docs/getting-started-openshift.rst index e235496..c557c9c 100644 --- a/docs/getting-started-openshift.rst +++ b/docs/getting-started-openshift.rst @@ -108,7 +108,7 @@ If you are planning to use SR-IOV, follow these `instructions - name: mofed-test-ctr + name: doca-test-ctr securityContext: capabilities: add: [ "IPC_LOCK" ] diff --git a/docs/life-cycle-management.rst b/docs/life-cycle-management.rst index 0434b21..fa7de4b 100644 --- a/docs/life-cycle-management.rst +++ b/docs/life-cycle-management.rst @@ -367,16 +367,16 @@ Troubleshooting - Required Action * - The node is in upgrade-failed state. - * Drain the node manually by running kubectl drain --ignore-daemonsets. - * Delete the MLNX_OFED pod on the node manually, by running the following command: ``kubectl delete pod -n `kubectl get pods --A --field-selector spec.nodeName= -l nvidia.com/ofed-driver --no-headers | awk '{print $1 " "$2}'```. + * Delete the NVIDIA DOCA Driver pod on the node manually, by running the following command: ``kubectl delete pod -n `kubectl get pods --A --field-selector spec.nodeName= -l nvidia.com/ofed-driver --no-headers | awk '{print $1 " "$2}'```. **NOTE:** If the "Safe driver loading" feature is enabled, you may also need to remove the ``nvidia.com/ofed-driver-upgrade.driver-wait-for-safe-load`` annotation from the node object to unblock the loading of the driver ``kubectl annotate node nvidia.com/ofed-driver-upgrade.driver-wait-for-safe-load-`` * Wait for the node to complete the upgrade. - * - The updated MLNX_OFED pod failed to start/ a new version of MLNX_OFED cannot be installed on the node. + * - The updated NVIDIA DOCA Driver pod failed to start/ a new version of NVIDIA DOCA Driver cannot be installed on the node. - Manually delete the pod by using ``kubectl delete -n ``. - If following the restart the pod still fails, change the MLNX_OFED version in the NicClusterPolicy to the previous version or to another working version. + If following the restart the pod still fails, change the NVIDIA DOCA Driver version in the NicClusterPolicy to the previous version or to another working version. ================================= Uninstalling the Network Operator diff --git a/docs/platform-support.rst b/docs/platform-support.rst index 60aa4d7..f63e45e 100644 --- a/docs/platform-support.rst +++ b/docs/platform-support.rst @@ -60,7 +60,7 @@ The following component versions are deployed by the Network Operator: * - Node Feature Discovery - |node-feature-discovery-version| - Optionally deployed. May already be present in the cluster with proper configuration. - * - NVIDIA MLNX_OFED driver container + * - NVIDIA DOCA Driver container - |mofed-version| - * - k8s-rdma-shared-device-plugin