diff --git a/docs/device-plugin.rst b/docs/device-plugin.rst deleted file mode 100644 index 8f85fb0..0000000 --- a/docs/device-plugin.rst +++ /dev/null @@ -1,40 +0,0 @@ -.. license-header - SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. - SPDX-License-Identifier: Apache-2.0 - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. - -.. headings # #, * *, =, -, ^, " - - -************* -Device Plugin -************* - -Kubernetes provides a device plugin framework that can be used to advertise system hardware resources to the Kubelet. -More information about the device plugin framework can be found at https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/. - -This document presents configuration with the following device plugins: - -.. list-table:: - :header-rows: 1 - - * - Device Plugin - - Project - * - SR-IOV network device plugin - - https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin - * - RDMA shared device plugin - - https://github.com/Mellanox/k8s-rdma-shared-dev-plugin - -- SR-IOV network device plugin - A device plugin for discovering and advertising the SR-IOV virtual functions (VFs) that are available on a Kubernetes host. -- RDMA shared device plugin - device plugin for sharing RDMA devices between PODs on the same host. diff --git a/docs/index.rst b/docs/index.rst index 41f2dec..3cdd535 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -30,33 +30,4 @@ Life Cycle Management Advanced Configurations - -.. toctree:: - :caption: Multi-network POD - :titlesonly: - :hidden: - - Multi-network POD - -.. toctree:: - :caption: Device Plugin - :titlesonly: - :hidden: - - Device Plugin - -.. toctree:: - :caption: K8s on Bare Metal - Ethernet - :titlesonly: - :hidden: - - K8s on Bare Metal - Ethernet - -.. toctree:: - :caption: Kubernetes Performance Tuning - :titlesonly: - :hidden: - - Kubernetes Performance Tuning - .. include:: overview.rst diff --git a/docs/k8s-baremetal-ethernet.rst b/docs/k8s-baremetal-ethernet.rst deleted file mode 100644 index fc9018c..0000000 --- a/docs/k8s-baremetal-ethernet.rst +++ /dev/null @@ -1,339 +0,0 @@ -.. license-header - SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. - SPDX-License-Identifier: Apache-2.0 - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. - -.. headings # #, * *, =, -, ^, " - - -**************************** -K8s on Bare Metal - Ethernet -**************************** - -.. contents:: On this page - :depth: 2 - :local: - :backlinks: none - - - -This chapter describes Kubernetes solutions running on bare-metal hosts with NVIDIA's Ethernet NIC family. - -============================= -Operating System Requirements -============================= - -NVIDIA drivers should be installed as part of the operating system. - -======================== -Kubernetes Prerequisites -======================== - -Install kubernetes Version 1.18, or newer. You may use the following references to Install Kubernetes with deployment tools: - -- `Bootstrapping clusters with kubeadm `_ -- `Installing Kubernetes with Kubespray `_ - -It is recommended to use Kubernetes Version 1.18 with the following features enabled. This will ensure the best NUMA alignment between the NIC PCI and the CPU, and better utilize SR-IOV performance: - -- `CPU Manager `_ - With static CPU manager policy -- `Topology Manager `_ - With single NUMA node policy - -Examples of how to configure CPU and topology managers can be found in the :doc:`Kubernetes Performance Tuning ` section. - -========================================== -Enabling SR-IOV Networking with Kubernetes -========================================== - -This chapter describes the setup and configuration procedures of legacy SR-IOV with SR-IOV Device Plugin and SR-IOV CNI. - -Single Root IO Virtualization (SR-IOV) is a technology that allows a physical PCIe device to present itself multiple times through the PCIe bus. This technology enables multiple virtual instances of the device with separate resources. These virtual functions can then be provisioned separately. Each VF is an additional device connected to the Physical Function. It shares the same resources with the Physical Function, and its number of ports equals those of the Physical Function. - -The following diagram represents the POD networking interfaces with Legacy SR-IOV as a secondary network. - -.. image:: images/sriovpod.png - ----------------------------------------------- -Supported Network Interface Cards and Firmware ----------------------------------------------- - -NVIDIA Networking supports the following Network Interface Cards and their corresponding firmware versions in Kubernetes: - -.. list-table:: - :header-rows: 1 - - * - Network Interface Card - - Firmware Version - * - ConnectX®-6 Dx - - 22.28.2006 - * - ConnectX®-6 - - 20.28.2006 - * - ConnectX®-5 - - 16.28.2006 - ------------------------------------------------- -Enabling SR-IOV Virtual Functions in Legacy Mode ------------------------------------------------- - -SR-IOV Legacy mode supports standard network device and RDMA Over Converged Ethernet (RoCE)-enabled network device. -To enable SR-IOV virtual functions in legacy mode, follow the instructions detailed in this `link `_. - --------------------- -RoCE Namespace Aware --------------------- - -Prior to Kernel Version 5.3.0, all RDMA devices were visible in all network namespaces. -Kernel Version 5.3.0 or NVIDIA OFED Version 4.7 introduce network namespace isolation of RDMA devices. -When the RDMA system is set to exclusive, this feature ensures that the RDMA device is bound to a particular net namespace and visible only to it. -To learn how to enable RoCE Namespace Aware by using RDMA CNI, see `here `_. - -1. Set the RDMA system to "exclusive". This should be done on the host preparation stage: - - -.. code-block:: bash - - rdma system set netns exclusive - -2. Deploy the RDMA CNI: - - -.. code-block:: bash - - kubectl apply -f https://raw.githubusercontent.com/Mellanox/rdma-cni/v1.0.0/deployment/rdma-cni-daemonset.yaml - -3. Update the SR-IOV network CRD with RDMA CNI as a chained plugin: - -.. code-block:: yaml - - apiVersion: "k8s.cni.cncf.io/v1" - kind: NetworkAttachmentDefinition - metadata: - name: sriov-net - annotations: - k8s.v1.cni.cncf.io/resourceName: nvidia.com/mlnx_sriov_netdevice - spec: - config: '{ - "cniVersion": "0.3.1", - "name": "sriov-network", - "plugins": [ - { - "type": "sriov", - "ipam": { - "type": "host-local", - "subnet": "10.56.217.0/24", - "routes": [ - { - "dst": "0.0.0.0/0" - } - ], - "gateway": "10.56.217.1" - } - }, - { - "type": "rdma" - } - ] - } - -.. _Creating SR-IOV with RoCE POD: - -Example of RoCE-enabled pod with SR-IOV resource: - -.. code-block:: yaml - - apiVersion: v1 - kind: Pod - metadata: - name: testpod1 - annotations: - k8s.v1.cni.cncf.io/networks: sriov-net - spec: - containers: - - name: appcntr1 - image: - imagePullPolicy: IfNotPresent - securityContext: - capabilities: - add: ["IPC_LOCK"] - command: [ "/bin/bash", "-c", "--" ] - args: [ "while true; do sleep 300000; done;" ] - resources: - requests: - nvidia.com/mlnx_sriov_netdevice: '1' - limits: - nvidia.com/mlnx_sriov_netdevice: '1' - -The `` should contain RDMA user space libraries - e.g rdma-core, which are compatible with the host kernel. - -Deploy the SR-IOV RoCE POD: - -.. code-block:: bash - - kubectl create -f sriov-roce-pod.yaml - ---------------------------------- -RoCE with Connection Manager (CM) ---------------------------------- - -Some RDMA applications use RDMA CM to establish connections across the network. -Due to kernel limitation, NVIDIA NICs require pre-allocate MACs for all VFs in the deployment, if an RDMA workload wishes to utilize RMDA CM to establish connection. - -To do that, run: - -.. code-block:: bash - - ip link set vf mac - echo > /sys/bus/pci/drivers/mlx5_core/unbind - echo > /sys/bus/pci/drivers/mlx5_core/bind - -This will populate the VF's node and port GUID required for RDMA CM to establish connection. - -------------------- -RoCE with GPUDirect -------------------- - -GPUDirect allows network adapters and storage drives to directly read and write to/from GPU memory, thereby eliminating unnecessary memory copies, decreasing CPU overheads and reducing latency. These actions result in significant performance improvements. - -GPUDirect requires the following: - -- MOFED 5.5-1.0.3.2 and above -- `nvidia-peermem` kernel module loaded by GPU Operator v1.9.0 -- NVIDIA GPU and driver supporting GPUDirect e.g Quadro RTX 6000/8000 or Tesla T4/Tesla V100/Tesla A100 - -The RoCE POD should be deployed as described in `Creating SR-IOV with RoCE POD`_. - ----- -DPDK ----- - -SR-IOV DPDK support is configured similarly to SR-IOV (legacy) configuration. This section describes the differences. - -1. Create the `sriov-dpdk-pod.yaml` file: - -.. code-block:: yaml - - apiVersion: v1 - kind: Pod - metadata: - name: testpod1 - annotations: - k8s.v1.cni.cncf.io/networks: sriov-net - spec: - containers: - - name: appcntr1 - image: - imagePullPolicy: IfNotPresent - securityContext: - capabilities: - add: ["IPC_LOCK"] - volumeMounts: - - mountPath: /dev/hugepages - name: hugepage - resources: - requests: - memory: 1Gi - hugepages-1Gi: 2Gi - command: [ "/bin/bash", "-c", "--" ] - args: [ "while true; do sleep 300000; done;" ] - resources: - requests: - mellanox.com/mlnx_sriov_netdevice: '1' - limits: - mellanox.com/mlnx_sriov_netdevice: '1' - volumes: - - name: hugepage - emptyDir: - medium: HugePages - - -The `` should contain DPDK and RDMA user space libraries e.g - rdma-core, which are compatible with the host Kernel and with each other. - -- CRI-O Version 1.17 and above requires adding `NET_RAW` to the capabilities (for other runtimes, `NET_RAW` is the default). -- For DPDK to work with PA addresses with Linux >= 4.0 requires adding `SYS_ADMIN` to the capabilities. -- DPDK applications that configure the device, such as MTU, MAC and link state, require adding `NET_ADMIN`. - - -Deploy the SR-IOV DPDK POD: - -.. code-block:: bash - - kubectl create -f sriov-dpdk-pod.yaml - -=========== -OVS Offload -=========== - -The ASAP2 solution combines the performance and efficiency of server/storage networking hardware with the flexibility of virtual switching software. ASAP2 offers up to 10 times better performance than non offloaded OVS solutions, delivering software-defined networks with the highest total infrastructure efficiency, deployment flexibility and operational simplicity. Starting from NVIDIA® ConnectX®-5 NICs, NVIDIA supports accelerated virtual switching in server NIC hardware through the ASAP2 feature. While accelerating the data plane, ASAP2 keeps the SDN control plane intact, thus staying completely transparent to applications, maintaining flexibility and ease of deployments. - --------------------------------- -OVN Kubernetes CNI with ConnectX --------------------------------- - -To enable OVN Kubernetes CNI with ConnectX, see `OVN Kubernetes CNI with OVS offload `_. - ------- -Antrea ------- - -For Antrea CNI configuration instructions, see `Antrea CNI with OVS Offload `_. - -================ -RoCE Shared Mode -================ - -RoCE shared mode allows RDMA devices to be shared between PODs on the same host. This configuration can work with macvlan or with ipvlan CNI. - ------------------------ -Kubernetes Prerequisite ------------------------ - -Install Kubernetes Version 1.16 or above. You may use the following references when installing Kubernetes with deployment tools: - -- `Bootstrapping Clusters with Kubeadm `_ -- `Installing Kubernetes with Kubespray `_ - ----------------------------------- -Deploying the Shared Device Plugin ----------------------------------- - -Create the `rdma-shared.yaml` configMap for the shared device plugin: - -.. code-block:: json - - { - "configList": [ - { - "resourceName": "roce_shared_devices", - "rdmaHcaMax": 1000, - "selectors": { - "vendors": ["15b3"], - "deviceIDs": ["1017"] - } - } - ] - } - -.. code-block:: bash - - kubectl create -f rdma-shared.yaml - kubectl create -f https://raw.githubusercontent.com/Mellanox/k8s-rdma-shared-dev-plugin/master/images/k8s-rdma-shared-dev-plugin-ds.yaml - -For advanced macvlan CNI configuration see following `instructions `_. - -Supported IPAM (IP Address Management) operations: - -- host-local -- dhcp -- static -- whereabouts diff --git a/docs/kubernetes-perfomance.rst b/docs/kubernetes-perfomance.rst deleted file mode 100644 index 1eee8e7..0000000 --- a/docs/kubernetes-perfomance.rst +++ /dev/null @@ -1,58 +0,0 @@ -.. license-header - SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. - SPDX-License-Identifier: Apache-2.0 - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. - -.. headings # #, * *, =, -, ^, " - - -***************************** -Kubernetes Performance Tuning -***************************** - -This section provides a configuration example for Kubernetes performance tuning for SR-IOV. - -The machine in this example includes the following CPUs: - -.. code-block:: - - numactl --hardware - available: 2 nodes (0-1) - node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17 - node 0 size: 31990 MB - node 0 free: 25314 MB - node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23 - node 1 size: 32237 MB - node 1 free: 27135 MB - node distances: - node 0 1 - 0: 10 21 - 1: 21 10 - -3 CPUs are reserved on NUMA, and 1 for the system and for Kubernetes. Edit `/var/lib/kubelet/config.yaml`: - -.. code-block:: yaml - - cpuManagerPolicy: static - reservedSystemCPUs: 6-8 - topologyManagerPolicy: single-numa-node - -Using `isolcpus` kernel boot command-line isolates the CPUs from the kernel scheduler. This will ensure that a user-space process will not be scheduled by the kernel. In this example, CPUs 0-5 and 9-23 are isolated. - -.. code-block:: - - isolcpus=0,1,2,3,4,5,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 - -.. warning:: - In case of changes in the `reservedSystemCPUs` or in the `cpuManagerPolicy` the `/var/lib/kubelet/cpu_manager_state` should be deleted, and the `kubelet` should be restarted. diff --git a/docs/multi-network-pod.rst b/docs/multi-network-pod.rst deleted file mode 100644 index d078ae5..0000000 --- a/docs/multi-network-pod.rst +++ /dev/null @@ -1,46 +0,0 @@ -.. license-header - SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. - SPDX-License-Identifier: Apache-2.0 - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. - -.. headings # #, * *, =, -, ^, " - - -***************** -Multi Network POD -***************** - -By default, Kubernetes allows for a single network (primary network) to be connected to a POD. -Kubernetes `network attachment definition custom resources `_ enhance this capability, and allow users to attach multi-networks for POD, a primary network which runs all of Kubernetes services and one or more secondary networks which are typically used for high performance. -The cluster network CNI plugin (primary plugin) is satisfying Kubernetes' networking requirements. - -Below is a list of well known cluster network CNI providers: - -.. list-table:: - :header-rows: 1 - - * - CNI Provider - - Project - * - Calico - - https://github.com/projectcalico/calico - * - Flannel - - https://github.com/flannel-io/flannel - * - Canal - - https://github.com/projectcalico/canal - * - ovn-kubernetes - - https://github.com/ovn-org/ovn-kubernetes - -The `Multus CNI plugin `_ enables attaching multiple network interfaces to pods. Multus is acting as a "meta-plugin", a CNI plugin that can call multiple other CNI plugins. - -.. image:: images/multus.png \ No newline at end of file diff --git a/docs/release-notes.rst b/docs/release-notes.rst index 602ba62..c7ff289 100644 --- a/docs/release-notes.rst +++ b/docs/release-notes.rst @@ -38,8 +38,9 @@ Changes and New Features - Description * - 24.4.0 - | - Added support for OpenShift Container Platform v4.15. - | - Added support for Ubuntu 22.04 with Upstream K8s on ARM platforms (NVIDIA IGX Orin) as a GA feature. - | - Added support for Ubuntu 22.04 with Upstream K8s on ARM platforms (NVIDIA Grace) as a Tech Preview feature. + | - Added support for Ubuntu 22.04. + | - Added support for NVIDIA Grace based ARM platforms with Ubuntu 22.04 and Upstream K8s as a Tech Preview feature. + | - Added support for NVIDIA IGX Orin based ARM platforms with Ubuntu 22.04 and Upstream K8s as a GA feature. | - Added support for Precompiled DOCA Driver containers for Ubuntu 22.04. | - Added support for Switchdev SR-IOV mode with SR-IOV Network Operator and OVS CNI as a Tech Preview feature. | - Added support for DOCA Telemetry Service (DTS) integration to expose network telemetry and NIC metrics in K8s.