This document describes how to run Kube-OVN with OVS-DPDK.
- Kubernetes >= 1.11
- Docker >= 1.12.6
- OS: CentOS 7.5/7.6/7.7, Ubuntu 16.04/18.04
- 1GB Hugepages on the host
- On the host, modify the file /etc/default/grub
- Append the following to the setting GRUB_CMDLINE_LINUX:
default_hugepagesz=1GB hugepagesz=1G hugepages=X
Where X is the number of 1GB hugepages you wish to create on your system. Your usecases will determine the number of hugepages required and system memory available will determine the maximum possible. - Update Grub:
- On legacy boot systems run:
grub2-mkconfig -o /boot/grub2/grub.cfg
- On EFI boot systems run:
grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg
NOTE: This filepath is an example from a CentOS system, it will differ on other distros.
- On legacy boot systems run:
- Reboot the system
- To confirm hugepages configured run:
grep Huge /proc/meminfo
Example Output:
AnonHugePages: 2105344 kB
HugePages_Total: 32
HugePages_Free: 30
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
-
Download the installation script:
wget https://raw.githubusercontent.com/alauda/kube-ovn/release-1.2/dist/images/install.sh
-
Use vim to edit the script variables to meet your requirement
REGISTRY="index.alauda.cn/alaudak8s"
NAMESPACE="kube-system" # The ns to deploy kube-ovn
POD_CIDR="10.16.0.0/16" # Do NOT overlap with NODE/SVC/JOIN CIDR
SVC_CIDR="10.96.0.0/12" # Do NOT overlap with NODE/POD/JOIN CIDR
JOIN_CIDR="100.64.0.0/16" # Do NOT overlap with NODE/POD/SVC CIDR
LABEL="node-role.kubernetes.io/master" # The node label to deploy OVN DB
IFACE="" # The nic to support container network, if empty will use the nic that the default route use
VERSION="v1.1.0"
- Run the installation script making sure to include the flag --with-dpdk= followed by the required DPDK version.
bash install.sh --with-dpdk=19.11
Note: Current supported version is DPDK 19.11
The DPDK enabled vhost-user sockets provided by OVS-DPDK are not suitable for use as the default network of a Kubernetes pod. We must retain the OVS (kernel) interface provided by Kube-OVN and the DPDK socket(s) must be requested as additional interface(s).
To facilitate multiple network interfaces to a pod we can use the Multus-CNI plugin. To install Multus follow the Multus quick start guide. During installation, Multus should detect Kube-OVN has already been installed as the default Kubernetes network plugin and will automatically configure itself so Kube-OVN continues to be the default network plugin for all pods.
Note: Multus determines the existing default network as the lexicographically (alphabetically) first configuration file in the /etc/cni/net.d directory. If another plugin has the lexicographically first config file at this location, it will be considered the default network. Rename configuration files accordingly before Multus installation.
With Multus installed, additional Network interfaces can now be requested within a pod spec.
There is now a containerized instance of OVS-DPDK running on the node. Kube-OVN can provide all of its regular (kernal) functionality. Multus is in place to enable pods request the additional OVS-DPDK interfaces. However, OVS-DPDK does provide regular Netdev interfaces, but vhost-user sockets. These sockets cannot be attached to a pod in the usual manner where the Netdev is moved to the pod network namespace. These sockets must be mounted into the pod. Kube-OVN (at least currently) does not have this socket-mounting ability. For this functionality we can use the Userspace CNI Network Plugin.
Note: These steps assume Go has already been installed, and the GOPATH env var has been set.
go get github.com/intel/userspace-cni-network-plugin
cd $GOPATH/src/github.com/intel/userspace-cni-network-plugin
make clean
make install
make
cp userspace/userspace /opt/cni/bin
A NetworkAttachmentDefinition is used to represent the network attachments. In this case we need a NAD to represent the network interfaces provided by Userspace CNI, i.e. the OVS-DPDK interfaces. It will then be possible to request this network attachment within a pod spec and Multus will attach these to the pod as secondary interfaces in addition to the preconfigured default network, i.e. the Kube-OVN provided OVS (Kernel) interfaces.
Create the NetworkAttachmentDefinition
cat <<EOF | kubectl create -f -
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: ovs-dpdk-br0
spec:
config: |
{
"cniVersion": "0.3.1",
"type": "userspace",
"name": "ovs-dpdk-br0",
"kubeconfig": "/etc/cni/net.d/multus.d/multus.kubeconfig",
"logFile": "/var/log/userspace-ovs-dpdk-br0.log",
"logLevel": "debug",
"host": {
"engine": "ovs-dpdk",
"iftype": "vhostuser",
"netType": "bridge",
"vhost": {
"mode": "server"
},
"bridge": {
"bridgeName": "br0"
}
},
"container": {
"engine": "ovs-dpdk",
"iftype": "vhostuser",
"netType": "interface",
"vhost": {
"mode": "client"
}
}
}
EOF
It should now be possible to request the Userspace CNI provided interfaces as annotations within a pod spec. The example below will request two OVS-DPDK interfaces, these will be in addition to the default network.
apiVersion: v1
kind: Pod
metadata:
annotations:
k8s.v1.cni.cncf.io/networks: ovs-dpdk-br0, ovs-dpdk-br0
Userspace-CNI is intended to run in an environment where OVS-DPDK is installed directly on the host, rather than in a container. Userspace-CNI makes calls to OVS-DPDK using an application called ovs-vsctl. With a containerized OVS-DPDK, this application is no longer available on the host. The following is a workaround to take ovs-vsctl calls made from the host and direct them to the appropriate Kube-OVN container running OVS-DPDK.
cat <<'EOF' > /usr/local/bin/ovs-vsctl
#!/bin/bash
ovsCont=$(docker ps | grep kube-ovn | grep ovs-ovn | grep -v pause | awk '{print $1}')
docker exec $ovsCont ovs-vsctl $@
EOF
chmod +x /usr/local/bin/ovs-vsctl
CPU masking is not necessary, but some advanced users may wish to use this feature in OVS-DPDK. When starting OVS-DPDK ovs-vsctl has the ability to configure a CPU mask. This should be used with something like CPU-Manager-for-Kubernetes. Configuration of such a setup is complex and specific to each system. It is out of the scope of this document. Please consult OVS-DPDK and CMK documentation.
A sample Kubernetes pod running a DPDK enabled Docker image.
Create the Dockerfile, name it Dockerfile.dpdk
FROM centos:8
ENV DPDK_VERSION=19.11.1
ENV DPDK_TARGET=x86_64-native-linuxapp-gcc
ENV DPDK_DIR=/usr/src/dpdk-stable-${DPDK_VERSION}
RUN dnf groupinstall -y 'Development Tools'
RUN dnf install -y wget numactl-devel
RUN cd /usr/src/ && \
wget http://fast.dpdk.org/rel/dpdk-${DPDK_VERSION}.tar.xz && \
tar xf dpdk-${DPDK_VERSION}.tar.xz && \
rm -f dpdk-${DPDK_VERSION}.tar.xz && \
cd ${DPDK_DIR} && \
sed -i s/CONFIG_RTE_EAL_IGB_UIO=y/CONFIG_RTE_EAL_IGB_UIO=n/ config/common_linux && \
sed -i s/CONFIG_RTE_LIBRTE_KNI=y/CONFIG_RTE_LIBRTE_KNI=n/ config/common_linux && \
sed -i s/CONFIG_RTE_KNI_KMOD=y/CONFIG_RTE_KNI_KMOD=n/ config/common_linux && \
make install T=${DPDK_TARGET} DESTDIR=install
Build the Docker image and tag it as dpdk:19.11. This build will take some time.
docker build -t dpdk:19.11 -f Dockerfile.dpdk .
Create the Pod Spec, name it pod.yaml
apiVersion: v1
kind: Pod
metadata:
generateName: testpmd-dpdk-
annotations:
k8s.v1.cni.cncf.io/networks: ovs-dpdk-br0, ovs-dpdk-br0
spec:
tolerations:
- operator: "Exists"
key: cmk
containers:
- name: testpmd-dpdk
image: dpdk:19.11
resources:
requests:
hugepages-1Gi: 2Gi
memory: 2Gi
limits:
hugepages-1Gi: 2Gi
memory: 2Gi
command: ["tail", "-f", "/dev/null"]
volumeMounts:
- mountPath: /hugepages
name: hugepages
- mountPath: /vhu
name: vhu
securityContext:
privileged: true
runAsUser: 0
volumes:
- name: hugepages
emptyDir:
medium: HugePages
- name: vhu
hostPath:
path: /var/run/openvswitch
securityContext:
runAsUser: 0
restartPolicy: Never
Run the pod.
kubectl create -f pod.yaml
The pod will be created with a kernel OVS interface provided by Kube-OVN, as the default network. In addition two secondary interfaces will be available within the pod as socket files located under /vhu/ .
To run TestPMD:
testpmd -m 1024 -c 0xC --file-prefix=testpmd_ --vdev=net_virtio_user0,path=<path-to-socket-file1> --vdev=net_virtio_user1,path=<path-to-socket-file2> --no-pci -- --no-lsc-interrupt --auto-start --tx-first --stats-period 1