DPDK (Data Plane Development Kit) is a set of open-source high-performance packet processing libraries and user space drivers.

oVirt support for DPDK was introduced in 2017, and is now enhanced in terms of deployment via Ansible and usage via Open Virtual Network.

While still experimental, OVN-DPDK in oVirt is now available in version 4.2.

What's new?

Ansible DPDK host setup

Host configuration for DPDK usage is now automated using Ansible. This primarly includes:

  • Hugepages configuration – hugepage size and quantity in the kernel.
  • CPU partitioning.
  • Binding NICs to userspace drivers.
  • OVS-DPDK related configuration (initialization, socket memory, pmd thread core binding, etc).

The role is installed via Ansible galaxy:

# ansible-galaxy install oVirt.dpdk-setup

An example playbook:

- hosts: dpdk_host_0
  vars:
    pci_drivers:
      "0000:02:00.1": "vfio-pci"
      "0000:02:00.2": "igb"
      "0000:02:00.3": ""
    configure_kernel: true
    bind_drivers: true
    set_ovs_dpdk: false
  roles:
    - ovirt-ansible-dpdk-setup

The role is controlled by 3 boolean variables (all set to true by default) and a dictionary of devices and their drivers:

  • configure_kernel – determines whether the kernel should be configured for DPDK usage (hugepages, CPU partitioning). WARNING: When set to true it is very likely to trigger a reboot of the host, unless all required configuration is already pre-set.
  • bind_drivers – determines whether the role should bind devices to their specified drivers.
  • set_ovs_dpdk – determines whether the role should initialize and configure OVS for DPDK usage.
  • pci_drivers – a dictionary of the PCI addresses of network interface cards as keys and drivers as values. An empty string represents the default Linux driver for the specified device. Non-DPDK compatible drivers may be set; However, NICs with such drivers will not be configured for OVS-DPDK usage and will not have their CPU isolated in multiple NUMA environments. PCI addresses can be retrieved via lshw -class network -businfo.

Additionally, there are several optional performance related variables:

  • pmd_threads_count – determines the number of PMD threads per NIC (default: 1).
  • nr_2mb_hugepages – determines the number of 2MB hugepages, if 2MB hugepages are to be used (default: 1024).
  • nr_1gb_hugepages – determines the number of 1GB hugepages, if 1GB hugepages are to be used (default: 4).
  • use_1gb_hugepages – determines whether 1GB hugepages should be used, where 1GB hugepages are supported (default: true).

For additional information, refer to the role's repository.

OVN Integration

DPDK in oVirt now leverages the capabilities of OVN in the form of an OVN localnet. This allows seamless connectivity across OVN networks, benefiting from OVN's software defined routers, switches, security groups, and ACL's.

For more information about OVN localnet integration in oVirt, refer to oVirt's Provider Physical Network RFE.

Usage in oVirt

  1. Install oVirt.dpdk-setup Ansible role via ansible-galaxy:

     # ansible-galaxy install oVirt.dpdk-setup
    
  2. Execute the role as described above.

  3. Unless the host was already configured, the host will restart for kernel changes to be applied. After reboot, the following values are changed (if they are not, you may have incompatible hardware):

    • IOMMU: /sys/devices/virtual/iommu/<DMAR device>/devices/<PCI address> should exist. DMAR devices are typically annotated by dmar0, dmar1, and so forth.

    • hugepages: grep Huge /proc/meminfo should reflect the desired state as defined in the Ansible playbook (i.e. Hugepagesize and HugePages_Total)

    • CPU partitioning: based on the devices that are to be used with a DPDK compatible driver, CPUs should be partitioned separately between each NUMA node. Refer to lscpu to see live CPU NUMA separation information, e.g.: NUMA node1 CPU(s): 8-15

      Note: NICs using userspace drivers do not appear in the kernel; ip commands won't list devices using userspace drivers. in oVirt, such devices appear as dpdk0, dpdk1, and so forth.

  4. Create an oVirt network phys-net-demo on your data center and attach it to your cluster.

  5. Attach phys-net-demo to DPDK NICs accross your cluster.

  6. Create an external OVN network ext-net-demo, setting phys-net-demo as its physical network. This configuration between ext-net-demo and phys-net-demo will enable traffic between phys-net-demo (which DPDK's physical port is part of) and the rest of the ports present in OVN's ext-net-demo network.

  7. ext-net-demo vNic's may now be added to virtual machines. These vNics now share L2 connectivity with other remote ports in ext-net-demo via DPDK's NIC (and other local ports internally).

  8. Temporarily, SElinux needs to be set to permissive. This is due to an open bug where SElinux blocks the creation of a QEMU socket. Once the bug is resolved, this step should be skipped.

  9. Host to VM packets are transmitted and received on buffers allocated on shared hugepages memory. As such, virtual machines using DPDK based NIC's need to enable hugepage sharing; this is done by running VM's with custom properties:

    • Review engine custom properties:
      engine-config -g UserDefinedVMProperties
    • Add engine custom properties for hugepages support:
      engine-config -s "UserDefinedVMProperties=hugepages=^.*$;hugepages_shared=(true|false)
    • Restart engine for changes to take effect:
      systemctl restart ovirt-engine
    • In the VM run once menu dialog, add the following custom properties:
      hugepages_sharedtrue.
      hugepages – the number of hugepages to share.