Overview

In a data center environment, there are multiple ways to orchestrate resources. One way is to use containers and kubernetes as orchestrators. To use the Alveo™ data center accelerator cards in such an environment, see /content/xilinx/en/developer/articles/using-alveo-in-a-kubernetes-environment.html for more information and examples on deployment.

Another way of orchestrating resources is through virtualization, such as using a Kernel-based Virtual Machine (KVM). For more information, see https://www.linux-kvm.org. This article shows how to deploy Alveo data center accelerator cards in a KVM environment.


Local KVM Setup

This article is not meant to be an in-depth tutorial on how to install and configure a KVM. This section provides a short overview of how to provision your server.

Note: This example uses Ubuntu 18.04 LTS on a Dell R740 server as an example host environment. For this tutorial, we installed and configured KVM using the guide available at https://fabianlee.org/2018/08/27/kvm-bare-metal-virtualization-on-ubuntu-with-kvm/.

Installing a KVM

To install a KVM, follow these steps:

1.       Install the KVM and assorted tools:

    sudo apt-get install qemu-system-x86 qemu-kvm qemu libvirt-bin virt-manager virtinst bridge-utils cpu-checker virt-viewer

2.      Check that the KVM was installed and that the CPU has VT-x virtualization enabled with kvm-ok.

    $ sudo kvm-ok

INFO: /dev/kvm exists
KVM acceleration can be used

Note: If you  get the following message, enable VT-x at the BIOS level.

    INFO: /dev/kvm does not exist HINT: sudo modprobe kvm_intel
INFO: Your CPU supports KVM extensions INFO: KVM (vmx) is disabled by your BIOS
HINT: Enter your BIOS setup and enable Virtualization Technology (VT), and then hard poweroff/poweron your system
KVM acceleration can NOT be used

3.       Run the virt-host-validate utility to run a whole set of checks against your virtualization ability and the KVM readiness.

    QEMU: Checking if device /dev/kvm is accessible			: PASS
QEMU: Checking if device /dev/vhost-net exists			: PASS
QEMU: Checking if device /dev/net/tun exists				: PASS
QEMU: Checking for cgroup 'memory' controller support		: PASS
QEMU: Checking for cgroup 'memory' controller mount-point	: PASS
QEMU: Checking for cgroup 'cpu' controller support			: PASS
QEMU: Checking for cgroup 'cpu' controller mount-point		: PASS
QEMU: Checking for cgroup 'cpuacct' controller support		: PASS
QEMU: Checking for cgroup 'cpuacct' controller mount-point	: PASS
QEMU: Checking for cgroup 'cpuset' controller support		: PASS
QEMU: Checking for cgroup 'cpuset' controller mount-point	: PASS
QEMU: Checking for cgroup 'devices' controller support		: PASS
QEMU: Checking for cgroup 'devices' controller mount-point	: PASS
QEMU: Checking for cgroup 'blkio' controller support		: PASS
QEMU: Checking for cgroup 'blkio' controller mount-point		: PASS
QEMU: Checking for device assignment IOMMU support			: PASS
LXC: Checking for Linux >= 2.6.26					: PASS
LXC: Checking for namespace ipc						: PASS
LXC: Checking for namespace mnt						: PASS
LXC: Checking for namespace pid						: PASS
LXC: Checking for namespace uts						: PASS
LXC: Checking for namespace net						: PASS
LXC: Checking for namespace user						: PASS
LXC: Checking for cgroup 'memory' controller support		: PASS
LXC: Checking for cgroup 'memory' controller mount-point		: PASS
LXC: Checking for cgroup 'cpu' controller support			: PASS
LXC: Checking for cgroup 'cpu' controller mount-point		: PASS
LXC: Checking for cgroup 'cpuacct' controller support		: PASS
LXC: Checking for cgroup 'cpuacct' controller mount-point	: PASS
LXC: Checking for cgroup 'cpuset' controller support		: PASS
LXC: Checking for cgroup 'cpuset' controller mount-point		: PASS
LXC: Checking for cgroup 'devices' controller support		: PASS
LXC: Checking for cgroup 'devices' controller mount-point	: PASS
LXC: Checking for cgroup 'blkio' controller support		: PASS
LXC: Checking for cgroup 'blkio' controller mount-point		: PASS
LXC: Checking if device /sys/fs/fuse/connections exists		: PASS

4.       If you do not see PASS for each test, you should fix your setup. On Ubuntu 18.04 LTS, IOMMU is not enabled by default, so that test would initially fail. To enable this, follow these steps: 

a.       Modify the /etc/default/grub file:

sudo vim /etc/default/grub

b.       Add/modify the GRUB_CMDLINE_LINUX entry:

GRUB_CMDLINE_LINUX="intel_iommu=on"

c.       Apply the following configuration:

sudo update-grub

d.       Reboot the machine.

Adding a User to libvirt Groups

To allow the current user to manage the guest VM, you can add yourself to all of the libvirt groups (e.g. libvirt, libvirt-qemu) and the kvm group.

    cat /etc/group | grep libvirt | awk -F':' {'print $1'} | xargs -n1 sudo adduser $USER

# add user to kvm group also 
sudo adduser $USER kvm

# relogin, then show group membership 
exec su -l $USER
id | grep libvirt

Group membership requires you to log back in. If the id command does not show your libvirt* group membership, logout and log back in, or run exec su -1 $USER, as described in exec su -l $USER.

Configuring the Network

By default, a KVM creates a virtual switch that shows up as a host interface named virbr0 using 192.168.122.0/24.

For information about virtual switches, see https://wiki.libvirt.org/page/VirtualNetworking.

Libvirt’s-Default-Network-Configuration
Libvirt’s Default Network Configuration Courtesy of libvirt.org

This interface should be visible from the host using the following ip command:

    ~$ ip addr show virbr0
12: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000 
link/ether 52:54:00:34:3e:8f brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 
valid_lft forever preferred_lft forever


Creating a Basic VM Using the virt-install Command

To test, you need an OS boot image. You have to download the ISO for the network installer of Ubuntu 18.04 because you are on an Ubuntu host. This file is only 76 Mb, so it is perfect for testing. When complete, you should have a local file named ~/Downloads/mini.iso.

1.       Use the following command to generate a list of the virtual machines running on your host:

    # list VMs 
virsh list

2.       Because you have not yet created a virtual machine, this should return empty. Create your first guest VM with 1 vcpu/1G RAM using the default virbr0 NAT network and default pool storage.

    $ virt-install --virt-type=kvm --name=ukvm1404 --ram 1024 --vcpus=1 --virt-type=kvm --hvm --cdrom ~/faas/mini.iso --network network=default --disk pool=default,size=20,bus=virtio,format=qcow2 --noautoconsole
WARNING  No operating system detected, VM performance may suffer. Specify an OS with --os-variant for optimal results.
 
Starting install...
Allocating 'ukvm1404.qcow2'                                |       20 GB  00:00:00    
Domain installation still in progress. You can reconnect to
the console to complete the installation process.
 
# open console to VM
$ virt-viewer ukvm1404

3.       The “virt-viewer” will open a window for the guest OS when you click the mouse in the window. Press <ENTER> to see the initial Ubuntu network install screen.

Note: If you want to delete this guest OS completely, close the GUI window that opened with then virt-viewer, then use the following commands:

    virsh destroy ukvm1404 
virsh undefine ukvm1404

Test from GUI

The virt-viewer utility opens a basic window to the guest OS. Notice it does not give you any control beside the sending keys. If you want a full GUI for managing the KVM, use the virt-manager as described at https://virt-manager.org/.

virt-manager

To install and start the virt-manager, run the following command:

    sudo apt-get install qemu-system virt-manager 
virt-manager

The virt-manager provides a convenient interface for creating or managing a guest OS. Any guest OS that you create from the CLI using virt-install also appears in this list.


Configuring an Alveo Device Passthrough

After you have created one or multiple VMs, assign Alveo data center accelerator cards to them. For more information on different use models, see https://xilinx.github.io/XRT/master/html/security.html#deployment-models.

1.       Download and install the XRT using the following command:

    wget https://www.xilinx.com/bin/public/openDownload?filename=xrt_201920.2.3.1301_18.04-xrt.deb 
sudo apt install ./xrt_201920.2.3.1301_18.04-xrt.deb

2.       Reboot your system after the installation is complete.

3.       For a list of the Alveo data center accelerator cards that are installed on your system, run the following XRT command. The following example shows that two cards are installed on the system:

    $/opt/xilinx/xrt/bin/xbmgmt scan
*0000:af:00.0 xilinx_u280-es1_xdma_201910_1(ts=0x5d1c391d) mgmt(inst=44800)
*0000:3b:00.0 xilinx_u200_xdma_201830_2(ts=0x5d1211e8) mgmt(inst=15104)

Using PCIe Passthrough

The Alveo data center accelerator cards expose two physical functions on the PCI®e interface. To provide full control to the guest OS for a device, you must execute both the functions. This can be done using the virsh command.

1.       Create two files, one for each of the physical functions. As an example, these files can be used to pass through the U200 file

    $ cat pass-mgmt.xml
   <hostdev mode='subsystem' type='pci' managed='yes'>
     <source>
	<address domain='0x0000' bus='0x3b' slot='0x00' function='0x0'/>
     </source>
     <address type='pci' domain='0x0000' bus='0x00' function='0x0' multifunction='on'/>
   </hostdev>
$ cat pass-user.xml
   <hostdev mode='subsystem' type='pci' managed='yes'>
     <source>
       <address domain='0x0000' bus='0x3b' slot='0x00' function='0x1'/>
     </source>
     <address type='pci' domain='0x0000' bus='0x00' function='0x1'/>
   </hostdev>

With these files, virsh can be used to pass through the U200 card to an existing VM. There are multiple to do this, but the process described in this section allows devices to be used in the guest OS only after a reboot.

2.       Attach the U200 to the VM in your system. In the following example, a VM named centos7.5 is shown. 

    $ virsh attach-device centos7.5 --file pass-user.xml --config 
Device attached successfully

$ virsh attach-device centos7.5 --file pass-mgmt.xml --config 
Device attached successfully

If you have the correct XRT and shell installed when booting the VM, you should be able to use the U200 directly.

XML-FILE

3.       In the same way you attached files, you can use the same XML files to detach devices.

    $ virsh detach-device centos7.5 --file pass-user.xml --config 
Device detached successfully

$ virsh detach-device centos7.5 --file pass-mgmt.xml --config 
Device detached successfully

Scripts to Attach and Detach Alveo Cards

To make it more convenient to attach and detach Alveo data center accelerator cards to different VMs, you can create a script to automate the task.  To create the script, the following helper xml files are used:

    $ cat pass-user.xml_base
  <hostdev mode='subsystem' type='pci' managed='yes'>
     <source>
       <address domain='0x0000' bus='0x$DEV' slot='0x00' function='0x1'/>
     </source>
     <address type='pci' domain='0x0000' bus='0x00' function='0x1'/>
</hostdev>

$ cat pass-mgmt.xml_base
  <hostdev mode='subsystem' type='pci' managed='yes'>
     <source>
       <address domain='0x0000' bus='0x$DEV' slot='0x00' function='0x0'/>
     </source>
     <address type='pci' domain='0x0000' bus='0x00' function='0x0' multifunction='on'/>
  </hostdev>

The following is the script, attach.sh, which is created using the two helper xml files:

    #!/bin/bash

if [ $# -ne 2 ]; then
  echo "Usage: $0 <OS name> <pcie slot>" echo "For example: $0 centos7.5 af" echo
  echo "OS list:" virsh list --all echo
  echo "========================================================"
  echo
  echo "Xilinx devices:"
  [ -f /opt/xilinx/xrt/bin/xbmgmt ] && /opt/xilinx/xrt/bin/xbmgmt scan || lspci -d 
  10ee: echo
  echo
  echo "========================================================"
  echo
    for dev in $(virsh list --all --name); do
    devices=$(virsh dumpxml $dev | grep '<hostdev' -A 5 | grep "function='0x0'" | 
    grep -v "type" | tr -s ' ' | cut -d' ' -f4 | cut -d= -f2 | awk '{print   substr($0,4,2);}')
    if [[ ! -z "$devices" ]]; then
        echo "Attached host devices in $dev:" echo $devices
        echo
    fi
  done 
  exit -1
fi

export OS=$1 
export DEV=$2

envsubst < pass-user.xml_base > pass-user-$DEV-$OS.xml 
envsubst < pass-mgmt.xml_base > pass-mgmt-$DEV-$OS.xml

CMD=$(basename $0) 
COMMAND=${CMD%.sh}
virsh $COMMAND-device $OS --file pass-mgmt-$DEV-$OS.xml --config 
virsh $COMMAND-device $OS --file pass-user-$DEV-$OS.xml --config

rm pass-mgmt-$DEV-$OS.xml pass-user-$DEV-$OS.xml

To use the attach.sh file, you can create a link to detach.sh as follows:

    -rwxrwxr-x 1 alveo alveo 1159 Mar 20 10:02 attach.sh
lrwxrwxrwx 1 alveo alveo 9 Mar 20 12:15 detach.sh -> attach.sh

When executed without any commands, the script will output the existing domains (VMs), the Alveo data center accelerator cards installed in the system, and show which Alveo card is attached to which VMs. For example:

    Usage: ./attach.sh <OS name> <pcie slot> 
For example: ./attach.sh centos7.5 af

OS list:
Id		Name			State
----------------------------------------------------
-	     centos7.5		shut off
-	     ubuntu18.04		shut off

========================================================

Xilinx devices:
*0000:af:00.0 xilinx_u280-es1_xdma_201910_1(ts=0x5d1c391d) mgmt(inst=44800)
*0000:3b:00.0 xilinx_u200_xdma_201830_2(ts=0x5d1211e8) mgmt(inst=15104)

========================================================

Attached host devices in centos7.5: 
3b
Af

Attached host devices in ubuntu18.04:

In the example shown above, there are two VMs (centos7.5 and ubuntu18.04) installed on the system but neither of them is running. There are two Alveo data center accelerator cards installed in the server (U200 in slot 0000:3b:00.0 and U280 in slot 000:af:00.0) and both are attached to the centos7.5 VM.

To attach or detach an Alveo data center accelerator card, use the following command:

    attach.sh <OS name> <pcie slot>

Note: Devices can be attached to multiple VMs, but these VMs cannot run simultaneously.

    $ ./attach.sh ubuntu18.04 3b 
Device attached successfully

Device attached successfully

$ ./attach.sh
Usage: ./attach.sh <OS name> <pcie slot> 
For example: ./attach.sh centos7.5 af

OS list:
Id		Name			State
----------------------------------------------------
-	     centos7.5		shut off
-	     ubuntu18.04		shut off

========================================================

Xilinx devices:
*0000:af:00.0 xilinx_u280-es1_xdma_201910_1(ts=0x5d1c391d) mgmt(inst=44800)
*0000:3b:00.0 xilinx_u200_xdma_201830_2(ts=0x5d1211e8) mgmt(inst=15104)

========================================================

Attached host devices in centos7.5: 
3b
af

Attached host devices in ubuntu18.04: 
3b

Likewise, to detach a device, use the following command: 

    detach.sh <OS name> <pcie slot>

The U200 can be detached from the centos7.5 VM,  as shown:

    $ ./detach.sh centos7.5 3b 
Device detached successfully

Device detached successfully

$ ./attach.sh
Usage: ./attach.sh <OS name> <pcie slot> 
For example: ./attach.sh centos7.5 af

OS list:
Id		Name			State
----------------------------------------------------
-	     centos7.5		shut off
-	     ubuntu18.04		shut off

========================================================

Xilinx devices:
*0000:af:00.0 xilinx_u280-es1_xdma_201910_1(ts=0x5d1c391d) mgmt(inst=44800)
*0000:3b:00.0 xilinx_u200_xdma_201830_2(ts=0x5d1211e8) mgmt(inst=15104)

========================================================

Attached host devices in centos7.5: 
af

Attached host devices in ubuntu18.04: 
3b

An auto completion support feature for this script is implemented by creating a file, autocomplete_faas.sh, containing the following code: 

    _faas()
{
    local cur prev hosts devices suggestions device_suggestions 
    COMPREPLY=()
    cur="${COMP_WORDS[COMP_CWORD]}" 
    prev="${COMP_WORDS[COMP_CWORD-1]}"
    hosts="$(virsh list --all --name)"
    devices="$(lspci -d 10ee: | grep \\.0 | awk '{print substr($0,0,2);}' | tr '\n' ' ' | head -c -1)"

    case $COMP_CWORD in 
    1)
         COMPREPLY=( $(compgen -W "${hosts}" -- ${cur}) ) 
         return 0
         ;;
    2)
        if [ "${COMP_WORDS[0]}" == "./detach.sh" ]; then
            # only return attached devices
            devices=$(virsh dumpxml $prev | grep '<hostdev' -A 5 | grep "function='0x0'" | grep -v "type" | tr -s ' ' | cut -d' ' -f4 | cut -d= -f2 | awk '{print substr($0,4,2);}')
        fi
        suggestions=( $(compgen -W "${devices}" -- ${cur}) )
        ;;
    esac
    if [ "${#suggestions[@]}" == "1" ] || [ ! -f /opt/xilinx/xrt/bin/xbutil ] ; then 
           COMPREPLY=("${suggestions[@]}")
    else
           # more than one suggestions resolved,
           # respond with the full device suggestions 
           declare -a device_suggestions
           for ((dev=0;dev<${#suggestions[@]};dev++)); do
               #device_suggestions="$device_suggestions\n$dev $(/opt/xilinx/xrt/bin/xbutil scan | grep":$dev:")"
               device_suggestions+=("${suggestions[$dev]}-->$(/opt/xilinx/xrt/bin/xbutil scan | grep ":${suggestions[$dev]}:" | xargs echo -n)")
           done 
           COMPREPLY=("${device_suggestions[@]}")
    fi

}
complete -F _faas ./attach.sh 
complete -F _faas ./detach.sh

And then sourcing this file:

    source ./autocomplete.sh

You can then use auto completion using TAB for both the VM name, and the device.


Conclusion

This article shows how to use Alveo data center accelerator cards in a KVM environment. Individual Alveo cards can be attached and detached from VMs dynamically, and thus can be used in these different VMs as they were running natively on the server.


About Kester Aernoudt

About Kester Aernoudt

Kester Aernoudt received his masters degree in Computer Science at the University of Ghent in 2002. In 2002 he started as a Research Engineer in the Technology Center of Barco where he worked on a wide range of processing platforms such as microcontrollers, DSP's, embedded processors, FPGA's, Multi Core CPU's, GPU's etc. Since 2011 he joined Xilinx, now working as a Processor Specialist covering Europe and Israel supporting customers and colleagues on Embedded Processors, X86 Acceleration, specifically targeting our Zynq Devices and Alveo.