Running an OpenGL application on a GPU-accelerated Nova Instance (Part 1)

By Guillaume on 03/02/2014 11:41:00

Have you ever dreamed of running graphic apps in OpenStack? Now, it's possible!

About a month ago we published a blueprint [1] to enable the support of PCI-passthrough in the XenAPI driver of Nova [2]. Our primary objective was to enable GPU-accelerated instances but we nonetheless scoped the blueprint with the intent to support "any" kind of PCI device. Since then, we published two patches [3] [4] that you can readily try using the trunk version of Nova. I would like to say that this work couldn't have been done without the help of the OpenStack and Xen communities.

In part 1 of this post, I will go through a step-by-step instruction that shows how to boot a Nova instance that has direct access to a GPU under Xen virtualization. In our particular setup, we used an Nvidia K2 graphic card but it should work equally well for other Nvidia GPUs like the Nvidia K520 or M2070Q that we booted successfully too in our lab.

First you need a working devstack into a domU. To do this, you must install Xenserver 6.2 on the machine that has the GPU installed then boot the domU with an Ubuntu Saucy (but other distribution should work as well) and install a devstack all-in-one in it. When you boot the dom0, you need to prepare the device for PCI passthrough. You do this by adding "pciback.hide=(87:00.0)(88:00.0)" to the dom0 Linux kernel command line. This will assign the pciback driver to the devices with BDF 87:00.0 and 88:00.0. Information about PCI passthrough with Xen are available on Xen wiki [5].

The next step is to download the code for the PCI passthrough.

  # cd /opt/stack/nova
  # git review -d 67125

This will download the two patches that are needed and will switch on the correct git branch. Before restarting the nova services you need to configure the nova scheduler and the compute node to be able to use PCI passthrough. For further information, check the wiki [6].

On the compute node you need to select which devices are eligible for passthrough. In our case we added the K2 cards. You do this by adding those devices into a list in /etc/nova/nova.conf

  # cat /etc/nova/nova.conf
  ...
  pci_passthrough_whitelist = [{"vendor_id":"10de","product_id":"11bf"}]
  ...

The vendor ID and the product ID of the K2 GPU are respectively 10de and 11bf. Thus we need to configure the scheduler as follows:

  # cat /etc/nova/nova.conf
  ...
  pci_alias={"vendor_id":"10de","product_id":"11bf","name":"k2"}
  scheduler_driver = nova.scheduler.filter_scheduler.FilterScheduler
  scheduler_available_filters=nova.scheduler.filters.all_filters
  scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter.PciPassthroughFilter
  scheduler_default_filters=RamFilter,ComputeFilter,AvailabilityZoneFilter,Compute
  ...

The pci_alias is used to match the extra parameters of a flavor with the selected PCI device. Hence, you need create a flavor that will be associated with the PCI devices that you want to attach:

  # nova flavor-key  m1.small set "pci_passthrough:alias"="k2:1"

Last but not least. You need to copy plugin files from /opt/stack/nova/plugins/xenserver/xenapi/etc/xapi.d/ of your devstack installation into the /etc/xapi.d/plugins/ directory of dom0. Overlooking this step would most probably result in plugin errors.

Restart the nova services. On your n-cpu screen you should see your PCI resources as available as shown below:

    2014-01-30 19:20:48.340 DEBUG nova.compute.resource_tracker [-] Hypervisor: assignable PCI devices: [{"status": "available", "dev_id": "pci_87:00.0", "product_id": "11bf", "dev_type": "type-PCI", "vendor_id": "10de", "label": "label_10de_11bf", "address": "87:00.0"}, {"status": "available", "dev_id": "pci_88:00.0", "product_id": "11bf", "dev_type": "type-PCI", "vendor_id": "10de", "label": "label_10de_11bf", "address": "88:00.0"}] from (pid=10444) _report_hypervisor_resource_view /opt/stack/nova/nova/compute/resource_tracker.py:429

If it is not the case, then check that your file nova.conf is correctly configured as described above.

Now, when you boot an instance using the flavor m1.small, one k2 will be attached to this instance. To be noted that the resources tracker will keep track of the PCI devices that you attached to your instances and so, creating a new GPU-accelerated instance will return an error when those resources are exhausted on all the compute nodes.

Now, everything should be ready to boot a GPU-accelerated instance:

  # nova boot --flavor m1.small --image centos6 --key-name mykey testvm1
  xlcloud@devstackvm1:~$ nova list
  +--------------+---------+--------+------------+-------------+--------------------+
  | ID           | Name    | Status | Task State | Power State | Networks           |
  +--------------+---------+--------+------------+-------------+--------------------+
  | 92f4...f081a | testvm1 | ACTIVE | -          | Running     | private=10.11.12.2 |
  +--------------+---------+--------+------------+-------------+--------------------+

Log into your instance to check which PCI devices are available:

  xlcloud@devstackvm1:~$ ssh  -l cloud-user 10.11.12.2
  Last login: Thu Jan 30 18:26:32 2014 from 10.11.12.1
  [cloud-user@testvm1 ~]$ lspci 
  00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
  00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
  00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
  00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01)
  00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01)
  00:02.0 VGA compatible controller: Cirrus Logic GD 5446
  00:03.0 SCSI storage controller: XenSource, Inc. Xen Platform Device (rev 01)
  00:05.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K2] (rev a1)

As you can see in the output above, my instance is attached to the K2 GPU. The next step to run an actual graphic application, since in the end that's what we want to do, requires to install drivers of your graphic card manufacturer into your GPU-accelerated instance (in this case that would be the Nvidia driver for the K2).

  [cloud-user@testvm1 ~]$ nvidia-smi 
  Mon Feb  3 08:51:42 2014       
  +------------------------------------------------------+                       
  | NVIDIA-SMI 331.38     Driver Version: 331.38         |                       
  |-------------------------------+----------------------+----------------------+
  | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
  | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
  |===============================+======================+======================|
  |   0  GRID K2             Off  | 0000:00:05.0     Off |                  Off |
  | N/A   32C    P0    37W / 117W |      9MiB /  4095MiB |      0%      Default |
  +-------------------------------+----------------------+----------------------+
                                                                                   
  +-----------------------------------------------------------------------------+
  | Compute processes:                                               GPU Memory |
  |  GPU       PID  Process name                                     Usage      |
  |=============================================================================|
  |  No running compute processes found                                         |
  +-----------------------------------------------------------------------------+

Okay, that's probably enough for today. In Part 2 of this post, I will show you how to setup a GPU-accelerated Nova instance to run an OpenGL application like the Unigine benchmark [7].

Guillaume Thouvenin XLcloud R&D


This wiki is licensed under a Creative Commons 2.0 license
XWiki Enterprise 5.4.6 - Documentation - Legal Notice

Site maintained by