Hot-plugging USB into a Xen VM from a script

Xen Panda

I have been run­ning a hybrid Linux/Windows box for some time now, using Xen.  Xen is a bare-met­al hyper­vi­sor with many pow­er­ful fea­tures, includ­ing the abil­i­ty to map phys­i­cal hard­ware (say, a graph­ics card) to guest VMs.  I have strug­gled with get­ting USB to work seam­less­ly, how­ev­er.  My hard­ware pre­vents me from map­ping a phys­i­cal USB device to the guest, so I need­ed a dif­fer­ent solution.

Xen allows emu­lat­ed USB con­trollers to be mapped to the guest, and this works well.  One impor­tant fea­ture is miss­ing, how­ev­er.  It is not pos­si­ble to auto­mat­i­cal­ly con­nect a device to the guest when it is plugged in.

A lot of research and some div­ing into the code brought me to a solu­tion, how­ev­er.  Utilizing the udev inter­face in lin­ux and the low-lev­el hyper­vi­sor inter­faces, I can hot-plug devices at run-time.  This post describes how I got there.

The source code dis­cussed in this arti­cle can be found, in full, at https://github.com/stephen-czetty/xen-auto-usb.

The problem

It is pos­si­ble to hot-plug a USB device into the guest via the com­mand-line, but it is rife with issues.  For the guest to see the device, you have to plug it in and man­u­al­ly run a com­mand to con­nect it.  Devices are also addressed by the phys­i­cal port they are plugged into.  For one-offs this isn’t so bad, since you can con­nect it once and for­get it.  For hot-plug­ging, this quick­ly gets tedious.

If you unplug the device, the guest does­n’t notice, and this can cause errors, even crash­es.  If you plug the device back into a dif­fer­ent USB port, it won’t be recon­nect­ed, because it is addressed by phys­i­cal loca­tion.  I want­ed a solu­tion where I could plug in any USB device and have it imme­di­ate­ly be seen in the guest, and when it was removed, it would go away.

Researching the solution

So I googled, but did­n’t find many viable solu­tions.  Then I looked at the source code for the tool set to see how it imple­ment­ed the hot plug func­tion­al­i­ty.  I dis­cov­ered that there is a pro­to­col (QMP) that is used via a UNIX-domain sock­et on the host.  By con­nect­ing to that, I could hand-craft com­mands that instruct­ed the guest to add or remove a device from its emu­lat­ed USB controller.

Documentation on QMP is thin, and fig­ur­ing this out required a lot of exper­i­men­ta­tion and reverse engi­neer­ing.  I crashed my guest on sev­er­al occa­sions dur­ing this process, so this is not for the faint of heart.

QMP boils down to a JSON-based pro­to­col.  When ini­tial­ly con­nect­ing, you must send it a hand­shake message:

{
	"execute": "qmp_capabilities"
}

After this hand­shake, you can now send com­mands.  Important ones for the pur­pos­es of this appli­ca­tion are device_add, device_del, qom_get, and qom_list.  The first two add and remove devices, as is evi­dent.  The last two are low­er-lev­el com­mands that get details about the run-time con­fig­u­ra­tion of the VM.

Examples

Add a device

{
	"execute": "device_add",
	"arguments": {
		"id": "xenusb-0-1",
		"driver": "usb-host",
		"bus": "xenusb-0.0",
		"port": 1,
		"hostbus": 1
		"hostaddr": 1
	}
}

This com­mand tells the host to attach the first device on bus 1 to the VM. When a device is attached to the host, dmesg will report the device as (1−1).  These are used for hostbus and hostaddr.  It is con­nect­ed to vir­tu­al con­troller 1 ("bus": "xenusb-1.0"), on port 1.  The id para­me­ter can be any­thing, but I chose val­ues that match what the com­mand-line tools expect.

Remove a device

{
	"execute": "device_del",
	"arguments": {
		"id": "xenusb-0-1"
	}
}

This com­mand removes a device from the guest.  All that is required is the id.

Create a controller

{
	"execute": "device_add",
	"arguments": {
		"id": "xenusb-0",
		"driver": "nec-usb-xhci",
		"p2": "15",
		"p3": "15"
	}
}

This cre­ates an emu­lat­ed con­troller on the guest.  The equiv­a­lent device_del will remove it, but I have found that it crash­es Windows when I tried it.  p2 and p3 spec­i­fy the num­ber of ports, 15 is the max­i­mum supported.

The dri­ver nec-usb-xhci cre­ates a USB 3.0 con­troller.  It is also pos­si­ble to cre­ate 1.1 or 2.0 con­trollers by using piix3-usb-uhci or usb-ehci, respec­tive­ly, leav­ing off the p2 and p3 argu­ments.  (Xen uses a default port count for these controllers.)

List existing devices

{
	"execute": "qom-list",
	"arguments": {
		"path": "xenusb-0.0"
	}
}

This uses the rel­a­tive­ly low-lev­el com­mand qom-list to return all of the devices attached to con­troller 0.  The return looks like:

{
	"return": [{
		"type": "link<usb-host>",
		"name": "child[0]"
	},
	{
		"type": "link<usb-host>",
		"name": "child[1]"
	}]
}

Nice, right?  child[0] is so help­ful.  You need to get the specifics of the device in order to work with it, as detailed below.

Get device specifics

{
	"execute": "qom-get",
	"arguments": {
		"path": "xenusb-0.0",
		"property": "child[0]"
	}
}

This gets the details of the device.  It returns:

{
	"return": "/machine/peripheral/xenusb-2-3"
}

We then need to do fur­ther calls to qom-get using the returned path to retrieve the port, hostbus, and hostaddr values:

{
	"execute": "qom-get",
	"arguments": {
		"path": "/machine/peripheral/xenusb-2-3",
		"property": "port"
	}
}

Etc.

As you can see from the exam­ples above, QMP is not exact­ly sim­ple to use, nor is it entire­ly self-con­sis­tent.  (Note the under­scores in device_add and device_del and the dash­es in qom-list and qom-get.)  It is easy to parse and gen­er­ate, but there are a lot of steps to run through to get the nec­es­sary data.

Monitoring the device

At this point, I had a man­u­al way to do what the tools already did.  The next step was to fig­ure out how to auto­mate it.  I rea­soned that if I had a root device that was ded­i­cat­ed to the guest VM, I could lis­ten for events on that device and auto­mat­i­cal­ly route all con­nec­tions to the guest.  So I grabbed an extra PCIe USB card I had lying around, threw it into the machine and got to work.

After some research, I found a python library called pyudev.  It is not tru­ly asyn­chro­nous, so I pro­grammed it to poll with no time­out, and added an asyncio.sleep(1.0) call to only poll for events once a sec­ond.  This means that there is a pos­si­ble delay of a sec­ond after plug­ging in a device, but I’m pret­ty sure it takes Windows much longer than that to ini­tial­ize it anyway.

async def monitor_devices(self) -> None:
    monitor = pyudev.Monitor.from_netlink(self.__context)
    monitor.filter_by('usb')

   while True:
        device = monitor.poll(0)
        if device is None:
            await asyncio.sleep(1.0)
            continue

        device = Device(device)
        self.__options.print_very_verbose('{0.action} on {0.device_path}'.format(device))
        if device.action == "add":
            if self.__is_a_device_we_care_about(device):
                await self.device_added.fire(device)
        elif device.action == "remove":
            await self.device_removed.fire(device)

When an inter­est­ing device is added, the device_added event is fired.  The code always fires the device_removed event, to cov­er devices that the script may have missed.

There is a lot of plumb­ing and meth­ods for reg­is­ter­ing hubs and spe­cif­ic devices to watch for, but this loop is the meat.  Combined with QMP com­mands, I was able to hot-plug and hot-unplug USB devices in a guest oper­at­ing system.

Additional Features

I’ve added a lot of func­tion­al­i­ty to the script since I first con­ceived it.  I’m not going into the details here, but in it’s cur­rent incar­na­tion, it can:

  • Hot-plug and unplug USB devices con­nect­ed to spe­cif­ic (con­fig­urable) hubs
  • Hot-plug and unplug USB devices spec­i­fied by their Vendor:Device ids (in any hub)
  • Create a new vir­tu­al con­troller if the first one fills up
  • Watch for VM reboots and recon­nect when it’s back up
  • Read from a YAML con­fig file
  • Wrapper in C to run setu­id root (instead of sudo)

On the Roadmap

  • Run as a daemon
  • Refactoring/cleanup
  • Watch opti­cal drives
  • Rotating logs
  • Limit num­ber of con­trollers created
  • External con­trol (Web-API)
  • Multiple VM support

Share