This article is part of a series: Jump to series overview

In the cloud computing tutorial we are starting VMs with QEMU dynamically, but I am relying on the DHCP server in my router to assign IP addresses. This is a problem because every time I want to SSH into a server, I have to lookup the current IP lease for the VM on my router’s web interface. In this part of the tutorial series we will look at possible solutions to retrieve the guest IP address of a QEMU VM and implement one of them.

Possibilities to Detect IP Address of QEMU VM

First, let’s have a look at some possibilities to detect the IP address of a QEMU VM. We will see three different possibilities each with their own advantages and disadvantages.

DHCP Leases File

One approach would be similar to what I am currently doing: We can use our own DHCP server and lookup the mapping in the lease file of the DHCP server. For example my OpenWRT router writes its IPv4 leases to /tmp/dhcp.leases by default:

1607742671 fe:24:38:01:4d:74 192.168.2.122 minikloud ff:b5:5e:67:ff:00:02:00:00:ab:11:47:63:71:bd:14:4e:64:00

These fields mean:

  • Lease expiry as UNIX timestamp
  • MAC address of the VM
  • IP address
  • Hostname
  • Client-ID, a unique ID for the client

Side Note: You might remember that during the second tutorial I mentioned that all Ubuntu VMs receive the same IP address. This seems to be due to the client ID. The router correctly receives different MAC addresses, but Ubuntu 18.04 does not seem to use the MAC address for calculation of the client ID, but instead the hostname (which is the same for all my VMs).

During the creation of guest VMs we always set a randomly generated MAC address. Since the leases file contains both the MAC address and the IP address we can easily find out IP addresses of QEMU guests by just parsing the lease file into a lookup dictionary.

Querying ARP Table

Another approach that works even without a DHCP server is querying the ARP table on the host machine. This can be done for example with ip neigh:

192.168.2.122 dev br0 lladdr fe:24:38:01:4d:74 STALE

While this is probably the most simple method to implement, it has the disadvantage that it only works if the VM is in the ARP cache. This would be the case if there is either some form of communication between the VM and the host or the VM guest uses ARP announcements. To my understanding this should usually be the case, but I wouldn’t want to fully rely on it. It still might happen that some guest does not send an ARP announcement.

If there are no announcements, the host could also loop through all possible IPs and try to establish communication - that way it will receive the MAC addresses for all connected IP addresses and the ARP cache will be filled. However, this is only possible for small subnet ranges.

QEMU Guest Agent

Another possibility is using the QEMU Guest Agent. This is a program that is installed inside the guest VM and can be used for communication between the host and the guest. Unlike the QEMU monitor it can provide capabilities that depend on the actual operating system of the guest, e.g. file system operations, network interfaces, shutdown without ACPI, and some more.

On Ubuntu we can install the Guest Agent with sudo apt install qemu-guest-agent.

The documentation on how to start up an interface to QEMU Guest Agent from the command line is quite confusing. It states that this might be changed in QEMU 0.16, but currently I’m at version 5.1.0 of QEMU. I also did not get it to work in the way that according to the docs might work as of QEMU 0.16. However, it works well with the method specified for QEMU 0.15. There have already been discussions about this 8 years ago.

Using the method from QEMU 0.15 we get a command line call like this:

qemu-system-x86_64 -m 4096 -accel kvm \
    -hda ubuntu.qcow2 \
    -device virtio-net-pci,netdev=pubnet,mac=fe:24:38:01:4d:74 \
    -netdev vde,id=pubnet,sock=/tmp/vde.ctl \
    -qmp unix:/tmp/aetherscale-qmp-ubuntu.sock,server,nowait \
    -chardev socket,path=/tmp/aetherscale-qga-ubuntu.sock,server,nowait,id=qga0 \
    -device virtio-serial \
    -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0

In this case org.qemu.guest_agent.0 is a filename inside the guest. According to the docs, this is the default, so it should probably work in most cases.

The communication channel for the Guest Agent protocol is not as advanced as the one for QMP. Unlike QMP it does not have a capabilities negotiation phase. It does not support multiple connected clients at the same time, either. Thus, it’s best to start communication by synchronizing with the agent to ensure that the agent is now serving requests from this connection:

{"execute":"guest-sync", "arguments":{"id": 1234}}
{"return": 1234}

We can now fetch the network information with the command guest-network-get-interfaces. The response will be in a single line, I added the line breaks for better readability.

{"execute": "guest-network-get-interfaces"}
{"return": [
    {
        "name": "lo",
        "ip-addresses": [
            {"ip-address-type": "ipv4", "ip-address": "127.0.0.1", "prefix": 8},
            {"ip-address-type": "ipv6", "ip-address": "::1", "prefix": 128}
        ],
        "statistics": {
            "tx-packets": 84,
            "tx-errs": 0,
            "rx-bytes": 7204,
            "rx-dropped": 0,
            "rx-packets": 84,
            "rx-errs": 0,
            "tx-bytes": 7204,
            "tx-dropped": 0
        },
        "hardware-address": "00:00:00:00:00:00"
    }, {
        "name": "ens3",
        "ip-addresses": [{
            "ip-address-type": "ipv4",
            "ip-address": "192.168.2.122",
            "prefix": 24
        }, {
            "ip-address-type": "ipv6",
            "ip-address": "2001:db8::21f",
            "prefix": 128
        }, {
            "ip-address-type": "ipv6",
            "ip-address": "2001:db8::fc24:38ff:fe01:4d74",
            "prefix": 64
        }, {
            "ip-address-type": "ipv6",
            "ip-address": "fe80::fc24:38ff:fe01:4d74",
            "prefix": 64
        }],
        "statistics": {
            "tx-packets": 31,
            "tx-errs": 0,
            "rx-bytes": 2558,
            "rx-dropped": 0,
            "rx-packets": 22,
            "rx-errs": 0,
            "tx-bytes": 2943,
            "tx-dropped": 0
        },
        "hardware-address": "fe:24:38:01:4d:74"
    }
]}

Here we can see that the MAC address fe:24:38:01:4d:74 has the IPv4 address 192.168.2.122 and the IPv6 addresses 2001:db8::21f and 2001:db8::fc24:38ff:fe01:4d74. The former IPv6 address was assigned via DHCPv6 by my router, the latter is an auto-generated address from the MAC address using the prefix announced by the router.

Implementation

Since I have full control over my guest VMs I will implement the method using the QEMU Guest Agent. I will implement it using a generic class interface so that it’s no problem to add other IP retrieval methods later. The QEMU guest agent protocol is very similar to QMP which means that we can re-use the class QemuMonitor from a previous tutorial with minor modifications. This tutorial is based on the code at commit 090cdee.

Let’s change this class first. The QEMU Guest Agent protocol does not start with capabilities negotiation. Instead, ideally we should start by syncing with the server. This is done by sending the guest-sync command with a randomly generated ID. The server will respond with this ID if it is in-sync. Additionally, the QEMU Guest Agent protocol might not respond at all, which means we will implement timeout behaviour. It is also recommended to prepend the guest-sync command with a 0xFF byte to ensure that the server flushes partial JSON messages from previous connections.

The fully changed class looks like this: We have two different initialization methods _initialize_qmp and _initialize_guest_agent depending on the selected protocol. execute has been extended to accept additional command arguments. This works both with QMP and the Guest Agent protocol.

import enum
import logging
import json
from pathlib import Path
import random
import socket
from typing import Any, Dict, Optional


class QemuException(Exception):
    pass


class QemuProtocol(enum.Enum):
    QMP = enum.auto()
    QGA = enum.auto()


class QemuMonitor:
    # TODO: Improve QMP communication, spec is here:
    # https://github.com/qemu/qemu/blob/master/docs/interop/qmp-spec.txt
    def __init__(
            self, socket_file: Path, protocol: QemuProtocol,
            timeout: Optional[float] = None):
        self.sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
        self.sock.connect(str(socket_file))
        self.f = self.sock.makefile('rw')
        self.protocol = protocol

        if timeout:
            self.sock.settimeout(timeout)

        # Initialize connection immediately
        self._initialize()

    def execute(
            self, command: str,
            arguments: Optional[Dict[str, Any]] = None) -> Any:
        message = {'execute': command}
        if arguments:
            message['arguments'] = arguments

        json_line = json.dumps(message) + '\r\n'
        logging.debug(f'Sending message to QEMU: {json_line}')
        self.sock.sendall(json_line.encode('utf-8'))
        return json.loads(self.readline())

    def _initialize(self):
        if self.protocol == QemuProtocol.QMP:
            self._initialize_qmp()
        elif self.protocol == QemuProtocol.QGA:
            self._initialize_guest_agent()
        else:
            raise ValueError('Unknown QemuProtocol')

    def _initialize_qmp(self):
        # Read the capabilities
        self.f.readline()

        # Acknowledge the QMP capability negotiation
        self.execute('qmp_capabilities')

    def _initialize_guest_agent(self):
        # make the server flush partial JSON from previous connections
        prepend_byte = b'\xff'
        self.sock.sendall(prepend_byte)

        rand_int = random.randint(100000, 1000000)
        self.execute('guest-sync', {'id': rand_int})

        return json.loads(self.readline())

    def readline(self) -> Any:
        try:
            logging.debug('Waiting for message from QEMU')
            data = self.f.readline()
            logging.debug(f'Received message from QEMU: {data}')
            return data
        except socket.timeout:
            raise QemuException(
                'Could not communicate with QEMU, is QMP server or GA running?')

We will display the IP address of VMs on a call to list-vms. This is probably not the best place for it in the future (at least not without caching), but it’s good enough for testing. Responses should look like this:

[{
   "vm-id": "123456",
   "ip-addresses": ["192.168.0.2"], ["2001:0db8:85a3:0000:0000:8a2e:0370:7334"]
}, {
   "vm-id": "234567",
   "ip-addresses": ["192.168.0.3"], ["2001:0db8:85a3:0000:0000:8a2e:0370:7335"]
}]

The class to actually fetch the IP address is then basically wrapper code around the QemuMonitor:

class GuestAgentIpAddress:
    def __init__(self, socket_file: Path, timeout: float = 1):
        self.comm_channel = QemuMonitor(socket_file, QemuProtocol.QGA, timeout)

    def fetch_ip_addresses(self):
        resp = self.comm_channel.execute('guest-network-get-interfaces')
        return self._parse_ips_from_response(resp)

    def _parse_ips_from_response(self, response):
        ips = []

        try:
            for interface in response['return']:
                for address in interface['ip-addresses']:
                    ips.append(address['ip-address'])

            return ips
        except KeyError:
            return []

With this class it’s now quite simple to get the list of IP addresses in a call to list-vms. We need a new function qemu_socket_guest_agent that generates the path to the Guest Agent socket for us. When a new VM is created it will listen to this socket. And on calls to list-vms we will open a connection to this socket and query the IP addresses:

def qemu_socket_guest_agent(vm_id: str) -> Path:
    return Path(f'/tmp/aetherscale-qga-{vm_id}.sock')


def list_vms(_: Dict[str, Any]) -> List[Dict[str, Any]]:
    vms = []

    for proc in psutil.process_iter(['pid', 'name']):
        if proc.name().startswith('vm-'):
            vm_id = proc.name()[3:]

            socket_file = qemu_socket_guest_agent(vm_id)
            hint = None
            ip_addresses = []
            try:
                fetcher = qemu.GuestAgentIpAddress(socket_file)
                ip_addresses = fetcher.fetch_ip_addresses()
            except qemu.QemuException:
                hint = 'Could not retrieve IP address for guest'

            msg = {
                'vm-id': vm_id,
                'ip-addresses': ip_addresses,
            }
            if hint:
                msg['hint'] = hint

            vms.append(msg)

    return vms

The changed section in create_qemu_systemd_unit with the new options for the Guest Agent look like this:

    qga_monitor_path = qemu_socket_guest_agent(qemu_config.vm_id)
    qga_chardev_quoted = shlex.quote(
        f'socket,path={qga_monitor_path},server,nowait,id=qga0')

    command = f'qemu-system-x86_64 -m 4096 -accel kvm -hda {hda_quoted} ' \
        f'-device {device_quoted} -netdev {netdev_quoted} ' \
        f'-name {name_quoted} ' \
        '-nographic ' \
        f'-qmp {socket_quoted} ' \
        f'-chardev {qga_chardev_quoted} ' \
        '-device virtio-serial ' \
        '-device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0'

Let’s try this out. Don’t forget to install and activate the QEMU Guest Agent in your base image before trying this. If Guest Agent is not running inside the VM you will see the hint that the IP address could not be retrieved from the gust.

$ aetherscale-cli create-vm --image ubuntu_base
[{'execution-info': {'status': 'success'},
'response': {'status': 'starting', 'vm-id': 'rszsauli'}}]

$ aetherscale-cli list-vms
[{'execution-info': {'status': 'success'},
'response': [{'vm-id': 'rszsauli', 'ip-addresses': ['127.0.0.1', '::1',
'192.168.2.219', '2001:db8:1234::21f', '2001:db8:1234:0:1c0f:bff:fef6:f30',
'fe80::1c0f:bff:fef6:f30']}]}]

Now we can retrieve all IP addresses from our guest machines. In the next tutorial we will do a small interlude and use the current state of the project to automatically create VMs with some services running.

You can get the full code at the end of this tutorial at commit cea5d87.

Comments

The comments section is a test at the moment. I might decide to remove it again. If you have any questions you can also send an e-mail to blog@stefan-koch.name. The software (Commento) is running on my own server.