Persistent QEMU Instances with systemd
If you’re running QEMU instances without libvirt, you have the problem that there is no daemon managing your instances. If your host reboots, the VMs will not be re-created. In this part of my cloud computing tutorial I will use systemd to create QEMU VMs that are managed by a daemon (in this case systemd). If you haven’t followed along my tutorial series, you can just read the systemd info and skip the Python parts.
One limitation with the systemd approach is that we should not start the VMs with a graphical user interface that way. It might work, but it’s not recommended. For the cloud computing tutotial this is only a limitation for debugging, because on a real cloud computing provider we would not want to see a VM window on our host anyway. For all GUI instances customers would connect with a remote desktop utility.
Up to now I have used the graphical output of QEMU, now we will use SSH to access our machine. For IP address assignment I will rely to the DHCP server of my router. When we setup private VPNs for a collection of VMs we will have to setup our own DHCP server to assign IP addresses.
Luckily, the Ubuntu base image I created earlier already starts an SSH server in the background. So there is nothing more to prepare. If you used another image without SSH server, install it and enable it as a service in your base image.
A systemd unit file for QEMU
What would a systemd unit file for QEMU look like? It’s quite simple actually.
I’m using user-mode systemd, but of course you can also use system-wide
systemd with a User
and Group
section in [Service]
(QEMU instances
should not be run as root):
[Unit]
Description=QEMU VM Operation System X
[Service]
ExecStart=qemu-system-x86_64 -m 4096 -accel kvm -nographic ...
[Install]
WantedBy=default.target
Just extend the qemu command with all options that you need for your exact
use case. The important option for usage with systemd is -nographic
, because
otherwise QEMU will try to start a window on the window manager. This would
fail if you don’t set the right environment variables and is not recommend
anyway. If you need to start a VM with a GUI window, use the autostart method
of your window manager.
Return the VM ID on startup
In the Python code we will perform one small change before we actually start our VMs as systemd services.
Up to now when we started a VM we got a window and could check the status there. Now, if we want to query the status of a VM we’ll have to ask systemd. For this, we will have to know an identifier for our machine, e.g. the VM ID. At the moment the VM ID is not returned, though.
In the client, we will change response_expected
for start-vm
to True
:
# ...
elif args.subparser_name == 'start-vm':
response_expected = True
data = {
'command': 'start-vm',
'options': {
'image': args.image,
}
# ...
And in computing.py
we will set a response value after the VM was started.
The response status will be starting
, because to the user the VM might
still seem to be in the starting
stage and possibly not responsive to SSH
for a while. We will think about a clean lifecycle naming scheme later, but
I think it could some thing like starting, running, crashed and
stopped probably with still some open issues regarding crashed and
stopped.
# ...
print(f'Started VM "{vm_id}"')
response = {
'status': 'starting',
'vm-id': vm_id,
}
With these two small changes we will now see the ID of a started VM in the terminal:
$ aetherscale-cli start-vm --image ubuntu_base
[{'status': 'starting', 'vm-id': 'wiuakczz'}]
Starting a VM
To start a new VM we will create a systemd unit file and then start and enable
the service. This can be achieved with the functions copy_systemd_unit
,
start_systemd_unit
and enable_systemd_unit
which we already created
previously. These actions will replace the current call to subprocess.Popen
.
Additionally, we will create a new function create_qemu_systemd_unit
that
creates a new systemd unit for a VM.
Let’s first create a data class to pass all required QEMU startup options between functions.
from dataclasses import dataclass
@dataclass
class QemuStartupConfig:
vm_id: str
hda_image: Path
mac_addr: str
vde_folder: Path
vm_id
contains the ID of our VM, hda_image
is the full path to our
image for this VM (which uses the base image as a backing image), mac_addr
of course is the MAC address for the network interface and vde_folder
is the path to the previously setup VDE network.
At the same time I removed all locations in the code where there is a distinction between TAP networking and VDE networking. I will only support VDE networking in the future. In case we will need TAP networking again, we can refer to older git commits.
With this data structure available, let’s create a function
create_qemu_systemd_unit
that writes the systemd unit file.
def create_qemu_systemd_unit(
unit_name: str, qemu_config: QemuStartupConfig):
hda_quoted = shlex.quote(str(qemu_config.hda_image.absolute()))
device_quoted = shlex.quote(
f'virtio-net-pci,netdev=pubnet,mac={qemu_config.mac_addr}')
netdev_quoted = shlex.quote(
f'vde,id=pubnet,sock={str(qemu_config.vde_folder)}')
name_quoted = shlex.quote(
f'qemu-vm-{qemu_config.vm_id},process=vm-{qemu_config.vm_id}')
command = f'qemu-system-x86_64 -m 4096 -accel kvm -hda {hda_quoted} ' \
f'-device {device_quoted} -netdev {netdev_quoted} ' \
f'-name {name_quoted}' \
'-nographic '
with tempfile.NamedTemporaryFile(mode='w+t', delete=False) as f:
f.write('[Unit]\n')
f.write(f'Description=aetherscale VM {qemu_config.vm_id}\n')
f.write('\n')
f.write('[Service]\n')
f.write(f'ExecStart={command}\n')
f.write('\n')
f.write('[Install]\n')
f.write('WantedBy=default.target\n')
execution.copy_systemd_unit(Path(f.name), unit_name)
os.remove(f.name)
At first, I quote all arguments with input that might require quoting. It would be quite bad if an attacker could execute commands on our host. Then, I write a pretty simple systemd unit file with the full QEMU command and the VM ID as the unit description.
With this function prepared, all we have to do is change the actual startup
of the VM inside callback
to use this function and start_systemd_unit
instead of calling subprocess.Popen
. Since we will need the
name of the systemd unit at different places throughout the code, we will
also create a function to calculate that.
def systemd_unit_name_for_vm(vm_id: str) -> str:
return f'aetherscale-vm-{vm_id}.service'
def callback(ch, method, properties, body):
# ...
mac_addr = interfaces.create_mac_address()
print(f'Assigning MAC address "{mac_addr}" to VM "{vm_id}"')
qemu_config = QemuStartupConfig(
vm_id=vm_id,
hda_image=user_image,
mac_addr=mac_addr,
vde_folder=Path(VDE_FOLDER))
unit_name = systemd_unit_name_for_vm(vm_id)
create_qemu_systemd_unit(unit_name, qemu_config)
execution.start_systemd_unit(unit_name)
print(f'Started VM "{vm_id}"')
response = {
'status': 'starting',
'vm-id': vm_id,
}
# ...
With this change, run the server with the command aetherscale
and start
a machine from the command line:
$ aetherscale-cli start-vm --image ubuntu_base
[{'status': 'starting', 'vm-id': 'wiuakczz'}]
$ systemctl status --user aetherscale-vm-wiuakczz.service
Stopping a VM
We can also stop our instances with systemd. When systemd stops a process
it sends a SIGTERM signal followed by SIGKILL if the process does not stop. This does not mean that
the operating system inside the VM is allowed to shutdown cleanly, but it’s
a clean stop command for QEMU.
It’s also possible to specify a custom script to execute in stop with
ExecStop
. This can be useful for performing a clean VM shutdown
through the QEMU monitor, but we will only do this in a later tutorial.
To implement this, we will create three new functions stop_systemd_unit
,
disable_systemd_unit
and delete_systemd_unit
:
def systemd_unit_path(unit_name: str) -> Path:
systemd_unit_dir = Path().home() / '.config/systemd/user'
return systemd_unit_dir / unit_name
def delete_systemd_unit(unit_name: str):
systemd_unit_path(unit_name).unlink(missing_ok=True)
def stop_systemd_unit(unit_name: str) -> bool:
return run_command_chain([
['systemctl', '--user', 'stop', unit_name],
])
def disable_systemd_unit(unit_name: str) -> bool:
return run_command_chain([
['systemctl', '--user', 'disable', unit_name],
])
With this, we can adjust our code to stop a VM:
def callback(ch, method, properties, body):
# ...
elif data['command'] == 'stop-vm':
try:
vm_id = data['options']['vm-id']
except KeyError:
print('VM ID not specified', file=sys.stderr)
return
unit_name = systemd_unit_name_for_vm(vm_id)
is_running = execution.systemctl_is_running(unit_name)
if is_running:
execution.disable_systemd_unit(unit_name)
execution.stop_systemd_unit(unit_name)
response = {
'status': 'killed',
'vm-id': vm_id,
}
else:
response = {
'status': 'error',
'reason': f'VM "{vm_id}" does not exist',
}
If you’ve followed closely you might recognize that this chain of actions will delete the systemd unit file, but it will keep the VM’s image file around. This is undesired. Either we want to fully delete the VM, in which case we don’t need its image anymore. Or we want to be able to restart it, then we can keep the systemd file.
Cleaning up the CLI commands and messages
Up to now there only was the difference between starting and stopping a VM, but with being able to restart stopped VMs we should actually distinguish between more actions:
- Create a new VM
- (Re-)start a stopped VM
- Stop a running VM
- Delete a VM
So, let’s cleanup our current VM messages and create a new set of messages for VM management:
Creating a new VM:
{
"command": "create-vm",
"options": {
"image": "some-image-name",
}
}
Stopping a running VM:
{
"command": "stop-vm",
"options": {
"vm-id": "id-of-the-vm",
}
}
Re-starting a stopped VM:
{
"command": "start-vm",
"options": {
"vm-id": "id-of-the-vm",
}
}
Deleting a VM:
{
"command": "delete-vm",
"options": {
"vm-id": "id-of-the-vm"
}
}
Now that we know which messages we want to support we can implement the logic.
While we’re at it, let’s change the callback
function to only include the
actual message handling logic:
def callback(ch, method, properties, body):
command_fn: Dict[str, Callable[[Dict[str, Any]], Dict[str, Any]]] = {
'list-vms': list_vms,
'create-vm': create_vm,
'start-vm': start_vm,
'stop-vm': stop_vm,
'delete-vm': delete_vm,
}
message = body.decode('utf-8')
logging.debug('Received message: ' + message)
data = json.loads(message)
try:
command = data['command']
except KeyError:
logging.error('No "command" specified in message')
return
try:
fn = command_fn[command]
except KeyError:
logging.error(f'Invalid command "{command}" specified')
return
options = data.get('options', {})
try:
response = fn(options)
# if a function wants to return a response
# set its execution status to success
resp_message = {
'execution-info': {
'status': 'success'
},
'response': response,
}
except Exception as e:
resp_message = {
'execution-info': {
'status': 'error',
# TODO: Only ouput message if it is an exception generated by us
'reason': str(e),
}
}
ch.basic_ack(delivery_tag=method.delivery_tag)
if properties.reply_to:
ch.basic_publish(
exchange='',
routing_key=properties.reply_to,
properties=pika.BasicProperties(
correlation_id=properties.correlation_id
),
body=json.dumps(resp_message)
This uses a dictionary mapping from the command
names to functions. If a
valid command was specified the associated function will be called. Each
function has to take an options
Dictionary as input (even if it does
not use it) and return a JSON serializable response.
Now we can also move the code for VM creation into its own function. While
we’re at it, we will replace all print
statements with proper logging and
instead of returning on errors we will raise exceptions.
If a command raises an exception, this means that an error has happened. In
this case we will write it to a sub structure called execution-info
. This
will hold all meta information about the execution. The actual response will
go into a sub structure called response
.
def create_vm(options: Dict[str, Any]) -> Dict[str, str]:
vm_id = ''.join(
random.choice(string.ascii_lowercase) for _ in range(8))
logging.info(f'Starting VM "{vm_id}"')
try:
image_name = os.path.basename(options['image'])
except KeyError:
raise ValueError('Image not specified')
try:
user_image = create_user_image(vm_id, image_name)
except (OSError, QemuException):
raise
mac_addr = interfaces.create_mac_address()
logging.debug(f'Assigning MAC address "{mac_addr}" to VM "{vm_id}"')
qemu_config = QemuStartupConfig(
vm_id=vm_id,
hda_image=user_image,
mac_addr=mac_addr,
vde_folder=Path(VDE_FOLDER))
unit_name = systemd_unit_name_for_vm(vm_id)
create_qemu_systemd_unit(unit_name, qemu_config)
execution.start_systemd_unit(unit_name)
logging.info(f'Started VM "{vm_id}"')
return {
'status': 'starting',
'vm-id': vm_id,
}
Inside the stop_vm
code we will change the behaviour a bit. If the VM
does not exist at all we will raise an exception. If it was already stopped,
we will return the instance status nonetheless, but with an additional hint
that it was already stopped. This could be displayed to the user and can be
ignored by automatic scripts that only wanted to stop the VM (which did
succeed, just maybe another process did the same thing at the same time).
To be able to distinguish between these situations we need a new function
systemd_unit_exists
to check if a unit exists. Since we install all VM
units ourselves, we can be sure in which directory they are and just have
to check for file existence.
def systemd_unit_exists(unit_name: str) -> bool:
return systemd_unit_path(unit_name).is_file()
Calling that function we can then distinguish between whether a unit exists and is not running or whether it does not exist at all.
def stop_vm(options: Dict[str, Any]) -> Dict[str, str]:
try:
vm_id = options['vm-id']
except KeyError:
raise ValueError('VM ID not specified')
unit_name = systemd_unit_name_for_vm(vm_id)
if not execution.systemd_unit_exists(unit_name):
raise RuntimeError('VM does not exist')
elif not execution.systemctl_is_running(unit_name):
response = {
'status': 'killed',
'vm-id': vm_id,
'hint': f'VM "{vm_id}" was not running',
}
else:
execution.disable_systemd_unit(unit_name)
execution.stop_systemd_unit(unit_name)
response = {
'status': 'killed',
'vm-id': vm_id,
}
return response
Next one is the start_vm
function. To start a VM that does exist but was
previously stopped, we just have to enable and start the systemd service.
Again, we will return a hint if the user was starting an already started
VM. It might be more reasonable to also return the status running
in this
case, because the VM is already running. But since it’s also possible that
another process started the VM one second ago and it’s still not available
we will just use starting
as the response status. Returning the actual VM
status can be done when we have proper status management.
def start_vm(options: Dict[str, Any]) -> Dict[str, str]:
try:
vm_id = options['vm-id']
except KeyError:
raise ValueError('VM ID not specified')
unit_name = systemd_unit_name_for_vm(vm_id)
if not execution.systemd_unit_exists(unit_name):
raise RuntimeError('VM does not exist')
elif execution.systemctl_is_running(unit_name):
response = {
'status': 'starting',
'vm-id': vm_id,
'hint': f'VM "{vm_id}" was already started',
}
else:
execution.start_systemd_unit(unit_name)
execution.enable_systemd_unit(unit_name)
response = {
'status': 'starting',
'vm-id': vm_id,
}
return response
The last one is deletion of a VM. This is a combination of stopping the VM
(de-registering it from systemd) and deletion of all resources. Up to now
the resources are only the VM image and the systemd unit file. In order to
not duplicate the code, we will call stop_vm
, ignore the return message
and then clean up the resources.
We will also create a new function
user_image_path
to construct the target path for a user image by VM ID. It
doesn’t really do much, but it’s required in two different locations and we
will probably change the path at some point in the future.
def user_image_path(vm_id: str) -> Path:
return USER_IMAGE_FOLDER / f'{vm_id}.qcow2'
def delete_vm(options: Dict[str, Any]) -> Dict[str, str]:
try:
vm_id = options['vm-id']
except KeyError:
raise ValueError('VM ID not specified')
stop_vm(options)
unit_name = systemd_unit_name_for_vm(vm_id)
user_image = user_image_path(vm_id)
execution.delete_systemd_unit(unit_name)
user_image.unlink()
return {
'status': 'deleted',
'vm-id': vm_id,
}
Finally, we also have to adjust the command line interface in client.py
. As
start-vm
, stop-vm
and delete-vm
all have the same arguments I will
handle them in one block of code.
def main():
parser = argparse.ArgumentParser(
description='Manage aetherscale instances')
subparsers = parser.add_subparsers(dest='subparser_name')
create_vm_parser = subparsers.add_parser('create-vm')
create_vm_parser.add_argument(
'--image', help='Name of the image to create a VM from', required=True)
start_vm_parser = subparsers.add_parser('start-vm')
start_vm_parser.add_argument(
'--vm-id', dest='vm_id', help='ID of the VM to start', required=True)
stop_vm_parser = subparsers.add_parser('stop-vm')
stop_vm_parser.add_argument(
'--vm-id', dest='vm_id', help='ID of the VM to stop', required=True)
delete_vm_parser = subparsers.add_parser('delete-vm')
delete_vm_parser.add_argument(
'--vm-id', dest='vm_id', help='ID of the VM to delete', required=True)
subparsers.add_parser('list-vms')
args = parser.parse_args()
if args.subparser_name == 'list-vms':
response_expected = True
data = {
'command': 'list-vms',
}
elif args.subparser_name == 'create-vm':
response_expected = True
data = {
'command': 'create-vm',
'options': {
'image': args.image,
}
}
elif args.subparser_name in ['start-vm', 'stop-vm', 'delete-vm']:
response_expected = True
data = {
'command': args.subparser_name,
'options': {
'vm-id': args.vm_id,
}
}
else:
parser.print_usage()
sys.exit(1)
try:
with ServerCommunication() as c:
result = c.send_msg(data, response_expected)
print(result)
except pika.exceptions.AMQPConnectionError:
print('Could not connect to AMQP broker. Is it running?',
file=sys.stderr)
Time to have fun
Now let’s play around with this a bit.
$ aetherscale-cli create-vm --image ubuntu_base
[{'execution-info': {'status': 'success'}, 'response': {'status': 'starting', 'vm-id': 'ygjpppvk'}}]
$ aetherscale-cli list-vms
[{'execution-info': {'status': 'success'}, 'response': ['vm-ygjpppvk']}]
aetherscale-cli create-vm --image ubuntu_base
[{'execution-info': {'status': 'success'}, 'response': {'status': 'starting', 'vm-id': 'cxzafamg'}}]
$ systemctl --user status "aetherscale-vm-*" | grep -B 2 Active
● aetherscale-vm-cxzafamg.service - aetherscale VM cxzafamg
Loaded: loaded (/home/user/.config/systemd/user/aetherscale-vm-cxzafamg.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2020-12-06 12:00:34 CET; 2min 38s ago
--
● aetherscale-vm-ygjpppvk.service - aetherscale VM ygjpppvk
Loaded: loaded (/home/user/.config/systemd/user/aetherscale-vm-ygjpppvk.service; disabled; vendor preset: enabled)
Active: active (running) since Sun 2020-12-06 11:51:35 CET; 11min ago
$ aetherscale-cli stop-vm --vm-id cxzafamg
[{'execution-info': {'status': 'success'}, 'response': {'status': 'killed', 'vm-id': 'cxzafamg'}}]
$ aetherscale-cli list-vms
[{'execution-info': {'status': 'success'}, 'response': ['vm-ygjpppvk']}]
$ aetherscale-cli start-vm --vm-id cxzafamg
[{'execution-info': {'status': 'success'}, 'response': {'status': 'starting', 'vm-id': 'cxzafamg'}}]
$ aetherscale-cli stop-vm --vm-id ygjpppvk
[{'execution-info': {'status': 'success'}, 'response': {'status': 'killed', 'vm-id': 'ygjpppvk'}}]
$ aetherscale-cli delete-vm --vm-id ygjpppvk
[{'execution-info': {'status': 'success'}, 'response': {'status': 'deleted', 'vm-id': 'ygjpppvk'}}]
$ aetherscale-cli delete-vm --vm-id cxzafamg
[{'execution-info': {'status': 'success'}, 'response': {'status': 'deleted', 'vm-id': 'cxzafamg'}}]
list-vms
only lists all running VMs. It would be much nicer
if it would list all VMs with their current status. Feel free to implement
this if you want.
There is one more open issue that can easily be solved. Enabled systemd user services do not run automatically on bootup, but only when the user is logged in. To run long-running services with standard user account, one has to enable lingering.
I do not maintain a comments section. If you have any questions or comments regarding my posts, please do not hesitate to send me an e-mail to blog@stefan-koch.name.