In this tutorial I will show how we can automate the installation of Archlinux into a newly created qcow2 image using Qemu. I will present two different methods. One uses a chroot environment from an existing linux host system and the other one creates the Archlinux installation from a standard Arch Live CD inside Qemu.

For many applications it will probably make a lot more sense to create a base image once and then copy this base image for new VMs, but some people might also have a use case for my approach. For example you might have so many different base configurations for your customers that creating a base image for each of them just would be too much effort. Again, such a use case could also be a situation in which automation with Ansible/Puppet/Chef would be more useful. Always think about what you need.

Creating Qemu image in a chroot environment on the host

Let’s start with the chroot approach, because it is the one that is simpler to script. This approach uses a special Archlinux bootstrap image into which we can chroot and then execute all required installation commands there. Installation of the bootloader is the toughest part in this installation scheme, because without customization bootloaders often expect that you’re in the target environment. But we’ll start with the simple steps.

At first we will download the required archlinux iso, create an empty raw image (because raw images can be mounted easily) and setup a filesystem inside the image. We can conver the raw image to qcow2 after the installation.

src=http://mirror.netcologne.de/archlinux/iso/2020.05.01/archlinux-bootstrap-2020.05.01-x86_64.tar.gz
archive=/tmp/archlinux-bootstrap-2020.05.01-x86_64.tar.gz
image=/tmp/image.raw
mountpoint=/tmp/arch

if [[ ! -f $archive ]]; then
    wget $src -O $archive
fi

mkdir $mountpoint

qemu-img create -f raw $image 20G
# Mount the image to an available loopback device, exposing the inner
# partition structure of the image (as e.g. /dev/loop0p1)
loop=$(sudo losetup --show -f -P $image)
# Create a partition and a file system on the device
sudo parted $loop mklabel msdos
sudo parted -a optimal $loop mkpart primary 0% 100%
sudo parted $loop set 1 boot on
loopp=${loop}p1
sudo mkfs.ext4 $loopp
# Extract archlinux-boostrap into the image's filesystem
sudo mount $loopp $mountpoint
sudo tar xf $archive -C $mountpoint --strip-components 1

Now we can chroot into the bootstrap system and execute all required commands there. For the sake of readability I will display all commands as standard shell commands again, but in the real automation script this won’t work. arch-chroot will create an interactive shell and wait for user input instead of continuing script execution. Thus, we have to send the script as a bash command to arch-chroot for full automation (you can see this in the full script at the end of the article).

sudo $mountpoint/bin/arch-chroot $mountpoint /bin/bash
# Select any server you want to have, or better multiple servers
echo 'Server = http://ftp.uni-bayreuth.de/linux/archlinux/$repo/os/$arch' \
    >> /etc/pacman.d/mirrorlist

pacman-key --init
pacman-key --populate archlinux

pacman -Syu --noconfirm
# dhcpcd will be required for standard networking in Qemu
# syslinux will be our bootloader
pacman -S --noconfirm base linux linux-firmware mkinitcpio dhcpcd syslinux
systemctl enable dhcpcd

# Standard Archlinux Setup
ln -sf /usr/share/zoneinfo/Europe/Berlin /etc/localtime
hwclock --systohc
echo en_US.UTF-8 UTF-8 >> /etc/locale.gen
locale-gen
echo LANG=en_US.UTF-8 > /etc/locale.conf
echo arch-qemu > /etc/hostname
echo -e '127.0.0.1  localhost\n::1  localhost' >> /etc/hosts

So far our setup steps are quite standard. Next we have to create an initramfs without autodetect, because autodetect would detect the modules of the current host system, which might not work in the Qemu VM (in my case it did not). Additionally, mkinitcpio tries to autodetect the current kernel version to load the modules, but my host kernel and the VM kernel (chroot) differed. Thus, I had to read the latest installed modules from /lib/modules.

# Create an initramfs without autodetect, because this breaks with the
# combination host/chroot/qemu
linux_version=$(ls /lib/modules/ | sort -V | tail -n 1)
mkinitcpio -c /etc/mkinitcpio.conf -S autodetect --kernel $linux_version \
    -g /boot/initramfs-linux-custom.img

And finally we can setup Syslinux. As I want to use the Qemu VM with -nographic, I will setup Syslinux with support for the serial console. You can find more information about -nographic and serial console in the second part of this article, because for the second installation routine it’s a requirement. If you don’t need the -nographic option, you can leave away some of the adjustments. It’s important to make sure that Syslinux and the MBR get installed to the correct device (i.e. the loopback device, which maps to our image file). Thus, we cannot use syslinux-install_update.

# Setup syslinux
mkdir /boot/syslinux
cp /usr/lib/syslinux/bios/*.c32 /boot/syslinux/
extlinux --install --device $loopp /boot/syslinux
dd bs=440 count=1 conv=notrunc if=/usr/lib/syslinux/bios/mbr.bin of=$loop
# Customize syslinux config
uuid=$(blkid -o value -s UUID $loopp)
sed -i '1i SERIAL 0 115200' /boot/syslinux/syslinux.cfg
sed -i "s/APPEND root=\/dev\/sda3/APPEND console=tty0 console=ttyS0,115200 root=UUID=\$uuid/g" \
    /boot/syslinux/syslinux.cfg
sed -i "s/INITRD ..\/initramfs-linux.img/INITRD ..\/initramfs-linux-custom.img/" \
    /boot/syslinux/syslinux.cfg

Finally, we can unmount the device, free the loopback device again and convert the image to qcow2:

sudo umount $mountpoint
sudo losetup -d $loop
qemu-img convert -f raw -O qcow2 $image /tmp/image.qcow2

This approach results in a ready-to-use Archlinux image that can be started with Qemu. Further adjustments for a specific project are just one step away now.

qemu-system-x86_64 -cpu host -enable-kvm -m 2048 -smp 2 -drive file=/tmp/image.qcow2 -nographic

The boot menu looks a bit broken, but afterwards the display is fine.

Creating Qemu image inside a Qemu VM with a Live-CD

Another approach is to use an Archlinux Live CD and boot the live CD inside a Qemu VM with an empty image attached as block device. This, however, will require automation of an interactive application, because e.g. the boot menu requires user interaction.

For automation of interactive applications we can use Expect. Expect can control a program even if this program asks the user several questions and would usually require human interaction. In order to make Expect work with Qemu we first have to tell Qemu not to start the guest in a new window, but to behave like a standard application with stdin and stdout. This can be done with the -nographic flag of Qemu, but then Qemu will expect the output to be written to the serial port.

This means, we will have to boot Archlinux with special settings for the serial console. The Archlinux wiki explains that this can be done by hitting <TAB> on the desired boot menu entry and then appending console=ttyS0,38400 to the boot options. Archlinux will then write its output to the serial port.

Now that we know how to make Qemu behave like a standard application without graphical output and we also know how to connect Archlinux output to Qemu, we can work on automation with Expect. The basic workflow of expect scripts is to spawn an applicaton and then expect an output on the stdout and respond with an input (send). There might be more advanced workflows, but for our use case this is enough. Let’s start with booting into Archlinux with the serial console settings enabled.

At first we have to run Qemu. The bootloader will then present us with a list of boot options, of which the right one is already selected. As stated by the Archlinux Wiki, we then have to press <TAB>. We will be presented a list of already set kernel flags and have to append our own ones. This results into the following actions:

  • Start the Qemu VM
  • Wait for the Arch Linux boot menu to appear, then press <TAB>
  • Wait for the kernel flags to appear, then enter the additional flags and press Enter (Carriage Return)
#!/usr/bin/expect -f

spawn qemu-system-x86_64 -cdrom archlinux-2020.05.01-x86_64.iso -cpu host -enable-kvm -m 2048 -smp 2 -drive file=arch-blog.qcow2,format=qcow2 -nographic

expect "*Boot Arch Linux (x86_64)*"
send -- "\t"

expect "*archiso.img*"
send -- " console=ttyS0,38400\r"

The boot menu contains quite a lot of output, thus I only filter for partial matches using wildcard searches. I am not sure whether it’s save to have a wildcard at the end of an expect expression, since this means there is more output to come, but so far it works well. The documenation discourages this behaviour and always recommends to include the last character in the expect statement (e.g. for prompts usually a space character).

After the boot is finished, it’s basically a standard back and forth between expecting an output (mostly the shell prompt) and sending a command. Since I rely on fdisk during my Arch installation there also is a bit of real usage of expect where we wait for some questions from fdisk and answer them. So, let’s login to Archlinux and setup our partition. I created a variable that holds the search string for the prompt so that I don’t have to repeat it all the time:

set prompt "*@archiso*~*#* "
expect "archiso login: "
send -- "root\r"
expect $prompt
send -- "fdisk /dev/sda\r"
expect "Command (m for help): "
send -- "n\r"
expect "Select (default p): "
send -- "p\r"
expect "Partition number (1-4, default 1): "
send -- "1\r"
expect "First sector*: "
send -- "\r"
expect "Last sector*: "
send -- "\r"
expect "Command (m for help): "
send -- "a\r"
expect "Command (m for help): "
send -- "w\r"
expect $prompt
send -- "mkfs.ext4 /dev/sda1\r"
expect $prompt
send -- "mount /dev/sda1 /mnt\r"

The Archlinux output contains a lot of control sequences for colours in terminals, thus my prompt contains a lot of wildcards. I’m thinking about using a wrapper script that removes all control sequences from the output before expect retrieves the output, but for now it works like this.

Now that we have the hard disk setup, we can install the base Arch system. During installation we will switch into a chroot environment, which means that the prompt will slightly change and we use another prompt variable.

set chroot_prompt "*root@archiso* "
expect $prompt
send -- "pacstrap /mnt base linux linux-firmware\r"
expect $prompt
send -- "genfstab -U /mnt >> /mnt/etc/fstab\r"
expect $prompt
send -- "arch-chroot /mnt\r"
expect $chroot_prompt
send -- "ln -sf /usr/share/zoneinfo/Europe/Berlin /etc/localtime\r"
expect $chroot_prompt
send -- "hwclock --systohc\r"
expect $chroot_prompt
send -- "echo en_US.UTF-8 UTF-8 >> /etc/locale.gen\r"
expect $chroot_prompt
send -- "locale-gen\r"
expect $chroot_prompt
send -- "echo LANG=en_US.UTF-8 > /etc/locale.conf\r"
expect $chroot_prompt
send -- "echo arch-qemu > /etc/hostname\r"
expect $chroot_prompt
send -- "echo -e '127.0.0.1  localhost\\n::1  localhost' >> /etc/hosts\r"
expect $chroot_prompt
send -- "mkinitcpio -P\r"
expect $chroot_prompt
send -- "passwd\r"
expect "New password: "
send -- "root\r"
expect "Retype new password: "
send -- "root\r"
expect $chroot_prompt

Finally, we have to choose a bootloader and adjust it to enable serial port output after we reboot and boot from the system. The expect eof at the end of the script seemed to be quite important at least during my tests: Without it the MBR was not persisted correctly into the image file (maybe because expect did not wait for the shutdown to complete and thus the machine had been terminated ungracefully).

send -- "pacman -S --noconfirm syslinux\r"
expect $chroot_prompt
send -- "syslinux-install_update -i -a -m\r"
expect $chroot_prompt
send -- "cat /boot/syslinux/syslinux.cfg\r"
expect $chroot_prompt
send -- "sed -i '1i SERIAL 0 115200' /boot/syslinux/syslinux.cfg\r"
expect $chroot_prompt
send -- "sed -i 's/APPEND root=\\/dev\\/sda3/APPEND console=tty0 console=ttyS0,115200 root=\\/dev\\/sda1/g' /boot/syslinux/syslinux.cfg\r"
expect $chroot_prompt
send -- "cat /boot/syslinux/syslinux.cfg\r"
expect $chroot_prompt

Full Scripts

Chroot Approach

src=http://mirror.netcologne.de/archlinux/iso/2020.05.01/archlinux-bootstrap-2020.05.01-x86_64.tar.gz
archive=/tmp/archlinux-bootstrap-2020.05.01-x86_64.tar.gz
image=/tmp/image.raw
mountpoint=/tmp/arch

if [[ ! -f $archive ]]; then
    wget $src -O $archive
fi

mkdir $mountpoint

qemu-img create -f raw $image 20G
loop=$(sudo losetup --show -f -P $image)
sudo parted $loop mklabel msdos
sudo parted -a optimal $loop mkpart primary 0% 100%
sudo parted $loop set 1 boot on
loopp=${loop}p1
sudo mkfs.ext4 $loopp
sudo mount $loopp $mountpoint
sudo tar xf $archive -C $mountpoint --strip-components 1

sudo $mountpoint/bin/arch-chroot $mountpoint /bin/bash <<EOL
set -v

echo 'Server = http://ftp.uni-bayreuth.de/linux/archlinux/\$repo/os/\$arch' >> /etc/pacman.d/mirrorlist

pacman-key --init
pacman-key --populate archlinux

pacman -Syu --noconfirm
pacman -S --noconfirm base linux linux-firmware mkinitcpio dhcpcd syslinux
systemctl enable dhcpcd

# Standard Archlinux Setup
ln -sf /usr/share/zoneinfo/Europe/Berlin /etc/localtime
hwclock --systohc
echo en_US.UTF-8 UTF-8 >> /etc/locale.gen
locale-gen
echo LANG=en_US.UTF-8 > /etc/locale.conf
echo arch-qemu > /etc/hostname
echo -e '127.0.0.1  localhost\n::1  localhost' >> /etc/hosts

# Create an initramfs without autodetect, because this breaks with the
# combination host/chroot/qemu
linux_version=\$(ls /lib/modules/ | sort -V | tail -n 1)
mkinitcpio -c /etc/mkinitcpio.conf -S autodetect --kernel \$linux_version -g /boot/initramfs-linux-custom.img

# Setup syslinux
mkdir /boot/syslinux
cp /usr/lib/syslinux/bios/*.c32 /boot/syslinux/
extlinux --install --device $loopp /boot/syslinux
dd bs=440 count=1 conv=notrunc if=/usr/lib/syslinux/bios/mbr.bin of=$loop
# Customize syslinux config
uuid=\$(blkid -o value -s UUID $loopp)
sed -i '1i SERIAL 0 115200' /boot/syslinux/syslinux.cfg
sed -i "s/APPEND root=\/dev\/sda3/APPEND console=tty0 console=ttyS0,115200 root=UUID=\$uuid/g" /boot/syslinux/syslinux.cfg
sed -i "s/INITRD ..\/initramfs-linux.img/INITRD ..\/initramfs-linux-custom.img/" /boot/syslinux/syslinux.cfg
EOL

sudo umount $mountpoint
sudo losetup -d $loop

Live CD inside Qemu Approach

#!/usr/bin/expect -f

set prompt "*@archiso*~*#* "
set chroot_prompt "*root@archiso* "
set timeout -1
spawn qemu-system-x86_64 -cdrom /data/media/distros/linux/archlinux-2020.05.01-x86_64.iso -cpu host -enable-kvm -m 2048 -smp 2 -drive file=test-img.raw,format=raw -nographic
match_max 100000
expect "*Boot Arch Linux (x86_64)*"
send -- "\t"
expect "*archiso.img*"
send -- " console=ttyS0,38400\r"

expect "archiso login: "
send -- "root\r"
expect $prompt
send -- "fdisk /dev/sda\r"
expect "Command (m for help): "
send -- "n\r"
expect "Select (default p): "
send -- "p\r"
expect "Partition number (1-4, default 1): "
send -- "\r"
expect "First sector*: "
send -- "\r"
expect "Last sector*: "
send -- "\r"
expect "Command (m for help): "
send -- "a\r"
expect "Command (m for help): "
send -- "w\r"
expect $prompt
send -- "mkfs.ext4 /dev/sda1\r"
expect $prompt
send -- "mount /dev/sda1 /mnt\r"
expect $prompt
send -- "pacstrap /mnt base linux linux-firmware\r"
expect $prompt
send -- "genfstab -U /mnt >> /mnt/etc/fstab\r"
expect $prompt
send -- "arch-chroot /mnt\r"
expect $chroot_prompt
send -- "ln -sf /usr/share/zoneinfo/Europe/Berlin /etc/localtime\r"
expect $chroot_prompt
send -- "hwclock --systohc\r"
expect $chroot_prompt
send -- "echo en_US.UTF-8 UTF-8 >> /etc/locale.gen\r"
expect $chroot_prompt
send -- "locale-gen\r"
expect $chroot_prompt
send -- "echo LANG=en_US.UTF-8 > /etc/locale.conf\r"
expect $chroot_prompt
send -- "echo arch-qemu > /etc/hostname\r"
expect $chroot_prompt
send -- "echo -e '127.0.0.1  localhost\\n::1  localhost' >> /etc/hosts\r"
expect $chroot_prompt
send -- "mkinitcpio -P\r"
expect $chroot_prompt
send -- "pacman -S --noconfirm syslinux\r"
expect $chroot_prompt
send -- "syslinux-install_update -i -a -m\r"
#expect $chroot_prompt
#send -- "dd bs=440 count=1 conv=notrunc if=/usr/lib/syslinux/bios/mbr.bin of=/dev/sda\r"
expect $chroot_prompt
send -- "cat /boot/syslinux/syslinux.cfg\r"
expect $chroot_prompt
send -- "sed -i '1i SERIAL 0 115200' /boot/syslinux/syslinux.cfg\r"
expect $chroot_prompt
send -- "sed -i 's/APPEND root=\\/dev\\/sda3/APPEND console=tty0 console=ttyS0,115200 root=\\/dev\\/sda1/g' /boot/syslinux/syslinux.cfg\r"
expect $chroot_prompt
send -- "cat /boot/syslinux/syslinux.cfg\r"
expect $chroot_prompt
send -- "passwd\r"
expect "New password: "
send -- "root\r"
expect "Retype new password: "
send -- "root\r"
expect $chroot_prompt
send -- "exit\r"
expect $prompt
send -- "shutdown -h now\r"
expect eof
I do not maintain a comments section. If you have any questions or comments regarding my posts, please do not hesitate to send me an e-mail to blog@stefan-koch.name.