virsh list hangs because of qemu-system-i386 defunct so libvirtd has to be restarted

Bug #1887592 reported by Oscar Alias
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

1) The release of Ubuntu you are using, via 'lsb_release -rd' or System -> About Ubuntu

choix:~$ lsb_release -rd
Description: Ubuntu 20.04 LTS
Release: 20.04

2) The version of the package you are using, via 'apt-cache policy pkgname'

choix:~$ dpkg-query -S /lib/systemd/system/libvirtd.service
libvirt-daemon-system: /lib/systemd/system/libvirtd.service

choix:~$ dpkg -l libvirt-daemon-system
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-=====================-================-============-==================================
ii libvirt-daemon-system 6.0.0-0ubuntu8.1 amd64 Libvirt daemon configuration files

3) What you expected to happen

Have a manageable and usable libvirtd service after every boot of Ubuntu.

4) What happened instead

After the installation of libvirt in Ubuntu, and following the reboot of Ubuntu. Every virsh command or virt-install or virt-manager were hanged.

5) Diagnostic

It is necessary to restart libvirtd so the programs like virsh can interact again with the service.

Checking the process list, I found that qemu-system is defunct 'qemu-system-i386 defunct'.

If I restart libvirtd, then it becomes responsive but if I reboot the machine, then libvirtd hangs again.

I enabled the debug flag for libvirtd in /etc/libvirt/libvirt.conf

log_level = 1
log_filters="1:qemu"
log_outputs="1:file:/var/log/libvirt/libvirtd.log"

And found that qemu was not finding the kvm device when invoked by libvirtd during the startup of the service.

qemu-system-x86_64: failed to initialize KVM: No such file or directory.

I could not replicate this on a laptop but happens every single time in my PC with a Ryzen 3700X, X570 chipset, an NVMe and 32 GB of RAM.

6) Resolution.

There is a race condition happening between qemu-kvm.service and libvirtd.service. Becase the debug flag of libvirtd pointed that qemu was not findinf the kvm device, and it is created by qemu-kvm.service.

Therefore the solution was to create a drop-in for the libvirtd.service to add an After key as follows:

```ini
$ sudo systemctl edit libvirtd

[Unit]
After=qemu-kvm.service
```

This will make the `libvirtd` to wait for `qemu-kvm` to complete before starting.

Related branches

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Sounds reasonable, thanks for the detailed pre-work.

I'll take a look at putting this into Ubuntu once I'm back from PTO.

I was recently helping someone on askubuntu that faced the same or at least similar - are you by any chance - that someone?

Changed in libvirt (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
tags: added: libvirt-20.10
Revision history for this message
Oscar Alias (oaliasb) wrote :

Thanks for attending this.

I am not the guy from askubuntu. But I hope this can help to solve that situation as well.

Paride Legovini (paride)
Changed in libvirt (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (18.5 KiB)

This bug was fixed in the package libvirt - 6.6.0-1ubuntu2

---------------
libvirt (6.6.0-1ubuntu2) groovy; urgency=medium

  * d/p/u/lp-1892826-Revert-m4-virt-xdr-rewrite-XDR-check.patch: avoid clashes
    between libtripc and glibc that break libvirt-lxc (LP: #1892826)
  * d/p/ubuntu-aa/lp-1892736-apparmor-allow-libvirtd-to-call-virtiofsd.patch:
    allow libvirt to control virtiofsd (LP: #1892736)

libvirt (6.6.0-1ubuntu1) groovy; urgency=medium

  * Merge with Debian 6.6.0-1 from experimental
    Among many other new features and fixes this includes fixes for:
    (LP: #1874647) - Stale libvirt cache leads to VM startup failures
    (LP: #1869796) - bad ordering and dependent restarts of services/sockets
    Remaining changes:
    - d/p/ubuntu-aa/lp-1847361-load-versioned-module.patch: allow loading
      versioned modules after qemu package upgrades (LP 1847361)
    - libvirt-uri.sh: Automatically switch default libvirt URI for users
      via user profile (xen URI on dom0, qemu:///system otherwise)
    - Disable libssh2 support (universe dependency)
    - Disable firewalld support (universe dependency)
    - Set qemu-group to kvm (for compat with older ubuntu)
    - Additional apport package-hook
    - Autostart default bridged network (As upstream does, but not Debian).
      In addition to just enabling it our solution provides:
      + do not autostart if subnet is already taken (e.g. in guests).
      + iterate some alternative subnets before giving up
    - d/p/ubuntu/Allow-libvirt-group-to-access-the-socket.patch: This is
      the group based access to libvirt functions as it was used in Ubuntu
      for quite long.
      + d/p/ubuntu/daemon-augeas-fix-expected.patch fix some related tests
        due to the group access change.
      + d/libvirt-daemon-system.postinst: add users in sudo to the libvirt
        group.
    - ubuntu/parallel-shutdown.patch: set parallel shutdown by default.
    - Update README.Debian with Ubuntu changes
    - d/p/ubuntu/ubuntu_machine_type.patch: accept ubuntu types as pci440fx
    - fix autopkgtests
      + d/t/control, d/t/smoke-qemu-session: fixup smoke-qemu-session by making
        vmlinuz available and accessible (Debian bug 848314)
      + d/t/control: fix smoke-qemu-session by ensuring the service will run
        installing libvirt-daemon-system
      + d/t/smoke-lxc: fix smoke-lxc by ignoring potential issues on destroy as
        long as the following undefine succeeds
      + d/t/smoke-lxc: use systemd instead of sysV to restart the service
    - dnsmasq related enhancements
      + run dnsmasq as libvirt-dnsmasq (LP: 1743718)
      + d/libvirt-daemon-system.postinst: add libvirt-dnsmasq user and group
      + d/libvirt-daemon-system.postrm: remove libvirt-dnsmasq user and group
        on purge
      + d/p/ubuntu/dnsmasq-as-priv-user: write dnsmasq config with user
        libvirt-dnsmasq and adapt the self tests to expect that config
      + d/libvirt-daemon-system.postinst: fix old libvirt-dnsmasq users group
      + Add dnsmasq configuration to work with system wide dnsmasq-base
    - debian/rules: disable the netcf backend. (LP: 1764314)
    - debian/patches/ubuntu/ovmf_paths.patch...

Changed in libvirt (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.