Upgrade of qemu binaries causes running instances not able to dynamically load modules

Bug #1847361 reported by Billy Olsen
40
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Fix Released
Undecided
Unassigned
Stein
Fix Released
Medium
Unassigned
libvirt (Ubuntu)
Fix Released
Low
Unassigned
Bionic
Fix Released
Undecided
Christian Ehrhardt 
Eoan
Fix Released
Undecided
Christian Ehrhardt 
qemu (Ubuntu)
Fix Released
Wishlist
Unassigned
Bionic
Fix Released
Undecided
Christian Ehrhardt 
Eoan
Fix Released
Undecided
Christian Ehrhardt 

Bug Description

[Impact]

 * An infrequent but annoying issue is QEMUs problem to not be able to
   hot-add capabilities IF since starting the instance qemu has been
   upgraded. This is due to qemu modules only working with exactly the
   same build.

 * We brought changes upstream that allow the packaging to keep the old
   files around and make qemu look after them as a fallback.

[Test Case]

 I:
 * $ apt install uvtool-libvirt
   $ uvt-simplestreams-libvirt --verbose sync --source http://cloud-images.ubuntu.com/daily arch=amd64 label=daily release=bionic
   $ uvt-kvm create --password ubuntu lateload arch=amd64 release=bionic label=daily

cat > curldisk.xml << EOF
  <disk type='network' device='disk'>
    <driver name='qemu' type='raw'/>
    <source protocol="http" name="ubuntu/dists/bionic-updates/main/installer-amd64/current/images/netboot/mini.iso">
            <host name="archive.ubuntu.com" port="80"/>
    </source>
    <target dev='vdc' bus='virtio'/>
    <readonly/>
  </disk>
EOF

# Here up or downgrade the installed packages, even a minor
# version or a rebuild of the same version
# Instead if you prefer (easier) you can run
  $ apt install --reinstall qemu-*

Next check if they appeared (action of the maintainer scripts)
in the /var/run/qemu/<version> directory

# And then rm/mv the original .so files of qemu-block-extra
# Trying to load a .so now would after an upgrade fail as the old qemu can't load the build id

$ virsh attach-device lateload curldisk.xml
Reported issue happens on attach:
root@b:~# virsh attach-device lateload cdrom-curl.xml
error: Failed to attach device from cdrom-curl.xml
error: internal error: unable to execute QEMU command 'device_add': Property 'virtio-blk-device.drive' can't find value 'drive-virtio-disk2'

In the log we can see:
Failed to initialize module: /usr/lib/x86_64-linux-gnu/qemu/block-curl.so

One can also check files mapped into a process and we should see the /var/run/.. path being used now.

 II:
 * As it had issues in the first iteration of the fix worth a
   try is also the use of an environment var for an extra path:
   $ QEMU_MODULE_DIR="/tmp/" qemu-system-x86_64 -cdrom localhost::/foo

[Regression Potential]

 I:
 * libvirt just allows a few more paths to be read from in the apparmor
   isolation that is usually safe unless these paths are considered
   sensitive. But /var/run/qemu is new, /var/run in general not meant
   for permanent or secure data and as always if people want to ramp up
   isolation they can always add deny rules to the local overrides.

 II:
 * the qemu change has two components.
   In qemu code it looks for another path if the former ones failed.
   I see no issues there yet, but can imagine that odd versions might
   make it access odd paths which would then be denied by apparmor or
   just don't exist. But that is no different than the former built-in
   paths it tries, so nothing bad should happen.
   The code change to the maintainer scripts has to backup the files.
   If that goes wrong upgrades could be broken, but so far no tests have
   shown issues.

[Other Info]

 * To really use the functionality users will need the new qemu AND the
   new libvirt that are uploaded for this bug.
   But it felt wrong to add versioned dependencies from qemu->libvirt
   (that is the semantically correct direction) also conflicts/breaks
   might cause issues in many places that want to control these. OTOH
   while the fix is great for some installations the majority of users
   won't care and therefore be happy if extra dependencies are not
   causing any oddity on apt upgrade. Therefore no versioned
   dependencies were added intentionally.

---

[Feature Freeze Exception]

Hi,
this is IMHO a just a bugfix. But since it involves some bigger changes I wanted to be on the safe side and get an ack by the release Team.

Problem:
- on upgrade qemu processes are left running as they
  represent a guest VM
- later trying to add features e.g. ceph disk hot-add will
  need to load .so files e.g. from qemu-block-extra package
- those modules can on ly be loaded from the same build, but those are
  gone after upgrade

Solution:
- If qemu fails to load from its usual paths it will
  now also look in /var/run/<version/
- package upgrade code will place the .so's there
- things will be cleaned on reboot which is much simpler
  and error-proof than trying to detect which versions
  binaries are running
- libvirt has a change to allow just reading and
  mapping from that path (apparmor)

@Release team it would be great if you would agree to this being safe for an FFe.

--- initial report ---

Upgrading qemu binaries causes the on-disk versions to change, but the in-memory running instances still attempt to dynamically load a library which matches its same version. This can cause running instances to fail actions like hotplugging devices. This can be alleviated by migrating the instance to a new host or restarting the instance, however in cloud type environments there may be instances that cannot be migrated (sriov, etc) or the cloud operator does not have permission to reboot.

This may be resolvable for many situations by changing the packaging to keep older versions of qemu libraries around on disk (similar to how the kernel package keeps older kernel versions around).

--- solution options (WIP) ---

For a packaging solution we would need:
- qemu-block-extra / qemu-system-gui binary packages would need
  sort of a -$buildid in the name. That could be the version
  string (sanitized for package name)
- /usr/lib/x86_64-linux-gnu/qemu/*.so would need a -$buildid
- loading of modules from qemu would need to consider $buildid
  when creating module names.
  util/module.c in module_load_one / module_load_file
  It already searches in multiple dirs, maybe it could insert
  the $buildid there
- We'd need a way of detecting running versions of qemu binaries
  and only make them uninstallable once the binaries are all
  gone. I have not seen something like that in apt yet (kernel
  is easy in comparison as only one can be loaded at a time).

ALTERNATIVES:
- disable loadable module support
- add an option to load all modules in advance (unlikely to be
  liked upstream) and not desirable for many setups using qemu
  (especially not as default)
- add an option to load a module (e.g via QMP/HMP) which would
  allow an admin
  to decide doing so for the few setups that benefit.
  - that could down the road then even get a libvirt interface
    for easier consumption

Heads up - None of the above would be SRUable

--- mitigation options ---

- live migrate for upgrades
  - prohibited by SR-IOV usage
  - Tech to get SR-IOV migratable is coming (e.g. via net_failover,
    bonding in DPDK, ...)
- load the modules you need in advance
  - Note: lacking an explicit "load module" command makes this
    slightly odd for now
  - but using iscsi or ceph is never spontaneous, a deployment
    has or hasn't the setup to use those
  - Create a single small read-only node and attach this to each guest,
    that will load the driver and render you immune to the issue. While
    more clunky, this isn't so much different than how it would be
    with an explicit "load module" command.
    Actually the target doesn't have to exist it can fail to attach
    and still achieves what is needed comment #17 has an example.

Related branches

tags: added: server-triage-discuss
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Interesting bug, thanks Billy for the report.

Packaging of qemu (and actually many other applications as well) splits its capabilities into a main binaries and .so files (or similar tech, even am extra python module if you want). That is for flexibility of attack surface (less active code), size management (install footprint), dependency management (some might be optional or less supported), ...
Some of these technologies can load these modules "late" and this is what happens here.

Qemu will not load all its .so files that are installed immediately, but when needed.
That is what happens when

I personally don't think packaging it with versioned binary packages like a lib or the kernel is the right way (as more packages are design-wise affected by the same and wouldn't we want to switch all of them then), but I don't have a great alternative yet either - this clearly needs some discussions about the way to go with it.

We might need to analyze which of these can load late and maybe provide an option to load them early or automatically before upgrades? OTOH that seems a very big and undetected change in behavior if e.g. reducing attack surface was your intention.

As I said, very interesting bug ...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The current list of extra .so files that we have are:

## Package: .so files ##
qemu-system-gui: audio-pa.so, ui-gtk.so
qemu-block-extra: block-curl.so, block-iscsi.so, block-rbd.so

I don't think that there is an action that would load audio or ui late.
So they should be safe against the reporte issue.

On the other hand block-* has the way of hotplugging the first disk of some kind that will load the .so file then. So I'd assume this is how you have hit this issue?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

@Billy - a few questions to complete the picture here:
1. Did this only fail on a major e.g. qemu 2.5 -> 3.1 upgrade or does it in your experience insist on the exact build ID so e.g. 2.5-1ubuntu1 -> 2.5-2ubuntu1 ? Depending on that answer the versioned package name becomes harder/easier - if you don't know please tell me then I will have to do some tests to confirm.
2. Most decisions to load-or-not are made on the initial instantiation, but as you say hotplug of e.g. new storage types might be an issue. What was your actual example to hit this - what did you hotplug (XML of guest and hotplug device if you have it still)

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (5.6 KiB)

Using proposed and UCA I did some experiments in Bionic:

qemu-block-extra:
     1:3.1+dfsg-2ubuntu3.2~cloud0 500
        500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/stein/main amd64
     1:2.11+dfsg-1ubuntu7.19 500
        500 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     1:2.11+dfsg-1ubuntu7.18 500
        500 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages

I should be able to switch between those.

Also there is a trick to do all this without a complex ceph/issci setup.

cat > curldisk.xml << EOF
  <disk type='network' device='disk'>
    <driver name='qemu' type='raw'/>
    <source protocol="http" name="ubuntu/dists/bionic-updates/main/installer-amd64/current/images/netboot/mini.iso">
            <host name="archive.ubuntu.com" port="80"/>
    </source>
    <target dev='vdc' bus='virtio'/>
    <readonly/>
  </disk>
EOF

$ virsh attach-device lateload curldisk.xml

Among many (many!) dependent libs from the system this also will late-load the qemu curl .so file.

root@b:~# diff libsloaded.pre libsloaded.post
4a5,29
> qemu-syst 18362 libvirt-qemu mem REG 0,64 26936 2603 /lib/x86_64-linux-gnu/libnss_dns-2.27.so
> qemu-syst 18362 libvirt-qemu mem REG 0,64 39208 2529 /lib/x86_64-linux-gnu/libcrypt-2.27.so
> qemu-syst 18362 libvirt-qemu mem REG 0,64 1082648 10066 /usr/lib/x86_64-linux-gnu/libsqlite3.so.0.8.6
> qemu-syst 18362 libvirt-qemu mem REG 0,64 300888 3424 /usr/lib/x86_64-linux-gnu/libhx509.so.5.0.0
> qemu-syst 18362 libvirt-qemu mem REG 0,64 60400 3418 /usr/lib/x86_64-linux-gnu/libheimbase.so.1.0.0
> qemu-syst 18362 libvirt-qemu mem REG 0,64 165880 10090 /usr/lib/x86_64-linux-gnu/libwind.so.0.0.0
> qemu-syst 18362 libvirt-qemu mem REG 0,64 88680 10058 /usr/lib/x86_64-linux-gnu/libroken.so.18.1.0
> qemu-syst 18362 libvirt-qemu mem REG 0,64 217560 3416 /usr/lib/x86_64-linux-gnu/libhcrypto.so.4.1.0
> qemu-syst 18362 libvirt-qemu mem REG 0,64 661696 9860 /usr/lib/x86_64-linux-gnu/libasn1.so.8.0.0
> qemu-syst 18362 libvirt-qemu mem REG 0,64 573464 9987 /usr/lib/x86_64-linux-gnu/libkrb5.so.26.0.0
> qemu-syst 18362 libvirt-qemu mem REG 0,64 35360 3420 /usr/lib/x86_64-linux-gnu/libheimntlm.so.0.1.0
> qemu-syst 18362 libvirt-qemu mem REG 0,64 14256 2573 /lib/x86_64-linux-gnu/libkeyutils.so.1.5
> qemu-syst 18362 libvirt-qemu mem REG 0,64 265712 3410 /usr/lib/x86_64-linux-gnu/libgssapi.so.3.0.0
> qemu-syst 18362 libvirt-qemu mem REG 0,64 43616 9991 /usr/lib/x86_64-linux-gnu/libkrb5support.so.0.1
> qemu-syst 18362 libvirt-qemu mem REG 0,64 14248 308068 /lib/x86_64-linux-gnu/libcom_err.so.2.1
> qemu-syst 18362 libvirt-qemu mem REG 0,64 199104 9985 /usr/lib/x86_64-linux-gnu/libk5crypto.so.3.1
> qemu-syst...

Read more...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Up/Downgrade with a command like this:
v=1:2.11+dfsg-1ubuntu7.18; apt install qemu-block-extra=$v qemu-system-common=$v qemu-system-x86=$v qemu-utils=$v

qemu-running: 1:2.11+dfsg-1ubuntu7.19
lib installed: 1:2.11+dfsg-1ubuntu7.18

Reported issue happens on attach:
root@b:~# virsh attach-device lateload cdrom-curl.xml
error: Failed to attach device from cdrom-curl.xml
error: internal error: unable to execute QEMU command 'device_add': Property 'virtio-blk-device.drive' can't find value 'drive-virtio-disk2'

In the log we can see:
Failed to initialize module: /usr/lib/x86_64-linux-gnu/qemu/block-curl.so
Note: only modules from the same build can be loaded.

So it is not a full ABI/API like a normal lib with an soname, it really is an internal .so with hard dependency on the real buildid.

In the design [1] it was mentioned that there could be options to load modules later via HMP/QMP.
In there also are some reasoning why to modularize and why to load things late.

[1]: https://wiki.qemu.org/Features/Modules

@Billy - comment #4 and #5 answer the questions I had to you above - I was too thrilled and tested it myself :-)

Changed in qemu (Ubuntu):
status: New → Confirmed
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

While mentioned, QMP loading of modules never came to life it seems [1].

[1]: http://manpages.ubuntu.com/manpages/eoan/man7/qemu-qmp-ref.7.html

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The code around module_load_one is mapped to a few helper names:

 60 #define block_module_load_one(lib) module_load_one("block-", lib)
 61 #define ui_module_load_one(lib) module_load_one("ui-", lib)
 62 #define audio_module_load_one(lib) module_load_one("audio-", lib)

Currently there seems to be no invocation of those other than attaching such a type of device.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Known mitigation options at this point:
- live migrate for upgrades
  - prohibited by SR-IOV usage
  - Tech to get SR-IOV migratable is coming (e.g. via net_failover, bonding in DPDK, ...)
- load the modules you need in advance
  - Note: lacking an explicit "load module" command makes this slightly odd for now
  - but using iscsi or ceph is never spontaneous, a deployment has or hasn't the
    setup to use those
  - Create a single small read-only node and attach this to each guest, that will
    load the driver and render you immune to the issue. While more clunky, this isn't
    so much different than how it would be with an explicit "load module" command.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Just theorizing for now ...

For a packaging solution we would need:
- qemu-block-extra / qemu-system-gui binary packages would need
  sort of a -$buildid in the name. That could be the version
  string (sanitized for package name)
- /usr/lib/x86_64-linux-gnu/qemu/*.so would need a -$buildid
- loading of modules from qemu would need to consider $buildid
  when creating module names.
  util/module.c in module_load_one / module_load_file
  It already searches in multiple dirs, maybe it could insert
  the $buildid there
- We'd need a way of detecting running versions of qemu binaries
  and only make them uninstallable once the binaries are all
  gone. I have not seen something like that in apt yet (kernel
  is easy in comparison as only one can be loaded at a time).

ALTERNATIVES:
- disable loadable module support
- add an option to load all modules in advance (unlikely to be
  liked upstream) and not desirable for many setups using qemu
  (especially not as default)
- add an option to load a module (e.g via QMP/HMP) which would
  allow an admin
  to decide doing so for the few setups that benefit.
  - that could down the road then even get a libvirt interface
    for easier consumption

Heads up - None of the above would be SRUable

Changed in qemu (Ubuntu):
importance: Undecided → Wishlist
status: Confirmed → Triaged
description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I have added testcase, solution options and mitigations to the description, there we can update them as new things come up int the discussion. That helps other people joining this bug to read what things are about in just the description.

Revision history for this message
Dan Streetman (ddstreet) wrote :

re: version matching, when I looked at this last week it seemed that qemu sets up a build stamp, which is then used to create its DSO stamp fun (cause this is so fun ;)

I wonder if the dso abi could be firmed up and made consistent, at least for major versions, so that the loader could be a bit more flexible about loading drivers from other builds.

ddstreet@thorin:~/repos/qemu/qemu$ git grep CONFIG_STAMP
configure: echo "CONFIG_STAMP=_$( (echo $qemu_version; echo $pkgversion; cat $0) | $shacmd - | cut -f1 -d\ )" >> $config_host_mak
include/qemu/module.h:#define DSO_STAMP_FUN glue(qemu_stamp, CONFIG_STAMP)
ddstreet@thorin:~/repos/qemu/qemu$ git grep DSO_STAMP_FUN
include/qemu/module.h:#define DSO_STAMP_FUN glue(qemu_stamp, CONFIG_STAMP)
include/qemu/module.h:#define DSO_STAMP_FUN_STR stringify(DSO_STAMP_FUN)
include/qemu/module.h:void DSO_STAMP_FUN(void);
module-common.c:void DSO_STAMP_FUN(void)
util/module.c: if (!g_module_symbol(g_module, DSO_STAMP_FUN_STR, (gpointer *)&sym)) {

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI discussed on KVM Forum and acknowledged that it is an issue.

Discussion went to the ML then [1].

This essentially becomes an upstream-devel item to create some late-loading command as discussed there. But the impact and severity is rather low as the general thought is more like "yes an issue, but see (e)(f) on [2].

[1]: [1]: https://lists.gnu.org/archive/html/qemu-devel/2019-11/msg00005.html
[2]: https://lists.gnu.org/archive/html/qemu-devel/2019-11/msg00029.html

tags: removed: server-triage-discuss
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

@Dannf - we also discussed flexibility, it really is not meant to be even remotely stable interface. So it should only load the very same buildid and nothing else.

I agree that making it more generous accepting other builds "might" work but not at all reliable.
I think allowing to load (or autoload) the modules before upgrades is the right way.

But as I said a low-prio upstream development item :-/
(As an admin could resolve it as part of the upgrade planning by using it in advance)

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

correction s/dannf/ddstreet/ :-)

Revision history for this message
Louis Bouchard (louis) wrote :

Hello,

Nice to see so many known and friendly faces around this bug. Let me outline a *real life* issue that we're facing following this issue.

We've put block storage in private beta a while ago (https://www.scaleway.com/fr/betas/#block-storage) based on RBD storage.

The same platform experienced another QEMU issue (LPl #1837869) kindly fixed and SRU by @paelzer) that we deployed on the same platform this week. So all the instances currently running that have NOT used RBD based storage during the BETA but will still be running when we go live and want to use RBD storage when it goes GA will experience this bug unless they reboot their instance.

We're looking into a workaround solution to alleviate this situation but I thought I'd share the context to give a some kind of "real life" experience :-)

Kind regards,

Louis (aka caribou) ;-)

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

If "restricting features after upgrade" is not an option until we came up upstream with a preferred approach to fix this the workarounds are:
- migrate away and back for upgrades (general practice, but sometimes impossible e.g. due to passthrough)
- the .so by using the functionality before upgrade (e.g. attach and detach a rbd device).
  I guess there could even be a valid config but invalid target which would achieve the same
  without the guest noticing.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Yeah, the workaround can be a disk that "fails" which helps to be not guest detectable:

Check libs before the workaround:
$ sudo lsof -p 9952 | awk '{print $9}' | grep '\.so' | sort | uniq > pre.libs

Then attach a XML that is valid for libvirt, but will fail to attach a disk (after loading the .so files)
  <disk type='network' device='disk'>
    <driver name='qemu' type='raw'/>
    <source protocol="http" name="does-not-exist">
            <host name="localhost" port="80"/>
    </source>
    <target dev='vdc' bus='virtio'/>
    <readonly/>
  </disk>

You'll get a fail (intentionally) like:
$ virsh attach-device guest1 non-disk.xml
error: Failed to attach device from on-disk.xml
error: internal error: unable to execute QEMU command 'device_add': Property 'virtio-blk-device.drive' can't find value 'drive-virtio-disk2'

Check after this:
$ sudo lsof -p 9952 | awk '{print $9}' | grep '\.so' | sort | uniq > post.libs

And we see all of them loaded:
$ diff -Naur pre.libs post.libs | grep '^+'
+++ post.libs 2019-11-18 07:01:55.461495910 +0100
+/lib/x86_64-linux-gnu/libcom_err.so.2.1
+/lib/x86_64-linux-gnu/libcrypt-2.27.so
+/lib/x86_64-linux-gnu/libkeyutils.so.1.5
+/usr/lib/x86_64-linux-gnu/libasn1.so.8.0.0
+/usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
+/usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2
+/usr/lib/x86_64-linux-gnu/libgssapi.so.3.0.0
+/usr/lib/x86_64-linux-gnu/libhcrypto.so.4.1.0
+/usr/lib/x86_64-linux-gnu/libheimbase.so.1.0.0
+/usr/lib/x86_64-linux-gnu/libheimntlm.so.0.1.0
+/usr/lib/x86_64-linux-gnu/libhx509.so.5.0.0
+/usr/lib/x86_64-linux-gnu/libk5crypto.so.3.1
+/usr/lib/x86_64-linux-gnu/libkrb5.so.26.0.0
+/usr/lib/x86_64-linux-gnu/libkrb5.so.3.3
+/usr/lib/x86_64-linux-gnu/libkrb5support.so.0.1
+/usr/lib/x86_64-linux-gnu/liblber-2.4.so.2.10.8
+/usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2.10.8
+/usr/lib/x86_64-linux-gnu/libnghttp2.so.14.15.2
+/usr/lib/x86_64-linux-gnu/libpsl.so.5.2.0
+/usr/lib/x86_64-linux-gnu/libroken.so.18.1.0
+/usr/lib/x86_64-linux-gnu/librtmp.so.1
+/usr/lib/x86_64-linux-gnu/libsqlite3.so.0.8.6
+/usr/lib/x86_64-linux-gnu/libwind.so.0.0.0
+/usr/lib/x86_64-linux-gnu/qemu/block-curl.so

description: updated
Revision history for this message
Dan Streetman (ddstreet) wrote :

Hello @louis (or @caribou) good to see you again! :)

I had a quick thought on this; maybe we could add a search path to our qemu, to look not only in the normal /usr/lib/x86_64-linux-gnu/qemu for its modules, but also in /run/qemu/$QEMU_VERSION_STAMP/modules/ (or something similar under /run). Then, qemu startup could simply copy its modules into this dir when any guest starts. If qemu is upgraded, it could fallback to this dir for its modules if the current on-disk modules are the wrong version. The modules in /run would disappear across reboots, so they wouldn't be a permanent leak of disk space.

We could even add a post-uninstall script to the package to check if there are any running qemu instances with version matching the specific package being uninstalled, and remove the /run/qemu/$VERSION dir/modules if nothing is running that would need them.

That approach might even be applicable upstream, though without tie-in to the specific packaging (i.e. rpm, deb, etc...) the /run modules would remain until a reboot.

Revision history for this message
Dan Streetman (ddstreet) wrote :

I should have read the ML thread first, looks like you mention this approach already there, although I think using /run gives an advantage of automatically clearing old modules out over reboot. In any case, might be worth carrying a patch for our own qemu, even if it doesn't make sense to handle it upstream.

tags: added: server-next
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (3.9 KiB)

In the discussion so far the opinions mostley leaned to "restrict it per policy" of some sort.
But I was continuing to wonder if we could not help users with a rather minimal change to avoid getting those issues.

This should really not be awkward, everyone e.g. wants to avoid building special packages - a version per build in the package name to be coinstallable looks bad. At the same time the dependencies would be odd as well, as you want them uninstallable, but only after e.g. a reboot.

Therefore as mentioned before the probably easiest way out is to use /tmp or even better /var/run as a temporary place which also will be auto-cleaned on a host reboot.

Considering that packaging has scripts that run on removal, and distributions could opt-in by placing the "old" modules under /var/run/...//$buildid/*.so that way qemu only needs to look for that path as fallback (should be a small change).

The full path to /usr/lib/x86_64-linux-gnu/qemu/ usually is "drwxr-xr-x root root".
There the modules are currently stored.
The dir /run (and below) is the same so the seucrity impact of "but people could place .so files there" isn't different to today where people that can get root permissions can modify files in the current path qemu looks for loading shared objects.
CC: lets add some security people still to give this an extra thought (TODO add Seth/Mdeslaur)

The DIR could even be set via an environment variable:
197 search_dir = getenv("QEMU_MODULE_DIR");
198 if (search_dir != NULL) {
199 dirs[n_dirs++] = g_strdup_printf("%s", search_dir);
200 }

So if we could set this ENV properly by default we could even go without a change to qemu.
Currently the ENV of a qemu process is:
$ cat /proc/3077/environ
LC_ALL=CPATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
HOME=/var/lib/libvirt/qemu/domain-3-focal2
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-3-focal2/.local/share
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-3-focal2/.cache
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-3-focal2/.config
QEMU_AUDIO_DRV=spiceroot

So the path isn't used already.
But the environment is cleared and constructed by libvirt, we can't easily "modify" it.
=> https://libvirt.org/drvqemu.html#qemucommand

That is a bit of a set-back for a packaging-only change.
But it provides a nicer workaround as users could run with a custom:
  <qemu:commandline>
    <qemu:env name='QEMU_ENV' value='/my/old/so/path'/>
  </qemu:commandline>
And copy the .so files before upgrade to /my/old/so/path.
Never the less this isn't good for multiple upgrades in a row and it isn't good to automate it for the user.

We'd need an ID that packaging (cleanup scripts) and qemu (module load path) know both about.
The packaging knows a distro specific version like 1:4.2-3ubuntu1 while the binary only knows the DSO_STAMP_FUN_STR which is a sha hashed qemu_version+pkg_version.

For example in 1:2.11+dfsg-1ubuntu7.23 that for me was
0000000000001d10 <qemu_module_dummy@@Base-0xf0>:
    1d10: 48 8d 3d 19 01 00 00 lea 0x119(%rip),%rdi # 1e30...

Read more...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Summary of the above - try an approach to a solution that does:
- prerm to copy to /var/run/qemu/<version>
- qemu patch to fall back loading from /var/run/qemu/<version>

I have a VERY preliminary build at:
https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3961/+packages
And preliminary branches for my experiments:
https://git.launchpad.net/~paelzer/ubuntu/+source/qemu/log/?h=bug-1847361-miss-old-so-on-upgrade-UBUNTU
https://git.launchpad.net/~paelzer/ubuntu/+source/qemu/log/?h=bug-1847361-miss-old-so-on-upgrade-UPSTREAM

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Generated prerm code looks (at a first glimpse) as I'd expected it to be.
    remove|upgrade|deconfigure)
        # retain .so files for still running qemu instances in /var/run
        mkdir -p /var/run/qemu/1_4_2-3ubuntu2~ppa3
        cp /usr/lib/x86_64-linux-gnu/qemu/ui-gtk.so /usr/lib/x86_64-linux-gnu/qemu/audio-*.so /var/run/qemu/1_4_2-3ubuntu2~ppa3/
    ;;

On build time it will adapt:
- the arch to save
- the version path path.
- we are not going to consider co-installed architectures for this
  - the qemu load mechanism will block loading the wrong ones
  - it will not crash
  - just not gaining the benefit of this change for a non-real use case is ok)

Note: the first version had the code to save the files of qemu-system-gui in both packages, but is enough to test.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I see a drawback, people could want to "apt remove" a package to stop its functionality.
And with the fix for this bug this will only be true after a reboot.

The conceptual way out of this as much as possible IMHO is to:
- trigger doing the backup to /var/run only on upgrade but not on remove
- have "purge" and "remove" delete the /var/run/qemu/#files# completely
  - we can't remove the dir as we could have e.g. gui and block packages handled differently

This isn't perfect there could be special-cases constructed to make this not work.
But again - it won't break just not get the benefit of "working on upgrade" and OTOH we want to keep the promise that "remove/purge" will make things stop to work/exist.

Note: "deconfigure" in this case matches "upgrade"

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Note: Leaving states like failed-upgrade and abort-upgrade without removing the files, as people can fix-up from there without hitting the saving function again. And it was not meant to be a "removal" so keep the files in place in these cases.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

After package transition I see:
/var/run/qemu/1_4_2-3ubuntu2~ppa3/:
total 96
drwxr-xr-x 2 root root 80 Mar 3 07:37 ./
drwxr-xr-x 3 root root 60 Mar 3 07:37 ../
-rw-r--r-- 1 root root 22768 Mar 3 07:37 audio-pa.so
-rw-r--r-- 1 root root 71784 Mar 3 07:37 ui-gtk.so

Matching the former:
root@f:~# md5sum /var/run/qemu/1_4_2-3ubuntu2~ppa3/*
3c231f9afc5f4451146953e2d373a601 /var/run/qemu/1_4_2-3ubuntu2~ppa3/audio-pa.so
069a1c8f4aaca598bceab5c2f6510bdf /var/run/qemu/1_4_2-3ubuntu2~ppa3/ui-gtk.so

root@f:~# md5sum /usr/lib/x86_64-linux-gnu/qemu/* /var/run/qemu/*
3c231f9afc5f4451146953e2d373a601 /usr/lib/x86_64-linux-gnu/qemu/audio-pa.so
069a1c8f4aaca598bceab5c2f6510bdf /usr/lib/x86_64-linux-gnu/qemu/ui-gtk.so

Ok that part seems to work, now lets check if qemu tries to load from the right paths.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

## Test ##

1. I have a qemu with the experimental patch applied
# dpkg -l qemu-system-x86
 1:4.2-3ubuntu2~ppa3

2. prep disk xml to load curl.so
# cat > curldisk.xml << EOF
  <disk type='network' device='disk'>
    <driver name='qemu' type='raw'/>
    <source protocol="http" name="ubuntu/dists/bionic-updates/main/installer-amd64/current/images/netboot/mini.iso">
            <host name="archive.ubuntu.com" port="80"/>
    </source>
    <target dev='vdc' bus='virtio'/>
    <readonly/>
  </disk>
EOF

3. check not being loaded
# lsof -p $(pidof qemu-system-x86_64) | grep block
<nothing>

4. load curl.so
# virsh attach-device focal2 curldisk.xml

5. check to be loaded now
# lsof -p $(pidof qemu-system-x86_64) | grep block
qemu-syst 19837 libvirt-qemu mem REG 0,83 34408 386797 /usr/lib/x86_64-linux-gnu/qemu/block-curl.so

1-5 above worked well as-is and my debug code spew out in the log:
DEBUG: trying to load module: /usr/lib/x86_64-linux-gnu/qemu/block-curl.so

Ok, lets do the same again but remove the qemu-system-block-extra package to have the .so only in /var/run.

The qemu version in the binary matches:
# qemu-system-x86_64 --version
QEMU emulator version 4.2.0 (Debian 1:4.2-3ubuntu2~ppa3)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

The files matched (but the /usr/lib path is removed before the experiment:
root@f:~# md5sum /var/run/qemu/1_4_2-3ubuntu2~ppa3/*curl* /usr/lib/x86_64-linux-gnu/qemu/*curl*
3d00fe34bddac3caeec9a4be91d84bae /var/run/qemu/1_4_2-3ubuntu2~ppa3/block-curl.so
3d00fe34bddac3caeec9a4be91d84bae /usr/lib/x86_64-linux-gnu/qemu/orig.block-curl.so.orig

Nothing loaded in advance:
# lsof -p $(pidof qemu-system-x86_64) | grep block
<empty>

Uh there is some serious wrong replacement happening :-)
DEBUG: trying to load module: /usr/lib/x86_64-linux-gnu/qemu/block-curl.so
DEBUG: trying to load module: /usr/bin/../block-curl.so
DEBUG: trying to load module: /usr/bin/block-curl.so
DEBUG: trying to load module: /var/run/qemu/________:_._______________/block-curl.so

So this
Debian 1:4.2-3ubuntu2~ppa3
became
________:_._______________
That replaced the inverse of what I wanted!
Was that function using an allowed set ... ?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Yeah ok it is inverted logic, ok then modify it to that ...
"For each character in string , if the character is not in valid_chars , replaces the character with substitutor."

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Updates:
- (hopefully) fixed the replacement logic
- adapted d/rules to the new logic
- added all the special handling of remove/purge mentioned above
- fixup of some minor issues and adding comments

-> New build qemu_4.2-3ubuntu2~ppa4 in PPA (just started to build)
-> force pushed both branches that I referenced with the new content

PPA: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3961/+packages
Branches:
- https://git.launchpad.net/~paelzer/ubuntu/+source/qemu/log/?h=bug-1847361-miss-old-so-on-upgrade-UBUNTU
- https://git.launchpad.net/~paelzer/ubuntu/+source/qemu/log/?h=bug-1847361-miss-old-so-on-upgrade-UPSTREAM

Testing will continue in a few hours once the new build is complete

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI: to test with just one version things like the following will trigger the prerm logic (not sure if that is still true with remove/upgrade being split as it will be on the next build)
$ apt install --reinstall qemu-system-gui

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Still works:
$ apt install --reinstall qemu-block-extra
(this is like an upgrade to itself)

files match
root@f:~# md5sum /var/run/qemu/1_4.2-3ubuntu2~ppa4_/* /usr/lib/x86_64-linux-gnu/qemu/block*
e5fae910ca4c3c726ae0491a2cdab507 /var/run/qemu/1_4.2-3ubuntu2~ppa4_/block-curl.so
f780837ec4678eb783baecfab43fa2c1 /var/run/qemu/1_4.2-3ubuntu2~ppa4_/block-iscsi.so
34795eba24a8445621328562e0d953f2 /var/run/qemu/1_4.2-3ubuntu2~ppa4_/block-rbd.so
399dfc7f7912506ff96c3b96ed90a6eb /var/run/qemu/1_4.2-3ubuntu2~ppa4_/block-ssh.so
e5fae910ca4c3c726ae0491a2cdab507 /usr/lib/x86_64-linux-gnu/qemu/block-curl.so
f780837ec4678eb783baecfab43fa2c1 /usr/lib/x86_64-linux-gnu/qemu/block-iscsi.so
34795eba24a8445621328562e0d953f2 /usr/lib/x86_64-linux-gnu/qemu/block-rbd.so
399dfc7f7912506ff96c3b96ed90a6eb /usr/lib/x86_64-linux-gnu/qemu/block-ssh.so

Removing the original one to force fallback:
mv /usr/lib/x86_64-linux-gnu/qemu/block-curl.so /usr/lib/x86_64-linux-gnu/qemu/orig.block-curl.so.orig

root@f:~# lsof -p $(pidof qemu-system-x86_64) | grep block
<nothing>

Loading:
DEBUG: trying to load module: /usr/lib/x86_64-linux-gnu/qemu/block-curl.so
DEBUG: trying to load module: /usr/bin/../block-curl.so
DEBUG: trying to load module: /usr/bin/block-curl.so
DEBUG: trying to load module: /var/run/qemu/Debian_1_4.2-3ubuntu2~ppa4/block-curl.so

Two differences left:
1. the binary has the "Debian_" prefix. that is from Debian rules (silly) but easy to fix
   --with-pkgversion="Debian $(DEB_VERSION)"
   => Prepending that in our templating as well
2. The effective dir has a suffix _
     /var/run/qemu/1_4.2-3ubuntu2~ppa4_
   That is from PKGVERSION substitution, since we use the inverted set it replaces the EOL
   Using printf will not add that and behave in d/rules like the binary

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Now I hit an issue that I expected:

DEBUG: trying to load module: /var/run/qemu/Debian_1_4.2-3ubuntu2~ppa4/block-curl.so
Failed to open module: /var/run/qemu/Debian_1_4.2-3ubuntu2~ppa4/block-curl.so: cannot open shared object file: Permission denied

Which is due to apparmove:
[302376.960953] audit: type=1400 audit(1583238035.059:439): apparmor="DENIED" operation="open" namespace="root//lxd-f_<var-snap-lxd-common-lxd>" profile="libvirt-2bef989e-6d28-45c8-b101-3959de1db2b3" name="/run/qemu/Debian_1_4.2-3ubuntu2~ppa4/block-curl.so" pid=6958 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=64055 ouid=0

I'm on the brink of letting that blocked by default and people would
=> less comfortable, but effectively making the change not even a bit less secure until bigger deployments who care opt in (also this can be decided later on).
Adding a libvirt task for it ...

Changed in libvirt (Ubuntu):
status: New → Triaged
importance: Undecided → Wishlist
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The bad part on the apparmor block is that it is usually hard to change/reload per-guest profiles but that would be needed.
But well one thing at a time, for now adding an override:

in /etc/apparmor.d/local/abstractions/libvirt-qemu :
# let qemu load old shared objects after upgrades (LP: #1847361)
/{var/,}run/qemu/*/*.so mr,

With that it worked well.
I'll retest later with my ~ppa5 build and then it should be ready for an upstream discussion.

Changed in qemu (Ubuntu):
importance: Wishlist → Low
importance: Low → Wishlist
Changed in libvirt (Ubuntu):
importance: Wishlist → Low
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

New dir created by maintainer scripts:
drwxr-xr-x 2 root root 120 Mar 3 16:20 Debian_1_4.2-3ubuntu2~ppa5/

And loading from there after an upgrade works just fine.
DEBUG: trying to load module: /usr/lib/x86_64-linux-gnu/qemu/block-curl.so
DEBUG: trying to load module: /usr/bin/../block-curl.so
DEBUG: trying to load module: /usr/bin/block-curl.so
DEBUG: trying to load module: /var/run/qemu/Debian_1_4.2-3ubuntu2~ppa5/block-curl.so

I'm locked up in meetings now, but this is good for upstream submission ... probably tomorrow.

Changed in qemu (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The patch is small but upstream doesn't get to it atm as it isn't on the high-prio list of anyone.
I'll continue to ping there but we should start to prep and review the packaging changes to be ready. Eventually we might even go forward and use that in Ubuntu without getting upstream.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Related Merge Proposals:
- qemu: https://code.launchpad.net/~paelzer/ubuntu/+source/qemu/+git/qemu/+merge/380467
- libvirt: https://code.launchpad.net/~paelzer/ubuntu/+source/libvirt/+git/libvirt/+merge/380469

I have already mailed Ubuntu-security on this in the past and now added a review slot for them on the MPs to feel safe for the "load .so from other place". I think it is as safe as the normal apth in /usr/lib/... but they are the experts to decide.

Finally I have converted this (in the description) to also be an FFe and subscribed ubuntu-release to check and ack on it.

description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Submitted the v2 of qemu patch to the ML.
PPA and MPs are updated regularly.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

v2 now got a:
Reviewed-by: Daniel P. Berrangé <address@hidden>
at: https://lists.nongnu.org/archive/html/qemu-devel/2020-03/msg02749.html

Lets give this a few more days.

While that is waiting hopefully the release team has some time to sign off this upload for the FFe.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Queued upstream by Paolo, I think we can go on with the uplaod as soon as we get the FFe Ack

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Now that it is queued in qemu I submitted the libvirt-apparmor portion to libvirt upstream as well.
Having already the ack of Jamie on the Ubunut MP means to me we are fine to add that in "our libvirt" already.

Upstream submission:
https://www.redhat.com/archives/libvir-list/2020-March/msg00486.html

Also pushed an updated qemu MP with e.g. changelog updates and better commit splits (same content).

Revision history for this message
Iain Lane (laney) wrote :

+1 from the release team.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks Laney, uploaded to Focal

Changed in qemu (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 6.0.0-0ubuntu5

---------------
libvirt (6.0.0-0ubuntu5) focal; urgency=medium

  * d/p/ubuntu-aa/lp-1847361-load-versioned-module.patch: allow loading
    versioned modules after qemu package upgrades (LP: #1847361)

 -- Christian Ehrhardt <email address hidden> Tue, 10 Mar 2020 08:58:04 +0100

Changed in libvirt (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:4.2-3ubuntu2

---------------
qemu (1:4.2-3ubuntu2) focal; urgency=medium

  * allow qemu to load old modules post upgrade (LP: #1847361)
    - d/p/ubuntu/lp-1847361-modules-load-upgrade.patch: to fallback module
      load to a versioned path
    - d/qemu-block-extra.*.in, d/qemu-system-gui.*.in: save shared objects on
      upgrade
    - d/rules: generate maintainer scripts matching package version on build
    - d/rules: enable --enable-module-upgrades where --enable-modules is set
  * d/p/ubuntu/lp-1847361-vhost-correctly-turn-on-VIRTIO_F_IOMMU_PLATFORM.patch:
    avoid unnecessary IOTLB transactions (LP: #1866207)

 -- Christian Ehrhardt <email address hidden> Mon, 02 Mar 2020 15:21:27 +0100

Changed in qemu (Ubuntu):
status: Fix Committed → Fix Released
Changed in qemu (Ubuntu Bionic):
assignee: nobody → Christian Ehrhardt  (paelzer)
Changed in qemu (Ubuntu Bionic):
status: New → Triaged
Changed in qemu (Ubuntu Eoan):
status: New → Triaged
Changed in libvirt (Ubuntu Eoan):
status: New → Triaged
Changed in libvirt (Ubuntu Bionic):
status: New → Triaged
Changed in qemu (Ubuntu Eoan):
assignee: nobody → Christian Ehrhardt  (paelzer)
Changed in libvirt (Ubuntu Eoan):
assignee: nobody → Christian Ehrhardt  (paelzer)
Changed in libvirt (Ubuntu Bionic):
assignee: nobody → Christian Ehrhardt  (paelzer)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

TODO:
include fix for bug 1871830 once finished

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
description: updated
description: updated
description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

MPs complete, pre tests complete - uploaded to Eoan/Bionic -unapproved for SRU team Review.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Pinged in #ubuntu-release as we'd have subsequent fixes by mdeslaur starting to queue up.

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Billy, or anyone else affected,

Accepted qemu into eoan-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:4.0+dfsg-0ubuntu9.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-eoan to verification-done-eoan. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-eoan. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in qemu (Ubuntu Eoan):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-eoan
Changed in libvirt (Ubuntu Eoan):
status: Triaged → Fix Committed
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello Billy, or anyone else affected,

Accepted libvirt into eoan-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/libvirt/5.4.0-0ubuntu5.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-eoan to verification-done-eoan. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-eoan. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello Billy, or anyone else affected,

Accepted qemu into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:2.11+dfsg-1ubuntu7.24 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in qemu (Ubuntu Bionic):
status: Triaged → Fix Committed
tags: added: verification-needed-bionic
Changed in libvirt (Ubuntu Bionic):
status: Triaged → Fix Committed
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello Billy, or anyone else affected,

Accepted libvirt into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/libvirt/4.0.0-1ubuntu8.16 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (qemu/1:4.0+dfsg-0ubuntu9.5)

All autopkgtests for the newly accepted qemu (1:4.0+dfsg-0ubuntu9.5) for eoan have finished running.
The following regressions have been reported in tests triggered by the package:

systemd/242-7ubuntu3.7 (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/eoan/update_excuses.html#qemu

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The systemd test on arm is just flaky and I retried.

More concerning is that qemu in bionic failed to build which it clearly did in the PPAs before.
Something must be different or been lost in between, checking that...

block/file-posix.o -MF block/file-posix.d -O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g -c -o block/file-posix.o /<<BUILDDIR>>/qemu-2.11+dfsg/block/file-posix.c
In file included from /<<BUILDDIR>>/qemu-2.11+dfsg/include/qemu/hbitmap.h:15:0,
                 from /<<BUILDDIR>>/qemu-2.11+dfsg/include/block/dirty-bitmap.h:5,
                 from /<<BUILDDIR>>/qemu-2.11+dfsg/include/block/block.h:9,
                 from /<<BUILDDIR>>/qemu-2.11+dfsg/include/block/block_int.h:28,
                 from /<<BUILDDIR>>/qemu-2.11+dfsg/block/file-posix.c:28:
/usr/include/linux/swab.h: In function ‘__swab’:
/<<BUILDDIR>>/qemu-2.11+dfsg/include/qemu/bitops.h:20:34: warning: "sizeof" is not defined, evaluates to 0 [-Wundef]
 #define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE)
                                  ^
/<<BUILDDIR>>/qemu-2.11+dfsg/include/qemu/bitops.h:20:41: error: missing binary operator before token "("
 #define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE)

The issue is the same on all architectures.
Unfortunately the PPA I could use for comparison [1] is cleaned up by now, but we know it built back then and also we are not directly touching any of the files in the error path.

It is reproducible in a local sbuild, the former (current) bionic version qemu_2.11+dfsg-1ubuntu7.23 fails the same way, so as I assumed it is a change in some other package that made it break in the meantime.

[1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4012

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Simplified the breaking call a bit and got:
gcc -Iblock -I. -I/build/qemu-D0lqny/qemu-2.11+dfsg -I/build/qemu-D0lqny/qemu-2.11+dfsg/include -pthread -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -D_GNU_SOURCE -Wall -Wundef -o file-posix.o ../block/file-posix.c

From there I got down to:
test.c:
#define BITS_PER_LONG (sizeof (unsigned long))
#include <linux/cdrom.h>
int main() {
   return BITS_PER_LONG;
}

$ gcc -Wall -Wundef -o test.o test.c
/usr/include/linux/swab.h: In function ‘__swab’:
test.c:1:34: warning: "sizeof" is not defined, evaluates to 0 [-Wundef]
 #define BITS_PER_LONG (sizeof (unsigned long))
                                  ^
test.c:1:41: error: missing binary operator before token "("
 #define BITS_PER_LONG (sizeof (unsigned long))
                                         ^

Root cause is the missing sizeof which makes BITS_PER_LONG and odd define.
With that defined including linux/cdrom.h breaks.
Dropping the cdrom include avoids it, as much as not defining BITS_PER_LONG (defining it to any reasonable number works are well).

A build with -E shows that BITS_PER_LONG becomes the actual text as it replaces the one in the return to this:
   return (sizeof (unsigned long));

That understanding lets me further simplify the test:
#define FOO (sizeof (unsigned long))
int main() {
#if FOO == 8
   return FOO;
#endif
}

$ gcc -Wall -Wundef -o test.o test.c
test.c: In function ‘main’:
test.c:1:24: warning: "sizeof" is not defined, evaluates to 0 [-Wundef]
 #define FOO (sizeof (unsigned long))
                        ^
test.c:4:5: note: in expansion of macro ‘FOO’
 #if FOO == 8
     ^~~
test.c:1:31: error: missing binary operator before token "("
 #define FOO (sizeof (unsigned long))
                               ^
test.c:4:5: note: in expansion of macro ‘FOO’
 #if FOO == 8
     ^~~

But that has no external dependencies anymore other than the compiler.
Why/where is sizeof missing.

To be clear, this works and returns 8
int main() {
   return (sizeof (unsigned long));

It is only the usage in the definition of the preprocessor var that fails.

But this fails since forever from Xenial to Focal, I must still miss some other change that actually makes this happen (or was mitigating it in the past).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

It is odd, the simplified error seems to exist for all times.

But the old build had the same code in place and didn't trigger the issue [1]
Never the less re-building 2.11+dfsg-1ubuntu7.23 nowadays triggers the same bug.

The question really became, what did prevent the issue from surfacing a month ago (and still does in Eoan where the build with the same code in qemu still works).

It doesn't seem to be ./configure results as the cc call arguments didn't change at all - [2] matches [1] in that.

[1]: https://launchpadlibrarian.net/464979008/buildlog_ubuntu-bionic-amd64.qemu_1%3A2.11+dfsg-1ubuntu7.23_BUILDING.txt.gz
[2]: https://launchpadlibrarian.net/478385568/buildlog_ubuntu-bionic-amd64.qemu_1%3A2.11+dfsg-1ubuntu7.24_BUILDING.txt.gz

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I was going into a bionic VM and bulding the former built .23 version.
In that environment I can compile file-posix.c.

The simplified call works in that VM:
$ gcc -Iblock -I. -I/home/ubuntu/qemu-2.11+dfsg -I/home/ubuntu/qemu-2.11+dfsg/include -pthread -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -D_GNU_SOURCE -Wall -Wundef -c -o file-posix.o ../block/file-posix.c 2>&1

While the same in e.g. a sbuild-chroot does not!

Running the compile with -E shows the usage of BITPER_LONG in e.g. test_bit
code:
    return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
good case -E:
    return 1UL & (addr[((nr) / (sizeof (unsigned long) * 8))] >> (nr & ((sizeof (unsigned long) * 8)-1)));
bad case -E doesn't get to that, as it breaks in the preprocessor stage with the known errors.

I aligned packages by purging all that the VM had on top of the build env.
I did a clean and a new build run to be sure, but still it builds in the VM but not in the chroot.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

It was suggested to use -save-temps (thanks danpb) and comparing the -i files.

The following is from:
$ gcc -Iblock -I. -I/build/qemu-D0lqny/qemu-2.11+dfsg -I/build/qemu-D0lqny/qemu-2.11+dfsg/include -pthread -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -D_GNU_SOURCE -Wall -Wundef -ftrack-macro-expansion=8 -c -CC -save-temps -o file-posix.o ../block/file-posix.c

Attaching the two files shows a lot of differences, but without knowing what to look for it can be hard to just "read" them for an issue.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

It also was found that the compiler versions changed:
 good build 7.4.0-1ubuntu1~18.04.1
  bad build 7.5.0-3ubuntu1~18.04

Since it is easier available I first tried the one in 18.04-release.

v="7.3.0-16ubuntu3";
v2="4:7.3.0-3ubuntu2";
apt install cpp-7=$v g++-7=$v gcc-7=$v gcc-7-base=$v libasan4=$v libcilkrts5=$v libgcc-7-dev=$v libstdc++-7-dev=$v libubsan0=$v cpp=$v2 g++=$v2 gcc=$v2

I further tried gcc-8 8.4.0-1ubuntu1~18.04

But with all those versiosn the issue still happens.

In addition I ran make clean, did a reconfigure with the same arguments and built again - still the same in schroot breaks missing the sizeof while in the VM it works.

P.S. for later tests I went back to the latest version in Bionic

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Coming back to -save-temps I found a way to make them more similar.
I had ran one with -CC (do not remove comments), uploading a new file.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

After replacing the build dir prefix in both
 build/qemu-D0lqny -> builddir
 home/ubuntu -> builddir
this is even more readable

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

There are a few offsets changed which do not matter.
What is left is:

$ diff -Naur file-posix.i.good file-posix.i.bad
--- file-posix.i.good 2020-05-06 13:42:04.936368109 +0200
+++ file-posix.i.bad 2020-05-06 13:46:23.573406190 +0200
@@ -69470,6 +69470,9 @@
 /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
 # 6 "/usr/include/linux/swab.h" 2 3 4

+# 1 "/usr/include/x86_64-linux-gnu/asm/bitsperlong.h" 1 3 4
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+# 8 "/usr/include/linux/swab.h" 2 3 4
 # 1 "/usr/include/x86_64-linux-gnu/asm/swab.h" 1 3 4
 /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */

-# 135 "/usr/include/linux/swab.h" 3 4
+# 136 "/usr/include/linux/swab.h" 3 4
+static __inline __attribute__ ((__always_inline__)) unsigned long __swab(const unsigned long y)
+{
+
+
+
+ return (__builtin_constant_p((__u32)(y)) ? ((__u32)( (((__u32)(y) & (__u32)0x000000ffUL) << 24) | (((__u32)(y) & (__u32)0x0000ff00UL) << 8) | (((__u32)(y) & (__u32)0x00ff0000UL) >> 8) | (((__u32)(y) & (__u32)0xff000000UL) >> 24))) : __fswab32(y));
+
+}
+
 /**
  * __swahw32 - return a word-swapped 32-bit value
  * @x: value to wordswap

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Good and bad system run the same
abe91699d25bc563c77de9ff5d716fb5 /usr/include/x86_64-linux-gnu/asm/bitsperlong.h

But they differ in
c6fb5761f50f398f66acd3b6cf3e2bcc /usr/include/linux/swab.h (bad)
1b3e767d11860a5eee41c8ea87b60c03 /usr/include/linux/swab.h (good)

Since this is very close to our bug lets drill that down.

Package:
$ dpkg -S /usr/include/linux/swab.h
linux-libc-dev:amd64: /usr/include/linux/swab.h

good: 4.15.0-99.100
bad: 4.15.0-100.101

4.15.0-100.101 is from -proposed which is used in new builds.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

And here we have the trigger:

$ diff -Naur swab.h.4.15.0-99.100.good swab.h.4.15.0-100.101.bad
--- swab.h.4.15.0-99.100.good 2020-05-06 13:56:28.755885666 +0200
+++ swab.h.4.15.0-100.101.bad 2020-05-06 13:55:39.191681069 +0200
@@ -4,6 +4,7 @@

 #include <linux/types.h>

+#include <asm/bitsperlong.h>
 #include <asm/swab.h>

 /*
@@ -132,6 +133,15 @@
        __fswab64(x))
 #endif

+static __always_inline unsigned long __swab(const unsigned long y)
+{
+#if BITS_PER_LONG == 64
+ return __swab64(y);
+#else /* BITS_PER_LONG == 32 */
+ return __swab32(y);
+#endif
+}
+
 /**
  * __swahw32 - return a word-swapped 32-bit value
  * @x: value to wordswap

That means the linux-libc-dev package being part of the proposed new 4.15 kernel in Bionic will break at least qemu and maybe others.

The problem is that it includes <asm/bitsperlong.h> which defines:
 # define __BITS_PER_LONG 64

But then uses BITS_PER_LONG (missing the leading underscores).
Due to that it will in the qemu case use what qemu has defined and break.
But even worse in other cases maybe use the wrong swab function.

Broken by [1]
commit 2385a55f64a65baf6594f37bfa018e2797dcb8c7 (sha from ubuntu kernel, upstream d5767057c9a)
Author: Yury Norov <email address hidden>
Date: Thu Jan 30 22:16:40 2020 -0800

    uapi: rename ext2_swab() to swab() and share globally in swab.h

Fixed by [2] (but missing in our proposed kernel)
commit 467d12f5c7842896d2de3ced74e4147ee29e97c8
Author: Christian Borntraeger <email address hidden>
Date: Thu Feb 20 20:04:03 2020 -0800

    include/uapi/linux/swab.h: fix userspace breakage, use __BITS_PER_LONG for swap

This fix also is in 4.14 stable kernel as ffd115f2dca955ce0782e801d488ecfaccde421f.
That should be the closest for our kernel.

@Kernel - Please consider NOT to release 4.15.0-100.101 as-is, it needs this fix.
Getting this fixed effectively gates any qemu update in Bionic (and maybe other things as well).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I have contacted the kernel team, depending on the outcome of the discussion we can decide here how to continue with qemu.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This is now gated by bug 1877123, once that is at least in Bionic-proposed we can re-build the qemu in Bionic-proposed.

Revision history for this message
Philippe Mathieu-Daudé (philmd) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Phillipe,
I already identified that patch and (as mentioned in the comment above) requested a backport of that patch in bug 1877123 already.
But thanks for your ping - better finding it twice than not knowing at all what is blocking this.

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (qemu/1:2.11+dfsg-1ubuntu7.24)

All autopkgtests for the newly accepted qemu (1:2.11+dfsg-1ubuntu7.24) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

systemd/237-3ubuntu10.40 (i386)
vagrant-mutate/1.2.0-3 (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#qemu

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (8.0 KiB)

Reinstall into saved .so files in versioned directory:

root@b:~# apt install --reinstall qemu-block-extra qemu-kvm qemu-system-common qemu-system-x86 qemu-utils
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 5 reinstalled, 0 to remove and 0 not upgraded.
Need to get 6792 kB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 qemu-system-common amd64 1:2.11+dfsg-1ubuntu7.24 [672 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 qemu-block-extra amd64 1:2.11+dfsg-1ubuntu7.24 [40.2 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 qemu-kvm amd64 1:2.11+dfsg-1ubuntu7.24 [13.2 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 qemu-system-x86 amd64 1:2.11+dfsg-1ubuntu7.24 [5197 kB]
Get:5 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 qemu-utils amd64 1:2.11+dfsg-1ubuntu7.24 [869 kB]
Fetched 6792 kB in 3s (2170 kB/s)
(Reading database ... 37924 files and directories currently installed.)
Preparing to unpack .../qemu-system-common_1%3a2.11+dfsg-1ubuntu7.24_amd64.deb ...
Unpacking qemu-system-common (1:2.11+dfsg-1ubuntu7.24) over (1:2.11+dfsg-1ubuntu7.24) ...
Preparing to unpack .../qemu-block-extra_1%3a2.11+dfsg-1ubuntu7.24_amd64.deb ...
Unpacking qemu-block-extra:amd64 (1:2.11+dfsg-1ubuntu7.24) over (1:2.11+dfsg-1ubuntu7.24) ...
Preparing to unpack .../qemu-kvm_1%3a2.11+dfsg-1ubuntu7.24_amd64.deb ...
Unpacking qemu-kvm (1:2.11+dfsg-1ubuntu7.24) over (1:2.11+dfsg-1ubuntu7.24) ...
Preparing to unpack .../qemu-system-x86_1%3a2.11+dfsg-1ubuntu7.24_amd64.deb ...
Unpacking qemu-system-x86 (1:2.11+dfsg-1ubuntu7.24) over (1:2.11+dfsg-1ubuntu7.24) ...
Preparing to unpack .../qemu-utils_1%3a2.11+dfsg-1ubuntu7.24_amd64.deb ...
Unpacking qemu-utils (1:2.11+dfsg-1ubuntu7.24) over (1:2.11+dfsg-1ubuntu7.24) ...
Setting up qemu-block-extra:amd64 (1:2.11+dfsg-1ubuntu7.24) ...
Setting up qemu-utils (1:2.11+dfsg-1ubuntu7.24) ...
Setting up qemu-system-common (1:2.11+dfsg-1ubuntu7.24) ...
Setting up qemu-system-x86 (1:2.11+dfsg-1ubuntu7.24) ...
Setting up qemu-kvm (1:2.11+dfsg-1ubuntu7.24) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
root@b:~# ll /var/run/qemu/Debian_1_2.11+dfsg-1ubuntu7.24/
total 92
drwxr-xr-x 2 root root 100 May 14 05:45 ./
drwxr-xr-x 3 root root 60 May 14 05:45 ../
-rw-r--r-- 1 root root 25600 May 14 05:45 block-curl.so
-rw-r--r-- 1 root root 36008 May 14 05:45 block-iscsi.so
-rw-r--r-- 1 root root 27712 May 14 05:45 block-rbd.so

Same for Eoan:
root@e:~# ll /var/run/qemu/Debian_1_4.0+dfsg-0ubuntu9.5/
total 120
drwxr-xr-x 2 root root 100 May 14 05:45 ./
drwxr-xr-x 3 root root 60 May 14 05:45 ../
-rw-r--r-- 1 root root 34280 May 14 05:45 block-curl.so
-rw-r--r-- 1 root root 49256 May 14 05:45 block-iscsi.so
-rw-r--r-- 1 root root 31752 May 14 05:45 block-rbd.so

 rename -v 's/\.so/.so.unavailable/' /usr/lib/x86_64-linux-gnu/qemu/*
/usr/lib/x86_64-linux-gnu/qemu/block-curl.so renamed as /usr/lib/x86_64-linux-gnu/qemu/block-curl.so.unavailable
/usr/lib/x86_64-linux-gnu/qemu/block-iscsi.s...

Read more...

tags: added: verification-done-eoan
removed: verification-needed-eoan
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The config is enabled in the right build, but I'm not seeing the new lines passed in the debugger, could be just optimization but it is suspicious.

Poking at the build dir we also see:
  qemu-build/config-host.mak:58:CONFIG_MODULE_UPGRADES=y

If we render the preprocessor step into a file all looks good still:
    char *version_dir;
    char *dirs[4];
...
    version_dir = g_strcanon(g_strdup("(Debian 1:2.11+dfsg-1ubuntu7.24)"),
                             "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789" "+-.~",
                             '_');
    dirs[i++] = g_strdup_printf("/var/run/qemu/%s", version_dir);

That only is in when the config option was considered correctly.

Some twists later I got the debugging corrected.
Now I see
(gdb) p version_dir
$14 = 0x55bfc6f31880 "_Debian_1_2.11+dfsg-1ubuntu7.24_"
Which ends up as:
$22 = {0x55bfc5bccfa0 "/usr/lib/x86_64-linux-gnu/qemu", 0x55bfc5dc4710 "/usr/bin/..", 0x55bfc5cf1840 "/usr/bin", 0x55bfc5ea98d0 "/var/run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.24_"}

Do mind the leading and trailing underscores.

That is due to the version in Bionic still having:
config-host.mak:52:PKGVERSION= (Debian 1:2.11+dfsg-1ubuntu7.24)
qemu-version.h:1:#define QEMU_PKGVERSION "(Debian 1:2.11+dfsg-1ubuntu7.24)"

And with that we get as expected:
7f70d74d6000-7f70d74d7000 rw-p 00008000 00:4d 204 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.24_/block-iscsi.so
and
7f70d6ea0000-7f70d6ea1000 r--p 00004000 00:4d 203 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.24_/block-curl.so

So the backport needs to adapt in the maintscripts to reflect that older behavior of having the enclosing brackets as part of the version string.
Now that the issue is understood the fix is trivial - I'll do a test run of a PPA and upload to -unapproved later on.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Thanks for the verification Christian! I will release the eoan version and wait for further info regarding the bionic one.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 5.4.0-0ubuntu5.3

---------------
libvirt (5.4.0-0ubuntu5.3) eoan; urgency=medium

  * d/p/ubuntu-aa/lp-1847361-load-versioned-module.patch: allow loading
    versioned modules after qemu package upgrades (LP: #1847361)

 -- Christian Ehrhardt <email address hidden> Thu, 09 Apr 2020 08:28:33 +0200

Changed in libvirt (Ubuntu Eoan):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:4.0+dfsg-0ubuntu9.5

---------------
qemu (1:4.0+dfsg-0ubuntu9.5) eoan; urgency=medium

  * allow qemu to load old modules post upgrade (LP: #1847361)
    - d/p/ubuntu/lp-1847361-modules-load-upgrade.patch: to fallback module
      load to a versioned path
    - d/qemu-block-extra.*.in, d/qemu-system-gui.*.in: save shared objects on
      upgrade
    - d/rules: generate maintainer scripts matching package version on build
    - d/rules: enable --enable-module-upgrades where --enable-modules is set

 -- Christian Ehrhardt <email address hidden> Mon, 02 Mar 2020 15:21:27 +0100

Changed in qemu (Ubuntu Eoan):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for libvirt has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The new dir by maintainer scripts now is:
  /var/run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25~ppa1_/
That matches the version as internally stord by that qemu version.

Now loading works as expected - after using device_attach with the original files moved away.
root@b:~# pidof qemu-system-x86_64; cat /proc/$(pidof qemu-system-x86_64)/maps | grep -e rbd -e ceph -e curl -e iscsi
116371
7f14253cd000-7f14253f1000 r-xp 00000000 00:43 43986 /usr/lib/x86_64-linux-gnu/libiscsi.so.7.0.2
7f14253f1000-7f14255f0000 ---p 00024000 00:43 43986 /usr/lib/x86_64-linux-gnu/libiscsi.so.7.0.2
7f14255f0000-7f14255f1000 r--p 00023000 00:43 43986 /usr/lib/x86_64-linux-gnu/libiscsi.so.7.0.2
7f14255f1000-7f14255f2000 rw-p 00024000 00:43 43986 /usr/lib/x86_64-linux-gnu/libiscsi.so.7.0.2
7f14255f2000-7f14255fa000 r-xp 00000000 00:4d 208 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25~ppa1_/block-iscsi.so
7f14255fa000-7f14257f9000 ---p 00008000 00:4d 208 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25~ppa1_/block-iscsi.so
7f14257f9000-7f14257fa000 r--p 00007000 00:4d 208 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25~ppa1_/block-iscsi.so
7f14257fa000-7f14257fb000 rw-p 00008000 00:4d 208 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25~ppa1_/block-iscsi.so
7f147cdd7000-7f147ce51000 r-xp 00000000 00:43 9918 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7f147ce51000-7f147d050000 ---p 0007a000 00:43 9918 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7f147d050000-7f147d053000 r--p 00079000 00:43 9918 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7f147d053000-7f147d054000 rw-p 0007c000 00:43 9918 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7f147d054000-7f147d058000 r-xp 00000000 00:4d 205 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25~ppa1_/block-curl.so
7f147d058000-7f147d258000 ---p 00004000 00:4d 205 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25~ppa1_/block-curl.so
7f147d258000-7f147d259000 r--p 00004000 00:4d 205 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25~ppa1_/block-curl.so
7f147d259000-7f147d25a000 rw-p 00005000 00:4d 205 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25~ppa1_/block-curl.so

I'll upload this qemu fixup to bionic-unapproved for a quick re-review and retest (it builds quite some time)

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Billy, or anyone else affected,

Accepted qemu into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:2.11+dfsg-1ubuntu7.25 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Recheck - 1:2.11+dfsg-1ubuntu7.23 affected as expected.

upgrade to 1:2.11+dfsg-1ubuntu7.25 worked

On re-install/upgrade this dir is created
/var/run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25_/

After the .so files have been made unavailable it can still load the hotplug disk.
# virsh attach-device lateload curldisk.xml
Device attached successfully

And we see the fallback .so being loaded:
root@b:~# pidof qemu-system-x86_64; cat /proc/$(pidof qemu-system-x86_64)/maps | grep -e rbd -e ceph -e curl -e iscsi
15149
7f652157e000-7f65215f8000 r-xp 00000000 00:35 9991 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7f65215f8000-7f65217f7000 ---p 0007a000 00:35 9991 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7f65217f7000-7f65217fa000 r--p 00079000 00:35 9991 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7f65217fa000-7f65217fb000 rw-p 0007c000 00:35 9991 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7f65300db000-7f65300df000 r-xp 00000000 00:4c 200 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25_/block-curl.so
7f65300df000-7f65302df000 ---p 00004000 00:4c 200 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25_/block-curl.so
7f65302df000-7f65302e0000 r--p 00004000 00:4c 200 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25_/block-curl.so
7f65302e0000-7f65302e1000 rw-p 00005000 00:4c 200 /run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25_/block-curl.so

With an extra hop, but verified.

tags: added: verification-done verification-done-bionic
removed: verification-needed verification-needed-bionic
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Note: the qemu tests implicitly test the libvirt change (appamor to allow to load these new paths) so the verification is true for both packages in both releases.

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (qemu/1:2.11+dfsg-1ubuntu7.25)

All autopkgtests for the newly accepted qemu (1:2.11+dfsg-1ubuntu7.25) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

vagrant-mutate/1.2.0-3 (armhf)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#qemu

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This is a known flaky test, I'll retry it.
Everything else seems to be good already.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 4.0.0-1ubuntu8.16

---------------
libvirt (4.0.0-1ubuntu8.16) bionic; urgency=medium

  * d/p/ubuntu-aa/lp-1847361-load-versioned-module.patch: allow loading
    versioned modules after qemu package upgrades (LP: #1847361)

 -- Christian Ehrhardt <email address hidden> Thu, 09 Apr 2020 08:28:33 +0200

Changed in libvirt (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:2.11+dfsg-1ubuntu7.25

---------------
qemu (1:2.11+dfsg-1ubuntu7.25) bionic; urgency=medium

  * d/rules: match how 2.11 stores PKGVERSION (LP: 1847361)

qemu (1:2.11+dfsg-1ubuntu7.24) bionic; urgency=medium

  * allow qemu to load old modules post upgrade (LP: #1847361)
    - d/p/ubuntu/lp-1847361-modules-load-upgrade.patch: to fallback module
      load to a versioned path
    - d/qemu-block-extra.*.in: save shared objects on upgrade
    - d/rules: generate maintainer scripts matching package version on build
    - d/rules: enable --enable-module-upgrades where --enable-modules is set

 -- Christian Ehrhardt <email address hidden> Thu, 14 May 2020 10:02:30 +0200

Changed in qemu (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Ante Karamatić (ivoks) wrote :

OpenSack Stein (and coresponding qemu package) are still supported in Ubuntu Cloud Archive and therefore should also have this fixed.

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

Ante: Updates from the supported Ubuntu releases are usually automatically backported to the corresponding UCA pockets. Are you seeing that not happen with this bug?

Changed in cloud-archive:
status: New → Incomplete
Revision history for this message
James Page (james-page) wrote :

Stein sources from Disco which I think went EOL before this SRU so the stein UCA would not have received the same update as bionic or bionic/train (which sources from eoan).

Revision history for this message
Corey Bryant (corey.bryant) wrote :

Disco is EOL so it probably didn't get patched and if that's the case then stein needs patching directly.

Changed in cloud-archive:
status: Incomplete → Confirmed
Revision history for this message
Corey Bryant (corey.bryant) wrote :

@Ante, I triaged as medium only because it looks like there are workarounds. Regardless we'll get to this soon for stein.

Changed in cloud-archive:
status: Confirmed → Fix Released
Revision history for this message
Dan Streetman (ddstreet) wrote :

@vtapia is preparing the patches for qemu in Stein for this bug and bug 1848497

Revision history for this message
Victor Tapia (vtapia) wrote :
Revision history for this message
Victor Tapia (vtapia) wrote :

I've backported Christian's patches to stein (see attached debdiffs) and after testing the described reproducer I can confirm that they fix the issue. Feel free to test them and let me know if there's any issue.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Victor,
debdiffs with the backports LGTM

Revision history for this message
Victor Tapia (vtapia) wrote :
Revision history for this message
Victor Tapia (vtapia) wrote :

Updated the qemu debdiff with the backported fix for bug 1848497

Revision history for this message
Corey Bryant (corey.bryant) wrote : Please test proposed package

Hello Billy, or anyone else affected,

Accepted libvirt into stein-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:stein-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-stein-needed to verification-stein-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-stein-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-stein-needed
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Hello Billy, or anyone else affected,

Accepted qemu into stein-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:stein-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-stein-needed to verification-stein-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-stein-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
Victor Tapia (vtapia) wrote :

=== Verification ===

1. Start qemu using the proposed package:

$ sudo qemu-system-x86_64 -machine none -S -nographic -monitor stdio -serial null
QEMU 3.1.0 monitor - type 'help' for more information
(qemu) info version
3.1.0Debian 1:3.1+dfsg-2ubuntu3.7~cloud1

2. Run "apt install --reinstall" so the prerm scripts copy the modules from /usr/lib/x86_64-linux-gnu/qemu/ to /var/run/qemu/$VERSION

$ md5sum /var/run/qemu/Debian_1_3.1+dfsg-2ubuntu3.7~cloud1/block-curl.so /usr/lib/x86_64-linux-gnu/qemu/block-curl.so
9424706c3ea3f1b3845fd3defbf6879c /var/run/qemu/Debian_1_3.1+dfsg-2ubuntu3.7~cloud1/block-curl.so
9424706c3ea3f1b3845fd3defbf6879c /usr/lib/x86_64-linux-gnu/qemu/block-curl.so

3. Remove the module from /usr/lib/x86_64-linux-gnu/qemu/

$ sudo rm /usr/lib/x86_64-linux-gnu/qemu/block-curl.so
$ ll /usr/lib/x86_64-linux-gnu/qemu/block-curl.so
ls: cannot access '/usr/lib/x86_64-linux-gnu/qemu/block-curl.so': No such file or directory

4. Add a curl device so it pulls the module

(qemu) drive_add 0 readonly=on,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
OK

5. Confirm it's falling back to /var/run/qemu/$VERSION when /usr/lib/x86_64-linux-gnu/qemu/ does not work

$ pidof qemu-system-x86_64; sudo cat /proc/$(pidof qemu-system-x86_64)/maps | grep -e curl
8687
7fad1ee33000-7fad1eead000 r-xp 00000000 08:02 3937904 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7fad1eead000-7fad1f0ac000 ---p 0007a000 08:02 3937904 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7fad1f0ac000-7fad1f0af000 r--p 00079000 08:02 3937904 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7fad1f0af000-7fad1f0b0000 rw-p 0007c000 08:02 3937904 /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.5.0
7fad1f0b0000-7fad1f0b5000 r-xp 00000000 00:16 993 /run/qemu/Debian_1_3.1+dfsg-2ubuntu3.7~cloud1/block-curl.so
7fad1f0b5000-7fad1f2b4000 ---p 00005000 00:16 993 /run/qemu/Debian_1_3.1+dfsg-2ubuntu3.7~cloud1/block-curl.so
7fad1f2b4000-7fad1f2b5000 r--p 00004000 00:16 993 /run/qemu/Debian_1_3.1+dfsg-2ubuntu3.7~cloud1/block-curl.so
7fad1f2b5000-7fad1f2b6000 rw-p 00005000 00:16 993 /run/qemu/Debian_1_3.1+dfsg-2ubuntu3.7~cloud1/block-curl.so

** Note: if /run is mounted as noexec, step 4 will fail with the following message:

(qemu) drive_add 0 readonly=on,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
Failed to initialize module: /usr/lib/x86_64-linux-gnu/qemu/block-curl.so
Note: only modules from the same build can be loaded.
Failed to open module: /var/run/qemu/_Debian_1_2.11+dfsg-1ubuntu7.25_/block-curl.so: failed to map segment from shared object
Unknown protocol 'http'

This affects all fixed releases (B/E+), and the workaround is to remount the /run fs without noexec (sudo mount -o remount,exec /run)

Victor Tapia (vtapia)
tags: added: verification-stein-done
removed: verification-stein-needed
Revision history for this message
Corey Bryant (corey.bryant) wrote : Update Released

The verification of the Stable Release Update for libvirt has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This bug was fixed in the package libvirt - 5.0.0-1ubuntu2.6~cloud1
---------------

 libvirt (5.0.0-1ubuntu2.6~cloud1) bionic-stein; urgency=medium
 .
   * d/p/ubuntu-aa/lp-1847361-load-versioned-module.patch: allow loading
     versioned modules after qemu package upgrades (LP: #1847361)

Revision history for this message
Corey Bryant (corey.bryant) wrote :

The verification of the Stable Release Update for qemu has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This bug was fixed in the package qemu - 1:3.1+dfsg-2ubuntu3.7~cloud1
---------------

 qemu (1:3.1+dfsg-2ubuntu3.7~cloud1) bionic-stein; urgency=medium
 .
   * d/p/ubuntu/lp-1848497-virtio-balloon-fix-QEMU-4.0-config-size-migration-in.patch:
     fix migration issue from qemu <4.0 when using virtio-balloon (LP: #1848497)
   * allow qemu to load old modules post upgrade (LP: #1847361)
     - d/p/ubuntu/lp-1847361-modules-load-upgrade.patch: to fallback module
       load to a versioned path
     - d/qemu-block-extra.*.in, d/qemu-system-gui.*.in: save shared objects on
       upgrade
     - d/rules: generate maintainer scripts matching package version on build
     - d/rules: enable --enable-module-upgrades where --enable-modules is set

Revision history for this message
Trent Lloyd (lathiat) wrote :

Note: This patch has related regressions in Hirsute due to the version number containing a space:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1906245
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1905377

Seems the patch is temporarily dropped will need to ensure we don't totally lose the fix

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: [Bug 1847361] Re: Upgrade of qemu binaries causes running instances not able to dynamically load modules

On Tue, Dec 1, 2020 at 4:50 AM Trent Lloyd <email address hidden> wrote:
>
> Note: This patch has related regressions in Hirsute due to the version number containing a space:
> https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1906245
> https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1905377
>
> Seems the patch is temporarily dropped will need to ensure we don't
> totally lose the fix

The listed regressions do not affect anything prior to Hirsute in any way.
We only changed the way it is deployed in maintainer scripts and the
fix for the hickup we had along the way is getting into Hirsute as of
today.
TL;DR: It isn't lost and wasn't intended to be dropped.

More interesting in this context might be the thoughts I'm on together
with Victor Tapia to eliminate the issues around /run being mounted as
noexec.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI - While generally working in either containers (other defaults) or when admins took care (mount options) there is https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1913421 which is a continuation of this to improve the mount options in the path used to have these module loads not be blocked by "noexec" mount option.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.