race condition between vhost_net_stop and CHR_EVENT_CLOSED on shutdown crashes qemu (fix regression)

Bug #1829380 reported by Dan Streetman
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Invalid
Undecided
Unassigned
Mitaka
Fix Released
Medium
Unassigned
Ocata
Fix Released
Medium
Unassigned
qemu (Ubuntu)
Invalid
Undecided
Unassigned
Xenial
Fix Released
Medium
Dan Streetman

Bug Description

[impact]

this bug is to track re-uploading the fix for bug 1823458 plus a patch to fix a regression it introduced.

instead of copying the details from bug 1823458, please see that bug for impact and testcase.

[test case]

see bug 1823458 for the original bug test case

for the regression test case from bug 1829245, rebooting the guest will result in the guest being unable to use the interface (possibly a couple reboots are needed to reproduce if the interface still works after the first reboot).

[regression potential]

see bug 1823458

additionally, the regression from the last fix is in bug 1829245.

the change is in code where this still has the regression potential of causing guest networking to fail.

Related branches

Revision history for this message
Dan Streetman (ddstreet) wrote :

marking 'invalid' for newer qemu; this is fixed in a different way in newer qemu but the change is too large to backport; the patches for xenial (and mitaka/ocata) are a minimal workaround.

Changed in qemu (Ubuntu Xenial):
assignee: nobody → Dan Streetman (ddstreet)
importance: Undecided → Medium
status: New → In Progress
Changed in qemu (Ubuntu):
status: New → Invalid
Revision history for this message
Dan Streetman (ddstreet) wrote :
Changed in cloud-archive:
status: New → Invalid
Revision history for this message
Corey Bryant (corey.bryant) wrote : Please test proposed package

Hello Dan, or anyone else affected,

Accepted qemu into ocata-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:ocata-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ocata-needed to verification-ocata-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ocata-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-ocata-needed
Revision history for this message
Corey Bryant (corey.bryant) wrote : Update Released

The verification of the Stable Release Update for qemu has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This bug was fixed in the package qemu - 1:2.8+dfsg-3ubuntu2.9~cloud6
---------------

 qemu (1:2.8+dfsg-3ubuntu2.9~cloud6) xenial-ocata; urgency=medium
 .
   * SECURITY UPDATE: Add support for exposing md-clear functionality
     to guests
     - d/p/ubuntu/enable-md-clear.patch
     - CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091
   * SECURITY UPDATE: heap overflow when loading device tree blob
     - d/p/ubuntu/CVE-2018-20815.patch: specify how large the buffer to
       copy the device tree blob into is.
     - CVE-2018-20815
   * SECURITY UPDATE: information leak in SLiRP
     - d/p/ubuntu/CVE-2019-9824.patch: check sscanf result when
       emulating ident.
     - CVE-2019-9824
   * d/p/ubuntu/fix-virtio-net-regression.patch: Fix regression of failing qemu
     guest networking (LP: #1829380).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I'm currently prepping soem things for Xenials virtualization stack anyway.
I'll make this fixup (+reestablishign the original fix) part of it.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Dan Streetman (ddstreet) wrote :

ocata:

I was able to reproduce the regression by rebooting the guest - I could not reproduce it any other way, by attaching/removing interface(s), setting their link up/down from the host side or from the guest side, adding/removing them to a bridge on the host, etc. On reboot, with qemu 1:2.8+dfsg-3ubuntu2.9~cloud5.1, after 2 reboots I was able to reproduce the guest being unable to use the interface. I stopped the guest, upgraded to 1:2.8+dfsg-3ubuntu2.9~cloud6, and repeated the test, and the guest had no problems even after 10 reboots.

description: updated
tags: added: verification-ocata-done
removed: verification-ocata-needed
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi
I wanted to mention that I have not found an issue on this in my regression testing running on this code against a PPA.
The qemu already in -unapproved is identical to what I proposed and we checked on - so lets keep it there as is to be accepted.

Dan Streetman (ddstreet)
tags: added: sts
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Dan, or anyone else affected,

Accepted qemu into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:2.5+dfsg-5ubuntu10.40 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in qemu (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-xenial
Revision history for this message
Dan Streetman (ddstreet) wrote :

With qemu version 1:2.5+dfsg-5ubuntu10.37, I rebooted a guest with a normal network interface, and after the 2nd reboot it lost networking. Then upgraded to 1:2.5+dfsg-5ubuntu10.40, stopped and started guest, and rebooted it 10 times without any networking problems.

tags: added: verification-done verification-done-xenial
removed: verification-needed verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:2.5+dfsg-5ubuntu10.40

---------------
qemu (1:2.5+dfsg-5ubuntu10.40) xenial; urgency=medium

  * Restore patches that caused regression
    - d/p/lp1823458/add-VirtIONet-vhost_stopped-flag-to-prevent-multiple.patch
    - d/p/lp1823458/do-not-call-vhost_net_cleanup-on-running-net-from-ch.patch
  * Fix regression introduced by above patches (LP: #1829380)
    - d/p/lp1829380.patch

  [ Rafael David Tinoco ]
  * d/p/lp1828288/target-i386-Set-AMD-alias-bits-after-filtering-CPUID.patch
    - Fix issues with CPUID_EXT2_AMD_ALIASES allowing guests using
      cpu passthrough to boot. (LP: #1828288)

 -- Dan Streetman <email address hidden> Thu, 16 May 2019 14:29:56 -0400

Changed in qemu (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Hello Dan, or anyone else affected,

Accepted qemu into mitaka-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:mitaka-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-mitaka-needed to verification-mitaka-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-mitaka-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-mitaka-needed
Revision history for this message
Corey Bryant (corey.bryant) wrote : Update Released

The verification of the Stable Release Update for qemu has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This bug was fixed in the package qemu - 1:2.5+dfsg-5ubuntu10.40~cloud0
---------------

 qemu (1:2.5+dfsg-5ubuntu10.40~cloud0) trusty-mitaka; urgency=medium
 .
   * New update for the Ubuntu Cloud Archive.
 .
 qemu (1:2.5+dfsg-5ubuntu10.40) xenial; urgency=medium
 .
   * Restore patches that caused regression
     - d/p/lp1823458/add-VirtIONet-vhost_stopped-flag-to-prevent-multiple.patch
     - d/p/lp1823458/do-not-call-vhost_net_cleanup-on-running-net-from-ch.patch
   * Fix regression introduced by above patches (LP: #1829380)
     - d/p/lp1829380.patch
 .
   [ Rafael David Tinoco ]
   * d/p/lp1828288/target-i386-Set-AMD-alias-bits-after-filtering-CPUID.patch
     - Fix issues with CPUID_EXT2_AMD_ALIASES allowing guests using
       cpu passthrough to boot. (LP: #1828288)

Dan Streetman (ddstreet)
tags: added: verification-mitaka-done
removed: verification-mitaka-needed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.