Merge ~rafaeldtinoco/ubuntu/+source/qemu:lp1805256-bionic into ubuntu/+source/qemu:ubuntu/bionic-devel

Proposed by Rafael David Tinoco
Status: Merged
Approved by: Rafael David Tinoco
Approved revision: 4972897a218dbed489d24bd9c6446241b771be1f
Merged at revision: 4972897a218dbed489d24bd9c6446241b771be1f
Proposed branch: ~rafaeldtinoco/ubuntu/+source/qemu:lp1805256-bionic
Merge into: ubuntu/+source/qemu:ubuntu/bionic-devel
Diff against target: 585 lines (+533/-0)
8 files modified
debian/changelog (+12/-0)
debian/patches/series (+6/-0)
debian/patches/ubuntu/lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch (+116/-0)
debian/patches/ubuntu/lp-1805256-aio-posix-Assert-that-aio_poll_is-always-called.patch (+41/-0)
debian/patches/ubuntu/lp-1805256-aio-rename-aio_context_in_iothread.patch (+62/-0)
debian/patches/ubuntu/lp-1805256-aio-wait_delegate_poll_aioctx_bql.patch (+107/-0)
debian/patches/ubuntu/lp-1805256-async_use_explicit_mem_barriers.patch (+164/-0)
debian/patches/ubuntu/lp-1805256-dont-count-ctx-not-as-progress.patch (+25/-0)
Reviewer Review Type Date Requested Status
Christian Ehrhardt  (community) Approve
Rafael David Tinoco (community) Approve
Ike Panhc (community) Approve
dann frazier (community) Needs Fixing
Canonical Server Pending
Review via email: mp+383566@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@paelzer,

Here the same thing I said in:

https://code.launchpad.net/~rafaeldtinoco/ubuntu/+source/qemu/+git/qemu/+merge/383530/comments/1006812

applies. All the other cherry-picks and/or backports were needed to make a smoother (and better maintained later) SRU, I added [] notes in each of those patches I considered needed a note from me to you.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Finishing up small build fixes related to implicit declarations, will give an ok soon here.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Okay, after finishing up the declarations issue I faced a new building problem:

https://lkml.org/lkml/2020/2/13/1428

for Ubuntu Bionic:

/usr/include/linux/swab.h: In function ‘__swab’:
/home/rafaeldtinoco/work/sources/ubuntu/qemu/include/qemu/bitops.h:20:34: error: "sizeof" is not defined, evaluates to 0 [-Werror=undef]
 #define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE)
                                  ^
/home/rafaeldtinoco/work/sources/ubuntu/qemu/include/qemu/bitops.h:20:41: error: missing binary operator before token "("
 #define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE)

Working on it

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Already spoken to you about the previous comment. You pointed me:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1877123

So we're good there.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This looks well prepared, 7.24 will eventually also get CVEs added and the kernel bug that makes it an FTFBS needs to be fixed. So +1 for now and the steps from here:
1. wait for bug 1877123 (kernel team)
2. test current SRU (my job)
3. release my SRU with security fixes (mdeslaur)
4. rebase this MP and the others (rafael)
5. hand over to the SRU Team to accept it

review: Approve
Revision history for this message
dann frazier (dannf) wrote :

I self-built the package from the PPA w/ -proposed disabled to avoid the build failure. Unfortunately, this fails testing on a Cavium Sabre (ThunderX2):

ubuntu@starbuck:~$ while :; do i=$((i + 1)); echo "=== $i ==="; qemu-img convert -p -f qcow2 -O qcow2 bionic-server-cloudimg-arm64.img foo.img; done
=== 1 ===
    (100.00/100%)
=== 2 ===
    (100.00/100%)
=== 3 ===
    (100.00/100%)
=== 4 ===
    (100.00/100%)
=== 5 ===
    (100.00/100%)
=== 6 ===
    (100.00/100%)
=== 7 ===
    (100.00/100%)
=== 8 ===
    (100.00/100%)
=== 9 ===
    (100.00/100%)
=== 10 ===
    (100.00/100%)
=== 11 ===
    (100.00/100%)
=== 12 ===
    (100.00/100%)
=== 13 ===
    (100.00/100%)
=== 14 ===
    (100.00/100%)
=== 15 ===
    (0.00/100%)
<HANG>

review: Needs Fixing
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@dannf,

Have you tested the other ones as well ? Is the bionic version the only one hanging ? My intent with this question is to check whether this was intermittent enough to be seen in bionic by a coincidence (frequency is not *that high*, and, by accident it happened with bionic backport) OR it definitely does not happen in the other backports, just with this one.

@paelzer,

Should I get a dump and analyse it with elf+dwarf symbols ? I can update upstream on the issue if needed.

Revision history for this message
dann frazier (dannf) wrote :

@rafaeldtinoco - I tested focal (groovy by proxy) and eoan, and those worked fine. See my +1s in those merge proposals. bionic is the only one hanging for me. That reflects my own "clean-room" backport testing as well - my eoan/focal backports were clean cherry picks that worked fine. My bionic one however required some additional backporting, and that still hit hangs. However, Ike's backport worked fine for me.

Revision history for this message
dann frazier (dannf) wrote :

@rafaeldtinoco: The relevant difference between your backport and Ike's is below. After applying this patch, my reproducer no longer fails. This undoes a change that came in w/ the extra backport of debian/patches/ubuntu/lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch.

Index: qemu-2.11+dfsg/util/aio-posix.c
===================================================================
--- qemu-2.11+dfsg.orig/util/aio-posix.c
+++ qemu-2.11+dfsg/util/aio-posix.c
@@ -647,7 +647,6 @@ bool aio_poll(AioContext *ctx, bool bloc
     if (blocking) {
         /* Finish the poll before clearing the flag. */
         atomic_store_release(&ctx->notify_me, atomic_read(&ctx->notify_me) - 2);
- aio_notify_accept(ctx);
     }

     /* Adjust polling time */
@@ -691,6 +690,8 @@ bool aio_poll(AioContext *ctx, bool bloc
         }
     }

+ aio_notify_accept(ctx);
+
     /* if we have any readable fds, dispatch event */
     if (ret > 0) {
         for (i = 0; i < npfd; i++) {

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Hey Dann,

This, per se, is not a change backed by an upstream change (not an entire change at least). The diff brought by you is part of a bigger commit:

----
Message-Id: <email address hidden>
Signed-off-by: Fam Zheng <email address hidden>
Signed-off-by: Rafael David Tinoco <email address hidden>

Author: Paolo Bonzini <email address hidden>
Origin: upstream, https://github.com/qemu/qemu/commit/b37548fcd
Bug: https://launchpad.net/bugs/1805256
Bug-Ubuntu: https://launchpad.net/bugs/1805256
Reviewed-By: Rafael David Tinoco <email address hidden>
Last-Update: 2020-05-07
----

also added, by me, to the Bionic SRU, just like all further SRUs, with the following backport note:

    [Rafael David Tinoco]

    I have cherry-picked this fix because of the AIO primitives fix
    dependency, nevertheless, it is good to notice it fixes the case
    above as well. Per se it would be difficult to SRU this, but,
    since we're stressing AIO mechanism with the other Aarch64 needed
    fix, this one fits the purpose as well.

and the filename:

debian/patches/ubuntu/lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch

containing:

https://paste.ubuntu.com/p/hFb3kzJx3f/

and the *most important* part of the code change you referred to.

----

The thing is.. changes to AIO code can make the issue to appear or disappear and/or make it more or less intermittent than it was, depending on how the cache line is being used by the binary object during its execution (specially in the case where the 2 variables that need syncing are sharing the same cache line <- which is hardly ever the truth).

It's hard to consider something fixed because of a change in binaries not backed by upstream in this case (remember when we added a local - to a function - variable and caused the issue to temporarily disappear because of diff cache lines having the 2 addresses that had to be synced ?).

Looks like there are no available TX2 machines for me to play with: I provisioned one and after a bit of work there (couldn't reproduce) I realized it wasn't a TX2 =\. I'll keep it there so if I can get a TX2 this week I can just move my files from one machine to another.

----

Current Status in my POV:

- Either we fixed the issue and the patch above is causing the same regression for the version we have in Bionic (which seems less likely)

- Or the issue is still there but being reproduced in a few builds only (like the one we have in my PPA) and not in others (like the one you had without the patch above).

----

Id like to have access to TX2 machine so I can test the same build with and without commit
lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch in Bionic and check if I can reproduce the issue and also do some objdumps in qemu.

It could be the case that we did not fix any version but we can only reproduce the issue in Bionic, for some reason, for example. Then I would have to get back to Paolo showing that by using a dump analysis to back us up.

Thoughts ?

Revision history for this message
dann frazier (dannf) wrote :
Download full text (3.7 KiB)

[ + Ike ]

On Tue, May 19, 2020 at 9:00 PM Rafael David Tinoco
<email address hidden> wrote:
>
> Hey Dann,
>
> This, per se, is not a change backed by an upstream change (not an entire change at least). The diff brought by you is part of a bigger commit:
>
> ----
> Message-Id: <email address hidden>
> Signed-off-by: Fam Zheng <email address hidden>
> Signed-off-by: Rafael David Tinoco <email address hidden>
>
> Author: Paolo Bonzini <email address hidden>
> Origin: upstream, https://github.com/qemu/qemu/commit/b37548fcd
> Bug: https://launchpad.net/bugs/1805256
> Bug-Ubuntu: https://launchpad.net/bugs/1805256
> Reviewed-By: Rafael David Tinoco <email address hidden>
> Last-Update: 2020-05-07
> ----
>
> also added, by me, to the Bionic SRU, just like all further SRUs, with the following backport note:
>
> [Rafael David Tinoco]
>
> I have cherry-picked this fix because of the AIO primitives fix
> dependency, nevertheless, it is good to notice it fixes the case
> above as well. Per se it would be difficult to SRU this, but,
> since we're stressing AIO mechanism with the other Aarch64 needed
> fix, this one fits the purpose as well.
>
> and the filename:
>
> debian/patches/ubuntu/lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch
>
> containing:
>
> https://paste.ubuntu.com/p/hFb3kzJx3f/
>
> and the *most important* part of the code change you referred to.

Right, I noticed and mentioned that. In fact, when I did a clean room
backport myself, I pulled the same change in with the same result.

> ----
>
> The thing is.. changes to AIO code can make the issue to appear or disappear and/or make it more or less intermittent than it was, depending on how the cache line is being used by the binary object during its execution (specially in the case where the 2 variables that need syncing are sharing the same cache line <- which is hardly ever the truth).
>

To be clear, I'm not suggesting we blindly drop that change w/o
understanding why we should.

>
> It's hard to consider something fixed because of a change in binaries not backed by upstream in this case (remember when we added a local - to a function - variable and caused the issue to temporarily disappear because of diff cache lines having the 2 addresses that had to be synced ?).
>
>
> Looks like there are no available TX2 machines for me to play with: I provisioned one and after a bit of work there (couldn't reproduce) I realized it wasn't a TX2 =\. I'll keep it there so if I can get a TX2 this week I can just move my files from one machine to another.
>

I've released one I was using (apollo), feel free to grab. You can
also use any Hi1620 chip - filter for MAAS tags soc-thunderx2 and
soc-hi1620, respectively.

>
> Current Status in my POV:
>
> - Either we fixed the issue and the patch above is causing the same regression for the version we have in Bionic (which seems less likely)
>
> - Or the issue is still there but being reproduced in a few builds only (like the one we have in my PPA) and not in others (like the one you had without the patch above).
>
> ----
>
> Id like to have access to TX2 machine so I can test the same build with a...

Read more...

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

>> debian/patches/ubuntu/lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch
>>
>> containing:
>>
>> https://paste.ubuntu.com/p/hFb3kzJx3f/
>>
>> and the *most important* part of the code change you referred to.
> Right, I noticed and mentioned that. In fact, when I did a clean room
> backport myself, I pulled the same change in with the same result.

Great! We're synced then!

>> ----
>>
>> The thing is.. changes to AIO code can make the issue to appear or disappear and/or make it more or less intermittent than it was, depending on how the cache line is being used by the binary object during its execution (specially in the case where the 2 variables that need syncing are sharing the same cache line <- which is hardly ever the truth).
> To be clear, I'm not suggesting we blindly drop that change w/o
> understanding why we should.

> Ah gotcha. Sorry for misunderstanding.

> I've released one I was using (apollo), feel free to grab. You can
> also use any Hi1620 chip - filter for MAAS tags soc-thunderx2 and
> soc-hi1620, respectively.

Will do!

>> It could be the case that we did not fix any version but we can only reproduce the issue in Bionic, for some reason, for example. Then I would have to get back to Paolo showing that by using a dump analysis to back us up.
> We could easily reproduce in eoan and focal before backporting the
> upstream fix. My hypothesis is that something changed after bionic
> that made lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch
> safe/correct to apply.

Yes! I thought the same! I'm just afraid that we can be thinking that
because the builds (each new major/minor version) are not 100%
reproducible and could be generating different objects with the cache
line main issue. But I share the same opinion as you on the hypothesis!

Revision history for this message
dann frazier (dannf) wrote :

I think you may need the following commit as well:
 70232b5253 aio-posix: Don't count ctx->notifier as progress when polling

I added that to your package and started a test - so far > 100 iterations OK. But I won't be confident until at least 500.

Revision history for this message
dann frazier (dannf) wrote :

At 2214 iterations and still going, so I'm confident :)

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Very nice indeed! Nice finding!

I have not replied to you before because unfortunately had a massive I/O issue with my nvme card #). Let me know if you're confident enough so I can include that patch in the official SRU, please!

Revision history for this message
dann frazier (dannf) wrote :

On Fri, May 22, 2020 at 6:56 PM Rafael David Tinoco
<email address hidden> wrote:
>
> Very nice indeed! Nice finding!
>
> I have not replied to you before because unfortunately had a massive I/O issue with my nvme card #). Let me know if you're confident enough so I can include that patch in the official SRU, please!

+1 from my POV. It made it to 3093 successful tests, which is when my
connection dropped. btw, the way I found it was to bisect, applying
the set of patches you selected at each stage (those of which weren't
already applied at the chosen commit).

Revision history for this message
dann frazier (dannf) wrote :

fyi, there are merge conflict markers in this version

review: Needs Fixing
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Hey Dann, will sort them out..

Actually I stopped in the middle of my previous attempt - thus the leftovers - and discussed this case with @cpaelzer. It turns out that the commit you mention, like I predicted, just added entropy - likely flushing local/remote cache lines - in between 2 primitives/barriers manipulation inside AIO synchronization code.

That explains why I did not see it before, it does not play a role in the locking primitives. With that in mind, it is very likely that the real original issue has not been fixed and is been only seeing in current bionic binary and code base.

After speaking to cpaelzer: I'll try to create a small code that mimics the issue and see if we can reproduce it at will to demonstrate to both: HW designers and Paolo. This effort will be done "boxed" in some of my schedule hours.

Meanwhile, after fixing the markers, @cpaelzer will likely accept this SRU as is cause it won't hurt anyway.. BUT we *could*, eventually, face the same issue again. Be aware :o).

Last case scenario, we can also just add some padding in between the 2 locking variables and make sure they never, ever, are loaded in the same cacheline <- that would mitigate the original issue for sure.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

BTW the leftovers were there just because I did not rebase newer bionic versions. I'm pushing a rebase to latest bionic-devel version.

Revision history for this message
dann frazier (dannf) wrote :

On Tue, May 26, 2020 at 11:26 AM Rafael David Tinoco
<email address hidden> wrote:
>
> Hey Dann, will sort them out..
>
> Actually I stopped in the middle of my previous attempt - thus the leftovers - and discussed this case with @cpaelzer. It turns out that the commit you mention, like I predicted, just added entropy - likely flushing local/remote cache lines - in between 2 primitives/barriers manipulation inside AIO synchronization code.

Could it just be that there are 2 different issues here?

A) One that impacts all releases, which is fixed by "async: use
explicit memory barriers"

B) One that only impacts bionic, which requires both "aio-posix: Don't
count ctx->notifier as progress when polling" and "aio: Do
aio_notify_accept only during blocking aio_poll"? These 2 patches were
submitted together to fix an IOThread hang:
  https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg00897.html

I'll see if I can collect a backtrace w/ the fix for A) and not B) to
see how it compares to the one we've attributed to B).

  -dann

Revision history for this message
dann frazier (dannf) wrote :
Download full text (3.5 KiB)

Here's a backtrace using the current bionic PPA package, which includes:
  async: use explicit memory barriers
  aio: Do aio_notify_accept only during blocking aio_poll
But does *not* include:
  aio-posix: Don't count ctx->notifier as progress when polling

(gdb) thread apply all bt

Thread 2 (Thread 0xffff9d787ff0 (LWP 70291)):
#0 0x0000ffff9dc022cc in __GI___sigtimedwait (set=<optimized out>, set@entry=0x
aaaaebac2a20, info=info@entry=0xffff9d787688, timeout=0xffffdd0d8430, timeout@en
try=0x0) at ../sysdeps/unix/sysv/linux/sigtimedwait.c:42
#1 0x0000ffff9dd39fac in __sigwait (set=set@entry=0xaaaaebac2a20, sig=sig@entry
=0xffff9d787764) at ../sysdeps/unix/sysv/linux/sigwait.c:28
#2 0x0000aaaac2311d70 in sigwait_compat (opaque=0xaaaaebac2a20) at ./util/compa
tfd.c:36
#3 0x0000ffff9dd2f088 in start_thread (arg=0xffffdd0d91bf) at pthread_create.c:
463
#4 0x0000ffff9dc9f4ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/
clone.S:78

Thread 1 (Thread 0xffff9d789010 (LWP 70290)):
#0 0x0000aaaac230e36c in aio_bh_poll (ctx=ctx@entry=0xaaaaebac4990) at ./util/a
sync.c:120
#1 0x0000aaaac2311818 in aio_poll (ctx=ctx@entry=0xaaaaebac4990, blocking=<opti
mized out>) at ./util/aio-posix.c:703
#2 0x0000aaaac229b2f0 in bdrv_prwv_co (child=child@entry=0xaaaaebace2f0, offset
=offset@entry=196608, qiov=qiov@entry=0xffffdd0d8ea0, is_write=is_write@entry=fa
lse, flags=flags@entry=0) at ./block/io.c:656
#3 0x0000aaaac229b654 in bdrv_preadv (qiov=0xffffdd0d8ea0, offset=196608, child
=0xaaaaebace2f0) at ./block/io.c:764
#4 bdrv_pread (child=0xaaaaebace2f0, offset=196608, buf=<optimized out>, bytes=
<optimized out>) at ./block/io.c:785
#5 0x0000aaaac226c514 in qcow2_do_open (bs=0xaaaaebaf3f90, options=0xaaaaebaea4
e0, flags=25090, errp=0xffffdd0d9070) at ./block/qcow2.c:1355
#6 0x0000aaaac224d834 in bdrv_open_driver (bs=bs@entry=0xaaaaebaf3f90, drv=0xaa
aac2375458 <bdrv_qcow2>, drv@entry=0xaaaaebae8240, node_name=<optimized out>, op
tions=options@entry=0xaaaaebaea4e0, open_flags=25090, errp=0xffffdd0d9130, errp@
entry=0x4202) at ./block.c:1146
#7 0x0000aaaac224eb90 in bdrv_open_common (errp=0x4202, options=0xaaaaebaea4e0,
 file=0xaaaaebaf33e0, bs=0xaaaaebaf3f90) at ./block.c:1404
#8 bdrv_open_inherit (filename=<optimized out>, filename@entry=0xffffdd0da239 "foo.img", reference=reference@entry=0x0, options=0xaaaaebaea4e0, options@entry=0xaaaaebae8240, flags=<optimized out>, flags@entry=16898, parent=parent@entry=0x0, child_role=child_role@entry=0x0, errp=errp@entry=0xffffdd0d9200) at ./block.c:2628
#9 0x0000aaaac224f9d8 in bdrv_open (filename=filename@entry=0xffffdd0da239 "foo.img", reference=reference@entry=0x0, options=options@entry=0xaaaaebae8240, flags=flags@entry=16898, errp=errp@entry=0xffffdd0d9200) at ./block.c:2710
#10 0x0000aaaac228b3c4 in blk_new_open (filename=0xffffdd0da239 "foo.img", reference=0x0, options=0xaaaaebae8240, flags=16898, errp=0xffffdd0d9200) at ./block/block-backend.c:323
#11 0x0000aaaac22415b0 in img_open_file (filename=0xffffdd0da239 "foo.img", options=0xaaaaebae8240, fmt=<optimized out>, flags=16898, writethrough=false, force_share=false, quiet=<optimized out>) at ./qemu-img.c:315
#12 0x0000aaaac224437c in img...

Read more...

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

> Could it just be that there are 2 different issues here?
>
> A) One that impacts all releases, which is fixed by "async: use
> explicit memory barriers"
>
> B) One that only impacts bionic, which requires both "aio-posix: Don't
> count ctx->notifier as progress when polling" and "aio: Do
> aio_notify_accept only during blocking aio_poll"? These 2 patches were
> submitted together to fix an IOThread hang:
> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg00897.html
>
> I'll see if I can collect a backtrace w/ the fix for A) and not B) to
> see how it compares to the one we've attributed to B).

A more plausible scenario, discussed with @cpaelzer as well, would be
that the patch we just placed in the patchset, has added entropy in
between 2 lock acquire/releases. This way, just like adding variables in
between the 2 lock variables, we are mitigating the issue.

Considering the initial root cause:

- 2 locking variables adjacent in address space
- thus sharing the same cache line
- when being manipulated in 2 x different CPUs
- each CPU with its own cache line
- CPUs should invalidate each others cacheline

We have the following mitigations:

- replacing locking primitives by full mutex implementation
- removing variables from the same cache line
- changing entropy reducing/postponing lock reads/writes
  - this one is here

So I think its more of the same... explaining to @cpaelzer he felt that
this was also plausible. Only way to confirm would be to isolate
everything else and stick with a small reproducer.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Your backtrace looks weird to me..

If I just unapply:

+ - aio-posix: Don't count ctx->notifier as progress when polling

I have this as instruction pointer at the time of the dump:

        if (bh->deleted) { -> line 120
            deleted = true;
        }

inside a bigger loop, stepping through existing "bottom halves" (scheduled work in the form of callbacks).

It does not look like the original issue:

- we don't have multiple threads running for the locking variables
  - sine qua non condition for the race to happen
- we have a single thread environment ?
  - are you specifying 1 couroutine only ? (this was our first mitigation idea)
- it does not look like it is "hang"
  - we are not waiting on locking anywhere as it appears

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

I pinged you on IRC, maybe we can have a hangout/meet session to help with the back and forth.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Okay, as we spoke in the hangout, wrapping up:

- Paolo patches backported to bionic likely fixed the original issue
- ... and the issue in all other versions Eoan, Focal, Groovy

- This "new" issue that appeared in Bionic in unrelated with the original one
- That explains why having patch:
  "- aio-posix: Don't count ctx->notifier as progress when polling"
  fixes "the issue" but it was a "loop forever inside RCU foreach macro"
- Original issue was not reproduced any longer

Bionic is good to go with all the patches:

* d/p/ubuntu/lp-1805256*: Fixes for QEMU on aarch64 ARM hosts
  - aio: rename aio_context_in_iothread() to in_aio_context_home_thread()
  - aio: Do aio_notify_accept only during blocking aio_poll
  - aio-posix: Assert that aio_poll() is always called in home thread
  - async: use explicit memory barriers (LP: #1805256)
  - aio-wait: delegate polling of main AioContext if BQL not held
  - aio-posix: Don't count ctx->notifier as progress when polling

All patches but the latest one are fixing the original (to LP: #1805256) issue and the last patch is fixing an issue observed by your tests with those backports (perhaps something that exited before or not, but we are sure it fixes the issue).

With this summary we're good to go for the SRU @cpaelzer.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for all the discussion and tests already!

This one is already rebased for the CVE-2019-20382/CVE-2020-1983 updates.

The code is a bit more than on >=Eoan as discussed in the past I think it is ok that way to get the things to apply and work as expected. ALso e.g. the assert (fix #3) will help in case assumptions are not true on bionics qemu (things should be ok, but if not better a crash than silent corruption).

Re-read the patches LGTM to me (still).

Can you update the PPA (https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1805256/) with this new version please?.
Also this time the former FTBFS in Bionic is resolved, so it will actually build.
I'll then run a regression test against all four versions and if ok sponsor it into the -unapproved queue after the groovy upload completed.

review: Needs Information
Revision history for this message
Ike Panhc (ikepanhc) wrote :

Thanks for picking up those patches from upstream. Build the qemu-img on 96cores arm64 system and it pass 100 times of qemu-img convert. The qemu-img in bionic-update will hang in less then 10 times run.

review: Approve
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@paelzer:

Done!

review: Approve
Revision history for this message
Ike Panhc (ikepanhc) wrote :

Overnight run (>7000 qemu-img convert) and do not hang on any run.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for the rebase, non case specific regression testing is a few thousand tests in and still working.
Approving the MP now.
If no issue shows up late I'll sponsor this to -unapproved later on.

review: Approve
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Regression tests completed - sponsoring once groovy is complete

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Groovy is completed, SRU sponsored into -unapproved

To ssh://git.launchpad.net/~usd-import-team/ubuntu/+source/qemu
 * [new tag] upload/1%2.11+dfsg-1ubuntu7.27 -> upload/1%2.11+dfsg-1ubuntu7.27

Some issues as discussed with rbasak a while ago with the new: ui/keycodemapdb/..git
Not sure if the tag will ever be able to match in Bionic :-/

But eventually
Uploading to ubuntu (via ftp to upload.ubuntu.com):
  Uploading qemu_2.11+dfsg-1ubuntu7.27.dsc: done.
  Uploading qemu_2.11+dfsg-1ubuntu7.27.debian.tar.xz: done.
  Uploading qemu_2.11+dfsg-1ubuntu7.27_source.buildinfo: done.
  Uploading qemu_2.11+dfsg-1ubuntu7.27_source.changes: done.
Successfully uploaded packages.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/debian/changelog b/debian/changelog
2index 59b5a48..2db0143 100644
3--- a/debian/changelog
4+++ b/debian/changelog
5@@ -1,3 +1,15 @@
6+qemu (1:2.11+dfsg-1ubuntu7.27) bionic; urgency=medium
7+
8+ * d/p/ubuntu/lp-1805256*: Fixes for QEMU on aarch64 ARM hosts
9+ - aio: rename aio_context_in_iothread() to in_aio_context_home_thread()
10+ - aio: Do aio_notify_accept only during blocking aio_poll
11+ - aio-posix: Assert that aio_poll() is always called in home thread
12+ - async: use explicit memory barriers (LP: #1805256)
13+ - aio-wait: delegate polling of main AioContext if BQL not held
14+ - aio-posix: Don't count ctx->notifier as progress when polling
15+
16+ -- Rafael David Tinoco <rafaeldtinoco@ubuntu.com> Tue, 26 May 2020 17:39:21 +0000
17+
18 qemu (1:2.11+dfsg-1ubuntu7.26) bionic-security; urgency=medium
19
20 * SECURITY UPDATE: memory leak in zrle_compress_data
21diff --git a/debian/patches/series b/debian/patches/series
22index 2c238f0..cc5f1c0 100644
23--- a/debian/patches/series
24+++ b/debian/patches/series
25@@ -145,3 +145,9 @@ CVE-2020-8608-2.patch
26 ubuntu/lp-1847361-modules-load-upgrade.patch
27 CVE-2019-20382.patch
28 CVE-2020-1983.patch
29+ubuntu/lp-1805256-aio-rename-aio_context_in_iothread.patch
30+ubuntu/lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch
31+ubuntu/lp-1805256-aio-posix-Assert-that-aio_poll_is-always-called.patch
32+ubuntu/lp-1805256-async_use_explicit_mem_barriers.patch
33+ubuntu/lp-1805256-aio-wait_delegate_poll_aioctx_bql.patch
34+ubuntu/lp-1805256-dont-count-ctx-not-as-progress.patch
35diff --git a/debian/patches/ubuntu/lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch b/debian/patches/ubuntu/lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch
36new file mode 100644
37index 0000000..7c54e10
38--- /dev/null
39+++ b/debian/patches/ubuntu/lp-1805256-aio-Do-aio_notify_accept-only-during-blocking.patch
40@@ -0,0 +1,116 @@
41+Description: aio: Do aio_notify_accept only during blocking aio_poll
42+
43+An aio_notify() pairs with an aio_notify_accept(). The former should
44+happen in the main thread or a vCPU thread, and the latter should be
45+done in the IOThread.
46+
47+There is one rare case that the main thread or vCPU thread may "steal"
48+the aio_notify() event just raised by itself, in bdrv_set_aio_context()
49+[1]. The sequence is like this:
50+
51+ main thread IO Thread
52+ ===============================================================
53+ bdrv_drained_begin()
54+ aio_disable_external(ctx)
55+ aio_poll(ctx, true)
56+ ctx->notify_me += 2
57+ ...
58+ bdrv_drained_end()
59+ ...
60+ aio_notify()
61+ ...
62+ bdrv_set_aio_context()
63+ aio_poll(ctx, false)
64+[1] aio_notify_accept(ctx)
65+ ppoll() /* Hang! */
66+
67+[1] is problematic. It will clear the ctx->notifier event so that
68+the blocked ppoll() will not return.
69+
70+(For the curious, this bug was noticed when booting a number of VMs
71+simultaneously in RHV. One or two of the VMs will hit this race
72+condition, making the VIRTIO device unresponsive to I/O commands. When
73+it hangs, Seabios is busy waiting for a read request to complete (read
74+MBR), right after initializing the virtio-blk-pci device, using 100%
75+guest CPU. See also https://bugzilla.redhat.com/show_bug.cgi?id=1562750
76+for the original bug analysis.)
77+
78+aio_notify() only injects an event when ctx->notify_me is set,
79+correspondingly aio_notify_accept() is only useful when ctx->notify_me
80+_was_ set. Move the call to it into the "blocking" branch. This will
81+effectively skip [1] and fix the hang.
82+
83+Furthermore, blocking aio_poll is only allowed on home thread
84+(in_aio_context_home_thread), because otherwise two blocking
85+aio_poll()'s can steal each other's ctx->notifier event and cause
86+hanging just like described above.
87+
88+ [Rafael David Tinoco]
89+
90+ I have cherry-picked this fix because of the AIO primitives fix
91+ dependency, nevertheless, it is good to notice it fixes the case
92+ above as well. Per se it would be difficult to SRU this, but,
93+ since we're stressing AIO mechanism with the other Aarch64 needed
94+ fix, this one fits the purpose as well.
95+
96+Cc: qemu-stable@nongnu.org
97+Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
98+Signed-off-by: Fam Zheng <famz@redhat.com>
99+Message-Id: <20180809132259.18402-3-famz@redhat.com>
100+Signed-off-by: Fam Zheng <famz@redhat.com>
101+Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
102+
103+Author: Paolo Bonzini <pbonzini@redhat.com>
104+Origin: upstream, https://github.com/qemu/qemu/commit/b37548fcd
105+Bug: https://launchpad.net/bugs/1805256
106+Bug-Ubuntu: https://launchpad.net/bugs/1805256
107+Reviewed-By: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
108+Last-Update: 2020-05-07
109+---
110+ util/aio-posix.c | 4 ++--
111+ util/aio-win32.c | 3 ++-
112+ 2 files changed, 4 insertions(+), 3 deletions(-)
113+
114+--- a/util/aio-posix.c
115++++ b/util/aio-posix.c
116+@@ -590,6 +590,7 @@
117+ * so disable the optimization now.
118+ */
119+ if (blocking) {
120++ assert(in_aio_context_home_thread(ctx));
121+ atomic_add(&ctx->notify_me, 2);
122+ }
123+
124+@@ -632,6 +633,7 @@
125+
126+ if (blocking) {
127+ atomic_sub(&ctx->notify_me, 2);
128++ aio_notify_accept(ctx);
129+ }
130+
131+ /* Adjust polling time */
132+@@ -675,8 +677,6 @@
133+ }
134+ }
135+
136+- aio_notify_accept(ctx);
137+-
138+ /* if we have any readable fds, dispatch event */
139+ if (ret > 0) {
140+ for (i = 0; i < npfd; i++) {
141+--- a/util/aio-win32.c
142++++ b/util/aio-win32.c
143+@@ -373,11 +373,12 @@
144+ ret = WaitForMultipleObjects(count, events, FALSE, timeout);
145+ if (blocking) {
146+ assert(first);
147++ assert(in_aio_context_home_thread(ctx));
148+ atomic_sub(&ctx->notify_me, 2);
149++ aio_notify_accept(ctx);
150+ }
151+
152+ if (first) {
153+- aio_notify_accept(ctx);
154+ progress |= aio_bh_poll(ctx);
155+ first = false;
156+ }
157diff --git a/debian/patches/ubuntu/lp-1805256-aio-posix-Assert-that-aio_poll_is-always-called.patch b/debian/patches/ubuntu/lp-1805256-aio-posix-Assert-that-aio_poll_is-always-called.patch
158new file mode 100644
159index 0000000..d9dadce
160--- /dev/null
161+++ b/debian/patches/ubuntu/lp-1805256-aio-posix-Assert-that-aio_poll_is-always-called.patch
162@@ -0,0 +1,41 @@
163+Description: aio-posix: Assert that aio_poll() is always called in home
164+ thread
165+
166+aio_poll() has an existing assertion that the function is only called
167+from the AioContext's home thread if blocking is allowed.
168+
169+This is not enough, some handlers make assumptions about the thread they
170+run in. Extend the assertion to non-blocking calls, too.
171+
172+Signed-off-by: Kevin Wolf <kwolf@redhat.com>
173+Reviewed-by: Eric Blake <eblake@redhat.com>
174+
175+Author: Kevin Wolf <kwolf@redhat.com>
176+Origin: upstream, https://github.com/qemu/qemu/commit/0dc165c1bd
177+Bug: https://launchpad.net/bugs/1805256
178+Bug-Ubuntu: https://launchpad.net/bugs/1805256
179+Reviewed-By: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
180+Last-Update: 2020-05-07
181+---
182+ util/aio-posix.c | 3 ++-
183+ 1 file changed, 2 insertions(+), 1 deletion(-)
184+
185+--- a/util/aio-posix.c
186++++ b/util/aio-posix.c
187+@@ -582,6 +582,8 @@
188+ int64_t timeout;
189+ int64_t start = 0;
190+
191++ assert(in_aio_context_home_thread(ctx));
192++
193+ /* aio_notify can avoid the expensive event_notifier_set if
194+ * everything (file descriptors, bottom halves, timers) will
195+ * be re-evaluated before the next blocking poll(). This is
196+@@ -590,7 +592,6 @@
197+ * so disable the optimization now.
198+ */
199+ if (blocking) {
200+- assert(in_aio_context_home_thread(ctx));
201+ atomic_add(&ctx->notify_me, 2);
202+ }
203+
204diff --git a/debian/patches/ubuntu/lp-1805256-aio-rename-aio_context_in_iothread.patch b/debian/patches/ubuntu/lp-1805256-aio-rename-aio_context_in_iothread.patch
205new file mode 100644
206index 0000000..acd53ff
207--- /dev/null
208+++ b/debian/patches/ubuntu/lp-1805256-aio-rename-aio_context_in_iothread.patch
209@@ -0,0 +1,62 @@
210+Description: rename aio_context_in_iothread() to
211+ in_aio_context_home_thread()
212+
213+The name aio_context_in_iothread() is misleading because it also returns
214+true when called on the main AioContext from the main loop thread, which
215+is not an IOThread.
216+
217+This patch renames it to in_aio_context_home_thread() and expands the
218+doc comment to make the semantics clearer.
219+
220+ [Rafael David Tinoco]
221+
222+ Cherry-picking this patch so maintenance is easier. It does not
223+ do anything else than renaming aio_context_in_iothread() in
224+ BDRV_POLL_WHILE macro.
225+
226+Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
227+Reviewed-by: Eric Blake <eblake@redhat.com>
228+Signed-off-by: Kevin Wolf <kwolf@redhat.com>
229+Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
230+
231+Author: Stefan Hajnoczi <stefanha@redhat.com>
232+Origin: upstream, https://github.com/qemu/qemu/commit/d2b63ba8dd
233+Bug: https://launchpad.net/bugs/1805256
234+Bug-Ubuntu: https://launchpad.net/bugs/1805256
235+Reviewed-By: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
236+Last-Update: 2020-05-07
237+---
238+ include/block/aio.h | 7 +++++--
239+ include/block/block.h | 2 +-
240+ 2 files changed, 6 insertions(+), 3 deletions(-)
241+
242+--- a/include/block/aio.h
243++++ b/include/block/aio.h
244+@@ -534,11 +534,14 @@
245+ AioContext *qemu_get_current_aio_context(void);
246+
247+ /**
248++ * in_aio_context_home_thread:
249+ * @ctx: the aio context
250+ *
251+- * Return whether we are running in the I/O thread that manages @ctx.
252++ * Return whether we are running in the thread that normally runs @ctx. Note
253++ * that acquiring/releasing ctx does not affect the outcome, each AioContext
254++ * still only has one home thread that is responsible for running it.
255+ */
256+-static inline bool aio_context_in_iothread(AioContext *ctx)
257++static inline bool in_aio_context_home_thread(AioContext *ctx)
258+ {
259+ return ctx == qemu_get_current_aio_context();
260+ }
261+--- a/include/block/block.h
262++++ b/include/block/block.h
263+@@ -385,7 +385,7 @@
264+ bool busy_ = true; \
265+ BlockDriverState *bs_ = (bs); \
266+ AioContext *ctx_ = bdrv_get_aio_context(bs_); \
267+- if (aio_context_in_iothread(ctx_)) { \
268++ if (in_aio_context_home_thread(ctx_)) { \
269+ while ((cond) || busy_) { \
270+ busy_ = aio_poll(ctx_, (cond)); \
271+ waited_ |= !!(cond) | busy_; \
272diff --git a/debian/patches/ubuntu/lp-1805256-aio-wait_delegate_poll_aioctx_bql.patch b/debian/patches/ubuntu/lp-1805256-aio-wait_delegate_poll_aioctx_bql.patch
273new file mode 100644
274index 0000000..7d4fd97
275--- /dev/null
276+++ b/debian/patches/ubuntu/lp-1805256-aio-wait_delegate_poll_aioctx_bql.patch
277@@ -0,0 +1,107 @@
278+Description: aio-wait: delegate polling of main AioContext if BQL not
279+ held
280+
281+Any thread that is not a iothread returns NULL for qemu_get_current_aio_context().
282+As a result, it would also return true for
283+in_aio_context_home_thread(qemu_get_aio_context()), causing
284+AIO_WAIT_WHILE to invoke aio_poll() directly. This is incorrect
285+if the BQL is not held, because aio_poll() does not expect to
286+run concurrently from multiple threads, and it can actually
287+happen when savevm writes to the vmstate file from the
288+migration thread.
289+
290+Therefore, restrict in_aio_context_home_thread to return true
291+for the main AioContext only if the BQL is held.
292+
293+The function is moved to aio-wait.h because it is mostly used
294+there and to avoid a circular reference between main-loop.h
295+and block/aio.h.
296+
297+ [Rafael David Tinoco]
298+
299+ This is a backport because it does not contain the changes made
300+ by commit 7719f3c968 ("block: extract AIO_WAIT_WHILE() from
301+ BlockDriverState"), which is a big refactor of the BDRV_POLL_WHILE()
302+ macro, AND it is definitely NOT needed for the purpose of fixing the
303+ issue present in current in_aio_context_home_thread(), object of
304+ this change.
305+
306+Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
307+Message-Id: <20200407140746.8041-5-pbonzini@redhat.com>
308+Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
309+Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
310+
311+Author: Paolo Bonzini <pbonzini@redhat.com>
312+Origin: backport, https://github.com/qemu/qemu/commit/3c18a92dc4
313+Bug: https://launchpad.net/bugs/1805256
314+Bug-Ubuntu: https://launchpad.net/bugs/1805256
315+Reviewed-By: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
316+Last-Update: 2020-05-07
317+---
318+ include/block/aio.h | 13 -------------
319+ include/qemu/main-loop.h | 21 +++++++++++++++++++++
320+ util/aio-posix.c | 1 +
321+ 3 files changed, 22 insertions(+), 13 deletions(-)
322+
323+--- a/include/block/aio.h
324++++ b/include/block/aio.h
325+@@ -534,19 +534,6 @@
326+ AioContext *qemu_get_current_aio_context(void);
327+
328+ /**
329+- * in_aio_context_home_thread:
330+- * @ctx: the aio context
331+- *
332+- * Return whether we are running in the thread that normally runs @ctx. Note
333+- * that acquiring/releasing ctx does not affect the outcome, each AioContext
334+- * still only has one home thread that is responsible for running it.
335+- */
336+-static inline bool in_aio_context_home_thread(AioContext *ctx)
337+-{
338+- return ctx == qemu_get_current_aio_context();
339+-}
340+-
341+-/**
342+ * aio_context_setup:
343+ * @ctx: the aio context
344+ *
345+--- a/include/qemu/main-loop.h
346++++ b/include/qemu/main-loop.h
347+@@ -279,6 +279,27 @@
348+ */
349+ void qemu_mutex_unlock_iothread(void);
350+
351++/**
352++ * in_aio_context_home_thread:
353++ * @ctx: the aio context
354++ *
355++ * Return whether we are running in the thread that normally runs @ctx. Note
356++ * that acquiring/releasing ctx does not affect the outcome, each AioContext
357++ * still only has one home thread that is responsible for running it.
358++ */
359++static inline bool in_aio_context_home_thread(AioContext *ctx)
360++{
361++ if (ctx == qemu_get_current_aio_context()) {
362++ return true;
363++ }
364++
365++ if (ctx == qemu_get_aio_context()) {
366++ return qemu_mutex_iothread_locked();
367++ } else {
368++ return false;
369++ }
370++}
371++
372+ /* internal interfaces */
373+
374+ void qemu_fd_register(int fd);
375+--- a/util/aio-posix.c
376++++ b/util/aio-posix.c
377+@@ -19,6 +19,7 @@
378+ #include "qemu/rcu_queue.h"
379+ #include "qemu/sockets.h"
380+ #include "qemu/cutils.h"
381++#include "qemu/main-loop.h"
382+ #include "trace.h"
383+ #ifdef CONFIG_EPOLL_CREATE1
384+ #include <sys/epoll.h>
385diff --git a/debian/patches/ubuntu/lp-1805256-async_use_explicit_mem_barriers.patch b/debian/patches/ubuntu/lp-1805256-async_use_explicit_mem_barriers.patch
386new file mode 100644
387index 0000000..a352d49
388--- /dev/null
389+++ b/debian/patches/ubuntu/lp-1805256-async_use_explicit_mem_barriers.patch
390@@ -0,0 +1,164 @@
391+Description: async: use explicit memory barriers
392+
393+When using C11 atomics, non-seqcst reads and writes do not participate
394+in the total order of seqcst operations. In util/async.c and util/aio-posix.c,
395+in particular, the pattern that we use
396+
397+ write ctx->notify_me write bh->scheduled
398+ read bh->scheduled read ctx->notify_me
399+ if !bh->scheduled, sleep if ctx->notify_me, notify
400+
401+needs to use seqcst operations for both the write and the read. In
402+general this is something that we do not want, because there can be
403+many sources that are polled in addition to bottom halves. The
404+alternative is to place a seqcst memory barrier between the write
405+and the read. This also comes with a disadvantage, in that the
406+memory barrier is implicit on strongly-ordered architectures and
407+it wastes a few dozen clock cycles.
408+
409+Fortunately, ctx->notify_me is never written concurrently by two
410+threads, so we can assert that and relax the writes to ctx->notify_me.
411+The resulting solution works and performs well on both aarch64 and x86.
412+
413+Note that the atomic_set/atomic_read combination is not an atomic
414+read-modify-write, and therefore it is even weaker than C11 ATOMIC_RELAXED;
415+on x86, ATOMIC_RELAXED compiles to a locked operation.
416+
417+Analyzed-by: Ying Fang <fangying1@huawei.com>
418+Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
419+Tested-by: Ying Fang <fangying1@huawei.com>
420+Message-Id: <20200407140746.8041-6-pbonzini@redhat.com>
421+Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
422+
423+Author: Paolo Bonzini <pbonzini@redhat.com>
424+Origin: upstream, https://github.com/qemu/qemu/commit/5710a3e09f
425+Bug: https://launchpad.net/bugs/1805256
426+Bug-Ubuntu: https://launchpad.net/bugs/1805256
427+Reviewed-By: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
428+Last-Update: 2020-05-07
429+---
430+ util/aio-posix.c | 16 ++++++++++++++--
431+ util/aio-win32.c | 17 ++++++++++++++---
432+ util/async.c | 16 ++++++++++++----
433+ 3 files changed, 40 insertions(+), 9 deletions(-)
434+
435+--- a/util/aio-posix.c
436++++ b/util/aio-posix.c
437+@@ -582,6 +582,11 @@
438+ int64_t timeout;
439+ int64_t start = 0;
440+
441++ /*
442++ * There cannot be two concurrent aio_poll calls for the same AioContext (or
443++ * an aio_poll concurrent with a GSource prepare/check/dispatch callback).
444++ * We rely on this below to avoid slow locked accesses to ctx->notify_me.
445++ */
446+ assert(in_aio_context_home_thread(ctx));
447+
448+ /* aio_notify can avoid the expensive event_notifier_set if
449+@@ -592,7 +597,13 @@
450+ * so disable the optimization now.
451+ */
452+ if (blocking) {
453+- atomic_add(&ctx->notify_me, 2);
454++ atomic_set(&ctx->notify_me, atomic_read(&ctx->notify_me) + 2);
455++ /*
456++ * Write ctx->notify_me before computing the timeout
457++ * (reading bottom half flags, etc.). Pairs with
458++ * smp_mb in aio_notify().
459++ */
460++ smp_mb();
461+ }
462+
463+ qemu_lockcnt_inc(&ctx->list_lock);
464+@@ -633,7 +644,8 @@
465+ }
466+
467+ if (blocking) {
468+- atomic_sub(&ctx->notify_me, 2);
469++ /* Finish the poll before clearing the flag. */
470++ atomic_store_release(&ctx->notify_me, atomic_read(&ctx->notify_me) - 2);
471+ aio_notify_accept(ctx);
472+ }
473+
474+--- a/util/aio-win32.c
475++++ b/util/aio-win32.c
476+@@ -330,6 +330,12 @@
477+ int count;
478+ int timeout;
479+
480++ /*
481++ * There cannot be two concurrent aio_poll calls for the same AioContext (or
482++ * an aio_poll concurrent with a GSource prepare/check/dispatch callback).
483++ * We rely on this below to avoid slow locked accesses to ctx->notify_me.
484++ */
485++ assert(in_aio_context_home_thread(ctx));
486+ progress = false;
487+
488+ /* aio_notify can avoid the expensive event_notifier_set if
489+@@ -340,7 +346,13 @@
490+ * so disable the optimization now.
491+ */
492+ if (blocking) {
493+- atomic_add(&ctx->notify_me, 2);
494++ atomic_set(&ctx->notify_me, atomic_read(&ctx->notify_me) + 2);
495++ /*
496++ * Write ctx->notify_me before computing the timeout
497++ * (reading bottom half flags, etc.). Pairs with
498++ * smp_mb in aio_notify().
499++ */
500++ smp_mb();
501+ }
502+
503+ qemu_lockcnt_inc(&ctx->list_lock);
504+@@ -373,8 +385,7 @@
505+ ret = WaitForMultipleObjects(count, events, FALSE, timeout);
506+ if (blocking) {
507+ assert(first);
508+- assert(in_aio_context_home_thread(ctx));
509+- atomic_sub(&ctx->notify_me, 2);
510++ atomic_store_release(&ctx->notify_me, atomic_read(&ctx->notify_me) - 2);
511+ aio_notify_accept(ctx);
512+ }
513+
514+--- a/util/async.c
515++++ b/util/async.c
516+@@ -221,7 +221,14 @@
517+ {
518+ AioContext *ctx = (AioContext *) source;
519+
520+- atomic_or(&ctx->notify_me, 1);
521++ atomic_set(&ctx->notify_me, atomic_read(&ctx->notify_me) | 1);
522++
523++ /*
524++ * Write ctx->notify_me before computing the timeout
525++ * (reading bottom half flags, etc.). Pairs with
526++ * smp_mb in aio_notify().
527++ */
528++ smp_mb();
529+
530+ /* We assume there is no timeout already supplied */
531+ *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));
532+@@ -239,7 +246,8 @@
533+ AioContext *ctx = (AioContext *) source;
534+ QEMUBH *bh;
535+
536+- atomic_and(&ctx->notify_me, ~1);
537++ /* Finish computing the timeout before clearing the flag. */
538++ atomic_store_release(&ctx->notify_me, atomic_read(&ctx->notify_me) & ~1);
539+ aio_notify_accept(ctx);
540+
541+ for (bh = ctx->first_bh; bh; bh = bh->next) {
542+@@ -335,10 +343,10 @@
543+ void aio_notify(AioContext *ctx)
544+ {
545+ /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
546+- * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
547++ * with smp_mb in aio_ctx_prepare or aio_poll.
548+ */
549+ smp_mb();
550+- if (ctx->notify_me) {
551++ if (atomic_read(&ctx->notify_me)) {
552+ event_notifier_set(&ctx->notifier);
553+ atomic_mb_set(&ctx->notified, true);
554+ }
555diff --git a/debian/patches/ubuntu/lp-1805256-dont-count-ctx-not-as-progress.patch b/debian/patches/ubuntu/lp-1805256-dont-count-ctx-not-as-progress.patch
556new file mode 100644
557index 0000000..b4ab591
558--- /dev/null
559+++ b/debian/patches/ubuntu/lp-1805256-dont-count-ctx-not-as-progress.patch
560@@ -0,0 +1,25 @@
561+Description: Don't count ctx->notifier as progress when polling
562+
563+The same logic exists in fd polling. This change is especially important
564+to avoid busy loop once we limit aio_notify_accept() to blocking
565+aio_poll().
566+
567+Author: Fam Zheng <famz@redhat.com>
568+Origin: upstream, https://github.com/qemu/qemu/commit/70232b5253
569+Bug: https://launchpad.net/bugs/1805256
570+Bug-Ubuntu: https://launchpad.net/bugs/1805256
571+Reviewed-By: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
572+Last-Update: 2020-05-25
573+
574+--- qemu-2.11+dfsg.orig/util/aio-posix.c
575++++ qemu-2.11+dfsg/util/aio-posix.c
576+@@ -495,7 +495,8 @@ static bool run_poll_handlers_once(AioCo
577+ QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
578+ if (!node->deleted && node->io_poll &&
579+ aio_node_check(ctx, node->is_external) &&
580+- node->io_poll(node->opaque)) {
581++ node->io_poll(node->opaque) &&
582++ node->opaque != &ctx->notifier) {
583+ progress = true;
584+ }
585+

Subscribers

People subscribed via source and target branches