Merge ~sergiodj/ubuntu/+source/qemu:tcg-crash-mantic into ubuntu/+source/qemu:ubuntu/mantic-devel

Proposed by Sergio Durigan Junior
Status: Merged
Approved by: git-ubuntu bot
Approved revision: not available
Merge reported by: Andreas Hasenack
Merged at revision: 289556433dc6174924c419625887e129e69b3f5f
Proposed branch: ~sergiodj/ubuntu/+source/qemu:tcg-crash-mantic
Merge into: ubuntu/+source/qemu:ubuntu/mantic-devel
Diff against target: 789 lines (+755/-0)
5 files modified
debian/changelog (+7/-0)
debian/patches/series (+3/-0)
debian/patches/ubuntu/lp-2051965-accel-tcg-Always-lock-pages-before-translation.patch (+621/-0)
debian/patches/ubuntu/lp-2051965-accel-tcg-Clear-tcg_ctx-gen_tb-on-buffer-overflow.patch (+34/-0)
debian/patches/ubuntu/lp-2051965-accel-tcg-Split-out-cpu_exec_longjmp_cleanup.patch (+90/-0)
Reviewer Review Type Date Requested Status
git-ubuntu bot Approve
Christian Ehrhardt  (community) Approve
Canonical Server Reporter Pending
Review via email: mp+460476@code.launchpad.net

Description of the change

This MP fixes QEMU bug #2051965 on Mantic.

This is a tricky bug; you can take a look at the Test Plan to see how to reproduce it. There will be SRUs for Jammy and Mantic; the former will be more complex to review, and the latter will be more straightforward (although not without its complexities). As of this writing, the Jammy SRU is not ready yet because I'm still working on backporting the needed patches.

I'm particularly interested in a double check of the 3 patches being backported to fix the bug. Let me know if you spot any problems or missing patches, please.

PPA: https://launchpad.net/~sergiodj/+archive/ubuntu/qemu-bug2051965/+packages

dep8 results will be posted soon.

To post a comment you must log in.
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Some trivial findings first:

- patches are in debian/patches/ubuntu/ but the CL says - d/p/lp... should be - d/p/u/lp...
- some places mention kernel "6.13" while it is "6.3"
- avoid some ambiguities by stating clearly that the kernel >=6.3 is in meant to be the one in the guest, otherwise some might think "so what, 6.3 isn't supported in jammy/mantic anyway" or similar

On the patches:
- lp-2051965-accel-tcg-Split-out-cpu_exec_longjmp_cleanup.patch
  That seems fine, it just moves identical code to a shared function
  The assert should not matter, and on the actual binary this most likely is inlined and thereby as before
- lp-2051965-accel-tcg-Always-lock-pages-before-translation.patch
  Wow, this is more busy.
  Conceptually it is ok - at high abstraction more locking is slower but safer.
  It didn't help that the diff was hard to read by so muhc movement
  But eventually the new code is easier to read with just one kind of locking instead of 4
  I spent some time in a rabbit hole reading this, but it is either ok or too complex for me
  to see the issue
- lp-2051965-accel-tcg-Clear-tcg_ctx-gen_tb-on-buffer-overflow.patch
  That one is easy again, clearing a condition introduced by the former
- I found no other fixups than the one you already spotted

I can see why you said Jammy backport might be more complex, there are probably plenty of changes to that code. My gut says that you'd be ok to backport quite some more changes (instead of only adapting those three patches) if you need it, but that is up to you as you see fit.

+1 on this one with the minor things fixed.

For the "where issues occur" in the SRU statement, the most important bit is that this seems purely in tcg code, so we can assume that HW-virt will not be affected ever.

review: Approve
Revision history for this message
git-ubuntu bot (git-ubuntu-bot) wrote :

Approvers: sergiodj, paelzer
Uploaders: sergiodj, paelzer
MP auto-approved

review: Approve
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

On Wednesday, February 14 2024, Christian Ehrhardt  wrote:

> Review: Approve

Thanks for the review, Christian.

> Some trivial findings first:
>
> - patches are in debian/patches/ubuntu/ but the CL says - d/p/lp... should be - d/p/u/lp...

Ah, indeed. Thanks, I'll fix it.

> - some places mention kernel "6.13" while it is "6.3"

Doh. Fixed.

> - avoid some ambiguities by stating clearly that the kernel >=6.3 is in meant to be the one in the guest, otherwise some might think "so what, 6.3 isn't supported in jammy/mantic anyway" or similar

Done.

> I can see why you said Jammy backport might be more complex, there are
> probably plenty of changes to that code. My gut says that you'd be ok
> to backport quite some more changes (instead of only adapting those
> three patches) if you need it, but that is up to you as you see fit.

Yeah, I've been collecting the "adjacent" patches that will need to be
backported on Jammy. I think it should be fine at the end of the day.

> +1 on this one with the minor things fixed.

Thanks. I'll wait until I have the Jammy MP ready before I upload this
one.

> For the "where issues occur" in the SRU statement, the most important bit is that this seems purely in tcg code, so we can assume that HW-virt will not be affected ever.

ACK. I'm putting off writing the rest of the SRU template until I have
a clear picture of what the Jammy MP will be.

Cheers,

--
Sergio
GPG key ID: E92F D0B3 6B14 F1F4 D8E0 EB2F 106D A1C8 C3CB BF14

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

On Wednesday, February 14 2024, Sergio Durigan Junior wrote:

>> For the "where issues occur" in the SRU statement, the most important bit is that this seems purely in tcg code, so we can assume that HW-virt will not be affected ever.
>
> ACK. I'm putting off writing the rest of the SRU template until I have
> a clear picture of what the Jammy MP will be.

The Jammy fix will take a bit longer, so I'm going ahead and uploading
this one. The reporter is being affected by the Mantic bug, so
hopefully this will unblock him while I figure out the Jammy fix.

Uploaded:

$ dput qemu_8.0.4+dfsg-1ubuntu3.23.10.3_source.changes
Trying to upload package to ubuntu
Checking signature on .changes
gpg: /home/sergio/work/qemu/qemu_8.0.4+dfsg-1ubuntu3.23.10.3_source.changes: Valid signature from 106DA1C8C3CBBF14
Checking signature on .dsc
gpg: /home/sergio/work/qemu/qemu_8.0.4+dfsg-1ubuntu3.23.10.3.dsc: Valid signature from 106DA1C8C3CBBF14
Uploading to ubuntu (via ftp to upload.ubuntu.com):
  Uploading qemu_8.0.4+dfsg-1ubuntu3.23.10.3.dsc: done.
  Uploading qemu_8.0.4+dfsg-1ubuntu3.23.10.3.debian.tar.xz: done.
  Uploading qemu_8.0.4+dfsg-1ubuntu3.23.10.3_source.buildinfo: done.
  Uploading qemu_8.0.4+dfsg-1ubuntu3.23.10.3_source.changes: done.
Successfully uploaded packages.

--
Sergio
GPG key ID: E92F D0B3 6B14 F1F4 D8E0 EB2F 106D A1C8 C3CB BF14

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

This was released already, marking as merged.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/debian/changelog b/debian/changelog
2index fe1973e..e055d38 100644
3--- a/debian/changelog
4+++ b/debian/changelog
5@@ -1,3 +1,10 @@
6+qemu (1:8.0.4+dfsg-1ubuntu3.23.10.3) mantic; urgency=medium
7+
8+ * d/p/u/lp-2051965-*.patch: Fix QEMU crash when using TCG acceleration
9+ with guest Linux kernel >= 6.3. (LP: #2051965)
10+
11+ -- Sergio Durigan Junior <sergio.durigan@canonical.com> Tue, 13 Feb 2024 18:26:17 -0500
12+
13 qemu (1:8.0.4+dfsg-1ubuntu3.23.10.2) mantic-security; urgency=medium
14
15 * SECURITY UPDATE: OOB read in RDMA device
16diff --git a/debian/patches/series b/debian/patches/series
17index 75c6b02..553651c 100644
18--- a/debian/patches/series
19+++ b/debian/patches/series
20@@ -36,3 +36,6 @@ CVE-2023-40360.patch
21 CVE-2023-4135.patch
22 CVE-2023-42467.patch
23 CVE-2023-5088.patch
24+ubuntu/lp-2051965-accel-tcg-Split-out-cpu_exec_longjmp_cleanup.patch
25+ubuntu/lp-2051965-accel-tcg-Always-lock-pages-before-translation.patch
26+ubuntu/lp-2051965-accel-tcg-Clear-tcg_ctx-gen_tb-on-buffer-overflow.patch
27diff --git a/debian/patches/ubuntu/lp-2051965-accel-tcg-Always-lock-pages-before-translation.patch b/debian/patches/ubuntu/lp-2051965-accel-tcg-Always-lock-pages-before-translation.patch
28new file mode 100644
29index 0000000..2d567b0
30--- /dev/null
31+++ b/debian/patches/ubuntu/lp-2051965-accel-tcg-Always-lock-pages-before-translation.patch
32@@ -0,0 +1,621 @@
33+From deba78709ae8ce103e2248413857747f804cd1ef Mon Sep 17 00:00:00 2001
34+From: Richard Henderson <richard.henderson@linaro.org>
35+Date: Thu, 6 Jul 2023 17:55:48 +0100
36+Subject: [PATCH] accel/tcg: Always lock pages before translation
37+
38+We had done this for user-mode by invoking page_protect
39+within the translator loop. Extend this to handle system
40+mode as well. Move page locking out of tb_link_page.
41+
42+Reported-by: Liren Wei <lrwei@bupt.edu.cn>
43+Reported-by: Richard W.M. Jones <rjones@redhat.com>
44+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
45+Tested-by: Richard W.M. Jones <rjones@redhat.com>
46+
47+Origin: backport, https://gitlab.com/qemu-project/qemu/commit/deba78709a
48+Bug-Ubuntu: https://bugs.launchpad.net/bugs/2051965
49+Last-Update: 2024-02-13
50+
51+---
52+ accel/tcg/cpu-exec.c | 20 ++++
53+ accel/tcg/internal.h | 30 ++++-
54+ accel/tcg/tb-maint.c | 244 ++++++++++++++++++++------------------
55+ accel/tcg/translate-all.c | 43 ++++++-
56+ accel/tcg/translator.c | 34 ++++--
57+ 5 files changed, 237 insertions(+), 134 deletions(-)
58+
59+--- qemu-tcg-crash-mantic.orig/accel/tcg/cpu-exec.c 2024-02-13 18:18:45.276155938 -0500
60++++ qemu-tcg-crash-mantic/accel/tcg/cpu-exec.c 2024-02-13 18:18:45.276155938 -0500
61+@@ -539,6 +539,26 @@ static void cpu_exec_longjmp_cleanup(CPU
62+ if (have_mmap_lock()) {
63+ mmap_unlock();
64+ }
65++#else
66++ /*
67++ * For softmmu, a tlb_fill fault during translation will land here,
68++ * and we need to release any page locks held. In system mode we
69++ * have one tcg_ctx per thread, so we know it was this cpu doing
70++ * the translation.
71++ *
72++ * Alternative 1: Install a cleanup to be called via an exception
73++ * handling safe longjmp. It seems plausible that all our hosts
74++ * support such a thing. We'd have to properly register unwind info
75++ * for the JIT for EH, rather that just for GDB.
76++ *
77++ * Alternative 2: Set and restore cpu->jmp_env in tb_gen_code to
78++ * capture the cpu_loop_exit longjmp, perform the cleanup, and
79++ * jump again to arrive here.
80++ */
81++ if (tcg_ctx->gen_tb) {
82++ tb_unlock_pages(tcg_ctx->gen_tb);
83++ tcg_ctx->gen_tb = NULL;
84++ }
85+ #endif
86+ if (qemu_mutex_iothread_locked()) {
87+ qemu_mutex_unlock_iothread();
88+--- qemu-tcg-crash-mantic.orig/accel/tcg/internal.h 2024-02-13 18:18:45.276155938 -0500
89++++ qemu-tcg-crash-mantic/accel/tcg/internal.h 2024-02-13 18:18:45.276155938 -0500
90+@@ -10,6 +10,7 @@
91+ #define ACCEL_TCG_INTERNAL_H
92+
93+ #include "exec/exec-all.h"
94++#include "exec/translate-all.h"
95+
96+ /*
97+ * Access to the various translations structures need to be serialised
98+@@ -35,6 +36,32 @@ static inline void page_table_config_ini
99+ void page_table_config_init(void);
100+ #endif
101+
102++#ifdef CONFIG_USER_ONLY
103++/*
104++ * For user-only, page_protect sets the page read-only.
105++ * Since most execution is already on read-only pages, and we'd need to
106++ * account for other TBs on the same page, defer undoing any page protection
107++ * until we receive the write fault.
108++ */
109++static inline void tb_lock_page0(tb_page_addr_t p0)
110++{
111++ page_protect(p0);
112++}
113++
114++static inline void tb_lock_page1(tb_page_addr_t p0, tb_page_addr_t p1)
115++{
116++ page_protect(p1);
117++}
118++
119++static inline void tb_unlock_page1(tb_page_addr_t p0, tb_page_addr_t p1) { }
120++static inline void tb_unlock_pages(TranslationBlock *tb) { }
121++#else
122++void tb_lock_page0(tb_page_addr_t);
123++void tb_lock_page1(tb_page_addr_t, tb_page_addr_t);
124++void tb_unlock_page1(tb_page_addr_t, tb_page_addr_t);
125++void tb_unlock_pages(TranslationBlock *);
126++#endif
127++
128+ #ifdef CONFIG_SOFTMMU
129+ void tb_invalidate_phys_range_fast(ram_addr_t ram_addr,
130+ unsigned size,
131+@@ -48,8 +75,7 @@ TranslationBlock *tb_gen_code(CPUState *
132+ void page_init(void);
133+ void tb_htable_init(void);
134+ void tb_reset_jump(TranslationBlock *tb, int n);
135+-TranslationBlock *tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
136+- tb_page_addr_t phys_page2);
137++TranslationBlock *tb_link_page(TranslationBlock *tb);
138+ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc);
139+ void cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
140+ uintptr_t host_pc);
141+--- qemu-tcg-crash-mantic.orig/accel/tcg/tb-maint.c 2024-02-13 18:18:45.276155938 -0500
142++++ qemu-tcg-crash-mantic/accel/tcg/tb-maint.c 2024-02-13 18:20:49.096346316 -0500
143+@@ -71,17 +71,7 @@ typedef struct PageDesc PageDesc;
144+ */
145+ #define assert_page_locked(pd) tcg_debug_assert(have_mmap_lock())
146+
147+-static inline void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
148+- PageDesc **ret_p2, tb_page_addr_t phys2,
149+- bool alloc)
150+-{
151+- *ret_p1 = NULL;
152+- *ret_p2 = NULL;
153+-}
154+-
155+-static inline void page_unlock(PageDesc *pd) { }
156+-static inline void page_lock_tb(const TranslationBlock *tb) { }
157+-static inline void page_unlock_tb(const TranslationBlock *tb) { }
158++static inline void tb_lock_pages(const TranslationBlock *tb) { }
159+
160+ /*
161+ * For user-only, since we are protecting all of memory with a single lock,
162+@@ -97,7 +87,7 @@ static void tb_remove_all(void)
163+ }
164+
165+ /* Call with mmap_lock held. */
166+-static void tb_record(TranslationBlock *tb, PageDesc *p1, PageDesc *p2)
167++static void tb_record(TranslationBlock *tb)
168+ {
169+ target_ulong addr;
170+ int flags;
171+@@ -392,12 +382,108 @@ static void page_lock(PageDesc *pd)
172+ qemu_spin_lock(&pd->lock);
173+ }
174+
175++/* Like qemu_spin_trylock, returns false on success */
176++static bool page_trylock(PageDesc *pd)
177++{
178++ bool busy = qemu_spin_trylock(&pd->lock);
179++ if (!busy) {
180++ page_lock__debug(pd);
181++ }
182++ return busy;
183++}
184++
185+ static void page_unlock(PageDesc *pd)
186+ {
187+ qemu_spin_unlock(&pd->lock);
188+ page_unlock__debug(pd);
189+ }
190+
191++void tb_lock_page0(tb_page_addr_t paddr)
192++{
193++ page_lock(page_find_alloc(paddr >> TARGET_PAGE_BITS, true));
194++}
195++
196++void tb_lock_page1(tb_page_addr_t paddr0, tb_page_addr_t paddr1)
197++{
198++ tb_page_addr_t pindex0 = paddr0 >> TARGET_PAGE_BITS;
199++ tb_page_addr_t pindex1 = paddr1 >> TARGET_PAGE_BITS;
200++ PageDesc *pd0, *pd1;
201++
202++ if (pindex0 == pindex1) {
203++ /* Identical pages, and the first page is already locked. */
204++ return;
205++ }
206++
207++ pd1 = page_find_alloc(pindex1, true);
208++ if (pindex0 < pindex1) {
209++ /* Correct locking order, we may block. */
210++ page_lock(pd1);
211++ return;
212++ }
213++
214++ /* Incorrect locking order, we cannot block lest we deadlock. */
215++ if (!page_trylock(pd1)) {
216++ return;
217++ }
218++
219++ /*
220++ * Drop the lock on page0 and get both page locks in the right order.
221++ * Restart translation via longjmp.
222++ */
223++ pd0 = page_find_alloc(pindex0, false);
224++ page_unlock(pd0);
225++ page_lock(pd1);
226++ page_lock(pd0);
227++ siglongjmp(tcg_ctx->jmp_trans, -3);
228++}
229++
230++void tb_unlock_page1(tb_page_addr_t paddr0, tb_page_addr_t paddr1)
231++{
232++ tb_page_addr_t pindex0 = paddr0 >> TARGET_PAGE_BITS;
233++ tb_page_addr_t pindex1 = paddr1 >> TARGET_PAGE_BITS;
234++
235++ if (pindex0 != pindex1) {
236++ page_unlock(page_find_alloc(pindex1, false));
237++ }
238++}
239++
240++static void tb_lock_pages(TranslationBlock *tb)
241++{
242++ tb_page_addr_t paddr0 = tb_page_addr0(tb);
243++ tb_page_addr_t paddr1 = tb_page_addr1(tb);
244++ tb_page_addr_t pindex0 = paddr0 >> TARGET_PAGE_BITS;
245++ tb_page_addr_t pindex1 = paddr1 >> TARGET_PAGE_BITS;
246++
247++ if (unlikely(paddr0 == -1)) {
248++ return;
249++ }
250++ if (unlikely(paddr1 != -1) && pindex0 != pindex1) {
251++ if (pindex0 < pindex1) {
252++ page_lock(page_find_alloc(pindex0, true));
253++ page_lock(page_find_alloc(pindex1, true));
254++ return;
255++ }
256++ page_lock(page_find_alloc(pindex1, true));
257++ }
258++ page_lock(page_find_alloc(pindex0, true));
259++}
260++
261++void tb_unlock_pages(TranslationBlock *tb)
262++{
263++ tb_page_addr_t paddr0 = tb_page_addr0(tb);
264++ tb_page_addr_t paddr1 = tb_page_addr1(tb);
265++ tb_page_addr_t pindex0 = paddr0 >> TARGET_PAGE_BITS;
266++ tb_page_addr_t pindex1 = paddr1 >> TARGET_PAGE_BITS;
267++
268++ if (unlikely(paddr0 == -1)) {
269++ return;
270++ }
271++ if (unlikely(paddr1 != -1) && pindex0 != pindex1) {
272++ page_unlock(page_find_alloc(pindex1, false));
273++ }
274++ page_unlock(page_find_alloc(pindex0, false));
275++}
276++
277+ static inline struct page_entry *
278+ page_entry_new(PageDesc *pd, tb_page_addr_t index)
279+ {
280+@@ -421,13 +507,10 @@ static void page_entry_destroy(gpointer
281+ /* returns false on success */
282+ static bool page_entry_trylock(struct page_entry *pe)
283+ {
284+- bool busy;
285+-
286+- busy = qemu_spin_trylock(&pe->pd->lock);
287++ bool busy = page_trylock(pe->pd);
288+ if (!busy) {
289+ g_assert(!pe->locked);
290+ pe->locked = true;
291+- page_lock__debug(pe->pd);
292+ }
293+ return busy;
294+ }
295+@@ -605,8 +688,7 @@ static void tb_remove_all(void)
296+ * Add the tb in the target page and protect it if necessary.
297+ * Called with @p->lock held.
298+ */
299+-static inline void tb_page_add(PageDesc *p, TranslationBlock *tb,
300+- unsigned int n)
301++static void tb_page_add(PageDesc *p, TranslationBlock *tb, unsigned int n)
302+ {
303+ bool page_already_protected;
304+
305+@@ -626,15 +708,21 @@ static inline void tb_page_add(PageDesc
306+ }
307+ }
308+
309+-static void tb_record(TranslationBlock *tb, PageDesc *p1, PageDesc *p2)
310++static void tb_record(TranslationBlock *tb)
311+ {
312+- tb_page_add(p1, tb, 0);
313+- if (unlikely(p2)) {
314+- tb_page_add(p2, tb, 1);
315++ tb_page_addr_t paddr0 = tb_page_addr0(tb);
316++ tb_page_addr_t paddr1 = tb_page_addr1(tb);
317++ tb_page_addr_t pindex0 = paddr0 >> TARGET_PAGE_BITS;
318++ tb_page_addr_t pindex1 = paddr0 >> TARGET_PAGE_BITS;
319++
320++ assert(paddr0 != -1);
321++ if (unlikely(paddr1 != -1) && pindex0 != pindex1) {
322++ tb_page_add(page_find_alloc(pindex1, false), tb, 1);
323+ }
324++ tb_page_add(page_find_alloc(pindex0, false), tb, 0);
325+ }
326+
327+-static inline void tb_page_remove(PageDesc *pd, TranslationBlock *tb)
328++static void tb_page_remove(PageDesc *pd, TranslationBlock *tb)
329+ {
330+ TranslationBlock *tb1;
331+ uintptr_t *pprev;
332+@@ -654,74 +742,16 @@ static inline void tb_page_remove(PageDe
333+
334+ static void tb_remove(TranslationBlock *tb)
335+ {
336+- PageDesc *pd;
337+-
338+- pd = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
339+- tb_page_remove(pd, tb);
340+- if (unlikely(tb->page_addr[1] != -1)) {
341+- pd = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
342+- tb_page_remove(pd, tb);
343+- }
344+-}
345+-
346+-static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
347+- PageDesc **ret_p2, tb_page_addr_t phys2, bool alloc)
348+-{
349+- PageDesc *p1, *p2;
350+- tb_page_addr_t page1;
351+- tb_page_addr_t page2;
352+-
353+- assert_memory_lock();
354+- g_assert(phys1 != -1);
355+-
356+- page1 = phys1 >> TARGET_PAGE_BITS;
357+- page2 = phys2 >> TARGET_PAGE_BITS;
358+-
359+- p1 = page_find_alloc(page1, alloc);
360+- if (ret_p1) {
361+- *ret_p1 = p1;
362+- }
363+- if (likely(phys2 == -1)) {
364+- page_lock(p1);
365+- return;
366+- } else if (page1 == page2) {
367+- page_lock(p1);
368+- if (ret_p2) {
369+- *ret_p2 = p1;
370+- }
371+- return;
372+- }
373+- p2 = page_find_alloc(page2, alloc);
374+- if (ret_p2) {
375+- *ret_p2 = p2;
376+- }
377+- if (page1 < page2) {
378+- page_lock(p1);
379+- page_lock(p2);
380+- } else {
381+- page_lock(p2);
382+- page_lock(p1);
383+- }
384+-}
385+-
386+-/* lock the page(s) of a TB in the correct acquisition order */
387+-static void page_lock_tb(const TranslationBlock *tb)
388+-{
389+- page_lock_pair(NULL, tb_page_addr0(tb), NULL, tb_page_addr1(tb), false);
390+-}
391+-
392+-static void page_unlock_tb(const TranslationBlock *tb)
393+-{
394+- PageDesc *p1 = page_find(tb_page_addr0(tb) >> TARGET_PAGE_BITS);
395+-
396+- page_unlock(p1);
397+- if (unlikely(tb_page_addr1(tb) != -1)) {
398+- PageDesc *p2 = page_find(tb_page_addr1(tb) >> TARGET_PAGE_BITS);
399+-
400+- if (p2 != p1) {
401+- page_unlock(p2);
402+- }
403++ tb_page_addr_t paddr0 = tb_page_addr0(tb);
404++ tb_page_addr_t paddr1 = tb_page_addr1(tb);
405++ tb_page_addr_t pindex0 = paddr0 >> TARGET_PAGE_BITS;
406++ tb_page_addr_t pindex1 = paddr0 >> TARGET_PAGE_BITS;
407++
408++ assert(paddr0 != -1);
409++ if (unlikely(paddr1 != -1) && pindex0 != pindex1) {
410++ tb_page_remove(page_find_alloc(pindex1, false), tb);
411+ }
412++ tb_page_remove(page_find_alloc(pindex0, false), tb);
413+ }
414+ #endif /* CONFIG_USER_ONLY */
415+
416+@@ -926,18 +956,16 @@ static void tb_phys_invalidate__locked(T
417+ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
418+ {
419+ if (page_addr == -1 && tb_page_addr0(tb) != -1) {
420+- page_lock_tb(tb);
421++ tb_lock_pages(tb);
422+ do_tb_phys_invalidate(tb, true);
423+- page_unlock_tb(tb);
424++ tb_unlock_pages(tb);
425+ } else {
426+ do_tb_phys_invalidate(tb, false);
427+ }
428+ }
429+
430+ /*
431+- * Add a new TB and link it to the physical page tables. phys_page2 is
432+- * (-1) to indicate that only one page contains the TB.
433+- *
434++ * Add a new TB and link it to the physical page tables.
435+ * Called with mmap_lock held for user-mode emulation.
436+ *
437+ * Returns a pointer @tb, or a pointer to an existing TB that matches @tb.
438+@@ -945,43 +973,29 @@ void tb_phys_invalidate(TranslationBlock
439+ * for the same block of guest code that @tb corresponds to. In that case,
440+ * the caller should discard the original @tb, and use instead the returned TB.
441+ */
442+-TranslationBlock *tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
443+- tb_page_addr_t phys_page2)
444++TranslationBlock *tb_link_page(TranslationBlock *tb)
445+ {
446+- PageDesc *p;
447+- PageDesc *p2 = NULL;
448+ void *existing_tb = NULL;
449+ uint32_t h;
450+
451+ assert_memory_lock();
452+ tcg_debug_assert(!(tb->cflags & CF_INVALID));
453+
454+- /*
455+- * Add the TB to the page list, acquiring first the pages's locks.
456+- * We keep the locks held until after inserting the TB in the hash table,
457+- * so that if the insertion fails we know for sure that the TBs are still
458+- * in the page descriptors.
459+- * Note that inserting into the hash table first isn't an option, since
460+- * we can only insert TBs that are fully initialized.
461+- */
462+- page_lock_pair(&p, phys_pc, &p2, phys_page2, true);
463+- tb_record(tb, p, p2);
464++ tb_record(tb);
465+
466+ /* add in the hash table */
467+- h = tb_hash_func(phys_pc, (tb->cflags & CF_PCREL ? 0 : tb->pc),
468++ h = tb_hash_func(tb_page_addr0(tb), (tb->cflags & CF_PCREL ? 0 : tb->pc),
469+ tb->flags, tb->cflags, tb->trace_vcpu_dstate);
470+ qht_insert(&tb_ctx.htable, tb, h, &existing_tb);
471+
472+ /* remove TB from the page(s) if we couldn't insert it */
473+ if (unlikely(existing_tb)) {
474+ tb_remove(tb);
475+- tb = existing_tb;
476++ tb_unlock_pages(tb);
477++ return existing_tb;
478+ }
479+
480+- if (p2 && p2 != p) {
481+- page_unlock(p2);
482+- }
483+- page_unlock(p);
484++ tb_unlock_pages(tb);
485+ return tb;
486+ }
487+
488+--- qemu-tcg-crash-mantic.orig/accel/tcg/translate-all.c 2024-02-13 18:18:45.276155938 -0500
489++++ qemu-tcg-crash-mantic/accel/tcg/translate-all.c 2024-02-13 18:22:38.708335591 -0500
490+@@ -304,7 +304,7 @@ TranslationBlock *tb_gen_code(CPUState *
491+ {
492+ CPUArchState *env = cpu->env_ptr;
493+ TranslationBlock *tb, *existing_tb;
494+- tb_page_addr_t phys_pc;
495++ tb_page_addr_t phys_pc, phys_p2;
496+ tcg_insn_unit *gen_code_buf;
497+ int gen_code_size, search_size, max_insns;
498+ #ifdef CONFIG_PROFILER
499+@@ -330,6 +330,7 @@ TranslationBlock *tb_gen_code(CPUState *
500+ QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
501+
502+ buffer_overflow:
503++ assert_no_pages_locked();
504+ tb = tcg_tb_alloc(tcg_ctx);
505+ if (unlikely(!tb)) {
506+ /* flush must be done */
507+@@ -351,8 +352,12 @@ TranslationBlock *tb_gen_code(CPUState *
508+ tb->trace_vcpu_dstate = *cpu->trace_dstate;
509+ tb_set_page_addr0(tb, phys_pc);
510+ tb_set_page_addr1(tb, -1);
511++ if (phys_pc != -1) {
512++ tb_lock_page0(phys_pc);
513++ }
514++
515+ tcg_ctx->gen_tb = tb;
516+- tb_overflow:
517++ restart_translate:
518+
519+ #ifdef CONFIG_PROFILER
520+ /* includes aborted translations because of exceptions */
521+@@ -378,6 +383,7 @@ TranslationBlock *tb_gen_code(CPUState *
522+ qemu_log_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT,
523+ "Restarting code generation for "
524+ "code_gen_buffer overflow\n");
525++ tb_unlock_pages(tb);
526+ goto buffer_overflow;
527+
528+ case -2:
529+@@ -396,14 +402,39 @@ TranslationBlock *tb_gen_code(CPUState *
530+ "Restarting code generation with "
531+ "smaller translation block (max %d insns)\n",
532+ max_insns);
533+- goto tb_overflow;
534++
535++ /*
536++ * The half-sized TB may not cross pages.
537++ * TODO: Fix all targets that cross pages except with
538++ * the first insn, at which point this can't be reached.
539++ */
540++ phys_p2 = tb_page_addr1(tb);
541++ if (unlikely(phys_p2 != -1)) {
542++ tb_unlock_page1(phys_pc, phys_p2);
543++ tb_set_page_addr1(tb, -1);
544++ }
545++ goto restart_translate;
546++
547++ case -3:
548++ /*
549++ * We had a page lock ordering problem. In order to avoid
550++ * deadlock we had to drop the lock on page0, which means
551++ * that everything we translated so far is compromised.
552++ * Restart with locks held on both pages.
553++ */
554++ qemu_log_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT,
555++ "Restarting code generation with re-locked pages");
556++ goto restart_translate;
557+
558+ default:
559+ g_assert_not_reached();
560+ }
561+ }
562++ tcg_ctx->gen_tb = NULL;
563++
564+ search_size = encode_search(tb, (void *)gen_code_buf + gen_code_size);
565+ if (unlikely(search_size < 0)) {
566++ tb_unlock_pages(tb);
567+ goto buffer_overflow;
568+ }
569+ tb->tc.size = gen_code_size;
570+@@ -522,6 +553,7 @@ TranslationBlock *tb_gen_code(CPUState *
571+ * before attempting to link to other TBs or add to the lookup table.
572+ */
573+ if (tb_page_addr0(tb) == -1) {
574++ assert_no_pages_locked();
575+ return tb;
576+ }
577+
578+@@ -536,7 +568,9 @@ TranslationBlock *tb_gen_code(CPUState *
579+ * No explicit memory barrier is required -- tb_link_page() makes the
580+ * TB visible in a consistent state.
581+ */
582+- existing_tb = tb_link_page(tb, tb_page_addr0(tb), tb_page_addr1(tb));
583++ existing_tb = tb_link_page(tb);
584++ assert_no_pages_locked();
585++
586+ /* if the TB already exists, discard what we just translated */
587+ if (unlikely(existing_tb != tb)) {
588+ uintptr_t orig_aligned = (uintptr_t)gen_code_buf;
589+--- qemu-tcg-crash-mantic.orig/accel/tcg/translator.c 2024-02-13 18:18:45.276155938 -0500
590++++ qemu-tcg-crash-mantic/accel/tcg/translator.c 2024-02-13 18:23:08.420308318 -0500
591+@@ -17,6 +17,7 @@
592+ #include "exec/translator.h"
593+ #include "exec/plugin-gen.h"
594+ #include "exec/replay-core.h"
595++#include "internal.h"
596+
597+ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
598+ {
599+@@ -47,10 +48,6 @@ void translator_loop(CPUState *cpu, Tran
600+ db->host_addr[0] = host_pc;
601+ db->host_addr[1] = NULL;
602+
603+-#ifdef CONFIG_USER_ONLY
604+- page_protect(pc);
605+-#endif
606+-
607+ ops->init_disas_context(db, cpu);
608+ tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
609+
610+@@ -158,22 +155,36 @@ static void *translator_access(CPUArchSt
611+ host = db->host_addr[1];
612+ base = TARGET_PAGE_ALIGN(db->pc_first);
613+ if (host == NULL) {
614+- tb_page_addr_t phys_page =
615+- get_page_addr_code_hostp(env, base, &db->host_addr[1]);
616++ tb_page_addr_t page0, old_page1, new_page1;
617++
618++ new_page1 = get_page_addr_code_hostp(env, base, &db->host_addr[1]);
619+
620+ /*
621+ * If the second page is MMIO, treat as if the first page
622+ * was MMIO as well, so that we do not cache the TB.
623+ */
624+- if (unlikely(phys_page == -1)) {
625++ if (unlikely(new_page1 == -1)) {
626++ tb_unlock_pages(tb);
627+ tb_set_page_addr0(tb, -1);
628+ return NULL;
629+ }
630+
631+- tb_set_page_addr1(tb, phys_page);
632+-#ifdef CONFIG_USER_ONLY
633+- page_protect(end);
634+-#endif
635++ /*
636++ * If this is not the first time around, and page1 matches,
637++ * then we already have the page locked. Alternately, we're
638++ * not doing anything to prevent the PTE from changing, so
639++ * we might wind up with a different page, requiring us to
640++ * re-do the locking.
641++ */
642++ old_page1 = tb_page_addr1(tb);
643++ if (likely(new_page1 != old_page1)) {
644++ page0 = tb_page_addr0(tb);
645++ if (unlikely(old_page1 != -1)) {
646++ tb_unlock_page1(page0, old_page1);
647++ }
648++ tb_set_page_addr1(tb, new_page1);
649++ tb_lock_page1(page0, new_page1);
650++ }
651+ host = db->host_addr[1];
652+ }
653+
654diff --git a/debian/patches/ubuntu/lp-2051965-accel-tcg-Clear-tcg_ctx-gen_tb-on-buffer-overflow.patch b/debian/patches/ubuntu/lp-2051965-accel-tcg-Clear-tcg_ctx-gen_tb-on-buffer-overflow.patch
655new file mode 100644
656index 0000000..bf6b2a0
657--- /dev/null
658+++ b/debian/patches/ubuntu/lp-2051965-accel-tcg-Clear-tcg_ctx-gen_tb-on-buffer-overflow.patch
659@@ -0,0 +1,34 @@
660+From ad17868eb162a5466d8ad43e5ccb428776403308 Mon Sep 17 00:00:00 2001
661+From: Richard Henderson <richard.henderson@linaro.org>
662+Date: Wed, 26 Jul 2023 12:58:08 -0700
663+Subject: [PATCH] accel/tcg: Clear tcg_ctx->gen_tb on buffer overflow
664+
665+On overflow of code_gen_buffer, we unlock the guest pages we had been
666+translating, but failed to clear gen_tb. On restart, if we cannot
667+allocate a TB, we exit to the main loop to perform the flush of all
668+TBs as soon as possible. With garbage in gen_tb, we hit an assert:
669+
670+../src/accel/tcg/tb-maint.c:348:page_unlock__debug: \
671+ assertion failed: (page_is_locked(pd))
672+
673+Fixes: deba78709ae8 ("accel/tcg: Always lock pages before translation")
674+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
675+
676+Origin: upstream, https://gitlab.com/qemu-project/qemu/commit/ad17868eb1
677+Bug-Ubuntu: https://bugs.launchpad.net/bugs/2051965
678+Last-Update: 2024-02-13
679+
680+---
681+ accel/tcg/translate-all.c | 1 +
682+ 1 file changed, 1 insertion(+)
683+
684+--- qemu-tcg-crash-mantic.orig/accel/tcg/translate-all.c 2024-02-13 18:24:31.688184016 -0500
685++++ qemu-tcg-crash-mantic/accel/tcg/translate-all.c 2024-02-13 18:24:31.684184024 -0500
686+@@ -384,6 +384,7 @@ TranslationBlock *tb_gen_code(CPUState *
687+ "Restarting code generation for "
688+ "code_gen_buffer overflow\n");
689+ tb_unlock_pages(tb);
690++ tcg_ctx->gen_tb = NULL;
691+ goto buffer_overflow;
692+
693+ case -2:
694diff --git a/debian/patches/ubuntu/lp-2051965-accel-tcg-Split-out-cpu_exec_longjmp_cleanup.patch b/debian/patches/ubuntu/lp-2051965-accel-tcg-Split-out-cpu_exec_longjmp_cleanup.patch
695new file mode 100644
696index 0000000..9398854
697--- /dev/null
698+++ b/debian/patches/ubuntu/lp-2051965-accel-tcg-Split-out-cpu_exec_longjmp_cleanup.patch
699@@ -0,0 +1,90 @@
700+From cb62bd15e14e304617d250158b77d0deb032f032 Mon Sep 17 00:00:00 2001
701+From: Richard Henderson <richard.henderson@linaro.org>
702+Date: Thu, 6 Jul 2023 08:45:13 +0100
703+Subject: [PATCH] accel/tcg: Split out cpu_exec_longjmp_cleanup
704+MIME-Version: 1.0
705+Content-Type: text/plain; charset=UTF-8
706+Content-Transfer-Encoding: 8bit
707+
708+Share the setjmp cleanup between cpu_exec_step_atomic
709+and cpu_exec_setjmp.
710+
711+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
712+Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
713+Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
714+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
715+
716+Origin: backport, https://gitlab.com/qemu-project/qemu/commit/cb62bd15e1
717+Bug-Ubuntu: https://bugs.launchpad.net/bugs/2051965
718+Last-Update: 2024-02-13
719+
720+---
721+ accel/tcg/cpu-exec.c | 43 +++++++++++++++++++------------------------
722+ 1 file changed, 19 insertions(+), 24 deletions(-)
723+
724+--- qemu-tcg-crash-mantic.orig/accel/tcg/cpu-exec.c 2024-02-13 19:51:45.182071910 -0500
725++++ qemu-tcg-crash-mantic/accel/tcg/cpu-exec.c 2024-02-13 19:54:41.056353084 -0500
726+@@ -529,6 +529,23 @@ static void cpu_exec_exit(CPUState *cpu)
727+ }
728+ }
729+
730++static void cpu_exec_longjmp_cleanup(CPUState *cpu)
731++{
732++ /* Non-buggy compilers preserve this; assert the correct value. */
733++ g_assert(cpu == current_cpu);
734++
735++#ifndef CONFIG_SOFTMMU
736++ clear_helper_retaddr();
737++ if (have_mmap_lock()) {
738++ mmap_unlock();
739++ }
740++#endif
741++ if (qemu_mutex_iothread_locked()) {
742++ qemu_mutex_unlock_iothread();
743++ }
744++ assert_no_pages_locked();
745++}
746++
747+ void cpu_exec_step_atomic(CPUState *cpu)
748+ {
749+ CPUArchState *env = cpu->env_ptr;
750+@@ -570,16 +587,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
751+ cpu_tb_exec(cpu, tb, &tb_exit);
752+ cpu_exec_exit(cpu);
753+ } else {
754+-#ifndef CONFIG_SOFTMMU
755+- clear_helper_retaddr();
756+- if (have_mmap_lock()) {
757+- mmap_unlock();
758+- }
759+-#endif
760+- if (qemu_mutex_iothread_locked()) {
761+- qemu_mutex_unlock_iothread();
762+- }
763+- assert_no_pages_locked();
764++ cpu_exec_longjmp_cleanup(cpu);
765+ }
766+
767+ /*
768+@@ -1024,20 +1032,7 @@ static int cpu_exec_setjmp(CPUState *cpu
769+ {
770+ /* Prepare setjmp context for exception handling. */
771+ if (unlikely(sigsetjmp(cpu->jmp_env, 0) != 0)) {
772+- /* Non-buggy compilers preserve this; assert the correct value. */
773+- g_assert(cpu == current_cpu);
774+-
775+-#ifndef CONFIG_SOFTMMU
776+- clear_helper_retaddr();
777+- if (have_mmap_lock()) {
778+- mmap_unlock();
779+- }
780+-#endif
781+- if (qemu_mutex_iothread_locked()) {
782+- qemu_mutex_unlock_iothread();
783+- }
784+-
785+- assert_no_pages_locked();
786++ cpu_exec_longjmp_cleanup(cpu);
787+ }
788+
789+ return cpu_exec_loop(cpu, sc);

Subscribers

People subscribed via source and target branches