Merge ~p-pisati/britney/+git/hints-ubuntu:devel-skiptest-meta into ~ubuntu-release/britney/+git/hints-ubuntu:devel

Proposed by Paolo Pisati
Status: Merged
Merged at revision: cabb95655f8d54a2d406fe140a07992582aca370
Proposed branch: ~p-pisati/britney/+git/hints-ubuntu:devel-skiptest-meta
Merge into: ~ubuntu-release/britney/+git/hints-ubuntu:devel
Diff against target: 17 lines (+9/-0)
1 file modified
ubuntu-release (+9/-0)
Reviewer Review Type Date Requested Status
Steve Langasek Approve
Timo Aaltonen (community) Approve
Brian Murray Pending
Review via email: mp+461043@code.launchpad.net

Commit message

hint two flaky tests (dracut/amd64) and (initramfs-tools/s390x) while everything else looks sound

To post a comment you must log in.
Revision history for this message
Benjamin Drung (bdrung) wrote :

Have you proof that these tests are flaky and not just starting to fail with linux-meta/6.8.0-11.11+1?

Revision history for this message
Paolo Pisati (p-pisati) wrote :

Yes, all the previous upload were fine and 11.11+1 is a no-change respin to fix some dependencies in -meta and -lrm - the kernel binary (e.g. 11.11) didn't change at all.

Revision history for this message
Timo Aaltonen (tjaalton) :
review: Approve
Revision history for this message
Brian Murray (brian-murray) wrote (last edit ):

> Yes, all the previous upload were fine and 11.11+1 is a no-change respin to
> fix some dependencies in -meta and -lrm - the kernel binary (e.g. 11.11)
> didn't change at all.

Generally speaking for these types of merge proposals we prefer to be able to inspect the logs ourselves. Is there something you can point us to?

Revision history for this message
Paolo Pisati (p-pisati) wrote (last edit ):
Revision history for this message
Graham Inggs (ginggs) wrote :

Both dracut/amd64 and initramfs-tools/s390x passed eventually, so no need for a hint for these.

However, all the linux-meta* package seem to regress glibc/2.39-0ubuntu1 that has just migrated.

I have just retried the glibc tests against linux-meta/6.8.0-11.11+1 with additional triggeres linux-signed/6.8.0-11.11 and linux/6.8.0-11.11.

Revision history for this message
Steve Langasek (vorlon) wrote :

These appear to be real regressions in the glibc tests with the new kernel. The following tests have moved from 'UNSUPPORTED' to 'FAIL':
debug/tst-fortify-syslog
elf/tst-dlopen-self-container
elf/tst-dlopen-tlsmodid-container
elf/tst-glibc-hwcaps-2-cache
elf/tst-glibc-hwcaps-cache
elf/tst-glibc-hwcaps-prepend-cache

And the tests show an error such as:

 3052s error: test-container.c:1136: could not create a private mount namespace

We should not simply let the glibc autopkgtests regress here as that is a bad regression in test coverage. It needs to get sorted out - there at least needs to be agreement that it's a bug on the glibc side and that the glibc tests will need to be fixed, before we ignore the failure.

review: Needs Information
Revision history for this message
Paolo Pisati (p-pisati) wrote :

We think it's LP: #2046844, and JJ is looking into it.

In the mean time, i updated the hint to cover all kernels blocked by failing glibc autopkgtests.

Revision history for this message
Paolo Pisati (p-pisati) wrote :

Citing schopin (see #43 LP: #2046844):

"We had a mitigation for this in glibc but the latest change from simply denying the unshare() call to allowing it but then denying anything requiring capabilities *presumably* broke the glibc test suite again. I'm only basing this from looking at the test logs, as I'm temporarily unable to run autopkgtests locally and am lacking the time to fix it.

2 classes of errors:

2770s FAIL: stdlib/tst-system
2770s original exit status 1
2770s error: test-container.c:1136: could not create a private mount namespace

That one is clearly userns-related, as it's due to a failing mount() call right after unshare()

2770s FAIL: sunrpc/tst-svc_register
2770s original exit status 1
2770s error: xwrite.c:32: write of 12 bytes failed after 0: Operation not permitted
2770s error: 1 test failures

I can't tell for sure what this one is about since this is your basic write() call and I don't have a stack trace at hand, but the EPERM would suggest that it's related.

I think a first fix would be to amend the test script to disable the userns restriction entirely for the duration of the tests (using 'needs-sudo'), while I'll still need to patch the test suite eventually to handle this new failure mode gracefully and simply ignore the tests, akin to https://sourceware.org/pipermail/libc-alpha/2024-February/154754.html"

Revision history for this message
Simon Chopin (schopin) wrote :

From the glibc side, I agree that it's something that needs to be fixed in glibc rather than in the kernel and I'll be working on it in the coming weeks. Better to hint it for now.

Revision history for this message
Steve Langasek (vorlon) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/ubuntu-release b/ubuntu-release
2index 8e605b4..5cf80cd 100644
3--- a/ubuntu-release
4+++ b/ubuntu-release
5@@ -36,3 +36,12 @@ force-badtest nvidia-graphics-drivers-390/all/armhf
6
7 # buggy test on s390x, will be skipped in next upload (LP: #2052794)
8 force-badtest glib2.0/2.79.1-1/s390x
9+
10+# hint AA regression affecting glibc autopkgtests and stalling all kernels
11+# in -proposed, see LP: #2046844
12+force-skiptest linux-meta/6.8.0-11.11+1
13+force-skiptest linux-meta-aws/6.8.0-1001.1
14+force-skiptest linux-meta-azure/6.8.0-1001.1
15+force-skiptest linux-meta-gcp/6.8.0-1002.2
16+force-skiptest linux-meta-lowlatency/6.8.0-7.7.1
17+force-skiptest linux-meta-oracle/6.8.0-1001.1

Subscribers

People subscribed via source and target branches