Merge ~ubuntu-release/britney/+git/hints-ubuntu:lunar-badtest-samba-ppc64el-slow-io into ~ubuntu-release/britney/+git/hints-ubuntu:devel

Proposed by Andreas Hasenack
Status: Merged
Merged at revision: 48e5a28f17278cc81f85bdc651d2a74b84ae16f5
Proposed branch: ~ubuntu-release/britney/+git/hints-ubuntu:lunar-badtest-samba-ppc64el-slow-io
Merge into: ~ubuntu-release/britney/+git/hints-ubuntu:devel
Diff against target: 13 lines (+5/-0)
1 file modified
ubuntu-release (+5/-0)
Reviewer Review Type Date Requested Status
Brian Murray Approve
Utkarsh Gupta Approve
Review via email: mp+439411@code.launchpad.net

Description of the change

See https://bugs.launchpad.net/ubuntu/+source/samba/+bug/2012432 and https://github.com/lxc/lxd/issues/11501

On ppc64el, a "lxc launch" command is consistently failing with:

Error: Failed instance creation: write unix @->/var/snap/lxd/common/lxd/unix.socket: i/o timeout

I prepared a package with more debugging for lxd, and ran the DEP8 tests in the infrastructure from a PPA for both amd64 and ppc64el. On amd64 it passed[3] (and image download is 10x faster than on ppc64el, hinting at a network bottleneck), and on ppc64el it failed and we have more details about the failure.

lxd has a hardcoded 30s timeout on some operations, and unpacking+starting a container is one of those. Unpacking the image on the ppc64el DEP8 vm took about 60s[2], and that failed the operation:

Mar 22 13:16:45 autopkgtest lxd.daemon[4292]: time="2023-03-22T13:16:45Z" level=info msg="Image unpack started" imageFile=/var/snap/lxd/common/lxd/images/6e24b9324ad0e87a7ff8387d4d7bbea6478342056b1c083565650b6e7f002e89 volName=member-server
Mar 22 13:16:46 autopkgtest lxd.daemon[4292]: time="2023-03-22T13:16:46Z" level=debug msg="Updated metadata for operation" class=task description="Creating instance" operation=371f9ac1-7113-47f9-895b-2cdce855ab3c project=default
Mar 22 13:17:34 autopkgtest lxd.daemon[4292]: time="2023-03-22T13:17:34Z" level=debug msg="Event listener server handler stopped" listener=5b1087ea-c641-4cda-8f5b-087e4e8f06bb local=/var/snap/lxd/common/lxd/unix.socket remote=@
Mar 22 13:17:35 autopkgtest lxd.daemon[4292]: time="2023-03-22T13:17:34Z" level=debug msg="Instance operation lock finished" action=create err="Instance \"create\" operation timed out after 30s" instance=member-server project=default reusable=false
...
Mar 22 13:17:44 autopkgtest lxd.daemon[4292]: time="2023-03-22T13:17:44Z" level=info msg="Image unpack stopped" imageFile=/var/snap/lxd/common/lxd/images/6e24b9324ad0e87a7ff8387d4d7bbea6478342056b1c083565650b6e7f002e89 volName=member-server

The 30s timeout was hit before the image was fully unpacked.

LXD upstream might consider changing this timeout[3], but in the meantime there aren't many options for this test on ppc64el until the bottleneck is resolved.

What I can consider for further uploads, if there is no short-term performance fix, is to change how the container is launched. Use "lxc init", ignore errors from it, but poll "lxc operation list" until the container is created, and then issue "lxc start".

In the meantime, this is me requesting to ignore this ppc64el failure for this current samba upload.

1. https://autopkgtest.ubuntu.com/packages/s/samba/lunar/ppc64el
2. https://autopkgtest.ubuntu.com/results/autopkgtest-lunar-ahasenack-samba-dep8-troubleshooting/lunar/ppc64el/s/samba/20230322_131759_d1066@/log.gz search for "13:16:45", there are two logs with that timestamp with info
3. https://autopkgtest.ubuntu.com/results/autopkgtest-lunar-ahasenack-samba-dep8-troubleshooting/lunar/amd64/s/samba/20230322_133523_34033@/log.gz

To post a comment you must log in.
Revision history for this message
Utkarsh Gupta (utkarsh) :
review: Approve
Revision history for this message
Brian Murray (brian-murray) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/ubuntu-release b/ubuntu-release
2index 8119aba..df4db39 100644
3--- a/ubuntu-release
4+++ b/ubuntu-release
5@@ -41,3 +41,8 @@ force-badtest boost1.81/blacklisted/arm64
6 # Not installable on i386, allow back in
7 force-badtest dm-writeboost/2.2.16-0.1/i386
8 force-badtest ruby-omniauth-google-oauth2/1.1.1-2/i386
9+
10+# Might be a temporary issue on ppc64el, but it's been going on for a few days and
11+# blocking the migration. Tests pass in other arches.
12+# See LP: #2012432 and https://github.com/lxc/lxd/issues/11501
13+force-skiptest samba/2:4.17.5+dfsg-2ubuntu3/ppc64el

Subscribers

People subscribed via source and target branches

to all changes: