systemd-logind must be restarted every ~1000 SSH logins to prevent a ~25 second delay

Bug #1591411 reported by Mike Pontillo
386
This bug affects 67 people
Affects Status Importance Assigned to Milestone
D-Bus
Fix Released
Medium
systemd
Fix Released
Unknown
dbus (Ubuntu)
Fix Released
Medium
Łukasz Zemczak
Xenial
Fix Released
Medium
Łukasz Zemczak
Yakkety
Won't Fix
Medium
Łukasz Zemczak
systemd (Ubuntu)
Fix Released
Medium
Unassigned
Xenial
Invalid
Medium
Unassigned
Yakkety
Invalid
Undecided
Unassigned

Bug Description

[Impact]

The bug affects multiple users and introduces an user visible delay (~25 seconds) on SSH connections after a large number of sessions have been processed. This has a serious impact on big systems and servers running our software.

The currently proposed fix is actually a safe workaround for the bug as proposed by the dbus upstream. The workaround makes uid 0 immune to the pending_fd_timeout limit that kicks in and causes the original issue.

[Test Case]

lxc launch ubuntu:x test
lxc exec test -- login -f ubuntu
ssh-import-id <whatever>

Then ran a script as follows (passing in ubuntu@<container-ip>):

while [ 1 ]; do
    (time ssh $1 "echo OK > /dev/null") 2>&1 | grep ^real >> log
done

Then checking the log file if there are any ssh sessions that are taking 25+ seconds to complete.

Multiple instances of the same script can be used at the same time.

[Regression Potential]

The fix has a rather low regression potential as the workaround is a very small change only affecting one particular case - handling of uid 0. The fix has been tested by multiple users and has been around in zesty for a while, with multiple people involved in reviewing the change. It's also a change that has been proposed by upstream.

[Original Description]

I noticed on a system that accepts large numbers of SSH connections that after awhile, SSH sessions were taking ~25 seconds to complete.

Looking in /var/log/auth.log, systemd-logind starts failing with the following:

Jun 10 23:55:28 test sshd[3666]: pam_unix(sshd:session): session opened for user ubuntu by (uid=0)
Jun 10 23:55:28 test systemd-logind[105]: New session c1052 of user ubuntu.
Jun 10 23:55:28 test systemd-logind[105]: Failed to abandon session scope: Transport endpoint is not connected
Jun 10 23:55:28 test sshd[3666]: pam_systemd(sshd:session): Failed to create session: Message recipient disconnected from message bus without replying

I reproduced this in an LXD container by doing something like:

lxc launch ubuntu:x test
lxc exec test -- login -f ubuntu
ssh-import-id <whatever>

Then ran a script as follows (passing in ubuntu@<container-ip>):

while [ 1 ]; do
    (time ssh $1 "echo OK > /dev/null") 2>&1 | grep ^real >> log
done

In my case, after 1052 logins, the 1053rd and thereafter were taking 25+ seconds to complete. Here are some snippets from the log file:

$ cat log | grep 0m0 | wc -l
1052

$ cat log | grep 0m25 | wc -l
4

$ tail -5 log
real 0m0.222s
real 0m25.232s
real 0m25.235s
real 0m25.236s
real 0m25.239s

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: systemd 229-4ubuntu5
ProcVersionSignature: Ubuntu 4.4.0-22.40-generic 4.4.8
Uname: Linux 4.4.0-22-generic x86_64
ApportVersion: 2.20.1-0ubuntu2
Architecture: amd64
Date: Sat Jun 11 00:09:34 2016
MachineType: Notebook W230SS
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-22-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash
SourcePackage: systemd
SystemdDelta:
 [EXTENDED] /lib/systemd/system/rc-local.service → /lib/systemd/system/rc-local.service.d/debian.conf
 [EXTENDED] /lib/systemd/system/systemd-timesyncd.service → /lib/systemd/system/systemd-timesyncd.service.d/disable-with-time-daemon.conf

 2 overridden configuration files found.
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/15/2014
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 4.6.5
dmi.board.asset.tag: Tag 12345
dmi.board.name: W230SS
dmi.board.vendor: Notebook
dmi.board.version: Not Applicable
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 9
dmi.chassis.vendor: Notebook
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr4.6.5:bd04/15/2014:svnNotebook:pnW230SS:pvrNotApplicable:rvnNotebook:rnW230SS:rvrNotApplicable:cvnNotebook:ct9:cvrN/A:
dmi.product.name: W230SS
dmi.product.version: Not Applicable
dmi.sys.vendor: Notebook

CVE References

Revision history for this message
Mike Pontillo (mpontillo) wrote :
Revision history for this message
Mike Pontillo (mpontillo) wrote :

Here's my (sad) workaround:

$ sudo crontab -l
1,11,21,31,41,51 * * * * service systemd-logind restart

description: updated
description: updated
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu):
status: New → Confirmed
Christian Reis (kiko)
tags: added: cdo-qa-blocker
Changed in systemd (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Stan Hu (stanhu) wrote :

There are two patches that relate to this problem.

systemd: https://github.com/systemd/systemd/pull/3191

dbus: https://bugs.freedesktop.org/show_bug.cgi?id=95264

The systemd patch was merged in https://github.com/systemd/systemd/commit/5ab42bc85a11a5250dcdf8e86291d3da90aa84bd and released in systemd v230 and v231. Ubuntu appears to be on 229-4ubuntu7.

The dbus patch was merged in master but may not be released yet.

Revision history for this message
Stan Hu (stanhu) wrote :

This is biting us and our customers quite a bit.

More relevant links: https://github.com/systemd/systemd/issues/1961

dbus patch: https://bugs.freedesktop.org/attachment.cgi?id=123493

Revision history for this message
Pablo Carranza (pcarranza) wrote :

Hi, for back reference you can see our progress in the GitLab.com issue: https://gitlab.com/gitlab-com/infrastructure/issues/290

You can see a graph of behavior before and after the dbus update and the host reboot here: https://gitlab.com/gitlab-com/infrastructure/issues/290#note_13607928

I think that the is a lot of value on getting this into ubuntu LTS because it will solve the lockups for ssh session creations under heavy load.

You can check realtime behavior here: http://monitor.gitlab.net/dashboard/db/gitlab-status

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello Pablo and Stan!

I could try sponsoring the patches that you need for fixing these issues. Just so I know the solution story: to get this issue fixed both systemd and dbus need their patches, yes?

Changed in systemd (Ubuntu):
assignee: nobody → Łukasz Zemczak (sil2100)
Changed in systemd (Ubuntu Xenial):
status: New → Confirmed
importance: Undecided → Medium
Changed in systemd (Ubuntu):
assignee: Łukasz Zemczak (sil2100) → nobody
status: Confirmed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

I backported the 230 fix for systemd to the xenial version of the package and started a test-build in my PPA. The patch didn't apply cleanly so I had to do some manual interventions - also, the changeset is pretty big itself. Since I am not a systemd maintainer I would prefer Martin to take a look at the patch before proceeding, so for now I am attaching the debdiff here for further review (along with links to the PPA, in case the package builds fine as I basically just did the dput).

But I also heard that possibly only the dbus fix might be sufficient. In case anyone confirms that I will prepare a new dbus distro-release for yakkety and xenial SRU.

PPA used for testing: https://launchpad.net/~sil2100/+archive/ubuntu/ppa

Thank you for your patience.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :
Revision history for this message
Mike Pontillo (mpontillo) wrote :

I would like to thank the GitLab team for their excellent work triaging this issue (and getting a patch ready). Very nice work. I tested the version of dbus in Stan Hu's PPA here:

https://launchpad.net/~stanhu/+archive/ubuntu/dbus

After updating dbus to the version in this PPA, I ran my "ssh to a container" test (which I used as a test case reproduce the bug to file this), and also on another test system that was experiencing this issue with a real-world use case.

This time, I was able to SSH into the system several thousand times, and everything worked fine.

Next I turned it up to eleven by running eight continuous-SSH scripts in a loop. In a minute or two, it fell over and went back to the 25-second delay behavior. So while the behavior is *much* improved with the dbus patch, there are still lingering issues, and I think we should consider patching systemd as well (in addition to triaging further to determine if there is larger design flaw that can be fixed separately).

I think it's worth patching dbus alone as a first step. I will test Łukasz's systemd PPA to see if that further improves things.

Thanks again to everyone in the community who helped pull together a fix!

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Actually, please disregard the portion of my previous comment where I suspected we should consider patching systemd as well. I no longer think that is necessary. (My test case was flawed.) After correcting the issue (ensuring I was running with the properly-updated dbus fix), I was able to run eight parallel continuous-SSH scripts against the LXC with the fixed dbus (without the systemd patch).

ubuntu@test:~$ uptime
 20:13:55 up 4 min, 1 user, load average: 5.87, 3.70, 2.01

Over 10,000 sessions so far, and no issues. Ship it!

tags: added: patch
Revision history for this message
Stan Hu (stanhu) wrote :

That's correct: only the dbus patch is absolutely necessary, but since the patch wasn't merged yet to dbus I am still not 100% sure if it is considered ready for prime time. It seems to work for us.

As Łukasz observed, the systemd patch is a lot more extensive. Even though it was merged to master, we were only going to attempt to backport the fix if it were absolutely necessary. It doesn't look like it was in this case, so we did not attempt to do it.

Revision history for this message
Pablo Carranza (pcarranza) wrote :

Thanks for sponsoring Mike!

Revision history for this message
Pablo Carranza (pcarranza) wrote :

And Łukasz :)

Revision history for this message
Stan Hu (stanhu) wrote :

FYI, the dbus patches were written by Lennart Poettering and submitted here for review: https://bugs.freedesktop.org/show_bug.cgi?id=95263#c13

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Ok, with everyone confirming that the systemd patch is not required, I am closing the systemd part of the bug as 'Invalid' - let's only concentrate on the dbus part here. That being said, I would not like to release a new patch for dbus downstream if the patch hasn't been fully reviewed and approved upstream.

In this case I would propose to wait a bit and see if a finalized patch will be available.

tags: removed: patch
Changed in systemd (Ubuntu Xenial):
status: Confirmed → Invalid
Changed in dbus (Ubuntu):
status: New → Confirmed
Changed in dbus (Ubuntu Xenial):
status: New → Confirmed
tags: added: patch
Revision history for this message
Simon McVittie (smcv) wrote :

> I am still not 100% sure if it is considered ready for prime time

As far as I can tell, Lennart's proposed patch on fd.o #95263 would reintroduce CVE-2014-3637 (fd.o #80559), a denial of service security vulnerability.

Changed in dbus (Ubuntu Xenial):
importance: Undecided → Medium
Changed in dbus (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Ivan Kozik (ludios) wrote :

Some hints for using Ubuntu 16.04 machines that can't be rebooted to work around this bug:

1) You can keep your SSH logins a secret from systemd-logind by adding `UsePAM no` to /etc/ssh/sshd_config; this will avoid the ~25 second delay.

2) `UsePAM no` requires unlocked accounts (passwd -u) with a password set, even if you are only using publickey authentication.

3) You can use `AuthenticationMethods publickey` to prevent login with the passwords set for those accounts.

4) `su` also uses PAM and therefore informs systemd-logind and hangs for ~25 seconds, but in some cases `ssh user@localhost` can work as a replacement for `su`. There doesn't seem to be a way to configure `su` to not use PAM.

5) If you were relying on PAM to set a ulimit -n (nofile) using /etc/security/limits.conf, you can add something like `LimitNOFILE=131072` to the [Service] section in /etc/systemd/system/sshd.service, then `systemctl daemon-reload && systemctl restart sshd`

Revision history for this message
Laurent Bigonville (bigon) wrote :

I would really advise AGAINST setting "UsePAM" to "no"

This will cause other issues like not properly killing the user applications if the machine is rebooted/shutdown (like https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=751636)

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

In https://launchpad.net/~sil2100/+archive/ubuntu/ppa I did a test-build of dbus with one of the other WIP proposed patches from https://bugs.freedesktop.org/show_bug.cgi?id=95263 if anyone would like to test those and see if they help. Might be good if we gave some additional feedback to the original bug on the patchsets.

Revision history for this message
Ivan Kozik (ludios) wrote :

I don't think there's any change with the dbus from https://launchpad.net/~sil2100/+archive/ubuntu/ppa. I tried it twice (across reboots) and systemd-logind still breaks after a few minutes of flooding the system with SSH logins.

`loginctl list-sessions` also still seems to grow without cleaning up ~10% of old sessions, but that's probably another bug entirely.

Revision history for this message
Jared Biel (jared-biel) wrote :

We have some servers that host git mirrors accessed via SSH that are affected by this. Restarting systemd-logind via cron every 5 minutes works for us (no full system restart necessary.)

Revision history for this message
Ken Baker (bakerkj) wrote :

Are there any updates on fixing this in xenial/16.04?

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Sadly upstream still didn't agree on a concrete fix, as the one that was confirmed as working was actually reintroducing a security vulnerability. We tried one of the other proposed fixes but it didn't seem to help. I'll try to push a new dbus version with another of the proposed fixes, but there has been no notable movement on the original upstream bug since long [1].

[1] https://bugs.freedesktop.org/show_bug.cgi?id=95263

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

I have prepared another xenial dbus package in my PPA containing the second WIP proposed fix from the upstream bug [1]. If you could give a try on reproducing the issue using dbus 1.10.6-1ubuntu3.1~test2 from this place, I would be grateful:

https://launchpad.net/~sil2100/+archive/ubuntu/ppa

Same as with the previous package, there is no guarantee this will help. It's one of the proposed changes to make the situation better as per the upstream developers, who would be very welcome on some feedback.

Revision history for this message
Ivan Kozik (ludios) wrote :

It works fine with the new test package in your PPA. I ran an SSH login flood for 10 minutes and didn't see systemd-logind fall over. I then purged the PPA and confirmed it was still broken without the test package (it dies after about 2 minutes / 5000 logins).

Revision history for this message
Finn Herpich (galeon) wrote :

I can confirm too that Łukasz latest version fixes the bug for me.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Excellent, let me forward these comments to the upstream bug. We'll still wait for a few more people to test it out with this change applied and then try to release it to the latest series + back-porting to xenial at least. Of course we can do that instantly once upstream accepts the patch, but I'm sure they'll like some real world feedback as well.

Thanks for testing!

Changed in dbus (Ubuntu Xenial):
assignee: nobody → Łukasz Zemczak (sil2100)
Changed in dbus (Ubuntu):
assignee: nobody → Łukasz Zemczak (sil2100)
Revision history for this message
autostatic (autostatic) wrote :

Tested with an Openstack 16.04 instance and having Mike Pontillo's while loop hammering ten times in parallel on it. With dbus 1.10.6-1ubuntu3 I quickly got logins taking about 25 seconds, after having installed the dbus package from the sil2100 PPA I couldn't reproduce the issue anymore.

Jeremy

Revision history for this message
Ben Parafina (benp) wrote :

Tested with Ubuntu Xenial, after using Mike Pontillo's While loop hammering 16x in parallel I couldn't reproduce after installing the modified dbus package.

Changed in dbus (Ubuntu Yakkety):
status: New → Confirmed
Changed in systemd (Ubuntu Yakkety):
status: New → Invalid
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

I think I will just go forward and start preparing the release of dbus with this fix in zesty and then backporting it to yakkety and xenial. Upstream didn't seem to officially review the fix or provide any feedback on our test results, but the fix is enough high-priority to consider including it anyway. I will of course get someone to review all this, but I suppose we'll be pushing upstream about it separately.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package dbus - 1.10.10-1ubuntu2

---------------
dbus (1.10.10-1ubuntu2) zesty; urgency=medium

  * debian/patches/make-uid-0-immune-to-timeout.patch:
    - Add a test patch proposed by Simon McVittie upstream to fix bug
      LP: #1591411.

 -- Łukasz 'sil2100' Zemczak <email address hidden> Tue, 11 Oct 2016 20:12:43 +0200

Changed in dbus (Ubuntu):
status: Confirmed → Fix Released
Changed in dbus (Ubuntu Yakkety):
assignee: nobody → Łukasz Zemczak (sil2100)
importance: Undecided → Medium
Revision history for this message
Simon McVittie (smcv) wrote :

I have not been able to reproduce this on a Debian (jessie or sid) or Ubuntu (xenial) virtual machine prepared according to the instructions in autopkgtest-virt-qemu(1), even after reducing the pending_fd_timeout limit from 150000 (2.5 minutes) to 150 (150ms) with this configuration in /etc/dbus-1/system-local.conf:

<busconfig>
  <limit name="pending_fd_timeout">150</limit>
</busconfig>

This is with 4 parallel loops repeatedly logging in via ssh, currently at around 280 logins each.

Is there something special that is needed in the OS image to exhibit this failure mode?

Revision history for this message
Simon McVittie (smcv) wrote :

If you can reproduce this issue and you have an expendable machine or container to test it on, I have some more ideas on Bug #95263.

Revision history for this message
Simon McVittie (smcv) wrote :

... er, that should be, I have some more ideas for testing on <https://bugs.freedesktop.org/show_bug.cgi?id=95263>.

Revision history for this message
Shay (shay++) wrote :

Will fix, 1.10.10-1ubuntu2, enter 16.04 LTS?

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

@Shay: yes, we will prepare an SRU to the currently supported series soon - but please note that the current 'fix' is, in fact, just a workaround. But it works.

@Simon: I can try finding some people that could reproduce this easily, prepare patched-up version of dbus with both the proposed fixes and ask them to run tests on them. Would be really cool if a real fix could be found this way. I'll take care of this this week and send feedback here and the upstream bug.

Thanks again for looking into this further Simon!

Revision history for this message
Pablo Carranza (pcarranza) wrote : Re: [Bug 1591411] Re: systemd-logind must be restarted every ~1000 SSH logins to prevent a ~25 second delay
Download full text (4.3 KiB)

We can try the proposed patched version in one worker at gitlab. It will
get stressed quite fast

On Nov 14, 2016 13:25, "Łukasz Zemczak" <email address hidden> wrote:

> @Shay: yes, we will prepare an SRU to the currently supported series
> soon - but please note that the current 'fix' is, in fact, just a
> workaround. But it works.
>
> @Simon: I can try finding some people that could reproduce this easily,
> prepare patched-up version of dbus with both the proposed fixes and ask
> them to run tests on them. Would be really cool if a real fix could be
> found this way. I'll take care of this this week and send feedback here
> and the upstream bug.
>
> Thanks again for looking into this further Simon!
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1591411
>
> Title:
> systemd-logind must be restarted every ~1000 SSH logins to prevent a
> ~25 second delay
>
> Status in D-Bus:
> Unknown
> Status in systemd:
> Unknown
> Status in dbus package in Ubuntu:
> Fix Released
> Status in systemd package in Ubuntu:
> Fix Released
> Status in dbus source package in Xenial:
> Confirmed
> Status in systemd source package in Xenial:
> Invalid
> Status in dbus source package in Yakkety:
> Confirmed
> Status in systemd source package in Yakkety:
> Invalid
>
> Bug description:
> I noticed on a system that accepts large numbers of SSH connections
> that after awhile, SSH sessions were taking ~25 seconds to complete.
>
> Looking in /var/log/auth.log, systemd-logind starts failing with the
> following:
>
> Jun 10 23:55:28 test sshd[3666]: pam_unix(sshd:session): session opened
> for user ubuntu by (uid=0)
> Jun 10 23:55:28 test systemd-logind[105]: New session c1052 of user
> ubuntu.
> Jun 10 23:55:28 test systemd-logind[105]: Failed to abandon session
> scope: Transport endpoint is not connected
> Jun 10 23:55:28 test sshd[3666]: pam_systemd(sshd:session): Failed to
> create session: Message recipient disconnected from message bus without
> replying
>
> I reproduced this in an LXD container by doing something like:
>
> lxc launch ubuntu:x test
> lxc exec test -- login -f ubuntu
> ssh-import-id <whatever>
>
> Then ran a script as follows (passing in ubuntu@<container-ip>):
>
> while [ 1 ]; do
> (time ssh $1 "echo OK > /dev/null") 2>&1 | grep ^real >> log
> done
>
> In my case, after 1052 logins, the 1053rd and thereafter were taking
> 25+ seconds to complete. Here are some snippets from the log file:
>
> $ cat log | grep 0m0 | wc -l
> 1052
>
> $ cat log | grep 0m25 | wc -l
> 4
>
> $ tail -5 log
> real 0m0.222s
> real 0m25.232s
> real 0m25.235s
> real 0m25.236s
> real 0m25.239s
>
> ProblemType: Bug
> DistroRelease: Ubuntu 16.04
> Package: systemd 229-4ubuntu5
> ProcVersionSignature: Ubuntu 4.4.0-22.40-generic 4.4.8
> Uname: Linux 4.4.0-22-generic x86_64
> ApportVersion: 2.20.1-0ubuntu2
> Architecture: amd64
> Date: Sat Jun 11 00:09:34 2016
> MachineType: Notebook W230SS
> ProcEnviron:
> TERM=xterm-256color
> PATH=(custom, no user)
> ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz...

Read more...

description: updated
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

While the workaround is being prepared to get SRUed to the stable releases, I prepared the dbus packages with the two patches Simon proposed for testing.

https://launchpad.net/~sil2100/+archive/ubuntu/ppa

Could anyone that was able to reproduce the original issue install the dbus packages from the above PPA and re-try the tests to see if the issue is reproducible? The following packages have the workaround reverted and the two requested patches applied. I prepared both xenial and zesty packages in the PPA for testing purposes.

Thanks!

description: updated
Revision history for this message
Jan Hübner (jan-huebner) wrote :

I just had the chance to add the PPA and install the package. Rebooted the machine and will report. Ubuntu 16.04.1 LTS.

Changed in dbus (Ubuntu Xenial):
status: Confirmed → In Progress
Changed in dbus (Ubuntu Yakkety):
status: Confirmed → In Progress
Revision history for this message
Stan Hu (stanhu) wrote :

Lukasz, could we get an updated release for Ubuntu 16.04 (xenial)? We're finding that the latest kernel updates are overwriting our custom dbus packages, and we would prefer to have an official release soon. Thank you!

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello Stan! The dbus packages for xenial and yakkety have been uploaded a long time ago (on the 25th of November actually) but are currently still sitting in the UNAPPROVED queue for each release. I'll poke the SRU team to take a look at those as soon as possible. I didn't expect this to take so long.

In the meantime, could someone please test the new packages with the new 'proper' proposed fix (as per my PPA above)? This would help upstream in getting rid of the issue without using just this workaround.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Hi Lukasz,

I found this issue when I use `systemd suspend` several hundred times.
I can confirm that the dbus package with 'proper fix' in the PPA fixes the issue for my case.

Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Mike, or anyone else affected,

Accepted dbus into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/dbus/1.10.6-1ubuntu3.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in dbus (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Mike, or anyone else affected,

Accepted dbus into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/dbus/1.10.10-1ubuntu1.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in dbus (Ubuntu Yakkety):
status: In Progress → Fix Committed
Revision history for this message
Laurent Perrin (perrin-laurent) wrote :

Hello,

I just tested the package from Xenial proposed repository and it solved the issue for us.

Here is the version of the package dbus: 1.10.6-1ubuntu3.2

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

@Kai-Heng
Thanks for giving those a spin, that's really good news! Would be good if we got a few more people testing those - this way we could revert the workaround and get the real fix released both upstream and downstream.

@Laurent
Thank you for testing the xenial package!

Are there any yakkety users around that could potentially give the dbus package a try as well? We could then switch the bug to verification-done and get both of them released into the updates pocket.

Revision history for this message
Dave Chiluk (chiluk) wrote :

@Lukasz
Looking good so far. Appears resolved with 1.10.6-1ubuntu3.2.

Thanks,
Dave.

tags: added: verification-done
removed: verification-needed
tags: added: verification-done-xenial
removed: verification-done
Revision history for this message
Keith Pawson (keith-nbvk) wrote :

Hello

We are running Gitlab on 16.04 LTS and had this issue which was really causing a lot of frustrations. I was waiting for the update to go through for about a month now and decided to apply this updated version. After applying it via proposed repository it has solved the issue for us as well.

This must be affecting many others out there as well.

Anyway, thank you so much for making this available and hopefully they will release it soon.

Cheers
Keith

Revision history for this message
Florian Bogner (florian-bogner) wrote :

Hi there,

applying the given DBUS package from PROPOSED on 16.04 LTS as described above solved the issue for us as well.

Thanks for all the work involved.

Best regards,

Florian

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

The packages are still stuck in -proposed for xenial as there seems to be an issue with another patch that was released alongside of our fix. Until that's resolved we can't really do much...

In the meantime, is there anyone with a yakkety device that could test the yakkety-proposed packages as well?

Revision history for this message
Jan Hübner (jan-huebner) wrote :

Oh come on, I monitor machines via SSH remote execution of scripts - Icinga logs in every minute. You might get the picture in regard to this bug. We're waiting for six month now. How should anyone take Ubuntu serious anymore? *sadface*

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

This takes much too longer than expected, it was supposed to migrate long time already. All because of that additional fix that got attached by another developer which caused some regression somewhere in another component *sigh*. I re-uploaded a new dbus to xenial-proposed with the other fix reverted (so only having our logind workaround). Will now make sure that this one migrates ASAP (trying to get it in now).

Revision history for this message
Jan Hübner (jan-huebner) wrote :

Thanks for your quick reply. I don't want to sound harsh or be the guy who is always complaining, but this is just so annoying. I can't test packages on production servers or run "testing" packages on them, even if known good, when it's company policy to only run stable/official packages (especially when it's a LTS release).

And be honest: how much more basic from a user/admin perspective than "SSH is working" can it get?

So: sorry for my "outburst" - I felt better afterwards ;) - and obviously something is moving again :)

Kind regards,
Jan

Revision history for this message
Dave Chiluk (chiluk) wrote :

@jan-huebner

Please educate yourself about the stable release process and development process
https://wiki.ubuntu.com/StableReleaseUpdates
https://wiki.ubuntu.com/UbuntuDevelopment

A regression was discovered in another component. This is the reason for the delay. This is very uncommon, but also the entire reason for the SRU process.

If you are capable of contributing in a development manner, I will gladly mentor you or help find a mentor for you. Contributing solutions is the best way to help speed fixes for which you may care about in the open-source world.

As it sounds as though you are using in production in a mission critical application perhaps you'd consider financially supporting the project by purchasing a support contract, or donating when you download.
https://buy.ubuntu.com/

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package dbus - 1.10.6-1ubuntu3.3

---------------
dbus (1.10.6-1ubuntu3.3) xenial; urgency=medium

  * debian/dbus.user-session.upstart:
    - Temporarily revert latest changes as those seem to cause issues in the
      unity8 session on touch (LP: #1654241).

 -- Łukasz 'sil2100' Zemczak <email address hidden> Thu, 12 Jan 2017 19:01:21 +0100

Changed in dbus (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

A re-poke - could anyone with a yakkety system test the new dbus package in -proposed? I would like it to make its way to yakkety-proposed as well if possible. Thanks!

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

I've tried to reproduce this on a yakkety cloud instance as well as with lxc for some time, but sadly with no success. So I'm not sure whether this is still present in yakkety at all.

Changed in dbus:
importance: Unknown → Medium
status: Unknown → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Anyone else can reproduce the issue on yakkety as-is and can test the -proposed dbus packages? They're a bit stuck in -proposed for too long.

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Change of SRU verification policy

As part of a recent change in the Stable Release Update verification policy we would like to inform that for a bug to be considered verified for a given release a verification-done-$RELEASE tag needs to be added to the bug where $RELEASE is the name of the series the package that was tested (e.g. verification-done-xenial). Please note that the global 'verification-done' tag can no longer be used for this purpose.

Thank you!

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Since no one is able to verify the yakkety version of this SRU for so long, I am dropping the dbus version with the fix for the bug from yakkety-proposed.

Changed in dbus (Ubuntu Yakkety):
status: Fix Committed → Won't Fix
Revision history for this message
Drew Freiberger (afreiberger) wrote :

Can we get this backported to trusty?

tags: added: canonical-bootstack canonical-is
Revision history for this message
Drew Freiberger (afreiberger) wrote :

Trusty versions of packages affected (I see there is a systemd update 229-4ubuntu19. Does this include the backported fixes from v230/v231 mentioned in comment #4?):

ii dbus 1.10.6-1ubuntu3.1 amd64 simple interprocess messaging system (daemon and utilities)
ii libdbus-1-3:amd64 1.10.6-1ubuntu3.1 amd64 simple interprocess messaging system (library)
ii libdbus-glib-1-2:amd64 0.106-1 amd64 simple interprocess messaging system (GLib-based shared library)
ii python3-dbus 1.2.0-3 amd64 simple interprocess messaging system (Python 3 interface)
ii libpam-systemd:amd64 229-4ubuntu12 amd64 system and service manager - PAM module
ii libsystemd0:amd64 229-4ubuntu12 amd64 systemd utility library
ii python3-systemd 231-2build1 amd64 Python 3 bindings for systemd
ii systemd 229-4ubuntu12 amd64 system and service manager
ii systemd-sysv 229-4ubuntu12 amd64 system and service manager - SysV links

Revision history for this message
William Van Hevelingen (blkperl) wrote :

This bug does not appear to be resolved on Xenial as we are seeing scope file leakage causing systemctl to hang.

We are running the version of dbus that contains the fix

# dpkg -s dbus | grep Version
Version: 1.10.6-1ubuntu3.3

Changed in systemd:
status: Unknown → Fix Released
Revision history for this message
Nick Adams (h-nick-n) wrote :

This still seems to be an issue. Running latest Bionic.

# dpkg -s dbus | grep Version
Version: 1.12.2-1ubuntu1.1

Revision history for this message
Ioanna Alifieraki (joalif) wrote :

@Nick

What symptoms do you observe? Delays when ssh?
Could you please share your reproducer or describe in which circumstances you hit the bug?
Do you see any leaking scopes "ls -ld /run/systemd/transient/session-*.scope*" ?
Lastly what version of systemd do you run?

Thanks.

Revision history for this message
Nick Adams (h-nick-n) wrote :

@Ioanna

~25 second delay when connecting via SSH

Happens most often during ansible playbook runs, wherein a single ssh connection is established, but multiple new sessions are created within that connection. Usually only happens on the 3rd or 4th run after a reboot. There are hundreds of leftover scopes after an ansible playbook run and subsequent disconnection:

# ls -ld /run/systemd/transient/session-*.scope* | wc -l
344

# systemd --version
systemd 237
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.