systemd-logind can get hung in cg_enumerate_tasks

Bug #1301882 reported by Iain Lane
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Fix Released
High
Stéphane Graber

Bug Description

While looking into some leftover processes after session end, I discovered that logind can get into a hung state. Here's roughly what I did (sorry that I don't have exact steps)

1. Log in to unity, log out
2. ssh in
3. loginctl
4. loginctl session-status <your closing session>
5. log in, log out
6. loginctl session-status <your new closing session>
7. log in
8. loginctl (systemd-logind is hung at this point)

bt of loginctl while it's hung showing that it's going to timeout waiting for a reply from logind over dbus

(gdb) bt
#0 0x00007ffff6e67fa0 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007ffff757c4b8 in socket_do_iteration (transport=0x6183d0, flags=6, timeout_milliseconds=<optimised out>)
    at ../../dbus/dbus-transport-socket.c:1125
#2 0x00007ffff757b3ff in _dbus_transport_do_iteration (transport=0x6183d0, flags=1, flags@entry=6,
    timeout_milliseconds=timeout_milliseconds@entry=25000) at ../../dbus/dbus-transport.c:976
#3 0x00007ffff75659dc in _dbus_connection_do_iteration_unlocked (connection=connection@entry=0x618ad0,
    pending=pending@entry=0x618320, flags=flags@entry=6, timeout_milliseconds=timeout_milliseconds@entry=25000)
    at ../../dbus/dbus-connection.c:1234
#4 0x00007ffff7566389 in _dbus_connection_block_pending_call (pending=0x618320)
    at ../../dbus/dbus-connection.c:2415
#5 0x00007ffff757572a in dbus_pending_call_block (pending=<optimised out>) at ../../dbus/dbus-pending-call.c:748
#6 0x00007ffff756694d in dbus_connection_send_with_reply_and_block (connection=connection@entry=0x618ad0,
    message=message@entry=0x619000, timeout_milliseconds=timeout_milliseconds@entry=-1,
    error=error@entry=0x7fffffffd610) at ../../dbus/dbus-connection.c:3530
#7 0x000000000040ebf4 in bus_method_call_with_reply (bus=bus@entry=0x618ad0,
    destination=destination@entry=0x40eeb1 "org.freedesktop.login1",
    path=path@entry=0x40ee99 "/org/freedesktop/login1",
    interface=interface@entry=0x40f478 "org.freedesktop.login1.Manager",
    method=method@entry=0x40ef3a "ListSessions", return_reply=return_reply@entry=0x7fffffffd738,
    return_error=return_error@entry=0x0, first_arg_type=first_arg_type@entry=0)
    at ../src/shared/dbus-common.c:1375
#8 0x0000000000404426 in list_sessions (bus=0x618ad0, args=<optimised out>, n=<optimised out>)
    at ../src/login/loginctl.c:82
#9 0x0000000000403b09 in loginctl_main (error=0x7fffffffd890, argv=0x7fffffffd9c8, argc=<optimised out>,
    bus=0x618ad0) at ../src/login/loginctl.c:1543
#10 main (argc=<optimised out>, argv=0x7fffffffd9c8) at ../src/login/loginctl.c:1574

and here's the bt of systemd-logind

(gdb) bt full
#0 0x00007f75545d6fa0 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1 0x00007f7554ada4b8 in socket_do_iteration (transport=0x2c87000, flags=7, timeout_milliseconds=<optimised out>) at ../../dbus/dbus-transport-socket.c:1125
        socket_transport = 0x2c87000
        poll_fd = {
          fd = 1041,
          events = 1,
          revents = 0
        }
        poll_res = <optimised out>
        poll_timeout = <optimised out>
#2 0x00007f7554ad93ff in _dbus_transport_do_iteration (transport=0x2c87000, flags=1, flags@entry=7, timeout_milliseconds=timeout_milliseconds@entry=-1) at ../../dbus/dbus-transport.c:976
No locals.
#3 0x00007f7554ac39dc in _dbus_connection_do_iteration_unlocked (connection=connection@entry=0x2c87470, pending=pending@entry=0x0, flags=flags@entry=7, timeout_milliseconds=timeout_milliseconds@entry=-1) at ../../dbus/dbus-connection.c:1234
No locals.
#4 0x00007f7554ac3b84 in _dbus_connection_flush_unlocked (connection=0x2c87470) at ../../dbus/dbus-connection.c:3574
No locals.
#5 0x00007f7554ac4331 in _dbus_connection_block_pending_call (pending=0x2c63930) at ../../dbus/dbus-connection.c:2382
        start_tv_sec = 140141908468009
        start_tv_usec = 46691440
        tv_sec = 0
        tv_usec = 140735977390056
        status = <optimised out>
        connection = 0x2c87470
        client_serial = <optimised out>
        timeout = <optimised out>
        timeout_milliseconds = <optimised out>
        elapsed_milliseconds = <optimised out>
#6 0x00007f7554ad372a in dbus_pending_call_block (pending=<optimised out>) at ../../dbus/dbus-pending-call.c:748
        __FUNCTION__ = "dbus_pending_call_block"
#7 0x00007f7554ac494d in dbus_connection_send_with_reply_and_block (connection=0x2c87470, message=0x2c86d60, timeout_milliseconds=-1, error=0x7fffa5f08030) at ../../dbus/dbus-connection.c:3530
        reply = <optimised out>
        pending = 0x2c63930
        __FUNCTION__ = "dbus_connection_send_with_reply_and_block"
#8 0x00007f7555125a85 in cgmanager_ping_sync () from /lib/x86_64-linux-gnu/libcgmanager.so.0
No symbol table info available.
#9 0x0000000000422285 in cgm_dbus_connect () at ../src/shared/cgmanager.c:68
        dbus_error = {
          name = 0x0,
          message = 0x0,
          dummy1 = 1,
          dummy2 = 0,
          dummy3 = 0,
          dummy4 = 0,
          dummy5 = 0,
          padding1 = 0x285
        }
        connection = <optimised out>
        __func__ = "cgm_dbus_connect"
#10 0x00000000004205fc in cg_enumerate_tasks (controller=0x425b4a "name=systemd", path=0x214e3b0 "/user/1000.user/c2.session", _f=_f@entry=0x7fffa5f08158) at ../src/shared/cgroup-util.c:117
        fs = 0x0
        f = <optimised out>
        r = <optimised out>
        value = 0x0
        template = 0x0
        fd = <optimised out>
        __PRETTY_FUNCTION__ = "cg_enumerate_tasks"
#11 0x00000000004207da in cg_is_empty (controller=<optimised out>, path=<optimised out>, ignore_self=false) at ../src/shared/cgroup-util.c:1090
        f = 0x0
        pid = 0
        self_pid = <optimised out>
        found = false
        r = <optimised out>
#12 0x0000000000420932 in cg_is_empty_recursive (controller=controller@entry=0x425b4a "name=systemd", path=0x214e3b0 "/user/1000.user/c2.session", ignore_self=ignore_self@entry=false) at ../src/shared/cgroup-util.c:1136
        d = 0x0
        fn = 0x0
        r = <optimised out>
        children = 0x7f7554ac3de5 <dbus_connection_unref+101>
        p1 = <optimised out>
        __PRETTY_FUNCTION__ = "cg_is_empty_recursive"
#13 0x000000000040fbb9 in session_check_gc (s=s@entry=0x2134f10, drop_not_started=drop_not_started@entry=true) at ../src/login/logind-session.c:976
        r = <optimised out>
        __PRETTY_FUNCTION__ = "session_check_gc"
#14 0x00000000004085e6 in manager_gc (m=m@entry=0x2067010, drop_not_started=drop_not_started@entry=true) at ../src/login/logind.c:1503
        seat = <optimised out>
        session = 0x2134f10
        user = <optimised out>
        i = 0xffffffffffffffff
        __PRETTY_FUNCTION__ = "manager_gc"
#15 0x00000000004093c5 in manager_run (m=m@entry=0x2067010) at ../src/login/logind.c:1728
        event = {
          events = 4294967295,
          data = {
            ptr = 0xffffffff,
            fd = -1,
            u32 = 4294967295,
            u64 = 4294967295
          }
        }
        n = <optimised out>
        msec = -1
        __PRETTY_FUNCTION__ = "manager_run"
        __func__ = "manager_run"
#16 0x00000000004061cc in main (argc=<optimised out>, argv=<optimised out>) at ../src/login/logind.c:1867
        m = 0x2067010
        r = 0
        __func__ = "main"

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: systemd-services 204-5ubuntu15
ProcVersionSignature: Ubuntu 3.13.0-22.44-generic 3.13.8
Uname: Linux 3.13.0-22-generic x86_64
ApportVersion: 2.14-0ubuntu1
Architecture: amd64
CurrentDesktop: Unity
Date: Thu Apr 3 12:04:11 2014
InstallationDate: Installed on 2012-10-07 (542 days ago)
InstallationMedia: Ubuntu 12.10 "Quantal Quetzal" - Beta amd64 (20121007)
SourcePackage: systemd
UpgradeStatus: Upgraded to trusty on 2013-05-07 (331 days ago)

Revision history for this message
Iain Lane (laney) wrote :
summary: - systemd-logind is hung in cg_enumerate_tasks
+ systemd-logind can get hung in cg_enumerate_tasks
Martin Pitt (pitti)
Changed in systemd (Ubuntu):
assignee: nobody → Stéphane Graber (stgraber)
Revision history for this message
Stéphane Graber (stgraber) wrote :

Looks like it's actually stuck in cgmanager_ping_sync during the connection, so apparently the cgmanager daemon is somehow stuck...

Revision history for this message
Iain Lane (laney) wrote :
Revision history for this message
James Hunt (jamesodhunt) wrote :

I saw cgmanager spinning at >100% cpu earlier and just managed to strace before my system overheated and powered off. Didn't catch the problematic call, but the errno was EMFILE. Interesting since bug 1300663 shows the same value from upstart. I wonder if this might be a kernel issue?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Easily reproduced just logging in and out a few times.

Changed in systemd (Ubuntu):
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 204-5ubuntu16

---------------
systemd (204-5ubuntu16) trusty; urgency=medium

  [ Stéphane Graber ]
  * Never call normalize_controller when controll is NULL as this
    triggers an assert... (LP: #1301846)

  [ Serge Hallyn ]
  * d/p/add-cgmanager-support: make sure connections are closed (LP: #1301882)
 -- Stephane Graber <email address hidden> Thu, 03 Apr 2014 13:53:09 -0400

Changed in systemd (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.