systemd-logind: memory leaks on session's connections (trusty-only)

Bug #1750013 reported by Guilherme G. Piccoli
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Fix Released
Medium
Guilherme G. Piccoli
Trusty
Fix Released
Medium
Guilherme G. Piccoli
Xenial
Fix Released
Medium
Guilherme G. Piccoli
Artful
Fix Released
Medium
Guilherme G. Piccoli
Bionic
Fix Released
Medium
Guilherme G. Piccoli

Bug Description

Below the SRU request form. Please refer to the Original Description to a more comprehensive explanation of the problem observed.

[Impact]

 * systemd-logind tool is leaking memory at each session connected. The
 issues happens in systemd from Trusty (14.04) only.

 * Three issues observed:
  - systemd-logind is leaking entire sessions, i.e, the sessions are not
    feeed after they're closed. In order to fix that, we proactively add
    the sessions to systemd garbage collector (gc) when they are closed.
    Also, part of the fix is to make cgmanager package a dependency. Refer
    to comment #1 to a more thorough explanation of the issue and the fix.

  - a small memory leak was observed in the session creation logic of
    systemd-logind. The fix for that is the addition of an appropriate
    free() call. Refer to comment #2 to more details on the issue and fix.

  - another small memory leak was observed in the cgmanager glue code of
    systemd-logind - this code is only present in this specific Ubuntu
    release of the package, due to necessary compatibility layer with
    upstart init system. The fix is to properly call free() in 2
    functions. Refer to comment #3 to a deep exposition of the issue and
    the fix.

[Test Case]

 * The basic test-case is to run the following loop from a remote machine:
   while true; do ssh <hostname-target> "whoami"; done

 * It's possible to watch the increase in memory consumption from
   "systemd-logind" process in the target machine. One can use the
   "ps uax" command to verify the RSS of the process, or count its
   anonymous pages from /proc/<logind_pid>/smaps.

[Regression Potential]

 * Since the fixes are small and not intrusive, the potential for
   regressions are low. More regression considerations on comments #1, #2
   and #3 for each fix.

 * A potential small regressson is performance-wise, since now we add
   sessions to garbage collector proactively.

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :
Download full text (3.9 KiB)

After some debug, it was observed that the only path in which a session
is freed (other than in user/seat free or manager free) is in garbage
collecting, specifically in the function manager_gc().
It means a closing session should be somehow added to gc in order gc has
a chance to validate if the session is ready to go, and then performs
the clean-up.

To include a session in gc, the function session_add_to_gc_queue() should
be called. This function is currently called from 3 paths:

* from session_stop();
* from manager startup routines;
* and from the empty cgroup notification handler.

Since we are not using systemd as init system in Trusty, the empty cgroup
notifier is not reliable. We rely on cgmanager to perform cgroup
interaction on Trusty, and its release-agent also does not notify when
cgroups are empty. Finally, commit
efdb023 ("core: unified cgroup hierarchy support")
made a refactor in cgroup management and it claims for the first time,
cgroup empty notifications are reliable, meaning until this commit we
certainly couldn't rely on it.

The startup path of manager is executed only in logind start; session_stop()
could execute from Terminate dbus events (like loginctl terminate-session),
and from user/seat stopping routines. Meaning there's no way a session is
added to gc queue except the cgroup empty notification (if it worked) in
this version of systemd.

The simple test-case mentioned in the first comment leaks entire sessions
and logind consumes an increasing amount of memory.
--

In order to prevent that, I wrote a small patch that proactively puts a
session in gc on session_remove_fifo(). This function is always called in
the event of Release a session - when the ssh ends, for example.
This way, we guarantee closing sessions have a chance to be cleared.

But also, it was noticed that kernel may take a small time window to
clear the tasks' cgroup tracking for the session - hence, the gc
might fail in freeing the session, so the patch adds a check on gc too
in order to re-add the session on gc in case session is on closing state.
This way stopping sessions are always freed and no leak is observed.

The patch:

diff --git a/src/login/logind-session.c b/src/login/logind-session.c
index 662273b..1a7389a 100644
--- a/src/login/logind-session.c
+++ b/src/login/logind-session.c
@@ -951,6 +952,7 @@ void session_remove_fifo(Session *s) {
                 free(s->fifo_path);
                 s->fifo_path = NULL;
         }
+ session_add_to_gc_queue(s);
 }

 int session_check_gc(Session *s, bool drop_not_started) {
diff --git a/src/login/logind.c b/src/login/logind.c
index 333a779..0b9e165 100644
--- a/src/login/logind.c
+++ b/src/login/logind.c
@@ -1495,6 +1495,8 @@ void manager_gc(Manager *m, bool drop_not_started) {
                 if (session_check_gc(session, drop_not_started) == 0) {
                         session_stop(session);
                         session_free(session);
+ } else if (session_get_state(session) == SESSION_CLOSING) {
+ session_add_to_gc_queue(session);
                 }
         }

This diverges from upstream since we have refactors in systemd code,
specially...

Read more...

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

It was noticed that during the logind manager logic for creating a session
based on dbus event, the sessions's initialization procedure will allocate
2 structures related to cgroup management of a session: controllers and
reset_controllers.

Both these structs are filled with cgroup controllers (when they exist)
in bus_parse_strv_iter(), which will always allocate at least one of each
to be used as a parent for a "string"-based list of elements abstraction.

Currently, the code does not free reset_controllers, so we have a memory
leak at each session creation, which shows up in a valgrind output like
this:

2,696 bytes in 337 blocks are definitely lost in loss record 99 of 101
at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x434F58: malloc_multiply (util.h:564)
by 0x4368DA: bus_parse_strv_iter (dbus-common.c:905)
by 0x40AE9D: bus_manager_create_session (logind-dbus.c:498)
by 0x40EB39: manager_message_handler (logind-dbus.c:1752)
by 0x5ED8E85: ??? (in /lib/x86_64-linux-gnu/libdbus-1.so.3.7.6)
by 0x5ECBA20: dbus_connection_dispatch (in /lib/x86_64-linux-gnu/libdbus-1.so.3.7.6)
by 0x409E21: manager_run (logind.c:1729)
by 0x40A27C: main (logind.c:1860)

Notice this was after 337 sessions.
--

I wrote a small patch that addresses the issue by just adding the
necessary strv_free(). It diverges from upstream systemd since
reset_controllers entity was removed from the code after the refactor
on cgroup's handling made by commit
fb6becb ("logind: port over to use scopes+slices for all cgroup stuff").

The strv_free() call has no known risk and potential double-frees aren't
expected since strv_free() validates the struct and its "children"
before de-allocate the resources.

The patch:

diff --git a/src/login/logind-session.c b/src/login/logind-session.c
index 662273b..1a7389a 100644
--- a/src/login/logind-session.c
+++ b/src/login/logind-session.c
@@ -93,6 +93,7 @@ void session_free(Session *s) {

         free(s->cgroup_path);
         strv_free(s->controllers);
+ strv_free(s->reset_controllers);

         free(s->tty);
         free(s->display);

After testing with both patches already mentioned and cgmanager, there
is still a (really) small leak to be investigated.

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

As already briefly mentioned, Trusty systemd's package has
additional patches to customize systemd - specially, we have the addition
of glue code to make Trusty's systemd work with cgmanager. This is
necessary due to the choice of upstart as init manager - it prevents
regular cgroup handling from systemd perspective, so cgmanager is used
in order to assist cgroup management on systemd code.

This glue code was used in some cgroup's routines when cgmanager is active
and running. In particular, it's used on systemd's cg_enumerate_tasks()
and cg_enumerate_processes() routines. In these cases, there's a memory
leak caused by a "string" allocation in cgm_get() that isn't freed on
cg_enumerate_tasks() / cg_enumerate_processes() - it shows up in Valgrind
like this:

58 bytes in 58 blocks are definitely lost in loss record 60 of 113
at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x638EB49: strdup (strdup.c:42)
by 0x422A90: cgm_get (cgmanager.c:148)
by 0x420AC6: cg_enumerate_tasks (cgroup-util.c:128)
by 0x420C89: cg_is_empty (cgroup-util.c:1129)
by 0x420DEA: cg_is_empty_recursive (cgroup-util.c:1175)
by 0x40FC48: session_check_gc (logind-session.c:979)
by 0x408694: manager_gc (logind.c:1503)
by 0x4094B8: manager_run (logind.c:1740)
by 0x40622B: main (logind.c:1868)
--

I wrote a small patch that fixes the leak by freeing the allocated pointer
after its use. It diverges from upstream since systemd upstream has no
cgmanager glue code, which was added specially for Trusty's need of
systemd packages without having it as init manager.

The free() calls are clear and simple, they're provided by a GCC attribute
that calls a free() whenever the scope of the pointer is not reachable
anymore. No side effects are expected, the patch frees the pointer in all
failure paths also since it's using the GCC attribute of auto-calling
free(). No double-frees are possible since the free() calls happens only
after the pointer was validated as has been successfully allocated.

The patch:

diff --git a/src/shared/cgroup-util.c b/src/shared/cgroup-util.c
index 6c07a49..ab50ad4 100644
--- a/src/shared/cgroup-util.c
+++ b/src/shared/cgroup-util.c
@@ -44,7 +44,7 @@ int cg_enumerate_processes(const char *controller, const char *path, FILE **_f)
         _cleanup_free_ char *fs = NULL;
         FILE *f;
         int r;
- char *value = NULL;
+ _cleanup_free_ char *value = NULL;
         char *template = NULL;
         int fd;

@@ -111,7 +111,7 @@ int cg_enumerate_tasks(const char *controller, const char *path, FILE **_f) {
         _cleanup_free_ char *fs = NULL;
         FILE *f;
         int r;
- char *value = NULL;
+ _cleanup_free_ char *value = NULL;
         char *template = NULL;
         int fd;

The patches above were tested during 4 days running the test-case, with
cgmanager package installed. More than 658000 sessions were started and
finished without memory leaks.

Changed in systemd (Ubuntu):
importance: Undecided → Medium
status: New → In Progress
Changed in systemd (Ubuntu Trusty):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
importance: Undecided → Medium
status: New → In Progress
Changed in systemd (Ubuntu Xenial):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
importance: Undecided → Medium
status: New → Fix Released
Changed in systemd (Ubuntu Bionic):
status: In Progress → Fix Released
Changed in systemd (Ubuntu Artful):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
importance: Undecided → Medium
status: New → Fix Released
description: updated
tags: added: sts-sru-needed
Revision history for this message
Dan Streetman (ddstreet) wrote :

This does not look right to me:

@@ -951,6 +952,7 @@
                 free(s->fifo_path);
                 s->fifo_path = NULL;
         }
+ session_add_to_gc_queue(s);
 }

 int session_check_gc(Session *s, bool drop_not_started) {

that puts the session on the manager's gc queue, however this function (session_remove_fifo) is called from session_free(), which is called from the manager's gc queue handler...can you explain why this is needed here?

The other parts of the patch look good, thanks!

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Dan, thanks for looking at the patch and for your question!

I tried to explain this a bit in comment #1, not too much detailed though:

"[...]closing session should be somehow added to gc [...] and then (the gc) performs
the clean-up.

To include a session in gc, the function session_add_to_gc_queue() should
be called.

[...] (this) patch proactively puts a session in gc on session_remove_fifo().
This function is always called in the event of Release a session - when the ssh ends, for example.
[...]"

Being a bit more specific: when SSH ends, for example, a Release event is sent
through dbus and systemd-logind captures it, in the function manager_message_handler().

From there, the function session_remove_fifo() is called. That point is our "bootstrap"
to add the closing session on gc, our first addition there. But unfortunately due to
an unpredictable timing of cgroup become empty for that session, the gc might fail
to remove the session right in that moment. So, we need to re-add the closing session to
gc from...within the gc! In my experiments, 2 or 3 re-additions are enough to get the session removed.

It wouldn't be necessary to do these 2 steps above IF the cgroup empty notification
was working fine. In that case, after the cgroup for the closing session become empty,
a notification handler would be triggered in the systemd-logind and that handler would
add the session to gc, and then gc would clean it up.
But, since in Trusty we have upstart as our init system (and systemd is highly patched to co-exist in this fashion) and cgroup empty notifications were broken in systemd (until a complete refactor of cgroup handling in systemd), our approach here is the less expensive one.

In summary, the above 2 steps [adding the session to gc in session_remove_fifo() and re-adding it from within gc] are the most important parts of this patch - without them we leak entire sessions and eventually the logind process gets OOM'ed.

Thanks,

Guilherme

Revision history for this message
Dan Streetman (ddstreet) wrote :

> when SSH ends, for example, a Release event is sent
> through dbus and systemd-logind captures it, in the function manager_message_handler().
>
> From there, the function session_remove_fifo() is called. That point is our "bootstrap"
> to add the closing session on gc

ok i see that path, dbus Release -> remove fifo, however i don't think it's appropriate to put the add_to_gc into remove_fifo; because, remove_fifo is also called from session_free()..consider this code:

        session_remove_fifo(s);

        free(s->state_file);
        free(s);
}

so...that call to session_remove_fifo() previously only removed the fifo. but with your patch, it's also adding 's' back onto the gc queue...but then immediately freeing s!!! When the gc queue is processed, this code:

        while ((session = m->session_gc_queue)) {
                LIST_REMOVE(Session, gc_queue, m->session_gc_queue, session);
                session->in_gc_queue = false;

will dereference 's' ('session' in the gc handler), accessing freed memory.

Since you're trying to use the dbus Release notification to add the session to the gc, wouldn't it be better to simply update the dbus Release handler to remove the fifo *and* add the session to the gc, from there? not modify the remove_fifo function?

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Dan, really well-observed! Thanks for the suggestion, it makes total sense. I'll update the approach: instead of using session_remove_fifo() to add the session on GC, I'll use the DBus handler itself.

After testing, I'll submit a debdiff_v2 here.
Thanks,

Guilherme

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Dan, please let me know if the debdiff v2 above addresses your concern and if it's properly following your suggestion. I tested the modification and it's working fine.

Thanks,

Guilherme

Revision history for this message
Dan Streetman (ddstreet) wrote :

That looks good.

My only comment now is, I see you've switched cgmanager from Suggests: to Depends:...is that *really* needed to fix a memleak here? i.e., if you uninstall cgmanager and re-run your memleak tests, does mem still leak? While changing it to a Depends: isn't impossible, I'd prefer to do that only if there is a specific reason to do so that can't be easily fixed some other way.

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thanks Dan.

About cgmanager: if we don't have it installed, it does not affect memory of systemd-logind tool per se. What happens, IMHO, is even more severe: we don't free/de-allocate sysfs cgroup paths for sessions. So, in my tests _without_ cgmanager, after 8000 SSH sessions to my target machine, I found 8000 folders with some files each at /sys/fs/cgroup/systemd/user/<my_user>/ .

Although it doesn't directly affects systemd-logind memory footprint, it certainly affects system memory overall. So, in my consideration (given cgmanager has even code glue added to systemd on Trusty), I'd suggest to make it a dependency, not sure why it isn't anyway - for me, this cgroup folders' leak is clearly a bug.

Revision history for this message
Dan Streetman (ddstreet) wrote :

Excellent, thanks - that is perfect justification to change it to Depends:, I'll get the upload going soon.

Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Guilherme, or anyone else affected,

Accepted systemd into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/204-5ubuntu20.27 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-trusty to verification-done-trusty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-trusty. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in systemd (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-trusty
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thanks Dan and Brian.

Just tested: after 1h and more than 5500 SSH sessions, no leaks at all were observed.
(As a comparison, testing the "old" version 20.26 I got 1.8 MB of leak in half an hour!)

Cheers,

Guilherme

tags: added: verification-done verification-done-trusty
removed: sts-sru-needed verification-needed verification-needed-trusty
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

We had a potential regression reported after some users installed this package from -proposed.
The reports starts on comment #17 in LP: #1303649 .

Summary: users are reporting high CPU loads from both systemd-logind and cgmanager processes, as well as delays in logins. I'm investigating to reproduce and understand the issue.

tags: added: regression-proposed
Revision history for this message
Stéphane Graber (stgraber) wrote :

Removed this SRU from -proposed.

Two reasons for this:
 1) The mentioned regression
 2) We should NEVER introduce a new dependency to a package through an SRU.
    Doing so will prevent any user that doesn't dist-upgrade (uses upgrade/safe-upgrade instead) from getting the new dependency, instead holding back the package forever. This would then completely prevent us from say pushing a security upload of systemd to these users.

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thanks Stéphane, the 2nd point is a reasonable consideration.
Regardless if I could provide a fix to cgmanager for the regression,
certainly it won't be added again as a dependency.

Cheers,

Guilherme

Revision history for this message
Mauro Franzoni (mauf) wrote :

I am also experiencing cpu overload for cgmanager and (after disabling it in grub's default with "cgroup_disable=cpu") systemd-logind as described in #16.
I see nothing strange in the logs, but if I can be of any help by providing any information please just ask me.
Cheers,
mauf

Revision history for this message
Mauro Franzoni (mauf) wrote :

my trace for cgmanager

Linux www 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 17:02:07 UTC 2018 i686 athlon i686 GNU/Linux

Revision history for this message
Mauro Franzoni (mauf) wrote :

...and for systemd-login

(cgmanager and systemd-login were using both around 49-50% of CPU)

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thank you Mauro! One thing that worth to take a look is that you're using kernel 3.13, and this could be related to the high CPU utilization issue you're observing.

We're changing the approach of this fix to not rely on cgmanager anymore. The CPU utilization issue is however another bug that needs investigation - I'll use the LP #1303649 for that.

Cheers,

Guilherme

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Dan, this is the v3 of the patch. I bumped the version to 20.28 since my proposed 20.27 caused the regression aforementioned.

For this version, I removed the dependency of cgmanager, along with the code that added closing sessions to garbage collector. Happens that a similar code is present on systemd-logind already, and for some reason (which is still a mystery for me) I was seeing a session leak in the tool. My hypothesis is that I built one version of the tool without the bunch of debian patches, and so that portion of code wasn't present. Then, I started working in a fix, but in the end, the fix is useless since a similar one is already there.

Anyway, the patch now just avoid some small memory leaks in both session's free path and cgmanager glue code. It was tested in both kernels 3.13 and 4.14 (with and without cgmanager), I didn't observe leaks anymore nor high CPU utilization.

Thanks,

Guilherme

tags: added: sts-sru-needed
removed: regression-proposed verification-done verification-done-trusty
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello Guilherme, or anyone else affected,

Accepted systemd into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/204-5ubuntu20.28 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-trusty to verification-done-trusty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-trusty. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-needed verification-needed-trusty
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Łukasz, thanks for the heads-up.

I just tested systemd from -proposed, version 204-5ubuntu20.28 and it fixes the issue reported in this LP.

The test consists in doing multiple SSHs in a loop (better explained in the description). I've run the test with and without cgmanager installed, in kernels 4.4.0-119 and 3.13.0-145, no memory leaks observed.

Important to notice, I wasn't able to observe any constant increase in CPU utilization from both binaries (logind and cgm); I kept monitoring and didn't notice anything abnormal.

I'll mark this as verified.
Cheers,

Guilherme

tags: added: verification-done verification-done-trusty
removed: sts-sru-needed verification-needed verification-needed-trusty
Revision history for this message
Dan Streetman (ddstreet) wrote :

Note, the 'linux-*' autopkgtest failures in pending-sru for this pkg are caused by bug 1723272, a known bug that can be ignored as it's unrelated to this sru.

Revision history for this message
Dan Streetman (ddstreet) wrote :

> caused by bug 1723272

sorry, i meant bug 1723223

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

As mentioned by Dan in the above comments, some failures in autopkgtest like:

autopkgtest [14:44:48]: test ubuntu-regression-suite: [-----------------------
Source Package Version: 4.4.0-1017.17
Running Kernel Version: 3.13.0-145.194
ERROR: running version does not match source package

Are "expected" due to LP #1723223.

Other than that, the proposed package passed in all other tests.
Thanks,

Guilherme

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for systemd has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 204-5ubuntu20.28

---------------
systemd (204-5ubuntu20.28) trusty; urgency=medium

  * logind: fix memleaks in session's free path and cgmanager glue code
    (LP: #1750013)

 -- <email address hidden> (Guilherme G. Piccoli) Tue, 03 Apr 2018 13:38:08 +0000

Changed in systemd (Ubuntu Trusty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.