autofs5 eats the cpu if you have large groups

Bug #591100 reported by Joel Ebel
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
autofs5 (Ubuntu)
Fix Released
Undecided
Unassigned
Lucid
Fix Released
Undecided
Unassigned

Bug Description

Binary package hint: autofs5

The issue is in lib/mounts.c, set_tsd_user_vars - editing out the boring bits, it looks like this:

grplen = sysconf(_SC_GETGR_R_SIZE_MAX);
       while (1) {
               char *tmp = realloc(gr_tmp, tmplen+1);
               status = getgrgid_r(gid, pgr, gr_tmp, tmplen, ppgr);
               if (status != ERANGE)
                       break;
               tmplen += grplen;
       }

It's trying to get the members of the users primary group, but doesn't know how big a buffer to allocate, so it keeps trying until the buffer is big enough, incrementing it each time. The increment is only 1024 bytes at a time, however, so it takes several hundred iterations to get a big enough buffer.

This shouldn't be relying on_SC_GETGR_R_SIZE_MAX to give a reasonable increment. See http://<email address hidden>/msg40443.html for some discussion about whether the value of SC_GETGR_R_SIZE_MAX should really be that low, but it seems that debian decided it should be, and the man page was wrong.

I've verified that bumping the increment value by 1000x fixes the issue, and stat'ing non-existent homedirs is now instantaneous.

Joel Ebel (jbebel)
tags: added: glucid
Revision history for this message
Joel Ebel (jbebel) wrote :

One option would be to double tmplen each pass. That would make it take, in my case, 10 tries, rather than ~750.

so:

- tmplen += grplen;
+ tmplen *= 2;

Revision history for this message
Joel Ebel (jbebel) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package autofs5 - 5.0.5-0ubuntu2

---------------
autofs5 (5.0.5-0ubuntu2) maverick; urgency=low

  [Joel Ebel]
  * debian/patches/16group_buffer_size.patch: Increase group buffer size
    geometrically rather than linearly when its found to be small.
    (LP: #591100)

  [Chuck Short]
  * debian/control: Fix conflict resolution. (LP: #520601)
 -- Chuck Short <email address hidden> Wed, 30 Jun 2010 08:06:45 -0400

Changed in autofs5 (Ubuntu):
status: New → Fix Released
Revision history for this message
Joel Ebel (jbebel) wrote :

The group size patch would be very helpful for us to have included in Lucid.

Revision history for this message
Joel Ebel (jbebel) wrote :

SRU team, as mentioned, this bug causes significant delays in automounting when your primary group is large (thousands of users). The bug has been addressed by increasing the group buffer size geometrically rather than linearly. If the primary group is small, there will be no change. The patch I originally provided still applies to the version in Lucid.

TEST CASE: Your primary group should have thousands of users, and you attempt access of something in an automount managed directory.
Expected result: It succeeds or fails quickly.
Prior to this fix it will take several seconds.

There are no expected regressions from this patch.

Revision history for this message
Martin Pitt (pitti) wrote :

SRU ack, please upload.

Revision history for this message
Benjamin Drung (bdrung) wrote :

uploaded the attached one to lucid-proposed

Changed in autofs5 (Ubuntu Lucid):
status: New → Fix Committed
Revision history for this message
Jonathan Riddell (jr) wrote : Please test proposed package

Accepted into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: added: verification-needed
Revision history for this message
Joel Ebel (jbebel) wrote :

We have been testing this exact patch (with the exception of subtle changelog variations, verified by debdiff) for 2 months now on thousands of machines. It greatly improves autofs performance in a large group environment. I've briefly tested the package uploaded to proposed to verify that it still works and improves performance.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package autofs5 - 5.0.4-3.1ubuntu5.1

---------------
autofs5 (5.0.4-3.1ubuntu5.1) lucid-proposed; urgency=low

  * Increase group buffer size geometrically rather than linearly when it is
    found to be too small (LP: #591100).
 -- Joel Ebel <email address hidden> Tue, 17 Aug 2010 10:44:24 +0200

Changed in autofs5 (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
Robie Basak (racb) wrote :

Did this patch get sent upstream? We're still having to maintain the Ubuntu delta here.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I just emailed the patch to the autofs mailing list, let's see what they say.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Found a link to the mailing list archive: https://www.spinics.net/lists/autofs/msg02115.html

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.