pam wedges, preventing user login, when homedir is openafs mounted and unreadable

Bug #113586 reported by Marty Vona
6
Affects Status Importance Assigned to Milestone
pam (Ubuntu)
Expired
Low
Unassigned

Bug Description

Been trying to track down a problem with my new ubuntu 7.04 install for a week or two. The issue manifests when I come in to work each morning. At that point my workstation is typically showing a screensaver. The screensaver is running and the machine is not hung. However usually when I try to unlock the screensaver and return to my user session that process wedges.

What is a real bummer is that though I can switch to another VT and try to login there, that *also* wedges if I try to login as myself.

Logging in as root on a VT does work, but if as root I try to su to my normal user, that too wedges. Stracing such an attempt finally led me to a hint: the strace showed that the process wedged exactly while trying to stat64 .pam_environment in my homedir. Boo. Likely my homedir is not readable at this point as my AFS tokens have very well expired.

This appears to be due to a recent patch in pam, "Patch 100 (renumbered from 060): Look at ~/.pam_environment too". I couldn't find user (admin) documentation for this code, but reading the source I see that there should be an option user_read_env which if set to zero should prevent PAM from trying to read the offending file. It appears that the default configuration did NOT have this set, so it was trying to read the file. And wedging. I adjusted every call to pam_env.so in /etc/pam.d/* to have user_read_env=0.

I'm typing this before I've actually waited 24 hours to verify that the problem is solved (remember, it typically only manifests when I arrive in the morning...). I'll append to the bug report if I come in tomorrow and the problem persists.

Revision history for this message
Kees Cook (kees) wrote :
Download full text (14.8 KiB)

pam (0.99.7.1-4ubuntu1~ppa1) gutsy; urgency=low

  * Resynchronise with Debian (LP: #43169, #14505, #80431). Remaining changes:
    - debian/control, debian/local/common-session{,md5sums}: use
      libpam-foreground for session management.
    - debian/rules: install unix_chkpwd setgid shadow instead of setuid root.
      The nis package handles overriding this as necessary.
    - debian/libpam-modules.postinst: Add PATH to /etc/environment if it's not
      present there or in /etc/security/pam_env.conf.
    - debian/patches-applied/ubuntu-fix_standard_types: Use standard u_int8_t
      type rather than __u8.
    - debian/patches-applied/ubuntu-rlimit_nice_correction: Explicitly
      initialise RLIMIT_NICE rather than relying on the kernel limits. Bound
      RLIMIT_NICE from below as well as from above. Fix off-by-one error when
      converting RLIMIT_NICE to the range of values used by the kernel.
      (Originally patch 101; converted to quilt.)
  * Dropped:
    - debian/rules: bashism fixes (merged upstream).
    - debian/control: Conflict on ancient nis (expired with Breezy).
    - debian/libpam-runtime.postinst: check for ancient pam (expired with
      Breezy).
    - debian/patches-applied/ubuntu-user_defined_environment: Look at
      ~/.pam_environment too, with the same format as
      /etc/security/pam_env.conf. (Originally patch 100; converted to quilt.)
      Left out of "series" for now (LP: #113586).

pam (0.99.7.1-4) unstable; urgency=low

  * libpam0g.postinst, libpam0g.templates: gdm doesn't need to be restarted
    to fix the library skew, only reloaded; special-case this daemon in the
    postinst and remove the mention of it from the debconf template, also
    tightening the language of the debconf template in the process.
    Closes: #440074.
  * Add courier-authdaemon to the list of services that need to be
    restarted; thanks to Micah Anderson for reporting.
  * New patch pam_env_ignore_garbage.patch: fix pam_env to really skip over
    garbage lines in /etc/environment and log an error, instead of failing
    with an obscure error; and ignore any PAM_BAD_ITEM values returned
    by pam_putenv(), since this is the expected error return when trying
    to delete a non-existent var. Closes: #439984.
  * Yet another thinko in hurd_no_setfsuid and in
    029_pam_limits_capabilities; this code should really be Hurd-safe at
    last...
  * getline() returns -1 on EOF, not 0; check this appropriately, to fix
    an infinite loop in pam_rhosts_auth. Thanks to Stephan Springl
    <email address hidden> for the fix. Closes: #440019.
  * Use ${misc:Depends} for libpam0g, so we get a proper dependency on
    debconf.
  * 019_pam_listfile_quiet: per discussion with upstream, don't suppress
    errors about missing files or files with wrong permissions; these are
    real errors that should not be buried.
  * Drop the remainder of 061_pam_issue_double_free, not required for the
    original bugfix.
  * Drop patch 064_pam_unix_cracklib_dictpath, which is not needed now that
    we define CRACKLIB_DICTS in debian/rules.
  * Drop patch 063_paswd_segv, superseded by a different upstream fix
  * Split 047_pam_limits_chr...

Changed in pam:
status: New → Fix Released
Revision history for this message
Kees Cook (kees) wrote :

Bleh. PPA upload caused this to auto-close. :(

Changed in pam:
assignee: nobody → keescook
status: Fix Released → In Progress
Revision history for this message
Björn Torkelsson (torkel) wrote :

Move/Create (an empty) .pam_environment in a directory that is readable without any AFS tokens, and then create a link to it in you home directory.
That is common practice for files needed to be read before you have any tokens.

Revision history for this message
Marty Vona (vona) wrote :

Yes, I am aware of the symlink workaround, and no, it does not solve the problem, which is pretty surprising. Details below.

It also appears that my evaluation and workaround above was also incorrect, or at least incomplete.

And it looks like I never mentioned the all details of the setup, sorry. The user homedir is a symlink into AFS space. Authentication is kerberos 5, and pam_openafs_session is used to get afs tokens.

This issue is still unsolved for me: my machine is still preventing me from logging in if there is already an ongoing login session which has lost AFS tokens. This is the real emphasis of this bug report: a login session which has lost AFS tokens not only is borked itself, of course, but somehow also manages to bork all other attempts to log in as that user, whether from a different VT, via ssh, or by su.

I've minimized the annoyance by setting my kerberos tickets for maximum renewable lifetime. So now I only need to deal with this mess once every week or so.

Other than rebooting, I've found two ways out of the wedge. One is to log in as root (which does work) and then kill all processes of the offending user. After they've died, new logins as that user work again. The other way out is to temporarily rename the user's homedir. Then logins as that user fail to find the homedir at all, but at least they don't hang. This doesn't really help much...

I have tried many times to strace attempts to su to the user from a root login to determine where the hang is occurring. In all cases I have observed so far it appears that it is, in fact, hanging while trying to read some dotfile in the homedir, which is of course unreadable. However I have now made all these dotfiles symlinks to files on a local filesystem, but the login attempts *still* hang trying to read the symlinks. I find this fairly bizarre and even inconsistently repeatable -- I am pretty sure that the hangs don't always occur on the same dotfile.

Revision history for this message
Björn Torkelsson (torkel) wrote :

Just to iron pam_env out. Does it work if you comment out pam_env?

We are running a similiar setup, with K5, openafs, and I have never noticed any problems logging in when the AFS tokens have expired. Not even if I wait a week.

Which pam modules are you using except pam-openafs-session? How are the configured?
What does your /etc/krb5.conf look like?

Have you checked /var/log/auth.log for any hints why you can't log in more than once?

Revision history for this message
Marty Vona (vona) wrote :

Bjorn - thanks for the thoughts & help on this. I've been traveling a lot and unfortunately don't have time to work on this issue for the foreseeable future. I'll probably revisit it when I upgrade my systems to ubuntu 7.1, which will hopefully be sometime in the next year. For now I'm just trying not to let my AFS tokens expire, and using the workaround methods above if they do.

Kees Cook (kees)
Changed in pam:
assignee: keescook → nobody
importance: Undecided → Low
status: In Progress → Confirmed
Revision history for this message
Thomas Hotz (thotz-deactivatedaccount) wrote :

So is this bug now closed?

Changed in pam (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Marty Vona (vona) wrote :

I didn't change the status. However, I changed institutions and no longer use AFS, so this doesn't affect me any more and I'm not set up to work on it either. I am not sure if it is even still an issue.

Best luck!

Changed in pam (Ubuntu):
status: Incomplete → Confirmed
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for pam (Ubuntu) because there has been no activity for 60 days.]

Changed in pam (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.