ecryptfs private directory randomly unmounts

Bug #358573 reported by Jamie Strandboge
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
eCryptfs
Fix Released
High
Unassigned
ecryptfs-utils (Ubuntu)
Fix Released
High
Dustin Kirkland 

Bug Description

Binary package hint: ecryptfs-utils

This seems to be bug #259293, but I am filing a new one as I haven't seen this in some time. Twice in the last week my ~/Private directory unmounted. Both times /tmp/ecryptfs-<user>-Private was '0'. Both times, the symlink in the unmounted ~/Private was not present. I have a cron job that runs every 10 minutes that I can see in syslog:

Apr 9 11:15:04 hostname CRON[22771]: Mount of private directory return code [0]

It could be that bug #259293 is simply 'mostly' fixed and I coincidentally hit this twice in the last week, or it could be a new bug (I don't know).

ProblemType: Bug
Architecture: amd64
DistroRelease: Ubuntu 9.04
Package: ecryptfs-utils 73-0ubuntu2
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: ecryptfs-utils
Uname: Linux 2.6.28-11-generic x86_64

CVE References

Revision history for this message
Jamie Strandboge (jdstrand) wrote :
Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Hi Jamie-

Thanks (again) for the report. I really thought we had this sorted out with the vastly improved mount counters ...

I'm marking "high" priority, but leaving "new", as I haven't been able to confirm it.

Please try to keep me updated on this issue. I'm not really sure how else to debug this race condition.

Also, I'm really interested if there's anyone else out there experiencing this issue...

:-Dustin

Changed in ecryptfs-utils (Ubuntu):
assignee: nobody → Dustin Kirkland (kirkland)
importance: Undecided → High
Revision history for this message
Jamie Strandboge (jdstrand) wrote :

I am going to mark this as Confirmed, since it happened to me again today on an up to date Jaunty. I do not have a reproducer yet, but I can say that I tend to see this when I am not at the keyboard and cron jobs run through the night.

Changed in ecryptfs-utils (Ubuntu):
status: New → Confirmed
Changed in ecryptfs:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Lowell Alleman (lowell-alleman) wrote :

I've been having the same problems as well. I have my Firefox profile located under the ~/Private folder and I've had Firefox mysterious crashes multiple times per day over the last week. I finally figured out what was happening, and sure enough, after each crash I realized that ~/Private was no longer mounted.

I've seen this happen in a little as 15 minutes after running ecryptfs-mount-private. Other times the mount says up for a long time; even across suspend/resume cycles (which I thought could have been the problem, but that does not appear to be the case.)

Side note: I used encfs in earlier releases of Ubuntu (prior to when encrypted "Private" folders were offered as an out-of-the-box feature). I stored my Firefox profile on there with no unmounting issues like this (normally you had the exact opposite issue, where it was difficult to get encfs to unmount after a certain timeout period.) In any case, with Firefox running, it seems pretty likely that it would being holding files open.

I've looked around but haven't found any log messages indicates what/when/why this is happening. (ecryptfs doesn't seem to do much logging; but I haven't looked into the options)

Revision history for this message
Ulf Rompe (rompe) wrote :

Seeing the same problem here, I just wrote a little script to analyze how often the unmounting happens. It checks the Private dir every minute, and if it appears to be unmounted, it spits out a message and remounts it. The message is accompanied with a timestamp. If somebody wants to try the script, just install moreutils, download the attached monitor_private_dir.sh and run it in a shell.

Output on my machine, after starting the script yesterday at 07:58:

ur@joda:~> bin/monitor_private_dir.sh
Mai 14 07:58:08 ~/Private was unmounted.
Mai 14 08:40:08 ~/Private was unmounted.
Mai 14 09:40:08 ~/Private was unmounted.
Mai 14 10:00:08 ~/Private was unmounted.
Mai 14 12:20:09 ~/Private was unmounted.
Mai 14 13:00:09 ~/Private was unmounted.
Mai 14 13:30:09 ~/Private was unmounted.
Mai 14 14:00:09 ~/Private was unmounted.
Mai 14 15:00:09 ~/Private was unmounted.
Mai 14 18:00:09 ~/Private was unmounted.
Mai 14 20:30:10 ~/Private was unmounted.
Mai 14 21:00:10 ~/Private was unmounted.
Mai 14 22:00:10 ~/Private was unmounted.
Mai 14 23:30:10 ~/Private was unmounted.
Mai 15 00:40:10 ~/Private was unmounted.
Mai 15 01:00:10 ~/Private was unmounted.
Mai 15 01:30:10 ~/Private was unmounted.
Mai 15 03:00:11 ~/Private was unmounted.

So, the unmount seems to take place at nearly random times, except that it only happens on full tens of minutes. This gives the impression that it could be cronjob related. I will do some more investigation in this direction. Maybe in the meantime somebody else who is affected by this bug wants to run the script to see if he gets equivalent results?

Revision history for this message
Dustin Kirkland  (kirkland) wrote : Re: [Bug 358573] Re: ecryptfs private directory randomly unmounts

Hi Ulf-

This problem is absolutely cronjob related.

Do you have any cronjobs running? If so, could you elaborate on what
these cronjobs are doing?

Also, approximately how many ssh and desktop sessions do you have open
as this user when the directory unmounts?

Are you encrypting all of $HOME or just $HOME/Private?

:-Dustin

Revision history for this message
Ulf Rompe (rompe) wrote :

I'm encrypting ~/Private only. There are tons of cronjobs running on this machine, several of them as my user. The unmount takes place every now and then, but not with every run of a job. Maybe I have to run more than one job at once for this to happen? But then again, I think Cron serializes jobs in order to not start everything at once.

Looking at the similarities of the times of unmounting in the syslog, I can spot these cronjobs to be running every time it happens:

rss2email as my user - should be harmless
fetch-exchange.py as my user - fetches mail from an exchange server and feeds it to procmail
/usr/sbin/update-motd as root

To my knowledge, none of these jobs should be accessing ~/Private in any way. Plus most times they are running my ~/Private stays mounted.

I'm still trying to gain some enlighenment by studying the logs. In addition I will now adjust the start times of my jobs in the hope of separating the jobs by time.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

It isn't surprising that ~/Private unmounts but encrypted $HOME does not-- there are all kinds of applications that are accessing files in $HOME that would prevent an umount of an encrypted $HOME directory.

Revision history for this message
Ulf Rompe (rompe) wrote :

I can now confirm that the unmounting only happens at times when two cronjobs are run as my user. None of the jobs is supposed to do anything in ~/Private. The bug doesn't happen every time, so this seems like a race condition.

Jamie, I don't know if it matters, but the Private dir is also getting unmounted when it is the cwd of several shells and I have files in there opened with an editor.

I will now try to reproduce the bug with more simple cronjobs like "ls ~ >/dev/null".

Revision history for this message
Ulf Rompe (rompe) wrote :

That worked out well. I have added these jobs to my crontab:

*/1 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null

Therefore every two minutes both jobs are run, while every other minute only one of them gets started.
Output from monitor_private_dir.sh since then:

Mai 19 11:44:27 ~/Private was unmounted.
Mai 19 12:28:27 ~/Private was unmounted.
Mai 19 12:32:27 ~/Private was unmounted.
Mai 19 12:33:27 ~/Private was unmounted.
Mai 19 12:36:27 ~/Private was unmounted.
Mai 19 12:42:27 ~/Private was unmounted.
Mai 19 12:46:27 ~/Private was unmounted.
Mai 19 12:52:27 ~/Private was unmounted.
Mai 19 12:54:27 ~/Private was unmounted.
Mai 19 13:00:27 ~/Private was unmounted.

At 12:33 another regular cronjob was running, so that uneven hit doesn't count. Every other hit proves the theory that two arbitrary cronjobs have to have the same start time to raise the chances of provoking the unmount of ~/Private.

Are you able reproduce the bug with this test case (adding the two lines to your crontab and interactively running my script)?

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

This is still a problem on Karmic. Here is an updated reproducer for Karmic. Create monitor_private_dir.sh:

#!/bin/sh
set -e
private="$HOME/.Private"
while true; do
 if ! mount | grep -q "$private"; then
  echo "$private was unmounted." | ts
  ecryptfs-mount-private
 fi
 sleep 1
done

Then add to the user's crontab:
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/1 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null
*/2 * * * * /bin/ls >/dev/null

Then run the monitor:
$ ./bin/monitor_private_dir.sh
Jul 17 07:08:11 /home/jamie/.Private was unmounted.
Jul 17 07:09:07 /home/jamie/.Private was unmounted.

It might take a couple minutes to trigger, but it does trigger fairly easily. I bet increasing the number of concurrent cron jobs will make it hit even faster. This was also done with an encrypted private directory, not encrypted private $HOME, but would theoretically work there too.

Revision history for this message
Max Bowsher (maxb) wrote :

/me has a flash of inspiration....

cron starts all the jobs approximately simultaneously, calling through pam to start sessions for each..... for each job, pam_ecryptfs.so increments the refcount file in /tmp/ - and the various invocations of pam_ecryptfs.so race with each other, in a textbook classic "lost updates" race condition.

When the jobs terminate, they've usually spread out enough in time that you are less likely to get the same loss of counter decrements.

Hence, more decrements than increments actually take effect, and the refcount hits zero.

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Reproducer script confirmed. This did demonstrate the problem for me.

Max, good call.

I'm working on improving the locking now...

:-Dustin

Changed in ecryptfs:
status: Confirmed → Fix Committed
Changed in ecryptfs-utils (Ubuntu):
status: Confirmed → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (4.4 KiB)

This bug was fixed in the package ecryptfs-utils - 76-0ubuntu1

---------------
ecryptfs-utils (76-0ubuntu1) karmic; urgency=low

  [ Dustin Kirkland ]
  * src/utils/ecryptfs-setup-swap: switch from vol_id to blkid,
    LP: #376486
  * debian/ecryptfs-utils.postinst, src/utils/ecryptfs-setup-private:
    don't echo mount passphrase if running in bootstrap mode; prune
    potential leakages from install log, LP: #383650
  * SECURITY UPDATE: mount passphrase recorded in install log (LP: #383650).
    - debian/ecryptfs-utils.postinst: prune private information from
      installer log
    - src/utils/ecryptfs-setup-private: don't echo passphrase if running in
      bootstrap mode
    - CVE-2009-1296
  * src/utils/ecryptfs-setup-private: make some of the lanuage more readable,
    (thanks, anrxc)
  * README, configure.ac, debian/control, debian/rules,
    doc/sourceforge_webpage/README, src/libecryptfs-swig/libecryptfs.py,
    src/libecryptfs-swig/libecryptfs_wrap.c,
    src/libecryptfs/key_management.c, src/libecryptfs/libecryptfs.pc.in,
    src/libecryptfs/main.c, src/pam_ecryptfs/Makefile.am,
    src/utils/manager.c, src/utils/mount.ecryptfs.c: move build from gcrypt
    to nss (this change has been pending for some time)
  * src/utils/ecryptfs-dot-private: dropped, was too hacky
  * ecryptfs-mount-private.1, ecryptfs-setup-private.1: align the
    documentation and implementation of the wrapping-independent feature,
    LP: #383746
  * src/utils/ecryptfs-umount-private: use keyctl list @u, since keyctl show
    stopped working, LP: #400484, #395082
  * src/utils/mount.ecryptfs_private.c: fix counter file locking; solves
    a longstanding bug about "random" umount caused by cronjobs, LP: #358573

  [ Michal Hlavinka (edits by Dustin Kirkland) ]
  * doc/manpage/ecryptfs-mount-private.1,
    doc/manpage/ecryptfs-rewrite-file.1,
    doc/manpage/ecryptfs-setup-private.1, doc/manpage/ecryptfs.7,
    doc/manpage/mount.ecryptfs_private.1,
    doc/manpage/umount.ecryptfs_private.1: documentation updated to note
    possible ecryptfs group membership requirements; Fix ecrypfs.7 man
    page and key_mod_openssl's error message; fix typo
  * src/libecryptfs/decision_graph.c: put a finite limit (5 tries) on
    interactive input; fix memory leaks when asking questions
  * src/libecryptfs/module_mgr.c: Don't error out with EINVAL when
    verbosity=0 and some options are missing.
  * src/utils/umount.ecryptfs.c: no error for missing key when removing it
  * src/libecryptfs-swig/libecryptfs.i: fix compile werror, cast char*
  * src/utils/ecryptfs_add_passphrase.c: fix/test/use return codes;
    return nonzero for --fnek when not supported but used
  * src/include/ecryptfs.h, src/key_mod/ecryptfs_key_mod_openssl.c,
    src/libecryptfs/module_mgr.c: refuse mounting with too small rsa
    key (key_mod_openssl)
  * src/utils/ecryptfs_insert_wrapped_passphrase_into_keyring.c: fix return
    codes
  * src/utils/ecryptfs-rewrite-file: polish output
  * src/libecryptfs/key_management.c: inform about full keyring; insert fnek
    sig into keyring if fnek support check fails; don't fail if key already
    exists in keyring
  * src/utils/ecryptfs-setup-private: if th...

Read more...

Changed in ecryptfs-utils (Ubuntu):
status: Fix Committed → Fix Released
Changed in ecryptfs:
status: Fix Committed → Fix Released
Revision history for this message
DarrenShare (darren-moorstreet) wrote :

Hi,

I've just started encountering this bug in the last couple of days - I have a fully encrypted /home directory. Even before finding this bug report I narrowed the cause down to one of three things, one of which was that I'd recently added three new hourly cron jobs.

It's nice to know you've fixed it. Are there any plans on backporting it to the version of ecryptfs-utils in Ubuntu 9.04 (73-0ubuntu6.1) or will I have to wait to upgrade to Karmic for this to be fixed? Will staggering the start of the cron jobs help (e.g. running the first one at 1 minute past the hour, the second at 2 minutes past the hour etc.)?

Thanks.

Revision history for this message
Max Bowsher (maxb) wrote :

I can't speak for plans on backporting, but staggering the cronjobs so cron never starts two jobs at the same time should mostly work around the issue.

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Max is right -- that should help work around the issue.

I'd rather not do the backport, unless it's really, really, really
necessary. We're getting pretty close to Karmic's release, and I'm
much happier with that ecryptfs-utils than Jaunty's ;-)

:-Dustin

Revision history for this message
DarrenShare (darren-moorstreet) wrote :

Fair enough. Thank you both for the comments.

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

I'll ask jdstrand for a second opinion... You suffered from this
Jamie. What do you think? SRU-worthy?

:-Dustin

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

While annoying, IMHO I don't feel it meets the SRU criteria of being a "high impact bug". See https://wiki.ubuntu.com/StableReleaseUpdates for details.

Revision history for this message
Luchostein (luchostein) wrote :
Download full text (5.7 KiB)

@kirkland: It is happening again, but on Kubuntu Precise 12.04.2 LTS. I had to remove ~/.ecryptfs/auto-umount to workaround the problem, but the race condition under simultaneous cronjobs still happens.

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 12.04.2 LTS
Release: 12.04
Codename: precise

$ dpkg-query -l|grep ecrypt
ii ecryptfs-utils 96-0ubuntu3.1 ecryptfs cryptographic filesystem (utilities)
ii libecryptfs0 96-0ubuntu3.1 ecryptfs cryptographic filesystem (library)

$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz
stepping : 9
microcode : 0x15
cpu MHz : 1200.000
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
bogomips : 5980.52
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz
stepping : 9
microcode : 0x15
cpu MHz : 1200.000
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 2
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc ...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.