plaintext -> crypted file name mapping tool

Bug #799157 reported by Dustin Kirkland 
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
eCryptfs
Fix Released
Wishlist
Dustin Kirkland 

Bug Description

I am rsync'ing my encrypted home directory (to a backup storage). In this process I need to exclude several files and directories (using --exclude option in rsync).
So far I am MANUALLY finding the encrypted filename corresponding to the directory I want to exclude. This works but is time-consuming every time I need to change the list of excluded directories. Besides, it is hard to grasp which (plaintext) directories are being excluded by simply looking at the rsync command-line.

I have just crafted a bash script, ecryptfs-filename-plain2encrypted, that does the job. It uses the fact that the i-node reported for an encrypted file and its plaintext counterpart is the same.

Feel free to distribute/post on your website/provide comments or improvements to this script

Regards,

Sergio Mena

PS Here is the script:

<code>

#!/bin/sh

# Utility script to map plaintext and encrypted filenames
# in an eCryptfs directory
# Sergio Mena. 2011-06-18

#Default values
encryptedroot="`dirname ${HOME}`/.ecryptfs/`basename ${HOME}`/.Private"
plaintextroot="$HOME"

usage()
{
cat << EOF
usage: $0 [options] filename

This script prints the ecryptfs counterpart filename (including path) of the plaintext filename \
passed as argument. Note that the script does not use PWD/CWD to locate the filename. Filename \
is a path to the target file/directory, relative to the plaintext root. Likewise, the resulting \
filename includes the path relative to the encrypted root.

OPTIONS:
   -h show this message
   -e path path to encrypted root path (default: $encryptedroot)
   -p path path to plaintext root path (default: $plaintextroot)
   -s swap root paths. The command effectively takes the opposite effect (i.e., from \
encrypted filename to plaintext).
EOF
}

reverse=0
while getopts "he:p:s" OPTION; do
    case $OPTION in
        h)
            usage
            exit 0
            ;;
        e)
            encryptedroot="$OPTARG"
            ;;
        p)
            plaintextroot="$OPTARG"
            ;;
        s)
            reverse=1
            ;;
        ?)
            usage >&2
            exit 1
            ;;
    esac
done

shift $((OPTIND - 1))

[ -z "$1" ] &&\
    echo "$0: No filename provided" >&2 &&\
    usage >&2 &&\
    exit 2

[ $reverse -eq 1 ] &&\
    aux="${encryptedroot}" &&\
    encryptedroot="${plaintextroot}" &&\
    plaintextroot="${aux}"

currentencryptedpath=
currentplaintextpath=
rest="$1"

while true; do
    nextplaintextdir=`echo ${rest} | sed 's/\/.*$//'`
    rest=`echo ${rest} | sed 's/^[^\/]*\/*//'`
    currentplaintextpath=${currentplaintextpath}/${nextplaintextdir}
    [ ! -e "${plaintextroot}/${currentplaintextpath}" ] &&\
        echo "$0: cannot access $1: No such file or directory" >&2 &&\
        exit 1
    inode=`ls -aid "${plaintextroot}/${currentplaintextpath}" | awk '{print $1}' `
    nextencrypteddir=`ls -ai "${encryptedroot}/${currentencryptedpath}" | \
                      grep ${inode} | awk '{print $2}'`
    [ -z "$nextencrypteddir" ] &&\
        echo "$0: Hmmm strange, no encrypted file/dir corresponds to plaintext file/dir" >&2 &&\
        exit 2
    currentencryptedpath="${currentencryptedpath}/${nextencrypteddir}"
    [ -z "$rest" ] &&\
        ( echo "${currentencryptedpath}" | sed 's/^\///' ) &&\
        exit 0
done

</code>

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

This is good stuff, Sergio!

Could you do two things, such that I can include this in ecryptfs-utils properly...
 1) Could you please attach the file to this bug report
 2) Could you please add a GPLv2 copyright header to the top of your script as comments?

You can find it in /usr/share/common-licenses/GPL-2:

    <one line to give the program's name and a brief idea of what it does.>
    Copyright (C) <year> <name of author>

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License along
    with this program; if not, write to the Free Software Foundation, Inc.,
    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

Thanks!

Changed in ecryptfs:
importance: Undecided → Wishlist
status: New → Triaged
assignee: nobody → Dustin Kirkland (kirkland)
Revision history for this message
Sergio Mena (sergio-mena) wrote :

Thanks Dustin. I've done as you suggested and I'm attaching the result

As you seem to be interested in including the script in future releases, I've done a bit of stress testing, to gain a bit of confidence the script won't break apart at the first corner case. These are the commands I ran on my home directory to stress test the script:

* Firstly, quite gentle: find . -print0 | xargs -0 -L 1 ecryptfs-filename-plain2encrypted 2> errors.txt
* Then, a bit more involved: find . -print0 | xargs -0 -L 1 -I TOK1234 bash -c $'enc=`ecryptfs-filename-plain2encrypted \'TOK1234\' 2>> errors2.txt`; pla=`ecryptfs-filename-plain2encrypted -s "$enc" 2>> errors3.txt`; echo "$pla"; [ \'TOK1234\' = "$pla" ] || echo \'TOK1234\' "does not match $pla" >> errors4.txt '

The second (long) command won't work when the filename contains single quotes, but the script under test DOES work in those cases.
Another oddity I have found with the second command is that directory .gvfs yields i-node 1. I'm not a gnome expert, so don't know how to handle this case. The script currently complains that .gvfs does not have a counterpart in the encrypted root.

Thanks and regards,

Sergio Mena

Revision history for this message
Sergio Mena (sergio-mena) wrote :

Attaching the script...

Revision history for this message
Tyler Hicks (tyhicks) wrote :

Hi Sergio - Thanks for the contribution.

Inode numbers are a great way to correlate upper and lower inodes. However, they're just a quick and dirty way to transform a plaintext filename to an encrypted filename (or vice-versa). The reason is that multiple dentries may point to a single inode.

Therefore, this script could give inaccurate results and I'm afraid users could shoot themselves in the foot by not understanding that subtle detail.

It is useful (I use this same technique at times), so maybe we can ship it in the source tree, but not install it into /usr/bin.

Also, the script is much too complicated for what it does. All you need is a stat call to get the inode number and then find (with the -inum argument) to find the corresponding inode in the lower filesystem. I think Dustin is working on a simpler version which does something along these lines.

Revision history for this message
Sergio Mena (sergio-mena) wrote :

Hi Tyler,

I agree with you on the problems that my script has with hard links: the name returned by the script may not be the correct one (although the contents ARE the same, and can be read-written without any problem). Only problem I see is if you erase the directory entry, then you might end up erasing the wrong one.

In short: I agree that using the i-node is a quick and dirty way to solve the problem (but that's the way I went about it not having any background knowledge on how filename encryption works in ecryptfs). So, I understand your concerns about the fact that some users might be disappointed with the side-effects it involves.

What I am afraid I don't agree with is the "stat" and "fjnd" solution that you suggest: this solution greatly underperforms, as find will search THE WHOLE directory structure. Then, if you limit the depth of the search, then your script will start looking more and more like mine :-)

Thanks

Sergio

Revision history for this message
Tyler Hicks (tyhicks) wrote :

Sergio Mena <email address hidden> wrote:
> What I am afraid I don't agree with is the "stat" and "fjnd" solution
> that you suggest: this solution greatly underperforms

With your approach, a user would have to walk the tree manually with
this script. For example, say they're looking for foo/bar/baz, they'd
have to run this tool to find foo, then again to find bar, then once
more to find baz. That is a pain to have to do.

This is a handy tool that probably won't be ran very often (not daily,
probably not even weekly) so I think performance gives way to easy of
use in this case.

Revision history for this message
Sergio Mena (sergio-mena) wrote :

Tyler,

The "while" loop in my script is doing what you describe No need to do it manually.

Taking your example, if you type:

ecryptfs-filename-plain2encrypted foo/bar/baz

You will get the full path in the lower filesystem (relative to its root)... i.e., the script will spit something like:

ECRYPTFS_FNEK_ENCRYPTED.blabla/ECRYPTFS_FNEK_ENCRYPTED.bleble/ECRYPTFS_FNEK_ENCRYPTED.blibli

As for the problem you spotted with hard links, it would only happen if the hard links are located in the same directory. So the attached patch makes the script fail safely, rather than giving incorrect results.

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

I have reworked the script a bit differently.

I've simplified the name to "ecryptfs-find", and allowed it to take either an encrypted, or a plaintext filename. It then does its best to guess the direction you're trying to map, and find the right mountpoint. None of these are perfect, but if it's right 98% of the time, I'm quite happy ;-)

I've retained your name as an author, Sergio.

I'd be interested to see what you think. I've committed it to lp:ecryptfs, though I haven't released yet.

Revision history for this message
Dustin Kirkland  (kirkland) wrote :
Changed in ecryptfs:
status: Triaged → Fix Committed
Revision history for this message
Sergio Mena (sergio-mena) wrote :

Hi Dustin,

Thanks for keeping me as an author, I appreciate it.
I like your version of the script, especially the clever way in which you figure out the mount points to explore (as opposed to my variables for the two "roots", defaulting to $HOME and /home/.ecryptfs/$USERNAME. I like the way you do it.

The heuristic you use (the ECRYPTFS_FNEK_ENCRYPTED prefix) seems also very reasonable to me.

Finally, I still have the performance concern with the fact of doing a "find" for the inode throughout entire mounted filesystems. I think the idea laid out in my version, consisting of looking up the inodes of the directories along the path is worth it (e.g., for aa/bb/cc/dd.txt, we look up aa first, then bb in aa, then cc in bb and finally dd.txt in cc). An algorithmic cost of O(log n) as opposed to O(n) is a huge performance difference. If this didn't convince you, let's stop here and keep the script we way you wrote it. If, on the other hand, you think the line <code>find "$m/" -inum "$inum"</code> towards the end of the script could be rewritten to be more efficient, I'll be very happy to try it out.

One more thing. From Tyler's comments, I figured out that, in order to figure out the i-node
stat --printf %i "$1"
...is maybe a more elegant way than...
ls -aid "$1" | awk '{print $1}'

Cheers,

Sergio

Revision history for this message
Dustin Kirkland  (kirkland) wrote : Re: [Bug 799157] Re: plaintext -> crypted file name mapping tool

Sergio,

Thanks for the comments.

Perhaps stat is more elegant. I'll review Tyler's comments and take a
look at stat.

I also agree with your logic about the performance of the find. It's
quite quick here for me, though I am on an SSD and a rather powerful
laptop.

find has an option, -maxdepth which could benefit us here. We could
do a bit of simple math on the target, determine how deep the
file/directory name should be, and add that to the find argument.

Would you care to bzr branch lp:ecryptfs, make those two changes (stat
and maxdepth) and send a patch or a merge proposal?

--
:-Dustin

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Released in ecryptfs-utils 89

Changed in ecryptfs:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.