UBUNTU: SAUCE: overlayfs: Propogate nosuid from lower and upper mounts
An overlayfs mount using an upper or lower directory from a
nosuid filesystem bypasses this restriction. Change this so
that if any lower or upper directory is nosuid at mount time the
overlayfs superblock is marked nosuid. This requires some
additions at the vfs level since nosuid currently only applies to
mounts, so a SB_I_NOSUID flag is added along with a helper
function to check a path for nosuid in both the mount and the
superblock.
UBUNTU: SAUCE: overlayfs: Be more careful about copying up sxid files
When an overlayfs filesystem's lowerdir is on a nosuid filesystem
but the upperdir is not, it's possible to copy up an sxid file or
stick directory into upperdir without changing the mode by
opening the file rw in the overlayfs mount without writing to it.
This makes it possible to bypass the nosuid restriction on the
lowerdir mount.
It's a bad idea in general to let the mounter copy up a sxid file
if the mounter wouldn't have had permission to create the sxid
file in the first place. Therefore change ovl_set_xattr to
exclude these bits when initially setting the mode, then set the
full mode after setting the user for the inode. This allows copy
up for non-sxid files to work as before but causes copy up to
fail for the cases where the user could not have created the sxid
inode in upperdir.
UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs
The original mounter had CAP_SYS_ADMIN in the user namespace
where the mount happened, and the vfs has validated that the user
has permission to do the requested operation. This is sufficient
for allowing the kernel to write these specific xattrs, so bypass
the permission checks for these xattrs.
UBUNTU: SAUCE: overlayfs: Use mounter's credentials instead of selectively raising caps
When overlayfs needs to perform internal operations which require
privileges the current user may not have, it does so by
selectively raising the required capabilities in the current set
of credentials. If the current process is in a user namespace
this doesn't always work, as operations such as setting
privileged xattrs often requires privileges in init_user_ns.
These operations really ought to be permitted, based on a couple
of facts:
1. The vfs has already verified that the current process is
allowed to perform the requested operation on the overlayfs
super block, and overlayfs has verified that the operation is
permitted in upperdir.
2. The original mounter of the overlayfs super block was
privileged enough to perform the internal overlayfs
operations required to satisfy the user's request in
upperdir.
On the other hand, if the filesystem is mounted from a user
namespace and then accessed in init_user_ns the credentials taken
will exceed those of the mounter. This could result in performing
operations that the user could not do otherwise, such as creating
files which are sxid for another user or group. Both of these
issues can be prevented by using the mounter's credentials when
performing privileged overlayfs-internal operations.
Add a new internal interface, ovl_prepare_creds(), which returns
a new set of credentials for performing privileged internal
operations that is identical to the mounter's creds. Use this
internal interface instead of using prepare_creds() and
selectively raising the needed capabilities.
This interface returns a new set of credentials which is an exact
copy of another set. Also update prepare_kernel_cred() to use
this function instead of duplicating code.