Commit graph

47 commits

Author SHA1 Message Date
Vivek Goyal
7eaa995c75 ovl: warn instead of error if d_type is not supported
commit e7c0b5991dd1be7b6f6dc2b54a15a0f47b64b007 upstream.

overlay needs underlying fs to support d_type. Recently I put in a
patch in to detect this condition and started failing mount if
underlying fs did not support d_type.

But this breaks existing configurations over kernel upgrade. Those who
are running docker (partially broken configuration) with xfs not
supporting d_type, are surprised that after kernel upgrade docker does
not run anymore.

https://github.com/docker/docker/issues/22937#issuecomment-229881315

So instead of erroring out, detect broken configuration and warn
about it. This should allow existing docker setups to continue
working after kernel upgrade.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 45aebeaf4f67 ("ovl: Ensure upper filesystem supports d_type")
Cc: <stable@vger.kernel.org> 4.6
Signed-off-by: SZ Lin (林上智) <sz.lin@moxa.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-28 07:23:43 +02:00
Vivek Goyal
0f9a6d88cd ovl: Do d_type check only if work dir creation was successful
commit 21765194cecf2e4514ad75244df459f188140a0f upstream.

d_type check requires successful creation of workdir as iterates
through work dir and expects work dir to be present in it. If that's
not the case, this check will always return d_type not supported even
if underlying filesystem might be supporting it.

So don't do this check if work dir creation failed in previous step.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: SZ Lin (林上智) <sz.lin@moxa.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-28 07:23:43 +02:00
Vivek Goyal
d5e678942d ovl: Ensure upper filesystem supports d_type
commit 45aebeaf4f67468f76bedf62923a576a519a9b68 upstream.

In some instances xfs has been created with ftype=0 and there if a file
on lower fs is removed, overlay leaves a whiteout in upper fs but that
whiteout does not get filtered out and is visible to overlayfs users.

And reason it does not get filtered out because upper filesystem does
not report file type of whiteout as DT_CHR during iterate_dir().

So it seems to be a requirement that upper filesystem support d_type for
overlayfs to work properly. Do this check during mount and fail if d_type
is not supported.

Suggested-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: SZ Lin (林上智) <sz.lin@moxa.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-28 07:23:43 +02:00
Miklos Szeredi
2f949da9c0 ovl: fix workdir creation
commit e1ff3dd1ae52cef5b5373c8cc4ad949c2c25a71c upstream.

Workdir creation fails in latest kernel.

Fix by allowing EOPNOTSUPP as a valid return value from
vfs_removexattr(XATTR_NAME_POSIX_ACL_*).  Upper filesystem may not support
ACL and still be perfectly able to support overlayfs.

Reported-by: Martin Ziegler <ziegler@uni-freiburg.de>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: c11b9fdd6a61 ("ovl: remove posix_acl_default from workdir")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:52 +02:00
Miklos Szeredi
d57a6c7480 ovl: remove posix_acl_default from workdir
commit c11b9fdd6a612f376a5e886505f1c54c16d8c380 upstream.

Clear out posix acl xattrs on workdir and also reset the mode after
creation so that an inherited sgid bit is cleared.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15 08:27:52 +02:00
Miklos Szeredi
54c4ddcbab ovl: disallow overlayfs as upperdir
commit 76bc8e2843b66f8205026365966b49ec6da39ae7 upstream.

This does not work and does not make sense.  So instead of fixing it
(probably not hard) just disallow.

Reported-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-20 18:09:19 +02:00
Miklos Szeredi
c452dfc332 fs: add file_dentry()
commit d101a125954eae1d397adda94ca6319485a50493 upstream.

This series fixes bugs in nfs and ext4 due to 4bacc9c923 ("overlayfs:
Make f_path always point to the overlay and f_inode to the underlay").

Regular files opened on overlayfs will result in the file being opened on
the underlying filesystem, while f_path points to the overlayfs
mount/dentry.

This confuses filesystems which get the dentry from struct file and assume
it's theirs.

Add a new helper, file_dentry() [*], to get the filesystem's own dentry
from the file.  This checks file->f_path.dentry->d_flags against
DCACHE_OP_REAL, and returns file->f_path.dentry if DCACHE_OP_REAL is not
set (this is the common, non-overlayfs case).

In the uncommon case it will call into overlayfs's ->d_real() to get the
underlying dentry, matching file_inode(file).

The reason we need to check against the inode is that if the file is copied
up while being open, d_real() would return the upper dentry, while the open
file comes from the lower dentry.

[*] If possible, it's better simply to use file_inode() instead.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Tested-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Daniel Axtens <dja@axtens.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-20 15:42:12 +09:00
Konstantin Khlebnikov
455fadecd1 ovl: fix working on distributed fs as lower layer
commit b5891cfab08fe3144a616e8e734df7749fb3b7d0 upstream.

This adds missing .d_select_inode into alternative dentry_operations.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Fixes: 7c03b5d45b ("ovl: allow distributed fs as lower layer")
Fixes: 4bacc9c923 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Reviewed-by: Nikolay Borisov <kernel@kyup.com>
Tested-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-16 08:42:59 -07:00
Konstantin Khlebnikov
e25cdcefed ovl: ignore lower entries when checking purity of non-directory entries
commit 45d11738969633ec07ca35d75d486bf2d8918df6 upstream.

After rename file dentry still holds reference to lower dentry from
previous location. This doesn't matter for data access because data comes
from upper dentry. But this stale lower dentry taints dentry at new
location and turns it into non-pure upper. Such file leaves visible
whiteout entry after remove in directory which shouldn't have whiteouts at
all.

Overlayfs already tracks pureness of file location in oe->opaque.  This
patch just uses that for detecting actual path type.

Comment from Vivek Goyal's patch:

Here are the details of the problem. Do following.

$ mkdir upper lower work merged upper/dir/
$ touch lower/test
$ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=
work merged
$ mv merged/test merged/dir/
$ rm merged/dir/test
$ ls -l merged/dir/
/usr/bin/ls: cannot access merged/dir/test: No such file or directory
total 0
c????????? ? ? ? ?            ? test

Basic problem seems to be that once a file has been unlinked, a whiteout
has been left behind which was not needed and hence it becomes visible.

Whiteout is visible because parent dir is of not type MERGE, hence
od->is_real is set during ovl_dir_open(). And that means ovl_iterate()
passes on iterate handling directly to underlying fs. Underlying fs does
not know/filter whiteouts so it becomes visible to user.

Why did we leave a whiteout to begin with when we should not have.
ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave
whiteout if file is pure upper. In this case file is not found to be pure
upper hence whiteout is left.

So why file was not PURE_UPPER in this case? I think because dentry is
still carrying some leftover state which was valid before rename. For
example, od->numlower was set to 1 as it was a lower file. After rename,
this state is not valid anymore as there is no such file in lower.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Viktor Stanchev <me@viktorstanchev.com>
Suggested-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-16 08:42:59 -07:00
Miklos Szeredi
8373f6590f ovl: setattr: check permissions before copy-up
commit cf9a6784f7c1b5ee2b9159a1246e327c331c5697 upstream.

Without this copy-up of a file can be forced, even without actually being
allowed to do anything on the file.

[Arnd Bergmann] include <linux/pagemap.h> for PAGE_CACHE_SIZE (used by
MAX_LFS_FILESIZE definition).

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25 12:01:24 -08:00
Miklos Szeredi
7193e80296 ovl: root: copy attr
commit ed06e069775ad9236087594a1c1667367e983fb5 upstream.

We copy i_uid and i_gid of underlying inode into overlayfs inode.  Except
for the root inode.

Fix this omission.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25 12:01:24 -08:00
Linus Torvalds
4bb0fb57f3 Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs bug fixes from Miklos Szeredi:
 "This contains fixes for bugs that appeared in earlier kernels (all are
  marked for -stable)"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: free lower_mnt array in ovl_put_super
  ovl: free stack of paths in ovl_fill_super
  ovl: fix open in stacked overlay
  ovl: fix dentry reference leak
  ovl: use O_LARGEFILE in ovl_copy_up()
2015-10-31 14:49:19 -07:00
Konstantin Khlebnikov
5ffdbe8bf1 ovl: free lower_mnt array in ovl_put_super
This fixes memory leak after umount.

Kmemleak report:

unreferenced object 0xffff8800ba791010 (size 8):
  comm "mount", pid 2394, jiffies 4294996294 (age 53.920s)
  hex dump (first 8 bytes):
    20 1c 13 02 00 88 ff ff                           .......
  backtrace:
    [<ffffffff811f8cd4>] create_object+0x124/0x2c0
    [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
    [<ffffffff811dffe6>] __kmalloc+0x106/0x340
    [<ffffffffa0152bfc>] ovl_fill_super+0x55c/0x9b0 [overlay]
    [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
    [<ffffffffa0152118>] ovl_mount+0x18/0x20 [overlay]
    [<ffffffff81201ab3>] mount_fs+0x43/0x170
    [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
    [<ffffffff812233ad>] do_mount+0x22d/0xdf0
    [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
    [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Fixes: dd662667e6 ("ovl: add mutli-layer infrastructure")
Cc: <stable@vger.kernel.org> # v4.0+
2015-10-12 17:11:44 +02:00
Konstantin Khlebnikov
0f95502ad8 ovl: free stack of paths in ovl_fill_super
This fixes small memory leak after mount.

Kmemleak report:

unreferenced object 0xffff88003683fe00 (size 16):
  comm "mount", pid 2029, jiffies 4294909563 (age 33.380s)
  hex dump (first 16 bytes):
    20 27 1f bb 00 88 ff ff 40 4b 0f 36 02 88 ff ff   '......@K.6....
  backtrace:
    [<ffffffff811f8cd4>] create_object+0x124/0x2c0
    [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
    [<ffffffff811dffe6>] __kmalloc+0x106/0x340
    [<ffffffffa01b7a29>] ovl_fill_super+0x389/0x9a0 [overlay]
    [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
    [<ffffffffa01b7118>] ovl_mount+0x18/0x20 [overlay]
    [<ffffffff81201ab3>] mount_fs+0x43/0x170
    [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
    [<ffffffff812233ad>] do_mount+0x22d/0xdf0
    [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
    [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Fixes: a78d9f0d5d ("ovl: support multiple lower layers")
Cc: <stable@vger.kernel.org> # v4.0+
2015-10-12 17:11:43 +02:00
Kees Cook
a068acf2ee fs: create and use seq_show_option for escaping
Many file systems that implement the show_options hook fail to correctly
escape their output which could lead to unescaped characters (e.g.  new
lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
could lead to confusion, spoofed entries (resulting in things like
systemd issuing false d-bus "mount" notifications), and who knows what
else.  This looks like it would only be the root user stepping on
themselves, but it's possible weird things could happen in containers or
in other situations with delegated mount privileges.

Here's an example using overlay with setuid fusermount trusting the
contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
of "sudo" is something more sneaky:

  $ BASE="ovl"
  $ MNT="$BASE/mnt"
  $ LOW="$BASE/lower"
  $ UP="$BASE/upper"
  $ WORK="$BASE/work/ 0 0
  none /proc fuse.pwn user_id=1000"
  $ mkdir -p "$LOW" "$UP" "$WORK"
  $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
  $ cat /proc/mounts
  none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
  none /proc fuse.pwn user_id=1000 0 0
  $ fusermount -u /proc
  $ cat /proc/mounts
  cat: /proc/mounts: No such file or directory

This fixes the problem by adding new seq_show_option and
seq_show_option_n helpers, and updating the vulnerable show_option
handlers to use them as needed.  Some, like SELinux, need to be open
coded due to unusual existing escape mechanisms.

[akpm@linux-foundation.org: add lost chunk, per Kees]
[keescook@chromium.org: seq_show_option should be using const parameters]
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Jan Kara <jack@suse.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: J. R. Okajima <hooanon05g@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Linus Torvalds
1dc51b8288 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "Assorted VFS fixes and related cleanups (IMO the most interesting in
  that part are f_path-related things and Eric's descriptor-related
  stuff).  UFS regression fixes (it got broken last cycle).  9P fixes.
  fs-cache series, DAX patches, Jan's file_remove_suid() work"

[ I'd say this is much more than "fixes and related cleanups".  The
  file_table locking rule change by Eric Dumazet is a rather big and
  fundamental update even if the patch isn't huge.   - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
  9p: cope with bogus responses from server in p9_client_{read,write}
  p9_client_write(): avoid double p9_free_req()
  9p: forgetting to cancel request on interrupted zero-copy RPC
  dax: bdev_direct_access() may sleep
  block: Add support for DAX reads/writes to block devices
  dax: Use copy_from_iter_nocache
  dax: Add block size note to documentation
  fs/file.c: __fget() and dup2() atomicity rules
  fs/file.c: don't acquire files->file_lock in fd_install()
  fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
  vfs: avoid creation of inode number 0 in get_next_ino
  namei: make set_root_rcu() return void
  make simple_positive() public
  ufs: use dir_pages instead of ufs_dir_pages()
  pagemap.h: move dir_pages() over there
  remove the pointless include of lglock.h
  fs: cleanup slight list_entry abuse
  xfs: Correctly lock inode when removing suid and file capabilities
  fs: Call security_ops->inode_killpriv on truncate
  fs: Provide function telling whether file_remove_privs() will do anything
  ...
2015-07-04 19:36:06 -07:00
Miklos Szeredi
7c03b5d45b ovl: allow distributed fs as lower layer
Allow filesystems with .d_revalidate as lower layer(s), but not as upper
layer.

For local filesystems the rule was that modifications on the layers
directly while being part of the overlay results in undefined behavior.

This can easily be extended to distributed filesystems: we assume the tree
used as lower layer is static, which means ->d_revalidate() should always
return "1".  If that is not the case, return -ESTALE, don't try to work
around the modification.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-06-22 13:53:48 +02:00
Miklos Szeredi
a6f15d9a75 ovl: don't traverse automount points
NFS and other distributed filesystems may place automount points in the
tree.  Previoulsy overlayfs refused to mount such filesystems types (based
on the existence of the .d_automount callback), even if the actual export
didn't have any automount points.

It cannot be determined in advance whether the filesystem has automount
points or not.  The solution is to allow fs with .d_automount but refuse to
traverse any automount points encountered.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-06-22 13:53:48 +02:00
David Howells
4bacc9c923 overlayfs: Make f_path always point to the overlay and f_inode to the underlay
Make file->f_path always point to the overlay dentry so that the path in
/proc/pid/fd is correct and to ensure that label-based LSMs have access to the
overlay as well as the underlay (path-based LSMs probably don't need it).

Using my union testsuite to set things up, before the patch I see:

	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
	...
	lr-x------. 1 root root 64 Jun  5 14:38 5 -> /a/foo107
	[root@andromeda union-testsuite]# stat /mnt/a/foo107
	...
	Device: 23h/35d Inode: 13381       Links: 1
	...
	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
	...
	Device: 23h/35d Inode: 13381       Links: 1
	...

After the patch:

	[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
	[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
	...
	lr-x------. 1 root root 64 Jun  5 14:22 5 -> /mnt/a/foo107
	[root@andromeda union-testsuite]# stat /mnt/a/foo107
	...
	Device: 23h/35d Inode: 40346       Links: 1
	...
	[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
	...
	Device: 23h/35d Inode: 40346       Links: 1
	...

Note the change in where /proc/$$/fd/5 points to in the ls command.  It was
pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
(which is correct).

The inode accessed, however, is the lower layer.  The union layer is on device
25h/37d and the upper layer on 24h/36d.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-06-19 03:19:32 -04:00
Miklos Szeredi
cc6f67bcaf ovl: mount read-only if workdir can't be created
OpenWRT folks reported that overlayfs fails to mount if upper fs is full,
because workdir can't be created.  Wordir creation can fail for various
other reasons too.

There's no reason that the mount itself should fail, overlayfs can work
fine without a workdir, as long as the overlay isn't modified.

So mount it read-only and don't allow remounting read-write.

Add a couple of WARN_ON()s for the impossible case of workdir being used
despite being read-only.

Reported-by: Bastian Bittorf <bittorf@bluebottle.com> 
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: <stable@vger.kernel.org> # v3.18+
2015-05-19 14:30:12 +02:00
hujianyang
71cbad7e69 ovl: upper fs should not be R/O
After importing multi-lower layer support, users could mount a r/o
partition as the left most lowerdir instead of using it as upperdir.
And a r/o upperdir may cause an error like

	overlayfs: failed to create directory ./workdir/work

during mount.

This patch check the *s_flags* of upper fs and return an error if
it is a r/o partition. The checking of *upper_mnt->mnt_sb->s_flags*
can be removed now.

This patch also remove

	/* FIXME: workdir is not needed for a R/O mount */

from ovl_fill_super() because:

1) for upper fs r/o case
Setting a r/o partition as upper is prevented, no need to care about
workdir in this case.

2) for "mount overlay -o ro" with a r/w upper fs case
Users could remount overlayfs to r/w in this case, so workdir should
not be omitted.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-03-18 10:29:48 +01:00
hujianyang
6be4506e34 ovl: check lowerdir amount for non-upper mount
Recently multi-lower layer mount support allow upperdir and workdir
to be omitted, then cause overlayfs can be mount with only one
lowerdir directory. This action make no sense and have potential risk.

This patch check the total number of lower directories to prevent
mounting overlayfs with only one directory.

Also, an error message is added to indicate lower directories exceed
OVL_MAX_STACK limit.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-03-18 10:29:47 +01:00
hujianyang
bead55ef77 ovl: print error message for invalid mount options
Overlayfs should print an error message if an incorrect mount option
is caught like other filesystems.

After this patch, improper option input could be clearly known.

Reported-by: Fabian Sturm <fabian.sturm@aduu.de>
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-03-18 10:29:47 +01:00
Seunghun Lee
3cdf6fe910 ovl: Prevent rw remount when it should be ro mount
Overlayfs should be mounted read-only when upper-fs is read-only or nonexistent.
But now it can be remounted read-write and this can cause kernel panic.
So we should prevent read-write remount when the above situation happens.

Signed-off-by: Seunghun Lee <waydi1@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-01-08 14:47:21 +01:00
hujianyang
a425c037f3 ovl: Fix opaque regression in ovl_lookup
Current multi-layer support overlayfs has a regression in
.lookup(). If there is a directory in upperdir and a regular
file has same name in lowerdir in a merged directory, lower
file is hidden and upper directory is set to opaque in former
case. But it is changed in present code.

In lowerdir lookup path, if a found inode is not directory,
the type checking of previous inode is missing. This inode
will be copied to the lowerstack of ovl_entry directly.

That will lead to several wrong conditions, for example,
the reading of the directory in upperdir may return an error
like:

   ls: reading directory .: Not a directory

This patch makes the lowerdir lookup path check the opaque
for non-directory file too.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-01-08 14:47:20 +01:00
hujianyang
2f83fd8c28 ovl: Fix kernel panic while mounting overlayfs
The function ovl_fill_super() in recently multi-layer support
version will incorrectly return 0 at error handling path and
then cause kernel panic.

This failure can be reproduced by mounting a overlayfs with
upperdir and workdir in different mounts.

And also, If the memory allocation of *lower_mnt* fail, this
function may return an zero either.

This patch fix this problem by setting *err* to proper error
number before jumping to error handling path.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-01-08 14:47:20 +01:00
hujianyang
cead89bb08 ovl: Use macros to present ovl_xattr
This patch adds two macros:

OVL_XATTR_PRE_NAME and OVL_XATTR_PRE_LEN

to present ovl_xattr name prefix and its length. Also, a
new macro OVL_XATTR_OPAQUE is introduced to replace old
*ovl_opaque_xattr*.

Fix the length of "trusted.overlay." to *16*.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:52 +01:00
hujianyang
1ba38725a3 ovl: Cleanup redundant blank lines
This patch removes redundant blanks lines in overlayfs.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:52 +01:00
Miklos Szeredi
a78d9f0d5d ovl: support multiple lower layers
Allow "lowerdir=" option to contain multiple lower directories separated by
a colon (e.g. "lowerdir=/bin:/usr/bin").  Colon characters in filenames can
be escaped with a backslash.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:52 +01:00
Miklos Szeredi
53a08cb9b8 ovl: make upperdir optional
Make "upperdir=" mount option optional.  If "upperdir=" is not given, then
the "workdir=" option is also optional (and ignored if given).

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:51 +01:00
Miklos Szeredi
ab508822ca ovl: improve mount helpers
Move common checks into ovl_mount_dir() helper.

Create helper for looking up lower directories.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:49 +01:00
Miklos Szeredi
3b7a9a249a ovl: mount: change order of initialization
Move allocation of root entry above to where it's needed.

Move initializations related to upperdir and workdir near each other.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:48 +01:00
Miklos Szeredi
4ebc581828 ovl: allow statfs if no upper layer
Handle "no upper layer" case in statfs.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:46 +01:00
Miklos Szeredi
09e10322b7 ovl: lookup ENAMETOOLONG on lower means ENOENT
"Suppose you have in one of the lower layers a filesystem with
->lookup()-enforced upper limit on name length.  Pretty much every local fs
has one, but... they are not all equal.  255 characters is the common upper
limit, but e.g. jffs2 stops at 254, minixfs upper limit is somewhere from
14 to 60, depending upon version, etc.  You are doing a lookup for
something that is present in upper layer, but happens to be too long for
one of the lower layers.  Too bad - ENAMETOOLONG for you..."

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:45 +01:00
Miklos Szeredi
3e01cee3b9 ovl: check whiteout on lowest layer as well
Not checking whiteouts on lowest layer was an optimization (there's nothing
to white out there), but it could result in inconsitent behavior when a
layer previously used as upper/middle is later used as lowest. 

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:45 +01:00
Miklos Szeredi
3d3c6b8939 ovl: multi-layer lookup
Look up dentry in all relevant layers.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:44 +01:00
Miklos Szeredi
9d7459d834 ovl: multi-layer readdir
If multiple lower layers exist, merge them as well in readdir according to
the same rules as merging upper with lower.  I.e. take whiteouts and opaque
directories into account on all but the lowers layer.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:44 +01:00
Miklos Szeredi
5ef88da56a ovl: helper to iterate layers
Add helper to iterate through all the layers, starting from the upper layer
(if exists) and continuing down through the lower layers.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:43 +01:00
Miklos Szeredi
dd662667e6 ovl: add mutli-layer infrastructure
Add multiple lower layers to 'struct ovl_fs' and 'struct ovl_entry'.

ovl_entry will have an array of paths, instead of just the dentry.  This
allows a compact array containing just the layers which exist at current
point in the tree (which is expected to be a small number for the majority
of dentries).

The number of layers is not limited by this infrastructure.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:43 +01:00
Miklos Szeredi
1afaba1ecb ovl: make path-type a bitmap
OVL_PATH_PURE_UPPER -> __OVL_PATH_UPPER | __OVL_PATH_PURE
OVL_PATH_UPPER      -> __OVL_PATH_UPPER
OVL_PATH_MERGE      -> __OVL_PATH_UPPER | __OVL_PATH_MERGE
OVL_PATH_LOWER      -> 0

Multiple R/O layers will allow __OVL_PATH_MERGE without __OVL_PATH_UPPER.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:42 +01:00
Miklos Szeredi
71d509280f ovl: use lockless_dereference() for upperdentry
Don't open code lockless_dereference() in ovl_upperdentry_dereference().

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-11-20 16:40:01 +01:00
Miklos Szeredi
91c7794713 ovl: allow filenames with comma
Allow option separator (comma) to be escaped with backslash.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-11-20 16:40:00 +01:00
Miklos Szeredi
ef94b1864d ovl: rename filesystem type to "overlay"
Some distributions carry an "old" format of overlayfs while mainline has a
"new" format.

The distros will possibly want to keep the old overlayfs alongside the new
for compatibility reasons.

To make it possible to differentiate the two versions change the name of
the new one from "overlayfs" to "overlay".

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reported-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Andy Whitcroft <apw@canonical.com>
2014-11-20 16:39:59 +01:00
Miklos Szeredi
69c433ed2e fs: limit filesystem stacking depth
Add a simple read-only counter to super_block that indicates how deep this
is in the stack of filesystems.  Previously ecryptfs was the only stackable
filesystem and it explicitly disallowed multiple layers of itself.

Overlayfs, however, can be stacked recursively and also may be stacked
on top of ecryptfs or vice versa.

To limit the kernel stack usage we must limit the depth of the
filesystem stack.  Initially the limit is set to 2.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-10-24 00:14:39 +02:00
Erez Zadok
f45827e841 overlayfs: implement show_options
This is useful because of the stacking nature of overlayfs.  Users like to
find out (via /proc/mounts) which lower/upper directory were used at mount
time.

AV: even failing ovl_parse_opt() could've done some kstrdup()
AV: failure of ovl_alloc_entry() should end up with ENOMEM, not EINVAL

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-10-24 00:14:38 +02:00
Andy Whitcroft
cc2596392a overlayfs: add statfs support
Add support for statfs to the overlayfs filesystem.  As the upper layer
is the target of all write operations assume that the space in that
filesystem is the space in the overlayfs.  There will be some inaccuracy as
overwriting a file will copy it up and consume space we were not expecting,
but it is better than nothing.

Use the upper layer dentry and mount from the overlayfs root inode,
passing the statfs call to that filesystem.

Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-10-24 00:14:38 +02:00
Miklos Szeredi
e9be9d5e76 overlay filesystem
Overlayfs allows one, usually read-write, directory tree to be
overlaid onto another, read-only directory tree.  All modifications
go to the upper, writable layer.

This type of mechanism is most often used for live CDs but there's a
wide variety of other uses.

The implementation differs from other "union filesystem"
implementations in that after a file is opened all operations go
directly to the underlying, lower or upper, filesystems.  This
simplifies the implementation and allows native performance in these
cases.

The dentry tree is duplicated from the underlying filesystems, this
enables fast cached lookups without adding special support into the
VFS.  This uses slightly more memory than union mounts, but dentries
are relatively small.

Currently inodes are duplicated as well, but it is a possible
optimization to share inodes for non-directories.

Opening non directories results in the open forwarded to the
underlying filesystem.  This makes the behavior very similar to union
mounts (with the same limitations vs. fchmod/fchown on O_RDONLY file
descriptors).

Usage:

  mount -t overlayfs overlayfs -olowerdir=/lower,upperdir=/upper/upper,workdir=/upper/work /overlay

The following cotributions have been folded into this patch:

Neil Brown <neilb@suse.de>:
 - minimal remount support
 - use correct seek function for directories
 - initialise is_real before use
 - rename ovl_fill_cache to ovl_dir_read

Felix Fietkau <nbd@openwrt.org>:
 - fix a deadlock in ovl_dir_read_merged
 - fix a deadlock in ovl_remove_whiteouts

Erez Zadok <ezk@fsl.cs.sunysb.edu>
 - fix cleanup after WARN_ON

Sedat Dilek <sedat.dilek@googlemail.com>
 - fix up permission to confirm to new API

Robin Dong <hao.bigrat@gmail.com>
 - fix possible leak in ovl_new_inode
 - create new inode in ovl_link

Andy Whitcroft <apw@canonical.com>
 - switch to __inode_permission()
 - copy up i_uid/i_gid from the underlying inode

AV:
 - ovl_copy_up_locked() - dput(ERR_PTR(...)) on two failure exits
 - ovl_clear_empty() - one failure exit forgetting to do unlock_rename(),
   lack of check for udir being the parent of upper, dropping and regaining
   the lock on udir (which would require _another_ check for parent being
   right).
 - bogus d_drop() in copyup and rename [fix from your mail]
 - copyup/remove and copyup/rename races [fix from your mail]
 - ovl_dir_fsync() leaving ERR_PTR() in ->realfile
 - ovl_entry_free() is pointless - it's just a kfree_rcu()
 - fold ovl_do_lookup() into ovl_lookup()
 - manually assigning ->d_op is wrong.  Just use ->s_d_op.
 [patches picked from Miklos]:
 * copyup/remove and copyup/rename races
 * bogus d_drop() in copyup and rename

Also thanks to the following people for testing and reporting bugs:

  Jordi Pujol <jordipujolp@gmail.com>
  Andy Whitcroft <apw@canonical.com>
  Michal Suchanek <hramrach@centrum.cz>
  Felix Fietkau <nbd@openwrt.org>
  Erez Zadok <ezk@fsl.cs.sunysb.edu>
  Randy Dunlap <rdunlap@xenotime.net>

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-10-24 00:14:38 +02:00