Commit graph

43124 commits

Author SHA1 Message Date
Theodore Ts'o
6ad4196f19 ext4: return EROFS if device is r/o and journal replay is needed
commit 4753d8a24d4588657bc0a4cd66d4e282dff15c8c upstream.

If the file system requires journal recovery, and the device is
read-ony, return EROFS to the mount system call.  This allows xfstests
generic/050 to pass.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12 06:37:27 +01:00
Theodore Ts'o
9a79248c08 ext4: preserve the needs_recovery flag when the journal is aborted
commit 97abd7d4b5d9c48ec15c425485f054e1c15e591b upstream.

If the journal is aborted, the needs_recovery feature flag should not
be removed.  Otherwise, it's the journal might not get replayed and
this could lead to more data getting lost.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12 06:37:27 +01:00
Theodore Ts'o
6ec4583e9b ext4: fix inline data error paths
commit eb5efbcb762aee4b454b04f7115f73ccbcf8f0ef upstream.

The write_end() function must always unlock the page and drop its ref
count, even on an error.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12 06:37:27 +01:00
Jan Kara
9d636818db ext4: fix data corruption in data=journal mode
commit 3b136499e906460919f0d21a49db1aaccf0ae963 upstream.

ext4_journalled_write_end() did not propely handle all the cases when
generic_perform_write() did not copy all the data into the target page
and could mark buffers with uninitialized contents as uptodate and dirty
leading to possible data corruption (which would be quickly fixed by
generic_perform_write() retrying the write but still). Fix the problem
by carefully handling the case when the page that is written to is not
uptodate.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12 06:37:27 +01:00
Jan Kara
8774c73cf6 ext4: trim allocation requests to group size
commit cd648b8a8fd5071d232242d5ee7ee3c0815776af upstream.

If filesystem groups are artifically small (using parameter -g to
mkfs.ext4), ext4_mb_normalize_request() can result in a request that is
larger than a block group. Trim the request size to not confuse
allocation code.

Reported-by: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12 06:37:27 +01:00
Roman Pen
a3068b3e80 ext4: do not polute the extents cache while shifting extents
commit 03e916fa8b5577d85471452a3d0c5738aa658dae upstream.

Inside ext4_ext_shift_extents() function ext4_find_extent() is called
without EXT4_EX_NOCACHE flag, which should prevent cache population.

This leads to oudated offsets in the extents tree and wrong blocks
afterwards.

Patch fixes the problem providing EXT4_EX_NOCACHE flag for each
ext4_find_extents() call inside ext4_ext_shift_extents function.

Fixes: 331573febb
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12 06:37:27 +01:00
Roman Pen
3daefdae5f ext4: Include forgotten start block on fallocate insert range
commit 2a9b8cba62c0741109c33a2be700ff3d7703a7c2 upstream.

While doing 'insert range' start block should be also shifted right.
The bug can be easily reproduced by the following test:

    ptr = malloc(4096);
    assert(ptr);

    fd = open("./ext4.file", O_CREAT | O_TRUNC | O_RDWR, 0600);
    assert(fd >= 0);

    rc = fallocate(fd, 0, 0, 8192);
    assert(rc == 0);
    for (i = 0; i < 2048; i++)
            *((unsigned short *)ptr + i) = 0xbeef;
    rc = pwrite(fd, ptr, 4096, 0);
    assert(rc == 4096);
    rc = pwrite(fd, ptr, 4096, 4096);
    assert(rc == 4096);

    for (block = 2; block < 1000; block++) {
            rc = fallocate(fd, FALLOC_FL_INSERT_RANGE, 4096, 4096);
            assert(rc == 0);

            for (i = 0; i < 2048; i++)
                    *((unsigned short *)ptr + i) = block;

            rc = pwrite(fd, ptr, 4096, 4096);
            assert(rc == 4096);
    }

Because start block is not included in the range the hole appears at
the wrong offset (just after the desired offset) and the following
pwrite() overwrites already existent block, keeping hole untouched.

Simple way to verify wrong behaviour is to check zeroed blocks after
the test:

   $ hexdump ./ext4.file | grep '0000 0000'

The root cause of the bug is a wrong range (start, stop], where start
should be inclusive, i.e. [start, stop].

This patch fixes the problem by including start into the range.  But
not to break left shift (range collapse) stop points to the beginning
of the a block, not to the end.

The other not obvious change is an iterator check on validness in a
main loop.  Because iterator is unsigned the following corner case
should be considered with care: insert a block at 0 offset, when stop
variables overflows and never becomes less than start, which is 0.
To handle this special case iterator is set to NULL to indicate that
end of the loop is reached.

Fixes: 331573febb
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12 06:37:27 +01:00
Theodore Ts'o
973f40f368 jbd2: don't leak modified metadata buffers on an aborted journal
commit e112666b4959b25a8552d63bc564e1059be703e8 upstream.

If the journal has been aborted, we shouldn't mark the underlying
buffer head as dirty, since that will cause the metadata block to get
modified.  And if the journal has been aborted, we shouldn't allow
this since it will almost certainly lead to a corrupted file system.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12 06:37:26 +01:00
Todd Kjos
837de638dc Merge branch 'upstream-linux-4.4.y' into android-4.4 2017-03-02 13:53:48 -08:00
Daniel Rosenberg
98c307e68c ANDROID: sdcardfs: support direct-IO (DIO) operations
This comes from the wrapfs patch
2e346c83b26e Wrapfs: support direct-IO (DIO) operations

Signed-off-by: Li Mengyang <li.mengyang@stonybrook.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 34133558
Change-Id: I3fd779c510ab70d56b1d918f99c20421b524cdc4
2017-02-24 17:51:49 -08:00
Daniel Rosenberg
166f168a48 ANDROID: sdcardfs: implement vm_ops->page_mkwrite
This comes from the wrapfs patch
3dfec0ffe5e2 Wrapfs: implement vm_ops->page_mkwrite

Some file systems (e.g., ext4) require it.  Reported by Ted Ts'o.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 34133558
Change-Id: I1a389b2422c654a6d3046bb8ec3e20511aebfa8e
2017-02-24 17:50:43 -08:00
Todd Kjos
f048326ea4 Merge branch 'android-4.4.y' into android-4.4 2017-02-23 11:37:18 -08:00
Sahitya Tummala
d7b146c692 fuse: fix use after free issue in fuse_dev_do_read()
commit 6ba4d2722d06960102c981322035239cd66f7316 upstream.

There is a potential race between fuse_dev_do_write()
and request_wait_answer() contexts as shown below:

TASK 1:
__fuse_request_send():
  |--spin_lock(&fiq->waitq.lock);
  |--queue_request();
  |--spin_unlock(&fiq->waitq.lock);
  |--request_wait_answer():
       |--if (test_bit(FR_SENT, &req->flags))
       <gets pre-empted after it is validated true>
                                   TASK 2:
                                   fuse_dev_do_write():
                                     |--clears bit FR_SENT,
                                     |--request_end():
                                        |--sets bit FR_FINISHED
                                        |--spin_lock(&fiq->waitq.lock);
                                        |--list_del_init(&req->intr_entry);
                                        |--spin_unlock(&fiq->waitq.lock);
                                        |--fuse_put_request();
       |--queue_interrupt();
       <request gets queued to interrupts list>
            |--wake_up_locked(&fiq->waitq);
       |--wait_event_freezable();
       <as FR_FINISHED is set, it returns and then
       the caller frees this request>

Now, the next fuse_dev_do_read(), see interrupts list is not empty
and then calls fuse_read_interrupt() which tries to access the request
which is already free'd and gets the below crash:

[11432.401266] Unable to handle kernel paging request at virtual address
6b6b6b6b6b6b6b6b
...
[11432.418518] Kernel BUG at ffffff80083720e0
[11432.456168] PC is at __list_del_entry+0x6c/0xc4
[11432.463573] LR is at fuse_dev_do_read+0x1ac/0x474
...
[11432.679999] [<ffffff80083720e0>] __list_del_entry+0x6c/0xc4
[11432.687794] [<ffffff80082c65e0>] fuse_dev_do_read+0x1ac/0x474
[11432.693180] [<ffffff80082c6b14>] fuse_dev_read+0x6c/0x78
[11432.699082] [<ffffff80081d5638>] __vfs_read+0xc0/0xe8
[11432.704459] [<ffffff80081d5efc>] vfs_read+0x90/0x108
[11432.709406] [<ffffff80081d67f0>] SyS_read+0x58/0x94

As FR_FINISHED bit is set before deleting the intr_entry with input
queue lock in request completion path, do the testing of this flag and
queueing atomically with the same lock in queue_interrupt().

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: fd22d62ed0 ("fuse: no fc->lock for iqueue parts")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-23 17:43:09 +01:00
Miklos Szeredi
f9400118b0 vfs: fix uninitialized flags in splice_to_pipe()
commit 5a81e6a171cdbd1fa8bc1fdd80c23d3d71816fac upstream.

Flags (PIPE_BUF_FLAG_PACKET, PIPE_BUF_FLAG_GIFT) could remain on the
unused part of the pipe ring buffer.  Previously splice_to_pipe() left
the flags value alone, which could result in incorrect behavior.

Uninitialized flags appears to have been there from the introduction of
the splice syscall.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-23 17:43:09 +01:00
Daniel Rosenberg
880c68e41d ANDROID: sdcardfs: Don't bother deleting freelist
There is no point deleting entries from dlist, as
that is a temporary list on the stack from which
contains only entries that are being deleted.

Not all code paths set up dlist, so those that
don't were performing invalid accesses in
hash_del_rcu. As an additional means to prevent
any other issue, we null out the list entries when
we allocate from the cache.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 35666680
Change-Id: Ibb1e28c08c3a600c29418d39ba1c0f3db3bf31e5
2017-02-22 15:47:47 -08:00
Daniel Rosenberg
1914b2934b ANDROID: sdcardfs: Add missing path_put
"ANDROID: sdcardfs: Add GID Derivation to sdcardfs" introduced
an unbalanced pat_get, leading to storage space not being freed
after deleting a file until rebooting. This adds the missing path_put.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 34691169
Change-Id: Ia7ef97ec2eca2c555cc06b235715635afc87940e
2017-02-16 19:14:22 -08:00
Dmitry Shmidt
232c28fe23 Merge remote-tracking branch 'common/android-4.4' into android-4.4.y
Change-Id: Icf907f5067fb6da5935ab0d3271df54b8d5df405
2017-02-15 18:02:55 -08:00
Daniel Rosenberg
56026a89e6 ANDROID: sdcardfs: Fix incorrect hash
This adds back the hash calculation removed as part of
the previous patch, as it is in fact necessary.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug: 35307857
Change-Id: Ie607332bcf2c5d2efdf924e4060ef3f576bf25dc
2017-02-15 15:01:11 -08:00
Michael Halcrow
8b75db9857 ANDROID: ext4 crypto: Disables zeroing on truncation when there's no key
When performing orphan cleanup on mount, ext4 may truncate pages.
Truncation as currently implemented may require the encryption key for
partial zeroing, and the key isn't necessarily available on mount.
Since the userspace tools don't perform the partial zeroing operation
anyway, let's just skip doing that in the kernel.

This patch fixes a BUG_ON() oops.

Bug: 35209576
Change-Id: I2527a3f8d2c57d2de5df03fda69ee397f76095d7
Signed-off-by: Michael Halcrow <mhalcrow@google.com>
2017-02-14 18:34:04 +00:00
Eric Biggers
a425a70b26 ANDROID: ext4: add a non-reversible key derivation method
Add a new per-file key derivation method to ext4 encryption defined as:

    derived_key[0:127]   = AES-256-ENCRYPT(master_key[0:255], nonce)
    derived_key[128:255] = AES-256-ENCRYPT(master_key[0:255], nonce ^ 0x01)
    derived_key[256:383] = AES-256-ENCRYPT(master_key[256:511], nonce)
    derived_key[384:511] = AES-256-ENCRYPT(master_key[256:511], nonce ^ 0x01)

... where the derived key and master key are both 512 bits, the nonce is
128 bits, AES-256-ENCRYPT takes the arguments (key, plaintext), and
'nonce ^ 0x01' denotes flipping the low order bit of the last byte.

The existing key derivation method is
'derived_key = AES-128-ECB-ENCRYPT(key=nonce, plaintext=master_key)'.
We want to make this change because currently, given a derived key you
can easily compute the master key by computing
'AES-128-ECB-DECRYPT(key=nonce, ciphertext=derived_key)'.

This was formerly OK because the previous threat model assumed that the
master key and derived keys are equally hard to obtain by an attacker.
However, we are looking to move the master key into secure hardware in
some cases, so we want to make sure that an attacker with access to a
derived key cannot compute the master key.

We are doing this instead of increasing the nonce to 512 bits because
it's important that the per-file xattr fit in the inode itself.  By
default, inodes are 256 bytes, and on Android we're already pretty close
to that limit. If we increase the nonce size, we end up allocating a new
filesystem block for each and every encrypted file, which has a
substantial performance and disk utilization impact.

Another option considered was to use the HMAC-SHA512 of the nonce, keyed
by the master key.  However this would be a little less performant,
would be less extensible to other key sizes and MAC algorithms, and
would pull in a dependency (security-wise and code-wise) on SHA-512.

Due to the use of "aes" rather than "ecb(aes)" in the implementation,
the new key derivation method is actually about twice as fast as the old
one, though the old one could be optimized similarly as well.

This patch makes the new key derivation method be used whenever HEH is
used to encrypt filenames.  Although these two features are logically
independent, it was decided to bundle them together for now.  Note that
neither feature is upstream yet, and it cannot be guaranteed that the
on-disk format won't change if/when these features are upstreamed.  For
this reason, and as noted in the previous patch, the features are both
behind a special mode number for now.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Change-Id: Iee4113f57e59dc8c0b7dc5238d7003c83defb986
2017-02-10 20:09:27 +00:00
Eric Biggers
3e0dd6ec69 ANDROID: ext4: allow encrypting filenames using HEH algorithm
Update ext4 encryption to allow filenames to be encrypted using the
Hash-Encrypt-Hash (HEH) block cipher mode of operation, which is
believed to be more secure than CBC, particularly within the constant
initialization vector (IV) constraint of filename encryption.  Notably,
HEH avoids the "common prefix" problem of CBC.  Both algorithms use
AES-256 as the underlying block cipher and take a 256-bit key.

We assign mode number 126 to HEH, just below 127
(EXT4_ENCRYPTION_MODE_PRIVATE) which in some kernels is reserved for
inline encryption on MSM chipsets.  Note that these modes are not yet
upstream, which is why these numbers are being used; it's preferable to
avoid collisions with modes that may be added upstream.  Also, although
HEH is not hardware-specific, we aren't currently reserving mode number
5 for HEH upstream, since for now we are tying HEH to the new key
derivation method which might become an independent flag upstream, and
there's also a chance that details of HEH will change after it gets
wider review.

Bug: 32975945
Signed-off-by: Eric Biggers <ebiggers@google.com>
Change-Id: I81418709d47da0e0ac607ae3f91088063c2d5dd4
2017-02-10 20:09:20 +00:00
Mohan Srinivasan
d854b68890 ANDROID: Refactor fs readpage/write tracepoints.
Refactor the fs readpage/write tracepoints to move the
inode->path lookup outside the tracepoint code, and pass a pointer
to the path into the tracepoint code instead. This is necessary
because the tracepoint code runs non-preemptible. Thanks to
Trilok Soni for catching this in 4.4.

Change-Id: I7486c5947918d155a30c61d6b9cd5027cf8fbe15
Signed-off-by: Mohan Srinivasan <srmohan@google.com>
2017-02-10 19:09:14 +00:00
Dmitry Shmidt
5edfa05a10 This is the 4.4.48 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlicFCgACgkQONu9yGCS
 aT4TLg//QVqQvdkxyy0lKQfOxmo4RSErmpFstgkvuVgucGh6Akvh8OV9hHJKabjK
 RUn3BNASoWfQF+G1vn7EQWcTGDgJhF/P39DvMu3zvpRbSYMMeX7og9iDnoNn2WtG
 l89l+5YfQG7Y8eJWj1mnTW2ul9pUxJFg4j2rjmcLhfgKPvJPCn+cpU2XKUxpj7gM
 yd/nbVuQlMFW6qfEES1W1RbDEOQ1KWJgdupsMEgodRxb/dlg8KldBQFmv1fGcrA6
 5jFqWzsQQ7AyfMWIRDBm9mJlHuvdoGCEGkyTbsZoSyuN72/cyfPSfTZPInpi09bb
 l0sod1nzcZsuQVJzaQHTKlvpMEduIDQVxy2/pNW/pKnGAS++fkK+uJCsu0mz+6+8
 zntaPdVoboiwwoK5dgP27vgWpYpw2QoCpPqWno7NIVNZfUcWWng3NS49goN+ytvY
 m1i1ih4KU1bMqMrT0qZugQwHHqaE9IJ8xyDMdXc86cMH1ylTo8ZnOOyGxRKKLOW1
 nVs4aQT2i7E9yQ8TjVJplLxtU3t/Q3D1qqPr5U70XJyEgT5X4/V0mXJaRRWXAzXP
 2IBJOLznqwbwuIHV8ocp7i76qtpVqbJkpMx2NhB0tFP0XjffqpZvv0v8aBTAdBS2
 060nyG8fZad6L++tWVODt7nd7gkD4NN/I8BqD0XzXx6zbOJexqA=
 =GUZe
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.48' into android-4.4.y

This is the 4.4.48 stable release
2017-02-09 10:59:15 -08:00
Rabin Vincent
920bba1092 cifs: initialize file_info_lock
commit 81ddd8c0c5e1cb41184d66567140cb48c53eb3d1 upstream.

Reviewed-by: Jeff Layton <jlayton@redhat.com>

file_info_lock is not initalized in initiate_cifs_search(), leading to the
following splat after a simple "mount.cifs ... dir && ls dir/":

 BUG: spinlock bad magic on CPU#0, ls/486
  lock: 0xffff880009301110, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
 CPU: 0 PID: 486 Comm: ls Not tainted 4.9.0 #27
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
  ffffc900042f3db0 ffffffff81327533 0000000000000000 ffff880009301110
  ffffc900042f3dd0 ffffffff810baf75 ffff880009301110 ffffffff817ae077
  ffffc900042f3df0 ffffffff810baff6 ffff880009301110 ffff880008d69900
 Call Trace:
  [<ffffffff81327533>] dump_stack+0x65/0x92
  [<ffffffff810baf75>] spin_dump+0x85/0xe0
  [<ffffffff810baff6>] spin_bug+0x26/0x30
  [<ffffffff810bb159>] do_raw_spin_lock+0xe9/0x130
  [<ffffffff8159ad2f>] _raw_spin_lock+0x1f/0x30
  [<ffffffff8127e50d>] cifs_closedir+0x4d/0x100
  [<ffffffff81181cfd>] __fput+0x5d/0x160
  [<ffffffff81181e3e>] ____fput+0xe/0x10
  [<ffffffff8109410e>] task_work_run+0x7e/0xa0
  [<ffffffff81002512>] exit_to_usermode_loop+0x92/0xa0
  [<ffffffff810026f9>] syscall_return_slowpath+0x49/0x50
  [<ffffffff8159b484>] entry_SYSCALL_64_fastpath+0xa7/0xa9

Fixes: 3afca265b5f53a0 ("Clarify locking of cifs file and tcon structures and make more granular")
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-09 08:02:45 +01:00
Kinglong Mee
2b4e56fde9 NFSD: Fix a null reference case in find_or_create_lock_stateid()
commit d19fb70dd68c4e960e2ac09b0b9c79dfdeefa726 upstream.

nfsd assigns the nfs4_free_lock_stateid to .sc_free in init_lock_stateid().

If nfsd doesn't go through init_lock_stateid() and put stateid at end,
there is a NULL reference to .sc_free when calling nfs4_put_stid(ns).

This patch let the nfs4_stid.sc_free assignment to nfs4_alloc_stid().

Fixes: 356a95ece7 "nfsd: clean up races in lock stateid searching..."
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-09 08:02:45 +01:00
Eryu Guan
e21a3cad35 ext4: validate s_first_meta_bg at mount time
commit 3a4b77cd47bb837b8557595ec7425f281f2ca1fe upstream.

Ralf Spenneberg reported that he hit a kernel crash when mounting a
modified ext4 image. And it turns out that kernel crashed when
calculating fs overhead (ext4_calculate_overhead()), this is because
the image has very large s_first_meta_bg (debug code shows it's
842150400), and ext4 overruns the memory in count_overhead() when
setting bitmap buffer, which is PAGE_SIZE.

ext4_calculate_overhead():
  buf = get_zeroed_page(GFP_NOFS);  <=== PAGE_SIZE buffer
  blks = count_overhead(sb, i, buf);

count_overhead():
  for (j = ext4_bg_num_gdb(sb, grp); j > 0; j--) { <=== j = 842150400
          ext4_set_bit(EXT4_B2C(sbi, s++), buf);   <=== buffer overrun
          count++;
  }

This can be reproduced easily for me by this script:

  #!/bin/bash
  rm -f fs.img
  mkdir -p /mnt/ext4
  fallocate -l 16M fs.img
  mke2fs -t ext4 -O bigalloc,meta_bg,^resize_inode -F fs.img
  debugfs -w -R "ssv first_meta_bg 842150400" fs.img
  mount -o loop fs.img /mnt/ext4

Fix it by validating s_first_meta_bg first at mount time, and
refusing to mount if its value exceeds the largest possible meta_bg
number.

Reported-by: Ralf Spenneberg <ralf@os-t.de>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-09 08:02:44 +01:00
Adrien Schildknecht
d9aa8ddc51 Squashfs: optimize reading uncompressed data
When dealing with uncompressed data, there is no need to read a whole
block (default 128K) to get the desired page: the pages are
independent from each others.

This patch change the readpages logic so that reading uncompressed
data only read the number of pages advised by the readahead algorithm.

Moreover, if the page actor contains holes (i.e. pages that are already
up-to-date), squashfs skips the buffer_head associated to those pages.

This patch greatly improve the performance of random reads for
uncompressed files because squashfs only read what is needed. It also
reduces the number of unnecessary reads.

Signed-off-by: Adrien Schildknecht <adriens@google.com>
Change-Id: I1850150fbf4b45c9dd138d88409fea1ab44054c0
2017-02-09 04:04:18 +00:00
Adrien Schildknecht
5e9c466d6e Squashfs: implement .readpages()
Squashfs does not implement .readpages(), so the kernel just repeatedly
calls .readpage().

The readpages function tries to pack as much pages as possible in the
same page actor so that only 1 read request is issued.

Now that the read requests are asynchronous, the kernel can truly
prefetch pages using its readahead algorithm.

Signed-off-by: Adrien Schildknecht <adriens@google.com>
Change-Id: Ice70e029dc24526f61e4e5a1a902588be2212498
2017-02-09 04:04:09 +00:00
Adrien Schildknecht
c9994560db Squashfs: replace buffer_head with BIO
The 'll_rw_block' has been deprecated and BIO is now the basic container
for block I/O within the kernel.

Switching to BIO offers 2 advantages:
  1/ It removes synchronous wait for the up-to-date buffers: SquashFS
     now deals with decompressions/copies asynchronously.
     Implementing an asynchronous mechanism to read data is needed to
     efficiently implement .readpages().
  2/ Prior to this patch, merging the read requests entirely depends on
     the IO scheduler. SquashFS has more information than the IO
     scheduler about what could be merged. Moreover, merging the reads
     at the FS level means that we rely less on the IO scheduler.

Signed-off-by: Adrien Schildknecht <adriens@google.com>
Change-Id: I775d2e11f017476e1899518ab52d9d0a8a0bce28
2017-02-09 04:04:02 +00:00
Adrien Schildknecht
417aca479b Squashfs: refactor page_actor
This patch essentially does 3 things:
  1/ Always use an array of page to store the data instead of a mix of
     buffers and pages.
  2/ It is now possible to have 'holes' in a page actor, i.e. NULL
     pages in the array.
     When reading a block (default 128K), squashfs tries to grab all
     the pages covering this block. If a single page is up-to-date or
     locked, it falls back to using an intermediate buffer to do the
     read and then copy the pages in the actor. Allowing holes in the
     page actor remove the need for this intermediate buffer.
  3/ Refactor the wrappers to share code that deals with page actors.

Signed-off-by: Adrien Schildknecht <adriens@google.com>
Change-Id: I98128bed5d518cf31b67e788a85b275e9a323bec
2017-02-09 04:03:50 +00:00
Adrien Schildknecht
50dcddba6c Squashfs: remove the FILE_CACHE option
FILE_DIRECT is working fine and offers faster results and lower memory
footprint.

Removing FILE_CACHE makes our life easier because we don't have to
maintain 2 differents function that does the same thing.

Signed-off-by: Adrien Schildknecht <adriens@google.com>
Change-Id: I6689ba74d0042c222a806f9edc539995e8e04c6b
2017-02-09 04:01:39 +00:00
Cong Wang
dc2ad0661d FROMLIST: 9p: fix a potential acl leak
(https://lkml.org/lkml/2016/12/13/579)

posix_acl_update_mode() could possibly clear 'acl', if so
we leak the memory pointed by 'acl'. Save this pointer
before calling posix_acl_update_mode() and release the memory
if 'acl' really gets cleared.

Reported-by: Mark Salyzyn <salyzyn@android.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Greg Kurz <groug@kaod.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Bug: 32458736
Change-Id: Ia78da401e6fd1bfd569653bd2cd0ebd3f9c737a0
2017-02-07 15:21:25 +00:00
Jan Kara
49b60d4aa9 BACKPORT: posix_acl: Clear SGID bit when setting file permissions
(cherry pick from commit 073931017b49d9458aa351605b43a7e34598caef)

When file permissions are modified via chmod(2) and the user is not in
the owning group or capable of CAP_FSETID, the setgid bit is cleared in
inode_change_ok().  Setting a POSIX ACL via setxattr(2) sets the file
permissions as well as the new ACL, but doesn't clear the setgid bit in
a similar way; this allows to bypass the check in chmod(2).  Fix that.

NB: We did not resolve the ACL leak in this CL, require additional
    upstream fix.

References: CVE-2016-7097
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Bug: 32458736
Change-Id: I19591ad452cc825ac282b3cfd2daaa72aa9a1ac1
2017-02-07 15:21:07 +00:00
Daniel Rosenberg
9ce149a458 ANDROID: sdcardfs: Switch strcasecmp for internal call
This moves our uses of strcasecmp over to an internal call so we can
easily change implementations later if we so desire. Additionally,
we leverage qstr's where appropriate to save time on comparisons.

Change-Id: I32fdc4fd0cd3b7b735dcfd82f60a2516fd8272a5
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-02-02 15:17:29 -08:00
Daniel Rosenberg
7191add87c ANDROID: sdcardfs: switch to full_name_hash and qstr
Use the kernel's string hash function instead of rolling
our own. Additionally, save a bit of calculation by using
the qstr struct in place of strings.

Change-Id: I0bbeb5ec2a9233f40135ad632e6f22c30ffa95c1
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-02-02 15:17:28 -08:00
Daniel Rosenberg
461da83e67 ANDROID: sdcardfs: Add GID Derivation to sdcardfs
This changes sdcardfs to modify the user and group in the
underlying filesystem depending on its usage. Ownership is
set by Android user, and package, as well as if the file is
under obb or cache. Other files can be labeled by extension.
Those values are set via the configfs interace.

To add an entry,
mkdir -p [configfs root]/sdcardfs/extensions/[gid]/[ext]

Bug: 34262585
Change-Id: I4e030ce84f094a678376349b1a96923e5076a0f4
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-02-02 15:17:27 -08:00
Daniel Rosenberg
d46684aa42 ANDROID: sdcardfs: Remove redundant operation
We call get_derived_permission_new unconditionally, so we don't need
to call update_derived_permission_lock, which does the same thing.

Change-Id: I0748100828c6af806da807241a33bf42be614935
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-02-02 15:17:22 -08:00
Daniel Rosenberg
09c77d6230 ANDROID: sdcardfs: add support for user permission isolation
This allows you to hide the existence of a package from
a user by adding them to an exclude list. If a user
creates that package's folder and is on the exclude list,
they will not see that package's id.

Bug: 34542611
Change-Id: I9eb82e0bf2457d7eb81ee56153b9c7d2f6646323
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-02-02 15:05:39 -08:00
Daniel Rosenberg
d9300a998d ANDROID: sdcardfs: Refactor configfs interface
This refactors the configfs code to be more easily extended.
It will allow additional files to be added easily.

Bug: 34542611
Bug: 34262585
Change-Id: I73c9b0ae5ca7eb27f4ebef3e6807f088b512d539
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-02-02 15:05:39 -08:00
Daniel Rosenberg
3e4f5484dc ANDROID: sdcardfs: Allow non-owners to touch
This modifies the permission checks in setattr to
allow for non-owners to modify the timestamp of
files to things other than the current time.
This still requires write access, as enforced by
the permission call, but relaxes the requirement
that the caller must be the owner, allowing those
with group permissions to change it as well.

Bug: 11118565
Change-Id: Ied31f0cce2797675c7ef179eeb4e088185adcbad
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-02-02 15:05:39 -08:00
Dmitry Shmidt
c8da41f0dc This is the 4.4.46 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAliRjsUACgkQONu9yGCS
 aT53lxAAzSDKqsc4eiGDuRW6A+hWUveObsOsAldQz8PLIhIEh/NiQPNsYisuWA+3
 K5nOwSU4E7LiFYemN/9wGGctq5aPtC7nWyDe43Xdeek0B3keu/KSqOhbOBywuAGr
 pdfEPcyIzgbzgIrygn5g8RUNCcgNkw5IYgmaiz2RCqNnjeQItAQQdm37svz/XDGt
 D6i0MaOHEkGfk3Z3ty4PJWSa/+Gd61M4OYTWiWWCY9gf1CCjQA59RlGhynm6bj+C
 Zq58tsnEzL8bOz9PDHoXKyrLc43mLhn7Q4Nhxs+rT00Yl7yxE/aFm9vd0EMIP3/V
 HUVobz4RmVyopvJPKqcPlD063BAYEm9BpxxXhc3tJNP3AaCjJodvKGkYKFKWPfG5
 6h0agCRCfeCS81y8JKeccw0tVBsD8ChuzO95tSHNulCWk/pN/7OD52b0jiuCMTFs
 lato8AKn2Ygtip/MWkK43yiuPd4qYjtFBSKx6x5we+Vj1HZz2DVjM8prkIqkfzM9
 FpgoyLOPAKXKbECzyURngLsXCckneS96ErxL/a/L/0/2u8Zv4RR9y9jcE38eAhZv
 IGhq8txZXnHD1Qyd6u0nbTeaLDswN06P1stsIUX8fR3qUa5QnlrhG9LzY9wihxlk
 oloLWuhIsUussgO07xgD/9NM8V7goN/DkQ+MCTShXxsNqb87WY0=
 =pPNg
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.46' into android-4.4.y

This is the 4.4.46 stable release
2017-02-01 12:55:09 -08:00
Benjamin Coddington
edef1086bf NFSv4.0: always send mode in SETATTR after EXCLUSIVE4
commit a430607b2ef7c3be090f88c71cfcb1b3988aa7c0 upstream.

Some nfsv4.0 servers may return a mode for the verifier following an open
with EXCLUSIVE4 createmode, but this does not mean the client should skip
setting the mode in the following SETATTR.  It should only do that for
EXCLUSIVE4_1 or UNGAURDED createmode.

Fixes: 5334c5bdac ("NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-01 08:30:53 +01:00
Guenter Roeck
4c7fc336f6 ANDROID: fs: Export vfs_rmdir2
allmodconfig builds fail with

ERROR: "vfs_rmdir2" undefined!

Export the missing function.

Change-Id: I983d327e59fd34e0484f3c54d925e97d3905c19c
Fixes: f9cb61dcb0 ("ANDROID: sdcardfs: User new permission2 functions")
Signed-off-by: Guenter Roeck <groeck@chromium.org>
2017-01-30 18:00:11 -08:00
Guenter Roeck
1cd3d34714 ANDROID: fs: Export free_fs_struct and set_fs_pwd
allmodconfig builds fail with:

ERROR: "free_fs_struct" undefined!
ERROR: "set_fs_pwd" undefined!

Export the missing symbols.

Change-Id: I4877ead19d7e7f0c93d4c4cad5681364284323aa
Fixes: 0ec03f8457 ("ANDROID: sdcardfs: override umask on mkdir and create")
Signed-off-by: Guenter Roeck <groeck@chromium.org>
2017-01-30 17:59:59 -08:00
Daniel Rosenberg
b5858221c1 ANDROID: mnt: remount should propagate to slaves of slaves
propagate_remount was not accounting for the slave mounts
of other slave mounts, leading to some namespaces not
recieving the remount information.

bug:33731928
Change-Id: Idc9e8c2ed126a4143229fc23f10a959c2d0a3854
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-01-26 15:53:31 -08:00
Daniel Rosenberg
e33aa348ee ANDROID: sdcardfs: Switch ->d_inode to d_inode()
Change-Id: I12375cc2d6e82fb8adf0319be971f335f8d7a312
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-01-26 15:53:31 -08:00
Daniel Rosenberg
5cb5648935 ANDROID: sdcardfs: Fix locking issue with permision fix up
Don't use lookup_one_len so we can grab the spinlock that
protects d_subdirs.

Bug: 30954918
Change-Id: I0c6a393252db7beb467e0d563739a3a14e1b5115
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-01-26 15:53:31 -08:00
Daniel Rosenberg
4d70f73115 ANDROID: sdcardfs: Use per mount permissions
This switches sdcardfs over to using permission2.
Instead of mounting several sdcardfs instances onto
the same underlaying directory, you bind mount a
single mount several times, and remount with the
options you want. These are stored in the private
mount data, allowing you to maintain the same tree,
but have different permissions for different mount
points.

Warning functions have been added for permission,
as it should never be called, and the correct
behavior is unclear.

Change-Id: I841b1d70ec60cf2b866fa48edeb74a0b0f8334f5
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-01-26 15:53:30 -08:00
Daniel Rosenberg
6b6e896b0b ANDROID: sdcardfs: Add gid and mask to private mount data
Adds support for mount2, remount2, and the functions
to allocate/clone/copy the private data

The next patch will switch over to actually using it.

Change-Id: I8a43da26021d33401f655f0b2784ead161c575e3
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-01-26 15:53:30 -08:00
Daniel Rosenberg
f9cb61dcb0 ANDROID: sdcardfs: User new permission2 functions
Change-Id: Ic7e0fb8fdcebb31e657b079fe02ac834c4a50db9
Signed-off-by: Daniel Rosenberg <drosen@google.com>
2017-01-26 15:53:30 -08:00