android_kernel_oneplus_msm8998/fs
Hugh Dickins 4b35943067 mm: larger stack guard gap, between vmas
commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream.

Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.

This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.

Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.

One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications.  For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).

Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.

Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.

Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[wt: backport to 4.11: adjust context]
[wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide]
[wt: backport to 4.4: adjust context ; drop ppc hugetlb_radix changes]
Signed-off-by: Willy Tarreau <w@1wt.eu>
[gkh: minor build fixes for 4.4]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-26 07:13:11 +02:00
..
9p 9p: fix a potential acl leak 2017-05-14 13:32:54 +02:00
adfs
affs affs: fix remount failure when there are no options changed 2016-06-07 18:14:32 -07:00
afs
autofs4 autofs: use dentry flags to block walks during expire 2016-09-30 10:18:37 +02:00
befs
bfs
btrfs btrfs: fix memory leak in update_space_info failure path 2017-06-14 13:16:24 +02:00
cachefiles FS-Cache: Add missing initialization of ret in cachefiles_write_page() 2015-11-16 20:38:43 -05:00
ceph fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
cifs Call echo service immediately after socket reconnect 2017-06-17 06:39:35 +02:00
coda fs/coda: fix readlink buffer overflow 2015-09-10 13:29:01 -07:00
configfs configfs: Fix race between create_link and configfs_rmdir 2017-06-26 07:13:08 +02:00
cramfs
debugfs debugfs: Make automount point inodes permanently empty 2016-05-04 14:48:41 -07:00
devpts devpts: clean up interface to pty drivers 2016-08-16 09:30:49 +02:00
dlm dlm: free workqueues after the connections 2016-10-22 12:26:56 +02:00
ecryptfs ecryptfs: fix handling of directory opening 2016-09-15 08:27:47 +02:00
efivarfs efi: Make efivarfs entries immutable by default 2016-03-03 15:07:09 -08:00
efs
exofs osd fs: __r4w_get_page rely on PageUptodate for uptodate 2015-12-12 10:15:34 -08:00
exportfs
ext2 posix_acl: Clear SGID bit when setting file permissions 2016-10-31 04:13:58 -06:00
ext4 fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
f2fs fscrypt: avoid collisions when presenting long encrypted filenames 2017-05-25 14:30:11 +02:00
fat fat: fix using uninitialized fields of fat_inode/fsinfo_inode 2017-03-15 09:57:15 +08:00
freevxfs
fscache FS-Cache: Initialise stores_lock in netfs cookie 2017-06-17 06:39:37 +02:00
fuse fuse: add missing FR_FORCE 2017-03-12 06:37:28 +01:00
gfs2 gfs2: avoid uninitialized variable warning 2017-04-30 05:49:28 +02:00
hfs hfs: fix B-tree corruption after insertion at position 0 2015-09-10 13:29:01 -07:00
hfsplus posix_acl: Clear SGID bit when setting file permissions 2016-10-31 04:13:58 -06:00
hostfs hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common() 2016-09-30 10:18:39 +02:00
hpfs hpfs: implement the show_options method 2016-06-01 12:15:54 -07:00
hugetlbfs mm: larger stack guard gap, between vmas 2017-06-26 07:13:11 +02:00
isofs isofs: Do not return EACCES for unknown filesystems 2016-10-28 03:01:34 -04:00
jbd2 jbd2: don't leak modified metadata buffers on an aborted journal 2017-03-12 06:37:26 +01:00
jffs2 posix_acl: Clear SGID bit when setting file permissions 2016-10-31 04:13:58 -06:00
jfs fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
kernfs kernfs: don't depend on d_find_any_alias() when generating notifications 2016-09-24 10:07:36 +02:00
lockd Mainly smaller bugfixes and cleanup. We're still finding some bugs from 2015-11-11 20:11:28 -08:00
logfs mm, fs: introduce mapping_gfp_constraint() 2015-11-06 17:50:42 -08:00
minix
ncpfs ncpfs: fix a braino in OOM handling in ncp_fill_cache() 2016-03-16 08:42:59 -07:00
nfs nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED" 2017-06-17 06:39:38 +02:00
nfs_common
nfsd fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
nilfs2 fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
nls
notify fanotify: fix list corruption in fanotify_get_response() 2016-09-30 10:18:37 +02:00
ntfs mm, fs: introduce mapping_gfp_constraint() 2015-11-06 17:50:42 -08:00
ocfs2 fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
omfs
openpromfs
overlayfs ovl: fsync after copy-up 2016-11-10 16:36:34 +01:00
proc mm: larger stack guard gap, between vmas 2017-06-26 07:13:11 +02:00
pstore pstore/ram: Use memcpy_fromio() to save old buffer 2016-10-28 03:01:27 -04:00
qnx4
qnx6
quota quota: Fix possible GPF due to uninitialised pointers 2016-04-12 09:08:56 -07:00
ramfs mm, fs: obey gfp_mapping for add_to_page_cache() 2015-10-16 11:42:28 -07:00
reiserfs fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
romfs romfs: use different way to generate fsid for BLOCK or MTD 2017-06-17 06:39:38 +02:00
squashfs squashfs: xattr simplifications 2015-11-13 20:34:33 -05:00
sysfs sysfs: be careful of error returns from ops->show() 2017-04-12 12:38:33 +02:00
sysv fix sysvfs symlinks 2015-11-23 21:11:08 -05:00
tracefs tracefs: Fix refcount imbalance in start_creating() 2015-11-04 22:13:45 -05:00
ubifs ubifs: Fix journal replay wrt. xattr nodes 2017-01-26 08:23:48 +01:00
udf fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
ufs ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path 2017-06-14 13:16:24 +02:00
xfs Make __xfs_xattr_put_listen preperly report errors. 2017-06-14 13:16:27 +02:00
aio.c aio: mark AIO pseudo-fs noexec 2016-10-07 15:23:47 +02:00
anon_inodes.c
attr.c vfs: move permission checking into notify_change() for utimes(NULL) 2016-10-22 12:26:56 +02:00
bad_inode.c
binfmt_aout.c
binfmt_elf.c Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-11-11 09:45:24 -08:00
binfmt_elf_fdpic.c libnvdimm for 4.4: 2015-11-10 12:07:22 -08:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
block_dev.c fs/block_dev: always invalidate cleancache in invalidate_bdev() 2017-05-20 14:27:01 +02:00
buffer.c fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
char_dev.c
compat.c
compat_binfmt_elf.c
compat_ioctl.c i2c-dev: Fix typo in ioctl name reference 2015-10-23 23:26:43 +02:00
coredump.c coredump: fix unfreezable coredumping task 2016-11-18 10:48:34 +01:00
dax.c dax: disable pmd mappings 2015-11-16 23:54:45 -08:00
dcache.c mnt: Protect the mountpoint hashtable with mount_lock 2017-01-19 20:17:21 +01:00
dcookies.c
direct-io.c fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
drop_caches.c
eventfd.c
eventpoll.c
exec.c exec: Ensure mm->user_ns contains the execed files 2017-01-06 11:16:14 +01:00
fcntl.c
fhandle.c fs/coredump: prevent fsuid=0 dumps into user-controlled directories 2016-04-12 09:08:58 -07:00
file.c vfs: clear remainder of 'full_fds_bits' in dup_fd() 2015-11-05 23:05:32 -08:00
file_table.c
filesystems.c
fs-writeback.c writeback, cgroup: fix use of the wrong bdi_writeback which mismatches the inode 2016-04-12 09:09:04 -07:00
fs_pin.c
fs_struct.c
inode.c vfs: fix deadlock in file_remove_privs() on overlayfs 2016-08-10 11:49:30 +02:00
internal.h
ioctl.c
Kconfig dax: disable pmd mappings 2015-11-16 23:54:45 -08:00
Kconfig.binfmt
libfs.c
locks.c locks: use file_inode() 2016-08-10 11:49:27 +02:00
Makefile ext4: promote ext4 over ext2 in the default probe order 2015-10-15 10:33:21 -04:00
mbcache.c
mount.h mnt: Add a per mount namespace limit on the number of mounts 2017-04-30 05:49:28 +02:00
mpage.c fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
namei.c fs: Check for invalid i_uid in may_follow_link() 2016-09-15 08:27:49 +02:00
namespace.c mnt: Add a per mount namespace limit on the number of mounts 2017-04-30 05:49:28 +02:00
no-block.c
nsfs.c fs/seq_file: convert int seq_vprint/seq_printf/etc... returns to void 2015-09-11 15:21:34 -07:00
open.c vfs: add vfs_select_inode() helper 2016-05-18 17:06:48 -07:00
pipe.c pipe: limit the per-user amount of pages allocated in pipes 2016-06-07 18:14:35 -07:00
pnode.c mnt: Add a per mount namespace limit on the number of mounts 2017-04-30 05:49:28 +02:00
pnode.h mnt: Add a per mount namespace limit on the number of mounts 2017-04-30 05:49:28 +02:00
posix_acl.c tmpfs: clear S_ISGID when setting posix ACLs 2017-01-26 08:23:47 +01:00
proc_namespace.c vfs: show_vfsstat: do not ignore errors from show_devname method 2016-04-12 09:08:55 -07:00
read_write.c
readdir.c
select.c
seq_file.c fs/seq_file: fix out-of-bounds read 2016-09-07 08:32:43 +02:00
signalfd.c
splice.c vfs: fix uninitialized flags in splice_to_pipe() 2017-02-23 17:43:09 +01:00
stack.c
stat.c ufs: restore maintaining ->i_blocks 2017-06-14 13:16:24 +02:00
statfs.c
super.c fs/super.c: fix race between freeze_super() and thaw_super() 2016-10-28 03:01:32 -04:00
sync.c fs/sync.c: make sync_file_range(2) use WB_SYNC_NONE writeback 2015-11-06 17:50:42 -08:00
timerfd.c timerfd: Protect the might cancel mechanism proper 2017-05-08 07:46:01 +02:00
userfaultfd.c userfaultfd: don't block on the last VM updates at exit time 2016-03-16 08:43:01 -07:00
utimes.c vfs: move permission checking into notify_change() for utimes(NULL) 2016-10-22 12:26:56 +02:00
xattr.c fs/xattr.c: zero out memory copied to userspace in getxattr 2017-05-20 14:27:01 +02:00