android_kernel_oneplus_msm8998/fs
Mike Kravetz bf8474c648 hugetlb: use same fault hash key for shared and private mappings
commit 1b426bac66e6cc83c9f2d92b96e4e72acf43419a upstream.

hugetlb uses a fault mutex hash table to prevent page faults of the
same pages concurrently.  The key for shared and private mappings is
different.  Shared keys off address_space and file index.  Private keys
off mm and virtual address.  Consider a private mappings of a populated
hugetlbfs file.  A fault will map the page from the file and if needed
do a COW to map a writable page.

Hugetlbfs hole punch uses the fault mutex to prevent mappings of file
pages.  It uses the address_space file index key.  However, private
mappings will use a different key and could race with this code to map
the file page.  This causes problems (BUG) for the page cache remove
code as it expects the page to be unmapped.  A sample stack is:

page dumped because: VM_BUG_ON_PAGE(page_mapped(page))
kernel BUG at mm/filemap.c:169!
...
RIP: 0010:unaccount_page_cache_page+0x1b8/0x200
...
Call Trace:
__delete_from_page_cache+0x39/0x220
delete_from_page_cache+0x45/0x70
remove_inode_hugepages+0x13c/0x380
? __add_to_page_cache_locked+0x162/0x380
hugetlbfs_fallocate+0x403/0x540
? _cond_resched+0x15/0x30
? __inode_security_revalidate+0x5d/0x70
? selinux_file_permission+0x100/0x130
vfs_fallocate+0x13f/0x270
ksys_fallocate+0x3c/0x80
__x64_sys_fallocate+0x1a/0x20
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9

There seems to be another potential COW issue/race with this approach
of different private and shared keys as noted in commit 8382d914eb
("mm, hugetlb: improve page-fault scalability").

Since every hugetlb mapping (even anon and private) is actually a file
mapping, just use the address_space index key for all mappings.  This
results in potentially more hash collisions.  However, this should not
be the common case.

Link: http://lkml.kernel.org/r/20190328234704.27083-3-mike.kravetz@oracle.com
Link: http://lkml.kernel.org/r/20190412165235.t4sscoujczfhuiyt@linux-r8p5
Fixes: b5cec28d36 ("hugetlbfs: truncate_hugepages() takes a range of pages")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11 12:23:52 +02:00
..
9p 9p locks: add mount option for lock retry interval 2019-04-27 09:33:58 +02:00
adfs
affs affs_lookup(): close a race with affs_remove_link() 2018-05-30 07:48:51 +02:00
afs afs: Fix afs_kill_pages() 2017-12-20 10:04:56 +01:00
autofs4 autofs: fix error return in autofs_fill_super() 2019-03-23 08:44:27 +01:00
befs
bfs bfs: add sanity check at bfs_fill_super() 2018-12-01 09:46:33 +01:00
btrfs btrfs: sysfs: don't leak memory when failing add fsid 2019-06-11 12:23:52 +02:00
cachefiles fscache, cachefiles: remove redundant variable 'cache' 2018-12-17 21:55:12 +01:00
ceph ceph: flush dirty inodes before proceeding with remount 2019-06-11 12:23:46 +02:00
cifs cifs: fix strcat buffer overflow and reduce raciness in smb21_set_oplock_level() 2019-06-11 12:23:45 +02:00
coda coda: fix 'kernel memory exposure attempt' in fsync 2017-11-24 08:32:25 +01:00
configfs configfs: replace strncpy with memcpy 2018-11-21 09:27:44 +01:00
cramfs Cramfs: fix abad comparison when wrap-arounds occur 2018-11-21 09:27:37 +01:00
debugfs debugfs: fix use-after-free on symlink traversal 2019-05-16 19:45:01 +02:00
devpts
dlm dlm: Don't swamp the CPU with callbacks queued during recovery 2019-02-20 10:13:04 +01:00
ecryptfs do d_instantiate/unlock_new_inode combinations safely 2018-05-30 07:48:52 +02:00
efivarfs
efs
exofs fs/exofs: fix potential memory leak in mount option parsing 2018-11-27 16:08:00 +01:00
exportfs exportfs: do not read dentry after free 2018-12-17 21:55:10 +01:00
ext2 ext2: Fix underflow in ext2_max_size() 2019-03-23 08:44:36 +01:00
ext4 ext4: do not delete unlinked inode from orphan list on failed truncate 2019-06-11 12:23:50 +02:00
f2fs f2fs: fix to do sanity check with current segment number 2019-04-27 09:33:58 +02:00
fat fs/fat/fatent.c: add cond_resched() to fat_count_free_clusters() 2018-11-10 07:41:40 -08:00
freevxfs
fscache fscache: fix race between enablement and dropping of object 2018-12-17 21:55:11 +01:00
fuse fuse: honor RLIMIT_FSIZE in fuse_file_fallocate 2019-06-11 12:23:46 +02:00
gfs2 gfs2: Fix sign extension bug in gfs2_update_stats 2019-06-11 12:23:51 +02:00
hfs hfs: do not free node before using 2018-12-17 21:55:12 +01:00
hfsplus hfsplus: do not free node before using 2018-12-17 21:55:12 +01:00
hostfs hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common() 2016-09-30 10:18:39 +02:00
hpfs
hugetlbfs hugetlb: use same fault hash key for shared and private mappings 2019-06-11 12:23:52 +02:00
isofs isofs: fix timestamps beyond 2027 2017-11-30 08:37:20 +00:00
jbd2 jbd2: fix compile warning when using JBUFFER_TRACE 2019-03-23 08:44:37 +01:00
jffs2 jffs2: fix use-after-free on symlink traversal 2019-05-16 19:45:01 +02:00
jfs jfs: Fix inconsistency between memory allocation and ea_buf->max_size 2018-08-09 12:19:28 +02:00
kernfs kernfs: Replace strncpy with memcpy 2018-12-13 09:21:29 +01:00
lockd lockd: fix access beyond unterminated strings in prints 2018-11-21 09:27:36 +01:00
logfs
minix
ncpfs ncpfs: fix build warning of strncpy 2019-03-23 08:44:21 +01:00
nfs NFS4: Fix v4.0 client state corruption when mount 2019-06-11 12:23:45 +02:00
nfs_common lockd: fix "list_add double add" caused by legacy signal interface 2018-02-03 17:04:28 +01:00
nfsd nfsd: Don't release the callback slot unless it was actually held 2019-05-16 19:44:44 +02:00
nilfs2 do d_instantiate/unlock_new_inode combinations safely 2018-05-30 07:48:52 +02:00
nls
notify fanotify: fix logic of events on child 2018-04-24 09:32:11 +02:00
ntfs
ocfs2 ocfs2: fix ocfs2 read inode data panic in ocfs2_iget 2019-06-11 12:23:37 +02:00
omfs
openpromfs
overlayfs ovl: fix uid/gid when creating over whiteout 2019-04-27 09:33:59 +02:00
proc fs/proc/proc_sysctl.c: Fix a NULL pointer dereference 2019-05-16 19:44:51 +02:00
pstore pstore/ram: Do not treat empty buffers as valid 2019-01-26 09:42:53 +01:00
qnx4
qnx6
quota fs/quota: Fix spectre gadget in do_quotactl 2018-09-09 20:04:36 +02:00
ramfs
reiserfs reiserfs: propagate errors from fill_with_dentries() properly 2018-11-27 16:08:00 +01:00
romfs romfs: use different way to generate fsid for BLOCK or MTD 2017-06-17 06:39:38 +02:00
squashfs squashfs: more metadata hardenings 2018-08-06 16:24:42 +02:00
sysfs scsi: sysfs: Introduce sysfs_{un,}break_active_protection() 2018-09-05 09:18:40 +02:00
sysv sysv: return 'err' instead of 0 in __sysv_write_inode 2018-12-17 21:55:09 +01:00
tracefs
ubifs ubifs: Check for name being NULL while mounting 2018-10-13 09:11:34 +02:00
udf udf: Fix crash on IO error during truncate 2019-04-03 06:23:14 +02:00
ufs ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour 2019-06-11 12:23:49 +02:00
xfs xfs: don't fail when converting shortform attr to long form during ATTR_REPLACE 2019-01-26 09:42:52 +01:00
aio.c aio: fix spectre gadget in lookup_ioctx 2018-12-21 14:09:50 +01:00
anon_inodes.c
attr.c vfs: move permission checking into notify_change() for utimes(NULL) 2016-10-22 12:26:56 +02:00
bad_inode.c
binfmt_aout.c
binfmt_elf.c binfmt_elf: switch to new creds when switching to new mm 2019-04-27 09:33:53 +02:00
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c fs/binfmt_misc.c: do not allow offset overflow 2018-07-03 11:21:26 +02:00
binfmt_script.c Revert "exec: load_script: don't blindly truncate shebang string" 2019-02-20 10:13:20 +01:00
block_dev.c fs/block_dev: always invalidate cleancache in invalidate_bdev() 2017-05-20 14:27:01 +02:00
buffer.c fs: fix guard_bio_eod to check for real EOD errors 2019-04-27 09:33:49 +02:00
char_dev.c
compat.c
compat_binfmt_elf.c binfmt_elf: compat: avoid unused function warning 2018-02-25 11:03:51 +01:00
compat_ioctl.c fs: compat: Remove warning from COMPATIBLE_IOCTL 2018-04-08 11:51:57 +02:00
coredump.c coredump: Ensure proper size of sparse core files 2017-07-05 14:37:20 +02:00
dax.c
dcache.c Hang/soft lockup in d_invalidate with simultaneous calls 2019-04-03 06:23:19 +02:00
dcookies.c
direct-io.c direct-io: Prevent NULL pointer access in submit_page_section 2017-10-18 09:20:42 +02:00
drop_caches.c fs/drop_caches.c: avoid softlockups in drop_pagecache_sb() 2019-03-23 08:44:26 +01:00
eventfd.c
eventpoll.c fs/epoll: drop ovflist branch prediction 2019-02-20 10:13:14 +01:00
exec.c mm: replace get_user_pages() write/force parameters with gup_flags 2018-12-17 21:55:16 +01:00
fcntl.c fs/fcntl: f_setown, avoid undefined behaviour 2018-01-31 12:06:11 +01:00
fhandle.c
file.c fs/file.c: initialize init_files.resize_wait 2019-04-27 09:33:49 +02:00
file_table.c
filesystems.c
fs-writeback.c fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount 2019-06-11 12:23:41 +02:00
fs_pin.c
fs_struct.c
inode.c writeback: initialize inode members that track writeback history 2019-04-03 06:23:21 +02:00
internal.h
ioctl.c
Kconfig
Kconfig.binfmt
libfs.c
locks.c locks: don't check for race with close when setting OFD lock 2018-01-17 09:35:27 +01:00
Makefile
mbcache.c
mount.h mnt: In propgate_umount handle visiting mounts in any order 2017-07-21 07:44:57 +02:00
mpage.c fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
namei.c namei: allow restricted O_CREAT of FIFOs and regular files 2018-12-01 09:46:41 +01:00
namespace.c mount: Prevent MNT_DETACH from disconnecting locked mounts 2018-11-21 09:27:44 +01:00
no-block.c
nsfs.c nsfs: mark dentry with DCACHE_RCUACCESS 2018-02-16 20:09:43 +01:00
open.c fs: completely ignore unknown open flags 2017-07-15 11:57:44 +02:00
pipe.c pipe: cap initial pipe capacity according to pipe-max-size limit 2018-05-26 08:48:51 +02:00
pnode.c mnt: Make propagate_umount less slow for overlapping mount propagation trees 2017-07-21 07:44:58 +02:00
pnode.h mnt: Add a per mount namespace limit on the number of mounts 2017-04-30 05:49:28 +02:00
posix_acl.c tmpfs: clear S_ISGID when setting posix ACLs 2017-01-26 08:23:47 +01:00
proc_namespace.c
read_write.c fs: add the fsnotify call to vfs_iter_write 2019-02-06 19:43:06 +01:00
readdir.c
select.c fs/select: add vmalloc fallback for select(2) 2018-01-31 12:06:09 +01:00
seq_file.c Make file credentials available to the seqfile interfaces 2017-08-06 19:19:42 -07:00
signalfd.c
splice.c vfs: fix uninitialized flags in splice_to_pipe() 2017-02-23 17:43:09 +01:00
stack.c
stat.c ufs: restore maintaining ->i_blocks 2017-06-14 13:16:24 +02:00
statfs.c
super.c fs: don't scan the inode cache before SB_BORN is set 2019-02-06 19:43:08 +01:00
sync.c
timerfd.c timerfd: Protect the might cancel mechanism proper 2017-05-08 07:46:01 +02:00
userfaultfd.c userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE 2017-12-20 10:04:53 +01:00
utimes.c vfs: move permission checking into notify_change() for utimes(NULL) 2016-10-22 12:26:56 +02:00
xattr.c getxattr: use correct xattr length 2018-09-09 20:04:36 +02:00