android_kernel_oneplus_msm8998/mm
Johannes Weiner 299991298b mm: memcontrol: do not recurse in direct reclaim
commit 89a2848381b5fcd9c4d9c0cd97680e3b28730e31 upstream.

On 4.0, we saw a stack corruption from a page fault entering direct
memory cgroup reclaim, calling into btrfs_releasepage(), which then
tried to allocate an extent and recursed back into a kmem charge ad
nauseam:

  [...]
  btrfs_releasepage+0x2c/0x30
  try_to_release_page+0x32/0x50
  shrink_page_list+0x6da/0x7a0
  shrink_inactive_list+0x1e5/0x510
  shrink_lruvec+0x605/0x7f0
  shrink_zone+0xee/0x320
  do_try_to_free_pages+0x174/0x440
  try_to_free_mem_cgroup_pages+0xa7/0x130
  try_charge+0x17b/0x830
  memcg_charge_kmem+0x40/0x80
  new_slab+0x2d9/0x5a0
  __slab_alloc+0x2fd/0x44f
  kmem_cache_alloc+0x193/0x1e0
  alloc_extent_state+0x21/0xc0
  __clear_extent_bit+0x2b5/0x400
  try_release_extent_mapping+0x1a3/0x220
  __btrfs_releasepage+0x31/0x70
  btrfs_releasepage+0x2c/0x30
  try_to_release_page+0x32/0x50
  shrink_page_list+0x6da/0x7a0
  shrink_inactive_list+0x1e5/0x510
  shrink_lruvec+0x605/0x7f0
  shrink_zone+0xee/0x320
  do_try_to_free_pages+0x174/0x440
  try_to_free_mem_cgroup_pages+0xa7/0x130
  try_charge+0x17b/0x830
  mem_cgroup_try_charge+0x65/0x1c0
  handle_mm_fault+0x117f/0x1510
  __do_page_fault+0x177/0x420
  do_page_fault+0xc/0x10
  page_fault+0x22/0x30

On later kernels, kmem charging is opt-in rather than opt-out, and that
particular kmem allocation in btrfs_releasepage() is no longer being
charged and won't recurse and overrun the stack anymore.

But it's not impossible for an accounted allocation to happen from the
memcg direct reclaim context, and we needed to reproduce this crash many
times before we even got a useful stack trace out of it.

Like other direct reclaimers, mark tasks in memcg reclaim PF_MEMALLOC to
avoid recursing into any other form of direct reclaim.  Then let
recursive charges from PF_MEMALLOC contexts bypass the cgroup limit.

Link: http://lkml.kernel.org/r/20161025141050.GA13019@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10 16:36:32 +01:00
..
kasan kasan: fix kmemleak false-positive in kasan_module_alloc() 2015-11-20 16:17:32 -08:00
backing-dev.c block: fix bdi vs gendisk lifetime mismatch 2016-08-20 18:09:24 +02:00
balloon_compaction.c virtio_balloon: fix race between migration and ballooning 2016-03-03 15:07:18 -08:00
bootmem.c bootmem: avoid freeing to bootmem after bootmem is done 2015-09-08 15:35:28 -07:00
cleancache.c cleancache: remove limit on the number of cleancache enabled filesystems 2015-04-14 16:49:03 -07:00
cma.c mm/cma.c: suppress warning 2015-11-05 19:34:48 -08:00
cma.h mm: cma: mark cma_bitmap_maxno() inline in header 2015-08-14 15:56:32 -07:00
cma_debug.c mm/cma_debug: correct size input to bitmap function 2015-07-17 16:39:54 -07:00
compaction.c mm, compaction: prevent VM_BUG_ON when terminating freeing scanner 2016-08-10 11:49:25 +02:00
debug-pagealloc.c
debug.c mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
dmapool.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
early_ioremap.c mm/early_ioremap: use offset_in_page macro 2015-11-05 19:34:48 -08:00
fadvise.c writeback: implement and use inode_congested() 2015-06-02 08:33:35 -06:00
failslab.c mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM 2015-11-06 17:50:42 -08:00
filemap.c mm: filemap: fix mapping->nrpages double accounting in fuse 2016-10-28 03:01:34 -04:00
frame_vector.c mm: fix docbook comment for get_vaddr_frames() 2015-11-05 19:34:48 -08:00
frontswap.c frontswap: allow multiple backends 2015-06-24 17:49:45 -07:00
gup.c mm: remove gup_flags FOLL_WRITE games from __get_user_pages() 2016-10-20 10:00:47 +02:00
highmem.c
huge_memory.c mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check 2016-05-04 14:48:49 -07:00
hugetlb.c mm/hugetlb: fix memory offline with hugepage size > memory block size 2016-10-31 04:13:58 -06:00
hugetlb_cgroup.c mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
hwpoison-inject.c hwpoison: use page_cgroup_ino for filtering by memcg 2015-09-10 13:29:01 -07:00
init-mm.c
internal.h mm, sl[au]b: add __GFP_ATOMIC to the GFP reclaim mask 2016-08-10 11:49:25 +02:00
interval_tree.c
Kconfig mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c mm/kmemleak.c: remove unneeded initialization of object to NULL 2015-11-05 19:34:48 -08:00
ksm.c mm,ksm: fix endless looping in allocating memory when ksm enable 2016-10-07 15:23:40 +02:00
list_lru.c mm/list_lru.c: avoid error-path NULL pointer deref 2016-11-10 16:36:32 +01:00
maccess.c mm/maccess.c: actually return -EFAULT from strncpy_from_unsafe 2015-11-05 19:34:48 -08:00
madvise.c mm: madvise allow remove operation for hugetlbfs 2015-09-08 15:35:28 -07:00
Makefile media updates for v4.3-rc1 2015-09-11 16:42:39 -07:00
memblock.c mm/memblock: make memblock_remove_range() static 2015-11-05 19:34:48 -08:00
memcontrol.c mm: memcontrol: do not recurse in direct reclaim 2016-11-10 16:36:32 +01:00
memory-failure.c mm: soft-offline: check return value in second __get_any_page() call 2016-02-25 12:01:21 -08:00
memory.c numa: fix /proc/<pid>/numa_maps for THP 2016-05-04 14:48:49 -07:00
memory_hotplug.c mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone() 2015-12-29 17:45:49 -08:00
mempolicy.c mm: rename alloc_pages_exact_node() to __alloc_pages_node() 2015-09-08 15:35:28 -07:00
mempool.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
memtest.c memtest: remove unused header files 2015-09-08 15:35:28 -07:00
migrate.c mm: Export migrate_page_move_mapping and migrate_page_copy 2016-07-27 09:47:31 -07:00
mincore.c mm/mincore: use offset_in_page macro 2015-11-05 19:34:48 -08:00
mlock.c mm: fix mlock accouting 2016-02-25 12:01:21 -08:00
mm_init.c mm: meminit: remove mminit_verify_page_links 2015-06-30 19:44:56 -07:00
mmap.c mm: fix regression in remap_file_pages() emulation 2016-02-25 12:01:21 -08:00
mmu_context.c
mmu_notifier.c mmu-notifier: add clear_young callback 2015-09-10 13:29:01 -07:00
mmzone.c mm: microoptimize zonelist operations 2015-02-11 17:06:02 -08:00
mprotect.c userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx 2015-09-04 16:54:41 -07:00
mremap.c mm/mremap: use offset_in_page macro 2015-11-05 19:34:48 -08:00
msync.c mm/msync: use offset_in_page macro 2015-11-05 19:34:48 -08:00
nobootmem.c mm: page_alloc: pass PFN to __free_pages_bootmem 2015-06-30 19:44:55 -07:00
nommu.c mm/nommu.c: drop unlikely inside BUG_ON() 2015-11-05 19:34:48 -08:00
oom_kill.c mm/oom_kill.c: avoid attempting to kill init sharing same memory 2015-12-12 10:15:34 -08:00
page-writeback.c writeback: use higher precision calculation in domain_dirty_limits() 2016-07-27 09:47:29 -07:00
page_alloc.c mm, meminit: ensure node is online before checking whether pages are uninitialised 2016-08-10 11:49:25 +02:00
page_counter.c mm: page_counter: let page_counter_try_charge() return bool 2015-11-05 19:34:48 -08:00
page_ext.c mm: introduce idle page tracking 2015-09-10 13:29:01 -07:00
page_idle.c mm: introduce idle page tracking 2015-09-10 13:29:01 -07:00
page_io.c fs: use helper bio_add_page() instead of open coding on bi_io_vec 2015-08-13 12:32:00 -06:00
page_isolation.c mm: fix invalid node in alloc_migrate_target() 2016-04-20 15:41:53 +09:00
page_owner.c mm/page_owner: set correct gfp_mask on page_owner 2015-07-17 16:39:54 -07:00
pagewalk.c mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers 2015-03-25 16:20:30 -07:00
percpu-km.c
percpu-vm.c
percpu.c percpu: fix synchronization between synchronous map extension and chunk destruction 2016-07-27 09:47:33 -07:00
pgtable-generic.c mm,thp: khugepaged: call pte flush at the time of collapse 2016-02-25 12:01:23 -08:00
process_vm_access.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-02-25 12:01:16 -08:00
quicklist.c
readahead.c mm, fs: introduce mapping_gfp_constraint() 2015-11-06 17:50:42 -08:00
rmap.c mm: page migration use migration entry for swapcache too 2015-11-05 19:34:48 -08:00
shmem.c tmpfs: fix regression hang in fallocate undo 2016-07-27 09:47:40 -07:00
slab.c slab/slub: adjust kmem_cache_alloc_bulk API 2015-11-22 11:58:44 -08:00
slab.h slab/slub: adjust kmem_cache_alloc_bulk API 2015-11-22 11:58:44 -08:00
slab_common.c mm: memcontrol: fix cgroup creation failure after many small jobs 2016-08-16 09:30:51 +02:00
slob.c slab/slub: adjust kmem_cache_alloc_bulk API 2015-11-22 11:58:44 -08:00
slub.c slub: clean up code for kmem cgroup support to kmem_cache_free_bulk 2016-05-04 14:48:49 -07:00
sparse-vmemmap.c
sparse.c
swap.c mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
swap_cgroup.c
swap_state.c mm: swap: zswap: maybe_preload & refactoring 2015-09-08 15:35:28 -07:00
swapfile.c mm: /proc/pid/smaps:: show proportional swap share of the mapping 2015-09-08 15:35:28 -07:00
truncate.c memcg: add per cgroup dirty page accounting 2015-06-02 08:33:33 -06:00
userfaultfd.c userfaultfd: avoid mmap_sem read recursion in mcopy_atomic 2015-09-04 16:54:41 -07:00
util.c proc: revert /proc/<pid>/maps [stack:TID] annotation 2016-09-15 08:27:46 +02:00
vmacache.c mm/vmacache: inline vmacache_valid_mm() 2015-11-05 19:34:48 -08:00
vmalloc.c mm: vmalloc: don't remove inexistent guard hole in remove_vm_area() 2015-11-20 16:17:32 -08:00
vmpressure.c
vmscan.c mm: memcontrol: do not recurse in direct reclaim 2016-11-10 16:36:32 +01:00
vmstat.c vmstat: allocate vmstat_wq before it is used 2016-01-08 23:47:54 -08:00
workingset.c mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page() 2016-10-28 03:01:34 -04:00
zbud.c mm: zsmalloc: constify struct zs_pool name 2015-11-06 17:50:42 -08:00
zpool.c mm: zsmalloc: constify struct zs_pool name 2015-11-06 17:50:42 -08:00
zsmalloc.c zsmalloc: fix zs_can_compact() integer overflow 2016-05-18 17:06:44 -07:00
zswap.c mm/zswap: provide unique zpool name 2016-05-11 11:21:14 +02:00