android_kernel_oneplus_msm8998/mm
Minchan Kim fdd6848191 mm: migrate: support non-lru movable page migration
We have allowed migration for only LRU pages until now and it was enough
to make high-order pages.  But recently, embedded system(e.g., webOS,
android) uses lots of non-movable pages(e.g., zram, GPU memory) so we
have seen several reports about troubles of small high-order allocation.
For fixing the problem, there were several efforts (e,g,.  enhance
compaction algorithm, SLUB fallback to 0-order page, reserved memory,
vmalloc and so on) but if there are lots of non-movable pages in system,
their solutions are void in the long run.

So, this patch is to support facility to change non-movable pages with
movable.  For the feature, this patch introduces functions related to
migration to address_space_operations as well as some page flags.

If a driver want to make own pages movable, it should define three
functions which are function pointers of struct
address_space_operations.

1. bool (*isolate_page) (struct page *page, isolate_mode_t mode);

What VM expects on isolate_page function of driver is to return *true*
if driver isolates page successfully.  On returing true, VM marks the
page as PG_isolated so concurrent isolation in several CPUs skip the
page for isolation.  If a driver cannot isolate the page, it should
return *false*.

Once page is successfully isolated, VM uses page.lru fields so driver
shouldn't expect to preserve values in that fields.

2. int (*migratepage) (struct address_space *mapping,
		struct page *newpage, struct page *oldpage, enum migrate_mode);

After isolation, VM calls migratepage of driver with isolated page.  The
function of migratepage is to move content of the old page to new page
and set up fields of struct page newpage.  Keep in mind that you should
indicate to the VM the oldpage is no longer movable via
__ClearPageMovable() under page_lock if you migrated the oldpage
successfully and returns 0.  If driver cannot migrate the page at the
moment, driver can return -EAGAIN.  On -EAGAIN, VM will retry page
migration in a short time because VM interprets -EAGAIN as "temporal
migration failure".  On returning any error except -EAGAIN, VM will give
up the page migration without retrying in this time.

Driver shouldn't touch page.lru field VM using in the functions.

3. void (*putback_page)(struct page *);

If migration fails on isolated page, VM should return the isolated page
to the driver so VM calls driver's putback_page with migration failed
page.  In this function, driver should put the isolated page back to the
own data structure.

4. non-lru movable page flags

There are two page flags for supporting non-lru movable page.

* PG_movable

Driver should use the below function to make page movable under
page_lock.

	void __SetPageMovable(struct page *page, struct address_space *mapping)

It needs argument of address_space for registering migration family
functions which will be called by VM.  Exactly speaking, PG_movable is
not a real flag of struct page.  Rather than, VM reuses page->mapping's
lower bits to represent it.

	#define PAGE_MAPPING_MOVABLE 0x2
	page->mapping = page->mapping | PAGE_MAPPING_MOVABLE;

so driver shouldn't access page->mapping directly.  Instead, driver
should use page_mapping which mask off the low two bits of page->mapping
so it can get right struct address_space.

For testing of non-lru movable page, VM supports __PageMovable function.
However, it doesn't guarantee to identify non-lru movable page because
page->mapping field is unified with other variables in struct page.  As
well, if driver releases the page after isolation by VM, page->mapping
doesn't have stable value although it has PAGE_MAPPING_MOVABLE (Look at
__ClearPageMovable).  But __PageMovable is cheap to catch whether page
is LRU or non-lru movable once the page has been isolated.  Because LRU
pages never can have PAGE_MAPPING_MOVABLE in page->mapping.  It is also
good for just peeking to test non-lru movable pages before more
expensive checking with lock_page in pfn scanning to select victim.

For guaranteeing non-lru movable page, VM provides PageMovable function.
Unlike __PageMovable, PageMovable functions validates page->mapping and
mapping->a_ops->isolate_page under lock_page.  The lock_page prevents
sudden destroying of page->mapping.

Driver using __SetPageMovable should clear the flag via
__ClearMovablePage under page_lock before the releasing the page.

* PG_isolated

To prevent concurrent isolation among several CPUs, VM marks isolated
page as PG_isolated under lock_page.  So if a CPU encounters PG_isolated
non-lru movable page, it can skip it.  Driver doesn't need to manipulate
the flag because VM will set/clear it automatically.  Keep in mind that
if driver sees PG_isolated page, it means the page have been isolated by
VM so it shouldn't touch page.lru field.  PG_isolated is alias with
PG_reclaim flag so driver shouldn't use the flag for own purpose.

[opensource.ganesh@gmail.com: mm/compaction: remove local variable is_lru]
  Link: http://lkml.kernel.org/r/20160618014841.GA7422@leo-test
Link: http://lkml.kernel.org/r/1464736881-24886-3-git-send-email-minchan@kernel.org
Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: John Einar Reitan <john.reitan@foss.arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: bda807d4445414e8e77da704f116bb0880fe0c76
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: I03380d927fed84c7464bd5f7c4405bef6b265b69
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2017-02-21 12:38:28 +05:30
..
kasan UBSAN: run-time undefined behavior sanity checker 2016-03-22 11:09:57 -07:00
backing-dev.c block: fix bdi vs gendisk lifetime mismatch 2016-08-20 18:09:24 +02:00
balloon_compaction.c virtio_balloon: fix race between migration and ballooning 2016-03-03 15:07:18 -08:00
bootmem.c mm: Remove __init annotations from free_bootmem_late 2016-03-22 11:03:23 -07:00
cleancache.c cleancache: remove limit on the number of cleancache enabled filesystems 2015-04-14 16:49:03 -07:00
cma.c mm: cma: check the max limit for cma allocation 2016-11-16 05:53:10 -08:00
cma.h mm: cma: mark cma_bitmap_maxno() inline in header 2015-08-14 15:56:32 -07:00
cma_debug.c mm/cma_debug: correct size input to bitmap function 2015-07-17 16:39:54 -07:00
compaction.c mm: migrate: support non-lru movable page migration 2017-02-21 12:38:28 +05:30
debug-pagealloc.c debug-pagealloc: Panic on pagealloc corruption 2016-10-28 10:55:48 +05:30
debug.c mm: add WasActive page flag 2016-05-31 15:27:11 -07:00
dmapool.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
early_ioremap.c mm/early_ioremap: use offset_in_page macro 2015-11-05 19:34:48 -08:00
fadvise.c writeback: implement and use inode_congested() 2015-06-02 08:33:35 -06:00
failslab.c mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM 2015-11-06 17:50:42 -08:00
filemap.c mm: vmstat: add pageoutclean 2016-03-25 16:03:59 -07:00
frame_vector.c mm: fix docbook comment for get_vaddr_frames() 2015-11-05 19:34:48 -08:00
frontswap.c frontswap: allow multiple backends 2015-06-24 17:49:45 -07:00
gup.c mm: remove gup_flags FOLL_WRITE games from __get_user_pages() 2016-12-09 17:43:44 -08:00
highmem.c mm/highmem: make kmap cache coloring aware 2014-08-06 18:01:22 -07:00
huge_memory.c Revert "Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4" 2016-08-26 14:34:05 -07:00
hugetlb.c hugetlb: fix nr_pmds accounting with shared page tables 2016-09-07 08:32:35 +02:00
hugetlb_cgroup.c mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
hwpoison-inject.c hwpoison: use page_cgroup_ino for filtering by memcg 2015-09-10 13:29:01 -07:00
init-mm.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
internal.h mm, kswapd: replace kswapd compaction with waking up kcompactd 2017-02-21 12:37:21 +05:30
interval_tree.c mm: replace vma->sharead.linear with vma->shared 2015-02-10 14:30:31 -08:00
Kconfig mm: Support address range reclaim 2016-06-22 14:44:20 -07:00
Kconfig.debug mm/debug_pagealloc: ask users for default setting of debug_pagealloc 2016-04-29 14:40:10 -07:00
kmemcheck.c mm/slab_common: move kmem_cache definition to internal header 2014-10-09 22:25:50 -04:00
kmemleak-test.c mm/kmemleak-test.c: use pr_fmt for logging 2014-06-06 16:08:18 -07:00
kmemleak.c kmemleak : Make kmemleak_stack_scan optional using config 2016-03-22 11:03:47 -07:00
ksm.c mm: migrate: support non-lru movable page migration 2017-02-21 12:38:28 +05:30
list_lru.c memcg: simplify and inline __mem_cgroup_from_kmem 2015-11-05 19:34:48 -08:00
maccess.c x86: remove more uaccess_32.h complexity 2016-08-27 11:23:38 +08:00
madvise.c mm: add a field to store names for private anonymous memory 2016-02-16 13:54:13 -08:00
Makefile Merge branch 'v4.4-16.09-android-tmp' into lsk-v4.4-16.09-android 2016-12-16 13:52:17 -08:00
memblock.c mm/memblock: disable local irqs while late memblock changes 2016-05-31 15:26:50 -07:00
memcontrol.c Merge "Merge branch 'v4.4-16.09-android-tmp' into lsk-v4.4-16.09-android" 2016-12-19 17:04:51 -08:00
memory-failure.c mm: Enhance per process reclaim to consider shared pages 2016-06-22 14:43:57 -07:00
memory.c Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4 2016-10-21 18:00:55 -07:00
memory_hotplug.c mm, memory hotplug: small cleanup in online_pages() 2017-02-21 12:37:10 +05:30
mempolicy.c mm: add a field to store names for private anonymous memory 2016-02-16 13:54:13 -08:00
mempool.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
memtest.c lib: memtest: Add MEMTEST_ENABLE_DEFAULT option 2016-05-05 15:05:52 -07:00
migrate.c mm: migrate: support non-lru movable page migration 2017-02-21 12:38:28 +05:30
mincore.c mm/mincore: use offset_in_page macro 2015-11-05 19:34:48 -08:00
mlock.c Merge remote-tracking branch 'lsk-44/linux-linaro-lsk-v4.4' into 44rc2 2016-03-23 20:51:00 -07:00
mm_init.c mm: meminit: remove mminit_verify_page_links 2015-06-30 19:44:56 -07:00
mmap.c Revert "arm64: Add support for app specific settings" 2016-08-15 15:08:53 -07:00
mmu_context.c sched/mm: call finish_arch_post_lock_switch in idle_task_exit and use_mm 2014-02-21 08:50:17 +01:00
mmu_notifier.c mmu-notifier: add clear_young callback 2015-09-10 13:29:01 -07:00
mmzone.c mm: microoptimize zonelist operations 2015-02-11 17:06:02 -08:00
mprotect.c mm: add a field to store names for private anonymous memory 2016-02-16 13:54:13 -08:00
mremap.c mm/mremap: use offset_in_page macro 2015-11-05 19:34:48 -08:00
msync.c mm/msync: use offset_in_page macro 2015-11-05 19:34:48 -08:00
nobootmem.c mm: Remove __init annotations from free_bootmem_late 2016-03-22 11:03:23 -07:00
nommu.c mm/nommu.c: drop unlikely inside BUG_ON() 2015-11-05 19:34:48 -08:00
oom_kill.c mm, oom: make dump_tasks public 2016-03-22 11:03:30 -07:00
page-writeback.c Merge remote-tracking branch 'msm4.4/tmp-da9a92f' into msm-4.4 2016-10-28 10:48:35 -07:00
page_alloc.c mm: migrate: support non-lru movable page migration 2017-02-21 12:38:28 +05:30
page_counter.c mm: page_counter: let page_counter_try_charge() return bool 2015-11-05 19:34:48 -08:00
page_ext.c mm: introduce idle page tracking 2015-09-10 13:29:01 -07:00
page_idle.c mm: introduce idle page tracking 2015-09-10 13:29:01 -07:00
page_io.c fs: use helper bio_add_page() instead of open coding on bi_io_vec 2015-08-13 12:32:00 -06:00
page_isolation.c mm: Inform KASAN when allocating pages during isolation 2016-11-23 17:18:03 -08:00
page_owner.c mm/page_owner: ask users about default setting of PAGE_OWNER 2016-05-03 15:53:55 -07:00
pagewalk.c mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers 2015-03-25 16:20:30 -07:00
percpu-km.c percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated 2014-09-02 14:46:05 -04:00
percpu-vm.c percpu: move region iterations out of pcpu_[de]populate_chunk() 2014-09-02 14:46:02 -04:00
percpu.c percpu: fix synchronization between synchronous map extension and chunk destruction 2016-07-27 09:47:33 -07:00
pgtable-generic.c mm,thp: khugepaged: call pte flush at the time of collapse 2016-02-25 12:01:23 -08:00
process_reclaim.c lowmemorykiller: Introduce sysfs node for ALMK and PPR adj threshold 2016-12-17 16:16:17 +05:30
process_vm_access.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-02-25 12:01:16 -08:00
quicklist.c mm: delete various needless include <linux/module.h> 2011-10-31 09:20:11 -04:00
readahead.c mm: change initial readahead window size calculation 2016-03-22 11:03:39 -07:00
rmap.c mm: Enhance per process reclaim to consider shared pages 2016-06-22 14:43:57 -07:00
shmem.c Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android 2016-07-29 21:38:37 +01:00
showmem.c mm: showmem: make the notifiers atomic 2016-03-22 11:03:55 -07:00
slab.c mm: SLAB hardened usercopy support 2016-08-27 11:23:38 +08:00
slab.h slab/slub: adjust kmem_cache_alloc_bulk API 2015-11-22 11:58:44 -08:00
slab_common.c mm: memcontrol: fix cgroup creation failure after many small jobs 2016-08-16 09:30:51 +02:00
slob.c slab/slub: adjust kmem_cache_alloc_bulk API 2015-11-22 11:58:44 -08:00
slub.c Merge branch 'v4.4-16.09-android-tmp' into lsk-v4.4-16.09-android 2016-12-16 13:52:17 -08:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:35:54 -07:00
swap.c mm: make compound_head() robust 2015-11-06 17:50:42 -08:00
swap_cgroup.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swap_ratio.c mm: swap_ratio: bail out if there aren't any other swap device 2016-05-31 15:23:38 -07:00
swap_state.c lowmemorykiller: Don't count swap cache pages twice 2016-04-13 11:11:01 -07:00
swapfile.c mm: swap: swap ratio support 2016-03-23 21:19:04 -07:00
truncate.c mm + fs: extends support for cache dropping 2016-03-23 21:24:12 -07:00
usercopy.c UPSTREAM: usercopy: remove page-spanning test for now 2016-09-14 14:44:29 +05:30
userfaultfd.c userfaultfd: avoid mmap_sem read recursion in mcopy_atomic 2015-09-04 16:54:41 -07:00
util.c mm: migrate: support non-lru movable page migration 2017-02-21 12:38:28 +05:30
vmacache.c mm/vmacache: inline vmacache_valid_mm() 2015-11-05 19:34:48 -08:00
vmalloc.c mm: Update is_vmalloc_addr to account for vmalloc savings 2016-03-22 11:03:59 -07:00
vmpressure.c mm: vmpressure: make vmpressure window variable 2016-12-16 20:25:45 +05:30
vmscan.c mm: wake kcompactd before kswapd's short sleep 2017-02-21 12:37:39 +05:30
vmstat.c mm, compaction: introduce kcompactd 2017-02-21 12:36:57 +05:30
workingset.c list_lru: add helpers to isolate items 2015-02-12 18:54:10 -08:00
zbud.c mm: zbud: fix the locking scenarios with zcache 2016-08-25 11:49:45 +05:30
zcache.c mm: zcache: fix merge issues 2016-06-07 11:57:44 -07:00
zpool.c mm: zsmalloc: constify struct zs_pool name 2015-11-06 17:50:42 -08:00
zsmalloc.c Revert "Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4" 2016-08-26 14:34:05 -07:00
zswap.c Revert "Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4" 2016-08-26 14:34:05 -07:00