Commit graph

291 commits

Author SHA1 Message Date
Yisheng Xie
c9a9116e8d mm/migration: make isolate_movable_page() return int type
Patch series "HWPOISON: soft offlining for non-lru movable page", v6.

After Minchan's commit bda807d44454 ("mm: migrate: support non-lru
movable page migration"), some type of non-lru page like zsmalloc and
virtio-balloon page also support migration.

Therefore, we can:

1) soft offlining no-lru movable pages, which means when memory
   corrected errors occur on a non-lru movable page, we can stop to use
   it by migrating data onto another page and disable the original
   (maybe half-broken) one.

2) enable memory hotplug for non-lru movable pages, i.e. we may offline
   blocks, which include such pages, by using non-lru page migration.

This patchset is heavily dependent on non-lru movable page migration.

This patch (of 4):

Change the return type of isolate_movable_page() from bool to int.  It
will return 0 when isolate movable page successfully, and return -EBUSY
when it isolates failed.

There is no functional change within this patch but prepare for later
patch.

[xieyisheng1@huawei.com: v6]
  Link: http://lkml.kernel.org/r/1486108770-630-2-git-send-email-xieyisheng1@huawei.com
Link: http://lkml.kernel.org/r/1485867981-16037-2-git-send-email-ysxie@foxmail.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 9e5bcd610ffcedf5e485e78a72762810b25c7f25
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: I0d921d0a4e202397ce2a37909c67aaafd86fb465
Signed-off-by: Arun KS <arunks@codeaurora.org>
2018-02-22 17:51:25 +05:30
Blagovest Kolenichev
03f50f905f Merge android-4.4@610af85 (v4.4.85) into msm-4.4
* refs/heads/tmp-610af85
  Linux 4.4.85
  ACPI / APEI: Add missing synchronize_rcu() on NOTIFY_SCI removal
  ACPI: ioapic: Clear on-stack resource before using it
  ntb_transport: fix bug calculating num_qps_mw
  ntb_transport: fix qp count bug
  ASoC: rsnd: don't call update callback if it was NULL
  ASoC: rsnd: ssi: 24bit data needs right-aligned settings
  ASoC: rsnd: Add missing initialization of ADG req_rate
  ASoC: rsnd: avoid pointless loop in rsnd_mod_interrupt()
  ASoC: rsnd: disable SRC.out only when stop timing
  ASoC: simple-card: don't fail if sysclk setting is not supported
  staging: rtl8188eu: add RNX-N150NUB support
  iio: hid-sensor-trigger: Fix the race with user space powering up sensors
  iio: imu: adis16480: Fix acceleration scale factor for adis16480
  ANDROID: binder: fix proc->tsk check.
  binder: Use wake up hint for synchronous transactions.
  binder: use group leader instead of open thread
  Bluetooth: bnep: fix possible might sleep error in bnep_session
  Bluetooth: cmtp: fix possible might sleep error in cmtp_session
  Bluetooth: hidp: fix possible might sleep error in hidp_session_thread
  perf/core: Fix group {cpu,task} validation
  nfsd: Limit end of page list when decoding NFSv4 WRITE
  cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
  cifs: Fix df output for users with quota limits
  tracing: Fix freeing of filter in create_filter() when set_str is false
  drm: rcar-du: Fix H/V sync signal polarity configuration
  drm: rcar-du: Fix display timing controller parameter
  drm: rcar-du: Fix crash in encoder failure error path
  drm: rcar-du: lvds: Rename PLLEN bit to PLLON
  drm: rcar-du: lvds: Fix PLL frequency-related configuration
  drm/atomic: If the atomic check fails, return its value first
  drm: Release driver tracking before making the object available again
  i2c: designware: Fix system suspend
  ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
  ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)
  ALSA: core: Fix unexpected error at replacing user TLV
  Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
  Input: trackpoint - add new trackpoint firmware ID
  mei: me: add lewisburg device ids
  mei: me: add broxton pci device ids
  net_sched: fix order of queue length updates in qdisc_replace()
  net: sched: fix NULL pointer dereference when action calls some targets
  irda: do not leak initialized list.dev to userspace
  tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
  ipv6: repair fib6 tree in failure case
  ipv6: reset fn->rr_ptr when replacing route
  tipc: fix use-after-free
  sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
  ipv4: better IP_MAX_MTU enforcement
  net_sched/sfq: update hierarchical backlog when drop packet
  ipv4: fix NULL dereference in free_fib_info_rcu()
  dccp: defer ccid_hc_tx_delete() at dismantle time
  dccp: purge write queue in dccp_destroy_sock()
  af_key: do not use GFP_KERNEL in atomic contexts
  ANDROID: NFC: st21nfca: Fix memory OOB and leak issues in connectivity events handler
  Linux 4.4.84
  usb: qmi_wwan: add D-Link DWM-222 device ID
  usb: optimize acpi companion search for usb port devices
  perf/x86: Fix LBR related crashes on Intel Atom
  pids: make task_tgid_nr_ns() safe
  Sanitize 'move_pages()' permission checks
  irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup()
  irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup()
  x86/asm/64: Clear AC on NMI entries
  xen: fix bio vec merging
  mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes
  mm/mempolicy: fix use after free when calling get_mempolicy
  ALSA: usb-audio: Add mute TLV for playback volumes on C-Media devices
  ALSA: usb-audio: Apply sample rate quirk to Sennheiser headset
  ALSA: seq: 2nd attempt at fixing race creating a queue
  Input: elan_i2c - Add antoher Lenovo ACPI ID for upcoming Lenovo NB
  Input: elan_i2c - add ELAN0608 to the ACPI table
  crypto: x86/sha1 - Fix reads beyond the number of blocks passed
  parisc: pci memory bar assignment fails with 64bit kernels on dino/cujo
  audit: Fix use after free in audit_remove_watch_rule()
  netfilter: nf_ct_ext: fix possible panic after nf_ct_extend_unregister
  ANDROID: check dir value of xfrm_userpolicy_id
  ANDROID: NFC: Fix possible memory corruption when handling SHDLC I-Frame commands
  ANDROID: nfc: fdp: Fix possible buffer overflow in WCS4000 NFC driver
  ANDROID: NFC: st21nfca: Fix out of bounds kernel access when handling ATR_REQ
  UPSTREAM: usb: dwc3: gadget: don't send extra ZLP
  BACKPORT: usb: dwc3: gadget: handle request->zero
  ANDROID: usb: gadget: assign no-op request complete callbacks
  ANDROID: usb: gadget: configfs: fix null ptr in android_disconnect
  ANDROID: uid_sys_stats: Fix implicit declaration of get_cmdline()
  uid_sys_stats: log task io with a debug flag
  Linux 4.4.83
  pinctrl: samsung: Remove bogus irq_[un]mask from resource management
  pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver
  pnfs/blocklayout: require 64-bit sector_t
  iio: adc: vf610_adc: Fix VALT selection value for REFSEL bits
  usb:xhci:Add quirk for Certain failing HP keyboard on reset after resume
  usb: quirks: Add no-lpm quirk for Moshi USB to Ethernet Adapter
  usb: core: unlink urbs from the tail of the endpoint's urb_list
  USB: Check for dropped connection before switching to full speed
  uas: Add US_FL_IGNORE_RESIDUE for Initio Corporation INIC-3069
  iio: light: tsl2563: use correct event code
  iio: accel: bmc150: Always restore device to normal mode after suspend-resume
  staging:iio:resolver:ad2s1210 fix negative IIO_ANGL_VEL read
  USB: hcd: Mark secondary HCD as dead if the primary one died
  usb: musb: fix tx fifo flush handling again
  USB: serial: pl2303: add new ATEN device id
  USB: serial: cp210x: add support for Qivicon USB ZigBee dongle
  USB: serial: option: add D-Link DWM-222 device ID
  nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays
  fuse: initialize the flock flag in fuse_file on allocation
  iscsi-target: Fix iscsi_np reset hung task during parallel delete
  iscsi-target: fix memory leak in iscsit_setup_text_cmd()
  mm: ratelimit PFNs busy info message
  cpuset: fix a deadlock due to incomplete patching of cpusets_enabled()
  ANDROID: Use sk_uid to replace uid get from socket file
  UPSTREAM: arm64: smp: Prevent raw_smp_processor_id() recursion
  UPSTREAM: arm64: restore get_current() optimisation
  ANDROID: arm64: Fix a copy-paste error in prior init_thread_info build fix

Conflicts:
	drivers/misc/Kconfig
	drivers/usb/dwc3/gadget.c
	include/linux/sched.h
	mm/migrate.c
	net/netfilter/xt_qtaguid.c

Change-Id: I3a0107fcb5c7455114b316426c9d669bb871acd1
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2017-09-04 17:20:09 -07:00
Linus Torvalds
46d51a26ef Sanitize 'move_pages()' permission checks
commit 197e7e521384a23b9e585178f3f11c9fa08274b9 upstream.

The 'move_paghes()' system call was introduced long long ago with the
same permission checks as for sending a signal (except using
CAP_SYS_NICE instead of CAP_SYS_KILL for the overriding capability).

That turns out to not be a great choice - while the system call really
only moves physical page allocations around (and you need other
capabilities to do a lot of it), you can check the return value to map
out some the virtual address choices and defeat ASLR of a binary that
still shares your uid.

So change the access checks to the more common 'ptrace_may_access()'
model instead.

This tightens the access checks for the uid, and also effectively
changes the CAP_SYS_NICE check to CAP_SYS_PTRACE, but it's unlikely that
anybody really _uses_ this legacy system call any more (we hav ebetter
NUMA placement models these days), so I expect nobody to notice.

Famous last words.

Reported-by: Otto Ebeling <otto.ebeling@iki.fi>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-24 17:02:36 -07:00
Vlastimil Babka
f788d97308 mm, page_owner: track and print last migrate reason
During migration, page_owner info is now copied with the rest of the
page, so the stacktrace leading to free page allocation during migration
is overwritten.  For debugging purposes, it might be however useful to
know that the page has been migrated since its initial allocation.  This
might happen many times during the lifetime for different reasons and
fully tracking this, especially with stacktraces would incur extra
memory costs.  As a compromise, store and print the migrate_reason of
the last migration that occurred to the page.  This is enough to
distinguish compaction, numa balancing etc.

Example page_owner entry after the patch:

  Page allocated via order 0, mask 0x24200ca(GFP_HIGHUSER_MOVABLE)
  PFN 628753 type Movable Block 1228 type Movable Flags 0x1fffff80040030(dirty|lru|swapbacked)
   [<ffffffff811682c4>] __alloc_pages_nodemask+0x134/0x230
   [<ffffffff811b6325>] alloc_pages_vma+0xb5/0x250
   [<ffffffff81177491>] shmem_alloc_page+0x61/0x90
   [<ffffffff8117a438>] shmem_getpage_gfp+0x678/0x960
   [<ffffffff8117c2b9>] shmem_fallocate+0x329/0x440
   [<ffffffff811de600>] vfs_fallocate+0x140/0x230
   [<ffffffff811df434>] SyS_fallocate+0x44/0x70
   [<ffffffff8158cc2e>] entry_SYSCALL_64_fastpath+0x12/0x71
  Page has been migrated, last migrate reason: compaction

Change-Id: I9c93f9f91fa71feaea1505d80ee56caf8daf5562
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 7cd12b4abfd2f8f42414c520bbd051a5b7dc7a8c
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
2017-07-07 15:12:10 +05:30
Vlastimil Babka
e91e9112cb mm, page_owner: copy page owner info during migration
The page_owner mechanism stores gfp_flags of an allocation and stack
trace that lead to it.  During page migration, the original information
is practically replaced by the allocation of free page as the migration
target.  Arguably this is less useful and might lead to all the
page_owner info for migratable pages gradually converge towards
compaction or numa balancing migrations.  It has also lead to
inaccuracies such as one fixed by commit e2cfc91120 ("mm/page_owner:
set correct gfp_mask on page_owner").

This patch thus introduces copying the page_owner info during migration.
However, since the fact that the page has been migrated from its
original place might be useful for debugging, the next patch will
introduce a way to track that information as well.

Change-Id: I4eb94be5fb2c93bbf165edb9f2a80091b5c8d7b1
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: d435edca928805074dae005ab9a42d9fa60fc702
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
2017-07-07 15:12:09 +05:30
Minchan Kim
ad386b38ee mm: balloon: use general non-lru movable page feature
Now, VM has a feature to migrate non-lru movable pages so balloon
doesn't need custom migration hooks in migrate.c and compaction.c.

Instead, this patch implements the page->mapping->a_ops->
{isolate|migrate|putback} functions.

With that, we could remove hooks for ballooning in general migration
functions and make balloon compaction simple.

[akpm@linux-foundation.org: compaction.h requires that the includer first include node.h]
Link: http://lkml.kernel.org/r/1464736881-24886-4-git-send-email-minchan@kernel.org
Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: b1123ea6d3b3da25af5c8a9d843bd07ab63213f4
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: Ibf5bc7ffcbfd31e01946729dc5366baa79d7cf5e
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2017-02-21 12:38:41 +05:30
Minchan Kim
fdd6848191 mm: migrate: support non-lru movable page migration
We have allowed migration for only LRU pages until now and it was enough
to make high-order pages.  But recently, embedded system(e.g., webOS,
android) uses lots of non-movable pages(e.g., zram, GPU memory) so we
have seen several reports about troubles of small high-order allocation.
For fixing the problem, there were several efforts (e,g,.  enhance
compaction algorithm, SLUB fallback to 0-order page, reserved memory,
vmalloc and so on) but if there are lots of non-movable pages in system,
their solutions are void in the long run.

So, this patch is to support facility to change non-movable pages with
movable.  For the feature, this patch introduces functions related to
migration to address_space_operations as well as some page flags.

If a driver want to make own pages movable, it should define three
functions which are function pointers of struct
address_space_operations.

1. bool (*isolate_page) (struct page *page, isolate_mode_t mode);

What VM expects on isolate_page function of driver is to return *true*
if driver isolates page successfully.  On returing true, VM marks the
page as PG_isolated so concurrent isolation in several CPUs skip the
page for isolation.  If a driver cannot isolate the page, it should
return *false*.

Once page is successfully isolated, VM uses page.lru fields so driver
shouldn't expect to preserve values in that fields.

2. int (*migratepage) (struct address_space *mapping,
		struct page *newpage, struct page *oldpage, enum migrate_mode);

After isolation, VM calls migratepage of driver with isolated page.  The
function of migratepage is to move content of the old page to new page
and set up fields of struct page newpage.  Keep in mind that you should
indicate to the VM the oldpage is no longer movable via
__ClearPageMovable() under page_lock if you migrated the oldpage
successfully and returns 0.  If driver cannot migrate the page at the
moment, driver can return -EAGAIN.  On -EAGAIN, VM will retry page
migration in a short time because VM interprets -EAGAIN as "temporal
migration failure".  On returning any error except -EAGAIN, VM will give
up the page migration without retrying in this time.

Driver shouldn't touch page.lru field VM using in the functions.

3. void (*putback_page)(struct page *);

If migration fails on isolated page, VM should return the isolated page
to the driver so VM calls driver's putback_page with migration failed
page.  In this function, driver should put the isolated page back to the
own data structure.

4. non-lru movable page flags

There are two page flags for supporting non-lru movable page.

* PG_movable

Driver should use the below function to make page movable under
page_lock.

	void __SetPageMovable(struct page *page, struct address_space *mapping)

It needs argument of address_space for registering migration family
functions which will be called by VM.  Exactly speaking, PG_movable is
not a real flag of struct page.  Rather than, VM reuses page->mapping's
lower bits to represent it.

	#define PAGE_MAPPING_MOVABLE 0x2
	page->mapping = page->mapping | PAGE_MAPPING_MOVABLE;

so driver shouldn't access page->mapping directly.  Instead, driver
should use page_mapping which mask off the low two bits of page->mapping
so it can get right struct address_space.

For testing of non-lru movable page, VM supports __PageMovable function.
However, it doesn't guarantee to identify non-lru movable page because
page->mapping field is unified with other variables in struct page.  As
well, if driver releases the page after isolation by VM, page->mapping
doesn't have stable value although it has PAGE_MAPPING_MOVABLE (Look at
__ClearPageMovable).  But __PageMovable is cheap to catch whether page
is LRU or non-lru movable once the page has been isolated.  Because LRU
pages never can have PAGE_MAPPING_MOVABLE in page->mapping.  It is also
good for just peeking to test non-lru movable pages before more
expensive checking with lock_page in pfn scanning to select victim.

For guaranteeing non-lru movable page, VM provides PageMovable function.
Unlike __PageMovable, PageMovable functions validates page->mapping and
mapping->a_ops->isolate_page under lock_page.  The lock_page prevents
sudden destroying of page->mapping.

Driver using __SetPageMovable should clear the flag via
__ClearMovablePage under page_lock before the releasing the page.

* PG_isolated

To prevent concurrent isolation among several CPUs, VM marks isolated
page as PG_isolated under lock_page.  So if a CPU encounters PG_isolated
non-lru movable page, it can skip it.  Driver doesn't need to manipulate
the flag because VM will set/clear it automatically.  Keep in mind that
if driver sees PG_isolated page, it means the page have been isolated by
VM so it shouldn't touch page.lru field.  PG_isolated is alias with
PG_reclaim flag so driver shouldn't use the flag for own purpose.

[opensource.ganesh@gmail.com: mm/compaction: remove local variable is_lru]
  Link: http://lkml.kernel.org/r/20160618014841.GA7422@leo-test
Link: http://lkml.kernel.org/r/1464736881-24886-3-git-send-email-minchan@kernel.org
Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: John Einar Reitan <john.reitan@foss.arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: bda807d4445414e8e77da704f116bb0880fe0c76
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: I03380d927fed84c7464bd5f7c4405bef6b265b69
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2017-02-21 12:38:28 +05:30
Minchan Kim
83e92bcc15 mm: use put_page() to free page instead of putback_lru_page()
Recently, I got many reports about perfermance degradation in embedded
system(Android mobile phone, webOS TV and so on) and easy fork fail.

The problem was fragmentation caused by zram and GPU driver mainly.
With memory pressure, their pages were spread out all of pageblock and
it cannot be migrated with current compaction algorithm which supports
only LRU pages.  In the end, compaction cannot work well so reclaimer
shrinks all of working set pages.  It made system very slow and even to
fail to fork easily which requires order-[2 or 3] allocations.

Other pain point is that they cannot use CMA memory space so when OOM
kill happens, I can see many free pages in CMA area, which is not memory
efficient.  In our product which has big CMA memory, it reclaims zones
too exccessively to allocate GPU and zram page although there are lots
of free space in CMA so system becomes very slow easily.

To solve these problem, this patch tries to add facility to migrate
non-lru pages via introducing new functions and page flags to help
migration.

struct address_space_operations {
	..
	..
	bool (*isolate_page)(struct page *, isolate_mode_t);
	void (*putback_page)(struct page *);
	..
}

new page flags

	PG_movable
	PG_isolated

For details, please read description in "mm: migrate: support non-lru
movable page migration".

Originally, Gioh Kim had tried to support this feature but he moved so I
took over the work.  I took many code from his work and changed a little
bit and Konstantin Khlebnikov helped Gioh a lot so he should deserve to
have many credit, too.

And I should mention Chulmin who have tested this patchset heavily so I
can find many bugs from him.  :)

Thanks, Gioh, Konstantin and Chulmin!

This patchset consists of five parts.

1. clean up migration
  mm: use put_page to free page instead of putback_lru_page

2. add non-lru page migration feature
  mm: migrate: support non-lru movable page migration

3. rework KVM memory-ballooning
  mm: balloon: use general non-lru movable page feature

4. zsmalloc refactoring for preparing page migration
  zsmalloc: keep max_object in size_class
  zsmalloc: use bit_spin_lock
  zsmalloc: use accessor
  zsmalloc: factor page chain functionality out
  zsmalloc: introduce zspage structure
  zsmalloc: separate free_zspage from putback_zspage
  zsmalloc: use freeobj for index

5. zsmalloc page migration
  zsmalloc: page migration support
  zram: use __GFP_MOVABLE for memory allocation

This patch (of 12):

Procedure of page migration is as follows:

First of all, it should isolate a page from LRU and try to migrate the
page.  If it is successful, it releases the page for freeing.
Otherwise, it should put the page back to LRU list.

For LRU pages, we have used putback_lru_page for both freeing and
putback to LRU list.  It's okay because put_page is aware of LRU list so
if it releases last refcount of the page, it removes the page from LRU
list.  However, It makes unnecessary operations (e.g., lru_cache_add,
pagevec and flags operations.  It would be not significant but no worth
to do) and harder to support new non-lru page migration because put_page
isn't aware of non-lru page's data structure.

To solve the problem, we can add new hook in put_page with PageMovable
flags check but it can increase overhead in hot path and needs new
locking scheme to stabilize the flag check with put_page.

So, this patch cleans it up to divide two semantic(ie, put and putback).
If migration is successful, use put_page instead of putback_lru_page and
use putback_lru_page only on failure.  That makes code more readable and
doesn't add overhead in put_page.

Comment from Vlastimil
 "Yeah, and compaction (perhaps also other migration users) has to drain
  the lru pvec...  Getting rid of this stuff is worth even by itself."

Link: http://lkml.kernel.org/r/1464736881-24886-2-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: c6c919eb90e021fbcfcbfa9dd3d55930cdbb67f9
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: I2c00623996f8017b9c84ed12fcc4d85290a1712c
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2017-02-21 12:38:13 +05:30
Runmin Wang
9cc5c789d9 Merge remote-tracking branch 'msm4.4/tmp-da9a92f' into msm-4.4
* origin/tmp-da9a92f:
  arm64: kaslr: increase randomization granularity
  arm64: relocatable: deal with physically misaligned kernel images
  arm64: don't map TEXT_OFFSET bytes below the kernel if we can avoid it
  arm64: kernel: replace early 64-bit literal loads with move-immediates
  arm64: introduce mov_q macro to move a constant into a 64-bit register
  arm64: kernel: perform relocation processing from ID map
  arm64: kernel: use literal for relocated address of __secondary_switched
  arm64: kernel: don't export local symbols from head.S
  arm64: simplify kernel segment mapping granularity
  arm64: cover the .head.text section in the .text segment mapping
  arm64: move early boot code to the .init segment
  arm64: use 'segment' rather than 'chunk' to describe mapped kernel regions
  arm64: mm: Mark .rodata as RO
  Linux 4.4.16
  ovl: verify upper dentry before unlink and rename
  drm/i915: Revert DisplayPort fast link training feature
  tmpfs: fix regression hang in fallocate undo
  tmpfs: don't undo fallocate past its last page
  crypto: qat - make qat_asym_algs.o depend on asn1 headers
  xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7
  File names with trailing period or space need special case conversion
  cifs: dynamic allocation of ntlmssp blob
  Fix reconnect to not defer smb3 session reconnect long after socket reconnect
  53c700: fix BUG on untagged commands
  s390: fix test_fp_ctl inline assembly contraints
  scsi: fix race between simultaneous decrements of ->host_failed
  ovl: verify upper dentry in ovl_remove_and_whiteout()
  ovl: Copy up underlying inode's ->i_mode to overlay inode
  ARM: mvebu: fix HW I/O coherency related deadlocks
  ARM: dts: armada-38x: fix MBUS_ID for crypto SRAM on Armada 385 Linksys
  ARM: sunxi/dt: make the CHIP inherit from allwinner,sun5i-a13
  ALSA: hda: add AMD Stoney PCI ID with proper driver caps
  ALSA: hda - fix use-after-free after module unload
  ALSA: ctl: Stop notification after disconnection
  ALSA: pcm: Free chmap at PCM free callback, too
  ALSA: hda/realtek - add new pin definition in alc225 pin quirk table
  ALSA: hda - fix read before array start
  ALSA: hda - Add PCI ID for Kabylake-H
  ALSA: hda/realtek: Add Lenovo L460 to docking unit fixup
  ALSA: timer: Fix negative queue usage by racy accesses
  ALSA: echoaudio: Fix memory allocation
  ALSA: au88x0: Fix calculation in vortex_wtdma_bufshift()
  ALSA: hda / realtek - add two more Thinkpad IDs (5050,5053) for tpt460 fixup
  ALSA: hda - Fix the headset mic jack detection on Dell machine
  ALSA: dummy: Fix a use-after-free at closing
  hwmon: (dell-smm) Cache fan_type() calls and change fan detection
  hwmon: (dell-smm) Disallow fan_type() calls on broken machines
  hwmon: (dell-smm) Restrict fan control and serial number to CAP_SYS_ADMIN by default
  tty/vt/keyboard: fix OOB access in do_compute_shiftstate()
  tty: vt: Fix soft lockup in fbcon cursor blink timer.
  iio:ad7266: Fix probe deferral for vref
  iio:ad7266: Fix support for optional regulators
  iio:ad7266: Fix broken regulator error handling
  iio: accel: kxsd9: fix the usage of spi_w8r8()
  staging: iio: accel: fix error check
  iio: hudmidity: hdc100x: fix incorrect shifting and scaling
  iio: humidity: hdc100x: fix IIO_TEMP channel reporting
  iio: humidity: hdc100x: correct humidity integration time mask
  iio: proximity: as3935: fix buffer stack trashing
  iio: proximity: as3935: remove triggered buffer processing
  iio: proximity: as3935: correct IIO_CHAN_INFO_RAW output
  iio: light apds9960: Add the missing dev.parent
  iio:st_pressure: fix sampling gains (bring inline with ABI)
  iio: Fix error handling in iio_trigger_attach_poll_func
  xen/balloon: Fix declared-but-not-defined warning
  perf/x86: Fix undefined shift on 32-bit kernels
  memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing
  drm/vmwgfx: Fix error paths when mapping framebuffer
  drm/vmwgfx: Delay pinning fbdev framebuffer until after mode set
  drm/vmwgfx: Check pin count before attempting to move a buffer
  drm/vmwgfx: Work around mode set failure in 2D VMs
  drm/vmwgfx: Add an option to change assumed FB bpp
  drm/ttm: Make ttm_bo_mem_compat available
  drm: atmel-hlcdc: actually disable scaling when no scaling is required
  drm: make drm_atomic_set_mode_prop_for_crtc() more reliable
  drm: add missing drm_mode_set_crtcinfo call
  drm/i915: Update CDCLK_FREQ register on BDW after changing cdclk frequency
  drm/i915: Update ifdeffery for mutex->owner
  drm/i915: Refresh cached DP port register value on resume
  drm/i915/ilk: Don't disable SSC source if it's in use
  drm/nouveau/disp/sor/gf119: select correct sor when poking training pattern
  drm/nouveau: fix for disabled fbdev emulation
  drm/nouveau/fbcon: fix out-of-bounds memory accesses
  drm/nouveau/gr/gf100-: update sm error decoding from gk20a nvgpu headers
  drm/nouveau/disp/sor/gf119: both links use the same training register
  virtio_balloon: fix PFN format for virtio-1
  drm/dp/mst: Always clear proposed vcpi table for port.
  drm/amdkfd: destroy dbgmgr in notifier release
  drm/amdkfd: unbind only existing processes
  ubi: Make recover_peb power cut aware
  drm/amdgpu/gfx7: fix broken condition check
  drm/radeon: fix asic initialization for virtualized environments
  btrfs: account for non-CoW'd blocks in btrfs_abort_transaction
  percpu: fix synchronization between synchronous map extension and chunk destruction
  percpu: fix synchronization between chunk->map_extend_work and chunk destruction
  af_unix: fix hard linked sockets on overlay
  vfs: add d_real_inode() helper
  arm64: Rework valid_user_regs
  ipmi: Remove smi_msg from waiting_rcv_msgs list before handle_one_recv_msg()
  drm/mgag200: Black screen fix for G200e rev 4
  iommu/amd: Fix unity mapping initialization race
  iommu/vt-d: Enable QI on all IOMMUs before setting root entry
  iommu/arm-smmu: Wire up map_sg for arm-smmu-v3
  base: make module_create_drivers_dir race-free
  tracing: Handle NULL formats in hold_module_trace_bprintk_format()
  HID: multitouch: enable palm rejection for Windows Precision Touchpad
  HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands
  HID: elo: kill not flush the work
  KVM: nVMX: VMX instructions: fix segment checks when L1 is in long mode.
  kvm: Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES
  KEYS: potential uninitialized variable
  ARCv2: LLSC: software backoff is NOT needed starting HS2.1c
  ARCv2: Check for LL-SC livelock only if LLSC is enabled
  ipv6: Fix mem leak in rt6i_pcpu
  cdc_ncm: workaround for EM7455 "silent" data interface
  net_sched: fix mirrored packets checksum
  packet: Use symmetric hash for PACKET_FANOUT_HASH.
  sched/fair: Fix cfs_rq avg tracking underflow
  UBIFS: Implement ->migratepage()
  mm: Export migrate_page_move_mapping and migrate_page_copy
  MIPS: KVM: Fix modular KVM under QEMU
  ARM: 8579/1: mm: Fix definition of pmd_mknotpresent
  ARM: 8578/1: mm: ensure pmd_present only checks the valid bit
  ARM: imx6ul: Fix Micrel PHY mask
  NFS: Fix another OPEN_DOWNGRADE bug
  make nfs_atomic_open() call d_drop() on all ->open_context() errors.
  nfsd: check permissions when setting ACLs
  posix_acl: Add set_posix_acl
  nfsd: Extend the mutex holding region around in nfsd4_process_open2()
  nfsd: Always lock state exclusively.
  nfsd4/rpc: move backchannel create logic into rpc code
  writeback: use higher precision calculation in domain_dirty_limits()
  thermal: cpu_cooling: fix improper order during initialization
  uvc: Forward compat ioctls to their handlers directly
  Revert "gpiolib: Split GPIO flags parsing and GPIO configuration"
  x86/amd_nb: Fix boot crash on non-AMD systems
  kprobes/x86: Clear TF bit in fault on single-stepping
  x86, build: copy ldlinux.c32 to image.iso
  locking/static_key: Fix concurrent static_key_slow_inc()
  locking/qspinlock: Fix spin_unlock_wait() some more
  locking/ww_mutex: Report recursive ww_mutex locking early
  of: irq: fix of_irq_get[_byname]() kernel-doc
  of: fix autoloading due to broken modalias with no 'compatible'
  mnt: If fs_fully_visible fails call put_filesystem.
  mnt: Account for MS_RDONLY in fs_fully_visible
  mnt: fs_fully_visible test the proper mount for MNT_LOCKED
  usb: common: otg-fsm: add license to usb-otg-fsm
  USB: EHCI: declare hostpc register as zero-length array
  usb: dwc2: fix regression on big-endian PowerPC/ARM systems
  powerpc/tm: Always reclaim in start_thread() for exec() class syscalls
  powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added
  powerpc/pseries: Fix PCI config address for DDW
  powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism
  IB/mlx4: Properly initialize GRH TClass and FlowLabel in AHs
  IB/cm: Fix a recently introduced locking bug
  EDAC, sb_edac: Fix rank lookup on Broadwell
  mac80211: Fix mesh estab_plinks counting in STA removal case
  mac80211_hwsim: Add missing check for HWSIM_ATTR_SIGNAL
  mac80211: mesh: flush mesh paths unconditionally
  mac80211: fix fast_tx header alignment
  Linux 4.4.15
  usb: dwc3: exynos: Fix deferred probing storm.
  usb: host: ehci-tegra: Grab the correct UTMI pads reset
  usb: gadget: fix spinlock dead lock in gadgetfs
  USB: mos7720: delete parport
  xhci: Fix handling timeouted commands on hosts in weird states.
  USB: xhci: Add broken streams quirk for Frescologic device id 1009
  usb: xhci-plat: properly handle probe deferral for devm_clk_get()
  xhci: Cleanup only when releasing primary hcd
  usb: musb: host: correct cppi dma channel for isoch transfer
  usb: musb: Ensure rx reinit occurs for shared_fifo endpoints
  usb: musb: Stop bulk endpoint while queue is rotated
  usb: musb: only restore devctl when session was set in backup
  usb: quirks: Add no-lpm quirk for Acer C120 LED Projector
  usb: quirks: Fix sorting
  USB: uas: Fix slave queue_depth not being set
  crypto: user - re-add size check for CRYPTO_MSG_GETALG
  crypto: ux500 - memmove the right size
  crypto: vmx - Increase priority of aes-cbc cipher
  AX.25: Close socket connection on session completion
  bpf: try harder on clones when writing into skb
  net: alx: Work around the DMA RX overflow issue
  net: macb: fix default configuration for GMAC on AT91
  neigh: Explicitly declare RCU-bh read side critical section in neigh_xmit()
  bpf, perf: delay release of BPF prog after grace period
  sock_diag: do not broadcast raw socket destruction
  Bridge: Fix ipv6 mc snooping if bridge has no ipv6 address
  ipmr/ip6mr: Initialize the last assert time of mfc entries.
  netem: fix a use after free
  esp: Fix ESN generation under UDP encapsulation
  sit: correct IP protocol used in ipip6_err
  net: Don't forget pr_fmt on net_dbg_ratelimited for CONFIG_DYNAMIC_DEBUG
  net_sched: fix pfifo_head_drop behavior vs backlog
  sdcardfs: Truncate packages_gid.list on overflow
  UPSTREAM: cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
  BACKPORT: proc: add /proc/<pid>/timerslack_ns interface
  BACKPORT: timer: convert timer_slack_ns from unsigned long to u64
  netfilter: xt_quota2: make quota2_log work well
  Revert "usb: gadget: prevent change of Host MAC address of 'usb0' interface"
  BACKPORT: PM / sleep: Go direct_complete if driver has no callbacks
  ANDROID: base-cfg: enable UID_CPUTIME
  UPSTREAM: USB: usbfs: fix potential infoleak in devio
  UPSTREAM: ALSA: timer: Fix leak in events via snd_timer_user_ccallback
  UPSTREAM: ALSA: timer: Fix leak in events via snd_timer_user_tinterrupt
  UPSTREAM: ALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS
  ANDROID: configs: remove unused configs
  ANDROID: cpu: send KOBJ_ONLINE event when enabling cpus
  ANDROID: dm verity fec: initialize recursion level
  ANDROID: dm verity fec: fix RS block calculation
  Linux 4.4.14
  netfilter: x_tables: introduce and use xt_copy_counters_from_user
  netfilter: x_tables: do compat validation via translate_table
  netfilter: x_tables: xt_compat_match_from_user doesn't need a retval
  netfilter: ip6_tables: simplify translate_compat_table args
  netfilter: ip_tables: simplify translate_compat_table args
  netfilter: arp_tables: simplify translate_compat_table args
  netfilter: x_tables: don't reject valid target size on some architectures
  netfilter: x_tables: validate all offsets and sizes in a rule
  netfilter: x_tables: check for bogus target offset
  netfilter: x_tables: check standard target size too
  netfilter: x_tables: add compat version of xt_check_entry_offsets
  netfilter: x_tables: assert minimum target size
  netfilter: x_tables: kill check_entry helper
  netfilter: x_tables: add and use xt_check_entry_offsets
  netfilter: x_tables: validate targets of jumps
  netfilter: x_tables: don't move to non-existent next rule
  drm/core: Do not preserve framebuffer on rmfb, v4.
  crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq
  netfilter: x_tables: fix unconditional helper
  netfilter: x_tables: make sure e->next_offset covers remaining blob size
  netfilter: x_tables: validate e->target_offset early
  MIPS: Fix 64k page support for 32 bit kernels.
  sparc64: Fix return from trap window fill crashes.
  sparc: Harden signal return frame checks.
  sparc64: Take ctx_alloc_lock properly in hugetlb_setup().
  sparc64: Reduce TLB flushes during hugepte changes
  sparc/PCI: Fix for panic while enabling SR-IOV
  sparc64: Fix sparc64_set_context stack handling.
  sparc64: Fix numa node distance initialization
  sparc64: Fix bootup regressions on some Kconfig combinations.
  sparc: Fix system call tracing register handling.
  fix d_walk()/non-delayed __d_free() race
  sched: panic on corrupted stack end
  proc: prevent stacking filesystems on top
  x86/entry/traps: Don't force in_interrupt() to return true in IST handlers
  wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel
  ecryptfs: forbid opening files without mmap handler
  memcg: add RCU locking around css_for_each_descendant_pre() in memcg_offline_kmem()
  parisc: Fix pagefault crash in unaligned __get_user() call
  pinctrl: mediatek: fix dual-edge code defect
  powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call
  powerpc: Use privileged SPR number for MMCR2
  powerpc: Fix definition of SIAR and SDAR registers
  powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge
  arm64: mm: always take dirty state from new pte in ptep_set_access_flags
  arm64: Provide "model name" in /proc/cpuinfo for PER_LINUX32 tasks
  crypto: ccp - Fix AES XTS error for request sizes above 4096
  crypto: public_key: select CRYPTO_AKCIPHER
  irqchip/gic-v3: Fix ICC_SGI1R_EL1.INTID decoding mask
  s390/bpf: reduce maximum program size to 64 KB
  s390/bpf: fix recache skb->data/hlen for skb_vlan_push/pop
  gpio: bcm-kona: fix bcm_kona_gpio_reset() warnings
  ARM: fix PTRACE_SETVFPREGS on SMP systems
  ALSA: hda/realtek: Add T560 docking unit fixup
  ALSA: hda/realtek - Add support for new codecs ALC700/ALC701/ALC703
  ALSA: hda/realtek - ALC256 speaker noise issue
  ALSA: hda - Fix headset mic detection problem for Dell machine
  ALSA: hda - Add PCI ID for Kabylake
  KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi
  KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS
  vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices
  geneve: Relax MTU constraints
  vxlan: Relax MTU constraints
  ipv6: Skip XFRM lookup if dst_entry in socket cache is valid
  l2tp: fix configuration passed to setup_udp_tunnel_sock()
  bridge: Don't insert unnecessary local fdb entry on changing mac address
  tcp: record TLP and ER timer stats in v6 stats
  vxlan: Accept user specified MTU value when create new vxlan link
  team: don't call netdev_change_features under team->lock
  sfc: on MC reset, clear PIO buffer linkage in TXQs
  bpf, inode: disallow userns mounts
  uapi glibc compat: fix compilation when !__USE_MISC in glibc
  udp: prevent skbs lingering in tunnel socket queues
  bpf: Use mount_nodev not mount_ns to mount the bpf filesystem
  tuntap: correctly wake up process during uninit
  switchdev: pass pointer to fib_info instead of copy
  tipc: fix nametable publication field in nl compat
  netlink: Fix dump skb leak/double free
  tipc: check nl sock before parsing nested attributes
  scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist
  scsi_lib: correctly retry failed zero length REQ_TYPE_FS commands
  cs-etm: associating output packet with CPU they executed on
  cs-etm: removing unecessary structure field
  cs-etm: account for each trace buffer in the queue
  cs-etm: avoid casting variable
  perf tools: fixing Makefile problems
  perf tools: new naming convention for openCSD
  perf scripts: Add python scripts for CoreSight traces
  perf tools: decoding capailitity for CoreSight traces
  perf symbols: Check before overwriting build_id
  perf tools: pushing driver configuration down to the kernel
  perf tools: add infrastructure for PMU specific configuration
  coresight: etm-perf: incorporating sink definition from the cmd line
  coresight: adding sink parameter to function coresight_build_path()
  perf: passing struct perf_event to function setup_aux()
  perf/core: adding PMU driver specific configuration
  perf tools: adding coresight etm PMU record capabilities
  perf tools: making coresight PMU listable
  coresight: tmc: implementing TMC-ETR AUX space API
  coresight: Add support for Juno platform
  coresight: Handle build path error
  coresight: Fix erroneous memset in tmc_read_unprepare_etr
  coresight: Fix tmc_read_unprepare_etr
  coresight: Fix NULL pointer dereference in _coresight_build_path
  ANDROID: dm verity fec: add missing release from fec_ktype
  ANDROID: dm verity fec: limit error correction recursion
  ANDROID: restrict access to perf events
  FROMLIST: security,perf: Allow further restriction of perf_event_open
  BACKPORT: perf tools: Document the perf sysctls
  Revert "armv6 dcc tty driver"
  Revert "arm: dcc_tty: fix armv6 dcc tty build failure"
  ARM64: Ignore Image-dtb from git point of view
  arm64: add option to build Image-dtb
  ANDROID: usb: gadget: f_midi: set fi->f to NULL when free f_midi function
  Linux 4.4.13
  xfs: handle dquot buffer readahead in log recovery correctly
  xfs: print name of verifier if it fails
  xfs: skip stale inodes in xfs_iflush_cluster
  xfs: fix inode validity check in xfs_iflush_cluster
  xfs: xfs_iflush_cluster fails to abort on error
  xfs: Don't wrap growfs AGFL indexes
  xfs: disallow rw remount on fs with unknown ro-compat features
  gcov: disable tree-loop-im to reduce stack usage
  scripts/package/Makefile: rpmbuild add support of RPMOPTS
  dma-debug: avoid spinlock recursion when disabling dma-debug
  PM / sleep: Handle failures in device_suspend_late() consistently
  ext4: silence UBSAN in ext4_mb_init()
  ext4: address UBSAN warning in mb_find_order_for_block()
  ext4: fix oops on corrupted filesystem
  ext4: clean up error handling when orphan list is corrupted
  ext4: fix hang when processing corrupted orphaned inode list
  drm/imx: Match imx-ipuv3-crtc components using device node in platform data
  drm/i915: Don't leave old junk in ilk active watermarks on readout
  drm/atomic: Verify connector->funcs != NULL when clearing states
  drm/fb_helper: Fix references to dev->mode_config.num_connector
  drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config()
  drm/amdgpu: Fix hdmi deep color support.
  drm/amdgpu: use drm_mode_vrefresh() rather than mode->vrefresh
  drm/vmwgfx: Fix order of operation
  drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands.
  drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION
  drm/gma500: Fix possible out of bounds read
  sunrpc: fix stripping of padded MIC tokens
  xen: use same main loop for counting and remapping pages
  xen/events: Don't move disabled irqs
  powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover()
  Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
  powerpc/eeh: Don't report error in eeh_pe_reset_and_recover()
  powerpc/book3s64: Fix branching to OOL handlers in relocatable kernel
  pipe: limit the per-user amount of pages allocated in pipes
  QE-UART: add "fsl,t1040-ucc-uart" to of_device_id
  wait/ptrace: assume __WALL if the child is traced
  mm: use phys_addr_t for reserve_bootmem_region() arguments
  media: v4l2-compat-ioctl32: fix missing reserved field copy in put_v4l2_create32
  PCI: Disable all BAR sizing for devices with non-compliant BARs
  pinctrl: exynos5440: Use off-stack memory for pinctrl_gpio_range
  clk: bcm2835: divider value has to be 1 or more
  clk: bcm2835: pll_off should only update CM_PLL_ANARST
  clk: at91: fix check of clk_register() returned value
  clk: bcm2835: Fix PLL poweron
  cpuidle: Fix cpuidle_state_is_coupled() argument in cpuidle_enter()
  cpuidle: Indicate when a device has been unregistered
  PM / Runtime: Fix error path in pm_runtime_force_resume()
  mfd: intel_soc_pmic_core: Terminate panel control GPIO lookup table correctly
  mfd: intel-lpss: Save register context on suspend
  hwmon: (ads7828) Enable internal reference
  aacraid: Fix for KDUMP driver hang
  aacraid: Fix for aac_command_thread hang
  aacraid: Relinquish CPU during timeout wait
  rtlwifi: pci: use dev_kfree_skb_irq instead of kfree_skb in rtl_pci_reset_trx_ring
  rtlwifi: Fix logic error in enter/exit power-save mode
  rtlwifi: btcoexist: Implement antenna selection
  rtlwifi: rtl8723be: Add antenna select module parameter
  hwrng: exynos - Fix unbalanced PM runtime put on timeout error path
  ath5k: Change led pin configuration for compaq c700 laptop
  ath10k: fix kernel panic, move arvifs list head init before htt init
  ath10k: fix rx_channel during hw reconfigure
  ath10k: fix firmware assert in monitor mode
  ath10k: fix debugfs pktlog_filter write
  ath9k: Fix LED polarity for some Mini PCI AR9220 MB92 cards.
  ath9k: Add a module parameter to invert LED polarity.
  ARM: dts: imx35: restore existing used clock enumeration
  ARM: dts: exynos: Add interrupt line to MAX8997 PMIC on exynos4210-trats
  ARM: dts: at91: fix typo in sama5d2 PIN_PD24 description
  ARM: mvebu: fix GPIO config on the Linksys boards
  Input: uinput - handle compat ioctl for UI_SET_PHYS
  ASoC: ak4642: Enable cache usage to fix crashes on resume
  affs: fix remount failure when there are no options changed
  MIPS: VDSO: Build with `-fno-strict-aliasing'
  MIPS: lib: Mark intrinsics notrace
  MIPS: Build microMIPS VDSO for microMIPS kernels
  MIPS: Fix sigreturn via VDSO on microMIPS kernel
  MIPS: ptrace: Prevent writes to read-only FCSR bits
  MIPS: ptrace: Fix FP context restoration FCSR regression
  MIPS: Disable preemption during prctl(PR_SET_FP_MODE, ...)
  MIPS: Prevent "restoration" of MSA context in non-MSA kernels
  MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
  MIPS: Use copy_s.fmt rather than copy_u.fmt
  MIPS: Loongson-3: Reserve 32MB for RS780E integrated GPU
  MIPS: Reserve nosave data for hibernation
  MIPS: ath79: make bootconsole wait for both THRE and TEMT
  MIPS: Sync icache & dcache in set_pte_at
  MIPS: Handle highmem pages in __update_cache
  MIPS: Flush highmem pages in __flush_dcache_page
  MIPS: Fix watchpoint restoration
  MIPS: Fix uapi include in exported asm/siginfo.h
  MIPS: Fix siginfo.h to use strict posix types
  MIPS: Avoid using unwind_stack() with usermode
  MIPS: Don't unwind to user mode with EVA
  MIPS: MSA: Fix a link error on `_init_msa_upper' with older GCC
  MIPS: math-emu: Fix jalr emulation when rd == $0
  MIPS64: R6: R2 emulation bugfix
  coresight: etb10: adjust read pointer only when needed
  coresight: configuring ETF in FIFO mode when acting as link
  coresight: tmc: implementing TMC-ETF AUX space API
  coresight: moving struct cs_buffers to header file
  coresight: tmc: keep track of memory width
  coresight: tmc: make sysFS and Perf mode mutually exclusive
  coresight: tmc: dump system memory content only when needed
  coresight: tmc: adding mode of operation for link/sinks
  coresight: tmc: getting rid of multiple read access
  coresight: tmc: allocating memory when needed
  coresight: tmc: making prepare/unprepare functions generic
  coresight: tmc: splitting driver in ETB/ETF and ETR components
  coresight: tmc: cleaning up header file
  coresight: tmc: introducing new header file
  coresight: tmc: clearly define number of transfers per burst
  coresight: tmc: re-implementing tmc_read_prepare/unprepare() functions
  coresight: tmc: waiting for TMCReady bit before programming
  coresight: tmc: modifying naming convention
  coresight: tmc: adding sysFS management entries
  coresight: etm4x: add tracer ID for A72 Maia processor.
  coresight: etb10: fixing the right amount of words to read
  coresight: stm: adding driver for CoreSight STM component
  coresight: adding path for STM device
  coresight: etm4x: modify q_support type
  coresight: no need to do the forced type conversion
  coresight: removing gratuitous boot time log messages
  coresight: etb10: splitting sysFS "status" entry
  coresight: moving coresight_simple_func() to header file
  coresight: etm4x: implementing the perf PMU API
  coresight: etm4x: implementing user/kernel mode tracing
  coresight: etm4x: moving etm_drvdata::enable to atomic field
  coresight: etm4x: unlocking tracers in default arch init
  coresight: etm4x: splitting etmv4 default configuration
  coresight: etm4x: splitting struct etmv4_drvdata
  coresight: etm4x: adding config and traceid registers
  coresight: etm4x: moving sysFS entries to a dedicated file
  stm class: Support devices that override software assigned masters
  stm class: Remove unnecessary pointer increment
  stm class: Fix stm device initialization order
  stm class: Do not leak the chrdev in error path
  stm class: Remove a pointless line
  stm class: stm_heartbeat: Make nr_devs parameter read-only
  stm class: dummy_stm: Make nr_dummies parameter read-only
  MAINTAINERS: Add a git tree for the stm class
  perf/ring_buffer: Document AUX API usage
  perf/core: Free AUX pages in unmap path
  perf/ring_buffer: Refuse to begin AUX transaction after rb->aux_mmap_count drops
  perf auxtrace: Add perf_evlist pointer to *info_priv_size()
  perf session: Simplify tool stubs
  perf inject: Hit all DSOs for AUX data in JIT and other cases
  perf tools: tracepoint_error() can receive e=NULL, robustify it
  perf evlist: Make perf_evlist__open() open evsels with their cpus and threads (like perf record does)
  perf evsel: Introduce disable() method
  perf cpumap: Auto initialize cpu__max_{node,cpu}
  drivers/hwtracing: make coresight-etm-perf.c explicitly non-modular
  drivers/hwtracing: make coresight-* explicitly non-modular
  coresight: introducing a global trace ID function
  coresight: etm-perf: new PMU driver for ETM tracers
  coresight: etb10: implementing AUX API
  coresight: etb10: adding operation mode for sink->enable()
  coresight: etb10: moving to local atomic operations
  coresight: etm3x: implementing perf_enable/disable() API
  coresight: etm3x: implementing user/kernel mode tracing
  coresight: etm3x: consolidating initial config
  coresight: etm3x: changing default trace configuration
  coresight: etm3x: set progbit to stop trace collection
  coresight: etm3x: adding operation mode for etm_enable()
  coresight: etm3x: splitting struct etm_drvdata
  coresight: etm3x: unlocking tracers in default arch init
  coresight: etm3x: moving sysFS entries to dedicated file
  coresight: etm3x: moving etm_readl/writel to header file
  coresight: moving PM runtime operations to core framework
  coresight: add API to get sink from path
  coresight: associating path with session rather than tracer
  coresight: etm4x: Check every parameter used by dma_xx_coherent.
  coresight: "DEVICE_ATTR_RO" should defined as static.
  coresight: implementing 'cpu_id()' API
  coresight: removing bind/unbind options from sysfs
  coresight: remove csdev's link from topology
  coresight: release reference taken by 'bus_find_device()'
  coresight: coresight_unregister() function cleanup
  coresight: fixing lockdep error
  coresight: fixing indentation problem
  coresight: Fix a typo in Kconfig
  coresight: checking for NULL string in coresight_name_match()
  perf/core: Disable the event on a truncated AUX record
  perf/core: Don't leak event in the syscall error path
  perf/core: Fix perf_sched_count derailment
  stm class: dummy_stm: Add link callback for fault injection
  stm class: Plug stm device's unlink callback
  stm class: Fix a race in unlinking
  stm class: Fix unbalanced module/device refcounting
  stm class: Guard output assignment against concurrency
  stm class: Fix unlocking braino in the error path
  stm class: Add heartbeat stm source device
  stm class: dummy_stm: Create multiple devices
  stm class: Support devices with multiple instances
  stm class: Use driver's packet callback return value
  stm class: Prevent user-controllable allocations
  stm class: Fix link list locking
  stm class: Fix locking in unbinding policy path
  stm class: Select CONFIG_SRCU
  stm class: Hide STM-specific options if STM is disabled
  perf: Synchronously free aux pages in case of allocation failure
  Linux 4.4.12
  kbuild: move -Wunused-const-variable to W=1 warning level
  Revert "scsi: fix soft lockup in scsi_remove_target() on module removal"
  scsi: Add intermediate STARGET_REMOVE state to scsi_target_state
  hpfs: implement the show_options method
  hpfs: fix remount failure when there are no options changed
  UBI: Fix static volume checks when Fastmap is used
  SIGNAL: Move generic copy_siginfo() to signal.h
  thunderbolt: Fix double free of drom buffer
  IB/srp: Fix a debug kernel crash
  ALSA: hda - Fix headset mic detection problem for one Dell machine
  ALSA: hda/realtek - Add support for ALC295/ALC3254
  ALSA: hda - Fix headphone noise on Dell XPS 13 9360
  ALSA: hda/realtek - New codecs support for ALC234/ALC274/ALC294
  mcb: Fixed bar number assignment for the gdd
  clk: bcm2835: add locking to pll*_on/off methods
  locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()
  serial: samsung: Reorder the sequence of clock control when call s3c24xx_serial_set_termios()
  serial: 8250_mid: recognize interrupt source in handler
  serial: 8250_mid: use proper bar for DNV platform
  serial: 8250_pci: fix divide error bug if baud rate is 0
  Fix OpenSSH pty regression on close
  tty/serial: atmel: fix hardware handshake selection
  TTY: n_gsm, fix false positive WARN_ON
  tty: vt, return error when con_startup fails
  xen/x86: actually allocate legacy interrupts on PV guests
  KVM: x86: mask CPUID(0xD,0x1).EAX against host value
  MIPS: KVM: Fix timer IRQ race when writing CP0_Compare
  MIPS: KVM: Fix timer IRQ race when freezing timer
  KVM: x86: fix ordering of cr0 initialization code in vmx_cpu_reset
  KVM: MTRR: remove MSR 0x2f8
  staging: comedi: das1800: fix possible NULL dereference
  usb: gadget: udc: core: Fix argument of dev_err() in usb_gadget_map_request()
  USB: leave LPM alone if possible when binding/unbinding interface drivers
  usb: misc: usbtest: fix pattern tests for scatterlists.
  usb: f_mass_storage: test whether thread is running before starting another
  usb: gadget: f_fs: Fix EFAULT generation for async read operations
  USB: serial: option: add even more ZTE device ids
  USB: serial: option: add more ZTE device ids
  USB: serial: option: add support for Cinterion PH8 and AHxx
  USB: serial: io_edgeport: fix memory leaks in probe error path
  USB: serial: io_edgeport: fix memory leaks in attach error path
  USB: serial: quatech2: fix use-after-free in probe error path
  USB: serial: keyspan: fix use-after-free in probe error path
  USB: serial: mxuport: fix use-after-free in probe error path
  mei: bus: call mei_cl_read_start under device lock
  mei: amthif: discard not read messages
  mei: fix NULL dereferencing during FW initiated disconnection
  Bluetooth: vhci: Fix race at creating hci device
  Bluetooth: vhci: purge unhandled skbs
  Bluetooth: vhci: fix open_timeout vs. hdev race
  mmc: sdhci-pci: Remove MMC_CAP_BUS_WIDTH_TEST for Intel controllers
  mmc: longer timeout for long read time quirk
  dell-rbtn: Ignore ACPI notifications if device is suspended
  ACPI / osi: Fix an issue that acpi_osi=!* cannot disable ACPICA internal strings
  mmc: sdhci-acpi: Remove MMC_CAP_BUS_WIDTH_TEST for Intel controllers
  mmc: mmc: Fix partition switch timeout for some eMMCs
  can: fix handling of unmodifiable configuration options
  irqchip/gic-v3: Configure all interrupts as non-secure Group-1
  irqchip/gic: Ensure ordering between read of INTACK and shared data
  Input: pwm-beeper - fix - scheduling while atomic
  mfd: omap-usb-tll: Fix scheduling while atomic BUG
  sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems
  clk: qcom: msm8916: Fix crypto clock flags
  crypto: sun4i-ss - Replace spinlock_bh by spin_lock_irq{save|restore}
  crypto: talitos - fix ahash algorithms registration
  crypto: caam - fix caam_jr_alloc() ret code
  ring-buffer: Prevent overflow of size in ring_buffer_resize()
  ring-buffer: Use long for nr_pages to avoid overflow failures
  asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
  fs/cifs: correctly to anonymous authentication for the NTLM(v2) authentication
  fs/cifs: correctly to anonymous authentication for the NTLM(v1) authentication
  fs/cifs: correctly to anonymous authentication for the LANMAN authentication
  fs/cifs: correctly to anonymous authentication via NTLMSSP
  remove directory incorrectly tries to set delete on close on non-empty directories
  kvm: arm64: Fix EC field in inject_abt64
  arm/arm64: KVM: Enforce Break-Before-Make on Stage-2 page tables
  arm64: cpuinfo: Missing NULL terminator in compat_hwcap_str
  arm64: Implement pmdp_set_access_flags() for hardware AF/DBM
  arm64: Implement ptep_set_access_flags() for hardware AF/DBM
  arm64: Ensure pmd_present() returns false after pmd_mknotpresent()
  arm64: Fix typo in the pmdp_huge_get_and_clear() definition
  ext4: iterate over buffer heads correctly in move_extent_per_page()
  perf test: Fix build of BPF and LLVM on older glibc libraries
  perf/core: Fix perf_event_open() vs. execve() race
  perf/x86/intel/pt: Generate PMI in the STOP region as well
  Btrfs: don't use src fd for printk
  UPSTREAM: mac80211: fix "warning: ‘target_metric’ may be used uninitialized"
  Revert "drivers: power: use 'current' instead of 'get_current()'"
  cpufreq: interactive: drop cpufreq_{get,put}_global_kobject func calls
  Revert "cpufreq: interactive: build fixes for 4.4"
  xt_qtaguid: Fix panic caused by processing non-full socket.
  fiq_debugger: Add fiq_debugger.disable option
  UPSTREAM: procfs: fixes pthread cross-thread naming if !PR_DUMPABLE
  FROMLIST: wlcore: Disable filtering in AP role
  Revert "drivers: power: Add watchdog timer to catch drivers which lockup during suspend."
  fiq_debugger: Add option to apply uart overlay by FIQ_DEBUGGER_UART_OVERLAY
  Revert "Recreate asm/mach/mmc.h include file"
  Revert "ARM: Add 'card_present' state to mmc_platfrom_data"
  usb: dual-role: make stub functions inline
  Revert "mmc: Add status IRQ and status callback function to mmc platform data"
  quick selinux support for tracefs
  Revert "hid-multitouch: Filter collections by application usage."
  Revert "HID: steelseries: validate output report details"
  xt_qtaguid: Fix panic caused by synack processing
  Revert "mm: vmscan: Add a debug file for shrinkers"
  Revert "SELinux: Enable setting security contexts on rootfs inodes."
  Revert "SELinux: build fix for 4.1"
  fuse: Add support for d_canonical_path
  vfs: change d_canonical_path to take two paths
  android: recommended.cfg: remove CONFIG_UID_STAT
  netfilter: xt_qtaguid: seq_printf fixes
  Revert "misc: uidstat: Adding uid stat driver to collect network statistics."
  Revert "net: activity_stats: Add statistics for network transmission activity"
  Revert "net: activity_stats: Stop using obsolete create_proc_read_entry api"
  Revert "misc: uidstat: avoid create_stat() race and blockage."
  Revert "misc: uidstat: Remove use of obsolete create_proc_read_entry api"
  Revert "misc seq_printf fixes for 4.4"
  Revert "misc: uid_stat: Include linux/atomic.h instead of asm/atomic.h"
  Revert "net: socket ioctl to reset connections matching local address"
  Revert "net: fix iterating over hashtable in tcp_nuke_addr()"
  Revert "net: fix crash in tcp_nuke_addr()"
  Revert "Don't kill IPv4 sockets when killing IPv6 sockets was requested."
  Revert "tcp: Fix IPV6 module build errors"
  android: base-cfg: remove CONFIG_SWITCH
  Revert "switch: switch class and GPIO drivers."
  Revert "drivers: switch: remove S_IWUSR from dev_attr"
  ANDROID: base-cfg: enable CONFIG_IP_NF_NAT
  BACKPORT: selinux: restrict kernel module loading
  android: base-cfg: enable CONFIG_QUOTA

Conflicts:
	Documentation/sysctl/kernel.txt
	drivers/cpufreq/cpufreq_interactive.c
	drivers/hwtracing/coresight/Kconfig
	drivers/hwtracing/coresight/Makefile
	drivers/hwtracing/coresight/coresight-etm4x.c
	drivers/hwtracing/coresight/coresight-etm4x.h
	drivers/hwtracing/coresight/coresight-priv.h
	drivers/hwtracing/coresight/coresight-stm.c
	drivers/hwtracing/coresight/coresight-tmc.c
	drivers/mmc/core/core.c
	include/linux/coresight-stm.h
	include/linux/coresight.h
	include/linux/msm_mdp.h
	include/uapi/linux/coresight-stm.h
	kernel/events/core.c
	kernel/sched/fair.c
	net/Makefile
	net/ipv4/netfilter/arp_tables.c
	net/ipv4/netfilter/ip_tables.c
	net/ipv4/tcp.c
	net/ipv6/netfilter/ip6_tables.c
	net/netfilter/xt_quota2.c
	sound/core/pcm.c

Change-Id: I17aa0002815014e9bddc47e67769a53c15768a99
Signed-off-by: Runmin Wang <runminw@codeaurora.org>
2016-10-28 10:48:35 -07:00
Runmin Wang
617229a3e9 Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4
* msm-4.4/tmp-510d0a3f:
  Linux 4.4.11
  nf_conntrack: avoid kernel pointer value leak in slab name
  drm/radeon: fix DP link training issue with second 4K monitor
  drm/i915/bdw: Add missing delay during L3 SQC credit programming
  drm/i915: Bail out of pipe config compute loop on LPT
  drm/radeon: fix PLL sharing on DCE6.1 (v2)
  Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing"
  Input: max8997-haptic - fix NULL pointer dereference
  get_rock_ridge_filename(): handle malformed NM entries
  tools lib traceevent: Do not reassign parg after collapse_tree()
  qla1280: Don't allocate 512kb of host tags
  atomic_open(): fix the handling of create_error
  regulator: axp20x: Fix axp22x ldo_io voltage ranges
  regulator: s2mps11: Fix invalid selector mask and voltages for buck9
  workqueue: fix rebind bound workers warning
  ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC
  vfs: rename: check backing inode being equal
  vfs: add vfs_select_inode() helper
  perf/core: Disable the event on a truncated AUX record
  regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case
  pinctrl: at91-pio4: fix pull-up/down logic
  spi: spi-ti-qspi: Handle truncated frames properly
  spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden
  spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT
  ALSA: hda - Fix broken reconfig
  ALSA: hda - Fix white noise on Asus UX501VW headset
  ALSA: hda - Fix subwoofer pin on ASUS N751 and N551
  ALSA: usb-audio: Yet another Phoneix Audio device quirk
  ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2)
  crypto: testmgr - Use kmalloc memory for RSA input
  crypto: hash - Fix page length clamping in hash walk
  crypto: qat - fix invalid pf2vf_resp_wq logic
  s390/mm: fix asce_bits handling with dynamic pagetable levels
  zsmalloc: fix zs_can_compact() integer overflow
  ocfs2: fix posix_acl_create deadlock
  ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang
  net/route: enforce hoplimit max value
  tcp: refresh skb timestamp at retransmit time
  net: thunderx: avoid exposing kernel stack
  net: fix a kernel infoleak in x25 module
  uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h MIME-Version: 1.0
  bridge: fix igmp / mld query parsing
  net: bridge: fix old ioctl unlocked net device walk
  VSOCK: do not disconnect socket when peer has shutdown SEND only
  net/mlx4_en: Fix endianness bug in IPV6 csum calculation
  net: fix infoleak in rtnetlink
  net: fix infoleak in llc
  net: fec: only clear a queue's work bit if the queue was emptied
  netem: Segment GSO packets on enqueue
  sch_dsmark: update backlog as well
  sch_htb: update backlog as well
  net_sched: update hierarchical backlog too
  net_sched: introduce qdisc_replace() helper
  gre: do not pull header in ICMP error processing
  net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
  samples/bpf: fix trace_output example
  bpf: fix check_map_func_compatibility logic
  bpf: fix refcnt overflow
  bpf: fix double-fdput in replace_map_fd_with_map_ptr()
  net/mlx4_en: fix spurious timestamping callbacks
  ipv4/fib: don't warn when primary address is missing if in_dev is dead
  net/mlx5e: Fix minimum MTU
  net/mlx5e: Device's mtu field is u16 and not int
  openvswitch: use flow protocol when recalculating ipv6 checksums
  atl2: Disable unimplemented scatter/gather feature
  vlan: pull on __vlan_insert_tag error path and fix csum correction
  net: use skb_postpush_rcsum instead of own implementations
  cdc_mbim: apply "NDP to end" quirk to all Huawei devices
  bpf/verifier: reject invalid LD_ABS | BPF_DW instruction
  net: sched: do not requeue a NULL skb
  packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface
  route: do not cache fib route info on local routes with oif
  decnet: Do not build routes to devices without decnet private data.
  parisc: Use generic extable search and sort routines
  arm64: kasan: Use actual memory node when populating the kernel image shadow
  arm64: mm: treat memstart_addr as a signed quantity
  arm64: lse: deal with clobbered IP registers after branch via PLT
  arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly
  arm64: kasan: Fix zero shadow mapping overriding kernel image shadow
  arm64: consistently use p?d_set_huge
  arm64: fix KASLR boot-time I-cache maintenance
  arm64: hugetlb: partial revert of 66b3923a1a0f
  arm64: make irq_stack_ptr more robust
  arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
  efi: stub: use high allocation for converted command line
  efi: stub: add implementation of efi_random_alloc()
  efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
  arm64: kaslr: randomize the linear region
  arm64: add support for kernel ASLR
  arm64: add support for building vmlinux as a relocatable PIE binary
  arm64: switch to relative exception tables
  extable: add support for relative extables to search and sort routines
  scripts/sortextable: add support for ET_DYN binaries
  arm64: futex.h: Add missing PAN toggling
  arm64: make asm/elf.h available to asm files
  arm64: avoid dynamic relocations in early boot code
  arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  arm64: add support for module PLTs
  arm64: move brk immediate argument definitions to separate header
  arm64: mm: use bit ops rather than arithmetic in pa/va translations
  arm64: mm: only perform memstart_addr sanity check if DEBUG_VM
  arm64: User die() instead of panic() in do_page_fault()
  arm64: allow kernel Image to be loaded anywhere in physical memory
  arm64: defer __va translation of initrd_start and initrd_end
  arm64: move kernel image to base of vmalloc area
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: decouple early fixmap init from linear mapping
  arm64: pgtable: implement static [pte|pmd|pud]_offset variants
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  arm64: add support for ioremap() block mappings
  arm64: prevent potential circular header dependencies in asm/bug.h
  of/fdt: factor out assignment of initrd_start/initrd_end
  of/fdt: make memblock minimum physical address arch configurable
  arm64: Remove the get_thread_info() function
  arm64: kernel: Don't toggle PAN on systems with UAO
  arm64: cpufeature: Test 'matches' pointer to find the end of the list
  arm64: kernel: Add support for User Access Override
  arm64: add ARMv8.2 id_aa64mmfr2 boiler plate
  arm64: cpufeature: Change read_cpuid() to use sysreg's mrs_s macro
  arm64: use local label prefixes for __reg_num symbols
  arm64: vdso: Mark vDSO code as read-only
  arm64: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL
  arm64: ptdump: Indicate whether memory should be faulting
  arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
  arm64: Drop alloc function from create_mapping
  arm64: prefetch: add missing #include for spin_lock_prefetch
  arm64: lib: patch in prfm for copy_page if requested
  arm64: lib: improve copy_page to deal with 128 bytes at a time
  arm64: prefetch: add alternative pattern for CPUs without a prefetcher
  arm64: prefetch: don't provide spin_lock_prefetch with LSE
  arm64: allow vmalloc regions to be set with set_memory_*
  arm64: kernel: implement ACPI parking protocol
  arm64: mm: create new fine-grained mappings at boot
  arm64: ensure _stext and _etext are page-aligned
  arm64: mm: allow passing a pgdir to alloc_init_*
  arm64: mm: allocate pagetables anywhere
  arm64: mm: use fixmap when creating page tables
  arm64: mm: add functions to walk tables in fixmap
  arm64: mm: add __{pud,pgd}_populate
  arm64: mm: avoid redundant __pa(__va(x))
  arm64: mm: add functions to walk page tables by PA
  arm64: mm: move pte_* macros
  arm64: kasan: avoid TLB conflicts
  arm64: mm: add code to safely replace TTBR1_EL1
  arm64: add function to install the idmap
  arm64: unmap idmap earlier
  arm64: unify idmap removal
  arm64: mm: place empty_zero_page in bss
  arm64: mm: specialise pagetable allocators
  asm-generic: Fix local variable shadow in __set_fixmap_offset
  Eliminate the .eh_frame sections from the aarch64 vmlinux and kernel modules
  arm64: Fix an enum typo in mm/dump.c
  arm64: kasan: ensure that the KASAN zero page is mapped read-only
  arch/arm64/include/asm/pgtable.h: add pmd_mkclean for THP
  arm64: hide __efistub_ aliases from kallsyms
  Linux 4.4.10
  drm/i915/skl: Fix DMC load on Skylake J0 and K0
  lib/test-string_helpers.c: fix and improve string_get_size() tests
  ACPI / processor: Request native thermal interrupt handling via _OSC
  drm/i915: Fake HDMI live status
  drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW
  drm/i915: Fix eDP low vswing for Broadwell
  drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
  drm/radeon: make sure vertical front porch is at least 1
  iio: ak8975: fix maybe-uninitialized warning
  iio: ak8975: Fix NULL pointer exception on early interrupt
  drm/amdgpu: set metadata pointer to NULL after freeing.
  drm/amdgpu: make sure vertical front porch is at least 1
  gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading
  nvmem: mxs-ocotp: fix buffer overflow in read
  USB: serial: cp210x: add Straizona Focusers device ids
  USB: serial: cp210x: add ID for Link ECU
  ata: ahci-platform: Add ports-implemented DT bindings.
  libahci: save port map for forced port map
  powerpc: Fix bad inline asm constraint in create_zero_mask()
  ACPICA: Dispatcher: Update thread ID for recursive method calls
  x86/sysfb_efi: Fix valid BAR address range check
  ARC: Add missing io barriers to io{read,write}{16,32}be()
  ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
  propogate_mnt: Handle the first propogated copy being a slave
  fs/pnode.c: treat zero mnt_group_id-s as unequal
  x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
  MAINTAINERS: Remove asterisk from EFI directory names
  writeback: Fix performance regression in wb_over_bg_thresh()
  batman-adv: Reduce refcnt of removed router when updating route
  batman-adv: Fix broadcast/ogm queue limit on a removed interface
  batman-adv: Check skb size before using encapsulated ETH+VLAN header
  batman-adv: fix DAT candidate selection (must use vid)
  mm: update min_free_kbytes from khugepaged after core initialization
  proc: prevent accessing /proc/<PID>/environ until it's ready
  Input: zforce_ts - fix dual touch recognition
  HID: Fix boot delay for Creative SB Omni Surround 5.1 with quirk
  HID: wacom: Add support for DTK-1651
  xen/evtchn: fix ring resize when binding new events
  xen/balloon: Fix crash when ballooning on x86 32 bit PAE
  xen: Fix page <-> pfn conversion on 32 bit systems
  ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernel
  ARM: EXYNOS: Properly skip unitialized parent clock in power domain on
  mm/zswap: provide unique zpool name
  mm, cma: prevent nr_isolated_* counters from going negative
  Minimal fix-up of bad hashing behavior of hash_64()
  MD: make bio mergeable
  tracing: Don't display trigger file for events that can't be enabled
  mac80211: fix statistics leak if dev_alloc_name() fails
  ath9k: ar5008_hw_cmn_spur_mitigate: add missing mask_m & mask_p initialisation
  lpfc: fix misleading indentation
  clk: qcom: msm8960: Fix ce3_src register offset
  clk: versatile: sp810: support reentrance
  clk: qcom: msm8960: fix ce3_core clk enable register
  clk: meson: Fix meson_clk_register_clks() signature type mismatch
  clk: rockchip: free memory in error cases when registering clock branches
  soc: rockchip: power-domain: fix err handle while probing
  clk-divider: make sure read-only dividers do not write to their register
  CNS3xxx: Fix PCI cns3xxx_write_config()
  mwifiex: fix corner case association failure
  ata: ahci_xgene: dereferencing uninitialized pointer in probe
  nbd: ratelimit error msgs after socket close
  mfd: intel-lpss: Remove clock tree on error path
  ipvs: drop first packet to redirect conntrack
  ipvs: correct initial offset of Call-ID header search in SIP persistence engine
  ipvs: handle ip_vs_fill_iph_skb_off failure
  RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
  Revert: "powerpc/tm: Check for already reclaimed tasks"
  arm64: head.S: use memset to clear BSS
  efi: stub: define DISABLE_BRANCH_PROFILING for all architectures
  arm64: entry: remove pointless SPSR mode check
  arm64: mm: move pgd_cache initialisation to pgtable_cache_init
  arm64: module: avoid undefined shift behavior in reloc_data()
  arm64: module: fix relocation of movz instruction with negative immediate
  arm64: traps: address fallout from printk -> pr_* conversion
  arm64: ftrace: fix a stack tracer's output under function graph tracer
  arm64: pass a task parameter to unwind_frame()
  arm64: ftrace: modify a stack frame in a safe way
  arm64: remove irq_count and do_softirq_own_stack()
  arm64: hugetlb: add support for PTE contiguous bit
  arm64: Use PoU cache instr for I/D coherency
  arm64: Defer dcache flush in __cpu_copy_user_page
  arm64: reduce stack use in irq_handler
  arm64: Documentation: add list of software workarounds for errata
  arm64: mm: place __cpu_setup in .text
  arm64: cmpxchg: Don't incldue linux/mmdebug.h
  arm64: mm: fold alternatives into .init
  arm64: Remove redundant padding from linker script
  arm64: mm: remove pointless PAGE_MASKing
  arm64: don't call C code with el0's fp register
  arm64: when walking onto the task stack, check sp & fp are in current->stack
  arm64: Add this_cpu_ptr() assembler macro for use in entry.S
  arm64: irq: fix walking from irq stack to task stack
  arm64: Add do_softirq_own_stack() and enable irq_stacks
  arm64: Modify stack trace and dump for use with irq_stack
  arm64: Store struct thread_info in sp_el0
  arm64: Add trace_hardirqs_off annotation in ret_to_user
  arm64: ftrace: fix the comments for ftrace_modify_code
  arm64: ftrace: stop using kstop_machine to enable/disable tracing
  arm64: spinlock: serialise spin_unlock_wait against concurrent lockers
  arm64: enable HAVE_IRQ_TIME_ACCOUNTING
  arm64: fix COMPAT_SHMLBA definition for large pages
  arm64: add __init/__initdata section marker to some functions/variables
  arm64: pgtable: implement pte_accessible()
  arm64: mm: allow sections for unaligned bases
  arm64: mm: detect bad __create_mapping uses
  Linux 4.4.9
  extcon: max77843: Use correct size for reading the interrupt register
  stm class: Select CONFIG_SRCU
  megaraid_sas: add missing curly braces in ioctl handler
  sunrpc/cache: drop reference when sunrpc_cache_pipe_upcall() detects a race
  thermal: rockchip: fix a impossible condition caused by the warning
  unbreak allmodconfig KCONFIG_ALLCONFIG=...
  jme: Fix device PM wakeup API usage
  jme: Do not enable NIC WoL functions on S0
  bus: imx-weim: Take the 'status' property value into account
  ARM: dts: pxa: fix dma engine node to pxa3xx-nand
  ARM: dts: armada-375: use armada-370-sata for SATA
  ARM: EXYNOS: select THERMAL_OF
  ARM: prima2: always enable reset controller
  ARM: OMAP3: Add cpuidle parameters table for omap3430
  ext4: fix races of writeback with punch hole and zero range
  ext4: fix races between buffered IO and collapse / insert range
  ext4: move unlocked dio protection from ext4_alloc_file_blocks()
  ext4: fix races between page faults and hole punching
  perf stat: Document --detailed option
  perf tools: handle spaces in file names obtained from /proc/pid/maps
  perf hists browser: Only offer symbol scripting when a symbol is under the cursor
  mtd: nand: Drop mtd.owner requirement in nand_scan
  mtd: brcmnand: Fix v7.1 register offsets
  mtd: spi-nor: remove micron_quad_enable()
  serial: sh-sci: Remove cpufreq notifier to fix crash/deadlock
  ext4: fix NULL pointer dereference in ext4_mark_inode_dirty()
  x86/mm/kmmio: Fix mmiotrace for hugepages
  perf evlist: Reference count the cpu and thread maps at set_maps()
  drivers/misc/ad525x_dpot: AD5274 fix RDAC read back errors
  rtc: max77686: Properly handle regmap_irq_get_virq() error code
  rtc: rx8025: remove rv8803 id
  rtc: ds1685: passing bogus values to irq_restore
  rtc: vr41xx: Wire up alarm_irq_enable
  rtc: hym8563: fix invalid year calculation
  PM / Domains: Fix removal of a subdomain
  PM / OPP: Initialize u_volt_min/max to a valid value
  misc: mic/scif: fix wrap around tests
  misc/bmp085: Enable building as a module
  lib/mpi: Endianness fix
  fbdev: da8xx-fb: fix videomodes of lcd panels
  scsi_dh: force modular build if SCSI is a module
  paride: make 'verbose' parameter an 'int' again
  regulator: s5m8767: fix get_register() error handling
  irqchip/mxs: Fix error check of of_io_request_and_map()
  irqchip/sunxi-nmi: Fix error check of of_io_request_and_map()
  spi/rockchip: Make sure spi clk is on in rockchip_spi_set_cs
  locking/mcs: Fix mcs_spin_lock() ordering
  regulator: core: Fix nested locking of supplies
  regulator: core: Ensure we lock all regulators
  regulator: core: fix regulator_lock_supply regression
  Revert "regulator: core: Fix nested locking of supplies"
  videobuf2-v4l2: Verify planes array in buffer dequeueing
  videobuf2-core: Check user space planes array in dqbuf
  USB: usbip: fix potential out-of-bounds write
  cgroup: make sure a parent css isn't freed before its children
  mm/hwpoison: fix wrong num_poisoned_pages accounting
  mm: vmscan: reclaim highmem zone if buffer_heads is over limit
  numa: fix /proc/<pid>/numa_maps for THP
  mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check
  memcg: relocate charge moving from ->attach to ->post_attach
  cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback
  slub: clean up code for kmem cgroup support to kmem_cache_free_bulk
  workqueue: fix ghost PENDING flag while doing MQ IO
  x86/apic: Handle zero vector gracefully in clear_vector_irq()
  efi: Expose non-blocking set_variable() wrapper to efivars
  efi: Fix out-of-bounds read in variable_matches()
  IB/security: Restrict use of the write() interface
  IB/mlx5: Expose correct max_sge_rd limit
  cxl: Keep IRQ mappings on context teardown
  v4l2-dv-timings.h: fix polarity for 4k formats
  vb2-memops: Fix over allocation of frame vectors
  ASoC: rt5640: Correct the digital interface data select
  ASoC: dapm: Make sure we have a card when displaying component widgets
  ASoC: ssm4567: Reset device before regcache_sync()
  ASoC: s3c24xx: use const snd_soc_component_driver pointer
  EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback
  toshiba_acpi: Fix regression caused by hotkey enabling value
  i2c: exynos5: Fix possible ABBA deadlock by keeping I2C clock prepared
  i2c: cpm: Fix build break due to incompatible pointer types
  perf intel-pt: Fix segfault tracing transactions
  drm/i915: Use fw_domains_put_with_fifo() on HSW
  drm/i915: Fixup the free space logic in ring_prepare
  drm/amdkfd: uninitialized variable in dbgdev_wave_control_set_registers()
  drm/i915: skl_update_scaler() wants a rotation bitmask instead of bit number
  drm/i915: Cleanup phys status page too
  pwm: brcmstb: Fix check of devm_ioremap_resource() return code
  drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()
  drm/dp/mst: Restore primary hub guid on resume
  drm/dp/mst: Validate port in drm_dp_payload_send_msg()
  drm/nouveau/gr/gf100: select a stream master to fixup tfb offset queries
  drm: Loongson-3 doesn't fully support wc memory
  drm/radeon: fix vertical bars appear on monitor (v2)
  drm/radeon: forbid mapping of userptr bo through radeon device file
  drm/radeon: fix initial connector audio value
  drm/radeon: add a quirk for a XFX R9 270X
  drm/amdgpu: fix regression on CIK (v2)
  amdgpu/uvd: add uvd fw version for amdgpu
  drm/amdgpu: bump the afmt limit for CZ, ST, Polaris
  drm/amdgpu: use defines for CRTCs and AMFT blocks
  drm/amdgpu: when suspending, if uvd/vce was running. need to cancel delay work.
  iommu/dma: Restore scatterlist offsets correctly
  iommu/amd: Fix checking of pci dma aliases
  pinctrl: single: Fix pcs_parse_bits_in_pinctrl_entry to use __ffs than ffs
  pinctrl: mediatek: correct debounce time unit in mtk_gpio_set_debounce
  xen kconfig: don't "select INPUT_XEN_KBDDEV_FRONTEND"
  Input: pmic8xxx-pwrkey - fix algorithm for converting trigger delay
  Input: gtco - fix crash on detecting device without endpoints
  netlink: don't send NETLINK_URELEASE for unbound sockets
  nl80211: check netlink protocol in socket release notification
  powerpc: Update TM user feature bits in scan_features()
  powerpc: Update cpu_user_features2 in scan_features()
  powerpc: scan_features() updates incorrect bits for REAL_LE
  crypto: talitos - fix AEAD tcrypt tests
  crypto: talitos - fix crash in talitos_cra_init()
  crypto: sha1-mb - use corrcet pointer while completing jobs
  crypto: ccp - Prevent information leakage on export
  iwlwifi: mvm: fix memory leak in paging
  iwlwifi: pcie: lower the debug level for RSA semaphore access
  s390/pci: add extra padding to function measurement block
  cpufreq: intel_pstate: Fix processing for turbo activation ratio
  Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control"
  Revert "drm/radeon: disable runtime pm on PX laptops without dGPU power control"
  drm/i915: Fix race condition in intel_dp_destroy_mst_connector()
  drm/qxl: fix cursor position with non-zero hotspot
  drm/nouveau/core: use vzalloc for allocating ramht
  futex: Acknowledge a new waiter in counter before plist
  futex: Handle unlock_pi race gracefully
  asm-generic/futex: Re-enable preemption in futex_atomic_cmpxchg_inatomic()
  ALSA: hda - Add dock support for ThinkPad X260
  ALSA: pcxhr: Fix missing mutex unlock
  ALSA: hda - add PCI ID for Intel Broxton-T
  ALSA: hda - Keep powering up ADCs on Cirrus codecs
  ALSA: hda/realtek - Add ALC3234 headset mode for Optiplex 9020m
  ALSA: hda - Don't trust the reported actual power state
  x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address
  x86/mm/xen: Suppress hugetlbfs in PV guests
  arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission
  arm64: Honour !PTE_WRITE in set_pte_at() for kernel mappings
  sched/cgroup: Fix/cleanup cgroup teardown/init
  dmaengine: pxa_dma: fix the maximum requestor line
  dmaengine: hsu: correct use of channel status register
  dmaengine: dw: fix master selection
  debugfs: Make automount point inodes permanently empty
  lib: lz4: fixed zram with lz4 on big endian machines
  dm cache metadata: fix cmd_read_lock() acquiring write lock
  dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros
  usb: gadget: f_fs: Fix use-after-free
  usb: hcd: out of bounds access in for_each_companion
  xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers
  usb: xhci: fix wild pointers in xhci_mem_cleanup
  xhci: resume USB 3 roothub first
  usb: xhci: applying XHCI_PME_STUCK_QUIRK to Intel BXT B0 host
  assoc_array: don't call compare_object() on a node
  ARM: OMAP2+: hwmod: Fix updating of sysconfig register
  ARM: OMAP2: Fix up interconnect barrier initialization for DRA7
  ARM: mvebu: Correct unit address for linksys
  ARM: dts: AM43x-epos: Fix clk parent for synctimer
  KVM: arm/arm64: Handle forward time correction gracefully
  kvm: x86: do not leak guest xcr0 into host interrupt handlers
  x86/mce: Avoid using object after free in genpool
  block: loop: fix filesystem corruption in case of aio/dio
  block: partition: initialize percpuref before sending out KOBJ_ADD

Conflicts:
	arch/arm64/Kconfig
	arch/arm64/include/asm/cputype.h
	arch/arm64/include/asm/hardirq.h
	arch/arm64/include/asm/irq.h
	arch/arm64/include/asm/mmu_context.h
	arch/arm64/kernel/cpu_errata.c
	arch/arm64/kernel/cpuinfo.c
	arch/arm64/kernel/setup.c
	arch/arm64/kernel/smp.c
	arch/arm64/kernel/stacktrace.c
	arch/arm64/mm/init.c
	arch/arm64/mm/mmu.c
	arch/arm64/mm/pageattr.c
	mm/memcontrol.c

CRs-Fixed: 1069136
Signed-off-by: Bryan Huntsman <bryanh@codeaurora.org>
Signed-off-by: Runmin Wang <runminw@codeaurora.org>
Change-Id: Ie9a16debd0578331a66947376f3b787a7bb54d65
2016-10-21 18:00:55 -07:00
Trilok Soni
5ab1e18aa3 Revert "Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4"
This reverts commit 9d6fd2c3e9 ("Merge remote-tracking branch
'msm-4.4/tmp-510d0a3f' into msm-4.4"), because it breaks the
dump parsing tools due to kernel can be loaded anywhere in the memory
now and not fixed at linear mapping.

Change-Id: Id416f0a249d803442847d09ac47781147b0d0ee6
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
2016-08-26 14:34:05 -07:00
Trilok Soni
9d6fd2c3e9 Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4
* msm-4.4/tmp-510d0a3f:
  Linux 4.4.11
  nf_conntrack: avoid kernel pointer value leak in slab name
  drm/radeon: fix DP link training issue with second 4K monitor
  drm/i915/bdw: Add missing delay during L3 SQC credit programming
  drm/i915: Bail out of pipe config compute loop on LPT
  drm/radeon: fix PLL sharing on DCE6.1 (v2)
  Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing"
  Input: max8997-haptic - fix NULL pointer dereference
  get_rock_ridge_filename(): handle malformed NM entries
  tools lib traceevent: Do not reassign parg after collapse_tree()
  qla1280: Don't allocate 512kb of host tags
  atomic_open(): fix the handling of create_error
  regulator: axp20x: Fix axp22x ldo_io voltage ranges
  regulator: s2mps11: Fix invalid selector mask and voltages for buck9
  workqueue: fix rebind bound workers warning
  ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC
  vfs: rename: check backing inode being equal
  vfs: add vfs_select_inode() helper
  perf/core: Disable the event on a truncated AUX record
  regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case
  pinctrl: at91-pio4: fix pull-up/down logic
  spi: spi-ti-qspi: Handle truncated frames properly
  spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden
  spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT
  ALSA: hda - Fix broken reconfig
  ALSA: hda - Fix white noise on Asus UX501VW headset
  ALSA: hda - Fix subwoofer pin on ASUS N751 and N551
  ALSA: usb-audio: Yet another Phoneix Audio device quirk
  ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2)
  crypto: testmgr - Use kmalloc memory for RSA input
  crypto: hash - Fix page length clamping in hash walk
  crypto: qat - fix invalid pf2vf_resp_wq logic
  s390/mm: fix asce_bits handling with dynamic pagetable levels
  zsmalloc: fix zs_can_compact() integer overflow
  ocfs2: fix posix_acl_create deadlock
  ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang
  net/route: enforce hoplimit max value
  tcp: refresh skb timestamp at retransmit time
  net: thunderx: avoid exposing kernel stack
  net: fix a kernel infoleak in x25 module
  uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h MIME-Version: 1.0
  bridge: fix igmp / mld query parsing
  net: bridge: fix old ioctl unlocked net device walk
  VSOCK: do not disconnect socket when peer has shutdown SEND only
  net/mlx4_en: Fix endianness bug in IPV6 csum calculation
  net: fix infoleak in rtnetlink
  net: fix infoleak in llc
  net: fec: only clear a queue's work bit if the queue was emptied
  netem: Segment GSO packets on enqueue
  sch_dsmark: update backlog as well
  sch_htb: update backlog as well
  net_sched: update hierarchical backlog too
  net_sched: introduce qdisc_replace() helper
  gre: do not pull header in ICMP error processing
  net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
  samples/bpf: fix trace_output example
  bpf: fix check_map_func_compatibility logic
  bpf: fix refcnt overflow
  bpf: fix double-fdput in replace_map_fd_with_map_ptr()
  net/mlx4_en: fix spurious timestamping callbacks
  ipv4/fib: don't warn when primary address is missing if in_dev is dead
  net/mlx5e: Fix minimum MTU
  net/mlx5e: Device's mtu field is u16 and not int
  openvswitch: use flow protocol when recalculating ipv6 checksums
  atl2: Disable unimplemented scatter/gather feature
  vlan: pull on __vlan_insert_tag error path and fix csum correction
  net: use skb_postpush_rcsum instead of own implementations
  cdc_mbim: apply "NDP to end" quirk to all Huawei devices
  bpf/verifier: reject invalid LD_ABS | BPF_DW instruction
  net: sched: do not requeue a NULL skb
  packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface
  route: do not cache fib route info on local routes with oif
  decnet: Do not build routes to devices without decnet private data.
  parisc: Use generic extable search and sort routines
  arm64: kasan: Use actual memory node when populating the kernel image shadow
  arm64: mm: treat memstart_addr as a signed quantity
  arm64: lse: deal with clobbered IP registers after branch via PLT
  arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly
  arm64: kasan: Fix zero shadow mapping overriding kernel image shadow
  arm64: consistently use p?d_set_huge
  arm64: fix KASLR boot-time I-cache maintenance
  arm64: hugetlb: partial revert of 66b3923a1a0f
  arm64: make irq_stack_ptr more robust
  arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
  efi: stub: use high allocation for converted command line
  efi: stub: add implementation of efi_random_alloc()
  efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
  arm64: kaslr: randomize the linear region
  arm64: add support for kernel ASLR
  arm64: add support for building vmlinux as a relocatable PIE binary
  arm64: switch to relative exception tables
  extable: add support for relative extables to search and sort routines
  scripts/sortextable: add support for ET_DYN binaries
  arm64: futex.h: Add missing PAN toggling
  arm64: make asm/elf.h available to asm files
  arm64: avoid dynamic relocations in early boot code
  arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  arm64: add support for module PLTs
  arm64: move brk immediate argument definitions to separate header
  arm64: mm: use bit ops rather than arithmetic in pa/va translations
  arm64: mm: only perform memstart_addr sanity check if DEBUG_VM
  arm64: User die() instead of panic() in do_page_fault()
  arm64: allow kernel Image to be loaded anywhere in physical memory
  arm64: defer __va translation of initrd_start and initrd_end
  arm64: move kernel image to base of vmalloc area
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: decouple early fixmap init from linear mapping
  arm64: pgtable: implement static [pte|pmd|pud]_offset variants
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  arm64: add support for ioremap() block mappings
  arm64: prevent potential circular header dependencies in asm/bug.h
  of/fdt: factor out assignment of initrd_start/initrd_end
  of/fdt: make memblock minimum physical address arch configurable
  arm64: Remove the get_thread_info() function
  arm64: kernel: Don't toggle PAN on systems with UAO
  arm64: cpufeature: Test 'matches' pointer to find the end of the list
  arm64: kernel: Add support for User Access Override
  arm64: add ARMv8.2 id_aa64mmfr2 boiler plate
  arm64: cpufeature: Change read_cpuid() to use sysreg's mrs_s macro
  arm64: use local label prefixes for __reg_num symbols
  arm64: vdso: Mark vDSO code as read-only
  arm64: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL
  arm64: ptdump: Indicate whether memory should be faulting
  arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
  arm64: Drop alloc function from create_mapping
  arm64: prefetch: add missing #include for spin_lock_prefetch
  arm64: lib: patch in prfm for copy_page if requested
  arm64: lib: improve copy_page to deal with 128 bytes at a time
  arm64: prefetch: add alternative pattern for CPUs without a prefetcher
  arm64: prefetch: don't provide spin_lock_prefetch with LSE
  arm64: allow vmalloc regions to be set with set_memory_*
  arm64: kernel: implement ACPI parking protocol
  arm64: mm: create new fine-grained mappings at boot
  arm64: ensure _stext and _etext are page-aligned
  arm64: mm: allow passing a pgdir to alloc_init_*
  arm64: mm: allocate pagetables anywhere
  arm64: mm: use fixmap when creating page tables
  arm64: mm: add functions to walk tables in fixmap
  arm64: mm: add __{pud,pgd}_populate
  arm64: mm: avoid redundant __pa(__va(x))
  arm64: mm: add functions to walk page tables by PA
  arm64: mm: move pte_* macros
  arm64: kasan: avoid TLB conflicts
  arm64: mm: add code to safely replace TTBR1_EL1
  arm64: add function to install the idmap
  arm64: unmap idmap earlier
  arm64: unify idmap removal
  arm64: mm: place empty_zero_page in bss
  arm64: mm: specialise pagetable allocators
  asm-generic: Fix local variable shadow in __set_fixmap_offset
  Eliminate the .eh_frame sections from the aarch64 vmlinux and kernel modules
  arm64: Fix an enum typo in mm/dump.c
  arm64: kasan: ensure that the KASAN zero page is mapped read-only
  arch/arm64/include/asm/pgtable.h: add pmd_mkclean for THP
  arm64: hide __efistub_ aliases from kallsyms
  Linux 4.4.10
  drm/i915/skl: Fix DMC load on Skylake J0 and K0
  lib/test-string_helpers.c: fix and improve string_get_size() tests
  ACPI / processor: Request native thermal interrupt handling via _OSC
  drm/i915: Fake HDMI live status
  drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW
  drm/i915: Fix eDP low vswing for Broadwell
  drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
  drm/radeon: make sure vertical front porch is at least 1
  iio: ak8975: fix maybe-uninitialized warning
  iio: ak8975: Fix NULL pointer exception on early interrupt
  drm/amdgpu: set metadata pointer to NULL after freeing.
  drm/amdgpu: make sure vertical front porch is at least 1
  gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading
  nvmem: mxs-ocotp: fix buffer overflow in read
  USB: serial: cp210x: add Straizona Focusers device ids
  USB: serial: cp210x: add ID for Link ECU
  ata: ahci-platform: Add ports-implemented DT bindings.
  libahci: save port map for forced port map
  powerpc: Fix bad inline asm constraint in create_zero_mask()
  ACPICA: Dispatcher: Update thread ID for recursive method calls
  x86/sysfb_efi: Fix valid BAR address range check
  ARC: Add missing io barriers to io{read,write}{16,32}be()
  ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
  propogate_mnt: Handle the first propogated copy being a slave
  fs/pnode.c: treat zero mnt_group_id-s as unequal
  x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
  MAINTAINERS: Remove asterisk from EFI directory names
  writeback: Fix performance regression in wb_over_bg_thresh()
  batman-adv: Reduce refcnt of removed router when updating route
  batman-adv: Fix broadcast/ogm queue limit on a removed interface
  batman-adv: Check skb size before using encapsulated ETH+VLAN header
  batman-adv: fix DAT candidate selection (must use vid)
  mm: update min_free_kbytes from khugepaged after core initialization
  proc: prevent accessing /proc/<PID>/environ until it's ready
  Input: zforce_ts - fix dual touch recognition
  HID: Fix boot delay for Creative SB Omni Surround 5.1 with quirk
  HID: wacom: Add support for DTK-1651
  xen/evtchn: fix ring resize when binding new events
  xen/balloon: Fix crash when ballooning on x86 32 bit PAE
  xen: Fix page <-> pfn conversion on 32 bit systems
  ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernel
  ARM: EXYNOS: Properly skip unitialized parent clock in power domain on
  mm/zswap: provide unique zpool name
  mm, cma: prevent nr_isolated_* counters from going negative
  Minimal fix-up of bad hashing behavior of hash_64()
  MD: make bio mergeable
  tracing: Don't display trigger file for events that can't be enabled
  mac80211: fix statistics leak if dev_alloc_name() fails
  ath9k: ar5008_hw_cmn_spur_mitigate: add missing mask_m & mask_p initialisation
  lpfc: fix misleading indentation
  clk: qcom: msm8960: Fix ce3_src register offset
  clk: versatile: sp810: support reentrance
  clk: qcom: msm8960: fix ce3_core clk enable register
  clk: meson: Fix meson_clk_register_clks() signature type mismatch
  clk: rockchip: free memory in error cases when registering clock branches
  soc: rockchip: power-domain: fix err handle while probing
  clk-divider: make sure read-only dividers do not write to their register
  CNS3xxx: Fix PCI cns3xxx_write_config()
  mwifiex: fix corner case association failure
  ata: ahci_xgene: dereferencing uninitialized pointer in probe
  nbd: ratelimit error msgs after socket close
  mfd: intel-lpss: Remove clock tree on error path
  ipvs: drop first packet to redirect conntrack
  ipvs: correct initial offset of Call-ID header search in SIP persistence engine
  ipvs: handle ip_vs_fill_iph_skb_off failure
  RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
  Revert: "powerpc/tm: Check for already reclaimed tasks"
  arm64: head.S: use memset to clear BSS
  efi: stub: define DISABLE_BRANCH_PROFILING for all architectures
  arm64: entry: remove pointless SPSR mode check
  arm64: mm: move pgd_cache initialisation to pgtable_cache_init
  arm64: module: avoid undefined shift behavior in reloc_data()
  arm64: module: fix relocation of movz instruction with negative immediate
  arm64: traps: address fallout from printk -> pr_* conversion
  arm64: ftrace: fix a stack tracer's output under function graph tracer
  arm64: pass a task parameter to unwind_frame()
  arm64: ftrace: modify a stack frame in a safe way
  arm64: remove irq_count and do_softirq_own_stack()
  arm64: hugetlb: add support for PTE contiguous bit
  arm64: Use PoU cache instr for I/D coherency
  arm64: Defer dcache flush in __cpu_copy_user_page
  arm64: reduce stack use in irq_handler
  arm64: Documentation: add list of software workarounds for errata
  arm64: mm: place __cpu_setup in .text
  arm64: cmpxchg: Don't incldue linux/mmdebug.h
  arm64: mm: fold alternatives into .init
  arm64: Remove redundant padding from linker script
  arm64: mm: remove pointless PAGE_MASKing
  arm64: don't call C code with el0's fp register
  arm64: when walking onto the task stack, check sp & fp are in current->stack
  arm64: Add this_cpu_ptr() assembler macro for use in entry.S
  arm64: irq: fix walking from irq stack to task stack
  arm64: Add do_softirq_own_stack() and enable irq_stacks
  arm64: Modify stack trace and dump for use with irq_stack
  arm64: Store struct thread_info in sp_el0
  arm64: Add trace_hardirqs_off annotation in ret_to_user
  arm64: ftrace: fix the comments for ftrace_modify_code
  arm64: ftrace: stop using kstop_machine to enable/disable tracing
  arm64: spinlock: serialise spin_unlock_wait against concurrent lockers
  arm64: enable HAVE_IRQ_TIME_ACCOUNTING
  arm64: fix COMPAT_SHMLBA definition for large pages
  arm64: add __init/__initdata section marker to some functions/variables
  arm64: pgtable: implement pte_accessible()
  arm64: mm: allow sections for unaligned bases
  arm64: mm: detect bad __create_mapping uses
  Linux 4.4.9
  extcon: max77843: Use correct size for reading the interrupt register
  stm class: Select CONFIG_SRCU
  megaraid_sas: add missing curly braces in ioctl handler
  sunrpc/cache: drop reference when sunrpc_cache_pipe_upcall() detects a race
  thermal: rockchip: fix a impossible condition caused by the warning
  unbreak allmodconfig KCONFIG_ALLCONFIG=...
  jme: Fix device PM wakeup API usage
  jme: Do not enable NIC WoL functions on S0
  bus: imx-weim: Take the 'status' property value into account
  ARM: dts: pxa: fix dma engine node to pxa3xx-nand
  ARM: dts: armada-375: use armada-370-sata for SATA
  ARM: EXYNOS: select THERMAL_OF
  ARM: prima2: always enable reset controller
  ARM: OMAP3: Add cpuidle parameters table for omap3430
  ext4: fix races of writeback with punch hole and zero range
  ext4: fix races between buffered IO and collapse / insert range
  ext4: move unlocked dio protection from ext4_alloc_file_blocks()
  ext4: fix races between page faults and hole punching
  perf stat: Document --detailed option
  perf tools: handle spaces in file names obtained from /proc/pid/maps
  perf hists browser: Only offer symbol scripting when a symbol is under the cursor
  mtd: nand: Drop mtd.owner requirement in nand_scan
  mtd: brcmnand: Fix v7.1 register offsets
  mtd: spi-nor: remove micron_quad_enable()
  serial: sh-sci: Remove cpufreq notifier to fix crash/deadlock
  ext4: fix NULL pointer dereference in ext4_mark_inode_dirty()
  x86/mm/kmmio: Fix mmiotrace for hugepages
  perf evlist: Reference count the cpu and thread maps at set_maps()
  drivers/misc/ad525x_dpot: AD5274 fix RDAC read back errors
  rtc: max77686: Properly handle regmap_irq_get_virq() error code
  rtc: rx8025: remove rv8803 id
  rtc: ds1685: passing bogus values to irq_restore
  rtc: vr41xx: Wire up alarm_irq_enable
  rtc: hym8563: fix invalid year calculation
  PM / Domains: Fix removal of a subdomain
  PM / OPP: Initialize u_volt_min/max to a valid value
  misc: mic/scif: fix wrap around tests
  misc/bmp085: Enable building as a module
  lib/mpi: Endianness fix
  fbdev: da8xx-fb: fix videomodes of lcd panels
  scsi_dh: force modular build if SCSI is a module
  paride: make 'verbose' parameter an 'int' again
  regulator: s5m8767: fix get_register() error handling
  irqchip/mxs: Fix error check of of_io_request_and_map()
  irqchip/sunxi-nmi: Fix error check of of_io_request_and_map()
  spi/rockchip: Make sure spi clk is on in rockchip_spi_set_cs
  locking/mcs: Fix mcs_spin_lock() ordering
  regulator: core: Fix nested locking of supplies
  regulator: core: Ensure we lock all regulators
  regulator: core: fix regulator_lock_supply regression
  Revert "regulator: core: Fix nested locking of supplies"
  videobuf2-v4l2: Verify planes array in buffer dequeueing
  videobuf2-core: Check user space planes array in dqbuf
  USB: usbip: fix potential out-of-bounds write
  cgroup: make sure a parent css isn't freed before its children
  mm/hwpoison: fix wrong num_poisoned_pages accounting
  mm: vmscan: reclaim highmem zone if buffer_heads is over limit
  numa: fix /proc/<pid>/numa_maps for THP
  mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check
  memcg: relocate charge moving from ->attach to ->post_attach
  cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback
  slub: clean up code for kmem cgroup support to kmem_cache_free_bulk
  workqueue: fix ghost PENDING flag while doing MQ IO
  x86/apic: Handle zero vector gracefully in clear_vector_irq()
  efi: Expose non-blocking set_variable() wrapper to efivars
  efi: Fix out-of-bounds read in variable_matches()
  IB/security: Restrict use of the write() interface
  IB/mlx5: Expose correct max_sge_rd limit
  cxl: Keep IRQ mappings on context teardown
  v4l2-dv-timings.h: fix polarity for 4k formats
  vb2-memops: Fix over allocation of frame vectors
  ASoC: rt5640: Correct the digital interface data select
  ASoC: dapm: Make sure we have a card when displaying component widgets
  ASoC: ssm4567: Reset device before regcache_sync()
  ASoC: s3c24xx: use const snd_soc_component_driver pointer
  EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback
  toshiba_acpi: Fix regression caused by hotkey enabling value
  i2c: exynos5: Fix possible ABBA deadlock by keeping I2C clock prepared
  i2c: cpm: Fix build break due to incompatible pointer types
  perf intel-pt: Fix segfault tracing transactions
  drm/i915: Use fw_domains_put_with_fifo() on HSW
  drm/i915: Fixup the free space logic in ring_prepare
  drm/amdkfd: uninitialized variable in dbgdev_wave_control_set_registers()
  drm/i915: skl_update_scaler() wants a rotation bitmask instead of bit number
  drm/i915: Cleanup phys status page too
  pwm: brcmstb: Fix check of devm_ioremap_resource() return code
  drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()
  drm/dp/mst: Restore primary hub guid on resume
  drm/dp/mst: Validate port in drm_dp_payload_send_msg()
  drm/nouveau/gr/gf100: select a stream master to fixup tfb offset queries
  drm: Loongson-3 doesn't fully support wc memory
  drm/radeon: fix vertical bars appear on monitor (v2)
  drm/radeon: forbid mapping of userptr bo through radeon device file
  drm/radeon: fix initial connector audio value
  drm/radeon: add a quirk for a XFX R9 270X
  drm/amdgpu: fix regression on CIK (v2)
  amdgpu/uvd: add uvd fw version for amdgpu
  drm/amdgpu: bump the afmt limit for CZ, ST, Polaris
  drm/amdgpu: use defines for CRTCs and AMFT blocks
  drm/amdgpu: when suspending, if uvd/vce was running. need to cancel delay work.
  iommu/dma: Restore scatterlist offsets correctly
  iommu/amd: Fix checking of pci dma aliases
  pinctrl: single: Fix pcs_parse_bits_in_pinctrl_entry to use __ffs than ffs
  pinctrl: mediatek: correct debounce time unit in mtk_gpio_set_debounce
  xen kconfig: don't "select INPUT_XEN_KBDDEV_FRONTEND"
  Input: pmic8xxx-pwrkey - fix algorithm for converting trigger delay
  Input: gtco - fix crash on detecting device without endpoints
  netlink: don't send NETLINK_URELEASE for unbound sockets
  nl80211: check netlink protocol in socket release notification
  powerpc: Update TM user feature bits in scan_features()
  powerpc: Update cpu_user_features2 in scan_features()
  powerpc: scan_features() updates incorrect bits for REAL_LE
  crypto: talitos - fix AEAD tcrypt tests
  crypto: talitos - fix crash in talitos_cra_init()
  crypto: sha1-mb - use corrcet pointer while completing jobs
  crypto: ccp - Prevent information leakage on export
  iwlwifi: mvm: fix memory leak in paging
  iwlwifi: pcie: lower the debug level for RSA semaphore access
  s390/pci: add extra padding to function measurement block
  cpufreq: intel_pstate: Fix processing for turbo activation ratio
  Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control"
  Revert "drm/radeon: disable runtime pm on PX laptops without dGPU power control"
  drm/i915: Fix race condition in intel_dp_destroy_mst_connector()
  drm/qxl: fix cursor position with non-zero hotspot
  drm/nouveau/core: use vzalloc for allocating ramht
  futex: Acknowledge a new waiter in counter before plist
  futex: Handle unlock_pi race gracefully
  asm-generic/futex: Re-enable preemption in futex_atomic_cmpxchg_inatomic()
  ALSA: hda - Add dock support for ThinkPad X260
  ALSA: pcxhr: Fix missing mutex unlock
  ALSA: hda - add PCI ID for Intel Broxton-T
  ALSA: hda - Keep powering up ADCs on Cirrus codecs
  ALSA: hda/realtek - Add ALC3234 headset mode for Optiplex 9020m
  ALSA: hda - Don't trust the reported actual power state
  x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address
  x86/mm/xen: Suppress hugetlbfs in PV guests
  arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission
  arm64: Honour !PTE_WRITE in set_pte_at() for kernel mappings
  sched/cgroup: Fix/cleanup cgroup teardown/init
  dmaengine: pxa_dma: fix the maximum requestor line
  dmaengine: hsu: correct use of channel status register
  dmaengine: dw: fix master selection
  debugfs: Make automount point inodes permanently empty
  lib: lz4: fixed zram with lz4 on big endian machines
  dm cache metadata: fix cmd_read_lock() acquiring write lock
  dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros
  usb: gadget: f_fs: Fix use-after-free
  usb: hcd: out of bounds access in for_each_companion
  xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers
  usb: xhci: fix wild pointers in xhci_mem_cleanup
  xhci: resume USB 3 roothub first
  usb: xhci: applying XHCI_PME_STUCK_QUIRK to Intel BXT B0 host
  assoc_array: don't call compare_object() on a node
  ARM: OMAP2+: hwmod: Fix updating of sysconfig register
  ARM: OMAP2: Fix up interconnect barrier initialization for DRA7
  ARM: mvebu: Correct unit address for linksys
  ARM: dts: AM43x-epos: Fix clk parent for synctimer
  KVM: arm/arm64: Handle forward time correction gracefully
  kvm: x86: do not leak guest xcr0 into host interrupt handlers
  x86/mce: Avoid using object after free in genpool
  block: loop: fix filesystem corruption in case of aio/dio
  block: partition: initialize percpuref before sending out KOBJ_ADD

Conflicts:
	arch/arm64/Kconfig
	arch/arm64/include/asm/cputype.h
	arch/arm64/include/asm/hardirq.h
	arch/arm64/include/asm/irq.h
	arch/arm64/kernel/cpu_errata.c
	arch/arm64/kernel/cpuinfo.c
	arch/arm64/kernel/setup.c
	arch/arm64/kernel/smp.c
	arch/arm64/kernel/stacktrace.c
	arch/arm64/mm/init.c
	arch/arm64/mm/mmu.c
	arch/arm64/mm/pageattr.c
	mm/memcontrol.c

CRs-Fixed: 1054234
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
Change-Id: I2a7a34631ffee36ce18b9171f16d023be777392f
2016-08-18 14:50:45 -07:00
Richard Weinberger
4b1cb3c89e mm: Export migrate_page_move_mapping and migrate_page_copy
commit 1118dce773d84f39ebd51a9fe7261f9169cb056e upstream.

Export these symbols such that UBIFS can implement
->migratepage.

Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-27 09:47:31 -07:00
Runmin Wang
750075feff Merge remote-tracking branch 'origin/tmp-917a9a9133a6' into lsk
* tmp-917a9:
  ARM/vdso: Mark the vDSO code read-only after init
  x86/vdso: Mark the vDSO code read-only after init
  lkdtm: Verify that '__ro_after_init' works correctly
  arch: Introduce post-init read-only memory
  x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option
  mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings
  asm-generic: Consolidate mark_rodata_ro()
  Linux 4.4.6
  ld-version: Fix awk regex compile failure
  target: Drop incorrect ABORT_TASK put for completed commands
  block: don't optimize for non-cloned bio in bio_get_last_bvec()
  MIPS: smp.c: Fix uninitialised temp_foreign_map
  MIPS: Fix build error when SMP is used without GIC
  ovl: fix getcwd() failure after unsuccessful rmdir
  ovl: copy new uid/gid into overlayfs runtime inode
  userfaultfd: don't block on the last VM updates at exit time
  powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
  powerpc/powernv: Add a kmsg_dumper that flushes console output on panic
  powerpc: Fix dedotify for binutils >= 2.26
  Revert "drm/radeon/pm: adjust display configuration after powerstate"
  drm/radeon: Fix error handling in radeon_flip_work_func.
  drm/amdgpu: Fix error handling in amdgpu_flip_work_func.
  Revert "drm/radeon: call hpd_irq_event on resume"
  x86/mm: Fix slow_virt_to_phys() for X86_PAE again
  gpu: ipu-v3: Do not bail out on missing optional port nodes
  mac80211: Fix Public Action frame RX in AP mode
  mac80211: check PN correctly for GCMP-encrypted fragmented MPDUs
  mac80211: minstrel_ht: fix a logic error in RTS/CTS handling
  mac80211: minstrel_ht: set default tx aggregation timeout to 0
  mac80211: fix use of uninitialised values in RX aggregation
  mac80211: minstrel: Change expected throughput unit back to Kbps
  iwlwifi: mvm: inc pending frames counter also when txing non-sta
  can: gs_usb: fixed disconnect bug by removing erroneous use of kfree()
  cfg80211/wext: fix message ordering
  wext: fix message delay/ordering
  ovl: fix working on distributed fs as lower layer
  ovl: ignore lower entries when checking purity of non-directory entries
  ASoC: wm8958: Fix enum ctl accesses in a wrong type
  ASoC: wm8994: Fix enum ctl accesses in a wrong type
  ASoC: samsung: Use IRQ safe spin lock calls
  ASoC: dapm: Fix ctl value accesses in a wrong type
  ncpfs: fix a braino in OOM handling in ncp_fill_cache()
  jffs2: reduce the breakage on recovery from halfway failed rename()
  dmaengine: at_xdmac: fix residue computation
  tracing: Fix check for cpu online when event is disabled
  s390/dasd: fix diag 0x250 inline assembly
  s390/mm: four page table levels vs. fork
  KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0
  KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
  KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit
  KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS
  KVM: VMX: disable PEBS before a guest entry
  kvm: cap halt polling at exactly halt_poll_ns
  PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr()
  ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property
  ARM: dts: dra7: do not gate cpsw clock due to errata i877
  ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
  arm64: account for sparsemem section alignment when choosing vmemmap offset
  Linux 4.4.5
  drm/amdgpu: fix topaz/tonga gmc assignment in 4.4 stable
  modules: fix longstanding /proc/kallsyms vs module insertion race.
  drm/i915: refine qemu south bridge detection
  drm/i915: more virtual south bridge detection
  block: get the 1st and last bvec via helpers
  block: check virt boundary in bio_will_gap()
  drm/amdgpu: Use drm_calloc_large for VM page_tables array
  thermal: cpu_cooling: fix out of bounds access in time_in_idle
  i2c: brcmstb: allocate correct amount of memory for regmap
  ubi: Fix out of bounds write in volume update code
  cxl: Fix PSL timebase synchronization detection
  MIPS: traps: Fix SIGFPE information leak from `do_ov' and `do_trap_or_bp'
  MIPS: scache: Fix scache init with invalid line size.
  USB: serial: option: add support for Quectel UC20
  USB: serial: option: add support for Telit LE922 PID 0x1045
  USB: qcserial: add Sierra Wireless EM74xx device ID
  USB: qcserial: add Dell Wireless 5809e Gobi 4G HSPA+ (rev3)
  USB: cp210x: Add ID for Parrot NMEA GPS Flight Recorder
  usb: chipidea: otg: change workqueue ci_otg as freezable
  ALSA: timer: Fix broken compat timer user status ioctl
  ALSA: hdspm: Fix zero-division
  ALSA: hdsp: Fix wrong boolean ctl value accesses
  ALSA: hdspm: Fix wrong boolean ctl value accesses
  ALSA: seq: oss: Don't drain at closing a client
  ALSA: pcm: Fix ioctls for X32 ABI
  ALSA: timer: Fix ioctls for X32 ABI
  ALSA: rawmidi: Fix ioctls X32 ABI
  ALSA: hda - Fix mic issues on Acer Aspire E1-472
  ALSA: ctl: Fix ioctls for X32 ABI
  ALSA: usb-audio: Add a quirk for Plantronics DA45
  adv7604: fix tx 5v detect regression
  dmaengine: pxa_dma: fix cyclic transfers
  Fix directory hardlinks from deleted directories
  jffs2: Fix page lock / f->sem deadlock
  Revert "jffs2: Fix lock acquisition order bug in jffs2_write_begin"
  Btrfs: fix loading of orphan roots leading to BUG_ON
  pata-rb532-cf: get rid of the irq_to_gpio() call
  tracing: Do not have 'comm' filter override event 'comm' field
  ata: ahci: don't mark HotPlugCapable Ports as external/removable
  PM / sleep / x86: Fix crash on graph trace through x86 suspend
  arm64: vmemmap: use virtual projection of linear region
  Adding Intel Lewisburg device IDs for SATA
  writeback: flush inode cgroup wb switches instead of pinning super_block
  block: bio: introduce helpers to get the 1st and last bvec
  libata: Align ata_device's id on a cacheline
  libata: fix HDIO_GET_32BIT ioctl
  drm/amdgpu: return from atombios_dp_get_dpcd only when error
  drm/amdgpu/gfx8: specify which engine to wait before vm flush
  drm/amdgpu: apply gfx_v8 fixes to gfx_v7 as well
  drm/amdgpu/pm: update current crtc info after setting the powerstate
  drm/radeon/pm: update current crtc info after setting the powerstate
  drm/ast: Fix incorrect register check for DRAM width
  target: Fix WRITE_SAME/DISCARD conversion to linux 512b sectors
  iommu/vt-d: Use BUS_NOTIFY_REMOVED_DEVICE in hotplug path
  iommu/amd: Fix boot warning when device 00:00.0 is not iommu covered
  iommu/amd: Apply workaround for ATS write permission check
  arm/arm64: KVM: Fix ioctl error handling
  KVM: x86: fix root cause for missed hardware breakpoints
  vfio: fix ioctl error handling
  Fix cifs_uniqueid_to_ino_t() function for s390x
  CIFS: Fix SMB2+ interim response processing for read requests
  cifs: fix out-of-bounds access in lease parsing
  fbcon: set a default value to blink interval
  kvm: x86: Update tsc multiplier on change.
  mips/kvm: fix ioctl error handling
  parisc: Fix ptrace syscall number and return value modification
  PCI: keystone: Fix MSI code that retrieves struct pcie_port pointer
  block: Initialize max_dev_sectors to 0
  drm/amdgpu: mask out WC from BO on unsupported arches
  btrfs: async-thread: Fix a use-after-free error for trace
  btrfs: Fix no_space in write and rm loop
  Btrfs: fix deadlock running delayed iputs at transaction commit time
  drivers: sh: Restore legacy clock domain on SuperH platforms
  use ->d_seq to get coherency between ->d_inode and ->d_flags
  Linux 4.4.4
  iwlwifi: mvm: don't allow sched scans without matches to be started
  iwlwifi: update and fix 7265 series PCI IDs
  iwlwifi: pcie: properly configure the debug buffer size for 8000
  iwlwifi: dvm: fix WoWLAN
  security: let security modules use PTRACE_MODE_* with bitmasks
  IB/cma: Fix RDMA port validation for iWarp
  x86/irq: Plug vector cleanup race
  x86/irq: Call irq_force_move_complete with irq descriptor
  x86/irq: Remove outgoing CPU from vector cleanup mask
  x86/irq: Remove the cpumask allocation from send_cleanup_vector()
  x86/irq: Clear move_in_progress before sending cleanup IPI
  x86/irq: Remove offline cpus from vector cleanup
  x86/irq: Get rid of code duplication
  x86/irq: Copy vectormask instead of an AND operation
  x86/irq: Check vector allocation early
  x86/irq: Reorganize the search in assign_irq_vector
  x86/irq: Reorganize the return path in assign_irq_vector
  x86/irq: Do not use apic_chip_data.old_domain as temporary buffer
  x86/irq: Validate that irq descriptor is still active
  x86/irq: Fix a race in x86_vector_free_irqs()
  x86/irq: Call chip->irq_set_affinity in proper context
  x86/entry/compat: Add missing CLAC to entry_INT80_32
  x86/mpx: Fix off-by-one comparison with nr_registers
  hpfs: don't truncate the file when delete fails
  do_last(): ELOOP failure exit should be done after leaving RCU mode
  should_follow_link(): validate ->d_seq after having decided to follow
  xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted.
  xen/pciback: Save the number of MSI-X entries to be copied later.
  xen/pciback: Check PF instead of VF for PCI_COMMAND_MEMORY
  xen/scsiback: correct frontend counting
  xen/arm: correctly handle DMA mapping of compound pages
  ARM: at91/dt: fix typo in sama5d2 pinmux descriptions
  ARM: OMAP2+: Fix onenand initialization to avoid filesystem corruption
  do_last(): don't let a bogus return value from ->open() et.al. to confuse us
  kernel/resource.c: fix muxed resource handling in __request_region()
  sunrpc/cache: fix off-by-one in qword_get()
  tracing: Fix showing function event in available_events
  powerpc/eeh: Fix partial hotplug criterion
  KVM: x86: MMU: fix ubsan index-out-of-range warning
  KVM: x86: fix conversion of addresses to linear in 32-bit protected mode
  KVM: x86: fix missed hardware breakpoints
  KVM: arm/arm64: vgic: Ensure bitmaps are long enough
  KVM: async_pf: do not warn on page allocation failures
  of/irq: Fix msi-map calculation for nonzero rid-base
  NFSv4: Fix a dentry leak on alias use
  nfs: fix nfs_size_to_loff_t
  block: fix use-after-free in dio_bio_complete
  bio: return EINTR if copying to user space got interrupted
  i2c: i801: Adding Intel Lewisburg support for iTCO
  phy: core: fix wrong err handle for phy_power_on
  writeback: keep superblock pinned during cgroup writeback association switches
  cgroup: make sure a parent css isn't offlined before its children
  cpuset: make mm migration asynchronous
  PCI/AER: Flush workqueue on device remove to avoid use-after-free
  ARCv2: SMP: Emulate IPI to self using software triggered interrupt
  ARCv2: STAR 9000950267: Handle return from intr to Delay Slot #2
  libata: fix sff host state machine locking while polling
  qla2xxx: Fix stale pointer access.
  spi: atmel: fix gpio chip-select in case of non-DT platform
  target: Fix race with SCF_SEND_DELAYED_TAS handling
  target: Fix remote-port TMR ABORT + se_cmd fabric stop
  target: Fix TAS handling for multi-session se_node_acls
  target: Fix LUN_RESET active TMR descriptor handling
  target: Fix LUN_RESET active I/O handling for ACK_KREF
  ALSA: hda - Fixing background noise on Dell Inspiron 3162
  ALSA: hda - Apply clock gate workaround to Skylake, too
  Revert "workqueue: make sure delayed work run in local cpu"
  workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup
  mac80211: Requeue work after scan complete for all VIF types.
  rfkill: fix rfkill_fop_read wait_event usage
  tick/nohz: Set the correct expiry when switching to nohz/lowres mode
  perf stat: Do not clean event's private stats
  cdc-acm:exclude Samsung phone 04e8:685d
  Revert "Staging: panel: usleep_range is preferred over udelay"
  Staging: speakup: Fix getting port information
  sd: Optimal I/O size is in bytes, not sectors
  libceph: don't spam dmesg with stray reply warnings
  libceph: use the right footer size when skipping a message
  libceph: don't bail early from try_read() when skipping a message
  libceph: fix ceph_msg_revoke()
  seccomp: always propagate NO_NEW_PRIVS on tsync
  cpufreq: Fix NULL reference crash while accessing policy->governor_data
  cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype
  hwmon: (ads1015) Handle negative conversion values correctly
  hwmon: (gpio-fan) Remove un-necessary speed_index lookup for thermal hook
  hwmon: (dell-smm) Blacklist Dell Studio XPS 8000
  Thermal: do thermal zone update after a cooling device registered
  Thermal: handle thermal zone device properly during system sleep
  Thermal: initialize thermal zone device correctly
  IB/mlx5: Expose correct maximum number of CQE capacity
  IB/qib: Support creating qps with GFP_NOIO flag
  IB/qib: fix mcast detach when qp not attached
  IB/cm: Fix a recently introduced deadlock
  dmaengine: dw: disable BLOCK IRQs for non-cyclic xfer
  dmaengine: at_xdmac: fix resume for cyclic transfers
  dmaengine: dw: fix cyclic transfer callbacks
  dmaengine: dw: fix cyclic transfer setup
  nfit: fix multi-interface dimm handling, acpi6.1 compatibility
  ACPI / PCI / hotplug: unlock in error path in acpiphp_enable_slot()
  ACPI: Revert "ACPI / video: Add Dell Inspiron 5737 to the blacklist"
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700
  lib: sw842: select crc32
  uapi: update install list after nvme.h rename
  ideapad-laptop: Add Lenovo Yoga 700 to no_hw_rfkill dmi list
  ideapad-laptop: Add Lenovo ideapad Y700-17ISK to no_hw_rfkill dmi list
  toshiba_acpi: Fix blank screen at boot if transflective backlight is supported
  make sure that freeing shmem fast symlinks is RCU-delayed
  drm/radeon/pm: adjust display configuration after powerstate
  drm/radeon: Don't hang in radeon_flip_work_func on disabled crtc. (v2)
  drm: Fix treatment of drm_vblank_offdelay in drm_vblank_on() (v2)
  drm: Fix drm_vblank_pre/post_modeset regression from Linux 4.4
  drm: Prevent vblank counter bumps > 1 with active vblank clients. (v2)
  drm: No-Op redundant calls to drm_vblank_off() (v2)
  drm/radeon: use post-decrement in error handling
  drm/qxl: use kmalloc_array to alloc reloc_info in qxl_process_single_command
  drm/i915: fix error path in intel_setup_gmbus()
  drm/i915/dsi: don't pass arbitrary data to sideband
  drm/i915/dsi: defend gpio table against out of bounds access
  drm/i915/skl: Don't skip mst encoders in skl_ddi_pll_select()
  drm/i915: Don't reject primary plane windowing with color keying enabled on SKL+
  drm/i915/dp: fall back to 18 bpp when sink capability is unknown
  drm/i915: Make sure DC writes are coherent on flush.
  drm/i915: Init power domains early in driver load
  drm/i915: intel_hpd_init(): Fix suspend/resume reprobing
  drm/i915: Restore inhibiting the load of the default context
  drm: fix missing reference counting decrease
  drm/radeon: hold reference to fences in radeon_sa_bo_new
  drm/radeon: mask out WC from BO on unsupported arches
  drm: add helper to check for wc memory support
  drm/radeon: fix DP audio support for APU with DCE4.1 display engine
  drm/radeon: Add a common function for DFS handling
  drm/radeon: cleaned up VCO output settings for DP audio
  drm/radeon: properly byte swap vce firmware setup
  drm/radeon: clean up fujitsu quirks
  drm/radeon: Fix "slow" audio over DP on DCE8+
  drm/radeon: call hpd_irq_event on resume
  drm/radeon: Fix off-by-one errors in radeon_vm_bo_set_addr
  drm/dp/mst: deallocate payload on port destruction
  drm/dp/mst: Reverse order of MST enable and clearing VC payload table.
  drm/dp/mst: move GUID storage from mgr, port to only mst branch
  drm/dp/mst: Calculate MST PBN with 31.32 fixed point
  drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil
  drm/dp/mst: fix in RAD element access
  drm/dp/mst: fix in MSTB RAD initialization
  drm/dp/mst: always send reply for UP request
  drm/dp/mst: process broadcast messages correctly
  drm/nouveau: platform: Fix deferred probe
  drm/nouveau/disp/dp: ensure sink is powered up before attempting link training
  drm/nouveau/display: Enable vblank irqs after display engine is on again.
  drm/nouveau/kms: take mode_config mutex in connector hotplug path
  drm/amdgpu/pm: adjust display configuration after powerstate
  drm/amdgpu: Don't hang in amdgpu_flip_work_func on disabled crtc.
  drm/amdgpu: use post-decrement in error handling
  drm/amdgpu: fix issue with overlapping userptrs
  drm/amdgpu: hold reference to fences in amdgpu_sa_bo_new (v2)
  drm/amdgpu: remove unnecessary forward declaration
  drm/amdgpu: fix s4 resume
  drm/amdgpu: remove exp hardware support from iceland
  drm/amdgpu: don't load MEC2 on topaz
  drm/amdgpu: drop topaz support from gmc8 module
  drm/amdgpu: pull topaz gmc bits into gmc_v7
  drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above
  drm/amdgpu: iceland use CI based MC IP
  drm/amdgpu: move gmc7 support out of CIK dependency
  drm/amdgpu: no need to load MC firmware on fiji
  drm/amdgpu: fix amdgpu_bo_pin_restricted VRAM placing v2
  drm/amdgpu: fix tonga smu resume
  drm/amdgpu: fix lost sync_to if scheduler is enabled.
  drm/amdgpu: call hpd_irq_event on resume
  drm/amdgpu: Fix off-by-one errors in amdgpu_vm_bo_map
  drm/vmwgfx: respect 'nomodeset'
  drm/vmwgfx: Fix a width / pitch mismatch on framebuffer updates
  drm/vmwgfx: Fix an incorrect lock check
  virtio_pci: fix use after free on release
  virtio_balloon: fix race between migration and ballooning
  virtio_balloon: fix race by fill and leak
  regulator: mt6311: MT6311_REGULATOR needs to select REGMAP_I2C
  regulator: axp20x: Fix GPIO LDO enable value for AXP22x
  clk: exynos: use irqsave version of spin_lock to avoid deadlock with irqs
  cxl: use correct operator when writing pcie config space values
  sparc64: fix incorrect sign extension in sys_sparc64_personality
  EDAC, mc_sysfs: Fix freeing bus' name
  EDAC: Robustify workqueues destruction
  MIPS: Fix buffer overflow in syscall_get_arguments()
  MIPS: Fix some missing CONFIG_CPU_MIPSR6 #ifdefs
  MIPS: hpet: Choose a safe value for the ETIME check
  MIPS: Loongson-3: Fix SMP_ASK_C0COUNT IPI handler
  Revert "MIPS: Fix PAGE_MASK definition"
  cputime: Prevent 32bit overflow in time[val|spec]_to_cputime()
  time: Avoid signed overflow in timekeeping_get_ns()
  Bluetooth: 6lowpan: Fix handling of uncompressed IPv6 packets
  Bluetooth: 6lowpan: Fix kernel NULL pointer dereferences
  Bluetooth: Fix incorrect removing of IRKs
  Bluetooth: Add support of Toshiba Broadcom based devices
  Bluetooth: Use continuous scanning when creating LE connections
  Drivers: hv: vmbus: Fix a Host signaling bug
  tools: hv: vss: fix the write()'s argument: error -> vss_msg
  mmc: sdhci: Allow override of get_cd() called from sdhci_request()
  mmc: sdhci: Allow override of mmc host operations
  mmc: sdhci-pci: Fix card detect race for Intel BXT/APL
  mmc: pxamci: fix again read-only gpio detection polarity
  mmc: sdhci-acpi: Fix card detect race for Intel BXT/APL
  mmc: mmci: fix an ages old detection error
  mmc: core: Enable tuning according to the actual timing
  mmc: sdhci: Fix sdhci_runtime_pm_bus_on/off()
  mmc: mmc: Fix incorrect use of driver strength switching HS200 and HS400
  mmc: sdio: Fix invalid vdd in voltage switch power cycle
  mmc: sdhci: Fix DMA descriptor with zero data length
  mmc: sdhci-pci: Do not default to 33 Ohm driver strength for Intel SPT
  mmc: usdhi6rol0: handle NULL data in timeout
  clockevents/tcb_clksrc: Prevent disabling an already disabled clock
  posix-clock: Fix return code on the poll method's error path
  irqchip/gic-v3-its: Fix double ICC_EOIR write for LPI in EOImode==1
  irqchip/atmel-aic: Fix wrong bit operation for IRQ priority
  irqchip/mxs: Add missing set_handle_irq()
  irqchip/omap-intc: Add support for spurious irq handling
  coresight: checking for NULL string in coresight_name_match()
  dm: fix dm_rq_target_io leak on faults with .request_fn DM w/ blk-mq paths
  dm snapshot: fix hung bios when copy error occurs
  dm space map metadata: remove unused variable in brb_pop()
  tda1004x: only update the frontend properties if locked
  vb2: fix a regression in poll() behavior for output,streams
  gspca: ov534/topro: prevent a division by 0
  si2157: return -EINVAL if firmware blob is too big
  media: dvb-core: Don't force CAN_INVERSION_AUTO in oneshot mode
  rc: sunxi-cir: Initialize the spinlock properly
  namei: ->d_inode of a pinned dentry is stable only for positives
  mei: validate request value in client notify request ioctl
  mei: fix fasync return value on error
  rtlwifi: rtl8723be: Fix module parameter initialization
  rtlwifi: rtl8188ee: Fix module parameter initialization
  rtlwifi: rtl8192se: Fix module parameter initialization
  rtlwifi: rtl8723ae: Fix initialization of module parameters
  rtlwifi: rtl8192de: Fix incorrect module parameter descriptions
  rtlwifi: rtl8192ce: Fix handling of module parameters
  rtlwifi: rtl8192cu: Add missing parameter setup
  rtlwifi: rtl_pci: Fix kernel panic
  locks: fix unlock when fcntl_setlk races with a close
  um: link with -lpthread
  uml: fix hostfs mknod()
  uml: flush stdout before forking
  s390/fpu: signals vs. floating point control register
  s390/compat: correct restore of high gprs on signal return
  s390/dasd: fix performance drop
  s390/dasd: fix refcount for PAV reassignment
  s390/dasd: prevent incorrect length error under z/VM after PAV changes
  s390: fix normalization bug in exception table sorting
  btrfs: initialize the seq counter in struct btrfs_device
  Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots
  Btrfs: fix transaction handle leak on failure to create hard link
  Btrfs: fix number of transaction units required to create symlink
  Btrfs: send, don't BUG_ON() when an empty symlink is found
  btrfs: statfs: report zero available if metadata are exhausted
  Btrfs: igrab inode in writepage
  Btrfs: add missing brelse when superblock checksum fails
  KVM: s390: fix memory overwrites when vx is disabled
  s390/kvm: remove dependency on struct save_area definition
  clocksource/drivers/vt8500: Increase the minimum delta
  genirq: Validate action before dereferencing it in handle_irq_event_percpu()
  mm: numa: quickly fail allocations for NUMA balancing on full nodes
  mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED
  ocfs2: unlock inode if deleting inode from orphan fails
  drm/i915: shut up gen8+ SDE irq dmesg noise
  iw_cxgb3: Fix incorrectly returning error on success
  spi: omap2-mcspi: Prevent duplicate gpio_request
  drivers: android: correct the size of struct binder_uintptr_t for BC_DEAD_BINDER_DONE
  USB: option: add "4G LTE usb-modem U901"
  USB: option: add support for SIM7100E
  USB: cp210x: add IDs for GE B650V3 and B850V3 boards
  usb: dwc3: Fix assignment of EP transfer resources
  can: ems_usb: Fix possible tx overflow
  dm thin: fix race condition when destroying thin pool workqueue
  bcache: Change refill_dirty() to always scan entire disk if necessary
  bcache: prevent crash on changing writeback_running
  bcache: allows use of register in udev to avoid "device_busy" error.
  bcache: unregister reboot notifier if bcache fails to unregister device
  bcache: fix a leak in bch_cached_dev_run()
  bcache: clear BCACHE_DEV_UNLINK_DONE flag when attaching a backing device
  bcache: Add a cond_resched() call to gc
  bcache: fix a livelock when we cause a huge number of cache misses
  lib/ucs2_string: Correct ucs2 -> utf8 conversion
  efi: Add pstore variables to the deletion whitelist
  efi: Make efivarfs entries immutable by default
  efi: Make our variable validation list include the guid
  efi: Do variable name validation tests in utf8
  efi: Use ucs2_as_utf8 in efivarfs instead of open coding a bad version
  lib/ucs2_string: Add ucs2 -> utf8 helper functions
  ARM: 8457/1: psci-smp is built only for SMP
  drm/gma500: Use correct unref in the gem bo create function
  devm_memremap: Fix error value when memremap failed
  KVM: s390: fix guest fprs memory leak
  arm64: errata: Add -mpc-relative-literal-loads to build flags
  ARM: debug-ll: fix BCM63xx entry for multiplatform
  ext4: fix bh->b_state corruption
  sctp: Fix port hash table size computation
  unix_diag: fix incorrect sign extension in unix_lookup_by_ino
  tipc: unlock in error path
  rtnl: RTM_GETNETCONF: fix wrong return value
  IFF_NO_QUEUE: Fix for drivers not calling ether_setup()
  tcp/dccp: fix another race at listener dismantle
  route: check and remove route cache when we get route
  net_sched fix: reclassification needs to consider ether protocol changes
  pppoe: fix reference counting in PPPoE proxy
  l2tp: Fix error creating L2TP tunnels
  net/mlx4_en: Avoid changing dev->features directly in run-time
  net/mlx4_en: Choose time-stamping shift value according to HW frequency
  net/mlx4_en: Count HW buffer overrun only once
  qmi_wwan: add "4G LTE usb-modem U901"
  tcp: md5: release request socket instead of listener
  tipc: fix premature addition of node to lookup table
  af_unix: Guard against other == sk in unix_dgram_sendmsg
  af_unix: Don't set err in unix_stream_read_generic unless there was an error
  ipv4: fix memory leaks in ip_cmsg_send() callers
  bonding: Fix ARP monitor validation
  bpf: fix branch offset adjustment on backjumps after patching ctx expansion
  flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen
  net: Copy inner L3 and L4 headers as unaligned on GRE TEB
  sctp: translate network order to host order when users get a hmacid
  enic: increment devcmd2 result ring in case of timeout
  tg3: Fix for tg3 transmit queue 0 timed out when too many gso_segs
  net:Add sysctl_max_skb_frags
  tcp: do not drop syn_recv on all icmp reports
  unix: correctly track in-flight fds in sending process user_struct
  ipv6: fix a lockdep splat
  ipv6: addrconf: Fix recursive spin lock call
  ipv6/udp: use sticky pktinfo egress ifindex on connect()
  ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail()
  tcp: beware of alignments in tcp_get_info()
  switchdev: Require RTNL mutex to be held when sending FDB notifications
  inet: frag: Always orphan skbs inside ip_defrag()
  tipc: fix connection abort during subscription cancel
  net: dsa: fix mv88e6xxx switches
  sctp: allow setting SCTP_SACK_IMMEDIATELY by the application
  pptp: fix illegal memory access caused by multiple bind()s
  af_unix: fix struct pid memory leak
  tcp: fix NULL deref in tcp_v4_send_ack()
  lwt: fix rx checksum setting for lwt devices tunneling over ipv6
  tunnels: Allow IPv6 UDP checksums to be correctly controlled.
  net: dp83640: Fix tx timestamp overflow handling.
  gro: Make GRO aware of lightweight tunnels.
  af_iucv: Validate socket address length in iucv_sock_bind()

Conflicts:
	arch/arm64/Makefile
	arch/arm64/include/asm/cacheflush.h
	drivers/mmc/host/sdhci.c
	drivers/usb/dwc3/ep0.c
	drivers/usb/dwc3/gadget.c
	kernel/module.c
	sound/core/pcm_compat.c

CRs-Fixed: 1010239
Signed-off-by: Runmin Wang <runminw@codeaurora.org>
Change-Id: I41a28636fc9ad91f9d979b191784609476294cdf
2016-07-12 11:40:49 -07:00
Minchan Kim
06de050ac6 mm: Enhance per process reclaim to consider shared pages
Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.

This patch causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

This feature doesn't handle non-linear mapping on ramfs because
it's very time-consuming and doesn't make sure of reclaiming and
not common.

Change-Id: I7e5f34f2e947f5db6d405867fe2ad34863ca40f7
Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:27
[vinmenon@codeaurora.org: trivial merge conflict fixes + changes
to make the patch work with 3.18 kernel]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-22 14:43:57 -07:00
Minchan Kim
36abe7272a mm/hwpoison: fix wrong num_poisoned_pages accounting
commit d7e69488bd04de165667f6bc741c1c0ec6042ab9 upstream.

Currently, migration code increses num_poisoned_pages on *failed*
migration page as well as successfully migrated one at the trial of
memory-failure.  It will make the stat wrong.  As well, it marks the
page as PG_HWPoison even if the migration trial failed.  It would mean
we cannot recover the corrupted page using memory-failure facility.

This patches fixes it.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04 14:48:49 -07:00
Liam Mark
5d871bae1b mm: cma: add trace events for CMA alloc perf testing
Add cma and migrate trace events to enable CMA allocation
performance to be measured via ftrace.

Change-Id: I1e471e9e21f1a14ce2ed167d8515ccb5f83eb88c
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2016-03-23 21:14:00 -07:00
Mel Gorman
18b75e0bdc mm: numa: quickly fail allocations for NUMA balancing on full nodes
commit 8479eba7781fa9ffb28268840de6facfc12c35a7 upstream.

Commit 4167e9b2cf ("mm: remove GFP_THISNODE") removed the GFP_THISNODE
flag combination due to confusing semantics.  It noted that
alloc_misplaced_dst_page() was one such user after changes made by
commit e97ca8e5b8 ("mm: fix GFP_THISNODE callers and clarify").

Unfortunately when GFP_THISNODE was removed, users of
alloc_misplaced_dst_page() started waking kswapd and entering direct
reclaim because the wrong GFP flags are cleared.  The consequence is
that workloads that used to fit into memory now get reclaimed which is
addressed by this patch.

The problem can be demonstrated with "mutilate" that exercises memcached
which is software dedicated to memory object caching.  The configuration
uses 80% of memory and is run 3 times for varying numbers of clients.
The results on a 4-socket NUMA box are

mutilate
                            4.4.0                 4.4.0
                          vanilla           numaswap-v1
Hmean    1      8394.71 (  0.00%)     8395.32 (  0.01%)
Hmean    4     30024.62 (  0.00%)    34513.54 ( 14.95%)
Hmean    7     32821.08 (  0.00%)    70542.96 (114.93%)
Hmean    12    55229.67 (  0.00%)    93866.34 ( 69.96%)
Hmean    21    39438.96 (  0.00%)    85749.21 (117.42%)
Hmean    30    37796.10 (  0.00%)    50231.49 ( 32.90%)
Hmean    47    18070.91 (  0.00%)    38530.13 (113.22%)

The metric is queries/second with the more the better.  The results are
way outside of the noise and the reason for the improvement is obvious
from some of the vmstats

                                 4.4.0       4.4.0
                               vanillanumaswap-v1r1
Minor Faults                1929399272  2146148218
Major Faults                  19746529        3567
Swap Ins                      57307366        9913
Swap Outs                     50623229       17094
Allocation stalls                35909         443
DMA allocs                           0           0
DMA32 allocs                  72976349   170567396
Normal allocs               5306640898  5310651252
Movable allocs                       0           0
Direct pages scanned         404130893      799577
Kswapd pages scanned         160230174           0
Kswapd pages reclaimed        55928786           0
Direct pages reclaimed         1843936       41921
Page writes file                  2391           0
Page writes anon              50623229       17094

The vanilla kernel is swapping like crazy with large amounts of direct
reclaim and kswapd activity.  The figures are aggregate but it's known
that the bad activity is throughout the entire test.

Note that simple streaming anon/file memory consumers also see this
problem but it's not as obvious.  In those cases, kswapd is awake when
it should not be.

As there are at least two reclaim-related bugs out there, it's worth
spelling out the user-visible impact.  This patch only addresses bugs
related to excessive reclaim on NUMA hardware when the working set is
larger than a NUMA node.  There is a bug related to high kswapd CPU
usage but the reports are against laptops and other UMA hardware and is
not addressed by this patch.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-03 15:07:11 -08:00
Mel Gorman
71baba4b92 mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM
__GFP_WAIT was used to signal that the caller was in atomic context and
could not sleep.  Now it is possible to distinguish between true atomic
context and callers that are not willing to sleep.  The latter should
clear __GFP_DIRECT_RECLAIM so kswapd will still wake.  As clearing
__GFP_WAIT behaves differently, there is a risk that people will clear the
wrong flags.  This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
indicate what it does -- setting it allows all reclaim activity, clearing
them prevents it.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Mel Gorman
d0164adc89 mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts.  They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve".  __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available.  Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative.  High priority users continue to use
__GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
  pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
  __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
  into this category where kswapd will still be woken but atomic reserves
  are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
  helper gfpflags_allow_blocking() where possible. This is because
  checking for __GFP_WAIT as was done historically now can trigger false
  positives. Some exceptions like dm-crypt.c exist where the code intent
  is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
  flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
  and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL.  They may
now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Hugh Dickins
42cb14b110 mm: migrate dirty page without clear_page_dirty_for_io etc
clear_page_dirty_for_io() has accumulated writeback and memcg subtleties
since v2.6.16 first introduced page migration; and the set_page_dirty()
which completed its migration of PageDirty, later had to be moderated to
__set_page_dirty_nobuffers(); then PageSwapBacked had to skip that too.

No actual problems seen with this procedure recently, but if you look into
what the clear_page_dirty_for_io(page)+set_page_dirty(newpage) is actually
achieving, it turns out to be nothing more than moving the PageDirty flag,
and its NR_FILE_DIRTY stat from one zone to another.

It would be good to avoid a pile of irrelevant decrementations and
incrementations, and improper event counting, and unnecessary descent of
the radix_tree under tree_lock (to set the PAGECACHE_TAG_DIRTY which
radix_tree_replace_slot() left in place anyway).

Do the NR_FILE_DIRTY movement, like the other stats movements, while
interrupts still disabled in migrate_page_move_mapping(); and don't even
bother if the zone is the same.  Do the PageDirty movement there under
tree_lock too, where old page is frozen and newpage not yet visible:
bearing in mind that as soon as newpage becomes visible in radix_tree, an
un-page-locked set_page_dirty() might interfere (or perhaps that's just
not possible: anything doing so should already hold an additional
reference to the old page, preventing its migration; but play safe).

But we do still need to transfer PageDirty in migrate_page_copy(), for
those who don't go the mapping route through migrate_page_move_mapping().

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Hugh Dickins
cf4b769abb mm: page migration avoid touching newpage until no going back
We have had trouble in the past from the way in which page migration's
newpage is initialized in dribs and drabs - see commit 8bdd638091 ("mm:
fix direct reclaim writeback regression") which proposed a cleanup.

We have no actual problem now, but I think the procedure would be clearer
(and alternative get_new_page pools safer to implement) if we assert that
newpage is not touched until we are sure that it's going to be used -
except for taking the trylock on it in __unmap_and_move().

So shift the early initializations from move_to_new_page() into
migrate_page_move_mapping(), mapping and NULL-mapping paths.  Similarly
migrate_huge_page_move_mapping(), but its NULL-mapping path can just be
deleted: you cannot reach hugetlbfs_migrate_page() with a NULL mapping.

Adjust stages 3 to 8 in the Documentation file accordingly.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Hugh Dickins
03f15c86c8 mm: simplify page migration's anon_vma comment and flow
__unmap_and_move() contains a long stale comment on page_get_anon_vma()
and PageSwapCache(), with an odd control flow that's hard to follow.
Mostly this reflects our confusion about the lifetime of an anon_vma, in
the early days of page migration, before we could take a reference to one.
 Nowadays this seems quite straightforward: cut it all down to essentials.

I cannot see the relevance of swapcache here at all, so don't treat it any
differently: I believe the old comment reflects in part our anon_vma
confusions, and in part the original v2.6.16 page migration technique,
which used actual swap to migrate anon instead of swap-like migration
entries.  Why should a swapcache page not be migrated with the aid of
migration entry ptes like everything else?  So lose that comment now, and
enable migration entries for swapcache in the next patch.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Hugh Dickins
5c3f9a6737 mm: page migration remove_migration_ptes at lock+unlock level
Clean up page migration a little more by calling remove_migration_ptes()
from the same level, on success or on failure, from __unmap_and_move() or
from unmap_and_move_huge_page().

Don't reset page->mapping of a PageAnon old page in move_to_new_page(),
leave that to when the page is freed.  Except for here in page migration,
it has been an invariant that a PageAnon (bit set in page->mapping) page
stays PageAnon until it is freed, and I think we're safer to keep to that.

And with the above rearrangement, it's necessary because zap_pte_range()
wants to identify whether a migration entry represents a file or an anon
page, to update the appropriate rss stats without waiting on it.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Hugh Dickins
7db7671f83 mm: page migration trylock newpage at same level as oldpage
Clean up page migration a little by moving the trylock of newpage from
move_to_new_page() into __unmap_and_move(), where the old page has been
locked.  Adjust unmap_and_move_huge_page() and balloon_page_migrate()
accordingly.

But make one kind-of-functional change on the way: whereas trylock of
newpage used to BUG() if it failed, now simply return -EAGAIN if so.
Cutting out BUG()s is good, right?  But, to be honest, this is really to
extend the usefulness of the custom put_new_page feature, allowing a pool
of new pages to be shared perhaps with racing uses.

Use an "else" instead of that "skip_unmap" label.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Hugh Dickins
2def7424c9 mm: page migration use the put_new_page whenever necessary
I don't know of any problem from the way it's used in our current tree,
but there is one defect in page migration's custom put_new_page feature.

An unused newpage is expected to be released with the put_new_page(), but
there was one MIGRATEPAGE_SUCCESS (0) path which released it with
putback_lru_page(): which can be very wrong for a custom pool.

Fixed more easily by resetting put_new_page once it won't be needed, than
by adding a further flag to modify the rc test.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Hugh Dickins
14e0f9bcc9 mm: correct a couple of page migration comments
It's migrate.c not migration,c, and nowadays putback_movable_pages() not
putback_lru_pages().

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Hugh Dickins
45637bab30 mm: rename mem_cgroup_migrate to mem_cgroup_replace_page
After v4.3's commit 0610c25daa ("memcg: fix dirty page migration")
mem_cgroup_migrate() doesn't have much to offer in page migration: convert
migrate_misplaced_transhuge_page() to set_page_memcg() instead.

Then rename mem_cgroup_migrate() to mem_cgroup_replace_page(), since its
remaining callers are replace_page_cache_page() and shmem_replace_page():
both of whom passed lrucare true, so just eliminate that argument.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Hugh Dickins
51afb12ba8 mm: page migration fix PageMlocked on migrated pages
Commit e6c509f854 ("mm: use clear_page_mlock() in page_remove_rmap()")
in v3.7 inadvertently made mlock_migrate_page() impotent: page migration
unmaps the page from userspace before migrating, and that commit clears
PageMlocked on the final unmap, leaving mlock_migrate_page() with
nothing to do.  Not a serious bug, the next attempt at reclaiming the
page would fix it up; but a betrayal of page migration's intent - the
new page ought to emerge as PageMlocked.

I don't see how to fix it for mlock_migrate_page() itself; but easily
fixed in remove_migration_pte(), by calling mlock_vma_page() when the vma
is VM_LOCKED - under pte lock as in try_to_unmap_one().

Delete mlock_migrate_page()?  Not quite, it does still serve a purpose for
migrate_misplaced_transhuge_page(): where we could replace it by a test,
clear_page_mlock(), mlock_vma_page() sequence; but would that be an
improvement?  mlock_migrate_page() is fairly lean, and let's make it
leaner by skipping the irq save/restore now clearly not needed.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Vlastimil Babka
f2f81fb2b7 mm, migrate: count pages failing all retries in vmstat and tracepoint
Migration tries up to 10 times to migrate pages that return -EAGAIN until
it gives up.  If some pages fail all retries, they are counted towards the
number of failed pages that migrate_pages() returns.  They should also be
counted in the /proc/vmstat pgmigrate_fail and in the mm_migrate_pages
tracepoint.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Greg Thelen
0610c25daa memcg: fix dirty page migration
The problem starts with a file backed dirty page which is charged to a
memcg.  Then page migration is used to move oldpage to newpage.

Migration:
 - copies the oldpage's data to newpage
 - clears oldpage.PG_dirty
 - sets newpage.PG_dirty
 - uncharges oldpage from memcg
 - charges newpage to memcg

Clearing oldpage.PG_dirty decrements the charged memcg's dirty page
count.

However, because newpage is not yet charged, setting newpage.PG_dirty
does not increment the memcg's dirty page count.  After migration
completes newpage.PG_dirty is eventually cleared, often in
account_page_cleaned().  At this time newpage is charged to a memcg so
the memcg's dirty page count is decremented which causes underflow
because the count was not previously incremented by migration.  This
underflow causes balance_dirty_pages() to see a very large unsigned
number of dirty memcg pages which leads to aggressive throttling of
buffered writes by processes in non root memcg.

This issue:
 - can harm performance of non root memcg buffered writes.
 - can report too small (even negative) values in
   memory.stat[(total_)dirty] counters of all memcg, including the root.

To avoid polluting migrate.c with #ifdef CONFIG_MEMCG checks, introduce
page_memcg() and set_page_memcg() helpers.

Test:
    0) setup and enter limited memcg
    mkdir /sys/fs/cgroup/test
    echo 1G > /sys/fs/cgroup/test/memory.limit_in_bytes
    echo $$ > /sys/fs/cgroup/test/cgroup.procs

    1) buffered writes baseline
    dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
    sync
    grep ^dirty /sys/fs/cgroup/test/memory.stat

    2) buffered writes with compaction antagonist to induce migration
    yes 1 > /proc/sys/vm/compact_memory &
    rm -rf /data/tmp/foo
    dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
    kill %
    sync
    grep ^dirty /sys/fs/cgroup/test/memory.stat

    3) buffered writes without antagonist, should match baseline
    rm -rf /data/tmp/foo
    dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
    sync
    grep ^dirty /sys/fs/cgroup/test/memory.stat

                       (speed, dirty residue)
             unpatched                       patched
    1) 841 MB/s 0 dirty pages          886 MB/s 0 dirty pages
    2) 611 MB/s -33427456 dirty pages  793 MB/s 0 dirty pages
    3) 114 MB/s -33427456 dirty pages  891 MB/s 0 dirty pages

    Notice that unpatched baseline performance (1) fell after
    migration (3): 841 -> 114 MB/s.  In the patched kernel, post
    migration performance matches baseline.

Fixes: c4843a7593 ("memcg: add per cgroup dirty page accounting")
Signed-off-by: Greg Thelen <gthelen@google.com>
Reported-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>	[4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-01 21:42:35 -04:00
Naoya Horiguchi
3aaa76e125 mm: migrate: hugetlb: putback destination hugepage to active list
Since commit bcc5422230 ("mm: hugetlb: introduce page_huge_active")
each hugetlb page maintains its active flag to avoid a race condition
betwe= en multiple calls of isolate_huge_page(), but current kernel
doesn't set the f= lag on a hugepage allocated by migration because the
proper putback routine isn= 't called.  This means that users could
still encounter the race referred to by bcc5422230 in this special
case, so this patch fixes it.

Fixes: bcc5422230 ("mm: hugetlb: introduce page_huge_active")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>  [4.1.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-22 15:09:53 -07:00
Vladimir Davydov
33c3fc71c8 mm: introduce idle page tracking
Knowing the portion of memory that is not used by a certain application or
memory cgroup (idle memory) can be useful for partitioning the system
efficiently, e.g.  by setting memory cgroup limits appropriately.
Currently, the only means to estimate the amount of idle memory provided
by the kernel is /proc/PID/{clear_refs,smaps}: the user can clear the
access bit for all pages mapped to a particular process by writing 1 to
clear_refs, wait for some time, and then count smaps:Referenced.  However,
this method has two serious shortcomings:

 - it does not count unmapped file pages
 - it affects the reclaimer logic

To overcome these drawbacks, this patch introduces two new page flags,
Idle and Young, and a new sysfs file, /sys/kernel/mm/page_idle/bitmap.
A page's Idle flag can only be set from userspace by setting bit in
/sys/kernel/mm/page_idle/bitmap at the offset corresponding to the page,
and it is cleared whenever the page is accessed either through page tables
(it is cleared in page_referenced() in this case) or using the read(2)
system call (mark_page_accessed()). Thus by setting the Idle flag for
pages of a particular workload, which can be found e.g.  by reading
/proc/PID/pagemap, waiting for some time to let the workload access its
working set, and then reading the bitmap file, one can estimate the amount
of pages that are not used by the workload.

The Young page flag is used to avoid interference with the memory
reclaimer.  A page's Young flag is set whenever the Access bit of a page
table entry pointing to the page is cleared by writing to the bitmap file.
If page_referenced() is called on a Young page, it will add 1 to its
return value, therefore concealing the fact that the Access bit was
cleared.

Note, since there is no room for extra page flags on 32 bit, this feature
uses extended page flags when compiled on 32 bit.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: kpageidle requires an MMU]
[akpm@linux-foundation.org: decouple from page-flags rework]
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Vlastimil Babka
96db800f5d mm: rename alloc_pages_exact_node() to __alloc_pages_node()
alloc_pages_exact_node() was introduced in commit 6484eb3e2a ("page
allocator: do not check NUMA node ID when the caller knows the node is
valid") as an optimized variant of alloc_pages_node(), that doesn't
fallback to current node for nid == NUMA_NO_NODE.  Unfortunately the
name of the function can easily suggest that the allocation is
restricted to the given node and fails otherwise.  In truth, the node is
only preferred, unless __GFP_THISNODE is passed among the gfp flags.

The misleading name has lead to mistakes in the past, see for example
commits 5265047ac3 ("mm, thp: really limit transparent hugepage
allocation to local node") and b360edb43f ("mm, mempolicy:
migrate_to_node should only migrate to node").

Another issue with the name is that there's a family of
alloc_pages_exact*() functions where 'exact' means exact size (instead
of page order), which leads to more confusion.

To prevent further mistakes, this patch effectively renames
alloc_pages_exact_node() to __alloc_pages_node() to better convey that
it's an optimized variant of alloc_pages_node() not intended for general
usage.  Both functions get described in comments.

It has been also considered to really provide a convenience function for
allocations restricted to a node, but the major opinion seems to be that
__GFP_THISNODE already provides that functionality and we shouldn't
duplicate the API needlessly.  The number of users would be small
anyway.

Existing callers of alloc_pages_exact_node() are simply converted to
call __alloc_pages_node(), with the exception of sba_alloc_coherent()
which open-codes the check for NUMA_NO_NODE, so it is converted to use
alloc_pages_node() instead.  This means it no longer performs some
VM_BUG_ON checks, and since the current check for nid in
alloc_pages_node() uses a 'nid < 0' comparison (which includes
NUMA_NO_NODE), it may hide wrong values which would be previously
exposed.

Both differences will be rectified by the next patch.

To sum up, this patch makes no functional changes, except temporarily
hiding potentially buggy callers.  Restricting the checks in
alloc_pages_node() is left for the next patch which can in turn expose
more existing buggy callers.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Robin Holt <robinmholt@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cliff Whickman <cpw@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Wanpeng Li
da1b13ccfb mm/hwpoison: fix race between soft_offline_page and unpoison_memory
Wanpeng Li reported a race between soft_offline_page() and
unpoison_memory(), which causes the following kernel panic:

   BUG: Bad page state in process bash  pfn:97000
   page:ffffea00025c0000 count:0 mapcount:1 mapping:          (null) index:0x7f4fdbe00
   flags: 0x1fffff80080048(uptodate|active|swapbacked)
   page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
   bad because of flags:
   flags: 0x40(active)
   Modules linked in: snd_hda_codec_hdmi i915 rpcsec_gss_krb5 nfsv4 dns_resolver bnep rfcomm nfsd bluetooth auth_rpcgss nfs_acl nfs rfkill lockd grace sunrpc i2c_algo_bit drm_kms_helper snd_hda_codec_realtek snd_hda_codec_generic drm snd_hda_intel fscache snd_hda_codec x86_pkg_temp_thermal coretemp kvm_intel snd_hda_core snd_hwdep kvm snd_pcm snd_seq_dummy snd_seq_oss crct10dif_pclmul snd_seq_midi crc32_pclmul snd_seq_midi_event ghash_clmulni_intel snd_rawmidi aesni_intel lrw gf128mul snd_seq glue_helper ablk_helper snd_seq_device cryptd fuse snd_timer dcdbas serio_raw mei_me parport_pc snd mei ppdev i2c_core video lp soundcore parport lpc_ich shpchp mfd_core ext4 mbcache jbd2 sd_mod e1000e ahci ptp libahci crc32c_intel libata pps_core
   CPU: 3 PID: 2211 Comm: bash Not tainted 4.2.0-rc5-mm1+ #45
   Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03 01/08/2015
   Call Trace:
     dump_stack+0x48/0x5c
     bad_page+0xe6/0x140
     free_pages_prepare+0x2f9/0x320
     ? uncharge_list+0xdd/0x100
     free_hot_cold_page+0x40/0x170
     __put_single_page+0x20/0x30
     put_page+0x25/0x40
     unmap_and_move+0x1a6/0x1f0
     migrate_pages+0x100/0x1d0
     ? kill_procs+0x100/0x100
     ? unlock_page+0x6f/0x90
     __soft_offline_page+0x127/0x2a0
     soft_offline_page+0xa6/0x200

This race is explained like below:

  CPU0                    CPU1

  soft_offline_page
  __soft_offline_page
  TestSetPageHWPoison
                        unpoison_memory
                        PageHWPoison check (true)
                        TestClearPageHWPoison
                        put_page    -> release refcount held by get_hwpoison_page in unpoison_memory
                        put_page    -> release refcount held by isolate_lru_page in __soft_offline_page
  migrate_pages

The second put_page() releases refcount held by isolate_lru_page() which
will lead to unmap_and_move() releases the last refcount of page and w/
mapcount still 1 since try_to_unmap() is not called if there is only one
user map the page.  Anyway, the page refcount and mapcount will still
mess if the page is mapped by multiple users.

This race was introduced by commit 4491f71260 ("mm/memory-failure: set
PageHWPoison before migrate_pages()"), which focuses on preventing the
reuse of successfully migrated page.  Before this commit we prevent the
reuse by changing the migratetype to MIGRATE_ISOLATE during soft
offlining, which has the following problems, so simply reverting the
commit is not a best option:

  1) it doesn't eliminate the reuse completely, because
     set_migratetype_isolate() can fail to set MIGRATE_ISOLATE to the
     target page if the pageblock of the page contains one or more
     unmovable pages (i.e.  has_unmovable_pages() returns true).

  2) the original code changes migratetype to MIGRATE_ISOLATE
     forcibly, and sets it to MIGRATE_MOVABLE forcibly after soft offline,
     regardless of the original migratetype state, which could impact
     other subsystems like memory hotplug or compaction.

This patch moves PageSetHWPoison just after put_page() in
unmap_and_move(), which closes up the reported race window and minimizes
another race window b/w SetPageHWPoison and reallocation (which causes
the reuse of soft-offlined page.) The latter race window still exists
but it's acceptable, because it's rare and effectively the same as
ordinary "containment failure" case even if it happens, so keep the
window open is acceptable.

Fixes: 4491f71260 ("mm/memory-failure: set PageHWPoison before migrate_pages()")
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Wanpeng Li <wanpeng.li@hotmail.com>
Tested-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Kirill A. Shutemov
d899844e9c mm: fix status code which move_pages() returns for zero page
The manpage for move_pages(2) specifies that status code for zero page is
supposed to be -EFAULT.  Currently kernel return -ENOENT in this case.

follow_page() can do it for us, if we would ask for FOLL_DUMP.  The use of
FOLL_DUMP also means that the upper layer page tables pages are no longer
allocated.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Naoya Horiguchi
4491f71260 mm/memory-failure: set PageHWPoison before migrate_pages()
Now page freeing code doesn't consider PageHWPoison as a bad page, so by
setting it before completing the page containment, we can prevent the
error page from being reused just after successful page migration.

I added TTU_IGNORE_HWPOISON for try_to_unmap() to make sure that the
page table entry is transformed into migration entry, not to hwpoison
entry.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dean Nelson <dnelson@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-07 04:39:42 +03:00
Naoya Horiguchi
f4c18e6f7b mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
The race condition addressed in commit add05cecef ("mm: soft-offline:
don't free target page in successful page migration") was not closed
completely, because that can happen not only for soft-offline, but also
for hard-offline.  Consider that a slab page is about to be freed into
buddy pool, and then an uncorrected memory error hits the page just
after entering __free_one_page(), then VM_BUG_ON_PAGE(page->flags &
PAGE_FLAGS_CHECK_AT_PREP) is triggered, despite the fact that it's not
necessary because the data on the affected page is not consumed.

To solve it, this patch drops __PG_HWPOISON from page flag checks at
allocation/free time.  I think it's justified because __PG_HWPOISON
flags is defined to prevent the page from being reused, and setting it
outside the page's alloc-free cycle is a designed behavior (not a bug.)

For recent months, I was annoyed about BUG_ON when soft-offlined page
remains on lru cache list for a while, which is avoided by calling
put_page() instead of putback_lru_page() in page migration's success
path.  This means that this patch reverts a major change from commit
add05cecef about the new refcounting rule of soft-offlined pages, so
"reuse window" revives.  This will be closed by a subsequent patch.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dean Nelson <dnelson@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-07 04:39:42 +03:00
Aneesh Kumar K.V
8809aa2d28 mm: clarify that the function operates on hugepage pte
We have confusing functions to clear pmd, pmd_clear_* and pmd_clear.  Add
_huge_ to pmdp_clear functions so that we are clear that they operate on
hugepage pte.

We don't bother about other functions like pmdp_set_wrprotect,
pmdp_clear_flush_young, because they operate on PTE bits and hence
indicate they are operating on hugepage ptes

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:44 -07:00
Naoya Horiguchi
add05cecef mm: soft-offline: don't free target page in successful page migration
Stress testing showed that soft offline events for a process iterating
"mmap-pagefault-munmap" loop can trigger
VM_BUG_ON(PAGE_FLAGS_CHECK_AT_PREP) in __free_one_page():

  Soft offlining page 0x70fe1 at 0x70100008d000
  Soft offlining page 0x705fb at 0x70300008d000
  page:ffffea0001c3f840 count:0 mapcount:0 mapping:          (null) index:0x2
  flags: 0x1fffff80800000(hwpoison)
  page dumped because: VM_BUG_ON_PAGE(page->flags & ((1 << 25) - 1))
  ------------[ cut here ]------------
  kernel BUG at /src/linux-dev/mm/page_alloc.c:585!
  invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
  Modules linked in: cfg80211 rfkill crc32c_intel microcode ppdev parport_pc pcspkr serio_raw virtio_balloon parport i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi floppy
  CPU: 3 PID: 1779 Comm: test_base_madv_ Not tainted 4.0.0-v4.0-150511-1451-00009-g82360a3730e6 #139
  RIP: free_pcppages_bulk+0x52a/0x6f0
  Call Trace:
    drain_pages_zone+0x3d/0x50
    drain_local_pages+0x1d/0x30
    on_each_cpu_mask+0x46/0x80
    drain_all_pages+0x14b/0x1e0
    soft_offline_page+0x432/0x6e0
    SyS_madvise+0x73c/0x780
    system_call_fastpath+0x12/0x17
  Code: ff 89 45 b4 48 8b 45 c0 48 83 b8 a8 00 00 00 00 0f 85 e3 fb ff ff 0f 1f 00 0f 0b 48 8b 7d 90 48 c7 c6 e8 95 a6 81 e8 e6 32 02 00 <0f> 0b 8b 45 cc 49 89 47 30 41 8b 47 18 83 f8 ff 0f 85 10 ff ff
  RIP  [<ffffffff811a806a>] free_pcppages_bulk+0x52a/0x6f0
   RSP <ffff88007a117d28>
  ---[ end trace 53926436e76d1f35 ]---

When soft offline successfully migrates page, the source page is supposed
to be freed.  But there is a race condition where a source page looks
isolated (i.e.  the refcount is 0 and the PageHWPoison is set) but
somewhat linked to pcplist.  Then another soft offline event calls
drain_all_pages() and tries to free such hwpoisoned page, which is
forbidden.

This odd page state seems to happen due to the race between put_page() in
putback_lru_page() and __pagevec_lru_add_fn().  But I don't want to play
with tweaking drain code as done in commit 9ab3b598d2 "mm: hwpoison:
drop lru_add_drain_all() in __soft_offline_page()", or to change page
freeing code for this soft offline's purpose.

Instead, let's think about the difference between hard offline and soft
offline.  There is an interesting difference in how to isolate the in-use
page between these, that is, hard offline marks PageHWPoison of the target
page at first, and doesn't free it by keeping its refcount 1.  OTOH, soft
offline tries to free the target page then marks PageHWPoison.  This
difference might be the source of complexity and result in bugs like the
above.  So making soft offline isolate with keeping refcount can be a
solution for this problem.

We can pass to page migration code the "reason" which shows the caller, so
let's use this more to avoid calling putback_lru_page() when called from
soft offline, which effectively does the isolation for soft offline.  With
this change, target pages of soft offline never be reused without changing
migratetype, so this patch also removes the related code.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:42 -07:00
Naoya Horiguchi
b3b3a99c53 mm/migrate: check-before-clear PageSwapCache
With the page flag sanitization patchset, an invalid usage of
ClearPageSwapCache() is detected in migration_page_copy().
migrate_page_copy() is shared by both normal and hugepage (both thp and
hugetlb) code path, so let's check PageSwapCache() and clear it if it's
set to avoid misuse of the invalid clear operation.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:17 -07:00
Mel Gorman
2a8e700264 mm: numa: remove migrate_ratelimited
This code is dead since commit 9e645ab6d0 ("sched/numa: Continue PTE
scanning even if migrate rate limited") so remove it.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-14 16:49:05 -07:00
Geert Uytterhoeven
ef2a5153b4 mm/migrate: mark unmap_and_move() "noinline" to avoid ICE in gcc 4.7.3
With gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-12ubuntu1) :

    mm/migrate.c: In function `migrate_pages':
    mm/migrate.c:1148:1: internal compiler error: in push_minipool_fix, at config/arm/arm.c:13500
    Please submit a full bug report,
    with preprocessed source if appropriate.
    See <file:///usr/share/doc/gcc-4.7/README.Bugs> for instructions.
    Preprocessed source stored into /tmp/ccPoM1tr.out file, please attach this to your bugreport.
    make[1]: *** [mm/migrate.o] Error 1
    make: *** [mm/migrate.o] Error 2

Mark unmap_and_move() (which is used in a single place only) "noinline"
to work around this compiler bug.

[akpm@linux-foundation.org: make it conditional on gcc-4.7.3 and arm]
[khilman@kernel.org: fine-tune compiler versions]
[akpm@linux-foundation.org: fix comment]
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reported-by: Kevin Hilman <khilman@kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Lina Iyer <lina.iyer@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-14 16:48:59 -07:00
Mel Gorman
4d94246699 mm: convert p[te|md]_mknonnuma and remaining page table manipulations
With PROT_NONE, the traditional page table manipulation functions are
sufficient.

[andre.przywara@arm.com: fix compiler warning in pmdp_invalidate()]
[akpm@linux-foundation.org: fix build with STRICT_MM_TYPECHECKS]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:08 -08:00
Mel Gorman
5d83306213 mm: numa: do not dereference pmd outside of the lock during NUMA hinting fault
Automatic NUMA balancing depends on being able to protect PTEs to trap a
fault and gather reference locality information.  Very broadly speaking
it would mark PTEs as not present and use another bit to distinguish
between NUMA hinting faults and other types of faults.  It was
universally loved by everybody and caused no problems whatsoever.  That
last sentence might be a lie.

This series is very heavily based on patches from Linus and Aneesh to
replace the existing PTE/PMD NUMA helper functions with normal change
protections.  I did alter and add parts of it but I consider them
relatively minor contributions.  At their suggestion, acked-bys are in
there but I've no problem converting them to Signed-off-by if requested.

AFAIK, this has received no testing on ppc64 and I'm depending on Aneesh
for that.  I tested trinity under kvm-tool and passed and ran a few
other basic tests.  At the time of writing, only the short-lived tests
have completed but testing of V2 indicated that long-term testing had no
surprises.  In most cases I'm leaving out detail as it's not that
interesting.

specjbb single JVM: There was negligible performance difference in the
	benchmark itself for short runs. However, system activity is
	higher and interrupts are much higher over time -- possibly TLB
	flushes. Migrations are also higher. Overall, this is more overhead
	but considering the problems faced with the old approach I think
	we just have to suck it up and find another way of reducing the
	overhead.

specjbb multi JVM: Negligible performance difference to the actual benchmark
	but like the single JVM case, the system overhead is noticeably
	higher.  Again, interrupts are a major factor.

autonumabench: This was all over the place and about all that can be
	reasonably concluded is that it's different but not necessarily
	better or worse.

autonumabench
                                     3.18.0-rc5            3.18.0-rc5
                                 mmotm-20141119         protnone-v3r3
User    NUMA01               32380.24 (  0.00%)    21642.92 ( 33.16%)
User    NUMA01_THEADLOCAL    22481.02 (  0.00%)    22283.22 (  0.88%)
User    NUMA02                3137.00 (  0.00%)     3116.54 (  0.65%)
User    NUMA02_SMT            1614.03 (  0.00%)     1543.53 (  4.37%)
System  NUMA01                 322.97 (  0.00%)     1465.89 (-353.88%)
System  NUMA01_THEADLOCAL       91.87 (  0.00%)       49.32 ( 46.32%)
System  NUMA02                  37.83 (  0.00%)       14.61 ( 61.38%)
System  NUMA02_SMT               7.36 (  0.00%)        7.45 ( -1.22%)
Elapsed NUMA01                 716.63 (  0.00%)      599.29 ( 16.37%)
Elapsed NUMA01_THEADLOCAL      553.98 (  0.00%)      539.94 (  2.53%)
Elapsed NUMA02                  83.85 (  0.00%)       83.04 (  0.97%)
Elapsed NUMA02_SMT              86.57 (  0.00%)       79.15 (  8.57%)
CPU     NUMA01                4563.00 (  0.00%)     3855.00 ( 15.52%)
CPU     NUMA01_THEADLOCAL     4074.00 (  0.00%)     4136.00 ( -1.52%)
CPU     NUMA02                3785.00 (  0.00%)     3770.00 (  0.40%)
CPU     NUMA02_SMT            1872.00 (  0.00%)     1959.00 ( -4.65%)

System CPU usage of NUMA01 is worse but it's an adverse workload on this
machine so I'm reluctant to conclude that it's a problem that matters.  On
the other workloads that are sensible on this machine, system CPU usage is
great.  Overall time to complete the benchmark is comparable

          3.18.0-rc5  3.18.0-rc5
        mmotm-20141119protnone-v3r3
User        59612.50    48586.44
System        460.22     1537.45
Elapsed      1442.20     1304.29

NUMA alloc hit                 5075182     5743353
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local               5075174     5743339
NUMA base PTE updates        637061448   443106883
NUMA huge PMD updates          1243434      864747
NUMA page range updates     1273699656   885857347
NUMA hint faults               1658116     1214277
NUMA hint local faults          959487      754113
NUMA hint local percent             57          62
NUMA pages migrated            5467056    61676398

The NUMA pages migrated look terrible but when I looked at a graph of the
activity over time I see that the massive spike in migration activity was
during NUMA01.  This correlates with high system CPU usage and could be
simply down to bad luck but any modifications that affect that workload
would be related to scan rates and migrations, not the protection
mechanism.  For all other workloads, migration activity was comparable.

Overall, headline performance figures are comparable but the overhead is
higher, mostly in interrupts.  To some extent, higher overhead from this
approach was anticipated but not to this degree.  It's going to be
necessary to reduce this again with a separate series in the future.  It's
still worth going ahead with this series though as it's likely to avoid
constant headaches with Xen and is probably easier to maintain.

This patch (of 10):

A transhuge NUMA hinting fault may find the page is migrating and should
wait until migration completes.  The check is race-prone because the pmd
is deferenced outside of the page lock and while the race is tiny, it'll
be larger if the PMD is cleared while marking PMDs for hinting fault.
This patch closes the race.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:08 -08:00
Naoya Horiguchi
e66f17ff71 mm/hugetlb: take page table lock in follow_huge_pmd()
We have a race condition between move_pages() and freeing hugepages, where
move_pages() calls follow_page(FOLL_GET) for hugepages internally and
tries to get its refcount without preventing concurrent freeing.  This
race crashes the kernel, so this patch fixes it by moving FOLL_GET code
for hugepages into follow_huge_pmd() with taking the page table lock.

This patch intentionally removes page==NULL check after pte_page.
This is justified because pte_page() never returns NULL for any
architectures or configurations.

This patch changes the behavior of follow_huge_pmd() for tail pages and
then tail pages can be pinned/returned.  So the caller must be changed to
properly handle the returned tail pages.

We could have a choice to add the similar locking to
follow_huge_(addr|pud) for consistency, but it's not necessary because
currently these functions don't support FOLL_GET flag, so let's leave it
for future development.

Here is the reproducer:

  $ cat movepages.c
  #include <stdio.h>
  #include <stdlib.h>
  #include <numaif.h>

  #define ADDR_INPUT      0x700000000000UL
  #define HPS             0x200000
  #define PS              0x1000

  int main(int argc, char *argv[]) {
          int i;
          int nr_hp = strtol(argv[1], NULL, 0);
          int nr_p  = nr_hp * HPS / PS;
          int ret;
          void **addrs;
          int *status;
          int *nodes;
          pid_t pid;

          pid = strtol(argv[2], NULL, 0);
          addrs  = malloc(sizeof(char *) * nr_p + 1);
          status = malloc(sizeof(char *) * nr_p + 1);
          nodes  = malloc(sizeof(char *) * nr_p + 1);

          while (1) {
                  for (i = 0; i < nr_p; i++) {
                          addrs[i] = (void *)ADDR_INPUT + i * PS;
                          nodes[i] = 1;
                          status[i] = 0;
                  }
                  ret = numa_move_pages(pid, nr_p, addrs, nodes, status,
                                        MPOL_MF_MOVE_ALL);
                  if (ret == -1)
                          err("move_pages");

                  for (i = 0; i < nr_p; i++) {
                          addrs[i] = (void *)ADDR_INPUT + i * PS;
                          nodes[i] = 0;
                          status[i] = 0;
                  }
                  ret = numa_move_pages(pid, nr_p, addrs, nodes, status,
                                        MPOL_MF_MOVE_ALL);
                  if (ret == -1)
                          err("move_pages");
          }
          return 0;
  }

  $ cat hugepage.c
  #include <stdio.h>
  #include <sys/mman.h>
  #include <string.h>

  #define ADDR_INPUT      0x700000000000UL
  #define HPS             0x200000

  int main(int argc, char *argv[]) {
          int nr_hp = strtol(argv[1], NULL, 0);
          char *p;

          while (1) {
                  p = mmap((void *)ADDR_INPUT, nr_hp * HPS, PROT_READ | PROT_WRITE,
                           MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0);
                  if (p != (void *)ADDR_INPUT) {
                          perror("mmap");
                          break;
                  }
                  memset(p, 0, nr_hp * HPS);
                  munmap(p, nr_hp * HPS);
          }
  }

  $ sysctl vm.nr_hugepages=40
  $ ./hugepage 10 &
  $ ./movepages 10 $(pgrep -f hugepage)

Fixes: e632a938d9 ("mm: migrate: add hugepage migration code to move_pages()")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Hugh Dickins <hughd@google.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org>	[3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11 17:06:01 -08:00
Kirill A. Shutemov
27ba0644ea rmap: drop support of non-linear mappings
We don't create non-linear mappings anymore.  Let's drop code which
handles them in rmap.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:31 -08:00
Al Viro
50062175ff vm_area_operations: kill ->migrate()
the only instance this method has ever grown was one in kernfs -
one that call ->migrate() of another vm_ops if it exists.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-17 08:26:51 -05:00
Linus Torvalds
988adfdffd Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "Highlights:

   - AMD KFD driver merge

     This is the AMD HSA interface for exposing a lowlevel interface for
     GPGPU use.  They have an open source userspace built on top of this
     interface, and the code looks as good as it was going to get out of
     tree.

   - Initial atomic modesetting work

     The need for an atomic modesetting interface to allow userspace to
     try and send a complete set of modesetting state to the driver has
     arisen, and been suffering from neglect this past year.  No more,
     the start of the common code and changes for msm driver to use it
     are in this tree.  Ongoing work to get the userspace ioctl finished
     and the code clean will probably wait until next kernel.

   - DisplayID 1.3 and tiled monitor exposed to userspace.

     Tiled monitor property is now exposed for userspace to make use of.

   - Rockchip drm driver merged.

   - imx gpu driver moved out of staging

  Other stuff:

   - core:
        panel - MIPI DSI + new panels.
        expose suggested x/y properties for virtual GPUs

   - i915:
        Initial Skylake (SKL) support
        gen3/4 reset work
        start of dri1/ums removal
        infoframe tracking
        fixes for lots of things.

   - nouveau:
        tegra k1 voltage support
        GM204 modesetting support
        GT21x memory reclocking work

   - radeon:
        CI dpm fixes
        GPUVM improvements
        Initial DPM fan control

   - rcar-du:
        HDMI support added
        removed some support for old boards
        slave encoder driver for Analog Devices adv7511

   - exynos:
        Exynos4415 SoC support

   - msm:
        a4xx gpu support
        atomic helper conversion

   - tegra:
        iommu support
        universal plane support
        ganged-mode DSI support

   - sti:
        HDMI i2c improvements

   - vmwgfx:
        some late fixes.

   - qxl:
        use suggested x/y properties"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (969 commits)
  drm: sti: fix module compilation issue
  drm/i915: save/restore GMBUS freq across suspend/resume on gen4
  drm: sti: correctly cleanup CRTC and planes
  drm: sti: add HQVDP plane
  drm: sti: add cursor plane
  drm: sti: enable auxiliary CRTC
  drm: sti: fix delay in VTG programming
  drm: sti: prepare sti_tvout to support auxiliary crtc
  drm: sti: use drm_crtc_vblank_{on/off} instead of drm_vblank_{on/off}
  drm: sti: fix hdmi avi infoframe
  drm: sti: remove event lock while disabling vblank
  drm: sti: simplify gdp code
  drm: sti: clear all mixer control
  drm: sti: remove gpio for HDMI hot plug detection
  drm: sti: allow to change hdmi ddc i2c adapter
  drm/doc: Document drm_add_modes_noedid() usage
  drm/i915: Remove '& 0xffff' from the mask given to WA_REG()
  drm/i915: Invert the mask and val arguments in wa_add() and WA_REG()
  drm: Zero out DRM object memory upon cleanup
  drm/i915/bdw: Fix the write setting up the WIZ hashing mode
  ...
2014-12-15 15:52:01 -08:00
Hugh Dickins
2ebba6b7e1 mm: unmapped page migration avoid unmap+remap overhead
Page migration's __unmap_and_move(), and rmap's try_to_unmap(), were
created for use on pages almost certainly mapped into userspace.  But
nowadays compaction often applies them to unmapped page cache pages: which
may exacerbate contention on i_mmap_rwsem quite unnecessarily, since
try_to_unmap_file() makes no preliminary page_mapped() check.

Now check page_mapped() in __unmap_and_move(); and avoid repeating the
same overhead in rmap_walk_file() - don't remove_migration_ptes() when we
never inserted any.

(The PageAnon(page) comment blocks now look even sillier than before, but
clean that up on some other occasion.  And note in passing that
try_to_unmap_one() does not use a migration entry when PageSwapCache, so
remove_migration_ptes() will then not update that swap entry to newpage
pte: not a big deal, but something else to clean up later.)

Davidlohr remarked in "mm,fs: introduce helpers around the i_mmap_mutex"
conversion to i_mmap_rwsem, that "The biggest winner of these changes is
migration": a part of the reason might be all of that unnecessary taking
of i_mmap_mutex in page migration; and it's rather a shame that I didn't
get around to sending this patch in before his - this one is much less
useful after Davidlohr's conversion to rwsem, but still good.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13 12:42:49 -08:00