list_lru_count_node() iterates over all memcgs to get the total number of
entries on the node but it can race with memcg_drain_all_list_lrus(),
which migrates the entries from a dead cgroup to another. This can return
incorrect number of entries from list_lru_count_node().
Fix this by keeping track of entries per node and simply return it in
list_lru_count_node().
Change-Id: I19e3b527804e95be75f48ee363c8207c0e7ee2ff
Link: http://lkml.kernel.org/r/1498707555-30525-1-git-send-email-stummala@codeaurora.org
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Alexander Polakov <apolyakov@beget.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Patch-mainline: linux-mm @ 29/06/17, 09:09:15
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* refs/heads/tmp-64a73ff:
Linux 4.4.76
KVM: nVMX: Fix exception injection
KVM: x86: zero base3 of unusable segments
KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh()
KVM: x86: fix emulation of RSM and IRET instructions
cpufreq: s3c2416: double free on driver init error path
iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
iommu: Handle default domain attach failure
iommu/vt-d: Don't over-free page table directories
ocfs2: o2hb: revert hb threshold to keep compatible
x86/mm: Fix flush_tlb_page() on Xen
x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
ARM: 8685/1: ensure memblock-limit is pmd-aligned
ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation
sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting
watchdog: bcm281xx: Fix use of uninitialized spinlock.
xfrm: Oops on error in pfkey_msg2xfrm_state()
xfrm: NULL dereference on allocation failure
xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY
jump label: fix passing kbuild_cflags when checking for asm goto support
ravb: Fix use-after-free on `ifconfig eth0 down`
sctp: check af before verify address in sctp_addr_id2transport
net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV
perf probe: Fix to show correct locations for events on modules
be2net: fix status check in be_cmd_pmac_add()
s390/ctl_reg: make __ctl_load a full memory barrier
swiotlb: ensure that page-sized mappings are page-aligned
coredump: Ensure proper size of sparse core files
x86/mpx: Use compatible types in comparison to fix sparse error
mac80211: initialize SMPS field in HT capabilities
spi: davinci: use dma_mapping_error()
scsi: lpfc: avoid double free of resource identifiers
HID: i2c-hid: Add sleep between POWER ON and RESET
kernel/panic.c: add missing \n
ibmveth: Add a proper check for the availability of the checksum features
vxlan: do not age static remote mac entries
virtio_net: fix PAGE_SIZE > 64k
vfio/spapr: fail tce_iommu_attach_group() when iommu_data is null
drm/amdgpu: check ring being ready before using
net: dsa: Check return value of phy_connect_direct()
amd-xgbe: Check xgbe_init() return code
platform/x86: ideapad-laptop: handle ACPI event 1
scsi: virtio_scsi: Reject commands when virtqueue is broken
xen-netfront: Fix Rx stall during network stress and OOM
swiotlb-xen: update dev_addr after swapping pages
virtio_console: fix a crash in config_work_handler
Btrfs: fix truncate down when no_holes feature is enabled
gianfar: Do not reuse pages from emergency reserve
powerpc/eeh: Enable IO path on permanent error
net: bgmac: Remove superflous netif_carrier_on()
net: bgmac: Start transmit queue in bgmac_open
net: bgmac: Fix SOF bit checking
bgmac: Fix reversed test of build_skb() return value.
mtd: bcm47xxpart: don't fail because of bit-flips
bgmac: fix a missing check for build_skb
mtd: bcm47xxpart: limit scanned flash area on BCM47XX (MIPS) only
MIPS: ralink: fix MT7628 wled_an pinmux gpio
MIPS: ralink: fix MT7628 pinmux typos
MIPS: ralink: Fix invalid assignment of SoC type
MIPS: ralink: fix USB frequency scaling
MIPS: ralink: MT7688 pinmux fixes
net: korina: Fix NAPI versus resources freeing
MIPS: ath79: fix regression in PCI window initialization
net: mvneta: Fix for_each_present_cpu usage
ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags
qla2xxx: Fix erroneous invalid handle message
scsi: lpfc: Set elsiocb contexts to NULL after freeing it
scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type
KVM: x86: fix fixing of hypercalls
mm: numa: avoid waiting on freed migrated pages
block: fix module reference leak on put_disk() call for cgroups throttle
sysctl: enable strict writes
usb: gadget: f_fs: Fix possibe deadlock
drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr
ALSA: hda - set input_path bitmap to zero after moving it to new place
ALSA: hda - Fix endless loop of codec configure
MIPS: Fix IRQ tracing & lockdep when rescheduling
MIPS: pm-cps: Drop manual cache-line alignment of ready_count
MIPS: Avoid accidental raw backtrace
mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff()
drm/ast: Handle configuration without P2A bridge
NFSv4: fix a reference leak caused WARNING messages
netfilter: synproxy: fix conntrackd interaction
netfilter: xt_TCPMSS: add more sanity tests on tcph->doff
rtnetlink: add IFLA_GROUP to ifla_policy
ipv6: Do not leak throw route references
sfc: provide dummy definitions of vswitch functions
net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
decnet: always not take dst->__refcnt when inserting dst into hash table
net/mlx5: Wait for FW readiness before initializing command interface
ipv6: fix calling in6_ifa_hold incorrectly for dad work
igmp: add a missing spin_lock_init()
igmp: acquire pmc lock for ip_mc_clear_src()
net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx
Fix an intermittent pr_emerg warning about lo becoming free.
af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers
net: Zero ifla_vf_info in rtnl_fill_vfinfo()
decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb
net: don't call strlen on non-terminated string in dev_set_alias()
ipv6: release dst on error in ip6_dst_lookup_tail
UPSTREAM: selinux: enable genfscon labeling for tracefs
Change-Id: I05ae1d6271769a99ea3817e5066f5ab6511f3254
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
This patch is motivated from Hugh and Vlastimil's concern [1].
There are two ways to get freepage from the allocator. One is using
normal memory allocation API and the other is __isolate_free_page()
which is internally used for compaction and pageblock isolation. Later
usage is rather tricky since it doesn't do whole post allocation
processing done by normal API.
One problematic thing I already know is that poisoned page would not be
checked if it is allocated by __isolate_free_page(). Perhaps, there
would be more.
We could add more debug logic for allocated page in the future and this
separation would cause more problem. I'd like to fix this situation at
this time. Solution is simple. This patch commonize some logic for
newly allocated page and uses it on all sites. This will solve the
problem.
[1] http://marc.info/?i=alpine.LSU.2.11.1604270029350.7066%40eggly.anvils%3E
Change-Id: I601ec8ce8ee4ab76cd408ff2148dd8c73b959fc2
[iamjoonsoo.kim@lge.com: mm-page_alloc-introduce-post-allocation-processing-on-page-allocator-v3]
Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1466150259-27727-9-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 46f24fd857b37bb86ddd5d0ac3d194e984dfdf1c
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
Currently, we store each page's allocation stacktrace on corresponding
page_ext structure and it requires a lot of memory. This causes the
problem that memory tight system doesn't work well if page_owner is
enabled. Moreover, even with this large memory consumption, we cannot
get full stacktrace because we allocate memory at boot time and just
maintain 8 stacktrace slots to balance memory consumption. We could
increase it to more but it would make system unusable or change system
behaviour.
To solve the problem, this patch uses stackdepot to store stacktrace.
It obviously provides memory saving but there is a drawback that
stackdepot could fail.
stackdepot allocates memory at runtime so it could fail if system has
not enough memory. But, most of allocation stack are generated at very
early time and there are much memory at this time. So, failure would
not happen easily. And, one failure means that we miss just one page's
allocation stacktrace so it would not be a big problem. In this patch,
when memory allocation failure happens, we store special stracktrace
handle to the page that is failed to save stacktrace. With it, user can
guess memory usage properly even if failure happens.
Memory saving looks as following. (4GB memory system with page_owner)
(before the patch -> after the patch)
static allocation:
92274688 bytes -> 25165824 bytes
dynamic allocation after boot + kernel build:
0 bytes -> 327680 bytes
total:
92274688 bytes -> 25493504 bytes
72% reduction in total.
Note that implementation looks complex than someone would imagine
because there is recursion issue. stackdepot uses page allocator and
page_owner is called at page allocation. Using stackdepot in page_owner
could re-call page allcator and then page_owner. That is a recursion.
To detect and avoid it, whenever we obtain stacktrace, recursion is
checked and page_owner is set to dummy information if found. Dummy
information means that this page is allocated for page_owner feature
itself (such as stackdepot) and it's understandable behavior for user.
Change-Id: I9f96f1b527836a7577b1818a6a4fde7786e23a3b
[iamjoonsoo.kim@lge.com: mm-page_owner-use-stackdepot-to-store-stacktrace-v3]
Link: http://lkml.kernel.org/r/1464230275-25791-6-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1466150259-27727-7-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1464230275-25791-6-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: f2ca0b55710752588ccff5224a11e6aea43a996a
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
We have dereferenced page_ext before checking it. Lets check it first
and then used it.
Change-Id: I9184110069df51ddcf6eb699cb6ed2320fa09ab0
Fixes: f86e4271978b ("mm: check the return value of lookup_page_ext for all call sites")
Link: http://lkml.kernel.org/r/1465249059-7883-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 8285027fc479949a7a166bc1b26ce57e894878a7
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
Per the discussion with Joonsoo Kim [1], we need check the return value
of lookup_page_ext() for all call sites since it might return NULL in
some cases, although it is unlikely, i.e. memory hotplug.
Tested with ltp with "page_owner=0".
[1] http://lkml.kernel.org/r/20160519002809.GA10245@js1304-P5Q-DELUXE
Change-Id: Ie0c577c1136a7f6f4e0fa2ceacfb007cd5323b8e
[akpm@linux-foundation.org: fix build-breaking typos]
[arnd@arndb.de: fix build problems from lookup_page_ext]
Link: http://lkml.kernel.org/r/6285269.2CksypHdYp@wuerfel
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1464023768-31025-1-git-send-email-yang.shi@linaro.org
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: f86e4271978bd93db466d6a95dad4b0fdcdb04f6
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
split_page() calls set_page_owner() to set up page_owner to each pages.
But, it has a drawback that head page and the others have different
stacktrace because callsite of set_page_owner() is slightly differnt.
To avoid this problem, this patch copies head page's page_owner to the
others. It needs to introduce new function, split_page_owner() but it
also remove the other function, get_page_owner_gfp() so looks good to
do.
Change-Id: Ie946ccf7dc1e9eeacb03ac81720c178daa7db21e
Link: http://lkml.kernel.org/r/1464230275-25791-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: a9627bc5e34e79ae80a33241b8a1501cc498e191
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
Currently, copy_page_owner() doesn't copy all the owner information. It
skips last_migrate_reason because copy_page_owner() is used for
migration and it will be properly set soon. But, following patch will
use copy_page_owner() and this skip will cause the problem that
allocated page has uninitialied last_migrate_reason. To prevent it,
this patch also copy last_migrate_reason in copy_page_owner().
Change-Id: Ibaf1320c296f808098481a29bde3147390199b90
Link: http://lkml.kernel.org/r/1464230275-25791-3-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: a8efe1c982a22c95884dee1ddf2e721567d1f483
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
It's not necessary to initialized page_owner with holding the zone lock.
It would cause more contention on the zone lock although it's not a big
problem since it is just debug feature. But, it is better than before
so do it. This is also preparation step to use stackdepot in page owner
feature. Stackdepot allocates new pages when there is no reserved space
and holding the zone lock in this case will cause deadlock.
Change-Id: Id96ab8444f194bead3fa4a8ddda30cdcca4ddc9f
Link: http://lkml.kernel.org/r/1464230275-25791-2-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 83358ece26b70f20c0ba2e0e00dc84b0ee24fe6d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
We don't need to split freepages with holding the zone lock. It will
cause more contention on zone lock so not desirable.
Change-Id: Ifb1ee4e48e322abb25a9293885f68dfe75afb743
[rientjes@google.com: if __isolate_free_page() fails, avoid adding to freelist so we don't call map_pages() with it]
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1606211447001.43430@chino.kir.corp.google.com
Link: http://lkml.kernel.org/r/1464230275-25791-1-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 66c64223ad4e7a4a9161fcd9606426d9f57227ca
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
There is a system thats node's pfns are overlapped as follows:
-----pfn-------->
N0 N1 N2 N0 N1 N2
Therefore, we need to care this overlapping when iterating pfn range.
There are one place in page_owner.c that iterates pfn range and it
doesn't consider this overlapping. Add it.
Without this patch, above system could over count early allocated page
number before page_owner is activated.
Change-Id: I2addf2fe2ae4d2b0d82b2dcbdcda37663daec0f3
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 9d43f5aec9506d98ad492a783aa8a18226c5d474
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
The page_owner mechanism is useful for dealing with memory leaks. By
reading /sys/kernel/debug/page_owner one can determine the stack traces
leading to allocations of all pages, and find e.g. a buggy driver.
This information might be also potentially useful for debugging, such as
the VM_BUG_ON_PAGE() calls to dump_page(). So let's print the stored
info from dump_page().
Example output:
page:ffffea000292f1c0 count:1 mapcount:0 mapping:ffff8800b2f6cc18 index:0x91d
flags: 0x1fffff8001002c(referenced|uptodate|lru|mappedtodisk)
page dumped because: VM_BUG_ON_PAGE(1)
page->mem_cgroup:ffff8801392c5000
page allocated via order 0, migratetype Movable, gfp_mask 0x24213ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD|__GFP_NOWARN|__GFP_NORETRY)
[<ffffffff811682c4>] __alloc_pages_nodemask+0x134/0x230
[<ffffffff811b40c8>] alloc_pages_current+0x88/0x120
[<ffffffff8115e386>] __page_cache_alloc+0xe6/0x120
[<ffffffff8116ba6c>] __do_page_cache_readahead+0xdc/0x240
[<ffffffff8116bd05>] ondemand_readahead+0x135/0x260
[<ffffffff8116be9c>] page_cache_async_readahead+0x6c/0x70
[<ffffffff811604c2>] generic_file_read_iter+0x3f2/0x760
[<ffffffff811e0dc7>] __vfs_read+0xa7/0xd0
page has been migrated, last migrate reason: compaction
Change-Id: Ie5f3716ab34b3a66a00973f5d87360ebd0155348
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 4e462112e98f9ad6dd62e160f8b14c7df5fed2fc
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
During migration, page_owner info is now copied with the rest of the
page, so the stacktrace leading to free page allocation during migration
is overwritten. For debugging purposes, it might be however useful to
know that the page has been migrated since its initial allocation. This
might happen many times during the lifetime for different reasons and
fully tracking this, especially with stacktraces would incur extra
memory costs. As a compromise, store and print the migrate_reason of
the last migration that occurred to the page. This is enough to
distinguish compaction, numa balancing etc.
Example page_owner entry after the patch:
Page allocated via order 0, mask 0x24200ca(GFP_HIGHUSER_MOVABLE)
PFN 628753 type Movable Block 1228 type Movable Flags 0x1fffff80040030(dirty|lru|swapbacked)
[<ffffffff811682c4>] __alloc_pages_nodemask+0x134/0x230
[<ffffffff811b6325>] alloc_pages_vma+0xb5/0x250
[<ffffffff81177491>] shmem_alloc_page+0x61/0x90
[<ffffffff8117a438>] shmem_getpage_gfp+0x678/0x960
[<ffffffff8117c2b9>] shmem_fallocate+0x329/0x440
[<ffffffff811de600>] vfs_fallocate+0x140/0x230
[<ffffffff811df434>] SyS_fallocate+0x44/0x70
[<ffffffff8158cc2e>] entry_SYSCALL_64_fastpath+0x12/0x71
Page has been migrated, last migrate reason: compaction
Change-Id: I9c93f9f91fa71feaea1505d80ee56caf8daf5562
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 7cd12b4abfd2f8f42414c520bbd051a5b7dc7a8c
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
The page_owner mechanism stores gfp_flags of an allocation and stack
trace that lead to it. During page migration, the original information
is practically replaced by the allocation of free page as the migration
target. Arguably this is less useful and might lead to all the
page_owner info for migratable pages gradually converge towards
compaction or numa balancing migrations. It has also lead to
inaccuracies such as one fixed by commit e2cfc91120 ("mm/page_owner:
set correct gfp_mask on page_owner").
This patch thus introduces copying the page_owner info during migration.
However, since the fact that the page has been migrated from its
original place might be useful for debugging, the next patch will
introduce a way to track that information as well.
Change-Id: I4eb94be5fb2c93bbf165edb9f2a80091b5c8d7b1
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: d435edca928805074dae005ab9a42d9fa60fc702
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
CONFIG_PAGE_OWNER attempts to impose negligible runtime overhead when
enabled during compilation, but not actually enabled during runtime by
boot param page_owner=on. This overhead can be further reduced using
the static key mechanism, which this patch does.
Change-Id: I76e44d92ed973647d4fd6489f97db5ffeb893354
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 7dd80b8af0bcd705a9ef2fa272c082882616a499
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
The information in /sys/kernel/debug/page_owner includes the migratetype
of the pageblock the page belongs to. This is also checked against the
page's migratetype (as declared by gfp_flags during its allocation), and
the page is reported as Fallback if its migratetype differs from the
pageblock's one. t This is somewhat misleading because in fact fallback
allocation is not the only reason why these two can differ. It also
doesn't direcly provide the page's migratetype, although it's possible
to derive that from the gfp_flags.
It's arguably better to print both page and pageblock's migratetype and
leave the interpretation to the consumer than to suggest fallback
allocation as the only possible reason. While at it, we can print the
migratetypes as string the same way as /proc/pagetypeinfo does, as some
of the numeric values depend on kernel configuration. For that, this
patch moves the migratetype_names array from #ifdef CONFIG_PROC_FS part
of mm/vmstat.c to mm/page_alloc.c and exports it.
With the new format strings for flags, we can now also provide symbolic
page and gfp flags in the /sys/kernel/debug/page_owner file. This
replaces the positional printing of page flags as single letters, which
might have looked nicer, but was limited to a subset of flags, and
required the user to remember the letters.
Example page_owner entry after the patch:
Page allocated via order 0, mask 0x24213ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD|__GFP_NOWARN|__GFP_NORETRY)
PFN 520 type Movable Block 1 type Movable Flags 0xfffff8001006c(referenced|uptodate|lru|active|mappedtodisk)
[<ffffffff811682c4>] __alloc_pages_nodemask+0x134/0x230
[<ffffffff811b4058>] alloc_pages_current+0x88/0x120
[<ffffffff8115e386>] __page_cache_alloc+0xe6/0x120
[<ffffffff8116ba6c>] __do_page_cache_readahead+0xdc/0x240
[<ffffffff8116bd05>] ondemand_readahead+0x135/0x260
[<ffffffff8116bfb1>] page_cache_sync_readahead+0x31/0x50
[<ffffffff81160523>] generic_file_read_iter+0x453/0x760
[<ffffffff811e0d57>] __vfs_read+0xa7/0xd0
Change-Id: I08f3412dbda9075d5534eee81444843a7679e54e
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 60f30350fd69a3e4d5f0f45937d3274c22565134
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[guptap@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAllc3f0ACgkQONu9yGCS
aT4fmA/+OHeYbhpaMRKqrUpsxB3NpROr2Z47ow6vaVjYZzd0irrODLlfIfDQ6EEo
N3v28povu16VeYXk+4h8bsAP2K2j6/BlRaSi2hB6dmnY8GDMaXEfRojPYAlzVz50
qnK/6152siDDarUx1h5Zc8GcmX/tEl6h3bOOxDcwLR+RvyIcWxenuR+uqRM/AV6o
BPEiOuMu7P6LjID7KYgBTFNajVBMLrDXt4SCWdzOZmlNt0QXgKB9yw68vTcc+edC
ZcXqa0M6nEWSDvwobbwBZhFL8H2dJjzweyjeFBgxnxgmOrRh6kvZG2wsz2c8O3/P
g8TuMxU7siu+I3lFwKy+dgZ/1REz+6Q3oFBqXsuddrcPYu23rV6mz/GxqWy4cerb
M4eTWz6L9vA2GoYpvBaWi0tKC9tkNM49g48Y24a6CW1O4dJWlz3RrpTiZmequbNF
mo8EKomSXn4kYAm1xT03DGljQkK/i2JtyI5sk2hLEqqxKvZ/3q9xxLLKOVx8dPvs
PIbfpapfYMXXMWgR6e+UKueNLgevfWE12X/OU4SgvSY4n/07/mH40XEd3zd82IsZ
1Mw0qj3JnqCAFDBBMsDYa+OvABaGD1dHARuiv+aeqW8tqoBglFHxWqF+SQVNXLIE
qTLiKz78vjQpH0zGpkA3HEOh/h4L7a0y3qRMECsk5SUxXsgu1gg=
=bwNU
-----END PGP SIGNATURE-----
Merge 4.4.76 into android-4.4
Changes in 4.4.76
ipv6: release dst on error in ip6_dst_lookup_tail
net: don't call strlen on non-terminated string in dev_set_alias()
decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb
net: Zero ifla_vf_info in rtnl_fill_vfinfo()
af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers
Fix an intermittent pr_emerg warning about lo becoming free.
net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx
igmp: acquire pmc lock for ip_mc_clear_src()
igmp: add a missing spin_lock_init()
ipv6: fix calling in6_ifa_hold incorrectly for dad work
net/mlx5: Wait for FW readiness before initializing command interface
decnet: always not take dst->__refcnt when inserting dst into hash table
net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
sfc: provide dummy definitions of vswitch functions
ipv6: Do not leak throw route references
rtnetlink: add IFLA_GROUP to ifla_policy
netfilter: xt_TCPMSS: add more sanity tests on tcph->doff
netfilter: synproxy: fix conntrackd interaction
NFSv4: fix a reference leak caused WARNING messages
drm/ast: Handle configuration without P2A bridge
mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff()
MIPS: Avoid accidental raw backtrace
MIPS: pm-cps: Drop manual cache-line alignment of ready_count
MIPS: Fix IRQ tracing & lockdep when rescheduling
ALSA: hda - Fix endless loop of codec configure
ALSA: hda - set input_path bitmap to zero after moving it to new place
drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr
usb: gadget: f_fs: Fix possibe deadlock
sysctl: enable strict writes
block: fix module reference leak on put_disk() call for cgroups throttle
mm: numa: avoid waiting on freed migrated pages
KVM: x86: fix fixing of hypercalls
scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type
scsi: lpfc: Set elsiocb contexts to NULL after freeing it
qla2xxx: Fix erroneous invalid handle message
ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags
net: mvneta: Fix for_each_present_cpu usage
MIPS: ath79: fix regression in PCI window initialization
net: korina: Fix NAPI versus resources freeing
MIPS: ralink: MT7688 pinmux fixes
MIPS: ralink: fix USB frequency scaling
MIPS: ralink: Fix invalid assignment of SoC type
MIPS: ralink: fix MT7628 pinmux typos
MIPS: ralink: fix MT7628 wled_an pinmux gpio
mtd: bcm47xxpart: limit scanned flash area on BCM47XX (MIPS) only
bgmac: fix a missing check for build_skb
mtd: bcm47xxpart: don't fail because of bit-flips
bgmac: Fix reversed test of build_skb() return value.
net: bgmac: Fix SOF bit checking
net: bgmac: Start transmit queue in bgmac_open
net: bgmac: Remove superflous netif_carrier_on()
powerpc/eeh: Enable IO path on permanent error
gianfar: Do not reuse pages from emergency reserve
Btrfs: fix truncate down when no_holes feature is enabled
virtio_console: fix a crash in config_work_handler
swiotlb-xen: update dev_addr after swapping pages
xen-netfront: Fix Rx stall during network stress and OOM
scsi: virtio_scsi: Reject commands when virtqueue is broken
platform/x86: ideapad-laptop: handle ACPI event 1
amd-xgbe: Check xgbe_init() return code
net: dsa: Check return value of phy_connect_direct()
drm/amdgpu: check ring being ready before using
vfio/spapr: fail tce_iommu_attach_group() when iommu_data is null
virtio_net: fix PAGE_SIZE > 64k
vxlan: do not age static remote mac entries
ibmveth: Add a proper check for the availability of the checksum features
kernel/panic.c: add missing \n
HID: i2c-hid: Add sleep between POWER ON and RESET
scsi: lpfc: avoid double free of resource identifiers
spi: davinci: use dma_mapping_error()
mac80211: initialize SMPS field in HT capabilities
x86/mpx: Use compatible types in comparison to fix sparse error
coredump: Ensure proper size of sparse core files
swiotlb: ensure that page-sized mappings are page-aligned
s390/ctl_reg: make __ctl_load a full memory barrier
be2net: fix status check in be_cmd_pmac_add()
perf probe: Fix to show correct locations for events on modules
net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV
sctp: check af before verify address in sctp_addr_id2transport
ravb: Fix use-after-free on `ifconfig eth0 down`
jump label: fix passing kbuild_cflags when checking for asm goto support
xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY
xfrm: NULL dereference on allocation failure
xfrm: Oops on error in pfkey_msg2xfrm_state()
watchdog: bcm281xx: Fix use of uninitialized spinlock.
sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting
ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation
ARM: 8685/1: ensure memblock-limit is pmd-aligned
x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
x86/mm: Fix flush_tlb_page() on Xen
ocfs2: o2hb: revert hb threshold to keep compatible
iommu/vt-d: Don't over-free page table directories
iommu: Handle default domain attach failure
iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
cpufreq: s3c2416: double free on driver init error path
KVM: x86: fix emulation of RSM and IRET instructions
KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh()
KVM: x86: zero base3 of unusable segments
KVM: nVMX: Fix exception injection
Linux 4.4.76
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 3c226c637b69104f6b9f1c6ec5b08d7b741b3229 upstream.
In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by
waiting until the pmd is unlocked before we return and retry. However,
we can race with migrate_misplaced_transhuge_page():
// do_huge_pmd_numa_page // migrate_misplaced_transhuge_page()
// Holds 0 refs on page // Holds 2 refs on page
vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
/* ... */
if (pmd_trans_migrating(*vmf->pmd)) {
page = pmd_page(*vmf->pmd);
spin_unlock(vmf->ptl);
ptl = pmd_lock(mm, pmd);
if (page_count(page) != 2)) {
/* roll back */
}
/* ... */
mlock_migrate_page(new_page, page);
/* ... */
spin_unlock(ptl);
put_page(page);
put_page(page); // page freed here
wait_on_page_locked(page);
goto out;
}
This can result in the freed page having its waiters flag set
unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the
page alloc/free functions. This has been observed on arm64 KVM guests.
We can avoid this by having do_huge_pmd_numa_page() take a reference on
the page before dropping the pmd lock, mirroring what we do in
__migration_entry_wait().
When we hit the race, migrate_misplaced_transhuge_page() will see the
reference and abort the migration, as it may do today in other cases.
Fixes: b8916634b7 ("mm: Prevent parallel splits during THP migration")
Link: http://lkml.kernel.org/r/1497349722-6731-2-git-send-email-will.deacon@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 460bcec84e11c75122ace5976214abbc596eb91b upstream.
We got need_resched() warnings in swap_cgroup_swapoff() because
swap_cgroup_ctrl[type].length is particularly large.
Reschedule when needed.
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1704061315270.80559@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* refs/heads/tmp-77ddb50:
UPSTREAM: usb: gadget: f_fs: avoid out of bounds access on comp_desc
Linux 4.4.74
mm: fix new crash in unmapped_area_topdown()
Allow stack to grow up to address space limit
mm: larger stack guard gap, between vmas
alarmtimer: Rate limit periodic intervals
MIPS: Fix bnezc/jialc return address calculation
usb: dwc3: exynos fix axius clock error path to do cleanup
alarmtimer: Prevent overflow of relative timers
genirq: Release resources in __setup_irq() error path
swap: cond_resched in swap_cgroup_prepare()
mm/memory-failure.c: use compound_head() flags for huge pages
USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks
usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
drivers/misc/c2port/c2port-duramar2150.c: checking for NULL instead of IS_ERR()
usb: r8a66597-hcd: decrease timeout
usb: r8a66597-hcd: select a different endpoint on timeout
USB: gadget: dummy_hcd: fix hub-descriptor removable fields
pvrusb2: reduce stack usage pvr2_eeprom_analyze()
usb: core: fix potential memory leak in error path during hcd creation
USB: hub: fix SS max number of ports
iio: proximity: as3935: recalibrate RCO after resume
staging: rtl8188eu: prevent an underflow in rtw_check_beacon_data()
mfd: omap-usb-tll: Fix inverted bit use for USB TLL mode
x86/mm/32: Set the '__vmalloc_start_set' flag in initmem_init()
serial: efm32: Fix parity management in 'efm32_uart_console_get_options()'
mac80211: fix IBSS presp allocation size
mac80211: fix CSA in IBSS mode
mac80211/wpa: use constant time memory comparison for MACs
mac80211: don't look at the PM bit of BAR frames
vb2: Fix an off by one error in 'vb2_plane_vaddr'
cpufreq: conservative: Allow down_threshold to take values from 1 to 10
can: gs_usb: fix memory leak in gs_cmd_reset()
configfs: Fix race between create_link and configfs_rmdir
UPSTREAM: bpf: don't let ldimm64 leak map addresses on unprivileged
BACKPORT: ext4: fix data exposure after a crash
ANDROID: sdcardfs: remove dead function open_flags_to_access_mode()
ANDROID: android-base.cfg: split out arm64-specific configs
Linux 4.4.73
sparc64: make string buffers large enough
s390/kvm: do not rely on the ILC on kvm host protection fauls
xtensa: don't use linux IRQ #0
tipc: ignore requests when the connection state is not CONNECTED
proc: add a schedule point in proc_pid_readdir()
romfs: use different way to generate fsid for BLOCK or MTD
sctp: sctp_addr_id2transport should verify the addr before looking up assoc
r8152: avoid start_xmit to schedule napi when napi is disabled
r8152: fix rtl8152_post_reset function
r8152: re-schedule napi for tx
nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"
ravb: unmap descriptors when freeing rings
drm/ast: Fixed system hanged if disable P2A
drm/nouveau: Don't enabling polling twice on runtime resume
parisc, parport_gsc: Fixes for printk continuation lines
net: adaptec: starfire: add checks for dma mapping errors
pinctrl: berlin-bg4ct: fix the value for "sd1a" of pin SCRD0_CRD_PRES
gianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page
net/mlx4_core: Avoid command timeouts during VF driver device shutdown
drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers
drm/nouveau: prevent userspace from deleting client object
ipv6: fix flow labels when the traffic class is non-0
FS-Cache: Initialise stores_lock in netfs cookie
fscache: Clear outstanding writes when disabling a cookie
fscache: Fix dead object requeue
ethtool: do not vzalloc(0) on registers dump
log2: make order_base_2() behave correctly on const input value zero
kasan: respect /proc/sys/kernel/traceoff_on_warning
jump label: pass kbuild_cflags when checking for asm goto support
PM / runtime: Avoid false-positive warnings from might_sleep_if()
ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches
i2c: piix4: Fix request_region size
sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications
sierra_net: Skip validating irrelevant fields for IDLE LSIs
net: hns: Fix the device being used for dma mapping during TX
NET: mkiss: Fix panic
NET: Fix /proc/net/arp for AX.25
ipv6: Inhibit IPv4-mapped src address on the wire.
ipv6: Handle IPv4-mapped src to in6addr_any dst.
net: xilinx_emaclite: fix receive buffer overflow
net: xilinx_emaclite: fix freezes due to unordered I/O
Call echo service immediately after socket reconnect
staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory.
ARM: dts: imx6dl: Fix the VDD_ARM_CAP voltage for 396MHz operation
partitions/msdos: FreeBSD UFS2 file systems are not recognized
s390/vmem: fix identity mapping
usb: gadget: f_fs: Fix possibe deadlock
Conflicts:
drivers/usb/gadget/function/f_fs.c
Change-Id: I23106e9fc2c4f2d0b06acce59b781f6c36487fcc
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
pagetypeinfo_showmixedcount_print is found to take a lot of time to
complete and it does this holding the zone lock and disabling interrupts.
In some cases it is found to take more than a second (On a 2.4GHz,8Gb
RAM,arm64 cpu). Avoid taking the zone lock similar to what is done by
read_page_owner, which means possibility of inaccurate results.
Change-Id: I11ec4a3a445d602e47fcc18a3e40480b74ad98af
Link: http://lkml.kernel.org/r/1498045643-12257-1-git-send-email-vinmenon@codeaurora.org
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: zhongjiang <zhongjiang@huawei.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Git-commit: a94b5fd913ac55a32fe05dfba21eb6af0e539781
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
[vinmenon@codeaurora.org: fix trivial merge conflicts]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAllQl/sACgkQONu9yGCS
aT5zMRAAuDBpWjQ1IFtgmzQnKGyjS3fm5X/EgPmT81PFKXay5/TH6Hc85TvorChk
mCC7qybadCFPjieBfUeCGhTposiGkbOZdYIzduzLeHPe7Eda88NKJw5ZS3x+RDro
if6BZNtQPwPk9jQ95zpBu/p6eCuIGFzQObif8XHga9eEVP+TPGDKFn5EdLM8j99t
ErKYyTLFEiZYa52hpCBbVz/4mX8bJOoAlZaitcbvaFbG0OodA5SL24sKlr7tAPrM
ajnuqv+ghOUjbXrUlrTGxCjJ7vCJjdBqNzuxVFNj5P1xDucpBW8uuWGob0XWTMbB
hj/ToAIQXQXrZKFpASWW74B4QZDcjo7dbhDWOurBaAsyLuBzAi26pI+q6TqgCQUO
k17ilfk9LVEvvFhiQ7xpJPNnkh6tCEk7Jdblru6ZL5fHCAYe+qUDj56TbqjFJCQK
+bDzPi0QXkEGQNKxo7zDu5iGQ0Gb0zD2Z3MrGD+3pCkM5yG0PXjzZ7lOlboyPzwY
88dxuuTRmm8yGEEm81BKmDYqAA1l4FCrap8u9FLoNyoZyMnK7B+SHHuPRBRhL3F2
I3L/v8BbJhXTsDNPXEsXtpZZpn2wxJp4x4gKWmCcOb5MM1nbFrFtwdj0cKobu6Xe
ygNMEkjlW2uUrZoDXthj1ICda/cEw/R0gMWzBeNNVfErOZEmFxM=
=zl9i
-----END PGP SIGNATURE-----
Merge 4.4.74 into android-4.4
Changes in 4.4.74
configfs: Fix race between create_link and configfs_rmdir
can: gs_usb: fix memory leak in gs_cmd_reset()
cpufreq: conservative: Allow down_threshold to take values from 1 to 10
vb2: Fix an off by one error in 'vb2_plane_vaddr'
mac80211: don't look at the PM bit of BAR frames
mac80211/wpa: use constant time memory comparison for MACs
mac80211: fix CSA in IBSS mode
mac80211: fix IBSS presp allocation size
serial: efm32: Fix parity management in 'efm32_uart_console_get_options()'
x86/mm/32: Set the '__vmalloc_start_set' flag in initmem_init()
mfd: omap-usb-tll: Fix inverted bit use for USB TLL mode
staging: rtl8188eu: prevent an underflow in rtw_check_beacon_data()
iio: proximity: as3935: recalibrate RCO after resume
USB: hub: fix SS max number of ports
usb: core: fix potential memory leak in error path during hcd creation
pvrusb2: reduce stack usage pvr2_eeprom_analyze()
USB: gadget: dummy_hcd: fix hub-descriptor removable fields
usb: r8a66597-hcd: select a different endpoint on timeout
usb: r8a66597-hcd: decrease timeout
drivers/misc/c2port/c2port-duramar2150.c: checking for NULL instead of IS_ERR()
usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks
mm/memory-failure.c: use compound_head() flags for huge pages
swap: cond_resched in swap_cgroup_prepare()
genirq: Release resources in __setup_irq() error path
alarmtimer: Prevent overflow of relative timers
usb: dwc3: exynos fix axius clock error path to do cleanup
MIPS: Fix bnezc/jialc return address calculation
alarmtimer: Rate limit periodic intervals
mm: larger stack guard gap, between vmas
Allow stack to grow up to address space limit
mm: fix new crash in unmapped_area_topdown()
Linux 4.4.74
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAllEst4ACgkQONu9yGCS
aT5RJQ//UBkwmDInzy8BbfZk7pXY4IXzbXYfZ/nS5QbmPQNcQArKeIsOH+C1ZTKq
suo4wR4yWUzATr8BBBxbUKyyrAsA85oDq/fl3EyPVLYOHdChnM/sS9H/eFrQhJIN
7PM9j1YiqyTohkagB2tkWSb8tO1r/wgTSTLmIE+RzsWn6rvMCPMPVY+3OGgC1fuT
LJrgtXKGBl9zNxmrns3VlSou1MhJvh/y8ESIrhdYHdZKos0zsgQaISXJjhZtyQcV
OnEzsuz9NMXWE9XGjNpyAm88Nh8T41Ey/vwOjt6mkvWac3r77IgI5NWaLV/QDyqm
d3jpVWK5BSPcLsmeN4LwQC5aYayvHlh8CfP8ZlBx1xkB5TpclnqXGgQA9BYpXAKw
XoeFl8n8xLaPrgX8gp3kw/f6C6443OC2JVeRvgnH/0ZM7M0+rZxE6DstRcUHGqf6
K8PN+AssQpBLIjXSGHnzDVHME/1xWUSmJZfLd5bd6NJ1zSZqZOy1gkf5dx77p0Ka
UaGVOg6UzOojr3GeUTE62bRO2ZuAno0QO1NQJJkUK1CbNMYmE61vYLM8i0pLKWZJ
3mDlhcoGK8aJH8chNLU3mgkXECU/9zOVKveWZFoghhMlv8ImgeTuiqZhvztzzT38
42DxdXPfMzxCwBF02zYu4qn+WDJbNyOqMQrlMEHwwb88wnKiOUg=
=Ic1J
-----END PGP SIGNATURE-----
Merge 4.4.73 into android-4.4
Changes in 4.4.73
s390/vmem: fix identity mapping
partitions/msdos: FreeBSD UFS2 file systems are not recognized
ARM: dts: imx6dl: Fix the VDD_ARM_CAP voltage for 396MHz operation
staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory.
Call echo service immediately after socket reconnect
net: xilinx_emaclite: fix freezes due to unordered I/O
net: xilinx_emaclite: fix receive buffer overflow
ipv6: Handle IPv4-mapped src to in6addr_any dst.
ipv6: Inhibit IPv4-mapped src address on the wire.
NET: Fix /proc/net/arp for AX.25
NET: mkiss: Fix panic
net: hns: Fix the device being used for dma mapping during TX
sierra_net: Skip validating irrelevant fields for IDLE LSIs
sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications
i2c: piix4: Fix request_region size
ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches
PM / runtime: Avoid false-positive warnings from might_sleep_if()
jump label: pass kbuild_cflags when checking for asm goto support
kasan: respect /proc/sys/kernel/traceoff_on_warning
log2: make order_base_2() behave correctly on const input value zero
ethtool: do not vzalloc(0) on registers dump
fscache: Fix dead object requeue
fscache: Clear outstanding writes when disabling a cookie
FS-Cache: Initialise stores_lock in netfs cookie
ipv6: fix flow labels when the traffic class is non-0
drm/nouveau: prevent userspace from deleting client object
drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers
net/mlx4_core: Avoid command timeouts during VF driver device shutdown
gianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page
pinctrl: berlin-bg4ct: fix the value for "sd1a" of pin SCRD0_CRD_PRES
net: adaptec: starfire: add checks for dma mapping errors
parisc, parport_gsc: Fixes for printk continuation lines
drm/nouveau: Don't enabling polling twice on runtime resume
drm/ast: Fixed system hanged if disable P2A
ravb: unmap descriptors when freeing rings
nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"
r8152: re-schedule napi for tx
r8152: fix rtl8152_post_reset function
r8152: avoid start_xmit to schedule napi when napi is disabled
sctp: sctp_addr_id2transport should verify the addr before looking up assoc
romfs: use different way to generate fsid for BLOCK or MTD
proc: add a schedule point in proc_pid_readdir()
tipc: ignore requests when the connection state is not CONNECTED
xtensa: don't use linux IRQ #0
s390/kvm: do not rely on the ILC on kvm host protection fauls
sparc64: make string buffers large enough
Linux 4.4.73
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit f4cb767d76cf7ee72f97dd76f6cfa6c76a5edc89 upstream.
Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of
mmap testing. That's the VM_BUG_ON(gap_end < gap_start) at the
end of unmapped_area_topdown(). Linus points out how MAP_FIXED
(which does not have to respect our stack guard gap intentions)
could result in gap_end below gap_start there. Fix that, and
the similar case in its alternative, unmapped_area().
Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Debugged-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd726c90b6b8ce87602208701b208a208e6d5600 upstream.
Fix expand_upwards() on architectures with an upward-growing stack (parisc,
metag and partly IA-64) to allow the stack to reliably grow exactly up to
the address space limit given by TASK_SIZE.
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream.
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[wt: backport to 4.11: adjust context]
[wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide]
[wt: backport to 4.4: adjust context ; drop ppc hugetlb_radix changes]
Signed-off-by: Willy Tarreau <w@1wt.eu>
[gkh: minor build fixes for 4.4]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef70762948dde012146926720b70e79736336764 upstream.
I saw need_resched() warnings when swapping on large swapfile (TBs)
because continuously allocating many pages in swap_cgroup_prepare() took
too long.
We already cond_resched when freeing page in swap_cgroup_swapoff(). Do
the same for the page allocation.
Link: http://lkml.kernel.org/r/20170604200109.17606-1-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7258ae5c5a2ce2f5969e8b18b881be40ab55433d upstream.
memory_failure() chooses a recovery action function based on the page
flags. For huge pages it uses the tail page flags which don't have
anything interesting set, resulting in:
> Memory failure: 0x9be3b4: Unknown page state
> Memory failure: 0x9be3b4: recovery action for unknown page: Failed
Instead, save a copy of the head page's flags if this is a huge page,
this means if there are no relevant flags for this tail page, we use the
head pages flags instead. This results in the me_huge_page() recovery
action being called:
> Memory failure: 0x9b7969: recovery action for huge page: Delayed
For hugepages that have not yet been allocated, this allows the hugepage
to be dequeued.
Fixes: 524fca1e73 ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages")
Link: http://lkml.kernel.org/r/20170524130204.21845-1-james.morse@arm.com
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* refs/heads/tmp-e76c0fa
Linux 4.4.72
arm64: ensure extension of smp_store_release value
arm64: armv8_deprecated: ensure extension of addr
usercopy: Adjust tests to deal with SMAP/PAN
RDMA/qib,hfi1: Fix MR reference count leak on write with immediate
arm64: entry: improve data abort handling of tagged pointers
arm64: hw_breakpoint: fix watchpoint matching for tagged pointers
Make __xfs_xattr_put_listen preperly report errors.
NFSv4: Don't perform cached access checks before we've OPENed the file
NFS: Ensure we revalidate attributes before using execute_ok()
mm: consider memblock reservations for deferred memory initialization sizing
net: better skb->sender_cpu and skb->napi_id cohabitation
serial: sh-sci: Fix panic when serial console and DMA are enabled
tty: Drop krefs for interrupted tty lock
drivers: char: mem: Fix wraparound check to allow mappings up to the end
ASoC: Fix use-after-free at card unregistration
ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT
ALSA: timer: Fix race between read and ioctl
drm/nouveau/tmr: fully separate alarm execution/pending lists
drm/vmwgfx: Make sure backup_handle is always valid
drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()
drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve()
perf/core: Drop kernel samples even though :u is specified
powerpc/hotplug-mem: Fix missing endian conversion of aa_index
powerpc/numa: Fix percpu allocations to be NUMA aware
powerpc/eeh: Avoid use after free in eeh_handle_special_event()
scsi: qla2xxx: don't disable a not previously enabled PCI device
KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages
btrfs: fix memory leak in update_space_info failure path
btrfs: use correct types for page indices in btrfs_page_exists_in_range
cxl: Fix error path on bad ioctl
ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path
ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments()
ufs: set correct ->s_maxsize
ufs: restore maintaining ->i_blocks
fix ufs_isblockset()
ufs: restore proper tail allocation
fs: add i_blocksize()
cpuset: consider dying css as offline
Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled
drm/msm: Expose our reservation object when exporting a dmabuf.
target: Re-add check to reject control WRITEs with overflow data
cpufreq: cpufreq_register_driver() should return -ENODEV if init fails
stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms
random: properly align get_random_int_hash
drivers: char: random: add get_random_long()
iio: proximity: as3935: fix AS3935_INT mask
iio: light: ltr501 Fix interchanged als/ps register field
staging/lustre/lov: remove set_fs() call from lov_getstripe()
usb: chipidea: debug: check before accessing ci_role
usb: chipidea: udc: fix NULL pointer dereference if udc_start failed
usb: gadget: f_mass_storage: Serialize wake and sleep execution
ext4: fix fdatasync(2) after extent manipulation operations
ext4: keep existing extra fields when inode expands
ext4: fix SEEK_HOLE
xen-netfront: cast grant table reference first to type int
xen-netfront: do not cast grant table reference to signed short
xen/privcmd: Support correctly 64KB page granularity when mapping memory
dmaengine: ep93xx: Always start from BASE0
dmaengine: usb-dmac: Fix DMAOR AE bit definition
KVM: async_pf: avoid async pf injection when in guest mode
arm: KVM: Allow unaligned accesses at HYP
KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation
kvm: async_pf: fix rcu_irq_enter() with irqs enabled
nfsd: Fix up the "supattr_exclcreat" attributes
nfsd4: fix null dereference on replay
drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)
crypto: gcm - wait for crypto op not signal safe
KEYS: fix freeing uninitialized memory in key_update()
KEYS: fix dereferencing NULL payload with nonzero length
ptrace: Properly initialize ptracer_cred on fork
serial: ifx6x60: fix use-after-free on module unload
arch/sparc: support NR_CPUS = 4096
sparc64: delete old wrap code
sparc64: new context wrap
sparc64: add per-cpu mm of secondary contexts
sparc64: redefine first version
sparc64: combine activate_mm and switch_mm
sparc64: reset mm cpumask after wrap
sparc: Machine description indices can vary
sparc64: mm: fix copy_tsb to correctly copy huge page TSBs
net: bridge: start hello timer only if device is up
net: ethoc: enable NAPI before poll may be scheduled
net: ping: do not abuse udp_poll()
ipv6: Fix leak in ipv6_gso_segment().
vxlan: fix use-after-free on deletion
tcp: disallow cwnd undo when switching congestion control
cxgb4: avoid enabling napi twice to the same queue
ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt()
bnx2x: Fix Multi-Cos
ANDROID: uid_sys_stats: check previous uid_entry before call find_or_register_uid
ANDROID: sdcardfs: d_splice_alias can return error values
Change-Id: I829ebf1a9271dcf0462c537e7bfcbcfde322f336
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
* refs/heads/tmp-6fc0573:
Linux 4.4.71
xfs: only return -errno or success from attr ->put_listent
xfs: in _attrlist_by_handle, copy the cursor back to userspace
xfs: fix unaligned access in xfs_btree_visit_blocks
xfs: bad assertion for delalloc an extent that start at i_size
xfs: fix indlen accounting error on partial delalloc conversion
xfs: wait on new inodes during quotaoff dquot release
xfs: update ag iterator to support wait on new inodes
xfs: support ability to wait on new inodes
xfs: fix up quotacheck buffer list error handling
xfs: prevent multi-fsb dir readahead from reading random blocks
xfs: handle array index overrun in xfs_dir2_leaf_readbuf()
xfs: fix over-copying of getbmap parameters from userspace
xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff()
xfs: Fix missed holes in SEEK_HOLE implementation
mlock: fix mlock count can not decrease in race condition
mm/migrate: fix refcount handling when !hugepage_migration_supported()
drm/gma500/psb: Actually use VBT mode when it is found
slub/memcg: cure the brainless abuse of sysfs attributes
ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430
pcmcia: remove left-over %Z format
drm/radeon: Unbreak HPD handling for r600+
drm/radeon/ci: disable mclk switching for high refresh rates (v2)
scsi: mpt3sas: Force request partial completion alignment
HID: wacom: Have wacom_tpc_irq guard against possible NULL dereference
mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read
i2c: i2c-tiny-usb: fix buffer not being DMA capable
vlan: Fix tcp checksum offloads in Q-in-Q vlans
net: phy: marvell: Limit errata to 88m1101
netem: fix skb_orphan_partial()
ipv4: add reference counting to metrics
sctp: fix ICMP processing if skb is non-linear
tcp: avoid fastopen API to be used on AF_UNSPEC
virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
be2net: Fix offload features for Q-in-Q packets
ipv6: fix out of bound writes in __ip6_append_data()
bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
qmi_wwan: add another Lenovo EM74xx device ID
bridge: netlink: check vlan_default_pvid range
ipv6: Check ip6_find_1stfragopt() return value properly.
ipv6: Prevent overrun when parsing v6 header options
net: Improve handling of failures on link and route dumps
tcp: eliminate negative reordering in tcp_clean_rtx_queue
sctp: do not inherit ipv6_{mc|ac|fl}_list from parent
sctp: fix src address selection if using secondary addresses for ipv6
tcp: avoid fragmenting peculiar skbs in SACK
s390/qeth: avoid null pointer dereference on OSN
s390/qeth: unbreak OSM and OSN support
s390/qeth: handle sysfs error during initialization
ipv6/dccp: do not inherit ipv6_mc_list from parent
dccp/tcp: do not inherit mc_list from parent
sparc: Fix -Wstringop-overflow warning
android: base-cfg: disable CONFIG_NFS_FS and CONFIG_NFSD
schedstats/eas: guard properly to avoid breaking non-smp schedstats users
BACKPORT: f2fs: sanity check size of nat and sit cache
FROMLIST: f2fs: sanity check checkpoint segno and blkoff
sched/tune: don't use schedtune before it is ready
sched/fair: use SCHED_CAPACITY_SCALE for energy normalization
sched/{fair,tune}: use reciprocal_value to compute boost margin
sched/tune: Initialize raw_spin_lock in boosted_groups
sched/tune: report when SchedTune has not been initialized
sched/tune: fix sched_energy_diff tracepoint
sched/tune: increase group count to 5
cpufreq/schedutil: use boosted_cpu_util for PELT to match WALT
sched/fair: Fix sched_group_energy() to support per-cpu capacity states
sched/fair: discount task contribution to find CPU with lowest utilization
sched/fair: ensure utilization signals are synchronized before use
sched/fair: remove task util from own cpu when placing waking task
trace:sched: Make util_avg in load_avg trace reflect PELT/WALT as used
sched/fair: Add eas (& cas) specific rq, sd and task stats
sched/core: Fix PELT jump to max OPP upon util increase
sched: EAS & 'single cpu per cluster'/cpu hotplug interoperability
UPSTREAM: sched/core: Fix group_entity's share update
UPSTREAM: sched/fair: Fix calc_cfs_shares() fixed point arithmetics width confusion
UPSTREAM: sched/fair: Fix incorrect task group ->load_avg
UPSTREAM: sched/fair: Fix effective_load() to consistently use smoothed load
UPSTREAM: sched/fair: Propagate asynchrous detach
UPSTREAM: sched/fair: Propagate load during synchronous attach/detach
UPSTREAM: sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list
BACKPORT: sched/fair: Factorize PELT update
UPSTREAM: sched/fair: Factorize attach/detach entity
UPSTREAM: sched/fair: Improve PELT stuff some more
UPSTREAM: sched/fair: Apply more PELT fixes
UPSTREAM: sched/fair: Fix post_init_entity_util_avg() serialization
BACKPORT: sched/fair: Initiate a new task's util avg to a bounded value
sched/fair: Simplify idle_idx handling in select_idle_sibling()
sched/fair: refactor find_best_target() for simplicity
sched/fair: Change cpu iteration order in find_best_target()
sched/core: Add first cpu w/ max/min orig capacity to root domain
sched/core: Remove remnants of commit fd5c98da1a42
sched: Remove sysctl_sched_is_big_little
sched/fair: Code !is_big_little path into select_energy_cpu_brute()
EAS: sched/fair: Re-integrate 'honor sync wakeups' into wakeup path
Fixup!: sched/fair.c: Set SchedTune specific struct energy_env.task
sched/fair: Energy-aware wake-up task placement
sched/fair: Add energy_diff dead-zone margin
sched/fair: Decommission energy_aware_wake_cpu()
sched/fair: Do not force want_affine eq. true if EAS is enabled
arm64: Set SD_ASYM_CPUCAPACITY sched_domain flag on DIE level
UPSTREAM: sched/fair: Fix incorrect comment for capacity_margin
UPSTREAM: sched/fair: Avoid pulling tasks from non-overloaded higher capacity groups
UPSTREAM: sched/fair: Add per-CPU min capacity to sched_group_capacity
UPSTREAM: sched/fair: Consider spare capacity in find_idlest_group()
UPSTREAM: sched/fair: Compute task/cpu utilization at wake-up correctly
UPSTREAM: sched/fair: Let asymmetric CPU configurations balance at wake-up
UPSTREAM: sched/core: Enable SD_BALANCE_WAKE for asymmetric capacity systems
UPSTREAM: sched/core: Pass child domain into sd_init()
UPSTREAM: sched/core: Introduce SD_ASYM_CPUCAPACITY sched_domain topology flag
UPSTREAM: sched/core: Remove unnecessary NULL-pointer check
UPSTREAM: sched/fair: Optimize find_idlest_cpu() when there is no choice
BACKPORT: sched/fair: Make the use of prev_cpu consistent in the wakeup path
UPSTREAM: sched/core: Fix power to capacity renaming in comment
Partial Revert: "WIP: sched: Add cpu capacity awareness to wakeup balancing"
Revert "WIP: sched: Consider spare cpu capacity at task wake-up"
FROM-LIST: cpufreq: schedutil: Redefine the rate_limit_us tunable
cpufreq: schedutil: add up/down frequency transition rate limits
trace/sched: add rq utilization signal for WALT
sched/cpufreq: make schedutil use WALT signal
sched: cpufreq: use rt_avg as estimate of required RT CPU capacity
cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
BACKPORT: kthread: allow to cancel kthread work
sched/cpufreq: fix tunables for schedfreq governor
BACKPORT: cpufreq: schedutil: New governor based on scheduler utilization data
sched: backport cpufreq hooks from 4.9-rc4
ANDROID: Kconfig: add depends for UID_SYS_STATS
ANDROID: hid: uhid: implement refcount for open and close
Revert "ext4: require encryption feature for EXT4_IOC_SET_ENCRYPTION_POLICY"
ANDROID: mnt: Fix next_descendent
Conflicts:
include/trace/events/sched.h
kernel/sched/Makefile
kernel/sched/core.c
kernel/sched/fair.c
kernel/sched/sched.h
Change-Id: I55318828f2c858e192ac7015bcf2bf0ec5c5b2c5
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
Use cond_resched_lock to avoid holding the vmap_area_lock for a
potentially long time and thus creating bad latencies for various
workloads.
Change-Id: I36eb4d8dbd6604f52e5c463373a9754847a44bc2
[hch: split from a larger patch by Joel, wrote the crappy changelog]
Link: http://lkml.kernel.org/r/1479474236-4139-11-git-send-email-hch@lst.de
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 763b218ddfaf56761c19923beb7e16656f66ec62
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
The purge_lock spinlock causes high latencies with non RT kernel. This
has been reported multiple times on lkml [1] [2] and affects
applications like audio.
This patch replaces it with a mutex to allow preemption while holding
the lock.
Thanks to Joel Fernandes for the detailed report and analysis as well as
an earlier attempt at fixing this issue.
[1] http://lists.openwall.net/linux-kernel/2016/03/23/29
[2] https://lkml.org/lkml/2016/10/9/59
Change-Id: I57d4e9c7ce5aeb3273574026da2a8b737ef0b809
Link: http://lkml.kernel.org/r/1479474236-4139-10-git-send-email-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: f9e09977671b618aeb25ddc0d4c9a84d5b5cde9d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
We will take a sleeping lock in later in this series, so this adds the
proper safeguards.
Change-Id: Iba7efcb690ad584a30ac31cfb7937889bab44e2e
Link: http://lkml.kernel.org/r/1479474236-4139-9-git-send-email-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 5803ed292e63a1bf00722d6655d0229794607183
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
We are going to use sleeping lock for freeing vmap. However some
vfree() users want to free memory from atomic (but not from interrupt)
context. For this we add vfree_atomic() - deferred variation of vfree()
which can be used in any atomic context (except NMIs).
[akpm@linux-foundation.org: tweak comment grammar]
[aryabinin@virtuozzo.com: use raw_cpu_ptr() instead of this_cpu_ptr()]
Link: http://lkml.kernel.org/r/1481553981-3856-1-git-send-email-aryabinin@virtuozzo.com
Link: http://lkml.kernel.org/r/1479474236-4139-5-git-send-email-hch@lst.de
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Jisheng Zhang <jszhang@marvell.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: bf22e37a641327e34681b7b6959d9646e3886770
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: I5f67e939774da6e811f3a5180a6b0f5d31fbe32b
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Move the purge_lock synchronization to the callers, move the call to
purge_fragmented_blocks_allcpus at the beginning of the function to the
callers that need it, move the force_flush behavior to the caller that
needs it, and pass start and end by value instead of by reference.
No change in behavior.
Change-Id: I6344f3c1de50e6fe939e886edeca610d6b539365
Link: http://lkml.kernel.org/r/1479474236-4139-4-git-send-email-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 0574ecd141df28d573d4364adec59766ddf5f38d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Just inline it into the only caller.
Change-Id: I5691805a6cec3e9e160b653551a99c6c998ff087
Link: http://lkml.kernel.org/r/1479474236-4139-3-git-send-email-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 9c3acf6043ac437ae0a45de4657ee700c3dc8850
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Patch series "reduce latency in __purge_vmap_area_lazy", v2.
This patch (of 10):
Sort out the long lock hold times in __purge_vmap_area_lazy. It is
based on a patch from Joel.
Inline free_unmap_vmap_area_noflush() it into the only caller.
Change-Id: I1cb90a5f4e14bae7b513da9cc672f2f8d06bfcfd
Link: http://lkml.kernel.org/r/1479474236-4139-2-git-send-email-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jisheng Zhang <jszhang@marvell.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: c8eef01e2f98e09a6733f2acdc675b4cf87a22a1
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
When mixing lots of vmallocs and set_memory_*() (which calls
vm_unmap_aliases()) I encountered situations where the performance
degraded severely due to the walking of the entire vmap_area list each
invocation.
One simple improvement is to add the lazily freed vmap_area to a
separate lockless free list, such that we then avoid having to walk the
full list on each purge.
Change-Id: I489700962fc86d539a68b5af489dfa9da04dfaad
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Roman Pen <r.peniaev@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Roman Pen <r.peniaev@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 80c4bd7a5e4368b680e0aeb57050a1b06eb573d8
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
[ Upstream commit 4f40c6e5627ea73b4e7c615c59631f38cc880885 ]
After much waiting I finally reproduced a KASAN issue, only to find my
trace-buffer empty of useful information because it got spooled out :/
Make kasan_report honour the /proc/sys/kernel/traceoff_on_warning
interface.
Link: http://lkml.kernel.org/r/20170125164106.3514-1-aryabinin@virtuozzo.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAllBIXAACgkQONu9yGCS
aT6T+w//VjXDZ+MddWJ4UeQDyIANYeFpa4tJNoqR3JsnT6yg1HODRZDR7aP5QJmN
GIoRWU/2Q2nmYbAO0c8RPxs07w2xtIZzTUn+H+i6sG7bRs5RbLM5AMg4W/A/X88L
V5c34kCvCf1HRfrdd4rXIZiibFnSZGqUv6o1YyQqCIvx15pyB6elMM714zt8uubk
iL4/WJ2M4SrmamHWA349ldEtPjQKpwpwdBcCn+M4awbimdc0pm8oZqNkAfwJ+vLO
HsuClO57I699ESU2Zt5bfEdVsW/gc7WiJOAr1Mrl2suToryrWfs2YT+sC/IQhkfC
gUsi9Cm/6YMu+tiP4o6aqYvTFoFplFErpEbC3mqAEvHGGHKhrgEDotYJ+FnvI3q7
Jaxix0B/Q/NIqsJPnqe5ONOCKFmW7rGR2e2j5+45GuiofioNVNF12HWfQkoItPOL
YeR2JB8K9aywzYM4gaJuy8ScJ1shN8TY1FKgZa5gBT2ym4pDDcQmxz7Jr7agREHe
F2sJ23zMU+o9guGA4Is2yqWCQ5yM+3kpPPISz+Pcgh8Q95o+ftCSyOeB2F5roW8I
EO22AlJPlQH0LWDQhOJ5ZuAVe+qB8EdrQqqdLbP4/oHp7MtlR5ge+idRuZc+AUsa
UoASccPsEwHyBErQmHoWNI4nPRciFrKliOqERmPLcuzewUwSatw=
=wXRR
-----END PGP SIGNATURE-----
Merge 4.4.72 into android-4.4
Changes in 4.4.72
bnx2x: Fix Multi-Cos
ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt()
cxgb4: avoid enabling napi twice to the same queue
tcp: disallow cwnd undo when switching congestion control
vxlan: fix use-after-free on deletion
ipv6: Fix leak in ipv6_gso_segment().
net: ping: do not abuse udp_poll()
net: ethoc: enable NAPI before poll may be scheduled
net: bridge: start hello timer only if device is up
sparc64: mm: fix copy_tsb to correctly copy huge page TSBs
sparc: Machine description indices can vary
sparc64: reset mm cpumask after wrap
sparc64: combine activate_mm and switch_mm
sparc64: redefine first version
sparc64: add per-cpu mm of secondary contexts
sparc64: new context wrap
sparc64: delete old wrap code
arch/sparc: support NR_CPUS = 4096
serial: ifx6x60: fix use-after-free on module unload
ptrace: Properly initialize ptracer_cred on fork
KEYS: fix dereferencing NULL payload with nonzero length
KEYS: fix freeing uninitialized memory in key_update()
crypto: gcm - wait for crypto op not signal safe
drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)
nfsd4: fix null dereference on replay
nfsd: Fix up the "supattr_exclcreat" attributes
kvm: async_pf: fix rcu_irq_enter() with irqs enabled
KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation
arm: KVM: Allow unaligned accesses at HYP
KVM: async_pf: avoid async pf injection when in guest mode
dmaengine: usb-dmac: Fix DMAOR AE bit definition
dmaengine: ep93xx: Always start from BASE0
xen/privcmd: Support correctly 64KB page granularity when mapping memory
xen-netfront: do not cast grant table reference to signed short
xen-netfront: cast grant table reference first to type int
ext4: fix SEEK_HOLE
ext4: keep existing extra fields when inode expands
ext4: fix fdatasync(2) after extent manipulation operations
usb: gadget: f_mass_storage: Serialize wake and sleep execution
usb: chipidea: udc: fix NULL pointer dereference if udc_start failed
usb: chipidea: debug: check before accessing ci_role
staging/lustre/lov: remove set_fs() call from lov_getstripe()
iio: light: ltr501 Fix interchanged als/ps register field
iio: proximity: as3935: fix AS3935_INT mask
drivers: char: random: add get_random_long()
random: properly align get_random_int_hash
stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms
cpufreq: cpufreq_register_driver() should return -ENODEV if init fails
target: Re-add check to reject control WRITEs with overflow data
drm/msm: Expose our reservation object when exporting a dmabuf.
Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled
cpuset: consider dying css as offline
fs: add i_blocksize()
ufs: restore proper tail allocation
fix ufs_isblockset()
ufs: restore maintaining ->i_blocks
ufs: set correct ->s_maxsize
ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments()
ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path
cxl: Fix error path on bad ioctl
btrfs: use correct types for page indices in btrfs_page_exists_in_range
btrfs: fix memory leak in update_space_info failure path
KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages
scsi: qla2xxx: don't disable a not previously enabled PCI device
powerpc/eeh: Avoid use after free in eeh_handle_special_event()
powerpc/numa: Fix percpu allocations to be NUMA aware
powerpc/hotplug-mem: Fix missing endian conversion of aa_index
perf/core: Drop kernel samples even though :u is specified
drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve()
drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()
drm/vmwgfx: Make sure backup_handle is always valid
drm/nouveau/tmr: fully separate alarm execution/pending lists
ALSA: timer: Fix race between read and ioctl
ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT
ASoC: Fix use-after-free at card unregistration
drivers: char: mem: Fix wraparound check to allow mappings up to the end
tty: Drop krefs for interrupted tty lock
serial: sh-sci: Fix panic when serial console and DMA are enabled
net: better skb->sender_cpu and skb->napi_id cohabitation
mm: consider memblock reservations for deferred memory initialization sizing
NFS: Ensure we revalidate attributes before using execute_ok()
NFSv4: Don't perform cached access checks before we've OPENed the file
Make __xfs_xattr_put_listen preperly report errors.
arm64: hw_breakpoint: fix watchpoint matching for tagged pointers
arm64: entry: improve data abort handling of tagged pointers
RDMA/qib,hfi1: Fix MR reference count leak on write with immediate
usercopy: Adjust tests to deal with SMAP/PAN
arm64: armv8_deprecated: ensure extension of addr
arm64: ensure extension of smp_store_release value
Linux 4.4.72
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 864b9a393dcb5aed09b8fd31b9bbda0fdda99374 upstream.
We have seen an early OOM killer invocation on ppc64 systems with
crashkernel=4096M:
kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0
kthreadd cpuset=/ mems_allowed=7
CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1
Call Trace:
dump_stack+0xb0/0xf0 (unreliable)
dump_header+0xb0/0x258
out_of_memory+0x5f0/0x640
__alloc_pages_nodemask+0xa8c/0xc80
kmem_getpages+0x84/0x1a0
fallback_alloc+0x2a4/0x320
kmem_cache_alloc_node+0xc0/0x2e0
copy_process.isra.25+0x260/0x1b30
_do_fork+0x94/0x470
kernel_thread+0x48/0x60
kthreadd+0x264/0x330
ret_from_kernel_thread+0x5c/0xa4
Mem-Info:
active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:5 slab_unreclaimable:73
mapped:0 shmem:0 pagetables:0 bounce:0
free:0 free_pcp:0 free_cma:0
Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Node 7 DMA: 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 0kB
0 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
819200 pages RAM
0 pages HighMem/MovableOnly
817481 pages reserved
0 pages cma reserved
0 pages hwpoisoned
the reason is that the managed memory is too low (only 110MB) while the
rest of the the 50GB is still waiting for the deferred intialization to
be done. update_defer_init estimates the initial memoty to initialize
to 2GB at least but it doesn't consider any memory allocated in that
range. In this particular case we've had
Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB)
so the low 2GB is mostly depleted.
Fix this by considering memblock allocations in the initial static
initialization estimation. Move the max_initialise to
reset_deferred_meminit and implement a simple memblock_reserved_memory
helper which iterates all reserved blocks and sums the size of all that
start below the given address. The cumulative size is than added on top
of the initial estimation. This is still not ideal because
reset_deferred_meminit doesn't consider holes and so reservation might
be above the initial estimation whihch we ignore but let's make the
logic simpler until we really need to handle more complicated cases.
Fixes: 3a80a7fa79 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93407472a21b82f39c955ea7787e5bc7da100642 upstream.
Replace all 1 << inode->i_blkbits and (1 << inode->i_blkbits) in fs
branch.
This patch also fixes multiple checkpatch warnings: WARNING: Prefer
'unsigned int' to bare use of 'unsigned'
Thanks to Andrew Morton for suggesting more appropriate function instead
of macro.
[geliangtang@gmail.com: truncate: use i_blocksize()]
Link: http://lkml.kernel.org/r/9c8b2cd83c8f5653805d43debde9fa8817e02fc4.1484895804.git.geliangtang@gmail.com
Link: http://lkml.kernel.org/r/1481319905-10126-1-git-send-email-fabf@skynet.be
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* refs/heads/tmp-9bc4622:
Linux 4.4.70
drivers: char: mem: Check for address space wraparound with mmap()
nfsd: encoders mustn't use unitialized values in error cases
drm/edid: Add 10 bpc quirk for LGD 764 panel in HP zBook 17 G2
PCI: Freeze PME scan before suspending devices
PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platforms
tracing/kprobes: Enforce kprobes teardown after testing
osf_wait4(): fix infoleak
genirq: Fix chained interrupt data ordering
uwb: fix device quirk on big-endian hosts
metag/uaccess: Check access_ok in strncpy_from_user
metag/uaccess: Fix access_ok()
iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings
staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD.
staging: rtl8192e: fix 2 byte alignment of register BSSIDR.
mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp
xc2028: Fix use-after-free bug properly
arm64: documentation: document tagged pointer stack constraints
arm64: uaccess: ensure extension of access_ok() addr
arm64: xchg: hazard against entire exchange variable
ARM: dts: at91: sama5d3_xplained: not all ADC channels are available
ARM: dts: at91: sama5d3_xplained: fix ADC vref
powerpc/64e: Fix hang when debugging programs with relocated kernel
powerpc/pseries: Fix of_node_put() underflow during DLPAR remove
powerpc/book3s/mce: Move add_taint() later in virtual mode
cx231xx-cards: fix NULL-deref at probe
cx231xx-audio: fix NULL-deref at probe
cx231xx-audio: fix init error path
dvb-frontends/cxd2841er: define symbol_rate_min/max in T/C fe-ops
zr364xx: enforce minimum size when reading header
dib0700: fix NULL-deref at probe
s5p-mfc: Fix unbalanced call to clock management
gspca: konica: add missing endpoint sanity check
ceph: fix recursion between ceph_set_acl() and __ceph_setattr()
iio: proximity: as3935: fix as3935_write
ipx: call ipxitf_put() in ioctl error path
USB: hub: fix non-SS hub-descriptor handling
USB: hub: fix SS hub-descriptor handling
USB: serial: io_ti: fix div-by-zero in set_termios
USB: serial: mct_u232: fix big-endian baud-rate handling
USB: serial: qcserial: add more Lenovo EM74xx device IDs
usb: serial: option: add Telit ME910 support
USB: iowarrior: fix info ioctl on big-endian hosts
usb: musb: tusb6010_omap: Do not reset the other direction's packet size
ttusb2: limit messages to buffer size
mceusb: fix NULL-deref at probe
usbvision: fix NULL-deref at probe
net: irda: irda-usb: fix firmware name on big-endian hosts
usb: host: xhci-mem: allocate zeroed Scratchpad Buffer
xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton
usb: host: xhci-plat: propagate return value of platform_get_irq()
sched/fair: Initialize throttle_count for new task-groups lazily
sched/fair: Do not announce throttled next buddy in dequeue_task_fair()
fscrypt: avoid collisions when presenting long encrypted filenames
f2fs: check entire encrypted bigname when finding a dentry
fscrypt: fix context consistency check when key(s) unavailable
net: qmi_wwan: Add SIMCom 7230E
ext4 crypto: fix some error handling
ext4 crypto: don't let data integrity writebacks fail with ENOMEM
USB: serial: ftdi_sio: add Olimex ARM-USB-TINY(H) PIDs
USB: serial: ftdi_sio: fix setting latency for unprivileged users
pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes()
pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes
iio: dac: ad7303: fix channel description
of: fix sparse warning in of_pci_range_parser_one
proc: Fix unbalanced hard link numbers
cdc-acm: fix possible invalid access when processing notification
drm/nouveau/tmr: handle races with hw when updating the next alarm time
drm/nouveau/tmr: avoid processing completed alarms when adding a new one
drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm
drm/nouveau/tmr: ack interrupt before processing alarms
drm/nouveau/therm: remove ineffective workarounds for alarm bugs
drm/amdgpu: Make display watermark calculations more accurate
drm/amdgpu: Avoid overflows/divide-by-zero in latency_watermark calculations.
ath9k_htc: fix NULL-deref at probe
ath9k_htc: Add support of AirTies 1eda:2315 AR9271 device
s390/cputime: fix incorrect system time
s390/kdump: Add final note
regulator: tps65023: Fix inverted core enable logic.
KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation
KVM: x86: Fix load damaged SSEx MXCSR register
ima: accept previously set IMA_NEW_FILE
mwifiex: pcie: fix cmd_buf use-after-free in remove/reset
rtlwifi: rtl8821ae: setup 8812ae RFE according to device type
md: update slab_cache before releasing new stripes when stripes resizing
dm space map disk: fix some book keeping in the disk space map
dm thin metadata: call precommit before saving the roots
dm bufio: make the parameter "retain_bytes" unsigned long
dm cache metadata: fail operations if fail_io mode has been established
dm bufio: check new buffer allocation watermark every 30 seconds
dm bufio: avoid a possible ABBA deadlock
dm raid: select the Kconfig option CONFIG_MD_RAID0
dm btree: fix for dm_btree_find_lowest_key()
infiniband: call ipv6 route lookup via the stub interface
tpm_crb: check for bad response size
ARM: tegra: paz00: Mark panel regulator as enabled on boot
USB: core: replace %p with %pK
char: lp: fix possible integer overflow in lp_setup()
watchdog: pcwd_usb: fix NULL-deref at probe
USB: ene_usb6250: fix DMA to the stack
usb: misc: legousbtower: Fix memory leak
usb: misc: legousbtower: Fix buffers on stack
ANDROID: uid_sys_stats: defer io stats calulation for dead tasks
ANDROID: AVB: Fix linter errors.
ANDROID: AVB: Fix invalidate_vbmeta_submit().
ANDROID: sdcardfs: Check for NULL in revalidate
Linux 4.4.69
ipmi: Fix kernel panic at ipmi_ssif_thread()
wlcore: Add RX_BA_WIN_SIZE_CHANGE_EVENT event
wlcore: Pass win_size taken from ieee80211_sta to FW
mac80211: RX BA support for sta max_rx_aggregation_subframes
mac80211: pass block ack session timeout to to driver
mac80211: pass RX aggregation window size to driver
Bluetooth: hci_intel: add missing tty-device sanity check
Bluetooth: hci_bcm: add missing tty-device sanity check
Bluetooth: Fix user channel for 32bit userspace on 64bit kernel
tty: pty: Fix ldisc flush after userspace become aware of the data already
serial: omap: suspend device on probe errors
serial: omap: fix runtime-pm handling on unbind
serial: samsung: Use right device for DMA-mapping calls
arm64: KVM: Fix decoding of Rt/Rt2 when trapping AArch32 CP accesses
padata: free correct variable
CIFS: add misssing SFM mapping for doublequote
cifs: fix CIFS_IOC_GET_MNT_INFO oops
CIFS: fix mapping of SFM_SPACE and SFM_PERIOD
SMB3: Work around mount failure when using SMB3 dialect to Macs
Set unicode flag on cifs echo request to avoid Mac error
fs/block_dev: always invalidate cleancache in invalidate_bdev()
ceph: fix memory leak in __ceph_setxattr()
fs/xattr.c: zero out memory copied to userspace in getxattr
ext4: evict inline data when writing to memory map
IB/mlx4: Reduce SRIOV multicast cleanup warning message to debug level
IB/mlx4: Fix ib device initialization error flow
IB/IPoIB: ibX: failed to create mcg debug file
IB/core: Fix sysfs registration error flow
vfio/type1: Remove locked page accounting workqueue
dm era: save spacemap metadata root after the pre-commit
crypto: algif_aead - Require setkey before accept(2)
block: fix blk_integrity_register to use template's interval_exp if not 0
KVM: arm/arm64: fix races in kvm_psci_vcpu_on
KVM: x86: fix user triggerable warning in kvm_apic_accept_events()
um: Fix PTRACE_POKEUSER on x86_64
x86, pmem: Fix cache flushing for iovec write < 8 bytes
selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bug
x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startup
usb: hub: Do not attempt to autosuspend disconnected devices
usb: hub: Fix error loop seen after hub communication errors
usb: Make sure usb/phy/of gets built-in
usb: misc: add missing continue in switch
staging: comedi: jr3_pci: cope with jiffies wraparound
staging: comedi: jr3_pci: fix possible null pointer dereference
staging: gdm724x: gdm_mux: fix use-after-free on module unload
staging: vt6656: use off stack for out buffer USB transfers.
staging: vt6656: use off stack for in buffer USB transfers.
USB: Proper handling of Race Condition when two USB class drivers try to call init_usb_class simultaneously
USB: serial: ftdi_sio: add device ID for Microsemi/Arrow SF2PLUS Dev Kit
usb: host: xhci: print correct command ring address
iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement
target: Convert ACL change queue_depth se_session reference usage
target/fileio: Fix zero-length READ and WRITE handling
target: Fix compare_and_write_callback handling for non GOOD status
xen: adjust early dom0 p2m handling to xen hypervisor behavior
ANDROID: AVB: Only invalidate vbmeta when told to do so.
ANDROID: sdcardfs: Move top to its own struct
ANDROID: lowmemorykiller: account for unevictable pages
ANDROID: usb: gadget: fix NULL pointer issue in mtp_read()
ANDROID: usb: f_mtp: return error code if transfer error in receive_file_work function
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
Conflicts:
drivers/usb/gadget/function/f_mtp.c
fs/ext4/page-io.c
net/mac80211/agg-rx.c
Change-Id: Id65e75bf3bcee4114eb5d00730a9ef2444ad58eb
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>