Once an object is put into quarantine, we no longer own it, i.e. object
could leave the quarantine and be reallocated. So having set_track()
call after the quarantine_put() may corrupt slab objects.
BUG kmalloc-4096 (Not tainted): Poison overwritten
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: 0xffff8804540de850-0xffff8804540de857. First byte 0xb5 instead of 0x6b
...
INFO: Freed in qlist_free_all+0x42/0x100 age=75 cpu=3 pid=24492
__slab_free+0x1d6/0x2e0
___cache_free+0xb6/0xd0
qlist_free_all+0x83/0x100
quarantine_reduce+0x177/0x1b0
kasan_kmalloc+0xf3/0x100
kasan_slab_alloc+0x12/0x20
kmem_cache_alloc+0x109/0x3e0
mmap_region+0x53e/0xe40
do_mmap+0x70f/0xa50
vm_mmap_pgoff+0x147/0x1b0
SyS_mmap_pgoff+0x2c7/0x5b0
SyS_mmap+0x1b/0x30
do_syscall_64+0x1a0/0x4e0
return_from_SYSCALL_64+0x0/0x7a
INFO: Slab 0xffffea0011503600 objects=7 used=7 fp=0x (null) flags=0x8000000000004080
INFO: Object 0xffff8804540de848 @offset=26696 fp=0xffff8804540dc588
Redzone ffff8804540de840: bb bb bb bb bb bb bb bb ........
Object ffff8804540de848: 6b 6b 6b 6b 6b 6b 6b 6b b5 52 00 00 f2 01 60 cc kkkkkkkk.R....`.
Similarly, poisoning after the quarantine_put() leads to false positive
use-after-free reports:
BUG: KASAN: use-after-free in anon_vma_interval_tree_insert+0x304/0x430 at addr ffff880405c540a0
Read of size 8 by task trinity-c0/3036
CPU: 0 PID: 3036 Comm: trinity-c0 Not tainted 4.7.0-think+ #9
Call Trace:
dump_stack+0x68/0x96
kasan_report_error+0x222/0x600
__asan_report_load8_noabort+0x61/0x70
anon_vma_interval_tree_insert+0x304/0x430
anon_vma_chain_link+0x91/0xd0
anon_vma_clone+0x136/0x3f0
anon_vma_fork+0x81/0x4c0
copy_process.part.47+0x2c43/0x5b20
_do_fork+0x16d/0xbd0
SyS_clone+0x19/0x20
do_syscall_64+0x1a0/0x4e0
entry_SYSCALL64_slow_path+0x25/0x25
Fix this by putting an object in the quarantine after all other
operations.
Fixes: 80a9201a5965 ("mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB")
Link: http://lkml.kernel.org/r/1470062715-14077-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 4a3d308d6674fabf213bce9c1a661ef43a85e515)
Change-Id: Iaa699c447b97f8cb04afdd2d6a5f572bea439185
Signed-off-by: Paul Lawrence <paullawrence@google.com>
For KASAN builds:
- switch SLUB allocator to using stackdepot instead of storing the
allocation/deallocation stacks in the objects;
- change the freelist hook so that parts of the freelist can be put
into the quarantine.
[aryabinin@virtuozzo.com: fixes]
Link: http://lkml.kernel.org/r/1468601423-28676-1-git-send-email-aryabinin@virtuozzo.com
Link: http://lkml.kernel.org/r/1468347165-41906-3-git-send-email-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Kuthonuzo Luruo <kuthonuzo.luruo@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 80a9201a5965f4715d5c09790862e0df84ce0614)
Change-Id: I2b59c6d50d0db62d3609edfdc7be54e48f8afa5c
Signed-off-by: Paul Lawrence <paullawrence@google.com>
There are two bugs on qlist_move_cache(). One is that qlist's tail
isn't set properly. curr->next can be NULL since it is singly linked
list and NULL value on tail is invalid if there is one item on qlist.
Another one is that if cache is matched, qlist_put() is called and it
will set curr->next to NULL. It would cause to stop the loop
prematurely.
These problems come from complicated implementation so I'd like to
re-implement it completely. Implementation in this patch is really
simple. Iterate all qlist_nodes and put them to appropriate list.
Unfortunately, I got this bug sometime ago and lose oops message. But,
the bug looks trivial and no need to attach oops.
Fixes: 55834c59098d ("mm: kasan: initial memory quarantine implementation")
Link: http://lkml.kernel.org/r/1467766348-22419-1-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Kuthonuzo Luruo <poll.stdin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 0ab686d8c8303069e80300663b3be6201a8697fb)
Change-Id: Ifca87bd938c74ff18e7fc2680afb15070cc7019f
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Currently we may put reserved by mempool elements into quarantine via
kasan_kfree(). This is totally wrong since quarantine may really free
these objects. So when mempool will try to use such element,
use-after-free will happen. Or mempool may decide that it no longer
need that element and double-free it.
So don't put object into quarantine in kasan_kfree(), just poison it.
Rename kasan_kfree() to kasan_poison_kfree() to respect that.
Also, we shouldn't use kasan_slab_alloc()/kasan_krealloc() in
kasan_unpoison_element() because those functions may update allocation
stacktrace. This would be wrong for the most of the remove_element call
sites.
(The only call site where we may want to update alloc stacktrace is
in mempool_alloc(). Kmemleak solves this by calling
kmemleak_update_trace(), so we could make something like that too.
But this is out of scope of this patch).
Fixes: 55834c59098d ("mm: kasan: initial memory quarantine implementation")
Link: http://lkml.kernel.org/r/575977C3.1010905@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Kuthonuzo Luruo <kuthonuzo.luruo@hpe.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 9b75a867cc9ddbafcaf35029358ac500f2635ff3)
Change-Id: Idb6c152dae8f8f2975dbe6acb7165315be8b465b
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change the following memory hot-add error messages to info messages.
There is no need for these to be errors.
kasan: WARNING: KASAN doesn't support memory hot-add
kasan: Memory hot-add will be disabled
Link: http://lkml.kernel.org/r/1464794430-5486-1-git-send-email-shuahkh@osg.samsung.com
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 91a4c272145652d798035c17e1c02c91001d3f51)
Change-Id: I6ac2acf71cb04f18d25c3e4cbf7317055d130f74
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Memory access coded in an assembly won't be seen by KASAN as a compiler
can instrument only C code. Add kasan_check_[read,write]() API which is
going to be used to check a certain memory range.
Link: http://lkml.kernel.org/r/1462538722-1574-3-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 64f8ebaf115bcddc4aaa902f981c57ba6506bc42)
Change-Id: I3e75c7c22e77d390c55ca1b86ec58a6d6ea1da87
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Quarantine isolates freed objects in a separate queue. The objects are
returned to the allocator later, which helps to detect use-after-free
errors.
When the object is freed, its state changes from KASAN_STATE_ALLOC to
KASAN_STATE_QUARANTINE. The object is poisoned and put into quarantine
instead of being returned to the allocator, therefore every subsequent
access to that object triggers a KASAN error, and the error handler is
able to say where the object has been allocated and deallocated.
When it's time for the object to leave quarantine, its state becomes
KASAN_STATE_FREE and it's returned to the allocator. From now on the
allocator may reuse it for another allocation. Before that happens,
it's still possible to detect a use-after free on that object (it
retains the allocation/deallocation stacks).
When the allocator reuses this object, the shadow is unpoisoned and old
allocation/deallocation stacks are wiped. Therefore a use of this
object, even an incorrect one, won't trigger ASan warning.
Without the quarantine, it's not guaranteed that the objects aren't
reused immediately, that's why the probability of catching a
use-after-free is lower than with quarantine in place.
Quarantine isolates freed objects in a separate queue. The objects are
returned to the allocator later, which helps to detect use-after-free
errors.
Freed objects are first added to per-cpu quarantine queues. When a
cache is destroyed or memory shrinking is requested, the objects are
moved into the global quarantine queue. Whenever a kmalloc call allows
memory reclaiming, the oldest objects are popped out of the global queue
until the total size of objects in quarantine is less than 3/4 of the
maximum quarantine size (which is a fraction of installed physical
memory).
As long as an object remains in the quarantine, KASAN is able to report
accesses to it, so the chance of reporting a use-after-free is
increased. Once the object leaves quarantine, the allocator may reuse
it, in which case the object is unpoisoned and KASAN can't detect
incorrect accesses to it.
Right now quarantine support is only enabled in SLAB allocator.
Unification of KASAN features in SLAB and SLUB will be done later.
This patch is based on the "mm: kasan: quarantine" patch originally
prepared by Dmitry Chernenkov. A number of improvements have been
suggested by Andrey Ryabinin.
[glider@google.com: v9]
Link: http://lkml.kernel.org/r/1462987130-144092-1-git-send-email-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 55834c59098d0c5a97b0f3247e55832b67facdcf)
Change-Id: Ib808d72a40f2e5137961d93dad540e85f8bbd2c4
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Implement the stack depot and provide CONFIG_STACKDEPOT. Stack depot
will allow KASAN store allocation/deallocation stack traces for memory
chunks. The stack traces are stored in a hash table and referenced by
handles which reside in the kasan_alloc_meta and kasan_free_meta
structures in the allocated memory chunks.
IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
duplication.
Right now stackdepot support is only enabled in SLAB allocator. Once
KASAN features in SLAB are on par with those in SLUB we can switch SLUB
to stackdepot as well, thus removing the dependency on SLUB stack
bookkeeping, which wastes a lot of memory.
This patch is based on the "mm: kasan: stack depots" patch originally
prepared by Dmitry Chernenkov.
Joonsoo has said that he plans to reuse the stackdepot code for the
mm/page_owner.c debugging facility.
[akpm@linux-foundation.org: s/depot_stack_handle/depot_stack_handle_t]
[aryabinin@virtuozzo.com: comment style fixes]
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from cd11016e5f5212c13c0cec7384a525edc93b4921)
Change-Id: Ic804318410823b95d84e264a6334e018f21ef943
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Add GFP flags to KASAN hooks for future patches to use.
This patch is based on the "mm: kasan: unified support for SLUB and SLAB
allocators" patch originally prepared by Dmitry Chernenkov.
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 505f5dcb1c419e55a9621a01f83eb5745d8d7398)
Change-Id: I7c5539f59e6969e484a6ff4f104dce2390669cfd
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Add KASAN hooks to SLAB allocator.
This patch is based on the "mm: kasan: unified support for SLUB and SLAB
allocators" patch originally prepared by Dmitry Chernenkov.
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 7ed2f9e663854db313f177a511145630e398b402)
Change-Id: I131fdafc1c27a25732475f5bbd1653b66954e1b7
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Finding suitable OFF_SLAB candidate is more related to aligned cache
size rather than original size. Same reasoning can be applied to the
debug pagealloc candidate. So, this patch moves up alignment fixup to
proper position. From that point, size is aligned so we can remove some
alignment fixups.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 832a15d209cd260180407bde1af18965b21623f3)
Change-Id: I8338d647da4a6eb6402c6fe4e2402b7db45ea5a5
Signed-off-by: Paul Lawrence <paullawrence@google.com>
debug_pagealloc debugging is related to SLAB_POISON flag rather than
FORCED_DEBUG option, although FORCED_DEBUG option will enable
SLAB_POISON. Fix it.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 40323278b557a5909bbecfa181c91a3af7afbbe3)
Change-Id: I4a7dbd40b6a2eb777439f271fa600b201e6db1d3
Signed-off-by: Paul Lawrence <paullawrence@google.com>
cache_init_objs() will be changed in following patch and current form
doesn't fit well for that change. So, before doing it, this patch
separates debugging initialization. This would cause two loop iteration
when debugging is enabled, but, this overhead seems too light than debug
feature itself so effect may not be visible. This patch will greatly
simplify changes in cache_init_objs() in following patch.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 10b2e9e8e808bd30e1f4018a36366d07b0abd12f)
Change-Id: I9904974a674f17fc7d57daf0fe351742db67c006
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Now, we don't use object status buffer in any setup. Remove it.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 249247b6f8ee362189a2f2bf598a14ff6c95fb4c)
Change-Id: I60b1fb030da5d44aff2627f57dffd9dd7436e511
Signed-off-by: Paul Lawrence <paullawrence@google.com>
DEBUG_SLAB_LEAK is a debug option. It's current implementation requires
status buffer so we need more memory to use it. And, it cause
kmem_cache initialization step more complex.
To remove this extra memory usage and to simplify initialization step,
this patch implement this feature with another way.
When user requests to get slab object owner information, it marks that
getting information is started. And then, all free objects in caches
are flushed to corresponding slab page. Now, we can distinguish all
freed object so we can know all allocated objects, too. After
collecting slab object owner information on allocated objects, mark is
checked that there is no free during the processing. If true, we can be
sure that our information is correct so information is returned to user.
Although this way is rather complex, it has two important benefits
mentioned above. So, I think it is worth changing.
There is one drawback that it takes more time to get slab object owner
information but it is just a debug option so it doesn't matter at all.
To help review, this patch implements new way only. Following patch
will remove useless code.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from d31676dfde257cb2b3e52d4e657d8ad2251e4d49)
Change-Id: I204ea0dd5553577d17c93f32f0d5a797ba0304af
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Currently, open code for checking DEBUG_PAGEALLOC cache is spread to
some sites. It makes code unreadable and hard to change.
This patch cleans up this code. The following patch will change the
criteria for DEBUG_PAGEALLOC cache so this clean-up will help it, too.
[akpm@linux-foundation.org: fix build with CONFIG_DEBUG_PAGEALLOC=n]
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from 40b44137971c2e5865a78f9f7de274449983ccb5)
Change-Id: I784df4a54b62f77a22f8fa70990387fdf968219f
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from a307ebd468e0b97c203f5a99a56a6017e4d1991a)
Change-Id: I4407dbf0a5a22ab6f9335219bafc6f28023635a0
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Functions which the compiler has instrumented for ASAN place poison on
the stack shadow upon entry and remove this poison prior to returning.
In some cases (e.g. hotplug and idle), CPUs may exit the kernel a
number of levels deep in C code. If there are any instrumented
functions on this critical path, these will leave portions of the idle
thread stack shadow poisoned.
If a CPU returns to the kernel via a different path (e.g. a cold
entry), then depending on stack frame layout subsequent calls to
instrumented functions may use regions of the stack with stale poison,
resulting in (spurious) KASAN splats to the console.
Contemporary GCCs always add stack shadow poisoning when ASAN is
enabled, even when asked to not instrument a function [1], so we can't
simply annotate functions on the critical path to avoid poisoning.
Instead, this series explicitly removes any stale poison before it can
be hit. In the common hotplug case we clear the entire stack shadow in
common code, before a CPU is brought online.
On architectures which perform a cold return as part of cpu idle may
retain an architecture-specific amount of stack contents. To retain the
poison for this retained context, the arch code must call the core KASAN
code, passing a "watermark" stack pointer value beyond which shadow will
be cleared. Architectures which don't perform a cold return as part of
idle do not need any additional code.
This patch (of 3):
Functions which the compiler has instrumented for KASAN place poison on
the stack shadow upon entry and remove this poision prior to returning.
In some cases (e.g. hotplug and idle), CPUs may exit the kernel a number
of levels deep in C code. If there are any instrumented functions on this
critical path, these will leave portions of the stack shadow poisoned.
If a CPU returns to the kernel via a different path (e.g. a cold entry),
then depending on stack frame layout subsequent calls to instrumented
functions may use regions of the stack with stale poison, resulting in
(spurious) KASAN splats to the console.
To avoid this, we must clear stale poison from the stack prior to
instrumented functions being called. This patch adds functions to the
KASAN core for removing poison from (portions of) a task's stack. These
will be used by subsequent patches to avoid problems with hotplug and
idle.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 64145065
(cherry-picked from e3ae116339f9a0c77523abc95e338fa405946e07)
Change-Id: I9be31b714d5bdaec94a2dad3f0e468c094fe5fa2
Signed-off-by: Paul Lawrence <paullawrence@google.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlomc30ACgkQONu9yGCS
aT5Hkw/9GJCZ3LyqVfd9mkB97/9OTFduQUjLsqFdgQIRZdByHe6fR8cfJ1ydeiJS
fcnBtZzJNMBXurEWourqrSWptDkFANoG0EGResza+3aXXC2SuwmuXG0JgR4KdnpE
1pkqnKJ42rnK1XWuHIp2S0A6S+SW8fA5HrVFpQDLiXUb4NB9xerG2PS6tq94rQN0
TB2IdJzb5ITWQkCzYvuTphl4jz2XnMJskd53c9osG2izBxVXzA650ygjtW6HrqG6
AWvc0BK7p/Rz+ty3PbGbSNUJyDicnknS2/UARMz/BlI5SiyL8aAHVC7h3k9dh2CU
uPVhfebFCSBcZ+iYl4UUuywjiJwDTMk/QwNZFph5Hw5HntB2LAHSG3rrv/8OIoPe
+bvMRonfWB+rtzDiy0UbCCutswJzf6h9Pj5y/PA+GIFnNmeDQ096EOI7BFGO1MVe
s/d/krw4xqhDRb97ZlWWX8UqRplmPSe4pWP47Gr0K4qNAPhp4B2tW6VJcMG3z2oo
0/iV9cZ2CVxpIA2oN7ymWCKSQ764VADTcNSqC9Wd+rDso2XcBPRYUo9pbNKLqryb
hxQR1Hj0do35qkR1L/e6thdygMV2ZyQ4xBq9fVPjDrwi2d9dO1jfnkQkL+QEa73G
AaufuvFdJefssFP91FwGZnhhMi8kWIuJSTFM3Pg9OwNEjgt3nxo=
=eTmV
-----END PGP SIGNATURE-----
Merge 4.4.104 into android-4.4
Changes in 4.4.104
netlink: add a start callback for starting a netlink dump
ipsec: Fix aborted xfrm policy dump crash
x86/mm/pat: Ensure cpa->pfn only contains page frame numbers
x86/efi: Hoist page table switching code into efi_call_virt()
x86/efi: Build our own page table structures
ARM: dts: omap3: logicpd-torpedo-37xx-devkit: Fix MMC1 cd-gpio
x86/efi-bgrt: Fix kernel panic when mapping BGRT data
x86/efi-bgrt: Replace early_memremap() with memremap()
mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d()
mm/madvise.c: fix madvise() infinite loop under special circumstances
btrfs: clear space cache inode generation always
KVM: x86: pvclock: Handle first-time write to pvclock-page contains random junk
KVM: x86: Exit to user-mode on #UD intercept when emulator requires
KVM: x86: inject exceptions produced by x86_decode_insn
mmc: core: Do not leave the block driver in a suspended state
eeprom: at24: check at24_read/write arguments
bcache: Fix building error on MIPS
Revert "drm/radeon: dont switch vt on suspend"
drm/radeon: fix atombios on big endian
drm/panel: simple: Add missing panel_simple_unprepare() calls
mtd: nand: Fix writing mtdoops to nand flash.
NFS: revalidate "." etc correctly on "open".
drm/i915: Don't try indexed reads to alternate slave addresses
drm/i915: Prevent zero length "index" write
nfsd: Make init_open_stateid() a bit more whole
nfsd: Fix stateid races between OPEN and CLOSE
nfsd: Fix another OPEN stateid race
Linux 4.4.104
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 6ea8d958a2c95a1d514015d4e29ba21a8c0a1a91 upstream.
MADVISE_WILLNEED has always been a noop for DAX (formerly XIP) mappings.
Unfortunately madvise_willneed() doesn't communicate this information
properly to the generic madvise syscall implementation. The calling
convention is quite subtle there. madvise_vma() is supposed to either
return an error or update &prev otherwise the main loop will never
advance to the next vma and it will keep looping for ever without a way
to get out of the kernel.
It seems this has been broken since introduction. Nobody has noticed
because nobody seems to be using MADVISE_WILLNEED on these DAX mappings.
[mhocko@suse.com: rewrite changelog]
Link: http://lkml.kernel.org/r/20171127115318.911-1-guoxuenan@huawei.com
Fixes: fe77ba6f4f ("[PATCH] xip: madvice/fadvice: execute in place")
Signed-off-by: chenjie <chenjie6@huawei.com>
Signed-off-by: guoxuenan <guoxuenan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: zhangyi (F) <yi.zhang@huawei.com>
Cc: Miao Xie <miaoxie@huawei.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8f97366452ed491d13cf1e44241bc0b5740b1f0 upstream.
Currently, we unconditionally make page table dirty in touch_pmd().
It may result in false-positive can_follow_write_pmd().
We may avoid the situation, if we would only make the page table entry
dirty if caller asks for write access -- FOLL_WRITE.
The patch also changes touch_pud() in the same way.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[Salvatore Bonaccorso: backport for 3.16:
- Adjust context
- Drop specific part for PUD-sized transparent hugepages. Support
for PUD-sized transparent hugepages was added in v4.11-rc1
]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Backport of the upstream commit f86e4271978b ("mm: check the return
value of lookup_page_ext for all call sites") is wrong for hwpoison
pages. I have accidentally negated the condition for bailout. This
basically disables hwpoison pages tracking while the code still
might crash on unusual configurations when struct pages do not have
page_ext allocated. The fix is trivial to invert the condition.
Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAloXywoACgkQONu9yGCS
aT78dxAAoM0uHsL9r+ivJ4uMj81dwBL3Rd/1Lb/PMV5Yblh/LJ2WOcXriq/JgMLt
+ARoyugEpB8JzAi1Y3bq3Jku2TYcT0o55UmjRgZzQitdX5o8j1g1baNnpRuMz63z
S/g4Msh5aJyoHmwgxWZ+mWKn3SYdNwHy+r0gGwgtvlUO97iXqwM3nqQ/4tHnIv1B
sz0NtJ7cgFvWVaneUkZ4z0ZGTlKfxaQg95enyyCRWM7MJ6Be03+KnhmQZ6GEb8vP
tf9GtXiMEDJdwppDmXjtdjFW5adejBOoCF/grvbQoEdn7XPC47k6/l5Y6A3PYLMj
kqlC8IbMHbiQXvgwezxp6Mv+oc+LuSjSCVikZW2SGMacs5kF92+0MIUvBtfUwvsA
FP7q6jUcT3Or4xiG4xLDQW+RLPetidd+1Ms4jia6jaCajbMjU7ZYaBuAplT4qhIl
koJ9pn1ksna3fUyxnNFJttUN2ulGDzcSBP5EZf3bLWMXkG4daa8Cen7vBkG1VqZE
tspXCbB/mZ/eGv/rH3b7F2BVfP2RY0YqlUZzmfTXIoCwqcmX1zGi/KMfepcZTH3b
LOo8CBmTgSYXYh0/16GAUH3ds3QQt8d0oeaCEtf8BaAZnq5R3M8doZzGzTB6LGjG
Rn1KsUzJPKSqgYis3FTJNU3wmPokvV1ZVXK/ee9zMq5zOtyJyOg=
=XIwd
-----END PGP SIGNATURE-----
Merge 4.4.101 into android-4.4
Changes in 4.4.101
tcp: do not mangle skb->cb[] in tcp_make_synack()
netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed
bonding: discard lowest hash bit for 802.3ad layer3+4
vlan: fix a use-after-free in vlan_device_event()
af_netlink: ensure that NLMSG_DONE never fails in dumps
sctp: do not peel off an assoc from one netns to another one
fealnx: Fix building error on MIPS
net/sctp: Always set scope_id in sctp_inet6_skb_msgname
ima: do not update security.ima if appraisal status is not INTEGRITY_PASS
serial: omap: Fix EFR write on RTS deassertion
arm64: fix dump_instr when PAN and UAO are in use
nvme: Fix memory order on async queue deletion
ocfs2: should wait dio before inode lock in ocfs2_setattr()
ipmi: fix unsigned long underflow
mm/page_alloc.c: broken deferred calculation
coda: fix 'kernel memory exposure attempt' in fsync
mm: check the return value of lookup_page_ext for all call sites
mm/page_ext.c: check if page_ext is not prepared
mm/pagewalk.c: report holes in hugetlb ranges
Linux 4.4.101
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 373c4557d2aa362702c4c2d41288fb1e54990b7c upstream.
This matters at least for the mincore syscall, which will otherwise copy
uninitialized memory from the page allocator to userspace. It is
probably also a correctness error for /proc/$pid/pagemap, but I haven't
tested that.
Removing the `walk->hugetlb_entry` condition in walk_hugetlb_range() has
no effect because the caller already checks for that.
This only reports holes in hugetlb ranges to callers who have specified
a hugetlb_entry callback.
This issue was found using an AFL-based fuzzer.
v2:
- don't crash on ->pte_hole==NULL (Andrew Morton)
- add Cc stable (Andrew Morton)
Changed for 4.4/4.9 stable backport:
- fix up conflict in the huge_pte_offset() call
Fixes: 1e25a271c8 ("mincore: apply page table walker on do_mincore()")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e492080e640c2d1235ddf3441cae634cfffef7e1 upstream.
online_page_ext() and page_ext_init() allocate page_ext for each
section, but they do not allocate if the first PFN is !pfn_present(pfn)
or !pfn_valid(pfn). Then section->page_ext remains as NULL.
lookup_page_ext checks NULL only if CONFIG_DEBUG_VM is enabled. For a
valid PFN, __set_page_owner will try to get page_ext through
lookup_page_ext. Without CONFIG_DEBUG_VM lookup_page_ext will misuse
NULL pointer as value 0. This incurrs invalid address access.
This is the panic example when PFN 0x100000 is not valid but PFN
0x13FC00 is being used for page_ext. section->page_ext is NULL,
get_entry returned invalid page_ext address as 0x1DFA000 for a PFN
0x13FC00.
To avoid this panic, CONFIG_DEBUG_VM should be removed so that page_ext
will be checked at all times.
Unable to handle kernel paging request at virtual address 01dfa014
------------[ cut here ]------------
Kernel BUG at ffffff80082371e0 [verbose debug info unavailable]
Internal error: Oops: 96000045 [#1] PREEMPT SMP
Modules linked in:
PC is at __set_page_owner+0x48/0x78
LR is at __set_page_owner+0x44/0x78
__set_page_owner+0x48/0x78
get_page_from_freelist+0x880/0x8e8
__alloc_pages_nodemask+0x14c/0xc48
__do_page_cache_readahead+0xdc/0x264
filemap_fault+0x2ac/0x550
ext4_filemap_fault+0x3c/0x58
__do_fault+0x80/0x120
handle_mm_fault+0x704/0xbb0
do_page_fault+0x2e8/0x394
do_mem_abort+0x88/0x124
Pre-4.7 kernels also need commit f86e4271978b ("mm: check the return
value of lookup_page_ext for all call sites").
Link: http://lkml.kernel.org/r/20171107094131.14621-1-jaewon31.kim@samsung.com
Fixes: eefa864b70 ("mm/page_ext: resurrect struct page extending code for debugging")
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f86e4271978bd93db466d6a95dad4b0fdcdb04f6 upstream.
Per the discussion with Joonsoo Kim [1], we need check the return value
of lookup_page_ext() for all call sites since it might return NULL in
some cases, although it is unlikely, i.e. memory hotplug.
Tested with ltp with "page_owner=0".
[1] http://lkml.kernel.org/r/20160519002809.GA10245@js1304-P5Q-DELUXE
[akpm@linux-foundation.org: fix build-breaking typos]
[arnd@arndb.de: fix build problems from lookup_page_ext]
Link: http://lkml.kernel.org/r/6285269.2CksypHdYp@wuerfel
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1464023768-31025-1-git-send-email-yang.shi@linaro.org
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d135e5750205a21a212a19dbb05aeb339e2cbea7 upstream.
In reset_deferred_meminit() we determine number of pages that must not
be deferred. We initialize pages for at least 2G of memory, but also
pages for reserved memory in this node.
The reserved memory is determined in this function:
memblock_reserved_memory_within(), which operates over physical
addresses, and returns size in bytes. However, reset_deferred_meminit()
assumes that that this function operates with pfns, and returns page
count.
The result is that in the best case machine boots slower than expected
due to initializing more pages than needed in single thread, and in the
worst case panics because fewer than needed pages are initialized early.
Link: http://lkml.kernel.org/r/20171021011707.15191-1-pasha.tatashin@oracle.com
Fixes: 864b9a393dcb ("mm: consider memblock reservations for deferred memory initialization sizing")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce a varible to save bootloader enforced memory limits and
restricts adding beyond this boundary during a memory hotplug.
Change-Id: I28c100644b7287ec4625c4c018b5fffc865e2e72
Signed-off-by: Arun KS <arunks@codeaurora.org>
Add arm64 to the list of architectures which supports
memory hotplug.
Change-Id: Iefeb8294bf06eaebb17a3b3aa8b33bb3b7133099
Signed-off-by: Arun KS <arunks@codeaurora.org>
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
This is a second and improved version of the patch previously released
in [3].
It builds on the work by Scott Branden [2] and, henceforth,
it needs to be applied on top of Scott's patches [2].
Comments are very welcome.
Changes from the original patchset and known issues:
- Compared to Scott's original patchset, this work adds the mapping of
the new hotplugged pages into the kernel page tables. This is done by
copying the old swapper_pg_dir over a new page, adding the new mappings,
and then switching to the newly built pg_dir (see `hotplug_paging` in
arch/arm64/mmu.c). There might be better ways to to this: suggestions
are more than welcome.
- The stub function for `arch_remove_memory` has been removed for now; we
are working in parallel on memory hot remove, and we plan to contribute
it as a separate patch.
- Corresponding Kconfig flags have been added;
- Note that this patch does not work when NUMA is enabled; in fact,
the function `memory_add_physaddr_to_nid` does not have an
implementation when the NUMA flag is on: this function is supposed to
return the nid the hotplugged memory should be associated with. However
it is not really clear to us yet what the semantics of this function
in the context of a NUMA system should be. A quick and dirty fix would
be to always attach to the first available NUMA node.
- In arch/arm64/mm/init.c `arch_add_memory`, we are doing a hack with the
nomap memory block flags to satisfy preconditions and postconditions of
`__add_pages` and postconditions of `arch_add_memory`. Compared to
memory hotplug implementation for other architectures, the "issue"
seems to be in the implemenation of `pfn_valid`. Suggestions on how
to cleanly avoid this hack are welcome.
This patchset can be tested by starting the kernel with the `mem=X` flag, where
X is less than the total available physical memory and has to be multiple of
MIN_MEMORY_BLOCK_SIZE. We also tested it on a customised version of QEMU
capable to emulate physical hotplug on arm64 platform.
To enable the feature the CONFIG_MEMORY_HOTPLUG compilation flag
needs to be set to true. Then, after memory is physically hotplugged,
the standard two steps to make it available (as also documented in
Documentation/memory-hotplug.txt) are:
(1) Notify memory hot-add
echo '0xYY000000' > /sys/devices/system/memory/probe
where 0xYY000000 is the first physical address of the new memory section.
(2) Online new memory block(s)
echo online > /sys/devices/system/memory/memoryXXX/state
-- or --
echo online_movable > /sys/devices/system/memory/memoryXXX/state
where XXX corresponds to the ids of newly added blocks.
Onlining can optionally be automatic at hot-add notification by enabling
the global flag:
echo online > /sys/devices/system/memory/auto_online_blocks
or by setting the corresponding config flag in the kernel build.
Again, any comment is highly appreciated.
[1] https://lkml.org/lkml/2016/11/17/49
[2] https://lkml.org/lkml/2016/12/1/811
[3] https://lkml.org/lkml/2016/12/14/188
Change-Id: I545807e3121c159aaa2f917ea914ee98f38fb296
Signed-off-by: Maciej Bielski <m.bielski@virtualopensystems.com>
Signed-off-by: Andrea Reale <ar@linux.vnet.ibm.com>
Patch-mainline: linux-kernel @ 11 Apr 2017, 18:25
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[arunks@codeaurora.org: fix to pass checker test]
Signed-off-by: Arun KS <arunks@codeaurora.org>
* refs/heads/tmp-89074de
Linux 4.4.94
Revert "tty: goldfish: Fix a parameter of a call to free_irq"
cpufreq: CPPC: add ACPI_PROCESSOR dependency
nfsd/callback: Cleanup callback cred on shutdown
target/iscsi: Fix unsolicited data seq_end_offset calculation
uapi: fix linux/mroute6.h userspace compilation errors
uapi: fix linux/rds.h userspace compilation errors
ceph: clean up unsafe d_parent accesses in build_dentry_path
i2c: at91: ensure state is restored after suspending
net: mvpp2: release reference to txq_cpu[] entry after unmapping
scsi: scsi_dh_emc: return success in clariion_std_inquiry()
slub: do not merge cache if slub_debug contains a never-merge flag
ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock
crypto: xts - Add ECB dependency
net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs
sparc64: Migrate hvcons irq to panicked cpu
md/linear: shutup lockdep warnning
f2fs: do not wait for writeback in write_begin
Btrfs: send, fix failure to rename top level inode due to name collision
iio: adc: xilinx: Fix error handling
netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value.
net/mlx4_en: fix overflow in mlx4_en_init_timestamp()
mac80211: fix power saving clients handling in iwlwifi
mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length
irqchip/crossbar: Fix incorrect type of local variables
watchdog: kempld: fix gcc-4.3 build
locking/lockdep: Add nest_lock integrity test
Revert "bsg-lib: don't free job in bsg_prepare_job"
tipc: use only positive error codes in messages
net: Set sk_prot_creator when cloning sockets to the right proto
packet: only test po->has_vnet_hdr once in packet_snd
packet: in packet_do_bind, test fanout with bind_lock held
tun: bail out from tun_get_user() if the skb is empty
l2tp: fix race condition in l2tp_tunnel_delete
l2tp: Avoid schedule while atomic in exit_net
vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmit
isdn/i4l: fetch the ppp_write buffer in one shot
bpf: one perf event close won't free bpf program attached by another perf event
packet: hold bind lock when rebinding to fanout hook
net: emac: Fix napi poll list corruption
ip6_gre: skb_push ipv6hdr before packing the header in ip6gre_header
udpv6: Fix the checksum computation when HW checksum does not apply
bpf/verifier: reject BPF_ALU64|BPF_END
sctp: potential read out of bounds in sctp_ulpevent_type_enabled()
MIPS: Fix minimum alignment requirement of IRQ stack
drm/dp/mst: save vcpi with payloads
percpu: make this_cpu_generic_read() atomic w.r.t. interrupts
trace: sched: Fix util_avg_walt in sched_load_avg_cpu trace
sched/fair: remove erroneous RCU_LOCKDEP_WARN from start_cpu()
sched: EAS/WALT: finish accounting prior to task_tick
cpufreq: sched: update capacity request upon tick always
sched/fair: prevent meaningless active migration
sched: walt: Leverage existing helper APIs to apply invariance
Conflicts:
kernel/sched/core.c
kernel/sched/fair.c
kernel/sched/sched.h
Change-Id: I0effac90fb6a4db559479bfa2fefa31c41200ce9
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlnrYxcACgkQONu9yGCS
aT6VpQ/9GzBA59FGi6ohxZnrUR35+U5Ehuw0IZo4JTUTrlj28QozeV6dBAdgQHLH
eGcejtzAsD39m7JjmBzkxiBBlCH9nQkq5IaUrJG6q5dYoTCYMLzHwQLgPSbrhbnS
hCSeHdJ0fevw9tKQELtWlIiG1iOULrWATf4MtpOCHcRmpxxSMRi22yQ4vKD2Vz4y
TdKb5c8bYjJoEqbtON4wKIiEK1JfyO80E4eZtNK7FXI+XX1WI65pum9/NBiDqB78
K0sK1t5pSJHvDgMwtOJ7Nxzcwle1cG3xm7NhZhNCfF9OWedCy+ZCc+e48T+TeoF4
UDHIhEvhOOOf/W3dRBQQj8VElj0zt92I+ivsWxKmheY9JzJdOvq2pTQoPAtLsBMD
/mChCvMSNEcHTfLYrm6Bjap0e6D10n1oUHX7jgLtq04EcX9Rh2zgYvL9u9QFLjFx
sAgTp+kmScgj0fi0XgiXJxj8mPc2MpTVmSUjcwAZD+N9Kuafkqbf3ZddZJiGyPfw
v4ZiAdUAtACdOaIRVPRUcG2fyLfKYqg2bFsif4Z67/0RmNf3C3rJpS9yX+Q36zCo
f66xbvysN3pRiME0obenrsxBJ0LvIkSVskxV0+5x0UfP5pOdf7jZqqpkr6IFMtLZ
/o4DYV+Da/qeYZQnmvF0BEvEnnX8GJFIJ+9RSbz9mAWcCWtWxTU=
=gevA
-----END PGP SIGNATURE-----
Merge 4.4.94 into android-4.4
Changes in 4.4.94
percpu: make this_cpu_generic_read() atomic w.r.t. interrupts
drm/dp/mst: save vcpi with payloads
MIPS: Fix minimum alignment requirement of IRQ stack
sctp: potential read out of bounds in sctp_ulpevent_type_enabled()
bpf/verifier: reject BPF_ALU64|BPF_END
udpv6: Fix the checksum computation when HW checksum does not apply
ip6_gre: skb_push ipv6hdr before packing the header in ip6gre_header
net: emac: Fix napi poll list corruption
packet: hold bind lock when rebinding to fanout hook
bpf: one perf event close won't free bpf program attached by another perf event
isdn/i4l: fetch the ppp_write buffer in one shot
vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmit
l2tp: Avoid schedule while atomic in exit_net
l2tp: fix race condition in l2tp_tunnel_delete
tun: bail out from tun_get_user() if the skb is empty
packet: in packet_do_bind, test fanout with bind_lock held
packet: only test po->has_vnet_hdr once in packet_snd
net: Set sk_prot_creator when cloning sockets to the right proto
tipc: use only positive error codes in messages
Revert "bsg-lib: don't free job in bsg_prepare_job"
locking/lockdep: Add nest_lock integrity test
watchdog: kempld: fix gcc-4.3 build
irqchip/crossbar: Fix incorrect type of local variables
mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length
mac80211: fix power saving clients handling in iwlwifi
net/mlx4_en: fix overflow in mlx4_en_init_timestamp()
netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value.
iio: adc: xilinx: Fix error handling
Btrfs: send, fix failure to rename top level inode due to name collision
f2fs: do not wait for writeback in write_begin
md/linear: shutup lockdep warnning
sparc64: Migrate hvcons irq to panicked cpu
net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs
crypto: xts - Add ECB dependency
ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock
slub: do not merge cache if slub_debug contains a never-merge flag
scsi: scsi_dh_emc: return success in clariion_std_inquiry()
net: mvpp2: release reference to txq_cpu[] entry after unmapping
i2c: at91: ensure state is restored after suspending
ceph: clean up unsafe d_parent accesses in build_dentry_path
uapi: fix linux/rds.h userspace compilation errors
uapi: fix linux/mroute6.h userspace compilation errors
target/iscsi: Fix unsolicited data seq_end_offset calculation
nfsd/callback: Cleanup callback cred on shutdown
cpufreq: CPPC: add ACPI_PROCESSOR dependency
Revert "tty: goldfish: Fix a parameter of a call to free_irq"
Linux 4.4.94
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit c6e28895a4372992961888ffaadc9efc643b5bfe ]
In case CONFIG_SLUB_DEBUG_ON=n, find_mergeable() gets debug features from
commandline but never checks if there are features from the
SLAB_NEVER_MERGE set.
As a result selected by slub_debug caches are always mergeable if they
have been created without a custom constructor set or without one of the
SLAB_* debug features on.
This moves the SLAB_NEVER_MERGE check below the flags update from
commandline to make sure it won't merge the slab cache if one of the debug
features is on.
Link: http://lkml.kernel.org/r/20170101124451.GA4740@lp-laptop-d
Signed-off-by: Grygorii Maistrenko <grygoriimkd@gmail.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instead of storing backing_dev_info inside struct request_queue,
allocate it dynamically, reference count it, and free it when the last
reference is dropped. Currently only request_queue holds the reference
but in the following patch we add other users referencing
backing_dev_info.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
Change-Id: Ibcee7b4c014018f9243cd3edbfd9c4a8877c3862
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
Git-commit: d03f6cdc1fc422accb734c7c07a661a0018d8631
[riteshh@codeaurora.org: resolved merge conflicts]
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
We will want to have struct backing_dev_info allocated separately from
struct request_queue. As the first step add pointer to backing_dev_info
to request_queue and convert all users touching it. No functional
changes in this patch.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
Change-Id: I77fbb181de7e39c83fbfba8cfb128d6ace161f31
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
Git-commit: 97419acd22a0bacc52dbc34d5bbc96d315e48acb
[riteshh@codeaurora.org: resolved merge conflicts]
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
This is cherry-picked from upstrea-f2fs-stable-linux-4.4.y.
Changes include:
commit c7fd9e2b4a ("f2fs: hurry up to issue discard after io interruption")
commit 603dde3965 ("f2fs: fix to show correct discard_granularity in sysfs")
...
commit 565f0225f9 ("f2fs: factor out discard command info into discard_cmd_control")
commit c4cc29d19e ("f2fs: remove batched discard in f2fs_trim_fs")
Change-Id: Icd8a85ac0c19a8aa25cd2591a12b4e9b85bdf1c5
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
This function always returns true, because it calls
memblock_overlaps_region() which returns a bool now. Change the
signature to bool and pass it on up instead.
Change-Id: I4b6403b823d20552a28006e35083d8056346dc51
Cc: Patrick Daly <pdaly@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
[satyap@codeaurora.org: trivial merge conflict resolution]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlnLaLoACgkQONu9yGCS
aT7hDw/+Ipx/xnjIUJFV/aqo8lTh3XqP/TjD5whoi+yYC8axLEZBLiOSLZceVjsG
hi2mP22gKn1i7GLXNeWIZ+rMtVzAN+qNg7i8cjWNfFp1fA7cCfFaYvlV0LVrO2tK
WnvvE8r5kQAKyQG8498ebEjianxwxHVERnNiE5/SDpCNj14DnwCJBTEYM0tEnuXZ
/jBIIs4xvndVa0fFfUjuAzh65AefAT1BmgsPll4GnFMUFHh30smYdFla5LL0GNIq
FQGFvIi8Q02disSMg9lFJVOlazc/HUREiFB1qy1DRtGMnS6/Q0HW0kCxeRi/7QEi
+HN2rLxtbpnuD5P7W4lDJ5/cyCHMIv8SJ8OqUd8uxbTWz31P/QxbM7d35d+w3rq8
dv3sQ6CMRnuIXGL5dFHh7zYqlzNS9PKjLmxzAw9grDf+nVsDxE4KUfJy00DSN1I1
Bopi1kCD2nUMOiBrmxkIczN6OOvcGBHh6/TTB2WEKVHn42D0fjLnO66kJVJLMsBm
vDdKJDDSGM/0HiUa5ydr6R0Ae7My3h5AJZRa5gn0kL/myatX/vsa0B2ZLpHlVipM
GhODBsDFkI4k4yceONDZPJmhhVab1lewTMuIW5D2KRMsgpQqLmlOyL5gykfH0rTx
FVnLSoMAHsgm6qVPwRS5BqK/UnXogfqjiB0iXzNNZnkiABWWoUQ=
=Skkr
-----END PGP SIGNATURE-----
Merge 4.4.89 into android-4.4
Changes in 4.4.89
ipv6: accept 64k - 1 packet length in ip6_find_1stfragopt()
ipv6: add rcu grace period before freeing fib6_node
ipv6: fix sparse warning on rt6i_node
qlge: avoid memcpy buffer overflow
Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()"
tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0
Revert "net: use lib/percpu_counter API for fragmentation mem accounting"
Revert "net: fix percpu memory leaks"
gianfar: Fix Tx flow control deactivation
ipv6: fix memory leak with multiple tables during netns destruction
ipv6: fix typo in fib6_net_exit()
f2fs: check hot_data for roll-forward recovery
x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps
md/raid5: release/flush io in raid5_do_work()
nfsd: Fix general protection fault in release_lock_stateid()
mm: prevent double decrease of nr_reserved_highatomic
tty: improve tty_insert_flip_char() fast path
tty: improve tty_insert_flip_char() slow path
tty: fix __tty_insert_flip_char regression
Input: i8042 - add Gigabyte P57 to the keyboard reset table
MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix quiet NaN propagation
MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix cases of both inputs zero
MIPS: math-emu: <MAX|MIN>.<D|S>: Fix cases of both inputs negative
MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of input values with opposite signs
MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of both infinite inputs
MIPS: math-emu: MINA.<D|S>: Fix some cases of infinity and zero inputs
crypto: AF_ALG - remove SGL terminator indicator when chaining
ext4: fix incorrect quotaoff if the quota feature is enabled
ext4: fix quota inconsistency during orphan cleanup for read-only mounts
powerpc: Fix DAR reporting when alignment handler faults
block: Relax a check in blk_start_queue()
md/bitmap: disable bitmap_resize for file-backed bitmaps.
skd: Avoid that module unloading triggers a use-after-free
skd: Submit requests to firmware before triggering the doorbell
scsi: zfcp: fix queuecommand for scsi_eh commands when DIX enabled
scsi: zfcp: add handling for FCP_RESID_OVER to the fcp ingress path
scsi: zfcp: fix capping of unsuccessful GPN_FT SAN response trace records
scsi: zfcp: fix passing fsf_req to SCSI trace on TMF to correlate with HBA
scsi: zfcp: fix missing trace records for early returns in TMF eh handlers
scsi: zfcp: fix payload with full FCP_RSP IU in SCSI trace records
scsi: zfcp: trace HBA FSF response by default on dismiss or timedout late response
scsi: zfcp: trace high part of "new" 64 bit SCSI LUN
scsi: megaraid_sas: Check valid aen class range to avoid kernel panic
scsi: megaraid_sas: Return pended IOCTLs with cmd_status MFI_STAT_WRONG_STATE in case adapter is dead
scsi: storvsc: fix memory leak on ring buffer busy
scsi: sg: remove 'save_scat_len'
scsi: sg: use standard lists for sg_requests
scsi: sg: off by one in sg_ioctl()
scsi: sg: factor out sg_fill_request_table()
scsi: sg: fixup infoleak when using SG_GET_REQUEST_TABLE
scsi: qla2xxx: Fix an integer overflow in sysfs code
ftrace: Fix selftest goto location on error
tracing: Apply trace_clock changes to instance max buffer
ARC: Re-enable MMU upon Machine Check exception
PCI: shpchp: Enable bridge bus mastering if MSI is enabled
media: v4l2-compat-ioctl32: Fix timespec conversion
media: uvcvideo: Prevent heap overflow when accessing mapped controls
bcache: initialize dirty stripes in flash_dev_run()
bcache: Fix leak of bdev reference
bcache: do not subtract sectors_to_gc for bypassed IO
bcache: correct cache_dirty_target in __update_writeback_rate()
bcache: Correct return value for sysfs attach errors
bcache: fix for gc and write-back race
bcache: fix bch_hprint crash and improve output
ftrace: Fix memleak when unregistering dynamic ops when tracing disabled
Linux 4.4.89
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 4855e4a7f29d6d10b0b9c84e189c770c9a94e91e upstream.
There is race between page freeing and unreserved highatomic.
CPU 0 CPU 1
free_hot_cold_page
mt = get_pfnblock_migratetype
set_pcppage_migratetype(page, mt)
unreserve_highatomic_pageblock
spin_lock_irqsave(&zone->lock)
move_freepages_block
set_pageblock_migratetype(page)
spin_unlock_irqrestore(&zone->lock)
free_pcppages_bulk
__free_one_page(mt) <- mt is stale
By above race, a page on CPU 0 could go non-highorderatomic free list
since the pageblock's type is changed. By that, unreserve logic of
highorderatomic can decrease reserved count on a same pageblock severak
times and then it will make mismatch between nr_reserved_highatomic and
the number of reserved pageblock.
So, this patch verifies whether the pageblock is highatomic or not and
decrease the count only if the pageblock is highatomic.
Link: http://lkml.kernel.org/r/1476259429-18279-3-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Sangseok Lee <sangseok.lee@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On a senario like writing out the first dirty page of the inode
as the inline data, we only cleared dirty flags of the pages, but
didn't clear the dirty tags of those pages in the radix tree.
If we don't clear the dirty tags of the pages in the radix tree, the
inodes which contain the pages will be marked with I_DIRTY_PAGES again
and again, and writepages() for the inodes will be invoked in every
writeback period. As a result, nothing will be done in every
writepages() for the inodes and it will just consume CPU time
meaninglessly.
Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The page_owner stacktrace always begin as follows:
[<ffffff987bfd48f4>] save_stack+0x40/0xc8
[<ffffff987bfd4da8>] __set_page_owner+0x3c/0x6c
These two entries do not provide any useful information and limits the
available stacktrace depth. The page_owner stacktrace was skipping caller
function from stack entries but this was missed with commit f2ca0b557107
("mm/page_owner: use stackdepot to store stacktrace")
Example page_owner entry after the patch:
Page allocated via order 0, mask 0x8(ffffff80085fb714)
PFN 654411 type Movable Block 639 type CMA Flags 0x0(ffffffbe5c7f12c0)
[<ffffff9b64989c14>] post_alloc_hook+0x70/0x80
...
[<ffffff9b651216e8>] msm_comm_try_state+0x5f8/0x14f4
[<ffffff9b6512486c>] msm_vidc_open+0x5e4/0x7d0
[<ffffff9b65113674>] msm_v4l2_open+0xa8/0x224
Change-Id: Ia34fc127d691c2858d991bb631aad9ebc703bcee
Link: http://lkml.kernel.org/r/1504078343-28754-2-git-send-email-guptap@codeaurora.org
Fixes: f2ca0b557107 ("mm/page_owner: use stackdepot to store stacktrace")
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Git-commit: 665bb76dd71bc061c5f730226dc1881d151983c8
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
* refs/heads/tmp-610af85
Linux 4.4.85
ACPI / APEI: Add missing synchronize_rcu() on NOTIFY_SCI removal
ACPI: ioapic: Clear on-stack resource before using it
ntb_transport: fix bug calculating num_qps_mw
ntb_transport: fix qp count bug
ASoC: rsnd: don't call update callback if it was NULL
ASoC: rsnd: ssi: 24bit data needs right-aligned settings
ASoC: rsnd: Add missing initialization of ADG req_rate
ASoC: rsnd: avoid pointless loop in rsnd_mod_interrupt()
ASoC: rsnd: disable SRC.out only when stop timing
ASoC: simple-card: don't fail if sysclk setting is not supported
staging: rtl8188eu: add RNX-N150NUB support
iio: hid-sensor-trigger: Fix the race with user space powering up sensors
iio: imu: adis16480: Fix acceleration scale factor for adis16480
ANDROID: binder: fix proc->tsk check.
binder: Use wake up hint for synchronous transactions.
binder: use group leader instead of open thread
Bluetooth: bnep: fix possible might sleep error in bnep_session
Bluetooth: cmtp: fix possible might sleep error in cmtp_session
Bluetooth: hidp: fix possible might sleep error in hidp_session_thread
perf/core: Fix group {cpu,task} validation
nfsd: Limit end of page list when decoding NFSv4 WRITE
cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
cifs: Fix df output for users with quota limits
tracing: Fix freeing of filter in create_filter() when set_str is false
drm: rcar-du: Fix H/V sync signal polarity configuration
drm: rcar-du: Fix display timing controller parameter
drm: rcar-du: Fix crash in encoder failure error path
drm: rcar-du: lvds: Rename PLLEN bit to PLLON
drm: rcar-du: lvds: Fix PLL frequency-related configuration
drm/atomic: If the atomic check fails, return its value first
drm: Release driver tracking before making the object available again
i2c: designware: Fix system suspend
ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)
ALSA: core: Fix unexpected error at replacing user TLV
Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
Input: trackpoint - add new trackpoint firmware ID
mei: me: add lewisburg device ids
mei: me: add broxton pci device ids
net_sched: fix order of queue length updates in qdisc_replace()
net: sched: fix NULL pointer dereference when action calls some targets
irda: do not leak initialized list.dev to userspace
tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
ipv6: repair fib6 tree in failure case
ipv6: reset fn->rr_ptr when replacing route
tipc: fix use-after-free
sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
ipv4: better IP_MAX_MTU enforcement
net_sched/sfq: update hierarchical backlog when drop packet
ipv4: fix NULL dereference in free_fib_info_rcu()
dccp: defer ccid_hc_tx_delete() at dismantle time
dccp: purge write queue in dccp_destroy_sock()
af_key: do not use GFP_KERNEL in atomic contexts
ANDROID: NFC: st21nfca: Fix memory OOB and leak issues in connectivity events handler
Linux 4.4.84
usb: qmi_wwan: add D-Link DWM-222 device ID
usb: optimize acpi companion search for usb port devices
perf/x86: Fix LBR related crashes on Intel Atom
pids: make task_tgid_nr_ns() safe
Sanitize 'move_pages()' permission checks
irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup()
irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup()
x86/asm/64: Clear AC on NMI entries
xen: fix bio vec merging
mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes
mm/mempolicy: fix use after free when calling get_mempolicy
ALSA: usb-audio: Add mute TLV for playback volumes on C-Media devices
ALSA: usb-audio: Apply sample rate quirk to Sennheiser headset
ALSA: seq: 2nd attempt at fixing race creating a queue
Input: elan_i2c - Add antoher Lenovo ACPI ID for upcoming Lenovo NB
Input: elan_i2c - add ELAN0608 to the ACPI table
crypto: x86/sha1 - Fix reads beyond the number of blocks passed
parisc: pci memory bar assignment fails with 64bit kernels on dino/cujo
audit: Fix use after free in audit_remove_watch_rule()
netfilter: nf_ct_ext: fix possible panic after nf_ct_extend_unregister
ANDROID: check dir value of xfrm_userpolicy_id
ANDROID: NFC: Fix possible memory corruption when handling SHDLC I-Frame commands
ANDROID: nfc: fdp: Fix possible buffer overflow in WCS4000 NFC driver
ANDROID: NFC: st21nfca: Fix out of bounds kernel access when handling ATR_REQ
UPSTREAM: usb: dwc3: gadget: don't send extra ZLP
BACKPORT: usb: dwc3: gadget: handle request->zero
ANDROID: usb: gadget: assign no-op request complete callbacks
ANDROID: usb: gadget: configfs: fix null ptr in android_disconnect
ANDROID: uid_sys_stats: Fix implicit declaration of get_cmdline()
uid_sys_stats: log task io with a debug flag
Linux 4.4.83
pinctrl: samsung: Remove bogus irq_[un]mask from resource management
pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver
pnfs/blocklayout: require 64-bit sector_t
iio: adc: vf610_adc: Fix VALT selection value for REFSEL bits
usb:xhci:Add quirk for Certain failing HP keyboard on reset after resume
usb: quirks: Add no-lpm quirk for Moshi USB to Ethernet Adapter
usb: core: unlink urbs from the tail of the endpoint's urb_list
USB: Check for dropped connection before switching to full speed
uas: Add US_FL_IGNORE_RESIDUE for Initio Corporation INIC-3069
iio: light: tsl2563: use correct event code
iio: accel: bmc150: Always restore device to normal mode after suspend-resume
staging:iio:resolver:ad2s1210 fix negative IIO_ANGL_VEL read
USB: hcd: Mark secondary HCD as dead if the primary one died
usb: musb: fix tx fifo flush handling again
USB: serial: pl2303: add new ATEN device id
USB: serial: cp210x: add support for Qivicon USB ZigBee dongle
USB: serial: option: add D-Link DWM-222 device ID
nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays
fuse: initialize the flock flag in fuse_file on allocation
iscsi-target: Fix iscsi_np reset hung task during parallel delete
iscsi-target: fix memory leak in iscsit_setup_text_cmd()
mm: ratelimit PFNs busy info message
cpuset: fix a deadlock due to incomplete patching of cpusets_enabled()
ANDROID: Use sk_uid to replace uid get from socket file
UPSTREAM: arm64: smp: Prevent raw_smp_processor_id() recursion
UPSTREAM: arm64: restore get_current() optimisation
ANDROID: arm64: Fix a copy-paste error in prior init_thread_info build fix
Conflicts:
drivers/misc/Kconfig
drivers/usb/dwc3/gadget.c
include/linux/sched.h
mm/migrate.c
net/netfilter/xt_qtaguid.c
Change-Id: I3a0107fcb5c7455114b316426c9d669bb871acd1
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
* refs/heads/tmp-4b8fc9f
UPSTREAM: locking: avoid passing around 'thread_info' in mutex debugging code
ANDROID: arm64: fix undeclared 'init_thread_info' error
UPSTREAM: kdb: use task_cpu() instead of task_thread_info()->cpu
Linux 4.4.82
net: account for current skb length when deciding about UFO
ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output
mm/mempool: avoid KASAN marking mempool poison checks as use-after-free
KVM: arm/arm64: Handle hva aging while destroying the vm
sparc64: Prevent perf from running during super critical sections
udp: consistently apply ufo or fragmentation
revert "ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output"
revert "net: account for current skb length when deciding about UFO"
packet: fix tp_reserve race in packet_set_ring
net: avoid skb_warn_bad_offload false positives on UFO
tcp: fastopen: tcp_connect() must refresh the route
net: sched: set xt_tgchk_param par.nft_compat as 0 in ipt_init_target
bpf, s390: fix jit branch offset related to ldimm64
net: fix keepalive code vs TCP_FASTOPEN_CONNECT
tcp: avoid setting cwnd to invalid ssthresh after cwnd reduction states
ANDROID: keychord: Fix for a memory leak in keychord.
ANDROID: keychord: Fix races in keychord_write.
Use %zu to print resid (size_t).
ANDROID: keychord: Fix a slab out-of-bounds read.
Linux 4.4.81
workqueue: implicit ordered attribute should be overridable
net: account for current skb length when deciding about UFO
ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output
mm: don't dereference struct page fields of invalid pages
signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
lib/Kconfig.debug: fix frv build failure
mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDER
ARM: 8632/1: ftrace: fix syscall name matching
virtio_blk: fix panic in initialization error path
drm/virtio: fix framebuffer sparse warning
scsi: qla2xxx: Get mutex lock before checking optrom_state
phy state machine: failsafe leave invalid RUNNING state
x86/boot: Add missing declaration of string functions
tg3: Fix race condition in tg3_get_stats64().
net: phy: dp83867: fix irq generation
sh_eth: R8A7740 supports packet shecksumming
wext: handle NULL extra data in iwe_stream_add_point better
sparc64: Measure receiver forward progress to avoid send mondo timeout
xen-netback: correctly schedule rate-limited queues
net: phy: Fix PHY unbind crash
net: phy: Correctly process PHY_HALTED in phy_stop_machine()
net/mlx5: Fix command bad flow on command entry allocation failure
sctp: fix the check for _sctp_walk_params and _sctp_walk_errors
sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()
dccp: fix a memleak for dccp_feat_init err process
dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properly
dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly
net: ethernet: nb8800: Handle all 4 RGMII modes identically
ipv6: Don't increase IPSTATS_MIB_FRAGFAILS twice in ip6_fragment()
packet: fix use-after-free in prb_retire_rx_blk_timer_expired()
openvswitch: fix potential out of bound access in parse_ct
mcs7780: Fix initialization when CONFIG_VMAP_STACK is enabled
rtnetlink: allocate more memory for dev_set_mac_address()
ipv4: initialize fib_trie prior to register_netdev_notifier call.
ipv6: avoid overflow of offset in ip6_find_1stfragopt
net: Zero terminate ifr_name in dev_ifname().
ipv4: ipv6: initialize treq->txhash in cookie_v[46]_check()
saa7164: fix double fetch PCIe access condition
drm: rcar-du: fix backport bug
f2fs: sanity check checkpoint segno and blkoff
media: lirc: LIRC_GET_REC_RESOLUTION should return microseconds
mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries
iser-target: Avoid isert_conn->cm_id dereference in isert_login_recv_done
iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP
iscsi-target: Fix initial login PDU asynchronous socket close OOPs
iscsi-target: Fix early sk_data_ready LOGIN_FLAGS_READY race
iscsi-target: Always wait for kthread_should_stop() before kthread exit
target: Avoid mappedlun symlink creation during lun shutdown
media: platform: davinci: return -EINVAL for VPFE_CMD_S_CCDC_RAW_PARAMS ioctl
ARM: dts: armada-38x: Fix irq type for pca955
ext4: fix overflow caused by missing cast in ext4_resize_fs()
ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize
mm/page_alloc: Remove kernel address exposure in free_reserved_area()
KVM: async_pf: make rcu irq exit if not triggered from idle task
ASoC: do not close shared backend dailink
ALSA: hda - Fix speaker output from VAIO VPCL14M1R
workqueue: restore WQ_UNBOUND/max_active==1 to be ordered
libata: array underflow in ata_find_dev()
ANDROID: binder: don't queue async transactions to thread.
ANDROID: binder: don't enqueue death notifications to thread todo.
ANDROID: binder: call poll_wait() unconditionally.
android: configs: move quota-related configs to recommended
BACKPORT: arm64: split thread_info from task stack
UPSTREAM: arm64: assembler: introduce ldr_this_cpu
UPSTREAM: arm64: make cpu number a percpu variable
UPSTREAM: arm64: smp: prepare for smp_processor_id() rework
BACKPORT: arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx
UPSTREAM: arm64: prep stack walkers for THREAD_INFO_IN_TASK
UPSTREAM: arm64: unexport walk_stackframe
UPSTREAM: arm64: traps: simplify die() and __die()
UPSTREAM: arm64: factor out current_stack_pointer
BACKPORT: arm64: asm-offsets: remove unused definitions
UPSTREAM: arm64: thread_info remove stale items
UPSTREAM: thread_info: include <current.h> for THREAD_INFO_IN_TASK
UPSTREAM: thread_info: factor out restart_block
UPSTREAM: kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function
UPSTREAM: sched/core: Add try_get_task_stack() and put_task_stack()
UPSTREAM: sched/core: Allow putting thread_info into task_struct
UPSTREAM: printk: when dumping regs, show the stack, not thread_info
UPSTREAM: fix up initial thread stack pointer vs thread_info confusion
UPSTREAM: Clarify naming of thread info/stack allocators
ANDROID: sdcardfs: override credential for ioctl to lower fs
Conflicts:
android/configs/android-base.cfg
arch/arm64/Kconfig
arch/arm64/include/asm/suspend.h
arch/arm64/kernel/head.S
arch/arm64/kernel/smp.c
arch/arm64/kernel/suspend.c
arch/arm64/kernel/traps.c
arch/arm64/mm/proc.S
kernel/fork.c
sound/soc/soc-pcm.c
Change-Id: I273e216c94899a838bbd208391c6cbe20b2bf683
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlmfaTcACgkQONu9yGCS
aT65LhAAlTz6ssLhstl1QhFIUZ5cTIMouONCYK2+fwg67J3nf8I4whaj0eXX9RIT
4f6L1JNn3YV6lEfXZLs3EHeiMWft2cal95yQULwhsSwI/1rAzjYkFtj59gPxp2Rc
03ZrU8UnWNZmKpzneiwnd0kkFi+7wBKT5GbERy6Voh1hpAg8HbHQeWdVxFXKCdLD
eSP1+RNsvknZBjcJibxhsArs8E8r5t+dXDzi0HYpiZvctV23VXD+y2UDE1RMEDx5
k5fH30DIxd3T1JU1qHGnJUlfK5jKVho76zoSThwEFm9xqoZat/xrby5gW5sMcWeD
0BMw4F6GYE8BoeViC/+iujR0B8ngU0e+ExH+M6WYoEGPH1BFHyPNqDoKjnyAjyyH
tQEOD/0aRWuxcBVyk34EafNZeou/AeDd0IReAHciCIomN0+3u104+HlxkGH1oXEn
u0O5kVXQPaB/YeXd3jRLSfDmzxojaaihTeJZGFi//1iAj+jJEeYagfeI+flqrtaC
Gcwi55HrNrLbEj9kBFLEnm8RgFyWFsO0oVbfu1bPUGZOmuMi4u1Ptkffi3p/Wrsh
cx9ErKXj6meOgkcmzCWYl1Ygp3rY3bdlbidixJnzEfOTeZ2FyxnMz3BQAxGeOPTD
OhUevEK08oMTb1YDt3i7Sh1BGKpU0AEaEw5i8m36m4rC6KdqJfA=
=JeJw
-----END PGP SIGNATURE-----
Merge 4.4.84 into android-4.4
Changes in 4.4.84
netfilter: nf_ct_ext: fix possible panic after nf_ct_extend_unregister
audit: Fix use after free in audit_remove_watch_rule()
parisc: pci memory bar assignment fails with 64bit kernels on dino/cujo
crypto: x86/sha1 - Fix reads beyond the number of blocks passed
Input: elan_i2c - add ELAN0608 to the ACPI table
Input: elan_i2c - Add antoher Lenovo ACPI ID for upcoming Lenovo NB
ALSA: seq: 2nd attempt at fixing race creating a queue
ALSA: usb-audio: Apply sample rate quirk to Sennheiser headset
ALSA: usb-audio: Add mute TLV for playback volumes on C-Media devices
mm/mempolicy: fix use after free when calling get_mempolicy
mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes
xen: fix bio vec merging
x86/asm/64: Clear AC on NMI entries
irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup()
irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup()
Sanitize 'move_pages()' permission checks
pids: make task_tgid_nr_ns() safe
perf/x86: Fix LBR related crashes on Intel Atom
usb: optimize acpi companion search for usb port devices
usb: qmi_wwan: add D-Link DWM-222 device ID
Linux 4.4.84
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 197e7e521384a23b9e585178f3f11c9fa08274b9 upstream.
The 'move_paghes()' system call was introduced long long ago with the
same permission checks as for sending a signal (except using
CAP_SYS_NICE instead of CAP_SYS_KILL for the overriding capability).
That turns out to not be a great choice - while the system call really
only moves physical page allocations around (and you need other
capabilities to do a lot of it), you can check the return value to map
out some the virtual address choices and defeat ASLR of a binary that
still shares your uid.
So change the access checks to the more common 'ptrace_may_access()'
model instead.
This tightens the access checks for the uid, and also effectively
changes the CAP_SYS_NICE check to CAP_SYS_PTRACE, but it's unlikely that
anybody really _uses_ this legacy system call any more (we hav ebetter
NUMA placement models these days), so I expect nobody to notice.
Famous last words.
Reported-by: Otto Ebeling <otto.ebeling@iki.fi>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>