evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
Morten Rasmussen	1e960320c9	sched: Prevent unnecessary active balance of single task in sched group Scenarios with the busiest group having just one task and the local being idle on topologies with sched groups with different numbers of cpus manage to dodge all load-balance bailout conditions resulting the nr_balance_failed counter to be incremented. This eventually causes a pointless active migration of the task. This patch prevents this by not incrementing the counter when the busiest group only has one task. ASYM_PACKING migrations and migrations due to reduced capacity should still take place as these are explicitly captured by need_active_balance(). A better solution would be to not attempt the load-balance in the first place, but that requires significant changes to the order of bailout conditions and statistics gathering. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>	2016-09-14 14:48:50 +05:30
Dietmar Eggemann	11d962803d	sched: Enable idle balance to pull single task towards cpu with higher capacity We do not want to miss out on the ability to pull a single remaining task from a potential source cpu towards an idle destination cpu. Add an extra criteria to need_active_balance() to kick off active load balance if the source cpu is over-utilized and has lower capacity than the destination cpu. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>	2016-09-14 14:48:50 +05:30
Morten Rasmussen	d72801bf86	sched: Consider spare cpu capacity at task wake-up find_idlest_group() selects the wake-up target group purely based on group load which leads to suboptimal choices in low load scenarios. An idle group with reduced capacity (due to RT tasks or different cpu type) isn't necessarily a better target than a lightly loaded group with higher capacity. The patch adds spare capacity as an additional group selection parameter. The target group is now selected based on the following criteria: 1. Return the group with the cpu with most spare capacity and this capacity is significant if such group exists. Significant spare capacity is currently at least 20% to spare. 2. Return the group with the lowest load, unless it is the local group in which case NULL is returned and the search is continued at the next (lower) level. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>	2016-09-14 14:48:50 +05:30
Morten Rasmussen	91b2b63314	sched: Add cpu capacity awareness to wakeup balancing Wakeup balancing is completely unaware of cpu capacity, cpu utilization and task utilization. The task is preferably placed on a cpu which is idle in the instant the wakeup happens. New tasks (SD_BALANCE_{FORK,EXEC} are placed on an idle cpu in the idlest group if such can be found, otherwise it goes on the least loaded one. Existing tasks (SD_BALANCE_WAKE) are placed on the previous cpu or an idle cpu sharing the same last level cache unless the wakee_flips heuristic in wake_wide() decides to fallback to considering cpus outside SD_LLC. Hence existing tasks are not guaranteed to get a chance to migrate to a different group at wakeup in case the current one has reduced cpu capacity (due RT/IRQ pressure or different uarch e.g. ARM big.LITTLE). They may eventually get pulled by other cpus doing periodic/idle/nohz_idle balance, but it may take quite a while before it happens. This patch adds capacity awareness to find_idlest_{group,queue} (used by SD_BALANCE_{FORK,EXEC} and SD_BALANCE_WAKE under certain circumstances) such that groups/cpus that can accommodate the waking task based on task utilization are preferred. In addition, wakeup of existing tasks (SD_BALANCE_WAKE) is sent through find_idlest_{group,queue} also if the task doesn't fit the capacity of the previous cpu to allow it to escape (override wake_affine) when necessary instead of relying on periodic/idle/nohz_idle balance to eventually sort it out. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>	2016-09-14 14:48:50 +05:30
Dietmar Eggemann	50df3f37c6	sched: Store system-wide maximum cpu capacity in root domain To be able to compare the capacity of the target cpu with the highest cpu capacity of the system in the wakeup path, store the system-wide maximum cpu capacity in the root domain. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>	2016-09-14 14:48:50 +05:30
Morten Rasmussen	49e029d7a7	arm: Update arch_scale_cpu_capacity() to reflect change to define arch_scale_cpu_capacity() is no longer a weak function but a #define instead. Include the #define in topology.h. cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>	2016-09-14 14:48:50 +05:30
Dietmar Eggemann	777273b78e	arm64: Enable frequency invariant scheduler load-tracking support Defines arch_scale_freq_capacity() to use cpufreq implementation. Including <linux/cpufreq.h> in topology.h like for the arm arch doesn't work because of CONFIG_COMPAT=y (Kernel support for 32-bit EL0). That's why cpufreq_scale_freq_capacity() has to be declared extern in topology.h. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>	2016-09-14 14:48:50 +05:30
Dietmar Eggemann	ec2b699198	arm: Enable frequency invariant scheduler load-tracking support Defines arch_scale_freq_capacity() to use cpufreq implementation. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>	2016-09-14 14:48:50 +05:30
Dietmar Eggemann	e3c1175d91	cpufreq: Frequency invariant scheduler load-tracking support Implements cpufreq_scale_freq_capacity() to provide the scheduler with a frequency scaling correction factor for more accurate load-tracking. The factor is: current_freq(cpu) << SCHED_CAPACITY_SHIFT / max_freq(cpu) In fact, freq_scale should be a struct cpufreq_policy data member. But this would require that the scheduler hot path (__update_load_avg()) would have to grab the cpufreq lock. This can be avoided by using per-cpu data initialized to SCHED_CAPACITY_SCALE for freq_scale. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>	2016-09-14 14:48:50 +05:30
Yuyang Du	8e21145595	sched/fair: Fix new task's load avg removed from source CPU in wake_up_new_task() If a newly created task is selected to go to a different CPU in fork balance when it wakes up the first time, its load averages should not be removed from the source CPU since they are never added to it before. The same is also applicable to a never used group entity. Fix it in remove_entity_load_avg(): when entity's last_update_time is 0, simply return. This should precisely identify the case in question, because in other migrations, the last_update_time is set to 0 after remove_entity_load_avg(). Reported-by: Steve Muckle <steve.muckle@linaro.org> Signed-off-by: Yuyang Du <yuyang.du@intel.com> [peterz: cfs_rq_last_update_time] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Juri Lelli <Juri.Lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Patrick Bellasi <patrick.bellasi@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincent Guittot <vincent.guittot@linaro.org> Link: http://lkml.kernel.org/r/20151216233427.GJ28098@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-09-14 14:45:40 +05:30
Mark Salyzyn	872ffdd406	FROMLIST: pstore: drop pmsg bounce buffer (from https://lkml.org/lkml/2016/9/1/428) (cherry pick from android-3.10 commit b58133100b38f2bf83cad2d7097417a3a196ed0b) Removing a bounce buffer copy operation in the pmsg driver path is always better. We also gain in overall performance by not requesting a vmalloc on every write as this can cause precious RT tasks, such as user facing media operation, to stall while memory is being reclaimed. Added a write_buf_user to the pstore functions, a backup platform write_buf_user that uses the small buffer that is part of the instance, and implemented a ramoops write_buf_user that only supports PSTORE_TYPE_PMSG. Signed-off-by: Mark Salyzyn <salyzyn@google.com> Bug: 31057326 Change-Id: I4cdee1cd31467aa3e6c605bce2fbd4de5b0f8caa	2016-09-14 14:44:33 +05:30
Kees Cook	f9ae153c05	UPSTREAM: usercopy: remove page-spanning test for now A custom allocator without __GFP_COMP that copies to userspace has been found in vmw_execbuf_process[1], so this disables the page-span checker by placing it behind a CONFIG for future work where such things can be tracked down later. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1373326 Reported-by: Vinson Lee <vlee@freedesktop.org> Fixes: f5509cc18daa ("mm: Hardened usercopy") Signed-off-by: Kees Cook <keescook@chromium.org> Change-Id: I4177c0fb943f14a5faf5c70f5e54bf782c316f43 (cherry picked from commit 8e1f74ea02cf4562404c48c6882214821552c13f) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2016-09-14 14:44:29 +05:30
Kees Cook	0c9f69c443	UPSTREAM: usercopy: force check_object_size() inline Just for good measure, make sure that check_object_size() is always inlined too, as already done for copy__user() and __copy__user(). Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kees Cook <keescook@chromium.org> Change-Id: Ibfdf4790d03fe426e68d9a864c55a0d1bbfb7d61 (cherry picked from commit a85d6b8242dc78ef3f4542a0f979aebcbe77fc4e) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2016-09-14 14:44:29 +05:30
Kees Cook	ab8010a1ae	BACKPORT: usercopy: fold builtin_const check into inline function Instead of having each caller of check_object_size() need to remember to check for a const size parameter, move the check into check_object_size() itself. This actually matches the original implementation in PaX, though this commit cleans up the now-redundant builtin_const() calls in the various architectures. Signed-off-by: Kees Cook <keescook@chromium.org> Change-Id: I348809399c10ffa051251866063be674d064b9ff (cherry picked from 81409e9e28058811c9ea865345e1753f8f677e44) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2016-09-14 14:44:29 +05:30
Kees Cook	b53d6e9ab1	UPSTREAM: x86/uaccess: force copy__user() to be inlined As already done with __copy__user(), mark copy_*_user() as __always_inline. Without this, the checks for things like __builtin_const_p() won't work consistently in either hardened usercopy nor the recent adjustments for detecting usercopy overflows at compile time. The change in kernel text size is detectable, but very small: text data bss dec hex filename 12118735 5768608 14229504 32116847 1ea106f vmlinux.before 12120207 5768608 14229504 32118319 1ea162f vmlinux.after Signed-off-by: Kees Cook <keescook@chromium.org> Change-Id: I284c85c2a782145f46655a91d4f83874c90eba61 (cherry picked from commit e6971009a95a74f28c58bbae415c40effad1226c) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2016-09-14 14:44:29 +05:30
Benjamin Tissoires	4774accdcc	UPSTREAM: HID: core: prevent out-of-bound readings (cherry picked from commit 50220dead1650609206efe91f0cc116132d59b3f) Plugging a Logitech DJ receiver with KASAN activated raises a bunch of out-of-bound readings. The fields are allocated up to MAX_USAGE, meaning that potentially, we do not have enough fields to fit the incoming values. Add checks and silence KASAN. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Change-Id: Iaf25e882a6696884439d7091b5fbb0b350d893d3 Bug: 30951261	2016-09-14 14:44:29 +05:30
Mohan Srinivasan	34363937f7	Android: Fix build breakages. The IO latency histogram change broke allmodconfig and allnoconfig builds. This fixes those breakages. Change-Id: I9cdae655b40ed155468f3cef25cdb74bb56c4d3e Signed-off-by: Mohan Srinivasan <srmohan@google.com>	2016-09-14 14:44:29 +05:30
Peter Hurley	f52aa97a9b	UPSTREAM: tty: Prevent ldisc drivers from re-using stale tty fields (cherry picked from commit dd42bf1197144ede075a9d4793123f7689e164bc) Line discipline drivers may mistakenly misuse ldisc-related fields when initializing. For example, a failure to initialize tty->receive_room in the N_GIGASET_M101 line discipline was recently found and fixed [1]. Now, the N_X25 line discipline has been discovered accessing the previous line discipline's already-freed private data [2]. Harden the ldisc interface against misuse by initializing revelant tty fields before instancing the new line discipline. [1] commit `fd98e9419d` Author: Tilman Schmidt <tilman@imap.cc> Date: Tue Jul 14 00:37:13 2015 +0200 isdn/gigaset: reset tty->receive_room when attaching ser_gigaset [2] Report from Sasha Levin <sasha.levin@oracle.com> [ 634.336761] ================================================================== [ 634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 at addr ffff8800a743efd0 [ 634.339558] Read of size 4 by task syzkaller_execu/8981 [ 634.340359] ============================================================================= [ 634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected ... [ 634.405018] Call Trace: [ 634.405277] dump_stack (lib/dump_stack.c:52) [ 634.405775] print_trailer (mm/slub.c:655) [ 634.406361] object_err (mm/slub.c:662) [ 634.406824] kasan_report_error (mm/kasan/report.c:138 mm/kasan/report.c:236) [ 634.409581] __asan_report_load4_noabort (mm/kasan/report.c:279) [ 634.411355] x25_asy_open_tty (drivers/net/wan/x25_asy.c:559 (discriminator 1)) [ 634.413997] tty_ldisc_open.isra.2 (drivers/tty/tty_ldisc.c:447) [ 634.414549] tty_set_ldisc (drivers/tty/tty_ldisc.c:567) [ 634.415057] tty_ioctl (drivers/tty/tty_io.c:2646 drivers/tty/tty_io.c:2879) [ 634.423524] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607) [ 634.427491] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613) [ 634.427945] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:188) Cc: Tilman Schmidt <tilman@imap.cc> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Change-Id: Ibed6feadfb9706d478f93feec3b240aecfc64af3 Bug: 30951112	2016-09-14 14:44:29 +05:30
Phil Turnbull	d24bf3c8b2	UPSTREAM: netfilter: nfnetlink: correctly validate length of batch messages (cherry picked from commit c58d6c93680f28ac58984af61d0a7ebf4319c241) If nlh->nlmsg_len is zero then an infinite loop is triggered because 'skb_pull(skb, msglen);' pulls zero bytes. The calculation in nlmsg_len() underflows if 'nlh->nlmsg_len < NLMSG_HDRLEN' which bypasses the length validation and will later trigger an out-of-bound read. If the length validation does fail then the malformed batch message is copied back to userspace. However, we cannot do this because the nlh->nlmsg_len can be invalid. This leads to an out-of-bounds read in netlink_ack: [ 41.455421] ================================================================== [ 41.456431] BUG: KASAN: slab-out-of-bounds in memcpy+0x1d/0x40 at addr ffff880119e79340 [ 41.456431] Read of size 4294967280 by task a.out/987 [ 41.456431] ============================================================================= [ 41.456431] BUG kmalloc-512 (Not tainted): kasan: bad access detected [ 41.456431] ----------------------------------------------------------------------------- ... [ 41.456431] Bytes b4 ffff880119e79310: 00 00 00 00 d5 03 00 00 b0 fb fe ff 00 00 00 00 ................ [ 41.456431] Object ffff880119e79320: 20 00 00 00 10 00 05 00 00 00 00 00 00 00 00 00 ............... [ 41.456431] Object ffff880119e79330: 14 00 0a 00 01 03 fc 40 45 56 11 22 33 10 00 05 .......@EV."3... [ 41.456431] Object ffff880119e79340: f0 ff ff ff 88 99 aa bb 00 14 00 0a 00 06 fe fb ................ ^^ start of batch nlmsg with nlmsg_len=4294967280 ... [ 41.456431] Memory state around the buggy address: [ 41.456431] ffff880119e79400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 41.456431] ffff880119e79480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 41.456431] >ffff880119e79500: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc [ 41.456431] ^ [ 41.456431] ffff880119e79580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 41.456431] ffff880119e79600: fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb fb [ 41.456431] ================================================================== Fix this with better validation of nlh->nlmsg_len and by setting NFNL_BATCH_FAILURE if any batch message fails length validation. CAP_NET_ADMIN is required to trigger the bugs. Fixes: `9ea2aa8b7d` ("netfilter: nfnetlink: validate nfnetlink header from batch") Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: Id3e15c40cb464bf2791af907c235d8a316b2449c Bug: 30947055	2016-09-14 14:44:29 +05:30
Riley Andrews	9400d22ae1	cpuset: Make cpusets restore on hotplug This deliberately changes the behavior of the per-cpuset cpus file to not be effected by hotplug. When a cpu is offlined, it will be removed from the cpuset/cpus file. When a cpu is onlined, if the cpuset originally requested that that cpu was part of the cpuset, that cpu will be restored to the cpuset. The cpus files still have to be hierachical, but the ranges no longer have to be out of the currently online cpus, just the physically present cpus. Change-Id: I22cdf33e7d312117bcefba1aeb0125e1ada289a9 Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>	2016-09-14 14:44:29 +05:30
Joonsoo Kim	e08e07aec2	UPSTREAM: mm/slub: support left redzone SLUB already has a redzone debugging feature. But it is only positioned at the end of object (aka right redzone) so it cannot catch left oob. Although current object's right redzone acts as left redzone of next object, first object in a slab cannot take advantage of this effect. This patch explicitly adds a left red zone to each object to detect left oob more precisely. Background: Someone complained to me that left OOB doesn't catch even if KASAN is enabled which does page allocation debugging. That page is out of our control so it would be allocated when left OOB happens and, in this case, we can't find OOB. Moreover, SLUB debugging feature can be enabled without page allocator debugging and, in this case, we will miss that OOB. Before trying to implement, I expected that changes would be too complex, but, it doesn't look that complex to me now. Almost changes are applied to debug specific functions so I feel okay. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: Ib893a17ecabd692e6c402e864196bf89cd6781a5 (cherry picked from commit d86bd1bece6fc41d59253002db5441fe960a37f6) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2016-09-14 14:44:29 +05:30
Linus Torvalds	03eb77747d	UPSTREAM: Make the hardened user-copy code depend on having a hardened allocator The kernel test robot reported a usercopy failure in the new hardened sanity checks, due to a page-crossing copy of the FPU state into the task structure. This happened because the kernel test robot was testing with SLOB, which doesn't actually do the required book-keeping for slab allocations, and as a result the hardening code didn't realize that the task struct allocation was one single allocation - and the sanity checks fail. Since SLOB doesn't even claim to support hardening (and you really shouldn't use it), the straightforward solution is to just make the usercopy hardening code depend on the allocator supporting it. Reported-by: kernel test robot <xiaolong.ye@intel.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: I37d51f866f873341bf7d5297249899b852e1c6ce (cherry picked from commit 6040e57658eee6eb1315a26119101ca832d1f854) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2016-09-14 14:43:39 +05:30
Mohan Srinivasan	8e4b2f84a8	Android: MMC/UFS IO Latency Histograms. This patch adds a new sysfs node (latency_hist) and reports IO (svc time) latency histograms. Disabled by default, can be enabled by echoing 0 into latency_hist, stats can be cleared by writing 2 into latency_hist. This commit fixes the 32 bit build breakage in the previous commit. Tested on both 32 bit and 64 bit arm devices. Bug: 30677035 Change-Id: I9a615a16616d80f87e75676ac4d078a5c429dcf9 Signed-off-by: Mohan Srinivasan <srmohan@google.com>	2016-09-14 14:43:34 +05:30
Josh Poimboeuf	10bd621d38	UPSTREAM: usercopy: fix overlap check for kernel text When running with a local patch which moves the '_stext' symbol to the very beginning of the kernel text area, I got the following panic with CONFIG_HARDENED_USERCOPY: usercopy: kernel memory exposure attempt detected from ffff88103dfff000 (<linear kernel text>) (4096 bytes) ------------[ cut here ]------------ kernel BUG at mm/usercopy.c:79! invalid opcode: 0000 [#1] SMP ... CPU: 0 PID: 4800 Comm: cp Not tainted 4.8.0-rc3.after+ #1 Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.5.4 01/22/2016 task: ffff880817444140 task.stack: ffff880816274000 RIP: 0010:[<ffffffff8121c796>] __check_object_size+0x76/0x413 RSP: 0018:ffff880816277c40 EFLAGS: 00010246 RAX: 000000000000006b RBX: ffff88103dfff000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88081f80dfa8 RDI: ffff88081f80dfa8 RBP: ffff880816277c90 R08: 000000000000054c R09: 0000000000000000 R10: 0000000000000005 R11: 0000000000000006 R12: 0000000000001000 R13: ffff88103e000000 R14: ffff88103dffffff R15: 0000000000000001 FS: 00007fb9d1750800(0000) GS:ffff88081f800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000021d2000 CR3: 000000081a08f000 CR4: 00000000001406f0 Stack: ffff880816277cc8 0000000000010000 000000043de07000 0000000000000000 0000000000001000 ffff880816277e60 0000000000001000 ffff880816277e28 000000000000c000 0000000000001000 ffff880816277ce8 ffffffff8136c3a6 Call Trace: [<ffffffff8136c3a6>] copy_page_to_iter_iovec+0xa6/0x1c0 [<ffffffff8136e766>] copy_page_to_iter+0x16/0x90 [<ffffffff811970e3>] generic_file_read_iter+0x3e3/0x7c0 [<ffffffffa06a738d>] ? xfs_file_buffered_aio_write+0xad/0x260 [xfs] [<ffffffff816e6262>] ? down_read+0x12/0x40 [<ffffffffa06a61b1>] xfs_file_buffered_aio_read+0x51/0xc0 [xfs] [<ffffffffa06a6692>] xfs_file_read_iter+0x62/0xb0 [xfs] [<ffffffff812224cf>] __vfs_read+0xdf/0x130 [<ffffffff81222c9e>] vfs_read+0x8e/0x140 [<ffffffff81224195>] SyS_read+0x55/0xc0 [<ffffffff81003a47>] do_syscall_64+0x67/0x160 [<ffffffff816e8421>] entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:[<00007fb9d0c33c00>] 0x7fb9d0c33c00 RSP: 002b:00007ffc9c262f28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: fffffffffff8ffff RCX: 00007fb9d0c33c00 RDX: 0000000000010000 RSI: 00000000021c3000 RDI: 0000000000000004 RBP: 00000000021c3000 R08: 0000000000000000 R09: 00007ffc9c264d6c R10: 00007ffc9c262c50 R11: 0000000000000246 R12: 0000000000010000 R13: 00007ffc9c2630b0 R14: 0000000000000004 R15: 0000000000010000 Code: 81 48 0f 44 d0 48 c7 c6 90 4d a3 81 48 c7 c0 bb b3 a2 81 48 0f 44 f0 4d 89 e1 48 89 d9 48 c7 c7 68 16 a3 81 31 c0 e8 f4 57 f7 ff <0f> 0b 48 8d 90 00 40 00 00 48 39 d3 0f 83 22 01 00 00 48 39 c3 RIP [<ffffffff8121c796>] __check_object_size+0x76/0x413 RSP <ffff880816277c40> The checked object's range [ffff88103dfff000, ffff88103e000000) is valid, so there shouldn't have been a BUG. The hardened usercopy code got confused because the range's ending address is the same as the kernel's text starting address at 0xffff88103e000000. The overlap check is slightly off. Fixes: f5509cc18daa ("mm: Hardened usercopy") Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Change-Id: I839dbf4ddbb4d9874026a42abed557eb9b3f8bef (cherry picked from commit 94cd97af690dd9537818dc9841d0ec68bb1dd877) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2016-09-14 14:43:30 +05:30
Eric Biggers	c3f4d074ed	UPSTREAM: usercopy: avoid potentially undefined behavior in pointer math check_bogus_address() checked for pointer overflow using this expression, where 'ptr' has type 'const void *': ptr + n < ptr Since pointer wraparound is undefined behavior, gcc at -O2 by default treats it like the following, which would not behave as intended: (long)n < 0 Fortunately, this doesn't currently happen for kernel code because kernel code is compiled with -fno-strict-overflow. But the expression should be fixed anyway to use well-defined integer arithmetic, since it could be treated differently by different compilers in the future or could be reported by tools checking for undefined behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Change-Id: I73b13be651cf35c03482f2014bf2c3dd291518ab (cherry picked from commit 7329a655875a2f4bd6984fe8a7e00a6981e802f3) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2016-09-14 14:43:25 +05:30
Linus Torvalds	1cbefb3fb1	UPSTREAM: unsafe_[get\|put]_user: change interface to use a error target label When I initially added the unsafe_[get\|put]_user() helpers in commit 5b24a7a2aa20 ("Add 'unsafe' user access functions for batched accesses"), I made the mistake of modeling the interface on our traditional __[get\|put]_user() functions, which return zero on success, or -EFAULT on failure. That interface is fairly easy to use, but it's actually fairly nasty for good code generation, since it essentially forces the caller to check the error value for each access. In particular, since the error handling is already internally implemented with an exception handler, and we already use "asm goto" for various other things, we could fairly easily make the error cases just jump directly to an error label instead, and avoid the need for explicit checking after each operation. So switch the interface to pass in an error label, rather than checking the error value in the caller. Best do it now before we start growing more users (the signal handling code in particular would be a good place to use the new interface). So rather than if (unsafe_get_user(x, ptr)) ... handle error .. the interface is now unsafe_get_user(x, ptr, label); where an error during the user mode fetch will now just cause a jump to 'label' in the caller. Right now the actual _implementation_ of this all still ends up being a "if (err) goto label", and does not take advantage of any exception label tricks, but for "unsafe_put_user()" in particular it should be fairly straightforward to convert to using the exception table model. Note that "unsafe_get_user()" is much harder to convert to a clever exception table model, because current versions of gcc do not allow the use of "asm goto" (for the exception) with output values (for the actual value to be fetched). But that is hopefully not a limitation in the long term. [ Also note that it might be a good idea to switch unsafe_get_user() to actually _return_ the value it fetches from user space, but this commit only changes the error handling semantics ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: Ib905a84a04d46984320f6fd1056da4d72f3d6b53 (cherry picked from commit 1bd4403d86a1c06cb6cc9ac87664a0c9d3413d51) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2016-09-14 14:43:17 +05:30
Ard Biesheuvel	a37344911d	BACKPORT: arm64: mm: fix location of _etext As Kees Cook notes in the ARM counterpart of this patch [0]: The _etext position is defined to be the end of the kernel text code, and should not include any part of the data segments. This interferes with things that might check memory ranges and expect executable code up to _etext. In particular, Kees is referring to the HARDENED_USERCOPY patch set [1], which rejects attempts to call copy_to_user() on kernel ranges containing executable code, but does allow access to the .rodata segment. Regardless of whether one may or may not agree with the distinction, it makes sense for _etext to have the same meaning across architectures. So let's put _etext where it belongs, between .text and .rodata, and fix up existing references to use __init_begin instead, which unlike _end_rodata includes the exception and notes sections as well. The _etext references in kaslr.c are left untouched, since its references to [_stext, _etext) are meant to capture potential jump instruction targets, and so disregarding .rodata is actually an improvement here. [0] http://article.gmane.org/gmane.linux.kernel/2245084 [1] http://thread.gmane.org/gmane.linux.kernel.hardened.devel/2502 Reported-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (cherry picked from commit 9fdc14c55cd6579d619ccd9d40982e0805e62b6d) Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2016-09-14 14:42:34 +05:30
Kees Cook	de3e4231c6	BACKPORT: ARM: 8583/1: mm: fix location of _etext The _etext position is defined to be the end of the kernel text code, and should not include any part of the data segments. This interferes with things that might check memory ranges and expect executable code up to _etext. Just to be conservative, leave the kernel resource as it was, using __init_begin instead of _etext as the end mark. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Change-Id: Ida514d1359dbe6f782f562ce29b4ba09ae72bfc0 (cherry picked from commit 14c4a533e0996f95a0a64dfd0b6252d788cebc74) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2016-09-14 14:26:51 +05:30
Mohamad Ayyash	0202669aab	BACKPORT: Don't show empty tag stats for unprivileged uids BUG: 27577101 BUG: 27532522 Signed-off-by: Mohamad Ayyash <mkayyash@google.com>	2016-09-14 14:26:46 +05:30
Eric Dumazet	2a111a3c24	UPSTREAM: tcp: fix use after free in tcp_xmit_retransmit_queue() (cherry picked from commit bb1fceca22492109be12640d49f5ea5a544c6bb4) When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the tail of the write queue using tcp_add_write_queue_tail() Then it attempts to copy user data into this fresh skb. If the copy fails, we undo the work and remove the fresh skb. Unfortunately, this undo lacks the change done to tp->highest_sack and we can leave a dangling pointer (to a freed skb) Later, tcp_xmit_retransmit_queue() can dereference this pointer and access freed memory. For regular kernels where memory is not unmapped, this might cause SACK bugs because tcp_highest_sack_seq() is buggy, returning garbage instead of tp->snd_nxt, but with various debug features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel. This bug was found by Marco Grassi thanks to syzkaller. Fixes: `6859d49475` ("[TCP]: Abstract tp->highest_sack accessing & point to next skb") Reported-by: Marco Grassi <marco.gra@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: I58bb02d6e4e399612e8580b9e02d11e661df82f5 Bug: 31183296	2016-09-14 14:26:42 +05:30
Amit Pundir	bfdbb3be1e	ANDROID: base-cfg: drop SECCOMP_FILTER config Don't need to set SECCOMP_FILTER explicitly since CONFIG_SECCOMP=y will select that config anyway. Fixes: `a49dcf2e74` ("ANDROID: base-cfg: enable SECCOMP config") Change-Id: Iff18ed4d2db5a55b9f9480d5ecbeef7b818b3837 Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2016-09-14 14:26:37 +05:30
Dan Carpenter	6bc91eb13f	UPSTREAM: [media] xc2028: unlock on error in xc2028_set_config() (cherry picked from commit 210bd104c6acd31c3c6b8b075b3f12d4a9f6b60d) We have to unlock before returning -ENOMEM. Fixes: 8dfbcc4351a0 ('[media] xc2028: avoid use after free') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Change-Id: I7b6ba9fde5c6e29467e6de23d398af2fe56e2547 Bug: 30946097	2016-09-14 14:26:32 +05:30
Mauro Carvalho Chehab	c70373ec84	UPSTREAM: [media] xc2028: avoid use after free (cherry picked from commit 8dfbcc4351a0b6d2f2d77f367552f48ffefafe18) If struct xc2028_config is passed without a firmware name, the following trouble may happen: [11009.907205] xc2028 5-0061: type set to XCeive xc2028/xc3028 tuner [11009.907491] ================================================================== [11009.907750] BUG: KASAN: use-after-free in strcmp+0x96/0xb0 at addr ffff8803bd78ab40 [11009.907992] Read of size 1 by task modprobe/28992 [11009.907994] ============================================================================= [11009.907997] BUG kmalloc-16 (Tainted: G W ): kasan: bad access detected [11009.907999] ----------------------------------------------------------------------------- [11009.908008] INFO: Allocated in xhci_urb_enqueue+0x214/0x14c0 [xhci_hcd] age=0 cpu=3 pid=28992 [11009.908012] ___slab_alloc+0x581/0x5b0 [11009.908014] __slab_alloc+0x51/0x90 [11009.908017] __kmalloc+0x27b/0x350 [11009.908022] xhci_urb_enqueue+0x214/0x14c0 [xhci_hcd] [11009.908026] usb_hcd_submit_urb+0x1e8/0x1c60 [11009.908029] usb_submit_urb+0xb0e/0x1200 [11009.908032] usb_serial_generic_write_start+0xb6/0x4c0 [11009.908035] usb_serial_generic_write+0x92/0xc0 [11009.908039] usb_console_write+0x38a/0x560 [11009.908045] call_console_drivers.constprop.14+0x1ee/0x2c0 [11009.908051] console_unlock+0x40d/0x900 [11009.908056] vprintk_emit+0x4b4/0x830 [11009.908061] vprintk_default+0x1f/0x30 [11009.908064] printk+0x99/0xb5 [11009.908067] kasan_report_error+0x10a/0x550 [11009.908070] __asan_report_load1_noabort+0x43/0x50 [11009.908074] INFO: Freed in xc2028_set_config+0x90/0x630 [tuner_xc2028] age=1 cpu=3 pid=28992 [11009.908077] __slab_free+0x2ec/0x460 [11009.908080] kfree+0x266/0x280 [11009.908083] xc2028_set_config+0x90/0x630 [tuner_xc2028] [11009.908086] xc2028_attach+0x310/0x8a0 [tuner_xc2028] [11009.908090] em28xx_attach_xc3028.constprop.7+0x1f9/0x30d [em28xx_dvb] [11009.908094] em28xx_dvb_init.part.3+0x8e4/0x5cf4 [em28xx_dvb] [11009.908098] em28xx_dvb_init+0x81/0x8a [em28xx_dvb] [11009.908101] em28xx_register_extension+0xd9/0x190 [em28xx] [11009.908105] em28xx_dvb_register+0x10/0x1000 [em28xx_dvb] [11009.908108] do_one_initcall+0x141/0x300 [11009.908111] do_init_module+0x1d0/0x5ad [11009.908114] load_module+0x6666/0x9ba0 [11009.908117] SyS_finit_module+0x108/0x130 [11009.908120] entry_SYSCALL_64_fastpath+0x16/0x76 [11009.908123] INFO: Slab 0xffffea000ef5e280 objects=25 used=25 fp=0x (null) flags=0x2ffff8000004080 [11009.908126] INFO: Object 0xffff8803bd78ab40 @offset=2880 fp=0x0000000000000001 [11009.908130] Bytes b4 ffff8803bd78ab30: 01 00 00 00 2a 07 00 00 9d 28 00 00 01 00 00 00 ....*....(...... [11009.908133] Object ffff8803bd78ab40: 01 00 00 00 00 00 00 00 b0 1d c3 6a 00 88 ff ff ...........j.... [11009.908137] CPU: 3 PID: 28992 Comm: modprobe Tainted: G B W 4.5.0-rc1+ #43 [11009.908140] Hardware name: /NUC5i7RYB, BIOS RYBDWi35.86A.0350.2015.0812.1722 08/12/2015 [11009.908142] ffff8803bd78a000 ffff8802c273f1b8 ffffffff81932007 ffff8803c6407a80 [11009.908148] ffff8802c273f1e8 ffffffff81556759 ffff8803c6407a80 ffffea000ef5e280 [11009.908153] ffff8803bd78ab40 dffffc0000000000 ffff8802c273f210 ffffffff8155ccb4 [11009.908158] Call Trace: [11009.908162] [<ffffffff81932007>] dump_stack+0x4b/0x64 [11009.908165] [<ffffffff81556759>] print_trailer+0xf9/0x150 [11009.908168] [<ffffffff8155ccb4>] object_err+0x34/0x40 [11009.908171] [<ffffffff8155f260>] kasan_report_error+0x230/0x550 [11009.908175] [<ffffffff81237d71>] ? trace_hardirqs_off_caller+0x21/0x290 [11009.908179] [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50 [11009.908182] [<ffffffff8155f5c3>] __asan_report_load1_noabort+0x43/0x50 [11009.908185] [<ffffffff8155ea00>] ? __asan_register_globals+0x50/0xa0 [11009.908189] [<ffffffff8194cea6>] ? strcmp+0x96/0xb0 [11009.908192] [<ffffffff8194cea6>] strcmp+0x96/0xb0 [11009.908196] [<ffffffffa13ba4ac>] xc2028_set_config+0x15c/0x630 [tuner_xc2028] [11009.908200] [<ffffffffa13bac90>] xc2028_attach+0x310/0x8a0 [tuner_xc2028] [11009.908203] [<ffffffff8155ea78>] ? memset+0x28/0x30 [11009.908206] [<ffffffffa13ba980>] ? xc2028_set_config+0x630/0x630 [tuner_xc2028] [11009.908211] [<ffffffffa157a59a>] em28xx_attach_xc3028.constprop.7+0x1f9/0x30d [em28xx_dvb] [11009.908215] [<ffffffffa157aa2a>] ? em28xx_dvb_init.part.3+0x37c/0x5cf4 [em28xx_dvb] [11009.908219] [<ffffffffa157a3a1>] ? hauppauge_hvr930c_init+0x487/0x487 [em28xx_dvb] [11009.908222] [<ffffffffa01795ac>] ? lgdt330x_attach+0x1cc/0x370 [lgdt330x] [11009.908226] [<ffffffffa01793e0>] ? i2c_read_demod_bytes.isra.2+0x210/0x210 [lgdt330x] [11009.908230] [<ffffffff812e87d0>] ? ref_module.part.15+0x10/0x10 [11009.908233] [<ffffffff812e56e0>] ? module_assert_mutex_or_preempt+0x80/0x80 [11009.908238] [<ffffffffa157af92>] em28xx_dvb_init.part.3+0x8e4/0x5cf4 [em28xx_dvb] [11009.908242] [<ffffffffa157a6ae>] ? em28xx_attach_xc3028.constprop.7+0x30d/0x30d [em28xx_dvb] [11009.908245] [<ffffffff8195222d>] ? string+0x14d/0x1f0 [11009.908249] [<ffffffff8195381f>] ? symbol_string+0xff/0x1a0 [11009.908253] [<ffffffff81953720>] ? uuid_string+0x6f0/0x6f0 [11009.908257] [<ffffffff811a775e>] ? __kernel_text_address+0x7e/0xa0 [11009.908260] [<ffffffff8104b02f>] ? print_context_stack+0x7f/0xf0 [11009.908264] [<ffffffff812e9846>] ? __module_address+0xb6/0x360 [11009.908268] [<ffffffff8137fdc9>] ? is_ftrace_trampoline+0x99/0xe0 [11009.908271] [<ffffffff811a775e>] ? __kernel_text_address+0x7e/0xa0 [11009.908275] [<ffffffff81240a70>] ? debug_check_no_locks_freed+0x290/0x290 [11009.908278] [<ffffffff8104a24b>] ? dump_trace+0x11b/0x300 [11009.908282] [<ffffffffa13e8143>] ? em28xx_register_extension+0x23/0x190 [em28xx] [11009.908285] [<ffffffff81237d71>] ? trace_hardirqs_off_caller+0x21/0x290 [11009.908289] [<ffffffff8123ff56>] ? trace_hardirqs_on_caller+0x16/0x590 [11009.908292] [<ffffffff812404dd>] ? trace_hardirqs_on+0xd/0x10 [11009.908296] [<ffffffffa13e8143>] ? em28xx_register_extension+0x23/0x190 [em28xx] [11009.908299] [<ffffffff822dcbb0>] ? mutex_trylock+0x400/0x400 [11009.908302] [<ffffffff810021a1>] ? do_one_initcall+0x131/0x300 [11009.908306] [<ffffffff81296dc7>] ? call_rcu_sched+0x17/0x20 [11009.908309] [<ffffffff8159e708>] ? put_object+0x48/0x70 [11009.908314] [<ffffffffa1579f11>] em28xx_dvb_init+0x81/0x8a [em28xx_dvb] [11009.908317] [<ffffffffa13e81f9>] em28xx_register_extension+0xd9/0x190 [em28xx] [11009.908320] [<ffffffffa0150000>] ? 0xffffffffa0150000 [11009.908324] [<ffffffffa0150010>] em28xx_dvb_register+0x10/0x1000 [em28xx_dvb] [11009.908327] [<ffffffff810021b1>] do_one_initcall+0x141/0x300 [11009.908330] [<ffffffff81002070>] ? try_to_run_init_process+0x40/0x40 [11009.908333] [<ffffffff8123ff56>] ? trace_hardirqs_on_caller+0x16/0x590 [11009.908337] [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50 [11009.908340] [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50 [11009.908343] [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50 [11009.908346] [<ffffffff8155ea37>] ? __asan_register_globals+0x87/0xa0 [11009.908350] [<ffffffff8144da7b>] do_init_module+0x1d0/0x5ad [11009.908353] [<ffffffff812f2626>] load_module+0x6666/0x9ba0 [11009.908356] [<ffffffff812e9c90>] ? symbol_put_addr+0x50/0x50 [11009.908361] [<ffffffffa1580037>] ? em28xx_dvb_init.part.3+0x5989/0x5cf4 [em28xx_dvb] [11009.908366] [<ffffffff812ebfc0>] ? module_frob_arch_sections+0x20/0x20 [11009.908369] [<ffffffff815bc940>] ? open_exec+0x50/0x50 [11009.908374] [<ffffffff811671bb>] ? ns_capable+0x5b/0xd0 [11009.908377] [<ffffffff812f5e58>] SyS_finit_module+0x108/0x130 [11009.908379] [<ffffffff812f5d50>] ? SyS_init_module+0x1f0/0x1f0 [11009.908383] [<ffffffff81004044>] ? lockdep_sys_exit_thunk+0x12/0x14 [11009.908394] [<ffffffff822e6936>] entry_SYSCALL_64_fastpath+0x16/0x76 [11009.908396] Memory state around the buggy address: [11009.908398] ffff8803bd78aa00: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [11009.908401] ffff8803bd78aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [11009.908403] >ffff8803bd78ab00: fc fc fc fc fc fc fc fc 00 00 fc fc fc fc fc fc [11009.908405] ^ [11009.908407] ffff8803bd78ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [11009.908409] ffff8803bd78ac00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [11009.908411] ================================================================== In order to avoid it, let's set the cached value of the firmware name to NULL after freeing it. While here, return an error if the memory allocation fails. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Change-Id: I945c841dcfb45de2056267e4aa50bbe176b527cf Bug: 30946097	2016-09-14 14:26:20 +05:30
Yongqin Liu	7988ef0ccc	ANDROID: base-cfg: enable SECCOMP config Enable following seccomp configs CONFIG_SECCOMP=y CONFIG_SECCOMP_FILTER=y Otherwise we will get mediacode error like this on Android N: E /system/bin/mediaextractor: libminijail: prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER): Invalid argument Change-Id: I2477b6a2cfdded5c0ebf6ffbb6150b0e5fe2ba12 Signed-off-by: Yongqin Liu <yongqin.liu@linaro.org> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>	2016-09-14 14:26:20 +05:30
Guenter Roeck	8b94247342	ANDROID: rcu_sync: Export rcu_sync_lockdep_assert x86_64:allmodconfig fails to build with the following error. ERROR: "rcu_sync_lockdep_assert" [kernel/locking/locktorture.ko] undefined! Introduced by commit `3228c5eb7a` ("RFC: FROMLIST: locking/percpu-rwsem: Optimize readers and reduce global impact"). The applied upstream version exports the missing symbol, so let's do the same. Change-Id: If4e516715c3415fe8c82090f287174857561550d Fixes: `3228c5eb7a` ("RFC: FROMLIST: locking/percpu-rwsem: Optimize ...") Signed-off-by: Guenter Roeck <groeck@chromium.org>	2016-09-14 14:26:20 +05:30
Balbir Singh	48bb58c012	RFC: FROMLIST: cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork cgroup_threadgroup_rwsem is acquired in read mode during process exit and fork. It is also grabbed in write mode during __cgroups_proc_write(). I've recently run into a scenario with lots of memory pressure and OOM and I am beginning to see systemd __switch_to+0x1f8/0x350 __schedule+0x30c/0x990 schedule+0x48/0xc0 percpu_down_write+0x114/0x170 __cgroup_procs_write.isra.12+0xb8/0x3c0 cgroup_file_write+0x74/0x1a0 kernfs_fop_write+0x188/0x200 __vfs_write+0x6c/0xe0 vfs_write+0xc0/0x230 SyS_write+0x6c/0x110 system_call+0x38/0xb4 This thread is waiting on the reader of cgroup_threadgroup_rwsem to exit. The reader itself is under memory pressure and has gone into reclaim after fork. There are times the reader also ends up waiting on oom_lock as well. __switch_to+0x1f8/0x350 __schedule+0x30c/0x990 schedule+0x48/0xc0 jbd2_log_wait_commit+0xd4/0x180 ext4_evict_inode+0x88/0x5c0 evict+0xf8/0x2a0 dispose_list+0x50/0x80 prune_icache_sb+0x6c/0x90 super_cache_scan+0x190/0x210 shrink_slab.part.15+0x22c/0x4c0 shrink_zone+0x288/0x3c0 do_try_to_free_pages+0x1dc/0x590 try_to_free_pages+0xdc/0x260 __alloc_pages_nodemask+0x72c/0xc90 alloc_pages_current+0xb4/0x1a0 page_table_alloc+0xc0/0x170 __pte_alloc+0x58/0x1f0 copy_page_range+0x4ec/0x950 copy_process.isra.5+0x15a0/0x1870 _do_fork+0xa8/0x4b0 ppc_clone+0x8/0xc In the meanwhile, all processes exiting/forking are blocked almost stalling the system. This patch moves the threadgroup_change_begin from before cgroup_fork() to just before cgroup_canfork(). There is no nee to worry about threadgroup changes till the task is actually added to the threadgroup. This avoids having to call reclaim with cgroup_threadgroup_rwsem held. tj: Subject and description edits. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Zefan Li <lizefan@huawei.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Tejun Heo <tj@kernel.org> [jstultz: Cherry-picked from: git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git 568ac888215c7f] Change-Id: Ie8ece84fb613cf6a7b08cea1468473a8df2b9661 Signed-off-by: John Stultz <john.stultz@linaro.org>	2016-09-14 14:26:20 +05:30
Peter Zijlstra	a81c69e149	RFC: FROMLIST: cgroup: avoid synchronize_sched() in __cgroup_procs_write() The current percpu-rwsem read side is entirely free of serializing insns at the cost of having a synchronize_sched() in the write path. The latency of the synchronize_sched() is too high for cgroups. The commit `1ed1328792` talks about the write path being a fairly cold path but this is not the case for Android which moves task to the foreground cgroup and back around binder IPC calls from foreground processes to background processes, so it is significantly hotter than human initiated operations. Switch cgroup_threadgroup_rwsem into the slow mode for now to avoid the problem, hopefully it should not be that slow after another commit 80127a39681b ("locking/percpu-rwsem: Optimize readers and reduce global impact"). We could just add rcu_sync_enter() into cgroup_init() but we do not want another synchronize_sched() at boot time, so this patch adds the new helper which doesn't block but currently can only be called before the first use. Cc: Tejun Heo <tj@kernel.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Reported-by: John Stultz <john.stultz@linaro.org> Reported-by: Dmitry Shmidt <dimitrysh@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> [jstultz: backported to 4.4] Change-Id: I34aa9c394d3052779b56976693e96d861bd255f2 Mailing-list-URL: https://lkml.org/lkml/2016/8/11/557 Signed-off-by: John Stultz <john.stultz@linaro.org>	2016-09-14 14:26:20 +05:30
Peter Zijlstra	d4d74af4b8	RFC: FROMLIST: locking/percpu-rwsem: Optimize readers and reduce global impact Currently the percpu-rwsem switches to (global) atomic ops while a writer is waiting; which could be quite a while and slows down releasing the readers. This patch cures this problem by ordering the reader-state vs reader-count (see the comments in __percpu_down_read() and percpu_down_write()). This changes a global atomic op into a full memory barrier, which doesn't have the global cacheline contention. This also enables using the percpu-rwsem with rcu_sync disabled in order to bias the implementation differently, reducing the writer latency by adding some cost to readers. Mailing-list-URL: https://lkml.org/lkml/2016/8/9/181 Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> [jstultz: Backported to 4.4] Change-Id: I8ea04b4dca2ec36f1c2469eccafde1423490572f Signed-off-by: John Stultz <john.stultz@linaro.org>	2016-09-14 14:26:20 +05:30
Lorenzo Colitti	0e806c83bc	net: ipv6: Fix ping to link-local addresses. ping_v6_sendmsg does not set flowi6_oif in response to sin6_scope_id or sk_bound_dev_if, so it is not possible to use these APIs to ping an IPv6 address on a different interface. Instead, it sets flowi6_iif, which is incorrect but harmless. Stop setting flowi6_iif, and support various ways of setting oif in the same priority order used by udpv6_sendmsg. [Backport of net 5e457896986e16c440c97bb94b9ccd95dd157292] Bug: 29370996 Change-Id: Ibe1b9434c00ed96f1e30acb110734c6570b087b8 Tested: https://android-review.googlesource.com/#/c/254470/ Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-14 14:26:20 +05:30
Hannes Frederic Sowa	a270309c90	ipv6: fix endianness error in icmpv6_err IPv6 ping socket error handler doesn't correctly convert the new 32 bit mtu to host endianness before using. [Cherry-pick of net dcb94b88c09ce82a80e188d49bcffdc83ba215a6] Bug: 29370996 Change-Id: Iea0ca79f16c2a1366d82b3b0a3097093d18da8b7 Cc: Lorenzo Colitti <lorenzo@google.com> Fixes: `6d0bfe2261` ("net: ipv6: Add IPv6 support to the ping socket.") Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-14 14:26:20 +05:30
Badhri Jagan Sridharan	eed022ee2f	ANDROID: dm: android-verity: Allow android-verity to be compiled as an independent module Exports the device mapper callbacks of linear and dm-verity-target methods. Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com> Change-Id: I0358be0615c431dce3cc78575aaac4ccfe3aacd7	2016-09-14 14:26:20 +05:30
Alex Shi	b56111f481	Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android Conflicts: arch/arm/Kconfig	2016-08-30 10:27:13 +08:00
Alex Shi	59e65b4bbf	Merge remote-tracking branch 'v4.4/topic/mm-kaslr-pax_usercopy' into linux-linaro-lsk-v4.4	2016-08-27 11:27:14 +08:00
Kees Cook	3ad78bad4f	mm: SLUB hardened usercopy support Under CONFIG_HARDENED_USERCOPY, this adds object size checking to the SLUB allocator to catch any copies that may span objects. Includes a redzone handling fix discovered by Michael Ellerman. Based on code from PaX and grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Reviwed-by: Laura Abbott <labbott@redhat.com> (cherry picked from commit ed18adc1cdd00a5c55a20fbdaed4804660772281) Signed-off-by: Alex Shi <alex.shi@linaro.org>	2016-08-27 11:23:38 +08:00
Kees Cook	784bd0f8d7	mm: SLAB hardened usercopy support Under CONFIG_HARDENED_USERCOPY, this adds object size checking to the SLAB allocator to catch any copies that may span objects. Based on code from PaX and grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> (cherry picked from commit 04385fc5e8fffed84425d909a783c0f0c587d847) Signed-off-by: Alex Shi <alex.shi@linaro.org>	2016-08-27 11:23:38 +08:00
Kees Cook	41e3ca9b2f	s390/uaccess: Enable hardened usercopy Enables CONFIG_HARDENED_USERCOPY checks on s390. Signed-off-by: Kees Cook <keescook@chromium.org> (cherry picked from commit 97433ea4fda62349bfa42089455593cbcb57e06c) Signed-off-by: Alex Shi <alex.shi@linaro.org>	2016-08-27 11:23:38 +08:00
Kees Cook	17427c2db3	sparc/uaccess: Enable hardened usercopy Enables CONFIG_HARDENED_USERCOPY checks on sparc. Based on code from PaX and grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> (cherry picked from commit 9d9208a15800f9f06f102f9aac1e8b323c3b8575) Signed-off-by: Alex Shi <alex.shi@linaro.org>	2016-08-27 11:23:38 +08:00
Kees Cook	225237bf68	powerpc/uaccess: Enable hardened usercopy Enables CONFIG_HARDENED_USERCOPY checks on powerpc. Based on code from PaX and grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> (cherry picked from commit 1d3c1324746fed0e34a5b94d3ed303e7521ed603) Signed-off-by: Alex Shi <alex.shi@linaro.org>	2016-08-27 11:23:38 +08:00
Kees Cook	434bef236c	ia64/uaccess: Enable hardened usercopy Enables CONFIG_HARDENED_USERCOPY checks on ia64. Based on code from PaX and grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> (cherry picked from commit 73d35887e24da77e8d1321b2e92bd9b9128e2fc2) Signed-off-by: Alex Shi <alex.shi@linaro.org>	2016-08-27 11:23:38 +08:00
Kees Cook	3308a2cca1	arm64/uaccess: Enable hardened usercopy Enables CONFIG_HARDENED_USERCOPY checks on arm64. As done by KASAN in -next, renames the low-level functions to __arch_copy_*_user() so a static inline can do additional work before the copy. Signed-off-by: Kees Cook <keescook@chromium.org> (cherry picked from commit faf5b63e294151d6ac24ca6906d6f221bd3496cd) Signed-off-by: Alex Shi <alex.shi@linaro.org>	2016-08-27 11:23:38 +08:00

1 2 3 4 5 ...