Commit graph

563384 commits

Author SHA1 Message Date
Laura Abbott
1e701cdc5b UPSTREAM: mm: Add is_migrate_cma_page
Code such as hardened user copy[1] needs a way to tell if a
page is CMA or not. Add is_migrate_cma_page in a similar way
to is_migrate_isolate_page.

[1]http://article.gmane.org/gmane.linux.kernel.mm/155238

Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>

Change-Id: I1f9aa13d8d063038fa70b93282a836648fbb4f6d
(cherry picked from commit 7c15d9bb8231f998ae7dc0b72415f5215459f7fb)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-06 15:53:07 +00:00
Linus Torvalds
5f1b3400f0 UPSTREAM: unsafe_[get|put]_user: change interface to use a error target label
When I initially added the unsafe_[get|put]_user() helpers in commit
5b24a7a2aa20 ("Add 'unsafe' user access functions for batched
accesses"), I made the mistake of modeling the interface on our
traditional __[get|put]_user() functions, which return zero on success,
or -EFAULT on failure.

That interface is fairly easy to use, but it's actually fairly nasty for
good code generation, since it essentially forces the caller to check
the error value for each access.

In particular, since the error handling is already internally
implemented with an exception handler, and we already use "asm goto" for
various other things, we could fairly easily make the error cases just
jump directly to an error label instead, and avoid the need for explicit
checking after each operation.

So switch the interface to pass in an error label, rather than checking
the error value in the caller.  Best do it now before we start growing
more users (the signal handling code in particular would be a good place
to use the new interface).

So rather than

	if (unsafe_get_user(x, ptr))
		... handle error ..

the interface is now

	unsafe_get_user(x, ptr, label);

where an error during the user mode fetch will now just cause a jump to
'label' in the caller.

Right now the actual _implementation_ of this all still ends up being a
"if (err) goto label", and does not take advantage of any exception
label tricks, but for "unsafe_put_user()" in particular it should be
fairly straightforward to convert to using the exception table model.

Note that "unsafe_get_user()" is much harder to convert to a clever
exception table model, because current versions of gcc do not allow the
use of "asm goto" (for the exception) with output values (for the actual
value to be fetched).  But that is hopefully not a limitation in the
long term.

[ Also note that it might be a good idea to switch unsafe_get_user() to
  actually _return_ the value it fetches from user space, but this
  commit only changes the error handling semantics ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Change-Id: Ib905a84a04d46984320f6fd1056da4d72f3d6b53
(cherry picked from commit 1bd4403d86a1c06cb6cc9ac87664a0c9d3413d51)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-06 15:52:54 +00:00
Ard Biesheuvel
79389c0d78 BACKPORT: arm64: mm: fix location of _etext
As Kees Cook notes in the ARM counterpart of this patch [0]:

  The _etext position is defined to be the end of the kernel text code,
  and should not include any part of the data segments. This interferes
  with things that might check memory ranges and expect executable code
  up to _etext.

In particular, Kees is referring to the HARDENED_USERCOPY patch set [1],
which rejects attempts to call copy_to_user() on kernel ranges containing
executable code, but does allow access to the .rodata segment. Regardless
of whether one may or may not agree with the distinction, it makes sense
for _etext to have the same meaning across architectures.

So let's put _etext where it belongs, between .text and .rodata, and fix
up existing references to use __init_begin instead, which unlike _end_rodata
includes the exception and notes sections as well.

The _etext references in kaslr.c are left untouched, since its references
to [_stext, _etext) are meant to capture potential jump instruction targets,
and so disregarding .rodata is actually an improvement here.

[0] http://article.gmane.org/gmane.linux.kernel/2245084
[1] http://thread.gmane.org/gmane.linux.kernel.hardened.devel/2502

Reported-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: I8f6582525217b9ca324f6a382ea52d30ce1d0dbd
(cherry picked from commit 9fdc14c55cd6579d619ccd9d40982e0805e62b6d)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-06 15:52:14 +00:00
Kees Cook
fed752db02 BACKPORT: ARM: 8583/1: mm: fix location of _etext
The _etext position is defined to be the end of the kernel text code,
and should not include any part of the data segments. This interferes
with things that might check memory ranges and expect executable code
up to _etext. Just to be conservative, leave the kernel resource as
it was, using __init_begin instead of _etext as the end mark.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

Change-Id: Ida514d1359dbe6f782f562ce29b4ba09ae72bfc0
(cherry picked from commit 14c4a533e0996f95a0a64dfd0b6252d788cebc74)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-06 15:51:59 +00:00
Linus Torvalds
c1932b1047 UPSTREAM: Use the new batched user accesses in generic user string handling
This converts the generic user string functions to use the batched user
access functions.

It makes a big difference on Skylake, which is the first x86
microarchitecture to implement SMAP.  The STAC/CLAC instructions are not
very fast, and doing them for each access inside the loop that copies
strings from user space (which is what the pathname handling does for
every pathname the kernel uses, for example) is very inefficient.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Change-Id: Ic39a686b4bb1ad9cd16ad0887bb669beed6fe8aa
(cherry picked from commit 9fd4470ff4974c41b1db43c3b355b9085af9c12a)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-06 15:51:48 +00:00
Linus Torvalds
8ac959cc06 UPSTREAM: Add 'unsafe' user access functions for batched accesses
The naming is meant to discourage random use: the helper functions are
not really any more "unsafe" than the traditional double-underscore
functions (which need the address range checking), but they do need even
more infrastructure around them, and should not be used willy-nilly.

In addition to checking the access range, these user access functions
require that you wrap the user access with a "user_acess_{begin,end}()"
around it.

That allows architectures that implement kernel user access control
(x86: SMAP, arm64: PAN) to do the user access control in the wrapping
user_access_begin/end part, and then batch up the actual user space
accesses using the new interfaces.

The main (and hopefully only) use for these are for core generic access
helpers, initially just the generic user string functions
(strnlen_user() and strncpy_from_user()).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Change-Id: Ic64efea41f97171bdbdabe3e531489aebd9b6fac
(cherry picked from commit 5b24a7a2aa2040c8c50c3b71122901d01661ff78)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-06 15:50:38 +00:00
Mohamad Ayyash
87518a45ae BACKPORT: Don't show empty tag stats for unprivileged uids
BUG: 27577101
BUG: 27532522
Signed-off-by: Mohamad Ayyash <mkayyash@google.com>
2016-09-02 14:58:30 -07:00
Eric Dumazet
d25e3081ba UPSTREAM: tcp: fix use after free in tcp_xmit_retransmit_queue()
(cherry picked from commit bb1fceca22492109be12640d49f5ea5a544c6bb4)

When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the
tail of the write queue using tcp_add_write_queue_tail()

Then it attempts to copy user data into this fresh skb.

If the copy fails, we undo the work and remove the fresh skb.

Unfortunately, this undo lacks the change done to tp->highest_sack and
we can leave a dangling pointer (to a freed skb)

Later, tcp_xmit_retransmit_queue() can dereference this pointer and
access freed memory. For regular kernels where memory is not unmapped,
this might cause SACK bugs because tcp_highest_sack_seq() is buggy,
returning garbage instead of tp->snd_nxt, but with various debug
features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel.

This bug was found by Marco Grassi thanks to syzkaller.

Fixes: 6859d49475 ("[TCP]: Abstract tp->highest_sack accessing & point to next skb")
Reported-by: Marco Grassi <marco.gra@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change-Id: I58bb02d6e4e399612e8580b9e02d11e661df82f5
Bug: 31183296
2016-09-02 13:44:25 -07:00
Amit Pundir
8e975e71f5 ANDROID: base-cfg: drop SECCOMP_FILTER config
Don't need to set SECCOMP_FILTER explicitly since CONFIG_SECCOMP=y will
select that config anyway.

Fixes: a49dcf2e74 ("ANDROID: base-cfg: enable SECCOMP config")
Change-Id: Iff18ed4d2db5a55b9f9480d5ecbeef7b818b3837
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-09-02 14:36:35 +00:00
Mathias Krause
a4e59955ac UPSTREAM: proc: prevent accessing /proc/<PID>/environ until it's ready
(cherry picked from commit 8148a73c9901a8794a50f950083c00ccf97d43b3)

If /proc/<PID>/environ gets read before the envp[] array is fully set up
in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying to
read more bytes than are actually written, as env_start will already be
set but env_end will still be zero, making the range calculation
underflow, allowing to read beyond the end of what has been written.

Fix this as it is done for /proc/<PID>/cmdline by testing env_end for
zero.  It is, apparently, intentionally set last in create_*_tables().

This bug was found by the PaX size_overflow plugin that detected the
arithmetic underflow of 'this_len = env_end - (env_start + src)' when
env_end is still zero.

The expected consequence is that userland trying to access
/proc/<PID>/environ of a not yet fully set up process may get
inconsistent data as we're in the middle of copying in the environment
variables.

Fixes: https://forums.grsecurity.net/viewtopic.php?f=3&t=4363
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=116461
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: Pax Team <pageexec@freemail.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Ia2f58d48c15478ed4b6e237b63e704c70ff21e96
Bug: 30951939
2016-09-01 17:09:47 -07:00
Dan Carpenter
30439e51fc UPSTREAM: [media] xc2028: unlock on error in xc2028_set_config()
(cherry picked from commit 210bd104c6acd31c3c6b8b075b3f12d4a9f6b60d)

We have to unlock before returning -ENOMEM.

Fixes: 8dfbcc4351a0 ('[media] xc2028: avoid use after free')

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Change-Id: I7b6ba9fde5c6e29467e6de23d398af2fe56e2547
Bug: 30946097
2016-09-01 16:08:12 -07:00
Mauro Carvalho Chehab
c7a407d41e UPSTREAM: [media] xc2028: avoid use after free
(cherry picked from commit 8dfbcc4351a0b6d2f2d77f367552f48ffefafe18)

If struct xc2028_config is passed without a firmware name,
the following trouble may happen:

[11009.907205] xc2028 5-0061: type set to XCeive xc2028/xc3028 tuner
[11009.907491] ==================================================================
[11009.907750] BUG: KASAN: use-after-free in strcmp+0x96/0xb0 at addr ffff8803bd78ab40
[11009.907992] Read of size 1 by task modprobe/28992
[11009.907994] =============================================================================
[11009.907997] BUG kmalloc-16 (Tainted: G        W      ): kasan: bad access detected
[11009.907999] -----------------------------------------------------------------------------

[11009.908008] INFO: Allocated in xhci_urb_enqueue+0x214/0x14c0 [xhci_hcd] age=0 cpu=3 pid=28992
[11009.908012] 	___slab_alloc+0x581/0x5b0
[11009.908014] 	__slab_alloc+0x51/0x90
[11009.908017] 	__kmalloc+0x27b/0x350
[11009.908022] 	xhci_urb_enqueue+0x214/0x14c0 [xhci_hcd]
[11009.908026] 	usb_hcd_submit_urb+0x1e8/0x1c60
[11009.908029] 	usb_submit_urb+0xb0e/0x1200
[11009.908032] 	usb_serial_generic_write_start+0xb6/0x4c0
[11009.908035] 	usb_serial_generic_write+0x92/0xc0
[11009.908039] 	usb_console_write+0x38a/0x560
[11009.908045] 	call_console_drivers.constprop.14+0x1ee/0x2c0
[11009.908051] 	console_unlock+0x40d/0x900
[11009.908056] 	vprintk_emit+0x4b4/0x830
[11009.908061] 	vprintk_default+0x1f/0x30
[11009.908064] 	printk+0x99/0xb5
[11009.908067] 	kasan_report_error+0x10a/0x550
[11009.908070] 	__asan_report_load1_noabort+0x43/0x50
[11009.908074] INFO: Freed in xc2028_set_config+0x90/0x630 [tuner_xc2028] age=1 cpu=3 pid=28992
[11009.908077] 	__slab_free+0x2ec/0x460
[11009.908080] 	kfree+0x266/0x280
[11009.908083] 	xc2028_set_config+0x90/0x630 [tuner_xc2028]
[11009.908086] 	xc2028_attach+0x310/0x8a0 [tuner_xc2028]
[11009.908090] 	em28xx_attach_xc3028.constprop.7+0x1f9/0x30d [em28xx_dvb]
[11009.908094] 	em28xx_dvb_init.part.3+0x8e4/0x5cf4 [em28xx_dvb]
[11009.908098] 	em28xx_dvb_init+0x81/0x8a [em28xx_dvb]
[11009.908101] 	em28xx_register_extension+0xd9/0x190 [em28xx]
[11009.908105] 	em28xx_dvb_register+0x10/0x1000 [em28xx_dvb]
[11009.908108] 	do_one_initcall+0x141/0x300
[11009.908111] 	do_init_module+0x1d0/0x5ad
[11009.908114] 	load_module+0x6666/0x9ba0
[11009.908117] 	SyS_finit_module+0x108/0x130
[11009.908120] 	entry_SYSCALL_64_fastpath+0x16/0x76
[11009.908123] INFO: Slab 0xffffea000ef5e280 objects=25 used=25 fp=0x          (null) flags=0x2ffff8000004080
[11009.908126] INFO: Object 0xffff8803bd78ab40 @offset=2880 fp=0x0000000000000001

[11009.908130] Bytes b4 ffff8803bd78ab30: 01 00 00 00 2a 07 00 00 9d 28 00 00 01 00 00 00  ....*....(......
[11009.908133] Object ffff8803bd78ab40: 01 00 00 00 00 00 00 00 b0 1d c3 6a 00 88 ff ff  ...........j....
[11009.908137] CPU: 3 PID: 28992 Comm: modprobe Tainted: G    B   W       4.5.0-rc1+ #43
[11009.908140] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0350.2015.0812.1722 08/12/2015
[11009.908142]  ffff8803bd78a000 ffff8802c273f1b8 ffffffff81932007 ffff8803c6407a80
[11009.908148]  ffff8802c273f1e8 ffffffff81556759 ffff8803c6407a80 ffffea000ef5e280
[11009.908153]  ffff8803bd78ab40 dffffc0000000000 ffff8802c273f210 ffffffff8155ccb4
[11009.908158] Call Trace:
[11009.908162]  [<ffffffff81932007>] dump_stack+0x4b/0x64
[11009.908165]  [<ffffffff81556759>] print_trailer+0xf9/0x150
[11009.908168]  [<ffffffff8155ccb4>] object_err+0x34/0x40
[11009.908171]  [<ffffffff8155f260>] kasan_report_error+0x230/0x550
[11009.908175]  [<ffffffff81237d71>] ? trace_hardirqs_off_caller+0x21/0x290
[11009.908179]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
[11009.908182]  [<ffffffff8155f5c3>] __asan_report_load1_noabort+0x43/0x50
[11009.908185]  [<ffffffff8155ea00>] ? __asan_register_globals+0x50/0xa0
[11009.908189]  [<ffffffff8194cea6>] ? strcmp+0x96/0xb0
[11009.908192]  [<ffffffff8194cea6>] strcmp+0x96/0xb0
[11009.908196]  [<ffffffffa13ba4ac>] xc2028_set_config+0x15c/0x630 [tuner_xc2028]
[11009.908200]  [<ffffffffa13bac90>] xc2028_attach+0x310/0x8a0 [tuner_xc2028]
[11009.908203]  [<ffffffff8155ea78>] ? memset+0x28/0x30
[11009.908206]  [<ffffffffa13ba980>] ? xc2028_set_config+0x630/0x630 [tuner_xc2028]
[11009.908211]  [<ffffffffa157a59a>] em28xx_attach_xc3028.constprop.7+0x1f9/0x30d [em28xx_dvb]
[11009.908215]  [<ffffffffa157aa2a>] ? em28xx_dvb_init.part.3+0x37c/0x5cf4 [em28xx_dvb]
[11009.908219]  [<ffffffffa157a3a1>] ? hauppauge_hvr930c_init+0x487/0x487 [em28xx_dvb]
[11009.908222]  [<ffffffffa01795ac>] ? lgdt330x_attach+0x1cc/0x370 [lgdt330x]
[11009.908226]  [<ffffffffa01793e0>] ? i2c_read_demod_bytes.isra.2+0x210/0x210 [lgdt330x]
[11009.908230]  [<ffffffff812e87d0>] ? ref_module.part.15+0x10/0x10
[11009.908233]  [<ffffffff812e56e0>] ? module_assert_mutex_or_preempt+0x80/0x80
[11009.908238]  [<ffffffffa157af92>] em28xx_dvb_init.part.3+0x8e4/0x5cf4 [em28xx_dvb]
[11009.908242]  [<ffffffffa157a6ae>] ? em28xx_attach_xc3028.constprop.7+0x30d/0x30d [em28xx_dvb]
[11009.908245]  [<ffffffff8195222d>] ? string+0x14d/0x1f0
[11009.908249]  [<ffffffff8195381f>] ? symbol_string+0xff/0x1a0
[11009.908253]  [<ffffffff81953720>] ? uuid_string+0x6f0/0x6f0
[11009.908257]  [<ffffffff811a775e>] ? __kernel_text_address+0x7e/0xa0
[11009.908260]  [<ffffffff8104b02f>] ? print_context_stack+0x7f/0xf0
[11009.908264]  [<ffffffff812e9846>] ? __module_address+0xb6/0x360
[11009.908268]  [<ffffffff8137fdc9>] ? is_ftrace_trampoline+0x99/0xe0
[11009.908271]  [<ffffffff811a775e>] ? __kernel_text_address+0x7e/0xa0
[11009.908275]  [<ffffffff81240a70>] ? debug_check_no_locks_freed+0x290/0x290
[11009.908278]  [<ffffffff8104a24b>] ? dump_trace+0x11b/0x300
[11009.908282]  [<ffffffffa13e8143>] ? em28xx_register_extension+0x23/0x190 [em28xx]
[11009.908285]  [<ffffffff81237d71>] ? trace_hardirqs_off_caller+0x21/0x290
[11009.908289]  [<ffffffff8123ff56>] ? trace_hardirqs_on_caller+0x16/0x590
[11009.908292]  [<ffffffff812404dd>] ? trace_hardirqs_on+0xd/0x10
[11009.908296]  [<ffffffffa13e8143>] ? em28xx_register_extension+0x23/0x190 [em28xx]
[11009.908299]  [<ffffffff822dcbb0>] ? mutex_trylock+0x400/0x400
[11009.908302]  [<ffffffff810021a1>] ? do_one_initcall+0x131/0x300
[11009.908306]  [<ffffffff81296dc7>] ? call_rcu_sched+0x17/0x20
[11009.908309]  [<ffffffff8159e708>] ? put_object+0x48/0x70
[11009.908314]  [<ffffffffa1579f11>] em28xx_dvb_init+0x81/0x8a [em28xx_dvb]
[11009.908317]  [<ffffffffa13e81f9>] em28xx_register_extension+0xd9/0x190 [em28xx]
[11009.908320]  [<ffffffffa0150000>] ? 0xffffffffa0150000
[11009.908324]  [<ffffffffa0150010>] em28xx_dvb_register+0x10/0x1000 [em28xx_dvb]
[11009.908327]  [<ffffffff810021b1>] do_one_initcall+0x141/0x300
[11009.908330]  [<ffffffff81002070>] ? try_to_run_init_process+0x40/0x40
[11009.908333]  [<ffffffff8123ff56>] ? trace_hardirqs_on_caller+0x16/0x590
[11009.908337]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
[11009.908340]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
[11009.908343]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
[11009.908346]  [<ffffffff8155ea37>] ? __asan_register_globals+0x87/0xa0
[11009.908350]  [<ffffffff8144da7b>] do_init_module+0x1d0/0x5ad
[11009.908353]  [<ffffffff812f2626>] load_module+0x6666/0x9ba0
[11009.908356]  [<ffffffff812e9c90>] ? symbol_put_addr+0x50/0x50
[11009.908361]  [<ffffffffa1580037>] ? em28xx_dvb_init.part.3+0x5989/0x5cf4 [em28xx_dvb]
[11009.908366]  [<ffffffff812ebfc0>] ? module_frob_arch_sections+0x20/0x20
[11009.908369]  [<ffffffff815bc940>] ? open_exec+0x50/0x50
[11009.908374]  [<ffffffff811671bb>] ? ns_capable+0x5b/0xd0
[11009.908377]  [<ffffffff812f5e58>] SyS_finit_module+0x108/0x130
[11009.908379]  [<ffffffff812f5d50>] ? SyS_init_module+0x1f0/0x1f0
[11009.908383]  [<ffffffff81004044>] ? lockdep_sys_exit_thunk+0x12/0x14
[11009.908394]  [<ffffffff822e6936>] entry_SYSCALL_64_fastpath+0x16/0x76
[11009.908396] Memory state around the buggy address:
[11009.908398]  ffff8803bd78aa00: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[11009.908401]  ffff8803bd78aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[11009.908403] >ffff8803bd78ab00: fc fc fc fc fc fc fc fc 00 00 fc fc fc fc fc fc
[11009.908405]                                            ^
[11009.908407]  ffff8803bd78ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[11009.908409]  ffff8803bd78ac00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[11009.908411] ==================================================================

In order to avoid it, let's set the cached value of the firmware
name to NULL after freeing it. While here, return an error if
the memory allocation fails.

Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Change-Id: I945c841dcfb45de2056267e4aa50bbe176b527cf
Bug: 30946097
2016-09-01 14:35:13 -07:00
Vegard Nossum
687549af9e UPSTREAM: block: fix use-after-free in seq file
(cherry picked from commit 77da160530dd1dc94f6ae15a981f24e5f0021e84)

I got a KASAN report of use-after-free:

    ==================================================================
    BUG: KASAN: use-after-free in klist_iter_exit+0x61/0x70 at addr ffff8800b6581508
    Read of size 8 by task trinity-c1/315
    =============================================================================
    BUG kmalloc-32 (Not tainted): kasan: bad access detected
    -----------------------------------------------------------------------------

    Disabling lock debugging due to kernel taint
    INFO: Allocated in disk_seqf_start+0x66/0x110 age=144 cpu=1 pid=315
            ___slab_alloc+0x4f1/0x520
            __slab_alloc.isra.58+0x56/0x80
            kmem_cache_alloc_trace+0x260/0x2a0
            disk_seqf_start+0x66/0x110
            traverse+0x176/0x860
            seq_read+0x7e3/0x11a0
            proc_reg_read+0xbc/0x180
            do_loop_readv_writev+0x134/0x210
            do_readv_writev+0x565/0x660
            vfs_readv+0x67/0xa0
            do_preadv+0x126/0x170
            SyS_preadv+0xc/0x10
            do_syscall_64+0x1a1/0x460
            return_from_SYSCALL_64+0x0/0x6a
    INFO: Freed in disk_seqf_stop+0x42/0x50 age=160 cpu=1 pid=315
            __slab_free+0x17a/0x2c0
            kfree+0x20a/0x220
            disk_seqf_stop+0x42/0x50
            traverse+0x3b5/0x860
            seq_read+0x7e3/0x11a0
            proc_reg_read+0xbc/0x180
            do_loop_readv_writev+0x134/0x210
            do_readv_writev+0x565/0x660
            vfs_readv+0x67/0xa0
            do_preadv+0x126/0x170
            SyS_preadv+0xc/0x10
            do_syscall_64+0x1a1/0x460
            return_from_SYSCALL_64+0x0/0x6a

    CPU: 1 PID: 315 Comm: trinity-c1 Tainted: G    B           4.7.0+ #62
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
     ffffea0002d96000 ffff880119b9f918 ffffffff81d6ce81 ffff88011a804480
     ffff8800b6581500 ffff880119b9f948 ffffffff8146c7bd ffff88011a804480
     ffffea0002d96000 ffff8800b6581500 fffffffffffffff4 ffff880119b9f970
    Call Trace:
     [<ffffffff81d6ce81>] dump_stack+0x65/0x84
     [<ffffffff8146c7bd>] print_trailer+0x10d/0x1a0
     [<ffffffff814704ff>] object_err+0x2f/0x40
     [<ffffffff814754d1>] kasan_report_error+0x221/0x520
     [<ffffffff8147590e>] __asan_report_load8_noabort+0x3e/0x40
     [<ffffffff83888161>] klist_iter_exit+0x61/0x70
     [<ffffffff82404389>] class_dev_iter_exit+0x9/0x10
     [<ffffffff81d2e8ea>] disk_seqf_stop+0x3a/0x50
     [<ffffffff8151f812>] seq_read+0x4b2/0x11a0
     [<ffffffff815f8fdc>] proc_reg_read+0xbc/0x180
     [<ffffffff814b24e4>] do_loop_readv_writev+0x134/0x210
     [<ffffffff814b4c45>] do_readv_writev+0x565/0x660
     [<ffffffff814b8a17>] vfs_readv+0x67/0xa0
     [<ffffffff814b8de6>] do_preadv+0x126/0x170
     [<ffffffff814b92ec>] SyS_preadv+0xc/0x10

This problem can occur in the following situation:

open()
 - pread()
    - .seq_start()
       - iter = kmalloc() // succeeds
       - seqf->private = iter
    - .seq_stop()
       - kfree(seqf->private)
 - pread()
    - .seq_start()
       - iter = kmalloc() // fails
    - .seq_stop()
       - class_dev_iter_exit(seqf->private) // boom! old pointer

As the comment in disk_seqf_stop() says, stop is called even if start
failed, so we need to reinitialise the private pointer to NULL when seq
iteration stops.

An alternative would be to set the private pointer to NULL when the
kmalloc() in disk_seqf_start() fails.

Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Change-Id: I07b33f4b38341f60a37806cdd45b0a0c3ab4d84d
Bug: 30942273
2016-09-01 13:46:04 -07:00
Jerome Marchand
af4104512c UPSTREAM: assoc_array: don't call compare_object() on a node
(cherry picked from commit 8d4a2ec1e0b41b0cf9a0c5cd4511da7f8e4f3de2)

Changes since V1: fixed the description and added KASan warning.

In assoc_array_insert_into_terminal_node(), we call the
compare_object() method on all non-empty slots, even when they're
not leaves, passing a pointer to an unexpected structure to
compare_object(). Currently it causes an out-of-bound read access
in keyring_compare_object detected by KASan (see below). The issue
is easily reproduced with keyutils testsuite.
Only call compare_object() when the slot is a leave.

KASan warning:
==================================================================
BUG: KASAN: slab-out-of-bounds in keyring_compare_object+0x213/0x240 at addr ffff880060a6f838
Read of size 8 by task keyctl/1655
=============================================================================
BUG kmalloc-192 (Not tainted): kasan: bad access detected
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: Allocated in assoc_array_insert+0xfd0/0x3a60 age=69 cpu=1 pid=1647
	___slab_alloc+0x563/0x5c0
	__slab_alloc+0x51/0x90
	kmem_cache_alloc_trace+0x263/0x300
	assoc_array_insert+0xfd0/0x3a60
	__key_link_begin+0xfc/0x270
	key_create_or_update+0x459/0xaf0
	SyS_add_key+0x1ba/0x350
	entry_SYSCALL_64_fastpath+0x12/0x76
INFO: Slab 0xffffea0001829b80 objects=16 used=8 fp=0xffff880060a6f550 flags=0x3fff8000004080
INFO: Object 0xffff880060a6f740 @offset=5952 fp=0xffff880060a6e5d1

Bytes b4 ffff880060a6f730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f740: d1 e5 a6 60 00 88 ff ff 0e 00 00 00 00 00 00 00  ...`............
Object ffff880060a6f750: 02 cf 8e 60 00 88 ff ff 02 c0 8e 60 00 88 ff ff  ...`.......`....
Object ffff880060a6f760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7d0: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff880060a6f7f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
CPU: 0 PID: 1655 Comm: keyctl Tainted: G    B           4.5.0-rc4-kasan+ #291
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 0000000000000000 000000001b2800b4 ffff880060a179e0 ffffffff81b60491
 ffff88006c802900 ffff880060a6f740 ffff880060a17a10 ffffffff815e2969
 ffff88006c802900 ffffea0001829b80 ffff880060a6f740 ffff880060a6e650
Call Trace:
 [<ffffffff81b60491>] dump_stack+0x85/0xc4
 [<ffffffff815e2969>] print_trailer+0xf9/0x150
 [<ffffffff815e9454>] object_err+0x34/0x40
 [<ffffffff815ebe50>] kasan_report_error+0x230/0x550
 [<ffffffff819949be>] ? keyring_get_key_chunk+0x13e/0x210
 [<ffffffff815ec62d>] __asan_report_load_n_noabort+0x5d/0x70
 [<ffffffff81994cc3>] ? keyring_compare_object+0x213/0x240
 [<ffffffff81994cc3>] keyring_compare_object+0x213/0x240
 [<ffffffff81bc238c>] assoc_array_insert+0x86c/0x3a60
 [<ffffffff81bc1b20>] ? assoc_array_cancel_edit+0x70/0x70
 [<ffffffff8199797d>] ? __key_link_begin+0x20d/0x270
 [<ffffffff8199786c>] __key_link_begin+0xfc/0x270
 [<ffffffff81993389>] key_create_or_update+0x459/0xaf0
 [<ffffffff8128ce0d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81992f30>] ? key_type_lookup+0xc0/0xc0
 [<ffffffff8199e19d>] ? lookup_user_key+0x13d/0xcd0
 [<ffffffff81534763>] ? memdup_user+0x53/0x80
 [<ffffffff819983ea>] SyS_add_key+0x1ba/0x350
 [<ffffffff81998230>] ? key_get_type_from_user.constprop.6+0xa0/0xa0
 [<ffffffff828bcf4e>] ? retint_user+0x18/0x23
 [<ffffffff8128cc7e>] ? trace_hardirqs_on_caller+0x3fe/0x580
 [<ffffffff81004017>] ? trace_hardirqs_on_thunk+0x17/0x19
 [<ffffffff828bc432>] entry_SYSCALL_64_fastpath+0x12/0x76
Memory state around the buggy address:
 ffff880060a6f700: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
 ffff880060a6f780: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc
>ffff880060a6f800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                        ^
 ffff880060a6f880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff880060a6f900: fc fc fc fc fc fc 00 00 00 00 00 00 00 00 00 00
==================================================================

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
Change-Id: I903935a221a5b9fb14cec14ef64bd2b6fa8eb222
Bug: 30513364
2016-09-01 12:57:32 -07:00
Yongqin Liu
a49dcf2e74 ANDROID: base-cfg: enable SECCOMP config
Enable following seccomp configs

CONFIG_SECCOMP=y
CONFIG_SECCOMP_FILTER=y

Otherwise we will get mediacode error like this on Android N:

E /system/bin/mediaextractor: libminijail: prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER): Invalid argument

Change-Id: I2477b6a2cfdded5c0ebf6ffbb6150b0e5fe2ba12
Signed-off-by: Yongqin Liu <yongqin.liu@linaro.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-09-01 17:14:00 +00:00
Guenter Roeck
4547c0f935 ANDROID: rcu_sync: Export rcu_sync_lockdep_assert
x86_64:allmodconfig fails to build with the following error.

ERROR: "rcu_sync_lockdep_assert" [kernel/locking/locktorture.ko] undefined!

Introduced by commit 3228c5eb7a ("RFC: FROMLIST: locking/percpu-rwsem:
Optimize readers and reduce global impact"). The applied upstream version
exports the missing symbol, so let's do the same.

Change-Id: If4e516715c3415fe8c82090f287174857561550d
Fixes: 3228c5eb7a ("RFC: FROMLIST: locking/percpu-rwsem: Optimize ...")
Signed-off-by: Guenter Roeck <groeck@chromium.org>
2016-08-31 10:10:44 -07:00
Badhri Jagan Sridharan
6087158f06 UPSTREAM: USB: cdc-acm: more sanity checking
commit 8835ba4a39cf53f705417b3b3a94eb067673f2c9 upstream.

An attack has become available which pretends to be a quirky
device circumventing normal sanity checks and crashes the kernel
by an insufficient number of interfaces. This patch adds a check
to the code path for quirky devices.

BUG: 28242610

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I9a5f7f3c704b65e866335054f470451fcfae9d1c
2016-08-30 13:41:35 -07:00
Badhri Jagan Sridharan
b24e99b3ae UPSTREAM: USB: iowarrior: fix oops with malicious USB descriptors
commit 4ec0ef3a82125efc36173062a50624550a900ae0 upstream.

The iowarrior driver expects at least one valid endpoint.  If given
malicious descriptors that specify 0 for the number of endpoints,
it will crash in the probe function.  Ensure there is at least
one endpoint on the interface before using it.

The full report of this issue can be found here:
http://seclists.org/bugtraq/2016/Mar/87

BUG: 28242610

Reported-by: Ralf Spenneberg <ralf@spenneberg.net>
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: If5161c23928e9ef77cb3359cba9b36622b1908df
2016-08-30 13:41:35 -07:00
Badhri Jagan Sridharan
ae6790121c UPSTREAM: USB: usb_driver_claim_interface: add sanity checking
commit 0b818e3956fc1ad976bee791eadcbb3b5fec5bfd upstream.

Attacks that trick drivers into passing a NULL pointer
to usb_driver_claim_interface() using forged descriptors are
known. This thwarts them by sanity checking.

BUG: 28242610

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: Ib43ec5edb156985a9db941785a313f6801df092a
2016-08-30 13:41:35 -07:00
Badhri Jagan Sridharan
ee5c7a9739 UPSTREAM: USB: mct_u232: add sanity checking in probe
commit 4e9a0b05257f29cf4b75f3209243ed71614d062e upstream.

An attack using the lack of sanity checking in probe is known. This
patch checks for the existence of a second port.

CVE-2016-3136
BUG: 28242610
Signed-off-by: Oliver Neukum <ONeukum@suse.com>
[johan: add error message ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I284ad648c2087c34a098d67e0cc6d948a568413c
2016-08-30 13:41:35 -07:00
Badhri Jagan Sridharan
125d90ca2f UPSTREAM: USB: cypress_m8: add endpoint sanity check
commit c55aee1bf0e6b6feec8b2927b43f7a09a6d5f754 upstream.

An attack using missing endpoints exists.

CVE-2016-3137

BUG: 28242610
Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I1cc7957a5924175d24f12fdc41162ece67c907e5
2016-08-30 13:41:35 -07:00
Badhri Jagan Sridharan
258a8f519a UPSTREAM: Input: powermate - fix oops with malicious USB descriptors
The powermate driver expects at least one valid USB endpoint in its
probe function.  If given malicious descriptors that specify 0 for
the number of endpoints, it will crash.  Validate the number of
endpoints on the interface before using them.

The full report for this issue can be found here:
http://seclists.org/bugtraq/2016/Mar/85

BUG: 28242610

Reported-by: Ralf Spenneberg <ralf@spenneberg.net>
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I1cb956a35f3bba73324240d5bd0a029f49d3c456
2016-08-30 13:41:35 -07:00
Jason Baron
ba52437821 BACKPORT: tcp: enable per-socket rate limiting of all 'challenge acks'
(cherry picked from commit 083ae308280d13d187512b9babe3454342a7987e)

The per-socket rate limit for 'challenge acks' was introduced in the
context of limiting ack loops:

commit f2b2c582e8 ("tcp: mitigate ACK loops for connections as tcp_sock")

And I think it can be extended to rate limit all 'challenge acks' on a
per-socket basis.

Since we have the global tcp_challenge_ack_limit, this patch allows for
tcp_challenge_ack_limit to be set to a large value and effectively rely on
the per-socket limit, or set tcp_challenge_ack_limit to a lower value and
still prevents a single connections from consuming the entire challenge ack
quota.

It further moves in the direction of eliminating the global limit at some
point, as Eric Dumazet has suggested. This a follow-up to:
Subject: tcp: make challenge acks less predictable

Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Yue Cao <ycao009@ucr.edu>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Change-Id: I622d5ae96e9387e775a0196c892d8d0e1a5564a7
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-08-30 16:58:07 +00:00
Balbir Singh
e91f1799ff RFC: FROMLIST: cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork
cgroup_threadgroup_rwsem is acquired in read mode during process exit
and fork.  It is also grabbed in write mode during
__cgroups_proc_write().  I've recently run into a scenario with lots
of memory pressure and OOM and I am beginning to see

systemd

 __switch_to+0x1f8/0x350
 __schedule+0x30c/0x990
 schedule+0x48/0xc0
 percpu_down_write+0x114/0x170
 __cgroup_procs_write.isra.12+0xb8/0x3c0
 cgroup_file_write+0x74/0x1a0
 kernfs_fop_write+0x188/0x200
 __vfs_write+0x6c/0xe0
 vfs_write+0xc0/0x230
 SyS_write+0x6c/0x110
 system_call+0x38/0xb4

This thread is waiting on the reader of cgroup_threadgroup_rwsem to
exit.  The reader itself is under memory pressure and has gone into
reclaim after fork. There are times the reader also ends up waiting on
oom_lock as well.

 __switch_to+0x1f8/0x350
 __schedule+0x30c/0x990
 schedule+0x48/0xc0
 jbd2_log_wait_commit+0xd4/0x180
 ext4_evict_inode+0x88/0x5c0
 evict+0xf8/0x2a0
 dispose_list+0x50/0x80
 prune_icache_sb+0x6c/0x90
 super_cache_scan+0x190/0x210
 shrink_slab.part.15+0x22c/0x4c0
 shrink_zone+0x288/0x3c0
 do_try_to_free_pages+0x1dc/0x590
 try_to_free_pages+0xdc/0x260
 __alloc_pages_nodemask+0x72c/0xc90
 alloc_pages_current+0xb4/0x1a0
 page_table_alloc+0xc0/0x170
 __pte_alloc+0x58/0x1f0
 copy_page_range+0x4ec/0x950
 copy_process.isra.5+0x15a0/0x1870
 _do_fork+0xa8/0x4b0
 ppc_clone+0x8/0xc

In the meanwhile, all processes exiting/forking are blocked almost
stalling the system.

This patch moves the threadgroup_change_begin from before
cgroup_fork() to just before cgroup_canfork().  There is no nee to
worry about threadgroup changes till the task is actually added to the
threadgroup.  This avoids having to call reclaim with
cgroup_threadgroup_rwsem held.

tj: Subject and description edits.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Zefan Li <lizefan@huawei.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Tejun Heo <tj@kernel.org>
[jstultz: Cherry-picked from:
 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git 568ac888215c7f]
Change-Id: Ie8ece84fb613cf6a7b08cea1468473a8df2b9661
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-26 09:37:56 -07:00
Peter Zijlstra
0c3240a1ef RFC: FROMLIST: cgroup: avoid synchronize_sched() in __cgroup_procs_write()
The current percpu-rwsem read side is entirely free of serializing insns
at the cost of having a synchronize_sched() in the write path.

The latency of the synchronize_sched() is too high for cgroups. The
commit 1ed1328792 talks about the write path being a fairly cold path
but this is not the case for Android which moves task to the foreground
cgroup and back around binder IPC calls from foreground processes to
background processes, so it is significantly hotter than human initiated
operations.

Switch cgroup_threadgroup_rwsem into the slow mode for now to avoid the
problem, hopefully it should not be that slow after another commit
80127a39681b ("locking/percpu-rwsem: Optimize readers and reduce global
impact").

We could just add rcu_sync_enter() into cgroup_init() but we do not want
another synchronize_sched() at boot time, so this patch adds the new helper
which doesn't block but currently can only be called before the first use.

Cc: Tejun Heo <tj@kernel.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Reported-by: John Stultz <john.stultz@linaro.org>
Reported-by: Dmitry Shmidt <dimitrysh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
[jstultz: backported to 4.4]
Change-Id: I34aa9c394d3052779b56976693e96d861bd255f2
Mailing-list-URL: https://lkml.org/lkml/2016/8/11/557
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-26 09:37:43 -07:00
Peter Zijlstra
3228c5eb7a RFC: FROMLIST: locking/percpu-rwsem: Optimize readers and reduce global impact
Currently the percpu-rwsem switches to (global) atomic ops while a
writer is waiting; which could be quite a while and slows down
releasing the readers.

This patch cures this problem by ordering the reader-state vs
reader-count (see the comments in __percpu_down_read() and
percpu_down_write()). This changes a global atomic op into a full
memory barrier, which doesn't have the global cacheline contention.

This also enables using the percpu-rwsem with rcu_sync disabled in order
to bias the implementation differently, reducing the writer latency by
adding some cost to readers.

Mailing-list-URL: https://lkml.org/lkml/2016/8/9/181
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[jstultz: Backported to 4.4]
Change-Id: I8ea04b4dca2ec36f1c2469eccafde1423490572f
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-26 09:37:34 -07:00
Lorenzo Colitti
75c591ea57 net: ipv6: Fix ping to link-local addresses.
ping_v6_sendmsg does not set flowi6_oif in response to
sin6_scope_id or sk_bound_dev_if, so it is not possible to use
these APIs to ping an IPv6 address on a different interface.
Instead, it sets flowi6_iif, which is incorrect but harmless.

Stop setting flowi6_iif, and support various ways of setting oif
in the same priority order used by udpv6_sendmsg.

[Backport of net 5e457896986e16c440c97bb94b9ccd95dd157292]

Bug: 29370996
Change-Id: Ibe1b9434c00ed96f1e30acb110734c6570b087b8
Tested: https://android-review.googlesource.com/#/c/254470/
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26 16:26:53 +09:00
Hannes Frederic Sowa
8f5bf76871 ipv6: fix endianness error in icmpv6_err
IPv6 ping socket error handler doesn't correctly convert the new 32 bit
mtu to host endianness before using.

[Cherry-pick of net dcb94b88c09ce82a80e188d49bcffdc83ba215a6]

Bug: 29370996
Change-Id: Iea0ca79f16c2a1366d82b3b0a3097093d18da8b7
Cc: Lorenzo Colitti <lorenzo@google.com>
Fixes: 6d0bfe2261 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26 16:26:53 +09:00
Badhri Jagan Sridharan
8a58605e46 ANDROID: dm: android-verity: Allow android-verity to be compiled as an independent module
Exports the device mapper callbacks of linear and dm-verity-target
methods.

Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I0358be0615c431dce3cc78575aaac4ccfe3aacd7
2016-08-25 15:59:13 +00:00
Mohamad Ayyash
1c6df5fdcb Revert "Android: MMC/UFS IO Latency Histograms."
This reverts commit 8d525c5122.

Change-Id: I69350b98d9de9b1c9f591e03a90f133e328ba72a
2016-08-25 00:59:21 +00:00
Mohan Srinivasan
8d525c5122 Android: MMC/UFS IO Latency Histograms.
This patch adds a new sysfs node (latency_hist) and reports IO
(svc time) latency histograms. Disabled by default, can be enabled
by echoing 0 into latency_hist, stats can be cleared by writing 2
into latency_hist.

Bug: 30677035
Change-Id: I625938135ea33e6e87cf6af1fc7edc136d8b4b32
Signed-off-by: Mohan Srinivasan <srmohan@google.com>
2016-08-24 19:09:22 +00:00
Rainer Weikusat
50acf9e6e3 UPSTREAM: af_unix: Guard against other == sk in unix_dgram_sendmsg
(cherry picked from commit a5527dda344fff0514b7989ef7a755729769daa1)

The unix_dgram_sendmsg routine use the following test

if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {

to determine if sk and other are in an n:1 association (either
established via connect or by using sendto to send messages to an
unrelated socket identified by address). This isn't correct as the
specified address could have been bound to the sending socket itself or
because this socket could have been connected to itself by the time of
the unix_peer_get but disconnected before the unix_state_lock(other). In
both cases, the if-block would be entered despite other == sk which
might either block the sender unintentionally or lead to trying to unlock
the same spin lock twice for a non-blocking send. Add a other != sk
check to guard against this.

Fixes: 7d267278a9 ("unix: avoid use-after-free in ep_remove_wait_queue")
Reported-By: Philipp Hahn <pmhahn@pmhahn.de>
Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Tested-by: Philipp Hahn <pmhahn@pmhahn.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Change-Id: I4ebef6a390df3487903b166b837e34c653e01cb2
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-08-24 13:48:36 +05:30
Takashi Iwai
a20900c4aa UPSTREAM: ALSA: timer: Fix race among timer ioctls
(cherry picked from commit af368027a49a751d6ff4ee9e3f9961f35bb4fede)

ALSA timer ioctls have an open race and this may lead to a
use-after-free of timer instance object.  A simplistic fix is to make
each ioctl exclusive.  We have already tread_sem for controlling the
tread, and extend this as a global mutex to be applied to each ioctl.

The downside is, of course, the worse concurrency.  But these ioctls
aren't to be parallel accessible, in anyway, so it should be fine to
serialize there.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Change-Id: I1ac52f1cba5e7408fd88c8fc1c30ca2e83967ebb
Bug: 28694392
2016-08-22 23:15:17 -07:00
Eric Dumazet
ca9b7b070b UPSTREAM: tcp: make challenge acks less predictable
(cherry picked from commit 75ff39ccc1bd5d3c455b6822ab09e533c551f758)

Yue Cao claims that current host rate limiting of challenge ACKS
(RFC 5961) could leak enough information to allow a patient attacker
to hijack TCP sessions. He will soon provide details in an academic
paper.

This patch increases the default limit from 100 to 1000, and adds
some randomization so that the attacker can no longer hijack
sessions without spending a considerable amount of probes.

Based on initial analysis and patch from Linus.

Note that we also have per socket rate limiting, so it is tempting
to remove the host limit in the future.

v2: randomize the count of challenge acks per second, not the period.

Fixes: 282f23c6ee ("tcp: implement RFC 5961 3.2")
Reported-by: Yue Cao <ycao009@ucr.edu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change-Id: Ib46ba66f5e4a5a7c81bfccd7b0aa83c3d9e1b3bb
Bug: 30809774
2016-08-16 17:33:30 -07:00
Waiman Long
76554612b4 sched/fair: Avoid redundant idle_cpu() call in update_sg_lb_stats()
Part of the responsibility of the update_sg_lb_stats() function is to
update the idle_cpus statistical counter in struct sg_lb_stats. This
check is done by calling idle_cpu(). The idle_cpu() function, in
turn, checks a number of fields within the run queue structure such
as rq->curr and rq->nr_running.

With the current layout of the run queue structure, rq->curr and
rq->nr_running are in separate cachelines. The rq->curr variable is
checked first followed by nr_running. As nr_running is also accessed
by update_sg_lb_stats() earlier, it makes no sense to load another
cacheline when nr_running is not 0 as idle_cpu() will always return
false in this case.

This patch eliminates this redundant cacheline load by checking the
cached nr_running before calling idle_cpu().

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Douglas Hatch <doug.hatch@hpe.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1448478580-26467-2-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit a426f99c91d1036767a7819aaaba6bd3191b7f06)
Signed-off-by: Javi Merino <javi.merino@arm.com>
2016-08-15 13:23:38 -07:00
Ricky Liang
34828bd3e7 FIXUP: sched: scheduler-driven cpu frequency selection
Two fixups that have been reported on LKML. The next version of
scheduler-driver cpu frequency selection patch set should include
these fixes and we can drop this patch then.

Signed-off-by: Ricky Liang <jcliang@chromium.org>

Change-Id: Ia2f8b5c0dd5dac06580256eeb4b259929688af68
2016-08-15 13:22:59 -07:00
Winter Wang
1a0cf8ace1 UPSTREAM: usb: gadget: configfs: add mutex lock before unregister gadget
There may be a race condition if f_fs calls unregister_gadget_item in
ffs_closed() when unregister_gadget is called by UDC store at the same time.
this leads to a kernel NULL pointer dereference:

[  310.644928] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[  310.645053] init: Service 'adbd' is being killed...
[  310.658938] pgd = c9528000
[  310.662515] [00000004] *pgd=19451831, *pte=00000000, *ppte=00000000
[  310.669702] Internal error: Oops: 817 [#1] PREEMPT SMP ARM
[  310.675211] Modules linked in:
[  310.678294] CPU: 0 PID: 1537 Comm: ->transport Not tainted 4.1.15-03725-g793404c #2
[  310.685958] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[  310.692493] task: c8e24200 ti: c945e000 task.ti: c945e000
[  310.697911] PC is at usb_gadget_unregister_driver+0xb4/0xd0
[  310.703502] LR is at __mutex_lock_slowpath+0x10c/0x16c
[  310.708648] pc : [<c075efc0>]    lr : [<c0bfb0bc>]    psr: 600f0113
<snip..>
[  311.565585] [<c075efc0>] (usb_gadget_unregister_driver) from [<c075e2b8>] (unregister_gadget_item+0x1c/0x34)
[  311.575426] [<c075e2b8>] (unregister_gadget_item) from [<c076fcc8>] (ffs_closed+0x8c/0x9c)
[  311.583702] [<c076fcc8>] (ffs_closed) from [<c07736b8>] (ffs_data_reset+0xc/0xa0)
[  311.591194] [<c07736b8>] (ffs_data_reset) from [<c07738ac>] (ffs_data_closed+0x90/0xd0)
[  311.599208] [<c07738ac>] (ffs_data_closed) from [<c07738f8>] (ffs_ep0_release+0xc/0x14)
[  311.607224] [<c07738f8>] (ffs_ep0_release) from [<c023e030>] (__fput+0x80/0x1d0)
[  311.614635] [<c023e030>] (__fput) from [<c014e688>] (task_work_run+0xb0/0xe8)
[  311.621788] [<c014e688>] (task_work_run) from [<c010afdc>] (do_work_pending+0x7c/0xa4)
[  311.629718] [<c010afdc>] (do_work_pending) from [<c010770c>] (work_pending+0xc/0x20)

for functions using functionFS, i.e. android adbd will close /dev/usb-ffs/adb/ep0
when usb IO thread fails, but switch adb from on to off also triggers write
"none" > UDC. These 2 operations both call unregister_gadget, which will lead
to the panic above.

add a mutex before calling unregister_gadget for api used in f_fs.

Signed-off-by: Winter Wang <wente.wang@nxp.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2016-08-15 09:53:21 -07:00
Badhri Jagan Sridharan
aa3cda16a5 ANDROID: dm-verity: adopt changes made to dm callbacks
v4.4 introduced changes to the callbacks used for
dm-linear and dm-verity-target targets. Move to those headers
in dm-android-verity.

Verified on hikey while having
BOARD_USES_RECOVERY_AS_BOOT := true
BOARD_BUILD_SYSTEM_ROOT_IMAGE := true

BUG: 27339727
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: Ic64950c3b55f0a6eaa570bcedc2ace83bbf3005e
2016-08-12 14:18:03 -07:00
Al Viro
4804692cb1 UPSTREAM: ecryptfs: fix handling of directory opening
(cherry picked from commit 6a480a7842545ec520a91730209ec0bae41694c1)

First of all, trying to open them r/w is idiocy; it's guaranteed to fail.
Moreover, assigning ->f_pos and assuming that everything will work is
blatantly broken - try that with e.g. tmpfs as underlying layer and watch
the fireworks.  There may be a non-trivial amount of state associated with
current IO position, well beyond the numeric offset.  Using the single
struct file associated with underlying inode is really not a good idea;
we ought to open one for each ecryptfs directory struct file.

Additionally, file_operations both for directories and non-directories are
full of pointless methods; non-directories should *not* have ->iterate(),
directories should not have ->flush(), ->fasync() and ->splice_read().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Change-Id: I4813ce803f270fdd364758ce1dc108b76eab226e
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-08-12 13:44:30 -07:00
Jeff Mahoney
7997255b0d UPSTREAM: ecryptfs: don't allow mmap when the lower fs doesn't support it
(cherry picked from commit f0fe970df3838c202ef6c07a4c2b36838ef0a88b)

There are legitimate reasons to disallow mmap on certain files, notably
in sysfs or procfs.  We shouldn't emulate mmap support on file systems
that don't offer support natively.

CVE-2016-1583

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: stable@vger.kernel.org
[tyhicks: clean up f_op check by using ecryptfs_file_to_lower()]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>

Change-Id: I66e3670771630a25b0608f10019d1584e9ce73a6
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-08-12 13:44:17 -07:00
Jeff Mahoney
1da2a42de4 UPSTREAM: Revert "ecryptfs: forbid opening files without mmap handler"
(cherry picked from commit 78c4e172412de5d0456dc00d2b34050aa0b683b5)

This reverts upstream commit 2f36db71009304b3f0b95afacd8eba1f9f046b87.

It fixed a local root exploit but also introduced a dependency on
the lower file system implementing an mmap operation just to open a file,
which is a bit of a heavy hammer.  The right fix is to have mmap depend
on the existence of the mmap handler instead.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>

Fixes: Change-Id I0be77c7f8bd3046bc34cd87ef577529792d479bc
       ("UPSTREAM: ecryptfs: forbid opening files without mmap handler")
Change-Id: Ib9bc87099f7f89e4e12dbc1a79e884dcadb1befb
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-08-12 13:44:01 -07:00
Amit Pundir
0b848391c2 ANDROID: net: core: fix UID-based routing
Fix RTA_UID enum to match it with the Android userspace code which
assumes RTA_UID=18.

With this patch all Android kernel networking unit tests mentioned here
https://source.android.com/devices/tech/config/kernel_network_tests.html
are success.

Without this patch multinetwork_test.py unit test fails.

Change-Id: I3ff36670f7d4e5bf5f01dce584ae9d53deabb3ed
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-08-12 13:38:04 -07:00
Amit Pundir
ef58c0a269 ANDROID: net: fib: remove duplicate assignment
Remove duplicate FRA_GOTO assignment.

Fixes: fd2cf795f3 ("net: core: Support UID-based routing.")

Change-Id: I462c24b16fdef42ae2332571a0b95de3ef9d2e25
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-08-12 13:37:44 -07:00
John Stultz
fb8741a0ba FROMLIST: proc: Fix timerslack_ns CAP_SYS_NICE check when adjusting self
In changing from checking ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)
to capable(CAP_SYS_NICE), I missed that ptrace_my_access succeeds
when p == current, but the CAP_SYS_NICE doesn't.

Thus while the previous commit was intended to loosen the needed
privledges to modify a processes timerslack, it needlessly restricted
a task modifying its own timerslack via the proc/<tid>/timerslack_ns
(which is permitted also via the PR_SET_TIMERSLACK method).

This patch corrects this by checking if p == current before checking
the CAP_SYS_NICE value.

Cc: Kees Cook <keescook@chromium.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
CC: Arjan van de Ven <arjan@linux.intel.com>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Colin Cross <ccross@android.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: Elliott Hughes <enh@google.com>
Cc: Android Kernel Team <kernel-team@android.com>
Mailing-list-url: http://www.spinics.net/lists/kernel/msg2317488.html
Change-Id: Ia3e8aff07c2d41f55b6617502d33c39b7d781aac
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-11 15:20:37 -07:00
Matt Wagantall
2a4445395f sched/rt: Add Kconfig option to enable panicking for RT throttling
This may be useful for detecting and debugging RT throttling issues.

Change-Id: I5807a897d11997d76421c1fcaa2918aad988c6c9
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
[rameezmustafa@codeaurora.org]: Port to msm-3.18]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[jstultz: forwardported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-11 14:26:55 -07:00
Matt Wagantall
989f33f789 sched/rt: print RT tasks when RT throttling is activated
Existing debug prints do not provide any clues about which tasks
may have triggered RT throttling. Print the names and PIDs of
all tasks on the throttled rt_rq to help narrow down the source
of the problem.

Change-Id: I180534c8a647254ed38e89d0c981a8f8bccd741c
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
[rameezmustafa@codeaurora.org]: Port to msm-3.18]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-08-11 14:26:54 -07:00
Peter Zijlstra
3276c3d7bc UPSTREAM: sched: Fix a race between __kthread_bind() and sched_setaffinity()
Because sched_setscheduler() checks p->flags & PF_NO_SETAFFINITY
without locks, a caller might observe an old value and race with the
set_cpus_allowed_ptr() call from __kthread_bind() and effectively undo
it:

	__kthread_bind()
	  do_set_cpus_allowed()
						<SYSCALL>
						  sched_setaffinity()
						    if (p->flags & PF_NO_SETAFFINITIY)
						    set_cpus_allowed_ptr()
	  p->flags |= PF_NO_SETAFFINITY

Fix the bug by putting everything under the regular scheduler locks.

This also closes a hole in the serialization of task_struct::{nr_,}cpus_allowed.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dedekind1@gmail.com
Cc: juri.lelli@arm.com
Cc: mgorman@suse.de
Cc: riel@redhat.com
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/20150515154833.545640346@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 25834c73f9)
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>

BUG=chrome-os-partner:44828
TEST=Boot kernel on Oak.
TEST=smaug-release and strago-release trybots.

Change-Id: Id3c898c5ee1a22ed704e83f2ecf5f78199280d38
Reviewed-on: https://chromium-review.googlesource.com/321264
Commit-Ready: Ricky Liang <jcliang@chromium.org>
Tested-by: Ricky Liang <jcliang@chromium.org>
Reviewed-by: Ricky Liang <jcliang@chromium.org>

Conflicts:
	kernel/sched/core.c
2016-08-11 14:26:54 -07:00
Srinath Sridharan
07ec7db165 sched/fair: Favor higher cpus only for boosted tasks
This CL separates the notion of boost and prefer_idle schedtune
attributes in cpu selection. Today only top-app
tasks are boosted. The CPU selection is slightly tweaked such that
higher order cpus are preferred only for boosted tasks (top-app) and the
rest would be skewed towards lower order cpus.
This avoids starvation issues for fg tasks when interacting with high
priority top-app tasks (a problem often seen in the case of system_server).

bug: 30245369
bug: 30292998
Change-Id: I0377e00893b9f6586eec55632a265518fd2fa8a1

Conflicts:
	kernel/sched/fair.c
2016-08-11 14:26:53 -07:00
Christoph Lameter
142b2acc79 vmstat: make vmstat_updater deferrable again and shut down on idle
Currently the vmstat updater is not deferrable as a result of commit
ba4877b9ca ("vmstat: do not use deferrable delayed work for
vmstat_update").  This in turn can cause multiple interruptions of the
applications because the vmstat updater may run at

Make vmstate_update deferrable again and provide a function that folds
the differentials when the processor is going to idle mode thus
addressing the issue of the above commit in a clean way.

Note that the shepherd thread will continue scanning the differentials
from another processor and will reenable the vmstat workers if it
detects any changes.

Change-Id: Idf256cfacb40b4dc8dbb6795cf06b34e8fec7a06
Fixes: ba4877b9ca ("vmstat: do not use deferrable delayed work for vmstat_update")
Signed-off-by: Christoph Lameter <cl@linux.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Git-commit: 0eb77e9880321915322d42913c3b53241739c8aa
[shashim@codeaurora.org: resolve minor merge conflicts]
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
[jstultz: fwdport to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-11 14:26:53 -07:00
Juri Lelli
74b4fa8e5c sched/fair: call OPP update when going idle after migration
When a task leaves a rq because it is migrated away it carries its
utilization with him. In this case and OPP update on the src rq might be
needed. The corresponding update at dst rq will happen at enqueue time.

Change-Id: I22754a43760fc8d22a488fe15044af93787ea7a8

sched/fair: Fix uninitialised variable in idle_balance

compiler warned, looks legit.

Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2016-08-11 14:26:52 -07:00