Commit graph

566530 commits

Author SHA1 Message Date
Florian Fainelli
a3fb2b3bd4 net: dsa: bcm_sf2: Fix race condition while unmasking interrupts
[ Upstream commit 4f101c47791cdcb831b3ef1f831b1cc51e4fe03c ]

We kept shadow copies of which interrupt sources we have enabled and
disabled, but due to an order bug in how intrl2_mask_clear was defined,
we could run into the following scenario:

CPU0					CPU1
intrl2_1_mask_clear(..)
sets INTRL2_CPU_MASK_CLEAR
					bcm_sf2_switch_1_isr
					read INTRL2_CPU_STATUS and masks with stale
					irq1_mask value
updates irq1_mask value

Which would make us loop again and again trying to process and interrupt
we are not clearing since our copy of whether it was enabled before
still indicates it was not. Fix this by updating the shadow copy first,
and then unasking at the HW level.

Fixes: 246d7f773c ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Paul Blakey
c03c024fd3 net/mlx5: Added missing check of msg length in verifying its signature
[ Upstream commit 2c0f8ce1b584a4d7b8ff53140d21dfed99834940 ]

Set and verify signature calculates the signature for each of the
mailbox nodes, even for those that are unused (from cache). Added
a missing length check to set and verify only those which are used.

While here, also moved the setting of msg's nodes token to where we
already go over them. This saves a pass because checksum is disabled,
and the only useful thing remaining that set signature does is setting
the token.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB
adapters')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Vegard Nossum
4be4511af7 tipc: fix NULL pointer dereference in shutdown()
[ Upstream commit d2fbdf76b85bcdfe57b8ef2ba09d20e8ada79abd ]

tipc_msg_create() can return a NULL skb and if so, we shouldn't try to
call tipc_node_xmit_skb() on it.

    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 3 PID: 30298 Comm: trinity-c0 Not tainted 4.7.0-rc7+ #19
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
    task: ffff8800baf09980 ti: ffff8800595b8000 task.ti: ffff8800595b8000
    RIP: 0010:[<ffffffff830bb46b>]  [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140
    RSP: 0018:ffff8800595bfce8  EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003023b0e0
    RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffffff83d12580
    RBP: ffff8800595bfd78 R08: ffffed000b2b7f32 R09: 0000000000000000
    R10: fffffbfff0759725 R11: 0000000000000000 R12: 1ffff1000b2b7f9f
    R13: ffff8800595bfd58 R14: ffffffff83d12580 R15: dffffc0000000000
    FS:  00007fcdde242700(0000) GS:ffff88011af80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fcddde1db10 CR3: 000000006874b000 CR4: 00000000000006e0
    DR0: 00007fcdde248000 DR1: 00007fcddd73d000 DR2: 00007fcdde248000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602
    Stack:
     0000000000000018 0000000000000018 0000000041b58ab3 ffffffff83954208
     ffffffff830bb400 ffff8800595bfd30 ffffffff8309d767 0000000000000018
     0000000000000018 ffff8800595bfd78 ffffffff8309da1a 00000000810ee611
    Call Trace:
     [<ffffffff830c84a3>] tipc_shutdown+0x553/0x880
     [<ffffffff825b4a3b>] SyS_shutdown+0x14b/0x170
     [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
     [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: 90 00 b4 0b 83 c7 00 f1 f1 f1 f1 4c 8d 6d e0 c7 40 04 00 00 00 f4 c7 40 08 f3 f3 f3 f3 48 89 d8 48 c1 e8 03 c7 45 b4 00 00 00 00 <80> 3c 30 00 75 78 48 8d 7b 08 49 8d 75 c0 48 b8 00 00 00 00 00
    RIP  [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140
     RSP <ffff8800595bfce8>
    ---[ end trace 57b0484e351e71f1 ]---

I feel like we should maybe return -ENOMEM or -ENOBUFS, but I'm not sure
userspace is equipped to handle that. Anyway, this is better than a GPF
and looks somewhat consistent with other tipc_msg_create() callers.

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Vegard Nossum
8d0d2ce6c8 net/irda: handle iriap_register_lsap() allocation failure
[ Upstream commit 5ba092efc7ddff040777ae7162f1d195f513571b ]

If iriap_register_lsap() fails to allocate memory, self->lsap is
set to NULL. However, none of the callers handle the failure and
irlmp_connect_request() will happily dereference it:

    iriap_register_lsap: Unable to allocated LSAP!
    ================================================================================
    UBSAN: Undefined behaviour in net/irda/irlmp.c:378:2
    member access within null pointer of type 'struct lsap_cb'
    CPU: 1 PID: 15403 Comm: trinity-c0 Not tainted 4.8.0-rc1+ #81
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org
    04/01/2014
     0000000000000000 ffff88010c7e78a8 ffffffff82344f40 0000000041b58ab3
     ffffffff84f98000 ffffffff82344e94 ffff88010c7e78d0 ffff88010c7e7880
     ffff88010630ad00 ffffffff84a5fae0 ffffffff84d3f5c0 000000000000017a
    Call Trace:
     [<ffffffff82344f40>] dump_stack+0xac/0xfc
     [<ffffffff8242f5a8>] ubsan_epilogue+0xd/0x8a
     [<ffffffff824302bf>] __ubsan_handle_type_mismatch+0x157/0x411
     [<ffffffff83b7bdbc>] irlmp_connect_request+0x7ac/0x970
     [<ffffffff83b77cc0>] iriap_connect_request+0xa0/0x160
     [<ffffffff83b77f48>] state_s_disconnect+0x88/0xd0
     [<ffffffff83b78904>] iriap_do_client_event+0x94/0x120
     [<ffffffff83b77710>] iriap_getvaluebyclass_request+0x3e0/0x6d0
     [<ffffffff83ba6ebb>] irda_find_lsap_sel+0x1eb/0x630
     [<ffffffff83ba90c8>] irda_connect+0x828/0x12d0
     [<ffffffff833c0dfb>] SYSC_connect+0x22b/0x340
     [<ffffffff833c7e09>] SyS_connect+0x9/0x10
     [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
     [<ffffffff845f946a>] entry_SYSCALL64_slow_path+0x25/0x25
    ================================================================================

The bug seems to have been around since forever.

There's more problems with missing error checks in iriap_init() (and
indeed all of irda_init()), but that's a bigger problem that needs
very careful review and testing. This patch will fix the most serious
bug (as it's easily reached from unprivileged userspace).

I have tested my patch with a reproducer.

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Lance Richardson
0bb225a04d vti: flush x-netns xfrm cache when vti interface is removed
[ Upstream commit a5d0dc810abf3d6b241777467ee1d6efb02575fc ]

When executing the script included below, the netns delete operation
hangs with the following message (repeated at 10 second intervals):

  kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1

This occurs because a reference to the lo interface in the "secure" netns
is still held by a dst entry in the xfrm bundle cache in the init netns.

Address this problem by garbage collecting the tunnel netns flow cache
when a cross-namespace vti interface receives a NETDEV_DOWN notification.

A more detailed description of the problem scenario (referencing commands
in the script below):

(1) ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1

  The vti_test interface is created in the init namespace. vti_tunnel_init()
  attaches a struct ip_tunnel to the vti interface's netdev_priv(dev),
  setting the tunnel net to &init_net.

(2) ip link set vti_test netns secure

  The vti_test interface is moved to the "secure" netns. Note that
  the associated struct ip_tunnel still has tunnel->net set to &init_net.

(3) ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1

  The first packet sent using the vti device causes xfrm_lookup() to be
  called as follows:

      dst = xfrm_lookup(tunnel->net, skb_dst(skb), fl, NULL, 0);

  Note that tunnel->net is the init namespace, while skb_dst(skb) references
  the vti_test interface in the "secure" namespace. The returned dst
  references an interface in the init namespace.

  Also note that the first parameter to xfrm_lookup() determines which flow
  cache is used to store the computed xfrm bundle, so after xfrm_lookup()
  returns there will be a cached bundle in the init namespace flow cache
  with a dst referencing a device in the "secure" namespace.

(4) ip netns del secure

  Kernel begins to delete the "secure" namespace.  At some point the
  vti_test interface is deleted, at which point dst_ifdown() changes
  the dst->dev in the cached xfrm bundle flow from vti_test to lo (still
  in the "secure" namespace however).
  Since nothing has happened to cause the init namespace's flow cache
  to be garbage collected, this dst remains attached to the flow cache,
  so the kernel loops waiting for the last reference to lo to go away.

<Begin script>
ip link add br1 type bridge
ip link set dev br1 up
ip addr add dev br1 1.1.1.1/8

ip netns add secure
ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1
ip link set vti_test netns secure
ip netns exec secure ip link set vti_test up
ip netns exec secure ip link s lo up
ip netns exec secure ip addr add dev lo 192.168.100.1/24
ip netns exec secure ip route add 192.168.200.0/24 dev vti_test
ip xfrm policy flush
ip xfrm state flush
ip xfrm policy add dir out tmpl src 1.1.1.1 dst 1.1.1.2 \
   proto esp mode tunnel mark 1
ip xfrm policy add dir in tmpl src 1.1.1.2 dst 1.1.1.1 \
   proto esp mode tunnel mark 1
ip xfrm state add src 1.1.1.1 dst 1.1.1.2 proto esp spi 1 \
   mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788
ip xfrm state add src 1.1.1.2 dst 1.1.1.1 proto esp spi 1 \
   mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788

ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1

ip netns del secure
<End script>

Reported-by: Hangbin Liu <haliu@redhat.com>
Reported-by: Jan Tluka <jtluka@redhat.com>
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Linus Torvalds
9b5390d7b6 af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
commit 6e1ce3c3451291142a57c4f3f6f999a29fb5b3bc upstream.

Right now we use the 'readlock' both for protecting some of the af_unix
IO path and for making the bind be single-threaded.

The two are independent, but using the same lock makes for a nasty
deadlock due to ordering with regards to filesystem locking.  The bind
locking would want to nest outside the VSF pathname locking, but the IO
locking wants to nest inside some of those same locks.

We tried to fix this earlier with commit c845acb324 ("af_unix: Fix
splice-bind deadlock") which moved the readlock inside the vfs locks,
but that caused problems with overlayfs that will then call back into
filesystem routines that take the lock in the wrong order anyway.

Splitting the locks means that we can go back to having the bind lock be
the outermost lock, and we don't have any deadlocks with lock ordering.

Acked-by: Rainer Weikusat <rweikusat@cyberadapt.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Linus Torvalds
941f699544 Revert "af_unix: Fix splice-bind deadlock"
commit 38f7bd94a97b542de86a2be9229289717e33a7a4 upstream.

This reverts commit c845acb324.

It turns out that it just replaces one deadlock with another one: we can
still get the wrong lock ordering with the readlock due to overlayfs
calling back into the filesystem layer and still taking the vfs locks
after the readlock.

The proper solution ends up being to just split the readlock into two
pieces: the bind lock (taken *outside* the vfs locks) and the IO lock
(taken *inside* the filesystem locks).  The two locks are independent
anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Mahesh Bandewar
f357a79839 bonding: Fix bonding crash
[ Upstream commit 24b27fc4cdf9e10c5e79e5923b6b7c2c5c95096c ]

Following few steps will crash kernel -

  (a) Create bonding master
      > modprobe bonding miimon=50
  (b) Create macvlan bridge on eth2
      > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \
	   type macvlan
  (c) Now try adding eth2 into the bond
      > echo +eth2 > /sys/class/net/bond0/bonding/slaves
      <crash>

Bonding does lots of things before checking if the device enslaved is
busy or not.

In this case when the notifier call-chain sends notifications, the
bond_netdev_event() assumes that the rx_handler /rx_handler_data is
registered while the bond_enslave() hasn't progressed far enough to
register rx_handler for the new slave.

This patch adds a rx_handler check that can be performed right at the
beginning of the enslave code to avoid getting into this situation.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Maurizio Lombardi
56e5ad1e1d megaraid: fix null pointer check in megasas_detach_one().
commit 546e559c79b1a8d27c23262907a00fc209e392a0 upstream.

The pd_seq_sync pointer can't be NULL, we have to check its entries
instead.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Arnd Bergmann
e3718ed1df nouveau: fix nv40_perfctr_next() cleanup regression
commit 86d65b7e7a0c927d07d18605c276d0f142438ead upstream.

gcc-6 warns about code in the nouveau driver that is obviously silly:

drivers/gpu/drm/nouveau/nvkm/engine/pm/nv40.c: In function 'nv40_perfctr_next':
drivers/gpu/drm/nouveau/nvkm/engine/pm/nv40.c:62:19: warning: self-comparison always evaluats to false [-Wtautological-compare]
  if (pm->sequence != pm->sequence) {

The behavior was accidentally introduced in a patch described as "This is
purely preparation for upcoming commits, there should be no code changes here.".
As far as I can tell, that was true for the rest of that patch except for
this one function, which has been changed to a NOP.

This patch restores the original behavior.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 8c1aeaa139 ("drm/nouveau/pm: cosmetic changes")
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Colin Ian King
2c4e9913a1 Staging: iio: adc: fix indent on break statement
commit b6acb0cfc21293a1bfc283e9217f58f7474ef728 upstream.

Fix indent warning when building with gcc 6:
drivers/staging/iio/adc/ad7192.c:239:4: warning: statement is indented
  as if it were guarded by... [-Wmisleading-indentation]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Arnd Bergmann
682c360eb2 iwlegacy: avoid warning about missing braces
commit 2cce76c3fab410520610a7d2f52faebc3cfcf843 upstream.

gcc-6 warns about code in il3945_hw_txq_ctx_free() being
somewhat ambiguous:

drivers/net/wireless/intel/iwlegacy/3945.c:1022:5: warning: suggest explicit braces to avoid ambiguous 'else' [-Wparentheses]

This adds a set of curly braces to avoid the warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:36 +02:00
Arnd Bergmann
1fff631e52 ath9k: fix misleading indentation
commit 362210e0dff4eb7bb36a9b34dbef3b39d779d95e upstream.

A cleanup patch in linux-3.18 moved around some code in the ath9k
driver and left some code to be indented in a misleading way,
made worse by the addition of some new code for p2p mode, as
discovered by a new gcc-6 warning:

drivers/net/wireless/ath/ath9k/init.c: In function 'ath9k_set_hw_capab':
drivers/net/wireless/ath/ath9k/init.c:851:4: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation]
    hw->wiphy->iface_combinations = if_comb;
    ^~
drivers/net/wireless/ath/ath9k/init.c:847:3: note: ...this 'if' clause, but it is not
   if (ath9k_is_chanctx_enabled())
   ^~

The code is in fact correct, but the indentation is not, so I'm
reformatting it as it should have been after the original cleanup.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 499afaccf6 ("ath9k: Isolate ath9k_use_chanctx module parameter")
Fixes: eb61f9f623 ("ath9k: advertise p2p dev support when chanctx")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Arnd Bergmann
7fb33fb721 am437x-vfpe: fix typo in vpfe_get_app_input_index
commit 0fb504001192c1df62c847a8bb6558753c36ebef upstream.

gcc-6 points out an obviously silly comparison in vpfe_get_app_input_index():

drivers/media/platform/am437x/am437x-vpfe.c: In function 'vpfe_get_app_input_index':
drivers/media/platform/am437x/am437x-vpfe.c:1709:27: warning: self-comparison always evaluats to true [-Wtautological-compare]
       client->adapter->nr == client->adapter->nr) {
                           ^~

This was introduced in a slighly incorrect conversion, and it's
clear that the comparison was meant to compare the iterator
to the current subdev instead, as we do in the line above.

Fixes: d37232390f ("[media] media: am437x-vpfe: match the OF node/i2c addr instead of name")

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Linus Torvalds
c48692f59b Add braces to avoid "ambiguous ‘else’" compiler warnings
commit 194dc870a5890e855ecffb30f3b80ba7c88f96d6 upstream.

Some of our "for_each_xyz()" macro constructs make gcc unhappy about
lack of braces around if-statements inside or outside the loop, because
the loop construct itself has a "if-then-else" statement inside of it.

The resulting warnings look something like this:

  drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’:
  drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses]
     if (ctx != dev_priv->kernel_context)
        ^

even if the code itself is fine.

Since the warning is fairly easy to avoid by adding a braces around the
if-statement near the for_each_xyz() construct, do so, rather than
disabling the otherwise potentially useful warning.

(The if-then-else statements used in the "for_each_xyz()" constructs are
designed to be inherently safe even with no braces, but in this case
it's quite understandable that gcc isn't really able to tell that).

This finally leaves the standard "allmodconfig" build with just a
handful of remaining warnings, so new and valid warnings hopefully will
stand out.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Arnd Bergmann
bc43ac8bd6 net: caif: fix misleading indentation
commit 8e0cc8c326d99e41468c96fea9785ab78883a281 upstream.

gcc points out code that is not indented the way it is
interpreted:

net/caif/cfpkt_skbuff.c: In function 'cfpkt_setlen':
net/caif/cfpkt_skbuff.c:289:4: error: statement is indented as if it were guarded by... [-Werror=misleading-indentation]
    return cfpkt_getlen(pkt);
    ^~~~~~
net/caif/cfpkt_skbuff.c:286:3: note: ...this 'else' clause, but it is not
   else
   ^~~~

It is clear from the context that not returning here would be
a bug, as we'd end up passing a negative length into a function
that takes a u16 length, so it is not missing curly braces
here, and I'm assuming that the indentation is the only part
that's wrong about it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Steven Rostedt
a52031beb0 Makefile: Mute warning for __builtin_return_address(>0) for tracing only
commit 377ccbb483738f84400ddf5840c7dd8825716985 upstream.

With the latest gcc compilers, they give a warning if
__builtin_return_address() parameter is greater than 0. That is because if
it is used by a function called by a top level function (or in the case of
the kernel, by assembly), it can try to access stack frames outside the
stack and crash the system.

The tracing system uses __builtin_return_address() of up to 2! But it is
well aware of the dangers that it may have, and has even added precautions
to protect against it (see the thunk code in arch/x86/entry/thunk*.S)

Linus originally added KBUILD_CFLAGS that would suppress the warning for the
entire kernel, as simply adding KBUILD_CFLAGS to the tracing directory
wouldn't work. The tracing directory plays a bit with the CFLAGS and
requires a little more logic.

This adds that special logic to only suppress the warning for the tracing
directory. If it is used anywhere else outside of tracing, the warning will
still be triggered.

Link: http://lkml.kernel.org/r/20160728223043.51996267@grimm.local.home

Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Linus Torvalds
a521e942bd Disable "frame-address" warning
commit 124a3d88fa20e1869fc229d7d8c740cc81944264 upstream.

Newer versions of gcc warn about the use of __builtin_return_address()
with a non-zero argument when "-Wall" is specified:

  kernel/trace/trace_irqsoff.c: In function ‘stop_critical_timings’:
  kernel/trace/trace_irqsoff.c:433:86: warning: calling ‘__builtin_return_address’ with a nonzero argument is unsafe [-Wframe-address]
     stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
  [ .. repeats a few times for other similar cases .. ]

It is true that a non-zero argument is somewhat dangerous, and we do not
actually have very many uses of that in the kernel - but the ftrace code
does use it, and as Stephen Rostedt says:

 "We are well aware of the danger of using __builtin_return_address() of
  > 0.  In fact that's part of the reason for having the "thunk" code in
  x86 (See arch/x86/entry/thunk_{64,32}.S).  [..] it adds extra frames
  when tracking irqs off sections, to prevent __builtin_return_address()
  from accessing bad areas.  In fact the thunk_32.S states: 'Trampoline to
  trace irqs off.  (otherwise CALLER_ADDR1 might crash)'."

For now, __builtin_return_address() with a non-zero argument is the best
we can do, and the warning is not helpful and can end up making people
miss other warnings for real problems.

So disable the frame-address warning on compilers that need it.

Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Linus Torvalds
3da2a4cb6e Disable "maybe-uninitialized" warning globally
commit 6e8d666e925333c55378e8d5540a8a9ee0eea9c5 upstream.

Several build configurations had already disabled this warning because
it generates a lot of false positives.  But some had not, and it was
still enabled for "allmodconfig" builds, for example.

Looking at the warnings produced, every single one I looked at was a
false positive, and the warnings are frequent enough (and big enough)
that they can easily hide real problems that you don't notice in the
noise generated by -Wmaybe-uninitialized.

The warning is good in theory, but this is a classic case of a warning
that causes more problems than the warning can solve.

If gcc gets better at avoiding false positives, we may be able to
re-enable this warning.  But as is, we're better off without it, and I
want to be able to see the *real* warnings.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Arnd Bergmann
7cd4d22328 gcov: disable -Wmaybe-uninitialized warning
commit e72e2dfe7c16ffbfbabf9cb24adc6d9f93a4fe37 upstream.

When gcov profiling is enabled, we see a lot of spurious warnings about
possibly uninitialized variables being used:

arch/arm/mm/dma-mapping.c: In function 'arm_coherent_iommu_map_page':
arch/arm/mm/dma-mapping.c:1085:16: warning: 'start' may be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/clk/st/clk-flexgen.c: In function 'st_of_flexgen_setup':
drivers/clk/st/clk-flexgen.c:323:9: warning: 'num_parents' may be used uninitialized in this function [-Wmaybe-uninitialized]
kernel/cgroup.c: In function 'cgroup_mount':
kernel/cgroup.c:2119:11: warning: 'root' may be used uninitialized in this function [-Wmaybe-uninitialized]

All of these are false positives, so it seems better to just disable
the warnings whenever GCOV is enabled. Most users don't enable GCOV,
and based on a prior patch, it is now also disabled for 'allmodconfig'
builds, so there should be no downsides of doing this.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Arnd Bergmann
605623774e Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES
commit 815eb71e7149ecce40db9dd0ad09c4dd9d33c60f upstream.

CONFIG_PROFILE_ALL_BRANCHES confuses gcc-5.x to the degree that it prints
incorrect warnings about a lot of variables that it thinks can be used
uninitialized, e.g.:

i2c/busses/i2c-diolan-u2c.c: In function 'diolan_usb_xfer':
i2c/busses/i2c-diolan-u2c.c:391:16: warning: 'byte' may be used uninitialized in this function
iio/gyro/itg3200_core.c: In function 'itg3200_probe':
iio/gyro/itg3200_core.c:213:6: warning: 'val' may be used uninitialized in this function
leds/leds-lp55xx-common.c: In function 'lp55xx_update_bits':
leds/leds-lp55xx-common.c:350:6: warning: 'tmp' may be used uninitialized in this function
misc/bmp085.c: In function 'show_pressure':
misc/bmp085.c:363:10: warning: 'pressure' may be used uninitialized in this function
power/ds2782_battery.c: In function 'ds2786_get_capacity':
power/ds2782_battery.c:214:17: warning: 'raw' may be used uninitialized in this function

These are all false positives that either rob someone's time when trying
to figure out whether they are real, or they get people to send wrong
patches to shut up the warnings.

Nobody normally wants to run a CONFIG_PROFILE_ALL_BRANCHES kernel in
production, so disabling the whole class of warnings for this configuration
has no serious downsides either.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Steven Rostedt <rostedtgoodmis.org>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Robert Jarzmik
d772ec1314 kbuild: forbid kernel directory to contain spaces and colons
commit 51193b76bfff5027cf96ba63effae808ad67cca7 upstream.

When the kernel path contains a space or a colon somewhere in the path
name, the modules_install target doesn't work anymore, as the path names
are not enclosed in double quotes. It is also supposed that and O= build
will suffer from the same weakness as modules_install.

Instead of checking and improving kbuild to resist to directories
including these characters, error out early to prevent any build if the
kernel's main directory contains a space.

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Josh Poimboeuf
9b6bbc3d96 tools: Support relative directory path for 'O='
commit e17cf3a80d4ba0c4e40bf1a89deb1354c2e10e14 upstream.

Running "make O=foo" (with a relative directory path) fails with:

  scripts/Makefile.include:3: *** O=foo does not exist.  Stop.
  /home/jpoimboe/git/linux/Makefile:1547: recipe for target 'tools/objtool' failed

The tools Makefile gets confused by the relative path and tries to build
objtool in tools/foo.  Convert the output directory to an absolute path
before passing it to the tools Makefile.

Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-next@vger.kernel.org
Cc: linux@roeck-us.net
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/94a078c6c998fac9f01a14f574008bf7dff40191.1457016803.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Wang YanQing
97283248f5 Makefile: revert "Makefile: Document ability to make file.lst and file.S" partially
commit 40ab87a4003c7952976ce901a2b9ece5ed833168 upstream.

Commit 6271897978 ("Makefile: Document ability to make file.lst
and file.S") document ability to make file.S, but there isn't such
ability in kbuild, so revert it.

Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Michal Marek
252644d83b kbuild: Do not run modules_install and install in paralel
commit a85a41ed69f27c4c667d8c418df14b4fb220c4ad upstream.

Based on a x86-only patch by Andy Lutomirski <luto@amacapital.net>

With modular kernels, 'make install' is going to need the installed
modules at some point to generate the initramfs.

Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Ashish Samant
cf5fa7b898 ocfs2: fix start offset to ocfs2_zero_range_for_truncate()
commit d21c353d5e99c56cdd5b5c1183ffbcaf23b8b960 upstream.

If we punch a hole on a reflink such that following conditions are met:

1. start offset is on a cluster boundary
2. end offset is not on a cluster boundary
3. (end offset is somewhere in another extent) or
   (hole range > MAX_CONTIG_BYTES(1MB)),

we dont COW the first cluster starting at the start offset.  But in this
case, we were wrongly passing this cluster to
ocfs2_zero_range_for_truncate() to zero out.  This will modify the
cluster in place and zero it in the source too.

Fix this by skipping this cluster in such a scenario.

To reproduce:

1. Create a random file of say 10 MB
     xfs_io -c 'pwrite -b 4k 0 10M' -f 10MBfile
2. Reflink  it
     reflink -f 10MBfile reflnktest
3. Punch a hole at starting at cluster boundary  with range greater that
1MB. You can also use a range that will put the end offset in another
extent.
     fallocate -p -o 0 -l 1048615 reflnktest
4. sync
5. Check the  first cluster in the source file. (It will be zeroed out).
    dd if=10MBfile iflag=direct bs=<cluster size> count=1 | hexdump -C

Link: http://lkml.kernel.org/r/1470957147-14185-1-git-send-email-ashish.samant@oracle.com
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reported-by: Saar Maoz <saar.maoz@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Eric Ren <zren@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:34 +02:00
Joseph Qi
f1ce664e68 ocfs2/dlm: fix race between convert and migration
commit e6f0c6e6170fec175fe676495f29029aecdf486c upstream.

Commit ac7cf246dfdb ("ocfs2/dlm: fix race between convert and recovery")
checks if lockres master has changed to identify whether new master has
finished recovery or not.  This will introduce a race that right after
old master does umount ( means master will change), a new convert
request comes.

In this case, it will reset lockres state to DLM_RECOVERING and then
retry convert, and then fail with lockres->l_action being set to
OCFS2_AST_INVALID, which will cause inconsistent lock level between
ocfs2 and dlm, and then finally BUG.

Since dlm recovery will clear lock->convert_pending in
dlm_move_lockres_to_recovery_list, we can use it to correctly identify
the race case between convert and recovery.  So fix it.

Fixes: ac7cf246dfdb ("ocfs2/dlm: fix race between convert and recovery")
Link: http://lkml.kernel.org/r/57CE1569.8010704@huawei.com
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:34 +02:00
Herbert Xu
2426cdb3f4 crypto: echainiv - Replace chaining with multiplication
commit 53a5d5ddccf849dbc27a8c1bba0b43c3a45fb792 upstream.

The current implementation uses a global per-cpu array to store
data which are used to derive the next IV.  This is insecure as
the attacker may change the stored data.

This patch removes all traces of chaining and replaces it with
multiplication of the salt and the sequence number.

Fixes: a10f554fa7 ("crypto: echainiv - Add encrypted chain IV...")
Reported-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:34 +02:00
Herbert Xu
1c95a8a481 crypto: skcipher - Fix blkcipher walk OOM crash
commit acdb04d0b36769b3e05990c488dc74d8b7ac8060 upstream.

When we need to allocate a temporary blkcipher_walk_next and it
fails, the code is supposed to take the slow path of processing
the data block by block.  However, due to an unrelated change
we instead end up dereferencing the NULL pointer.

This patch fixes it by moving the unrelated bsize setting out
of the way so that we enter the slow path as inteded.

Fixes: 7607bd8ff0 ("[CRYPTO] blkcipher: Added blkcipher_walk_virt_block")
Reported-by: xiakaixu <xiakaixu@huawei.com>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:34 +02:00
Ard Biesheuvel
9246fd26f9 crypto: arm/aes-ctr - fix NULL dereference in tail processing
commit f82e90b28654804ab72881d577d87c3d5c65e2bc upstream.

The AES-CTR glue code avoids calling into the blkcipher API for the
tail portion of the walk, by comparing the remainder of walk.nbytes
modulo AES_BLOCK_SIZE with the residual nbytes, and jumping straight
into the tail processing block if they are equal. This tail processing
block checks whether nbytes != 0, and does nothing otherwise.

However, in case of an allocation failure in the blkcipher layer, we
may enter this code with walk.nbytes == 0, while nbytes > 0. In this
case, we should not dereference the source and destination pointers,
since they may be NULL. So instead of checking for nbytes != 0, check
for (walk.nbytes % AES_BLOCK_SIZE) != 0, which implies the former in
non-error conditions.

Fixes: 86464859cc ("crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions")
Reported-by: xiakaixu <xiakaixu@huawei.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:34 +02:00
Ard Biesheuvel
3e2d986d8b crypto: arm64/aes-ctr - fix NULL dereference in tail processing
commit 2db34e78f126c6001d79d3b66ab1abb482dc7caa upstream.

The AES-CTR glue code avoids calling into the blkcipher API for the
tail portion of the walk, by comparing the remainder of walk.nbytes
modulo AES_BLOCK_SIZE with the residual nbytes, and jumping straight
into the tail processing block if they are equal. This tail processing
block checks whether nbytes != 0, and does nothing otherwise.

However, in case of an allocation failure in the blkcipher layer, we
may enter this code with walk.nbytes == 0, while nbytes > 0. In this
case, we should not dereference the source and destination pointers,
since they may be NULL. So instead of checking for nbytes != 0, check
for (walk.nbytes % AES_BLOCK_SIZE) != 0, which implies the former in
non-error conditions.

Fixes: 49788fe2a1 ("arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions")
Reported-by: xiakaixu <xiakaixu@huawei.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:34 +02:00
Eric Dumazet
c867a11289 tcp: properly scale window in tcp_v[46]_reqsk_send_ack()
[ Upstream commit 20a2b49fc538540819a0c552877086548cff8d8d ]

When sending an ack in SYN_RECV state, we must scale the offered
window if wscale option was negotiated and accepted.

Tested:
 Following packetdrill test demonstrates the issue :

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

// Establish a connection.
+0 < S 0:0(0) win 20000 <mss 1000,sackOK,wscale 7, nop, TS val 100 ecr 0>
+0 > S. 0:0(0) ack 1 win 28960 <mss 1460,sackOK, TS val 100 ecr 100, nop, wscale 7>

+0 < . 1:11(10) ack 1 win 156 <nop,nop,TS val 99 ecr 100>
// check that window is properly scaled !
+0 > . 1:1(0) ack 1 win 226 <nop,nop,TS val 200 ecr 100>

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
2016-09-30 10:18:34 +02:00
Eric Dumazet
0f55fa7541 tcp: fix use after free in tcp_xmit_retransmit_queue()
[ Upstream commit bb1fceca22492109be12640d49f5ea5a544c6bb4 ]

When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the
tail of the write queue using tcp_add_write_queue_tail()

Then it attempts to copy user data into this fresh skb.

If the copy fails, we undo the work and remove the fresh skb.

Unfortunately, this undo lacks the change done to tp->highest_sack and
we can leave a dangling pointer (to a freed skb)

Later, tcp_xmit_retransmit_queue() can dereference this pointer and
access freed memory. For regular kernels where memory is not unmapped,
this might cause SACK bugs because tcp_highest_sack_seq() is buggy,
returning garbage instead of tp->snd_nxt, but with various debug
features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel.

This bug was found by Marco Grassi thanks to syzkaller.

Fixes: 6859d49475 ("[TCP]: Abstract tp->highest_sack accessing & point to next skb")
Reported-by: Marco Grassi <marco.gra@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
2016-09-30 10:18:34 +02:00
Artem Germanov
98418550e1 tcp: cwnd does not increase in TCP YeAH
[ Upstream commit db7196a0d0984b933ccf2cd6a60e26abf466e8a3 ]

Commit 76174004a0
(tcp: do not slow start when cwnd equals ssthresh )
introduced regression in TCP YeAH. Using 100ms delay 1% loss virtual
ethernet link kernel 4.2 shows bandwidth ~500KB/s for single TCP
connection and kernel 4.3 and above (including 4.8-rc4) shows bandwidth
~100KB/s.
   That is caused by stalled cwnd when cwnd equals ssthresh. This patch
fixes it by proper increasing cwnd in this case.

Signed-off-by: Artem Germanov <agermanov@anchorfree.com>
Acked-by: Dmitry Adamushko <d.adamushko@anchorfree.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Holger Hoffstätte <holger@applied-asynchrony.com>
2016-09-30 10:18:34 +02:00
Dave Jones
ea7dd213c1 ipv6: release dst in ping_v6_sendmsg
[ Upstream commit 03c2778a938aaba0893f6d6cdc29511d91a79848 ]

Neither the failure or success paths of ping_v6_sendmsg release
the dst it acquires.  This leads to a flood of warnings from
"net/core/dst.c:288 dst_release" on older kernels that
don't have 8bf4ada2e2 backported.

That patch optimistically hoped this had been fixed post 3.10, but
it seems at least one case wasn't, where I've seen this triggered
a lot from machines doing unprivileged icmp sockets.

Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
2016-09-30 10:18:34 +02:00
David Forster
6b8076b8a7 ipv4: panic in leaf_walk_rcu due to stale node pointer
[ Upstream commit 94d9f1c5906b20053efe375b6d66610bca4b8b64 ]

Panic occurs when issuing "cat /proc/net/route" whilst
populating FIB with > 1M routes.

Use of cached node pointer in fib_route_get_idx is unsafe.

 BUG: unable to handle kernel paging request at ffffc90001630024
 IP: [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0
 PGD 11b08d067 PUD 11b08e067 PMD dac4b067 PTE 0
 Oops: 0000 [#1] SMP
 Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscac
 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep virti
 acpi_cpufreq button parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd
tio_ring virtio floppy uhci_hcd ehci_hcd usbcore usb_common libata scsi_mod
 CPU: 1 PID: 785 Comm: cat Not tainted 4.2.0-rc8+ #4
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 task: ffff8800da1c0bc0 ti: ffff88011a05c000 task.ti: ffff88011a05c000
 RIP: 0010:[<ffffffff814cf6a0>]  [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0
 RSP: 0018:ffff88011a05fda0  EFLAGS: 00010202
 RAX: ffff8800d8a40c00 RBX: ffff8800da4af940 RCX: ffff88011a05ff20
 RDX: ffffc90001630020 RSI: 0000000001013531 RDI: ffff8800da4af950
 RBP: 0000000000000000 R08: ffff8800da1f9a00 R09: 0000000000000000
 R10: ffff8800db45b7e4 R11: 0000000000000246 R12: ffff8800da4af950
 R13: ffff8800d97a74c0 R14: 0000000000000000 R15: ffff8800d97a7480
 FS:  00007fd3970e0700(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffffc90001630024 CR3: 000000011a7e4000 CR4: 00000000000006e0
 Stack:
  ffffffff814d00d3 0000000000000000 ffff88011a05ff20 ffff8800da1f9a00
  ffffffff811dd8b9 0000000000000800 0000000000020000 00007fd396f35000
  ffffffff811f8714 0000000000003431 ffffffff8138dce0 0000000000000f80
 Call Trace:
  [<ffffffff814d00d3>] ? fib_route_seq_start+0x93/0xc0
  [<ffffffff811dd8b9>] ? seq_read+0x149/0x380
  [<ffffffff811f8714>] ? fsnotify+0x3b4/0x500
  [<ffffffff8138dce0>] ? process_echoes+0x70/0x70
  [<ffffffff8121cfa7>] ? proc_reg_read+0x47/0x70
  [<ffffffff811bb823>] ? __vfs_read+0x23/0xd0
  [<ffffffff811bbd42>] ? rw_verify_area+0x52/0xf0
  [<ffffffff811bbe61>] ? vfs_read+0x81/0x120
  [<ffffffff811bcbc2>] ? SyS_read+0x42/0xa0
  [<ffffffff81549ab2>] ? entry_SYSCALL_64_fastpath+0x16/0x75
 Code: 48 85 c0 75 d8 f3 c3 31 c0 c3 f3 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00
a 04 89 f0 33 02 44 89 c9 48 d3 e8 0f b6 4a 05 49 89
 RIP  [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0
  RSP <ffff88011a05fda0>
 CR2: ffffc90001630024

Signed-off-by: Dave Forster <dforster@brocade.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
2016-09-30 10:18:34 +02:00
Jeff Mahoney
daef25aa90 reiserfs: fix "new_insert_key may be used uninitialized ..."
commit 0a11b9aae49adf1f952427ef1a1d9e793dd6ffb6 upstream.

new_insert_key only makes any sense when it's associated with a
new_insert_ptr, which is initialized to NULL and changed to a
buffer_head when we also initialize new_insert_key.  We can key off of
that to avoid the uninitialized warning.

Link: http://lkml.kernel.org/r/5eca5ffb-2155-8df2-b4a2-f162f105efed@suse.com
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:34 +02:00
Arnd Bergmann
29bd03596e Fix build warning in kernel/cpuset.c
>           2 ../kernel/cpuset.c:2101:11: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
>           1 ../kernel/cpuset.c:2101:2: warning: initialization from incompatible pointer type
>           1 ../kernel/cpuset.c:2101:2: warning: (near initialization for 'cpuset_cgrp_subsys.fork')

This got introduced by 06ec7a1d76 ("cpuset: make sure new tasks
conform to the current config of the cpuset"). In the upstream
kernel, the function prototype was changed as of b53202e63089
("cgroup: kill cgrp_ss_priv[CGROUP_CANFORK_COUNT] and friends").

That patch is not suitable for stable kernels, and fortunately
the warning seems harmless as the prototypes only differ in the
second argument that is unused. Adding that argument gets rid
of the warning:

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:33 +02:00
Michal Nazarewicz
df12772578 include/linux/kernel.h: change abs() macro so it uses consistent return type
commit 8f57e4d930d48217268315898212518d4d3e0773 upstream.

Rewrite abs() so that its return type does not depend on the
architecture and no unexpected type conversion happen inside of it.  The
only conversion is from unsigned to signed type.  char is left as a
return type but treated as a signed type regradless of it's actual
signedness.

With the old version, int arguments were promoted to long and depending
on architecture a long argument might result in s64 or long return type
(which may or may not be the same).

This came after some back and forth with Nicolas.  The current macro has
different return type (for the same input type) depending on
architecture which might be midly iritating.

An alternative version would promote to int like so:

	#define abs(x)	__abs_choose_expr(x, long long,			\
			__abs_choose_expr(x, long,			\
			__builtin_choose_expr(				\
				sizeof(x) <= sizeof(int),		\
				({ int __x = (x); __x<0?-__x:__x; }),	\
				((void)0))))

I have no preference but imagine Linus might.  :] Nicolas argument against
is that promoting to int causes iconsistent behaviour:

	int main(void) {
		unsigned short a = 0, b = 1, c = a - b;
		unsigned short d = abs(a - b);
		unsigned short e = abs(c);
		printf("%u %u\n", d, e);  // prints: 1 65535
	}

Then again, no sane person expects consistent behaviour from C integer
arithmetic.  ;)

Note:

  __builtin_types_compatible_p(unsigned char, char) is always false, and
  __builtin_types_compatible_p(signed char, char) is also always false.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:33 +02:00
Catalin Marinas
67cd3bda54 FROMLIST: arm64: Enable CONFIG_ARM64_SW_TTBR0_PAN
This patch adds the Kconfig option to enable support for TTBR0 PAN
emulation. The option is default off because of a slight performance hit
when enabled, caused by the additional TTBR0_EL1 switching during user
access operations or exception entry/exit code.

Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: Id00a8ad4169d6eb6176c468d953436eb4ae887ae
(cherry picked from commit 6a2d7bad43474c48b68394d455b84a16b7d7dc3f)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00
Catalin Marinas
4dbc88bd2b FROMLIST: arm64: xen: Enable user access before a privcmd hvc call
Privcmd calls are issued by the userspace. The kernel needs to enable
access to TTBR0_EL1 as the hypervisor would issue stage 1 translations
to user memory via AT instructions. Since AT instructions are not
affected by the PAN bit (ARMv8.1), we only need the explicit
uaccess_enable/disable if the TTBR0 PAN option is enabled.

Reviewed-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: I927f14076ba94c83e609b19f46dd373287e11fc4
(cherry picked from commit 8cc1f33d2c9f206b6505bedba41aed2b33c203c0)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00
Catalin Marinas
5dc2b7c7bb FROMLIST: arm64: Handle faults caused by inadvertent user access with PAN enabled
When TTBR0_EL1 is set to the reserved page, an erroneous kernel access
to user space would generate a translation fault. This patch adds the
checks for the software-set PSR_PAN_BIT to emulate a permission fault
and report it accordingly.

This patch also updates the description of the synchronous external
aborts on translation table walks.

Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: I623113fc8bf6d5f023aeec7a0640b62a25ef8420
(cherry picked from commit be3db9340c8011d22f06715339b66bcbbd4893bd)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00
Catalin Marinas
5775ca3482 FROMLIST: arm64: Disable TTBR0_EL1 during normal kernel execution
When the TTBR0 PAN feature is enabled, the kernel entry points need to
disable access to TTBR0_EL1. The PAN status of the interrupted context
is stored as part of the saved pstate, reusing the PSR_PAN_BIT (22).
Restoring access to TTBR0_PAN is done on exception return if returning
to user or returning to a context where PAN was disabled.

Context switching via switch_mm() must defer the update of TTBR0_EL1
until a return to user or an explicit uaccess_enable() call.

Special care needs to be taken for two cases where TTBR0_EL1 is set
outside the normal kernel context switch operation: EFI run-time
services (via efi_set_pgd) and CPU suspend (via cpu_(un)install_idmap).
Code has been added to avoid deferred TTBR0_EL1 switching as in
switch_mm() and restore the reserved TTBR0_EL1 when uninstalling the
special TTBR0_EL1.

This patch also removes a stale comment on the switch_mm() function.

Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: Id1198cf1cde022fad10a94f95d698fae91d742aa
(cherry picked from commit d26cfd64c973b31f73091c882e07350e14fdd6c9)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00
Catalin Marinas
1911d36b27 FROMLIST: arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1
This patch adds the uaccess macros/functions to disable access to user
space by setting TTBR0_EL1 to a reserved zeroed page. Since the value
written to TTBR0_EL1 must be a physical address, for simplicity this
patch introduces a reserved_ttbr0 page at a constant offset from
swapper_pg_dir. The uaccess_disable code uses the ttbr1_el1 value
adjusted by the reserved_ttbr0 offset.

Enabling access to user is done by restoring TTBR0_EL1 with the value
from the struct thread_info ttbr0 variable. Interrupts must be disabled
during the uaccess_ttbr0_enable code to ensure the atomicity of the
thread_info.ttbr0 read and TTBR0_EL1 write. This patch also moves the
get_thread_info asm macro from entry.S to assembler.h for reuse in the
uaccess_ttbr0_* macros.

Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: Idf09a870b8612dce23215bce90d88781f0c0c3aa
(cherry picked from commit 940d37234182d2675ab8ab46084840212d735018)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00
Catalin Marinas
3b66929169 FROMLIST: arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro
This patch takes the errata workaround code out of cpu_do_switch_mm into
a dedicated post_ttbr0_update_workaround macro which will be reused in a
subsequent patch.

Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: I69f94e4c41046bd52ca9340b72d97bfcf955b586
(cherry picked from commit 4398e6a1644373a4c2f535f4153c8378d0914630)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00
Catalin Marinas
23368b642d FROMLIST: arm64: Factor out PAN enabling/disabling into separate uaccess_* macros
This patch moves the directly coded alternatives for turning PAN on/off
into separate uaccess_{enable,disable} macros or functions. The asm
macros take a few arguments which will be used in subsequent patches.

Note that any (unlikely) access that the compiler might generate between
uaccess_enable() and uaccess_disable(), other than those explicitly
specified by the user access code, will not be protected by PAN.

Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: Ic3fddd706400c8798f57456c56361d84d234f6ef
(cherry picked from commit a4820644c627b82cbc865f2425bb788c94743b16)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00
Laura Abbott
d907eb2529 UPSTREAM: arm64: Handle el1 synchronous instruction aborts cleanly
Executing from a non-executable area gives an ugly message:

lkdtm: Performing direct entry EXEC_RODATA
lkdtm: attempting ok execution at ffff0000084c0e08
lkdtm: attempting bad execution at ffff000008880700
Bad mode in Synchronous Abort handler detected on CPU2, code 0x8400000e -- IABT (current EL)
CPU: 2 PID: 998 Comm: sh Not tainted 4.7.0-rc2+ #13
Hardware name: linux,dummy-virt (DT)
task: ffff800077e35780 ti: ffff800077970000 task.ti: ffff800077970000
PC is at lkdtm_rodata_do_nothing+0x0/0x8
LR is at execute_location+0x74/0x88

The 'IABT (current EL)' indicates the error but it's a bit cryptic
without knowledge of the ARM ARM. There is also no indication of the
specific address which triggered the fault. The increase in kernel
page permissions makes hitting this case more likely as well.
Handling the case in the vectors gives a much more familiar looking
error message:

lkdtm: Performing direct entry EXEC_RODATA
lkdtm: attempting ok execution at ffff0000084c0840
lkdtm: attempting bad execution at ffff000008880680
Unable to handle kernel paging request at virtual address ffff000008880680
pgd = ffff8000089b2000
[ffff000008880680] *pgd=00000000489b4003, *pud=0000000048904003, *pmd=0000000000000000
Internal error: Oops: 8400000e [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 997 Comm: sh Not tainted 4.7.0-rc1+ #24
Hardware name: linux,dummy-virt (DT)
task: ffff800077f9f080 ti: ffff800008a1c000 task.ti: ffff800008a1c000
PC is at lkdtm_rodata_do_nothing+0x0/0x8
LR is at execute_location+0x74/0x88

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: Ifba74589ba2cf05b28335d4fd3e3140ef73668db
(cherry picked from commit 9adeb8e72dbfe976709df01e259ed556ee60e779)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00
James Morse
4caec81662 BACKPORT: arm64: kernel: Save and restore UAO and addr_limit on exception entry
If we take an exception while at EL1, the exception handler inherits
the original context's addr_limit and PSTATE.UAO values. To be consistent
always reset addr_limit and PSTATE.UAO on (re-)entry to EL1. This
prevents accidental re-use of the original context's addr_limit.

Based on a similar patch for arm from Russell King.

Cc: <stable@vger.kernel.org> # 4.6-
Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

Change-Id: Iab453201c6e08bc6e22500b7c5570dd0fe2d1b74
(cherry picked from commit e19a6ee2460bdd0d0055a6029383422773f9999a)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00
Andre Przywara
76440ecb4d UPSTREAM: arm64: include alternative handling in dcache_by_line_op
The newly introduced dcache_by_line_op macro is used at least in
one occassion at the moment to issue a "dc cvau" instruction,
which is affected by ARM errata 819472, 826319, 827319 and 824069.
Change the macro to allow for alternative patching in there to
protect affected Cortex-A53 cores.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
[catalin.marinas@arm.com: indentation fixups]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: I450594dc311b09b6b832b707a9abb357608cc6e4
(cherry picked from commit 823066d9edcdfe4cedb06216c2b1f91efaf68a87)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00
Andre Przywara
405893f77e UPSTREAM: arm64: fix "dc cvau" cache operation on errata-affected core
The ARM errata 819472, 826319, 827319 and 824069 for affected
Cortex-A53 cores demand to promote "dc cvau" instructions to
"dc civac" as well.
Attribute the usage of the instruction in __flush_cache_user_range
to also be covered by our alternative patching efforts.
For that we introduce an assembly macro which both deals with
alternatives while still tagging the instructions as USER.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Change-Id: If5e7933ba32331b2aa28fc5d9e019649452f0f6c
(cherry picked from commit 290622efc76ece22ef76a30bf117755891ab27f6)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-29 10:52:56 -07:00