Commit graph

549327 commits

Author SHA1 Message Date
David S. Miller
c1bf5fe031 Merge branch 'bpf-unprivileged'
Alexei Starovoitov says:

====================
bpf: unprivileged

v1-v2:
- this set logically depends on cb patch
  "bpf: fix cb access in socket filter programs":
  http://patchwork.ozlabs.org/patch/527391/
  which is must have to allow unprivileged programs.
  Thanks Daniel for finding that issue.
- refactored sysctl to be similar to 'modules_disabled'
- dropped bpf_trace_printk
- split tests into separate patch and added more tests
  based on discussion

v1 cover letter:
I think it is time to liberate eBPF from CAP_SYS_ADMIN.
As was discussed when eBPF was first introduced two years ago
the only piece missing in eBPF verifier is 'pointer leak detection'
to make it available to non-root users.
Patch 1 adds this pointer analysis.
The eBPF programs, obviously, need to see and operate on kernel addresses,
but with these extra checks they won't be able to pass these addresses
to user space.
Patch 2 adds accounting of kernel memory used by programs and maps.
It changes behavoir for existing root users, but I think it needs
to be done consistently for both root and non-root, since today
programs and maps are only limited by number of open FDs (RLIMIT_NOFILE).
Patch 2 accounts program's and map's kernel memory as RLIMIT_MEMLOCK.

Unprivileged eBPF is only meaningful for 'socket filter'-like programs.
eBPF programs for tracing and TC classifiers/actions will stay root only.

In parallel the bpf fuzzing effort is ongoing and so far
we've found only one verifier bug and that was already fixed.
The 'constant blinding' pass also being worked on.
It will obfuscate constant-like values that are part of eBPF ISA
to make jit spraying attacks even harder.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 19:13:41 -07:00
Alexei Starovoitov
bf5088773f bpf: add unprivileged bpf tests
Add new tests samples/bpf/test_verifier:

unpriv: return pointer
  checks that pointer cannot be returned from the eBPF program

unpriv: add const to pointer
unpriv: add pointer to pointer
unpriv: neg pointer
  checks that pointer arithmetic is disallowed

unpriv: cmp pointer with const
unpriv: cmp pointer with pointer
  checks that comparison of pointers is disallowed
  Only one case allowed 'void *value = bpf_map_lookup_elem(..); if (value == 0) ...'

unpriv: check that printk is disallowed
  since bpf_trace_printk is not available to unprivileged

unpriv: pass pointer to helper function
  checks that pointers cannot be passed to functions that expect integers
  If function expects a pointer the verifier allows only that type of pointer.
  Like 1st argument of bpf_map_lookup_elem() must be pointer to map.
  (applies to non-root as well)

unpriv: indirectly pass pointer on stack to helper function
  checks that pointer stored into stack cannot be used as part of key
  passed into bpf_map_lookup_elem()

unpriv: mangle pointer on stack 1
unpriv: mangle pointer on stack 2
  checks that writing into stack slot that already contains a pointer
  is disallowed

unpriv: read pointer from stack in small chunks
  checks that < 8 byte read from stack slot that contains a pointer is
  disallowed

unpriv: write pointer into ctx
  checks that storing pointers into skb->fields is disallowed

unpriv: write pointer into map elem value
  checks that storing pointers into element values is disallowed
  For example:
  int bpf_prog(struct __sk_buff *skb)
  {
    u32 key = 0;
    u64 *value = bpf_map_lookup_elem(&map, &key);
    if (value)
       *value = (u64) skb;
  }
  will be rejected.

unpriv: partial copy of pointer
  checks that doing 32-bit register mov from register containing
  a pointer is disallowed

unpriv: pass pointer to tail_call
  checks that passing pointer as an index into bpf_tail_call
  is disallowed

unpriv: cmp map pointer with zero
  checks that comparing map pointer with constant is disallowed

unpriv: write into frame pointer
  checks that frame pointer is read-only (applies to root too)

unpriv: cmp of frame pointer
  checks that R10 cannot be using in comparison

unpriv: cmp of stack pointer
  checks that Rx = R10 - imm is ok, but comparing Rx is not

unpriv: obfuscate stack pointer
  checks that Rx = R10 - imm is ok, but Rx -= imm is not

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 19:13:37 -07:00
Alexei Starovoitov
aaac3ba95e bpf: charge user for creation of BPF maps and programs
since eBPF programs and maps use kernel memory consider it 'locked' memory
from user accounting point of view and charge it against RLIMIT_MEMLOCK limit.
This limit is typically set to 64Kbytes by distros, so almost all
bpf+tracing programs would need to increase it, since they use maps,
but kernel charges maximum map size upfront.
For example the hash map of 1024 elements will be charged as 64Kbyte.
It's inconvenient for current users and changes current behavior for root,
but probably worth doing to be consistent root vs non-root.

Similar accounting logic is done by mmap of perf_event.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 19:13:36 -07:00
Alexei Starovoitov
1be7f75d16 bpf: enable non-root eBPF programs
In order to let unprivileged users load and execute eBPF programs
teach verifier to prevent pointer leaks.
Verifier will prevent
- any arithmetic on pointers
  (except R10+Imm which is used to compute stack addresses)
- comparison of pointers
  (except if (map_value_ptr == 0) ... )
- passing pointers to helper functions
- indirectly passing pointers in stack to helper functions
- returning pointer from bpf program
- storing pointers into ctx or maps

Spill/fill of pointers into stack is allowed, but mangling
of pointers stored in the stack or reading them byte by byte is not.

Within bpf programs the pointers do exist, since programs need to
be able to access maps, pass skb pointer to LD_ABS insns, etc
but programs cannot pass such pointer values to the outside
or obfuscate them.

Only allow BPF_PROG_TYPE_SOCKET_FILTER unprivileged programs,
so that socket filters (tcpdump), af_packet (quic acceleration)
and future kcm can use it.
tracing and tc cls/act program types still require root permissions,
since tracing actually needs to be able to see all kernel pointers
and tc is for root only.

For example, the following unprivileged socket filter program is allowed:
int bpf_prog1(struct __sk_buff *skb)
{
  u32 index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
  u64 *value = bpf_map_lookup_elem(&my_map, &index);

  if (value)
	*value += skb->len;
  return 0;
}

but the following program is not:
int bpf_prog1(struct __sk_buff *skb)
{
  u32 index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
  u64 *value = bpf_map_lookup_elem(&my_map, &index);

  if (value)
	*value += (u64) skb;
  return 0;
}
since it would leak the kernel address into the map.

Unprivileged socket filter bpf programs have access to the
following helper functions:
- map lookup/update/delete (but they cannot store kernel pointers into them)
- get_random (it's already exposed to unprivileged user space)
- get_smp_processor_id
- tail_call into another socket filter program
- ktime_get_ns

The feature is controlled by sysctl kernel.unprivileged_bpf_disabled.
This toggle defaults to off (0), but can be set true (1).  Once true,
bpf programs and maps cannot be accessed from unprivileged process,
and the toggle cannot be set back to false.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 19:13:35 -07:00
Ken-ichirou MATSUZAWA
914eebf2f4 netfilter: nfnetlink_log: autoload nf_conntrack_netlink module NFQA_CFG_F_CONNTRACK config flag
This patch enables to load nf_conntrack_netlink module if
NFULNL_CFG_F_CONNTRACK config flag is specified.

Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-12 21:44:12 +02:00
Manjeet Pawar
c9692657c0 arm64: Fix MINSIGSTKSZ and SIGSTKSZ
MINSIGSTKSZ and SIGSTKSZ for ARM64 are not correctly set in latest kernel.
This patch fixes this issue.

This issue is reported in LTP (testcase: sigaltstack02.c).
Testcase failed when sigaltstack() called with stack size "MINSIGSTKSZ - 1"
Since in Glibc-2.22, MINSIGSTKSZ is set to 5120 but in kernel
it is set to 2048 so testcase gets failed.

Testcase Output:
sigaltstack02 1  TPASS  :  stgaltstack() fails, Invalid Flag value,errno:22
sigaltstack02 2  TFAIL  :  sigaltstack() returned 0, expected -1,errno:12

Reported Issue in Glibc Bugzilla:
Bugfix in Glibc-2.22: [Bug 16850]
https://sourceware.org/bugzilla/show_bug.cgi?id=16850

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
Signed-off-by: Manjeet Pawar <manjeet.p@samsung.com>
Signed-off-by: Rohit Thapliyal <r.thapliyal@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-10-12 17:40:12 +01:00
Will Deacon
b6dd8e0719 arm64: errata: use KBUILD_CFLAGS_MODULE for erratum #843419
Commit df057cc7b4 ("arm64: errata: add module build workaround for
erratum #843419") sets CFLAGS_MODULE to ensure that the large memory
model is used by the compiler when building kernel modules.

However, CFLAGS_MODULE is an environment variable and intended to be
overridden on the command line, which appears to be the case with the
Ubuntu kernel packaging system, so use KBUILD_CFLAGS_MODULE instead.

Cc: <stable@vger.kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: df057cc7b4 ("arm64: errata: add module build workaround for erratum #843419")
Reported-by: Dann Frazier <dann.frazier@canonical.com>
Tested-by: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-10-12 17:40:12 +01:00
Chuck Lever
3be7f32878 svcrdma: Fix NFS server crash triggered by 1MB NFS WRITE
Now that the NFS server advertises a maximum payload size of 1MB
for RPC/RDMA again, it crashes in svc_process_common() when NFS
client sends a 1MB NFS WRITE on an NFS/RDMA mount.

The server has set up a 259 element array of struct page pointers
in rq_pages[] for each incoming request. The last element of the
array is NULL.

When an incoming request has been completely received,
rdma_read_complete() attempts to set the starting page of the
incoming page vector:

  rqstp->rq_arg.pages = &rqstp->rq_pages[head->hdr_count];

and the page to use for the reply:

  rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];

But the value of page_no has already accounted for head->hdr_count.
Thus rq_respages now points past the end of the incoming pages.

For NFS WRITE operations smaller than the maximum, this is harmless.
But when the NFS WRITE operation is as large as the server's max
payload size, rq_respages now points at the last entry in rq_pages,
which is NULL.

Fixes: cc9a903d91 ('svcrdma: Change maximum server payload . . .')
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=270
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-10-12 11:55:43 -04:00
Arnd Bergmann
c932245811 netfilter: bridge: avoid unused label warning
With the ARM mini2440_defconfig, the bridge netfilter code gets
built with both CONFIG_NF_DEFRAG_IPV4 and CONFIG_NF_DEFRAG_IPV6
disabled, which leads to a harmless gcc warning:

net/bridge/br_netfilter_hooks.c: In function 'br_nf_dev_queue_xmit':
net/bridge/br_netfilter_hooks.c:792:2: warning: label 'drop' defined but not used [-Wunused-label]

This gets rid of the warning by cleaning up the code to avoid
the respective #ifdefs causing this problem, and replacing them
with if(IS_ENABLED()) checks. I have verified that the resulting
object code is unchanged, and an additional advantage is that
we now get compile coverage of the unused functions in more
configurations.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: dd302b59bd ("netfilter: bridge: don't leak skb in error paths")
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-12 17:48:36 +02:00
Pablo Neira Ayuso
d53195c259 Merge tag 'ipvs4-for-v4.4' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next
Simon Horman says:

====================
Fourth Round of IPVS Updates for v4.4

please consider these build warning cleanups from David Ahern and myself.
They resolve some minor side effects of Eric Biederman' heroic work to
cleanup IPVS which you recently pulled: its queued up for v4.4 so no need
to worry about earlier kernel versions.
====================

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-12 17:38:54 +02:00
Pablo Neira Ayuso
4302f5eeb9 nfnetlink_cttimeout: add rcu_barrier() on module removal
Make sure kfree_rcu() released objects before leaving the module removal
exit path.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-12 17:04:41 +02:00
Pablo Neira Ayuso
ae2d708ed8 netfilter: conntrack: fix crash on timeout object removal
The object and module refcounts are updated for each conntrack template,
however, if we delete the iptables rules and we flush the timeout
database, we may end up with invalid references to timeout object that
are just gone.

Resolve this problem by setting the timeout reference to NULL when the
custom timeout entry is removed from our base. This patch requires some
RCU trickery to ensure safe pointer handling.

This handling is similar to what we already do with conntrack helpers,
the idea is to avoid bumping the timeout object reference counter from
the packet path to avoid the cost of atomic ops.

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-12 17:04:34 +02:00
Pablo Neira Ayuso
403d89ad9c netfilter: xt_CT: don't put back reference to timeout policy object
On success, this shouldn't put back the timeout policy object, otherwise
we may have module refcount overflow and we allow deletion of timeout
that are still in use.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-12 16:54:45 +02:00
Arnd Bergmann
0fa28877b2 net: HNS: fix MDIO dependencies
The newly introduced HNS_MDIO Kconfig symbol selects 'MDIO', but
that is the wrong symbol as the code used by this driver is
provided by PHYLIB rather than the MDIO driver. Also, there is
no need to make this driver user selectable, because it is already
selected by all drivers that need it.

This changes the Kconfig file to select the correct library, and
to make the option silent.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 5b904d3940 ("net: add Hisilicon Network Subsystem MDIO support")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:41:15 -07:00
Daniel Pieczko
c577e59ed7 sfc: fully reset if MC_REBOOT event received without warm_boot_count increment
On EF10, MC_CMD_VPORT_RECONFIGURE can cause a CODE_MC_REBOOT event
to be sent to a function without incrementing the (adapter-wide)
warm_boot_count.  In this case, the reboot is not detected by the
loop on efx_mcdi_poll_reboot(), so prepare for recovery from an MC
reboot anyway.  When this codepath is run, the MC has always just
rebooted, so this recovery is valid.

The loop on efx_mcdi_poll_reboot() is still required for other MC
reboot cases, so that actions in response to an MC reboot are
performed, such as clearing locally calculated statistics.
Siena NICs are unaffected by this change as the above scenario
does not apply.

Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:35:25 -07:00
David S. Miller
d5404915a9 Merge branch 'switchdev_ageing_time'
Scott Feldman says:

====================
switchdev: push bridge ageing_time attribute down

Push bridge-level attributes down to switchdev drivers.  This patchset
adds the infrastructure and then pushes, as an example, ageing_time attribute
down from bridge to switchdev (rocker) driver.  Add some range-checking
for ageing_time.

RTNETLINK answers: Numerical result out of range

Up until now, switchdev attrs where port-level attrs, so the netdev used in
switchdev_attr_set() would be a switch port or bond of switch ports.  With
bridge-level attrs, the netdev passed to switchdev_attr_set() is the bridge
netdev.  The same recusive algo is used to visit the leaves of the stacked
drivers to set the attr, it's just in this case we start one layer higher in
the stack.  One note is not all ports in the bridge may support setting a
bridge-level attribute, so rather than failing the entire set, we'll skip over
those ports returning -EOPNOTSUPP.

v2->v3: Per Jiri review: push only ageing_time attr down at this time, and
don't pass raw bridge IFLA_BR_* values; rather use new switchdev attr ID for
ageing_time.

v1->v2: rebase w/ net-next
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:20:28 -07:00
Scott Feldman
d0cf57f9dd rocker: handle setting bridge ageing_time
The FDB cleanup timer will get rescheduled to re-evaluate FDB entries
based on new ageing_time.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:20:22 -07:00
Scott Feldman
c62987bbd8 bridge: push bridge setting ageing_time down to switchdev
Use SWITCHDEV_F_SKIP_EOPNOTSUPP to skip over ports in bridge that don't
support setting ageing_time (or setting bridge attrs in general).

If push fails, don't update ageing_time in bridge and return err to user.

If push succeeds, update ageing_time in bridge and run gc_timer now to
recalabrate when to run gc_timer next, based on new ageing_time.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:20:20 -07:00
Scott Feldman
464314ea6c switchdev: skip over ports returning -EOPNOTSUPP when recursing ports
This allows us to recurse over all the ports, skipping over unsupporting
ports.  Without the change, the recursion would stop at first unsupported
port.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:20:20 -07:00
Scott Feldman
f55ac58ae6 switchdev: add bridge ageing_time attribute
Setting the stage to push bridge-level attributes down to port driver so
hardware can be programmed accordingly.  Bridge-level attribute example is
ageing_time.  This is a per-bridge attribute, not a per-bridge-port attr.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:20:19 -07:00
Richard Sailer
7533ce3055 tcp: change type of alive from int to bool
The alive parameter of tcp_orphan_retries, indicates
whether the connection is assumed alive or not.
In the function and all places calling it is used as a boolean value.

Therefore this changes the type of alive to bool in the function
definition and all calling locations.

Since tcp_orphan_tries is a tcp_timer.c local function no change in
any other file or header is necessary.

Signed-off-by: Richard Sailer <richard@weltraumpflege.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:15:03 -07:00
Roopa Prabhu
3741873b4f bridge: allow adding of fdb entries pointing to the bridge device
This patch enables adding of fdb entries pointing to the bridge device.
This can be used to propagate mac address of vlan interfaces
configured on top of the vlan filtering bridge.

Before:
$bridge fdb add 44:38:39:00:27:9f dev bridge
RTNETLINK answers: Invalid argument

After:
$bridge fdb add 44:38:39:00:27:9f dev bridge

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12 05:11:58 -07:00
Dave Airlie
7b98040a77 Merge branch 'linux-4.3' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
Nothing too crazy here, a couple of regression fixes + runpm/fbcon
race fix.

* 'linux-4.3' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
  drm/nouveau/bios: fix OF loading
  drm/nouveau/fbcon: take runpm reference when userspace has an open fd
  drm/nouveau/nouveau: Disable AGP for SiS 761
  drm/nouveau/display: allow up to 16k width/height for fermi+
  drm/nouveau/bios: translate devinit pri/sec i2c bus to internal identifiers
2015-10-12 13:59:04 +10:00
Ilia Mirkin
25d295882a drm/nouveau/bios: fix OF loading
Currently OF bios load fails for a few reasons:
 - checksum failure
 - bios size too small
 - no PCIR header
 - bios length not a multiple of 4

In this change, we resolve all of the above by ignoring any checksum
failures (since OF VBIOS tends not to have a checksum), and faking the
PCIR data when loading from OF.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-10-12 13:54:56 +10:00
Ben Skeggs
f231976c2e drm/nouveau/fbcon: take runpm reference when userspace has an open fd
We need to do this in order to prevent accesses to the device while it's
powered down.  Userspace may have an mmap of the fb, and there's no good
way (that I know of) to prevent it from touching the device otherwise.

This fixes some nasty races between runpm and plymouth on some systems,
which result in the GPU getting very upset and hanging the boot.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
2015-10-12 13:54:40 +10:00
Ondrej Zary
953132b56a drm/nouveau/nouveau: Disable AGP for SiS 761
SiS 761 chipset does not support AGP cards but has AGP capability (for
the onboard video). At least PC Chips A31G board using this chipset has
an AGP-like AGPro slot that's wired to the PCI bus. Enabling AGP will
fail (GPU lockup and software fbcon, X11 hangs).

Add support for matching just the host bridge in nvkm_device_agp_quirks
and add entry for SiS 761 with mode 0 (AGP disabled).

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-10-12 13:48:29 +10:00
Ilia Mirkin
5102ec3e99 drm/nouveau/display: allow up to 16k width/height for fermi+
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-10-12 13:48:28 +10:00
Ben Skeggs
2239b76b0b drm/nouveau/bios: translate devinit pri/sec i2c bus to internal identifiers
fdo#92013.

Regression from "i2c: transition pad/ports away from being based on nvkm_object"

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-10-12 13:48:28 +10:00
Linus Torvalds
25cb62b764 Linux 4.3-rc5 2015-10-11 11:09:45 -07:00
Linus Torvalds
9a78f9c3c6 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Thomas Gleixner:
 "Fix a long standing state race in finish_task_switch()"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Fix TASK_DEAD race in finish_task_switch()
2015-10-11 10:24:32 -07:00
Linus Torvalds
7cbbab00cb Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Thomas Glexiner:
 "Fix build breakage on powerpc in perf tools"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Fix build break on powerpc due to sample_reg_masks
2015-10-11 10:23:52 -07:00
Linus Torvalds
a145164ba8 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull maintainer email update from Thomas Gleixner:
 "Change Matt Fleming's email address in the maintainers file"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Change Matt Fleming's email address
2015-10-11 10:23:00 -07:00
Linus Torvalds
e3d6e0e701 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "Three trivial commits:

   - Fix a kerneldoc regression

   - Export handle_bad_irq to unbreak a driver in next

   - Add an accessor for the of_node field so refactoring in next does
     not depend on merge ordering"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqdomain: Add an accessor for the of_node field
  genirq: Fix handle_bad_irq kerneldoc comment
  genirq: Export handle_bad_irq
2015-10-11 10:16:59 -07:00
Linus Torvalds
5a433f7a6b SCSI fixes on 20151010
This is a set of three bug fixes, two of which are regressions from recent
 updates (the 3ware one from 4.1 and the device handler fixes from 4.2).
 
 Signed-off-by: James Bottomley <JBottomley@Odin.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEbBAABAgAGBQJWGgjcAAoJEDeqqVYsXL0MDjIH+N1poa/RB6jU163q1bMnJlp6
 6ygEqLtrJH7FNjadiRuYjsZaSWfuabbEyj0zV8t6S7Z2wWQmXG7Qfkic3W0CZa9u
 qCa3aPmRB06Al9nwmEDlgilBfFP5hWkWldiPF7uEBqcCsBtzT3cxxUnyoCS1fy28
 USMHSQ4fnQDFdaTafXqDCRrMjXIJLeRY1Gg7YuiG7l6h4YK5qPC+0cCpiIeDyDyI
 WjTr/SbFzIyDg0r/SNwjZqbhY2+s4a2/4GcAmjpBMWvg2GnXDGt6vxibRvrwZfHf
 PtUvsVS7eJhAOAKs67KOJavP6kvheXkj/QTZWq5Y/DqDrd/14qIgjOPpIHYyig==
 =20ji
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of three bug fixes, two of which are regressions from
  recent updates (the 3ware one from 4.1 and the device handler fixes
  from 4.2)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  3w-9xxx: don't unmap bounce buffered commands
  scsi_dh: Use the correct module name when loading device handler
  libiscsi: Fix iscsi_check_transport_timeouts possible infinite loop
2015-10-11 10:02:30 -07:00
Linus Torvalds
f24fe98df8 One bug fix for raid1/raid10.
Very careless bug earler in 4.3-rc, now fixed :-)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJWGYrdAAoJEDnsnt1WYoG56vcP/03E7DoycDQ7uH46i2szf3M/
 isy0dk+2S9D87iBz/xf4RXvAY7FTHy9/vIG/o6UKYkhSRzm6T6xCbrwd+duS2TSc
 yQM33BJ1VdM+trGj5ywrdF8guwRMjW4NFPnez16moVSVZDbNK2pUdZiw8kGSi39n
 hpjftyefojISG6rbDGBGK2JiVTNOqDjMH2Ny8MhX2J5ryQQOsd6+9ojgri3nfTbP
 6PmP08QyVxdYA3ZUlTZaKUNZ8AQHgoydhiEyGbdCewcE8pYaeEUqvcBi4DrDOil8
 9BGHnf755Wl3k26P8uBsvri9zp+SZl0LEZLhSpyFpRmCTaFGn0pnSKJ0intnRTPc
 JZ7gTY6q1Bt5DXToZw7hHVWgxjos8aweS2JLzSJloB6FFlDCckypkvSR4GQL7R9N
 jIYntfwaQaJIgUSzVo/Aw6vjBTWbqyLHf3DP8ImsPSe/z0gjtRiyPkjoZgthuYp2
 ErLoVe/JgKstR0gmobbdRhShIfXMFVAIwasXOXfq4Ye4LRwvfAwP2UDHrC25mb0O
 IJi6fMqf3bWxmLIzFUcTe8Z2nzuKolAgP2rcd6kb0bbLxE4Y5xtzCV8fgnhk2obw
 HvP4zZnacLKx8Nvet+YGUKjVJU3wx4RTgyGLU4WqC13fwZREeJLWwxgK859ZJ8yl
 k+TQud5fKgfkX20+eTA0
 =qdcM
 -----END PGP SIGNATURE-----

Merge tag 'md/4.3-rc4-fix' of git://neil.brown.name/md

Pull md bugfix from Neil Brown:
 "One bug fix for raid1/raid10.

  Very careless bug earler in 4.3-rc, now fixed :-)"

* tag 'md/4.3-rc4-fix' of git://neil.brown.name/md:
  crash in md-raid1 and md-raid10 due to incorrect list manipulation
2015-10-11 09:35:51 -07:00
Eric Dumazet
6bcfd7f8c2 tcp: fix RFS vs lockless listeners
Before recent TCP listener patches, we were updating listener
sk->sk_rxhash before the cloning of master socket.

children sk_rxhash was therefore correct after the normal 3WHS.

But with lockless listener, we no longer dirty/change listener sk_rxhash
as it would be racy.

We need to correctly update the child sk_rxhash, otherwise first data
packet wont hit correct cpu if RFS is used.

Fixes: 079096f103 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Willem de Bruijn <willemb@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:33:15 -07:00
Hannes Frederic Sowa
9ef2e965e5 ipv6: drop frames with attached skb->sk in forwarding
This is a clone of commit 2ab957492d ("ip_forward: Drop frames with
attached skb->sk") for ipv6.

This commit has exactly the same reasons as the above mentioned commit,
namely to prevent panics during netfilter reload or a misconfigured stack.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:30:44 -07:00
Hannes Frederic Sowa
d9e4ce65b2 ipv6: gre: setup default multicast routes over PtP links
GRE point-to-point interfaces should also support ipv6 multicast. Setting
up default multicast routes on interface creation was forgotten. Add it.

Bugzilla: <https://bugzilla.kernel.org/show_bug.cgi?id=103231>
Cc: Julien Muchembled <jm@jmuchemb.eu>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Nicolas Dumazet <ndumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:30:43 -07:00
David S. Miller
5010ea59e1 Merge branch 'dsa-next'
Vivien Didelot says:

====================
net: dsa: push switchdev prepare phase in FDB ops

This patchset pushes the switchdev prepare phase for the FDB add and del
operations down to the DSA drivers. Currently only mv88e6xxx is affected.

Since the dump requires a bit of refactoring in the driver, it'll come in a
future patchset.

Changes in v2:
 * forward declare switchdev structs instead of fixing the dsa.h include.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:28:57 -07:00
Vivien Didelot
8057b3e7a1 net: dsa: use switchdev obj in port_fdb_del
For consistency with the FDB add operation, propagate the
switchdev_obj_port_fdb structure in the DSA drivers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:28:52 -07:00
Vivien Didelot
1f36faf269 net: dsa: push prepare phase in port_fdb_add
Now that the prepare phase is pushed down to the DSA drivers, propagate
it to the port_fdb_add function.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:28:50 -07:00
Vivien Didelot
146a32067b net: dsa: add port_fdb_prepare
Push the prepare phase for FDB operations down to the DSA drivers, with
a new port_fdb_prepare function. Currently only mv88e6xxx is affected.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:28:49 -07:00
Jon Ringle
6e86ac120e net: encx24j600: Fix typos in Kconfig
Signed-off-by: Jon Ringle <jringle@gridpoint.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:26:10 -07:00
David S. Miller
7bcfeead48 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2015-10-08

Here's another set of Bluetooth & 802.15.4 patches for the 4.4 kernel.

802.15.4:
 - Many improvements & fixes to the mrf24j40 driver
 - Fixes and cleanups to nl802154, mac802154 & ieee802154 code

Bluetooth:
 - New chipset support in btmrvl driver
 - Fixes & cleanups to btbcm, btmrvl, bpa10x & btintel drivers
 - Support for vendor specific diagnostic data through common API
 - Cleanups to the 6lowpan code
 - New events & message types for monitor channel

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:15:30 -07:00
Yuval Mintz
f9468e8dc8 bnx2x: Prevent UDP 4-tuple configurations on older adapters
Configuring 4-tuple RSS hsahing for UDP
[E.g., by using `ethtool -N <interface> rx-flow-hash udp4 sdfn']
on a 57710/57711 adapter would cause it to assert as HW does not
support such a configuration.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:09:30 -07:00
David S. Miller
5edc11ab87 Merge branch 'mlxsw-fixes'
Jiri Pirko says:

====================
mlxsw: couple of fixes

Just a couple of small fixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:08:14 -07:00
Ido Schimmel
bee1f753bf mlxsw: Fix bug in __mlxsw_item_bit_array_offset
When calculating the shift needed in order to access a bit array element
in a byte, we should multiply the index by the element size and not
assume it is fixed at 2-bits.

Fixes: 93c1edb27f ("mlxsw: Introduce Mellanox switch driver core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:08:09 -07:00
Elad Raz
4b0c2541cb mlxsw: switchx2: changing order of exit fallbacks
Fixes: 31557f0f97 ("mlxsw: Introduce Mellanox SwitchX-2 ASIC support")
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:08:08 -07:00
wangweidong
8fae307c8f BNX2: fix a Null Pointer for stats_blk
we have two processes to do:
P1#: ifconfig eth0 down; which will call bnx2_close, then will
, and set Null to stats_blk
P2#: ifconfig eth0; which will call bnx2_get_stats64, it will
use stats_blk.
In one case:
    --P1#--                   --P2#--
                              stats_blk(no null)
    bnx2_free_mem
    ->bp->stats_blk = NULL
                              GET_64BIT_NET_STATS

then it will cause 'NULL Pointer' Problem.
it is as well with 'ethtool -S ethx'.

Allocate the statistics block at probe time so that this problem is
impossible

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:06:21 -07:00
Eric Dumazet
e446f9dfe1 net: synack packets can be attached to request sockets
selinux needs few changes to accommodate fact that SYNACK messages
can be attached to a request socket, lacking sk_security pointer

(Only syncookies are still attached to a TCP_LISTEN socket)

Adds a new sk_listener() helper, and use it in selinux and sch_fq

Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported by: kernel test robot <ying.huang@linux.intel.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:05:06 -07:00