Commit graph

34197 commits

Author SHA1 Message Date
Pablo Neira Ayuso
65cd90ac76 netfilter: nft_chain_nat_ipv4: use generic IPv4 NAT code from core
Use the exported IPv4 NAT functions that are provided by the core. This
removes duplicated code so iptables and nft use the same NAT codebase.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-02 17:14:11 +02:00
Pablo Neira Ayuso
30766f4c2d netfilter: nat: move specific NAT IPv4 to core
Move the specific NAT IPv4 core functions that are called from the
hooks from iptable_nat.c to nf_nat_l3proto_ipv4.c. This prepares the
ground to allow iptables and nft to use the same NAT engine code that
comes in a follow up patch.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-02 17:14:10 +02:00
Willem de Bruijn
364a9e9324 sock: deduplicate errqueue dequeue
sk->sk_error_queue is dequeued in four locations. All share the
exact same logic. Deduplicate.

Also collapse the two critical sections for dequeue (at the top of
the recv handler) and signal (at the bottom).

This moves signal generation for the next packet forward, which should
be harmless.

It also changes the behavior if the recv handler exits early with an
error. Previously, a signal for follow-up packets on the errqueue
would then not be scheduled. The new behavior, to always signal, is
arguably a bug fix.

For rxrpc, the change causes the same function to be called repeatedly
for each queued packet (because the recv handler == sk_error_report).
It is likely that all packets will fail for the same reason (e.g.,
memory exhaustion).

This code runs without sk_lock held, so it is not safe to trust that
sk->sk_err is immutable inbetween releasing q->lock and the subsequent
test. Introduce int err just to avoid this potential race.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 21:49:08 -07:00
Tom Herbert
72297c59f7 l2tp: Enable checksum unnecessary conversions for l2tp/UDP sockets
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 21:36:28 -07:00
Tom Herbert
884d338c04 gre: Add support for checksum unnecessary conversions
Call skb_checksum_try_convert and skb_gro_checksum_try_convert
after checksum is found present and validated in the GRE header
for normal and GRO paths respectively.

In GRO path, call skb_gro_checksum_try_convert

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 21:36:28 -07:00
Tom Herbert
2abb7cdc0d udp: Add support for doing checksum unnecessary conversion
Add support for doing CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE
conversion in UDP tunneling path.

In the normal UDP path, we call skb_checksum_try_convert after locating
the UDP socket. The check is that checksum conversion is enabled for
the socket (new flag in UDP socket) and that checksum field is
non-zero.

In the UDP GRO path, we call skb_gro_checksum_try_convert after
checksum is validated and checksum field is non-zero. Since this is
already in GRO we assume that checksum conversion is always wanted.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 21:36:28 -07:00
Tom Herbert
5a21232983 net: Support for csum_bad in skbuff
This flag indicates that an invalid checksum was detected in the
packet. __skb_mark_checksum_bad helper function was added to set this.

Checksums can be marked bad from a driver or the GRO path (the latter
is implemented in this patch). csum_bad is checked in
__skb_checksum_validate_complete (i.e. calling that when ip_summed ==
CHECKSUM_NONE).

csum_bad works in conjunction with ip_summed value. In the case that
ip_summed is CHECKSUM_NONE and csum_bad is set, this implies that the
first (or next) checksum encountered in the packet is bad. When
ip_summed is CHECKSUM_UNNECESSARY, the first checksum after the last
one validated is bad. For example, if ip_summed == CHECKSUM_UNNECESSARY,
csum_level == 1, and csum_bad is set-- then the third checksum in the
packet is bad. In the normal path, the packet will be dropped when
processing the protocol layer of the bad checksum:
__skb_decr_checksum_unnecessary called twice for the good checksums
changing ip_summed to CHECKSUM_NONE so that
__skb_checksum_validate_complete is called to validate the third
checksum and that will fail since csum_bad is set.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 21:36:27 -07:00
Florian Fainelli
61b7363ffa net: dsa: make dsa_pack_type static
net/dsa/dsa.c:624:20: sparse: symbol 'dsa_pack_type' was not declared.
Should it be static?

Fixes: 3e8a72d1da ("net: dsa: reduce number of protocol hooks")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 20:41:45 -07:00
stephen hemminger
688d1945bc tcp: whitespace fixes
Fix places where there is space before tab, long lines, and
awkward if(){, double spacing etc. Add blank line after declaration/initialization.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 18:12:45 -07:00
David S. Miller
377655fe6b Merge tag 'master-2014-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
pull request: wireless 2014-08-28

Please pull this batch of fixes intended for the 3.17 stream.

For the Bluetooth/6LowPAN/802.15.4 bits, Johan says:

'It contains a connection reference counting fix for LE where a
connection might stay up even though it should get disconnected.

The other 802.15.4 6LoWPAN related patches were sent to the bluetooth
tree by Alexander Aring and described as follows by him:

"
these patches contains patches for the bluetooth branch.

This series includes memory leak fixes and an errno value fix.
Also there are two patches for sending and receiving 1280 6LoWPAN
packets, which makes the IEEE 802.15.4 6LoWPAN stack more RFC
compliant.
"'

Along with that...

Alexey Khoroshilov fixes a use-after-free bug on at76c50x-usb.

Hauke Mehrtens adds a PCI ID to bcma.

Himangi Saraogi fixes a silly "A || A" test in rtlwifi.

Larry Finger adds a device ID to rtl8192cu.

Maks Naumov fixes a strncmp argument in ath9k.

Álvaro Fernández Rojas adds a PCI ID to ssb.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 18:08:11 -07:00
Jesper Dangaard Brouer
afb84b6261 pktgen: add flag NO_TIMESTAMP to disable timestamping
Then testing the TX limits of the stack, then it is useful to
be-able to disable the do_gettimeofday() timetamping on every packet.

This implements a pktgen flag NO_TIMESTAMP which will disable this
call to do_gettimeofday().

The performance change on (my system E5-2695) with skb_clone=0, goes
from TX 2,423,751 pps to 2,567,165 pps with flag NO_TIMESTAMP. Thus,
the cost of do_gettimeofday() or saving is approx 23 nanosec.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 18:06:59 -07:00
Erik Hugne
a5325ae5b8 tipc: add name distributor resiliency queue
TIPC name table updates are distributed asynchronously in a cluster,
entailing a risk of certain race conditions. E.g., if two nodes
simultaneously issue conflicting (overlapping) publications, this may
not be detected until both publications have reached a third node, in
which case one of the publications will be silently dropped on that
node. Hence, we end up with an inconsistent name table.

In most cases this conflict is just a temporary race, e.g., one
node is issuing a publication under the assumption that a previous,
conflicting, publication has already been withdrawn by the other node.
However, because of the (rtt related) distributed update delay, this
may not yet hold true on all nodes. The symptom of this failure is a
syslog message: "tipc: Cannot publish {%u,%u,%u}, overlap error".

In this commit we add a resiliency queue at the receiving end of
the name table distributor. When insertion of an arriving publication
fails, we retain it in this queue for a short amount of time, assuming
that another update will arrive very soon and clear the conflict. If so
happens, we insert the publication, otherwise we drop it.

The (configurable) retention value defaults to 2000 ms. Knowing from
experience that the situation described above is extremely rare, there
is no risk that the queue will accumulate any large number of items.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:51:48 -07:00
Erik Hugne
f4ad8a4b8b tipc: refactor name table updates out of named packet receive routine
We need to perform the same actions when processing deferred name
table updates, so this functionality is moved to a separate
function.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:51:48 -07:00
David S. Miller
8dcda22a5d net: xmit_list() becomes dev_hard_start_xmit().
Now fundamentally we can process lists of SKBs as cheaply
as single packets.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:39:56 -07:00
David S. Miller
ce93718fb7 net: Don't keep around original SKB when we software segment GSO frames.
Just maintain the list properly by returning the head of the remaining
SKB list from dev_hard_start_xmit().

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:39:56 -07:00
David S. Miller
50cbe9ab5f net: Validate xmit SKBs right when we pull them out of the qdisc.
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:39:56 -07:00
David S. Miller
eae3f88ee4 net: Separate out SKB validation logic from transmit path.
dev_hard_start_xmit() does two things, it first validates and
canonicalizes the SKB, then it actually sends it.

Make a set of helper functions for doing the first part.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:39:55 -07:00
David S. Miller
95f6b3dda2 net: Have xmit_list() signal more==true when appropriate.
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:39:55 -07:00
David S. Miller
fa2dbdc253 net: Pass a "more" indication down into netdev_start_xmit() code paths.
For now it will always be false.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:39:55 -07:00
David S. Miller
7f2e870f2a net: Move main gso loop out of dev_hard_start_xmit() into helper.
There is a slight policy change happening here as well.

The previous code would drop the entire rest of the GSO skb if any of
them got, for example, a congestion notification.

That makes no sense, anything NET_XMIT_MASK and below is something
like congestion or policing.  And in the congestion case it doesn't
even mean the packet was actually dropped.

Just continue until dev_xmit_complete() evaluates to false.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:39:55 -07:00
David S. Miller
2ea2551375 net: Create xmit_one() helper for dev_hard_start_xmit()
Hopefully making the code a bit easier to read and digest.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:39:55 -07:00
David S. Miller
10b3ad8c21 net: Do txq_trans_update() in netdev_start_xmit()
That way we don't have to audit every call site to make sure it is
doing this properly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01 17:39:55 -07:00
Pablo Neira Ayuso
d79a61d646 netfilter: NETFILTER_XT_TARGET_LOG selects NF_LOG_*
CONFIG_NETFILTER_XT_TARGET_LOG is not selected anymore when jumping
from 3.16 to 3.17-rc1 if you don't set on the new NF_LOG_IPV4 and
NF_LOG_IPV6 switches.

Change this to select the three new symbols NF_LOG_COMMON, NF_LOG_IPV4
and NF_LOG_IPV6 instead, so NETFILTER_XT_TARGET_LOG remains enabled
when moving from old to new kernels.

Reported-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-01 13:46:31 +02:00
Tom Herbert
202863fe4c sctp: Change sctp to implement csum_levels
CHECKSUM_UNNECESSARY may be applied to the SCTP CRC so we need to
appropriate account for this by decrementing csum_level. This is
done by calling __skb_dec_checksum_unnecessary.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29 20:41:11 -07:00
Tom Herbert
662880f442 net: Allow GRO to use and set levels of checksum unnecessary
Allow GRO path to "consume" checksums provided in CHECKSUM_UNNECESSARY
and to report new checksums verfied for use in fallback to normal
path.

Change GRO checksum path to track csum_level using a csum_cnt field
in NAPI_GRO_CB. On GRO initialization, if ip_summed is
CHECKSUM_UNNECESSARY set NAPI_GRO_CB(skb)->csum_cnt to
skb->csum_level + 1. For each checksum verified, decrement
NAPI_GRO_CB(skb)->csum_cnt while its greater than zero. If a checksum
is verfied and NAPI_GRO_CB(skb)->csum_cnt == 0, we have verified a
deeper checksum than originally indicated in skbuf so increment
csum_level (or initialize to CHECKSUM_UNNECESSARY if ip_summed is
CHECKSUM_NONE or CHECKSUM_COMPLETE).

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29 20:41:11 -07:00
Tom Herbert
77cffe23c1 net: Clarification of CHECKSUM_UNNECESSARY
This patch:
 - Clarifies the specific requirements of devices returning
   CHECKSUM_UNNECESSARY (comments in skbuff.h).
 - Adds csum_level field to skbuff. This is used to express how
   many checksums are covered by CHECKSUM_UNNECESSARY (stores n - 1).
   This replaces the overloading of skb->encapsulation, that field is
   is now only used to indicate inner headers are valid.
 - Change __skb_checksum_validate_needed to "consume" each checksum
   as indicated by csum_level as layers of the the packet are parsed.
 - Remove skb_pop_rcv_encapsulation, no longer needed in the new
   csum_level model.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29 20:41:11 -07:00
Daniel Borkmann
38ab1fa981 net: sctp: fix ABI mismatch through sctp_assoc_to_state helper
Since SCTP day 1, that is, 19b55a2af145 ("Initial commit") from lksctp
tree, the official <netinet/sctp.h> header carries a copy of enum
sctp_sstat_state that looks like (compared to the current in-kernel
enumeration):

  User definition:                     Kernel definition:

  enum sctp_sstat_state {              typedef enum {
    SCTP_EMPTY             = 0,          <removed>
    SCTP_CLOSED            = 1,          SCTP_STATE_CLOSED            = 0,
    SCTP_COOKIE_WAIT       = 2,          SCTP_STATE_COOKIE_WAIT       = 1,
    SCTP_COOKIE_ECHOED     = 3,          SCTP_STATE_COOKIE_ECHOED     = 2,
    SCTP_ESTABLISHED       = 4,          SCTP_STATE_ESTABLISHED       = 3,
    SCTP_SHUTDOWN_PENDING  = 5,          SCTP_STATE_SHUTDOWN_PENDING  = 4,
    SCTP_SHUTDOWN_SENT     = 6,          SCTP_STATE_SHUTDOWN_SENT     = 5,
    SCTP_SHUTDOWN_RECEIVED = 7,          SCTP_STATE_SHUTDOWN_RECEIVED = 6,
    SCTP_SHUTDOWN_ACK_SENT = 8,          SCTP_STATE_SHUTDOWN_ACK_SENT = 7,
  };                                   } sctp_state_t;

This header was later on also placed into the uapi, so that user space
programs can compile without having <netinet/sctp.h>, but the shipped
with <linux/sctp.h> instead.

While RFC6458 under 8.2.1.Association Status (SCTP_STATUS) says that
sstat_state can range from SCTP_CLOSED to SCTP_SHUTDOWN_ACK_SENT, we
nevertheless have a what it appears to be dummy SCTP_EMPTY state from
the very early days.

While it seems to do just nothing, commit 0b8f9e25b0 ("sctp: remove
completely unsed EMPTY state") did the right thing and removed this dead
code. That however, causes an off-by-one when the user asks the SCTP
stack via SCTP_STATUS API and checks for the current socket state thus
yielding possibly undefined behaviour in applications as they expect
the kernel to tell the right thing.

The enumeration had to be changed however as based on the current socket
state, we access a function pointer lookup-table through this. Therefore,
I think the best way to deal with this is just to add a helper function
sctp_assoc_to_state() to encapsulate the off-by-one quirk.

Reported-by: Tristan Su <sooqing@gmail.com>
Fixes: 0b8f9e25b0 ("sctp: remove completely unsed EMPTY state")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29 20:31:08 -07:00
Eric Dumazet
d9b2938aab net: attempt a single high order allocation
In commit ed98df3361 ("net: use __GFP_NORETRY for high order
allocations") we tried to address one issue caused by order-3
allocations.

We still observe high latencies and system overhead in situations where
compaction is not successful.

Instead of trying order-3, order-2, and order-1, do a single order-3
best effort and immediately fallback to plain order-0.

This mimics slub strategy to fallback to slab min order if the high
order allocation used for performance failed.

Order-3 allocations give a performance boost only if they can be done
without recurring and expensive memory scan.

Quoting David :

The page allocator relies on synchronous (sync light) memory compaction
after direct reclaim for allocations that don't retry and deferred
compaction doesn't work with this strategy because the allocation order
is always decreasing from the previous failed attempt.

This means sync light compaction will always be encountered if memory
cannot be defragmented or reclaimed several times during the
skb_page_frag_refill() iteration.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29 20:28:23 -07:00
Ying Xue
cc086fcf92 tipc: fix a potential oops
Commit 6c9808ce09 ("tipc: remove port_lock") accidentally involves
a potential bug: when tipc socket instance(tsk) is not got with given
reference number in tipc_sk_get(), tsk is set to NULL. Subsequently
we jump to exit label where to decrease socket reference counter
pointed by tsk pointer in tipc_sk_put(). However, As now tsk is NULL,
oops may happen because of touching a NULL pointer.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Erik Hugne <erik.hugne@ericsson.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29 20:22:43 -07:00
Daniel Borkmann
10c51b5623 net: add skb_get_tx_queue() helper
Replace occurences of skb_get_queue_mapping() and follow-up
netdev_get_tx_queue() with an actual helper function.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29 20:02:07 -07:00
Mika Westerberg
d0616613d9 net: rfkill: gpio: Add more Broadcom bluetooth ACPI IDs
This adds one more ACPI ID of a Broadcom bluetooth chip.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-29 13:10:44 +02:00
Michal Kazior
a00f4f6e04 mac80211: fix chantype recalc warning
When a device driver is unloaded local->interfaces
list is cleared. If there was more than 1
interface running and connected (bound to a
chanctx) then chantype recalc was called and it
ended up with compat being NULL causing a call
trace warning.

Warn if compat becomes NULL as a result of
incompatible bss_conf.chandef of interfaces bound
to a given channel context only.

The call trace looked like this:

 WARNING: CPU: 2 PID: 2594 at /devel/src/linux/net/mac80211/chan.c:557 ieee80211_recalc_chanctx_chantype+0x2cd/0x2e0()
 Modules linked in: ath10k_pci(-) ath10k_core ath
 CPU: 2 PID: 2594 Comm: rmmod Tainted: G        W     3.16.0-rc1+ #150
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  0000000000000009 ffff88001ea279c0 ffffffff818dfa93 0000000000000000
  ffff88001ea279f8 ffffffff810514a8 ffff88001ce09cd0 ffff88001e03cc58
  0000000000000000 ffff88001ce08840 ffff88001ce09cd0 ffff88001ea27a08
 Call Trace:
  [<ffffffff818dfa93>] dump_stack+0x4d/0x66
  [<ffffffff810514a8>] warn_slowpath_common+0x78/0xa0
  [<ffffffff81051585>] warn_slowpath_null+0x15/0x20
  [<ffffffff818a407d>] ieee80211_recalc_chanctx_chantype+0x2cd/0x2e0
  [<ffffffff818a3dda>] ? ieee80211_recalc_chanctx_chantype+0x2a/0x2e0
  [<ffffffff818a4919>] ieee80211_assign_vif_chanctx+0x1a9/0x770
  [<ffffffff818a6220>] __ieee80211_vif_release_channel+0x70/0x130
  [<ffffffff818a6dd3>] ieee80211_vif_release_channel+0x43/0xb0
  [<ffffffff81885f4e>] ieee80211_stop_ap+0x21e/0x5a0
  [<ffffffff8184b9b5>] __cfg80211_stop_ap+0x85/0x520
  [<ffffffff8181c188>] __cfg80211_leave+0x68/0x120
  [<ffffffff8181c268>] cfg80211_leave+0x28/0x40
  [<ffffffff8181c5f3>] cfg80211_netdev_notifier_call+0x373/0x6b0
  [<ffffffff8107f965>] notifier_call_chain+0x55/0x110
  [<ffffffff8107fa41>] raw_notifier_call_chain+0x11/0x20
  [<ffffffff816a8dc0>] call_netdevice_notifiers_info+0x30/0x60
  [<ffffffff816a8eb9>] __dev_close_many+0x59/0xf0
  [<ffffffff816a9021>] dev_close_many+0x81/0x120
  [<ffffffff816aa1c5>] rollback_registered_many+0x115/0x2a0
  [<ffffffff816aa3a6>] unregister_netdevice_many+0x16/0xa0
  [<ffffffff8187d841>] ieee80211_remove_interfaces+0x121/0x1b0
  [<ffffffff8185e0e6>] ieee80211_unregister_hw+0x56/0x110
  [<ffffffffa0011ac4>] ath10k_mac_unregister+0x14/0x60 [ath10k_core]
  [<ffffffffa0014fe7>] ath10k_core_unregister+0x27/0x40 [ath10k_core]
  [<ffffffffa003b1f4>] ath10k_pci_remove+0x44/0xa0 [ath10k_pci]
  [<ffffffff81373138>] pci_device_remove+0x28/0x60
  [<ffffffff814cb534>] __device_release_driver+0x64/0xd0
  [<ffffffff814cbcc8>] driver_detach+0xb8/0xc0
  [<ffffffff814cb23a>] bus_remove_driver+0x4a/0xb0
  [<ffffffff814cc697>] driver_unregister+0x27/0x50

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-29 13:06:01 +02:00
Florian Fainelli
246d7f773c net: dsa: add Broadcom SF2 switch driver
Add support for the Broadcom Starfigther 2 switch chip using a DSA
driver. This switch driver supports the following features:

- configuration of the external switch port interface: MII, RevMII,
  RGMII and RGMII_NO_ID are supported
- support for the per-port MIB counters
- support for link interrupts for special ports (e.g: MoCA)
- powering up/down of switch memories to conserve power when ports are
  unused

Finally, update the compatible property for the DSA core code to match
our switch top-level compatible node.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
5037d532b8 net: dsa: add Broadcom tag RX/TX handler
Add support for the 4-bytes Broadcom tag that built-in switches such as
the Starfighter 2 might insert when receiving packets, or that we need
to insert while targetting specific switch ports. We use a fake local
EtherType value for this 4-bytes switch tag: ETH_P_BRCMTAG to make sure
we can assign DSA-specific network operations within the DSA drivers.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
ce31b31c68 net: dsa: allow updating fixed PHY link information
Allow switch drivers to hook a PHY link update callback to perform
port-specific link work.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
ec9436baed net: dsa: allow drivers to do link adjustment
Whenever libphy determines that the link status of a given PHY/port has
changed, allow to call into the switch driver link adjustment callback
so proper actions can be taken care of by the switch driver upon link
notification.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
5aed85cec2 net: dsa: allow switches to work without tagging
In case switch port tagging is disabled (voluntarily, or the switch just
does not support it), allow us to continue using the defined set of
dsa_device_ops in net/dsa/slave.c.

We introduce dsa_protocol_is_tagged() to check whether we need to
override skb->protocol and go through the DSA-specifif packet_type
function, or if we just go on and receive the SKB through the normal
path.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
0d8bcdd383 net: dsa: allow for more complex PHY setups
Modify the DSA slave interface to be bound to an arbitray PHY, not just
the ones that are available as child PHY devices of the switch MDIO bus.

This allows us for instance to have external PHYs connected to a
separate MDIO bus, but yet also connected to a given switch port.

Under certain configurations, the physical port mask might not be a 1:1
mapping to the MII PHYs mask. This is the case, if e.g: Port 1 of the
switch is used and connects to a PHY at a MDIO address different than 1.

Introduce a phys_mii_mask variable which allows driver to implement and
divert their own MDIO read/writes operations for a subset of the MDIO
PHY addresses.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
bd47497a01 net: dsa: retain a per-port device_node pointer
We will later use the per-port device_node pointer to fetch a bunch of
port-specific properties.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:40 -07:00
Florian Fainelli
fa981d9af8 net: dsa: provide a switch device device tree node pointer
We might need to fetch additional resources from the device tree node
pointer, such as register ranges or other properties. Keep a device_node
pointer around for this.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:39 -07:00
Florian Fainelli
3e8a72d1da net: dsa: reduce number of protocol hooks
DSA is currently registering one packet_type function per EtherType it
needs to intercept in the receive path of a DSA-enabled Ethernet device.
Right now we have three of them: trailer, DSA and eDSA, and there might
be more in the future, this will not scale to the addition of new
protocols.

This patch proceeds with adding a new layer of abstraction and two new
functions:

dsa_switch_rcv() which will dispatch into the tag-protocol specific
receive function implemented by net/dsa/tag_*.c

dsa_slave_xmit() which will dispatch into the tag-protocol specific
transmit function implemented by net/dsa/tag_*.c

When we do create the per-port slave network devices, we iterate over
the switch protocol to assign the DSA-specific receive and transmit
operations.

A new fake ethertype value is used: ETH_P_XDSA to illustrate the fact
that this is no longer going to look like ETH_P_DSA or ETH_P_TRAILER
like it used to be.

This allows us to greatly simplify the check in eth_type_trans() and
always override the skb->protocol with ETH_P_XDSA for Ethernet switches
tagged protocol, while also reducing the number repetitive slave
netdevice_ops assignments.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 22:59:39 -07:00
Julian Anastasov
eb90b0c734 ipvs: fix ipv6 hook registration for local replies
commit fc60476761
("ipvs: changes for local real server") from 2.6.37
introduced DNAT support to local real server but the
IPv6 LOCAL_OUT handler ip_vs_local_reply6() is
registered incorrectly as IPv4 hook causing any outgoing
IPv4 traffic to be dropped depending on the IP header values.

Chris tracked down the problem to CONFIG_IP_VS_IPV6=y
Bug report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349768

Reported-by: Chris J Arges <chris.j.arges@canonical.com>
Tested-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2014-08-28 10:52:37 +09:00
Florian Westphal
253ff51635 tcp: syncookies: mark cookie_secret read_mostly
only written once.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-27 16:30:49 -07:00
Andreea-Cristina Bernat
2688eba9d5 mac80211: Replace rcu_dereference() with rcu_access_pointer()
The "rcu_dereference()" calls are used directly in conditions.
Since their return values are never dereferenced it is recommended to
use "rcu_access_pointer()" instead of "rcu_dereference()".
Therefore, this patch makes the replacements.

The following Coccinelle semantic patch was used:
@@
@@

(
 if(
 (<+...
- rcu_dereference
+ rcu_access_pointer
  (...)
  ...+>)) {...}
|
 while(
 (<+...
- rcu_dereference
+ rcu_access_pointer
  (...)
  ...+>)) {...}
)

Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-27 12:14:10 +02:00
Julian Anastasov
ea1d5d7755 ipvs: properly declare tunnel encapsulation
The tunneling method should properly use tunnel encapsulation.
Fixes problem with CHECKSUM_PARTIAL packets when TCP/UDP csum
offload is supported.

Thanks to Alex Gartrell for reporting the problem, providing
solution and for all suggestions.

Reported-by: Alex Gartrell <agartrell@fb.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2014-08-27 14:31:56 +09:00
Alexey Perevalov
f111f780ae netfilter: nfnetlink_acct: add filter support to nfacct counter list/reset
You can use this to skip accounting objects when listing/resetting
via NFNL_MSG_ACCT_GET/NFNL_MSG_ACCT_GET_CTRZERO messages with the
NLM_F_DUMP netlink flag. The filtering covers the following cases:

1. No filter specified. In this case, the client will get old behaviour,
2. List/reset counter object only: In this case, you have to use
   NFACCT_F_QUOTA as mask and value 0.
3. List/reset quota objects only: You have to use NFACCT_F_QUOTA_PKTS
   as mask and value - the same, for byte based quota mask should be
   NFACCT_F_QUOTA_BYTES and value - the same.

If you want to obtain the object with any quota type
(ie. NFACCT_F_QUOTA_PKTS|NFACCT_F_QUOTA_BYTES), you need to perform
two dump requests, one to obtain NFACCT_F_QUOTA_PKTS objects and
another for NFACCT_F_QUOTA_BYTES.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-08-26 21:36:19 +02:00
Andreea-Cristina Bernat
ad053a962f mac80211: scan: Replace rcu_assign_pointer() with RCU_INIT_POINTER()
The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1.   This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.

The following Coccinelle semantic patch was used:
@@
@@

- rcu_assign_pointer
+ RCU_INIT_POINTER
  (..., NULL)

Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-26 11:16:31 +02:00
Johannes Berg
5bc8c1f2b0 cfg80211: allow passing frame type to cfg80211_inform_bss()
When using the cfg80211_inform_bss[_width]() functions drivers
cannot currently indicate whether the data was received in a
beacon or probe response. Fix that by passing a new enum that
indicates such (or unknown).

For good measure, use it in ath6kl.

Acked-by: Kalle Valo <kvalo@qca.qualcomm.com> [ath6kl]
Acked-by: Arend van Spriel <arend@broadcom.com> [brcmfmac]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-26 11:16:02 +02:00
Johannes Berg
0e227084ae cfg80211: clarify BSS probe response vs. beacon data
There are a few possible cases of where BSS data came from:
 1) only a beacon has been received
 2) only a probe response has been received
 3) the driver didn't report what it received (this happens when
    using cfg80211_inform_bss[_width]())
 4) both probe response and beacon data has been received

Unfortunately, in the userspace API, a few things weren't there:
 a) there was no way to differentiate cases 1) and 4) above
    without comparing the data of the IEs
 b) the TSF was always from the last frame, instead of being
    exposed for beacon/probe response separately like IEs

Fix this by
   i) exporting a new flag attribute that indicates whether or
      not probe response data has been received - this addresses (a)
  ii) exporting a BEACON_TSF attribute that holds the beacon's TSF
      if a beacon has been received
 iii) not exporting the beacon attributes in case (3) above as that
      would just lead userspace into thinking the data actually came
      from a beacon when that isn't clear

To implement this, track inside the IEs struct whether or not it
(definitely) came from a beacon.

Reported-by: William Seto
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-26 11:16:01 +02:00
Michal Kazior
f41ef64853 cfg80211: re-enable CSA for drivers that support it
This reverts commit dda444d524.

Channel switching code has been reworked and
improved significantly since the time original
locking issues were found.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-08-26 11:16:01 +02:00