Another bug in the "cfg80211: do not replace BSS structs" patch,
a forgotten length update leads to bogus data being stored and
passed to userspace, often truncated.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The fragmentation threshold is defined to be including the
FCS, and the code that sets the TX_FRAGMENTED flag correctly
accounts for those four bytes. The code that verifies this
doesn't though, which could lead to spurious warnings and
frames being dropped although everything is ok. Correct the
code by accounting for the FCS.
(JWL -- The problem is described here:
http://article.gmane.org/gmane.linux.kernel.wireless.general/32205 )
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
net_create() will be used by C/R to create fresh netns on restart.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
copy_net_ns() doesn't copy anything, it creates fresh netns, so
get/put of old netns isn't needed.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_prequeue() refers to the constant value (TCP_RTO_MIN) regardless of
the actual value might be tuned. The following patches fix this and make
tcp_prequeue get the actual value returns from tcp_rto_min().
Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This should be very safe compared with full enabled, so I see
no reason why it shouldn't be done right away. As ECN can only
be negotiated if the SYN sending party is also supporting it,
somebody in the loop probably knows what he/she is doing. If
SYN does not ask for ECN, the server side SYN-ACK is identical
to what it is without ECN. Thus it's quite safe.
The chosen value is safe w.r.t to existing configs which
choose to currently set manually either 0 or 1 but
silently upgrades those who have not explicitly requested
ECN off.
Whether to just enable both sides comes up time to time but
unless that gets done now we can at least make the servers
aware of ECN already. As there are some known problems to occur
if ECN is enabled, it's currently questionable whether there's
any real gain from enabling clients as servers mostly won't
support it anyway (so we'd hit just the negative sides). After
enabling the servers and getting that deployed, the client end
enable really has some potential gain too.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based almost entirely upon a patch by Eric Dumazet.
The common case is to have num-tx-queues <= num_rx_queues
and even if num_tx_queues is larger it will not be significantly
larger.
Therefore, a subtraction loop is always going to be faster than
modulus.
Signed-off-by: David S. Miller <davem@davemloft.net>
These fixes resolved crashes due to resource leak BUG_ON checks. The
resource leaks were detected by introducing asynchronous transport errors.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Commit 0ba25ff4c6 ("br2684: convert to
net_device_ops") inadvertently deleted the initialization of the net_dev
pointer in the br2684_dev structure, leading to crashes. This patch
adds it back.
Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel should only be using the high 16 bits of a kernel
generated priority. Filter priorities in all other cases only
use the upper 16 bits of the u32 'prio' field of 'struct tcf_proto',
but when the kernel generates the priority of a filter is saves all
32 bits which can result in incorrect lookup failures when a filter
needs to be deleted or modified.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
xt_socket can use connection tracking, and checks whether it is a module.
Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
When skb_rx_queue_recorded() is true, we dont want to use jash distribution
as the device driver exactly told us which queue was selected at RX time.
jhash makes a statistical shuffle, but this wont work with 8 static inputs.
Later improvements would be to compute reciprocal value of real_num_tx_queues
to avoid a divide here. But this computation should be done once,
when real_num_tx_queues is set. This needs a separate patch, and a new
field in struct net_device.
Reported-by: Andrew Dickinson <andrew@whydna.net>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lennert Buytenhek wrote:
> Since 4fb6699481 ("net: Optimize memory
> usage when splicing from sockets.") I'm seeing this oops (e.g. in
> 2.6.30-rc3) when splicing from a TCP socket to /dev/null on a driver
> (mv643xx_eth) that uses LRO in the skb mode (lro_receive_skb) rather
> than the frag mode:
My patch incorrectly assumed skb->sk was always valid, but for
"frag_listed" skbs we can only use skb->sk of their parent.
Reported-by: Lennert Buytenhek <buytenh@wantstofly.org>
Debugged-by: Lennert Buytenhek <buytenh@wantstofly.org>
Tested-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In "mac80211: correct wext transmit power handler"
I fixed the wext handler, but forgot to make the default of the
user_power_level -1 (aka "auto"), so that now the transmit power
is always set to 0, causing associations to time out and similar
problems since we're transmitting with very little power. Correct
this by correcting the default user_power_level to -1.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Bisected-by: Niel Lambrechts <niel.lambrechts@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
- ieee80211_wep_init(), which is called with rtnl_lock held, blocks in
request_module() [waiting for modprobe to load a crypto module].
- modprobe blocks in a call to flush_workqueue(), when it closes a TTY
[presumably when it exits].
- The workqueue item linkwatch_event() blocks on rtnl_lock.
There's no reason for wep_init() to be called with rtnl_lock held, so
just move it outside the critical section.
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The x_tables are organized with a table structure and a per-cpu copies
of the counters and rules. On older kernels there was a reader/writer
lock per table which was a performance bottleneck. In 2.6.30-rc, this
was converted to use RCU and the counters/rules which solved the performance
problems for do_table but made replacing rules much slower because of
the necessary RCU grace period.
This version uses a per-cpu set of spinlocks and counters to allow to
table processing to proceed without the cache thrashing of a global
reader lock and keeps the same performance for table updates.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean up svc_one_sock_name() by setting up automatic variables for
frequently used expressions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Add an arm to the switch statement in svc_one_sock_name() so it can
construct the name of PF_INET6 sockets properly.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aime Le Rouzic <aime.le-rouzic@bull.net>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Use snprintf() in one_sock_name() to prevent overflowing the output
buffer. If the name doesn't fit in the buffer, the buffer is filled
in with an empty string, and -ENAMETOOLONG is returned.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Adjust the synopsis of svc_sock_names() to pass in the size of the
output buffer. Add a documenting comment.
This is a cosmetic change for now. A subsequent patch will make sure
the buffer length is passed to one_sock_name(), where the length will
actually be useful.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Adjust the synopsis of svc_addsock() to pass in the size of the output
buffer. Add a documenting comment.
This is a cosmetic change for now. A subsequent patch will make sure
the buffer length is passed to one_sock_name(), where the length will
actually be useful.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The svc_xprt_names() function can overflow its buffer if it's so near
the end of the passed in buffer that the "name too long" string still
doesn't fit. Of course, it could never tell if it was near the end
of the passed in buffer, since its only caller passes in zero as the
buffer length.
Let's make this API a little safer.
Change svc_xprt_names() so it *always* checks for a buffer overflow,
and change its only caller to pass in the correct buffer length.
If svc_xprt_names() does overflow its buffer, it now fails with an
ENAMETOOLONG errno, instead of trying to write a message at the end
of the buffer. I don't like this much, but I can't figure out a clean
way that's always safe to return some of the names, *and* an
indication that the buffer was not long enough.
The displayed error when doing a 'cat /proc/fs/nfsd/portlist' is
"File name too long".
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The svc_addr_len() helper function returns -EAFNOSUPPORT if it doesn't
recognize the address family of the passed-in socket address. However,
the return type of this function is size_t, which means -EAFNOSUPPORT
is turned into a very large positive value in this case.
The check in svc_udp_recvfrom() to see if the return value is less
than zero therefore won't work at all.
Additionally, handle_connect_req() passes this value directly to
memset(). This could cause memset() to clobber a large chunk of memory
if svc_addr_len() has returned an error. Currently the address family
of these addresses, however, is known to be supported long before
handle_connect_req() is called, so this isn't a real risk.
Change the error return value of svc_addr_len() to zero, which fits in
the range of size_t, and is safer to pass to memset() directly.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Fix the following sparse warnings in net/sunrpc/svc_xprt.c.
warning: symbol 'svc_recv' was not declared. Should it be static?
warning: symbol 'svc_drop' was not declared. Should it be static?
warning: symbol 'svc_send' was not declared. Should it be static?
warning: symbol 'svc_close_all' was not declared. Should it be static?
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The Bluetooth 2.1 specification introduced four different security modes
that can be mapped using Legacy Pairing and Simple Pairing. With the
usage of Simple Pairing it is required that all connections (except
the ones for SDP) are encrypted. So even the low security requirement
mandates an encrypted connection when using Simple Pairing. When using
Legacy Pairing (for Bluetooth 2.0 devices and older) this is not required
since it causes interoperability issues.
To support this properly the low security requirement translates into
different host controller transactions depending if Simple Pairing is
supported or not. However in case of Simple Pairing the command to
switch on encryption after a successful authentication is not triggered
for the low security mode. This patch fixes this and actually makes
the logic to differentiate between Simple Pairing and Legacy Pairing
a lot simpler.
Based on a report by Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Bluetooth stack uses a reference counting for all established ACL
links and if no user (L2CAP connection) is present, the link will be
terminated to save power. The problem part is the dedicated pairing
when using Legacy Pairing (Bluetooth 2.0 and before). At that point
no user is present and pairing attempts will be disconnected within
10 seconds or less. In previous kernel version this was not a problem
since the disconnect timeout wasn't triggered on incoming connections
for the first time. However this caused issues with broken host stacks
that kept the connections around after dedicated pairing. When the
support for Simple Pairing got added, the link establishment procedure
needed to be changed and now causes issues when using Legacy Pairing
When using Simple Pairing it is possible to do a proper reference
counting of ACL link users. With Legacy Pairing this is not possible
since the specification is unclear in some areas and too many broken
Bluetooth devices have already been deployed. So instead of trying to
deal with all the broken devices, a special pairing timeout will be
introduced that increases the timeout to 60 seconds when pairing is
triggered.
If a broken devices now puts the stack into an unforeseen state, the
worst that happens is the disconnect timeout triggers after 120 seconds
instead of 4 seconds. This allows successful pairings with legacy and
broken devices now.
Based on a report by Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
These are later assigned to other values without being used meanwhile.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In 2.6.25 we added UDP mem accounting.
This unfortunatly added a penalty when a frame is transmitted, since
we have at TX completion time to call sock_wfree() to perform necessary
memory accounting. This calls sock_def_write_space() and utimately
scheduler if any thread is waiting on the socket.
Thread(s) waiting for an incoming frame was scheduled, then had to sleep
again as event was meaningless.
(All threads waiting on a socket are using same sk_sleep anchor)
This adds lot of extra wakeups and increases latencies, as noted
by Christoph Lameter, and slows down softirq handler.
Reference : http://marc.info/?l=linux-netdev&m=124060437012283&w=2
Fortunatly, Davide Libenzi recently added concept of keyed wakeups
into kernel, and particularly for sockets (see commit
37e5540b3c
epoll keyed wakeups: make sockets use keyed wakeups)
Davide goal was to optimize epoll, but this new wakeup infrastructure
can help non epoll users as well, if they care to setup an appropriate
handler.
This patch introduces new DEFINE_WAIT_FUNC() helper and uses it
in wait_for_packet(), so that only relevant event can wakeup a thread
blocked in this function.
Trace of function calls from bnx2 TX completion bnx2_poll_work() is :
__kfree_skb()
skb_release_head_state()
sock_wfree()
sock_def_write_space()
__wake_up_sync_key()
__wake_up_common()
receiver_wake_function() : Stops here since thread is waiting for an INPUT
Reported-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
this is the sctp code to enable hardware crc32c offload for
adapters that support it.
Originally by: Vlad Yasevich <vladislav.yasevich@hp.com>
modified by Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On a brand new GRO skb, we cannot call ip_hdr since the header
may lie in the non-linear area. This patch adds the helper
skb_gro_network_header to handle this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skb_gro_* code fails to handle the case where a header starts
in the linear area but ends in the frags area. Since the goal
of skb_gro_* is to optimise the case of completely non-linear
packets, we can simply bail out if we have anything in the linear
area.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now we have no upper limit on the size of the route cache hash table.
On a 128GB POWER6 box it ends up as 32MB:
IP route cache hash table entries: 4194304 (order: 9, 33554432 bytes)
It would be nice to cap this for memory consumption reasons, but a massive
hashtable also causes a significant spike when measuring OS jitter.
With a 32MB hashtable and 4 million entries, rt_worker_func is taking
5 ms to complete. On another system with more memory it's taking 14 ms.
Even though rt_worker_func does call cond_sched() to limit its impact,
in an HPC environment we want to keep all sources of OS jitter to a minimum.
With the patch applied we limit the number of entries to 512k which
can still be overriden by using the rt_entries boot option:
IP route cache hash table entries: 524288 (order: 6, 4194304 bytes)
With this patch rt_worker_func now takes 0.460 ms on the same system.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When I initially implemented this protocol, I disregarded the use of netlink
attribute headers, thinking for my purposes they weren't needed. I've come to
find out that, as I'm starting to work with sending down messages with
associated data (like config messages), the kernel code spits out warnings about
trailing data in a netlink skb that doesn't have an associated header on it. As
such, I'm going to start including attribute headers in my netlink transaction,
and so for completeness, I should likely include them on messages bound from the
kernel to user space. This patch adds that header to the kernel, and bumps the
protocol version accordingly
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When kernel inserts a temporary SA for IKE, it uses the wrong hash
value for dst list. Two hash values were calcultated before: one with
source address and one with a wildcard source address.
Bug hinted by Junwei Zhang <junwei.zhang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IP MIB (RFC 4293) defines stats for InOctets, OutOctets, InMcastOctets and
OutMcastOctets:
http://tools.ietf.org/html/rfc4293
But it seems we don't track those in any way that easy to separate from other
protocols. This patch adds those missing counters to the stats file. Tested
successfully by me
With help from Eric Dumazet.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the VLAN event handler does not adjust the VLAN
device's carrier state when the real device or the VLAN device is set
administratively up or down.
The following patch adds a transfer of operating state from the
real device to the VLAN device when the real device is administratively
set up or down, and sets the carrier state up or down during init, open
and close of the VLAN device.
This permits observers above the VLAN device that care about the
carrier state (bonding's link monitor, for example) to receive updates
for administrative changes by more closely mimicing the behavior of real
devices.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The nfs server rdma transport was mapping rdma read target pages for
TO_DEVICE instead of FROM_DEVICE. This causes data corruption on non
cache-coherent systems if frmrs are used.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
net/mac80211/tx.c: In function ‘ieee80211_tx_h_select_key’:
net/mac80211/tx.c:448: warning: ‘key’ may be used uninitialized in this function
drivers/net/wireless/ath/ath9k/rc.c: In function ‘ath_rc_rate_getidx’:
drivers/net/wireless/ath/ath9k/rc.c:815: warning: ‘nextindex’ may be used uninitialized in this function
drivers/net/wireless/hostap/hostap_plx.c: In function ‘prism2_plx_probe’:
drivers/net/wireless/hostap/hostap_plx.c:438: warning: ‘cor_index’ may be used uninitialized in this function
drivers/net/wireless/hostap/hostap_plx.c:438: warning: ‘cor_offset’ may be used uninitialized in this function
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Related-to: commit 325fb5b4d2
The compat path suffers from a similar problem. It only uses a __be32
when all of the recent code uses, and expects, an nf_inet_addr
everywhere. As a result, addresses stored by xt_recents were
filled with whatever other stuff was on the stack following the be32.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
With a minor compile fix from Roman.
Reported-and-tested-by: Roman Hoog Antink <rha@open.ch>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch adds missing role attribute to the DCCP type, otherwise
the creation of entries is not of any use.
The attribute added is CTA_PROTOINFO_DCCP_ROLE which contains the
role of the conntrack original tuple.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Commit d0dba725 (netfilter: ctnetlink: add callbacks to the per-proto
nlattrs) changed the protocol registration function to abort if the
to-be registered protocol doesn't provide a new callback function.
The DCCP and UDP-Lite IPv6 protocols were missed in this conversion,
add the required callback pointer.
Reported-and-tested-by: Steven Jan Springl <steven@springl.ukfsn.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>