Rework TIPC's message sending routines to take advantage of the total
amount of data value passed to it by the kernel socket infrastructure.
This change eliminates the need for TIPC to compute the size of outgoing
messages itself, as well as the check for an oversize message in
tipc_msg_build(). In addition, this change warrants an explanation:
- res = send_packet(NULL, sock, &my_msg, 0);
+ res = send_packet(NULL, sock, &my_msg, bytes_to_send);
Previously, the final argument to send_packet() was ignored (since the
amount of data being sent was recalculated by a lower-level routine)
and we could just pass in a dummy value (0). Now that the
recalculation is being eliminated, the argument value being passed to
send_packet() is significant and we have to supply the actual amount
of data we want to send.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Adds checks to TIPC's socket send routines to promptly detect and
abort attempts to send more than 66,000 bytes in a single TIPC
message or more than 2**31-1 bytes in a single TIPC byte stream request.
In addition, this ensures that the number of iovecs in a send request
does not exceed the limits of a standard integer variable.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Enhances existing checks on the discovery domain associated with a TIPC
bearer. A bearer can no longer be configured to accept links from itself
only (which would be pointless), or to nodes outside its own cluster
(since multi-cluster support has now been removed from TIPC). Also, the
neighbor discovery routine now validates link setup requests against the
configured discovery domain for the bearer, rather than simply ensuring
the requesting node belongs to the node's own cluster.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
This allows them to be available for easy re-use in other places
and avoids trivial mistakes caused by "count the f's and 0's".
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Modifies a TIPC send routine that did not discard the outgoing sk_buff
if it was not transmitted because of link congestion; this eliminates
the potential for buffer leakage in the many callers who did not clean up
the unsent buffer. (The two routines that previously did discard the unsent
buffer have been updated to eliminate their now-redundant clean up.)
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Sets the destination node field of an incoming multicast message
to the receiving node's network address before handing off the message
to each receiving port. This ensures that, in the event the destination
port returns the message to the sender, the sender can identify which
node the destination port belonged to.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Set the destination node and destination port fields of an outgoing
multicast message header to zero; this is necessary to ensure that
the receiving node can route the message properly if it was packed
into a bundle due to link congestion. (Previously, there was a chance
that the receiving node would send the unbundled message to a random
node & port, rather than processing the message itself.)
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Ensures that all outgoing data messages have the "name lookup scope"
field of their header set correctly; that is, named multicast messages
now specify cluster-wide name lookup, while messages not using TIPC
naming zero out the lookup field. (Previously, the lookup scope specified
for these types of messages was inherited from the last message sent
by the sending port.)
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Modifies the routine that fragments an existing message buffer to
use similar logic to that used when generating fragments from an iovec.
The routine now creates a complete chain of fragments and adds them to
the link transmit queue as a unit, so that the link sends all fragments
or none; this prevents the incomplete transmission of a fragmented
message that might otherwise result because of link congestion or
memory exhaustion. This change also ensures that the counter recording
the number of fragmented messages sent by the link is now incremented
only if the message is actually sent.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates code that restricts a link's counter of its fragmented
messages to a 16-bit value, since the counter value is automatically
restricted to this range when it is written into the message header.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates code that sets the link selector field in the header of
fragmented messages, since this information is never referenced.
(The unnecessary initialization was harmless as it was over-written
by the fragmented message identifier value before the fragments were
transmitted.)
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates optional code used to test TIPC's ability to recover
from lost broadcast messages. This code duplicates functionality
already provided by the network stack's QoS option "network emulator".
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Half of the #define entries in msg.h were down at the bottom
of the header, instead of up at the top before any of the static
inlines etc. Relocate them up to the top, to be consistent with
the other normal linux header file layout conventions.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Gets rid of unused constants defining the types used in routing
messages. These messages no longer exist in TIPC now that multicluster
and multizone support has been eliminated.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Removes comments in TIPC's message header include file that are
outdated and/or unnecessary. Also introduces short comments (or
supplements existing ones) to better describe several set of existing
symbolic constants.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
This seems to be a leftover from the old days, when we didn't support
any frames that didn't contain the full ieee802.11 header. This is
not the case anymore. It does not cause problems now, because they
are only dropped during scan. But when scheduled scans get merged,
this would become a problem because we would drop all small frames
while scheduled scan is running.
To fix this, return RX_CONTINUE instead of RX_DROP_MONITOR.
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When mac80211 is built without CONFIG_PM being defined, the following errors
are output:
net/mac80211/main.c: In function ‘ieee80211_register_hw’:
net/mac80211/main.c:700: error: ‘const struct ieee80211_ops’ has no member named ‘suspend’
net/mac80211/main.c:700: error: ‘const struct ieee80211_ops’ has no member named ‘resume’
make[2]: *** [net/mac80211/main.o] Error 1
make[1]: *** [net/mac80211] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [net] Error 2
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
These warnings are exposed by gcc 4.6.
net/wireless/reg.c: In function 'freq_reg_info_regd':
net/wireless/reg.c:675:38: warning: variable 'pr' set but not used
[-Wunused-but-set-variable]
net/wireless/lib80211_crypt_wep.c: In function 'lib80211_wep_build_iv':
net/wireless/lib80211_crypt_wep.c:99:12: warning: variable 'len' set but
not used [-Wunused-but-set-variable]
Signed-off-by: Rajkumar Manoharan <rmanoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When we are disconnecting, we set PS off, but this happens before we
send the deauth/disassoc request. When the deauth/disassoc frames are
sent, we trigger the dynamic ps timer, which then times out and turns
PS back on. Thus, PS remains on after disconnecting, causing problems
when associating again.
This can be fixed by preventing the timer to start when we're not
associated anymore.
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The kernel already prints its build timestamp during boot, no need to
repeat it in random drivers and produce different object files each
time.
Signed-off-by: Michal Marek <mmarek@suse.cz>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: netdev@vger.kernel.org
Cc: tipc-discussion@lists.sourceforge.net
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
This patch reverts a2361c8735:
"[PATCH] netfilter: xt_conntrack: warn about use in raw table"
Florian Wesphal says:
"... when the packet was sent from the local machine the skb
already has ->nfct attached, and -m conntrack seems to do
the right thing."
Acked-by: Jan Engelhardt <jengelh@medozas.de>
Reported-by: Florian Wesphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The mask indicates the bits one wants to zero out, so it needs to be
inverted before applying to the original TOS field.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The IPv6 header is not zeroed out in alloc_skb so we must initialize
it properly unless we want to see IPv6 packets with random TOS fields
floating around. The current implementation resets the flow label
but this could be changed if deemed necessary.
We stumbled upon this issue when trying to apply a mangle rule to
the RST packet generated by the REJECT target module.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.
IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.
This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.
The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
If the sync daemons run in a name space while it crashes
or get killed, there is no way to stop them except for a reboot.
When all patches are there, ip_vs_core will handle register_pernet_(),
i.e. ip_vs_sync_init() and ip_vs_sync_cleanup() will be removed.
Kernel threads should not increment the use count of a socket.
By calling sk_change_net() after creating a socket this is avoided.
sock_release cant be used intead sk_release_kernel() should be used.
Thanks Eric W Biederman for your advices.
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
The optimizations in commit 255d0dc340
(netfilter: x_table: speedup compat operations) assume that
xt_compat_add_offset is called once per rule.
ebtables however called it for each match/target found in a rule.
The match/watcher/target parser already returns the needed delta, so it
is sufficient to move the xt_compat_add_offset call to a more reasonable
location.
While at it, also get rid of the unused COMPAT iterator macros.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
commit 255d0dc340 (netfilter: x_table: speedup compat operations)
made ebtables not working anymore.
1) xt_compat_calc_jump() is not an exact match lookup
2) compat_table_info() has a typo in xt_compat_init_offsets() call
3) compat_do_replace() misses a xt_compat_init_offsets() call
Reported-by: dann frazier <dannf@dannf.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch fixes the missing initialization of the start time if
the timestamp support is enabled.
libnetfilter_conntrack/utils# conntrack -E &
libnetfilter_conntrack/utils# ./conntrack_create
tcp 6 109 ESTABLISHED src=1.1.1.1 dst=2.2.2.2 sport=1025 dport=21 packets=0 bytes=0 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=21 dport=1025 packets=0 bytes=0 mark=0 delta-time=1303296401 use=2
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Use proper data type to handle get_user_pages_fast error condition. Also
do not treat EFAULT error as fatal.
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
mac_pton() parses MAC address in form XX:XX:XX:XX:XX:XX and only in that form.
mac_pton() doesn't dirty result until it's sure string representation is valid.
mac_pton() doesn't care about characters _after_ last octet,
it's up to caller to deal with it.
mac_pton() diverges from 0/-E return value convention.
Target usage:
if (!mac_pton(str, whatever->mac))
return -EINVAL;
/* ->mac being u8 [ETH_ALEN] is filled at this point. */
/* optionally check str[3 * ETH_ALEN - 1] for termination */
Use mac_pton() in pktgen and netconsole for start.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At VLAN dismantle phase, unregister_vlan_dev() makes one
synchronize_net() call after vlan_group_set_device(grp, vlan_id, NULL).
This call can be safely removed because we are calling
unregister_netdevice_queue() to queue device for deletion, and this
process needs at least one rcu grace period to complete.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Speedup vlan dismantling in CONFIG_VLAN_8021Q_GVRP=y cases,
by using a call_rcu() to free the memory instead of waiting with
expensive synchronize_rcu() [ while RTNL is held ]
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
veth devices dont use the batched device unregisters yet.
Since veth are a pair of devices, it makes sense to use a batch of two
unregisters, this roughly divides dismantle time by two.
Fix this by changing dellink() callers to always provide a non NULL
head. (Idea from Michał Mirosław)
This patch also handles macvlan case : We now dismantle all macvlans on
top of a lower dev at once.
Reported-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Michał Mirosław <mirqus@gmail.com>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I messed things up when I converted over to the transport
flow, I passed the ipv4 address value instead of it's address.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This way ip_output.c no longer needs rt->rt_{src,dst}.
We already have these keys sitting, ready and waiting, on the stack or
in a socket structure.
Signed-off-by: David S. Miller <davem@davemloft.net>
We have two cases.
Either the socket is in TCP_ESTABLISHED state and connect() filled
in the inet socket cork flow, or we looked up the route here and
used an on-stack flow.
Track which one it was, and use it to obtain src/dst addrs.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables ethtool to set the loopback mode on a given interface.
By configuring the interface in loopback mode in conjunction with a policy
route / rule, a userland application can stress the egress / ingress path
exposing the flows of the change in progress and potentially help developer(s)
understand the impact of those changes without even sending a packet out
on the network.
Following set of commands illustrates one such example -
a) ip -4 addr add 192.168.1.1/24 dev eth1
b) ip -4 rule add from all iif eth1 lookup 250
c) ip -4 route add local 0/0 dev lo proto kernel scope host table 250
d) arp -Ds 192.168.1.100 eth1
e) arp -Ds 192.168.1.200 eth1
f) sysctl -w net.ipv4.ip_nonlocal_bind=1
g) sysctl -w net.ipv4.conf.all.accept_local=1
# Assuming that the machine has 8 cores
h) taskset 000f netserver -L 192.168.1.200
i) taskset 00f0 netperf -t TCP_CRR -L 192.168.1.100 -H 192.168.1.200 -l 30
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP Cubic keeps a metric that estimates the amount of delayed
acknowledgements to use in adjusting the window. If an abnormally
large number of packets are acknowledged at once, then the update
could wrap and reach zero. This kind of ACK could only
happen when there was a large window and huge number of
ACK's were lost.
This patch limits the value of delayed ack ratio. The choice of 32
is just a conservative value since normally it should be range of
1 to 4 packets.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I don't know why %pI6 doesn't compress, but the format specifier is
kernel-standard, so use it.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows us to acquire the exact route keying information from the
protocol, however that might be managed.
It handles all of the possibilities, from the simplest case of storing
the key in inet->cork.fl to the more complex setup SCTP has where
individual transports determine the flow.
Signed-off-by: David S. Miller <davem@davemloft.net>
Operation order is now transposed, we first create the child
socket then we try to hook up the route.
Signed-off-by: David S. Miller <davem@davemloft.net>
DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.
IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.
This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.
The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
If the sync daemons run in a name space while it crashes
or get killed, there is no way to stop them except for a reboot.
When all patches are there, ip_vs_core will handle register_pernet_(),
i.e. ip_vs_sync_init() and ip_vs_sync_cleanup() will be removed.
Kernel threads should not increment the use count of a socket.
By calling sk_change_net() after creating a socket this is avoided.
sock_release cant be used intead sk_release_kernel() should be used.
Thanks Eric W Biederman for your advices.
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
This is just like inet_csk_route_req() except that it operates after
we've created the new child socket.
In this way we can use the new socket's cork flow for proper route
key storage.
This will be used by DCCP and TCP child socket creation handling.
Signed-off-by: David S. Miller <davem@davemloft.net>