Commit graph

361125 commits

Author SHA1 Message Date
Johannes Berg
8d1f7ecd2a mac80211: defer tailroom counter manipulation when roaming
During roaming, the crypto_tx_tailroom_needed_cnt counter
will often take values 2,1,0,1,2 because first keys are
removed and then new keys are added. This is inefficient
because during the 0->1 transition, synchronize_net must
be called to avoid packet races, although typically no
packets would be flowing during that time.

To avoid that, defer the decrement (2->1, 1->0) when keys
are removed (by half a second). This means the counter
will really have the values 2,2,2,3,4 ... 2, thus never
reaching 0 and having to do the 0->1 transition.

Note that this patch entirely disregards the drivers for
which this optimisation was done to start with, for them
the key removal itself will be expensive because it has
to synchronize_net() after the counter is incremented to
remove the key from HW crypto. For them the sequence will
look like this: 0,1,0,1,0,1,0,1,0 (*) which is clearly a
lot more inefficient. This could be addressed separately,
during key removal the 0->1->0 sequence isn't necessary.

(*) it starts at 0 because HW crypto is on, then goes to
    1 when HW crypto is disabled for a key, then back to
    0 because the key is deleted; this happens for both
    keys in the example. When new keys are added, it goes
    to 1 first because they're added in software; when a
    key is moved to hardware it goes back to 0

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:36:00 +01:00
Johannes Berg
a87121051c mac80211: remove IEEE80211_KEY_FLAG_WMM_STA
There's no driver using this flag, so it seems
that all drivers support HW crypto with WMM or
don't support it at all. Remove the flag and
code setting it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:59 +01:00
Stanislaw Gruszka
153a5fc410 mac80211: merge reconfig assign chanctx code
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:58 +01:00
Stanislaw Gruszka
690205f18f mac80211: cleanup suspend/resume on mesh mode
Remove not used any longer suspend/resume code.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:58 +01:00
Stanislaw Gruszka
a61829437e mac80211: cleanup suspend/resume on ibss mode
Remove not used any longer suspend/resume code.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:57 +01:00
Stanislaw Gruszka
9b7d72c104 mac80211: cleanup suspend/resume on managed mode
Remove not used any longer suspend/resume code.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:56 +01:00
Stanislaw Gruszka
12e7f51702 mac80211: cleanup generic suspend/resume procedures
Since now we disconnect before suspend, various code which save
connection state can now be removed from suspend and resume
procedure. Cleanup on resume side is smaller as ieee80211_reconfig()
is also used for H/W restart.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:56 +01:00
Stanislaw Gruszka
8125696991 cfg80211/mac80211: disconnect on suspend
If possible that after suspend, cfg80211 will receive request to
disconnect what require action on interface that was removed during
suspend.

Problem can manifest itself by various warnings similar to below one:

WARNING: at net/mac80211/driver-ops.h:12 ieee80211_bss_info_change_notify+0x2f9/0x300 [mac80211]()
wlan0:  Failed check-sdata-in-driver check, flags: 0x4
Call Trace:
 [<c043e0b3>] warn_slowpath_fmt+0x33/0x40
 [<f83707c9>] ieee80211_bss_info_change_notify+0x2f9/0x300 [mac80211]
 [<f83a660a>] ieee80211_recalc_ps_vif+0x2a/0x30 [mac80211]
 [<f83a6706>] ieee80211_set_disassoc+0xf6/0x500 [mac80211]
 [<f83a9441>] ieee80211_mgd_deauth+0x1f1/0x280 [mac80211]
 [<f8381b36>] ieee80211_deauth+0x16/0x20 [mac80211]
 [<f8261e70>] cfg80211_mlme_down+0x70/0xc0 [cfg80211]
 [<f8264de1>] __cfg80211_disconnect+0x1b1/0x1d0 [cfg80211]

To fix the problem disconnect from any associated network before
suspend. User space is responsible to establish connection again
after resume. This basically need to be done by user space anyway,
because associated stations can go away during suspend (for example
NetworkManager disconnects on suspend and connect on resume by default).

Patch also handle situation when driver refuse to suspend with wowlan
configured and try to suspend again without it.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:55 +01:00
Stanislaw Gruszka
30c97120c6 mac80211: remove napi
Since two years no mac80211 driver implement support for NAPI. Looks
this feature is unneeded, so remove it from generic mac80211 code.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:54 +01:00
Felix Fietkau
098b8afbf2 mac80211/minstrel_ht: fix spacing between sample attempts
A sample attempt should only count in mi->sample_tries if the sample
attempt wasn't skipped based on slower rate criteria.
This patch increases the sampling frequency for potentially desirable
rates and thus enables faster recovery from interference or collisions.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:54 +01:00
Felix Fietkau
965237ab9f mac80211/minstrel_ht: increase sampling frequency of some slower rates
If a rate is below the max_tp_rate, sample it frequently if:
- it is above max_tp_rate2, or
- it is above max_prob_rate and is a candidate for max_prob_rate
  (has fewer streams than max_tp_rate).
This helps the retry chain recover more quickly from bad statistics
caused by collisions or interference, and slightly reduces throughput
fluctuations with higher rates.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:53 +01:00
Felix Fietkau
96d4ac3f2f minstrel_ht: increase sampling frequency
Try to sample all available rates, as sample attempts do not cost much
airtime and are appropriately spaced based on the average A-MPDU length.
This helps with faster recovery on rate fluctuations.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:53 +01:00
Felix Fietkau
a299c6d591 mac80211/minstrel_ht: improve max_prob_rate selection
max_prob_rate should be selected to be very reliable, however limiting
it to single-stream on 3-stream devices is a bit much.
Allow max_prob_rate to use one stream less than the max_tp_rate.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:52 +01:00
Felix Fietkau
ed97a13c54 mac80211/minstrel_ht: improve accuracy of throughput metric at high data rates
At high data rates the average frame transmission durations are small
enough for rounding errors to matter, sometimes causing minstrel to use
slightly lower transmit rates than necessary.
To fix this, change the unit of the duration value to nanoseconds
instead of microseconds, and reorder the multiplications/divisions when
calculating the throughput metric so that they don't overflow or
truncate prematurely.
At 2-stream HT40 this makes TCP throughput a bit more stable.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:51 +01:00
Jouni Malinen
355199e02b cfg80211: Extend support for IEEE 802.11r Fast BSS Transition
Add NL80211_CMD_UPDATE_FT_IES to support update of FT IEs to the WLAN
driver and NL80211_CMD_FT_EVENT to send FT events from the WLAN driver.
This will carry the target AP's MAC address along with the relevant
Information Elements. This event is used to report received FT IEs
(MDIE, FTIE, RSN IE, TIE, RICIE). These changes allow FT to be supported
with drivers that use an internal SME instead of user space option (like
FT implementation in wpa_supplicant with mac80211-based drivers).

Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:51 +01:00
Johannes Berg
723d568aa5 cfg80211: prohibit zero keepalive interval
It's not useful to specify a 0 keepalive interval, this
would send too much data. Prohibit this to also avoid
device issues.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:50 +01:00
Ilan Peer
d339d5ca8e mac80211: Allow drivers to differentiate between ROC types
Some devices can handle remain on channel requests differently
based on the request type/priority. Add support to
differentiate between different ROC types, i.e., indicate that
the ROC is required for sending managment frames.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:49 +01:00
Johannes Berg
f62fab735e cfg80211: refactor association parameters
cfg80211_mlme_assoc() has grown far too many arguments,
make the caller build almost all of the driver struct
and pass that to the function instead.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:49 +01:00
Johannes Berg
dd5ecfeac8 mac80211: support VHT capability overrides
Support the cfg80211 API to override VHT capabilities
on association.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:48 +01:00
Johannes Berg
ee2aca343c cfg80211: add ability to override VHT capabilities
For testing it's sometimes useful to be able to
override certain VHT capability advertisement,
add the ability to do that in cfg80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:47 +01:00
Johannes Berg
c8bb93f5f5 wireless: remove unused VHT MCS defines
There's an enum with the same values (but slightly
different names except for NOT_SUPPORTED) that is
actually used, so remove the defines.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:47 +01:00
Johannes Berg
947add36ca cfg80211: move exported event functions into nl80211
This is the sort of thing gcc's LTO could do, but since
we don't have that yet we can also do it manually. The
advantage is reduced code, both source and binary, e.g.
on x86-64

   text	   data	    bss	    dec	    hex	filename
 442825	  56230	    776	 499831	  7a077	cfg80211.ko (before)
 441585	  56230	    776	 498591	  79b9f	cfg80211.ko (after)

a reduction of ~1k.

But in order to not complicate the code move only those
functions that are simple wrappers, not those that have
functionality of their own.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:46 +01:00
Johannes Berg
fe1abafd94 nl80211: re-add channel width and extended capa advertising
Add back the channel width and extended capability data
to wiphy information if split information is supported.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:45 +01:00
Felix Fietkau
b8a31c9a5a ieee80211: mark 802.11 related structs as being 2-byte aligned
Regardless of what header features they use, or if they align the IP
header or not, 802.11 packets from all drivers guarantee a 2-byte
alignment (and there's a debug WARN_ON in case they don't).

Annotate packet structs with __aligned(2) to allow the compiler to use
16-bit load/store operations on platforms with extremely inefficient
unaligned access (e.g. MIPS).

This reduces code size and improves performance on affected platforms
and causes no binary code change on others.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:45 +01:00
Johannes Berg
9a886586c8 wireless: move sequence number arithmetic to ieee80211.h
Move the sequence number arithmetic code from mac80211 to
ieee80211.h so others can use it. Also rename the functions
from _seq to _sn, they operate on the sequence number, not
the sequence_control field.

Also move macros to convert the sequence control to/from
the sequence number value from various drivers.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:44 +01:00
Johannes Berg
b56cf72083 nl80211: conditionally add back TCP WoWLAN information
Add back the previously removed TCP WoWLAN information,
but only if userspace is prepared to deal with large
wiphy capability data dumps.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:43 +01:00
Johannes Berg
cdc89b97bf nl80211: conditionally add back radar information
If userspace is updated to deal with large split wiphy
information dumps, add back the radar information that
could otherwise push the data over the limit of the
netlink dump messages.

Cc: Simon Wunderlich <simon.wunderlich@s2003.tu-chemnitz.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:43 +01:00
Johannes Berg
3713b4e364 nl80211: allow splitting wiphy information in dumps
The per-wiphy information is getting large, to the point
where with more than the typical number of channels it's
too large and overflows, and userspace can't get any of
the information at all.

To address this (in a way that doesn't require making all
messages bigger) allow userspace to specify that it can
deal with wiphy information split across multiple parts
of the dump, and if it can split up the data. This also
splits up each channel separately so an arbitrary number
of channels can be supported.

Additionally, since GET_WIPHY has the same problem, add
support for filtering the wiphy dump and get information
for a single wiphy only, this allows userspace apps to
use dump in this case to retrieve all data from a single
device.

As userspace needs to know if all this this is supported,
add a global nl80211 feature set and include a bit for
this behaviour in it.

Cc: Dennis H Jensen <dennis.h.jensen@siemens.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:42 +01:00
Johannes Berg
191922cd4b mac80211: clarify alignment comment
The comment says something about __skb_push(), but that
isn't even called in the code any more. Looking at the
git history, that comment never even made sense when it
was still called, so just replace that part to note it
still works even when align isn't 0 or 2.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:41 +01:00
Sachin Kamat
9fed3096d7 net: rfkill: Fix sparse warning in rfkill-regulator.c
'rfkill_regulator_ops' is used only in this file. Hence make it static.
Silences the following warning:
net/rfkill/rfkill-regulator.c:54:19: warning:
symbol 'rfkill_regulator_ops' was not declared. Should it be static?

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:41 +01:00
Johannes Berg
77ee7c891a cfg80211: comprehensively check station changes
The station change API isn't being checked properly before
drivers are called, and as a result it is difficult to see
what should be allowed and what not.

In order to comprehensively check the API parameters parse
everything first, and then have the driver call a function
(cfg80211_check_station_change()) with the additionally
information about the kind of station that is being changed;
this allows the function to make better decisions than the
old code could.

While at it, also add a few checks, particularly in mesh
and clarify the TDLS station lifetime in documentation.

To be able to reduce a few checks, ignore any flag set bits
when the mask isn't set, they shouldn't be applied then.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:40 +01:00
Johannes Berg
ff276691e9 cfg80211: unify station WME parsing
Instead of copying the code, create a new function
to parse the station's WME information.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:39 +01:00
Johannes Berg
984c311b09 cfg80211: clean up station WME attribute parsing
Parse the attributes first, and then disable the apply
flag if needed.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:39 +01:00
Johannes Berg
2c1aabf33d cfg80211: constify station parameter pointers
All the pointers point right into the skb data and
not to anything that would be useful to change, so
make them const.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:38 +01:00
Johannes Berg
f8bacc2104 cfg80211: clean up mesh plink station change API
Make the ability to leave the plink_state unchanged not use a
magic -1 variable that isn't in the enum, but an explicit change
flag; reject invalid plink states or actions and move the needed
constants for plink actions to the right header file. Also
reject plink_state changes for non-mesh interfaces.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:35:37 +01:00
Johannes Berg
c0f3a317f2 Merge remote-tracking branch 'mac80211/master' into HEAD
There are a few things that would otherwise conflict.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-03-06 16:33:12 +01:00
John W. Linville
32cdd592b7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-03-06 10:21:17 -05:00
Gavin Shan
66d29cbc59 benet: Wait f/w POST until timeout
While PCI card faces EEH errors, reset (usually hot reset) is
expected to recover from the EEH errors. After EEH core finishes
the reset, the driver callback (be_eeh_reset) is called and wait
the firmware to complete POST successfully. The original code would
return with error once detecting failure during POST stage. That
seems not enough.

The patch forces the driver (be_eeh_reset) to wait the firmware
completes POST until timeout, instead of returning error upon
detection POST failure immediately. Also, it would improve the
reliability of the EEH funtionality of the driver.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Acked-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:06 -05:00
David Ward
fa2b04f450 net/ipv4: Timestamp option cannot overflow with prespecified addresses
When a router forwards a packet that contains the IPv4 timestamp option,
if there is no space left in the option for the router to add its own
timestamp, then the router increments the Overflow value in the option.

However, if the addresses of the routers are prespecified in the option,
then the overflow condition cannot happen: the option is structured so
that each prespecified router has a place to write its timestamp. Other
routers do not add a timestamp, so there will never be a lack of space.

This fix ensures that the Overflow value in the IPv4 timestamp option is
not incremented when the addresses of the routers are prespecified, even
if the Pointer value is greater than the Length value.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:06 -05:00
Eric Dumazet
d1f41b67ff net: reduce net_rx_action() latency to 2 HZ
We should use time_after_eq() to get maximum latency of two ticks,
instead of three.

Bug added in commit 24f8b2385 (net: increase receive packet quantum)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:06 -05:00
Randy Dunlap
691b3b7e13 net: fix new kernel-doc warnings in net core
Fix new kernel-doc warnings in net/core/dev.c:

Warning(net/core/dev.c:4788): No description found for parameter 'new_carrier'
Warning(net/core/dev.c:4788): Excess function parameter 'new_carries' description in 'dev_change_carrier'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:06 -05:00
Zang MingJie
88c4c066c6 reset nf before xmit vxlan encapsulated packet
We should reset nf settings bond to the skb as ipip/ipgre do.

If not, the conntrack/nat info bond to the origin packet may continually
redirect the packet to vxlan interface causing a routing loop.

this is the scenario:

     VETP     VXLAN Gateway
    /----\  /---------------\
    |    |  |               |
    |  vx+--+vx --NAT-> eth0+--> Internet
    |    |  |               |
    \----/  \---------------/

when there are any packet coming from internet to the vetp, there will be lots
of garbage packets coming out the gateway's vxlan interface, but none actually
sent to the physical interface, because they are redirected back to the vxlan
interface in the postrouting chain of NAT rule, and dmesg complains:

    Mar  1 21:52:53 debian kernel: [ 8802.997699] Dead loop on virtual device vxlan0, fix it urgently!
    Mar  1 21:52:54 debian kernel: [ 8804.004907] Dead loop on virtual device vxlan0, fix it urgently!
    Mar  1 21:52:55 debian kernel: [ 8805.012189] Dead loop on virtual device vxlan0, fix it urgently!
    Mar  1 21:52:56 debian kernel: [ 8806.020593] Dead loop on virtual device vxlan0, fix it urgently!

the patch should fix the problem

Signed-off-by: Zang MingJie <zealot0630@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:05 -05:00
Paolo Valente
76e4cb0d3a pkt_sched: sch_qfq: remove a useless invocation of qfq_update_eligible
QFQ+ can select for service only 'eligible' aggregates, i.e.,
aggregates that would have started to be served also in the emulated
ideal system.  As a consequence, for QFQ+ to be work conserving, at
least one of the active aggregates must be eligible when it is time to
choose the next aggregate to serve.

The set of eligible aggregates is updated through the function
qfq_update_eligible(), which does guarantee that, after its
invocation, at least one of the active aggregates is eligible.
Because of this property, this function is invoked in
qfq_deactivate_agg() to guarantee that at least one of the active
aggregates is still eligible after an aggregate has been deactivated.
In particular, the critical case is when there are other active
aggregates, but the aggregate being deactivated happens to be the only
one eligible.

However, this precaution is not needed for QFQ+ to be work conserving,
because update_eligible() is always invoked also at the beginning of
qfq_choose_next_agg(). This patch removes the additional invocation of
update_eligible() in qfq_deactivate_agg().

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Reviewed-by: Fabio Checconi <fchecconi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:05 -05:00
Paolo Valente
40dd2d5461 pkt_sched: sch_qfq: do not allow virtual time to jump if an aggregate is in service
By definition of (the algorithm of) QFQ+, the system virtual time must
be pushed up only if there is no 'eligible' aggregate, i.e. no
aggregate that would have started to be served also in the ideal
system emulated by QFQ+.  QFQ+ serves only eligible aggregates, hence
the aggregate currently in service is eligible.  As a consequence, to
decide whether there is no eligible aggregate, QFQ+ must also check
whether there is no aggregate in service.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Reviewed-by: Fabio Checconi <fchecconi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:05 -05:00
Paolo Valente
a0143efa96 pkt_sched: sch_qfq: prevent budget from wrapping around after a dequeue
Aggregate budgets are computed so as to guarantee that, after an
aggregate has been selected for service, that aggregate has enough
budget to serve at least one maximum-size packet for the classes it
contains. For this reason, after a new aggregate has been selected
for service, its next packet is immediately dequeued, without any
further control.

The maximum packet size for a class, lmax, can be changed through
qfq_change_class(). In case the user sets lmax to a lower value than
the the size of some of the still-to-arrive packets, QFQ+ will
automatically push up lmax as it enqueues these packets.  This
automatic push up is likely to happen with TSO/GSO.

In any case, if lmax is assigned a lower value than the size of some
of the packets already enqueued for the class, then the following
problem may occur: the size of the next packet to dequeue for the
class may happen to be larger than lmax, after the aggregate to which
the class belongs has been just selected for service. In this case,
even the budget of the aggregate, which is an unsigned value, may be
lower than the size of the next packet to dequeue. After dequeueing
this packet and subtracting its size from the budget, the latter would
wrap around.

This fix prevents the budget from wrapping around after any packet
dequeue.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Reviewed-by: Fabio Checconi <fchecconi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:05 -05:00
Paolo Valente
2f3b89a1fe pkt_sched: sch_qfq: serve activated aggregates immediately if the scheduler is empty
If no aggregate is in service, then the function qfq_dequeue() does
not dequeue any packet. For this reason, to guarantee QFQ+ to be work
conserving, a just-activated aggregate must be set as in service
immediately if it happens to be the only active aggregate.
This is done by the function qfq_enqueue().

Unfortunately, the function qfq_add_to_agg(), used to add a class to
an aggregate, does not perform this important additional operation.
In particular, if: 1) qfq_add_to_agg() is invoked to complete the move
of a class from a source aggregate, becoming, for this move, inactive,
to a destination aggregate, becoming instead active, and 2) the
destination aggregate becomes the only active aggregate, then this
aggregate is not however set as in service. QFQ+ remains then in a
non-work-conserving state until a new invocation of qfq_enqueue()
recovers the situation.

This fix solves the problem by moving the logic for setting an
aggregate as in service directly into the function qfq_activate_agg().
Hence, from whatever point qfq_activate_aggregate() is invoked, QFQ+
remains work conserving.  Since the more-complex logic of this new
version of activate_aggregate() is not necessary, in qfq_dequeue(), to
reschedule an aggregate that finishes its budget, then the aggregate
is now rescheduled by invoking directly the functions needed.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Reviewed-by: Fabio Checconi <fchecconi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:05 -05:00
Paolo Valente
624b85fb96 pkt_sched: sch_qfq: fix the update of eligible-group sets
Between two invocations of make_eligible, the system virtual time may
happen to grow enough that, in its binary representation, a bit with
higher order than 31 flips. This happens especially with
TSO/GSO. Before this fix, the mask used in make_eligible was computed
as (1UL<<index_of_last_flipped_bit)-1, whose value is well defined on
a 64-bit architecture, because index_of_flipped_bit <= 63, but is in
general undefined on a 32-bit architecture if index_of_flipped_bit > 31.
The fix just replaces 1UL with 1ULL.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Reviewed-by: Fabio Checconi <fchecconi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:05 -05:00
Paolo Valente
9b99b7e90b pkt_sched: sch_qfq: properly cap timestamps in charge_actual_service
QFQ+ schedules the active aggregates in a group using a bucket list
(one list per group). The bucket in which each aggregate is inserted
depends on the aggregate's timestamps, and the number
of buckets in a group is enough to accomodate the possible (range of)
values of the timestamps of all the aggregates in the group. For this
property to hold, timestamps must however be computed correctly.  One
necessary condition for computing timestamps correctly is that the
number of bits dequeued for each aggregate, while the aggregate is in
service, does not exceed the maximum budget budgetmax assigned to the
aggregate.

For each aggregate, budgetmax is proportional to the number of classes
in the aggregate. If the number of classes of the aggregate is
decreased through qfq_change_class(), then budgetmax is decreased
automatically as well.  Problems may occur if the aggregate is in
service when budgetmax is decreased, because the current remaining
budget of the aggregate and/or the service already received by the
aggregate may happen to be larger than the new value of budgetmax.  In
this case, when the aggregate is eventually deselected and its
timestamps are updated, the aggregate may happen to have received an
amount of service larger than budgetmax.  This may cause the aggregate
to be assigned a higher virtual finish time than the maximum
acceptable value for the last bucket in the bucket list of the group.

This fix introduces a cap that addresses this issue.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Reviewed-by: Fabio Checconi <fchecconi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:05 -05:00
Peter Hurley
f74861ca87 net/irda: Raise dtr in non-blocking open
DTR/RTS need to be raised, regardless of the open() mode, but not
if the port has already shutdown.

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:05 -05:00
Peter Hurley
0b176ce3a7 net/irda: Use barrier to set task state
Without a memory and compiler barrier, the task state change
can migrate relative to the condition testing in a blocking loop.
However, the task state change must be visible across all cpus
prior to testing those conditions. Failing to do this can result
in the familiar 'lost wakeup' and this task will hang until killed.

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-06 02:47:04 -05:00