* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu:
rcu: Prevent RCU callbacks from executing before scheduler initialized
Commit 3fe1698b7f ("sched: Deal with non-atomic min_vruntime reads
on 32bit") forgot to initialize min_vruntime_copy which could lead to
an infinite while loop in task_waking_fair() under some circumstances
(early boot, lucky timing).
[ This bug was also reported by others that blamed it on the RCU
initialization problems ]
Reported-and-tested-by: Bruno Wolff III <bruno@wolff.to>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When receiving the first RX interrupt before the internal call
to napi_schedule_prep is successful the RX interrupt gets disabled
and is never enabled again as the poll function never gets executed.
Signed-off-by: Michael Thalmeier <Michael.Thalmeier@sigmatek.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Coefficients to convert chip register values to voltage/current have been
slightly changed in revision B of the chip datasheet. Update driver coefficients
to match the coefficients in the datasheet.
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
The registers and descriptors bits are identical to the pre-8168
8169 chipsets : {RxDesc / TxDesc}.opts2 can only contain VLAN information.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Reviewed-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes obsolete code in the initialisation/creation of
slcan devices.
It follows the suggested cleanups from Ilya Matvejchikov in
drivers/net/slip.c that where recently applied to net-next-2.6:
- slip: remove dead code within the slip initialization
- slip: remove redundant check slip_devs for NULL
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 5f77898de1 does not completely
fix the problem of handling allocations with irqs disabled.. The
below patch on top of it fixes the problem completely.
Based on review by "Ivan Vecera" <ivecera@redhat.com>..
"
Small note, the root of the problem was that non-atomic allocation was requested with IRQs disabled. Your patch description does not contain wwhy were the IRQs disabled.
The function bnad_mbox_irq_alloc incorrectly uses 'flags' var for two different things, 1) to save current CPU flags and 2) for request_irq
call.
First the spin_lock_irqsave disables the IRQs and saves _all_ CPU flags (including one that enables/disables interrupts) to 'flags'. Then the 'flags' is overwritten by 0 or 0x80 (IRQF_SHARED). Finally the spin_unlock_irqrestore should restore saved flags, but these flags are now either 0x00 or 0x80. The interrupt bit value in flags register on x86 arch is 0x100.
This means that the interrupt bit is zero (IRQs disabled) after spin_unlock_irqrestore so the request_irq function is called with disabled interrupts.
"
Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__gfs2_free_data and __gfs2_free_meta are almost identical, and
can be trivially combined.
[This is as per Eric's original patch minus gfs2_free_data() which had
no callers left and plus the conversion of the bmap.c calls to these
functions. All in all, a nice clean up]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This adds S_NOSEC support to GFS2. We set/reset the flag either when
a user calls setattr or when we have just regained the glock
from another node. The flag is only set if there are no xattrs
on the inode and there is no suid bit set.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
This patch is a performance improvement for GFS2 in a clustered
environment. It makes the glock hold time self-adjusting.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch adds a cache for the hash table to the directory code
in order to help simplify the way in which the hash table is
accessed. This is intended to be a first step towards introducing
some performance improvements in the directory code.
There are two follow ups that I'm hoping to see fairly shortly. One
is to simplify the hash table reading code now that we always read the
complete hash table, whether we want one entry or all of them. The
other is to introduce readahead on the heads of the hash chains
which are referred to from the table.
The hash table is a maximum of 128k in size, so it is not worth trying
to read it in small chunks.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
handle_signal()->set_fs() has a nice comment which explains what
set_fs() is, but it doesn't explain why it is needed and why it
depends on CONFIG_X86_64.
Afaics, the history of this confusion is:
1. I guess today nobody can explain why it was needed
in arch/i386/kernel/signal.c, perhaps it was always
wrong. This predates 2.4.0 kernel.
2. then it was copy-and-past'ed to the new x86_64 arch.
3. then it was removed from i386 (but not from x86_64)
by b93b6ca3 "i386: remove unnecessary code".
4. then it was reintroduced under CONFIG_X86_64 when x86
unified i386 and x86_64, because the patch above didn't
touch x86_64.
Remove it. ->addr_limit should be correct. Even if it was possible
that it is wrong, it is too late to fix it after setup_rt_frame().
Linus commented in:
http://lkml.kernel.org/r/alpine.LFD.0.999.0707170902570.19166@woody.linux-foundation.org
... about the equivalent bit from i386:
Heh. I think it's entirely historical.
Please realize that the whole reason that function is called "set_fs()" is
that it literally used to set the %fs segment register, not
"->addr_limit".
So I think the "set_fs(USER_DS)" is there _only_ to match the other
regs->xds = __USER_DS;
regs->xes = __USER_DS;
regs->xss = __USER_DS;
regs->xcs = __USER_CS;
things, and never mattered. And now it matters even less, and has been
copied to all other architectures where it is just totally insane.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20110710164424.GA20261@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
1. do_signal() looks at TS_RESTORE_SIGMASK and calculates the
mask which should be stored in the signal frame, then it
passes "oldset" to the callees, down to setup_rt_frame().
This is ugly, setup_rt_frame() can do this itself and nobody
else needs this sigset_t. Move this code into setup_rt_frame.
2. do_signal() also clears TS_RESTORE_SIGMASK if handle_signal()
succeeds.
We can move this to setup_rt_frame() as well, this avoids the
unnecessary checks and makes the logic more clear.
3. use set_current_blocked() instead of sigprocmask(SIG_SETMASK),
sigprocmask() should be avoided.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20110710182203.GA27979@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
sys_sigsuspend() and sys_sigreturn() change ->blocked directly.
This is not correct, see the changelog in e6fa16ab
"signal: sigprocmask() should do retarget_shared_pending()"
Change them to use set_current_blocked().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20110710192727.GA31759@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
sys32_sigsuspend() and sys32_*sigreturn() change ->blocked directly.
This is not correct, see the changelog in e6fa16ab
"signal: sigprocmask() should do retarget_shared_pending()"
Change them to use set_current_blocked().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20110710192724.GA31755@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Enabling function tracer to trace all functions, then load a module and
then disable function tracing will cause ftrace to fail.
This can also happen by enabling function tracing on the command line:
ftrace=function
and during boot up, modules are loaded, then you disable function tracing
with 'echo nop > current_tracer' you will trigger a bug in ftrace that
will shut itself down.
The reason is, the new ftrace code keeps ref counts of all ftrace_ops that
are registered for tracing. When one or more ftrace_ops are registered,
all the records that represent the functions that the ftrace_ops will
trace have a ref count incremented. If this ref count is not zero,
when the code modification runs, that function will be enabled for tracing.
If the ref count is zero, that function will be disabled from tracing.
To make sure the accounting was working, FTRACE_WARN_ON()s were added
to updating of the ref counts.
If the ref count hits its max (> 2^30 ftrace_ops added), or if
the ref count goes below zero, a FTRACE_WARN_ON() is triggered which
disables all modification of code.
Since it is common for ftrace_ops to trace all functions in the kernel,
instead of creating > 20,000 hash items for the ftrace_ops, the hash
count is just set to zero, and it represents that the ftrace_ops is
to trace all functions. This is where the issues arrise.
If you enable function tracing to trace all functions, and then add
a module, the modules function records do not get the ref count updated.
When the function tracer is disabled, all function records ref counts
are subtracted. Since the modules never had their ref counts incremented,
they go below zero and the FTRACE_WARN_ON() is triggered.
The solution to this is rather simple. When modules are loaded, and
their functions are added to the the ftrace pool, look to see if any
ftrace_ops are registered that trace all functions. And for those,
update the ref count for the module function records.
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Both __d_unalias() and __d_materialise_dentry() need loop prevention.
Grab rename_lock in caller, check for loops there...
As a side benefit, we have dentry_lock_for_move() called only under
rename_lock, which seriously reduces deadlock potential of the
execrable "locking order" used for ->d_lock.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that clocksource.archdata is available, use it for ia64-specific
code.
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: linux-ia64@vger.kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/d31de0ee0842a0e322fb6441571c2b0adb323fa2.1310563276.git.luto@mit.edu
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Fix the condition where PS3 network RX hangs when no network
TX is occurring by calling gelic_card_enable_rxdmac() during
RX_DMA_CHAIN_END event processing.
The gelic hardware automatically clears its RX_DMA_EN flag when
it detects an RX_DMA_CHAIN_END event. In its processing of
RX_DMA_CHAIN_END the gelic driver is required to set RX_DMA_EN
(with a call to gelic_card_enable_rxdmac()) to restart RX DMA
transfers. The existing gelic driver code does not set
RX_DMA_EN directly in its processing of the RX_DMA_CHAIN_END
event, but uses a flag variable card->rx_dma_restart_required
to schedule the setting of RX_DMA_EN until next inside the
interrupt handler.
It seems this delayed setting of RX_DMA_EN causes the hang since
the next RX interrupt after the RX_DMA_CHAIN_END event where
RX_DMA_EN is scheduled to be set will not occur since RX_DMA_EN
was not set. In the case were network TX is occuring, RX_DMA_EN
is set in the next TX interrupt and RX processing continues.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Overview:
Support mapping of priorities to traffic classes and
traffic classes to transmission queues ranges in the net device.
The queue ranges are (count, offset) pairs relating to the txq
array.
This can be done via DCBX negotiation or by kernel.
As a result Enhanced Transmission Selection (ETS) and Priority Flow
Control (PFC) are supported between L2 network traffic classes.
Mapping:
This patch uses the netdev_set_num_tc, netdev_set_prio_tc_map and
netdev_set_tc_queue functions to map priorities to traffic classes
and traffic classes to transmission queue ranges.
This mapping is performed by bnx2x_setup_tc function which is
connected to the ndo_setup_tc.
This function is always called at nic load where by default it
maps all priorities to tc 0, and it may also be called by the
kernel or by the bnx2x upon DCBX negotiation to modify the mapping.
rtnl lock:
When the ndo_setup_tc is called at nic load or by kernel the rtnl
lock is already taken. However, when DCBX negotiation takes place
the lock is not taken. The work is therefore scheduled to be
handled by the sp_rtnl task.
Fastpath:
The fastpath structure of the bnx2x which was previously used
to hold the information of one tx queue and one rx queue was
redesigned to represent multiple tx queues, one for each traffic
class.
The transmission queue supplied in the skb by the kernel can no
longer be interpreted as a straightforward index into the fastpath
structure array, but it must rather be decoded to the appropriate
fastpath index and the tc within that fastpath.
Slowpath:
The bnx2x's queue object was redesigned to accommodate multiple
transmission queues. The queue object's state machine was enhanced
to allow opening multiple transmission-only connections on top of
the regular tx-rx connection.
Firmware:
This feature relies on the tx-only queue feature introduced in the
bnx2x 7.0.23 firmware and the FW likewise must have the bnx2x multi
cos support.
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Renaming the "reset_task" to a more general purpose name,
"sp_rtnl_task", as it is already used for another purpose
other than reset which is parity recovery, and since I
plan to add a third operation for this task, updating the
priority to traffic class and traffic class to transmission
queues mappings after dcbx negotiation takes place.
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no software fallback implemented for SCTP or FCoE checksumming,
and so it should not be passed on by software devices like bridge or bonding.
For VLAN devices, this is different. First, the driver for underlying device
should be prepared to get offloaded packets even when the feature is disabled
(especially if it advertises it in vlan_features). Second, devices under
VLANs do not get replaced without tearing down the VLAN first.
This fixes a mess I accidentally introduced while converting bonding to
ndo_fix_features.
NETIF_F_SOFT_FEATURES are removed from BOND_VLAN_FEATURES because they
are unused as of commit 712ae51afd.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets to devices without NETIF_F_SCTP_CSUM (including NETIF_F_NO_CSUM)
should be properly checksummed because the packets can be diverted or
rerouted after construction. This still leaves packets diverted from
NETIF_F_SCTP_CSUM-enabled devices with broken checksums. Fixing this
needs implementing software offload fallback in networking core.
For users of sctp_checksum_disable, skb->ip_summed should be left as
CHECKSUM_NONE and not CHECKSUM_UNNECESSARY as per include/linux/skbuff.h.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a trivial typo in the name of the constant
ENERGY_PERF_BIAS_POWERSAVE. This didn't cause trouble because this
constant is not currently used for anything.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Len Brown <len.brown@intel.com>
Link: http://lkml.kernel.org/r/tip-abe48b108247e9b90b4c6739662a2e5c765ed114@git.kernel.org
Remove it, as it indirectly exposes netdev features. It's not used in
iproute2 (2.6.38) - is anything else using its interface?
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
The same information and more can be obtained by using ethtool
with ETHTOOL_GFEATURES.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename probe_* to trace_probe_* for avoiding namespace
confliction. This also fixes improper names of find_probe_event()
and cleanup_all_probes() to find_trace_probe() and
release_all_trace_probes() respectively.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20110627072636.6528.60374.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
It is not used anywhere except net/core/dev.c now.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
vlan_features contains features inherited from underlying device.
NETIF_SOFT_FEATURES are not inherited but belong to the vlan device
itself (ensured in vlan_dev_fix_features()).
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the fact that ORing with zero is a no-op.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove wrong setting of dev->flags. NETIF_F_NO_CSUM maps to IFF_DEBUG
there, so looks like a mistake.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of hw_nmi_watchdog_set_attr() weak function
and appropriate x86_pmu::hw_watchdog_set_attr() call
we introduce even alias mechanism which allow us
to drop this routines completely and isolate quirks
of Netburst architecture inside P4 PMU code only.
The main idea remains the same though -- to allow
nmi-watchdog and perf top run simultaneously.
Note the aliasing mechanism applies to generic
PERF_COUNT_HW_CPU_CYCLES event only because arbitrary
event (say passed as RAW initially) might have some
additional bits set inside ESCR register changing
the behaviour of event and we can't guarantee anymore
that alias event will give the same result.
P.S. Thanks a huge to Don and Steven for for testing
and early review.
Acked-by: Don Zickus <dzickus@redhat.com>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Stephane Eranian <eranian@google.com>
CC: Lin Ming <ming.m.lin@intel.com>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20110708201712.GS23657@sun
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Set the init value before reset in probe function. And then just
modify the relative bits and keep the init settings.
For 8110S, 8110SB, and 8110SC series, the initial value of RxConfig
needs to be set after the tx/rx is enabled.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Only 8111b needs to enable rx when shutdowning with WoL.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Only 8111E needs enable RxConfig bit 0 ~ 3 when suspending or
shutdowning for wake on lan.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Add the ERI functions which would be used by the new chips.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
- Disable tx and rx by resetting hw, so replace rtl8169_asic_down
with rtl8169_hw_reset.
- RxConfig bits 0 ~ 5 have to be cleared before hw reset to avoid
receiving spurious data.
- Certain chips need to do some checking before reset.
- Remove hw reset which is done before hw_start. It is done in close,
down or device probe functions.
- Move rtl8169_init_ring_indexes function into rtl_hw_reset function.
The indexes of tx and rx only need to be zero when the hw resets.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Currently the stack trace per event in ftace is only 8 frames.
This can be quite limiting and sometimes useless. Especially when
the "ignore frames" is wrong and we also use up stack frames for
the event processing itself.
Change this to be dynamic by adding a percpu buffer that we can
write a large stack frame into and then copy into the ring buffer.
For interrupts and NMIs that come in while another event is being
process, will only get to use the 8 frame stack. That should be enough
as the task that it interrupted will have the full stack frame anyway.
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
While attempting to create a timechart of boot up I found perf didn't
tolerate modules being loaded/unloaded. This patch fixes this by
reading the file once and then writing the size read at the correct
point in the file. It also simplifies the code somewhat.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Link: http://lkml.kernel.org/r/10011.1310614483@neuling.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>