While doing stress tests with a disabled IP route cache, I found
__mkroute_output() was touching three times in_device atomic refcount.
Use RCU to touch it once to reduce cache line ping pongs.
Before patch
time to perform the test
real 1m42.009s
user 0m12.545s
sys 25m0.726s
Profile :
16109.00 26.4% ip_route_output_slow vmlinux
7434.00 12.2% dst_destroy vmlinux
3280.00 5.4% fib_rules_lookup vmlinux
3252.00 5.3% fib_semantic_match vmlinux
2622.00 4.3% fib_table_lookup vmlinux
2535.00 4.1% dst_alloc vmlinux
1750.00 2.9% _raw_read_lock vmlinux
1532.00 2.5% rt_set_nexthop vmlinux
After patch
real 1m36.503s
user 0m12.977s
sys 23m25.608s
14234.00 22.4% ip_route_output_slow vmlinux
8717.00 13.7% dst_destroy vmlinux
4052.00 6.4% fib_rules_lookup vmlinux
3951.00 6.2% fib_semantic_match vmlinux
3191.00 5.0% dst_alloc vmlinux
1764.00 2.8% fib_table_lookup vmlinux
1692.00 2.7% _raw_read_lock vmlinux
1605.00 2.5% rt_set_nexthop vmlinux
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch restores the below flow control patch submitted by Rémi
Denis-Courmont, which accidentaly got lost due to Pipe controller patch
on Phonet.
commit 1a98214fee
Author: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Date: Mon Aug 30 12:57:03 2010 +0000
Phonet: restore flow control credits when sending fails
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Philip Rakity <prakity@marvell.com>
Signed-off-by: Sachin Sanap <ssanap@marvell.com>
Signed-off-by: Mark Brown <markb@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There were lots of places being inconsistent since handle count
looked like a kref but it really wasn't.
Fix this my just making handle count an atomic on the object,
and have it increase the normal object kref.
Now i915/radeon/nouveau drivers can drop the normal reference on
userspace object creation, and have the handle hold it.
This patch fixes a memory leak or corruption on unload, because
the driver had no way of knowing if a handle had been actually
added for this object, and the fbcon object needed to know this
to clean itself up properly.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
S3C2440 style I2C controller uses PCLK to calculate the SDA line delay.
The driver wrongly assumed that this delay is calculated from the
frequency that the controller is operating on. This patch fixes this
issue.
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
create_irq() returns -1 if the interrupt allocation failed, but the
code checks for irq == 0.
Use create_irq_nr() instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Venkatesh Pallipadi <venki@google.com>
LKML-Reference: <alpine.LFD.2.00.1009282310360.2416@localhost6.localdomain6>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
free_irq_cfg() is not freeing the cpumask_vars in irq_cfg. Fixing this
triggers a use after free caused by the fact that copying struct
irq_cfg is done with memcpy, which copies the pointer not the cpumask.
Fix both places.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
LKML-Reference: <alpine.LFD.2.00.1009282052570.2416@localhost6.localdomain6>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
If acpi_evaluate_object() function call doesn't fail, we must kfree()
output.buffer before returning from pcc_cpufreq_do_osc().
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Dave Jones <davej@redhat.com>
acpi_perf_data is a percpu pointer but was missing __percpu markup.
Add it.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Commit c52c2ddc1d ("alpha: switch osf_sigprocmask() to use of
sigprocmask()") had several problems. The more obvious compile issues
got fixed in commit 0f44fbd297 ("alpha: fix compile problem in
arch/alpha/kernel/signal.c"), but it also caused a regression.
Since _BLOCKABLE is already the set of signals that can be blocked, the
code should do "newmask & _BLOCKABLE" rather than inverting _BLOCKABLE
before masking.
Reported-by: Michael Cree <mcree@orcon.net.nz>
Patch-by: Al Viro <viro@zeniv.linux.org.uk>
Patch-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes a resource leak on failure, where the
oprofilefs and some counters may not released properly.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: <stable@kernel.org> # .35.x
LKML-Reference: <20100929145225.GJ13563@erda.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
HARD_TX_LOCK no longer protects tunnels from dead loops,
but xmit_recursion percpu counter.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support building wl1271-equipped boards without building the
wl1271 driver itself, e.g.:
CONFIG_MACH_OMAP_ZOOM3=y
CONFIG_WL12XX is not set
Reported-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
It should be ready after 8 years...remove the experimental dependency.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Print media debug messages only when HW debug is enabled.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the following DCA improvements to myri10ge:
1) Finally move myri10ge to use dca3 API
2) Disable PCIe relaxed ordering when enabling DCA on
myri10ge. This provides a performance boost on Nehalem
based Xeons
3) Make sure to properly initialize NIC's DCA state when it is enabled,
rather than giving the NIC a bogus tag (0) and waiting for
the first received packet to trigger an update. Not using a
real tag can cause hardware exceptions on some motherboards
when a CPU socket is empty.
3) Always update the cached CPU when our interrupt affinity changes
so as to avoid excessive calls to dca3_get_tag()
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Loic Prylli <loic@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Moved enabling of MSI to the bnx2x_set_num_queues() - the same functions that
handles the initialization of the MSI-X.
From: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The i810 and i830 device drivers may replace their file operations
on an open file descriptor. My previous patch to move the BKL
out of the common DRM code into these drivers only caught the
default file operations, not the ones that actually end up being
used.
Found while trying to come up with a way to kill the BKL for
good in these drivers.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Retrieve the header after doing pskb_may_pull since, pskb_may_pull
could change the buffer structure.
This is based on the comment given by Eric Dumazet on Phonet
Pipe controller patch for a similar problem.
Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
uml_net_set_mac() was broken and luckily it was never used, before.
What it was trying to do is spin_lock before memcopy the mac address.
Linus attempted to fix it in assumption that someone decided the
lock was needed. But since it was never ever used at all, and was
just dead code, I think we can assume that it is not needed, after
all.
On the other hand patch [f25c80a4] was trying to use eth_mac_addr()
in eth_configure(), *which was the real fallout*. Because of state
checks done inside eth_mac_addr() the address was never set. I have
not reintroduced the memcpy wrapper, but I've put a comment for future
cats.
The code now is back to exactly as it was before [f25c80a4]. With
the cleanup applied. If the spin_lock is indeed needed then a contender
should supply a test case that fails, then fix it with the proper
locking, as a separate unrelated patch.
CC: Julia Lawall <julia@diku.dk>
CC: David S. Miller <davem@davemloft.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Al Viro <viro@ZenIV.linux.org.uk>
Tested-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dmaengine: fix interrupt clearing for mv_xor
missing inline keyword for static function in linux/dmaengine.h
dma/shdma: move dereference below the NULL check
ocfs2 fast symlinks are NUL terminated strings stored inline in the
inode data area. However, disk corruption or a local attacker could, in
theory, remove that NUL. Because we're using strlen() (my fault,
introduced in a731d1 when removing vfs_follow_link()), we could walk off
the end of that string.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Cc: stable@kernel.org
There is some confusion with rx_queue name after RPS, and net drivers
private rx_queue fields.
I suggest to rename "struct net_device"->rx_queue to ingress_queue.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets.
Other seldom used fields are kept in netdev->stats structure, possibly
unsafe.
This is a preliminary work to support lockless transmit path, and
correct RX stats, that are already unsafe.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPIP tunnels can benefit from lockless xmits, using NETIF_F_LLTX
Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending
10000000 UDP frames via one ipip tunnel (size:200 bytes per frame)
Before patch :
real 2m53.321s
user 0m10.277s
sys 46m0.597s
After patch:
real 0m32.063s
user 0m9.237s
sys 8m16.255s
Last problem to solve is the contention on dst :
16118.00 28.3% __ip_route_output_key vmlinux
6135.00 10.8% dst_release vmlinux
3220.00 5.6% ip_finish_output vmlinux
2149.00 3.8% ip_route_output_flow vmlinux
1575.00 2.8% ip_append_data vmlinux
1481.00 2.6% ip_push_pending_frames vmlinux
1349.00 2.4% __xfrm_lookup vmlinux
1216.00 2.1% csum_partial_copy_generic vmlinux
1208.00 2.1% udp_sendmsg vmlinux
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GRE tunnels can benefit from lockless xmits, using NETIF_F_LLTX
Note: If tunnels are created with the "oseq" option, LLTX is not
enabled :
Even using an atomic_t o_seq, we would increase chance for packets being
out of order at receiver.
Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending
10000000 UDP frames via one gre tunnel (size:200 bytes per frame)
Before patch :
real 3m0.094s
user 0m9.365s
sys 47m50.103s
After patch:
real 0m29.756s
user 0m11.097s
sys 7m33.012s
Last problem to solve is the contention on dst :
38660.00 21.4% __ip_route_output_key vmlinux
20786.00 11.5% dst_release vmlinux
14191.00 7.8% __xfrm_lookup vmlinux
12410.00 6.9% ip_finish_output vmlinux
4540.00 2.5% ip_push_pending_frames vmlinux
4427.00 2.4% ip_append_data vmlinux
4265.00 2.4% __alloc_skb vmlinux
4140.00 2.3% __ip_local_out vmlinux
3991.00 2.2% dev_queue_xmit vmlinux
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 15fc1f7056 (sit: percpu stats accounting) forgot the fallback
tunnel case (sit0), and can crash pretty fast.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 3c97af99a5 (ipip: percpu stats accounting) forgot the fallback
tunnel case (tunl0), and can crash pretty fast.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As tunnel devices are going to be lockless, we need to make sure a
misconfigured machine wont enter an infinite loop.
Add a percpu variable, and limit to three the number of stacked xmits.
Reported-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Testing on very recent kernel (2.6.36-rc6) made this warning pop:
WARNING: at fs/fs-writeback.c:87 inode_to_bdi+0x65/0x70()
Hardware name:
Dirtiable inode bdi default != sb bdi cifs
...the following patch fixes it and seems to be the obviously correct
thing to do for cifs.
Cc: stable@kernel.org
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Our list of Toshiba Satellite models that require this workaround
is growing -- so invoke the workaround for the entire product line.
https://bugzilla.kernel.org/show_bug.cgi?id=14679
Signed-off-by: Len Brown <len.brown@intel.com>
The src_base and dst_base fields in apei_exec_context are physical
address, so they should be ioremaped before being used in ERST
MOVE_DATA instruction.
Reported-by: Javier Martinez Canillas <martinez.javier@gmail.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
commit 934231de70 fixes an unbalanced
CONFIG_ACPI_PROCFS code block during module initialisation. This
patch fixes similar issue but for the module exit.
Signed-off-by: Luis Henriques <luis.henrix@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
acpi_pad.c:432: warning: ‘num_cpus’ may be used uninitialized in this function
gcc 4.4.4 was unable to notice that num_cpus is always set.
Re-arrange the code to un-confuse gcc, and also make
it easier for humans to read....
Signed-off-by: Len Brown <len.browns@intel.com>
In ERST debug/test support patch, a dynamic allocated buffer is
used. The may-failed memory allocation should be tried firstly before
free the previous buffer.
APEI resource management memory allocation related error path is fixed
too.
v2:
- Fix error messages for APEI resources management
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
platform_data in hest_parse_ghes() is used for saving the address of entry
information of erst_tab. When the device is failed to be added, platform_data
will be freed by platform_device_put(). But the value saved in platform_data
should not be freed here. If it is done, it will make system panic.
So I think platform_data should save the address of allocated memory
which saves entry information of erst_tab.
This patch fixed it and I confirmed it on x86_64 next-tree.
v2:
Transport the pointer of hest_hdr to platform_data using
platform_device_add_data()
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
After we ioremap() a new region, we call __acpi_try_ioremap() to
see whether another thread has already mapped the same region.
This check clobbers "vaddr", so compute the return value of
acpi_pre_map() using the ioremap() result "map->vaddr" instead.
v2:
Modified the unsuitable description of patch.
v3:
Removed unlikely() check and made description simpler.
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
On Huang Ying's machine:
erst_tab->header_length == sizeof(struct acpi_table_einj)
but Yinghai reported that on his machine,
erst_tab->header_length == sizeof(struct acpi_table_einj) -
sizeof(struct acpi_table_header)
To make erst table size checking code works on all systems, both
testing are treated as PASS.
Same situation applies to einj_tab->header_length, so corresponding
table size checking is changed in similar way too.
v2:
- Treat both table size as valid
Originally-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>