EE is hard-disabled on entry to kvmppc_handle_exit(), so call
hard_irq_disable() so that PACA_IRQ_HARD_DIS is set, and soft_enabled
is unset.
Without this, we get warnings such as arch/powerpc/kernel/time.c:300,
and sometimes host kernel hangs.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
KVM core expects arch code to acquire the srcu lock when calling
gfn_to_memslot and similar functions.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
The previous patch made 64-bit booke KVM build again, but Altivec
support is still not complete, and we can't prevent the guest from
turning on Altivec (which can corrupt host state until state
save/restore is implemented). Disable e6500 on KVM until this is
fixed.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Interrupt numbers defined for Book3E follows IVORs definition. Align
BOOKE_INTERRUPT_ALTIVEC_UNAVAIL and BOOKE_INTERRUPT_ALTIVEC_ASSIST to this
rule which also fixes the build breakage.
IVORs 32 and 33 are shared so reflect this in the interrupts naming.
This fixes a build break for 64-bit booke KVM.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
This patch makes legacy code on suspend/resume path being executed
conditionally, on non-DT platforms only, to fix suspend/resume of
DT-enabled systems, for which the code is inappropriate.
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
[olof: add #include <linux/of.h>]
Signed-off-by: Olof Johansson <olof@lixom.net>
The API requires that the GET_ONE_REG and SET_ONE_REG ioctls have this
extra information encoded in the register identifiers.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
We use 0x7000000000000000ULL as 0x6000000000000000ULL is reserved for
ARM64.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
In prima2, some functions of checking DT is registered in initcall
level. If it doesn't match the compatible name of sirf, kernel
will panic. It blocks the usage of multiplatform on other verndor.
The error message is in below.
Knic - not syncing: unable to find compatible pwrc node in dtb
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc3-00006-gd7f26ea-dirty #86
[<c0013adc>] (unwind_backtrace+0x0/0xf8) from [<c0011430>] (show_stack+0x10/0x1)
[<c0011430>] (show_stack+0x10/0x14) from [<c026f724>] (panic+0x90/0x1e8)
[<c026f724>] (panic+0x90/0x1e8) from [<c03267fc>] (sirfsoc_of_pwrc_init+0x24/0x)
[<c03267fc>] (sirfsoc_of_pwrc_init+0x24/0x58) from [<c0320864>] (do_one_initcal)
[<c0320864>] (do_one_initcall+0x90/0x150) from [<c0320a20>] (kernel_init_freeab)
[<c0320a20>] (kernel_init_freeable+0xfc/0x1c4) from [<c026b9e8>] (kernel_init+0)
[<c026b9e8>] (kernel_init+0x8/0xe4) from [<c000e158>] (ret_from_fork+0x14/0x3c)
Signen-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Filters need to be translated to real BPF code for userland, like SO_GETFILTER.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to the following commits:
commit 00f97da17a (netpoll: fix position of network header)
commit 525cebedb3 (pktgen: Fix position of ip and udp header)
using skb_tail_offset() seems not correct since the offset
is based on head pointer.
With the last caller removed, skb_tail_offset() can be killed
finally.
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Daniel Borkmann <dborkmann@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliezer Tamir says:
====================
This patch set adds the ability for the socket layer code to
poll directly on an Ethernet device's RX queue.
This eliminates the cost of the interrupt and context switch
and with proper tuning allows us to get very close to the HW latency.
This is a follow up to Jesse Brandeburg's Kernel Plumbers talk from
last year
http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/09/2012-lpc-Low-Latency-Sockets-slides-brandeburg.pdf
Patch 1 adds a napi_id and a hashing mechanism to lookup a napi by id.
Patch 2 adds an ndo_ll_poll method and the code that supports it.
Patch 3 adds support for busy-polling on UDP sockets.
Patch 4 adds support for TCP.
Patch 5 adds the ixgbe driver code implementing ndo_ll_poll.
Patch 6 adds additional statistics to the ixgbe driver for ndo_ll_poll.
Performance numbers:
setup TCP_RR UDP_RR
kernel Config C3/6 rx-usecs tps cpu% S.dem tps cpu% S.dem
patched optimized on 100 87k 3.13 11.4 94K 3.17 10.7
patched optimized on 0 71k 3.12 14.0 84k 3.19 12.0
patched optimized on adaptive 80k 3.13 12.5 90k 3.46 12.2
patched typical on 100 72 3.13 14.0 79k 3.17 12.8
patched typical on 0 60k 2.13 16.5 71k 3.18 14.0
patched typical on adaptive 67k 3.51 16.7 75k 3.36 14.5
3.9 optimized on adaptive 25k 1.0 12.7 28k 0.98 11.2
3.9 typical off 0 48k 1.09 7.3 52k 1.11 4.18
3.9 typical 0ff adaptive 35k 1.12 4.08 38k 0.65 5.49
3.9 optimized off adaptive 40k 0.82 4.83 43k 0.70 5.23
3.9 optimized off 0 57k 1.17 4.08 62k 1.04 3.95
Test setup details:
Machines: each with two Intel Xeon 2680 CPUs and X520 (82599) optical
NICs
Tests: Netperf tcp_rr and udp_rr, 1 byte (round trips per second)
Kernel: unmodified 3.9 and patched 3.9
Config: typical is derived from RH6.2, optimized is a stripped down
config.
Interrupt coalescing (ethtool rx-usecs) settings: 0=off, 1=adaptive,
100 us
When C3/6 states were turned on (via BIOS) the performance governor
was used.
These performance numbers were measured with v2 of the patch set.
Performance of the optimized config with an rx-usecs setting of 100
(the first line in the table above) was tracked during the evolution
of the patches and has never varied by more than 1%.
Design:
A global hash table that allows us to look up a struct napi by a
unique id was added.
A napi_id field was added both to struct sk_buff and struct sk.
This is used to track which NAPI we need to poll for a specific
socket.
The device driver marks every incoming skb with this id.
This is propagated to the sk when the socket is looked up in the
protocol handler.
When the socket code does not find any more data on the socket queue,
it now may call ndo_ll_poll which will crank the device's rx queue and
feed incoming packets to the stack directly from the context of the
socket.
A sysctl value (net.core4.low_latency_poll) controls how many
microseconds we busy-wait before giving up. (setting to 0 globally
disables busy-polling)
Locking:
1. Locking between napi poll and ndo_ll_poll:
Since what needs to be locked between a device's NAPI poll and
ndo_ll_poll, is highly device / configuration dependent, we do this
inside the Ethernet driver.
For example, when packets for high priority connections are sent to
separate rx queues, you might not need locking between napi poll and
ndo_ll_poll at all.
For ixgbe we only lock the RX queue.
ndo_ll_poll does not touch the interrupt state or the TX queues.
(earlier versions of this patchset did touch them,
but this design is simpler and works better.)
If a queue is actively polled by a socket (on another CPU) napi poll
will not service it, but will wait until the queue can be locked
and cleaned before doing a napi_complete().
If a socket can't lock the queue because another CPU has it,
either from napi or from another socket polling on the queue,
the socket code can busy wait on the socket's skb queue.
Ndo_ll_poll does not have preferential treatment for the data from the
calling socket vs. data from others, so if another CPU is polling,
you will see your data on this socket's queue when it arrives.
Ndo_ll_poll is called with local BHs disabled, so it won't race on
the same CPU with net_rx_action, which calls the napi poll method.
2. Napi_hash
The napi hash mechanism uses RCU.
napi_by_id() must be called under rcu_read_lock().
After a call to napi_hash_del(), caller must take care to wait an rcu
grace period before freeing the memory containing the napi struct.
(Ixgbe already had this because the queue vector structure uses rcu to
protect the statistics counters in it.)
how to test:
1. The patchset should apply cleanly to net-next.
(don't forget to configure INET_LL_RX_POLL).
2. The ethtool -c setting for rx-usecs should be on the order of 100.
3. Use ethtool -K to disable GRO and LRO
(You are encouraged to try it both ways. If you find that your
workload
does better with GRO on do tell us.)
4. Sysctl value net.core.low_latency_poll controls how long
(in us) to busy-wait for more data, You are encouraged to play
with this and see what works for you. The default is now 0 so you need
to
set it to turn the feature on. I recommend a value around 50.
4. benchmark thread and IRQ should be bound to separate cores.
Both cores should be on the same CPU NUMA node as the NIC.
When the app and the IRQ run on the same CPU you get a small penalty.
If interrupt coalescing is set to a low value this penalty can be very
large.
5. If you suspect that your machine is not configured properly,
use numademo to make sure that the CPU to memory BW is OK.
numademo 128m memcpy local copy numbers should be more than
8GB/s on a properly configured machine.
Change log:
v10
- removed select/poll support. (we will work on this some more and try again)
v9
- correct sysctl proc_handler, reported by Eric Dumazet and Amir Vadai.
- more int -> bool changes, reported by Eric Dumazet.
- better mask testing in sock_poll(), reported by Eric Dumazet.
v8
- split out udp and select/poll into separate patches.
what used to be patch 2/5 is now three patches.
- type corrections from Amir Vadai and Cong Wang:
one unsigned long that was left when changing to cycles_t
int -> bool
- more detailed patch descriptions.
v7
- suggested by Ben Hutchings and Eric Dumazet:
type fixes, static for globals in net/core.c,
avoid napi_id collisions in napi_hash_add()
v6
- many small fixes suggested by Eric Dumazet:
data locality, typos, documentation
protect napi_hash insert/delete with a spinlock (napi_gen_id is no
longer atomic_t since it's only accessed with the spinlock held.)
- added IPv6 TCP and UDP support (only minimally tested)
v5
- corrections suggested by Ben Hutchings:
fixed typos, moved the config option and sysctl value from IPv4 to net
- moved sk_mark_ll() to the protocol handlers
- removed global id mechanism, replaced with a hashed napi_id.
based on code sample from Eric Dumazet
Note that ixgbe_free_q_vector() already waits an rcu grace period
before freeing the q_vector, so nothing additional needs to be done
when adding a call to napi_hash_del().
- simple poll/select support
v4
- removed separate config option for TCP as suggested Eric Dumazet.
- added linux mib counter for packets received through the low latency path,
as suggested by Andi Kleen.
- re-allow module unloading, remove module param, use a global generation id
instead to prevent the use of a stale napi pointer, as suggested
by Eric Dumazet
- updated Documentation/networking/ip-sysctl.txt text
v3
- coding style changes suggested by Dave Miller
v2
- the sysctl knob is now in microseconds. The default value is now 0 (off).
- for now the code depends at configure time on CONFIG_I86_TSC
- the napi reference in struct skb is now a union with the dma cookie
since the former is only used on RX and the latter on TX,
as suggested by Eric Dumazet.
- we do a better job at honoring non-blocking operations.
- removed busy-polling support for tcp_read_sock()
- remove dynamic disabling of GRO
- coding style fixes
- disallow unloading the device module after the feature has been used
Credit:
Jesse Brandeburg, Arun Chekhov Ilango, Julie Cummings,
Alexander Duyck, Eric Geisler, Jason Neighbors, Yadong Li,
Mike Polehn, Anil Vasudevan, Don Wood
Special thanks for finding bugs in earlier versions:
Willem de Bruijn and Andi Kleen
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add additional statistics to the ixgbe driver for ndo_ll_poll
Defined under LL_EXTENDED_STATS
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the ixgbe driver code implementing ndo_ll_poll.
Adds ndo_ll_poll method and locking between it and the napi poll.
When receiving a packet we use skb_mark_ll to record the napi it came from.
Add each napi to the napi_hash right after netif_napi_add().
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds low latency socket poll support for TCP.
In tcp_v[46]_rcv() add a call to sk_mark_ll() to copy the napi_id
from the skb to the sk.
In tcp_recvmsg(), when there is no data in the socket we busy-poll.
This is a good example of how to add busy-poll support to more protocols.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add upport for busy-polling on UDP sockets.
In __udp[46]_lib_rcv add a call to sk_mark_ll() to copy the napi_id
from the skb into the sk.
This is done at the earliest possible moment, right after we identify
which socket this skb is for.
In __skb_recv_datagram When there is no data and the user
tries to read we busy poll.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds an ndo_ll_poll method and the code that supports it.
This method can be used by low latency applications to busy-poll
Ethernet device queues directly from the socket code.
sysctl_net_ll_poll controls how many microseconds to poll.
Default is zero (disabled).
Individual protocol support will be added by subsequent patches.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds a napi_id and a hashing mechanism to lookup a napi by id.
This will be used by subsequent patches to implement low latency
Ethernet device polling.
Based on a code sample by Eric Dumazet.
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The stop machine logic can lock up if all but one of the migration
threads make it through the disable-irq step and the one remaining
thread gets stuck in __do_softirq. The reason __do_softirq can hang is
that it has a bail-out based on jiffies timeout, but in the lockup case,
jiffies itself is not incremented.
To work around this, re-add the max_restart counter in __do_irq and stop
processing irqs after 10 restarts.
Thanks to Tejun Heo and Rusty Russell and others for helping me track
this down.
This was introduced in 3.9 by commit c10d73671a ("softirq: reduce
latencies").
It may be worth looking into ath9k to see if it has issues with its irq
handler at a later date.
The hang stack traces look something like this:
------------[ cut here ]------------
WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7()
Watchdog detected hard LOCKUP on cpu 2
Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
Pid: 23, comm: migration/2 Tainted: G C 3.9.4+ #11
Call Trace:
<NMI> warn_slowpath_common+0x85/0x9f
warn_slowpath_fmt+0x46/0x48
watchdog_overflow_callback+0x9c/0xa7
__perf_event_overflow+0x137/0x1cb
perf_event_overflow+0x14/0x16
intel_pmu_handle_irq+0x2dc/0x359
perf_event_nmi_handler+0x19/0x1b
nmi_handle+0x7f/0xc2
do_nmi+0xbc/0x304
end_repeat_nmi+0x1e/0x2e
<<EOE>>
cpu_stopper_thread+0xae/0x162
smpboot_thread_fn+0x258/0x260
kthread+0xc7/0xcf
ret_from_fork+0x7c/0xb0
---[ end trace 4947dfa9b0a4cec3 ]---
BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17]
Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
irq event stamp: 835637905
hardirqs last enabled at (835637904): __do_softirq+0x9f/0x257
hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80
softirqs last enabled at (5654720): __do_softirq+0x1ff/0x257
softirqs last disabled at (5654725): irq_exit+0x5f/0xbb
CPU 1
Pid: 17, comm: migration/1 Tainted: G WC 3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M.
RIP: tasklet_hi_action+0xf0/0xf0
Process migration/1
Call Trace:
<IRQ>
__do_softirq+0x117/0x257
irq_exit+0x5f/0xbb
smp_apic_timer_interrupt+0x8a/0x98
apic_timer_interrupt+0x72/0x80
<EOI>
printk+0x4d/0x4f
stop_machine_cpu_stop+0x22c/0x274
cpu_stopper_thread+0xae/0x162
smpboot_thread_fn+0x258/0x260
kthread+0xc7/0xcf
ret_from_fork+0x7c/0xb0
Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Pekka Riikonen <priikone@iki.fi>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In (bc6bcb5 netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond
packet boundary), the use of tcp_hdr was introduced. However, we
cannot assume that skb->transport_header is set for non-local packets.
Cc: Florian Westphal <fw@strlen.de>
Reported-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Patrik writes:
Two fixes for memory leaks split into Cedarview and Poulsbo versions,
and a fix for properly setting the pipe base when using fbdev. It's on
my todo-list to start unifying the chips since they are very similar,
but until then I'd like to split them up in case there are side-effects
on Cedarview that I cannot currently test.
airled: Verified pull from github matches what I expected.
* 'gma500-fixes' of git://github.com/patjak/drm-gma500:
drm/gma500/cdv: Fix cursor gem obj referencing on cdv
drm/gma500/psb: Fix cursor gem obj referencing on psb
drm/gma500/cdv: Unpin framebuffer on crtc disable
drm/gma500/psb: Unpin framebuffer on crtc disable
drm/gma500: Add fb gtt offset to fb base
Complier may generate codes that re-read the tun->numqueues during
tun_select_queue(). This may be a race if vlan->numqueues were changed in the
same time and can lead unexpected result (e.g. very huge value).
We need prevent the compiler from generating such codes by adding an
ACCESS_ONCE() to make sure tun->numqueues were only read once.
Bug were introduced by commit c8d68e6be1
(tuntap: multiqueue support).
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we decide not use zero-copy, msg.control should be set to NULL otherwise
macvtap/tap may set zerocopy callbacks which may decrease the kref of ubufs
wrongly.
Bug were introduced by commit cedb9bdce0
(vhost-net: skip head management if no outstanding).
This solves the following warnings:
WARNING: at include/linux/kref.h:47 handle_tx+0x477/0x4b0 [vhost_net]()
Modules linked in: vhost_net macvtap macvlan tun nfsd exportfs bridge stp llc openvswitch kvm_amd kvm bnx2 megaraid_sas [last unloaded: tun]
CPU: 5 PID: 8670 Comm: vhost-8668 Not tainted 3.10.0-rc2+ #1566
Hardware name: Dell Inc. PowerEdge R715/00XHKG, BIOS 1.5.2 04/19/2011
ffffffffa0198323 ffff88007c9ebd08 ffffffff81796b73 ffff88007c9ebd48
ffffffff8103d66b 000000007b773e20 ffff8800779f0000 ffff8800779f43f0
ffff8800779f8418 000000000000015c 0000000000000062 ffff88007c9ebd58
Call Trace:
[<ffffffff81796b73>] dump_stack+0x19/0x1e
[<ffffffff8103d66b>] warn_slowpath_common+0x6b/0xa0
[<ffffffff8103d6b5>] warn_slowpath_null+0x15/0x20
[<ffffffffa0197627>] handle_tx+0x477/0x4b0 [vhost_net]
[<ffffffffa0197690>] handle_tx_kick+0x10/0x20 [vhost_net]
[<ffffffffa019541e>] vhost_worker+0xfe/0x1a0 [vhost_net]
[<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
[<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
[<ffffffff81061f46>] kthread+0xc6/0xd0
[<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70
[<ffffffff817a1aec>] ret_from_fork+0x7c/0xb0
[<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer Broadcom BCM63xx SoCs: 6328, 6362 and 6368 have an integrated switch
which needs to be driven slightly differently from the traditional
external switches. This patch introduces changes in arch/mips/bcm63xx in order
to:
- register a bcm63xx_enetsw driver instead of bcm63xx_enet driver
- update DMA channels configuration & state RAM base addresses
- add a new platform data configuration knob to define the number of
ports per switch/device and force link on some ports
- define the required switch registers
On the driver side, the following changes are required:
- the switch ports need to be polled to ensure the link is up and
running and RX/TX can properly work
- basic switch configuration needs to be performed for the switch to
forward packets to the CPU
- update the MIB counters since the integrated
Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current bcm63xx_enet driver always uses bcmenet_shared_base whenever
it needs to access DMA channel configuration space or access the DMA
channel state RAM. Split these register in 3 parts to be more accurate:
- global DMA configuration
- per DMA channel configuration space
- per DMA channel state RAM space
This is preliminary to support new chips where the global DMA
configuration remains the same, but there is a varying number of DMA
channels located at a different memory offset.
Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the rset_nway ethtool callback which uses libphy generic
autonegotiation restart function.
Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch reworks the UEFI anti-bricking code, including an effective
reversion of cc5a080c and 31ff2f20. It turns out that calling
QueryVariableInfo() from boot services results in some firmware
implementations jumping to physical addresses even after entering virtual
mode, so until we have 1:1 mappings for UEFI runtime space this isn't
going to work so well.
Reverting these gets us back to the situation where we'd refuse to create
variables on some systems because they classify deleted variables as "used"
until the firmware triggers a garbage collection run, which they won't do
until they reach a lower threshold. This results in it being impossible to
install a bootloader, which is unhelpful.
Feedback from Samsung indicates that the firmware doesn't need more than
5KB of storage space for its own purposes, so that seems like a reasonable
threshold. However, there's still no guarantee that a platform will attempt
garbage collection merely because it drops below this threshold. It seems
that this is often only triggered if an attempt to write generates a
genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to
create a variable larger than the remaining space. This should fail, but if
it somehow succeeds we can then immediately delete it.
I've tested this on the UEFI machines I have available, but I don't have
a Samsung and so can't verify that it avoids the bricking problem.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Lee, Chun-Y <jlee@suse.com> [ dummy variable cleanup ]
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
__DECLARE_TRACE_RCU() currently creates an _rcuidle() tracepoint which
may safely be invoked from what RCU considers to be an idle CPU.
However, these _rcuidle() tracepoints may -not- be invoked from the
handler of an irq taken from idle, because rcu_idle_enter() zeroes
RCU's nesting-level counter, so that the rcu_irq_exit() returning to
idle will trigger a WARN_ON_ONCE().
This commit therefore substitutes rcu_irq_enter() for rcu_idle_exit()
and rcu_irq_exit() for rcu_idle_enter() in order to make the _rcuidle()
tracepoints usable from irq handlers as well as from process context.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Pablo Neira Ayuso says:
====================
The following patchset contains four fixes for Netfilter and one fix
for IPVS, they are:
* Fix data leak to user-space via getsockopt IP_VS_SO_GET_DESTS, from
Dan Carpenter.
* Fix xt_TCPMSS if no TCP MSS is specified in syn packets, to avoid the
violation of RFC879, from Phil Oester.
* Fix incomplete dump of objects via nfnetlink_acct and nfnetlink_cttimeout,
from myself.
* Fix missing HW protocol in packets passed to user-space via NFQUEUE,
from myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
A few nasty issues, particularly a race with the interrupt controller in
the xilinx driver, together with a couple of more minor fixes and a much
needed move of the mailing list away from sourceforge.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRrissAAoJELSic+t+oim99qsQAJZJsEnlp/NOSt7OmPVrXv4c
4nlqSFgqNA0wEwIjgRJgusVQsSPlTzmJpoMJrA1PDSV5lC6ilaqo1z2Nlg6w0PzL
BGJU72vbmfCd9+rIqm5icEZPDG5CajM5fFl6cngNDUhPbOYz1Lo+DHVebkBp9won
SNEG2UM2GL5orEvPNvy6DRrLJLBnfFAtezyC8gcL9Q4wyL2woQAaF3FaytFvL0sz
lehIwejLQgJ9Irzz+ciqQQI+jWyII2pjSWBXb9tmoL1HAeeODrxeH/Qs1dn/s9MQ
3ImWAOhQJCjajBzwRxuvW+I08piAr8ZaMMbVhLBUR2ZuJcjEdcJ2h5zKdU97cv2D
hFS1JJjb1f88iOJMmcApE/NM8FVtJJjit5G6NaNGcEIi2RriTNZ6ZP8bfxvC1/sT
wBXG3C8KTMYKHhYkM0LmBI5deQGgj5Xm1Yq/aFLiil0UCrnxIK/1cWY5UT6EWWpa
/G6UfkE4Vt13leW/xPJtpPM9q2dsSzGFIf108t7DAVIsg2kCKFQ58yLqi4uNNnxl
yqyhW2dCfFmas9+Oef4XdQa4e7SWQ6hCmXx1x4GncY9y/PeJqRmbpq4ROW6HwCgy
URNvbJGoJxv+hzNlpTVI18MEq4ej1dTwJCHnMjtZir6Nt0nbjofxpCplsbRkwN4b
eI0ubMqyFUBpTcXgYPuu
=Lvj8
-----END PGP SIGNATURE-----
Merge tag 'spi-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few nasty issues, particularly a race with the interrupt controller
in the xilinx driver, together with a couple of more minor fixes and a
much needed move of the mailing list away from sourceforge."
* tag 'spi-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: hspi: fixup long delay time
spi: spi-xilinx: Remove ISR race condition
spi: topcliff-pch: fix error return code in pch_spi_probe()
spi: topcliff-pch: Pass correct pointer to free_irq()
spi: Move mailing list to vger
- xen/tmem stopped working after a certain combination of modprobe/swapon was used
- cpu online/offlining would trigger WARN_ON.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJRtgQxAAoJEFjIrFwIi8fJ9QYH/ibEGBlKLfGzN4Apx6evBA68
l9CuLIpGCkPOiJLgVs10zY77Sg3f95bHFkJrwrEIDeTQ42joNKybsIp+qZCIS5CW
sAnzSd6Mb8jqQYJpgLl03+z3GdFzILTJ9e0zYTETbW2vfLpAETm87XWH+gjxDK0e
9I0kZX+Q3+2VWi5xv+UwWkYIOLbggLyKYajTHDwWNuC7vQgkJulCAbmjgN/NBv7A
xacvXdbEClIfZ5tvJJN0RggdEWo6WyTxyExfLPmpXFbHEvWDUX5LPEf8MhFY1ORK
U1C0BSV6YuLo350G5lY4I0R75ZEWLeUyVkZFnJIeGnUF1rP6OXEuVWTyeivPDBA=
=Fy36
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull xen fixes from Konrad Rzeszutek Wilk:
"Two bug-fixes for regressions:
- xen/tmem stopped working after a certain combination of
modprobe/swapon was used
- cpu online/offlining would trigger WARN_ON."
* tag 'stable/for-linus-3.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/tmem: Don't over-write tmem_frontswap_poolid after tmem_frontswap_init set it.
xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU.
The biggest fix here is Lars-Peter's fix for custom locking callbacks
which is pretty localised but important for those devices that use the
feature. Otherwise we've got a couple of fairly small cleanups which
would have been sent sooner were it not for letting Lars-Peter's patch
soak for a while.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRtZ9aAAoJELSic+t+oim9FKsP/09KdtRUMGIjJQFX3Lm4bEoU
0ZQhvV6ozQWa+3faKlUeBu61yUx4R+zAgxO3rJHSEZM2+50bTZ9t70InvHOegKU7
ipayoR/4cS23kK0uZJL9TURmoK4JuII+QVlTX/Fn9siJrrFvoSSOHjv7DxGzPZtt
o4xgK+R8WU6zTUVIf2ncHzryaKPMs50tEt7FjPTMn+MetWsn0kgYxk3VLsORQqnL
VJwzusJn/Z+vIy4kr9NAFGV6UljecRrIzFmn98kVHgMW1nEjr3vEE1upP2Y1TrF3
flZqooulg7qlc2bL60/aakWXAWQ33lKZCvtMHykk46cUFN0faSbL3V9wk1N/85AN
n6wHZKHA9APGMeImv1nL+p3DLZkhY3HYpuOgoSP8ocUpZ7y7C0wH+oGBO3nmLGjK
4vEGPClsNFrdeF5viViIQ7ECHd9ls7t182gj3UJJjKPi27N5KppXM2ki3TQZABsS
Q+iZ1dk/GYdyUV2E6LhoBGNnB9v6zo5pi/bABT59KuM0G6SuieRpPbchxo0VIvqY
6cUZrzJ8Kn9VRDIc5/RIgijm9VjhbGGas/gc5nr014IKiSHMjPVXvBQ3NYw7yQw/
SFtLgu2UBb+twQ7TWGK7BG6clHtwvSEjnvCFzPJ1q77c1gXzNfY8mCOYjfyEZ5Tb
JuxD5znZHTvNLo+1QRQh
=mOJz
-----END PGP SIGNATURE-----
Merge tag 'regmap-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"The biggest fix here is Lars-Peter's fix for custom locking callbacks
which is pretty localised but important for those devices that use the
feature. Otherwise we've got a couple of fairly small cleanups which
would have been sent sooner were it not for letting Lars-Peter's patch
soak for a while"
* tag 'regmap-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: rbtree: Fixed node range check on sync
regmap: regcache: Fixup locking for custom lock callbacks
regmap: debugfs: Check return value of regmap_write()
Pull crypto fixes from Herbert Xu:
"This fixes a build problem in sahara and temporarily disables two new
optimisations because of performance regressions until a permanent fix
is ready"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: sahara - fix building as module
crypto: blowfish - disable AVX2 implementation
crypto: twofish - disable AVX2 implementation
Support for rt2800 device is broken since my
'rt2x00: rt2x00dev: use rt2x00dev->tx->limit'
patch. The changelog of that commit says that the
TX data queue is initialized already when the
rt2x00lib_probe_hw() function is called.
However as Jakub noticed it, this statement is not
correct. The queue->limit field is initialized in
the rt2x00queue_alloc_entries routine and that is
not yet called when rt2x00lib_probe_hw() runs.
Because the value of tx->limit contains zero, the
driver tries to allocate a kernel fifo with zero
size and kfifo_alloc rejects that with -EINVAL.
PCI: Enabling device 0000:01:00.0 (0000 -> 0002)
ieee80211 phy1: rt2x00_set_rt: Info - RT chipset 3071, rev 021c detected
ieee80211 phy1: rt2x00_set_rf: Info - RF chipset 0008 detected
ieee80211 phy1: rt2x00lib_probe_dev: Error - Failed to initialize hw
rt2800pci: probe of 0000:01:00.0 failed with error -22
Move the data_queue field initialization from
the rt2x00queue_alloc_entries routine into the
rt2x00queue_init function. The initialization
code is not strictly related to the allocation,
and the change ensures that the queue_data fields
can be used in the probe routines.
The patch also introduces a helper function in
order to be able to get the correct data_queue_desc
structure for a given queue. This helper is only
needed temporarily and it will be removed later.
Reported-by: Jakub Kicinski <moorray@wp.pl>
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
GCC 4.8 is spitting out uninitialized-variable warnings against
"drivers/net/wireless/rtlwifi/rtl8192de/dm.c".
drivers/net/wireless/rtlwifi/rtl8192de/dm.c:941:31:
error: 'ofdm_index_old[1]' may be used uninitialized in this
function [-Werror=maybe-uninitialized]
rtlpriv->dm.ofdm_index[i] = ofdm_index_old[i];
This patch adds initialization to the variable and properly sets its value.
Signed-off-by: Yunlian Jiang <yunlian@google.com>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Check for allocation failures and return -ENOMEM. The caller
already expects it.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This has only one caller and rates[] is an array with
IEEE80211_TX_MAX_RATES (4) elements.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The optional debugfs interface to the vendor's engineering tools wasn't
bounds checking at all, which made it trivial to perform a buffer
overflow if this interface was compiled in and then explicitly enabled
at runtime.
This patch checks both the length supplied as part of the data to ensure
it is sane, and also the amount of data compared to the remaining buffer
space. If either is too large, fail immediately.
(This bug was spotted by Dan Carpenter <dan.carpenter@oracle.com>)
Signed-off-by: Solomon Peachy <pizza@shaftnet.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
goto after return is wrong.
The other code in this block needs to set an
error value then goto an error release block.
This one doesn't need to release anything and
was likely a copy/paste remainder.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-By: Solomon Peachy <pizza@shaftnet.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Do not use uninitialised termios data to determine when to configure the
device at open.
This also prevents stack data from leaking to userspace in the OOM error
path.
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Do not use uninitialised termios data to determine when to configure the
device at open.
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Do not use uninitialised termios data to determine when to configure the
device at open.
This also prevents stack data from leaking to userspace.
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
arch_ftrace_update_code and ftrace_modify_all_code are only
available if CONFIG_DYNAMIC_FTRACE is selected.
Fixes the following build problem on MIPS randconfig:
arch/mips/kernel/ftrace.c: In function 'arch_ftrace_update_code':
arch/mips/kernel/ftrace.c:31:2: error: implicit declaration of function
'ftrace_modify_all_code' [-Werror=implicit-function-declaration]
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Acked-by: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5435/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The kvm_* symbols are only available if KVM is selected.
Fixes the following linking problem on a randconfig:
arch/mips/built-in.o: In function `local_flush_tlb_mm':
(.text+0x18a94): undefined reference to `kvm_local_flush_tlb_all'
arch/mips/built-in.o: In function `local_flush_tlb_range':
(.text+0x18d0c): undefined reference to `kvm_local_flush_tlb_all'
kernel/built-in.o: In function `__schedule':
core.c:(.sched.text+0x2a00): undefined reference to `kvm_local_flush_tlb_all'
mm/built-in.o: In function `use_mm':
(.text+0x30214): undefined reference to `kvm_local_flush_tlb_all'
fs/built-in.o: In function `flush_old_exec':
(.text+0xf0a0): undefined reference to `kvm_local_flush_tlb_all'
make: *** [vmlinux] Error 1
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Acked-by: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5437/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Only an interrupt can wake the core from 'wait', enable interrupts
locally before executing 'wait'.
[ralf@linux-mips.org: This leave the race between an interrupt that's
setting TIF_NEED_RESCHEd and entering the WAIT status. but at least it's
going to bring Alchemy back from the dead, so I'm going to apply this
patch.]
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Cc: Linux-MIPS <linux-mips@linux-mips.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/5408/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 10a7a07713 ("xen: tmem: enable Xen
tmem shim to be built/loaded as a module") allows the tmem module
to be loaded any time. For this work the frontswap API had to
be able to asynchronously to call tmem_frontswap_init before
or after the swap image had been set. That was added in git
commit 905cd0e1bf
("mm: frontswap: lazy initialization to allow tmem backends to build/run as modules").
Which means we could do this (The common case):
modprobe tmem [so calls frontswap_register_ops, no ->init]
modifies tmem_frontswap_poolid = -1
swapon /dev/xvda1 [__frontswap_init, calls -> init, tmem_frontswap_poolid is
< 0 so tmem hypercall done]
Or the failing one:
swapon /dev/xvda1 [calls __frontswap_init, sets the need_init bitmap]
modprobe tmem [calls frontswap_register_ops, -->init calls, finds out
tmem_frontswap_poolid is 0, does not make a hypercall.
Later in the module_init, sets tmem_frontswap_poolid=-1]
Which meant that in the failing case we would not call the hypercall
to initialize the pool and never be able to make any frontswap
backend calls.
Moving the frontswap_register_ops after setting the tmem_frontswap_poolid
fixes it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
All architectures must implement IRQ functions. Since various
dependencies on !S390 were removed, there are various drivers that can
be selected but will fail to link. Provide a dummy implementation of
these functions for the !PCI case.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@vger.kernel.org # 3.9
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>