NETIF_F_HW_L2FW_DOFFLOAD allows upper layer net devices such
as macvlan to use queues in the hardware to directly submit and
receive skbs.
This creates a subtle change in the datapath though. One change
being the skb may no longer use the root devices qdisc.
Because users may not expect this we can't enable the feature
by default unless the hardware can offload all the software
functionality above it. So for now disable it by default and
let users opt in.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When compiling with -Wstrict-prototypes gcc catches a static
I missed.
./ixgbe_main.c:4254: warning: no previous prototype for 'ixgbe_fwd_ring_down'
Reported-by: Phillip Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On e1000_down(), we should ensure every asynchronous work is canceled
before proceeding. Since the watchdog_task can schedule other works
apart from itself, it should be stopped first, but currently it is
stopped after the reset_task. This can result in the following race
leading to the reset_task running after the module unload:
e1000_down_and_stop(): e1000_watchdog():
---------------------- -----------------
cancel_work_sync(reset_task)
schedule_work(reset_task)
cancel_delayed_work_sync(watchdog_task)
The patch moves cancel_delayed_work_sync(watchdog_task) at the beginning
of e1000_down_and_stop() thus ensuring the race is impossible.
Cc: Tushar Dave <tushar.n.dave@intel.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The patch fixes the following lockdep warning, which is 100%
reproducible on network restart:
======================================================
[ INFO: possible circular locking dependency detected ]
3.12.0+ #47 Tainted: GF
-------------------------------------------------------
kworker/1:1/27 is trying to acquire lock:
((&(&adapter->watchdog_task)->work)){+.+...}, at: [<ffffffff8108a5b0>] flush_work+0x0/0x70
but task is already holding lock:
(&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&adapter->mutex){+.+...}:
[<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
[<ffffffff816b8cbc>] mutex_lock_nested+0x4c/0x390
[<ffffffffa017233d>] e1000_watchdog+0x7d/0x5b0 [e1000]
[<ffffffff8108b972>] process_one_work+0x1d2/0x510
[<ffffffff8108ca80>] worker_thread+0x120/0x3a0
[<ffffffff81092c1e>] kthread+0xee/0x110
[<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
-> #0 ((&(&adapter->watchdog_task)->work)){+.+...}:
[<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810
[<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
[<ffffffff8108a5eb>] flush_work+0x3b/0x70
[<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140
[<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20
[<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000]
[<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000]
[<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000]
[<ffffffff8108b972>] process_one_work+0x1d2/0x510
[<ffffffff8108ca80>] worker_thread+0x120/0x3a0
[<ffffffff81092c1e>] kthread+0xee/0x110
[<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&adapter->mutex);
lock((&(&adapter->watchdog_task)->work));
lock(&adapter->mutex);
lock((&(&adapter->watchdog_task)->work));
*** DEADLOCK ***
3 locks held by kworker/1:1/27:
#0: (events){.+.+.+}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510
#1: ((&adapter->reset_task)){+.+...}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510
#2: (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000]
stack backtrace:
CPU: 1 PID: 27 Comm: kworker/1:1 Tainted: GF 3.12.0+ #47
Hardware name: System manufacturer System Product Name/P5B-VM SE, BIOS 0501 05/31/2007
Workqueue: events e1000_reset_task [e1000]
ffffffff820f6000 ffff88007b9dba98 ffffffff816b54a2 0000000000000002
ffffffff820f5e50 ffff88007b9dbae8 ffffffff810ba936 ffff88007b9dbac8
ffff88007b9dbb48 ffff88007b9d8f00 ffff88007b9d8780 ffff88007b9d8f00
Call Trace:
[<ffffffff816b54a2>] dump_stack+0x49/0x5f
[<ffffffff810ba936>] print_circular_bug+0x216/0x310
[<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810
[<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
[<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
[<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
[<ffffffff8108a5eb>] flush_work+0x3b/0x70
[<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
[<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140
[<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20
[<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000]
[<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000]
[<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000]
[<ffffffff8108b972>] process_one_work+0x1d2/0x510
[<ffffffff8108b906>] ? process_one_work+0x166/0x510
[<ffffffff8108ca80>] worker_thread+0x120/0x3a0
[<ffffffff8108c960>] ? manage_workers+0x2c0/0x2c0
[<ffffffff81092c1e>] kthread+0xee/0x110
[<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70
[<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
[<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70
== The issue background ==
The problem occurs, because e1000_down(), which is called under
adapter->mutex by e1000_reset_task(), tries to synchronously cancel
e1000 auxiliary works (reset_task, watchdog_task, phy_info_task,
fifo_stall_task), which take adapter->mutex in their handlers. So the
question is what does adapter->mutex protect there?
The adapter->mutex was introduced by commit 0ef4ee ("e1000: convert to
private mutex from rtnl") as a replacement for rtnl_lock() taken in the
asynchronous handlers. It targeted on fixing a similar lockdep warning
issued when e1000_down() was called under rtnl_lock(), and it fixed it,
but unfortunately it introduced the lockdep warning described above.
Anyway, that said the source of this bug is that the asynchronous works
were made to take rtnl_lock() some time ago, so let's look deeper and
find why it was added there.
The rtnl_lock() was added to asynchronous handlers by commit 338c15
("e1000: fix occasional panic on unload") in order to prevent
asynchronous handlers from execution after the module is unloaded
(e1000_down() is called) as it follows from the comment to the commit:
> Net drivers in general have an issue where timers fired
> by mod_timer or work threads with schedule_work are running
> outside of the rtnl_lock.
>
> With no other lock protection these routines are vulnerable
> to races with driver unload or reset paths.
>
> The longer term solution to this might be a redesign with
> safer locks being taken in the driver to guarantee no
> reentrance, but for now a safe and effective fix is
> to take the rtnl_lock in these routines.
I'm not sure if this locking scheme fixed the problem or just made it
unlikely, although I incline to the latter. Anyway, this was long time
ago when e1000 auxiliary works were implemented as timers scheduling
real work handlers in their routines. The e1000_down() function only
canceled the timers, but left the real handlers running if they were
running, which could result in work execution after module unload.
Today, the e1000 driver uses sane delayed works instead of the pair
timer+work to implement its delayed asynchronous handlers, and the
e1000_down() synchronously cancels all the works so that the problem
that commit 338c15 tried to cope with disappeared, and we don't need any
locks in the handlers any more. Moreover, any locking there can
potentially result in a deadlock.
So, this patch reverts commits 0ef4ee and 338c15.
Fixes: 0ef4eedc2e ("e1000: convert to private mutex from rtnl")
Fixes: 338c15e470 ("e1000: fix occasional panic on unload")
Cc: Tushar Dave <tushar.n.dave@intel.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is based on a similar change made to e1000e support in
commit bb9e44d0d0 ("e1000e: prevent oops when adapter is being closed
and reset simultaneously"). The same issue has also been observed
on the older e1000 cards.
Here, we have increased the RESET_COUNT value to 50 because there are too
many accesses to e1000 nic on stress tests to e1000 nic, it is not enough
to set RESET_COUT 25. Experimentation has shown that it is enough to set
RESET_COUNT 50.
Signed-off-by: yzhu1 <yanjun.zhu@windriver.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes Wake on LAN being reported as supported on some Ethernet
ports, in contrary to Hardware capability.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Simplify the code. Avoid race conditions caused by attributes
being created after hwmon device registration. Implicitly
(through hwmon API) add mandatory 'name' sysfs attribute.
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit b268daffdc.
I applied the wrong version of this patch, the proper version
is coming up next.
Signed-off-by: David S. Miller <davem@davemloft.net>
Windows driver will enable ALDPS function, but linux driver and firmware
do not have any configuration related to ALDPS function for 8168g.
So restart system to linux and remove the NIC cable, LAN enter ALDPS,
then LAN RX will be disabled.
This issue can be easily reproduced on dual boot windows and linux
system with RTL_GIGA_MAC_VER_40 chip.
Realtek said, ALDPS function can be disabled by configuring to PHY,
switch to page 0x0A43, reg0x10 bit2=0.
Signed-off-by: David Chang <dchang@suse.com>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent be2net commit 6384a4d (adds a support for busy polling)
introduces a regression that results in kernel crash. It incorrectly
modified be_close() so napi_disable() is called only for the first queue.
This breaks a correct pairing of napi_enable/_disable for the rest
of event queues and causes a crash in subsequent be_open() call.
v2: Applied suggestions from Sathya
Fixes: 6384a4d ("be2net: add support for ndo_busy_poll")
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 55485e7b41.
I applied the wrong version of this patch, the right one is coming up
next.
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent be2net commit 6384a4d (adds a support for busy polling)
introduces a regression that results in kernel crash. It incorrectly
modified be_close() so napi_disable() is called only for the first queue.
This breaks a correct pairing of napi_enable/_disable for the rest
of event queues and causes a crash in subsequent be_open() call.
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2fdac010bd ("via-velocity.c: update napi
implementation") overlooked an irq disabling spinlock when the Rx part
of the NAPI poll handler was converted from netif_rx to netif_receive_skb.
NAPI Rx processing can be taken out of the locked section with a pair of
napi_{disable / enable} since it only races with the MTU change function.
An heavier rework of the NAPI locking would be able to perform NAPI Tx
before Rx where I simply removed one of velocity_tx_srv calls.
References: https://bugzilla.redhat.com/show_bug.cgi?id=1022733
Fixes: 2fdac010bd (via-velocity.c: update napi implementation)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Alex A. Schmidt <aaschmidt1@gmail.com>
Cc: Jamie Heilman <jamie@audible.transient.net>
Cc: Michele Baldessari <michele@acksyn.org>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use new hwmon API to simplify code, provide missing mandatory 'name'
sysfs attribute, and attach hwmon attributes to hwmon device instead
of pci device.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On BE3-R, the PF programs the initial MAC address for its VFs. Doing it again
in VF probe, causes a FW error which although harmless generates
an unnecessary error log message.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is not being set currently. (This field is not applicable for Lancer)
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Interrupts need to be enabled in be_resume, when adapter boots back up from D3cold.
disabling interrupts in be_suspend() just to be symmetric to be_resume().
Signed-off-by: Ravikumar Nelavelli <ravikumar.nelavelli@emulex.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While we're doing this, fix the error code for SIOCSHWTSTAMP ioctl on
non-timestamping hardware.
Compile-tested only.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Pull slave-dmaengine changes from Vinod Koul:
"This brings for slave dmaengine:
- Change dma notification flag to DMA_COMPLETE from DMA_SUCCESS as
dmaengine can only transfer and not verify validaty of dma
transfers
- Bunch of fixes across drivers:
- cppi41 driver fixes from Daniel
- 8 channel freescale dma engine support and updated bindings from
Hongbo
- msx-dma fixes and cleanup by Markus
- DMAengine updates from Dan:
- Bartlomiej and Dan finalized a rework of the dma address unmap
implementation.
- In the course of testing 1/ a collection of enhancements to
dmatest fell out. Notably basic performance statistics, and
fixed / enhanced test control through new module parameters
'run', 'wait', 'noverify', and 'verbose'. Thanks to Andriy and
Linus [Walleij] for their review.
- Testing the raid related corner cases of 1/ triggered bugs in
the recently added 16-source operation support in the ioatdma
driver.
- Some minor fixes / cleanups to mv_xor and ioatdma"
* 'next' of git://git.infradead.org/users/vkoul/slave-dma: (99 commits)
dma: mv_xor: Fix mis-usage of mmio 'base' and 'high_base' registers
dma: mv_xor: Remove unneeded NULL address check
ioat: fix ioat3_irq_reinit
ioat: kill msix_single_vector support
raid6test: add new corner case for ioatdma driver
ioatdma: clean up sed pool kmem_cache
ioatdma: fix selection of 16 vs 8 source path
ioatdma: fix sed pool selection
ioatdma: Fix bug in selftest after removal of DMA_MEMSET.
dmatest: verbose mode
dmatest: convert to dmaengine_unmap_data
dmatest: add a 'wait' parameter
dmatest: add basic performance metrics
dmatest: add support for skipping verification and random data setup
dmatest: use pseudo random numbers
dmatest: support xor-only, or pq-only channels in tests
dmatest: restore ability to start test at module load and init
dmatest: cleanup redundant "dmatest: " prefixes
dmatest: replace stored results mechanism, with uniform messages
Revert "dmatest: append verify result to results"
...
Pull networking fixes from David Miller:
"Mostly these are fixes for fallout due to merge window changes, as
well as cures for problems that have been with us for a much longer
period of time"
1) Johannes Berg noticed two major deficiencies in our genetlink
registration. Some genetlink protocols we passing in constant
counts for their ops array rather than something like
ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were
using fixed IDs for their multicast groups.
We have to retain these fixed IDs to keep existing userland tools
working, but reserve them so that other multicast groups used by
other protocols can not possibly conflict.
In dealing with these two problems, we actually now use less state
management for genetlink operations and multicast groups.
2) When configuring interface hardware timestamping, fix several
drivers that simply do not validate that the hwtstamp_config value
is one the driver actually supports. From Ben Hutchings.
3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.
4) In dev_forward_skb(), set the skb->protocol in the right order
relative to skb_scrub_packet(). From Alexei Starovoitov.
5) Bridge erroneously fails to use the proper wrapper functions to make
calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki
Makita.
6) When detaching a bridge port, make sure to flush all VLAN IDs to
prevent them from leaking, also from Toshiaki Makita.
7) Put in a compromise for TCP Small Queues so that deep queued devices
that delay TX reclaim non-trivially don't have such a performance
decrease. One particularly problematic area is 802.11 AMPDU in
wireless. From Eric Dumazet.
8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
here. Fix from Eric Dumzaet, reported by Dave Jones.
9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.
10) When computing mergeable buffer sizes, virtio-net fails to take the
virtio-net header into account. From Michael Dalton.
11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic
bumping, this one has been with us for a while. From Eric Dumazet.
12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
Hugne.
13) 6lowpan bit used for traffic classification was wrong, from Jukka
Rissanen.
14) macvlan has the same issue as normal vlans did wrt. propagating LRO
disabling down to the real device, fix it the same way. From Michal
Kubecek.
15) CPSW driver needs to soft reset all slaves during suspend, from
Daniel Mack.
16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.
17) The xen-netfront RX buffer refill timer isn't properly scheduled on
partial RX allocation success, from Ma JieYue.
18) When ipv6 ping protocol support was added, the AF_INET6 protocol
initialization cleanup path on failure was borked a little. Fix
from Vlad Yasevich.
19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
blocks we can do the wrong thing with the msg_name we write back to
userspace. From Hannes Frederic Sowa. There is another fix in the
works from Hannes which will prevent future problems of this nature.
20) Fix route leak in VTI tunnel transmit, from Fan Du.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
genetlink: make multicast groups const, prevent abuse
genetlink: pass family to functions using groups
genetlink: add and use genl_set_err()
genetlink: remove family pointer from genl_multicast_group
genetlink: remove genl_unregister_mc_group()
hsr: don't call genl_unregister_mc_group()
quota/genetlink: use proper genetlink multicast APIs
drop_monitor/genetlink: use proper genetlink multicast APIs
genetlink: only pass array to genl_register_family_with_ops()
tcp: don't update snd_nxt, when a socket is switched from repair mode
atm: idt77252: fix dev refcnt leak
xfrm: Release dst if this dst is improper for vti tunnel
netlink: fix documentation typo in netlink_set_err()
be2net: Delete secondary unicast MAC addresses during be_close
be2net: Fix unconditional enabling of Rx interface options
net, virtio_net: replace the magic value
ping: prevent NULL pointer dereference on write to msg_name
bnx2x: Prevent "timeout waiting for state X"
bnx2x: prevent CFC attention
bnx2x: Prevent panic during DMAE timeout
...
- Re-enable flow steering verbs with new improved userspace ABI
- Fixes for slow connection due to GID lookup scalability
- IPoIB fixes
- Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
- Further improvements to SRP error handling
- Add new transport type for Cisco usNIC
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABCAAGBQJSil7BAAoJEENa44ZhAt0hbtgP/A+AmUalbOX6ZKzuOFxsrtY2
r55CX9b1JBeFM/Zhn2o6y+81lpCjkckJSggESMe4izNgocGw0nW4vYGN4SBynatj
y8sR9OSn+G3ihuENrzG41MJUGEa5WbcNMy4boN+Oa+qyTlV/WjLR7Fv4WbikK7Wm
o8FNlXiiDhMoGfHHG5J0MD0EQsnxuLDk2XP+ciu4tLtTs+wBka+gFK8WnMvztle3
gTeMNna5ilvCS2fdBxteuPA3KeDnJE9AgJSMJ2a4Rh+DR8uTgWYQ6n7amjmOc546
yhAKkoBkxPE10+Yj82WOPhCFxSeWcuSwJvpgv5dTVZ1XqUUcC1V3TEcZDHmyyHQ7
uPXgS1A+erBW3OYPBjZqtKvnHObscV12fL+rId3vIhcAQIbFroci08ZwPidEYRkn
fvwlEKcrIsBIpRXEyjlFCxsiiDnfq1wC1VayMR3jrIK0P6idf1SXf/geiRp9+RGT
wKUc0j51jvEx29qc65xuhEP9FQV9pCMxyd+FEE0d0KkjMz5hsIkjmcUcBbgF0CGg
GEyDPlgRLv+vmWDGpT8XraaV/0CJOEQDIgB4WSN87/AZ4UoNt7spW2xqsLsp1toy
5e0100tpWUleTPLe/Wig5GtBdagQ2jAUK1+186CP93pFPtkwc4/7X3hyp7qPIPTz
VDvT9DEy6zjSMCLpMcdo
=xxC+
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband/rdma updates from Roland Dreier:
- Re-enable flow steering verbs with new improved userspace ABI
- Fixes for slow connection due to GID lookup scalability
- IPoIB fixes
- Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
- Further improvements to SRP error handling
- Add new transport type for Cisco usNIC
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
IB/core: Re-enable create_flow/destroy_flow uverbs
IB/core: extended command: an improved infrastructure for uverbs commands
IB/core: Remove ib_uverbs_flow_spec structure from userspace
IB/core: Use a common header for uverbs flow_specs
IB/core: Make uverbs flow structure use names like verbs ones
IB/core: Rename 'flow' structs to match other uverbs structs
IB/core: clarify overflow/underflow checks on ib_create/destroy_flow
IB/ucma: Convert use of typedef ctl_table to struct ctl_table
IB/cm: Convert to using idr_alloc_cyclic()
IB/mlx5: Fix page shift in create CQ for userspace
IB/mlx4: Fix device max capabilities check
IB/mlx5: Fix list_del of empty list
IB/mlx5: Remove dead code
IB/core: Encorce MR access rights rules on kernel consumers
IB/mlx4: Fix endless loop in resize CQ
RDMA/cma: Remove unused argument and minor dead code
RDMA/ucma: Discard events for IDs not yet claimed by user space
IB/core: Add Cisco usNIC rdma node and transport types
RDMA/nes: Remove self-assignment from nes_query_qp()
IB/srp: Report receive errors correctly
...
Secondary unicast MAC addresses will get deleted only when the interface is UP.
When the interface is DOWN, though these secondary MAC addresses are unusable
and awaiting to be deleted, cause the firmware to believe that they are being used.
If the user intends to set a MAC address as primary MAC from one of these
secondary MAC addresses, the firmware returns a MAC address Collision error.
Delete these secondary MAC addresses during be_close.
The secondary MAC addresses list will be refreshed during interface open anyway.
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver currently requests the firmware to enable rx_interface options
without considering if the interface was created with that capability.
This could cause commands to firmware to fail.
To avoid this, enable only those options on an interface if the interface
was created with that capability.
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current driver release rtnl lock in between DCB re-configuration.
As a result, other flows (e.g., mtu config) may enter in between and fail
due to halted tx path for dcb configuration.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During VF load, prior to sending messages on HW channel to PF the VF
checks its bulletin board to see whether the PF indicated it has closed;
If a closed PF is encountered, the VF skips sending the message.
Due to incorrect return values, there's a possible scenario in which the VF
finishes loading "successfully", while the PF hasn't actually fully configured
FW/HW for the VFs supposed configuration.
Once VF tries to send Tx packets, HW will raise an attention (and FW possibly
will start treat the VF as malicious).
The patch fails the loading process in such a scenario.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If chip enters a recovery flow just after the driver issues a DMAE request
the DMAE will timeout. Current code will cause a bnx2x_panic() as a result,
which means interface will no longer be usable (regardless of the recovery
results), as bnx2x_panic() is irreversible for the driver.
As this is a possible flow, the panic should be reached only when driver
is compiled with STOP_ON_ERROR.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While unloading, bnx2x needs to clean the sp_rtnl_state to prevent
configuration made before the unload to be applied afterwards with
stale values.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull dmaengine changes from Dan
1/ Bartlomiej and Dan finalized a rework of the dma address unmap
implementation.
2/ In the course of testing 1/ a collection of enhancements to dmatest
fell out. Notably basic performance statistics, and fixed / enhanced
test control through new module parameters 'run', 'wait', 'noverify',
and 'verbose'. Thanks to Andriy and Linus for their review.
3/ Testing the raid related corner cases of 1/ triggered bugs in the
recently added 16-source operation support in the ioatdma driver.
4/ Some minor fixes / cleanups to mv_xor and ioatdma.
Conflicts:
drivers/dma/dmatest.c
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Pull trivial tree updates from Jiri Kosina:
"Usual earth-shaking, news-breaking, rocket science pile from
trivial.git"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
doc: add missing files to timers/00-INDEX
timekeeping: Fix some trivial typos in comments
mm: Fix some trivial typos in comments
irq: Fix some trivial typos in comments
NUMA: fix typos in Kconfig help text
mm: update 00-INDEX
doc: Documentation/DMA-attributes.txt fix typo
DRM: comment: `halve' -> `half'
Docs: Kconfig: `devlopers' -> `developers'
doc: typo on word accounting in kprobes.c in mutliple architectures
treewide: fix "usefull" typo
treewide: fix "distingush" typo
mm/Kconfig: Grammar s/an/a/
kexec: Typo s/the/then/
Documentation/kvm: Update cpuid documentation for steal time and pv eoi
treewide: Fix common typo in "identify"
__page_to_pfn: Fix typo in comment
Correct some typos for word frequency
clk: fixed-factor: Fix a trivial typo
...
During resume, use for_each_slave to walk the slaves of the cpsw, and
soft-reset each of them. This prevents oopses if there is only one
slave configured.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For archs with pages size of 4K, when the chunk is freed, fwp is not in the
list so avoid attempting to delete it.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Use this new function to make code more comprehensible, since we are
reinitialzing the completion, not initializing.
[akpm@linux-foundation.org: linux-next resyncs]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13)
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes bug 62491 (https://bugzilla.kernel.org/show_bug.cgi?id=62491).
After resuming some users got the following error flooding the kernel log:
alx 0000:02:00.0: invalid PHY speed/duplex: 0xffff
Signed-off-by: Jonas Hahnfeld <linux@hahnjo.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver fails to check the results of DMA mapping and results in
the following warning: (with kernel config "CONFIG_DMA_API_DEBUG" enable)
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:937 check_unmap+0x43c/0x7d8()
fec 2188000.ethernet: DMA-API: device driver failed to check map
error[device address=0x00000000383a8040] [size=2048 bytes] [mapped as single]
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.17-16827-g9cdb0ba-dirty #188
[<80013c4c>] (unwind_backtrace+0x0/0xf8) from [<80011704>] (show_stack+0x10/0x14)
[<80011704>] (show_stack+0x10/0x14) from [<80025614>] (warn_slowpath_common+0x4c/0x6c)
[<80025614>] (warn_slowpath_common+0x4c/0x6c) from [<800256c8>] (warn_slowpath_fmt+0x30/0x40)
[<800256c8>] (warn_slowpath_fmt+0x30/0x40) from [<8026bfdc>] (check_unmap+0x43c/0x7d8)
[<8026bfdc>] (check_unmap+0x43c/0x7d8) from [<8026c584>] (debug_dma_unmap_page+0x6c/0x78)
[<8026c584>] (debug_dma_unmap_page+0x6c/0x78) from [<8038049c>] (fec_enet_rx_napi+0x254/0x8a8)
[<8038049c>] (fec_enet_rx_napi+0x254/0x8a8) from [<804dc8c0>] (net_rx_action+0x94/0x160)
[<804dc8c0>] (net_rx_action+0x94/0x160) from [<8002c758>] (__do_softirq+0xe8/0x1d0)
[<8002c758>] (__do_softirq+0xe8/0x1d0) from [<8002c8e8>] (do_softirq+0x4c/0x58)
[<8002c8e8>] (do_softirq+0x4c/0x58) from [<8002cb50>] (irq_exit+0x90/0xc8)
[<8002cb50>] (irq_exit+0x90/0xc8) from [<8000ea88>] (handle_IRQ+0x3c/0x94)
[<8000ea88>] (handle_IRQ+0x3c/0x94) from [<8000855c>] (gic_handle_irq+0x28/0x5c)
[<8000855c>] (gic_handle_irq+0x28/0x5c) from [<8000de00>] (__irq_svc+0x40/0x50)
Exception stack(0x815a5f38 to 0x815a5f80)
5f20: 815a5f80 3b9aca00
5f40: 0fe52383 00000002 0dd8950e 00000002 81e7b080 00000000 00000000 815ac4d8
5f60: 806032ec 00000000 00000017 815a5f80 80059028 8041fc4c 60000013 ffffffff
[<8000de00>] (__irq_svc+0x40/0x50) from [<8041fc4c>] (cpuidle_enter_state+0x50/0xf0)
[<8041fc4c>] (cpuidle_enter_state+0x50/0xf0) from [<8041fd94>] (cpuidle_idle_call+0xa8/0x14c)
[<8041fd94>] (cpuidle_idle_call+0xa8/0x14c) from [<8000edac>] (arch_cpu_idle+0x10/0x4c)
[<8000edac>] (arch_cpu_idle+0x10/0x4c) from [<800582f8>] (cpu_startup_entry+0x60/0x130)
[<800582f8>] (cpu_startup_entry+0x60/0x130) from [<80bc7a48>] (start_kernel+0x2d0/0x328)
[<80bc7a48>] (start_kernel+0x2d0/0x328) from [<10008074>] (0x10008074)
---[ end trace c6edec32436e0042 ]---
Because dma-debug add new interfaces to debug dma mapping errors, pls refer
to: http://lwn.net/Articles/516640/
After dma mapping, it must call dma_mapping_error() to check mapping error,
otherwise the map_err_type alway is MAP_ERR_NOT_CHECKED, check_unmap() define
the mapping is not checked and dump the error msg. So,add dma_mapping_error()
checking to fix the WARNING
And RX DMA buffers are used repeatedly and the driver copies it into an skb,
fec_enet_rx() should not map or unmap, use dma_sync_single_for_cpu()/dma_sync_single_for_device()
instead of dma_map_single()/dma_unmap_single().
There have another potential issue: fec_enet_rx() passes the DMA address to __va().
Physical and DMA addresses are *not* the same thing. They may differ if the device
is behind an IOMMU or bounce buffering was required, or just because there is a fixed
offset between the device and host physical addresses. Also fix it in this patch.
=============================================
V2: add net_ratelimit() to limit map err message.
use dma_sync_single_for_cpu() instead of dma_map_single().
fix the issue that pass DMA addresses to __va() to get virture address.
V1: initial send
=============================================
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hwtstamp_ioctl() should validate all fields of hwtstamp_config
before making any changes. Currently it sets the TX configuration
before validating the rx_filter field.
Untested as I don't have a cross-compiler to hand.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cpsw_hwtstamp_ioctl() should validate all fields of hwtstamp_config,
and the hardware version, before making any changes. Currently it
sets the TX configuration before validating the rx_filter field
or that the hardware supports timestamping.
Also correct the error code for hardware versions that don't
support timestamping. ENOTSUPP is used by the NFS implementation
and is not part of userland API; we want EOPNOTSUPP (which glibc
also calls ENOTSUP, with one 'P').
Untested as I don't have a cross-compiler to hand.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Mugunthan V N <mugunthanvnm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stmmac_hwtstamp_ioctl() should validate all fields of hwtstamp_config
before making any changes. Currently it sets the TX configuration
before validating the rx_filter field.
Compile-tested only.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>