Commit graph

496848 commits

Author SHA1 Message Date
Sasha Levin
99531e6063 scsi_debug: use atomic allocation in resp_rsup_opcodes
resp_rsup_opcodes() may get called from atomic context and would need to
use GFP_ATOMIC for allocations:

[ 1237.913419] BUG: sleeping function called from invalid context at mm/slub.c:1262
[ 1237.914865] in_atomic(): 1, irqs_disabled(): 0, pid: 7556, name: trinity-c311
[ 1237.916142] 3 locks held by trinity-c311/7556:
[ 1237.916981] #0: (sb_writers#5){.+.+.+}, at: do_readv_writev (include/linux/fs.h:2346 fs/read_write.c:844)
[ 1237.919713] #1: (&of->mutex){+.+.+.}, at: kernfs_fop_write (fs/kernfs/file.c:297)
[ 1237.922626] Mutex: counter: -1 owner: trinity-c311
[ 1237.924044] #2: (s_active#51){.+.+.+}, at: kernfs_fop_write (fs/kernfs/file.c:297)
[ 1237.925960] Preemption disabled blk_execute_rq_nowait (block/blk-exec.c:95)
[ 1237.927416]
[ 1237.927680] CPU: 24 PID: 7556 Comm: trinity-c311 Not tainted 3.19.0-rc4-next-20150116-sasha-00054-g4ad498c-dirty #1744
[ 1237.929603]  ffff8804fc9d8000 ffff8804d9bc3548 ffffffff9d439fb2 0000000000000000
[ 1237.931097]  0000000000000000 ffff8804d9bc3588 ffffffff9a18389a ffff8804d9bc3598
[ 1237.932466]  ffffffff9a1b1715 ffffffffa15935d8 ffffffff9e6f8cb1 00000000000004ee
[ 1237.933984] Call Trace:
[ 1237.934434] dump_stack (lib/dump_stack.c:52)
[ 1237.935323] ___might_sleep (kernel/sched/core.c:7339)
[ 1237.936259] ? mark_held_locks (kernel/locking/lockdep.c:2549)
[ 1237.937293] __might_sleep (kernel/sched/core.c:7305)
[ 1237.938272] __kmalloc (mm/slub.c:1262 mm/slub.c:2419 mm/slub.c:2491 mm/slub.c:3291)
[ 1237.939137] ? resp_rsup_opcodes (include/linux/slab.h:435 drivers/scsi/scsi_debug.c:1689)
[ 1237.940173] resp_rsup_opcodes (include/linux/slab.h:435 drivers/scsi/scsi_debug.c:1689)
[ 1237.941211] ? add_host_store (drivers/scsi/scsi_debug.c:1584)
[ 1237.942261] scsi_debug_queuecommand (drivers/scsi/scsi_debug.c:5276)
[ 1237.943404] ? blk_rq_map_sg (block/blk-merge.c:254)
[ 1237.944398] ? scsi_init_sgtable (drivers/scsi/scsi_lib.c:1095)
[ 1237.945402] sdebug_queuecommand_lock_or_not (drivers/scsi/scsi_debug.c:5300)
[ 1237.946735] scsi_dispatch_cmd (drivers/scsi/scsi_lib.c:1706)
[ 1237.947720] scsi_queue_rq (drivers/scsi/scsi_lib.c:1996)
[ 1237.948687] __blk_mq_run_hw_queue (block/blk-mq.c:816)
[ 1237.949796] blk_mq_run_hw_queue (block/blk-mq.c:896)
[ 1237.950903] ? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:154 kernel/locking/spinlock.c:183)
[ 1237.951862] blk_mq_insert_request (block/blk-mq.c:1037)
[ 1237.952876] blk_execute_rq_nowait (block/blk-exec.c:95)
[ 1237.953981] ? lockdep_init_map (kernel/locking/lockdep.c:3034)
[ 1237.954967] blk_execute_rq (block/blk-exec.c:131)
[ 1237.955929] ? blk_rq_bio_prep (block/blk-core.c:2835)
[ 1237.956913] scsi_execute (drivers/scsi/scsi_lib.c:252)
[ 1237.957821] scsi_execute_req_flags (drivers/scsi/scsi_lib.c:281)
[ 1237.958968] scsi_report_opcode (drivers/scsi/scsi.c:956)
[ 1237.960009] sd_revalidate_disk (drivers/scsi/sd.c:2707 drivers/scsi/sd.c:2792)
[ 1237.961139] revalidate_disk (fs/block_dev.c:1081)
[ 1237.962223] sd_rescan (drivers/scsi/sd.c:1532)
[ 1237.963142] scsi_rescan_device (drivers/scsi/scsi_scan.c:1579)
[ 1237.964165] store_rescan_field (drivers/scsi/scsi_sysfs.c:672)
[ 1237.965254] dev_attr_store (drivers/base/core.c:138)
[ 1237.966319] sysfs_kf_write (fs/sysfs/file.c:131)
[ 1237.967289] kernfs_fop_write (fs/kernfs/file.c:311)
[ 1237.968274] do_readv_writev (fs/read_write.c:722 fs/read_write.c:854)
[ 1237.969295] ? __acct_update_integrals (kernel/tsacct.c:145)
[ 1237.970452] ? kernfs_fop_open (fs/kernfs/file.c:271)
[ 1237.971505] ? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:154 kernel/locking/spinlock.c:183)
[ 1237.972512] ? context_tracking_user_exit (include/linux/vtime.h:89 include/linux/jump_label.h:114 include/trace/events/context_tracking.h:47 kernel/context_tracking.c:140)
[ 1237.973668] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2578 kernel/locking/lockdep.c:2625)
[ 1237.974882] ? trace_hardirqs_on (kernel/locking/lockdep.c:2633)
[ 1237.975850] vfs_writev (fs/read_write.c:893)
[ 1237.976691] SyS_writev (fs/read_write.c:926 fs/read_write.c:917)
[ 1237.977538] system_call_fastpath (arch/x86/kernel/entry_64.S:423)

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-01-20 09:04:21 +01:00
Fabio Estevam
7ecd0bde5b ARM: dts: imx25: Fix PWM "per" clocks
Currently PWM functionality is broken on mx25 due to the wrong assignment of the
PWM "per" clock.

According to Documentation/devicetree/bindings/clock/imx25-clock.txt:
	pwm_ipg_per		52

,so update the pwm "per" to use 'pwm_ipg_per' instead of 'per10' clock.

With this change PWM can work fine on mx25.

Cc: <stable@vger.kernel.org>
Reported-by: Carlos Soto <csotoalonso@gmail.com>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
2015-01-20 15:37:10 +08:00
Eyal Shapira
2cee4762c5 iwlwifi: mvm: validate tid and sta_id in ba_notif
These are coming from the FW and are used to access arrays.
Bad values can cause an out of bounds access so discard
such ba_notifs and warn.

CC: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-01-20 08:47:16 +02:00
Linus Torvalds
eef8f4c2ac Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Socket addresses returned in the error queue need to be fully
    initialized before being passed on to userspace, fix from Willem de
    Bruijn.

 2) Interrupt handling fixes to davinci_emac driver from Tony Lindgren.

 3) Fix races between receive packet steering and cpu hotplug, from Eric
    Dumazet.

 4) Allowing netlink sockets to subscribe to unknown multicast groups
    leads to crashes, don't allow it.  From Johannes Berg.

 5) One to many socket races in SCTP fixed by Daniel Borkmann.

 6) Put in a guard against the mis-use of ipv6 atomic fragments, from
    Hagen Paul Pfeifer.

 7) Fix promisc mode and ethtool crashes in sh_eth driver, from Ben
    Hutchings.

 8) NULL deref and double kfree fix in sxgbe driver from Girish K.S and
    Byungho An.

 9) cfg80211 deadlock fix from Arik Nemtsov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
  s2io: use snprintf() as a safety feature
  r8152: remove sram_read
  r8152: remove generic_ocp_read before writing
  bgmac: activate irqs only if there is nothing to poll
  bgmac: register napi before the device
  sh_eth: Fix ethtool operation crash when net device is down
  sh_eth: Fix promiscuous mode on chips without TSU
  ipv6: stop sending PTB packets for MTU < 1280
  net: sctp: fix race for one-to-many sockets in sendmsg's auto associate
  genetlink: synchronize socket closing and family removal
  genetlink: disallow subscribing to unknown mcast groups
  genetlink: document parallel_ops
  net: rps: fix cpu unplug
  net: davinci_emac: Add support for emac on dm816x
  net: davinci_emac: Fix ioremap for devices with MDIO within the EMAC address space
  net: davinci_emac: Fix incomplete code for getting the phy from device tree
  net: davinci_emac: Free clock after checking the frequency
  net: davinci_emac: Fix runtime pm calls for davinci_emac
  net: davinci_emac: Fix hangs with interrupts
  ip: zero sockaddr returned on error queue
  ...
2015-01-20 18:19:31 +12:00
Linus Torvalds
2262889091 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "This fixes a regression that arose from the change to add a crypto
  prefix to module names which was done to prevent the loading of
  arbitrary modules through the Crypto API.

  In particular, a number of modules were missing the crypto prefix
  which meant that they could no longer be autoloaded"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: add missing crypto module aliases
2015-01-20 18:17:34 +12:00
Rusty Russell
c749637909 module: fix race in kallsyms resolution during module load success.
The kallsyms routines (module_symbol_name, lookup_module_* etc) disable
preemption to walk the modules rather than taking the module_mutex:
this is because they are used for symbol resolution during oopses.

This works because there are synchronize_sched() and synchronize_rcu()
in the unload and failure paths.  However, there's one case which doesn't
have that: the normal case where module loading succeeds, and we free
the init section.

We don't want a synchronize_rcu() there, because it would slow down
module loading: this bug was introduced in 2009 to speed module
loading in the first place.

Thus, we want to do the free in an RCU callback.  We do this in the
simplest possible way by allocating a new rcu_head: if we put it in
the module structure we'd have to worry about that getting freed.

Reported-by: Rui Xiang <rui.xiang@huawei.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-01-20 11:38:34 +10:30
Rusty Russell
be1f221c04 module: remove mod arg from module_free, rename module_memfree().
Nothing needs the module pointer any more, and the next patch will
call it from RCU, where the module itself might no longer exist.
Removing the arg is the safest approach.

This just codifies the use of the module_alloc/module_free pattern
which ftrace and bpf use.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: x86@kernel.org
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: linux-cris-kernel@axis.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: nios2-dev@lists.rocketboards.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: netdev@vger.kernel.org
2015-01-20 11:38:33 +10:30
Rusty Russell
d453cded05 module_arch_freeing_init(): new hook for archs before module->module_init freed.
Archs have been abusing module_free() to clean up their arch-specific
allocations.  Since module_free() is also (ab)used by BPF and trace code,
let's keep it to simple allocations, and provide a hook called before
that.

This means that avr32, ia64, parisc and s390 no longer need to implement
their own module_free() at all.  avr32 doesn't need module_finalize()
either.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-s390@vger.kernel.org
2015-01-20 11:38:32 +10:30
Rusty Russell
c772be5231 param: fix uninitialized read with CONFIG_DEBUG_LOCK_ALLOC
ignore_lockdep is uninitialized, and sysfs_attr_init() doesn't initialize
it, so memset to 0.

Reported-by: Huang Ying <ying.huang@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-01-20 11:38:31 +10:30
Dan Carpenter
a8c1d28ac3 s2io: use snprintf() as a safety feature
"sp->desc[i]" has 25 characters.  "dev->name" has 15 characters.  If we
used all 15 characters then the sprintf() would overflow.

I changed the "sprintf(sp->name, "%s Neterion %s"" to snprintf(), as
well, even though it can't overflow just to be consistent.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 19:42:21 -05:00
Olof Johansson
07bf328350 A rather urgent pull request to fix omap4 legacy interrupts.
The legacy interrupts on omap4 got broken when gic got changed to
 use irq_domain_add_linear() instead of the irq_domain_add_legacy(). We
 still have the hardcoded legacy IRQ numbers in use in several places,
 most notably the in the legacy DMA. It took a while to figure out
 what the problem was and how it should be fixed for the -rc series.
 
 Also include is a regression fix for the dra7 dwc3 suspend.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUvY51AAoJEBvUPslcq6VzWEoP/1eH0XFWFUFz0A7f7bq2DRgM
 O7KHNwBhVDWBMfpwzycfu2wlfWCe++T5R/DjtzIPzZG28rHmq/0A8q9ZSFmIsLLv
 6Fbl0NbYZcQbgb7SfoCXvoTcUkv4rUTRbFblMNUHES1lm/JbDkKIxGU0iQDoL1oE
 6wo1mcF/+0VdrCtTaVtrQcO2+hnvQCDS4Qxcwgdz3FHPhkMGWqkuLqzNh4WrVp+r
 Ma3UOeyM/gHWfG1SV6C4Y5H7ADp+Vmo7Wvhl/KMrr5L4SFCJ0AryWP1LZpcuwbeD
 IN+fcVBG7tIKaN4BtPedUAuigAhOgAtRo0JcdWc00V+sfgyWGiYtK/5ZFuMx1Fc8
 yzLgFInlEf67DrSgMGAZbLBDWNmVVblPbcPH1PmYIoD7YWocFikAY1c3kw8KXAqR
 sZf5uXDRrBSYF5F1AvMX1ktWiKMWFoQ032BoifWZ6b/PF4LY/GEA/IJj8zxHBmDW
 +1PD3LxtuawUce7g0gD2/pvIRvSbZ7I5wOPMX8bug1Pgnla3p7lGZQ0dlYBQxn80
 SD7liN5T+GzxXhtjv1ly5le3uz8a0M5BKXsuWLzCsPBO7DERVKcjv0a3/TIVQONB
 EUT2sNGPwTfqvpj1GDvMtAMWat444V094zf/HgXANDe5+q+HUBMqoZCK4LDsFq8n
 dxVJ4XpWVo24mn70gIv3
 =lpO3
 -----END PGP SIGNATURE-----

Merge tag 'omap-for-v3.19/gic-regression-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Merge "Urgent omap4 legacy interrupt regression fix for v3.19-rc series" from
Tony Lindgren:

A rather urgent pull request to fix omap4 legacy interrupts.

The legacy interrupts on omap4 got broken when gic got changed to
use irq_domain_add_linear() instead of the irq_domain_add_legacy(). We
still have the hardcoded legacy IRQ numbers in use in several places,
most notably the in the legacy DMA. It took a while to figure out
what the problem was and how it should be fixed for the -rc series.

Also include is a regression fix for the dra7 dwc3 suspend.

* tag 'omap-for-v3.19/gic-regression-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP: Work around hardcoded interrupts
  arm: boot: dts: dra7: enable dwc3 suspend PHY quirk

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-01-19 16:23:01 -08:00
Andrew Lunn
38bdf45f4a bus: mvebu-mbus: fix support of MBus window 13
On Armada XP, 375 and 38x the MBus window 13 has the remap capability,
like windows 0 to 7. However, the mvebu-mbus driver isn't currently
taking into account this special case, which means that when window 13
is actually used, the remap registers are left to 0, making the device
using this MBus window unavailable.

As a minimal fix for stable, don't use window 13. A full fix will
follow later.

Fixes: fddddb52a6 ("bus: introduce an Marvell EBU MBus driver")
Cc: <stable@vger.kernel.org> # v3.10+
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
2015-01-19 15:40:53 -06:00
David S. Miller
0c49087462 Some further updates for net-next:
* fix network-manager which was broken by the previous changes
  * fix delete-station events, which were broken by me making the
    genlmsg_end() mistake
  * fix a timer left running during suspend in some race conditions
    that would cause an annoying (but harmless) warning
  * (less important, but in the tree already) remove 80+80 MHz rate
    reporting since the spec doesn't distinguish it from 160 MHz;
    as the bitrate they're both 160 MHz bandwidth
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJUvUZlAAoJEDBSmw7B7bqrfNMQAJT5jjOSjmwW8Zdvujvda/qt
 bFpYa9t0NsN3izzMpjPSrCwPrHN5qE86ZA8TcZrIzejPH4rpltjaXB6JNHZardVo
 deCUWU9xotoPELoE0Xex9mHPEkYYvOaht/P8A/88qP1S2PykMmj9fqNeijyUvwuo
 Jlsh0wKe4Jq6bCmdxvy/bde84ceAQcuh2TKNov1S0tB38tRY9qSu2n6ZGpoMNcEe
 CWuW+23jL1uAvt6Ljk2fTKdLR8iyXykfM0UMX2/4R2PMnJXK/dQqV/eeXTjpxtMk
 UV4aIMcSS19D7HxICKOXOdZLdMMuXaFUnUMlGLBtXZz3n9lZoP7RtVIHoib8VBXZ
 tY7xQrK6YNvwZ4SZZPuv/yr6YWP2+vBM2FUfXjzD+or6uYsej203a5q0RsOY+3Tp
 6Yklr+mQNlrOtpMsHMSgrBUUZsAK1I95ALMVVqaq1hgf1cDvRIDHlOo4A7bjwNFw
 eA3L1o4O1/E/IGp4v6+2Iukc9rIwm11sNr/wuD8dDkZTradUPH1iY6J5sxJNb2Nq
 YdpCnQ/lNXj650+z9/G2omSA6DTTTOtXJPxKR+FOHZVKDpZYtF6TxKb0S79fINps
 6ZlWIna5bUiXF1b6xad1x+vtyjNMgTvkg6mR+WQnvF57Ri8hucbtpv5wpA5bhYUQ
 Fbz9VZF2nfMeIbXfTaWi
 =Bvmr
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2015-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Some further updates for net-next:
 * fix network-manager which was broken by the previous changes
 * fix delete-station events, which were broken by me making the
   genlmsg_end() mistake
 * fix a timer left running during suspend in some race conditions
   that would cause an annoying (but harmless) warning
 * (less important, but in the tree already) remove 80+80 MHz rate
   reporting since the spec doesn't distinguish it from 160 MHz;
   as the bitrate they're both 160 MHz bandwidth

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:22:19 -05:00
Johannes Berg
926e9878a3 phonet netlink: allow multiple messages per skb in route dump
My previous patch to this file changed the code to be bug-compatible
towards userspace. Unless userspace (which I wasn't able to find)
implements the dump reader by hand in a wrong way, this isn't needed.
If it uses libnl or similar code putting multiple messages into a
single SKB is far more efficient.

Change the code to do this. While at it, also clean it up and don't
use so many variables - just store the address in the callback args
directly.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:20:17 -05:00
Nimrod Andy
fc83477780 ARM: dts: imx6sx: correct i.MX6sx sdb board enet phy address
The commit (3d125f9c91) cause i.MX6SX sdb enet cannot work. The cause is
the commit add mdio node with un-correct phy address.

The patch just correct i.MX6sx sdb board enet phy address.

Signed-off-by: Fugang Duan <B38611@freescale.com>
Acked-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:19:22 -05:00
David S. Miller
ef5a1ba145 Merge branch 'r8152'
Hayes Wang says:

====================
r8152: couldn't read OCP_SRAM_DATA

Read OCP_SRAM_DATA would read additional bytes and may let
the hw abnormal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:16:36 -05:00
hayeswang
b4d99def09 r8152: remove sram_read
Read OCP register 0xa43a~0xa43b would clear some flags which the hw
would use, and it may let the device lost. However, the unit of
reading is 4 bytes. That is, it would read 0xa438~0xa43b when calling
sram_read() to read OCP_SRAM_DATA.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:16:32 -05:00
hayeswang
8cb3db24c8 r8152: remove generic_ocp_read before writing
For ocp_write_word() and ocp_write_byte(), there is a generic_ocp_read()
which is used to read the whole 4 byte data, keep the unchanged bytes,
and modify the expected bytes. However, the "byen" could be used to
determine which bytes of the 4 bytes to write, so the action could be
removed.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:16:32 -05:00
Satoru Takeuchi
6e1103a6e9 btrfs: fix state->private cast on 32 bit machines
Suppress the following warning displayed on building 32bit (i686) kernel.

===============================================================================
...
   CC [M]  fs/btrfs/extent_io.o
fs/btrfs/extent_io.c: In function ‘btrfs_free_io_failure_record’:
fs/btrfs/extent_io.c:2193:13: warning: cast to pointer from integer of
different size [-Wint-to-pointer-cast]
    failrec = (struct io_failure_record *)state->private;
...
===============================================================================

Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Reported-by: Chris Murphy <chris@colorremedies.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-19 13:06:06 -08:00
Filipe Manana
75c68e9fbb Btrfs: fix race deleting block group from space_info->ro_bgs list
When removing a block group we were deleting it from its space_info's
ro_bgs list without the correct protection - the space info's spinlock.
Fix this by doing the list delete while holding the spinlock of the
corresponding space info, which is the correct lock for any operation
on that list.

This issue was introduced in the 3.19 kernel by the following change:

    Btrfs: move read only block groups onto their own list V2
    commit 633c0aad4c

I ran into a kernel crash while a task was running statfs, which iterates
the space_info->ro_bgs list while holding the space info's spinlock,
and another task was deleting it from the same list, without holding that
spinlock, as part of the block group remove operation (while running the
function btrfs_remove_block_group). This happened often when running the
stress test xfstests/generic/038 I recently made.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-19 13:05:45 -08:00
Tsutomu Itoh
379d6854a2 Btrfs: fix incorrect freeing in scrub_stripe
The address that should be freed is not 'ppath' but 'path'.

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Reviewed-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-19 13:05:44 -08:00
David Sterba
98bd5c547e btrfs: sync ioctl, handle errors after transaction start
The version merged to 3.19 did not handle errors from start_trancaction
and could pass an invalid pointer to commit_transaction.

Fixes: 6b5fe46dfa ("btrfs: do commit in sync_fs if there are pending changes")
Reported-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-19 13:05:44 -08:00
Felix Fietkau
22a5dc0e5e net: sched: Introduce connmark action
This tc action allows you to retrieve the connection tracking mark
This action has been used heavily by openwrt for a few years now.

There are known limitations currently:

doesn't work for initial packets, since we only query the ct table.
  Fine given use case is for returning packets

no implicit defrag.
  frags should be rare so fix later..

won't work for more complex tasks, e.g. lookup of other extensions
  since we have no means to store results

we still have a 2nd lookup later on via normal conntrack path.
This shouldn't break anything though since skb->nfct isn't altered.

V2:
remove unnecessary braces (Jiri)
change the action identifier to 14 (Jiri)
Fix some stylistic issues caught by checkpatch
V3:
Move module params to bottom (Cong)
Get rid of tcf_hashinfo_init and friends and conform to newer API (Cong)

Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:02:06 -05:00
David S. Miller
e60bf80615 Merge branch 'bgmac'
Hauke Mehrtens says:

====================
bgmac: some fixes to napi usage

I compared the napi documentation with the bgmac driver and found some
problems in that driver. These two patches should fix the problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:00:02 -05:00
Hauke Mehrtens
43f159c60a bgmac: activate irqs only if there is nothing to poll
IRQs should only get activated when there is nothing to poll in the
queue any more and to after every poll.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:59:57 -05:00
Hauke Mehrtens
6216642f20 bgmac: register napi before the device
napi should get registered before the netdev and not after.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:59:57 -05:00
David S. Miller
cbcd1fa72c Merge branch 'dsa-next'
Florian Fainelli says:

====================
net: DSA fixes for bridge and ip-autoconf

These two patches address some real world use cases of the DSA master and slave
network devices.

You have already seen patch 1 previously and you rejected it since my
explanations were not good enough to provide a justification as to why it is
useful, hopefully this time my explanation is better.

Patch 2 solves a different, yet very real problem as well at the bridge layer
when using DSA network devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:45:16 -05:00
Florian Fainelli
8db0a2ee2c net: bridge: reject DSA-enabled master netdevices as bridge members
DSA-enabled master network devices with a switch tagging protocol should
strip the protocol specific format before handing the frame over to
higher layer.

When adding such a DSA master network device as a bridge member, we go
through the following code path when receiving a frame:

__netif_receive_skb_core
	-> first ptype check against ptype_all is not returning any
	   handler for this skb

	-> check and invoke rx_handler:
		-> deliver frame to the bridge layer: br_handle_frame

DSA registers a ptype handler with the fake ETH_XDSA ethertype, which is
called *after* the bridge-layer rx_handler has run. br_handle_frame()
tries to parse the frame it received from the DSA master network device,
and will not be able to match any of its conditions and jumps straight
at the end of the end of br_handle_frame() and returns
RX_HANDLER_CONSUMED there.

Since we returned RX_HANDLER_CONSUMED, __netif_receive_skb_core() stops
RX processing for this frame and returns NET_RX_SUCCESS, so we never get
a chance to call our switch tag packet processing logic and deliver
frames to the DSA slave network devices, and so we do not get any
functional bridge members at all.

Instead of cluttering the bridge receive path with DSA-specific checks,
and rely on assumptions about how __netif_receive_skb_core() is
processing frames, we simply deny adding the DSA master network device
(conduit interface) as a bridge member, leaving only the slave DSA
network devices to be bridge members, since those will work correctly in
all circumstances.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:45:10 -05:00
Florian Fainelli
728c02089a net: ipv4: handle DSA enabled master network devices
The logic to configure a network interface for kernel IP
auto-configuration is very simplistic, and does not handle the case
where a device is stacked onto another such as with DSA. This causes the
kernel not to open and configure the master network device in a DSA
switch tree, and therefore slave network devices using this master
network devices as conduit device cannot be open.

This restriction comes from a check in net/dsa/slave.c, which is
basically checking the master netdev flags for IFF_UP and returns
-ENETDOWN if it is not the case.

Automatically bringing-up DSA master network devices allows DSA slave
network devices to be used as valid interfaces for e.g: NFS root booting
by allowing kernel IP autoconfiguration to succeed on these interfaces.

On the reverse path, make sure we do not attempt to close a DSA-enabled
device as this would implicitely prevent the slave DSA network device
from operating.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:45:10 -05:00
Ben Hutchings
5bdc73800d mii: Handle link state changes for forced modes in mii_check_media()
mii_check_media() does not update the link (carrier) state or log link
changes when the link mode is forced.  Drivers using the mii library
must do this themselves, but most of them do not.

Instead of changing them all, provide a sensible default behaviour
similar to mii_check_link() when the mode is forced.

via-rhine depends on it being a no-op in this case, so make its call
to mii_check_media() conditional.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:43:42 -05:00
David S. Miller
852c5d9c98 Merge branch 'sh_eth'
Ben Hutchings says:

====================
sh_eth fixes

I'm currently looking at Ethernet support on the R-Car H2 chip,
reviewing and testing the sh_eth driver.  Here are fixes for two fairly
obvious bugs in the driver; I will probably have some more later.

These are not tested on any of the other supported chips.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:37:44 -05:00
Ben Hutchings
4f9dce230b sh_eth: Fix ethtool operation crash when net device is down
The driver connects and disconnects the PHY device whenever the
net device is brought up and down.  The ethtool get_settings,
set_settings and nway_reset operations will dereference a null
or dangling pointer if called while it is down.

I think it would be preferable to keep the PHY connected, but there
may be good reasons not to.

As an immediate fix for this bug:
- Set the phydev pointer to NULL after disconnecting the PHY
- Change those three operations to return -ENODEV while the PHY is
  not connected

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:37:40 -05:00
Ben Hutchings
b37feed7c2 sh_eth: Fix promiscuous mode on chips without TSU
Currently net_device_ops::set_rx_mode is only implemented for
chips with a TSU (multiple address table).  However we do need
to turn the PRM (promiscuous) flag on and off for other chips.

- Remove the unlikely() from the TSU functions that we may safely
  call for chips without a TSU
- Make setting of the MCT flag conditional on the tsu capability flag
- Rename sh_eth_set_multicast_list() to sh_eth_set_rx_mode() and plumb
  it into both net_device_ops structures
- Remove the previously-unreachable branch in sh_eth_rx_mode() that
  would otherwise reset the flags to defaults for non-TSU chips

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:37:40 -05:00
David S. Miller
8f1115b4f2 Merge branch 'csiostor'
Praveen Madhavan says:

====================
csiostor: Remove T4 FCoE support

We found a subtle issue with FCoE on T4 very late in the game
and decided not to productize FCoE on T4 and therefore there
are no customers that will be impacted by this change. FCoE is
supported on T5 cards.

Please apply on net-next since depends on previous commits.

Changes in v2:
  - Make the commit message more clearer.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:30:06 -05:00
Praveen Madhavan
d394431523 csiostor:Removed file csio_hw_t4.c
We have decided not to productize FCoE on T4.
Hence file is removed.

Signed-off-by: Praveen Madhavan <praveenm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:30:02 -05:00
Praveen Madhavan
3fb4c22eaa csiostor:Remove T4 FCoE Support.
We found a subtle issue with FCoE on T4 very late in the game
and decided not to productize FCoE on T4 and therefore there
are no customers that will be impacted by this change. Hence
T4 FCoE support is removed. FCoE supported only on T5 cards.

changes in v2:
  - Make the commit message more clearer.

Signed-off-by: Praveen Madhavan <praveenm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:30:02 -05:00
David S. Miller
b66a4eaaee Merge branch 'netcp'
Murali Karicheri says:

====================
net: Add Keystone NetCP ethernet driver support

The Network Coprocessor (NetCP) is a hardware accelerator that processes
Ethernet packets. NetCP has a gigabit Ethernet (GbE) subsystem with a ethernet
switch sub-module to send and receive packets. NetCP also includes a packet
accelerator (PA) module to perform packet classification operations such as
header matching, and packet modification operations such as checksum
generation. NetCP can also optionally include a Security Accelerator(SA)
capable of performing IPSec operations on ingress/egress packets.

Keystone SoC's also have a 10 Gigabit Ethernet Subsystem (XGbE) which
includes a 3-port Ethernet switch sub-module capable of 10Gb/s and
1Gb/s rates per Ethernet port.

Both GBE and XGBE network processors supported using common driver. It
is also designed to handle future variants of NetCP.

version history
---------------
v7->v8

 - Reworked comments against v7, related to checker warning.
 - Patch 2/4 that has all of the driver code in v7 is now split into 3
   patches based on functionality so that we have 3 smaller patches
   review instead of a big patch.
 - Patch for MAINTAINER is merged to 2/4 along with netcp core driver
 - Separate patch (3/4) for 1G and  (4/4) for 10G
 - Removed big endian support for initial version (will add it later)

v6->v7
 - Fixed some minor documentation error and also modified the netcp driver
   to fix the set* functions to include correct le/be macros.

v5->v6
 - updated version after incorporating comments [6] from David Miller,
   David Laight & Geert Uytterhoeven on v5. I would like get this in
   for v3.19 merge window if the latest version is acceptable.

v4->v5
 - Sorry to spin v5 quickly but I missed few check-patch warnings which
   were pointed by Joe Perches(thanks). I folded his changes [5] along with
   few more check-patch warning fixes. I would like get this in for v3.18
   merge window if David is happy with this version.

v3->v4
 - Couple of fixes in in error path as pointed [4] out by David. Rest of
   the patches are unchanged from v3.

v2->v3
 - Update v3 after incorporating Jamal and David Miller's comment/suggestion
   from earlier versions [1] [2].  After per the discussion here [3], the
   controversial custom exports have been dropped now. And for future
   future offload support additions, we will plug into generic frameworks
   as an when they are available.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:07:43 -05:00
Wingman Kwok
90cff9e2da net: netcp: Enhance GBE driver to support 10G Ethernet
This patch enhances the NetCP gbe driver to support 10GbE subsystem
available in Keystone NetCP. The 3-port 10GbE switch sub-module contains
the following components:- 10GbE Switch, MDIO Module, 2 PCS-R Modules
(10GBase-R) and 2 SGMII modules (10/100/1000Base-T). The GBE driver
together with netcp core driver provides support for 10G Ethernet
on Keystone SoCs.

10GbE hardware spec is available at

http://www.ti.com/general/docs/lit/getliterature.tsp?baseLiteratureNumber=spruhj5&fileType=pdf

 Cc: David Miller <davem@davemloft.net>
 Cc: Rob Herring <robh+dt@kernel.org>
 Cc: Grant Likely <grant.likely@linaro.org>
 Cc: Santosh Shilimkar <santosh.shilimkar@kernel.org>
 Cc: Pawel Moll <pawel.moll@arm.com>
 Cc: Mark Rutland <mark.rutland@arm.com>
 Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
 Cc: Kumar Gala <galak@codeaurora.org>

Signed-off-by: Wingman Kwok <w-kwok2@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:07:39 -05:00
Wingman Kwok
6f8d3f3338 net: netcp: Add Keystone NetCP GbE driver
This patch add support for 1G Ethernet driver based on Keystone
NetCP hardware. The gigabit Ethernet (GbE) switch subsystem is one of the main
components of the network coprocessor (NETCP) peripheral. The purpose of the
gigabit Ethernet switch subsystem in the NETCP is to provide an interface to
transfer data between the host device and another connected device in
compliance with the Ethernet protocol. GbE consists of 5 port Ethernet Switch
module, 4 Serial Gigabit Media Independent Interface (SGMII) modules, MDIO
module and SerDes.

Driver for 5 port GbE switch and SGMII module is added in this patch. These
hardware modules along with netcp core driver provides Network driver functions
for 1G Ethernet.

Detailed hardware spec is available at

http://www.ti.com/lit/ug/sprugv9d/sprugv9d.pdf

 Cc: David Miller <davem@davemloft.net>
 Cc: Rob Herring <robh+dt@kernel.org>
 Cc: Grant Likely <grant.likely@linaro.org>
 Cc: Santosh Shilimkar <santosh.shilimkar@kernel.org>
 Cc: Pawel Moll <pawel.moll@arm.com>
 Cc: Mark Rutland <mark.rutland@arm.com>
 Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
 Cc: Kumar Gala <galak@codeaurora.org>

Signed-off-by: Wingman Kwok <w-kwok2@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:07:39 -05:00
Karicheri, Muralidharan
84640e27f2 net: netcp: Add Keystone NetCP core ethernet driver
The network coprocessor (NetCP) is a hardware accelerator available in
Keystone SoCs that processes Ethernet packets. NetCP consists of following
hardware components

 1 Gigabit Ethernet (GbE) subsystem with a Ethernet switch sub-module to
   send and receive packets.
 2 Packet Accelerator (PA) module to perform packet classification
   operations such as header matching, and packet modification operations
   such as checksum generation.
 3 Security Accelerator(SA) capable of performing IPSec operations on
   ingress/egress packets.
 4 An optional 10 Gigabit Ethernet Subsystem (XGbE) which includes a
   3-port Ethernet switch sub-module capable of 10Gb/s and 1Gb/s rates
   per Ethernet port.
 5 Packet DMA and Queue Management Subsystem (QMSS) to enqueue and dequeue
   packets and DMA the packets between memory and NetCP hardware components
   described above.

NetCP core driver make use of the Keystone Navigator driver API to allocate
DMA channel for the Ethenet device and to handle packet queue/de-queue,
Please refer API's in include/linux/soc/ti/knav_dma.h and
drivers/soc/ti/knav_qmss.h for details.

NetCP driver consists of NetCP core driver and at a minimum Gigabit
Ethernet (GBE) module (1) driver to implement the Network device function.
Other modules (2,3) can be optionally added to achieve supported hardware
acceleration function. The initial version of the driver include NetCP
core driver and GBE driver modules.

Please refer Documentation/devicetree/bindings/net/keystone-netcp.txt
for design of the driver.

 Cc: David Miller <davem@davemloft.net>
 Cc: Rob Herring <robh+dt@kernel.org>
 Cc: Grant Likely <grant.likely@linaro.org>
 Cc: Santosh Shilimkar <santosh.shilimkar@kernel.org>
 Cc: Pawel Moll <pawel.moll@arm.com>
 Cc: Mark Rutland <mark.rutland@arm.com>
 Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
 Cc: Kumar Gala <galak@codeaurora.org>

Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Wingman Kwok <w-kwok2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:07:39 -05:00
Karicheri, Muralidharan
44eefcdfb9 Documentation: dt: net: Add binding doc for Keystone NetCP ethernet driver
The network coprocessor (NetCP) is a hardware accelerator that processes
Ethernet packets. NetCP has a gigabit Ethernet (GbE) subsystem with a ethernet
switch sub-module to send and receive packets. NetCP also includes a packet
accelerator (PA) module to perform packet classification operations such as
header matching, and packet modification operations such as checksum
generation. NetCP can also optionally include a Security Accelerator(SA)
capable of performing IPSec operations on ingress/egress packets.

Keystone SoC's also have a 10 Gigabit Ethernet Subsystem (XGbE) which
includes a 3-port Ethernet switch sub-module capable of 10Gb/s and
1Gb/s rates per Ethernet port.

NetCP Subsystem device tree layout looks something like below:

-----------------------------
  NetCP subsystem(10G or 1G)
-----------------------------
	|
	|-> NetCP Devices ->	|
	|			|-> GBE/XGBE Switch
	|			|
	|			|-> Packet Accelerator
	|			|
	|			|-> Security Accelerator
	|
	|
	|
	|-> NetCP Interfaces ->	|
				|-> Ethernet Port 0
				|
				|-> Ethernet Port 1
				|
				|-> Ethernet Port 2
				|
				|-> Ethernet Port 3

Common driver supports GBE as well XGBE network processors.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: David Miller <davem@davemloft.net>
Cc: Santosh Shilimkar <santosh.shilimkar@kernel.org>

Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 15:07:39 -05:00
Hagen Paul Pfeifer
9d289715eb ipv6: stop sending PTB packets for MTU < 1280
Reduce the attack vector and stop generating IPv6 Fragment Header for
paths with an MTU smaller than the minimum required IPv6 MTU
size (1280 byte) - called atomic fragments.

See IETF I-D "Deprecating the Generation of IPv6 Atomic Fragments" [1]
for more information and how this "feature" can be misused.

[1] https://tools.ietf.org/html/draft-ietf-6man-deprecate-atomfrag-generation-00

Signed-off-by: Fernando Gont <fgont@si6networks.com>
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 14:52:07 -05:00
Felipe Balbi
92cb13fb21 net: ethernet: ti: cpsw: fix buld break when NET_POLL_CONTROLLER
Commit c03abd8463 (net: ethernet: cpsw: don't requests IRQs we don't
use) left one build breakage when NET_POLL_CONTROLLER is enabled.

Fix this build break by referring to the correct irqs_table array.

Fixes: c03abd8463 (net: ethernet: cpsw: don't requests IRQs we don't use)
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 14:45:00 -05:00
David S. Miller
7f9091f0a7 Merge branch 'link_netns'
Merge branch 'link_netns'

Nicolas Dichtel says:

====================
netns: allow to identify peer netns

The goal of this serie is to be able to multicast netlink messages with an
attribute that identify a peer netns.
This is needed by the userland to interpret some information contained in
netlink messages (like IFLA_LINK value, but also some other attributes in case
of x-netns netdevice (see also
http://thread.gmane.org/gmane.linux.network/315933/focus=316064 and
http://thread.gmane.org/gmane.linux.kernel.containers/28301/focus=4239)).

Ids of peer netns can be set by userland via a new rtnl cmd RTM_NEWNSID. When
the kernel needs an id for a peer (for example when advertising a new x-netns
interface via netlink), if the user didn't allocate an id, one will be
automatically allocated.
These ids are stored per netns and are local (ie only valid in the netns where
they are set). To avoid allocating an int for each peer netns, I use
idr_for_each() to retrieve the id of a peer netns. Note that it will be possible
to add a table (struct net -> id) later to optimize this lookup if needed.

Patch 1/4 introduces the rtnetlink API mechanism to set and get these ids.
Patch 2/4 and 3/4 implements an example of how to use these ids when advertising
information about a x-netns interface.
And patch 4/4 shows that the netlink messages can be symetric between a GET and
a SET.

iproute2 patches are available, I can send them on demand.

Here is a small screenshot to show how it can be used by userland.

$ ip netns add foo
$ ip netns del foo
$ ip netns
$ touch /var/run/netns/init_net
$ mount --bind /proc/1/ns/net /var/run/netns/init_net
$ ip netns add foo
$ ip -n foo netns
foo
init_net
$ ip -n foo netns set init_net 0
$ ip -n foo netns set foo 1

$ ip netns
foo
init_net
$ ip -n foo netns
foo (id: 1)
init_net (id: 0)

$ ip -n foo link add ipip1 link-netnsid 0 type ipip remote 10.16.0.121 local 10.16.0.249
$ ip -n foo link ls ipip1
6: ipip1@NONE: <POINTOPOINT,NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default
    link/ipip 10.16.0.249 peer 10.16.0.121 link-netnsid 0

$ ip netns
foo
init_net
$ ip -n foo link add ipip2 type ipip remote 10.16.0.121 local 10.16.0.249
$ ip -n foo link set ipip2 netns init_net
$ ip link ls ipip2
7: ipip2@NONE: <POINTOPOINT,NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default
    link/ipip 10.16.0.249 peer 10.16.0.121 link-netnsid 0
$ ip netns
foo (id: 0)
init_net

v4 -> v5:
  use rtnetlink instead of genetlink
  allocate automatically an id if user didn't assign one
  rename include/uapi/linux/netns.h to include/uapi/linux/net_namespace.h
  add vxlan in patch #3

RFCv3 -> v4:
  rebase on net-next
  add copyright text in the new netns.h file

RFCv2 -> RFCv3:
  ids are now defined by userland (via netlink). Ids are stored in each netns
  (and they are local to this netns).
  add get_link_net support for ip6 tunnels
  netnsid is now a s32 instead of a u32

RFCv1 -> RFCv2:
  remove useless ()
  ids are now stored in the user ns. It's possible to get an id for a peer netns
  only if the current netns and the peer netns have the same user ns parent.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 14:44:33 -05:00
Nicolas Dichtel
317f4810e4 rtnl: allow to create device with IFLA_LINK_NETNSID set
This patch adds the ability to create a netdevice in a specified netns and
then move it into the final netns. In fact, it allows to have a symetry between
get and set rtnl messages.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 14:32:03 -05:00
Nicolas Dichtel
1728d4fabd tunnels: advertise link netns via netlink
Implement rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is
added to rtnetlink messages.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 14:32:03 -05:00
Nicolas Dichtel
d37512a277 rtnl: add link netns id to interface messages
This patch adds a new attribute (IFLA_LINK_NETNSID) which contains the 'link'
netns id when this netns is different from the netns where the interface
stands (for example for x-net interfaces like ip tunnels).
With this attribute, it's possible to interpret correctly all advertised
information (like IFLA_LINK, etc.).

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 14:21:26 -05:00
Nicolas Dichtel
0c7aecd4bd netns: add rtnl cmd to add and get peer netns ids
With this patch, a user can define an id for a peer netns by providing a FD or a
PID. These ids are local to the netns where it is added (ie valid only into this
netns).

The main function (ie the one exported to other module), peernet2id(), allows to
get the id of a peer netns. If no id has been assigned by the user, this
function allocates one.

These ids will be used in netlink messages to point to a peer netns, for example
in case of a x-netns interface.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 14:21:18 -05:00
David Jeffery
ce75145267 libata: prevent HSM state change race between ISR and PIO
It is possible for ata_sff_flush_pio_task() to set ap->hsm_task_state to
HSM_ST_IDLE in between the time __ata_sff_port_intr() checks for HSM_ST_IDLE
and before it calls ata_sff_hsm_move() causing ata_sff_hsm_move() to BUG().

This problem is hard to reproduce making this patch hard to verify, but this
fix will prevent the race.

I have not been able to reproduce the problem, but here is a crash dump from
a 2.6.32 kernel.

On examining the ata port's state, its hsm_task_state field has a value of HSM_ST_IDLE:

crash> struct ata_port.hsm_task_state ffff881c1121c000
  hsm_task_state = 0

Normally, this should not be possible as ata_sff_hsm_move() was called from ata_sff_host_intr(),
which checks hsm_task_state and won't call ata_sff_hsm_move() if it has a HSM_ST_IDLE value.

PID: 11053  TASK: ffff8816e846cae0  CPU: 0   COMMAND: "sshd"
 #0 [ffff88008ba03960] machine_kexec at ffffffff81038f3b
 #1 [ffff88008ba039c0] crash_kexec at ffffffff810c5d92
 #2 [ffff88008ba03a90] oops_end at ffffffff8152b510
 #3 [ffff88008ba03ac0] die at ffffffff81010e0b
 #4 [ffff88008ba03af0] do_trap at ffffffff8152ad74
 #5 [ffff88008ba03b50] do_invalid_op at ffffffff8100cf95
 #6 [ffff88008ba03bf0] invalid_op at ffffffff8100bf9b
    [exception RIP: ata_sff_hsm_move+317]
    RIP: ffffffff813a77ad  RSP: ffff88008ba03ca0  RFLAGS: 00010097
    RAX: 0000000000000000  RBX: ffff881c1121dc60  RCX: 0000000000000000
    RDX: ffff881c1121dd10  RSI: ffff881c1121dc60  RDI: ffff881c1121c000
    RBP: ffff88008ba03d00   R8: 0000000000000000   R9: 000000000000002e
    R10: 000000000001003f  R11: 000000000000009b  R12: ffff881c1121c000
    R13: 0000000000000000  R14: 0000000000000050  R15: ffff881c1121dd78
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff88008ba03d08] ata_sff_host_intr at ffffffff813a7fbd
 #8 [ffff88008ba03d38] ata_sff_interrupt at ffffffff813a821e
 #9 [ffff88008ba03d78] handle_IRQ_event at ffffffff810e6ec0
--- <IRQ stack> ---
    [exception RIP: pipe_poll+48]
    RIP: ffffffff81192780  RSP: ffff880f26d459b8  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: ffff880f26d459c8  RCX: 0000000000000000
    RDX: 0000000000000001  RSI: 0000000000000000  RDI: ffff881a0539fa80
    RBP: ffffffff8100bb8e   R8: ffff8803b23324a0   R9: 0000000000000000
    R10: ffff880f26d45dd0  R11: 0000000000000008  R12: ffffffff8109b646
    R13: ffff880f26d45948  R14: 0000000000000246  R15: 0000000000000246
    ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018
    RIP: 00007f26017435c3  RSP: 00007fffe020c420  RFLAGS: 00000206
    RAX: 0000000000000017  RBX: ffffffff8100b072  RCX: 00007fffe020c45c
    RDX: 00007f2604a3f120  RSI: 00007f2604a3f140  RDI: 000000000000000d
    RBP: 0000000000000000   R8: 00007fffe020e570   R9: 0101010101010101
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007fffe020e5f0
    R13: 00007fffe020e5f4  R14: 00007f26045f373c  R15: 00007fffe020e5e0
    ORIG_RAX: 0000000000000017  CS: 0033  SS: 002b

Somewhere between the ata_sff_hsm_move() check and the ata_sff_host_intr() check, the value changed.
On examining the other cpus to see what else was running, another cpu was running the error handler
routines:

PID: 326    TASK: ffff881c11014aa0  CPU: 1   COMMAND: "scsi_eh_1"
 #0 [ffff88008ba27e90] crash_nmi_callback at ffffffff8102fee6
 #1 [ffff88008ba27ea0] notifier_call_chain at ffffffff8152d515
 #2 [ffff88008ba27ee0] atomic_notifier_call_chain at ffffffff8152d57a
 #3 [ffff88008ba27ef0] notify_die at ffffffff810a154e
 #4 [ffff88008ba27f20] do_nmi at ffffffff8152b1db
 #5 [ffff88008ba27f50] nmi at ffffffff8152aaa0
    [exception RIP: _spin_lock_irqsave+47]
    RIP: ffffffff8152a1ff  RSP: ffff881c11a73aa0  RFLAGS: 00000006
    RAX: 0000000000000001  RBX: ffff881c1121deb8  RCX: 0000000000000000
    RDX: 0000000000000246  RSI: 0000000000000020  RDI: ffff881c122612d8
    RBP: ffff881c11a73aa0   R8: ffff881c17083800   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: ffff881c1121c000
    R13: 000000000000001f  R14: ffff881c1121dd50  R15: ffff881c1121dc60
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
--- <NMI exception stack> ---
 #6 [ffff881c11a73aa0] _spin_lock_irqsave at ffffffff8152a1ff
 #7 [ffff881c11a73aa8] ata_exec_internal_sg at ffffffff81396fb5
 #8 [ffff881c11a73b58] ata_exec_internal at ffffffff81397109
 #9 [ffff881c11a73bd8] atapi_eh_request_sense at ffffffff813a34eb

Before it tried to acquire a spinlock, ata_exec_internal_sg() called ata_sff_flush_pio_task().
This function will set ap->hsm_task_state to HSM_ST_IDLE, and has no locking around setting this
value. ata_sff_flush_pio_task() can then race with the interrupt handler and potentially set
HSM_ST_IDLE at a fatal moment, which will trigger a kernel BUG.

v2: Fixup comment in ata_sff_flush_pio_task()

tj: Further updated comment.  Use ap->lock instead of shost lock and
    use the [un]lock_irq variant instead of the irqsave/restore one.

Signed-off-by: David Milburn <dmilburn@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
2015-01-19 14:11:23 -05:00
Emmanuel Grumbach
c1e140bf79 mac80211: delete the assoc/auth timer upon suspend
While suspending, we destroy the authentication /
association that might be taking place. While doing so, we
forgot to delete the timer which can be firing after
local->suspended is already set, producing the warning below.

Fix that by deleting the timer.

[66722.825487] WARNING: CPU: 2 PID: 5612 at net/mac80211/util.c:755 ieee80211_can_queue_work.isra.18+0x32/0x40 [mac80211]()
[66722.825487] queueing ieee80211 work while going to suspend
[66722.825529] CPU: 2 PID: 5612 Comm: kworker/u16:69 Tainted: G        W  O  3.16.1+ #24
[66722.825537] Workqueue: events_unbound async_run_entry_fn
[66722.825545] Call Trace:
[66722.825552]  <IRQ>  [<ffffffff817edbb2>] dump_stack+0x4d/0x66
[66722.825556]  [<ffffffff81075cad>] warn_slowpath_common+0x7d/0xa0
[66722.825572]  [<ffffffffa06b5b90>] ? ieee80211_sta_bcn_mon_timer+0x50/0x50 [mac80211]
[66722.825573]  [<ffffffff81075d1c>] warn_slowpath_fmt+0x4c/0x50
[66722.825586]  [<ffffffffa06977a2>] ieee80211_can_queue_work.isra.18+0x32/0x40 [mac80211]
[66722.825598]  [<ffffffffa06977d5>] ieee80211_queue_work+0x25/0x50 [mac80211]
[66722.825611]  [<ffffffffa06b5bac>] ieee80211_sta_timer+0x1c/0x20 [mac80211]
[66722.825614]  [<ffffffff8108655a>] call_timer_fn+0x8a/0x300

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-19 18:59:20 +01:00