Commit graph

507097 commits

Author SHA1 Message Date
Akeem G Abodunrin
579b23d8dc i40e: Add safety net for switch calling
This patch adds default case to handle unmatched switch calls.

Change-ID: Icd203570a1dc5322c1038f68b98a83195e8ad28c
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-03 01:07:24 -08:00
Shannon Nelson
7edf810c61 i40e/i40evf: print FW build number in version string
Include the FW build number in the formatted FW version string.  In order
to fit within ethtool's 32 character limit, the etrack's unused high order
bits are trimmed as is the leading 0 for the NVM version.  This leaves
us with 2 character left for if/when the etrack id goes to 5 hex chars
and the NVM major number goes to 2 chars.

Change-ID: Icb004c4b9b14a2f54dd200b467fcc1d7b9297308
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-03 01:07:24 -08:00
Neerav Parikh
d40d00b1c2 i40e: Skip the priority tagging if DCB is not enabled
If DCB is not enabled priority tagging is not needed
so skip over that section.

Change-ID: Ia3f3fa07945b421259a9ca38329d6d1cbd6c6bcc
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-03 01:07:23 -08:00
Matthew Vick
eca3204765 fm10k: Resolve various spelling errors and checkpatch warnings
Fix a few silly typos in the code and checkpatch warnings in support of
general code cleanliness.

Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-03 01:07:23 -08:00
Matthew Vick
5bf33dc687 fm10k: Implement ndo_features_check
The introduction of ndo_features_check allows drivers to report their
offload capabilities per-skb. Implement this in fm10k to take advantage
of this new functionality.

Reported-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-03 01:07:22 -08:00
Matthew Vick
8c1a90aa49 fm10k: Modify tunnel length header check when offloading
The FM10000 host interface can only support up to 184 bytes when
performing tunnel offloads. Because of this, a check was added to
prevent the driver from attempting to feed a header to the hardware too
big for it to parse. Make this check a little more robust by calculating
the inner L4 header length based on whether it is TCP or UDP.

Cc: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-03-03 01:07:21 -08:00
Tyler Hicks
6d65261a09 eCryptfs: don't pass fs-specific ioctl commands through
eCryptfs can't be aware of what to expect when after passing an
arbitrary ioctl command through to the lower filesystem. The ioctl
command may trigger an action in the lower filesystem that is
incompatible with eCryptfs.

One specific example is when one attempts to use the Btrfs clone
ioctl command when the source file is in the Btrfs filesystem that
eCryptfs is mounted on top of and the destination fd is from a new file
created in the eCryptfs mount. The ioctl syscall incorrectly returns
success because the command is passed down to Btrfs which thinks that it
was able to do the clone operation. However, the result is an empty
eCryptfs file.

This patch allows the trim, {g,s}etflags, and {g,s}etversion ioctl
commands through and then copies up the inode metadata from the lower
inode to the eCryptfs inode to catch any changes made to the lower
inode's metadata. Those five ioctl commands are mostly common across all
filesystems but the whitelist may need to be further pruned in the
future.

https://bugzilla.kernel.org/show_bug.cgi?id=93691
https://launchpad.net/bugs/1305335

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Cc: Rocko <rockorequin@hotmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: stable@vger.kernel.org # v2.6.36+: c43f7b8 eCryptfs: Handle ioctl calls with unlocked and compat functions
2015-03-03 02:03:56 -06:00
Max Mansfield
c7d373c3f0 usb: ftdi_sio: Add jtag quirk support for Cyber Cortex AV boards
This patch integrates Cyber Cortex AV boards with the existing
ftdi_jtag_quirk in order to use serial port 0 with JTAG which is
required by the manufacturers' software.

Steps: 2

[ftdi_sio_ids.h]
1. Defined the device PID

[ftdi_sio.c]
2. Added a macro declaration to the ids array, in order to enable the
jtag quirk for the device.

Signed-off-by: Max Mansfield <max.m.mansfield@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2015-03-03 07:47:06 +01:00
Michal Kubeček
acf8dd0a9d udp: only allow UFO for packets from SOCK_DGRAM sockets
If an over-MTU UDP datagram is sent through a SOCK_RAW socket to a
UFO-capable device, ip_ufo_append_data() sets skb->ip_summed to
CHECKSUM_PARTIAL unconditionally as all GSO code assumes transport layer
checksum is to be computed on segmentation. However, in this case,
skb->csum_start and skb->csum_offset are never set as raw socket
transmit path bypasses udp_send_skb() where they are usually set. As a
result, driver may access invalid memory when trying to calculate the
checksum and store the result (as observed in virtio_net driver).

Moreover, the very idea of modifying the userspace provided UDP header
is IMHO against raw socket semantics (I wasn't able to find a document
clearly stating this or the opposite, though). And while allowing
CHECKSUM_NONE in the UFO case would be more efficient, it would be a bit
too intrusive change just to handle a corner case like this. Therefore
disallowing UFO for packets from SOCK_DGRAM seems to be the best option.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 22:19:29 -05:00
David S. Miller
096b1c1728 Merge branch 'sh_eth'
Ben Hutchings says:

====================
Fixes for sh_eth #4 v2

I'm continuing review and testing of Ethernet support on the R-Car H2
chip, with help from a colleague.  This series fixes a few more issues.

These are not tested on any of the other supported chips.

v2: Add note that the revert is not a pure revert.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 21:31:00 -05:00
Ben Hutchings
dacc73e0cf sh_eth: Really fix padding of short frames on TX
My previous fix to clear padding of short frames used skb->len as the
DMA length, assuming that skb_padto() extended skb->len to include the
padding.  That isn't the case; we need to use skb_put_padto() instead.

(This wasn't immediately obvious because software padding isn't
actually needed on the R-Car H2.  We could make it conditional on
which chip is being driven, but it's probably not worth the effort.)

Reported-by: "Violeta Menéndez González" <violeta.menendez@codethink.co.uk>
Fixes: 612a17a54b50 ("sh_eth: Fix padding of short frames on TX")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 21:30:56 -05:00
Ben Hutchings
9b4a6364a6 Revert "sh_eth: Enable Rx descriptor word 0 shift for r8a7790"
This reverts commit fd9af07c34.

The hardware manual states that the frame error and multicast bits are
copied to bits 9:0 of RD0, not bits 25:16.  I've tested that this is
true for RFS1 (CRC error), RFS3 (frame too short), RFS4 (frame too
long) and RFS8 (multicast).

Also adjust a comment to agree with this.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 21:30:56 -05:00
Ben Hutchings
6ded286555 sh_eth: Fix RX recovery on R-Car in case of RX ring underrun
In case of RX ring underrun (RDE), we attempt to reset the software
descriptor pointers (dirty_rx and cur_rx) to match where the hardware
will read the next descriptor from, as that might not be the first
dirty descriptor.  This relies on reading RDFAR, but that register
doesn't exist on all supported chips - specifically, not on the R-Car
chips.  This will result in unpredictable behaviour on those chips
after an RDE.

Make this pointer reset conditional and assume that it isn't needed on
the R-Car chips.  This fix also assumes that RDFAR is never exposed at
offset 0 in the memory map - this is currently true, and a subsequent
commit will fix the ambiguity between offset 0 and no-offset in the
register offset maps.

Fixes: 79fba9f517 ("net: sh_eth: fix the rxdesc pointer when rx ...")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 21:30:56 -05:00
Ben Hutchings
7d7355f58b sh_eth: Ensure proper ordering of descriptor active bit write/read
When submitting a DMA descriptor, the active bit must be written last.
When reading a completed DMA descriptor, the active bit must be read
first.

Add memory barriers to ensure that this ordering is maintained.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 21:30:56 -05:00
Eduardo Valentin
78045bfe85 Merge branch 'fixes' of github.com:lmajewski/linux-samsung-thermal into work-fixes
Pull samsung thermal fixes from Lukasz Majewski:
"Changes:
- Exynos7 power down detection mode fix
- Fix for cpufreq cooling device regression
- Updating MAINTAINER's entry for Samsung Exynos Thermal"

Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-03-02 22:18:31 -04:00
Alexander Aring
263be3326b at86rf230: restore trx len when needed
In the most cases the spi messages has a length of two. Currently we
always set the the len field to two before transmit a spi message. In
cases for read out/write in the frame buffer we need another len. This
patch use trx len two as default. For the frame buffer cases we restore
the trx len to two on success and failure. This will reduce the len
setting of two when it's already two.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-03-03 02:15:25 +01:00
Alexander Aring
31fa74344c at86rf230: remove multiple dereferencing for ctx
This patch cleanups the referencing for the state change context
variable. The state change context should only set once and this is by
initial a state change. This patch will use the initial state change
variable in the complete handler of the state change by using the ctx
context which should be always the same like the initial state change
context.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-03-03 02:15:25 +01:00
Alexander Aring
cca990c85d at86rf230: remove multiple dereferencing for irq
By holding the irq variable inside at86rf230_state_change we can squash
some multiple dereferencing for getting irq num.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-03-03 02:15:25 +01:00
Alexander Aring
74de4c804c at86rf230: refactor receive handling
This patch refactor the receive handling into one function.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-03-03 02:15:25 +01:00
Alexander Aring
ef5428a138 at86rf230: cleanup and squash stack variable
I had this variable because I thought it would be protected by
disable/enable irq but this is not true. It's protected by stop/wake
netdev queue which is called by ieee802154_xmit_complete.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-03-03 02:15:25 +01:00
Alexander Aring
ba6d223932 at86rf230: add transmit retry support
This patch introduce a transmit retry handling into at86rf230 transmit
path. Current behaviour is to wait the normal receive time if we want
to go into STATE_TX_ON when the transceiver is in STATE_BUSY_RX_AACK
which indicates that a frame is currently receiving. A non force state
change will not interrupt the the receiving state.

The current behaviour is that after the normal receive time we will
start a force change into STATE_TX_ON. With this patch we do seven
retries to go into STATE_TX_ON without forcing. After we hit the
AT86RF2XX_MAX_TX_RETRIES we will start the force state change.
This is a polling like method to go into STATE_TX_ON in times of maximum
receiving time.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-03-03 02:15:24 +01:00
Florian Westphal
72500bc11e netfilter: bridge: rework reject handling
bridge reject handling is not straightforward, there are many subtle
differences depending on configuration.

skb->dev is either the bridge port (PRE_ROUTING) or the bridge
itself (INPUT), so we need to use indev instead.

Also, checksum validation will only work reliably if we trim skb
according to the l3 header size.

While at it, add csum validation for ipv6 and skip existing tests
if skb was already checked e.g. by GRO.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-03 02:10:51 +01:00
Florian Westphal
ee586bbc28 netfilter: reject: don't send icmp error if csum is invalid
tcp resets are never emitted if the packet that triggers the
reject/reset has an invalid checksum.

For icmp error responses there was no such check.
It allows to distinguish icmp response generated via

iptables -I INPUT -p udp --dport 42 -j REJECT

and those emitted by network stack (won't respond if csum is invalid,
REJECT does).

Arguably its possible to avoid this by using conntrack and only
using REJECT with -m conntrack NEW/RELATED.

However, this doesn't work when connection tracking is not in use
or when using nf_conntrack_checksum=0.

Furthermore, sending errors in response to invalid csums doesn't make
much sense so just add similar test as in nf_send_reset.

Validate csum if needed and only send the response if it is ok.

Reference: http://bugzilla.redhat.com/show_bug.cgi?id=1169829
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-03 02:10:35 +01:00
Kim, Ben Young Tae
3267c884ce Bluetooth: btusb: Add support for QCA ROME chipset family
This patch supports ROME Bluetooth family from Qualcomm Atheros,
e.g. QCA61x4 or QCA6574.

New chipset have similar firmware downloading sequences to previous
chipset from Atheros, however, it doesn't support vid/pid switching
after downloading the patch so that firmware needs to be handled by
btusb module directly.

ROME chipset can be differentiated from previous version by reading
ROM version.

T:  Bus=03 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 16 Spd=12   MxCh= 0
D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=0cf3 ProdID=e300 Rev= 0.01
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms

T:  Bus=03 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  8 Spd=12   MxCh= 0
D:  Ver= 2.01 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=0cf3 ProdID=e360 Rev= 0.01
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms

Signed-off-by: Ben Young Tae Kim <ytkim@qca.qualcomm.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-03-03 02:07:01 +01:00
Kim, Ben Young Tae
ace3198258 Bluetooth: btusb: Add setup callback for chip init on USB
Some of chipset does not allow to send a patch or config files through
HCI VS channel at early stage as well as they don't support to send
USB patch files to other channel except USB bulk path.

New callback added is for initialization of BT controller through USB

Signed-off-by: Ben Young Tae Kim <ytkim@qca.qualcomm.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-03-03 02:07:00 +01:00
Peter Zijlstra
c064a0de1b livepatch: fix RCU usage in klp_find_external_symbol()
While one must hold RCU-sched (aka. preempt_disable) for find_symbol()
one must equally hold it over the use of the object returned.

The moment you release the RCU-sched read lock, the object can be dead
and gone.

[jkosina@suse.cz: change subject line to be aligned with other patches]
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-03-03 00:22:55 +01:00
Arnd Bergmann
4a6155a465 Input: sun4i-ts - add thermal driver dependency
The sun4i-ts driver has had a dependency on the thermal code
with the addition of the thermal zone sensor support, but this
is not currently enforced in Kconfig, so with TOUCHSCREEN_SUN4I=y,
THERMAL=m and THERMAL_OF=y we get

drivers/built-in.o: In function `sun4i_ts_remove':
:(.text+0x2376f4): undefined reference to `thermal_zone_of_sensor_unregister'
drivers/built-in.o: In function `sun4i_ts_probe':
:(.text+0x237a94): undefined reference to `thermal_zone_of_sensor_register'
:(.text+0x237c00): undefined reference to `thermal_zone_of_sensor_unregister'

We need the dependency on THERMAL in order to ensure that this
driver becomes a loadable module if the thermal support itself
is modular, while the dependency on THERMAL_OF is a runtime
dependency and the driver will still build if it is missing.
It is entirely possible to build sun4i-ts without THERMAL_OF
just to use the hwmon sensors and/or touchscreen.

Fixes: 2236971079 ("Input: sun4i-ts - add thermal zone sensor support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[wens@csie.org: Fix description and Kconfig dependencies]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2015-03-02 15:15:50 -08:00
Trond Myklebust
ec3ca4e57e NFSv4: Ensure we skip delegations that are already being returned
In nfs_client_return_marked_delegations() and nfs_delegation_reap_unclaimed()
we want to optimise the loop traversal by skipping delegations that are
already in the process of being returned.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02 18:09:15 -05:00
Trond Myklebust
9f0f8e12c4 NFSv4: Pin the superblock while we're returning the delegation
This patch ensures that the superblock doesn't go ahead and disappear
underneath us while the state manager thread is returning delegations.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02 18:09:14 -05:00
Trond Myklebust
ade04647dd NFSv4: Ensure we honour NFS_DELEGATION_RETURNING in nfs_inode_set_delegation()
Ensure that nfs_inode_set_delegation() doesn't inadvertently detach a
delegation that is already in the process of being returned.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02 18:09:14 -05:00
Trond Myklebust
b04b22f4ca NFSv4: Ensure that we don't reap a delegation that is being returned
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02 18:09:13 -05:00
Anna Schumaker
369d6b7f00 NFS: Fix stateid used for NFS v4 closes
After 566fcec60 the client uses the "current stateid" from the
nfs4_state structure to close a file.  This could potentially contain a
delegation stateid, which is disallowed by the protocol and causes
servers to return NFS4ERR_BAD_STATEID.  This patch restores the
(correct) behavior of sending the open stateid to close a file.

Reported-by: Olga Kornievskaia <kolga@netapp.com>
Fixes: 566fcec60 (NFSv4: Fix an atomicity problem in CLOSE)
Signed-off-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02 18:06:42 -05:00
Tapasweni Pathak
cfec0e75f5 KVM: MIPS: Enable after disabling interrupt
Enable disabled interrupt, on unsuccessful operation.

Found by Coccinelle.

Signed-off-by: Tapasweni Pathak <tapaswenipathak@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-02 19:18:12 -03:00
James Hogan
b3cffac04e KVM: MIPS: Fix trace event to save PC directly
Currently the guest exit trace event saves the VCPU pointer to the
structure, and the guest PC is retrieved by dereferencing it when the
event is printed rather than directly from the trace record. This isn't
safe as the printing may occur long afterwards, after the PC has changed
and potentially after the VCPU has been freed. Usually this results in
the same (wrong) PC being printed for multiple trace events. It also
isn't portable as userland has no way to access the VCPU data structure
when interpreting the trace record itself.

Lets save the actual PC in the structure so that the correct value is
accessible later.

Fixes: 669e846e6c ("KVM/MIPS32: MIPS arch specific APIs for KVM")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # v3.10+
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-02 19:17:52 -03:00
Linus Torvalds
023a6007a0 Two GPIO fixes for the v4.0 kernel series:
- Fix a translation problem in of_get_named_gpiod_flags()
 - Fix a long standing container_of() mistake in the TPS65912
   driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU9EyeAAoJEEEQszewGV1z16YP/1/sPyqZpj6f6Z9Q3shAffGY
 chDyxuaf8X7weiRd7vap93BPnnYJeJQkLQQCOEbsGmGsXOxLCIpqv6ShINYsRcnD
 aUnhVt6c9PxxkllfDaBJfKgXOa+M647Uj0Bzfkl2W9zuIJaeyGqUVOu7rvsFmf8f
 44ofuNdHYKHgkFtcdhPthIHC3zhGpDUwKR4OUElgZd89sHLcIEYVT0KQddRY0qE/
 RVb3KaP4FrlEL9vFrXABDsh9UufvN29gybAJSuCe/fgqdLAxTsOIoKktA8xNSXZR
 wWj47pjopRE1/GIJ03ug0boiv0eKwumvUwAn5xlrdJurcIGh0NrHSSF9JPCgMdSK
 48+45k+MmYQPJVQG/n4NRgAUv10KbN+0u/4MViNLYzTQuGkoCriei7/FL5/04TOi
 52xpdJ3Nf0R/ItzpPrmoNRx8vWzt7vg3SLiQi3kzeej9ej1DW+a9OvDeGiImAtKO
 MEx0Q3Nm5VNQ5kjiZaRan8/HK/Yys1fESqYdlbOxAEPRaCh3tl78x1jIN+ulivIn
 myyMyCn3H5y6DEYqORRyw97egqvCjLz6/BqIIuApKNVOy+gpkdmYtpL1GMEOWOJK
 J+w1fx7cnHXBhGAQHKgmqFvHF9L1Bqadd3RlvXk17XDhxM9mRWka4S4E+08/BEtb
 qL7OgdAzI0EPn0WxWBKM
 =5nhV
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "Two GPIO fixes:

   - Fix a translation problem in of_get_named_gpiod_flags()

   - Fix a long standing container_of() mistake in the TPS65912 driver"

* tag 'gpio-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: tps65912: fix wrong container_of arguments
  gpiolib: of: allow of_gpiochip_find_and_xlate to find more than one chip per node
2015-03-02 14:13:39 -08:00
Linus Torvalds
10d6dfc197 Merge branch 'fixes-for-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal
Pull thermal management fixes from Eduardo Valentin:
 "Specifics:

   - Several fixes in tmon tool.

   - Fixes in intel int340x for _ART and _TRT tables.

   - Add id for Avoton SoC into powerclamp driver.

   - Fixes in RCAR thermal driver to remove race conditions and fix fail
     path

   - Fixes in TI thermal driver: removal of unnecessary code and build
     fix if !CONFIG_PM_SLEEP

   - Cleanups in exynos thermal driver

   - Add stubs for include/linux/thermal.h.  Now drivers using thermal
     calls but that also work without CONFIG_THERMAL will be able to
     compile for systems that don't care about thermal.

  Note: I am sending this pull on Rui's behalf while he fixes issues in
  his Linux box"

* 'fixes-for-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
  thermal: int340x_thermal: Ignore missing _ART, _TRT tables
  thermal/intel_powerclamp: add id for Avoton SoC
  tools/thermal: tmon: silence 'set but not used' warnings
  tools/thermal: tmon: use pkg-config to determine library dependencies
  tools/thermal: tmon: support cross-compiling
  tools/thermal: tmon: add .gitignore
  tools/thermal: tmon: fixup tui windowing calculations
  tools/thermal: tmon: tui: don't hard-code dialog window size assumptions
  tools/thermal: tmon: add min/max macros
  tools/thermal: tmon: add --target-temp parameter
  thermal: exynos: Clean-up code to use oneline entry for exynos compatible table
  thermal: rcar: Make error and remove paths symmetrical with init
  thermal: rcar: Fix race condition between init and interrupt
  thermal: Introduce dummy functions when thermal is not defined
  ti-soc-thermal: Delete an unnecessary check before the function call "cpufreq_cooling_unregister"
  thermal: ti-soc-thermal: bandgap: Fix build warning if !CONFIG_PM_SLEEP
2015-03-02 14:08:10 -08:00
Filipe Manana
84471e2429 Btrfs: incremental send, don't rename a directory too soon
There's one more case where we can't issue a rename operation for a
directory as soon as we process it. We used to delay directory renames
only if they have some ancestor directory with a higher inode number
that got renamed too, but there's another case where we need to delay
the rename too - when a directory A is renamed to the old name of a
directory B but that directory B has its rename delayed because it
has now (in the send root) an ancestor with a higher inode number that
was renamed. If we don't delay the directory rename in this case, the
receiving end of the send stream will attempt to rename A to the old
name of B before B got renamed to its new name, which results in a
"directory not empty" error. So fix this by delaying directory renames
for this case too.

Steps to reproduce:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt

  $ mkdir /mnt/a
  $ mkdir /mnt/b
  $ mkdir /mnt/c
  $ touch /mnt/a/file

  $ btrfs subvolume snapshot -r /mnt /mnt/snap1

  $ mv /mnt/c /mnt/x
  $ mv /mnt/a /mnt/x/y
  $ mv /mnt/b /mnt/a

  $ btrfs subvolume snapshot -r /mnt /mnt/snap2

  $ btrfs send /mnt/snap1 -f /tmp/1.send
  $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/2.send

  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt2
  $ btrfs receive /mnt2 -f /tmp/1.send
  $ btrfs receive /mnt2 -f /tmp/2.send
  ERROR: rename b -> a failed. Directory not empty

A test case for xfstests follows soon.

Reported-by: Ames Cornish <ames@cornishes.net>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:45 -08:00
David Sterba
1932b7be97 btrfs: fix lost return value due to variable shadowing
A block-local variable stores error code but btrfs_get_blocks_direct may
not return it in the end as there's a ret defined in the function scope.

CC: <stable@vger.kernel.org>	# 3.6+
Fixes: d187663ef2 ("Btrfs: lock extents as we map them in DIO")
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:45 -08:00
Filipe Manana
5cdf83edb8 Btrfs: do not ignore errors from btrfs_lookup_xattr in do_setxattr
The return value from btrfs_lookup_xattr() can be a pointer encoding an
error, therefore deal with it. This fixes commit 5f5bc6b1e2
("Btrfs: make xattr replace operations atomic").

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:45 -08:00
Filipe Manana
5dfe2be7ea Btrfs: fix off-by-one logic error in btrfs_realloc_node
The end_slot variable actually matches the number of pointers in the
node and not the last slot (which is 'nritems - 1'). Therefore in order
to check that the current slot in the for loop doesn't match the last
one, the correct logic is to check if 'i' is less than 'end_slot - 1'
and not 'end_slot - 2'.

Fix this and set end_slot to be 'nritems - 1', as it's less confusing
since the variable name implies it's inclusive rather then exclusive.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:45 -08:00
Filipe Manana
e8c1c76e80 Btrfs: add missing inode update when punching hole
When punching a file hole if we endup only zeroing parts of a page,
because the start offset isn't a multiple of the sector size or the
start offset and length fall within the same page, we were not updating
the inode item. This prevented an fsync from doing anything, if no other
file changes happened in the current transaction, because the fields
in btrfs_inode used to check if the inode needs to be fsync'ed weren't
updated.

This issue is easy to reproduce and the following excerpt from the
xfstest case I made shows how to trigger it:

  _scratch_mkfs >> $seqres.full 2>&1
  _init_flakey
  _mount_flakey

  # Create our test file.
  $XFS_IO_PROG -f -c "pwrite -S 0x22 -b 16K 0 16K" \
      $SCRATCH_MNT/foo | _filter_xfs_io

  # Fsync the file, this makes btrfs update some btrfs inode specific fields
  # that are used to track if the inode needs to be written/updated to the fsync
  # log or not. After this fsync, the new values for those fields indicate that
  # a subsequent fsync does not need to touch the fsync log.
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

  # Force a commit of the current transaction. After this point, any operation
  # that modifies the data or metadata of our file, should update those fields in
  # the btrfs inode with values that make the next fsync operation write to the
  # fsync log.
  sync

  # Punch a hole in our file. This small range affects only 1 page.
  # This made the btrfs hole punching implementation write only some zeroes in
  # one page, but it did not update the btrfs inode fields used to determine if
  # the next fsync needs to write to the fsync log.
  $XFS_IO_PROG -c "fpunch 8000 4K" $SCRATCH_MNT/foo

  # Another variation of the previously mentioned case.
  $XFS_IO_PROG -c "fpunch 15000 100" $SCRATCH_MNT/foo

  # Now fsync the file. This was a no-operation because the previous hole punch
  # operation didn't update the inode's fields mentioned before, so they remained
  # with the values they had after the first fsync - that is, they indicate that
  # it is not needed to write to fsync log.
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

  echo "File content before:"
  od -t x1 $SCRATCH_MNT/foo

  # Simulate a crash/power loss.
  _load_flakey_table $FLAKEY_DROP_WRITES
  _unmount_flakey

  # Enable writes and mount the fs. This makes the fsync log replay code run.
  _load_flakey_table $FLAKEY_ALLOW_WRITES
  _mount_flakey

  # Because the last fsync didn't do anything, here the file content matched what
  # it was after the first fsync, before the holes were punched, and not what it
  # was after the holes were punched.
  echo "File content after:"
  od -t x1 $SCRATCH_MNT/foo

This issue has been around since 2012, when the punch hole implementation
was added, commit 2aaa665581 ("Btrfs: add hole punching").

A test case for xfstests follows soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:44 -08:00
Josef Bacik
0c0ef4bc84 Btrfs: abort the transaction if we fail to update the free space cache inode
Our gluster boxes were hitting a problem where they'd run out of space when
updating the block group cache and therefore wouldn't be able to update the free
space inode.  This is a problem because this is how we invalidate the cache and
protect ourselves from errors further down the stack, so if this fails we have
to abort the transaction so we make sure we don't end up with stale free space
cache.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:44 -08:00
Filipe Manana
4d884fceaa Btrfs: fix fsync race leading to ordered extent memory leaks
We can have multiple fsync operations against the same file during the
same transaction and they can collect the same ordered extents while they
don't complete (still accessible from the inode's ordered tree). If this
happens, those ordered extents will never get their reference counts
decremented to 0, leading to memory leaks and inode leaks (an iput for an
ordered extent's inode is scheduled only when the ordered extent's refcount
drops to 0). The following sequence diagram explains this race:

         CPU 1                                         CPU 2

btrfs_sync_file()

                                                 btrfs_sync_file()

  mutex_lock(inode->i_mutex)
  btrfs_log_inode()
    btrfs_get_logged_extents()
      --> collects ordered extent X
      --> increments ordered
          extent X's refcount
    btrfs_submit_logged_extents()
  mutex_unlock(inode->i_mutex)

                                                   mutex_lock(inode->i_mutex)
  btrfs_sync_log()
     btrfs_wait_logged_extents()
       --> list_del_init(&ordered->log_list)
                                                     btrfs_log_inode()
                                                       btrfs_get_logged_extents()
                                                         --> Adds ordered extent X
                                                             to logged_list because
                                                             at this point:
                                                             list_empty(&ordered->log_list)
                                                             && test_bit(BTRFS_ORDERED_LOGGED,
                                                                         &ordered->flags) == 0
                                                         --> Increments ordered extent
                                                             X's refcount
       --> check if ordered extent's io is
           finished or not, start it if
           necessary and wait for it to finish
       --> sets bit BTRFS_ORDERED_LOGGED
           on ordered extent X's flags
           and adds it to trans->ordered
  btrfs_sync_log() finishes

                                                       btrfs_submit_logged_extents()
                                                     btrfs_log_inode() finishes
                                                   mutex_unlock(inode->i_mutex)

btrfs_sync_file() finishes

                                                   btrfs_sync_log()
                                                      btrfs_wait_logged_extents()
                                                        --> Sees ordered extent X has the
                                                            bit BTRFS_ORDERED_LOGGED set in
                                                            its flags
                                                        --> X's refcount is untouched
                                                   btrfs_sync_log() finishes

                                                 btrfs_sync_file() finishes

btrfs_commit_transaction()
  --> called by transaction kthread for e.g.
  btrfs_wait_pending_ordered()
    --> waits for ordered extent X to
        complete
    --> decrements ordered extent X's
        refcount by 1 only, corresponding
        to the increment done by the fsync
        task ran by CPU 1

In the scenario of the above diagram, after the transaction commit,
the ordered extent will remain with a refcount of 1 forever, leaking
the ordered extent structure and preventing the i_count of its inode
from ever decreasing to 0, since the delayed iput is scheduled only
when the ordered extent's refcount drops to 0, preventing the inode
from ever being evicted by the VFS.

Fix this by using the flag BTRFS_ORDERED_LOGGED differently. Use it to
mean that an ordered extent is already being processed by an fsync call,
which will attach it to the current transaction, preventing it from being
collected by subsequent fsync operations against the same inode.

This race was introduced with the following change (added in 3.19 and
backported to stable 3.18 and 3.17):

  Btrfs: make sure logged extents complete in the current transaction V3
  commit 50d9aa99bd

I ran into this issue while running xfstests/generic/113 in a loop, which
failed about 1 out of 10 runs with the following warning in dmesg:

[ 2612.440038] WARNING: CPU: 4 PID: 22057 at fs/btrfs/disk-io.c:3558 free_fs_root+0x36/0x133 [btrfs]()
[ 2612.442810] Modules linked in: btrfs crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop processor parport_pc parport psmouse therma
l_sys i2c_piix4 serio_raw pcspkr evdev microcode button i2c_core ext4 crc16 jbd2 mbcache sd_mod sg sr_mod cdrom virtio_scsi ata_generic virtio_pci ata_piix virtio_ring libata virtio flo
ppy e1000 scsi_mod [last unloaded: btrfs]
[ 2612.452711] CPU: 4 PID: 22057 Comm: umount Tainted: G        W      3.19.0-rc5-btrfs-next-4+ #1
[ 2612.454921] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[ 2612.457709]  0000000000000009 ffff8801342c3c78 ffffffff8142425e ffff88023ec8f2d8
[ 2612.459829]  0000000000000000 ffff8801342c3cb8 ffffffff81045308 ffff880046460000
[ 2612.461564]  ffffffffa036da56 ffff88003d07b000 ffff880046460000 ffff880046460068
[ 2612.463163] Call Trace:
[ 2612.463719]  [<ffffffff8142425e>] dump_stack+0x4c/0x65
[ 2612.464789]  [<ffffffff81045308>] warn_slowpath_common+0xa1/0xbb
[ 2612.466026]  [<ffffffffa036da56>] ? free_fs_root+0x36/0x133 [btrfs]
[ 2612.467247]  [<ffffffff810453c5>] warn_slowpath_null+0x1a/0x1c
[ 2612.468416]  [<ffffffffa036da56>] free_fs_root+0x36/0x133 [btrfs]
[ 2612.469625]  [<ffffffffa036f2a7>] btrfs_drop_and_free_fs_root+0x93/0x9b [btrfs]
[ 2612.471251]  [<ffffffffa036f353>] btrfs_free_fs_roots+0xa4/0xd6 [btrfs]
[ 2612.472536]  [<ffffffff8142612e>] ? wait_for_completion+0x24/0x26
[ 2612.473742]  [<ffffffffa0370bbc>] close_ctree+0x1f3/0x33c [btrfs]
[ 2612.475477]  [<ffffffff81059d1d>] ? destroy_workqueue+0x148/0x1ba
[ 2612.476695]  [<ffffffffa034e3da>] btrfs_put_super+0x19/0x1b [btrfs]
[ 2612.477911]  [<ffffffff81153e53>] generic_shutdown_super+0x73/0xef
[ 2612.479106]  [<ffffffff811540e2>] kill_anon_super+0x13/0x1e
[ 2612.480226]  [<ffffffffa034e1e3>] btrfs_kill_super+0x17/0x23 [btrfs]
[ 2612.481471]  [<ffffffff81154307>] deactivate_locked_super+0x3b/0x50
[ 2612.482686]  [<ffffffff811547a7>] deactivate_super+0x3f/0x43
[ 2612.483791]  [<ffffffff8116b3ed>] cleanup_mnt+0x59/0x78
[ 2612.484842]  [<ffffffff8116b44c>] __cleanup_mnt+0x12/0x14
[ 2612.485900]  [<ffffffff8105d019>] task_work_run+0x8f/0xbc
[ 2612.486960]  [<ffffffff810028d8>] do_notify_resume+0x5a/0x6b
[ 2612.488083]  [<ffffffff81236e5b>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 2612.489333]  [<ffffffff8142a17f>] int_signal+0x12/0x17
[ 2612.490353] ---[ end trace 54a960a6bdcb8d93 ]---
[ 2612.557253] VFS: Busy inodes after unmount of sdb. Self-destruct in 5 seconds.  Have a nice day...

Kmemleak confirmed the ordered extent leak (and btrfs inode specific
structures such as delayed nodes):

$ cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff880154290db0 (size 576):
  comm "btrfsck", pid 21980, jiffies 4295542503 (age 1273.412s)
  hex dump (first 32 bytes):
    01 40 00 00 01 00 00 00 b0 1d f1 4e 01 88 ff ff  .@.........N....
    00 00 00 00 00 00 00 00 c8 0d 29 54 01 88 ff ff  ..........)T....
  backtrace:
    [<ffffffff8141d74d>] kmemleak_update_trace+0x4c/0x6a
    [<ffffffff8122f2c0>] radix_tree_node_alloc+0x6d/0x83
    [<ffffffff8122fb26>] __radix_tree_create+0x109/0x190
    [<ffffffff8122fbdd>] radix_tree_insert+0x30/0xac
    [<ffffffffa03b9bde>] btrfs_get_or_create_delayed_node+0x130/0x187 [btrfs]
    [<ffffffffa03bb82d>] btrfs_delayed_delete_inode_ref+0x32/0xac [btrfs]
    [<ffffffffa0379dae>] __btrfs_unlink_inode+0xee/0x288 [btrfs]
    [<ffffffffa037c715>] btrfs_unlink_inode+0x1e/0x40 [btrfs]
    [<ffffffffa037c797>] btrfs_unlink+0x60/0x9b [btrfs]
    [<ffffffff8115d7f0>] vfs_unlink+0x9c/0xed
    [<ffffffff8115f5de>] do_unlinkat+0x12c/0x1fa
    [<ffffffff811601a7>] SyS_unlinkat+0x29/0x2b
    [<ffffffff81429e92>] system_call_fastpath+0x12/0x17
    [<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff88014ef11db0 (size 576):
  comm "rm", pid 22009, jiffies 4295542593 (age 1273.052s)
  hex dump (first 32 bytes):
    02 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 c8 1d f1 4e 01 88 ff ff  ...........N....
  backtrace:
    [<ffffffff8141d74d>] kmemleak_update_trace+0x4c/0x6a
    [<ffffffff8122f2c0>] radix_tree_node_alloc+0x6d/0x83
    [<ffffffff8122fb26>] __radix_tree_create+0x109/0x190
    [<ffffffff8122fbdd>] radix_tree_insert+0x30/0xac
    [<ffffffffa03b9bde>] btrfs_get_or_create_delayed_node+0x130/0x187 [btrfs]
    [<ffffffffa03bb82d>] btrfs_delayed_delete_inode_ref+0x32/0xac [btrfs]
    [<ffffffffa0379dae>] __btrfs_unlink_inode+0xee/0x288 [btrfs]
    [<ffffffffa037c715>] btrfs_unlink_inode+0x1e/0x40 [btrfs]
    [<ffffffffa037c797>] btrfs_unlink+0x60/0x9b [btrfs]
    [<ffffffff8115d7f0>] vfs_unlink+0x9c/0xed
    [<ffffffff8115f5de>] do_unlinkat+0x12c/0x1fa
    [<ffffffff811601a7>] SyS_unlinkat+0x29/0x2b
    [<ffffffff81429e92>] system_call_fastpath+0x12/0x17
    [<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff8800336feda8 (size 584):
  comm "aio-stress", pid 22031, jiffies 4295543006 (age 1271.400s)
  hex dump (first 32 bytes):
    00 40 3e 00 00 00 00 00 00 00 8f 42 00 00 00 00  .@>........B....
    00 00 01 00 00 00 00 00 00 00 01 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8114eb34>] create_object+0x172/0x29a
    [<ffffffff8141d790>] kmemleak_alloc+0x25/0x41
    [<ffffffff81141ae6>] kmemleak_alloc_recursive.constprop.52+0x16/0x18
    [<ffffffff81145288>] kmem_cache_alloc+0xf7/0x198
    [<ffffffffa0389243>] __btrfs_add_ordered_extent+0x43/0x309 [btrfs]
    [<ffffffffa038968b>] btrfs_add_ordered_extent_dio+0x12/0x14 [btrfs]
    [<ffffffffa03810e2>] btrfs_get_blocks_direct+0x3ef/0x571 [btrfs]
    [<ffffffff81181349>] do_blockdev_direct_IO+0x62a/0xb47
    [<ffffffff8118189a>] __blockdev_direct_IO+0x34/0x36
    [<ffffffffa03776e5>] btrfs_direct_IO+0x16a/0x1e8 [btrfs]
    [<ffffffff81100373>] generic_file_direct_write+0xb8/0x12d
    [<ffffffffa038615c>] btrfs_file_write_iter+0x24b/0x42f [btrfs]
    [<ffffffff8118bb0d>] aio_run_iocb+0x2b7/0x32e
    [<ffffffff8118c99a>] do_io_submit+0x26e/0x2ff
    [<ffffffff8118ca3b>] SyS_io_submit+0x10/0x12
    [<ffffffff81429e92>] system_call_fastpath+0x12/0x17

CC: <stable@vger.kernel.org> # 3.19, 3.18 and 3.17
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02 14:04:44 -08:00
Radim Krčmář
f563db4bdb KVM: SVM: fix interrupt injection (apic->isr_count always 0)
In commit b4eef9b36d, we started to use hwapic_isr_update() != NULL
instead of kvm_apic_vid_enabled(vcpu->kvm).  This didn't work because
SVM had it defined and "apicv" path in apic_{set,clear}_isr() does not
change apic->isr_count, because it should always be 1.  The initial
value of apic->isr_count was based on kvm_apic_vid_enabled(vcpu->kvm),
which is always 0 for SVM, so KVM could have injected interrupts when it
shouldn't.

Fix it by implicitly setting SVM's hwapic_isr_update to NULL and make the
initial isr_count depend on hwapic_isr_update() for good measure.

Fixes: b4eef9b36d ("kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv")
Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-02 19:04:40 -03:00
Linus Torvalds
1a6f77ab08 3 md fixes for 4.0
- fix a read-balance problem that was reported 2 years ago, but
   that I never noticed the report :-(
 - fix for rare RAID6 problem causing incorrect bitmap updates when
   two devices fail.
 - add __ATTR_PREALLOC annotation now that it is possible.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIVAwUAVPOlKznsnt1WYoG5AQJKlw//TXHI4MFB/3Zy0ncbHMEpKwgyuTYD0kCM
 lpsQGowAaqKdUfmXxhtLjSgQXmxpUf/q200EKUr81nV/v+HQTraC91ZmyHNvUPaB
 4+blSoEDqF/spo6rlbEXw6ByWAcaO6w3SVDLDci4rXoQoqzmGPzzjD4zqr485j61
 xRk4cV0zDVpdzp7OX+bR/fCt3A0ELbAXi22E+8U6NXnYwQPb3vIYNydcjQEPEpKk
 nLpQRoinz+XpnidUneuFO2/3Lgax5bsgK3ruxxgTUWrlF2weCD5+3g1S2FQrqZFp
 d+FyEgyv5hGgpg6mqGRvERIrzlwkqdaZAhP0haC82ZhOR5VnZFR2KS+1sACDR3jQ
 0QSR7IX8opTgvZaepNdjRAp2W4/zYnhIceMwgi9TPHWiTTT3xW7KW99kj5DdxiCg
 21i/SHXuTnw//rlNfE663wwtuBnyCEDeTCmjUNBJ0Nset+Cnc4wq6pdvt8Wzxh/a
 rGuTkD9eTQ3oR33hfJD2iUAQKYfvdr2u9zun8TzBwe50zTS+MTd3+k1xYNwcUC8z
 LfUarTLlv59L8anBhNoBzGMhZa62jqqz1Tvj3EI5u/sXbDqtzZhixhoafpsWmBnA
 8h2YyvVU4q3Oxalaqk2gEufscAtD8bAHzbbHKdd9HYLWnyoiWCYydN1QAUWKvfWP
 ycs7YftfNDM=
 =CaGN
 -----END PGP SIGNATURE-----

Merge tag 'md/4.0-fixes' of git://neil.brown.name/md

Pull md fixes from Neil Brown:
 "Three md fixes:

   - fix a read-balance problem that was reported 2 years ago, but that
     I never noticed the report :-(

   - fix for rare RAID6 problem causing incorrect bitmap updates when
     two devices fail.

   - add __ATTR_PREALLOC annotation now that it is possible"

* tag 'md/4.0-fixes' of git://neil.brown.name/md:
  md: mark some attributes as pre-alloc
  raid5: check faulty flag for array status during recovery.
  md/raid1: fix read balance when a drive is write-mostly.
2015-03-02 14:03:27 -08:00
Linus Torvalds
49db1f0ef2 arch/metag fixes for v4.0
This is just a single patch to fix the KSTK_EIP() and KSTK_ESP() macros
 for metag which have always been erronously returning the PC and stack
 pointer of the task's kernel context rather than from its user context
 saved at entry from userland into the kernel, which affects the contents
 of /proc/<pid>/maps and /proc/<pid>/stat.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJU9EnnAAoJEGwLaZPeOHZ67uYQALvIb6R/Webca0hEPuHa95ic
 4mfKzj7iD4DZTBLsuYyq9+NIt2g24mCd5vbDLfH63PpEnRwjdt7y+n4xhRVoQS87
 ZC2aLx3Ry6NC7ByGhEq0n4aTOaKYffe9y4XE8FZddn/9rZUIE2sEdXGtyrg6rsYf
 Eb/e/senPBRPNT8LpurSmYcPsCB2q2yo+0503aj41VjCjPbYe92/QrIDU8Ag3R5y
 c5C0btD9NOcB4xt/vIGU7H0OH85Q+OvLHBzu/5aVFyPelPtIE4xpYP1fRyd/P002
 Jmm6KH52ILMArgqB3KavKMvCebQBwwf92LLUtQ5ZhdeX9TMYzgG22P3CmZcS49Ha
 xwkIgDbeI1BQeMoVgTgVRMDnAOXmF/HdzxlbILHonaptiHDEOj3izdWpfubrIGi7
 9/69L/hF3DY5udt8qBQ4fWDJrvBYQpoqyUEiv/eFfyhFpVaCxKQ0YQgWto3UujWG
 7ESNkNp3kTTlo4NeUh47x1TE0CBiNHAGU+r72Uysb/u9N3Aya8b/jy8x4wCBemLs
 vHL3bfgg7Pee067/O+w9GTQoe7ldzifcSrTGV3s7wpUqKPBGUdq4MtPaXDvJqk/W
 uqnjoH1+/juvBpjwNwoavCXAO5CI6j19kKQH9iCc3v3YizSRtCG4VwfnVd2HgSc4
 LtUrSZkfkwQvBn1oc4mg
 =1+sf
 -----END PGP SIGNATURE-----

Merge tag 'metag-fixes-v4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag

Pull arch/metag fix from James Hogan:
 "This is just a single patch to fix the KSTK_EIP() and KSTK_ESP()
  macros for metag which have always been erronously returning the PC
  and stack pointer of the task's kernel context rather than from its
  user context saved at entry from userland into the kernel, which
  affects the contents of /proc/<pid>/maps and /proc/<pid>/stat"

* tag 'metag-fixes-v4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  metag: Fix KSTK_EIP() and KSTK_ESP() macros
2015-03-02 14:02:17 -08:00
David S. Miller
b898441f4e Merge branch 'neigh_cleanups'
Eric W. Biederman says:

====================
Neighbour table and ax25 cleanups

While looking at the neighbour table to what it would take to allow
using next hops in a different address family than the current packets
I found a partial resolution for my issues and I stumbled upon some
work that makes the neighbour table code easier to understand and
maintain.

Long ago in a much younger kernel ax25 found a hack to use
dev_rebuild_header to transmit it's packets instead of going through
what today is ndo_start_xmit.

When the neighbour table was rewritten into it's current form the ax25
code was such a challenge that arp_broken_ops appeard in arp.c and
neigh_compat_output appeared in neighbour.c to keep the ax25 hack alive.

With a little bit of work I was able to remove some of the hack that
is the ax25 transmit path for ip packets and to isolate what remains
into a slightly more readable piece of code in ax25_ip.c.  Removing the
need for the generic code to worry about ax25 special cases.

After cleaning up the old ax25 hacks I also performed a little bit of
work on neigh_resolve_output to remove the need for a dst entry and to
ensure cached headers get a deterministic protocol value in their cached
header.   This guarantees that a cached header will not be different
depending on which protocol of packet is transmitted, and it allows
packets to be transmitted that don't have a dst entry.  There remains
a small amount of code that takes advantage of when packets have a dst
entry but that is something different.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 16:43:46 -05:00
Eric W. Biederman
435e8eb27e neigh: Don't require a dst in neigh_resolve_output
Having a dst helps a little bit for teql but is fundamentally
unnecessary and there are code paths where a dst is not available that
it would be nice to use the neighbour cache.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 16:43:41 -05:00
Eric W. Biederman
bdf53c5849 neigh: Don't require dst in neigh_hh_init
- Add protocol to neigh_tbl so that dst->ops->protocol is not needed
- Acquire the device from neigh->dev

This results in a neigh_hh_init that will cache the samve values
regardless of the packets flowing through it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 16:43:41 -05:00
Eric W. Biederman
59b2af26b9 arp: Kill arp_find
There are no more callers so kill this function.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 16:43:41 -05:00