Commit graph

565201 commits

Author SHA1 Message Date
Steffen Maier
4abdfdd09e zfcp: fix fc_host port_type with NPIV
commit bd77befa5bcff8c51613de271913639edf85fbc2 upstream.

For an NPIV-enabled FCP device, zfcp can erroneously show
"NPort (fabric via point-to-point)" instead of "NPIV VPORT"
for the port_type sysfs attribute of the corresponding
fc_host.
s390-tools that can be affected are dbginfo.sh and ziomon.

zfcp_fsf_exchange_config_evaluate() ignores
fsf_qtcb_bottom_config.connection_features indicating NPIV
and only sets fc_host_port_type to FC_PORTTYPE_NPORT if
fsf_qtcb_bottom_config.fc_topology is FSF_TOPO_FABRIC.

Only the independent zfcp_fsf_exchange_port_evaluate()
evaluates connection_features to overwrite fc_host_port_type
to FC_PORTTYPE_NPIV in case of NPIV.
Code was introduced with upstream kernel 2.6.30
commit 0282985da5
("[SCSI] zfcp: Report fc_host_port_type as NPIV").

This works during FCP device recovery (such as set online)
because it performs FSF_QTCB_EXCHANGE_CONFIG_DATA followed by
FSF_QTCB_EXCHANGE_PORT_DATA in sequence.

However, the zfcp-specific scsi host sysfs attributes
"requests", "megabytes", or "seconds_active" trigger only
zfcp_fsf_exchange_config_evaluate() resetting fc_host
port_type to FC_PORTTYPE_NPORT despite NPIV.

The zfcp-specific scsi host sysfs attribute "utilization"
triggers only zfcp_fsf_exchange_port_evaluate() correcting
the fc_host port_type again in case of NPIV.

Evaluate fsf_qtcb_bottom_config.connection_features
in zfcp_fsf_exchange_config_evaluate() where it belongs to.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 0282985da5 ("[SCSI] zfcp: Report fc_host_port_type as NPIV")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:28 -04:00
Richard Weinberger
7f2e25fa12 ubi: Deal with interrupted erasures in WL
commit 2365418879e9abf12ea9def7f9f3caf0dfa7ffb0 upstream.

When Fastmap is used we can face here an -EBADMSG
since Fastmap cannot know about unmaps.
If the erasure was interrupted the PEB may show ECC
errors and UBI would go to ro-mode as it assumes
that the PEB was check during attach time, which is
not the case with Fastmap.

Fixes: dbb7d2a88d ("UBI: Add fastmap core")
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:28 -04:00
Laurent Dufour
b57af607da powerpc/pseries: Fix stack corruption in htpe code
commit 05af40e885955065aee8bb7425058eb3e1adca08 upstream.

This commit fixes a stack corruption in the pseries specific code dealing
with the huge pages.

In __pSeries_lpar_hugepage_invalidate() the buffer used to pass arguments
to the hypervisor is not large enough. This leads to a stack corruption
where a previously saved register could be corrupted leading to unexpected
result in the caller, like the following panic:

  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: virtio_balloon ip_tables x_tables autofs4
  virtio_blk 8139too virtio_pci virtio_ring 8139cp virtio
  CPU: 11 PID: 1916 Comm: mmstress Not tainted 4.8.0 #76
  task: c000000005394880 task.stack: c000000005570000
  NIP: c00000000027bf6c LR: c00000000027bf64 CTR: 0000000000000000
  REGS: c000000005573820 TRAP: 0300   Not tainted  (4.8.0)
  MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 84822884  XER: 20000000
  CFAR: c00000000010a924 DAR: 420000000014e5e0 DSISR: 40000000 SOFTE: 1
  GPR00: c00000000027bf64 c000000005573aa0 c000000000e02800 c000000004447964
  GPR04: c00000000404de18 c000000004d38810 00000000042100f5 00000000f5002104
  GPR08: e0000000f5002104 0000000000000001 042100f5000000e0 00000000042100f5
  GPR12: 0000000000002200 c00000000fe02c00 c00000000404de18 0000000000000000
  GPR16: c1ffffffffffe7ff 00003fff62000000 420000000014e5e0 00003fff63000000
  GPR20: 0008000000000000 c0000000f7014800 0405e600000000e0 0000000000010000
  GPR24: c000000004d38810 c000000004447c10 c00000000404de18 c000000004447964
  GPR28: c000000005573b10 c000000004d38810 00003fff62000000 420000000014e5e0
  NIP [c00000000027bf6c] zap_huge_pmd+0x4c/0x470
  LR [c00000000027bf64] zap_huge_pmd+0x44/0x470
  Call Trace:
  [c000000005573aa0] [c00000000027bf64] zap_huge_pmd+0x44/0x470 (unreliable)
  [c000000005573af0] [c00000000022bbd8] unmap_page_range+0xcf8/0xed0
  [c000000005573c30] [c00000000022c2d4] unmap_vmas+0x84/0x120
  [c000000005573c80] [c000000000235448] unmap_region+0xd8/0x1b0
  [c000000005573d80] [c0000000002378f0] do_munmap+0x2d0/0x4c0
  [c000000005573df0] [c000000000237be4] SyS_munmap+0x64/0xb0
  [c000000005573e30] [c000000000009560] system_call+0x38/0x108
  Instruction dump:
  fbe1fff8 fb81ffe0 7c7f1b78 7ca32b78 7cbd2b78 f8010010 7c9a2378 f821ffb1
  7cde3378 4bfffea9 7c7b1b79 41820298 <e87f0000> 48000130 7fa5eb78 7fc4f378

Most of the time, the bug is surfacing in a caller up in the stack from
__pSeries_lpar_hugepage_invalidate() which is quite confusing.

This bug is pending since v3.11 but was hidden if a caller of the
caller of __pSeries_lpar_hugepage_invalidate() has pushed the corruped
register (r18 in this case) in the stack and is not using it until
restoring it. GCC 6.2.0 seems to raise it more frequently.

This commit also change the definition of the parameter buffer in
pSeries_lpar_flush_hash_range() to rely on the global define
PLPAR_HCALL9_BUFSIZE (no functional change here).

Fixes: 1a5272866f ("powerpc: Optimize hugepage invalidate")
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:28 -04:00
Paul Mackerras
800b55c471 powerpc/64: Fix incorrect return value from __copy_tofrom_user
commit 1a34439e5a0b2235e43f96816dbb15ee1154f656 upstream.

Debugging a data corruption issue with virtio-net/vhost-net led to
the observation that __copy_tofrom_user was occasionally returning
a value 16 larger than it should.  Since the return value from
__copy_tofrom_user is the number of bytes not copied, this means
that __copy_tofrom_user can occasionally return a value larger
than the number of bytes it was asked to copy.  In turn this can
cause higher-level copy functions such as copy_page_to_iter_iovec
to corrupt memory by copying data into the wrong memory locations.

It turns out that the failing case involves a fault on the store
at label 79, and at that point the first unmodified byte of the
destination is at R3 + 16.  Consequently the exception handler
for that store needs to add 16 to R3 before using it to work out
how many bytes were not copied, but in this one case it was not
adding the offset to R3.  To fix it, this moves the label 179 to
the point where we add 16 to R3.  I have checked manually all the
exception handlers for the loads and stores in this code and the
rest of them are correct (it would be excellent to have an
automated test of all the exception cases).

This bug has been present since this code was initially
committed in May 2002 to Linux version 2.5.20.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:28 -04:00
Gavin Shan
8f7929ce0c powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data()
commit 5adaf8629b193f185ca5a1665b9e777a0579f518 upstream.

This fixes the warnings reported from sparse:

  pci.c:312:33: warning: restricted __be64 degrades to integer
  pci.c:313:33: warning: restricted __be64 degrades to integer

Fixes: cee72d5bb4 ("powerpc/powernv: Display diag data on p7ioc EEH errors")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:28 -04:00
Gavin Shan
450a3ad866 powerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag()
commit a7032132d7560a8434e1f54b71efd7fa20f073bd upstream.

The hub diag-data type is filled with big-endian data by OPAL call
opal_pci_get_hub_diag_data(). We need convert it to CPU-endian value
before using it. The issue is reported by sparse as pointed by Michael
Ellerman:

  eeh-powernv.c:1309:21: warning: restricted __be16 degrades to integer

This converts hub diag-data type to CPU-endian before using it in
pnv_eeh_get_and_dump_hub_diag().

Fixes: 2a485ad7c8 ("powerpc/powernv: Drop PHB operation next_error()")
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:28 -04:00
Gavin Shan
f0a933ef44 powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear()
commit d63e51b31e0b655ed0f581b8a8fd4c4b4f8d1919 upstream.

The PE number (@frozen_pe_no), filled by opal_pci_next_error() is in
big-endian format. It should be converted to CPU-endian before it is
passed to opal_pci_eeh_freeze_clear() when clearing the frozen state if
the PE is invalid one. As Michael Ellerman pointed out, the issue is
also detected by sparse:

  eeh-powernv.c:1541:41: warning: incorrect type in argument 2 (different base types)

This passes CPU-endian PE number to opal_pci_eeh_freeze_clear() and it
should be part of commit <0f36db77643b> ("powerpc/eeh: Fix wrong printed
PE number"), which was merged to 4.3 kernel.

Fixes: 71b540adff ("powerpc/powernv: Don't escalate non-existing frozen PE")
Suggested-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:28 -04:00
Anton Blanchard
74c7701815 powerpc/vdso64: Use double word compare on pointers
commit 5045ea37377ce8cca6890d32b127ad6770e6dce5 upstream.

__kernel_get_syscall_map() and __kernel_clock_getres() use cmpli to
check if the passed in pointer is non zero. cmpli maps to a 32 bit
compare on binutils, so we ignore the top 32 bits.

A simple test case can be created by passing in a bogus pointer with
the bottom 32 bits clear. Using a clk_id that is handled by the VDSO,
then one that is handled by the kernel shows the problem:

  printf("%d\n", clock_getres(CLOCK_REALTIME, (void *)0x100000000));
  printf("%d\n", clock_getres(CLOCK_BOOTTIME, (void *)0x100000000));

And we get:

  0
  -1

The bigger issue is if we pass a valid pointer with the bottom 32 bits
clear, in this case we will return success but won't write any data
to the pointer.

I stumbled across this issue because the LLVM integrated assembler
doesn't accept cmpli with 3 arguments. Fix this by converting them to
cmpldi.

Fixes: a7f290dad3 ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel")
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:28 -04:00
Rabin Vincent
a35c9d6749 dm crypt: fix crash on exit
commit f659b10087daaf4ce0087c3f6aec16746be9628f upstream.

As the documentation for kthread_stop() says, "if threadfn() may call
do_exit() itself, the caller must ensure task_struct can't go away".
dm-crypt does not ensure this and therefore crashes when crypt_dtr()
calls kthread_stop().  The crash is trivially reproducible by adding a
delay before the call to kthread_stop() and just opening and closing a
dm-crypt device.

 general protection fault: 0000 [#1] PREEMPT SMP
 CPU: 0 PID: 533 Comm: cryptsetup Not tainted 4.8.0-rc7+ #7
 task: ffff88003bd0df40 task.stack: ffff8800375b4000
 RIP: 0010: kthread_stop+0x52/0x300
 Call Trace:
  crypt_dtr+0x77/0x120
  dm_table_destroy+0x6f/0x120
  __dm_destroy+0x130/0x250
  dm_destroy+0x13/0x20
  dev_remove+0xe6/0x120
  ? dev_suspend+0x250/0x250
  ctl_ioctl+0x1fc/0x530
  ? __lock_acquire+0x24f/0x1b10
  dm_ctl_ioctl+0x13/0x20
  do_vfs_ioctl+0x91/0x6a0
  ? ____fput+0xe/0x10
  ? entry_SYSCALL_64_fastpath+0x5/0xbd
  ? trace_hardirqs_on_caller+0x151/0x1e0
  SyS_ioctl+0x41/0x70
  entry_SYSCALL_64_fastpath+0x1f/0xbd

This problem was introduced by bcbd94ff48 ("dm crypt: fix a possible
hang due to race condition on exit").

Looking at the description of that patch (excerpted below), it seems
like the problem it addresses can be solved by just using
set_current_state instead of __set_current_state, since we obviously
need the memory barrier.

| dm crypt: fix a possible hang due to race condition on exit
|
| A kernel thread executes __set_current_state(TASK_INTERRUPTIBLE),
| __add_wait_queue, spin_unlock_irq and then tests kthread_should_stop().
| It is possible that the processor reorders memory accesses so that
| kthread_should_stop() is executed before __set_current_state().  If
| such reordering happens, there is a possible race on thread
| termination: [...]

So this patch just reverts the aforementioned patch and changes the
__set_current_state(TASK_INTERRUPTIBLE) to set_current_state(...).  This
fixes the crash and should also fix the potential hang.

Fixes: bcbd94ff48 ("dm crypt: fix a possible hang due to race condition on exit")
Cc: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:28 -04:00
Mike Snitzer
293441e493 dm mpath: check if path's request_queue is dying in activate_path()
commit f10e06b744074824fb8ec7066bc03ecc90918f5b upstream.

If pg_init_retries is set and a request is queued against a multipath
device with all underlying block device request_queues in the "dying"
state then an infinite loop is triggered because activate_path() never
succeeds and hence never calls pg_init_done().

This change avoids that device removal triggers an infinite loop by
failing the activate_path() which causes the "dying" path to be failed.

Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:28 -04:00
Minfei Huang
bf74a108c6 dm: return correct error code in dm_resume()'s retry loop
commit 8dc23658b7aaa8b6b0609c81c8ad75e98b612801 upstream.

dm_resume() will return success (0) rather than -EINVAL if
!dm_suspended_md() upon retry within dm_resume().

Reset the error code at the start of dm_resume()'s retry loop.
Also, remove a useless assignment at the end of dm_resume().

Fixes: ffcc393641 ("dm: enhance internal suspend and resume interface")
Signed-off-by: Minfei Huang <mnghuan@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Bart Van Assche
90be7f1538 dm: mark request_queue dead before destroying the DM device
commit 3b785fbcf81c3533772c52b717f77293099498d3 upstream.

This avoids that new requests are queued while __dm_destroy() is in
progress.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Adrian Hunter
b1d528d090 perf intel-pt: Fix MTC timestamp calculation for large MTC periods
commit 3bccbe20f6d188ce7b00326e776b745cfd35b10a upstream.

The MTC packet provides a 8-bit slice of CTC which is related to TSC by
the TMA packet, however the TMA packet only provides the lower 16 bits
of CTC.  If mtc_shift > 8 then some of the MTC bits are not in the CTC
provided by the TMA packet. Fix-up the last_mtc calculated from the TMA
packet by copying the missing bits from the current MTC assuming the
least difference between the two, and that the current MTC comes after
last_mtc.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1475062896-22274-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Adrian Hunter
d04565939c perf intel-pt: Fix estimated timestamps for cycle-accurate mode
commit 51ee6481fa8e879cc942bcc1b0af713e158b7a98 upstream.

In cycle-accurate mode, timestamps can be calculated from CYC packets.
The decoder also estimates timestamps based on the number of
instructions since the last timestamp. For that to work in
cycle-accurate mode, the instruction count needs to be reset to zero
when a timestamp is calculated from a CYC packet, but that wasn't
happening, so fix it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1475062896-22274-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Adrian Hunter
0ea2bdfcdd perf intel-pt: Fix snapshot overlap detection decoder errors
commit 810c398bc09b2f2dfde52a7d2483a710612c5fb8 upstream.

Fix occasional decoder errors decoding trace data collected in snapshot
mode.

Snapshot mode can take successive snapshots of trace which might overlap.
The decoder checks whether there is an overlap but only looks at the
current and previous buffer. However buffers that do not contain
synchronization (i.e. PSB) packets cannot be decoded or used for overlap
checking. That means the decoder actually needs to check overlaps between
the current buffer and the previous buffer that contained usable data.
Make that change.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lkml.kernel.org/r/1474641528-18776-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Andrew Bresticker
987ec3eb8f pstore/ram: Use memcpy_fromio() to save old buffer
commit d771fdf94180de2bd811ac90cba75f0f346abf8d upstream.

The ramoops buffer may be mapped as either I/O memory or uncached
memory.  On ARM64, this results in a device-type (strongly-ordered)
mapping.  Since unnaligned accesses to device-type memory will
generate an alignment fault (regardless of whether or not strict
alignment checking is enabled), it is not safe to use memcpy().
memcpy_fromio() is guaranteed to only use aligned accesses, so use
that instead.

Signed-off-by: Andrew Bresticker <abrestic@chromium.org>
Signed-off-by: Enric Balletbo Serra <enric.balletbo@collabora.com>
Reviewed-by: Puneet Kumar <puneetster@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Furquan Shaikh
93a6ca7f6d pstore/ram: Use memcpy_toio instead of memcpy
commit 7e75678d23167c2527e655658a8ef36a36c8b4d9 upstream.

persistent_ram_update uses vmap / iomap based on whether the buffer is in
memory region or reserved region. However, both map it as non-cacheable
memory. For armv8 specifically, non-cacheable mapping requests use a
memory type that has to be accessed aligned to the request size. memcpy()
doesn't guarantee that.

Signed-off-by: Furquan Shaikh <furquan@google.com>
Signed-off-by: Enric Balletbo Serra <enric.balletbo@collabora.com>
Reviewed-by: Aaron Durbin <adurbin@chromium.org>
Reviewed-by: Olof Johansson <olofj@chromium.org>
Tested-by: Furquan Shaikh <furquan@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Sebastian Andrzej Siewior
8a3a5110c3 pstore/core: drop cmpxchg based updates
commit d5a9bf0b38d2ac85c9a693c7fb851f74fd2a2494 upstream.

I have here a FPGA behind PCIe which exports SRAM which I use for
pstore. Now it seems that the FPGA no longer supports cmpxchg based
updates and writes back 0xff…ff and returns the same.  This leads to
crash during crash rendering pstore useless.
Since I doubt that there is much benefit from using cmpxchg() here, I am
dropping this atomic access and use the spinlock based version.

Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Rabin Vincent <rabinv@axis.com>
Tested-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
[kees: remove "_locked" suffix since it's the only option now]
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Sebastian Andrzej Siewior
6ae56031f1 pstore/ramoops: fixup driver removal
commit 4407de74df18ed405cc5998990004c813ccfdbde upstream.

A basic rmmod ramoops segfaults. Let's see why.

Since commit 34f0ec82e0 ("pstore: Correct the max_dump_cnt clearing of
ramoops") sets ->max_dump_cnt to zero before looping over ->przs but we
didn't use it before that either.

And since commit ee1d267423 ("pstore: add pstore unregister") we free
that memory on rmmod.

But even then, we looped until a NULL pointer or ERR. I don't see where
it is ensured that the last member is NULL. Let's try this instead:
simply error recovery and free. Clean up in error case where resources
were allocated. And then, in the free path, rely on ->max_dump_cnt in
the free path.

Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Helge Deller
e3fffa6705 parisc: Increase initial kernel mapping size
commit 65bf34f59594c11f13d371c5334a6a0a385cd7ae upstream.

Increase the initial kernel default page mapping size for 64-bit kernels to
64 MB and for 32-bit kernels to 32 MB.

Due to the additional support of ftrace, tracepoint and huge pages the kernel
size can exceed the sizes we used up to now.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Helge Deller
59db586d09 parisc: Fix kernel memory layout regarding position of __gp
commit f8850abb7ba68229838014b3409460e576751c6d upstream.

Architecturally we need to keep __gp below 0x1000000.

But because of ftrace and tracepoint support, the RO_DATA_SECTION now gets much
bigger than it was before. By moving the linkage tables before RO_DATA_SECTION
we can avoid that __gp gets positioned at a too high address.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:27 -04:00
Helge Deller
97ba83a01c parisc: Increase KERNEL_INITIAL_SIZE for 32-bit SMP kernels
commit 690d097c00c88fa9d93d198591e184164b1d8c20 upstream.

Increase the initial kernel default page mapping size for SMP kernels to 32MB
and add a runtime check which panics early if the kernel is bigger than the
initial mapping size.

This fixes boot crashes of 32bit SMP kernels. Due to the introduction of huge
page support in kernel 4.4 and it's required initial kernel layout in memory, a
32bit SMP kernel usually got bigger (in layout, not size) than 16MB.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:26 -04:00
Srinivas Pandruvada
d756b52c8b cpufreq: intel_pstate: Fix unsafe HWP MSR access
commit f9f4872df6e1801572949f8a370c886122d4b6da upstream.

This is a requirement that MSR MSR_PM_ENABLE must be set to 0x01 before
reading MSR_HWP_CAPABILITIES on a given CPU. If cpufreq init() is
scheduled on a CPU which is not same as policy->cpu or migrates to a
different CPU before calling msr read for MSR_HWP_CAPABILITIES, it
is possible that MSR_PM_ENABLE was not to set to 0x01 on that CPU.
This will cause GP fault. So like other places in this path
rdmsrl_on_cpu should be used instead of rdmsrl.

Moreover the scope of MSR_HWP_CAPABILITIES is on per thread basis, so it
should be read from the same CPU, for which MSR MSR_HWP_REQUEST is
getting set.

dmesg dump or warning:

[   22.014488] WARNING: CPU: 139 PID: 1 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe+0x68/0x70
[   22.014492] unchecked MSR access error: RDMSR from 0x771
[   22.014493] Modules linked in:
[   22.014507] CPU: 139 PID: 1 Comm: swapper/0 Not tainted 4.7.5+ #1
...
...
[   22.014516] Call Trace:
[   22.014542]  [<ffffffff813d7dd1>] dump_stack+0x63/0x82
[   22.014558]  [<ffffffff8107bc8b>] __warn+0xcb/0xf0
[   22.014561]  [<ffffffff8107bcff>] warn_slowpath_fmt+0x4f/0x60
[   22.014563]  [<ffffffff810676f8>] ex_handler_rdmsr_unsafe+0x68/0x70
[   22.014564]  [<ffffffff810677d9>] fixup_exception+0x39/0x50
[   22.014604]  [<ffffffff8102e400>] do_general_protection+0x80/0x150
[   22.014610]  [<ffffffff817f9ec8>] general_protection+0x28/0x30
[   22.014635]  [<ffffffff81687940>] ? get_target_pstate_use_performance+0xb0/0xb0
[   22.014642]  [<ffffffff810600c7>] ? native_read_msr+0x7/0x40
[   22.014657]  [<ffffffff81688123>] intel_pstate_hwp_set+0x23/0x130
[   22.014660]  [<ffffffff81688406>] intel_pstate_set_policy+0x1b6/0x340
[   22.014662]  [<ffffffff816829bb>] cpufreq_set_policy+0xeb/0x2c0
[   22.014664]  [<ffffffff81682f39>] cpufreq_init_policy+0x79/0xe0
[   22.014666]  [<ffffffff81682cb0>] ? cpufreq_update_policy+0x120/0x120
[   22.014669]  [<ffffffff816833a6>] cpufreq_online+0x406/0x820
[   22.014671]  [<ffffffff8168381f>] cpufreq_add_dev+0x5f/0x90
[   22.014717]  [<ffffffff81530ac8>] subsys_interface_register+0xb8/0x100
[   22.014719]  [<ffffffff816821bc>] cpufreq_register_driver+0x14c/0x210
[   22.014749]  [<ffffffff81fe1d90>] intel_pstate_init+0x39d/0x4d5
[   22.014751]  [<ffffffff81fe13f2>] ? cpufreq_gov_dbs_init+0x12/0x12

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:26 -04:00
Sergei Shtylyov
3fc96b9e2b platform: don't return 0 from platform_get_irq[_byname]() on error
commit e330b9a6bb35dc7097a4f02cb1ae7b6f96df92af upstream.

of_irq_get[_byname]() return 0 iff  irq_create_of_mapping() call fails.
Returning both  error code and 0 on failure is a sign of a misdesigned API,
it makes the failure check unnecessarily complex and error prone. We should
rely  on the platform IRQ resource in this case, not return 0,  especially
as 0 can be  a valid  IRQ resource too...

Fixes: aff008ad81 ("platform_get_irq: Revert to platform_get_resource if of_irq_get fails")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:26 -04:00
Maik Broemme
e7837af616 PCI: Mark Atheros AR9580 to avoid bus reset
commit 8e2e03179923479ca0c0b6fdc7c93ecf89bce7a8 upstream.

Similar to the AR93xx and the AR94xx series, the AR95xx also have the same
quirk for the Bus Reset.  It will lead to instant system reset if the
device is assigned via VFIO to a KVM VM.  I've been able reproduce this
behavior with a MikroTik R11e-2HnD.

Fixes: c3e59ee4e7 ("PCI: Mark Atheros AR93xx to avoid bus reset")
Signed-off-by: Maik Broemme <mbroemme@libmpq.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:26 -04:00
Haibo Chen
68b13e229b mmc: sdhci: cast unsigned int to unsigned long long to avoid unexpeted error
commit 02265cd60335a2c1417abae4192611e1fc05a6e5 upstream.

Potentially overflowing expression 1000000 * data->timeout_clks with
type unsigned int is evaluated using 32-bit arithmetic, and then used
in a context that expects an expression of type unsigned long long.

To avoid overflow, cast 1000000U to type unsigned long long.
Special thanks to Coverity.

Fixes: 7f05538af71c ("mmc: sdhci: fix data timeout (part 2)")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:26 -04:00
Daniel Glöckner
bc7e4a29a5 mmc: block: don't use CMD23 with very old MMC cards
commit 0ed50abb2d8fc81570b53af25621dad560cd49b3 upstream.

CMD23 aka SET_BLOCK_COUNT was introduced with MMC v3.1.
Older versions of the specification allowed to terminate
multi-block transfers only with CMD12.

The patch fixes the following problem:

  mmc0: new MMC card at address 0001
  mmcblk0: mmc0:0001 SDMB-16 15.3 MiB
  mmcblk0: timed out sending SET_BLOCK_COUNT command, card status 0x400900
  ...
  blk_update_request: I/O error, dev mmcblk0, sector 0
  Buffer I/O error on dev mmcblk0, logical block 0, async page read
   mmcblk0: unable to read partition table

Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:26 -04:00
Larry Finger
7dcd1cd077 rtlwifi: Fix missing country code for Great Britain
commit 0c9d3491530773858ff9d705ec2a9c382f449230 upstream.

Some RTL8821AE devices sold in Great Britain have the country code of
0x25 encoded in their EEPROM. This value is not tested in the routine
that establishes the regulatory info for the chip. The fix is to set
this code to have the same capabilities as the EU countries. In addition,
the channels allowed for COUNTRY_CODE_ETSI were more properly suited
for China and Israel, not the EU. This problem has also been fixed.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:26 -04:00
Lin Huang
8bfd0c41d2 PM / devfreq: event: remove duplicate devfreq_event_get_drvdata()
commit c8a9a6daccad495c48d5435d3487956ce01bc6a1 upstream.

there define two devfreq_event_get_drvdata() function in devfreq-event.h
when disable CONFIG_PM_DEVFREQ_EVENT, it will lead to build fail. So
remove devfreq_event_get_drvdata() function.

Fixes: f262f28c14 ("PM / devfreq: event: Add devfreq_event class")
Signed-off-by: Lin Huang <hl@rock-chips.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:26 -04:00
Lucas Stach
aed6dd5646 clk: imx6: initialize GPU clocks
commit d8846023aed1293e54d33499558fc2aa2b2f393f upstream.

Initialize the GPU clock muxes to sane inputs. Until now they have
not been changed from their default values, which means that both
GPU3D shader and GPU2D core were fed by clock inputs whose rates
exceed the maximium allowed frequency of the cores by as much as
200MHz.

This fixes a severe GPU stability issue on i.MX6DL.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:26 -04:00
Jan Remmet
5122ea50c6 regulator: tps65910: Work around silicon erratum SWCZ010
commit 8f9165c981fed187bb483de84caf9adf835aefda upstream.

http://www.ti.com/lit/pdf/SWCZ010:
  DCDC o/p voltage can go higher than programmed value

Impact:
VDDI, VDD2, and VIO output programmed voltage level can go higher than
expected or crash, when coming out of PFM to PWM mode or using DVFS.

Description:
When DCDC CLK SYNC bits are 11/01:
* VIO 3-MHz oscillator is the source clock of the digital core and input
  clock of VDD1 and VDD2
* Turn-on of VDD1 and VDD2 HSD PFETis synchronized or at a constant
  phase shift
* Current pulled though VCC1+VCC2 is Iload(VDD1) + Iload(VDD2)
* The 3 HSD PFET will be turned-on at the same time, causing the highest
  possible switching noise on the application. This noise level depends
  on the layout, the VBAT level, and the load current. The noise level
  increases with improper layout.

When DCDC CLK SYNC bits are 00:
* VIO 3-MHz oscillator is the source clock of digital core
* VDD1 and VDD2 are running on their own 3-MHz oscillator
* Current pulled though VCC1+VCC2 average of Iload(VDD1) + Iload(VDD2)
* The switching noise of the 3 SMPS will be randomly spread over time,
  causing lower overall switching noise.

Workaround:
Set DCDCCTRL_REG[1:0]= 00.

Signed-off-by: Jan Remmet <j.remmet@phytec.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:25 -04:00
Alexander Usyskin
5c99f32c46 mei: me: add kaby point device ids
commit ac182e8abc6f93c1c4cc12f042af64c9d7be0d1e upstream.

Add device ids for Intel Kabypoint PCH (Kabylake)

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:25 -04:00
Liu Gang
02785b1fc2 gpio: mpc8xxx: Correct irq handler function
commit d71cf15b865bdd45925f7b094d169aaabd705145 upstream.

From the beginning of the gpio-mpc8xxx.c, the "handle_level_irq"
has being used to handle GPIO interrupts in the PowerPC/Layerscape
platforms. But actually, almost all PowerPC/Layerscape platforms
assert an interrupt request upon either a high-to-low change or
any change on the state of the signal.

So the "handle_level_irq" is not reasonable for PowerPC/Layerscape
GPIO interrupt, it should be "handle_edge_irq". Otherwise the system
may lost some interrupts from the PIN's state changes.

Signed-off-by: Liu Gang <Gang.Liu@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:25 -04:00
Greg Kroah-Hartman
3afd8362fa Linux 4.4.27 2016-10-22 12:27:13 +02:00
Glauber Costa
c7077fba80 cfq: fix starvation of asynchronous writes
commit 3932a86b4b9d1f0b049d64d4591ce58ad18b44ec upstream.

While debugging timeouts happening in my application workload (ScyllaDB), I have
observed calls to open() taking a long time, ranging everywhere from 2 seconds -
the first ones that are enough to time out my application - to more than 30
seconds.

The problem seems to happen because XFS may block on pending metadata updates
under certain circumnstances, and that's confirmed with the following backtrace
taken by the offcputime tool (iovisor/bcc):

    ffffffffb90c57b1 finish_task_switch
    ffffffffb97dffb5 schedule
    ffffffffb97e310c schedule_timeout
    ffffffffb97e1f12 __down
    ffffffffb90ea821 down
    ffffffffc046a9dc xfs_buf_lock
    ffffffffc046abfb _xfs_buf_find
    ffffffffc046ae4a xfs_buf_get_map
    ffffffffc046babd xfs_buf_read_map
    ffffffffc0499931 xfs_trans_read_buf_map
    ffffffffc044a561 xfs_da_read_buf
    ffffffffc0451390 xfs_dir3_leaf_read.constprop.16
    ffffffffc0452b90 xfs_dir2_leaf_lookup_int
    ffffffffc0452e0f xfs_dir2_leaf_lookup
    ffffffffc044d9d3 xfs_dir_lookup
    ffffffffc047d1d9 xfs_lookup
    ffffffffc0479e53 xfs_vn_lookup
    ffffffffb925347a path_openat
    ffffffffb9254a71 do_filp_open
    ffffffffb9242a94 do_sys_open
    ffffffffb9242b9e sys_open
    ffffffffb97e42b2 entry_SYSCALL_64_fastpath
    00007fb0698162ed [unknown]

Inspecting my run with blktrace, I can see that the xfsaild kthread exhibit very
high "Dispatch wait" times, on the dozens of seconds range and consistent with
the open() times I have saw in that run.

Still from the blktrace output, we can after searching a bit, identify the
request that wasn't dispatched:

  8,0   11      152    81.092472813   804  A  WM 141698288 + 8 <- (8,1) 141696240
  8,0   11      153    81.092472889   804  Q  WM 141698288 + 8 [xfsaild/sda1]
  8,0   11      154    81.092473207   804  G  WM 141698288 + 8 [xfsaild/sda1]
  8,0   11      206    81.092496118   804  I  WM 141698288 + 8 (   22911) [xfsaild/sda1]
  <==== 'I' means Inserted (into the IO scheduler) ===================================>
  8,0    0   289372    96.718761435     0  D  WM 141698288 + 8 (15626265317) [swapper/0]
  <==== Only 15s later the CFQ scheduler dispatches the request ======================>

As we can see above, in this particular example CFQ took 15 seconds to dispatch
this request. Going back to the full trace, we can see that the xfsaild queue
had plenty of opportunity to run, and it was selected as the active queue many
times. It would just always be preempted by something else (example):

  8,0    1        0    81.117912979     0  m   N cfq1618SN / insert_request
  8,0    1        0    81.117913419     0  m   N cfq1618SN / add_to_rr
  8,0    1        0    81.117914044     0  m   N cfq1618SN / preempt
  8,0    1        0    81.117914398     0  m   N cfq767A  / slice expired t=1
  8,0    1        0    81.117914755     0  m   N cfq767A  / resid=40
  8,0    1        0    81.117915340     0  m   N / served: vt=1948520448 min_vt=1948520448
  8,0    1        0    81.117915858     0  m   N cfq767A  / sl_used=1 disp=0 charge=0 iops=1 sect=0

where cfq767 is the xfsaild queue and cfq1618 corresponds to one of the ScyllaDB
IO dispatchers.

The requests preempting the xfsaild queue are synchronous requests. That's a
characteristic of ScyllaDB workloads, as we only ever issue O_DIRECT requests.
While it can be argued that preempting ASYNC requests in favor of SYNC is part
of the CFQ logic, I don't believe that doing so for 15+ seconds is anyone's
goal.

Moreover, unless I am misunderstanding something, that breaks the expectation
set by the "fifo_expire_async" tunable, which in my system is set to the
default.

Looking at the code, it seems to me that the issue is that after we make
an async queue active, there is no guarantee that it will execute any request.

When the queue itself tests if it cfq_may_dispatch() it can bail if it sees SYNC
requests in flight. An incoming request from another queue can also preempt it
in such situation before we have the chance to execute anything (as seen in the
trace above).

This patch sets the must_dispatch flag if we notice that we have requests
that are already fifo_expired. This flag is always cleared after
cfq_dispatch_request() returns from cfq_dispatch_requests(), so it won't pin
the queue for subsequent requests (unless they are themselves expired)

Care is taken during preempt to still allow rt requests to preempt us
regardless.

Testing my workload with this patch applied produces much better results.
From the application side I see no timeouts, and the open() latency histogram
generated by systemtap looks much better, with the worst outlier at 131ms:

Latency histogram of xfs_buf_lock acquisition (microseconds):
 value |-------------------------------------------------- count
     0 |                                                     11
     1 |@@@@                                                161
     2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  1966
     4 |@                                                    54
     8 |                                                     36
    16 |                                                      7
    32 |                                                      0
    64 |                                                      0
       ~
  1024 |                                                      0
  2048 |                                                      0
  4096 |                                                      1
  8192 |                                                      1
 16384 |                                                      2
 32768 |                                                      0
 65536 |                                                      0
131072 |                                                      1
262144 |                                                      0
524288 |                                                      0

Signed-off-by: Glauber Costa <glauber@scylladb.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: linux-block@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Miklos Szeredi
b3b4283f7e vfs: move permission checking into notify_change() for utimes(NULL)
commit f2b20f6ee842313a0d681dbbf7f87b70291a6a3b upstream.

This fixes a bug where the permission was not properly checked in
overlayfs.  The testcase is ltp/utimensat01.

It is also cleaner and safer to do the permission checking in the vfs
helper instead of the caller.

This patch introduces an additional ia_valid flag ATTR_TOUCH (since
touch(1) is the most obvious user of utimes(NULL)) that is passed into
notify_change whenever the conditions for this special permission checking
mode are met.

Reported-by: Aihua Zhang <zhangaihua1@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Tested-by: Aihua Zhang <zhangaihua1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Marcelo Ricardo Leitner
1832397d8c dlm: free workqueues after the connections
commit 3a8db79889ce16930aff19b818f5b09651bb7644 upstream.

After backporting commit ee44b4bc05 ("dlm: use sctp 1-to-1 API")
series to a kernel with an older workqueue which didn't use RCU yet, it
was noticed that we are freeing the workqueues in dlm_lowcomms_stop()
too early as free_conn() will try to access that memory for canceling
the queued works if any.

This issue was introduced by commit 0d737a8cfd as before it such
attempt to cancel the queued works wasn't performed, so the issue was
not present.

This patch fixes it by simply inverting the free order.

Fixes: 0d737a8cfd ("dlm: fix race while closing connections")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Marcelo Cerri
78e1355d42 crypto: vmx - Fix memory corruption caused by p8_ghash
commit 80da44c29d997e28c4442825f35f4ac339813877 upstream.

This patch changes the p8_ghash driver to use ghash-generic as a fixed
fallback implementation. This allows the correct value of descsize to be
defined directly in its shash_alg structure and avoids problems with
incorrect buffer sizes when its state is exported or imported.

Reported-by: Jan Stancek <jstancek@redhat.com>
Fixes: cc333cd68d ("crypto: vmx - Adding GHASH routines for VMX module")
Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Marcelo Cerri
4f8c2ad3ee crypto: ghash-generic - move common definitions to a new header file
commit a397ba829d7f8aff4c90af3704573a28ccd61a59 upstream.

Move common values and types used by ghash-generic to a new header file
so drivers can directly use ghash-generic as a fallback implementation.

Fixes: cc333cd68d ("crypto: vmx - Adding GHASH routines for VMX module")
Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
gmail
847d63fcaf ext4: release bh in make_indexed_dir
commit e81d44778d1d57bbaef9e24c4eac7c8a7a401d40 upstream.

The commit 6050d47adc: "ext4: bail out from make_indexed_dir() on
first error" could end up leaking bh2 in the error path.

[ Also avoid renaming bh2 to bh, which just confuses things --tytso ]

Signed-off-by: yangsheng <yngsion@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Ross Zwisler
2db2fd9538 ext4: allow DAX writeback for hole punch
commit cca32b7eeb4ea24fa6596650e06279ad9130af98 upstream.

Currently when doing a DAX hole punch with ext4 we fail to do a writeback.
This is because the logic around filemap_write_and_wait_range() in
ext4_punch_hole() only looks for dirty page cache pages in the radix tree,
not for dirty DAX exceptional entries.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Fabian Frederick
6f6c12ce00 ext4: fix memory leak in ext4_insert_range()
commit edf15aa180d7b98fe16bd3eda42f9dd0e60dee20 upstream.

Running xfstests generic/013 with kmemleak gives the following:

unreferenced object 0xffff8801d3d27de0 (size 96):
  comm "fsstress", pid 4941, jiffies 4294860168 (age 53.485s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff818eaaf3>] kmemleak_alloc+0x23/0x40
    [<ffffffff81179805>] __kmalloc+0xf5/0x1d0
    [<ffffffff8122ef5c>] ext4_find_extent+0x1ec/0x2f0
    [<ffffffff8123530c>] ext4_insert_range+0x34c/0x4a0
    [<ffffffff81235942>] ext4_fallocate+0x4e2/0x8b0
    [<ffffffff81181334>] vfs_fallocate+0x134/0x210
    [<ffffffff8118203f>] SyS_fallocate+0x3f/0x60
    [<ffffffff818efa9b>] entry_SYSCALL_64_fastpath+0x13/0x8f
    [<ffffffffffffffff>] 0xffffffffffffffff

Problem seems mitigated by dropping refs and freeing path
when there's no path[depth].p_ext

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Daeho Jeong
d380cbb863 ext4: reinforce check of i_dtime when clearing high fields of uid and gid
commit 93e3b4e6631d2a74a8cf7429138096862ff9f452 upstream.

Now, ext4_do_update_inode() clears high 16-bit fields of uid/gid
of deleted and evicted inode to fix up interoperability with old
kernels. However, it checks only i_dtime of an inode to determine
whether the inode was deleted and evicted, and this is very risky,
because i_dtime can be used for the pointer maintaining orphan inode
list, too. We need to further check whether the i_dtime is being
used for the orphan inode list even if the i_dtime is not NULL.

We found that high 16-bit fields of uid/gid of inode are unintentionally
and permanently cleared when the inode truncation is just triggered,
but not finished, and the inode metadata, whose high uid/gid bits are
cleared, is written on disk, and the sudden power-off follows that
in order.

Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com>
Signed-off-by: Hobin Woo <hobin.woo@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Eric Whitney
76a8f17e0b ext4: enforce online defrag restriction for encrypted files
commit 14fbd4aa613bd5110556c281799ce36dc6f3ba97 upstream.

Online defragging of encrypted files is not currently implemented.
However, the move extent ioctl can still return successfully when
called.  For example, this occurs when xfstest ext4/020 is run on an
encrypted file system, resulting in a corrupted test file and a
corresponding test failure.

Until the proper functionality is implemented, fail the move extent
ioctl if either the original or donor file is encrypted.

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Brian King
2ed1b50a40 scsi: ibmvfc: Fix I/O hang when port is not mapped
commit 07d0e9a847401ffd2f09bd450d41644cd090e81d upstream.

If a VFC port gets unmapped in the VIOS, it may not respond with a CRQ
init complete following H_REG_CRQ. If this occurs, we can end up having
called scsi_block_requests and not a resulting unblock until the init
complete happens, which may never occur, and we end up hanging I/O
requests.  This patch ensures the host action stay set to
IBMVFC_HOST_ACTION_TGT_DEL so we move all rports into devloss state and
unblock unless we receive an init complete.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Borislav Petkov
161cbfec10 scsi: arcmsr: Simplify user_len checking
commit 4bd173c30792791a6daca8c64793ec0a4ae8324f upstream.

Do the user_len check first and then the ver_addr allocation so that we
can save us the kfree() on the error path when user_len is >
ARCMSR_API_DATA_BUFLEN.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Marco Grassi <marco.gra@gmail.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Tomas Henzl <thenzl@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:56 +02:00
Dan Carpenter
2404092282 scsi: arcmsr: Buffer overflow in arcmsr_iop_message_xfer()
commit 7bc2b55a5c030685b399bb65b6baa9ccc3d1f167 upstream.

We need to put an upper bound on "user_len" so the memcpy() doesn't
overflow.

Reported-by: Marco Grassi <marco.gra@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:55 +02:00
Justin Maggard
ffb3c7a99a async_pq_val: fix DMA memory leak
commit c84750906b4818d4929fbf73a4ae6c113b94f52b upstream.

Add missing dmaengine_unmap_put(), so we don't OOM during RAID6 sync.

Fixes: 1786b943da ("async_pq_val: convert to dmaengine_unmap_data")
Signed-off-by: Justin Maggard <jmaggard@netgear.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:55 +02:00
Al Viro
be90a09bd9 reiserfs: switch to generic_{get,set,remove}xattr()
commit 79a628d14ec7ee9adfdc3ce04343d5ff7ec20c18 upstream.

reiserfs_xattr_[sg]et() will fail with -EOPNOTSUPP for V1 inodes anyway,
and all reiserfs instances of ->[sg]et() call it and so does ->set_acl().

Checks for name length in the instances had been bogus; they should've
been "bugger off if it's _exactly_ the prefix" (as generic would
do on its own) and not "bugger off if it's shorter than the prefix" -
that can't happen.

xattr_full_name() is needed to adjust for the fact that generic instances
will skip the prefix in the name passed to ->[gs]et(); reiserfs homegrown
analogues didn't.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[jeffm: Backported to v4.4: adjust context]
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:55 +02:00
Mike Galbraith
5c16520bdc reiserfs: Unlock superblock before calling reiserfs_quota_on_mount()
commit 420902c9d086848a7548c83e0a49021514bd71b7 upstream.

If we hold the superblock lock while calling reiserfs_quota_on_mount(), we can
deadlock our own worker - mount blocks kworker/3:2, sleeps forever more.

crash> ps|grep UN
    715      2   3  ffff880220734d30  UN   0.0       0      0  [kworker/3:2]
   9369   9341   2  ffff88021ffb7560  UN   1.3  493404 123184  Xorg
   9665   9664   3  ffff880225b92ab0  UN   0.0   47368    812  udisks-daemon
  10635  10403   3  ffff880222f22c70  UN   0.0   14904    936  mount
crash> bt ffff880220734d30
PID: 715    TASK: ffff880220734d30  CPU: 3   COMMAND: "kworker/3:2"
 #0 [ffff8802244c3c20] schedule at ffffffff8144584b
 #1 [ffff8802244c3cc8] __rt_mutex_slowlock at ffffffff814472b3
 #2 [ffff8802244c3d28] rt_mutex_slowlock at ffffffff814473f5
 #3 [ffff8802244c3dc8] reiserfs_write_lock at ffffffffa05f28fd [reiserfs]
 #4 [ffff8802244c3de8] flush_async_commits at ffffffffa05ec91d [reiserfs]
 #5 [ffff8802244c3e08] process_one_work at ffffffff81073726
 #6 [ffff8802244c3e68] worker_thread at ffffffff81073eba
 #7 [ffff8802244c3ec8] kthread at ffffffff810782e0
 #8 [ffff8802244c3f48] kernel_thread_helper at ffffffff81450064
crash> rd ffff8802244c3cc8 10
ffff8802244c3cc8:  ffffffff814472b3 ffff880222f23250   .rD.....P2."....
ffff8802244c3cd8:  0000000000000000 0000000000000286   ................
ffff8802244c3ce8:  ffff8802244c3d30 ffff880220734d80   0=L$.....Ms ....
ffff8802244c3cf8:  ffff880222e8f628 0000000000000000   (.."............
ffff8802244c3d08:  0000000000000000 0000000000000002   ................
crash> struct rt_mutex ffff880222e8f628
struct rt_mutex {
  wait_lock = {
    raw_lock = {
      slock = 65537
    }
  },
  wait_list = {
    node_list = {
      next = 0xffff8802244c3d48,
      prev = 0xffff8802244c3d48
    }
  },
  owner = 0xffff880222f22c71,
  save_state = 0
}
crash> bt 0xffff880222f22c70
PID: 10635  TASK: ffff880222f22c70  CPU: 3   COMMAND: "mount"
 #0 [ffff8802216a9868] schedule at ffffffff8144584b
 #1 [ffff8802216a9910] schedule_timeout at ffffffff81446865
 #2 [ffff8802216a99a0] wait_for_common at ffffffff81445f74
 #3 [ffff8802216a9a30] flush_work at ffffffff810712d3
 #4 [ffff8802216a9ab0] schedule_on_each_cpu at ffffffff81074463
 #5 [ffff8802216a9ae0] invalidate_bdev at ffffffff81178aba
 #6 [ffff8802216a9af0] vfs_load_quota_inode at ffffffff811a3632
 #7 [ffff8802216a9b50] dquot_quota_on_mount at ffffffff811a375c
 #8 [ffff8802216a9b80] finish_unfinished at ffffffffa05dd8b0 [reiserfs]
 #9 [ffff8802216a9cc0] reiserfs_fill_super at ffffffffa05de825 [reiserfs]
    RIP: 00007f7b9303997a  RSP: 00007ffff443c7a8  RFLAGS: 00010202
    RAX: 00000000000000a5  RBX: ffffffff8144ef12  RCX: 00007f7b932e9ee0
    RDX: 00007f7b93d9a400  RSI: 00007f7b93d9a3e0  RDI: 00007f7b93d9a3c0
    RBP: 00007f7b93d9a2c0   R8: 00007f7b93d9a550   R9: 0000000000000001
    R10: ffffffffc0ed040e  R11: 0000000000000202  R12: 000000000000040e
    R13: 0000000000000000  R14: 00000000c0ed040e  R15: 00007ffff443ca20
    ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b

Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22 12:26:55 +02:00