Commit graph

570828 commits

Author SHA1 Message Date
Vitaly Kuznetsov
462558aefc KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use
[ Upstream commit 0bcc3fb95b97ac2ca223a5a870287b37f56265ac ]

Devices which use level-triggered interrupts under Windows 2016 with
Hyper-V role enabled don't work: Windows disables EOI broadcast in SPIV
unconditionally. Our in-kernel IOAPIC implementation emulates an old IOAPIC
version which has no EOI register so EOI never happens.

The issue was discovered and discussed a while ago:
https://www.spinics.net/lists/kvm/msg148098.html

While this is a guest OS bug (it should check that IOAPIC has the required
capabilities before disabling EOI broadcast) we can workaround it in KVM:
advertising DIRECTED_EOI with in-kernel IOAPIC makes little sense anyway.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Gregory CLEMENT
82eac614c6 i2c: mv64xxx: Apply errata delay only in standard mode
[ Upstream commit 31184d8c6ea49ea0676d100cdd7e1f102ad025b5 ]

The errata FE-8471889 description has been updated. There is still a
timing violation for repeated start. But the errata now states that it
was only the case for the Standard mode (100 kHz), in Fast mode (400 kHz)
there is no issue.

This patch limit the errata fix to the Standard mode.

It has been tesed successfully on the clearfog (Aramda 388 based board).

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Seunghun Han
dfcb739c20 ACPICA: acpi: acpica: fix acpi operand cache leak in nseval.c
[ Upstream commit 97f3c0a4b0579b646b6b10ae5a3d59f0441cc12c ]

I found an ACPI cache leak in ACPI early termination and boot continuing case.

When early termination occurs due to malicious ACPI table, Linux kernel
terminates ACPI function and continues to boot process. While kernel terminates
ACPI function, kmem_cache_destroy() reports Acpi-Operand cache leak.

Boot log of ACPI operand cache leak is as follows:
>[    0.464168] ACPI: Added _OSI(Module Device)
>[    0.467022] ACPI: Added _OSI(Processor Device)
>[    0.469376] ACPI: Added _OSI(3.0 _SCP Extensions)
>[    0.471647] ACPI: Added _OSI(Processor Aggregator Device)
>[    0.477997] ACPI Error: Null stack entry at ffff880215c0aad8 (20170303/exresop-174)
>[    0.482706] ACPI Exception: AE_AML_INTERNAL, While resolving operands for [opcode_name unavailable] (20170303/dswexec-461)
>[    0.487503] ACPI Error: Method parse/execution failed [\DBG] (Node ffff88021710ab40), AE_AML_INTERNAL (20170303/psparse-543)
>[    0.492136] ACPI Error: Method parse/execution failed [\_SB._INI] (Node ffff88021710a618), AE_AML_INTERNAL (20170303/psparse-543)
>[    0.497683] ACPI: Interpreter enabled
>[    0.499385] ACPI: (supports S0)
>[    0.501151] ACPI: Using IOAPIC for interrupt routing
>[    0.503342] ACPI Error: Null stack entry at ffff880215c0aad8 (20170303/exresop-174)
>[    0.506522] ACPI Exception: AE_AML_INTERNAL, While resolving operands for [opcode_name unavailable] (20170303/dswexec-461)
>[    0.510463] ACPI Error: Method parse/execution failed [\DBG] (Node ffff88021710ab40), AE_AML_INTERNAL (20170303/psparse-543)
>[    0.514477] ACPI Error: Method parse/execution failed [\_PIC] (Node ffff88021710ab18), AE_AML_INTERNAL (20170303/psparse-543)
>[    0.518867] ACPI Exception: AE_AML_INTERNAL, Evaluating _PIC (20170303/bus-991)
>[    0.522384] kmem_cache_destroy Acpi-Operand: Slab cache still has objects
>[    0.524597] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc5 #26
>[    0.526795] Hardware name: innotek gmb_h virtual_box/virtual_box, BIOS virtual_box 12/01/2006
>[    0.529668] Call Trace:
>[    0.530811]  ? dump_stack+0x5c/0x81
>[    0.532240]  ? kmem_cache_destroy+0x1aa/0x1c0
>[    0.533905]  ? acpi_os_delete_cache+0xa/0x10
>[    0.535497]  ? acpi_ut_delete_caches+0x3f/0x7b
>[    0.537237]  ? acpi_terminate+0xa/0x14
>[    0.538701]  ? acpi_init+0x2af/0x34f
>[    0.540008]  ? acpi_sleep_proc_init+0x27/0x27
>[    0.541593]  ? do_one_initcall+0x4e/0x1a0
>[    0.543008]  ? kernel_init_freeable+0x19e/0x21f
>[    0.546202]  ? rest_init+0x80/0x80
>[    0.547513]  ? kernel_init+0xa/0x100
>[    0.548817]  ? ret_from_fork+0x25/0x30
>[    0.550587] vgaarb: loaded
>[    0.551716] EDAC MC: Ver: 3.0.0
>[    0.553744] PCI: Probing PCI hardware
>[    0.555038] PCI host bridge to bus 0000:00
> ... Continue to boot and log is omitted ...

I analyzed this memory leak in detail and found acpi_ns_evaluate() function
only removes Info->return_object in AE_CTRL_RETURN_VALUE case. But, when errors
occur, the status value is not AE_CTRL_RETURN_VALUE, and Info->return_object is
also not null. Therefore, this causes acpi operand memory leak.

This cache leak causes a security threat because an old kernel (<= 4.9) shows
memory locations of kernel functions in stack dump. Some malicious users
could use this information to neutralize kernel ASLR.

I made a patch to fix ACPI operand cache leak.

Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Erik Schmauss
0170c50523 ACPICA: Events: add a return on failure from acpi_hw_register_read
[ Upstream commit b4c0de312613ca676db5bd7e696a44b56795612a ]

This ensures that acpi_ev_fixed_event_detect() does not use fixed_status
and and fixed_enable as uninitialized variables.

Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Coly Li
7fb3ba3029 bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set
[ Upstream commit fadd94e05c02afec7b70b0b14915624f1782f578 ]

In patch "bcache: fix cached_dev->count usage for bch_cache_set_error()",
cached_dev_get() is called when creating dc->writeback_thread, and
cached_dev_put() is called when exiting dc->writeback_thread. This
modification works well unless people detach the bcache device manually by
    'echo 1 > /sys/block/bcache<N>/bcache/detach'
Because this sysfs interface only calls bch_cached_dev_detach() which wakes
up dc->writeback_thread but does not stop it. The reason is, before patch
"bcache: fix cached_dev->count usage for bch_cache_set_error()", inside
bch_writeback_thread(), if cache is not dirty after writeback,
cached_dev_put() will be called here. And in cached_dev_make_request() when
a new write request makes cache from clean to dirty, cached_dev_get() will
be called there. Since we don't operate dc->count in these locations,
refcount d->count cannot be dropped after cache becomes clean, and
cached_dev_detach_finish() won't be called to detach bcache device.

This patch fixes the issue by checking whether BCACHE_DEV_DETACHING is
set inside bch_writeback_thread(). If this bit is set and cache is clean
(no existing writeback_keys), break the while-loop, call cached_dev_put()
and quit the writeback thread.

Please note if cache is still dirty, even BCACHE_DEV_DETACHING is set the
writeback thread should continue to perform writeback, this is the original
design of manually detach.

It is safe to do the following check without locking, let me explain why,
+	if (!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) &&
+	    (!atomic_read(&dc->has_dirty) || !dc->writeback_running)) {

If the kenrel thread does not sleep and continue to run due to conditions
are not updated in time on the running CPU core, it just consumes more CPU
cycles and has no hurt. This should-sleep-but-run is safe here. We just
focus on the should-run-but-sleep condition, which means the writeback
thread goes to sleep in mistake while it should continue to run.
1, First of all, no matter the writeback thread is hung or not,
   kthread_stop() from cached_dev_detach_finish() will wake up it and
   terminate by making kthread_should_stop() return true. And in normal
   run time, bit on index BCACHE_DEV_DETACHING is always cleared, the
   condition
	!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags)
   is always true and can be ignored as constant value.
2, If one of the following conditions is true, the writeback thread should
   go to sleep,
   "!atomic_read(&dc->has_dirty)" or "!dc->writeback_running)"
   each of them independently controls the writeback thread should sleep or
   not, let's analyse them one by one.
2.1 condition "!atomic_read(&dc->has_dirty)"
   If dc->has_dirty is set from 0 to 1 on another CPU core, bcache will
   call bch_writeback_queue() immediately or call bch_writeback_add() which
   indirectly calls bch_writeback_queue() too. In bch_writeback_queue(),
   wake_up_process(dc->writeback_thread) is called. It sets writeback
   thread's task state to TASK_RUNNING and following an implicit memory
   barrier, then tries to wake up the writeback thread.
   In writeback thread, its task state is set to TASK_INTERRUPTIBLE before
   doing the condition check. If other CPU core sets the TASK_RUNNING state
   after writeback thread setting TASK_INTERRUPTIBLE, the writeback thread
   will be scheduled to run very soon because its state is not
   TASK_INTERRUPTIBLE. If other CPU core sets the TASK_RUNNING state before
   writeback thread setting TASK_INTERRUPTIBLE, the implict memory barrier
   of wake_up_process() will make sure modification of dc->has_dirty on
   other CPU core is updated and observed on the CPU core of writeback
   thread. Therefore the condition check will correctly be false, and
   continue writeback code without sleeping.
2.2 condition "!dc->writeback_running)"
   dc->writeback_running can be changed via sysfs file, every time it is
   modified, a following bch_writeback_queue() is alwasy called. So the
   change is always observed on the CPU core of writeback thread. If
   dc->writeback_running is changed from 0 to 1 on other CPU core, this
   condition check will observe the modification and allow writeback
   thread to continue to run without sleeping.
Now we can see, even without a locking protection, multiple conditions
check is safe here, no deadlock or process hang up will happen.

I compose a separte patch because that patch "bcache: fix cached_dev->count
usage for bch_cache_set_error()" already gets a "Reviewed-by:" from Hannes
Reinecke. Also this fix is not trivial and good for a separate patch.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Huijun Tang <tang.junhui@zte.com.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Michael Schmitz
ae90d15119 zorro: Set up z->dev.dma_mask for the DMA API
[ Upstream commit 55496d3fe2acd1a365c43cbd613a20ecd4d74395 ]

The generic DMA API uses dev->dma_mask to check the DMA addressable
memory bitmask, and warns if no mask is set or even allocated.

Set z->dev.dma_coherent_mask on Zorro bus scan, and make z->dev.dma_mask
to point to z->dev.dma_coherent_mask so device drivers that need DMA have
everything set up to avoid warnings from dma_alloc_coherent(). Drivers can
still use dma_set_mask_and_coherent() to explicitly set their DMA bit mask.

Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
[geert: Handle Zorro II with 24-bit address space]
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Shawn Lin
9c43286998 clk: Don't show the incorrect clock phase
[ Upstream commit 1f9c63e8de3d7b377c9d74e4a17524cfb60e6384 ]

It's found that the clock phase output from clk_summary is
wrong compared to the actual phase reading from the register.

cat /sys/kernel/debug/clk/clk_summary | grep sdio_sample
sdio_sample     0        1        0 50000000          0 -22

It exposes an issue that clk core, clk_core_get_phase, always
returns the cached core->phase which should be either updated
by calling clk_set_phase or directly from the first place the
clk was registered.

When registering the clk, the core->phase geting from ->get_phase()
may return negative value indicating error. This is quite common
since the clk's phase may be highly related to its parent chain,
but it was temporarily orphan when registered, since its parent
chains hadn't be ready at that time, so the clk drivers decide to
return error in this case. However, if no clk_set_phase is called or
maybe the ->set_phase() isn't even implemented, the core->phase would
never be updated. This is wrong, and we should try to update it when
all its parent chains are settled down, like the way of updating clock
rate for that. But it's not deserved to complicate the code now and
just update it anyway when calling clk_core_get_phase, which would be
much simple and enough.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Acked-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Chunyu Hu
1177eca042 cpufreq: cppc_cpufreq: Fix cppc_cpufreq_init() failure path
[ Upstream commit 55b55abc17f238c61921360e61dde90dd9a326d1 ]

Kmemleak reported the below leak. When cppc_cpufreq_init went into
failure path, the cpu mask is not freed. After fix, this report is
gone. And to avaoid potential NULL pointer reference, check the cpu
value first.

unreferenced object 0xffff800fd5ea4880 (size 128):
  comm "swapper/0", pid 1, jiffies 4294939510 (age 668.680s)
  hex dump (first 32 bytes):
    00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00  .... ...........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffff0000082c4ae4>] __kmalloc_node+0x278/0x634
    [<ffff0000088f4a74>] alloc_cpumask_var_node+0x28/0x60
    [<ffff0000088f4af0>] zalloc_cpumask_var+0x14/0x1c
    [<ffff000008d20254>] cppc_cpufreq_init+0xd0/0x19c
    [<ffff000008083828>] do_one_initcall+0xec/0x15c
    [<ffff000008cd1018>] kernel_init_freeable+0x1f4/0x2a4
    [<ffff0000089099b0>] kernel_init+0x18/0x10c
    [<ffff000008084d50>] ret_from_fork+0x10/0x18
    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Chunyu Hu <chuhu@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Thinh Nguyen
b5cc81ad4b usb: dwc3: Update DWC_usb31 GTXFIFOSIZ reg fields
[ Upstream commit 0cab8d26d6e5e053b2bed3356992aaa71dc93628 ]

Update two GTXFIFOSIZ bit fields for the DWC_usb31 controller. TXFDEP
is a 15-bit value instead of 16-bit value, and bit 15 is TXFRAMNUM.

The GTXFIFOSIZ register for DWC_usb31 is as follows:
 +-------+-----------+----------------------------------+
 | BITS  | Name      | Description                      |
 +=======+===========+==================================+
 | 31:16 | TXFSTADDR | Transmit FIFOn RAM Start Address |
 | 15    | TXFRAMNUM | Asynchronous/Periodic TXFIFO     |
 | 14:0  | TXFDEP    | TXFIFO Depth                     |
 +-------+-----------+----------------------------------+

Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Philipp Puschmann
02152ad1ca arm: dts: socfpga: fix GIC PPI warning
[ Upstream commit 6d97d5aba08b26108f95dc9fb7bbe4d9436c769c ]

Fixes the warning "GIC: PPI13 is secure or misconfigured" by
changing the interrupt type from level_low to edge_raising

Signed-off-by: Philipp Puschmann <pp@emlix.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Jay Vosburgh
da782aed49 virtio-net: Fix operstate for virtio when no VIRTIO_NET_F_STATUS
[ Upstream commit bda7fab54828bbef2164bb23c0f6b1a7d05cc718 ]

The operstate update logic will leave an interface in the
default UNKNOWN operstate if the interface carrier state never changes
from the default carrier up state set at creation.  This includes the
case of an explicit call to netif_carrier_on, as the carrier on to on
transition has no effect on operstate.

	This affects virtio-net for the case that the virtio peer does
not support VIRTIO_NET_F_STATUS (the feature that provides carrier state
updates).  Without this feature, the virtio specification states that
"the link should be assumed active," so, logically, the operstate should
be UP instead of UNKNOWN.  This has impact on user space applications
that use the operstate to make availability decisions for the interface.

	Resolve this by changing the virtio probe logic slightly to call
netif_carrier_off for both the "with" and "without" VIRTIO_NET_F_STATUS
cases, and then the existing call to netif_carrier_on for the "without"
case will cause an operstate transition.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:11 +02:00
Petr Vorel
5cace4b68c ima: Fallback to the builtin hash algorithm
[ Upstream commit ab60368ab6a452466885ef4edf0cefd089465132 ]

IMA requires having it's hash algorithm be compiled-in due to it's
early use.  The default IMA algorithm is protected by Kconfig to be
compiled-in.

The ima_hash kernel parameter allows to choose the hash algorithm. When
the specified algorithm is not available or available as a module, IMA
initialization fails, which leads to a kernel panic (mknodat syscall calls
ima_post_path_mknod()).  Therefore as fallback we force IMA to use
the default builtin Kconfig hash algorithm.

Fixed crash:

$ grep CONFIG_CRYPTO_MD4 .config
CONFIG_CRYPTO_MD4=m

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.12.14-2.3-default root=UUID=74ae8202-9ca7-4e39-813b-22287ec52f7a video=1024x768-16 plymouth.ignore-serial-consoles console=ttyS0 console=tty resume=/dev/disk/by-path/pci-0000:00:07.0-part3 splash=silent showopts ima_hash=md4
...
[    1.545190] ima: Can not allocate md4 (reason: -2)
...
[    2.610120] BUG: unable to handle kernel NULL pointer dereference at           (null)
[    2.611903] IP: ima_match_policy+0x23/0x390
[    2.612967] PGD 0 P4D 0
[    2.613080] Oops: 0000 [#1] SMP
[    2.613080] Modules linked in: autofs4
[    2.613080] Supported: Yes
[    2.613080] CPU: 0 PID: 1 Comm: systemd Not tainted 4.12.14-2.3-default #1
[    2.613080] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
[    2.613080] task: ffff88003e2d0040 task.stack: ffffc90000190000
[    2.613080] RIP: 0010:ima_match_policy+0x23/0x390
[    2.613080] RSP: 0018:ffffc90000193e88 EFLAGS: 00010296
[    2.613080] RAX: 0000000000000000 RBX: 000000000000000c RCX: 0000000000000004
[    2.613080] RDX: 0000000000000010 RSI: 0000000000000001 RDI: ffff880037071728
[    2.613080] RBP: 0000000000008000 R08: 0000000000000000 R09: 0000000000000000
[    2.613080] R10: 0000000000000008 R11: 61c8864680b583eb R12: 00005580ff10086f
[    2.613080] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000008000
[    2.613080] FS:  00007f5c1da08940(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[    2.613080] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.613080] CR2: 0000000000000000 CR3: 0000000037002000 CR4: 00000000003406f0
[    2.613080] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    2.613080] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    2.613080] Call Trace:
[    2.613080]  ? shmem_mknod+0xbf/0xd0
[    2.613080]  ima_post_path_mknod+0x1c/0x40
[    2.613080]  SyS_mknod+0x210/0x220
[    2.613080]  entry_SYSCALL_64_fastpath+0x1a/0xa5
[    2.613080] RIP: 0033:0x7f5c1bfde570
[    2.613080] RSP: 002b:00007ffde1c90dc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000085
[    2.613080] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5c1bfde570
[    2.613080] RDX: 0000000000000000 RSI: 0000000000008000 RDI: 00005580ff10086f
[    2.613080] RBP: 00007ffde1c91040 R08: 00005580ff10086f R09: 0000000000000000
[    2.613080] R10: 0000000000104000 R11: 0000000000000246 R12: 00005580ffb99660
[    2.613080] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000002
[    2.613080] Code: 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 56 44 8d 14 09 41 55 41 54 55 53 44 89 d3 09 cb 48 83 ec 38 48 8b 05 c5 03 29 01 <4c> 8b 20 4c 39 e0 0f 84 d7 01 00 00 4c 89 44 24 08 89 54 24 20
[    2.613080] RIP: ima_match_policy+0x23/0x390 RSP: ffffc90000193e88
[    2.613080] CR2: 0000000000000000
[    2.613080] ---[ end trace 9a9f0a8a73079f6a ]---
[    2.673052] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    2.673052]
[    2.675337] Kernel Offset: disabled
[    2.676405] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

Signed-off-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Jiandi An
edf3bf9ee2 ima: Fix Kconfig to select TPM 2.0 CRB interface
[ Upstream commit fac37c628fd5d68fd7298d9b57ae8601ee1b4723 ]

TPM_CRB driver provides TPM CRB 2.0 support.  If it is built as a
module, the TPM chip is registered after IMA init.  tpm_pcr_read() in
IMA fails and displays the following message even though eventually
there is a TPM chip on the system.

ima: No TPM chip found, activating TPM-bypass! (rc=-19)

Fix IMA Kconfig to select TPM_CRB so TPM_CRB driver is built in the kernel
and initializes before IMA.

Signed-off-by: Jiandi An <anjiandi@codeaurora.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Karthikeyan Periyasamy
d1dbe5dbfd ath10k: Fix kernel panic while using worker (ath10k_sta_rc_update_wk)
[ Upstream commit 8b2d93dd22615cb7f3046a5a2083a6f8bb8052ed ]

When attempt to run worker (ath10k_sta_rc_update_wk) after the station object
(ieee80211_sta) delete will trigger the kernel panic.

This problem arise in AP + Mesh configuration, Where the current node AP VAP
and neighbor node mesh VAP MAC address are same. When the current mesh node
try to establish the mesh link with neighbor node, driver peer creation for
the neighbor mesh node fails due to duplication MAC address. Already the AP
VAP created with same MAC address.

It is caused by the following scenario steps.

Steps:
1. In above condition, ath10k driver sta_state callback (ath10k_sta_state)
   fails to do the state change for a station from IEEE80211_STA_NOTEXIST
   to IEEE80211_STA_NONE due to peer creation fails. Sta_state callback is
   called from ieee80211_add_station() to handle the new station
   (neighbor mesh node) request from the wpa_supplicant.
2. Concurrently ath10k receive the sta_rc_update callback notification from
   the mesh_neighbour_update() to handle the beacon frames of the above
   neighbor mesh node. since its atomic callback, ath10k driver queue the
   work (ath10k_sta_rc_update_wk) to handle rc update.
3. Due to driver sta_state callback fails (step 1), mac80211 free the station
   object.
4. When the worker (ath10k_sta_rc_update_wk) scheduled to run, it will access
   the station object which is already deleted. so it will trigger kernel
   panic.

Added the peer exist check in sta_rc_update callback before queue the work.

Kernel Panic log:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0204000
[00000000] *pgd=00000000
Internal error: Oops: 17 [#1] PREEMPT SMP ARM
CPU: 1 PID: 1833 Comm: kworker/u4:2 Not tainted 3.14.77 #1
task: dcef0000 ti: d72b6000 task.ti: d72b6000
PC is at pwq_activate_delayed_work+0x10/0x40
LR is at pwq_activate_delayed_work+0xc/0x40
pc : [<c023f988>]    lr : [<c023f984>]    psr: 40000193
sp : d72b7f18  ip : 0000007a  fp : d72b6000
r10: 00000000  r9 : dd404414  r8 : d8c31998
r7 : d72b6038  r6 : 00000004  r5 : d4907ec8  r4 : dcee1300
r3 : ffffffe0  r2 : 00000000  r1 : 00000001  r0 : 00000000
Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5787d  Table: 595bc06a  DAC: 00000015
...
Process kworker/u4:2 (pid: 1833, stack limit = 0xd72b6238)
Stack: (0xd72b7f18 to 0xd72b8000)
7f00:                                                       00000001 dcee1300
7f20: 00000001 c02410dc d8c31980 dd404400 dd404400 c0242790 d8c31980 00000089
7f40: 00000000 d93e1340 00000000 d8c31980 c0242568 00000000 00000000 00000000
7f60: 00000000 c02474dc 00000000 00000000 000000f8 d8c31980 00000000 00000000
7f80: d72b7f80 d72b7f80 00000000 00000000 d72b7f90 d72b7f90 d72b7fac d93e1340
7fa0: c0247404 00000000 00000000 c0208d20 00000000 00000000 00000000 00000000
7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<c023f988>] (pwq_activate_delayed_work) from [<c02410dc>] (pwq_dec_nr_in_flight+0x58/0xc4)
[<c02410dc>] (pwq_dec_nr_in_flight) from [<c0242790>] (worker_thread+0x228/0x360)
[<c0242790>] (worker_thread) from [<c02474dc>] (kthread+0xd8/0xec)
[<c02474dc>] (kthread) from [<c0208d20>] (ret_from_fork+0x14/0x34)
Code: e92d4038 e1a05000 ebffffbc[69210.619376] SMP: failed to stop secondary CPUs
Rebooting in 3 seconds..

Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Leon Romanovsky
3a97320ce9 net/mlx5: Protect from command bit overflow
[ Upstream commit 957f6ba8adc7be401a74ccff427e4cfd88d3bfcb ]

The system with CONFIG_UBSAN enabled on produces the following error
during driver initialization. The reason to it that max_reg_cmds can be
larger enough to cause to "1 << max_reg_cmds" overflow the unsigned long.

================================================================================
UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx5/core/cmd.c:1805:42
signed integer overflow:
-2147483648 - 1 cannot be represented in type 'int'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2-00032-g06cda2358d9b-dirty #724
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
 dump_stack+0xe9/0x18f
 ? dma_virt_alloc+0x81/0x81
 ubsan_epilogue+0xe/0x4e
 handle_overflow+0x187/0x20c
 mlx5_cmd_init+0x73a/0x12b0
 mlx5_load_one+0x1c3d/0x1d30
 init_one+0xd02/0xf10
 pci_device_probe+0x26c/0x3b0
 driver_probe_device+0x622/0xb40
 __driver_attach+0x175/0x1b0
 bus_for_each_dev+0xef/0x190
 bus_add_driver+0x2db/0x490
 driver_register+0x16b/0x1e0
 __pci_register_driver+0x177/0x1b0
 init+0x6d/0x92
 do_one_initcall+0x15b/0x270
 kernel_init_freeable+0x2d8/0x3d0
 kernel_init+0x14/0x190
 ret_from_fork+0x24/0x30
================================================================================

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Michael Ellerman
e1448fac67 selftests: Print the test we're running to /dev/kmsg
[ Upstream commit 88893cf787d3062c631cc20b875068eb11756e03 ]

Some tests cause the kernel to print things to the kernel log
buffer (ie. printk), in particular oops and warnings etc. However when
running all the tests in succession it's not always obvious which
test(s) caused the kernel to print something.

We can narrow it down by printing which test directory we're running
in to /dev/kmsg, if it's writable.

Example output:

  [  170.149149] kselftest: Running tests in powerpc
  [  305.300132] kworker/dying (71) used greatest stack depth: 7776 bytes
                 left
  [  808.915456] kselftest: Running tests in pstore

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Frank Asseg
0bc0aa2736 tools/thermal: tmon: fix for segfault
[ Upstream commit 6c59f64b7ecf2bccbe73931d7d573d66ed13b537 ]

Fixes a segfault occurring when e.g. <TAB> is pressed multiple times in the
ncurses tmon application. The segfault is caused by incrementing
cur_thermal_record in the main function without checking if it's value reached
NR_THERMAL_RECORD immediately. Since the boundary check only occurred in
update_thermal_data a race condition existed, which lead to an attempted read
beyond the last element of the trec array.

The fix was implemented by moving the cur_thermal_record incrementation to the
update_thermal_data function using a temporary variable on which the boundary
condition is checked before updating cur_thread_record, so that the variable is
never incremented beyond the trec array's boundary.

It seems the segfault does not occur on every machine: On a HP EliteBook G4 the
segfault happens, while it does not happen on a Thinkpad T540p.

Signed-off-by: Frank Asseg <frank.asseg@objecthunter.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Michael Ellerman
55f43f3b10 powerpc/perf: Fix kernel address leak via sampling registers
[ Upstream commit e1ebd0e5b9d0a10ba65e63a3514b6da8c6a5a819 ]

Current code in power_pmu_disable() does not clear the sampling
registers like Sampling Instruction Address Register (SIAR) and
Sampling Data Address Register (SDAR) after disabling the PMU. Since
these are userspace readable and could contain kernel addresses, add
code to explicitly clear the content of these registers.

Also add a "context synchronizing instruction" to enforce no further
updates to these registers as suggested by Power ISA v3.0B. From
section 9.4, on page 1108:

  "If an mtspr instruction is executed that changes the value of a
  Performance Monitor register other than SIAR, SDAR, and SIER, the
  change is not guaranteed to have taken effect until after a
  subsequent context synchronizing instruction has been executed (see
  Chapter 11. "Synchronization Requirements for Context Alterations"
  on page 1133)."

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Massage change log and add ISA reference]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Madhavan Srinivasan
83fb2a49cc powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer
[ Upstream commit bb19af816025d495376bd76bf6fbcf4244f9a06d ]

The current Branch History Rolling Buffer (BHRB) code does not check
for any privilege levels before updating the data from BHRB. This
could leak kernel addresses to userspace even when profiling only with
userspace privileges. Add proper checks to prevent it.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Alexandre Belloni
0cf2a2141a rtc: hctosys: Ensure system time doesn't overflow time_t
[ Upstream commit b3a5ac42ab18b7d1a8f2f072ca0ee76a3b754a43 ]

On 32bit platforms, time_t is still a signed 32bit long. If it is
overflowed, userspace and the kernel cant agree on the current system time.
This causes multiple issues, in particular with systemd:
https://github.com/systemd/systemd/issues/1143

A good workaround is to simply avoid using hctosys which is something I
greatly encourage as the time is better set by userspace.

However, many distribution enable it and use systemd which is rendering the
system unusable in case the RTC holds a date after 2038 (and more so after
2106). Many drivers have workaround for this case and they should be
eliminated so there is only one place left to fix when userspace is able to
cope with dates after the 31bit overflow.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Guenter Roeck
634ac348e9 hwmon: (nct6775) Fix writing pwmX_mode
[ Upstream commit 415eb2a1aaa4881cf85bd86c683356fdd8094a23 ]

pwmX_mode is defined in the ABI as 0=DC mode, 1=pwm mode. The chip
register bit is set to 1 for DC mode. This got mixed up, and writing
1 into pwmX_mode resulted in DC mode enabled. Fix it up by using
the ABI definition throughout the driver for consistency.

Fixes: 77eb5b3703 ("hwmon: (nct6775) Add support for pwm, pwm_mode, ... ")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Helge Deller
c33badb975 parisc/pci: Switch LBA PCI bus from Hard Fail to Soft Fail mode
[ Upstream commit b845f66f78bf42a4ce98e5cfe0e94fab41dd0742 ]

Carlo Pisani noticed that his C3600 workstation behaved unstable during heavy
I/O on the PCI bus with a VIA VT6421 IDE/SATA PCI card.

To avoid such instability, this patch switches the LBA PCI bus from Hard Fail
mode into Soft Fail mode. In this mode the bus will return -1UL for timed out
MMIO transactions, which is exactly how the x86 (and most other architectures)
PCI busses behave.

This patch is based on a proposal by Grant Grundler and Kyle McMartin 10
years ago:
https://www.spinics.net/lists/linux-parisc/msg01027.html

Cc: Carlo Pisani <carlojpisani@gmail.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Reviewed-by: Grant Grundler <grantgrundler@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:10 +02:00
Greg Ungerer
b1a557b92d m68k: set dma and coherent masks for platform FEC ethernets
[ Upstream commit f61e64310b75733d782e930d1fb404b84699eed6 ]

As of commit 205e1b7f51e4 ("dma-mapping: warn when there is no
coherent_dma_mask") the Freescale FEC driver is issuing the following
warning on driver initialization on ColdFire systems:

WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 0x40159e20
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 4.16.0-rc7-dirty #4
Stack from 41833dd8:
        41833dd8 40259c53 40025534 40279e26 00000003 00000000 4004e514 41827000
        400255de 40244e42 00000204 40159e20 00000009 00000000 00000000 4024531d
        40159e20 40244e42 00000204 00000000 00000000 00000000 00000007 00000000
        00000000 40279e26 4028d040 40226576 4003ae88 40279e26 418273f6 41833ef8
        7fffffff 418273f2 41867028 4003c9a2 4180ac6c 00000004 41833f8c 4013e71c
        40279e1c 40279e26 40226c16 4013ced2 40279e26 40279e58 4028d040 00000000
Call Trace:
        [<40025534>] 0x40025534
 [<4004e514>] 0x4004e514
 [<400255de>] 0x400255de
 [<40159e20>] 0x40159e20
 [<40159e20>] 0x40159e20

It is not fatal, the driver and the system continue to function normally.

As per the warning the coherent_dma_mask is not set on this device.
There is nothing special about the DMA memory coherency on this hardware
so we can just set the mask to 32bits in the platform data for the FEC
ethernet devices.

Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:09 +02:00
Michael Ellerman
f8240a4feb powerpc/mpic: Check if cpu_possible() in mpic_physmask()
[ Upstream commit 0834d627fbea00c1444075eb3e448e1974da452d ]

In mpic_physmask() we loop over all CPUs up to 32, then get the hard
SMP processor id of that CPU.

Currently that's possibly walking off the end of the paca array, but
in a future patch we will change the paca array to be an array of
pointers, and in that case we will get a NULL for missing CPUs and
oops. eg:

  Unable to handle kernel paging request for data at address 0x88888888888888b8
  Faulting instruction address: 0xc00000000004e380
  Oops: Kernel access of bad area, sig: 11 [#1]
  ...
  NIP .mpic_set_affinity+0x60/0x1a0
  LR  .irq_do_set_affinity+0x48/0x100

Fix it by checking the CPU is possible, this also fixes the code if
there are gaps in the CPU numbering which probably never happens on
mpic systems but who knows.

Debugged-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:09 +02:00
Lenny Szubowicz
30ae624081 ACPI: acpi_pad: Fix memory leak in power saving threads
[ Upstream commit 8b29d29abc484d638213dd79a18a95ae7e5bb402 ]

Fix once per second (round_robin_time) memory leak of about 1 KB in
each acpi_pad kernel idling thread that is activated.

Found by testing with kmemleak.

Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:09 +02:00
Dan Carpenter
a1b03156db xen/acpi: off by one in read_acpi_id()
[ Upstream commit c37a3c94775855567b90f91775b9691e10bd2806 ]

If acpi_id is == nr_acpi_bits, then we access one element beyond the end
of the acpi_psd[] array or we set one bit beyond the end of the bit map
when we do __set_bit(acpi_id, acpi_id_present);

Fixes: 59a5680291 ("xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:09 +02:00
Jeff Mahoney
93c13652e0 btrfs: fix lockdep splat in btrfs_alloc_subvolume_writers
[ Upstream commit 8a5a916d9a35e13576d79cc16e24611821b13e34 ]

While running btrfs/011, I hit the following lockdep splat.

This is the important bit:
   pcpu_alloc+0x1ac/0x5e0
   __percpu_counter_init+0x4e/0xb0
   btrfs_init_fs_root+0x99/0x1c0 [btrfs]
   btrfs_get_fs_root.part.54+0x5b/0x150 [btrfs]
   resolve_indirect_refs+0x130/0x830 [btrfs]
   find_parent_nodes+0x69e/0xff0 [btrfs]
   btrfs_find_all_roots_safe+0xa0/0x110 [btrfs]
   btrfs_find_all_roots+0x50/0x70 [btrfs]
   btrfs_qgroup_prepare_account_extents+0x53/0x90 [btrfs]
   btrfs_commit_transaction+0x3ce/0x9b0 [btrfs]

The percpu_counter_init call in btrfs_alloc_subvolume_writers
uses GFP_KERNEL, which we can't do during transaction commit.

This switches it to GFP_NOFS.

========================================================
WARNING: possible irq lock inversion dependency detected
4.12.14-kvmsmall #8 Tainted: G        W
--------------------------------------------------------
kswapd0/50 just changed the state of lock:
 (&delayed_node->mutex){+.+.-.}, at: [<ffffffffc06994fa>] __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs]
but this lock took another, RECLAIM_FS-unsafe lock in the past:
 (pcpu_alloc_mutex){+.+.+.}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
Chain exists of:
  &delayed_node->mutex --> &found->groups_sem --> pcpu_alloc_mutex

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(pcpu_alloc_mutex);
                               local_irq_disable();
                               lock(&delayed_node->mutex);
                               lock(&found->groups_sem);
  <Interrupt>
    lock(&delayed_node->mutex);

 *** DEADLOCK ***

2 locks held by kswapd0/50:
 #0:  (shrinker_rwsem){++++..}, at: [<ffffffff811dc11f>] shrink_slab+0x7f/0x5b0
 #1:  (&type->s_umount_key#30){+++++.}, at: [<ffffffff8126dec6>] trylock_super+0x16/0x50

the shortest dependencies between 2nd lock and 1st lock:
   -> (pcpu_alloc_mutex){+.+.+.} ops: 4904 {
      HARDIRQ-ON-W at:
                          __mutex_lock+0x4e/0x8c0
                          pcpu_alloc+0x1ac/0x5e0
                          alloc_kmem_cache_cpus.isra.70+0x25/0xa0
                          __do_tune_cpucache+0x2c/0x220
                          do_tune_cpucache+0x26/0xc0
                          enable_cpucache+0x6d/0xf0
                          kmem_cache_init_late+0x42/0x75
                          start_kernel+0x343/0x4cb
                          x86_64_start_kernel+0x127/0x134
                          secondary_startup_64+0xa5/0xb0
      SOFTIRQ-ON-W at:
                          __mutex_lock+0x4e/0x8c0
                          pcpu_alloc+0x1ac/0x5e0
                          alloc_kmem_cache_cpus.isra.70+0x25/0xa0
                          __do_tune_cpucache+0x2c/0x220
                          do_tune_cpucache+0x26/0xc0
                          enable_cpucache+0x6d/0xf0
                          kmem_cache_init_late+0x42/0x75
                          start_kernel+0x343/0x4cb
                          x86_64_start_kernel+0x127/0x134
                          secondary_startup_64+0xa5/0xb0
      RECLAIM_FS-ON-W at:
                             __kmalloc+0x47/0x310
                             pcpu_extend_area_map+0x2b/0xc0
                             pcpu_alloc+0x3ec/0x5e0
                             alloc_kmem_cache_cpus.isra.70+0x25/0xa0
                             __do_tune_cpucache+0x2c/0x220
                             do_tune_cpucache+0x26/0xc0
                             enable_cpucache+0x6d/0xf0
                             __kmem_cache_create+0x1bf/0x390
                             create_cache+0xba/0x1b0
                             kmem_cache_create+0x1f8/0x2b0
                             ksm_init+0x6f/0x19d
                             do_one_initcall+0x50/0x1b0
                             kernel_init_freeable+0x201/0x289
                             kernel_init+0xa/0x100
                             ret_from_fork+0x3a/0x50
      INITIAL USE at:
                         __mutex_lock+0x4e/0x8c0
                         pcpu_alloc+0x1ac/0x5e0
                         alloc_kmem_cache_cpus.isra.70+0x25/0xa0
                         setup_cpu_cache+0x2f/0x1f0
                         __kmem_cache_create+0x1bf/0x390
                         create_boot_cache+0x8b/0xb1
                         kmem_cache_init+0xa1/0x19e
                         start_kernel+0x270/0x4cb
                         x86_64_start_kernel+0x127/0x134
                         secondary_startup_64+0xa5/0xb0
    }
    ... key      at: [<ffffffff821d8e70>] pcpu_alloc_mutex+0x70/0xa0
    ... acquired at:
   pcpu_alloc+0x1ac/0x5e0
   __percpu_counter_init+0x4e/0xb0
   btrfs_init_fs_root+0x99/0x1c0 [btrfs]
   btrfs_get_fs_root.part.54+0x5b/0x150 [btrfs]
   resolve_indirect_refs+0x130/0x830 [btrfs]
   find_parent_nodes+0x69e/0xff0 [btrfs]
   btrfs_find_all_roots_safe+0xa0/0x110 [btrfs]
   btrfs_find_all_roots+0x50/0x70 [btrfs]
   btrfs_qgroup_prepare_account_extents+0x53/0x90 [btrfs]
   btrfs_commit_transaction+0x3ce/0x9b0 [btrfs]
   transaction_kthread+0x176/0x1b0 [btrfs]
   kthread+0x102/0x140
   ret_from_fork+0x3a/0x50

  -> (&fs_info->commit_root_sem){++++..} ops: 1566382 {
     HARDIRQ-ON-W at:
                        down_write+0x3e/0xa0
                        cache_block_group+0x287/0x420 [btrfs]
                        find_free_extent+0x106c/0x12d0 [btrfs]
                        btrfs_reserve_extent+0xd8/0x170 [btrfs]
                        cow_file_range.isra.66+0x133/0x470 [btrfs]
                        run_delalloc_range+0x121/0x410 [btrfs]
                        writepage_delalloc.isra.50+0xfe/0x180 [btrfs]
                        __extent_writepage+0x19a/0x360 [btrfs]
                        extent_write_cache_pages.constprop.56+0x249/0x3e0 [btrfs]
                        extent_writepages+0x4d/0x60 [btrfs]
                        do_writepages+0x1a/0x70
                        __filemap_fdatawrite_range+0xa7/0xe0
                        btrfs_rename+0x5ee/0xdb0 [btrfs]
                        vfs_rename+0x52a/0x7e0
                        SyS_rename+0x351/0x3b0
                        do_syscall_64+0x79/0x1e0
                        entry_SYSCALL_64_after_hwframe+0x42/0xb7
     HARDIRQ-ON-R at:
                        down_read+0x35/0x90
                        caching_thread+0x57/0x560 [btrfs]
                        normal_work_helper+0x1c0/0x5e0 [btrfs]
                        process_one_work+0x1e0/0x5c0
                        worker_thread+0x44/0x390
                        kthread+0x102/0x140
                        ret_from_fork+0x3a/0x50
     SOFTIRQ-ON-W at:
                        down_write+0x3e/0xa0
                        cache_block_group+0x287/0x420 [btrfs]
                        find_free_extent+0x106c/0x12d0 [btrfs]
                        btrfs_reserve_extent+0xd8/0x170 [btrfs]
                        cow_file_range.isra.66+0x133/0x470 [btrfs]
                        run_delalloc_range+0x121/0x410 [btrfs]
                        writepage_delalloc.isra.50+0xfe/0x180 [btrfs]
                        __extent_writepage+0x19a/0x360 [btrfs]
                        extent_write_cache_pages.constprop.56+0x249/0x3e0 [btrfs]
                        extent_writepages+0x4d/0x60 [btrfs]
                        do_writepages+0x1a/0x70
                        __filemap_fdatawrite_range+0xa7/0xe0
                        btrfs_rename+0x5ee/0xdb0 [btrfs]
                        vfs_rename+0x52a/0x7e0
                        SyS_rename+0x351/0x3b0
                        do_syscall_64+0x79/0x1e0
                        entry_SYSCALL_64_after_hwframe+0x42/0xb7
     SOFTIRQ-ON-R at:
                        down_read+0x35/0x90
                        caching_thread+0x57/0x560 [btrfs]
                        normal_work_helper+0x1c0/0x5e0 [btrfs]
                        process_one_work+0x1e0/0x5c0
                        worker_thread+0x44/0x390
                        kthread+0x102/0x140
                        ret_from_fork+0x3a/0x50
     INITIAL USE at:
                       down_write+0x3e/0xa0
                       cache_block_group+0x287/0x420 [btrfs]
                       find_free_extent+0x106c/0x12d0 [btrfs]
                       btrfs_reserve_extent+0xd8/0x170 [btrfs]
                       cow_file_range.isra.66+0x133/0x470 [btrfs]
                       run_delalloc_range+0x121/0x410 [btrfs]
                       writepage_delalloc.isra.50+0xfe/0x180 [btrfs]
                       __extent_writepage+0x19a/0x360 [btrfs]
                       extent_write_cache_pages.constprop.56+0x249/0x3e0 [btrfs]
                       extent_writepages+0x4d/0x60 [btrfs]
                       do_writepages+0x1a/0x70
                       __filemap_fdatawrite_range+0xa7/0xe0
                       btrfs_rename+0x5ee/0xdb0 [btrfs]
                       vfs_rename+0x52a/0x7e0
                       SyS_rename+0x351/0x3b0
                       do_syscall_64+0x79/0x1e0
                       entry_SYSCALL_64_after_hwframe+0x42/0xb7
   }
   ... key      at: [<ffffffffc0729578>] __key.61970+0x0/0xfffffffffff9aa88 [btrfs]
   ... acquired at:
   cache_block_group+0x287/0x420 [btrfs]
   find_free_extent+0x106c/0x12d0 [btrfs]
   btrfs_reserve_extent+0xd8/0x170 [btrfs]
   btrfs_alloc_tree_block+0x12f/0x4c0 [btrfs]
   btrfs_create_tree+0xbb/0x2a0 [btrfs]
   btrfs_create_uuid_tree+0x37/0x140 [btrfs]
   open_ctree+0x23c0/0x2660 [btrfs]
   btrfs_mount+0xd36/0xf90 [btrfs]
   mount_fs+0x3a/0x160
   vfs_kern_mount+0x66/0x150
   btrfs_mount+0x18c/0xf90 [btrfs]
   mount_fs+0x3a/0x160
   vfs_kern_mount+0x66/0x150
   do_mount+0x1c1/0xcc0
   SyS_mount+0x7e/0xd0
   do_syscall_64+0x79/0x1e0
   entry_SYSCALL_64_after_hwframe+0x42/0xb7

 -> (&found->groups_sem){++++..} ops: 2134587 {
    HARDIRQ-ON-W at:
                      down_write+0x3e/0xa0
                      __link_block_group+0x34/0x130 [btrfs]
                      btrfs_read_block_groups+0x33d/0x7b0 [btrfs]
                      open_ctree+0x2054/0x2660 [btrfs]
                      btrfs_mount+0xd36/0xf90 [btrfs]
                      mount_fs+0x3a/0x160
                      vfs_kern_mount+0x66/0x150
                      btrfs_mount+0x18c/0xf90 [btrfs]
                      mount_fs+0x3a/0x160
                      vfs_kern_mount+0x66/0x150
                      do_mount+0x1c1/0xcc0
                      SyS_mount+0x7e/0xd0
                      do_syscall_64+0x79/0x1e0
                      entry_SYSCALL_64_after_hwframe+0x42/0xb7
    HARDIRQ-ON-R at:
                      down_read+0x35/0x90
                      btrfs_calc_num_tolerated_disk_barrier_failures+0x113/0x1f0 [btrfs]
                      open_ctree+0x207b/0x2660 [btrfs]
                      btrfs_mount+0xd36/0xf90 [btrfs]
                      mount_fs+0x3a/0x160
                      vfs_kern_mount+0x66/0x150
                      btrfs_mount+0x18c/0xf90 [btrfs]
                      mount_fs+0x3a/0x160
                      vfs_kern_mount+0x66/0x150
                      do_mount+0x1c1/0xcc0
                      SyS_mount+0x7e/0xd0
                      do_syscall_64+0x79/0x1e0
                      entry_SYSCALL_64_after_hwframe+0x42/0xb7
    SOFTIRQ-ON-W at:
                      down_write+0x3e/0xa0
                      __link_block_group+0x34/0x130 [btrfs]
                      btrfs_read_block_groups+0x33d/0x7b0 [btrfs]
                      open_ctree+0x2054/0x2660 [btrfs]
                      btrfs_mount+0xd36/0xf90 [btrfs]
                      mount_fs+0x3a/0x160
                      vfs_kern_mount+0x66/0x150
                      btrfs_mount+0x18c/0xf90 [btrfs]
                      mount_fs+0x3a/0x160
                      vfs_kern_mount+0x66/0x150
                      do_mount+0x1c1/0xcc0
                      SyS_mount+0x7e/0xd0
                      do_syscall_64+0x79/0x1e0
                      entry_SYSCALL_64_after_hwframe+0x42/0xb7
    SOFTIRQ-ON-R at:
                      down_read+0x35/0x90
                      btrfs_calc_num_tolerated_disk_barrier_failures+0x113/0x1f0 [btrfs]
                      open_ctree+0x207b/0x2660 [btrfs]
                      btrfs_mount+0xd36/0xf90 [btrfs]
                      mount_fs+0x3a/0x160
                      vfs_kern_mount+0x66/0x150
                      btrfs_mount+0x18c/0xf90 [btrfs]
                      mount_fs+0x3a/0x160
                      vfs_kern_mount+0x66/0x150
                      do_mount+0x1c1/0xcc0
                      SyS_mount+0x7e/0xd0
                      do_syscall_64+0x79/0x1e0
                      entry_SYSCALL_64_after_hwframe+0x42/0xb7
    INITIAL USE at:
                     down_write+0x3e/0xa0
                     __link_block_group+0x34/0x130 [btrfs]
                     btrfs_read_block_groups+0x33d/0x7b0 [btrfs]
                     open_ctree+0x2054/0x2660 [btrfs]
                     btrfs_mount+0xd36/0xf90 [btrfs]
                     mount_fs+0x3a/0x160
                     vfs_kern_mount+0x66/0x150
                     btrfs_mount+0x18c/0xf90 [btrfs]
                     mount_fs+0x3a/0x160
                     vfs_kern_mount+0x66/0x150
                     do_mount+0x1c1/0xcc0
                     SyS_mount+0x7e/0xd0
                     do_syscall_64+0x79/0x1e0
                     entry_SYSCALL_64_after_hwframe+0x42/0xb7
  }
  ... key      at: [<ffffffffc0729488>] __key.59101+0x0/0xfffffffffff9ab78 [btrfs]
  ... acquired at:
   find_free_extent+0xcb4/0x12d0 [btrfs]
   btrfs_reserve_extent+0xd8/0x170 [btrfs]
   btrfs_alloc_tree_block+0x12f/0x4c0 [btrfs]
   __btrfs_cow_block+0x110/0x5b0 [btrfs]
   btrfs_cow_block+0xd7/0x290 [btrfs]
   btrfs_search_slot+0x1f6/0x960 [btrfs]
   btrfs_lookup_inode+0x2a/0x90 [btrfs]
   __btrfs_update_delayed_inode+0x65/0x210 [btrfs]
   btrfs_commit_inode_delayed_inode+0x121/0x130 [btrfs]
   btrfs_evict_inode+0x3fe/0x6a0 [btrfs]
   evict+0xc4/0x190
   __dentry_kill+0xbf/0x170
   dput+0x2ae/0x2f0
   SyS_rename+0x2a6/0x3b0
   do_syscall_64+0x79/0x1e0
   entry_SYSCALL_64_after_hwframe+0x42/0xb7

-> (&delayed_node->mutex){+.+.-.} ops: 5580204 {
   HARDIRQ-ON-W at:
                    __mutex_lock+0x4e/0x8c0
                    btrfs_delayed_update_inode+0x46/0x6e0 [btrfs]
                    btrfs_update_inode+0x83/0x110 [btrfs]
                    btrfs_dirty_inode+0x62/0xe0 [btrfs]
                    touch_atime+0x8c/0xb0
                    do_generic_file_read+0x818/0xb10
                    __vfs_read+0xdc/0x150
                    vfs_read+0x8a/0x130
                    SyS_read+0x45/0xa0
                    do_syscall_64+0x79/0x1e0
                    entry_SYSCALL_64_after_hwframe+0x42/0xb7
   SOFTIRQ-ON-W at:
                    __mutex_lock+0x4e/0x8c0
                    btrfs_delayed_update_inode+0x46/0x6e0 [btrfs]
                    btrfs_update_inode+0x83/0x110 [btrfs]
                    btrfs_dirty_inode+0x62/0xe0 [btrfs]
                    touch_atime+0x8c/0xb0
                    do_generic_file_read+0x818/0xb10
                    __vfs_read+0xdc/0x150
                    vfs_read+0x8a/0x130
                    SyS_read+0x45/0xa0
                    do_syscall_64+0x79/0x1e0
                    entry_SYSCALL_64_after_hwframe+0x42/0xb7
   IN-RECLAIM_FS-W at:
                       __mutex_lock+0x4e/0x8c0
                       __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs]
                       btrfs_evict_inode+0x22c/0x6a0 [btrfs]
                       evict+0xc4/0x190
                       dispose_list+0x35/0x50
                       prune_icache_sb+0x42/0x50
                       super_cache_scan+0x139/0x190
                       shrink_slab+0x262/0x5b0
                       shrink_node+0x2eb/0x2f0
                       kswapd+0x2eb/0x890
                       kthread+0x102/0x140
                       ret_from_fork+0x3a/0x50
   INITIAL USE at:
                   __mutex_lock+0x4e/0x8c0
                   btrfs_delayed_update_inode+0x46/0x6e0 [btrfs]
                   btrfs_update_inode+0x83/0x110 [btrfs]
                   btrfs_dirty_inode+0x62/0xe0 [btrfs]
                   touch_atime+0x8c/0xb0
                   do_generic_file_read+0x818/0xb10
                   __vfs_read+0xdc/0x150
                   vfs_read+0x8a/0x130
                   SyS_read+0x45/0xa0
                   do_syscall_64+0x79/0x1e0
                   entry_SYSCALL_64_after_hwframe+0x42/0xb7
 }
 ... key      at: [<ffffffffc072d488>] __key.56935+0x0/0xfffffffffff96b78 [btrfs]
 ... acquired at:
   __lock_acquire+0x264/0x11c0
   lock_acquire+0xbd/0x1e0
   __mutex_lock+0x4e/0x8c0
   __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs]
   btrfs_evict_inode+0x22c/0x6a0 [btrfs]
   evict+0xc4/0x190
   dispose_list+0x35/0x50
   prune_icache_sb+0x42/0x50
   super_cache_scan+0x139/0x190
   shrink_slab+0x262/0x5b0
   shrink_node+0x2eb/0x2f0
   kswapd+0x2eb/0x890
   kthread+0x102/0x140
   ret_from_fork+0x3a/0x50

stack backtrace:
CPU: 1 PID: 50 Comm: kswapd0 Tainted: G        W        4.12.14-kvmsmall #8 SLE15 (unreleased)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
 dump_stack+0x78/0xb7
 print_irq_inversion_bug.part.38+0x19f/0x1aa
 check_usage_forwards+0x102/0x120
 ? ret_from_fork+0x3a/0x50
 ? check_usage_backwards+0x110/0x110
 mark_lock+0x16c/0x270
 __lock_acquire+0x264/0x11c0
 ? pagevec_lookup_entries+0x1a/0x30
 ? truncate_inode_pages_range+0x2b3/0x7f0
 lock_acquire+0xbd/0x1e0
 ? __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs]
 __mutex_lock+0x4e/0x8c0
 ? __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs]
 ? __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs]
 ? btrfs_evict_inode+0x1f6/0x6a0 [btrfs]
 __btrfs_release_delayed_node+0x3a/0x1f0 [btrfs]
 btrfs_evict_inode+0x22c/0x6a0 [btrfs]
 evict+0xc4/0x190
 dispose_list+0x35/0x50
 prune_icache_sb+0x42/0x50
 super_cache_scan+0x139/0x190
 shrink_slab+0x262/0x5b0
 shrink_node+0x2eb/0x2f0
 kswapd+0x2eb/0x890
 kthread+0x102/0x140
 ? mem_cgroup_shrink_node+0x2c0/0x2c0
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x3a/0x50

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: David Sterba <dsterba@suse.com>

Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:09 +02:00
Filipe Manana
b037a568e3 Btrfs: fix copy_items() return value when logging an inode
[ Upstream commit 8434ec46c6e3232cebc25a910363b29f5c617820 ]

When logging an inode, at tree-log.c:copy_items(), if we call
btrfs_next_leaf() at the loop which checks for the need to log holes, we
need to make sure copy_items() returns the value 1 to its caller and
not 0 (on success). This is because the path the caller passed was
released and is now different from what is was before, and the caller
expects a return value of 0 to mean both success and that the path
has not changed, while a return value of 1 means both success and
signals the caller that it can not reuse the path, it has to perform
another tree search.

Even though this is a case that should not be triggered on normal
circumstances or very rare at least, its consequences can be very
unpredictable (especially when replaying a log tree).

Fixes: 16e7549f04 ("Btrfs: incompatible format change to remove hole extents")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:09 +02:00
Qu Wenruo
a77b0cfe80 btrfs: tests/qgroup: Fix wrong tree backref level
[ Upstream commit 3c0efdf03b2d127f0e40e30db4e7aa0429b1b79a ]

The extent tree of the test fs is like the following:

 BTRFS info (device (null)): leaf 16327509003777336587 total ptrs 1 free space 3919
  item 0 key (4096 168 4096) itemoff 3944 itemsize 51
          extent refs 1 gen 1 flags 2
          tree block key (68719476736 0 0) level 1
                                           ^^^^^^^
          ref#0: tree block backref root 5

And it's using an empty tree for fs tree, so there is no way that its
level can be 1.

For REAL (created by mkfs) fs tree backref with no skinny metadata, the
result should look like:

 item 3 key (30408704 EXTENT_ITEM 4096) itemoff 3845 itemsize 51
         refs 1 gen 4 flags TREE_BLOCK
         tree block key (256 INODE_ITEM 0) level 0
                                           ^^^^^^^
         tree block backref root 5

Fix the level to 0, so it won't break later tree level checker.

Fixes: faa2dbf004 ("Btrfs: add sanity tests for new qgroup accounting code")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:09 +02:00
Vicente Bergas
b18cda2787 Bluetooth: btusb: Add USB ID 7392:a611 for Edimax EW-7611ULB
[ Upstream commit a41e0796396eeceff673af4a38feaee149c6ff86 ]

This WiFi/Bluetooth USB dongle uses a Realtek chipset, so, use btrtl for it.

Product information:
https://wikidevi.com/wiki/Edimax_EW-7611ULB

>From /sys/kernel/debug/usb/devices
T:  Bus=02 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#=  3 Spd=480  MxCh= 0
D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=7392 ProdID=a611 Rev= 2.00
S:  Manufacturer=Realtek
S:  Product=Edimax Wi-Fi N150 Bluetooth4.0 USB Adapter
S:  SerialNumber=00e04c000001
C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=500mA
A:  FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=01
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
I:* If#= 2 Alt= 0 #EPs= 6 Cls=ff(vend.) Sub=ff Prot=ff Driver=rtl8723bu
E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=03(Int.) MxPS=  64 Ivl=500us
E:  Ad=08(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=09(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms

Tested-by: Vicente Bergas <vicencb@gmail.com>
Signed-off-by: Vicente Bergas <vicencb@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:09 +02:00
Florian Fainelli
7dd8604365 net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()
[ Upstream commit 60d6e6f0b9e422dd01aeda39257ee0428e5e2a3f ]

bgmac_dma_tx_ring_free() assigns the ctl1 word which is a litle endian
32-bit word without using proper accessors, fix this, and because a
length cannot be negative, use unsigned int while at it.

Fixes: 9cde94506e ("bgmac: implement scatter/gather support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:09 +02:00
Bryan O'Donoghue
d9bc96c4dc rtc: snvs: Fix usage of snvs_rtc_enable
[ Upstream commit 1485991c024603b2fb4ae77beb7a0d741128a48e ]

commit 179a502f8c ("rtc: snvs: add Freescale rtc-snvs driver") introduces
the SNVS RTC driver with a function snvs_rtc_enable().

snvs_rtc_enable() can return an error on the enable path however this
driver does not currently trap that failure on the probe() path and
consequently if enabling the RTC fails we encounter a later error spinning
forever in rtc_write_sync_lp().

[   36.093481] [<c010d630>] (__irq_svc) from [<c0c2e9ec>] (_raw_spin_unlock_irqrestore+0x34/0x44)
[   36.102122] [<c0c2e9ec>] (_raw_spin_unlock_irqrestore) from [<c072e32c>] (regmap_read+0x4c/0x5c)
[   36.110938] [<c072e32c>] (regmap_read) from [<c085d0f4>] (rtc_write_sync_lp+0x6c/0x98)
[   36.118881] [<c085d0f4>] (rtc_write_sync_lp) from [<c085d160>] (snvs_rtc_alarm_irq_enable+0x40/0x4c)
[   36.128041] [<c085d160>] (snvs_rtc_alarm_irq_enable) from [<c08567b4>] (rtc_timer_do_work+0xd8/0x1a8)
[   36.137291] [<c08567b4>] (rtc_timer_do_work) from [<c01441b8>] (process_one_work+0x28c/0x76c)
[   36.145840] [<c01441b8>] (process_one_work) from [<c01446cc>] (worker_thread+0x34/0x58c)
[   36.153961] [<c01446cc>] (worker_thread) from [<c014aee4>] (kthread+0x138/0x150)
[   36.161388] [<c014aee4>] (kthread) from [<c0107e14>] (ret_from_fork+0x14/0x20)
[   36.168635] rcu_sched kthread starved for 2602 jiffies! g496 c495 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
[   36.178564] rcu_sched       R  running task        0     8      2 0x00000000
[   36.185664] [<c0c288b0>] (__schedule) from [<c0c29134>] (schedule+0x3c/0xa0)
[   36.192739] [<c0c29134>] (schedule) from [<c0c2db80>] (schedule_timeout+0x78/0x4e0)
[   36.200422] [<c0c2db80>] (schedule_timeout) from [<c01a7ab0>] (rcu_gp_kthread+0x648/0x1864)
[   36.208800] [<c01a7ab0>] (rcu_gp_kthread) from [<c014aee4>] (kthread+0x138/0x150)
[   36.216309] [<c014aee4>] (kthread) from [<c0107e14>] (ret_from_fork+0x14/0x20)

This patch fixes by parsing the result of rtc_write_sync_lp() and
propagating both in the probe and elsewhere. If the RTC doesn't start we
don't proceed loading the driver and don't get into this loop mess later
on.

Fixes: 179a502f8c ("rtc: snvs: add Freescale rtc-snvs driver")
Signed-off-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:09 +02:00
David S. Miller
659db5217b sparc64: Make atomic_xchg() an inline function rather than a macro.
[ Upstream commit d13864b68e41c11e4231de90cf358658f6ecea45 ]

This avoids a lot of -Wunused warnings such as:

====================
kernel/debug/debug_core.c: In function ‘kgdb_cpu_enter’:
./arch/sparc/include/asm/cmpxchg_64.h:55:22: warning: value computed is not used [-Wunused-value]
 #define xchg(ptr,x) ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))

./arch/sparc/include/asm/atomic_64.h:86:30: note: in expansion of macro ‘xchg’
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
                              ^~~~
kernel/debug/debug_core.c:508:4: note: in expansion of macro ‘atomic_xchg’
    atomic_xchg(&kgdb_active, cpu);
    ^~~~~~~~~~~
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
David Howells
61b4c050ad fscache: Fix hanging wait on page discarded by writeback
[ Upstream commit 2c98425720233ae3e135add0c7e869b32913502f ]

If the fscache asynchronous write operation elects to discard a page that's
pending storage to the cache because the page would be over the store limit
then it needs to wake the page as someone may be waiting on completion of
the write.

The problem is that the store limit may be updated by a different
asynchronous operation - and so may miss the write - and that the store
limit may not even get updated until later by the netfs.

Fix the kernel hang by making fscache_write_op() mark as written any pages
that are over the limit.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
Sean Christopherson
ffd0502d82 KVM: VMX: raise internal error for exception during invalid protected mode state
[ Upstream commit add5ff7a216ee545a214013f26d1ef2f44a9c9f8 ]

Exit to userspace with KVM_INTERNAL_ERROR_EMULATION if we encounter
an exception in Protected Mode while emulating guest due to invalid
guest state.  Unlike Big RM, KVM doesn't support emulating exceptions
in PM, i.e. PM exceptions are always injected via the VMCS.  Because
we will never do VMRESUME due to emulation_required, the exception is
never realized and we'll keep emulating the faulting instruction over
and over until we receive a signal.

Exit to userspace iff there is a pending exception, i.e. don't exit
simply on a requested event. The purpose of this check and exit is to
aid in debugging a guest that is in all likelihood already doomed.
Invalid guest state in PM is extremely limited in normal operation,
e.g. it generally only occurs for a few instructions early in BIOS,
and any exception at this time is all but guaranteed to be fatal.
Non-vectored interrupts, e.g. INIT, SIPI and SMI, can be cleanly
handled/emulated, while checking for vectored interrupts, e.g. INTR
and NMI, without hitting false positives would add a fair amount of
complexity for almost no benefit (getting hit by lightning seems
more likely than encountering this specific scenario).

Add a WARN_ON_ONCE to vmx_queue_exception() if we try to inject an
exception via the VMCS and emulation_required is true.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
Davidlohr Bueso
094c145f4c sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning
[ Upstream commit d29a20645d5e929aa7e8616f28e5d8e1c49263ec ]

While running rt-tests' pi_stress program I got the following splat:

  rq->clock_update_flags < RQCF_ACT_SKIP
  WARNING: CPU: 27 PID: 0 at kernel/sched/sched.h:960 assert_clock_updated.isra.38.part.39+0x13/0x20

  [...]

  <IRQ>
  enqueue_top_rt_rq+0xf4/0x150
  ? cpufreq_dbs_governor_start+0x170/0x170
  sched_rt_rq_enqueue+0x65/0x80
  sched_rt_period_timer+0x156/0x360
  ? sched_rt_rq_enqueue+0x80/0x80
  __hrtimer_run_queues+0xfa/0x260
  hrtimer_interrupt+0xcb/0x220
  smp_apic_timer_interrupt+0x62/0x120
  apic_timer_interrupt+0xf/0x20
  </IRQ>

  [...]

  do_idle+0x183/0x1e0
  cpu_startup_entry+0x5f/0x70
  start_secondary+0x192/0x1d0
  secondary_startup_64+0xa5/0xb0

We can get rid of it be the "traditional" means of adding an
update_rq_clock() call after acquiring the rq->lock in
do_sched_rt_period_timer().

The case for the RT task throttling (which this workload also hits)
can be ignored in that the skip_update call is actually bogus and
quite the contrary (the request bits are removed/reverted).

By setting RQCF_UPDATED we really don't care if the skip is happening
or not and will therefore make the assert_clock_updated() check happy.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: linux-kernel@vger.kernel.org
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/20180402164954.16255-1-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
Jun Piao
6b6933fdb1 ocfs2/dlm: don't handle migrate lockres if already in shutdown
[ Upstream commit bb34f24c7d2c98d0c81838a7700e6068325b17a0 ]

We should not handle migrate lockres if we are already in
'DLM_CTXT_IN_SHUTDOWN', as that will cause lockres remains after leaving
dlm domain.  At last other nodes will get stuck into infinite loop when
requsting lock from us.

The problem is caused by concurrency umount between nodes.  Before
receiveing N1's DLM_BEGIN_EXIT_DOMAIN_MSG, N2 has picked up N1 as the
migrate target.  So N2 will continue sending lockres to N1 even though
N1 has left domain.

        N1                             N2 (owner)
                                       touch file

    access the file,
    and get pr lock

                                       begin leave domain and
                                       pick up N1 as new owner

    begin leave domain and
    migrate all lockres done

                                       begin migrate lockres to N1

    end leave domain, but
    the lockres left
    unexpectedly, because
    migrate task has passed

[piaojun@huawei.com: v3]
  Link: http://lkml.kernel.org/r/5A9CBD19.5020107@huawei.com
Link: http://lkml.kernel.org/r/5A99F028.2090902@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Reviewed-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
Nikolay Borisov
e1ccd597ff btrfs: Fix possible softlock on single core machines
[ Upstream commit 1e1c50a929bc9e49bc3f9935b92450d9e69f8158 ]

do_chunk_alloc implements a loop checking whether there is a pending
chunk allocation and if so causes the caller do loop. Generally this
loop is executed only once, however testing with btrfs/072 on a single
core vm machines uncovered an extreme case where the system could loop
indefinitely. This is due to a missing cond_resched when loop which
doesn't give a chance to the previous chunk allocator finish its job.

The fix is to simply add the missing cond_resched.

Fixes: 6d74119f1a ("Btrfs: avoid taking the chunk_mutex in do_chunk_alloc")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
Liu Bo
b487d1cd2c Btrfs: fix NULL pointer dereference in log_dir_items
[ Upstream commit 80c0b4210a963e31529e15bf90519708ec947596 ]

0, 1 and <0 can be returned by btrfs_next_leaf(), and when <0 is
returned, path->nodes[0] could be NULL, log_dir_items lacks such a
check for <0 and we may run into a null pointer dereference panic.

Fixes: e02119d5a7 ("Btrfs: Add a write ahead tree log to optimize synchronous operations")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
Liu Bo
0c3968cc74 Btrfs: bail out on error during replay_dir_deletes
[ Upstream commit b98def7ca6e152ee55e36863dddf6f41f12d1dc6 ]

If errors were returned by btrfs_next_leaf(), replay_dir_deletes needs
to bail out, otherwise @ret would be forced to be 0 after 'break;' and
the caller won't be aware of it.

Fixes: e02119d5a7 ("Btrfs: Add a write ahead tree log to optimize synchronous operations")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
Huang Ying
532618b474 mm: fix races between address_space dereference and free in page_evicatable
[ Upstream commit e92bb4dd9673945179b1fc738c9817dd91bfb629 ]

When page_mapping() is called and the mapping is dereferenced in
page_evicatable() through shrink_active_list(), it is possible for the
inode to be truncated and the embedded address space to be freed at the
same time.  This may lead to the following race.

CPU1                                                CPU2

truncate(inode)                                     shrink_active_list()
  ...                                                 page_evictable(page)
  truncate_inode_page(mapping, page);
    delete_from_page_cache(page)
      spin_lock_irqsave(&mapping->tree_lock, flags);
        __delete_from_page_cache(page, NULL)
          page_cache_tree_delete(..)
            ...                                         mapping = page_mapping(page);
            page->mapping = NULL;
            ...
      spin_unlock_irqrestore(&mapping->tree_lock, flags);
      page_cache_free_page(mapping, page)
        put_page(page)
          if (put_page_testzero(page)) -> false
- inode now has no pages and can be freed including embedded address_space

                                                        mapping_unevictable(mapping)
							  test_bit(AS_UNEVICTABLE, &mapping->flags);
- we've dereferenced mapping which is potentially already free.

Similar race exists between swap cache freeing and page_evicatable()
too.

The address_space in inode and swap cache will be freed after a RCU
grace period.  So the races are fixed via enclosing the page_mapping()
and address_space usage in rcu_read_lock/unlock().  Some comments are
added in code to make it clear what is protected by the RCU read lock.

Link: http://lkml.kernel.org/r/20180212081227.1940-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
Claudio Imbrenda
35de741a55 mm/ksm: fix interaction with THP
[ Upstream commit 77da2ba0648a4fd52e5ff97b8b2b8dd312aec4b0 ]

This patch fixes a corner case for KSM.  When two pages belong or
belonged to the same transparent hugepage, and they should be merged,
KSM fails to split the page, and therefore no merging happens.

This bug can be reproduced by:
* making sure ksm is running (in case disabling ksmtuned)
* enabling transparent hugepages
* allocating a THP-aligned 1-THP-sized buffer
  e.g. on amd64: posix_memalign(&p, 1<<21, 1<<21)
* filling it with the same values
  e.g. memset(p, 42, 1<<21)
* performing madvise to make it mergeable
  e.g. madvise(p, 1<<21, MADV_MERGEABLE)
* waiting for KSM to perform a few scans

The expected outcome is that the all the pages get merged (1 shared and
the rest sharing); the actual outcome is that no pages get merged (1
unshared and the rest volatile)

The reason of this behaviour is that we increase the reference count
once for both pages we want to merge, but if they belong to the same
hugepage (or compound page), the reference counter used in both cases is
the one of the head of the compound page.  This means that
split_huge_page will find a value of the reference counter too high and
will fail.

This patch solves this problem by testing if the two pages to merge
belong to the same hugepage when attempting to merge them.  If so, the
hugepage is split safely.  This means that the hugepage is not split if
not necessary.

Link: http://lkml.kernel.org/r/1521548069-24758-1-git-send-email-imbrenda@linux.vnet.ibm.com
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Co-authored-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
Esben Haabendal
1d4902ad2b dp83640: Ensure against premature access to PHY registers after reset
[ Upstream commit 76327a35caabd1a932e83d6a42b967aa08584e5d ]

The datasheet specifies a 3uS pause after performing a software
reset. The default implementation of genphy_soft_reset() does not
provide this, so implement soft_reset with the needed pause.

Signed-off-by: Esben Haabendal <eha@deif.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:08 +02:00
Dave Carroll
48b7c436bf scsi: aacraid: Insure command thread is not recursively stopped
[ Upstream commit 1c6b41fb92936fa5facea464d5d7cbf855966d04 ]

If a recursive IOP_RESET is invoked, usually due to the eh_thread
handling errors after the first reset, be sure we flag that the command
thread has been stopped to avoid an Oops of the form;

 [ 336.620256] CPU: 28 PID: 1193 Comm: scsi_eh_0 Kdump: loaded Not tainted 4.14.0-49.el7a.ppc64le #1
 [ 336.620297] task: c000003fd630b800 task.stack: c000003fd61a4000
 [ 336.620326] NIP: c000000000176794 LR: c00000000013038c CTR: c00000000024bc10
 [ 336.620361] REGS: c000003fd61a7720 TRAP: 0300 Not tainted (4.14.0-49.el7a.ppc64le)
 [ 336.620395] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 22084022 XER: 20040000
 [ 336.620435] CFAR: c000000000130388 DAR: 0000000000000000 DSISR: 40000000 SOFTE: 1
 [ 336.620435] GPR00: c00000000013038c c000003fd61a79a0 c0000000014c7e00 0000000000000000
 [ 336.620435] GPR04: 000000000000000c 000000000000000c 9000000000009033 0000000000000477
 [ 336.620435] GPR08: 0000000000000477 0000000000000000 0000000000000000 c008000010f7d940
 [ 336.620435] GPR12: c00000000024bc10 c000000007a33400 c0000000001708a8 c000003fe3b881d8
 [ 336.620435] GPR16: c000003fe3b88060 c000003fd61a7d10 fffffffffffff000 000000000000001e
 [ 336.620435] GPR20: 0000000000000001 c000000000ebf1a0 0000000000000001 c000003fe3b88000
 [ 336.620435] GPR24: 0000000000000003 0000000000000002 c000003fe3b88840 c000003fe3b887e8
 [ 336.620435] GPR28: c000003fe3b88000 c000003fc8181788 0000000000000000 c000003fc8181700
 [ 336.620750] NIP [c000000000176794] exit_creds+0x34/0x160
 [ 336.620775] LR [c00000000013038c] __put_task_struct+0x8c/0x1f0
 [ 336.620804] Call Trace:
 [ 336.620817] [c000003fd61a79a0] [c000003fe3b88000] 0xc000003fe3b88000 (unreliable)
 [ 336.620853] [c000003fd61a79d0] [c00000000013038c] __put_task_struct+0x8c/0x1f0
 [ 336.620889] [c000003fd61a7a00] [c000000000171418] kthread_stop+0x1e8/0x1f0
 [ 336.620922] [c000003fd61a7a40] [c008000010f7448c] aac_reset_adapter+0x14c/0x8d0 [aacraid]
 [ 336.620959] [c000003fd61a7b00] [c008000010f60174] aac_eh_host_reset+0x84/0x100 [aacraid]
 [ 336.621010] [c000003fd61a7b30] [c000000000864f24] scsi_try_host_reset+0x74/0x180
 [ 336.621046] [c000003fd61a7bb0] [c000000000867ac0] scsi_eh_ready_devs+0xc00/0x14d0
 [ 336.625165] [c000003fd61a7ca0] [c0000000008699e0] scsi_error_handler+0x550/0x730
 [ 336.632101] [c000003fd61a7dc0] [c000000000170a08] kthread+0x168/0x1b0
 [ 336.639031] [c000003fd61a7e30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4
 [ 336.645971] Instruction dump:
 [ 336.648743] 384216a0 7c0802a6 fbe1fff8 f8010010 f821ffd1 7c7f1b78 60000000 60000000
 [ 336.657056] 39400000 e87f0838 f95f0838 7c0004ac <7d401828> 314affff 7d40192d 40c2fff4
 [ 336.663997] -[ end trace 4640cf8d4945ad95 ]-

So flag when the thread is stopped by setting the thread pointer to NULL.

Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:07 +02:00
Shunyong Yang
11f3429796 cpufreq: CPPC: Initialize shared perf capabilities of CPUs
[ Upstream commit 8913315e9459b146e5888ab5138e10daa061b885 ]

When multiple CPUs are related in one cpufreq policy, the first online
CPU will be chosen by default to handle cpufreq operations. Let's take
cpu0 and cpu1 as an example.

When cpu0 is offline, policy->cpu will be shifted to cpu1. cpu1's perf
capabilities should be initialized. Otherwise, perf capabilities are 0s
and speed change can not take effect.

This patch copies perf capabilities of the first online CPU to other
shared CPUs when policy shared type is CPUFREQ_SHARED_TYPE_ANY.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Shunyong Yang <shunyong.yang@hxt-semitech.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:07 +02:00
Carlos Maiolino
72422ef801 Force log to disk before reading the AGF during a fstrim
[ Upstream commit 8c81dd46ef3c416b3b95e3020fb90dbd44e6140b ]

Forcing the log to disk after reading the agf is wrong, we might be
calling xfs_log_force with XFS_LOG_SYNC with a metadata lock held.

This can cause a deadlock when racing a fstrim with a filesystem
shutdown.

The deadlock has been identified due a miscalculation bug in device-mapper
dm-thin, which returns lack of space to its users earlier than the device itself
really runs out of space, changing the device-mapper volume into an error state.

The problem happened while filling the filesystem with a single file,
triggering the bug in device-mapper, consequently causing an IO error
and shutting down the filesystem.

If such file is removed, and fstrim executed before the XFS finishes the
shut down process, the fstrim process will end up holding the buffer
lock, and going to sleep on the cil wait queue.

At this point, the shut down process will try to wake up all the threads
waiting on the cil wait queue, but for this, it will try to hold the
same buffer log already held my the fstrim, locking up the filesystem.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:07 +02:00
Jens Axboe
651a896485 sr: get/drop reference to device in revalidate and check_events
[ Upstream commit 2d097c50212e137e7b53ffe3b37561153eeba87d ]

We can't just use scsi_cd() to get the scsi_cd structure, we have
to grab a live reference to the device. For both callbacks, we're
not inside an open where we already hold a reference to the device.

This fixes device removal/addition under concurrent device access,
which otherwise could result in the below oops.

NULL pointer dereference at 0000000000000010
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in:
sr 12:0:0:0: [sr2] scsi-1 drive
 scsi_debug crc_t10dif crct10dif_generic crct10dif_common nvme nvme_core sb_edac xl
sr 12:0:0:0: Attached scsi CD-ROM sr2
 sr_mod cdrom btrfs xor zstd_decompress zstd_compress xxhash lzo_compress zlib_defc
sr 12:0:0:0: Attached scsi generic sg7 type 5
 igb ahci libahci i2c_algo_bit libata dca [last unloaded: crc_t10dif]
CPU: 43 PID: 4629 Comm: systemd-udevd Not tainted 4.16.0+ #650
Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
RIP: 0010:sr_block_revalidate_disk+0x23/0x190 [sr_mod]
RSP: 0018:ffff883ff357bb58 EFLAGS: 00010292
RAX: ffffffffa00b07d0 RBX: ffff883ff3058000 RCX: ffff883ff357bb66
RDX: 0000000000000003 RSI: 0000000000007530 RDI: ffff881fea631000
RBP: 0000000000000000 R08: ffff881fe4d38400 R09: 0000000000000000
R10: 0000000000000000 R11: 00000000000001b6 R12: 000000000800005d
R13: 000000000800005d R14: ffff883ffd9b3790 R15: 0000000000000000
FS:  00007f7dc8e6d8c0(0000) GS:ffff883fff340000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000003ffda98005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? __invalidate_device+0x48/0x60
 check_disk_change+0x4c/0x60
 sr_block_open+0x16/0xd0 [sr_mod]
 __blkdev_get+0xb9/0x450
 ? iget5_locked+0x1c0/0x1e0
 blkdev_get+0x11e/0x320
 ? bdget+0x11d/0x150
 ? _raw_spin_unlock+0xa/0x20
 ? bd_acquire+0xc0/0xc0
 do_dentry_open+0x1b0/0x320
 ? inode_permission+0x24/0xc0
 path_openat+0x4e6/0x1420
 ? cpumask_any_but+0x1f/0x40
 ? flush_tlb_mm_range+0xa0/0x120
 do_filp_open+0x8c/0xf0
 ? __seccomp_filter+0x28/0x230
 ? _raw_spin_unlock+0xa/0x20
 ? __handle_mm_fault+0x7d6/0x9b0
 ? list_lru_add+0xa8/0xc0
 ? _raw_spin_unlock+0xa/0x20
 ? __alloc_fd+0xaf/0x160
 ? do_sys_open+0x1a6/0x230
 do_sys_open+0x1a6/0x230
 do_syscall_64+0x5a/0x100
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:07 +02:00
Tom Abraham
69ccb50c17 swap: divide-by-zero when zero length swap file on ssd
[ Upstream commit a06ad633a37c64a0cd4c229fc605cee8725d376e ]

Calling swapon() on a zero length swap file on SSD can lead to a
divide-by-zero.

Although creating such files isn't possible with mkswap and they woud be
considered invalid, it would be better for the swapon code to be more
robust and handle this condition gracefully (return -EINVAL).
Especially since the fix is small and straightforward.

To help with wear leveling on SSD, the swapon syscall calculates a
random position in the swap file using modulo p->highest_bit, which is
set to maxpages - 1 in read_swap_header.

If the swap file is zero length, read_swap_header sets maxpages=1 and
last_page=0, resulting in p->highest_bit=0 and we divide-by-zero when we
modulo p->highest_bit in swapon syscall.

This can be prevented by having read_swap_header return zero if
last_page is zero.

Link: http://lkml.kernel.org/r/5AC747C1020000A7001FA82C@prv-mh.provo.novell.com
Signed-off-by: Thomas Abraham <tabraham@suse.com>
Reported-by: <Mark.Landis@Teradata.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:07 +02:00
Danilo Krummrich
ea8a1f4a9a fs/proc/proc_sysctl.c: fix potential page fault while unregistering sysctl table
[ Upstream commit a0b0d1c345d0317efe594df268feb5ccc99f651e ]

proc_sys_link_fill_cache() does not take currently unregistering sysctl
tables into account, which might result into a page fault in
sysctl_follow_link() - add a check to fix it.

This bug has been present since v3.4.

Link: http://lkml.kernel.org/r/20180228013506.4915-1-danilokrummrich@dk-develop.de
Fixes: 0e47c99d7f ("sysctl: Replace root_list with links between sysctl_table_sets")
Signed-off-by: Danilo Krummrich <danilokrummrich@dk-develop.de>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:07 +02:00
Joerg Roedel
dd0b7b0cee x86/pgtable: Don't set huge PUD/PMD on non-leaf entries
[ Upstream commit e3e288121408c3abeed5af60b87b95c847143845 ]

The pmd_set_huge() and pud_set_huge() functions are used from
the generic ioremap() code to establish large mappings where this
is possible.

But the generic ioremap() code does not check whether the
PMD/PUD entries are already populated with a non-leaf entry,
so that any page-table pages these entries point to will be
lost.

Further, on x86-32 with SHARED_KERNEL_PMD=0, this causes a
BUG_ON() in vmalloc_sync_one() when PMD entries are synced
from swapper_pg_dir to the current page-table. This happens
because the PMD entry from swapper_pg_dir was promoted to a
huge-page entry while the current PGD still contains the
non-leaf entry. Because both entries are present and point
to a different page, the BUG_ON() triggers.

This was actually triggered with pti-x32 enabled in a KVM
virtual machine by the graphics driver.

A real and better fix for that would be to improve the
page-table handling in the generic ioremap() code. But that is
out-of-scope for this patch-set and left for later work.

Reported-by: David H. Gutteridge <dhgutteridge@sympatico.ca>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <llong@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20180411152437.GC15462@8bytes.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:07 +02:00