-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl13bQ8ACgkQONu9yGCS
aT7MNA/8CJJOI1UwUD2bmGsPg5OMOkAhucsIS1U2NH0OiPRbYbHRKiau8ML5FHAk
9MMEjndZoDxokndbBFAao/oq+8uXrzokxfoLAvzSm6ICIv8x5tI81v2yjVh8EPk3
6/2fJ68ugovSgAyMZlIK9Z/UeoJAoLPVqSlp7h0ecP00DOvEe5Vs347MfabNiNHP
OrKCd0myMieBdAy2idEOs/s97q0WbwPhrEELQFCDOEhTTdDMGMP14/X7TdFHELVf
G927erT6ztErLQHe8BUYmeQvxrLVoL9IV0vOMh98rbWxHD32/jqxO5yVY77oHH0b
lDT8nH2s+6pa3b3bZkVnEYKN9omE/O8QJVBHMuRGbWFjwyXJfAE7rMxVul6tUn/d
I96wBHD4kskGeXXcgb4zc/TSVo2WFJDvnnIgRs7hvUjPTAJ+iwJHXg9yjgu4BP7Z
uP14eTTv0CILtcKpkcRtjdmr64crJ/YdedZxlJut8nM5s7ui0waytpxzoaKIKg7i
j/QiRIYRikTTOqeABRw6/EZeYnieSdh1xrozI1fnKdikzU0nSx5Qxk7dsR4pTxrd
0699ba+TZHaJ+4eSYPxm+CejAXom9MBBgyLevT9ZkGu2wk3nPJ5GZU0olMcV1WXo
yN6/Xid8aXQBFOXw3Ex9pUI/qrV0fGEhZwFuOyCNFy1/lqMv2TY=
=s9Vs
-----END PGP SIGNATURE-----
Merge 4.4.192 into android-4.4
Changes in 4.4.192
net: tundra: tsi108: use spin_lock_irqsave instead of spin_lock_irq in IRQ context
net: tc35815: Explicitly check NET_IP_ALIGN is not zero in tc35815_rx
Bluetooth: btqca: Add a short delay before downloading the NVM
ibmveth: Convert multicast list size for little-endian system
gpio: Fix build error of function redefinition
cxgb4: fix a memory leak bug
net: myri10ge: fix memory leaks
cx82310_eth: fix a memory leak bug
net: kalmia: fix memory leaks
wimax/i2400m: fix a memory leak bug
ravb: Fix use-after-free ravb_tstamp_skb
Tools: hv: kvp: eliminate 'may be used uninitialized' warning
IB/mlx4: Fix memory leaks
ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
KVM: arm/arm64: Only skip MMIO insn once
libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
spi: bcm2835aux: ensure interrupts are enabled for shared handler
spi: bcm2835aux: unifying code between polling and interrupt driven code
spi: bcm2835aux: remove dangerous uncontrolled read of fifo
spi: bcm2835aux: fix corruptions for longer spi transfers
Revert "x86/apic: Include the LDR when clearing out APIC registers"
net: fix skb use after free in netpoll
net: stmmac: dwmac-rk: Don't fail if phy regulator is absent
Linux 4.4.192
Change-Id: I5e02cd84379aa9da7da5ed9545e939e0ca13197f
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit 950b07c14e8c59444e2359f15fd70ed5112e11a0 ]
This reverts commit 558682b5291937a70748d36fd9ba757fb25b99ae.
Chris Wilson reports that it breaks his CPU hotplug test scripts. In
particular, it breaks offlining and then re-onlining the boot CPU, which
we treat specially (and the BIOS does too).
The symptoms are that we can offline the CPU, but it then does not come
back online again:
smpboot: CPU 0 is now offline
smpboot: Booting Node 0 Processor 0 APIC 0x0
smpboot: do_boot_cpu failed(-1) to wakeup CPU#0
Thomas says he knows why it's broken (my personal suspicion: our magic
handling of the "cpu0_logical_apicid" thing), but for 5.3 the right fix
is to just revert it, since we've never touched the LDR bits before, and
it's not worth the risk to do anything else at this stage.
[ Hotpluging of the boot CPU is special anyway, and should be off by
default. See the "BOOTPARAM_HOTPLUG_CPU0" config option and the
cpu0_hotplug kernel parameter.
In general you should not do it, and it has various known limitations
(hibernate and suspend require the boot CPU, for example).
But it should work, even if the boot CPU is special and needs careful
treatment - Linus ]
Link: https://lore.kernel.org/lkml/156785100521.13300.14461504732265570003@skylake-alporthouse-com/
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bandan Das <bsd@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2113c5f62b7423e4a72b890bd479704aa85c81ba ]
If after an MMIO exit to userspace a VCPU is immediately run with an
immediate_exit request, such as when a signal is delivered or an MMIO
emulation completion is needed, then the VCPU completes the MMIO
emulation and immediately returns to userspace. As the exit_reason
does not get changed from KVM_EXIT_MMIO in these cases we have to
be careful not to complete the MMIO emulation again, when the VCPU is
eventually run again, because the emulation does an instruction skip
(and doing too many skips would be a waste of guest code :-) We need
to use additional VCPU state to track if the emulation is complete.
As luck would have it, we already have 'mmio_needed', which even
appears to be used in this way by other architectures already.
Fixes: 0d640732dbeb ("arm64: KVM: Skip MMIO insn after emulation")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1yFlMACgkQONu9yGCS
aT558A//bJDJD/ngcixs+2kRz5puPovxWVB70iPlyAn/0wtMaGU6R1w7Pd9MnQIM
TnvLN2e5HDnddHIsYak+JNuDgpNQAvr865wzWdf+J0WP/zsu5oro+KeFOztQhGB+
V0u6JYnvUuhVnR6ZHNI2997J42YthMJEsltRoXGMDqiWxCv47w+Z6c0iDwRQdI+A
O2neu9mmSRMkwaTTQSt8eY3us4/GuSkoFKiUoQOL2SDG9GAU+BT1Z7JnymWrQ20U
wSxKDwxpi44jQowq4aQofbcjofpvgcHxiM6p1MbbKBqwSBWxbcRpYOHjchry0h+W
PWFXDiXollAeci7nkWF35ASzwKBg3lhe8genhafqUo6HTZzoOdgjZ7NQ5F5nHuHh
M8oFXiZVpdjYziMKIqhPcVgABJFtm2iZmrUe3mXloLG9Qm0af/WtdobTFXA7Gsgh
v8tqU94dQYvvXzfi9G1hU5tRBs+yzdoI0ovyLPKBPP/2xI7qsP484Oq374VkW7Xc
Lm15aU0sivd3VPelDqvyUhmIX6XpBHjD1PK0N5mF777wJWt0+IEy7/GiPDMH3vSs
prgCdYaQFav9Yc6xQViAxSK+9L/k5Wdhtj+Otux/UBPGVwPugHqlVqMIVj7ND1Nz
5VN4B4YqfJhiiwygaDzly9PkvMQ+KOUgn7sBHJZONkL40ivHLDY=
=glFL
-----END PGP SIGNATURE-----
Merge 4.4.191 into android-4.4
Changes in 4.4.191
HID: Add 044f:b320 ThrustMaster, Inc. 2 in 1 DT
MIPS: kernel: only use i8253 clocksource with periodic clockevent
netfilter: ebtables: fix a memory leak bug in compat
bonding: Force slave speed check after link state recovery for 802.3ad
can: dev: call netif_carrier_off() in register_candev()
st21nfca_connectivity_event_received: null check the allocation
st_nci_hci_connectivity_event_received: null check the allocation
ASoC: ti: davinci-mcasp: Correct slot_width posed constraint
net: usb: qmi_wwan: Add the BroadMobi BM818 card
isdn: mISDN: hfcsusb: Fix possible null-pointer dereferences in start_isoc_chain()
isdn: hfcsusb: Fix mISDN driver crash caused by transfer buffer on the stack
perf bench numa: Fix cpu0 binding
can: sja1000: force the string buffer NULL-terminated
can: peak_usb: force the string buffer NULL-terminated
NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()
net: cxgb3_main: Fix a resource leak in a error path in 'init_one()'
net: hisilicon: make hip04_tx_reclaim non-reentrant
net: hisilicon: fix hip04-xmit never return TX_BUSY
net: hisilicon: Fix dma_map_single failed on arm64
libata: add SG safety checks in SFF pio transfers
selftests: kvm: Adding config fragments
HID: wacom: correct misreported EKR ring values
Revert "dm bufio: fix deadlock with loop device"
userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx
x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386
x86/apic: Handle missing global clockevent gracefully
x86/boot: Save fields explicitly, zero out everything else
x86/boot: Fix boot regression caused by bootparam sanitizing
dm btree: fix order of block initialization in btree_split_beneath
dm space map metadata: fix missing store of apply_bops() return value
dm table: fix invalid memory accesses with too high sector number
cgroup: Disable IRQs while holding css_set_lock
GFS2: don't set rgrp gl_object until it's inserted into rgrp tree
net: arc_emac: fix koops caused by sk_buff free
vhost-net: set packet weight of tx polling to 2 * vq size
vhost_net: use packet weight for rx handler, too
vhost_net: introduce vhost_exceeds_weight()
vhost: introduce vhost_exceeds_weight()
vhost_net: fix possible infinite loop
vhost: scsi: add weight support
siphash: add cryptographically secure PRF
siphash: implement HalfSipHash1-3 for hash tables
inet: switch IP ID generator to siphash
netfilter: ctnetlink: don't use conntrack/expect object addresses as id
netfilter: conntrack: Use consistent ct id hash calculation
Revert "perf test 6: Fix missing kvm module load for s390"
x86/pm: Introduce quirk framework to save/restore extra MSR registers around suspend/resume
x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h
scsi: ufs: Fix NULL pointer dereference in ufshcd_config_vreg_hpm()
dmaengine: ste_dma40: fix unneeded variable warning
usb: gadget: composite: Clear "suspended" on reset/disconnect
usb: host: fotg2: restart hcd after port reset
tools: hv: fix KVP and VSS daemons exit code
watchdog: bcm2835_wdt: Fix module autoload
tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue
ALSA: usb-audio: Fix a stack buffer overflow bug in check_input_term
ALSA: usb-audio: Fix an OOB bug in parse_audio_mixer_unit
tcp: make sure EPOLLOUT wont be missed
ALSA: seq: Fix potential concurrent access to the deleted pool
KVM: x86: Don't update RIP or do single-step on faulting emulation
x86/apic: Do not initialize LDR and DFR for bigsmp
x86/apic: Include the LDR when clearing out APIC registers
usb-storage: Add new JMS567 revision to unusual_devs
USB: cdc-wdm: fix race between write and disconnect due to flag abuse
usb: host: ohci: fix a race condition between shutdown and irq
USB: storage: ums-realtek: Update module parameter description for auto_delink_en
USB: storage: ums-realtek: Whitelist auto-delink support
ptrace,x86: Make user_64bit_mode() available to 32-bit builds
uprobes/x86: Fix detection of 32-bit user mode
mmc: sdhci-of-at91: add quirk for broken HS200
mmc: core: Fix init of SD cards reporting an invalid VDD range
stm class: Fix a double free of stm_source_device
VMCI: Release resource if the work is already queued
Revert "cfg80211: fix processing world regdomain when non modular"
mac80211: fix possible sta leak
x86/ptrace: fix up botched merge of spectrev1 fix
Linux 4.4.191
Change-Id: I9c9b3ec748ba2977b818fd569d1788ed5da295b2
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
I incorrectly merged commit 31a2fbb390fe ("x86/ptrace: Fix possible
spectre-v1 in ptrace_get_debugreg()") when backporting it, as was
graciously pointed out at
https://grsecurity.net/teardown_of_a_failed_linux_lts_spectre_fix.php
Resolve the upstream difference with the stable kernel merge to properly
protect things.
Reported-by: Brad Spengler <spender@grsecurity.net>
Cc: Dianzhang Chen <dianzhangchen0@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <bp@alien8.de>
Cc: <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9212ec7d8357ea630031e89d0d399c761421c83b ]
32-bit processes running on a 64-bit kernel are not always detected
correctly, causing the process to crash when uretprobes are installed.
The reason for the crash is that in_ia32_syscall() is used to determine the
process's mode, which only works correctly when called from a syscall.
In the case of uretprobes, however, the function is called from a exception
and always returns 'false' on a 64-bit kernel. In consequence this leads to
corruption of the process's return address.
Fix this by using user_64bit_mode() instead of in_ia32_syscall(), which
is correct in any situation.
[ tglx: Add a comment and the following historical info ]
This should have been detected by the rename which happened in commit
abfb9498ee13 ("x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()")
which states in the changelog:
The is_ia32_task()/is_x32_task() function names are a big misnomer: they
suggests that the compat-ness of a system call is a task property, which
is not true, the compatness of a system call purely depends on how it
was invoked through the system call layer.
.....
and then it went and blindly renamed every call site.
Sadly enough this was already mentioned here:
8faaed1b9f ("uprobes/x86: Introduce sizeof_long(), cleanup adjust_ret_addr() and
arch_uretprobe_hijack_return_addr()")
where the changelog says:
TODO: is_ia32_task() is not what we actually want, TS_COMPAT does
not necessarily mean 32bit. Fortunately syscall-like insns can't be
probed so it actually works, but it would be better to rename and
use is_ia32_frame().
and goes all the way back to:
0326f5a94d ("uprobes/core: Handle breakpoint and singlestep exceptions")
Oh well. 7+ years until someone actually tried a uretprobe on a 32bit
process on a 64bit kernel....
Fixes: 0326f5a94d ("uprobes/core: Handle breakpoint and singlestep exceptions")
Signed-off-by: Sebastian Mayr <me@sam.st>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190728152617.7308-1-me@sam.st
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e27c310af5c05cf876d9cad006928076c27f54d4 ]
In its current form, user_64bit_mode() can only be used when CONFIG_X86_64
is selected. This implies that code built with CONFIG_X86_64=n cannot use
it. If a piece of code needs to be built for both CONFIG_X86_64=y and
CONFIG_X86_64=n and wants to use this function, it needs to wrap it in
an #ifdef/#endif; potentially, in multiple places.
This can be easily avoided with a single #ifdef/#endif pair within
user_64bit_mode() itself.
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-4-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 558682b5291937a70748d36fd9ba757fb25b99ae upstream.
Although APIC initialization will typically clear out the LDR before
setting it, the APIC cleanup code should reset the LDR.
This was discovered with a 32-bit KVM guest jumping into a kdump
kernel. The stale bits in the LDR triggered a bug in the KVM APIC
implementation which caused the destination mapping for VCPUs to be
corrupted.
Note that this isn't intended to paper over the KVM APIC bug. The kernel
has to clear the LDR when resetting the APIC registers except when X2APIC
is enabled.
This lacks a Fixes tag because missing to clear LDR goes way back into pre
git history.
[ tglx: Made x2apic_enabled a function call as required ]
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190826101513.5080-3-bsd@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bae3a8d3308ee69a7dbdf145911b18dfda8ade0d upstream.
Legacy apic init uses bigsmp for smp systems with 8 and more CPUs. The
bigsmp APIC implementation uses physical destination mode, but it
nevertheless initializes LDR and DFR. The LDR even ends up incorrectly with
multiple bit being set.
This does not cause a functional problem because LDR and DFR are ignored
when physical destination mode is active, but it triggered a problem on a
32-bit KVM guest which jumps into a kdump kernel.
The multiple bits set unearthed a bug in the KVM APIC implementation. The
code which creates the logical destination map for VCPUs ignores the
disabled state of the APIC and ends up overwriting an existing valid entry
and as a result, APIC calibration hangs in the guest during kdump
initialization.
Remove the bogus LDR/DFR initialization.
This is not intended to work around the KVM APIC bug. The LDR/DFR
ininitalization is wrong on its own.
The issue goes back into the pre git history. The fixes tag is the commit
in the bitkeeper import which introduced bigsmp support in 2003.
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: db7b9e9f26b8 ("[PATCH] Clustered APIC setup for >8 CPU systems")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190826101513.5080-2-bsd@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75ee23b30dc712d80d2421a9a547e7ab6e379b44 upstream.
Don't advance RIP or inject a single-step #DB if emulation signals a
fault. This logic applies to all state updates that are conditional on
clean retirement of the emulation instruction, e.g. updating RFLAGS was
previously handled by commit 38827dbd3f ("KVM: x86: Do not update
EFLAGS on faulting emulation").
Not advancing RIP is likely a nop, i.e. ctxt->eip isn't updated with
ctxt->_eip until emulation "retires" anyways. Skipping #DB injection
fixes a bug reported by Andy Lutomirski where a #UD on SYSCALL due to
invalid state with EFLAGS.TF=1 would loop indefinitely due to emulation
overwriting the #UD with #DB and thus restarting the bad SYSCALL over
and over.
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@kernel.org>
Fixes: 663f4c61b8 ("KVM: x86: handle singlestep during emulation")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24 ]
There have been reports of RDRAND issues after resuming from suspend on
some AMD family 15h and family 16h systems. This issue stems from a BIOS
not performing the proper steps during resume to ensure RDRAND continues
to function properly.
RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be
reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND
support using CPUID, including the kernel, will believe that RDRAND is
not supported.
Update the CPU initialization to clear the RDRAND CPUID bit for any family
15h and 16h processor that supports RDRAND. If it is known that the family
15h or family 16h system does not have an RDRAND resume issue or that the
system will not be placed in suspend, the "rdrand=force" kernel parameter
can be used to stop the clearing of the RDRAND CPUID bit.
Additionally, update the suspend and resume path to save and restore the
MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in
place after resuming from suspend.
Note, that clearing the RDRAND CPUID bit does not prevent a processor
that normally supports the RDRAND instruction from executing it. So any
code that determined the support based on family and model won't #UD.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "x86@kernel.org" <x86@kernel.org>
Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.com
[sl: adjust context in docs]
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7a9c2dd08eadd5c6943115dbbec040c38d2e0822 ]
A bug was reported that on certain Broadwell platforms, after
resuming from S3, the CPU is running at an anomalously low
speed.
It turns out that the BIOS has modified the value of the
THERM_CONTROL register during S3, and changed it from 0 to 0x10,
thus enabled clock modulation(bit4), but with undefined CPU Duty
Cycle(bit1:3) - which causes the problem.
Here is a simple scenario to reproduce the issue:
1. Boot up the system
2. Get MSR 0x19a, it should be 0
3. Put the system into sleep, then wake it up
4. Get MSR 0x19a, it shows 0x10, while it should be 0
Although some BIOSen want to change the CPU Duty Cycle during
S3, in our case we don't want the BIOS to do any modification.
Fix this issue by introducing a more generic x86 framework to
save/restore specified MSR registers(THERM_CONTROL in this case)
for suspend/resume. This allows us to fix similar bugs in a much
simpler way in the future.
When the kernel wants to protect certain MSRs during suspending,
we simply add a quirk entry in msr_save_dmi_table, and customize
the MSR registers inside the quirk callback, for example:
u32 msr_id_need_to_save[] = {MSR_ID0, MSR_ID1, MSR_ID2...};
and the quirk mechanism ensures that, once resumed from suspend,
the MSRs indicated by these IDs will be restored to their
original, pre-suspend values.
Since both 64-bit and 32-bit kernels are affected, this patch
covers the common 64/32-bit suspend/resume code path. And
because the MSRs specified by the user might not be available or
readable in any situation, we use rdmsrl_safe() to safely save
these MSRs.
Reported-and-tested-by: Marcin Kaszewski <marcin.kaszewski@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@suse.de
Cc: len.brown@intel.com
Cc: linux@horizon.com
Cc: luto@kernel.org
Cc: rjw@rjwysocki.net
Link: http://lkml.kernel.org/r/c9abdcbc173dd2f57e8990e304376f19287e92ba.1448382971.git.yu.c.chen@intel.com
[ More edits to the naming of data structures. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 7846f58fba964af7cb8cf77d4d13c33254725211 upstream.
commit a90118c445cc ("x86/boot: Save fields explicitly, zero out everything
else") had two errors:
* It preserved boot_params.acpi_rsdp_addr, and
* It failed to preserve boot_params.hdr
Therefore, zero out acpi_rsdp_addr, and preserve hdr.
Fixes: a90118c445cc ("x86/boot: Save fields explicitly, zero out everything else")
Reported-by: Neil MacLeod <neil@nmacleod.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neil MacLeod <neil@nmacleod.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190821192513.20126-1-jhubbard@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a90118c445cc7f07781de26a9684d4ec58bfcfd1 upstream.
Recent gcc compilers (gcc 9.1) generate warnings about an out of bounds
memset, if the memset goes accross several fields of a struct. This
generated a couple of warnings on x86_64 builds in sanitize_boot_params().
Fix this by explicitly saving the fields in struct boot_params
that are intended to be preserved, and zeroing all the rest.
[ tglx: Tagged for stable as it breaks the warning free build there as well ]
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190731054627.5627-2-jhubbard@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f897e60a12f0b9146357780d317879bce2a877dc upstream.
Some newer machines do not advertise legacy timers. The kernel can handle
that situation if the TSC and the CPU frequency are enumerated by CPUID or
MSRs and the CPU supports TSC deadline timer. If the CPU does not support
TSC deadline timer the local APIC timer frequency has to be known as well.
Some Ryzens machines do not advertize legacy timers, but there is no
reliable way to determine the bus frequency which feeds the local APIC
timer when the machine allows overclocking of that frequency.
As there is no legacy timer the local APIC timer calibration crashes due to
a NULL pointer dereference when accessing the not installed global clock
event device.
Switch the calibration loop to a non interrupt based one, which polls
either TSC (if frequency is known) or jiffies. The latter requires a global
clockevent. As the machines which do not have a global clockevent installed
have a known TSC frequency this is a non issue. For older machines where
TSC frequency is not known, there is no known case where the legacy timers
do not exist as that would have been reported long ago.
Reported-by: Daniel Drake <drake@endlessm.com>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Drake <drake@endlessm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908091443030.21433@nanos.tec.linutronix.de
Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1142926#c12
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b63f20a778c88b6a04458ed6ffc69da953d3a109 upstream.
Use 'lea' instead of 'add' when adjusting %rsp in CALL_NOSPEC so as to
avoid clobbering flags.
KVM's emulator makes indirect calls into a jump table of sorts, where
the destination of the CALL_NOSPEC is a small blob of code that performs
fast emulation by executing the target instruction with fixed operands.
adcb_al_dl:
0x000339f8 <+0>: adc %dl,%al
0x000339fa <+2>: ret
A major motiviation for doing fast emulation is to leverage the CPU to
handle consumption and manipulation of arithmetic flags, i.e. RFLAGS is
both an input and output to the target of CALL_NOSPEC. Clobbering flags
results in all sorts of incorrect emulation, e.g. Jcc instructions often
take the wrong path. Sans the nops...
asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
0x0003595a <+58>: mov 0xc0(%ebx),%eax
0x00035960 <+64>: mov 0x60(%ebx),%edx
0x00035963 <+67>: mov 0x90(%ebx),%ecx
0x00035969 <+73>: push %edi
0x0003596a <+74>: popf
0x0003596b <+75>: call *%esi
0x000359a0 <+128>: pushf
0x000359a1 <+129>: pop %edi
0x000359a2 <+130>: mov %eax,0xc0(%ebx)
0x000359b1 <+145>: mov %edx,0x60(%ebx)
ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
0x000359a8 <+136>: mov -0x10(%ebp),%eax
0x000359ab <+139>: and $0x8d5,%edi
0x000359b4 <+148>: and $0xfffff72a,%eax
0x000359b9 <+153>: or %eax,%edi
0x000359bd <+157>: mov %edi,0x4(%ebx)
For the most part this has gone unnoticed as emulation of guest code
that can trigger fast emulation is effectively limited to MMIO when
running on modern hardware, and MMIO is rarely, if ever, accessed by
instructions that affect or consume flags.
Breakage is almost instantaneous when running with unrestricted guest
disabled, in which case KVM must emulate all instructions when the guest
has invalid state, e.g. when the guest is in Big Real Mode during early
BIOS.
Fixes: 776b043848fd2 ("x86/retpoline: Add initial retpoline support")
Fixes: 1a29b5b7f347a ("KVM: x86: Make indirect calls in emulator speculation safe")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190822211122.27579-1-sean.j.christopherson@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a07e3324538a989b7cdbf2c679be6a7f9df2544f ]
i8253 clocksource needs a free running timer. This could only
be used, if i8253 clockevent is set up as periodic.
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
This represents a rollup of a series of reverts, simplified are
modifications to remove fiq_glue and fiq_debugger references in:
arch/arm/common/Kconfig
arch/arm/common/Makefile
drivers/staging/android/Kconfig
drivers/staging/android/Makefile
And deletion of:
arch/arm/common/fiq_glue.S
arch/arm/common/fiq_glue_setup.c
drivers/staging/android/fiq_debugger/
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Bug: 32402555
Bug: 36101220
Change-Id: I3f74b1ff5e4971d619bcb37a911fed68fbb538d5
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1iTHIACgkQONu9yGCS
aT6r1RAAizAiIopvXWZ6Z7BAWj0MyalChPY+DhkGgS9egBs9TIqRJ4QCffTeIbHC
2PYXf//eoWRP1fT7AesMwD1s4lSaT0BuEoFwZn3bwFhR4Xf9HzVEl8bCXIgj4bq2
WCjE8u/W8ALFZOZ5yJJjRtVOEpTt512u5OzkaF3h+iXh4/g3UYHh1QPNSHA9l7fx
UbX+PsI9jYl3Ge4zqMIcuzPkgCIuF+g+EirzNRKijfYxOfoVgad83UXPFAMI2isF
+ftDAbIR+Tc7sBug/30ATdhQjFWDfM9Gzz0rBl9Pw1SpCH+h33e2cEMzLJC42DI2
mLpTABI7TMt+tygNAxceHCunPma80z22oobxgkGoZJRKH7MfQg/FD9N05bR+8C11
AlOcix4p1oaWRJssv4myrLjJq4Yt5Ura+/MvWSp1FLhodUbA+F7lfB1L1nW2LtXu
/edinaBKNMYUVxAmkJOWm3HT79OsonzbC6KqPDLsTTEfYISS6S5i99WnrLdjNoim
ozXkPUG3ymT9oRcgndRwEDGRGe0lAhI5SwdXujvxf1J26f90r8e8sHnAiDeO63ne
H+Uxd76b6BxAVpOnluKMTouHI+nxhRyvnp0rUYC523rOtFo9mYluWzte5BOCQ/Ss
hFgUKSe1F9+Vzm1bIdt1LicrooKGIeoYOqhk8A3jTum67jnUaQA=
=Kv4T
-----END PGP SIGNATURE-----
Merge 4.4.190 into android-4.4
Changes in 4.4.190
usb: iowarrior: fix deadlock on disconnect
sound: fix a memory leak bug
x86/mm: Check for pfn instead of page in vmalloc_sync_one()
x86/mm: Sync also unmappings in vmalloc_sync_all()
mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
perf db-export: Fix thread__exec_comm()
usb: yurex: Fix use-after-free in yurex_delete
can: peak_usb: fix potential double kfree_skb()
netfilter: nfnetlink: avoid deadlock due to synchronous request_module
iscsi_ibft: make ISCSI_IBFT dependson ACPI instead of ISCSI_IBFT_FIND
mac80211: don't warn about CW params when not using them
hwmon: (nct6775) Fix register address and added missed tolerance for nct6106
cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init()
s390/qdio: add sanity checks to the fast-requeue path
ALSA: compress: Fix regression on compressed capture streams
ALSA: compress: Prevent bypasses of set_params
ALSA: compress: Be more restrictive about when a drain is allowed
perf probe: Avoid calling freeing routine multiple times for same pointer
ARM: davinci: fix sleep.S build error on ARMv4
scsi: megaraid_sas: fix panic on loading firmware crashdump
scsi: ibmvfc: fix WARN_ON during event pool release
tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop
perf/core: Fix creating kernel counters for PMUs that override event->cpu
can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices
can: peak_usb: pcan_usb_fd: Fix info-leaks to USB devices
hwmon: (nct7802) Fix wrong detection of in4 presence
ALSA: firewire: fix a memory leak bug
mac80211: don't WARN on short WMM parameters from AP
SMB3: Fix deadlock in validate negotiate hits reconnect
smb3: send CAP_DFS capability during session setup
mwifiex: fix 802.11n/WPA detection
scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
sh: kernel: hw_breakpoint: Fix missing break in switch statement
usb: gadget: f_midi: fail if set_alt fails to allocate requests
USB: gadget: f_midi: fixing a possible double-free in f_midi
mm/memcontrol.c: fix use after free in mem_cgroup_iter()
ALSA: hda - Fix a memory leak bug
HID: holtek: test for sanity of intfdata
HID: hiddev: avoid opening a disconnected device
HID: hiddev: do cleanup in failure of opening a device
Input: kbtab - sanity check for endpoint type
Input: iforce - add sanity checks
net: usb: pegasus: fix improper read if get_registers() fail
xen/pciback: remove set but not used variable 'old_state'
irqchip/irq-imx-gpcv2: Forward irq type to parent
perf header: Fix divide by zero error if f_header.attr_size==0
perf header: Fix use of unitialized value warning
libata: zpodd: Fix small read overflow in zpodd_get_mech_type()
scsi: hpsa: correct scsi command status issue after reset
ata: libahci: do not complain in case of deferred probe
kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules
IB/core: Add mitigation for Spectre V1
ocfs2: remove set but not used variable 'last_hash'
asm-generic: fix -Wtype-limits compiler warnings
staging: comedi: dt3000: Fix signed integer overflow 'divider * base'
staging: comedi: dt3000: Fix rounding up of timer divisor
USB: core: Fix races in character device registration and deregistraion
usb: cdc-acm: make sure a refcount is taken early enough
USB: serial: option: add D-Link DWM-222 device ID
USB: serial: option: Add support for ZTE MF871A
USB: serial: option: add the BroadMobi BM818 card
USB: serial: option: Add Motorola modem UARTs
Backport minimal compiler_attributes.h to support GCC 9
include/linux/module.h: copy __init/__exit attrs to init/cleanup_module
arm64: compat: Allow single-byte watchpoints on all addresses
Input: psmouse - fix build error of multiple definition
asm-generic: default BUG_ON(x) to if(x)BUG()
scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure
RDMA: Directly cast the sockaddr union to sockaddr
IB/mlx5: Make coding style more consistent
x86/vdso: Remove direct HPET access through the vDSO
iommu/amd: Move iommu_init_pci() to .init section
x86/boot: Disable the address-of-packed-member compiler warning
net/packet: fix race in tpacket_snd()
xen/netback: Reset nr_frags before freeing skb
net/mlx5e: Only support tx/rx pause setting for port owner
sctp: fix the transport error_count check
bonding: Add vlan tx offload to hw_enc_features
Linux 4.4.190
Change-Id: I2af7fee66e6ce77c41266cec8cfa7b7c4a78a05c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 20c6c189045539d29f4854d92b7ea9c329e1edfc upstream.
The clang warning 'address-of-packed-member' is disabled for the general
kernel code, also disable it for the x86 boot code.
This suppresses a bunch of warnings like this when building with clang:
./arch/x86/include/asm/processor.h:535:30: warning: taking address of
packed member 'sp0' of class or structure 'x86_hw_tss' may result in an
unaligned pointer value [-Waddress-of-packed-member]
return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
^~~~~~~~~~~~~~~~~~~
./arch/x86/include/asm/percpu.h:391:59: note: expanded from macro
'this_cpu_read_stable'
#define this_cpu_read_stable(var) percpu_stable_op("mov", var)
^~~
./arch/x86/include/asm/percpu.h:228:16: note: expanded from macro
'percpu_stable_op'
: "p" (&(var)));
^~~
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170725215053.135586-1-mka@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1ed95e52d902035e39a715ff3a314a893a96e5b7 upstream.
Allowing user code to map the HPET is problematic. HPET
implementations are notoriously buggy, and there are probably many
machines on which even MMIO reads from bogus HPET addresses are
problematic.
We have a report that the Dell Precision M2800 with:
ACPI: HPET 0x00000000C8FE6238 000038 (v01 DELL CBX3 01072009 AMI. 00000005)
is either so slow when accessing the HPET or actually hangs in some
regard, causing soft lockups to be reported if users do unexpected
things to the HPET.
The vclock HPET code has also always been a questionable speedup.
Accessing an HPET is exceedingly slow (on the order of several
microseconds), so the added overhead in requiring a syscall to read
the HPET is a small fraction of the total code of accessing it.
To avoid future problems, let's just delete the code entirely.
In the long run, this could actually be a speedup. Waiman Long as a
patch to optimize the case where multiple CPUs contend for the HPET,
but that won't help unless all the accesses are mediated by the
kernel.
Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <Waiman.Long@hpe.com>
Cc: Waiman Long <waiman.long@hpe.com>
Link: http://lkml.kernel.org/r/d2f90bba98db9905041cff294646d290d378f67a.1460074438.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 849adec41203ac5837c40c2d7e08490ffdef3c2c upstream.
Commit d968d2b801 ("ARM: 7497/1: hw_breakpoint: allow single-byte
watchpoints on all addresses") changed the validation requirements for
hardware watchpoints on arch/arm/. Update our compat layer to implement
the same relaxation.
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1ee1119d184bb06af921b48c3021d921bbd85bac upstream.
Add missing break statement in order to prevent the code from falling
through to case SH_BREAKPOINT_WRITE.
Fixes: 09a0729477 ("sh: hw-breakpoints: Add preliminary support for SH-4A UBC.")
Cc: stable@vger.kernel.org
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d64b212ea960db4276a1d8372bd98cb861dfcbb0 ]
When building a multiplatform kernel that includes armv4 support,
the default target CPU does not support the blx instruction,
which leads to a build failure:
arch/arm/mach-davinci/sleep.S: Assembler messages:
arch/arm/mach-davinci/sleep.S:56: Error: selected processor does not support `blx ip' in ARM mode
Add a .arch statement in the sources to make this file build.
Link: https://lore.kernel.org/r/20190722145211.1154785-1-arnd@arndb.de
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 8e998fc24de47c55b47a887f6c95ab91acd4a720 upstream.
With huge-page ioremap areas the unmappings also need to be synced between
all page-tables. Otherwise it can cause data corruption when a region is
unmapped and later re-used.
Make the vmalloc_sync_one() function ready to sync unmappings and make sure
vmalloc_sync_all() iterates over all page-tables even when an unmapped PMD
is found.
Fixes: 5d72b4fba4 ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20190719184652.11391-3-joro@8bytes.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 51b75b5b563a2637f9d8dc5bd02a31b2ff9e5ea0 upstream.
Do not require a struct page for the mapped memory location because it
might not exist. This can happen when an ioremapped region is mapped with
2MB pages.
Fixes: 5d72b4fba4 ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20190719184652.11391-2-joro@8bytes.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f36cf386e3fec258a341d446915862eded3e13d8 upstream.
Intel provided the following information:
On all current Atom processors, instructions that use a segment register
value (e.g. a load or store) will not speculatively execute before the
last writer of that segment retires. Thus they will not use a
speculatively written segment value.
That means on ATOMs there is no speculation through SWAPGS, so the SWAPGS
entry paths can be excluded from the extra LFENCE if PTI is disabled.
Create a separate bug flag for the through SWAPGS speculation and mark all
out-of-order ATOMs and AMD/HYGON CPUs as not affected. The in-order ATOMs
are excluded from the whole mitigation mess anyway.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
[bwh: Backported to 4.4:
- There's no whitelist entry (or any support) for Hygon CPUs
- Adjust context, indentation]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64dbc122b20f75183d8822618c24f85144a5a94d upstream.
Somehow the swapgs mitigation entry code patch ended up with a JMPQ
instruction instead of JMP, where only the short jump is needed. Some
assembler versions apparently fail to optimize JMPQ into a two-byte JMP
when possible, instead always using a 7-byte JMP with relocation. For
some reason that makes the entry code explode with a #GP during boot.
Change it back to "JMP" as originally intended.
Fixes: 18ec54fdd6d1 ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bwh: Backported to 4.4: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a2059825986a1c8143fd6698774fa9d83733bb11 upstream.
The previous commit added macro calls in the entry code which mitigate the
Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are
enabled. Enable those features where applicable.
The mitigations may be disabled with "nospectre_v1" or "mitigations=off".
There are different features which can affect the risk of attack:
- When FSGSBASE is enabled, unprivileged users are able to place any
value in GS, using the wrgsbase instruction. This means they can
write a GS value which points to any value in kernel space, which can
be useful with the following gadget in an interrupt/exception/NMI
handler:
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg1
// dependent load or store based on the value of %reg
// for example: mov %(reg1), %reg2
If an interrupt is coming from user space, and the entry code
speculatively skips the swapgs (due to user branch mistraining), it
may speculatively execute the GS-based load and a subsequent dependent
load or store, exposing the kernel data to an L1 side channel leak.
Note that, on Intel, a similar attack exists in the above gadget when
coming from kernel space, if the swapgs gets speculatively executed to
switch back to the user GS. On AMD, this variant isn't possible
because swapgs is serializing with respect to future GS-based
accesses.
NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case
doesn't exist quite yet.
- When FSGSBASE is disabled, the issue is mitigated somewhat because
unprivileged users must use prctl(ARCH_SET_GS) to set GS, which
restricts GS values to user space addresses only. That means the
gadget would need an additional step, since the target kernel address
needs to be read from user space first. Something like:
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg1
mov (%reg1), %reg2
// dependent load or store based on the value of %reg2
// for example: mov %(reg2), %reg3
It's difficult to audit for this gadget in all the handlers, so while
there are no known instances of it, it's entirely possible that it
exists somewhere (or could be introduced in the future). Without
tooling to analyze all such code paths, consider it vulnerable.
Effects of SMAP on the !FSGSBASE case:
- If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not
susceptible to Meltdown), the kernel is prevented from speculatively
reading user space memory, even L1 cached values. This effectively
disables the !FSGSBASE attack vector.
- If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP
still prevents the kernel from speculatively reading user space
memory. But it does *not* prevent the kernel from reading the
user value from L1, if it has already been cached. This is probably
only a small hurdle for an attacker to overcome.
Thanks to Dave Hansen for contributing the speculative_smap() function.
Thanks to Andrew Cooper for providing the inside scoop on whether swapgs
is serializing on AMD.
[ tglx: Fixed the USER fence decision and polished the comment as suggested
by Dave Hansen ]
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
[bwh: Backported to 4.4:
- Check for X86_FEATURE_KAISER instead of X86_FEATURE_PTI
- mitigations= parameter is x86-only here
- Don't use __ro_after_init
- Adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 18ec54fdd6d18d92025af097cd042a75cf0ea24c upstream.
Spectre v1 isn't only about array bounds checks. It can affect any
conditional checks. The kernel entry code interrupt, exception, and NMI
handlers all have conditional swapgs checks. Those may be problematic in
the context of Spectre v1, as kernel code can speculatively run with a user
GS.
For example:
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg
mov (%reg), %reg1
When coming from user space, the CPU can speculatively skip the swapgs, and
then do a speculative percpu load using the user GS value. So the user can
speculatively force a read of any kernel value. If a gadget exists which
uses the percpu value as an address in another load/store, then the
contents of the kernel value may become visible via an L1 side channel
attack.
A similar attack exists when coming from kernel space. The CPU can
speculatively do the swapgs, causing the user GS to get used for the rest
of the speculative window.
The mitigation is similar to a traditional Spectre v1 mitigation, except:
a) index masking isn't possible; because the index (percpu offset)
isn't user-controlled; and
b) an lfence is needed in both the "from user" swapgs path and the
"from kernel" non-swapgs path (because of the two attacks described
above).
The user entry swapgs paths already have SWITCH_TO_KERNEL_CR3, which has a
CR3 write when PTI is enabled. Since CR3 writes are serializing, the
lfences can be skipped in those cases.
On the other hand, the kernel entry swapgs paths don't depend on PTI.
To avoid unnecessary lfences for the user entry case, create two separate
features for alternative patching:
X86_FEATURE_FENCE_SWAPGS_USER
X86_FEATURE_FENCE_SWAPGS_KERNEL
Use these features in entry code to patch in lfences where needed.
The features aren't enabled yet, so there's no functional change.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
[bwh: Backported to 4.4:
- Assign the CPU feature bits from word 7
- Add FENCE_SWAPGS_KERNEL_ENTRY to NMI entry, since it does not
use paranoid_entry
- Include <asm/cpufeatures.h> in calling.h
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2fa5f04f85730d0c4f49f984b7efeb4f8d5bd1fc upstream.
This warning:
WARNING: CPU: 0 PID: 3331 at arch/x86/entry/common.c:45 enter_from_user_mode+0x32/0x50
CPU: 0 PID: 3331 Comm: ldt_gdt_64 Not tainted 4.8.0-rc7+ #13
Call Trace:
dump_stack+0x99/0xd0
__warn+0xd1/0xf0
warn_slowpath_null+0x1d/0x20
enter_from_user_mode+0x32/0x50
error_entry+0x6d/0xc0
? general_protection+0x12/0x30
? native_load_gs_index+0xd/0x20
? do_set_thread_area+0x19c/0x1f0
SyS_set_thread_area+0x24/0x30
do_int80_syscall_32+0x7c/0x220
entry_INT80_compat+0x38/0x50
... can be reproduced by running the GS testcase of the ldt_gdt test unit in
the x86 selftests.
do_int80_syscall_32() will call enter_form_user_mode() to convert context
tracking state from user state to kernel state. The load_gs_index() call
can fail with user gsbase, gsbase will be fixed up and proceed if this
happen.
However, enter_from_user_mode() will be called again in the fixed up path
though it is context tracking kernel state currently.
This patch fixes it by just fixing up gsbase and telling lockdep that IRQs
are off once load_gs_index() failed with user gsbase.
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1475197266-3440-1-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This will make it clearer which bits are allocated, in case we need to
assign more feature bits for later backports.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 147b9635e6347104b91f48ca9dca61eb0fbf2a54 upstream.
If CTR_EL0.{CWG,ERG} are 0b0000 then they must be interpreted to have
their architecturally maximum values, which defeats the use of
FTR_HIGHER_SAFE when sanitising CPU ID registers on heterogeneous
machines.
Introduce FTR_HIGHER_OR_ZERO_SAFE so that these fields effectively
saturate at zero.
Fixes: 3c739b5710 ("arm64: Keep track of CPU feature registers")
Cc: <stable@vger.kernel.org> # 4.4.y only
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit be68a8aaf925aaf35574260bf820bb09d2f9e07f upstream.
Our field definitions for CTR_EL0 suffer from a number of problems:
- The IDC and DIC fields are missing, which causes us to enable CTR
trapping on CPUs with either of these returning non-zero values.
- The ERG is FTR_LOWER_SAFE, whereas it should be treated like CWG as
FTR_HIGHER_SAFE so that applications can use it to avoid false sharing.
- [nit] A RES1 field is described as "RAO"
This patch updates the CTR_EL0 field definitions to fix these issues.
Cc: <stable@vger.kernel.org> # 4.4.y only
Cc: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3901336ed9887b075531bffaeef7742ba614058b ]
After making a change to improve objtool's sibling call detection, it
started showing the following warning:
arch/x86/kvm/vmx/nested.o: warning: objtool: .fixup+0x15: sibling call from callable instruction with modified stack frame
The problem is the ____kvm_handle_fault_on_reboot() macro. It does a
fake call by pushing a fake RIP and doing a jump. That tricks the
unwinder into printing the function which triggered the exception,
rather than the .fixup code.
Instead of the hack to make it look like the original function made the
call, just change the macro so that the original function actually does
make the call. This allows removal of the hack, and also makes objtool
happy.
I triggered a vmx instruction exception and verified that the stack
trace is still sane:
kernel BUG at arch/x86/kvm/x86.c:358!
invalid opcode: 0000 [#1] SMP PTI
CPU: 28 PID: 4096 Comm: qemu-kvm Not tainted 5.2.0+ #16
Hardware name: Lenovo THINKSYSTEM SD530 -[7X2106Z000]-/-[7X2106Z000]-, BIOS -[TEE113Z-1.00]- 07/17/2017
RIP: 0010:kvm_spurious_fault+0x5/0x10
Code: 00 00 00 00 00 8b 44 24 10 89 d2 45 89 c9 48 89 44 24 10 8b 44 24 08 48 89 44 24 08 e9 d4 40 22 00 0f 1f 40 00 0f 1f 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41
RSP: 0018:ffffbf91c683bd00 EFLAGS: 00010246
RAX: 000061f040000000 RBX: ffff9e159c77bba0 RCX: ffff9e15a5c87000
RDX: 0000000665c87000 RSI: ffff9e15a5c87000 RDI: ffff9e159c77bba0
RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9e15a5c87000
R10: 0000000000000000 R11: fffff8f2d99721c0 R12: ffff9e159c77bba0
R13: ffffbf91c671d960 R14: ffff9e159c778000 R15: 0000000000000000
FS: 00007fa341cbe700(0000) GS:ffff9e15b7400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdd38356804 CR3: 00000006759de003 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
loaded_vmcs_init+0x4f/0xe0
alloc_loaded_vmcs+0x38/0xd0
vmx_create_vcpu+0xf7/0x600
kvm_vm_ioctl+0x5e9/0x980
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? free_one_page+0x13f/0x4e0
do_vfs_ioctl+0xa4/0x630
ksys_ioctl+0x60/0x90
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x55/0x1c0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fa349b1ee5b
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/64a9b64d127e87b6920a97afde8e96ea76f6524e.1563413318.git.jpoimboe@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 29e7e9664aec17b94a9c8c5a75f8d216a206aa3a ]
clang warns about a few parts of the math-emu implementation
where a 16-bit integer becomes negative during assignment:
arch/x86/math-emu/poly_tan.c:88:35: error: implicit conversion from 'int' to 'short' changes value from 49216 to -16320 [-Werror,-Wconstant-conversion]
(0x41 + EXTENDED_Ebias) | SIGN_Negative);
~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
arch/x86/math-emu/fpu_emu.h:180:58: note: expanded from macro 'setexponent16'
#define setexponent16(x,y) { (*(short *)&((x)->exp)) = (y); }
~ ^
arch/x86/math-emu/reg_constant.c:37:32: error: implicit conversion from 'int' to 'short' changes value from 49085 to -16451 [-Werror,-Wconstant-conversion]
FPU_REG const CONST_PI2extra = MAKE_REG(NEG, -66,
^~~~~~~~~~~~~~~~~~
arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG'
((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) }
~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/math-emu/reg_constant.c:48:28: error: implicit conversion from 'int' to 'short' changes value from 65535 to -1 [-Werror,-Wconstant-conversion]
FPU_REG const CONST_QNaN = MAKE_REG(NEG, EXP_OVER, 0x00000000, 0xC0000000);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG'
((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) }
~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
The code is correct as is, so add a typecast to shut up the warnings.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190712090816.350668-1-arnd@arndb.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ec6335586953b0df32f83ef696002063090c7aef ]
There are many compiler warnings like this,
In file included from ./arch/x86/include/asm/smp.h:13,
from ./arch/x86/include/asm/mmzone_64.h:11,
from ./arch/x86/include/asm/mmzone.h:5,
from ./include/linux/mmzone.h:969,
from ./include/linux/gfp.h:6,
from ./include/linux/mm.h:10,
from arch/x86/kernel/apic/io_apic.c:34:
arch/x86/kernel/apic/io_apic.c: In function 'check_timer':
./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned
expression >= 0 is always true [-Wtype-limits]
if ((v) <= apic_verbosity) \
^~
arch/x86/kernel/apic/io_apic.c:2160:2: note: in expansion of macro
'apic_printk'
apic_printk(APIC_QUIET, KERN_INFO "..TIMER: vector=0x%02X "
^~~~~~~~~~~
./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned
expression >= 0 is always true [-Wtype-limits]
if ((v) <= apic_verbosity) \
^~
arch/x86/kernel/apic/io_apic.c:2207:4: note: in expansion of macro
'apic_printk'
apic_printk(APIC_QUIET, KERN_ERR "..MP-BIOS bug: "
^~~~~~~~~~~
APIC_QUIET is 0, so silence them by making apic_verbosity type int.
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1562621805-24789-1-git-send-email-cai@lca.pw
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ba1bc0fcdeaf3bf583c1517bd2e3e29cf223c969 ]
The modification of EXIN register doesn't clean the bitfield before
the writing of a new value. After a few modifications the bitfield would
accumulate only '1's.
Signed-off-by: Petr Cvek <petrcvekcz@gmail.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: hauke@hauke-m.de
Cc: john@phrozen.org
Cc: linux-mips@vger.kernel.org
Cc: openwrt-devel@lists.openwrt.org
Cc: pakahmar@hotmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ef1ba39a9fa53d2205e633bc9b21840a275908e ]
This is similar to commit e6186820a745 ("arm64: dts: rockchip: Arch
counter doesn't tick in system suspend"). Specifically on the rk3288
it can be seen that the timer stops ticking in suspend if we end up
running through the "osc_disable" path in rk3288_slp_mode_set(). In
that path the 24 MHz clock will turn off and the timer stops.
To test this, I ran this on a Chrome OS filesystem:
before=$(date); \
suspend_stress_test -c1 --suspend_min=30 --suspend_max=31; \
echo ${before}; date
...and I found that unless I plug in a device that requests USB wakeup
to be active that the two calls to "date" would show that fewer than
30 seconds passed.
NOTE: deep suspend (where the 24 MHz clock gets disabled) isn't
supported yet on upstream Linux so this was tested on a downstream
kernel.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ffd9a1ba9fdb7f2bd1d1ad9b9243d34e96756ba2 ]
DMA got broken a while back in two different ways:
1) a change in the behaviour of disable_irq() to wait for the interrupt
to finish executing causes us to deadlock at the end of DMA.
2) a change to avoid modifying the scatterlist left the first transfer
uninitialised.
DMA is only used with expansion cards, so has gone unnoticed.
Fixes: fa4e998999 ("[ARM] dma: RiscPC: don't modify DMA SG entries")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1GiqYACgkQONu9yGCS
aT5poQ//XNZuSNH5NeE8y37z/7EC5cnx5QOdgpVEz/RZF6Al7DzM0SK/oWiMJR9O
+gJOoHEwlW/GmVw5O/yOll6ChnAlXfbGnZy9TlXkVUVIa9qU3xVrSFnh4lM1xiZy
crEaIQ9ow6tfQHnq/DcODvfyEdZgaiW0xTBTB/ZBEKmN9//rBphTuZlFvAKX7bv5
JBflHDCGl/1zO09xqR9jgWcrCW//a2Ip/O2D61IW1l3oqp7eVGDZMBHMbac45zQ0
4tpD/ppzv8ak3+HTknIujuZSMlMkCJ6FYBlTqpp44e/qQ8ZvQ2s0OdP3iHwlC5HA
E60F2ynewg1JJ6RnhmnTn2g4C1MEvL7QMroo3fo1TujpHYLJBpLiQpggXnweTfYN
eR+Ux1i38SyyqhYSMncp42vttsIXnYTpAGzZi0gLOenVj9MnrNjQueBI4o5PmJwF
CcYP8SIaadSZhBPv/FDo0mKFdepb10g1PBi/0Dk+tqJuxSDbqc+cD5BywkJh67T5
y+3LBVOIZCYA6WY8v7J65x9gNZI50RGKcoX0YWsbEKhBjnfCmW0B0qB17HwWpPWz
UvSIGY7Vj7ufhCMSgzuqOPSVKQ5gL36BsJOZPyrnqz2GdMebSpKRMPEGsNSdPvnl
8M8GuZFotgKmW7m2aU5nr8+Mwh82zXir9He1aShxd172caefGIk=
=ml+6
-----END PGP SIGNATURE-----
Merge 4.4.187 into android-4.4
Changes in 4.4.187
MIPS: ath79: fix ar933x uart parity mode
MIPS: fix build on non-linux hosts
dmaengine: imx-sdma: fix use-after-free on probe error path
ath10k: Do not send probe response template for mesh
ath9k: Check for errors when reading SREV register
ath6kl: add some bounds checking
ath: DFS JP domain W56 fixed pulse type 3 RADAR detection
batman-adv: fix for leaked TVLV handler.
media: dvb: usb: fix use after free in dvb_usb_device_exit
crypto: talitos - fix skcipher failure due to wrong output IV
media: marvell-ccic: fix DMA s/g desc number calculation
media: vpss: fix a potential NULL pointer dereference
net: stmmac: dwmac1000: Clear unused address entries
signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig
af_key: fix leaks in key_pol_get_resp and dump_sp.
xfrm: Fix xfrm sel prefix length validation
media: staging: media: davinci_vpfe: - Fix for memory leak if decoder initialization fails.
net: phy: Check against net_device being NULL
tua6100: Avoid build warnings.
locking/lockdep: Fix merging of hlocks with non-zero references
media: wl128x: Fix some error handling in fm_v4l2_init_video_device()
cpupower : frequency-set -r option misses the last cpu in related cpu list
net: fec: Do not use netdev messages too early
net: axienet: Fix race condition causing TX hang
s390/qdio: handle PENDING state for QEBSM devices
perf test 6: Fix missing kvm module load for s390
gpio: omap: fix lack of irqstatus_raw0 for OMAP4
gpio: omap: ensure irq is enabled before wakeup
regmap: fix bulk writes on paged registers
bpf: silence warning messages in core
rcu: Force inlining of rcu_read_lock()
xfrm: fix sa selector validation
perf evsel: Make perf_evsel__name() accept a NULL argument
vhost_net: disable zerocopy by default
EDAC/sysfs: Fix memory leak when creating a csrow object
media: i2c: fix warning same module names
ntp: Limit TAI-UTC offset
timer_list: Guard procfs specific code
acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
media: coda: fix mpeg2 sequence number handling
media: coda: increment sequence offset for the last returned frame
mt7601u: do not schedule rx_tasklet when the device has been disconnected
x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c
mt7601u: fix possible memory leak when the device is disconnected
ath10k: fix PCIE device wake up failed
rslib: Fix decoding of shortened codes
rslib: Fix handling of of caller provided syndrome
ixgbe: Check DDM existence in transceiver before access
EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec
bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush()
Bluetooth: hci_bcsp: Fix memory leak in rx_skb
Bluetooth: 6lowpan: search for destination address in all peers
Bluetooth: Check state in l2cap_disconnect_rsp
Bluetooth: validate BLE connection interval updates
crypto: ghash - fix unaligned memory access in ghash_setkey()
crypto: arm64/sha1-ce - correct digest for empty data in finup
crypto: arm64/sha2-ce - correct digest for empty data in finup
Input: gtco - bounds check collection indent level
regulator: s2mps11: Fix buck7 and buck8 wrong voltages
tracing/snapshot: Resize spare buffer if size changed
NFSv4: Handle the special Linux file open access mode
lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE
ALSA: seq: Break too long mutex context in the write loop
media: v4l2: Test type instead of cfg->type in v4l2_ctrl_new_custom()
media: coda: Remove unbalanced and unneeded mutex unlock
KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed
drm/nouveau/i2c: Enable i2c pads & busses during preinit
padata: use smp_mb in padata_reorder to avoid orphaned padata jobs
9p/virtio: Add cleanup path in p9_virtio_init
PCI: Do not poll for PME if the device is in D3cold
take floppy compat ioctls to sodding floppy.c
floppy: fix div-by-zero in setup_format_params
floppy: fix out-of-bounds read in next_valid_format
floppy: fix invalid pointer dereference in drive_name
floppy: fix out-of-bounds read in copy_buffer
coda: pass the host file in vma->vm_file on mmap
gpu: ipu-v3: ipu-ic: Fix saturation bit offset in TPMEM
parisc: Fix kernel panic due invalid values in IAOQ0 or IAOQ1
powerpc/32s: fix suspend/resume when IBATs 4-7 are used
powerpc/watchpoint: Restore NV GPRs while returning from exception
eCryptfs: fix a couple type promotion bugs
intel_th: msu: Fix single mode with disabled IOMMU
Bluetooth: Add SMP workaround Microsoft Surface Precision Mouse bug
usb: Handle USB3 remote wakeup for LPM enabled devices correctly
dm bufio: fix deadlock with loop device
bnx2x: Prevent load reordering in tx completion processing
caif-hsi: fix possible deadlock in cfhsi_exit_module()
ipv4: don't set IPv6 only flags to IPv4 addresses
net: bcmgenet: use promisc for unsupported filters
net: neigh: fix multiple neigh timer scheduling
nfc: fix potential illegal memory access
sky2: Disable MSI on ASUS P6T
netrom: fix a memory leak in nr_rx_frame()
netrom: hold sock when setting skb->destructor
tcp: Reset bytes_acked and bytes_received when disconnecting
bonding: validate ip header before check IPPROTO_IGMP
net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling
net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query
net: bridge: stp: don't cache eth dest pointer before skb pull
elevator: fix truncation of icq_cache_name
NFSv4: Fix open create exclusive when the server reboots
nfsd: increase DRC cache limit
nfsd: give out fewer session slots as limit approaches
nfsd: fix performance-limiting session calculation
nfsd: Fix overflow causing non-working mounts on 1 TB machines
drm/panel: simple: Fix panel_simple_dsi_probe
usb: core: hub: Disable hub-initiated U1/U2
tty: max310x: Fix invalid baudrate divisors calculator
pinctrl: rockchip: fix leaked of_node references
tty: serial: cpm_uart - fix init when SMC is relocated
memstick: Fix error cleanup path of memstick_init
tty/serial: digicolor: Fix digicolor-usart already registered warning
tty: serial: msm_serial: avoid system lockup condition
drm/virtio: Add memory barriers for capset cache.
phy: renesas: rcar-gen2: Fix memory leak at error paths
usb: gadget: Zero ffs_io_data
powerpc/pci/of: Fix OF flags parsing for 64bit BARs
PCI: sysfs: Ignore lockdep for remove attribute
iio: iio-utils: Fix possible incorrect mask calculation
recordmcount: Fix spurious mcount entries on powerpc
mfd: core: Set fwnode for created devices
mfd: arizona: Fix undefined behavior
um: Silence lockdep complaint about mmap_sem
powerpc/4xx/uic: clear pending interrupt after irq type/pol change
serial: sh-sci: Fix TX DMA buffer flushing and workqueue races
kallsyms: exclude kasan local symbols on s390
perf test mmap-thread-lookup: Initialize variable to suppress memory sanitizer warning
f2fs: avoid out-of-range memory access
mailbox: handle failed named mailbox channel request
powerpc/eeh: Handle hugepages in ioremap space
sh: prevent warnings when using iounmap
mm/kmemleak.c: fix check for softirq context
9p: pass the correct prototype to read_cache_page
mm/mmu_notifier: use hlist_add_head_rcu()
locking/lockdep: Fix lock used or unused stats error
locking/lockdep: Hide unused 'class' variable
usb: wusbcore: fix unbalanced get/put cluster_id
usb: pci-quirks: Correct AMD PLL quirk detection
x86/sysfb_efi: Add quirks for some devices with swapped width and height
x86/speculation/mds: Apply more accurate check on hypervisor platform
hpet: Fix division by zero in hpet_time_div()
ALSA: line6: Fix wrong altsetting for LINE6_PODHD500_1
ALSA: hda - Add a conexant codec entry to let mute led work
powerpc/tm: Fix oops on sigreturn on systems without TM
access: avoid the RCU grace period for the temporary subjective credentials
vmstat: Remove BUG_ON from vmstat_update
mm, vmstat: make quiet_vmstat lighter
ipv6: check sk sk_type and protocol early in ip_mroute_set/getsockopt
tcp: reset sk_send_head in tcp_write_queue_purge
ISDN: hfcsusb: checking idx of ep configuration
media: cpia2_usb: first wake up, then free in disconnect
media: radio-raremono: change devm_k*alloc to k*alloc
Bluetooth: hci_uart: check for missing tty operations
sched/fair: Don't free p->numa_faults with concurrent readers
drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
ceph: hold i_ceph_lock when removing caps for freeing inode
Linux 4.4.187
Change-Id: Id03e619b24750a6b3faaff02166469569f5deb4f
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit f16d80b75a096c52354c6e0a574993f3b0dfbdfe upstream.
On systems like P9 powernv where we have no TM (or P8 booted with
ppc_tm=off), userspace can construct a signal context which still has
the MSR TS bits set. The kernel tries to restore this context which
results in the following crash:
Unexpected TM Bad Thing exception at c0000000000022fc (msr 0x8000000102a03031) tm_scratch=800000020280f033
Oops: Unrecoverable exception, sig: 6 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 0 PID: 1636 Comm: sigfuz Not tainted 5.2.0-11043-g0a8ad0ffa4 #69
NIP: c0000000000022fc LR: 00007fffb2d67e48 CTR: 0000000000000000
REGS: c00000003fffbd70 TRAP: 0700 Not tainted (5.2.0-11045-g7142b497d8)
MSR: 8000000102a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[E]> CR: 42004242 XER: 00000000
CFAR: c0000000000022e0 IRQMASK: 0
GPR00: 0000000000000072 00007fffb2b6e560 00007fffb2d87f00 0000000000000669
GPR04: 00007fffb2b6e728 0000000000000000 0000000000000000 00007fffb2b6f2a8
GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 00007fffb2b76900 0000000000000000 0000000000000000
GPR16: 00007fffb2370000 00007fffb2d84390 00007fffea3a15ac 000001000a250420
GPR20: 00007fffb2b6f260 0000000010001770 0000000000000000 0000000000000000
GPR24: 00007fffb2d843a0 00007fffea3a14a0 0000000000010000 0000000000800000
GPR28: 00007fffea3a14d8 00000000003d0f00 0000000000000000 00007fffb2b6e728
NIP [c0000000000022fc] rfi_flush_fallback+0x7c/0x80
LR [00007fffb2d67e48] 0x7fffb2d67e48
Call Trace:
Instruction dump:
e96a0220 e96a02a8 e96a0330 e96a03b8 394a0400 4200ffdc 7d2903a6 e92d0c00
e94d0c08 e96d0c10 e82d0c18 7db242a6 <4c000024> 7db243a6 7db142a6 f82d0c18
The problem is the signal code assumes TM is enabled when
CONFIG_PPC_TRANSACTIONAL_MEM is enabled. This may not be the case as
with P9 powernv or if `ppc_tm=off` is used on P8.
This means any local user can crash the system.
Fix the problem by returning a bad stack frame to the user if they try
to set the MSR TS bits with sigreturn() on systems where TM is not
supported.
Found with sigfuz kernel selftest on P9.
This fixes CVE-2019-13648.
Fixes: 2b0a576d15 ("powerpc: Add new transactional memory state to the signal context")
Cc: stable@vger.kernel.org # v3.9
Reported-by: Praveen Pandey <Praveen.Pandey@in.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190719050502.405-1-mikey@neuling.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 517c3ba00916383af6411aec99442c307c23f684 upstream.
X86_HYPER_NATIVE isn't accurate for checking if running on native platform,
e.g. CONFIG_HYPERVISOR_GUEST isn't set or "nopv" is enabled.
Checking the CPU feature bit X86_FEATURE_HYPERVISOR to determine if it's
running on native platform is more accurate.
This still doesn't cover the platforms on which X86_FEATURE_HYPERVISOR is
unsupported, e.g. VMware, but there is nothing which can be done about this
scenario.
Fixes: 8a4b06d391b0 ("x86/speculation/mds: Add sysfs reporting for MDS")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1564022349-17338-1-git-send-email-zhenzhong.duan@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d02f1aa39189e0619c3525d5cd03254e61bf606a upstream.
Some Lenovo 2-in-1s with a detachable keyboard have a portrait screen but
advertise a landscape resolution and pitch, resulting in a messed up
display if the kernel tries to show anything on the efifb (because of the
wrong pitch).
Fix this by adding a new DMI match table for devices which need to have
their width and height swapped.
At first it was tried to use the existing table for overriding some of the
efifb parameters, but some of the affected devices have variants with
different LCD resolutions which will not work with hardcoded override
values.
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1730783
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190721152418.11644-1-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 733f0025f0fb43e382b84db0930ae502099b7e62 ]
When building drm/exynos for sh, as part of an allmodconfig build, the
following warning triggered:
exynos7_drm_decon.c: In function `decon_remove':
exynos7_drm_decon.c:769:24: warning: unused variable `ctx'
struct decon_context *ctx = dev_get_drvdata(&pdev->dev);
The ctx variable is only used as argument to iounmap().
In sh - allmodconfig CONFIG_MMU is not defined
so it ended up in:
\#define __iounmap(addr) do { } while (0)
\#define iounmap __iounmap
Fix the warning by introducing a static inline function for iounmap.
This is similar to several other architectures.
Link: http://lkml.kernel.org/r/20190622114208.24427-1-sam@ravnborg.org
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Inki Dae <inki.dae@samsung.com>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 33439620680be5225c1b8806579a291e0d761ca0 ]
In commit 4a7b06c157a2 ("powerpc/eeh: Handle hugepages in ioremap
space") support for using hugepages in the vmalloc and ioremap areas was
enabled for radix. Unfortunately this broke EEH MMIO error checking.
Detection works by inserting a hook which checks the results of the
ioreadXX() set of functions. When a read returns a 0xFFs response we
need to check for an error which we do by mapping the (virtual) MMIO
address back to a physical address, then mapping physical address to a
PCI device via an interval tree.
When translating virt -> phys we currently assume the ioremap space is
only populated by PAGE_SIZE mappings. If a hugepage mapping is found we
emit a WARN_ON(), but otherwise handles the check as though a normal
page was found. In pathalogical cases such as copying a buffer
containing a lot of 0xFFs from BAR memory this can result in the system
not booting because it's too busy printing WARN_ON()s.
There's no real reason to assume huge pages can't be present and we're
prefectly capable of handling them, so do that.
Fixes: 4a7b06c157a2 ("powerpc/eeh: Handle hugepages in ioremap space")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190710150517.27114-1-oohall@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ab3a0689e74e6aa5b41360bc18861040ddef5b1 ]
When testing out gpio-keys with a button, a spurious
interrupt (and therefore a key press or release event)
gets triggered as soon as the driver enables the irq
line for the first time.
This patch clears any potential bogus generated interrupt
that was caused by the switching of the associated irq's
type and polarity.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 80bf6ceaf9310b3f61934c69b382d4912deee049 ]
When we get into activate_mm(), lockdep complains that we're doing
something strange:
WARNING: possible circular locking dependency detected
5.1.0-10252-gb00152307319-dirty #121 Not tainted
------------------------------------------------------
inside.sh/366 is trying to acquire lock:
(____ptrval____) (&(&p->alloc_lock)->rlock){+.+.}, at: flush_old_exec+0x703/0x8d7
but task is already holding lock:
(____ptrval____) (&mm->mmap_sem){++++}, at: flush_old_exec+0x6c5/0x8d7
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&mm->mmap_sem){++++}:
[...]
__lock_acquire+0x12ab/0x139f
lock_acquire+0x155/0x18e
down_write+0x3f/0x98
flush_old_exec+0x748/0x8d7
load_elf_binary+0x2ca/0xddb
[...]
-> #0 (&(&p->alloc_lock)->rlock){+.+.}:
[...]
__lock_acquire+0x12ab/0x139f
lock_acquire+0x155/0x18e
_raw_spin_lock+0x30/0x83
flush_old_exec+0x703/0x8d7
load_elf_binary+0x2ca/0xddb
[...]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&mm->mmap_sem);
lock(&(&p->alloc_lock)->rlock);
lock(&mm->mmap_sem);
lock(&(&p->alloc_lock)->rlock);
*** DEADLOCK ***
2 locks held by inside.sh/366:
#0: (____ptrval____) (&sig->cred_guard_mutex){+.+.}, at: __do_execve_file+0x12d/0x869
#1: (____ptrval____) (&mm->mmap_sem){++++}, at: flush_old_exec+0x6c5/0x8d7
stack backtrace:
CPU: 0 PID: 366 Comm: inside.sh Not tainted 5.1.0-10252-gb00152307319-dirty #121
Stack:
[...]
Call Trace:
[<600420de>] show_stack+0x13b/0x155
[<6048906b>] dump_stack+0x2a/0x2c
[<6009ae64>] print_circular_bug+0x332/0x343
[<6009c5c6>] check_prev_add+0x669/0xdad
[<600a06b4>] __lock_acquire+0x12ab/0x139f
[<6009f3d0>] lock_acquire+0x155/0x18e
[<604a07e0>] _raw_spin_lock+0x30/0x83
[<60151e6a>] flush_old_exec+0x703/0x8d7
[<601a8eb8>] load_elf_binary+0x2ca/0xddb
[...]
I think it's because in exec_mmap() we have
down_read(&old_mm->mmap_sem);
...
task_lock(tsk);
...
activate_mm(active_mm, mm);
(which does down_write(&mm->mmap_sem))
I'm not really sure why lockdep throws in the whole knowledge
about the task lock, but it seems that old_mm and mm shouldn't
ever be the same (and it doesn't deadlock) so tell lockdep that
they're different.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>