diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index e4cd3be77663..f97d1aaec1f9 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -279,6 +279,8 @@ What: /sys/devices/system/cpu/vulnerabilities /sys/devices/system/cpu/vulnerabilities/spec_store_bypass /sys/devices/system/cpu/vulnerabilities/l1tf /sys/devices/system/cpu/vulnerabilities/mds + /sys/devices/system/cpu/vulnerabilities/tsx_async_abort + /sys/devices/system/cpu/vulnerabilities/itlb_multihit Date: January 2018 Contact: Linux kernel mailing list Description: Information about CPU vulnerabilities diff --git a/Documentation/hw-vuln/tsx_async_abort.rst b/Documentation/hw-vuln/tsx_async_abort.rst new file mode 100644 index 000000000000..38beda735f39 --- /dev/null +++ b/Documentation/hw-vuln/tsx_async_abort.rst @@ -0,0 +1,268 @@ +.. SPDX-License-Identifier: GPL-2.0 + +TAA - TSX Asynchronous Abort +====================================== + +TAA is a hardware vulnerability that allows unprivileged speculative access to +data which is available in various CPU internal buffers by using asynchronous +aborts within an Intel TSX transactional region. + +Affected processors +------------------- + +This vulnerability only affects Intel processors that support Intel +Transactional Synchronization Extensions (TSX) when the TAA_NO bit (bit 8) +is 0 in the IA32_ARCH_CAPABILITIES MSR. On processors where the MDS_NO bit +(bit 5) is 0 in the IA32_ARCH_CAPABILITIES MSR, the existing MDS mitigations +also mitigate against TAA. + +Whether a processor is affected or not can be read out from the TAA +vulnerability file in sysfs. See :ref:`tsx_async_abort_sys_info`. + +Related CVEs +------------ + +The following CVE entry is related to this TAA issue: + + ============== ===== =================================================== + CVE-2019-11135 TAA TSX Asynchronous Abort (TAA) condition on some + microprocessors utilizing speculative execution may + allow an authenticated user to potentially enable + information disclosure via a side channel with + local access. + ============== ===== =================================================== + +Problem +------- + +When performing store, load or L1 refill operations, processors write +data into temporary microarchitectural structures (buffers). The data in +those buffers can be forwarded to load operations as an optimization. + +Intel TSX is an extension to the x86 instruction set architecture that adds +hardware transactional memory support to improve performance of multi-threaded +software. TSX lets the processor expose and exploit concurrency hidden in an +application due to dynamically avoiding unnecessary synchronization. + +TSX supports atomic memory transactions that are either committed (success) or +aborted. During an abort, operations that happened within the transactional region +are rolled back. An asynchronous abort takes place, among other options, when a +different thread accesses a cache line that is also used within the transactional +region when that access might lead to a data race. + +Immediately after an uncompleted asynchronous abort, certain speculatively +executed loads may read data from those internal buffers and pass it to dependent +operations. This can be then used to infer the value via a cache side channel +attack. + +Because the buffers are potentially shared between Hyper-Threads cross +Hyper-Thread attacks are possible. + +The victim of a malicious actor does not need to make use of TSX. Only the +attacker needs to begin a TSX transaction and raise an asynchronous abort +which in turn potenitally leaks data stored in the buffers. + +More detailed technical information is available in the TAA specific x86 +architecture section: :ref:`Documentation/x86/tsx_async_abort.rst `. + + +Attack scenarios +---------------- + +Attacks against the TAA vulnerability can be implemented from unprivileged +applications running on hosts or guests. + +As for MDS, the attacker has no control over the memory addresses that can +be leaked. Only the victim is responsible for bringing data to the CPU. As +a result, the malicious actor has to sample as much data as possible and +then postprocess it to try to infer any useful information from it. + +A potential attacker only has read access to the data. Also, there is no direct +privilege escalation by using this technique. + + +.. _tsx_async_abort_sys_info: + +TAA system information +----------------------- + +The Linux kernel provides a sysfs interface to enumerate the current TAA status +of mitigated systems. The relevant sysfs file is: + +/sys/devices/system/cpu/vulnerabilities/tsx_async_abort + +The possible values in this file are: + +.. list-table:: + + * - 'Vulnerable' + - The CPU is affected by this vulnerability and the microcode and kernel mitigation are not applied. + * - 'Vulnerable: Clear CPU buffers attempted, no microcode' + - The system tries to clear the buffers but the microcode might not support the operation. + * - 'Mitigation: Clear CPU buffers' + - The microcode has been updated to clear the buffers. TSX is still enabled. + * - 'Mitigation: TSX disabled' + - TSX is disabled. + * - 'Not affected' + - The CPU is not affected by this issue. + +.. _ucode_needed: + +Best effort mitigation mode +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If the processor is vulnerable, but the availability of the microcode-based +mitigation mechanism is not advertised via CPUID the kernel selects a best +effort mitigation mode. This mode invokes the mitigation instructions +without a guarantee that they clear the CPU buffers. + +This is done to address virtualization scenarios where the host has the +microcode update applied, but the hypervisor is not yet updated to expose the +CPUID to the guest. If the host has updated microcode the protection takes +effect; otherwise a few CPU cycles are wasted pointlessly. + +The state in the tsx_async_abort sysfs file reflects this situation +accordingly. + + +Mitigation mechanism +-------------------- + +The kernel detects the affected CPUs and the presence of the microcode which is +required. If a CPU is affected and the microcode is available, then the kernel +enables the mitigation by default. + + +The mitigation can be controlled at boot time via a kernel command line option. +See :ref:`taa_mitigation_control_command_line`. + +.. _virt_mechanism: + +Virtualization mitigation +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Affected systems where the host has TAA microcode and TAA is mitigated by +having disabled TSX previously, are not vulnerable regardless of the status +of the VMs. + +In all other cases, if the host either does not have the TAA microcode or +the kernel is not mitigated, the system might be vulnerable. + + +.. _taa_mitigation_control_command_line: + +Mitigation control on the kernel command line +--------------------------------------------- + +The kernel command line allows to control the TAA mitigations at boot time with +the option "tsx_async_abort=". The valid arguments for this option are: + + ============ ============================================================= + off This option disables the TAA mitigation on affected platforms. + If the system has TSX enabled (see next parameter) and the CPU + is affected, the system is vulnerable. + + full TAA mitigation is enabled. If TSX is enabled, on an affected + system it will clear CPU buffers on ring transitions. On + systems which are MDS-affected and deploy MDS mitigation, + TAA is also mitigated. Specifying this option on those + systems will have no effect. + ============ ============================================================= + +Not specifying this option is equivalent to "tsx_async_abort=full". + +The kernel command line also allows to control the TSX feature using the +parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used +to control the TSX feature and the enumeration of the TSX feature bits (RTM +and HLE) in CPUID. + +The valid options are: + + ============ ============================================================= + off Disables TSX on the system. + + Note that this option takes effect only on newer CPUs which are + not vulnerable to MDS, i.e., have MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1 + and which get the new IA32_TSX_CTRL MSR through a microcode + update. This new MSR allows for the reliable deactivation of + the TSX functionality. + + on Enables TSX. + + Although there are mitigations for all known security + vulnerabilities, TSX has been known to be an accelerator for + several previous speculation-related CVEs, and so there may be + unknown security risks associated with leaving it enabled. + + auto Disables TSX if X86_BUG_TAA is present, otherwise enables TSX + on the system. + ============ ============================================================= + +Not specifying this option is equivalent to "tsx=off". + +The following combinations of the "tsx_async_abort" and "tsx" are possible. For +affected platforms tsx=auto is equivalent to tsx=off and the result will be: + + ========= ========================== ========================================= + tsx=on tsx_async_abort=full The system will use VERW to clear CPU + buffers. Cross-thread attacks are still + possible on SMT machines. + tsx=on tsx_async_abort=off The system is vulnerable. + tsx=off tsx_async_abort=full TSX might be disabled if microcode + provides a TSX control MSR. If so, + system is not vulnerable. + tsx=off tsx_async_abort=off ditto + ========= ========================== ========================================= + + +For unaffected platforms "tsx=on" and "tsx_async_abort=full" does not clear CPU +buffers. For platforms without TSX control (MSR_IA32_ARCH_CAPABILITIES.MDS_NO=0) +"tsx" command line argument has no effect. + +For the affected platforms below table indicates the mitigation status for the +combinations of CPUID bit MD_CLEAR and IA32_ARCH_CAPABILITIES MSR bits MDS_NO +and TSX_CTRL_MSR. + + ======= ========= ============= ======================================== + MDS_NO MD_CLEAR TSX_CTRL_MSR Status + ======= ========= ============= ======================================== + 0 0 0 Vulnerable (needs microcode) + 0 1 0 MDS and TAA mitigated via VERW + 1 1 0 MDS fixed, TAA vulnerable if TSX enabled + because MD_CLEAR has no meaning and + VERW is not guaranteed to clear buffers + 1 X 1 MDS fixed, TAA can be mitigated by + VERW or TSX_CTRL_MSR + ======= ========= ============= ======================================== + +Mitigation selection guide +-------------------------- + +1. Trusted userspace and guests +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If all user space applications are from a trusted source and do not execute +untrusted code which is supplied externally, then the mitigation can be +disabled. The same applies to virtualized environments with trusted guests. + + +2. Untrusted userspace and guests +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If there are untrusted applications or guests on the system, enabling TSX +might allow a malicious actor to leak data from the host or from other +processes running on the same physical core. + +If the microcode is available and the TSX is disabled on the host, attacks +are prevented in a virtualized environment as well, even if the VMs do not +explicitly enable the mitigation. + + +.. _taa_default_mitigations: + +Default mitigations +------------------- + +The kernel's default action for vulnerable processors is: + + - Deploy TSX disable mitigation (tsx_async_abort=full tsx=off). diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index b3c7bdffbdc0..d170c8fcced6 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2240,6 +2240,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. spectre_v2_user=off [X86] spec_store_bypass_disable=off [X86] mds=off [X86] + tsx_async_abort=off [X86] auto (default) Mitigate all CPU vulnerabilities, but leave SMT @@ -4131,6 +4132,67 @@ bytes respectively. Such letter suffixes can also be entirely omitted. platforms where RDTSC is slow and this accounting can add overhead. + tsx= [X86] Control Transactional Synchronization + Extensions (TSX) feature in Intel processors that + support TSX control. + + This parameter controls the TSX feature. The options are: + + on - Enable TSX on the system. Although there are + mitigations for all known security vulnerabilities, + TSX has been known to be an accelerator for + several previous speculation-related CVEs, and + so there may be unknown security risks associated + with leaving it enabled. + + off - Disable TSX on the system. (Note that this + option takes effect only on newer CPUs which are + not vulnerable to MDS, i.e., have + MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1 and which get + the new IA32_TSX_CTRL MSR through a microcode + update. This new MSR allows for the reliable + deactivation of the TSX functionality.) + + auto - Disable TSX if X86_BUG_TAA is present, + otherwise enable TSX on the system. + + Not specifying this option is equivalent to tsx=off. + + See Documentation/hw-vuln/tsx_async_abort.rst + for more details. + + tsx_async_abort= [X86,INTEL] Control mitigation for the TSX Async + Abort (TAA) vulnerability. + + Similar to Micro-architectural Data Sampling (MDS) + certain CPUs that support Transactional + Synchronization Extensions (TSX) are vulnerable to an + exploit against CPU internal buffers which can forward + information to a disclosure gadget under certain + conditions. + + In vulnerable processors, the speculatively forwarded + data can be used in a cache side channel attack, to + access data to which the attacker does not have direct + access. + + This parameter controls the TAA mitigation. The + options are: + + full - Enable TAA mitigation on vulnerable CPUs + if TSX is enabled. + + off - Unconditionally disable TAA mitigation + + Not specifying this option is equivalent to + tsx_async_abort=full. On CPUs which are MDS affected + and deploy MDS mitigation, TAA mitigation is not + required and doesn't provide any additional + mitigation. + + For details see: + Documentation/hw-vuln/tsx_async_abort.rst + turbografx.map[2|3]= [HW,JOY] TurboGraFX parallel port interface Format: diff --git a/Documentation/x86/tsx_async_abort.rst b/Documentation/x86/tsx_async_abort.rst new file mode 100644 index 000000000000..4a4336a89372 --- /dev/null +++ b/Documentation/x86/tsx_async_abort.rst @@ -0,0 +1,117 @@ +.. SPDX-License-Identifier: GPL-2.0 + +TSX Async Abort (TAA) mitigation +================================ + +.. _tsx_async_abort: + +Overview +-------- + +TSX Async Abort (TAA) is a side channel attack on internal buffers in some +Intel processors similar to Microachitectural Data Sampling (MDS). In this +case certain loads may speculatively pass invalid data to dependent operations +when an asynchronous abort condition is pending in a Transactional +Synchronization Extensions (TSX) transaction. This includes loads with no +fault or assist condition. Such loads may speculatively expose stale data from +the same uarch data structures as in MDS, with same scope of exposure i.e. +same-thread and cross-thread. This issue affects all current processors that +support TSX. + +Mitigation strategy +------------------- + +a) TSX disable - one of the mitigations is to disable TSX. A new MSR +IA32_TSX_CTRL will be available in future and current processors after +microcode update which can be used to disable TSX. In addition, it +controls the enumeration of the TSX feature bits (RTM and HLE) in CPUID. + +b) Clear CPU buffers - similar to MDS, clearing the CPU buffers mitigates this +vulnerability. More details on this approach can be found in +:ref:`Documentation/hw-vuln/mds.rst `. + +Kernel internal mitigation modes +-------------------------------- + + ============= ============================================================ + off Mitigation is disabled. Either the CPU is not affected or + tsx_async_abort=off is supplied on the kernel command line. + + tsx disabled Mitigation is enabled. TSX feature is disabled by default at + bootup on processors that support TSX control. + + verw Mitigation is enabled. CPU is affected and MD_CLEAR is + advertised in CPUID. + + ucode needed Mitigation is enabled. CPU is affected and MD_CLEAR is not + advertised in CPUID. That is mainly for virtualization + scenarios where the host has the updated microcode but the + hypervisor does not expose MD_CLEAR in CPUID. It's a best + effort approach without guarantee. + ============= ============================================================ + +If the CPU is affected and the "tsx_async_abort" kernel command line parameter is +not provided then the kernel selects an appropriate mitigation depending on the +status of RTM and MD_CLEAR CPUID bits. + +Below tables indicate the impact of tsx=on|off|auto cmdline options on state of +TAA mitigation, VERW behavior and TSX feature for various combinations of +MSR_IA32_ARCH_CAPABILITIES bits. + +1. "tsx=off" + +========= ========= ============ ============ ============== =================== ====================== +MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=off +---------------------------------- ------------------------------------------------------------------------- +TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation + after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full +========= ========= ============ ============ ============== =================== ====================== + 0 0 0 HW default Yes Same as MDS Same as MDS + 0 0 1 Invalid case Invalid case Invalid case Invalid case + 0 1 0 HW default No Need ucode update Need ucode update + 0 1 1 Disabled Yes TSX disabled TSX disabled + 1 X 1 Disabled X None needed None needed +========= ========= ============ ============ ============== =================== ====================== + +2. "tsx=on" + +========= ========= ============ ============ ============== =================== ====================== +MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=on +---------------------------------- ------------------------------------------------------------------------- +TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation + after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full +========= ========= ============ ============ ============== =================== ====================== + 0 0 0 HW default Yes Same as MDS Same as MDS + 0 0 1 Invalid case Invalid case Invalid case Invalid case + 0 1 0 HW default No Need ucode update Need ucode update + 0 1 1 Enabled Yes None Same as MDS + 1 X 1 Enabled X None needed None needed +========= ========= ============ ============ ============== =================== ====================== + +3. "tsx=auto" + +========= ========= ============ ============ ============== =================== ====================== +MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=auto +---------------------------------- ------------------------------------------------------------------------- +TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation + after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full +========= ========= ============ ============ ============== =================== ====================== + 0 0 0 HW default Yes Same as MDS Same as MDS + 0 0 1 Invalid case Invalid case Invalid case Invalid case + 0 1 0 HW default No Need ucode update Need ucode update + 0 1 1 Disabled Yes TSX disabled TSX disabled + 1 X 1 Enabled X None needed None needed +========= ========= ============ ============ ============== =================== ====================== + +In the tables, TSX_CTRL_MSR is a new bit in MSR_IA32_ARCH_CAPABILITIES that +indicates whether MSR_IA32_TSX_CTRL is supported. + +There are two control bits in IA32_TSX_CTRL MSR: + + Bit 0: When set it disables the Restricted Transactional Memory (RTM) + sub-feature of TSX (will force all transactions to abort on the + XBEGIN instruction). + + Bit 1: When set it disables the enumeration of the RTM and HLE feature + (i.e. it will make CPUID(EAX=7).EBX{bit4} and + CPUID(EAX=7).EBX{bit11} read as 0). diff --git a/Makefile b/Makefile index 992923e533ff..09e872accf0e 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ VERSION = 4 PATCHLEVEL = 4 -SUBLEVEL = 200 +SUBLEVEL = 202 EXTRAVERSION = NAME = Blurry Fish Butt diff --git a/arch/mips/bcm63xx/reset.c b/arch/mips/bcm63xx/reset.c index d1fe51edf5e6..4d411da2497b 100644 --- a/arch/mips/bcm63xx/reset.c +++ b/arch/mips/bcm63xx/reset.c @@ -119,7 +119,7 @@ #define BCM6368_RESET_DSL 0 #define BCM6368_RESET_SAR SOFTRESET_6368_SAR_MASK #define BCM6368_RESET_EPHY SOFTRESET_6368_EPHY_MASK -#define BCM6368_RESET_ENETSW 0 +#define BCM6368_RESET_ENETSW SOFTRESET_6368_ENETSW_MASK #define BCM6368_RESET_PCM SOFTRESET_6368_PCM_MASK #define BCM6368_RESET_MPI SOFTRESET_6368_MPI_MASK #define BCM6368_RESET_PCIE 0 diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 96efd8213c1c..d7eb035a9c96 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -66,29 +66,35 @@ endif UTS_MACHINE := $(OLDARCH) ifeq ($(CONFIG_CPU_LITTLE_ENDIAN),y) -override CC += -mlittle-endian -ifneq ($(cc-name),clang) -override CC += -mno-strict-align -endif -override AS += -mlittle-endian override LD += -EL -override CROSS32CC += -mlittle-endian override CROSS32AS += -mlittle-endian LDEMULATION := lppc GNUTARGET := powerpcle MULTIPLEWORD := -mno-multiple KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-save-toc-indirect) else -ifeq ($(call cc-option-yn,-mbig-endian),y) -override CC += -mbig-endian -override AS += -mbig-endian -endif override LD += -EB LDEMULATION := ppc GNUTARGET := powerpc MULTIPLEWORD := -mmultiple endif +ifdef CONFIG_PPC64 +cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mabi=elfv1) +cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mcall-aixdesc) +aflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mabi=elfv1) +aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2 +endif + +cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mlittle-endian +cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mbig-endian) +ifneq ($(cc-name),clang) + cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mno-strict-align +endif + +aflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mbig-endian) +aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mlittle-endian + ifeq ($(HAS_BIARCH),y) override AS += -a$(CONFIG_WORD_SIZE) override LD += -m elf$(CONFIG_WORD_SIZE)$(LDEMULATION) @@ -121,7 +127,9 @@ ifeq ($(CONFIG_CPU_LITTLE_ENDIAN),y) CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2,$(call cc-option,-mcall-aixdesc)) AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2) else +CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv1) CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcall-aixdesc) +AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv1) endif CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcmodel=medium,$(call cc-option,-mminimal-toc)) CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mno-pointers-to-nested-functions) @@ -212,6 +220,9 @@ cpu-as-$(CONFIG_E200) += -Wa,-me200 KBUILD_AFLAGS += $(cpu-as-y) KBUILD_CFLAGS += $(cpu-as-y) +KBUILD_AFLAGS += $(aflags-y) +KBUILD_CFLAGS += $(cflags-y) + head-y := arch/powerpc/kernel/head_$(CONFIG_WORD_SIZE).o head-$(CONFIG_8xx) := arch/powerpc/kernel/head_8xx.o head-$(CONFIG_40x) := arch/powerpc/kernel/head_40x.o diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper index ceaa75d5a684..be4831acda22 100755 --- a/arch/powerpc/boot/wrapper +++ b/arch/powerpc/boot/wrapper @@ -161,6 +161,28 @@ case "$elfformat" in elf32-powerpc) format=elf32ppc ;; esac +ld_version() +{ + # Poached from scripts/ld-version.sh, but we don't want to call that because + # this script (wrapper) is distributed separately from the kernel source. + # Extract linker version number from stdin and turn into single number. + awk '{ + gsub(".*\\)", ""); + gsub(".*version ", ""); + gsub("-.*", ""); + split($1,a, "."); + print a[1]*100000000 + a[2]*1000000 + a[3]*10000; + exit + }' +} + +# Do not include PT_INTERP segment when linking pie. Non-pie linking +# just ignores this option. +LD_VERSION=$(${CROSS}ld --version | ld_version) +LD_NO_DL_MIN_VERSION=$(echo 2.26 | ld_version) +if [ "$LD_VERSION" -ge "$LD_NO_DL_MIN_VERSION" ] ; then + nodl="--no-dynamic-linker" +fi platformo=$object/"$platform".o lds=$object/zImage.lds @@ -412,7 +434,7 @@ if [ "$platform" != "miboot" ]; then if [ -n "$link_address" ] ; then text_start="-Ttext $link_address" fi - ${CROSS}ld -m $format -T $lds $text_start $pie -o "$ofile" \ + ${CROSS}ld -m $format -T $lds $text_start $pie $nodl -o "$ofile" \ $platformo $tmp $object/wrapper.a rm $tmp fi diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 75b5d8f59940..2dccd604c6eb 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1718,6 +1718,51 @@ config X86_INTEL_MPX If unsure, say N. +choice + prompt "TSX enable mode" + depends on CPU_SUP_INTEL + default X86_INTEL_TSX_MODE_OFF + help + Intel's TSX (Transactional Synchronization Extensions) feature + allows to optimize locking protocols through lock elision which + can lead to a noticeable performance boost. + + On the other hand it has been shown that TSX can be exploited + to form side channel attacks (e.g. TAA) and chances are there + will be more of those attacks discovered in the future. + + Therefore TSX is not enabled by default (aka tsx=off). An admin + might override this decision by tsx=on the command line parameter. + Even with TSX enabled, the kernel will attempt to enable the best + possible TAA mitigation setting depending on the microcode available + for the particular machine. + + This option allows to set the default tsx mode between tsx=on, =off + and =auto. See Documentation/kernel-parameters.txt for more + details. + + Say off if not sure, auto if TSX is in use but it should be used on safe + platforms or on if TSX is in use and the security aspect of tsx is not + relevant. + +config X86_INTEL_TSX_MODE_OFF + bool "off" + help + TSX is disabled if possible - equals to tsx=off command line parameter. + +config X86_INTEL_TSX_MODE_ON + bool "on" + help + TSX is always enabled on TSX capable HW - equals the tsx=on command + line parameter. + +config X86_INTEL_TSX_MODE_AUTO + bool "auto" + help + TSX is enabled on TSX capable HW that is believed to be safe against + side channel attacks- equals the tsx=auto command line parameter. +endchoice + config EFI bool "EFI runtime service support" depends on ACPI diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 113cb01ebaac..94491e4d21a7 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -340,5 +340,7 @@ #define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */ #define X86_BUG_MSBDS_ONLY X86_BUG(20) /* CPU is only affected by the MSDBS variant of BUG_MDS */ #define X86_BUG_SWAPGS X86_BUG(21) /* CPU is affected by speculation through SWAPGS */ +#define X86_BUG_TAA X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */ +#define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */ #endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 39f202462029..dac449879113 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -408,6 +408,7 @@ struct kvm_vcpu_arch { u64 smbase; bool tpr_access_reporting; u64 ia32_xss; + u64 arch_capabilities; /* * Paging state of the vcpu @@ -1226,6 +1227,7 @@ void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu); void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm, unsigned long address); +u64 kvm_get_arch_capabilities(void); void kvm_define_shared_msr(unsigned index, u32 msr); int kvm_set_shared_msr(unsigned index, u64 val, u64 mask); diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 30183770132a..854a20efa771 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -71,10 +71,26 @@ * Microarchitectural Data * Sampling (MDS) vulnerabilities. */ +#define ARCH_CAP_PSCHANGE_MC_NO BIT(6) /* + * The processor is not susceptible to a + * machine check error due to modifying the + * code page size along with either the + * physical address or cache type + * without TLB invalidation. + */ +#define ARCH_CAP_TSX_CTRL_MSR BIT(7) /* MSR for TSX control is available. */ +#define ARCH_CAP_TAA_NO BIT(8) /* + * Not susceptible to + * TSX Async Abort (TAA) vulnerabilities. + */ #define MSR_IA32_BBL_CR_CTL 0x00000119 #define MSR_IA32_BBL_CR_CTL3 0x0000011e +#define MSR_IA32_TSX_CTRL 0x00000122 +#define TSX_CTRL_RTM_DISABLE BIT(0) /* Disable RTM feature */ +#define TSX_CTRL_CPUID_CLEAR BIT(1) /* Disable TSX enumeration */ + #define MSR_IA32_SYSENTER_CS 0x00000174 #define MSR_IA32_SYSENTER_ESP 0x00000175 #define MSR_IA32_SYSENTER_EIP 0x00000176 diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index c3138ac80db2..783f0711895b 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -268,7 +268,7 @@ DECLARE_STATIC_KEY_FALSE(mds_idle_clear); #include /** - * mds_clear_cpu_buffers - Mitigation for MDS vulnerability + * mds_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability * * This uses the otherwise unused and obsolete VERW instruction in * combination with microcode which triggers a CPU buffer flush when the @@ -291,7 +291,7 @@ static inline void mds_clear_cpu_buffers(void) } /** - * mds_user_clear_cpu_buffers - Mitigation for MDS vulnerability + * mds_user_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability * * Clear CPU buffers if the corresponding static key is enabled */ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index dab73faef9b0..cac54e61c299 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -852,4 +852,11 @@ enum mds_mitigations { MDS_MITIGATION_VMWERV, }; +enum taa_mitigations { + TAA_MITIGATION_OFF, + TAA_MITIGATION_UCODE_NEEDED, + TAA_MITIGATION_VERW, + TAA_MITIGATION_TSX_DISABLED, +}; + #endif /* _ASM_X86_PROCESSOR_H */ diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index 1e5184092ee6..ea8f887da6cc 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -25,7 +25,7 @@ obj-y += bugs.o obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o -obj-$(CONFIG_CPU_SUP_INTEL) += intel.o +obj-$(CONFIG_CPU_SUP_INTEL) += intel.o tsx.o obj-$(CONFIG_CPU_SUP_AMD) += amd.o obj-$(CONFIG_CPU_SUP_CYRIX_32) += cyrix.o obj-$(CONFIG_CPU_SUP_CENTAUR) += centaur.o diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 917c63aa1599..7fd0a13ae0ba 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -30,11 +30,14 @@ #include #include +#include "cpu.h" + static void __init spectre_v1_select_mitigation(void); static void __init spectre_v2_select_mitigation(void); static void __init ssb_select_mitigation(void); static void __init l1tf_select_mitigation(void); static void __init mds_select_mitigation(void); +static void __init taa_select_mitigation(void); /* The base value of the SPEC_CTRL MSR that always has to be preserved. */ u64 x86_spec_ctrl_base; @@ -94,6 +97,7 @@ void __init check_bugs(void) ssb_select_mitigation(); l1tf_select_mitigation(); mds_select_mitigation(); + taa_select_mitigation(); arch_smt_update(); @@ -246,6 +250,93 @@ static int __init mds_cmdline(char *str) } early_param("mds", mds_cmdline); +#undef pr_fmt +#define pr_fmt(fmt) "TAA: " fmt + +/* Default mitigation for TAA-affected CPUs */ +static enum taa_mitigations taa_mitigation = TAA_MITIGATION_VERW; + +static const char * const taa_strings[] = { + [TAA_MITIGATION_OFF] = "Vulnerable", + [TAA_MITIGATION_UCODE_NEEDED] = "Vulnerable: Clear CPU buffers attempted, no microcode", + [TAA_MITIGATION_VERW] = "Mitigation: Clear CPU buffers", + [TAA_MITIGATION_TSX_DISABLED] = "Mitigation: TSX disabled", +}; + +static void __init taa_select_mitigation(void) +{ + u64 ia32_cap; + + if (!boot_cpu_has_bug(X86_BUG_TAA)) { + taa_mitigation = TAA_MITIGATION_OFF; + return; + } + + /* TSX previously disabled by tsx=off */ + if (!boot_cpu_has(X86_FEATURE_RTM)) { + taa_mitigation = TAA_MITIGATION_TSX_DISABLED; + goto out; + } + + if (cpu_mitigations_off()) { + taa_mitigation = TAA_MITIGATION_OFF; + return; + } + + /* TAA mitigation is turned off on the cmdline (tsx_async_abort=off) */ + if (taa_mitigation == TAA_MITIGATION_OFF) + goto out; + + if (boot_cpu_has(X86_FEATURE_MD_CLEAR)) + taa_mitigation = TAA_MITIGATION_VERW; + else + taa_mitigation = TAA_MITIGATION_UCODE_NEEDED; + + /* + * VERW doesn't clear the CPU buffers when MD_CLEAR=1 and MDS_NO=1. + * A microcode update fixes this behavior to clear CPU buffers. It also + * adds support for MSR_IA32_TSX_CTRL which is enumerated by the + * ARCH_CAP_TSX_CTRL_MSR bit. + * + * On MDS_NO=1 CPUs if ARCH_CAP_TSX_CTRL_MSR is not set, microcode + * update is required. + */ + ia32_cap = x86_read_arch_cap_msr(); + if ( (ia32_cap & ARCH_CAP_MDS_NO) && + !(ia32_cap & ARCH_CAP_TSX_CTRL_MSR)) + taa_mitigation = TAA_MITIGATION_UCODE_NEEDED; + + /* + * TSX is enabled, select alternate mitigation for TAA which is + * the same as MDS. Enable MDS static branch to clear CPU buffers. + * + * For guests that can't determine whether the correct microcode is + * present on host, enable the mitigation for UCODE_NEEDED as well. + */ + static_branch_enable(&mds_user_clear); + +out: + pr_info("%s\n", taa_strings[taa_mitigation]); +} + +static int __init tsx_async_abort_parse_cmdline(char *str) +{ + if (!boot_cpu_has_bug(X86_BUG_TAA)) + return 0; + + if (!str) + return -EINVAL; + + if (!strcmp(str, "off")) { + taa_mitigation = TAA_MITIGATION_OFF; + } else if (!strcmp(str, "full")) { + taa_mitigation = TAA_MITIGATION_VERW; + } + + return 0; +} +early_param("tsx_async_abort", tsx_async_abort_parse_cmdline); + #undef pr_fmt #define pr_fmt(fmt) "Spectre V1 : " fmt @@ -758,13 +849,10 @@ static void update_mds_branch_idle(void) } #define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n" +#define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n" void arch_smt_update(void) { - /* Enhanced IBRS implies STIBP. No update required. */ - if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) - return; - mutex_lock(&spec_ctrl_mutex); switch (spectre_v2_user) { @@ -790,6 +878,17 @@ void arch_smt_update(void) break; } + switch (taa_mitigation) { + case TAA_MITIGATION_VERW: + case TAA_MITIGATION_UCODE_NEEDED: + if (sched_smt_active()) + pr_warn_once(TAA_MSG_SMT); + break; + case TAA_MITIGATION_TSX_DISABLED: + case TAA_MITIGATION_OFF: + break; + } + mutex_unlock(&spec_ctrl_mutex); } @@ -1178,6 +1277,11 @@ static void __init l1tf_select_mitigation(void) #ifdef CONFIG_SYSFS +static ssize_t itlb_multihit_show_state(char *buf) +{ + return sprintf(buf, "Processor vulnerable\n"); +} + static ssize_t mds_show_state(char *buf) { #ifdef CONFIG_HYPERVISOR_GUEST @@ -1197,6 +1301,21 @@ static ssize_t mds_show_state(char *buf) sched_smt_active() ? "vulnerable" : "disabled"); } +static ssize_t tsx_async_abort_show_state(char *buf) +{ + if ((taa_mitigation == TAA_MITIGATION_TSX_DISABLED) || + (taa_mitigation == TAA_MITIGATION_OFF)) + return sprintf(buf, "%s\n", taa_strings[taa_mitigation]); + + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { + return sprintf(buf, "%s; SMT Host state unknown\n", + taa_strings[taa_mitigation]); + } + + return sprintf(buf, "%s; SMT %s\n", taa_strings[taa_mitigation], + sched_smt_active() ? "vulnerable" : "disabled"); +} + static char *stibp_state(void) { if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) @@ -1262,6 +1381,12 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr case X86_BUG_MDS: return mds_show_state(buf); + case X86_BUG_TAA: + return tsx_async_abort_show_state(buf); + + case X86_BUG_ITLB_MULTIHIT: + return itlb_multihit_show_state(buf); + default: break; } @@ -1298,4 +1423,14 @@ ssize_t cpu_show_mds(struct device *dev, struct device_attribute *attr, char *bu { return cpu_show_common(dev, attr, buf, X86_BUG_MDS); } + +ssize_t cpu_show_tsx_async_abort(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_TAA); +} + +ssize_t cpu_show_itlb_multihit(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_ITLB_MULTIHIT); +} #endif diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 3965235973c8..e8fa12c7ad5b 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -847,13 +847,14 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c) #endif } -#define NO_SPECULATION BIT(0) -#define NO_MELTDOWN BIT(1) -#define NO_SSB BIT(2) -#define NO_L1TF BIT(3) -#define NO_MDS BIT(4) -#define MSBDS_ONLY BIT(5) -#define NO_SWAPGS BIT(6) +#define NO_SPECULATION BIT(0) +#define NO_MELTDOWN BIT(1) +#define NO_SSB BIT(2) +#define NO_L1TF BIT(3) +#define NO_MDS BIT(4) +#define MSBDS_ONLY BIT(5) +#define NO_SWAPGS BIT(6) +#define NO_ITLB_MULTIHIT BIT(7) #define VULNWL(_vendor, _family, _model, _whitelist) \ { X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist } @@ -871,26 +872,26 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { VULNWL(NSC, 5, X86_MODEL_ANY, NO_SPECULATION), /* Intel Family 6 */ - VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION), - VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION), - VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION), - VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION), - VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION), + VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION | NO_ITLB_MULTIHIT), + VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION | NO_ITLB_MULTIHIT), + VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION | NO_ITLB_MULTIHIT), + VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION | NO_ITLB_MULTIHIT), + VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION | NO_ITLB_MULTIHIT), - VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), - VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), - VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), - VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), - VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), - VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), + VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), + VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), + VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), + VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), + VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), + VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), VULNWL_INTEL(CORE_YONAH, NO_SSB), - VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY | NO_SWAPGS), + VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), - VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS), - VULNWL_INTEL(ATOM_GOLDMONT_X, NO_MDS | NO_L1TF | NO_SWAPGS), - VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS), + VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT), + VULNWL_INTEL(ATOM_GOLDMONT_X, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT), + VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT), /* * Technically, swapgs isn't serializing on AMD (despite it previously @@ -901,13 +902,13 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { */ /* AMD Family 0xf - 0x12 */ - VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS), - VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS), - VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS), - VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS), + VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), + VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), + VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), + VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */ - VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS), + VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), {} }; @@ -918,19 +919,30 @@ static bool __init cpu_matches(unsigned long which) return m && !!(m->driver_data & which); } -static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) +u64 x86_read_arch_cap_msr(void) { u64 ia32_cap = 0; + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap); + + return ia32_cap; +} + +static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) +{ + u64 ia32_cap = x86_read_arch_cap_msr(); + + /* Set ITLB_MULTIHIT bug if cpu is not in the whitelist and not mitigated */ + if (!cpu_matches(NO_ITLB_MULTIHIT) && !(ia32_cap & ARCH_CAP_PSCHANGE_MC_NO)) + setup_force_cpu_bug(X86_BUG_ITLB_MULTIHIT); + if (cpu_matches(NO_SPECULATION)) return; setup_force_cpu_bug(X86_BUG_SPECTRE_V1); setup_force_cpu_bug(X86_BUG_SPECTRE_V2); - if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES)) - rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap); - if (!cpu_matches(NO_SSB) && !(ia32_cap & ARCH_CAP_SSB_NO) && !cpu_has(c, X86_FEATURE_AMD_SSB_NO)) setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS); @@ -947,6 +959,21 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) if (!cpu_matches(NO_SWAPGS)) setup_force_cpu_bug(X86_BUG_SWAPGS); + /* + * When the CPU is not mitigated for TAA (TAA_NO=0) set TAA bug when: + * - TSX is supported or + * - TSX_CTRL is present + * + * TSX_CTRL check is needed for cases when TSX could be disabled before + * the kernel boot e.g. kexec. + * TSX_CTRL check alone is not sufficient for cases when the microcode + * update is not present or running as guest that don't get TSX_CTRL. + */ + if (!(ia32_cap & ARCH_CAP_TAA_NO) && + (cpu_has(c, X86_FEATURE_RTM) || + (ia32_cap & ARCH_CAP_TSX_CTRL_MSR))) + setup_force_cpu_bug(X86_BUG_TAA); + if (cpu_matches(NO_MELTDOWN)) return; @@ -1287,6 +1314,8 @@ void __init identify_boot_cpu(void) enable_sep_cpu(); #endif cpu_detect_tlb(&boot_cpu_data); + + tsx_init(); } void identify_secondary_cpu(struct cpuinfo_x86 *c) diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h index 3b19d82f7932..c42cc1acd668 100644 --- a/arch/x86/kernel/cpu/cpu.h +++ b/arch/x86/kernel/cpu/cpu.h @@ -44,9 +44,27 @@ struct _tlb_table { extern const struct cpu_dev *const __x86_cpu_dev_start[], *const __x86_cpu_dev_end[]; +#ifdef CONFIG_CPU_SUP_INTEL +enum tsx_ctrl_states { + TSX_CTRL_ENABLE, + TSX_CTRL_DISABLE, + TSX_CTRL_NOT_SUPPORTED, +}; + +extern enum tsx_ctrl_states tsx_ctrl_state; + +extern void __init tsx_init(void); +extern void tsx_enable(void); +extern void tsx_disable(void); +#else +static inline void tsx_init(void) { } +#endif /* CONFIG_CPU_SUP_INTEL */ + extern void get_cpu_cap(struct cpuinfo_x86 *c); extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c); extern void x86_spec_ctrl_setup_ap(void); +extern u64 x86_read_arch_cap_msr(void); + #endif /* ARCH_X86_CPU_H */ diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index b0e0c7a12e61..7beef3da5904 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -582,6 +582,11 @@ static void init_intel(struct cpuinfo_x86 *c) detect_vmx_virtcap(c); init_intel_energy_perf(c); + + if (tsx_ctrl_state == TSX_CTRL_ENABLE) + tsx_enable(); + if (tsx_ctrl_state == TSX_CTRL_DISABLE) + tsx_disable(); } #ifdef CONFIG_X86_32 diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c index 989d3c215d2b..66ca6ec09bd4 100644 --- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c +++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c @@ -555,7 +555,7 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs) if (event->attr.sample_type & PERF_SAMPLE_RAW) offset_max = perf_ibs->offset_max; else if (check_rip) - offset_max = 2; + offset_max = 3; else offset_max = 1; do { diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c new file mode 100644 index 000000000000..c2a9dd816c5c --- /dev/null +++ b/arch/x86/kernel/cpu/tsx.c @@ -0,0 +1,140 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Intel Transactional Synchronization Extensions (TSX) control. + * + * Copyright (C) 2019 Intel Corporation + * + * Author: + * Pawan Gupta + */ + +#include + +#include + +#include "cpu.h" + +enum tsx_ctrl_states tsx_ctrl_state = TSX_CTRL_NOT_SUPPORTED; + +void tsx_disable(void) +{ + u64 tsx; + + rdmsrl(MSR_IA32_TSX_CTRL, tsx); + + /* Force all transactions to immediately abort */ + tsx |= TSX_CTRL_RTM_DISABLE; + + /* + * Ensure TSX support is not enumerated in CPUID. + * This is visible to userspace and will ensure they + * do not waste resources trying TSX transactions that + * will always abort. + */ + tsx |= TSX_CTRL_CPUID_CLEAR; + + wrmsrl(MSR_IA32_TSX_CTRL, tsx); +} + +void tsx_enable(void) +{ + u64 tsx; + + rdmsrl(MSR_IA32_TSX_CTRL, tsx); + + /* Enable the RTM feature in the cpu */ + tsx &= ~TSX_CTRL_RTM_DISABLE; + + /* + * Ensure TSX support is enumerated in CPUID. + * This is visible to userspace and will ensure they + * can enumerate and use the TSX feature. + */ + tsx &= ~TSX_CTRL_CPUID_CLEAR; + + wrmsrl(MSR_IA32_TSX_CTRL, tsx); +} + +static bool __init tsx_ctrl_is_supported(void) +{ + u64 ia32_cap = x86_read_arch_cap_msr(); + + /* + * TSX is controlled via MSR_IA32_TSX_CTRL. However, support for this + * MSR is enumerated by ARCH_CAP_TSX_MSR bit in MSR_IA32_ARCH_CAPABILITIES. + * + * TSX control (aka MSR_IA32_TSX_CTRL) is only available after a + * microcode update on CPUs that have their MSR_IA32_ARCH_CAPABILITIES + * bit MDS_NO=1. CPUs with MDS_NO=0 are not planned to get + * MSR_IA32_TSX_CTRL support even after a microcode update. Thus, + * tsx= cmdline requests will do nothing on CPUs without + * MSR_IA32_TSX_CTRL support. + */ + return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR); +} + +static enum tsx_ctrl_states x86_get_tsx_auto_mode(void) +{ + if (boot_cpu_has_bug(X86_BUG_TAA)) + return TSX_CTRL_DISABLE; + + return TSX_CTRL_ENABLE; +} + +void __init tsx_init(void) +{ + char arg[5] = {}; + int ret; + + if (!tsx_ctrl_is_supported()) + return; + + ret = cmdline_find_option(boot_command_line, "tsx", arg, sizeof(arg)); + if (ret >= 0) { + if (!strcmp(arg, "on")) { + tsx_ctrl_state = TSX_CTRL_ENABLE; + } else if (!strcmp(arg, "off")) { + tsx_ctrl_state = TSX_CTRL_DISABLE; + } else if (!strcmp(arg, "auto")) { + tsx_ctrl_state = x86_get_tsx_auto_mode(); + } else { + tsx_ctrl_state = TSX_CTRL_DISABLE; + pr_err("tsx: invalid option, defaulting to off\n"); + } + } else { + /* tsx= not provided */ + if (IS_ENABLED(CONFIG_X86_INTEL_TSX_MODE_AUTO)) + tsx_ctrl_state = x86_get_tsx_auto_mode(); + else if (IS_ENABLED(CONFIG_X86_INTEL_TSX_MODE_OFF)) + tsx_ctrl_state = TSX_CTRL_DISABLE; + else + tsx_ctrl_state = TSX_CTRL_ENABLE; + } + + if (tsx_ctrl_state == TSX_CTRL_DISABLE) { + tsx_disable(); + + /* + * tsx_disable() will change the state of the + * RTM CPUID bit. Clear it here since it is now + * expected to be not set. + */ + setup_clear_cpu_cap(X86_FEATURE_RTM); + } else if (tsx_ctrl_state == TSX_CTRL_ENABLE) { + + /* + * HW defaults TSX to be enabled at bootup. + * We may still need the TSX enable support + * during init for special cases like + * kexec after TSX is disabled. + */ + tsx_enable(); + + /* + * tsx_enable() will change the state of the + * RTM CPUID bit. Force it here since it is now + * expected to be set. + */ + setup_force_cpu_cap(X86_FEATURE_RTM); + } +} diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 53918abccbc3..40e415fedcee 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -447,6 +447,18 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, entry->ebx |= F(TSC_ADJUST); entry->edx &= kvm_cpuid_7_0_edx_x86_features; cpuid_mask(&entry->edx, CPUID_7_EDX); + if (boot_cpu_has(X86_FEATURE_IBPB) && + boot_cpu_has(X86_FEATURE_IBRS)) + entry->edx |= F(SPEC_CTRL); + if (boot_cpu_has(X86_FEATURE_STIBP)) + entry->edx |= F(INTEL_STIBP); + if (boot_cpu_has(X86_FEATURE_SSBD)) + entry->edx |= F(SPEC_CTRL_SSBD); + /* + * We emulate ARCH_CAPABILITIES in software even + * if the host doesn't support it. + */ + entry->edx |= F(ARCH_CAPABILITIES); } else { entry->ebx = 0; entry->edx = 0; diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index f8f9d1b368bf..1b3a432f6fd5 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -546,7 +546,6 @@ struct vcpu_vmx { u64 msr_guest_kernel_gs_base; #endif - u64 arch_capabilities; u64 spec_ctrl; u32 vm_entry_controls_shadow; @@ -2866,12 +2865,6 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = to_vmx(vcpu)->spec_ctrl; break; - case MSR_IA32_ARCH_CAPABILITIES: - if (!msr_info->host_initiated && - !guest_cpuid_has_arch_capabilities(vcpu)) - return 1; - msr_info->data = to_vmx(vcpu)->arch_capabilities; - break; case MSR_IA32_SYSENTER_CS: msr_info->data = vmcs_read32(GUEST_SYSENTER_CS); break; @@ -3028,11 +3021,6 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_W); break; - case MSR_IA32_ARCH_CAPABILITIES: - if (!msr_info->host_initiated) - return 1; - vmx->arch_capabilities = data; - break; case MSR_IA32_CR_PAT: if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) { if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data)) @@ -5079,9 +5067,6 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx) ++vmx->nmsrs; } - if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) - rdmsrl(MSR_IA32_ARCH_CAPABILITIES, vmx->arch_capabilities); - vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl); /* 22.2.1, 20.8.1 */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2b47fd3d4b8c..3b711cd261d7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -575,7 +575,7 @@ static bool pdptrs_changed(struct kvm_vcpu *vcpu) gfn_t gfn; int r; - if (is_long_mode(vcpu) || !is_pae(vcpu)) + if (is_long_mode(vcpu) || !is_pae(vcpu) || !is_paging(vcpu)) return false; if (!test_bit(VCPU_EXREG_PDPTR, @@ -995,6 +995,43 @@ static u32 emulated_msrs[] = { static unsigned num_emulated_msrs; +u64 kvm_get_arch_capabilities(void) +{ + u64 data; + + rdmsrl_safe(MSR_IA32_ARCH_CAPABILITIES, &data); + + if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN)) + data |= ARCH_CAP_RDCL_NO; + if (!boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS)) + data |= ARCH_CAP_SSB_NO; + if (!boot_cpu_has_bug(X86_BUG_MDS)) + data |= ARCH_CAP_MDS_NO; + + /* + * On TAA affected systems, export MDS_NO=0 when: + * - TSX is enabled on the host, i.e. X86_FEATURE_RTM=1. + * - Updated microcode is present. This is detected by + * the presence of ARCH_CAP_TSX_CTRL_MSR and ensures + * that VERW clears CPU buffers. + * + * When MDS_NO=0 is exported, guests deploy clear CPU buffer + * mitigation and don't complain: + * + * "Vulnerable: Clear CPU buffers attempted, no microcode" + * + * If TSX is disabled on the system, guests are also mitigated against + * TAA and clear CPU buffer mitigation is not required for guests. + */ + if (boot_cpu_has_bug(X86_BUG_TAA) && boot_cpu_has(X86_FEATURE_RTM) && + (data & ARCH_CAP_TSX_CTRL_MSR)) + data &= ~ARCH_CAP_MDS_NO; + + return data; +} + +EXPORT_SYMBOL_GPL(kvm_get_arch_capabilities); + static bool __kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer) { if (efer & EFER_FFXSR) { @@ -2070,6 +2107,11 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_AMD64_BU_CFG2: break; + case MSR_IA32_ARCH_CAPABILITIES: + if (!msr_info->host_initiated) + return 1; + vcpu->arch.arch_capabilities = data; + break; case MSR_EFER: return set_efer(vcpu, msr_info); case MSR_K7_HWCR: @@ -2344,6 +2386,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_UCODE_REV: msr_info->data = 0x100000000ULL; break; + case MSR_IA32_ARCH_CAPABILITIES: + if (!msr_info->host_initiated && + !guest_cpuid_has_arch_capabilities(vcpu)) + return 1; + msr_info->data = vcpu->arch.arch_capabilities; + break; case MSR_MTRRcap: case 0x200 ... 0x2ff: return kvm_mtrr_get_msr(vcpu, msr_info->index, &msr_info->data); @@ -7168,7 +7216,7 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, kvm_update_cpuid(vcpu); idx = srcu_read_lock(&vcpu->kvm->srcu); - if (!is_long_mode(vcpu) && is_pae(vcpu)) { + if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu)) { load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu)); mmu_reset_needed = 1; } @@ -7392,6 +7440,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) { int r; + vcpu->arch.arch_capabilities = kvm_get_arch_capabilities(); kvm_vcpu_mtrr_init(vcpu); r = vcpu_load(vcpu); if (r) diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index e5ac568c7c9b..2a5aad29c8a3 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -711,12 +711,27 @@ ssize_t __weak cpu_show_mds(struct device *dev, return sprintf(buf, "Not affected\n"); } +ssize_t __weak cpu_show_tsx_async_abort(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + +ssize_t __weak cpu_show_itlb_multihit(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL); static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL); static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL); +static DEVICE_ATTR(tsx_async_abort, 0444, cpu_show_tsx_async_abort, NULL); +static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL); static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_meltdown.attr, @@ -725,6 +740,8 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_spec_store_bypass.attr, &dev_attr_l1tf.attr, &dev_attr_mds.attr, + &dev_attr_tsx_async_abort.attr, + &dev_attr_itlb_multihit.attr, NULL }; diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index db58c8d664c2..6188b70c2790 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -50,13 +50,11 @@ * granting userspace undue privileges. There are three categories of privilege. * * First, commands which are explicitly defined as privileged or which should - * only be used by the kernel driver. The parser generally rejects such - * commands, though it may allow some from the drm master process. + * only be used by the kernel driver. The parser rejects such commands * * Second, commands which access registers. To support correct/enhanced * userspace functionality, particularly certain OpenGL extensions, the parser - * provides a whitelist of registers which userspace may safely access (for both - * normal and drm master processes). + * provides a whitelist of registers which userspace may safely access * * Third, commands which access privileged memory (i.e. GGTT, HWS page, etc). * The parser always rejects such commands. @@ -81,9 +79,9 @@ * in the per-ring command tables. * * Other command table entries map fairly directly to high level categories - * mentioned above: rejected, master-only, register whitelist. The parser - * implements a number of checks, including the privileged memory checks, via a - * general bitmasking mechanism. + * mentioned above: rejected, register whitelist. The parser implements a number + * of checks, including the privileged memory checks, via a general bitmasking + * mechanism. */ #define STD_MI_OPCODE_MASK 0xFF800000 @@ -94,7 +92,7 @@ #define CMD(op, opm, f, lm, fl, ...) \ { \ .flags = (fl) | ((f) ? CMD_DESC_FIXED : 0), \ - .cmd = { (op), (opm) }, \ + .cmd = { (op) & (opm), (opm) }, \ .length = { (lm) }, \ __VA_ARGS__ \ } @@ -109,14 +107,13 @@ #define R CMD_DESC_REJECT #define W CMD_DESC_REGISTER #define B CMD_DESC_BITMASK -#define M CMD_DESC_MASTER /* Command Mask Fixed Len Action ---------------------------------------------------------- */ -static const struct drm_i915_cmd_descriptor common_cmds[] = { +static const struct drm_i915_cmd_descriptor gen7_common_cmds[] = { CMD( MI_NOOP, SMI, F, 1, S ), CMD( MI_USER_INTERRUPT, SMI, F, 1, R ), - CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, M ), + CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, R ), CMD( MI_ARB_CHECK, SMI, F, 1, S ), CMD( MI_REPORT_HEAD, SMI, F, 1, S ), CMD( MI_SUSPEND_FLUSH, SMI, F, 1, S ), @@ -146,7 +143,7 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = { CMD( MI_BATCH_BUFFER_START, SMI, !F, 0xFF, S ), }; -static const struct drm_i915_cmd_descriptor render_cmds[] = { +static const struct drm_i915_cmd_descriptor gen7_render_cmds[] = { CMD( MI_FLUSH, SMI, F, 1, S ), CMD( MI_ARB_ON_OFF, SMI, F, 1, R ), CMD( MI_PREDICATE, SMI, F, 1, S ), @@ -213,7 +210,7 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { CMD( MI_URB_ATOMIC_ALLOC, SMI, F, 1, S ), CMD( MI_SET_APPID, SMI, F, 1, S ), CMD( MI_RS_CONTEXT, SMI, F, 1, S ), - CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, M ), + CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, R ), CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ), CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, R ), CMD( MI_RS_STORE_DATA_IMM, SMI, !F, 0xFF, S ), @@ -229,7 +226,7 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { CMD( GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS, S3D, !F, 0x1FF, S ), }; -static const struct drm_i915_cmd_descriptor video_cmds[] = { +static const struct drm_i915_cmd_descriptor gen7_video_cmds[] = { CMD( MI_ARB_ON_OFF, SMI, F, 1, R ), CMD( MI_SET_APPID, SMI, F, 1, S ), CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, @@ -273,7 +270,7 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = { CMD( MFX_WAIT, SMFX, F, 1, S ), }; -static const struct drm_i915_cmd_descriptor vecs_cmds[] = { +static const struct drm_i915_cmd_descriptor gen7_vecs_cmds[] = { CMD( MI_ARB_ON_OFF, SMI, F, 1, R ), CMD( MI_SET_APPID, SMI, F, 1, S ), CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, @@ -311,7 +308,7 @@ static const struct drm_i915_cmd_descriptor vecs_cmds[] = { }}, ), }; -static const struct drm_i915_cmd_descriptor blt_cmds[] = { +static const struct drm_i915_cmd_descriptor gen7_blt_cmds[] = { CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, B, .bits = {{ @@ -345,10 +342,62 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = { }; static const struct drm_i915_cmd_descriptor hsw_blt_cmds[] = { - CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, M ), + CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, R ), CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ), }; +/* + * For Gen9 we can still rely on the h/w to enforce cmd security, and only + * need to re-enforce the register access checks. We therefore only need to + * teach the cmdparser how to find the end of each command, and identify + * register accesses. The table doesn't need to reject any commands, and so + * the only commands listed here are: + * 1) Those that touch registers + * 2) Those that do not have the default 8-bit length + * + * Note that the default MI length mask chosen for this table is 0xFF, not + * the 0x3F used on older devices. This is because the vast majority of MI + * cmds on Gen9 use a standard 8-bit Length field. + * All the Gen9 blitter instructions are standard 0xFF length mask, and + * none allow access to non-general registers, so in fact no BLT cmds are + * included in the table at all. + * + */ +static const struct drm_i915_cmd_descriptor gen9_blt_cmds[] = { + CMD( MI_NOOP, SMI, F, 1, S ), + CMD( MI_USER_INTERRUPT, SMI, F, 1, S ), + CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, S ), + CMD( MI_FLUSH, SMI, F, 1, S ), + CMD( MI_ARB_CHECK, SMI, F, 1, S ), + CMD( MI_REPORT_HEAD, SMI, F, 1, S ), + CMD( MI_ARB_ON_OFF, SMI, F, 1, S ), + CMD( MI_SUSPEND_FLUSH, SMI, F, 1, S ), + CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, S ), + CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, S ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, S ), + CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, W, + .reg = { .offset = 1, .mask = 0x007FFFFC, .step = 2 } ), + CMD( MI_UPDATE_GTT, SMI, !F, 0x3FF, S ), + CMD( MI_STORE_REGISTER_MEM_GEN8, SMI, F, 4, W, + .reg = { .offset = 1, .mask = 0x007FFFFC } ), + CMD( MI_FLUSH_DW, SMI, !F, 0x3F, S ), + CMD( MI_LOAD_REGISTER_MEM_GEN8, SMI, F, 4, W, + .reg = { .offset = 1, .mask = 0x007FFFFC } ), + CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, W, + .reg = { .offset = 1, .mask = 0x007FFFFC, .step = 1 } ), + + /* + * We allow BB_START but apply further checks. We just sanitize the + * basic fields here. + */ + CMD( MI_BATCH_BUFFER_START_GEN8, SMI, !F, 0xFF, B, + .bits = {{ + .offset = 0, + .mask = ~SMI, + .expected = (MI_BATCH_PPGTT_HSW | 1), + }}, ), +}; + #undef CMD #undef SMI #undef S3D @@ -359,40 +408,44 @@ static const struct drm_i915_cmd_descriptor hsw_blt_cmds[] = { #undef R #undef W #undef B -#undef M -static const struct drm_i915_cmd_table gen7_render_cmds[] = { - { common_cmds, ARRAY_SIZE(common_cmds) }, - { render_cmds, ARRAY_SIZE(render_cmds) }, +static const struct drm_i915_cmd_table gen7_render_cmd_table[] = { + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, + { gen7_render_cmds, ARRAY_SIZE(gen7_render_cmds) }, }; -static const struct drm_i915_cmd_table hsw_render_ring_cmds[] = { - { common_cmds, ARRAY_SIZE(common_cmds) }, - { render_cmds, ARRAY_SIZE(render_cmds) }, +static const struct drm_i915_cmd_table hsw_render_ring_cmd_table[] = { + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, + { gen7_render_cmds, ARRAY_SIZE(gen7_render_cmds) }, { hsw_render_cmds, ARRAY_SIZE(hsw_render_cmds) }, }; -static const struct drm_i915_cmd_table gen7_video_cmds[] = { - { common_cmds, ARRAY_SIZE(common_cmds) }, - { video_cmds, ARRAY_SIZE(video_cmds) }, +static const struct drm_i915_cmd_table gen7_video_cmd_table[] = { + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, + { gen7_video_cmds, ARRAY_SIZE(gen7_video_cmds) }, }; -static const struct drm_i915_cmd_table hsw_vebox_cmds[] = { - { common_cmds, ARRAY_SIZE(common_cmds) }, - { vecs_cmds, ARRAY_SIZE(vecs_cmds) }, +static const struct drm_i915_cmd_table hsw_vebox_cmd_table[] = { + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, + { gen7_vecs_cmds, ARRAY_SIZE(gen7_vecs_cmds) }, }; -static const struct drm_i915_cmd_table gen7_blt_cmds[] = { - { common_cmds, ARRAY_SIZE(common_cmds) }, - { blt_cmds, ARRAY_SIZE(blt_cmds) }, +static const struct drm_i915_cmd_table gen7_blt_cmd_table[] = { + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, + { gen7_blt_cmds, ARRAY_SIZE(gen7_blt_cmds) }, }; -static const struct drm_i915_cmd_table hsw_blt_ring_cmds[] = { - { common_cmds, ARRAY_SIZE(common_cmds) }, - { blt_cmds, ARRAY_SIZE(blt_cmds) }, +static const struct drm_i915_cmd_table hsw_blt_ring_cmd_table[] = { + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, + { gen7_blt_cmds, ARRAY_SIZE(gen7_blt_cmds) }, { hsw_blt_cmds, ARRAY_SIZE(hsw_blt_cmds) }, }; +static const struct drm_i915_cmd_table gen9_blt_cmd_table[] = { + { gen9_blt_cmds, ARRAY_SIZE(gen9_blt_cmds) }, +}; + + /* * Register whitelists, sorted by increasing register offset. */ @@ -426,6 +479,10 @@ struct drm_i915_reg_descriptor { #define REG64(addr) \ REG32(addr), REG32(addr + sizeof(u32)) +#define REG64_IDX(_reg, idx) \ + { .addr = _reg(idx) }, \ + { .addr = _reg ## _UDW(idx) } + static const struct drm_i915_reg_descriptor gen7_render_regs[] = { REG64(GPGPU_THREADS_DISPATCHED), REG64(HS_INVOCATION_COUNT), @@ -479,17 +536,27 @@ static const struct drm_i915_reg_descriptor gen7_blt_regs[] = { REG32(BCS_SWCTRL), }; -static const struct drm_i915_reg_descriptor ivb_master_regs[] = { - REG32(FORCEWAKE_MT), - REG32(DERRMR), - REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_A)), - REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_B)), - REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_C)), -}; - -static const struct drm_i915_reg_descriptor hsw_master_regs[] = { - REG32(FORCEWAKE_MT), - REG32(DERRMR), +static const struct drm_i915_reg_descriptor gen9_blt_regs[] = { + REG64_IDX(RING_TIMESTAMP, RENDER_RING_BASE), + REG64_IDX(RING_TIMESTAMP, BSD_RING_BASE), + REG32(BCS_SWCTRL), + REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE), + REG64_IDX(BCS_GPR, 0), + REG64_IDX(BCS_GPR, 1), + REG64_IDX(BCS_GPR, 2), + REG64_IDX(BCS_GPR, 3), + REG64_IDX(BCS_GPR, 4), + REG64_IDX(BCS_GPR, 5), + REG64_IDX(BCS_GPR, 6), + REG64_IDX(BCS_GPR, 7), + REG64_IDX(BCS_GPR, 8), + REG64_IDX(BCS_GPR, 9), + REG64_IDX(BCS_GPR, 10), + REG64_IDX(BCS_GPR, 11), + REG64_IDX(BCS_GPR, 12), + REG64_IDX(BCS_GPR, 13), + REG64_IDX(BCS_GPR, 14), + REG64_IDX(BCS_GPR, 15), }; #undef REG64 @@ -550,6 +617,17 @@ static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header) return 0; } +static u32 gen9_blt_get_cmd_length_mask(u32 cmd_header) +{ + u32 client = (cmd_header & INSTR_CLIENT_MASK) >> INSTR_CLIENT_SHIFT; + + if (client == INSTR_MI_CLIENT || client == INSTR_BC_CLIENT) + return 0xFF; + + DRM_DEBUG_DRIVER("CMD: Abnormal blt cmd length! 0x%08X\n", cmd_header); + return 0; +} + static bool validate_cmds_sorted(struct intel_engine_cs *ring, const struct drm_i915_cmd_table *cmd_tables, int cmd_table_count) @@ -608,9 +686,7 @@ static bool check_sorted(int ring_id, static bool validate_regs_sorted(struct intel_engine_cs *ring) { - return check_sorted(ring->id, ring->reg_table, ring->reg_count) && - check_sorted(ring->id, ring->master_reg_table, - ring->master_reg_count); + return check_sorted(ring->id, ring->reg_table, ring->reg_count); } struct cmd_node { @@ -691,63 +767,61 @@ int i915_cmd_parser_init_ring(struct intel_engine_cs *ring) int cmd_table_count; int ret; - if (!IS_GEN7(ring->dev)) + if (!IS_GEN7(ring->dev) && !(IS_GEN9(ring->dev) && ring->id == BCS)) return 0; switch (ring->id) { case RCS: if (IS_HASWELL(ring->dev)) { - cmd_tables = hsw_render_ring_cmds; + cmd_tables = hsw_render_ring_cmd_table; cmd_table_count = - ARRAY_SIZE(hsw_render_ring_cmds); + ARRAY_SIZE(hsw_render_ring_cmd_table); } else { - cmd_tables = gen7_render_cmds; - cmd_table_count = ARRAY_SIZE(gen7_render_cmds); + cmd_tables = gen7_render_cmd_table; + cmd_table_count = ARRAY_SIZE(gen7_render_cmd_table); } ring->reg_table = gen7_render_regs; ring->reg_count = ARRAY_SIZE(gen7_render_regs); - if (IS_HASWELL(ring->dev)) { - ring->master_reg_table = hsw_master_regs; - ring->master_reg_count = ARRAY_SIZE(hsw_master_regs); - } else { - ring->master_reg_table = ivb_master_regs; - ring->master_reg_count = ARRAY_SIZE(ivb_master_regs); - } - ring->get_cmd_length_mask = gen7_render_get_cmd_length_mask; break; case VCS: - cmd_tables = gen7_video_cmds; - cmd_table_count = ARRAY_SIZE(gen7_video_cmds); + cmd_tables = gen7_video_cmd_table; + cmd_table_count = ARRAY_SIZE(gen7_video_cmd_table); ring->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask; break; case BCS: - if (IS_HASWELL(ring->dev)) { - cmd_tables = hsw_blt_ring_cmds; - cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmds); - } else { - cmd_tables = gen7_blt_cmds; - cmd_table_count = ARRAY_SIZE(gen7_blt_cmds); - } - - ring->reg_table = gen7_blt_regs; - ring->reg_count = ARRAY_SIZE(gen7_blt_regs); - - if (IS_HASWELL(ring->dev)) { - ring->master_reg_table = hsw_master_regs; - ring->master_reg_count = ARRAY_SIZE(hsw_master_regs); - } else { - ring->master_reg_table = ivb_master_regs; - ring->master_reg_count = ARRAY_SIZE(ivb_master_regs); - } - ring->get_cmd_length_mask = gen7_blt_get_cmd_length_mask; + if (IS_GEN9(ring->dev)) { + cmd_tables = gen9_blt_cmd_table; + cmd_table_count = ARRAY_SIZE(gen9_blt_cmd_table); + ring->get_cmd_length_mask = + gen9_blt_get_cmd_length_mask; + + /* BCS Engine unsafe without parser */ + ring->requires_cmd_parser = 1; + } + else if (IS_HASWELL(ring->dev)) { + cmd_tables = hsw_blt_ring_cmd_table; + cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmd_table); + } else { + cmd_tables = gen7_blt_cmd_table; + cmd_table_count = ARRAY_SIZE(gen7_blt_cmd_table); + } + + if (IS_GEN9(ring->dev)) { + ring->reg_table = gen9_blt_regs; + ring->reg_count = ARRAY_SIZE(gen9_blt_regs); + } else { + ring->reg_table = gen7_blt_regs; + ring->reg_count = ARRAY_SIZE(gen7_blt_regs); + } + break; case VECS: - cmd_tables = hsw_vebox_cmds; - cmd_table_count = ARRAY_SIZE(hsw_vebox_cmds); + cmd_tables = hsw_vebox_cmd_table; + cmd_table_count = ARRAY_SIZE(hsw_vebox_cmd_table); /* VECS can use the same length_mask function as VCS */ ring->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask; break; @@ -769,7 +843,7 @@ int i915_cmd_parser_init_ring(struct intel_engine_cs *ring) return ret; } - ring->needs_cmd_parser = true; + ring->using_cmd_parser = true; return 0; } @@ -783,7 +857,7 @@ int i915_cmd_parser_init_ring(struct intel_engine_cs *ring) */ void i915_cmd_parser_fini_ring(struct intel_engine_cs *ring) { - if (!ring->needs_cmd_parser) + if (!ring->using_cmd_parser) return; fini_hash_table(ring); @@ -949,30 +1023,9 @@ unpin_src: return ret ? ERR_PTR(ret) : dst; } -/** - * i915_needs_cmd_parser() - should a given ring use software command parsing? - * @ring: the ring in question - * - * Only certain platforms require software batch buffer command parsing, and - * only when enabled via module parameter. - * - * Return: true if the ring requires software command parsing - */ -bool i915_needs_cmd_parser(struct intel_engine_cs *ring) -{ - if (!ring->needs_cmd_parser) - return false; - - if (!USES_PPGTT(ring->dev)) - return false; - - return (i915.enable_cmd_parser == 1); -} - -static bool check_cmd(const struct intel_engine_cs *ring, +static int check_cmd(const struct intel_engine_cs *ring, const struct drm_i915_cmd_descriptor *desc, const u32 *cmd, u32 length, - const bool is_master, bool *oacontrol_set) { if (desc->flags & CMD_DESC_REJECT) { @@ -980,12 +1033,6 @@ static bool check_cmd(const struct intel_engine_cs *ring, return false; } - if ((desc->flags & CMD_DESC_MASTER) && !is_master) { - DRM_DEBUG_DRIVER("CMD: Rejected master-only command: 0x%08X\n", - *cmd); - return false; - } - if (desc->flags & CMD_DESC_REGISTER) { /* * Get the distance between individual register offset @@ -1002,11 +1049,6 @@ static bool check_cmd(const struct intel_engine_cs *ring, find_reg(ring->reg_table, ring->reg_count, reg_addr); - if (!reg && is_master) - reg = find_reg(ring->master_reg_table, - ring->master_reg_count, - reg_addr); - if (!reg) { DRM_DEBUG_DRIVER("CMD: Rejected register 0x%08X in command: 0x%08X (ring=%d)\n", reg_addr, *cmd, ring->id); @@ -1091,16 +1133,113 @@ static bool check_cmd(const struct intel_engine_cs *ring, return true; } +static int check_bbstart(struct intel_context *ctx, + u32 *cmd, u64 offset, u32 length, + u32 batch_len, + u64 batch_start, + u64 shadow_batch_start) +{ + + u64 jump_offset, jump_target; + u32 target_cmd_offset, target_cmd_index; + + /* For igt compatibility on older platforms */ + if (CMDPARSER_USES_GGTT(ctx->i915)) { + DRM_DEBUG("CMD: Rejecting BB_START for ggtt based submission\n"); + return -EACCES; + } + + if (length != 3) { + DRM_DEBUG("CMD: Recursive BB_START with bad length(%u)\n", + length); + return -EINVAL; + } + + jump_target = *(u64*)(cmd+1); + jump_offset = jump_target - batch_start; + + /* + * Any underflow of jump_target is guaranteed to be outside the range + * of a u32, so >= test catches both too large and too small + */ + if (jump_offset >= batch_len) { + DRM_DEBUG("CMD: BB_START to 0x%llx jumps out of BB\n", + jump_target); + return -EINVAL; + } + + /* + * This cannot overflow a u32 because we already checked jump_offset + * is within the BB, and the batch_len is a u32 + */ + target_cmd_offset = lower_32_bits(jump_offset); + target_cmd_index = target_cmd_offset / sizeof(u32); + + *(u64*)(cmd + 1) = shadow_batch_start + target_cmd_offset; + + if (target_cmd_index == offset) + return 0; + + if (ctx->jump_whitelist_cmds <= target_cmd_index) { + DRM_DEBUG("CMD: Rejecting BB_START - truncated whitelist array\n"); + return -EINVAL; + } else if (!test_bit(target_cmd_index, ctx->jump_whitelist)) { + DRM_DEBUG("CMD: BB_START to 0x%llx not a previously executed cmd\n", + jump_target); + return -EINVAL; + } + + return 0; +} + +static void init_whitelist(struct intel_context *ctx, u32 batch_len) +{ + const u32 batch_cmds = DIV_ROUND_UP(batch_len, sizeof(u32)); + const u32 exact_size = BITS_TO_LONGS(batch_cmds); + u32 next_size = BITS_TO_LONGS(roundup_pow_of_two(batch_cmds)); + unsigned long *next_whitelist; + + if (CMDPARSER_USES_GGTT(ctx->i915)) + return; + + if (batch_cmds <= ctx->jump_whitelist_cmds) { + bitmap_zero(ctx->jump_whitelist, batch_cmds); + return; + } + +again: + next_whitelist = kcalloc(next_size, sizeof(long), GFP_KERNEL); + if (next_whitelist) { + kfree(ctx->jump_whitelist); + ctx->jump_whitelist = next_whitelist; + ctx->jump_whitelist_cmds = + next_size * BITS_PER_BYTE * sizeof(long); + return; + } + + if (next_size > exact_size) { + next_size = exact_size; + goto again; + } + + DRM_DEBUG("CMD: Failed to extend whitelist. BB_START may be disallowed\n"); + bitmap_zero(ctx->jump_whitelist, ctx->jump_whitelist_cmds); + + return; +} + #define LENGTH_BIAS 2 /** * i915_parse_cmds() - parse a submitted batch buffer for privilege violations + * @ctx: the context in which the batch is to execute * @ring: the ring on which the batch is to execute * @batch_obj: the batch buffer in question - * @shadow_batch_obj: copy of the batch buffer in question + * @user_batch_start: Canonical base address of original user batch * @batch_start_offset: byte offset in the batch at which execution starts * @batch_len: length of the commands in batch_obj - * @is_master: is the submitting process the drm master? + * @shadow_batch_obj: copy of the batch buffer in question + * @shadow_batch_start: Canonical base address of shadow_batch_obj * * Parses the specified batch buffer looking for privilege violations as * described in the overview. @@ -1108,14 +1247,16 @@ static bool check_cmd(const struct intel_engine_cs *ring, * Return: non-zero if the parser finds violations or otherwise fails; -EACCES * if the batch appears legal but should use hardware parsing */ -int i915_parse_cmds(struct intel_engine_cs *ring, +int i915_parse_cmds(struct intel_context *ctx, + struct intel_engine_cs *ring, struct drm_i915_gem_object *batch_obj, - struct drm_i915_gem_object *shadow_batch_obj, + u64 user_batch_start, u32 batch_start_offset, u32 batch_len, - bool is_master) + struct drm_i915_gem_object *shadow_batch_obj, + u64 shadow_batch_start) { - u32 *cmd, *batch_base, *batch_end; + u32 *cmd, *batch_base, *batch_end, offset = 0; struct drm_i915_cmd_descriptor default_desc = { 0 }; bool oacontrol_set = false; /* OACONTROL tracking. See check_cmd() */ int ret = 0; @@ -1127,6 +1268,8 @@ int i915_parse_cmds(struct intel_engine_cs *ring, return PTR_ERR(batch_base); } + init_whitelist(ctx, batch_len); + /* * We use the batch length as size because the shadow object is as * large or larger and copy_batch() will write MI_NOPs to the extra @@ -1150,16 +1293,6 @@ int i915_parse_cmds(struct intel_engine_cs *ring, break; } - /* - * If the batch buffer contains a chained batch, return an - * error that tells the caller to abort and dispatch the - * workload as a non-secure batch. - */ - if (desc->cmd.value == MI_BATCH_BUFFER_START) { - ret = -EACCES; - break; - } - if (desc->flags & CMD_DESC_FIXED) length = desc->length.fixed; else @@ -1174,13 +1307,23 @@ int i915_parse_cmds(struct intel_engine_cs *ring, break; } - if (!check_cmd(ring, desc, cmd, length, is_master, - &oacontrol_set)) { - ret = -EINVAL; + if (!check_cmd(ring, desc, cmd, length, &oacontrol_set)) { + ret = CMDPARSER_USES_GGTT(ring->dev) ? -EINVAL : -EACCES; break; } + if (desc->cmd.value == MI_BATCH_BUFFER_START) { + ret = check_bbstart(ctx, cmd, offset, length, + batch_len, user_batch_start, + shadow_batch_start); + break; + } + + if (ctx->jump_whitelist_cmds > offset) + set_bit(offset, ctx->jump_whitelist); + cmd += length; + offset += length; } if (oacontrol_set) { @@ -1206,7 +1349,7 @@ int i915_parse_cmds(struct intel_engine_cs *ring, * * Return: the current version number of the cmd parser */ -int i915_cmd_parser_get_version(void) +int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv) { /* * Command parser version history @@ -1218,6 +1361,7 @@ int i915_cmd_parser_get_version(void) * 3. Allow access to the GPGPU_THREADS_DISPATCHED register. * 4. L3 atomic chicken bits of HSW_SCRATCH1 and HSW_ROW_CHICKEN3. * 5. GPGPU dispatch compute indirect registers. + * 10. Gen9 only - Supports the new ppgtt based BLIT parser */ - return 5; + return CMDPARSER_USES_GGTT(dev_priv) ? 5 : 10; } diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 61fcb3b22297..7b61078c2330 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -133,7 +133,7 @@ static int i915_getparam(struct drm_device *dev, void *data, value = 1; break; case I915_PARAM_HAS_SECURE_BATCHES: - value = capable(CAP_SYS_ADMIN); + value = HAS_SECURE_BATCHES(dev_priv) && capable(CAP_SYS_ADMIN); break; case I915_PARAM_HAS_PINNED_BATCHES: value = 1; @@ -145,7 +145,7 @@ static int i915_getparam(struct drm_device *dev, void *data, value = 1; break; case I915_PARAM_CMD_PARSER_VERSION: - value = i915_cmd_parser_get_version(); + value = i915_cmd_parser_get_version(dev_priv); break; case I915_PARAM_HAS_COHERENT_PHYS_GTT: value = 1; diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index a6ad938f44a6..697b2499c7a1 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -698,6 +698,8 @@ static int i915_drm_suspend_late(struct drm_device *drm_dev, bool hibernation) return ret; } + i915_rc6_ctx_wa_suspend(dev_priv); + pci_disable_device(drm_dev->pdev); /* * During hibernation on some platforms the BIOS may try to access @@ -849,6 +851,8 @@ static int i915_drm_resume_early(struct drm_device *dev) intel_uncore_sanitize(dev); intel_power_domains_init_hw(dev_priv); + i915_rc6_ctx_wa_resume(dev_priv); + return ret; } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6fca39e1c419..5799356f6b6b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -891,6 +891,12 @@ struct intel_context { int pin_count; } engine[I915_NUM_RINGS]; + /* jump_whitelist: Bit array for tracking cmds during cmdparsing */ + unsigned long *jump_whitelist; + + /* jump_whitelist_cmds: No of cmd slots available */ + uint32_t jump_whitelist_cmds; + struct list_head link; }; @@ -1153,6 +1159,7 @@ struct intel_gen6_power_mgmt { bool client_boost; bool enabled; + bool ctx_corrupted; struct delayed_work delayed_resume_work; unsigned boosts; @@ -2539,6 +2546,9 @@ struct drm_i915_cmd_table { #define HAS_BSD2(dev) (INTEL_INFO(dev)->ring_mask & BSD2_RING) #define HAS_BLT(dev) (INTEL_INFO(dev)->ring_mask & BLT_RING) #define HAS_VEBOX(dev) (INTEL_INFO(dev)->ring_mask & VEBOX_RING) + +#define HAS_SECURE_BATCHES(dev_priv) (INTEL_INFO(dev_priv)->gen < 6) + #define HAS_LLC(dev) (INTEL_INFO(dev)->has_llc) #define HAS_WT(dev) ((IS_HASWELL(dev) || IS_BROADWELL(dev)) && \ __I915__(dev)->ellc_size) @@ -2553,8 +2563,18 @@ struct drm_i915_cmd_table { #define HAS_OVERLAY(dev) (INTEL_INFO(dev)->has_overlay) #define OVERLAY_NEEDS_PHYSICAL(dev) (INTEL_INFO(dev)->overlay_needs_physical) +/* + * The Gen7 cmdparser copies the scanned buffer to the ggtt for execution + * All later gens can run the final buffer from the ppgtt + */ +#define CMDPARSER_USES_GGTT(dev_priv) IS_GEN7(dev_priv) + /* Early gen2 have a totally busted CS tlb and require pinned batches. */ #define HAS_BROKEN_CS_TLB(dev) (IS_I830(dev) || IS_845G(dev)) + +#define NEEDS_RC6_CTX_CORRUPTION_WA(dev) \ + (IS_BROADWELL(dev) || INTEL_INFO(dev)->gen == 9) + /* * dp aux and gmbus irq on gen4 seems to be able to generate legacy interrupts * even when in MSI mode. This results in spurious interrupt warnings if the @@ -3276,16 +3296,19 @@ void i915_get_extra_instdone(struct drm_device *dev, uint32_t *instdone); const char *i915_cache_level_str(struct drm_i915_private *i915, int type); /* i915_cmd_parser.c */ -int i915_cmd_parser_get_version(void); +int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv); int i915_cmd_parser_init_ring(struct intel_engine_cs *ring); void i915_cmd_parser_fini_ring(struct intel_engine_cs *ring); bool i915_needs_cmd_parser(struct intel_engine_cs *ring); -int i915_parse_cmds(struct intel_engine_cs *ring, +int i915_parse_cmds(struct intel_context *cxt, + struct intel_engine_cs *ring, struct drm_i915_gem_object *batch_obj, - struct drm_i915_gem_object *shadow_batch_obj, + u64 user_batch_start, u32 batch_start_offset, u32 batch_len, - bool is_master); + struct drm_i915_gem_object *shadow_batch_obj, + u64 shadow_batch_start); + /* i915_suspend.c */ extern int i915_save_state(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 0433d25f9d23..20fb0ee1df4f 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -157,6 +157,8 @@ void i915_gem_context_free(struct kref *ctx_ref) if (i915.enable_execlists) intel_lr_context_free(ctx); + kfree(ctx->jump_whitelist); + /* * This context is going away and we need to remove all VMAs still * around. This is to handle imported shared objects for which @@ -246,6 +248,9 @@ __create_hw_context(struct drm_device *dev, ctx->hang_stats.ban_period_seconds = DRM_I915_CTX_BAN_PERIOD; + ctx->jump_whitelist = NULL; + ctx->jump_whitelist_cmds = 0; + return ctx; err_out: diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 8800f410b2d2..56f6f5c5941e 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1121,17 +1121,52 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev, return 0; } +static struct i915_vma* +shadow_batch_pin(struct drm_i915_gem_object *obj, struct i915_address_space *vm) +{ + struct drm_i915_private *dev_priv = to_i915(obj->base.dev); + struct i915_address_space *pin_vm = vm; + u64 flags; + int ret; + + /* + * PPGTT backed shadow buffers must be mapped RO, to prevent + * post-scan tampering + */ + if (CMDPARSER_USES_GGTT(dev_priv)) { + flags = PIN_GLOBAL; + pin_vm = &dev_priv->gtt.base; + } else if (vm->has_read_only) { + flags = PIN_USER; + obj->gt_ro = 1; + } else { + DRM_DEBUG("Cannot prevent post-scan tampering without RO capable vm\n"); + return ERR_PTR(-EINVAL); + } + + ret = i915_gem_object_pin(obj, pin_vm, 0, flags); + if (ret) + return ERR_PTR(ret); + else + return i915_gem_obj_to_vma(obj, pin_vm); +} + static struct drm_i915_gem_object* -i915_gem_execbuffer_parse(struct intel_engine_cs *ring, +i915_gem_execbuffer_parse(struct intel_context *ctx, + struct intel_engine_cs *ring, struct drm_i915_gem_exec_object2 *shadow_exec_entry, struct eb_vmas *eb, + struct i915_address_space *vm, struct drm_i915_gem_object *batch_obj, u32 batch_start_offset, - u32 batch_len, - bool is_master) + u32 batch_len) { struct drm_i915_gem_object *shadow_batch_obj; struct i915_vma *vma; + struct i915_vma *user_vma = list_entry(eb->vmas.prev, + typeof(*user_vma), exec_list); + u64 batch_start; + u64 shadow_batch_start; int ret; shadow_batch_obj = i915_gem_batch_pool_get(&ring->batch_pool, @@ -1139,24 +1174,34 @@ i915_gem_execbuffer_parse(struct intel_engine_cs *ring, if (IS_ERR(shadow_batch_obj)) return shadow_batch_obj; - ret = i915_parse_cmds(ring, + vma = shadow_batch_pin(shadow_batch_obj, vm); + if (IS_ERR(vma)) { + ret = PTR_ERR(vma); + goto err; + } + + batch_start = user_vma->node.start + batch_start_offset; + + shadow_batch_start = vma->node.start; + + ret = i915_parse_cmds(ctx, + ring, batch_obj, - shadow_batch_obj, + batch_start, batch_start_offset, batch_len, - is_master); - if (ret) - goto err; - - ret = i915_gem_obj_ggtt_pin(shadow_batch_obj, 0, 0); - if (ret) + shadow_batch_obj, + shadow_batch_start); + if (ret) { + WARN_ON(vma->pin_count == 0); + vma->pin_count--; goto err; + } i915_gem_object_unpin_pages(shadow_batch_obj); memset(shadow_exec_entry, 0, sizeof(*shadow_exec_entry)); - vma = i915_gem_obj_to_ggtt(shadow_batch_obj); vma->exec_entry = shadow_exec_entry; vma->exec_entry->flags = __EXEC_OBJECT_HAS_PIN; drm_gem_object_reference(&shadow_batch_obj->base); @@ -1168,7 +1213,14 @@ i915_gem_execbuffer_parse(struct intel_engine_cs *ring, err: i915_gem_object_unpin_pages(shadow_batch_obj); - if (ret == -EACCES) /* unhandled chained batch */ + + /* + * Unsafe GGTT-backed buffers can still be submitted safely + * as non-secure. + * For PPGTT backing however, we have no choice but to forcibly + * reject unsafe buffers + */ + if (CMDPARSER_USES_GGTT(batch_obj->base.dev) && (ret == -EACCES)) return batch_obj; else return ERR_PTR(ret); @@ -1320,6 +1372,13 @@ eb_get_batch(struct eb_vmas *eb) return vma->obj; } +static inline bool use_cmdparser(const struct intel_engine_cs *ring, + u32 batch_len) +{ + return ring->requires_cmd_parser || + (ring->using_cmd_parser && batch_len && USES_PPGTT(ring->dev)); +} + static int i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct drm_file *file, @@ -1349,6 +1408,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, dispatch_flags = 0; if (args->flags & I915_EXEC_SECURE) { + /* Return -EPERM to trigger fallback code on old binaries. */ + if (!HAS_SECURE_BATCHES(dev_priv)) + return -EPERM; + if (!file->is_master || !capable(CAP_SYS_ADMIN)) return -EPERM; @@ -1487,16 +1550,20 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, } params->args_batch_start_offset = args->batch_start_offset; - if (i915_needs_cmd_parser(ring) && args->batch_len) { + if (use_cmdparser(ring, args->batch_len)) { struct drm_i915_gem_object *parsed_batch_obj; - parsed_batch_obj = i915_gem_execbuffer_parse(ring, + u32 batch_off = args->batch_start_offset; + u32 batch_len = args->batch_len; + if (batch_len == 0) + batch_len = batch_obj->base.size - batch_off; + + parsed_batch_obj = i915_gem_execbuffer_parse(ctx, ring, &shadow_exec_entry, - eb, + eb, vm, batch_obj, - args->batch_start_offset, - args->batch_len, - file->is_master); + batch_off, + batch_len); if (IS_ERR(parsed_batch_obj)) { ret = PTR_ERR(parsed_batch_obj); goto err; @@ -1506,18 +1573,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, * parsed_batch_obj == batch_obj means batch not fully parsed: * Accept, but don't promote to secure. */ - if (parsed_batch_obj != batch_obj) { - /* - * Batch parsed and accepted: - * - * Set the DISPATCH_SECURE bit to remove the NON_SECURE - * bit from MI_BATCH_BUFFER_START commands issued in - * the dispatch_execbuffer implementations. We - * specifically don't want that set on batches the - * command parser has accepted. - */ - dispatch_flags |= I915_DISPATCH_SECURE; + if (CMDPARSER_USES_GGTT(dev_priv)) + dispatch_flags |= I915_DISPATCH_SECURE; params->args_batch_start_offset = 0; batch_obj = parsed_batch_obj; } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index b37fe0df743e..65a53ee398b8 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -119,7 +119,8 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt) (enable_ppgtt == 0 || !has_aliasing_ppgtt)) return 0; - if (enable_ppgtt == 1) + /* Full PPGTT is required by the Gen9 cmdparser */ + if (enable_ppgtt == 1 && INTEL_INFO(dev)->gen != 9) return 1; if (enable_ppgtt == 2 && has_full_ppgtt) @@ -152,7 +153,8 @@ static int ppgtt_bind_vma(struct i915_vma *vma, { u32 pte_flags = 0; - /* Currently applicable only to VLV */ + /* Applicable to VLV, and gen8+ */ + pte_flags = 0; if (vma->obj->gt_ro) pte_flags |= PTE_READ_ONLY; @@ -172,11 +174,14 @@ static void ppgtt_unbind_vma(struct i915_vma *vma) static gen8_pte_t gen8_pte_encode(dma_addr_t addr, enum i915_cache_level level, - bool valid) + bool valid, u32 flags) { gen8_pte_t pte = valid ? _PAGE_PRESENT | _PAGE_RW : 0; pte |= addr; + if (unlikely(flags & PTE_READ_ONLY)) + pte &= ~_PAGE_RW; + switch (level) { case I915_CACHE_NONE: pte |= PPAT_UNCACHED_INDEX; @@ -460,7 +465,7 @@ static void gen8_initialize_pt(struct i915_address_space *vm, gen8_pte_t scratch_pte; scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page), - I915_CACHE_LLC, true); + I915_CACHE_LLC, true, 0); fill_px(vm->dev, pt, scratch_pte); } @@ -757,8 +762,9 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, { struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); - gen8_pte_t scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page), - I915_CACHE_LLC, use_scratch); + gen8_pte_t scratch_pte = + gen8_pte_encode(px_dma(vm->scratch_page), + I915_CACHE_LLC, use_scratch, 0); if (!USES_FULL_48BIT_PPGTT(vm->dev)) { gen8_ppgtt_clear_pte_range(vm, &ppgtt->pdp, start, length, @@ -779,7 +785,8 @@ gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm, struct i915_page_directory_pointer *pdp, struct sg_page_iter *sg_iter, uint64_t start, - enum i915_cache_level cache_level) + enum i915_cache_level cache_level, + u32 flags) { struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); @@ -799,7 +806,7 @@ gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm, pt_vaddr[pte] = gen8_pte_encode(sg_page_iter_dma_address(sg_iter), - cache_level, true); + cache_level, true, flags); if (++pte == GEN8_PTES) { kunmap_px(ppgtt, pt_vaddr); pt_vaddr = NULL; @@ -820,7 +827,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, struct sg_table *pages, uint64_t start, enum i915_cache_level cache_level, - u32 unused) + u32 flags) { struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); @@ -830,7 +837,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, if (!USES_FULL_48BIT_PPGTT(vm->dev)) { gen8_ppgtt_insert_pte_entries(vm, &ppgtt->pdp, &sg_iter, start, - cache_level); + cache_level, flags); } else { struct i915_page_directory_pointer *pdp; uint64_t templ4, pml4e; @@ -838,7 +845,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) { gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter, - start, cache_level); + start, cache_level, flags); } } } @@ -1447,7 +1454,7 @@ static void gen8_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m) uint64_t start = ppgtt->base.start; uint64_t length = ppgtt->base.total; gen8_pte_t scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page), - I915_CACHE_LLC, true); + I915_CACHE_LLC, true, 0); if (!USES_FULL_48BIT_PPGTT(vm->dev)) { gen8_dump_pdp(&ppgtt->pdp, start, length, scratch_pte, m); @@ -1515,6 +1522,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt) ppgtt->base.clear_range = gen8_ppgtt_clear_range; ppgtt->base.unbind_vma = ppgtt_unbind_vma; ppgtt->base.bind_vma = ppgtt_bind_vma; + + /* + * From bdw, there is support for read-only pages in the PPGTT. + * + * XXX GVT is not honouring the lack of RW in the PTE bits. + */ + ppgtt->base.has_read_only = !intel_vgpu_active(ppgtt->base.dev); + ppgtt->debug_dump = gen8_dump_ppgtt; if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) { @@ -2343,7 +2358,7 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte) static void gen8_ggtt_insert_entries(struct i915_address_space *vm, struct sg_table *st, uint64_t start, - enum i915_cache_level level, u32 unused) + enum i915_cache_level level, u32 flags) { struct drm_i915_private *dev_priv = vm->dev->dev_private; unsigned first_entry = start >> PAGE_SHIFT; @@ -2357,7 +2372,7 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm, addr = sg_dma_address(sg_iter.sg) + (sg_iter.sg_pgoffset << PAGE_SHIFT); gen8_set_pte(>t_entries[i], - gen8_pte_encode(addr, level, true)); + gen8_pte_encode(addr, level, true, flags)); i++; } @@ -2370,7 +2385,7 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm, */ if (i != 0) WARN_ON(readq(>t_entries[i-1]) - != gen8_pte_encode(addr, level, true)); + != gen8_pte_encode(addr, level, true, flags)); /* This next bit makes the above posting read even more important. We * want to flush the TLBs only after we're certain all the PTE updates @@ -2444,7 +2459,7 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm, scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page), I915_CACHE_LLC, - use_scratch); + use_scratch, 0); for (i = 0; i < num_entries; i++) gen8_set_pte(>t_base[i], scratch_pte); readl(gtt_base); @@ -2510,7 +2525,8 @@ static int ggtt_bind_vma(struct i915_vma *vma, if (ret) return ret; - /* Currently applicable only to VLV */ + /* Applicable to VLV (gen8+ do not support RO in the GGTT) */ + pte_flags = 0; if (obj->gt_ro) pte_flags |= PTE_READ_ONLY; @@ -2653,6 +2669,9 @@ static int i915_gem_setup_global_gtt(struct drm_device *dev, i915_address_space_init(ggtt_vm, dev_priv); ggtt_vm->total += PAGE_SIZE; + /* Only VLV supports read-only GGTT mappings */ + ggtt_vm->has_read_only = IS_VALLEYVIEW(dev_priv); + if (intel_vgpu_active(dev)) { ret = intel_vgt_balloon(dev); if (ret) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index a216397ead52..d36f2d77576a 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -307,6 +307,9 @@ struct i915_address_space { */ struct list_head inactive_list; + /* Some systems support read-only mappings for GGTT and/or PPGTT */ + bool has_read_only:1; + /* FIXME: Need a more generic return type */ gen6_pte_t (*pte_encode)(dma_addr_t addr, enum i915_cache_level level, diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index cace154bbdc0..603d8cdfc5f1 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -170,6 +170,8 @@ #define ECOCHK_PPGTT_WT_HSW (0x2<<3) #define ECOCHK_PPGTT_WB_HSW (0x3<<3) +#define GEN8_RC6_CTX_INFO 0x8504 + #define GAC_ECO_BITS 0x14090 #define ECOBITS_SNB_BIT (1<<13) #define ECOBITS_PPGTT_CACHE64B (3<<8) @@ -511,6 +513,10 @@ */ #define BCS_SWCTRL 0x22200 +/* There are 16 GPR registers */ +#define BCS_GPR(n) (0x22600 + (n) * 8) +#define BCS_GPR_UDW(n) (0x22600 + (n) * 8 + 4) + #define GPGPU_THREADS_DISPATCHED 0x2290 #define HS_INVOCATION_COUNT 0x2300 #define DS_INVOCATION_COUNT 0x2308 @@ -1567,6 +1573,7 @@ enum skl_disp_power_wells { #define RING_IMR(base) ((base)+0xa8) #define RING_HWSTAM(base) ((base)+0x98) #define RING_TIMESTAMP(base) ((base)+0x358) +#define RING_TIMESTAMP_UDW(base) ((base) + 0x358 + 4) #define TAIL_ADDR 0x001FFFF8 #define HEAD_WRAP_COUNT 0xFFE00000 #define HEAD_WRAP_ONE 0x00200000 @@ -5704,6 +5711,10 @@ enum skl_disp_power_wells { #define GAMMA_MODE_MODE_12BIT (2 << 0) #define GAMMA_MODE_MODE_SPLIT (3 << 0) +/* Display Internal Timeout Register */ +#define RM_TIMEOUT 0x42060 +#define MMIO_TIMEOUT_US(us) ((us) << 0) + /* interrupts */ #define DE_MASTER_IRQ_CONTROL (1 << 31) #define DE_SPRITEB_FLIP_DONE (1 << 29) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 4f5d07bb3511..a9166ff48a26 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -10747,6 +10747,10 @@ void intel_mark_busy(struct drm_device *dev) return; intel_runtime_pm_get(dev_priv); + + if (NEEDS_RC6_CTX_CORRUPTION_WA(dev_priv)) + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); + i915_update_gfx_val(dev_priv); if (INTEL_INFO(dev)->gen >= 6) gen6_rps_busy(dev_priv); @@ -10765,6 +10769,11 @@ void intel_mark_idle(struct drm_device *dev) if (INTEL_INFO(dev)->gen >= 6) gen6_rps_idle(dev->dev_private); + if (NEEDS_RC6_CTX_CORRUPTION_WA(dev_priv)) { + i915_rc6_ctx_wa_check(dev_priv); + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); + } + intel_runtime_pm_put(dev_priv); } diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 722aa159cd28..78503e481313 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -1410,6 +1410,9 @@ void intel_enable_gt_powersave(struct drm_device *dev); void intel_disable_gt_powersave(struct drm_device *dev); void intel_suspend_gt_powersave(struct drm_device *dev); void intel_reset_gt_powersave(struct drm_device *dev); +bool i915_rc6_ctx_wa_check(struct drm_i915_private *i915); +void i915_rc6_ctx_wa_suspend(struct drm_i915_private *i915); +void i915_rc6_ctx_wa_resume(struct drm_i915_private *i915); void gen6_update_ring_freq(struct drm_device *dev); void gen6_rps_busy(struct drm_i915_private *dev_priv); void gen6_rps_reset_ei(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index fd4690ed93c0..81bd84f9156b 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -66,6 +66,14 @@ static void bxt_init_clock_gating(struct drm_device *dev) */ I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) | GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ); + + /* + * Lower the display internal timeout. + * This is needed to avoid any hard hangs when DSI port PLL + * is off and a MMIO access is attempted by any privilege + * application, using batch buffers or any other means. + */ + I915_WRITE(RM_TIMEOUT, MMIO_TIMEOUT_US(950)); } static void i915_pineview_get_mem_freq(struct drm_device *dev) @@ -4591,30 +4599,42 @@ void intel_set_rps(struct drm_device *dev, u8 val) gen6_set_rps(dev, val); } -static void gen9_disable_rps(struct drm_device *dev) +static void gen9_disable_rc6(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; I915_WRITE(GEN6_RC_CONTROL, 0); +} + +static void gen9_disable_rps(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + I915_WRITE(GEN9_PG_ENABLE, 0); } +static void gen6_disable_rc6(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + + I915_WRITE(GEN6_RC_CONTROL, 0); +} + static void gen6_disable_rps(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; - I915_WRITE(GEN6_RC_CONTROL, 0); I915_WRITE(GEN6_RPNSWREQ, 1 << 31); } -static void cherryview_disable_rps(struct drm_device *dev) +static void cherryview_disable_rc6(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; I915_WRITE(GEN6_RC_CONTROL, 0); } -static void valleyview_disable_rps(struct drm_device *dev) +static void valleyview_disable_rc6(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; @@ -4818,7 +4838,8 @@ static void gen9_enable_rc6(struct drm_device *dev) I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 25); /* 3a: Enable RC6 */ - if (intel_enable_rc6(dev) & INTEL_RC6_ENABLE) + if (!dev_priv->rps.ctx_corrupted && + intel_enable_rc6(dev) & INTEL_RC6_ENABLE) rc6_mask = GEN6_RC_CTL_RC6_ENABLE; DRM_INFO("RC6 %s\n", (rc6_mask & GEN6_RC_CTL_RC6_ENABLE) ? "on" : "off"); @@ -4841,7 +4862,7 @@ static void gen9_enable_rc6(struct drm_device *dev) * WaRsDisableCoarsePowerGating:skl,bxt - Render/Media PG need to be disabled with RC6. */ if ((IS_BROXTON(dev) && (INTEL_REVID(dev) < BXT_REVID_B0)) || - ((IS_SKL_GT3(dev) || IS_SKL_GT4(dev)) && (INTEL_REVID(dev) <= SKL_REVID_F0))) + INTEL_INFO(dev)->gen == 9) I915_WRITE(GEN9_PG_ENABLE, 0); else I915_WRITE(GEN9_PG_ENABLE, (rc6_mask & GEN6_RC_CTL_RC6_ENABLE) ? @@ -4884,7 +4905,8 @@ static void gen8_enable_rps(struct drm_device *dev) I915_WRITE(GEN6_RC6_THRESHOLD, 50000); /* 50/125ms per EI */ /* 3: Enable RC6 */ - if (intel_enable_rc6(dev) & INTEL_RC6_ENABLE) + if (!dev_priv->rps.ctx_corrupted && + intel_enable_rc6(dev) & INTEL_RC6_ENABLE) rc6_mask = GEN6_RC_CTL_RC6_ENABLE; intel_print_rc6_info(dev, rc6_mask); if (IS_BROADWELL(dev)) @@ -6128,10 +6150,101 @@ static void intel_init_emon(struct drm_device *dev) dev_priv->ips.corr = (lcfuse & LCFUSE_HIV_MASK); } +static bool i915_rc6_ctx_corrupted(struct drm_i915_private *dev_priv) +{ + return !I915_READ(GEN8_RC6_CTX_INFO); +} + +static void i915_rc6_ctx_wa_init(struct drm_i915_private *i915) +{ + if (!NEEDS_RC6_CTX_CORRUPTION_WA(i915)) + return; + + if (i915_rc6_ctx_corrupted(i915)) { + DRM_INFO("RC6 context corrupted, disabling runtime power management\n"); + i915->rps.ctx_corrupted = true; + intel_runtime_pm_get(i915); + } +} + +static void i915_rc6_ctx_wa_cleanup(struct drm_i915_private *i915) +{ + if (i915->rps.ctx_corrupted) { + intel_runtime_pm_put(i915); + i915->rps.ctx_corrupted = false; + } +} + +/** + * i915_rc6_ctx_wa_suspend - system suspend sequence for the RC6 CTX WA + * @i915: i915 device + * + * Perform any steps needed to clean up the RC6 CTX WA before system suspend. + */ +void i915_rc6_ctx_wa_suspend(struct drm_i915_private *i915) +{ + if (i915->rps.ctx_corrupted) + intel_runtime_pm_put(i915); +} + +/** + * i915_rc6_ctx_wa_resume - system resume sequence for the RC6 CTX WA + * @i915: i915 device + * + * Perform any steps needed to re-init the RC6 CTX WA after system resume. + */ +void i915_rc6_ctx_wa_resume(struct drm_i915_private *i915) +{ + if (!i915->rps.ctx_corrupted) + return; + + if (i915_rc6_ctx_corrupted(i915)) { + intel_runtime_pm_get(i915); + return; + } + + DRM_INFO("RC6 context restored, re-enabling runtime power management\n"); + i915->rps.ctx_corrupted = false; +} + +static void intel_disable_rc6(struct drm_device *dev); + +/** + * i915_rc6_ctx_wa_check - check for a new RC6 CTX corruption + * @i915: i915 device + * + * Check if an RC6 CTX corruption has happened since the last check and if so + * disable RC6 and runtime power management. + * + * Return false if no context corruption has happened since the last call of + * this function, true otherwise. +*/ +bool i915_rc6_ctx_wa_check(struct drm_i915_private *i915) +{ + if (!NEEDS_RC6_CTX_CORRUPTION_WA(i915)) + return false; + + if (i915->rps.ctx_corrupted) + return false; + + if (!i915_rc6_ctx_corrupted(i915)) + return false; + + DRM_NOTE("RC6 context corruption, disabling runtime power management\n"); + + intel_disable_rc6(i915->dev); + i915->rps.ctx_corrupted = true; + intel_runtime_pm_get_noresume(i915); + + return true; +} + void intel_init_gt_powersave(struct drm_device *dev) { i915.enable_rc6 = sanitize_rc6_option(dev, i915.enable_rc6); + i915_rc6_ctx_wa_init(to_i915(dev)); + if (IS_CHERRYVIEW(dev)) cherryview_init_gt_powersave(dev); else if (IS_VALLEYVIEW(dev)) @@ -6144,6 +6257,8 @@ void intel_cleanup_gt_powersave(struct drm_device *dev) return; else if (IS_VALLEYVIEW(dev)) valleyview_cleanup_gt_powersave(dev); + + i915_rc6_ctx_wa_cleanup(to_i915(dev)); } static void gen6_suspend_rps(struct drm_device *dev) @@ -6176,6 +6291,38 @@ void intel_suspend_gt_powersave(struct drm_device *dev) gen6_rps_idle(dev_priv); } +static void __intel_disable_rc6(struct drm_device *dev) +{ + if (INTEL_INFO(dev)->gen >= 9) + gen9_disable_rc6(dev); + else if (IS_CHERRYVIEW(dev)) + cherryview_disable_rc6(dev); + else if (IS_VALLEYVIEW(dev)) + valleyview_disable_rc6(dev); + else + gen6_disable_rc6(dev); +} + +static void intel_disable_rc6(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = to_i915(dev); + + mutex_lock(&dev_priv->rps.hw_lock); + __intel_disable_rc6(dev); + mutex_unlock(&dev_priv->rps.hw_lock); +} + +static void intel_disable_rps(struct drm_device *dev) +{ + if (IS_CHERRYVIEW(dev) || IS_VALLEYVIEW(dev)) + return; + + if (INTEL_INFO(dev)->gen >= 9) + gen9_disable_rps(dev); + else + gen6_disable_rps(dev); +} + void intel_disable_gt_powersave(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; @@ -6186,16 +6333,12 @@ void intel_disable_gt_powersave(struct drm_device *dev) intel_suspend_gt_powersave(dev); mutex_lock(&dev_priv->rps.hw_lock); - if (INTEL_INFO(dev)->gen >= 9) - gen9_disable_rps(dev); - else if (IS_CHERRYVIEW(dev)) - cherryview_disable_rps(dev); - else if (IS_VALLEYVIEW(dev)) - valleyview_disable_rps(dev); - else - gen6_disable_rps(dev); + + __intel_disable_rc6(dev); + intel_disable_rps(dev); dev_priv->rps.enabled = false; + mutex_unlock(&dev_priv->rps.hw_lock); } } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 9d48443bca2e..df6547f60a5c 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2058,6 +2058,8 @@ static void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf) static int intel_alloc_ringbuffer_obj(struct drm_device *dev, struct intel_ringbuffer *ringbuf) { + struct drm_i915_private *dev_priv = to_i915(dev); + struct i915_address_space *vm = &dev_priv->gtt.base; struct drm_i915_gem_object *obj; obj = NULL; @@ -2068,8 +2070,12 @@ static int intel_alloc_ringbuffer_obj(struct drm_device *dev, if (obj == NULL) return -ENOMEM; - /* mark ring buffers as read-only from GPU side by default */ - obj->gt_ro = 1; + /* + * Mark ring buffers as read-only from GPU side (so no stray overwrites) + * if supported by the platform's GGTT. + */ + if (vm->has_read_only) + obj->gt_ro = 1; ringbuf->obj = obj; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 49fa41dc0eb6..56c872b89a92 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -314,7 +314,8 @@ struct intel_engine_cs { volatile u32 *cpu_page; } scratch; - bool needs_cmd_parser; + bool using_cmd_parser; + bool requires_cmd_parser; /* * Table of commands the command parser needs to know about diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c index 892d0a71d766..57724a3afe78 100644 --- a/drivers/gpu/drm/radeon/si_dpm.c +++ b/drivers/gpu/drm/radeon/si_dpm.c @@ -1956,6 +1956,7 @@ static void si_initialize_powertune_defaults(struct radeon_device *rdev) case 0x682C: si_pi->cac_weights = cac_weights_cape_verde_pro; si_pi->dte_data = dte_data_sun_xt; + update_dte_from_pl2 = true; break; case 0x6825: case 0x6827: diff --git a/drivers/iio/imu/adis16480.c b/drivers/iio/imu/adis16480.c index 1880105cc8c4..778a46247f8d 100644 --- a/drivers/iio/imu/adis16480.c +++ b/drivers/iio/imu/adis16480.c @@ -266,8 +266,11 @@ static int adis16480_set_freq(struct iio_dev *indio_dev, int val, int val2) struct adis16480 *st = iio_priv(indio_dev); unsigned int t; + if (val < 0 || val2 < 0) + return -EINVAL; + t = val * 1000 + val2 / 1000; - if (t <= 0) + if (t == 0) return -EINVAL; t = 2460000 / t; diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 1bf4f54c2bef..e31b4c7d2522 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1719,7 +1719,8 @@ err_detach: slave_disable_netpoll(new_slave); err_close: - slave_dev->priv_flags &= ~IFF_BONDING; + if (!netif_is_bond_master(slave_dev)) + slave_dev->priv_flags &= ~IFF_BONDING; dev_close(slave_dev); err_restore_mac: @@ -1915,7 +1916,8 @@ static int __bond_release_one(struct net_device *bond_dev, dev_set_mtu(slave_dev, slave->original_mtu); - slave_dev->priv_flags &= ~IFF_BONDING; + if (!netif_is_bond_master(slave_dev)) + slave_dev->priv_flags &= ~IFF_BONDING; bond_free_slave(slave); diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c index e3dccd3200d5..7d35f6737499 100644 --- a/drivers/net/can/c_can/c_can.c +++ b/drivers/net/can/c_can/c_can.c @@ -97,6 +97,9 @@ #define BTR_TSEG2_SHIFT 12 #define BTR_TSEG2_MASK (0x7 << BTR_TSEG2_SHIFT) +/* interrupt register */ +#define INT_STS_PENDING 0x8000 + /* brp extension register */ #define BRP_EXT_BRPE_MASK 0x0f #define BRP_EXT_BRPE_SHIFT 0 @@ -1029,10 +1032,16 @@ static int c_can_poll(struct napi_struct *napi, int quota) u16 curr, last = priv->last_status; int work_done = 0; - priv->last_status = curr = priv->read_reg(priv, C_CAN_STS_REG); - /* Ack status on C_CAN. D_CAN is self clearing */ - if (priv->type != BOSCH_D_CAN) - priv->write_reg(priv, C_CAN_STS_REG, LEC_UNUSED); + /* Only read the status register if a status interrupt was pending */ + if (atomic_xchg(&priv->sie_pending, 0)) { + priv->last_status = curr = priv->read_reg(priv, C_CAN_STS_REG); + /* Ack status on C_CAN. D_CAN is self clearing */ + if (priv->type != BOSCH_D_CAN) + priv->write_reg(priv, C_CAN_STS_REG, LEC_UNUSED); + } else { + /* no change detected ... */ + curr = last; + } /* handle state changes */ if ((curr & STATUS_EWARN) && (!(last & STATUS_EWARN))) { @@ -1083,10 +1092,16 @@ static irqreturn_t c_can_isr(int irq, void *dev_id) { struct net_device *dev = (struct net_device *)dev_id; struct c_can_priv *priv = netdev_priv(dev); + int reg_int; - if (!priv->read_reg(priv, C_CAN_INT_REG)) + reg_int = priv->read_reg(priv, C_CAN_INT_REG); + if (!reg_int) return IRQ_NONE; + /* save for later use */ + if (reg_int & INT_STS_PENDING) + atomic_set(&priv->sie_pending, 1); + /* disable all interrupts and schedule the NAPI */ c_can_irq_control(priv, false); napi_schedule(&priv->napi); diff --git a/drivers/net/can/c_can/c_can.h b/drivers/net/can/c_can/c_can.h index 8acdc7fa4792..d5567a7c1c6d 100644 --- a/drivers/net/can/c_can/c_can.h +++ b/drivers/net/can/c_can/c_can.h @@ -198,6 +198,7 @@ struct c_can_priv { struct net_device *dev; struct device *device; atomic_t tx_active; + atomic_t sie_pending; unsigned long tx_dir; int last_status; u16 (*read_reg) (const struct c_can_priv *priv, enum reg index); diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c index baef09b9449f..6b866d0451b2 100644 --- a/drivers/net/can/flexcan.c +++ b/drivers/net/can/flexcan.c @@ -923,6 +923,7 @@ static int flexcan_chip_start(struct net_device *dev) reg_mecr = flexcan_read(®s->mecr); reg_mecr &= ~FLEXCAN_MECR_ECRWRDIS; flexcan_write(reg_mecr, ®s->mecr); + reg_mecr |= FLEXCAN_MECR_ECCDIS; reg_mecr &= ~(FLEXCAN_MECR_NCEFAFRZ | FLEXCAN_MECR_HANCEI_MSK | FLEXCAN_MECR_FANCEI_MSK); flexcan_write(reg_mecr, ®s->mecr); diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c index b227f81e4a7e..6982ab8777b7 100644 --- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -617,6 +617,7 @@ static int gs_can_open(struct net_device *netdev) rc); usb_unanchor_urb(urb); + usb_free_urb(urb); break; } diff --git a/drivers/net/can/usb/peak_usb/pcan_usb.c b/drivers/net/can/usb/peak_usb/pcan_usb.c index 838545ce468d..e626c2afbbb1 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb.c @@ -108,7 +108,7 @@ struct pcan_usb_msg_context { u8 *end; u8 rec_cnt; u8 rec_idx; - u8 rec_data_idx; + u8 rec_ts_idx; struct net_device *netdev; struct pcan_usb *pdev; }; @@ -552,10 +552,15 @@ static int pcan_usb_decode_status(struct pcan_usb_msg_context *mc, mc->ptr += PCAN_USB_CMD_ARGS; if (status_len & PCAN_USB_STATUSLEN_TIMESTAMP) { - int err = pcan_usb_decode_ts(mc, !mc->rec_idx); + int err = pcan_usb_decode_ts(mc, !mc->rec_ts_idx); if (err) return err; + + /* Next packet in the buffer will have a timestamp on a single + * byte + */ + mc->rec_ts_idx++; } switch (f) { @@ -638,10 +643,13 @@ static int pcan_usb_decode_data(struct pcan_usb_msg_context *mc, u8 status_len) cf->can_dlc = get_can_dlc(rec_len); - /* first data packet timestamp is a word */ - if (pcan_usb_decode_ts(mc, !mc->rec_data_idx)) + /* Only first packet timestamp is a word */ + if (pcan_usb_decode_ts(mc, !mc->rec_ts_idx)) goto decode_failed; + /* Next packet in the buffer will have a timestamp on a single byte */ + mc->rec_ts_idx++; + /* read data */ memset(cf->data, 0x0, sizeof(cf->data)); if (status_len & PCAN_USB_STATUSLEN_RTR) { @@ -695,7 +703,6 @@ static int pcan_usb_decode_msg(struct peak_usb_device *dev, u8 *ibuf, u32 lbuf) /* handle normal can frames here */ } else { err = pcan_usb_decode_data(&mc, sl); - mc.rec_data_idx++; } } diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_core.c b/drivers/net/can/usb/peak_usb/pcan_usb_core.c index b1d68f49b398..8c47cc8dc896 100644 --- a/drivers/net/can/usb/peak_usb/pcan_usb_core.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_core.c @@ -776,7 +776,7 @@ static int peak_usb_create_dev(const struct peak_usb_adapter *peak_usb_adapter, dev = netdev_priv(netdev); /* allocate a buffer large enough to send commands */ - dev->cmd_buf = kmalloc(PCAN_USB_MAX_CMD_LEN, GFP_KERNEL); + dev->cmd_buf = kzalloc(PCAN_USB_MAX_CMD_LEN, GFP_KERNEL); if (!dev->cmd_buf) { err = -ENOMEM; goto lbl_free_candev; diff --git a/drivers/net/can/usb/usb_8dev.c b/drivers/net/can/usb/usb_8dev.c index 522286cc0f9c..50d9b945089e 100644 --- a/drivers/net/can/usb/usb_8dev.c +++ b/drivers/net/can/usb/usb_8dev.c @@ -1010,9 +1010,8 @@ static void usb_8dev_disconnect(struct usb_interface *intf) netdev_info(priv->netdev, "device disconnected\n"); unregister_netdev(priv->netdev); - free_candev(priv->netdev); - unlink_all_urbs(priv); + free_candev(priv->netdev); } } diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c index e8b7dc1bcfa6..2a7dfac20546 100644 --- a/drivers/net/ethernet/hisilicon/hip04_eth.c +++ b/drivers/net/ethernet/hisilicon/hip04_eth.c @@ -950,7 +950,6 @@ static int hip04_remove(struct platform_device *pdev) hip04_free_ring(ndev, d); unregister_netdev(ndev); - free_irq(ndev->irq, ndev); of_node_put(priv->phy_node); cancel_work_sync(&priv->tx_timeout_task); free_netdev(ndev); diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c index d70b2e5d5222..cbb0bdf85177 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c +++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c @@ -628,6 +628,7 @@ static int e1000_set_ringparam(struct net_device *netdev, for (i = 0; i < adapter->num_rx_queues; i++) rxdr[i].count = rxdr->count; + err = 0; if (netif_running(adapter->netdev)) { /* Try to get new resources before deleting old */ err = e1000_setup_all_rx_resources(adapter); @@ -648,14 +649,13 @@ static int e1000_set_ringparam(struct net_device *netdev, adapter->rx_ring = rxdr; adapter->tx_ring = txdr; err = e1000_up(adapter); - if (err) - goto err_setup; } kfree(tx_old); kfree(rx_old); clear_bit(__E1000_RESETTING, &adapter->flags); - return 0; + return err; + err_setup_tx: e1000_free_all_rx_resources(adapter); err_setup_rx: @@ -667,7 +667,6 @@ err_alloc_rx: err_alloc_tx: if (netif_running(adapter->netdev)) e1000_up(adapter); -err_setup: clear_bit(__E1000_RESETTING, &adapter->flags); return err; } diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 70ed5e5c3514..9404f38d9d0d 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -1673,7 +1673,8 @@ static void igb_check_swap_media(struct igb_adapter *adapter) if ((hw->phy.media_type == e1000_media_type_copper) && (!(connsw & E1000_CONNSW_AUTOSENSE_EN))) { swap_now = true; - } else if (!(connsw & E1000_CONNSW_SERDESD)) { + } else if ((hw->phy.media_type != e1000_media_type_copper) && + !(connsw & E1000_CONNSW_SERDESD)) { /* copper signal takes time to appear */ if (adapter->copper_tries < 4) { adapter->copper_tries++; diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c index f4657a2e730a..8b63c9d183a2 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@ -1465,8 +1465,16 @@ enum qede_remove_mode { static void __qede_remove(struct pci_dev *pdev, enum qede_remove_mode mode) { struct net_device *ndev = pci_get_drvdata(pdev); - struct qede_dev *edev = netdev_priv(ndev); - struct qed_dev *cdev = edev->cdev; + struct qede_dev *edev; + struct qed_dev *cdev; + + if (!ndev) { + dev_info(&pdev->dev, "Device has already been removed\n"); + return; + } + + edev = netdev_priv(ndev); + cdev = edev->cdev; DP_INFO(edev, "Starting qede_remove\n"); diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c index 442efbccd005..71ef895b4dca 100644 --- a/drivers/net/usb/cdc_ncm.c +++ b/drivers/net/usb/cdc_ncm.c @@ -533,8 +533,8 @@ static void cdc_ncm_set_dgram_size(struct usbnet *dev, int new_size) /* read current mtu value from device */ err = usbnet_read_cmd(dev, USB_CDC_GET_MAX_DATAGRAM_SIZE, USB_TYPE_CLASS | USB_DIR_IN | USB_RECIP_INTERFACE, - 0, iface_no, &max_datagram_size, 2); - if (err < 0) { + 0, iface_no, &max_datagram_size, sizeof(max_datagram_size)); + if (err < sizeof(max_datagram_size)) { dev_dbg(&dev->intf->dev, "GET_MAX_DATAGRAM_SIZE failed\n"); goto out; } @@ -545,7 +545,7 @@ static void cdc_ncm_set_dgram_size(struct usbnet *dev, int new_size) max_datagram_size = cpu_to_le16(ctx->max_datagram_size); err = usbnet_write_cmd(dev, USB_CDC_SET_MAX_DATAGRAM_SIZE, USB_TYPE_CLASS | USB_DIR_OUT | USB_RECIP_INTERFACE, - 0, iface_no, &max_datagram_size, 2); + 0, iface_no, &max_datagram_size, sizeof(max_datagram_size)); if (err < 0) dev_dbg(&dev->intf->dev, "SET_MAX_DATAGRAM_SIZE failed\n"); diff --git a/drivers/nfc/fdp/i2c.c b/drivers/nfc/fdp/i2c.c index 2d043415f326..dc2f3d5b2a36 100644 --- a/drivers/nfc/fdp/i2c.c +++ b/drivers/nfc/fdp/i2c.c @@ -278,7 +278,7 @@ static void fdp_nci_i2c_read_device_properties(struct device *dev, *fw_vsc_cfg, len); if (r) { - devm_kfree(dev, fw_vsc_cfg); + devm_kfree(dev, *fw_vsc_cfg); goto vsc_read_err; } } else { diff --git a/drivers/nfc/st21nfca/core.c b/drivers/nfc/st21nfca/core.c index dd8b150fbffa..1826cd330468 100644 --- a/drivers/nfc/st21nfca/core.c +++ b/drivers/nfc/st21nfca/core.c @@ -726,6 +726,7 @@ static int st21nfca_hci_complete_target_discovered(struct nfc_hci_dev *hdev, NFC_PROTO_FELICA_MASK; } else { kfree_skb(nfcid_skb); + nfcid_skb = NULL; /* P2P in type A */ r = nfc_hci_get_param(hdev, ST21NFCA_RF_READER_F_GATE, ST21NFCA_RF_READER_F_NFCID1, diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c index 30323114c53c..9865793b538a 100644 --- a/drivers/pci/host/pci-tegra.c +++ b/drivers/pci/host/pci-tegra.c @@ -586,12 +586,15 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0bf1, tegra_pcie_fixup_class); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0e1c, tegra_pcie_fixup_class); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0e1d, tegra_pcie_fixup_class); -/* Tegra PCIE requires relaxed ordering */ +/* Tegra20 and Tegra30 PCIE requires relaxed ordering */ static void tegra_pcie_relax_enable(struct pci_dev *dev) { pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN); } -DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, tegra_pcie_relax_enable); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, 0x0bf0, tegra_pcie_relax_enable); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, 0x0bf1, tegra_pcie_relax_enable); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, 0x0e1c, tegra_pcie_relax_enable); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, 0x0e1d, tegra_pcie_relax_enable); static int tegra_pcie_setup(int nr, struct pci_sys_data *sys) { diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c b/drivers/scsi/lpfc/lpfc_nportdisc.c index 193733e8c823..3a4613f9fb9f 100644 --- a/drivers/scsi/lpfc/lpfc_nportdisc.c +++ b/drivers/scsi/lpfc/lpfc_nportdisc.c @@ -759,9 +759,9 @@ lpfc_disc_set_adisc(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) if (!(vport->fc_flag & FC_PT2PT)) { /* Check config parameter use-adisc or FCP-2 */ - if ((vport->cfg_use_adisc && (vport->fc_flag & FC_RSCN_MODE)) || + if (vport->cfg_use_adisc && ((vport->fc_flag & FC_RSCN_MODE) || ((ndlp->nlp_fcp_info & NLP_FCP_2_DEVICE) && - (ndlp->nlp_type & NLP_FCP_TARGET))) { + (ndlp->nlp_type & NLP_FCP_TARGET)))) { spin_lock_irq(shost->host_lock); ndlp->nlp_flag |= NLP_NPR_ADISC; spin_unlock_irq(shost->host_lock); diff --git a/drivers/scsi/qla2xxx/qla_bsg.c b/drivers/scsi/qla2xxx/qla_bsg.c index c26acde797f0..2d5375d67736 100644 --- a/drivers/scsi/qla2xxx/qla_bsg.c +++ b/drivers/scsi/qla2xxx/qla_bsg.c @@ -252,7 +252,7 @@ qla2x00_process_els(struct fc_bsg_job *bsg_job) srb_t *sp; const char *type; int req_sg_cnt, rsp_sg_cnt; - int rval = (DRIVER_ERROR << 16); + int rval = (DID_ERROR << 16); uint16_t nextlid = 0; if (bsg_job->request->msgcode == FC_BSG_RPT_ELS) { @@ -426,7 +426,7 @@ qla2x00_process_ct(struct fc_bsg_job *bsg_job) struct Scsi_Host *host = bsg_job->shost; scsi_qla_host_t *vha = shost_priv(host); struct qla_hw_data *ha = vha->hw; - int rval = (DRIVER_ERROR << 16); + int rval = (DID_ERROR << 16); int req_sg_cnt, rsp_sg_cnt; uint16_t loop_id; struct fc_port *fcport; @@ -1910,7 +1910,7 @@ qlafx00_mgmt_cmd(struct fc_bsg_job *bsg_job) struct Scsi_Host *host = bsg_job->shost; scsi_qla_host_t *vha = shost_priv(host); struct qla_hw_data *ha = vha->hw; - int rval = (DRIVER_ERROR << 16); + int rval = (DID_ERROR << 16); struct qla_mt_iocb_rqst_fx00 *piocb_rqst; srb_t *sp; int req_sg_cnt = 0, rsp_sg_cnt = 0; diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index ff5df33fc740..611a127f08d8 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -2962,6 +2962,10 @@ qla2x00_shutdown(struct pci_dev *pdev) /* Stop currently executing firmware. */ qla2x00_try_to_stop_firmware(vha); + /* Disable timer */ + if (vha->timer_active) + qla2x00_stop_timer(vha); + /* Turn adapter off line */ vha->flags.online = 0; diff --git a/drivers/usb/core/config.c b/drivers/usb/core/config.c index 126987c9da75..eb82ee0a3ba8 100644 --- a/drivers/usb/core/config.c +++ b/drivers/usb/core/config.c @@ -314,6 +314,11 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno, int inum, /* Validate the wMaxPacketSize field */ maxp = usb_endpoint_maxp(&endpoint->desc); + if (maxp == 0) { + dev_warn(ddev, "config %d interface %d altsetting %d endpoint 0x%X has wMaxPacketSize 0, skipping\n", + cfgno, inum, asnum, d->bEndpointAddress); + goto skip_to_next_endpoint_or_interface_descriptor; + } /* Find the highest legal maxpacket size for this endpoint */ i = 0; /* additional transactions per microframe */ diff --git a/drivers/usb/gadget/configfs.c b/drivers/usb/gadget/configfs.c index a31322ed9447..6b0c4187aae1 100644 --- a/drivers/usb/gadget/configfs.c +++ b/drivers/usb/gadget/configfs.c @@ -90,6 +90,8 @@ struct gadget_info { bool unbinding; char b_vendor_code; char qw_sign[OS_STRING_QW_SIGN_LEN]; + spinlock_t spinlock; + bool unbind; #ifdef CONFIG_USB_CONFIGFS_UEVENT bool connected; bool sw_connected; @@ -1289,6 +1291,7 @@ static int configfs_composite_bind(struct usb_gadget *gadget, int ret; /* the gi->lock is hold by the caller */ + gi->unbind = 0; cdev->gadget = gadget; set_gadget_data(gadget, cdev); ret = composite_dev_prepare(composite, cdev); @@ -1475,19 +1478,118 @@ static void configfs_composite_unbind(struct usb_gadget *gadget) { struct usb_composite_dev *cdev; struct gadget_info *gi; + unsigned long flags; /* the gi->lock is hold by the caller */ cdev = get_gadget_data(gadget); gi = container_of(cdev, struct gadget_info, cdev); + spin_lock_irqsave(&gi->spinlock, flags); + gi->unbind = 1; + spin_unlock_irqrestore(&gi->spinlock, flags); kfree(otg_desc[0]); otg_desc[0] = NULL; purge_configs_funcs(gi); composite_dev_cleanup(cdev); usb_ep_autoconfig_reset(cdev->gadget); + spin_lock_irqsave(&gi->spinlock, flags); cdev->gadget = NULL; set_gadget_data(gadget, NULL); + spin_unlock_irqrestore(&gi->spinlock, flags); +} + +#if !IS_ENABLED(CONFIG_USB_CONFIGFS_UEVENT) +static int configfs_composite_setup(struct usb_gadget *gadget, + const struct usb_ctrlrequest *ctrl) +{ + struct usb_composite_dev *cdev; + struct gadget_info *gi; + unsigned long flags; + int ret; + + cdev = get_gadget_data(gadget); + if (!cdev) + return 0; + + gi = container_of(cdev, struct gadget_info, cdev); + spin_lock_irqsave(&gi->spinlock, flags); + cdev = get_gadget_data(gadget); + if (!cdev || gi->unbind) { + spin_unlock_irqrestore(&gi->spinlock, flags); + return 0; + } + + ret = composite_setup(gadget, ctrl); + spin_unlock_irqrestore(&gi->spinlock, flags); + return ret; +} + +static void configfs_composite_disconnect(struct usb_gadget *gadget) +{ + struct usb_composite_dev *cdev; + struct gadget_info *gi; + unsigned long flags; + + cdev = get_gadget_data(gadget); + if (!cdev) + return; + + gi = container_of(cdev, struct gadget_info, cdev); + spin_lock_irqsave(&gi->spinlock, flags); + cdev = get_gadget_data(gadget); + if (!cdev || gi->unbind) { + spin_unlock_irqrestore(&gi->spinlock, flags); + return; + } + + composite_disconnect(gadget); + spin_unlock_irqrestore(&gi->spinlock, flags); +} +#endif + +static void configfs_composite_suspend(struct usb_gadget *gadget) +{ + struct usb_composite_dev *cdev; + struct gadget_info *gi; + unsigned long flags; + + cdev = get_gadget_data(gadget); + if (!cdev) + return; + + gi = container_of(cdev, struct gadget_info, cdev); + spin_lock_irqsave(&gi->spinlock, flags); + cdev = get_gadget_data(gadget); + if (!cdev || gi->unbind) { + spin_unlock_irqrestore(&gi->spinlock, flags); + return; + } + + composite_suspend(gadget); + spin_unlock_irqrestore(&gi->spinlock, flags); +} + +static void configfs_composite_resume(struct usb_gadget *gadget) +{ + struct usb_composite_dev *cdev; + struct gadget_info *gi; + unsigned long flags; + + cdev = get_gadget_data(gadget); + if (!cdev) + return; + + gi = container_of(cdev, struct gadget_info, cdev); + spin_lock_irqsave(&gi->spinlock, flags); + cdev = get_gadget_data(gadget); + if (!cdev || gi->unbind) { + spin_unlock_irqrestore(&gi->spinlock, flags); + return; + } + + composite_resume(gadget); + spin_unlock_irqrestore(&gi->spinlock, flags); } #ifdef CONFIG_USB_CONFIGFS_UEVENT @@ -1579,12 +1681,12 @@ static const struct usb_gadget_driver configfs_driver_template = { .reset = android_disconnect, .disconnect = android_disconnect, #else - .setup = composite_setup, - .reset = composite_disconnect, - .disconnect = composite_disconnect, + .setup = configfs_composite_setup, + .reset = configfs_composite_disconnect, + .disconnect = configfs_composite_disconnect, #endif - .suspend = composite_suspend, - .resume = composite_resume, + .suspend = configfs_composite_suspend, + .resume = configfs_composite_resume, .max_speed = USB_SPEED_SUPER, .driver = { @@ -1714,6 +1816,7 @@ static struct config_group *gadgets_make( mutex_init(&gi->lock); INIT_LIST_HEAD(&gi->string_list); INIT_LIST_HEAD(&gi->available_func); + spin_lock_init(&gi->spinlock); composite_init_dev(&gi->cdev); gi->cdev.desc.bLength = USB_DT_DEVICE_SIZE; diff --git a/drivers/usb/gadget/udc/atmel_usba_udc.c b/drivers/usb/gadget/udc/atmel_usba_udc.c index 585cb8734f50..668ac5e8681b 100644 --- a/drivers/usb/gadget/udc/atmel_usba_udc.c +++ b/drivers/usb/gadget/udc/atmel_usba_udc.c @@ -403,9 +403,11 @@ static void submit_request(struct usba_ep *ep, struct usba_request *req) next_fifo_transaction(ep, req); if (req->last_transaction) { usba_ep_writel(ep, CTL_DIS, USBA_TX_PK_RDY); - usba_ep_writel(ep, CTL_ENB, USBA_TX_COMPLETE); + if (ep_is_control(ep)) + usba_ep_writel(ep, CTL_ENB, USBA_TX_COMPLETE); } else { - usba_ep_writel(ep, CTL_DIS, USBA_TX_COMPLETE); + if (ep_is_control(ep)) + usba_ep_writel(ep, CTL_DIS, USBA_TX_COMPLETE); usba_ep_writel(ep, CTL_ENB, USBA_TX_PK_RDY); } } diff --git a/drivers/usb/gadget/udc/fsl_udc_core.c b/drivers/usb/gadget/udc/fsl_udc_core.c index 8991a4070792..bd98557caa28 100644 --- a/drivers/usb/gadget/udc/fsl_udc_core.c +++ b/drivers/usb/gadget/udc/fsl_udc_core.c @@ -2570,7 +2570,7 @@ static int fsl_udc_remove(struct platform_device *pdev) dma_pool_destroy(udc_controller->td_pool); free_irq(udc_controller->irq, udc_controller); iounmap(dr_regs); - if (pdata->operating_mode == FSL_USB2_DR_DEVICE) + if (res && (pdata->operating_mode == FSL_USB2_DR_DEVICE)) release_mem_region(res->start, resource_size(res)); /* free udc --wait for the release() finished */ diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c index 4d68a1e9e878..3476d02967f7 100644 --- a/drivers/usb/usbip/vhci_hcd.c +++ b/drivers/usb/usbip/vhci_hcd.c @@ -303,6 +303,7 @@ static int vhci_hub_control(struct usb_hcd *hcd, u16 typeReq, u16 wValue, default: break; } + break; default: usbip_dbg_vhci_rh(" ClearPortFeature: default %x\n", wValue); diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 72e914de473e..81754c33c3a9 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -21,6 +21,14 @@ #include "vhost.h" #define VHOST_VSOCK_DEFAULT_HOST_CID 2 +/* Max number of bytes transferred before requeueing the job. + * Using this limit prevents one virtqueue from starving others. */ +#define VHOST_VSOCK_WEIGHT 0x80000 +/* Max number of packets transferred before requeueing the job. + * Using this limit prevents one virtqueue from starving others with + * small pkts. + */ +#define VHOST_VSOCK_PKT_WEIGHT 256 enum { VHOST_VSOCK_FEATURES = VHOST_FEATURES, @@ -529,7 +537,8 @@ static int vhost_vsock_dev_open(struct inode *inode, struct file *file) vsock->vqs[VSOCK_VQ_TX].handle_kick = vhost_vsock_handle_tx_kick; vsock->vqs[VSOCK_VQ_RX].handle_kick = vhost_vsock_handle_rx_kick; - vhost_dev_init(&vsock->dev, vqs, ARRAY_SIZE(vsock->vqs)); + vhost_dev_init(&vsock->dev, vqs, ARRAY_SIZE(vsock->vqs), + VHOST_VSOCK_PKT_WEIGHT, VHOST_VSOCK_WEIGHT); file->private_data = vsock; spin_lock_init(&vsock->send_pkt_list_lock); diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index e137ff6cd9da..aa4df4a02252 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -926,6 +926,11 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release) dout("__ceph_remove_cap %p from %p\n", cap, &ci->vfs_inode); + /* remove from inode's cap rbtree, and clear auth cap */ + rb_erase(&cap->ci_node, &ci->i_caps); + if (ci->i_auth_cap == cap) + ci->i_auth_cap = NULL; + /* remove from session list */ spin_lock(&session->s_cap_lock); if (session->s_cap_iterator == cap) { @@ -961,11 +966,6 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release) spin_unlock(&session->s_cap_lock); - /* remove from inode list */ - rb_erase(&cap->ci_node, &ci->i_caps); - if (ci->i_auth_cap == cap) - ci->i_auth_cap = NULL; - if (removed) ceph_put_cap(mdsc, cap); diff --git a/fs/configfs/symlink.c b/fs/configfs/symlink.c index 66e8c5d58b21..3af565e8fd51 100644 --- a/fs/configfs/symlink.c +++ b/fs/configfs/symlink.c @@ -157,11 +157,42 @@ int configfs_symlink(struct inode *dir, struct dentry *dentry, const char *symna !type->ct_item_ops->allow_link) goto out_put; + /* + * This is really sick. What they wanted was a hybrid of + * link(2) and symlink(2) - they wanted the target resolved + * at syscall time (as link(2) would've done), be a directory + * (which link(2) would've refused to do) *AND* be a deep + * fucking magic, making the target busy from rmdir POV. + * symlink(2) is nothing of that sort, and the locking it + * gets matches the normal symlink(2) semantics. Without + * attempts to resolve the target (which might very well + * not even exist yet) done prior to locking the parent + * directory. This perversion, OTOH, needs to resolve + * the target, which would lead to obvious deadlocks if + * attempted with any directories locked. + * + * Unfortunately, that garbage is userland ABI and we should've + * said "no" back in 2005. Too late now, so we get to + * play very ugly games with locking. + * + * Try *ANYTHING* of that sort in new code, and you will + * really regret it. Just ask yourself - what could a BOFH + * do to me and do I want to find it out first-hand? + * + * AV, a thoroughly annoyed bastard. + */ + inode_unlock(dir); ret = get_target(symname, &path, &target_item, dentry->d_sb); + inode_lock(dir); if (ret) goto out_put; - ret = type->ct_item_ops->allow_link(parent_item, target_item); + if (dentry->d_inode || d_unhashed(dentry)) + ret = -EEXIST; + else + ret = inode_permission(dir, MAY_WRITE | MAY_EXEC); + if (!ret) + ret = type->ct_item_ops->allow_link(parent_item, target_item); if (!ret) { mutex_lock(&configfs_symlink_mutex); ret = create_link(parent_item, target_item, dentry); diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index de1138e4b049..67fbe6837fdb 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -582,10 +582,13 @@ void wbc_attach_and_unlock_inode(struct writeback_control *wbc, spin_unlock(&inode->i_lock); /* - * A dying wb indicates that the memcg-blkcg mapping has changed - * and a new wb is already serving the memcg. Switch immediately. + * A dying wb indicates that either the blkcg associated with the + * memcg changed or the associated memcg is dying. In the first + * case, a replacement wb should already be available and we should + * refresh the wb immediately. In the second case, trying to + * refresh will keep failing. */ - if (unlikely(wb_dying(wbc->wb))) + if (unlikely(wb_dying(wbc->wb) && !css_is_dying(wbc->wb->memcg_css))) inode_switch_wbs(inode, wbc->wb_id); } diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index 7af5eeabc80e..5dac3382405c 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -52,6 +52,16 @@ nfs4_is_valid_delegation(const struct nfs_delegation *delegation, return false; } +struct nfs_delegation *nfs4_get_valid_delegation(const struct inode *inode) +{ + struct nfs_delegation *delegation; + + delegation = rcu_dereference(NFS_I(inode)->delegation); + if (nfs4_is_valid_delegation(delegation, 0)) + return delegation; + return NULL; +} + static int nfs4_do_check_delegation(struct inode *inode, fmode_t flags, bool mark) { diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h index 333063e032f0..26a8af7bdca3 100644 --- a/fs/nfs/delegation.h +++ b/fs/nfs/delegation.h @@ -58,6 +58,7 @@ int nfs4_open_delegation_recall(struct nfs_open_context *ctx, struct nfs4_state int nfs4_lock_delegation_recall(struct file_lock *fl, struct nfs4_state *state, const nfs4_stateid *stateid); bool nfs4_copy_delegation_stateid(nfs4_stateid *dst, struct inode *inode, fmode_t flags); +struct nfs_delegation *nfs4_get_valid_delegation(const struct inode *inode); void nfs_mark_delegation_referenced(struct nfs_delegation *delegation); int nfs4_have_delegation(struct inode *inode, fmode_t flags); int nfs4_check_delegation(struct inode *inode, fmode_t flags); diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 900a62a9ad4e..08207001d475 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -1243,8 +1243,6 @@ static int can_open_delegated(struct nfs_delegation *delegation, fmode_t fmode, return 0; if ((delegation->type & fmode) != fmode) return 0; - if (test_bit(NFS_DELEGATION_RETURNING, &delegation->flags)) - return 0; switch (claim) { case NFS4_OPEN_CLAIM_NULL: case NFS4_OPEN_CLAIM_FH: @@ -1473,7 +1471,6 @@ static void nfs4_return_incompatible_delegation(struct inode *inode, fmode_t fmo static struct nfs4_state *nfs4_try_open_cached(struct nfs4_opendata *opendata) { struct nfs4_state *state = opendata->state; - struct nfs_inode *nfsi = NFS_I(state->inode); struct nfs_delegation *delegation; int open_mode = opendata->o_arg.open_flags; fmode_t fmode = opendata->o_arg.fmode; @@ -1490,7 +1487,7 @@ static struct nfs4_state *nfs4_try_open_cached(struct nfs4_opendata *opendata) } spin_unlock(&state->owner->so_lock); rcu_read_lock(); - delegation = rcu_dereference(nfsi->delegation); + delegation = nfs4_get_valid_delegation(state->inode); if (!can_open_delegated(delegation, fmode, claim)) { rcu_read_unlock(); break; @@ -1981,7 +1978,7 @@ static void nfs4_open_prepare(struct rpc_task *task, void *calldata) if (can_open_cached(data->state, data->o_arg.fmode, data->o_arg.open_flags)) goto out_no_action; rcu_read_lock(); - delegation = rcu_dereference(NFS_I(data->state->inode)->delegation); + delegation = nfs4_get_valid_delegation(data->state->inode); if (can_open_delegated(delegation, data->o_arg.fmode, claim)) goto unlock_no_action; rcu_read_unlock(); diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 960f2750bc42..1945aa03a52a 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -65,6 +65,11 @@ extern ssize_t cpu_show_l1tf(struct device *dev, struct device_attribute *attr, char *buf); extern ssize_t cpu_show_mds(struct device *dev, struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_tsx_async_abort(struct device *dev, + struct device_attribute *attr, + char *buf); +extern ssize_t cpu_show_itlb_multihit(struct device *dev, + struct device_attribute *attr, char *buf); extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata, diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index a6cc576fd467..b0156f8a9ab7 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -880,6 +880,7 @@ struct netns_ipvs { struct delayed_work defense_work; /* Work handler */ int drop_rate; int drop_counter; + int old_secure_tcp; atomic_t dropentry; /* locks in ctl.c */ spinlock_t dropentry_lock; /* drop entry handling */ diff --git a/include/net/neighbour.h b/include/net/neighbour.h index f6017ddc4ded..1c0d07376125 100644 --- a/include/net/neighbour.h +++ b/include/net/neighbour.h @@ -425,8 +425,8 @@ static inline int neigh_event_send(struct neighbour *neigh, struct sk_buff *skb) { unsigned long now = jiffies; - if (neigh->used != now) - neigh->used = now; + if (READ_ONCE(neigh->used) != now) + WRITE_ONCE(neigh->used, now); if (!(neigh->nud_state&(NUD_CONNECTED|NUD_DELAY|NUD_PROBE))) return __neigh_event_send(neigh, skb); return 0; diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index 4bd7508bedc9..b96df7499600 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -648,7 +648,8 @@ struct nft_expr_ops { */ struct nft_expr { const struct nft_expr_ops *ops; - unsigned char data[]; + unsigned char data[] + __attribute__((aligned(__alignof__(u64)))); }; static inline void *nft_expr_priv(const struct nft_expr *expr) diff --git a/include/net/sock.h b/include/net/sock.h index 026c48680b71..8eaba44fdc09 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2184,7 +2184,7 @@ static inline ktime_t sock_read_timestamp(struct sock *sk) return kt; #else - return sk->sk_stamp; + return READ_ONCE(sk->sk_stamp); #endif } @@ -2195,7 +2195,7 @@ static inline void sock_write_timestamp(struct sock *sk, ktime_t kt) sk->sk_stamp = kt; write_sequnlock(&sk->sk_stamp_seq); #else - sk->sk_stamp = kt; + WRITE_ONCE(sk->sk_stamp, kt); #endif } diff --git a/lib/dump_stack.c b/lib/dump_stack.c index c30d07e99dba..72de6444934d 100644 --- a/lib/dump_stack.c +++ b/lib/dump_stack.c @@ -44,7 +44,12 @@ retry: was_locked = 1; } else { local_irq_restore(flags); - cpu_relax(); + /* + * Wait for the lock to release before jumping to + * atomic_cmpxchg() in order to mitigate the thundering herd + * problem. + */ + do { cpu_relax(); } while (atomic_read(&dump_lock) != -1); goto retry; } diff --git a/mm/filemap.c b/mm/filemap.c index ea5c7a6e20d2..6f3c539d9e68 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -342,7 +342,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start, .range_end = end, }; - if (!mapping_cap_writeback_dirty(mapping)) + if (!mapping_cap_writeback_dirty(mapping) || + !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) return 0; wbc_attach_fdatawrite_inode(&wbc, mapping->host); diff --git a/mm/vmstat.c b/mm/vmstat.c index e297ed47aa58..3a07628297da 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1588,7 +1588,7 @@ static int __init setup_vmstat(void) #endif #ifdef CONFIG_PROC_FS proc_create("buddyinfo", S_IRUGO, NULL, &fragmentation_file_operations); - proc_create("pagetypeinfo", S_IRUGO, NULL, &pagetypeinfo_file_ops); + proc_create("pagetypeinfo", 0400, NULL, &pagetypeinfo_file_ops); proc_create("vmstat", S_IRUGO, NULL, &proc_vmstat_file_operations); proc_create("zoneinfo", S_IRUGO, NULL, &proc_zoneinfo_file_operations); #endif diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 54f3d7cb23e6..caa26184f7e3 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -1930,8 +1930,9 @@ ip_set_sockfn_get(struct sock *sk, int optval, void __user *user, int *len) } req_version->version = IPSET_PROTOCOL; - ret = copy_to_user(user, req_version, - sizeof(struct ip_set_req_version)); + if (copy_to_user(user, req_version, + sizeof(struct ip_set_req_version))) + ret = -EFAULT; goto done; } case IP_SET_OP_GET_BYNAME: { @@ -1988,7 +1989,8 @@ ip_set_sockfn_get(struct sock *sk, int optval, void __user *user, int *len) } /* end of switch(op) */ copy: - ret = copy_to_user(user, data, copylen); + if (copy_to_user(user, data, copylen)) + ret = -EFAULT; done: vfree(data); diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index e9a3e74567b7..79f4ffe7291a 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -97,7 +97,6 @@ static bool __ip_vs_addr_is_local_v6(struct net *net, static void update_defense_level(struct netns_ipvs *ipvs) { struct sysinfo i; - static int old_secure_tcp = 0; int availmem; int nomem; int to_change = -1; @@ -178,35 +177,35 @@ static void update_defense_level(struct netns_ipvs *ipvs) spin_lock(&ipvs->securetcp_lock); switch (ipvs->sysctl_secure_tcp) { case 0: - if (old_secure_tcp >= 2) + if (ipvs->old_secure_tcp >= 2) to_change = 0; break; case 1: if (nomem) { - if (old_secure_tcp < 2) + if (ipvs->old_secure_tcp < 2) to_change = 1; ipvs->sysctl_secure_tcp = 2; } else { - if (old_secure_tcp >= 2) + if (ipvs->old_secure_tcp >= 2) to_change = 0; } break; case 2: if (nomem) { - if (old_secure_tcp < 2) + if (ipvs->old_secure_tcp < 2) to_change = 1; } else { - if (old_secure_tcp >= 2) + if (ipvs->old_secure_tcp >= 2) to_change = 0; ipvs->sysctl_secure_tcp = 1; } break; case 3: - if (old_secure_tcp < 2) + if (ipvs->old_secure_tcp < 2) to_change = 1; break; } - old_secure_tcp = ipvs->sysctl_secure_tcp; + ipvs->old_secure_tcp = ipvs->sysctl_secure_tcp; if (to_change >= 0) ip_vs_protocol_timeout_change(ipvs, ipvs->sysctl_secure_tcp > 1); diff --git a/net/nfc/netlink.c b/net/nfc/netlink.c index 80def98c9dba..04d4c388a7a8 100644 --- a/net/nfc/netlink.c +++ b/net/nfc/netlink.c @@ -1066,7 +1066,6 @@ static int nfc_genl_llc_set_params(struct sk_buff *skb, struct genl_info *info) local = nfc_llcp_find_local(dev); if (!local) { - nfc_put_device(dev); rc = -ENODEV; goto exit; } @@ -1126,7 +1125,6 @@ static int nfc_genl_llc_sdreq(struct sk_buff *skb, struct genl_info *info) local = nfc_llcp_find_local(dev); if (!local) { - nfc_put_device(dev); rc = -ENODEV; goto exit; } diff --git a/sound/firewire/bebob/bebob_focusrite.c b/sound/firewire/bebob/bebob_focusrite.c index f11090057949..d0a8736613a1 100644 --- a/sound/firewire/bebob/bebob_focusrite.c +++ b/sound/firewire/bebob/bebob_focusrite.c @@ -28,6 +28,8 @@ #define SAFFIRE_CLOCK_SOURCE_SPDIF 1 /* clock sources as returned from register of Saffire Pro 10 and 26 */ +#define SAFFIREPRO_CLOCK_SOURCE_SELECT_MASK 0x000000ff +#define SAFFIREPRO_CLOCK_SOURCE_DETECT_MASK 0x0000ff00 #define SAFFIREPRO_CLOCK_SOURCE_INTERNAL 0 #define SAFFIREPRO_CLOCK_SOURCE_SKIP 1 /* never used on hardware */ #define SAFFIREPRO_CLOCK_SOURCE_SPDIF 2 @@ -190,6 +192,7 @@ saffirepro_both_clk_src_get(struct snd_bebob *bebob, unsigned int *id) map = saffirepro_clk_maps[1]; /* In a case that this driver cannot handle the value of register. */ + value &= SAFFIREPRO_CLOCK_SOURCE_SELECT_MASK; if (value >= SAFFIREPRO_CLOCK_SOURCE_COUNT || map[value] < 0) { err = -EIO; goto end; diff --git a/sound/pci/hda/patch_ca0132.c b/sound/pci/hda/patch_ca0132.c index c55c0131be0a..c0742ee11519 100644 --- a/sound/pci/hda/patch_ca0132.c +++ b/sound/pci/hda/patch_ca0132.c @@ -4440,7 +4440,7 @@ static void hp_callback(struct hda_codec *codec, struct hda_jack_callback *cb) /* Delay enabling the HP amp, to let the mic-detection * state machine run. */ - cancel_delayed_work_sync(&spec->unsol_hp_work); + cancel_delayed_work(&spec->unsol_hp_work); schedule_delayed_work(&spec->unsol_hp_work, msecs_to_jiffies(500)); tbl = snd_hda_jack_tbl_get(codec, cb->nid); if (tbl) diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index f6720afa9f34..97ebd1d3646d 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -1080,7 +1080,7 @@ void hists__collapse_resort(struct hists *hists, struct ui_progress *prog) } } -static int hist_entry__sort(struct hist_entry *a, struct hist_entry *b) +static int64_t hist_entry__sort(struct hist_entry *a, struct hist_entry *b) { struct perf_hpp_fmt *fmt; int64_t cmp = 0;