Commit graph

23986 commits

Author SHA1 Message Date
Greg Kroah-Hartman
5541782ce2 This is the 4.4.150 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlt33LMACgkQONu9yGCS
 aT7SIhAAp3gDjUe/MZRRbfxinJU1E157rp/wFsr/I1BmQ37+wxDvO4UrPthAC3p0
 /ZAUiFr8dyr0KLhUYiQmGSHckrhAMgw1V/r0nFg3YvWBT6bYY9kOmyJb8tnkRGLz
 zzr0jVQAgN78tfUD+8j41AnUhNrapXkvUT02ZboKji7QFGgh8Ll5iTurQ+NFDNG4
 Qpnb7+dCRZfjfUL0Dy4H0rSxJV4CUlS/6DqDMnr4uQRRCwbXogNloYQj2OHS2Kc2
 06P2qatJ99QTA6mlg0lIOjN7oywtUVTYrsbBN006UxPtn58swK2CDhMgA2oPXWRe
 44aSw1FPM58p+UyxAhDAVFRy231c6b2zTYILBXL4nGqLM+bIlme4K+JBEo8AyIXO
 1M/+MxCdTTx5CjZKp5hJVeQzCIPdDGAVrawSqrUmjtl5dEF6csWJ833gO/Sk1QXD
 DpBVKjSBl+prKtVaeRRg0ImJ2cAr+8TA0SpoVuFPwpMid41w3xKigUewmoqTV65i
 nPTIw44Fx8KCv/476H21lEbZKNRQkFf39IfpH4PptZOGcQ7+nSVA/GNdSAV8rj3c
 dcHkhkHiec1yK+jawEwsuDrpdsg0S43Qq2l57FEsqlLvOHGwzInCxZW0mStfKXkb
 4rj9ccc1bZhKtlJFVITzpLMy3EsjSsvAvGA3i+fWWHrD74idEgc=
 =5rdo
 -----END PGP SIGNATURE-----

Merge 4.4.150 into android-4.4

Changes in 4.4.150
	x86/speculation/l1tf: Exempt zeroed PTEs from inversion
	Linux 4.4.150

Change-Id: I2dfd6e160998ae2f55f3b7621df62e96a4511f7c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-08-18 11:35:52 +02:00
Sean Christopherson
4cdedeefa3 x86/speculation/l1tf: Exempt zeroed PTEs from inversion
commit f19f5c49bbc3ffcc9126cc245fc1b24cc29f4a37 upstream.

It turns out that we should *not* invert all not-present mappings,
because the all zeroes case is obviously special.

clear_page() does not undergo the XOR logic to invert the address bits,
i.e. PTE, PMD and PUD entries that have not been individually written
will have val=0 and so will trigger __pte_needs_invert(). As a result,
{pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones
(adjusted by the max PFN mask) instead of zero. A zeroed entry is ok
because the page at physical address 0 is reserved early in boot
specifically to mitigate L1TF, so explicitly exempt them from the
inversion when reading the PFN.

Manifested as an unexpected mprotect(..., PROT_NONE) failure when called
on a VMA that has VM_PFNMAP and was mmap'd to as something other than
PROT_NONE but never used. mprotect() sends the PROT_NONE request down
prot_none_walk(), which walks the PTEs to check the PFNs.
prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns
-EACCES because it thinks mprotect() is trying to adjust a high MMIO
address.

[ This is a very modified version of Sean's original patch, but all
  credit goes to Sean for doing this and also pointing out that
  sometimes the __pte_needs_invert() function only gets the protection
  bits, not the full eventual pte.  But zero remains special even in
  just protection bits, so that's ok.   - Linus ]

Fixes: f22cc87f6c1f ("x86/speculation/l1tf: Invert all not present mappings")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-18 10:45:38 +02:00
Greg Kroah-Hartman
f76bdbdd51 This is the 4.4.149 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlt3Gm4ACgkQONu9yGCS
 aT5yQBAAnvjt4kSAgBR774bBS4SM3OmRlpop7l/vx7S0oaAPkrQeqoyeLa210S/i
 4yUZiwhd6B8/Bd4dvD31ETOqmxjYfTsne6aZ4JuKwhzEdTZxXUEY82HcVwZ0Yvwy
 /p74whVpCx2W9Wb46/IGRpOZ+WgBKmr/GvCFVs9mU3mLTRdPn/BNOmTXHum7BCax
 QU/az9mLjx9yC5o+35QjLLeOpeunz8OiAN6h5E4bqkD2xd2Pl/5iLg+wGAwMlVUc
 3+cNZpZ18RSEU377wGf00b0PkOBG8ZeKrW7+HlpG8xW0avVuilHFbvpxtnEioAWe
 0CFAuhhZV7gXpC8pbP8hqnlCvJntKF0ybRRx/pt4iTmaqPZwn8VKc67k11FnlbTr
 unfOzqEkCCOJzZ4rg2FYZaPUqPFpcOPlXzD87mwHi3BfwPqdkTiyTtuiOkJTa30X
 Uom6q5GMuTVgz45+jKL4I+gtIrRO1DX/Quz1BVeEZZgOArLtAbKtB1qaJ78FsRqp
 fhwRRm5DHtlbn2kun/r4EP6+TYFw5l+GhVEPpZnzwH5HiBdl9/hSN3e+0H9Pais5
 EkLQSHJsPJXQHbEiIek18Lj3I/lblpoQP2DZjFfPfxBx3Og9EYF1BTC6u7LUw+0p
 9+KI+FkiRC+nv3sub0jhn/5k6F8PPsR2f9YVknop+AqFVnNiH3c=
 =0GGR
 -----END PGP SIGNATURE-----

Merge 4.4.149 into android-4.4

Changes in 4.4.149
	x86/mm: Disable ioremap free page handling on x86-PAE
	tcp: Fix missing range_truesize enlargement in the backport
	kasan: don't emit builtin calls when sanitization is off
	i2c: ismt: fix wrong device address when unmap the data buffer
	kbuild: verify that $DEPMOD is installed
	crypto: vmac - require a block cipher with 128-bit block size
	crypto: vmac - separate tfm and request context
	crypto: blkcipher - fix crash flushing dcache in error path
	crypto: ablkcipher - fix crash flushing dcache in error path
	ASoC: Intel: cht_bsw_max98090_ti: Fix jack initialization
	Bluetooth: hidp: buffer overflow in hidp_process_report
	ioremap: Update pgtable free interfaces with addr
	x86/mm: Add TLB purge to free pmd/pte page interfaces
	Linux 4.4.149

Change-Id: I1e23095dd229992359341bda5c05e9b5b59fec45
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-08-17 21:25:15 +02:00
Toshi Kani
5b9b4a8cca x86/mm: Add TLB purge to free pmd/pte page interfaces
commit 5e0fb5df2ee871b841f96f9cb6a7f2784e96aa4e upstream.

ioremap() calls pud_free_pmd_page() / pmd_free_pte_page() when it creates
a pud / pmd map.  The following preconditions are met at their entry.
 - All pte entries for a target pud/pmd address range have been cleared.
 - System-wide TLB purges have been peformed for a target pud/pmd address
   range.

The preconditions assure that there is no stale TLB entry for the range.
Speculation may not cache TLB entries since it requires all levels of page
entries, including ptes, to have P & A-bits set for an associated address.
However, speculation may cache pud/pmd entries (paging-structure caches)
when they have P-bit set.

Add a system-wide TLB purge (INVLPG) to a single page after clearing
pud/pmd entry's P-bit.

SDM 4.10.4.1, Operation that Invalidate TLBs and Paging-Structure Caches,
states that:
  INVLPG invalidates all paging-structure caches associated with the
  current PCID regardless of the liner addresses to which they correspond.

Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: mhocko@suse.com
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: cpandya@codeaurora.org
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: Joerg Roedel <joro@8bytes.org>
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180627141348.21777-4-toshi.kani@hpe.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-17 20:56:45 +02:00
Chintan Pandya
29f475cbff ioremap: Update pgtable free interfaces with addr
commit 785a19f9d1dd8a4ab2d0633be4656653bd3de1fc upstream.

The following kernel panic was observed on ARM64 platform due to a stale
TLB entry.

 1. ioremap with 4K size, a valid pte page table is set.
 2. iounmap it, its pte entry is set to 0.
 3. ioremap the same address with 2M size, update its pmd entry with
    a new value.
 4. CPU may hit an exception because the old pmd entry is still in TLB,
    which leads to a kernel panic.

Commit b6bdb7517c3d ("mm/vmalloc: add interfaces to free unmapped page
table") has addressed this panic by falling to pte mappings in the above
case on ARM64.

To support pmd mappings in all cases, TLB purge needs to be performed
in this case on ARM64.

Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page()
so that TLB purge can be added later in seprate patches.

[toshi.kani@hpe.com: merge changes, rewrite patch description]
Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Signed-off-by: Chintan Pandya <cpandya@codeaurora.org>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: mhocko@suse.com
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-17 20:56:45 +02:00
Toshi Kani
438604aa02 x86/mm: Disable ioremap free page handling on x86-PAE
commit f967db0b9ed44ec3057a28f3b28efc51df51b835 upstream.

ioremap() supports pmd mappings on x86-PAE.  However, kernel's pmd
tables are not shared among processes on x86-PAE.  Therefore, any
update to sync'd pmd entries need re-syncing.  Freeing a pte page
also leads to a vmalloc fault and hits the BUG_ON in vmalloc_sync_one().

Disable free page handling on x86-PAE.  pud_free_pmd_page() and
pmd_free_pte_page() simply return 0 if a given pud/pmd entry is present.
This assures that ioremap() does not update sync'd pmd entries at the
cost of falling back to pte mappings.

Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Reported-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: mhocko@suse.com
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: cpandya@codeaurora.org
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180627141348.21777-2-toshi.kani@hpe.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-17 20:56:44 +02:00
Greg Kroah-Hartman
f057ff9377 This is the 4.4.148 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlt0SdMACgkQONu9yGCS
 aT4NLxAAovDVqFFejBk8M1nxAtQSqRzB2PMboc+l62clKa6BAJtWAsgPFjECgzEp
 edlDeUttliQoTB6S3GYYM82oj50myUKlGvlJRptRE3Gr1iYubdB/U2RDmwEzCxbC
 AEzu4tEv+Z23jaLGsuAIOg66faBTqqgVoKtp/TlKwl+Y/b6WzkI0gRzxWTBFnAlj
 AKuhmoc1JoS9JF/MQ4q02gYSQ0g1eTpr1gIU2GMow9pK9Rahk4Jdl4yRjNLUFDxd
 ojrBYCoElf90R3q+NvmZBbzxwanm2OgzeEBffhh647aB5kHEUd5h4z9w+sIoXmSq
 50uD59q62Umdpp2O125HH5KHeHbcTUCXXp3g1VY6A/+d9dTs9GZqo//vf6aJsxEb
 gixoYyNbIcqw1k0jhEEW2ah3F3j+ZHvNmLKPyV4U8h2Tw2K5QKzFu/fVnQw7Xfv6
 Gv0z1TQ4Y+w2bqpzDiDBO4sRgKOXVr3hzWa0jggW5AoKWTco/oIVkE+Rqmj65AfK
 DROqCMQq75K+pymrM8I3wTXRSxtSH9bO/iqCu2LiiaG+JAkvr0OIHEHgizxLtAFO
 ivpREPDsWhVAYUmnoCgJa8Za1GdJk1I9uvxoJY1TBL8gbcYc61yjjeJDYqLghuNT
 EhrvFvJ4r/fQ6BJ76+rO7FSJIl+Kov2Uf7CWql3Lzxps6/u5GNQ=
 =73dO
 -----END PGP SIGNATURE-----

Merge 4.4.148 into android-4.4

Changes in 4.4.148
	ext4: fix check to prevent initializing reserved inodes
	tpm: fix race condition in tpm_common_write()
	ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV
	fork: unconditionally clear stack on fork
	parisc: Enable CONFIG_MLONGCALLS by default
	parisc: Define mb() and add memory barriers to assembler unlock sequences
	xen/netfront: don't cache skb_shinfo()
	ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices
	scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management enabled
	root dentries need RCU-delayed freeing
	fix mntput/mntput race
	fix __legitimize_mnt()/mntput() race
	IB/core: Make testing MR flags for writability a static inline function
	IB/mlx4: Mark user MR as writable if actual virtual memory is writable
	IB/ocrdma: fix out of bounds access to local buffer
	ARM: dts: imx6sx: fix irq for pcie bridge
	x86/paravirt: Fix spectre-v2 mitigations for paravirt guests
	x86/speculation: Protect against userspace-userspace spectreRSB
	kprobes/x86: Fix %p uses in error messages
	x86/irqflags: Provide a declaration for native_save_fl
	x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_SHIFT
	x86/mm: Move swap offset/type up in PTE to work around erratum
	x86/mm: Fix swap entry comment and macro
	mm: x86: move _PAGE_SWP_SOFT_DIRTY from bit 7 to bit 1
	x86/speculation/l1tf: Change order of offset/type in swap entry
	x86/speculation/l1tf: Protect swap entries against L1TF
	x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation
	x86/speculation/l1tf: Make sure the first page is always reserved
	x86/speculation/l1tf: Add sysfs reporting for l1tf
	mm: Add vm_insert_pfn_prot()
	mm: fix cache mode tracking in vm_insert_mixed()
	x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings
	x86/speculation/l1tf: Limit swap file size to MAX_PA/2
	x86/bugs: Move the l1tf function and define pr_fmt properly
	x86/speculation/l1tf: Extend 64bit swap file size limit
	x86/cpufeatures: Add detection of L1D cache flush support.
	x86/speculation/l1tf: Protect PAE swap entries against L1TF
	x86/speculation/l1tf: Fix up pte->pfn conversion for PAE
	x86/speculation/l1tf: Invert all not present mappings
	x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert
	x86/mm/pat: Make set_memory_np() L1TF safe
	x86/mm/kmmio: Make the tracer robust against L1TF
	x86/speculation/l1tf: Fix up CPU feature flags
	x86/init: fix build with CONFIG_SWAP=n
	x86/speculation/l1tf: Unbreak !__HAVE_ARCH_PFN_MODIFY_ALLOWED architectures
	Linux 4.4.148

Change-Id: I83c857d9d9d74ee47e61d15eb411f276f057ba3d
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-08-15 18:20:41 +02:00
Vlastimil Babka
4b90ff885c x86/init: fix build with CONFIG_SWAP=n
commit 792adb90fa724ce07c0171cbc96b9215af4b1045 upstream.

The introduction of generic_max_swapfile_size and arch-specific versions has
broken linking on x86 with CONFIG_SWAP=n due to undefined reference to
'generic_max_swapfile_size'. Fix it by compiling the x86-specific
max_swapfile_size() only with CONFIG_SWAP=y.

Reported-by: Tomas Pruzina <pruzinat@gmail.com>
Fixes: 377eeaa8e11f ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:11 +02:00
Guenter Roeck
eb993211b9 x86/speculation/l1tf: Fix up CPU feature flags
In linux-4.4.y, the definition of X86_FEATURE_RETPOLINE and
X86_FEATURE_RETPOLINE_AMD is different from the upstream
definition. Result is an overlap with the newly introduced
X86_FEATURE_L1TF_PTEINV. Update RETPOLINE definitions to match
upstream definitions to improve alignment with upstream code.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:11 +02:00
Andi Kleen
6b06f36f07 x86/mm/kmmio: Make the tracer robust against L1TF
commit 1063711b57393c1999248cccb57bebfaf16739e7 upstream

The mmio tracer sets io mapping PTEs and PMDs to non present when enabled
without inverting the address bits, which makes the PTE entry vulnerable
for L1TF.

Make it use the right low level macros to actually invert the address bits
to protect against L1TF.

In principle this could be avoided because MMIO tracing is not likely to be
enabled on production machines, but the fix is straigt forward and for
consistency sake it's better to get rid of the open coded PTE manipulation.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:11 +02:00
Andi Kleen
02ff2769ed x86/mm/pat: Make set_memory_np() L1TF safe
commit 958f79b9ee55dfaf00c8106ed1c22a2919e0028b upstream

set_memory_np() is used to mark kernel mappings not present, but it has
it's own open coded mechanism which does not have the L1TF protection of
inverting the address bits.

Replace the open coded PTE manipulation with the L1TF protecting low level
PTE routines.

Passes the CPA self test.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ dwmw2: Pull in pud_mkhuge() from commit a00cc7d9dd, and pfn_pud() ]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
[groeck: port to 4.4]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:11 +02:00
Andi Kleen
9feecdb6cb x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert
commit 0768f91530ff46683e0b372df14fd79fe8d156e5 upstream

Some cases in THP like:
  - MADV_FREE
  - mprotect
  - split

mark the PMD non present for temporarily to prevent races. The window for
an L1TF attack in these contexts is very small, but it wants to be fixed
for correctness sake.

Use the proper low level functions for pmd/pud_mknotpresent() to address
this.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:10 +02:00
Andi Kleen
0aae5fe841 x86/speculation/l1tf: Invert all not present mappings
commit f22cc87f6c1f771b57c407555cfefd811cdd9507 upstream

For kernel mappings PAGE_PROTNONE is not necessarily set for a non present
mapping, but the inversion logic explicitely checks for !PRESENT and
PROT_NONE.

Remove the PROT_NONE check and make the inversion unconditional for all not
present mappings.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:10 +02:00
Michal Hocko
09049f022a x86/speculation/l1tf: Fix up pte->pfn conversion for PAE
commit e14d7dfb41f5807a0c1c26a13f2b8ef16af24935 upstream

Jan has noticed that pte_pfn and co. resp. pfn_pte are incorrect for
CONFIG_PAE because phys_addr_t is wider than unsigned long and so the
pte_val reps. shift left would get truncated. Fix this up by using proper
types.

[dwmw2: Backport to 4.9]

Fixes: 6b28baca9b1f ("x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation")
Reported-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:10 +02:00
Vlastimil Babka
b55b06bd3b x86/speculation/l1tf: Protect PAE swap entries against L1TF
commit 0d0f6249058834ffe1ceaad0bb31464af66f6e7a upstream

The PAE 3-level paging code currently doesn't mitigate L1TF by flipping the
offset bits, and uses the high PTE word, thus bits 32-36 for type, 37-63 for
offset. The lower word is zeroed, thus systems with less than 4GB memory are
safe. With 4GB to 128GB the swap type selects the memory locations vulnerable
to L1TF; with even more memory, also the swap offfset influences the address.
This might be a problem with 32bit PAE guests running on large 64bit hosts.

By continuing to keep the whole swap entry in either high or low 32bit word of
PTE we would limit the swap size too much. Thus this patch uses the whole PAE
PTE with the same layout as the 64bit version does. The macros just become a
bit tricky since they assume the arch-dependent swp_entry_t to be 32bit.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:10 +02:00
Konrad Rzeszutek Wilk
dc48c1a2f4 x86/cpufeatures: Add detection of L1D cache flush support.
commit 11e34e64e4103955fc4568750914c75d65ea87ee upstream

336996-Speculative-Execution-Side-Channel-Mitigations.pdf defines a new MSR
(IA32_FLUSH_CMD) which is detected by CPUID.7.EDX[28]=1 bit being set.

This new MSR "gives software a way to invalidate structures with finer
granularity than other architectual methods like WBINVD."

A copy of this document is available at
  https://bugzilla.kernel.org/show_bug.cgi?id=199511

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:10 +02:00
Vlastimil Babka
df7fd6ccb3 x86/speculation/l1tf: Extend 64bit swap file size limit
commit 1a7ed1ba4bba6c075d5ad61bb75e3fbc870840d6 upstream

The previous patch has limited swap file size so that large offsets cannot
clear bits above MAX_PA/2 in the pte and interfere with L1TF mitigation.

It assumed that offsets are encoded starting with bit 12, same as pfn. But
on x86_64, offsets are encoded starting with bit 9.

Thus the limit can be raised by 3 bits. That means 16TB with 42bit MAX_PA
and 256TB with 46bit MAX_PA.

Fixes: 377eeaa8e11f ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:10 +02:00
Konrad Rzeszutek Wilk
fa86c208d2 x86/bugs: Move the l1tf function and define pr_fmt properly
commit 56563f53d3066afa9e63d6c997bf67e76a8b05c0 upstream

The pr_warn in l1tf_select_mitigation would have used the prior pr_fmt
which was defined as "Spectre V2 : ".

Move the function to be past SSBD and also define the pr_fmt.

Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf")
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:10 +02:00
Andi Kleen
685b44483f x86/speculation/l1tf: Limit swap file size to MAX_PA/2
commit 377eeaa8e11fe815b1d07c81c4a0e2843a8c15eb upstream

For the L1TF workaround its necessary to limit the swap file size to below
MAX_PA/2, so that the higher bits of the swap offset inverted never point
to valid memory.

Add a mechanism for the architecture to override the swap file size check
in swapfile.c and add a x86 specific max swapfile check function that
enforces that limit.

The check is only enabled if the CPU is vulnerable to L1TF.

In VMs with 42bit MAX_PA the typical limit is 2TB now, on a native system
with 46bit PA it is 32TB. The limit is only per individual swap file, so
it's always possible to exceed these limits with multiple swap files or
partitions.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:10 +02:00
Andi Kleen
d71af2dbac x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings
commit 42e4089c7890725fcd329999252dc489b72f2921 upstream

For L1TF PROT_NONE mappings are protected by inverting the PFN in the page
table entry. This sets the high bits in the CPU's address space, thus
making sure to point to not point an unmapped entry to valid cached memory.

Some server system BIOSes put the MMIO mappings high up in the physical
address space. If such an high mapping was mapped to unprivileged users
they could attack low memory by setting such a mapping to PROT_NONE. This
could happen through a special device driver which is not access
protected. Normal /dev/mem is of course access protected.

To avoid this forbid PROT_NONE mappings or mprotect for high MMIO mappings.

Valid page mappings are allowed because the system is then unsafe anyways.

It's not expected that users commonly use PROT_NONE on MMIO. But to
minimize any impact this is only enforced if the mapping actually refers to
a high MMIO address (defined as the MAX_PA-1 bit being set), and also skip
the check for root.

For mmaps this is straight forward and can be handled in vm_insert_pfn and
in remap_pfn_range().

For mprotect it's a bit trickier. At the point where the actual PTEs are
accessed a lot of state has been changed and it would be difficult to undo
on an error. Since this is a uncommon case use a separate early page talk
walk pass for MMIO PROT_NONE mappings that checks for this condition
early. For non MMIO and non PROT_NONE there are no changes.

[dwmw2: Backport to 4.9]
[groeck: Backport to 4.4]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:10 +02:00
Andi Kleen
bf0cca01b8 x86/speculation/l1tf: Add sysfs reporting for l1tf
commit 17dbca119312b4e8173d4e25ff64262119fcef38 upstream

L1TF core kernel workarounds are cheap and normally always enabled, However
they still should be reported in sysfs if the system is vulnerable or
mitigated. Add the necessary CPU feature/bug bits.

- Extend the existing checks for Meltdowns to determine if the system is
  vulnerable. All CPUs which are not vulnerable to Meltdown are also not
  vulnerable to L1TF

- Check for 32bit non PAE and emit a warning as there is no practical way
  for mitigation due to the limited physical address bits

- If the system has more than MAX_PA/2 physical memory the invert page
  workarounds don't protect the system against the L1TF attack anymore,
  because an inverted physical address will also point to valid
  memory. Print a warning in this case and report that the system is
  vulnerable.

Add a function which returns the PFN limit for the L1TF mitigation, which
will be used in follow up patches for sanity and range checks.

[ tglx: Renamed the CPU feature bit to L1TF_PTEINV ]
[ dwmw2: Backport to 4.9 (cpufeatures.h, E820) ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:09 +02:00
Andi Kleen
52dc5c9f8e x86/speculation/l1tf: Make sure the first page is always reserved
commit 10a70416e1f067f6c4efda6ffd8ea96002ac4223 upstream

The L1TF workaround doesn't make any attempt to mitigate speculate accesses
to the first physical page for zeroed PTEs. Normally it only contains some
data from the early real mode BIOS.

It's not entirely clear that the first page is reserved in all
configurations, so add an extra reservation call to make sure it is really
reserved. In most configurations (e.g.  with the standard reservations)
it's likely a nop.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:09 +02:00
Andi Kleen
9ee2d2da67 x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation
commit 6b28baca9b1f0d4a42b865da7a05b1c81424bd5c upstream

When PTEs are set to PROT_NONE the kernel just clears the Present bit and
preserves the PFN, which creates attack surface for L1TF speculation
speculation attacks.

This is important inside guests, because L1TF speculation bypasses physical
page remapping. While the host has its own migitations preventing leaking
data from other VMs into the guest, this would still risk leaking the wrong
page inside the current guest.

This uses the same technique as Linus' swap entry patch: while an entry is
is in PROTNONE state invert the complete PFN part part of it. This ensures
that the the highest bit will point to non existing memory.

The invert is done by pte/pmd_modify and pfn/pmd/pud_pte for PROTNONE and
pte/pmd/pud_pfn undo it.

This assume that no code path touches the PFN part of a PTE directly
without using these primitives.

This doesn't handle the case that MMIO is on the top of the CPU physical
memory. If such an MMIO region was exposed by an unpriviledged driver for
mmap it would be possible to attack some real memory.  However this
situation is all rather unlikely.

For 32bit non PAE the inversion is not done because there are really not
enough bits to protect anything.

Q: Why does the guest need to be protected when the HyperVisor already has
   L1TF mitigations?

A: Here's an example:

   Physical pages 1 2 get mapped into a guest as
   GPA 1 -> PA 2
   GPA 2 -> PA 1
   through EPT.

   The L1TF speculation ignores the EPT remapping.

   Now the guest kernel maps GPA 1 to process A and GPA 2 to process B, and
   they belong to different users and should be isolated.

   A sets the GPA 1 PA 2 PTE to PROT_NONE to bypass the EPT remapping and
   gets read access to the underlying physical page. Which in this case
   points to PA 2, so it can read process B's data, if it happened to be in
   L1, so isolation inside the guest is broken.

   There's nothing the hypervisor can do about this. This mitigation has to
   be done in the guest itself.

[ tglx: Massaged changelog ]
[ dwmw2: backported to 4.9 ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:09 +02:00
Linus Torvalds
9bbdab847f x86/speculation/l1tf: Protect swap entries against L1TF
commit 2f22b4cd45b67b3496f4aa4c7180a1271c6452f6 upstream

With L1 terminal fault the CPU speculates into unmapped PTEs, and resulting
side effects allow to read the memory the PTE is pointing too, if its
values are still in the L1 cache.

For swapped out pages Linux uses unmapped PTEs and stores a swap entry into
them.

To protect against L1TF it must be ensured that the swap entry is not
pointing to valid memory, which requires setting higher bits (between bit
36 and bit 45) that are inside the CPUs physical address space, but outside
any real memory.

To do this invert the offset to make sure the higher bits are always set,
as long as the swap file is not too big.

Note there is no workaround for 32bit !PAE, or on systems which have more
than MAX_PA/2 worth of memory. The later case is very unlikely to happen on
real systems.

[AK: updated description and minor tweaks by. Split out from the original
     patch ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:09 +02:00
Linus Torvalds
614f5e8464 x86/speculation/l1tf: Change order of offset/type in swap entry
commit bcd11afa7adad8d720e7ba5ef58bdcd9775cf45f upstream

If pages are swapped out, the swap entry is stored in the corresponding
PTE, which has the Present bit cleared. CPUs vulnerable to L1TF speculate
on PTE entries which have the present bit set and would treat the swap
entry as phsyical address (PFN). To mitigate that the upper bits of the PTE
must be set so the PTE points to non existent memory.

The swap entry stores the type and the offset of a swapped out page in the
PTE. type is stored in bit 9-13 and offset in bit 14-63. The hardware
ignores the bits beyond the phsyical address space limit, so to make the
mitigation effective its required to start 'offset' at the lowest possible
bit so that even large swap offsets do not reach into the physical address
space limit bits.

Move offset to bit 9-58 and type to bit 59-63 which are the bits that
hardware generally doesn't care about.

That, in turn, means that if you on desktop chip with only 40 bits of
physical addressing, now that the offset starts at bit 9, there needs to be
30 bits of offset actually *in use* until bit 39 ends up being set, which
means when inverted it will again point into existing memory.

So that's 4 terabyte of swap space (because the offset is counted in pages,
so 30 bits of offset is 42 bits of actual coverage). With bigger physical
addressing, that obviously grows further, until the limit of the offset is
hit (at 50 bits of offset - 62 bits of actual swap file coverage).

This is a preparatory change for the actual swap entry inversion to protect
against L1TF.

[ AK: Updated description and minor tweaks. Split into two parts ]
[ tglx: Massaged changelog ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:07 +02:00
Naoya Horiguchi
86b0948d7c mm: x86: move _PAGE_SWP_SOFT_DIRTY from bit 7 to bit 1
commit eee4818baac0f2b37848fdf90e4b16430dc536ac upstream

_PAGE_PSE is used to distinguish between a truly non-present
(_PAGE_PRESENT=0) PMD, and a PMD which is undergoing a THP split and
should be treated as present.

But _PAGE_SWP_SOFT_DIRTY currently uses the _PAGE_PSE bit, which would
cause confusion between one of those PMDs undergoing a THP split, and a
soft-dirty PMD.  Dropping _PAGE_PSE check in pmd_present() does not work
well, because it can hurt optimization of tlb handling in thp split.

Thus, we need to move the bit.

In the current kernel, bits 1-4 are not used in non-present format since
commit 00839ee3b299 ("x86/mm: Move swap offset/type up in PTE to work
around erratum").  So let's move _PAGE_SWP_SOFT_DIRTY to bit 1.  Bit 7
is used as reserved (always clear), so please don't use it for other
purpose.

[dwmw2: Pulled in to 4.9 backport to support L1TF changes]

Link: http://lkml.kernel.org/r/20170717193955.20207-3-zi.yan@sent.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: David Nellans <dnellans@nvidia.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:07 +02:00
Dave Hansen
f487cf69cf x86/mm: Fix swap entry comment and macro
commit ace7fab7a6cdd363a615ec537f2aa94dbc761ee2 upstream

A recent patch changed the format of a swap PTE.

The comment explaining the format of the swap PTE is wrong about
the bits used for the swap type field.  Amusingly, the ASCII art
and the patch description are correct, but the comment itself
is wrong.

As I was looking at this, I also noticed that the
SWP_OFFSET_FIRST_BIT has an off-by-one error.  This does not
really hurt anything.  It just wasted a bit of space in the PTE,
giving us 2^59 bytes of addressable space in our swapfiles
instead of 2^60.  But, it doesn't match with the comments, and it
wastes a bit of space, so fix it.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Fixes: 00839ee3b299 ("x86/mm: Move swap offset/type up in PTE to work around erratum")
Link: http://lkml.kernel.org/r/20160810172325.E56AD7DA@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:07 +02:00
Dave Hansen
0a5deacaac x86/mm: Move swap offset/type up in PTE to work around erratum
commit 00839ee3b299303c6a5e26a0a2485427a3afcbbf upstream

This erratum can result in Accessed/Dirty getting set by the hardware
when we do not expect them to be (on !Present PTEs).

Instead of trying to fix them up after this happens, we just
allow the bits to get set and try to ignore them.  We do this by
shifting the layout of the bits we use for swap offset/type in
our 64-bit PTEs.

It looks like this:

 bitnrs: |     ...            | 11| 10|  9|8|7|6|5| 4| 3|2|1|0|
 names:  |     ...            |SW3|SW2|SW1|G|L|D|A|CD|WT|U|W|P|
 before: |         OFFSET (9-63)          |0|X|X| TYPE(1-5) |0|
  after: | OFFSET (14-63)  |  TYPE (9-13) |0|X|X|X| X| X|X|X|0|

Note that D was already a don't care (X) even before.  We just
move TYPE up and turn its old spot (which could be hit by the
A bit) into all don't cares.

We take 5 bits away from the offset, but that still leaves us
with 50 bits which lets us index into a 62-bit swapfile (4 EiB).
I think that's probably fine for the moment.  We could
theoretically reclaim 5 of the bits (1, 2, 3, 4, 7) but it
doesn't gain us anything.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: dave.hansen@intel.com
Cc: linux-mm@kvack.org
Cc: mhocko@suse.com
Link: http://lkml.kernel.org/r/20160708001911.9A3FD2B6@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:07 +02:00
Andi Kleen
90a231c63c x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_SHIFT
commit 50896e180c6aa3a9c61a26ced99e15d602666a4c upstream

L1 Terminal Fault (L1TF) is a speculation related vulnerability. The CPU
speculates on PTE entries which do not have the PRESENT bit set, if the
content of the resulting physical address is available in the L1D cache.

The OS side mitigation makes sure that a !PRESENT PTE entry points to a
physical address outside the actually existing and cachable memory
space. This is achieved by inverting the upper bits of the PTE. Due to the
address space limitations this only works for 64bit and 32bit PAE kernels,
but not for 32bit non PAE.

This mitigation applies to both host and guest kernels, but in case of a
64bit host (hypervisor) and a 32bit PAE guest, inverting the upper bits of
the PAE address space (44bit) is not enough if the host has more than 43
bits of populated memory address space, because the speculation treats the
PTE content as a physical host address bypassing EPT.

The host (hypervisor) protects itself against the guest by flushing L1D as
needed, but pages inside the guest are not protected against attacks from
other processes inside the same guest.

For the guest the inverted PTE mask has to match the host to provide the
full protection for all pages the host could possibly map into the
guest. The hosts populated address space is not known to the guest, so the
mask must cover the possible maximal host address space, i.e. 52 bit.

On 32bit PAE the maximum PTE mask is currently set to 44 bit because that
is the limit imposed by 32bit unsigned long PFNs in the VMs. This limits
the mask to be below what the host could possible use for physical pages.

The L1TF PROT_NONE protection code uses the PTE masks to determine which
bits to invert to make sure the higher bits are set for unmapped entries to
prevent L1TF speculation attacks against EPT inside guests.

In order to invert all bits that could be used by the host, increase
__PHYSICAL_PAGE_SHIFT to 52 to match 64bit.

The real limit for a 32bit PAE kernel is still 44 bits because all Linux
PTEs are created from unsigned long PFNs, so they cannot be higher than 44
bits on a 32bit kernel. So these extra PFN bits should be never set. The
only users of this macro are using it to look at PTEs, so it's safe.

[ tglx: Massaged changelog ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:06 +02:00
Nick Desaulniers
ec5aa64fec x86/irqflags: Provide a declaration for native_save_fl
commit 208cbb32558907f68b3b2a081ca2337ac3744794 upstream.

It was reported that the commit d0a8d9378d16 is causing users of gcc < 4.9
to observe -Werror=missing-prototypes errors.

Indeed, it seems that:
extern inline unsigned long native_save_fl(void) { return 0; }

compiled with -Werror=missing-prototypes produces this warning in gcc <
4.9, but not gcc >= 4.9.

Fixes: d0a8d9378d16 ("x86/paravirt: Make native_save_fl() extern inline").
Reported-by: David Laight <david.laight@aculab.com>
Reported-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Cc: jgross@suse.com
Cc: kstewart@linuxfoundation.org
Cc: gregkh@linuxfoundation.org
Cc: boris.ostrovsky@oracle.com
Cc: astrachan@google.com
Cc: mka@chromium.org
Cc: arnd@arndb.de
Cc: tstellar@redhat.com
Cc: sedat.dilek@gmail.com
Cc: David.Laight@aculab.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180803170550.164688-1-ndesaulniers@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:06 +02:00
Masami Hiramatsu
866234c373 kprobes/x86: Fix %p uses in error messages
commit 0ea063306eecf300fcf06d2f5917474b580f666f upstream.

Remove all %p uses in error messages in kprobes/x86.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Howells <dhowells@redhat.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jon Medhurst <tixy@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tobin C . Harding <me@tobin.cc>
Cc: Will Deacon <will.deacon@arm.com>
Cc: acme@kernel.org
Cc: akpm@linux-foundation.org
Cc: brueckner@linux.vnet.ibm.com
Cc: linux-arch@vger.kernel.org
Cc: rostedt@goodmis.org
Cc: schwidefsky@de.ibm.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/lkml/152491902310.9916.13355297638917767319.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:06 +02:00
Jiri Kosina
7744abbe29 x86/speculation: Protect against userspace-userspace spectreRSB
commit fdf82a7856b32d905c39afc85e34364491e46346 upstream.

The article "Spectre Returns! Speculation Attacks using the Return Stack
Buffer" [1] describes two new (sub-)variants of spectrev2-like attacks,
making use solely of the RSB contents even on CPUs that don't fallback to
BTB on RSB underflow (Skylake+).

Mitigate userspace-userspace attacks by always unconditionally filling RSB on
context switch when the generic spectrev2 mitigation has been enabled.

[1] https://arxiv.org/pdf/1807.07940.pdf

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1807261308190.997@cbobk.fhfr.pm
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:06 +02:00
Peter Zijlstra
8dbce8a2e9 x86/paravirt: Fix spectre-v2 mitigations for paravirt guests
commit 5800dc5c19f34e6e03b5adab1282535cb102fafd upstream.

Nadav reported that on guests we're failing to rewrite the indirect
calls to CALLEE_SAVE paravirt functions. In particular the
pv_queued_spin_unlock() call is left unpatched and that is all over the
place. This obviously wrecks Spectre-v2 mitigation (for paravirt
guests) which relies on not actually having indirect calls around.

The reason is an incorrect clobber test in paravirt_patch_call(); this
function rewrites an indirect call with a direct call to the _SAME_
function, there is no possible way the clobbers can be different
because of this.

Therefore remove this clobber check. Also put WARNs on the other patch
failure case (not enough room for the instruction) which I've not seen
trigger in my (limited) testing.

Three live kernel image disassemblies for lock_sock_nested (as a small
function that illustrates the problem nicely). PRE is the current
situation for guests, POST is with this patch applied and NATIVE is with
or without the patch for !guests.

PRE:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    callq  *0xffffffff822299e8
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.

POST:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    callq  0xffffffff810a0c20 <__raw_callee_save___pv_queued_spin_unlock>
   0xffffffff817be9a5 <+53>:    xchg   %ax,%ax
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063aa0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.

NATIVE:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    movb   $0x0,(%rdi)
   0xffffffff817be9a3 <+51>:    nopl   0x0(%rax)
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.


Fixes: 63f70270cc ("[PATCH] i386: PARAVIRT: add common patching machinery")
Fixes: 3010a0663fd9 ("x86/paravirt, objtool: Annotate indirect calls")
Reported-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:06 +02:00
Greg Kroah-Hartman
1396226023 This is the 4.4.146 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltoWioACgkQONu9yGCS
 aT6YrQ//d8dWKaNZK08Z/l2ZqRS56wlNTJyHIB81p1uM2PuPHfLjsZzLQ+HnZ3Ha
 G+fedEj3sbwJp8i61TRu9Q1p/PyLWsnaryWZaK3gm4Yo8GrdVbXAY47EHwz3fbUK
 yxrC0+zQmIlyZgwzbUNGspDuAdNt2MFDug97RFF8BdhJd84Rv0BbicGMwKJQFfFN
 g0Tv6yB+8cjmnCMjmLreLyi+puWvXZtZXAi+idl9eTC4ysGDKNvO1ERptv2NC5C6
 171cbsS/ngpY5ZIUcmLy0QPPFh/ZCeoft22R3gOxZDkjT4Ro6lY5ubPKDEcn57Hv
 FSV5fuQ3cBtmsODn7LMIWqLDKuCRM/gTmvXrWxM91JDLSsuAdZWATj8k4iIoocmk
 l/3iOixBMFCGToQ1I2/O33QZOssKoDIz4bpG6+HM/Cj4anSnVZKjouJSTlNZr/3i
 ZJOXpu/MpQItc/RGo/PumzJLkXhS+HyGwPbTIOPy29NMqp+xvjZv4DttuJbqyHJ2
 Pm/OZcvU7z1wSMhcTknvZLLMQVRIICQjfPJNDefqAdrCdd233cRo37cU8egg4A0l
 F3q+ZI/ny01YWQP8KrCJyWB5lHHbEc44wUHCxet0TPZ1qaqvVcXzaWhwxP2H0L3I
 7r2u9bDg15ielw3jhPpRWZMvANbQlToNoj6YROqj5ArcIowcBPc=
 =7/iL
 -----END PGP SIGNATURE-----

Merge 4.4.146 into android-4.4

Changes in 4.4.146
	MIPS: Fix off-by-one in pci_resource_to_user()
	Input: elan_i2c - add ACPI ID for lenovo ideapad 330
	Input: i8042 - add Lenovo LaVie Z to the i8042 reset list
	Input: elan_i2c - add another ACPI ID for Lenovo Ideapad 330-15AST
	tracing: Fix double free of event_trigger_data
	tracing: Fix possible double free in event_enable_trigger_func()
	tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure
	tracing: Quiet gcc warning about maybe unused link variable
	xen/netfront: raise max number of slots in xennet_get_responses()
	ALSA: emu10k1: add error handling for snd_ctl_add
	ALSA: fm801: add error handling for snd_ctl_add
	nfsd: fix potential use-after-free in nfsd4_decode_getdeviceinfo
	mm: vmalloc: avoid racy handling of debugobjects in vunmap
	mm/slub.c: add __printf verification to slab_err()
	rtc: ensure rtc_set_alarm fails when alarms are not supported
	netfilter: ipset: List timing out entries with "timeout 1" instead of zero
	infiniband: fix a possible use-after-free bug
	hvc_opal: don't set tb_ticks_per_usec in udbg_init_opal_common()
	powerpc/64s: Fix compiler store ordering to SLB shadow area
	RDMA/mad: Convert BUG_ONs to error flows
	disable loading f2fs module on PAGE_SIZE > 4KB
	f2fs: fix to don't trigger writeback during recovery
	usbip: usbip_detach: Fix memory, udev context and udev leak
	perf/x86/intel/uncore: Correct fixed counter index check in generic code
	perf/x86/intel/uncore: Correct fixed counter index check for NHM
	iwlwifi: pcie: fix race in Rx buffer allocator
	Bluetooth: hci_qca: Fix "Sleep inside atomic section" warning
	Bluetooth: btusb: Add a new Realtek 8723DE ID 2ff8:b011
	ASoC: dpcm: fix BE dai not hw_free and shutdown
	mfd: cros_ec: Fail early if we cannot identify the EC
	mwifiex: handle race during mwifiex_usb_disconnect
	wlcore: sdio: check for valid platform device data before suspend
	media: videobuf2-core: don't call memop 'finish' when queueing
	btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups
	btrfs: qgroup: Finish rescan when hit the last leaf of extent tree
	PCI: Prevent sysfs disable of device while driver is attached
	ath: Add regulatory mapping for FCC3_ETSIC
	ath: Add regulatory mapping for ETSI8_WORLD
	ath: Add regulatory mapping for APL13_WORLD
	ath: Add regulatory mapping for APL2_FCCA
	ath: Add regulatory mapping for Uganda
	ath: Add regulatory mapping for Tanzania
	ath: Add regulatory mapping for Serbia
	ath: Add regulatory mapping for Bermuda
	ath: Add regulatory mapping for Bahamas
	powerpc/32: Add a missing include header
	powerpc/chrp/time: Make some functions static, add missing header include
	powerpc/powermac: Add missing prototype for note_bootable_part()
	powerpc/powermac: Mark variable x as unused
	powerpc/8xx: fix invalid register expression in head_8xx.S
	pinctrl: at91-pio4: add missing of_node_put
	PCI: pciehp: Request control of native hotplug only if supported
	mwifiex: correct histogram data with appropriate index
	scsi: ufs: fix exception event handling
	ALSA: emu10k1: Rate-limit error messages about page errors
	regulator: pfuze100: add .is_enable() for pfuze100_swb_regulator_ops
	md: fix NULL dereference of mddev->pers in remove_and_add_spares()
	media: smiapp: fix timeout checking in smiapp_read_nvm
	ALSA: usb-audio: Apply rate limit to warning messages in URB complete callback
	HID: hid-plantronics: Re-resend Update to map button for PTT products
	drm/radeon: fix mode_valid's return type
	powerpc/embedded6xx/hlwd-pic: Prevent interrupts from being handled by Starlet
	HID: i2c-hid: check if device is there before really probing
	tty: Fix data race in tty_insert_flip_string_fixed_flag
	dma-iommu: Fix compilation when !CONFIG_IOMMU_DMA
	media: rcar_jpu: Add missing clk_disable_unprepare() on error in jpu_open()
	libata: Fix command retry decision
	media: saa7164: Fix driver name in debug output
	mtd: rawnand: fsl_ifc: fix FSL NAND driver to read all ONFI parameter pages
	brcmfmac: Add support for bcm43364 wireless chipset
	s390/cpum_sf: Add data entry sizes to sampling trailer entry
	perf: fix invalid bit in diagnostic entry
	scsi: 3w-9xxx: fix a missing-check bug
	scsi: 3w-xxxx: fix a missing-check bug
	scsi: megaraid: silence a static checker bug
	thermal: exynos: fix setting rising_threshold for Exynos5433
	bpf: fix references to free_bpf_prog_info() in comments
	media: siano: get rid of __le32/__le16 cast warnings
	drm/atomic: Handling the case when setting old crtc for plane
	ALSA: hda/ca0132: fix build failure when a local macro is defined
	memory: tegra: Do not handle spurious interrupts
	memory: tegra: Apply interrupts mask per SoC
	drm/gma500: fix psb_intel_lvds_mode_valid()'s return type
	ipconfig: Correctly initialise ic_nameservers
	rsi: Fix 'invalid vdd' warning in mmc
	audit: allow not equal op for audit by executable
	microblaze: Fix simpleImage format generation
	usb: hub: Don't wait for connect state at resume for powered-off ports
	crypto: authencesn - don't leak pointers to authenc keys
	crypto: authenc - don't leak pointers to authenc keys
	media: omap3isp: fix unbalanced dma_iommu_mapping
	scsi: scsi_dh: replace too broad "TP9" string with the exact models
	scsi: megaraid_sas: Increase timeout by 1 sec for non-RAID fastpath IOs
	media: si470x: fix __be16 annotations
	drm: Add DP PSR2 sink enable bit
	random: mix rdrand with entropy sent in from userspace
	squashfs: be more careful about metadata corruption
	ext4: fix inline data updates with checksums enabled
	ext4: check for allocation block validity with block group locked
	dmaengine: pxa_dma: remove duplicate const qualifier
	ASoC: pxa: Fix module autoload for platform drivers
	ipv4: remove BUG_ON() from fib_compute_spec_dst
	net: fix amd-xgbe flow-control issue
	net: lan78xx: fix rx handling before first packet is send
	xen-netfront: wait xenbus state change when load module manually
	NET: stmmac: align DMA stuff to largest cache line length
	tcp: do not force quickack when receiving out-of-order packets
	tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode
	tcp: do not aggressively quick ack after ECN events
	tcp: refactor tcp_ecn_check_ce to remove sk type cast
	tcp: add one more quick ack after after ECN events
	inet: frag: enforce memory limits earlier
	net: dsa: Do not suspend/resume closed slave_dev
	netlink: Fix spectre v1 gadget in netlink_create()
	squashfs: more metadata hardening
	squashfs: more metadata hardenings
	can: ems_usb: Fix memory leak on ems_usb_disconnect()
	net: socket: fix potential spectre v1 gadget in socketcall
	virtio_balloon: fix another race between migration and ballooning
	kvm: x86: vmx: fix vpid leak
	crypto: padlock-aes - Fix Nano workaround data corruption
	scsi: sg: fix minor memory leak in error path
	Linux 4.4.146

Change-Id: Ia7e43a90d0f5603c741811436b8de41884cb2851
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-08-06 19:12:19 +02:00
Roman Kagan
314b46558c kvm: x86: vmx: fix vpid leak
commit 63aff65573d73eb8dda4732ad4ef222dd35e4862 upstream.

VPID for the nested vcpu is allocated at vmx_create_vcpu whenever nested
vmx is turned on with the module parameter.

However, it's only freed if the L1 guest has executed VMXON which is not
a given.

As a result, on a system with nested==on every creation+deletion of an
L1 vcpu without running an L2 guest results in leaking one vpid.  Since
the total number of vpids is limited to 64k, they can eventually get
exhausted, preventing L2 from starting.

Delay allocation of the L2 vpid until VMXON emulation, thus matching its
freeing.

Fixes: 5c614b3583
Cc: stable@vger.kernel.org
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-06 16:24:42 +02:00
Kan Liang
d6a260fe05 perf/x86/intel/uncore: Correct fixed counter index check for NHM
[ Upstream commit d71f11c076c420c4e2fceb4faefa144e055e0935 ]

For Nehalem and Westmere, there is only one fixed counter for W-Box.
There is no index which is bigger than UNCORE_PMC_IDX_FIXED.
It is not correct to use >= to check fixed counter.
The code quality issue will bring problem when new counter index is
introduced.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1525371913-10597-2-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-06 16:24:32 +02:00
Kan Liang
b4d9e5e88b perf/x86/intel/uncore: Correct fixed counter index check in generic code
[ Upstream commit 4749f8196452eeb73cf2086a6a9705bae479d33d ]

There is no index which is bigger than UNCORE_PMC_IDX_FIXED. The only
exception is client IMC uncore, which has been specially handled.
For generic code, it is not correct to use >= to check fixed counter.
The code quality issue will bring problem when a new counter index is
introduced.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1525371913-10597-3-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-06 16:24:32 +02:00
Greg Kroah-Hartman
4b2d6badbc This is the 4.4.144 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltYMlwACgkQONu9yGCS
 aT5ZmxAAjAWUndXt7fTUyHgxkoG61sEkdX4jcsp6NFwQMudU0UHx4/kcZE+HdMjL
 VU8BZtdUg+jMLXM4erVBpQRKY9YHIPi8nWMTm1UjduMCxVD6dVL1HU6/RXl1cYIx
 rf/opYOimqT9lYCeffmd9ai2zEEJKSt7/avddcJY4qHiqLan27gbUdAq2H26aM/5
 LUzAaSBzhq3VYo9Q5zv03b1+tORAxh2BIffZjGEFe8SQQl1o63WqwV4RxEhV/Bjt
 hBgl/6B/+EHtQnYnbnoOT/an9Ma15ik4/z3vVv6yRLNK+hS5T31OKcYCsUrjp6O+
 TQVaVLWWmn/VpIHAMkrhBs9Xxg5GmRziF77AkzyC506tK268M2+IoY77ursVl1YK
 STaOwUcLUlKLbl5OADqMpYtNU9ybkP+MmgDZsIEXz9UiCZM721fL5Au2PHuzaYOD
 2nE2EQb04It4k9GN8FStv2KPIiKUCEXi9MlNsHGPs6Mc+fliIigoKPhpU5JG+sxR
 eJgPMNv4OWhwXWTd1wf0Gy5X+i0lQlwlGgIHFfSB8vzArJ0Y/yuPj2a6xhQshOza
 Ivq7JudHvxYxhDSWYoCKgtTgzMdSBbJ3xjOoUUHy4ryamYeyaMvgFjsaCTMr0dsw
 76BkgNTbpsip+I77a9h4Ozlk5QE7h61EsqjmZBkGVqLYjrUQ/IU=
 =X4tZ
 -----END PGP SIGNATURE-----

Merge 4.4.144 into android-4.4

Changes in 4.4.144
	KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel.
	x86/MCE: Remove min interval polling limitation
	fat: fix memory allocation failure handling of match_strdup()
	ALSA: rawmidi: Change resized buffers atomically
	ARC: Fix CONFIG_SWAP
	ARC: mm: allow mprotect to make stack mappings executable
	mm: memcg: fix use after free in mem_cgroup_iter()
	ipv4: Return EINVAL when ping_group_range sysctl doesn't map to user ns
	ipv6: fix useless rol32 call on hash
	lib/rhashtable: consider param->min_size when setting initial table size
	net/ipv4: Set oif in fib_compute_spec_dst
	net: phy: fix flag masking in __set_phy_supported
	ptp: fix missing break in switch
	tg3: Add higher cpu clock for 5762.
	net: Don't copy pfmemalloc flag in __copy_skb_header()
	skbuff: Unconditionally copy pfmemalloc in __skb_clone()
	xhci: Fix perceived dead host due to runtime suspend race with event handler
	x86/paravirt: Make native_save_fl() extern inline
	x86/cpufeatures: Add CPUID_7_EDX CPUID leaf
	x86/cpufeatures: Add Intel feature bits for Speculation Control
	x86/cpufeatures: Add AMD feature bits for Speculation Control
	x86/msr: Add definitions for new speculation control MSRs
	x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
	x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes
	x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support
	x86/cpufeatures: Clean up Spectre v2 related CPUID flags
	x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
	x86/pti: Mark constant arrays as __initconst
	x86/asm/entry/32: Simplify pushes of zeroed pt_regs->REGs
	x86/entry/64/compat: Clear registers for compat syscalls, to reduce speculation attack surface
	x86/speculation: Update Speculation Control microcode blacklist
	x86/speculation: Correct Speculation Control microcode blacklist again
	x86/speculation: Clean up various Spectre related details
	x86/speculation: Fix up array_index_nospec_mask() asm constraint
	x86/speculation: Add <asm/msr-index.h> dependency
	x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
	x86/mm: Factor out LDT init from context init
	x86/mm: Give each mm TLB flush generation a unique ID
	x86/speculation: Use Indirect Branch Prediction Barrier in context switch
	x86/spectre_v2: Don't check microcode versions when running under hypervisors
	x86/speculation: Use IBRS if available before calling into firmware
	x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP
	x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist
	selftest/seccomp: Fix the flag name SECCOMP_FILTER_FLAG_TSYNC
	selftest/seccomp: Fix the seccomp(2) signature
	xen: set cpu capabilities from xen_start_kernel()
	x86/amd: don't set X86_BUG_SYSRET_SS_ATTRS when running under Xen
	x86/nospec: Simplify alternative_msr_write()
	x86/bugs: Concentrate bug detection into a separate function
	x86/bugs: Concentrate bug reporting into a separate function
	x86/bugs: Read SPEC_CTRL MSR during boot and re-use reserved bits
	x86/bugs, KVM: Support the combination of guest and host IBRS
	x86/cpu: Rename Merrifield2 to Moorefield
	x86/cpu/intel: Add Knights Mill to Intel family
	x86/bugs: Expose /sys/../spec_store_bypass
	x86/cpufeatures: Add X86_FEATURE_RDS
	x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation
	x86/bugs/intel: Set proper CPU features and setup RDS
	x86/bugs: Whitelist allowed SPEC_CTRL MSR values
	x86/bugs/AMD: Add support to disable RDS on Fam[15, 16, 17]h if requested
	x86/speculation: Create spec-ctrl.h to avoid include hell
	prctl: Add speculation control prctls
	x86/process: Optimize TIF checks in __switch_to_xtra()
	x86/process: Correct and optimize TIF_BLOCKSTEP switch
	x86/process: Optimize TIF_NOTSC switch
	x86/process: Allow runtime control of Speculative Store Bypass
	x86/speculation: Add prctl for Speculative Store Bypass mitigation
	nospec: Allow getting/setting on non-current task
	proc: Provide details on speculation flaw mitigations
	seccomp: Enable speculation flaw mitigations
	prctl: Add force disable speculation
	seccomp: Use PR_SPEC_FORCE_DISABLE
	seccomp: Add filter flag to opt-out of SSB mitigation
	seccomp: Move speculation migitation control to arch code
	x86/speculation: Make "seccomp" the default mode for Speculative Store Bypass
	x86/bugs: Rename _RDS to _SSBD
	proc: Use underscores for SSBD in 'status'
	Documentation/spec_ctrl: Do some minor cleanups
	x86/bugs: Fix __ssb_select_mitigation() return type
	x86/bugs: Make cpu_show_common() static
	x86/bugs: Fix the parameters alignment and missing void
	x86/cpu: Make alternative_msr_write work for 32-bit code
	x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
	x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
	x86/cpufeatures: Disentangle SSBD enumeration
	x86/cpu/AMD: Fix erratum 1076 (CPB bit)
	x86/cpufeatures: Add FEATURE_ZEN
	x86/speculation: Handle HT correctly on AMD
	x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
	x86/speculation: Add virtualized speculative store bypass disable support
	x86/speculation: Rework speculative_store_bypass_update()
	x86/bugs: Unify x86_spec_ctrl_{set_guest, restore_host}
	x86/bugs: Expose x86_spec_ctrl_base directly
	x86/bugs: Remove x86_spec_ctrl_set()
	x86/bugs: Rework spec_ctrl base and mask logic
	x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
	x86/bugs: Rename SSBD_NO to SSB_NO
	x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
	x86/cpu: Re-apply forced caps every time CPU caps are re-read
	block: do not use interruptible wait anywhere
	clk: tegra: Fix PLL_U post divider and initial rate on Tegra30
	ubi: Introduce vol_ignored()
	ubi: Rework Fastmap attach base code
	ubi: Be more paranoid while seaching for the most recent Fastmap
	ubi: Fix races around ubi_refill_pools()
	ubi: Fix Fastmap's update_vol()
	ubi: fastmap: Erase outdated anchor PEBs during attach
	Linux 4.4.144

Change-Id: Ia3e9b2b7bc653cba68b76878d34f8fcbbc007a13
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-31 20:18:19 +02:00
Greg Kroah-Hartman
7bbfac1903 This is the 4.4.143 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltUd9MACgkQONu9yGCS
 aT4zHxAAgRG0sISgcD3y8JaWSM/clVrzAouarIl6O5cvaLPj7CuBwaVtnEqsfihw
 7p+o0vTyJXifyPa2Zvvu+EgWTRsV/zx2ClOr0qwdXJaXyvPHyO5PQhefMVDs5WOR
 tzAAR+O72Au/zFBsRUB/Skn9iMgcilfPJo4kFND50nIbHB1iBCnS7YEFxUQEOZaD
 gwVT5gMD6vabq1TdPQCKIgm8X46pS8A8l0Kh68t/cxZzHRKbNb5vEusCCReRFBUQ
 IxAFQa9vjPblCI3jfvthtwIhDdTSkfuZ/mWYTfo/VnmDucR3yZdtxAgggoEPHGlV
 gsPZWmlhRwH8CPmJ6C89lz25hQZe6o2s++qMoUZ9A/YBVjNgQjVXYVWF/btenqdJ
 VkBRCSAUhUSOKz9PJvNfd1R65dI20k1CsHHk2f7O7GNiZ5QuznpyOimpLYlKvQl8
 n3nVyhkYtomYf1LcBOCbR3DqfFDfCJi7fWiCj1JkkdQ8CbHwNF9bdI+EROdjKpKz
 4rNRKlCtmDUlyJgt6x2I6Kjqgby6hC7KnUnnFtZxylq+M2bXRhcL6XaeP3CAh1M+
 3//yHX/l0utLg07jjbdwZncADGwlGhj0yCsbpcUH4SB5IIX8An6py9YTMiDXDspj
 mpWu9QRuXI/Y1qdIwyhkCGr7YUpRWttJZCbz4eMeVleJqiwvluo=
 =FYCr
 -----END PGP SIGNATURE-----

Merge 4.4.143 into android-4.4

Changes in 4.4.143
	compiler, clang: suppress warning for unused static inline functions
	compiler, clang: properly override 'inline' for clang
	compiler, clang: always inline when CONFIG_OPTIMIZE_INLINING is disabled
	compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline declarations
	x86/asm: Add _ASM_ARG* constants for argument registers to <asm/asm.h>
	Revert "sit: reload iphdr in ipip6_rcv"
	ocfs2: subsystem.su_mutex is required while accessing the item->ci_parent
	bcm63xx_enet: correct clock usage
	bcm63xx_enet: do not write to random DMA channel on BCM6345
	crypto: crypto4xx - remove bad list_del
	crypto: crypto4xx - fix crypto4xx_build_pdr, crypto4xx_build_sdr leak
	atm: zatm: Fix potential Spectre v1
	net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
	net: dccp: switch rx_tstamp_last_feedback to monotonic clock
	net/mlx5: Fix incorrect raw command length parsing
	net: sungem: fix rx checksum support
	qed: Limit msix vectors in kdump kernel to the minimum required count.
	r8152: napi hangup fix after disconnect
	tcp: fix Fast Open key endianness
	tcp: prevent bogus FRTO undos with non-SACK flows
	vhost_net: validate sock before trying to put its fd
	net_sched: blackhole: tell upper qdisc about dropped packets
	net/mlx5: Fix command interface race in polling mode
	net: cxgb3_main: fix potential Spectre v1
	rtlwifi: rtl8821ae: fix firmware is not ready to run
	MIPS: Call dump_stack() from show_regs()
	MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
	netfilter: ebtables: reject non-bridge targets
	KEYS: DNS: fix parsing multiple options
	rds: avoid unenecessary cong_update in loop transport
	net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
	Linux 4.4.143

Change-Id: Icacfd188cbb6075bf82a48ec1066e8653eb73ae4
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-31 20:11:21 +02:00
Greg Kroah-Hartman
8ec9fd8936 This is the 4.4.142 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltQk74ACgkQONu9yGCS
 aT5nCg/8DfqvQgljahgvf2GnF/XD/0pvLFAFWYG81Dmgf247c8ZoELD2VF06CWsJ
 6r7C2yLmO+TO1TZw07OJOWaqyepA3ae06G0ZSXuMVojG6yV5BaYnXPjLJwp2LXVO
 glxxjuwinu0NWW5YjgnV/DL8T7aRI+0hKecmDJJ+6yp3H8Jk9Q1Oc5g2wyNQLY6v
 agwtTUgpHt9sd5+TZmyKhzyiqFTESPmLUScpam0UCM8gyMS31Sk5UyezlFB4+/sD
 UdlsXmCcXpxx/s51OlkMiTTB/ErRbik19FShohx5MmwjHRiSGX4RYn/b0fj/LeL1
 PUFyfzg4xzn8Ff8F2G0NO8LRiF5zjOrPSTcA96raoaRxjBnd2GvGAG7MZV1YxuP0
 rpeUrjwspZgmtGKJK2A99y9il7pVcWhyruZieEO8jYoUWB4wv+JqRherDeUjq7Xm
 xcEszKKzOduOPVlRVbyPyUKxnlFz9+FwLmZug1aMDdKgmCIZWSLgMorQOjlGDasu
 zYbvLoC3feVusAu1orMqYbb/sYsk5QaN6ytsM4Fp+dRA2F6DZ9j2xnKd16fL9B8y
 rNpdLEjS/JvjZMhRP8/SPOxBndz6u2bm5gVrabo4UydlQwM5PWjGvSNmWcM4+PBz
 A7ZbQviTdCL2M1NwPI2L/HYMXHcNTxNuchRnvDmIbsEZlYzl4JM=
 =aV1k
 -----END PGP SIGNATURE-----

Merge 4.4.142 into android-4.4

Changes in 4.4.142
	Kbuild: fix # escaping in .cmd files for future Make
	x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6
	perf tools: Move syscall number fallbacks from perf-sys.h to tools/arch/x86/include/asm/
	Linux 4.4.142

Change-Id: If45535d0aeb1ad6f7239d3bc15aa4d3d60754da7
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-31 20:08:13 +02:00
Alistair Strachan
56b516c5e3 x86_64_cuttlefish_defconfig: Enable android-verity
Bug: 72722987
Test: Build & boot with x86_64_cuttlefish_defconfig
Change-Id: I961e6aaa944b5ab0c005cb39604a52f8dc98fb06
Signed-off-by: Alistair Strachan <astrachan@google.com>
2018-07-26 18:26:03 +00:00
Alistair Strachan
f402eb9ad5 x86_64_cuttlefish_defconfig: enable verity cert
Bug: 72722987
Test: Build, boot and verify in /proc/keys
Change-Id: Ia55b94d56827003a88cb6083a75340ee31347470
Signed-off-by: Alistair Strachan <astrachan@google.com>
2018-07-26 18:25:43 +00:00
Andy Lutomirski
42a8fe474e x86/cpu: Re-apply forced caps every time CPU caps are re-read
commit 60d3450167433f2d099ce2869dc52dd9e7dc9b29 upstream.

Calling get_cpu_cap() will reset a bunch of CPU features.  This will
cause the system to lose track of force-set and force-cleared
features in the words that are reset until the end of CPU
initialization.  This can cause X86_FEATURE_FPU, for example, to
change back and forth during boot and potentially confuse CPU setup.

To minimize the chance of confusion, re-apply forced caps every time
get_cpu_cap() is called.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/c817eb373d2c67c2c81413a70fc9b845fa34a37e.1484705016.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 10:18:31 +02:00
Juergen Gross
399a9d0cc4 x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
commit 74899d92e66663dc7671a8017b3146dcd4735f3b upstream.

Commit:

  1f50ddb4f418 ("x86/speculation: Handle HT correctly on AMD")

... added speculative_store_bypass_ht_init() to the per-CPU initialization sequence.

speculative_store_bypass_ht_init() needs to be called on each CPU for
PV guests, too.

Reported-by: Brian Woods <brian.woods@amd.com>
Tested-by: Brian Woods <brian.woods@amd.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: xen-devel@lists.xenproject.org
Fixes: 1f50ddb4f4189243c05926b842dc1a0332195f31 ("x86/speculation: Handle HT correctly on AMD")
Link: https://lore.kernel.org/lkml/20180621084331.21228-1-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 10:18:31 +02:00
Konrad Rzeszutek Wilk
cadb98135d x86/bugs: Rename SSBD_NO to SSB_NO
commit 240da953fcc6a9008c92fae5b1f727ee5ed167ab upstream

The "336996 Speculative Execution Side Channel Mitigations" from
May defines this as SSB_NO, hence lets sync-up.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu>
Reviewed-by: Matt Helsley (VMware) <matt.helsley@gmail.com>
Reviewed-by: Alexey Makhalov <amakhalov@vmware.com>
Reviewed-by: Bo Gan <ganb@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 10:18:31 +02:00
Thomas Gleixner
48805280d0 x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
commit 47c61b3955cf712cadfc25635bf9bc174af030ea upstream

Add the necessary logic for supporting the emulated VIRT_SPEC_CTRL MSR to
x86_virt_spec_ctrl().  If either X86_FEATURE_LS_CFG_SSBD or
X86_FEATURE_VIRT_SPEC_CTRL is set then use the new guest_virt_spec_ctrl
argument to check whether the state must be modified on the host. The
update reuses speculative_store_bypass_update() so the ZEN-specific sibling
coordination can be reused.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu>
Reviewed-by: Matt Helsley (VMware) <matt.helsley@gmail.com>
Reviewed-by: Alexey Makhalov <amakhalov@vmware.com>
Reviewed-by: Bo Gan <ganb@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 10:18:31 +02:00
Thomas Gleixner
80d7439fb0 x86/bugs: Rework spec_ctrl base and mask logic
commit be6fcb5478e95bb1c91f489121238deb3abca46a upstream

x86_spec_ctrL_mask is intended to mask out bits from a MSR_SPEC_CTRL value
which are not to be modified. However the implementation is not really used
and the bitmask was inverted to make a check easier, which was removed in
"x86/bugs: Remove x86_spec_ctrl_set()"

Aside of that it is missing the STIBP bit if it is supported by the
platform, so if the mask would be used in x86_virt_spec_ctrl() then it
would prevent a guest from setting STIBP.

Add the STIBP bit if supported and use the mask in x86_virt_spec_ctrl() to
sanitize the value which is supplied by the guest.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu>
Reviewed-by: Matt Helsley (VMware) <matt.helsley@gmail.com>
Reviewed-by: Alexey Makhalov <amakhalov@vmware.com>
Reviewed-by: Bo Gan <ganb@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 10:18:31 +02:00
Thomas Gleixner
90cfa767bc x86/bugs: Remove x86_spec_ctrl_set()
commit 4b59bdb569453a60b752b274ca61f009e37f4dae upstream

x86_spec_ctrl_set() is only used in bugs.c and the extra mask checks there
provide no real value as both call sites can just write x86_spec_ctrl_base
to MSR_SPEC_CTRL. x86_spec_ctrl_base is valid and does not need any extra
masking or checking.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu>
Reviewed-by: Matt Helsley (VMware) <matt.helsley@gmail.com>
Reviewed-by: Alexey Makhalov <amakhalov@vmware.com>
Reviewed-by: Bo Gan <ganb@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 10:18:31 +02:00
Thomas Gleixner
9ed7ee52e4 x86/bugs: Expose x86_spec_ctrl_base directly
commit fa8ac4988249c38476f6ad678a4848a736373403 upstream

x86_spec_ctrl_base is the system wide default value for the SPEC_CTRL MSR.
x86_spec_ctrl_get_default() returns x86_spec_ctrl_base and was intended to
prevent modification to that variable. Though the variable is read only
after init and globaly visible already.

Remove the function and export the variable instead.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu>
Reviewed-by: Matt Helsley (VMware) <matt.helsley@gmail.com>
Reviewed-by: Alexey Makhalov <amakhalov@vmware.com>
Reviewed-by: Bo Gan <ganb@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 10:18:30 +02:00
Borislav Petkov
d5aec90670 x86/bugs: Unify x86_spec_ctrl_{set_guest, restore_host}
commit cc69b34989210f067b2c51d5539b5f96ebcc3a01 upstream

Function bodies are very similar and are going to grow more almost
identical code. Add a bool arg to determine whether SPEC_CTRL is being set
for the guest or restored to the host.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu>
Reviewed-by: Matt Helsley (VMware) <matt.helsley@gmail.com>
Reviewed-by: Alexey Makhalov <amakhalov@vmware.com>
Reviewed-by: Bo Gan <ganb@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 10:18:30 +02:00