Add Intel CPUID for Intel Secure Hash Algorithm Extensions. This feature
provides new instructions for accelerated computation of SHA-1 and SHA-256.
This allows the feature to be shown in the /proc/cpuinfo for cpus that
support it.
Refer to SHA extension programming guide in chapter 8.2 of the Intel
Architecture Instruction Set Extensions Programming reference
for definition of this feature's cpuid: CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29] = 1
https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf
Originally-by: Chandramouli Narayanan <mouli_7982@yahoo.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Link: http://lkml.kernel.org/r/1440194206.3940.6.camel@schen9-mobl2
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some in-flight code makes use of the old rdtscll() (now removed), provide a wrapper
for a kernel cycle to smooth the transition to rdtsc().
( We use the safest variant, rdtsc_ordered(), which has barriers - this adds another
incentive to remove the wrapper in the future. )
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/dddbf98a2af53312e9aa73a5a2b1622fe5d6f52b.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add support for two new PMEM APIs, copy_from_iter_pmem() and
clear_pmem(). copy_from_iter_pmem() is used to copy data from an
iterator into a PMEM buffer. clear_pmem() zeros a PMEM memory range.
Both of these new APIs must be explicitly ordered using a wmb_pmem()
function call and are implemented in such a way that the wmb_pmem()
will make the stores to PMEM durable. Because both APIs are unordered
they can be called as needed without introducing any unwanted memory
barriers.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Prior to this change x86_64 used the pmem defines in
arch/x86/include/asm/pmem.h, and UM used the default ones at the
top of include/linux/pmem.h. The inclusion or exclusion in linux/pmem.h
was controlled by CONFIG_ARCH_HAS_PMEM_API, but the ones in asm/pmem.h
were controlled by ARCH_HAS_NOCACHE_UACCESS.
Instead, control them both with CONFIG_ARCH_HAS_PMEM_API so that it's
clear that they are related and we don't run into the possibility where
they are both included or excluded. Also remove a bunch of stale
function prototypes meant for UM in asm/pmem.h - these just conflicted
with the inline defaults in linux/pmem.h and gave compile errors.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Prior to this change arch_has_wmb_pmem() was only called by
arch_has_pmem_api(). Both arch_has_wmb_pmem() and arch_has_pmem_api()
checked to make sure that CONFIG_ARCH_HAS_PMEM_API was enabled.
Instead, remove the old arch_has_wmb_pmem() wrapper to be rid of one
extra layer of indirection and the redundant CONFIG_ARCH_HAS_PMEM_API
check. Rename __arch_has_wmb_pmem() to arch_has_wmb_pmem() since we no
longer have a wrapper, and just have arch_has_pmem_api() call the
architecture specific arch_has_wmb_pmem() directly.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Move the x86 PMEM API implementation out of asm/cacheflush.h and into
its own header asm/pmem.h. This will allow members of the PMEM API to
be more easily identified on this and other architectures.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Map shared data structure that will hold CPU registers, VPMU context,
V/PCPU IDs of the CPU interrupted by PMU interrupt. Hypervisor fills
this information in its handler and passes it to the guest for further
processing.
Set up PMU VIRQ.
Now that perf infrastructure will assume that PMU is available on a PV
guest we need to be careful and make sure that accesses via RDPMC
instruction don't cause fatal traps by the hypervisor. Provide a nop
RDPMC handler.
For the same reason avoid issuing a warning on a write to APIC's LVTPC.
Both of these will be made functional in later patches.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Set Xen's PMU mode via /sys/hypervisor/pmu/pmu_mode. Add XENPMU hypercall.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cleanup by removing arch/x86/xen/p2m.h as it isn't needed any more.
Most definitions in this file are used in p2m.c only. Move those into
p2m.c.
set_phys_range_identity() is already declared in
arch/x86/include/asm/xen/page.h, add __init annotation there.
MAX_REMAP_RANGES isn't used at all, just delete it.
The only define left is P2M_PER_PAGE which is moved to page.h as well.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
64 bit pv-domains under Xen are limited to 512 GB of RAM today. The
main reason has been the 3 level p2m tree, which was replaced by the
virtual mapped linear p2m list. Parallel to the p2m list which is
being used by the kernel itself there is a 3 level mfn tree for usage
by the Xen tools and eventually for crash dump analysis. For this tree
the linear p2m list can serve as a replacement, too. As the kernel
can't know whether the tools are capable of dealing with the p2m list
instead of the mfn tree, the limit of 512 GB can't be dropped in all
cases.
This patch replaces the hard limit by a kernel parameter which tells
the kernel to obey the 512 GB limit or not. The default is selected by
a configuration parameter which specifies whether the 512 GB limit
should be active per default for domUs (domain save/restore/migration
and crash dump analysis are affected).
Memory above the domain limit is returned to the hypervisor instead of
being identity mapped, which was wrong anyway.
The kernel configuration parameter to specify the maximum size of a
domain can be deleted, as it is not relevant any more.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Use the newest headers from the xen tree to get some new structure
layouts.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Currently, the event channel rebind code is gated with the presence of
the vector callback.
The virtual interrupt controller on ARM has the concept of per-CPU
interrupt (PPI) which allow us to support per-VCPU event channel.
Therefore there is no need of vector callback for ARM.
Xen is already using a free PPI to notify the guest VCPU of an event.
Furthermore, the xen code initialization in Linux (see
arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ.
Introduce new helper xen_support_evtchn_rebind to allow architecture
decide whether rebind an event is support or not. It will always return
true on ARM and keep the same behavior on x86.
This is also allow us to drop the usage of xen_have_vector_callback
entirely in the ARM code.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
This reverts commit:
2c7577a758 ("sched/x86_64: Don't save flags on context switch")
It was a nice speedup. It's also not quite correct: SYSENTER
enables interrupts too early.
We can re-add this optimization once the SYSENTER code is beaten
into shape, which should happen in 4.3 or 4.4.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # v3.19
Link: http://lkml.kernel.org/r/85f56651f59f76624e80785a8fd3bdfdd089a818.1439838962.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Both the per-APIC flag ".wait_for_init_deassert",
and the global atomic_t "init_deasserted"
are dead code -- remove them.
For all APIC types, "wait_for_master()"
prevents an AP from proceeding until the BSP has set
cpu_callout_mask, making "init_deasserted" {unnecessary}:
BSP: <de-assert INIT>
...
BSP: {set init_deasserted}
AP: wait_for_master()
set cpu_initialized_mask
wait for cpu_callout_mask
BSP: test cpu_initialized_mask
BSP: set cpu_callout_mask
AP: test cpu_callout_mask
AP: {wait for init_deasserted}
...
AP: <touch APIC>
Deleting the {dead code} above is necessary to enable
some parallelism in a future patch.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/de4b3a9bab894735e285870b5296da25ee6a8a5a.1439739165.git.len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
VMX encodes access rights differently from LAR, and the latter is
most likely what x86 people think of when they think of "access
rights".
Rename them to avoid confusion.
Cc: kvm@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Kill arch_memremap_pmem() and just let the architecture specify the
flags to be passed to memremap(). Default to writethrough by default.
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This reverts commits 9a036b93a3 ("x86/signal/64: Remove 'fs' and 'gs'
from sigcontext") and c6f2062935 ("x86/signal/64: Fix SS handling for
signals delivered to 64-bit programs").
They were cleanups, but they break dosemu by changing the signal return
behavior (and removing 'fs' and 'gs' from the sigcontext struct - while
not actually changing any behavior - causes build problems).
Reported-and-tested-by: Stas Sergeev <stsp@list.ru>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kexec could boot a kernel that could be legacy with no knowledge
of LMCE. Hence we should make sure we clear LMCE optin before
kexec reboot.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1439396985-12812-9-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This used to flush out MCEs logged during early boot and which
were in the MCA registers from a previous system run. No need
for that now, since we've moved to a genpool.
Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439396985-12812-7-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use unified genpool to save Action Optional error events and put
Action Optional error handling in the same notification chain as
MCE error decoding.
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
[ Fold in subsequent patch from Boris for early boot logging. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
[ Correct a lot. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439396985-12812-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Boris reported that gcc version 4.4.4 20100503 (Red Hat
4.4.4-2) fails to build linux-next kernels that have
this fresh commit via the locking tree:
11276d5306 ("locking/static_keys: Add a new static_key interface")
The problem appears to be that even though @key and @branch are
compile time constants, it doesn't see the following expression
as an immediate value:
&((char *)key)[branch]
More recent GCCs don't appear to have this problem.
In particular, Red Hat backported the 'asm goto' feature into 4.4,
'normal' 4.4 compilers will not have this feature and thus not
run into this asm.
The workaround is to supply both values to the asm as immediates
and do the addition in asm.
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since the following commit:
536fa40222 ("compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release()")
smp_store_release() supports byte accesses, so use that in writer unlock
and remove the conditional macro override.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Waiman Long <Waiman.Long@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1438880084-18856-6-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
... to allow an arch specific implementation of getting page
protection type associated with a physical address.
On x86, we currently have no way to look up the EFI memory map
attributes for a region in a consistent way, because the
memmap is discarded after efi_free_boot_services(). So if
you call efi_mem_attributes() during boot and at runtime,
you could theoretically see different attributes.
Since we are yet to see any x86 platforms that require
anything other than PAGE_KERNEL (some arm64 platforms
require the equivalent of PAGE_KERNEL_NOCACHE), return that
until we know differently.
Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438936621-5215-5-git-send-email-matt@codeblueprint.co.uk
[ Small fixes to spelling. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We can spare the irq_desc lookup in the interrupt entry code if we
store the descriptor pointer in the vector array instead the interrupt
number.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/20150802203609.717724106@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
VECTOR_UNDEFINED is a misnomer. The vector is defined, but unused.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/20150802203609.477282494@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We have abstracted the data struct and functions which are used to check
reserved bit on guest page tables, now we extend the logic to check
zero bits on shadow page tables
The zero bits on sptes include not only reserved bits on hardware but also
the bits that SPTEs willnever use. For example, shadow pages will never
use GB pages unless the guest uses them too.
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These two fields, rsvd_bits_mask and bad_mt_xwr, in "struct kvm_mmu" are
used to check if reserved bits set on guest ptes, move them to a data
struct so that the approach can be applied to check host shadow page
table entries as well
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
They are no longer used. Good riddance!
Deleting the TIF_ macros is really nice. It was never clear why
there were so many variants.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/22c61682f446628573dde0f1d573ab821677e06da.1438378274.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With this config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os
gcc-4.7.2 generates many copies of these tiny functions:
__arch_hweight32 (35 copies):
55 push %rbp
e8 66 9b 4a 00 callq __sw_hweight32
48 89 e5 mov %rsp,%rbp
5d pop %rbp
c3 retq
__arch_hweight64 (8 copies):
55 push %rbp
e8 5e c2 8a 00 callq __sw_hweight64
48 89 e5 mov %rsp,%rbp
5d pop %rbp
c3 retq
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
This patch fixes this via s/inline/__always_inline/
To avoid touching 32-bit case where such change was not tested
to be a win, reformat __arch_hweight64() to have completely
disjoint 64-bit and 32-bit implementations. IOW: made #ifdef /
32 bits and 64 bits instead of having #ifdef / #else / #endif
inside a single function body. Only 64-bit __arch_hweight64() is
__always_inline'd.
text data bss dec filename
86971120 17195912 36659200 140826232 vmlinux.before
86970954 17195912 36659200 140826066 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1438697716-28121-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Hypervisor Top Level Functional Specification v3.1/4.0 notes that cpuid
(0x40000003) EDX's 10th bit should be used to check that Hyper-V guest
crash MSR's functionality available.
This patch should fix this recognition. Currently the code checks EAX
register instead of EDX.
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Full kernel hang is observed when kdump kernel starts after a crash. This
hang happens in vmbus_negotiate_version() function on
wait_for_completion() as Hyper-V host (Win2012R2 in my testing) never
responds to CHANNELMSG_INITIATE_CONTACT as it thinks the connection is
already established. We need to perform some mandatory minimalistic
cleanup before we start new kernel.
Reported-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When general-purpose kexec (not kdump) is being performed in Hyper-V guest
the newly booted kernel fails with an MCE error coming from the host. It
is the same error which was fixed in the "Drivers: hv: vmbus: Implement
the protocol for tearing down vmbus state" commit - monitor pages remain
special and when they're being written to (as the new kernel doesn't know
these pages are special) bad things happen. We need to perform some
minimalistic cleanup before booting a new kernel on kexec. To do so we
need to register a special machine_ops.shutdown handler to be executed
before the native_machine_shutdown(). Registering a shutdown notification
handler via the register_reboot_notifier() call is not sufficient as it
happens to early for our purposes. machine_ops is not being exported to
modules (and I don't think we want to export it) so let's do this in
mshyperv.c
The minimalistic cleanup consists of cleaning up clockevents, synic MSRs,
guest os id MSR, and hypercall MSR.
Kdump doesn't require all this stuff as it lives in a separate memory
space.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add new MSRs (LBR_INFO) and some new MSR bits used by the Intel Skylake
PMU driver.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1431285767-27027-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
PEBSv3 has a raw TSC time stamp in its memory buffer that
later needs to to be converted to perf_clock.
Add a native_sched_clock_from_tsc() that works the same
as native_sched_clock(), but starts with an already given
TSC value.
Paravirt is ignored, it will just get the native clock.
But there isn't a para virtualized PEBS anyway.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1431285767-27027-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Intel PT chapter in the new Intel Architecture SDM adds several packets
corresponding enable bits and registers that control packet generation.
Also, additional bits in the Intel PT CPUID leaf were added to enumerate
presence and parameters of these new packets and features.
The packets and enables are:
* CYC: cycle accurate mode, provides the number of cycles elapsed since
previous CYC packet; its presence and available threshold values are
enumerated via CPUID;
* MTC: mini time counter packets, used for tracking TSC time between
full TSC packets; its presence and available resolution options are
enumerated via CPUID;
* PSB packet period is now configurable, available period values are
enumerated via CPUID.
This patch adds corresponding bit and register definitions, pmu driver
capabilities based on CPUID enumeration, new attribute format bits for
the new featurens and extends event configuration validation function
to take these into account.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1438262131-12725-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There are various problems and short-comings with the current
static_key interface:
- static_key_{true,false}() read like a branch depending on the key
value, instead of the actual likely/unlikely branch depending on
init value.
- static_key_{true,false}() are, as stated above, tied to the
static_key init values STATIC_KEY_INIT_{TRUE,FALSE}.
- we're limited to the 2 (out of 4) possible options that compile to
a default NOP because that's what our arch_static_branch() assembly
emits.
So provide a new static_key interface:
DEFINE_STATIC_KEY_TRUE(name);
DEFINE_STATIC_KEY_FALSE(name);
Which define a key of different types with an initial true/false
value.
Then allow:
static_branch_likely()
static_branch_unlikely()
to take a key of either type and emit the right instruction for the
case.
This means adding a second arch_static_branch_jump() assembly helper
which emits a JMP per default.
In order to determine the right instruction for the right state,
encode the branch type in the LSB of jump_entry::key.
This is the final step in removing the naming confusion that has led to
a stream of avoidable bugs such as:
a833581e37 ("x86, perf: Fix static_key bug in load_mm_cr4()")
... but it also allows new static key combinations that will give us
performance enhancements in the subsequent patches.
Tested-by: Rabin Vincent <rabin@rab.in> # arm
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> # ppc
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Upcoming changes to static keys is interacting/conflicting with the following
pending TSC commits in tip:x86/asm:
4ea1636b04 x86/asm/tsc: Rename native_read_tsc() to rdtsc()
...
So merge it into the locking tree to have a smoother resolution.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Replace ACCESS_ONCE() macro in smp_store_release() and smp_load_acquire()
with WRITE_ONCE() and READ_ONCE() on x86, arm, arm64, ia64, metag, mips,
powerpc, s390, sparc and asm-generic since ACCESS_ONCE() does not work
reliably on non-scalar types.
WRITE_ONCE() and READ_ONCE() were introduced in the following commits:
230fa253df ("kernel: Provide READ_ONCE and ASSIGN_ONCE")
43239cbe79 ("kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Alexander Duyck <alexander.h.duyck@redhat.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/1438528264-714-1-git-send-email-andreyknvl@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Make it clearer that this is the pointer to the userspace vm86
state area.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438148483-11932-8-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
vm86.h was being implicitly included in alot of places via
processor.h, which in turn got it from math_emu.h. Break that
chain and explicitly include vm86.h in all files that need it.
Also remove unused vm86 field from math_emu_info.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438148483-11932-7-git-send-email-brgerst@gmail.com
[ Fixed build failure. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Move vm86 specific definitions from irq_vectors.h to vm86.h.
Based on patch from Brian Gerst.
Originally-from: Brian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438148483-11932-6-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>