Commits 4a31276930 ("x86: Turn the
copy_from_user check into an (optional) compile time warning")
and 63312b6a6f ("x86: Add a
Kconfig option to turn the copy_from_user warnings into errors")
touched only the 32-bit variant of copy_from_user(), whereas the
original commit 9f0cf4adb6 ("x86:
Use __builtin_object_size() to validate the buffer size for
copy_from_user()") also added the same code to the 64-bit one.
Further the earlier conversion from an inline WARN() to the call
to copy_from_user_overflow() went a little too far: When the
number of bytes to be copied is not a constant (e.g. [looking at
3.11] in drivers/net/tun.c:__tun_chr_ioctl() or
drivers/pci/pcie/aer/aer_inject.c:aer_inject_write()), the
compiler will always have to keep the funtion call, and hence
there will always be a warning. By using __builtin_constant_p()
we can avoid this.
And then this slightly extends the effect of
CONFIG_DEBUG_STRICT_USER_COPY_CHECKS in that apart from
converting warnings to errors in the constant size case, it
retains the (possibly wrong) warnings in the non-constant size
case, such that if someone is prepared to get a few false
positives, (s)he'll be able to recover the current behavior
(except that these diagnostics now will never be converted to
errors).
Since the 32-bit variant (intentionally) didn't call
might_fault(), the unification results in this being called
twice now. Adding a suitable #ifdef would be the alternative if
that's a problem.
I'd like to point out though that with
__compiletime_object_size() being restricted to gcc before 4.6,
the whole construct is going to become more and more pointless
going forward. I would question however that commit
2fb0815c9e ("gcc4: disable
__compiletime_object_size for GCC 4.6+") was really necessary,
and instead this should have been dealt with as is done here
from the beginning.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5265056D02000078000FC4F3@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since the paravirt spinlock optimizations went into the v3.12 kernel
we have a very good performance benefit for paravirtualized KVM / Xen
kernels. Also we no longer suffer from 5% side effect on native
kernel that is mentioned in the Kconfig entry.
So update the Kconfig entry accordingly.
pvspinlock benefit on KVM link:
https://lkml.org/lkml/2013/8/6/178
Attilio's tests on native kernel impact:
http://blog.xen.org/index.php/2012/05/11/benchmarking-the-new-pv-ticketlock-implementation/
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: <konrad.wilk@oracle.com>
Cc: <linux@eikelenboom.it>
Cc: <gleb@redhat.com>
Cc: <pbonzini@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1382371508-3843-1-git-send-email-raghavendra.kt@linux.vnet.ibm.com
[ Updated the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull MCE updates from Tony Luck:
"There is a enhanced error logging mechanism for Xeon processors.
Full description is here:
http://www.intel.com/content/www/us/en/architecture-and-technology/enhanced-mca-logging-xeon-paper.html
This patch series provides a module (and support code) to
check for an extended error log and print extra details about
the error on the console.
"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Introduce xen_dma_map_page, xen_dma_unmap_page,
xen_dma_sync_single_for_cpu and xen_dma_sync_single_for_device.
They have empty implementations on x86 and ia64 but they call the
corresponding platform dma_ops function on arm and arm64.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Changes in v9:
- xen_dma_map_page return void, avoid page_to_phys.
Several architectures open code effectively the same code block for
finding and mapping PCI irqs. This patch consolidates it down to a
single function.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
All the callers of irq_create_of_mapping() pass the contents of a struct
of_phandle_args structure to the function. Since all the callers already
have an of_phandle_args pointer, why not pass it directly to
irq_create_of_mapping()?
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
struct of_irq and struct of_phandle_args are exactly the same structure.
This patch makes the kernel use of_phandle_args everywhere. This in
itself isn't a big deal, but it makes some follow-on patches simpler.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The OF irq handling code has been overloading the term 'map' to refer to
both parsing the data in the device tree and mapping it to the internal
linux irq system. This is probably because the device tree does have the
concept of an 'interrupt-map' function for translating interrupt
references from one node to another, but 'map' is still confusing when
the primary purpose of some of the functions are to parse the DT data.
This patch renames all the of_irq_map_* functions to of_irq_parse_*
which makes it clear that there is a difference between the parsing
phase and the mapping phase. Kernel code can make use of just the
parsing or just the mapping support as needed by the subsystem.
The patch was generated mechanically with a handful of sed commands.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Conflicts:
drivers/net/usb/qmi_wwan.c
include/net/dst.h
Trivial merge conflicts, both were overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Architectures which support CONFIG_PARPORT_PC should select
ARCH_MIGHT_HAVE_PC_PARPORT.
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Ingo Molnar <mingo@redhat.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: x86@kernel.org
In latest UEFI spec(by now it is 2.4) memory error definition
for CPER (UEFI 2.4 Appendix N Common Platform Error Record)
adds some new fields. These fields help people to locate
memory error to an actual DIMM location.
Original-author: Tony Luck <tony.luck@intel.com>
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch adds a new interface to decode memory device (type 17)
to help error reporting on DIMMs.
Original-author: Tony Luck <tony.luck@intel.com>
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This H/W error log driver (a.k.a eMCA driver) is implemented based on
http://www.intel.com/content/www/us/en/architecture-and-technology/enhanced-mca-logging-xeon-paper.html
After errors are captured, more detailed platform specific information
can be got via this new enhanced H/W error log driver. Most notably we
can track memory errors back to the DIMM slot silk screen label.
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Pull networking fixes from David Miller:
"Sorry I let so much accumulate, I was in Buffalo and wanted a few
things to cook in my tree for a while before sending to you. Anyways,
it's a lot of little things as usual at this stage in the game"
1) Make bonding MAINTAINERS entry reflect reality, from Andy
Gospodarek.
2) Fix accidental sock_put() on timewait mini sockets, from Eric
Dumazet.
3) Fix crashes in l2tp due to mis-handling of ipv4 mapped ipv6
addresses, from François CACHEREUL.
4) Fix heap overflow in __audit_sockaddr(), from the eagle eyed Dan
Carpenter.
5) tcp_shifted_skb() doesn't take handle FINs properly, from Eric
Dumazet.
6) SFC driver bug fixes from Ben Hutchings.
7) Fix TX packet scheduling wedge after channel change in ath9k driver,
from Felix Fietkau.
8) Fix user after free in BPF JIT code, from Alexei Starovoitov.
9) Source address selection test is reversed in
__ip_route_output_key(), fix from Jiri Benc.
10) VLAN and CAN layer mis-size netlink attributes, from Marc
Kleine-Budde.
11) Fix permission checks in sysctls to use current_euid() instead of
current_uid(). From Eric W Biederman.
12) IPSEC policies can go away while a timer is still pending for them,
add appropriate ref-counting to fix, from Steffen Klassert.
13) Fix mis-programming of FDR and RMCR registers on R8A7740 sh_eth
chips, from Nguyen Hong Ky and Simon Horman.
14) MLX4 forgets to DMA unmap pages on RX, fix from Amir Vadai.
15) IPV6 GRE tunnel MTU upper limit is miscalculated, from Oussama
Ghorbel.
16) Fix typo in fq_change(), we were assigning "initial quantum" to
"quantum". From Eric Dumazet.
17) Set a more appropriate sk_pacing_rate for non-TCP sockets, otherwise
FQ packet scheduler does not pace those flows properly. Also from
Eric Dumazet.
18) rtlwifi miscalculates packet pointers, from Mark Cave-Ayland.
19) l2tp_xmit_skb() can be called from process context, not just softirq
context, so we must always make sure to BH disable around it. From
Eric Dumazet.
20) On qdisc reset, we forget to purge the RB tree of SKBs in netem
packet scheduler. From Stephen Hemminger.
21) Fix info leak in farsync WAN driver ioctl() handler, from Dan
Carpenter and Salva Peiró.
22) Fix PHY reset and other issues in dm9000 driver, from Nikita
Kiryanov and Michael Abbott.
23) When hardware can do SCTP crc32 checksums, we accidently don't
disable the csum offload when IPSEC transformations have been
applied. From Fan Du and Vlad Yasevich.
24) Tail loss probing in TCP leaves the socket in the wrong congestion
avoidance state. From Yuchung Cheng.
25) In CPSW driver, enable NAPI before interrupts are turned on, from
Markus Pargmann.
26) Integer underflow and dual-assignment in YAM hamradio driver, from
Dan Carpenter.
27) If we are going to mangle a packet in tcp_set_skb_tso_segs() we must
unclone it. This fixes various hard to track down crashes in
drivers where the SKBs ->gso_segs was changing right from underneath
the driver during TX queueing. From Eric Dumazet.
28) Fix the handling of VLAN IDs, and in particular the special IDs 0
and 4095, in the bridging layer. From Toshiaki Makita.
29) Another info leak, this time in wanxl WAN driver, from Salva Peiró.
30) Fix race in socket credential passing, from Daniel Borkmann.
31) WHen NETLABEL is disabled, we don't validate CIPSO packets properly,
from Seif Mazareeb.
32) Fix identification of fragmented frames in ipv4/ipv6 UDP
Fragmentation Offload output paths, from Jiri Pirko.
33) Virtual Function fixes in bnx2x driver from Yuval Mintz and Ariel
Elior.
34) When we removed the explicit neighbour pointer from ipv6 routes a
slight regression was introduced for users such as IPVS, xt_TEE, and
raw sockets. We mix up the users requested destination address with
the routes assigned nexthop/gateway. From Julian Anastasov and
Simon Horman.
35) Fix stack overruns in rt6_probe(), the issue is that can end up
doing two full packet xmit paths at the same time when emitting
neighbour discovery messages. From Hannes Frederic Sowa.
36) davinci_emac driver doesn't handle IFF_ALLMULTI correctly, from
Mariusz Ceier.
37) Make sure to set TCP sk_pacing_rate after the first legitimate RTT
sample, from Neal Cardwell.
38) Wrong netlink attribute passed to xfrm_replay_verify_len(), from
Steffen Klassert.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (152 commits)
ax88179_178a: Add VID:DID for Samsung USB Ethernet Adapter
ax88179_178a: Correct the RX error definition in RX header
Revert "bridge: only expire the mdb entry when query is received"
tcp: initialize passive-side sk_pacing_rate after 3WHS
davinci_emac.c: Fix IFF_ALLMULTI setup
mac802154: correct a typo in ieee802154_alloc_device() prototype
ipv6: probe routes asynchronous in rt6_probe
netfilter: nf_conntrack: fix rt6i_gateway checks for H.323 helper
ipv6: fill rt6i_gateway with nexthop address
ipv6: always prefer rt6i_gateway if present
bnx2x: Set NETIF_F_HIGHDMA unconditionally
bnx2x: Don't pretend during register dump
bnx2x: Lock DMAE when used by statistic flow
bnx2x: Prevent null pointer dereference on error flow
bnx2x: Fix config when SR-IOV and iSCSI are enabled
bnx2x: Fix Coalescing configuration
bnx2x: Unlock VF-PF channel on MAC/VLAN config error
bnx2x: Prevent an illegal pointer dereference during panic
bnx2x: Fix Maximum CoS estimation for VFs
drivers: net: cpsw: fix kernel warn during iperf test with interrupt pacing
...
net_get_random_once(intrduced in the next patch) uses static_keys in
a way that they get enabled on boot-up instead of replaced with an
ideal_nop. So check for default_nop on initial enabling.
Other architectures don't check for this.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: x86@kernel.org
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As Intel rolling out more SoC's after Moorestown, we need to
re-structure the code in a way that is backward compatible and easy to
expand. This patch implements a flexible way to support multiple boards
and devices.
This patch does not add any new functional support. It just refactors
the existing code to increase the modularity and decrease the code
duplication for supporting multiple soc's and boards.
Currently intel-mid.c has both board and soc related code in one file.
This patch moves the board related code to new files and let linker
script to create SFI devite table following this:
1. Move the SFI device specific code to
arch/x86/platform/intel-mid/device-libs/platform_<device>.*
A new device file is added for every supported device. This code will
get conditionally compiled by using corresponding device driver
CONFIG option.
2. Move the device_ids location to .x86_intel_mid_dev.init section by
using new sfi_device() macro.
This patch was based on previous code from Sathyanarayanan Kuppuswamy.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-13-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
When Intel mid uses SFI table to enumerate devices, it requires an extra
device table with further information about how to probe such devices.
This patch creates a section where the device table will stay if
CONFIG_X86_INTEL_MID is selected.
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-12-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Intel mid sfi code doesn't need struct devs_id.get_platform_data != NULL.
If the callback is not set, just assume there is no platform_data.
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-11-git-send-email-david.a.cohen@linux.intel.com
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Moved SFI specific parsing/handling code to sfi.c. This will enable us
to reuse our intel-mid code for platforms that supports firmware
interfaces other than SFI (like ACPI).
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-10-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Added a custom handler for medfield based ipc devices and
moved devs_id structure defintion to header file.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-9-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch provides a means to add custom handler for
SFI devices. If you set device_handler as NULL in
device_id table standard SFI device handler will be used.
If its not NULL custom handler will be called.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-8-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
SFI device_id[] table parsing code is duplicated in every SFI
device handler. This patch removes this code duplication, by
adding a seperate function get_device_id() to parse through the
device table. Also this patch moves the SPI, I2C, IPC info code from
sfi_parse_devs() to respective device handlers.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-7-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
mrst is used as common name to represent all intel_mid type
soc's. But moorsetwon is just one of the intel_mid soc. So
renamed them to use intel_mid.
This patch mainly renames the variables and related
functions that uses *mrst* prefix with *intel_mid*.
To ensure that there are no functional changes, I have compared
the objdump of related files before and after rename and found
the only difference is symbol and name changes.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-6-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Function 'type1_access_ok' should return bool value, not 0/1.
This patch changes 'return 0/1' to 'return false/true'.
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-5-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Following files contains code that is common to all intel mid
soc's. So renamed them as below.
mrst/mrst.c -> intel-mid/intel-mid.c
mrst/vrtc.c -> intel-mid/intel_mid_vrtc.c
mrst/early_printk_mrst.c -> intel-mid/intel_mid_vrtc.c
pci/mrst.c -> pci/intel_mid_pci.c
Also, renamed the corresponding header files and made changes
to the driver files that included these header files.
To ensure that there are no functional changes, I have compared
the objdump of renamed files before and after rename and found
that the only difference is file name change.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-4-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Fixed indentation issues reported by checkpatch script in
mrst related files.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-3-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Fixed printk and pr_* related issues in mrst related files.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: http://lkml.kernel.org/r/1382049336-21316-2-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
We will use that in the later patch to find the kvm ops handler
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Having 64-bit MSR access methods on given CPU can avoid shifting and
simplify MSR content manipulation. We already have other combinations
of rdmsrl_xxx and wrmsrl_xxx but missing the _on_cpu version.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There's been reports of high NMI handler overhead, highlighted by
such kernel messages:
[ 3697.380195] perf samples too long (10009 > 10000), lowering kernel.perf_event_max_sample_rate to 13000
[ 3697.389509] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.331 msecs
Don Zickus analyzed the source of the overhead and reported:
> While there are a few places that are causing latencies, for now I focused on
> the longest one first. It seems to be 'copy_user_from_nmi'
>
> intel_pmu_handle_irq ->
> intel_pmu_drain_pebs_nhm ->
> __intel_pmu_drain_pebs_nhm ->
> __intel_pmu_pebs_event ->
> intel_pmu_pebs_fixup_ip ->
> copy_from_user_nmi
>
> In intel_pmu_pebs_fixup_ip(), if the while-loop goes over 50, the sum of
> all the copy_from_user_nmi latencies seems to go over 1,000,000 cycles
> (there are some cases where only 10 iterations are needed to go that high
> too, but in generall over 50 or so). At this point copy_user_from_nmi
> seems to account for over 90% of the nmi latency.
The solution to that is to avoid having to call copy_from_user_nmi() for
every instruction.
Since we already limit the max basic block size, we can easily
pre-allocate a piece of memory to copy the entire thing into in one
go.
Don reported this test result:
> Your patch made a huge difference in improvement. The
> copy_from_user_nmi() no longer hits the million of cycles. I still
> have a batch of 100,000-300,000 cycles. My longest NMI paths used
> to be dominated by copy_from_user_nmi, now it is not (I have to dig
> up the new hot path).
Reported-and-tested-by: Don Zickus <dzickus@redhat.com>
Cc: jmario@redhat.com
Cc: acme@infradead.org
Cc: dave.hansen@linux.intel.com
Cc: eranian@google.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131016105755.GX10651@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJSXSSbAAoJEIlPj0hw4a6QkIoQAMe69LtP8x4l7ZfgqT/QxwHK
COGw+KpPXZPxASpCJO9aTXW/Mr54X9BpC4if6OmpRA6BXyU2sTrp2TIeuPbZCFvg
DjFy4g6Snd9yvb0FPbIcUbMbUZUWmLA05PstUZV+FNub2l9ie7WWzfjyVNQv/izA
AJ6J5GDSPxHegoygeBCCrn0Qc5rgN6soxhRDndmVGLxlBNRs71ORojRMGUrWY2X/
OhxSj97hCXZHZyaQBI2JfuNzmFAVGvdLVN3NGzROKK1u8zKGn1I1kL39OU1QkCHZ
iUmBeJt7jzetS85XcDxCzcErWs6K6HjFqiSF1tC6GGOtX73oqNqV4weIzNSdO+lc
tLUGDfJSJdaVB6tNoGsywqx380CHyJi6WjUR21o2RSW8TtsNrzkpqLE0Aqv84tnl
dyTgykzmpKhtmchjHHO4mGDioCRDYLcok3gol2IRc6P+BYXzHgwvWkH/TWMhvVjK
IHGTd+cuUqrs7MmuxGZ/kQiqV6OegdFZYrW6E6vulITUMWNPxUaXFBu9njzrMxGt
2bLeC+xDTi2BW9ihaln+zVUxmHZ2TLq6fHMZ/XTWrwGNNdUhKU/jNIWqC25MxSX8
vOVnCLvT+AqqSU82ng3JL5LvnZFGlAd/cUdVEdo723xRNz6OgWLbrhi7Pq/OkjIJ
9VKoA3zX8mgiMl1Wgp9t
=wJOk
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen fixes from Stefano Stabellini:
"A small fix for Xen on x86_32 and a build fix for xen-tpmfront on
arm64"
* tag 'stable/for-linus-3.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Fix possible user space selector corruption
tpm: xen-tpmfront: fix missing declaration of xen_domain
We use jump label to enable pv-spinlock. With the changes in (442e0973e9
Merge branch 'x86/jumplabel'), the jump label behaviour has changed
that would result in eventual hang of the VM since we would end up in a
situation where slow path locks would halt the vcpus but we will not be
able to wakeup the vcpu by lock releaser using unlock kick.
Similar problem in Xen and more detailed description is available in
a945928ea2 (xen: Do not enable spinlocks before jump_label_init()
has executed)
This patch splits kvm_spinlock_init to separate jump label changes with
pvops patching and also make jump label enabling after jump_label_init().
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Page pinning is not mandatory in kvm async page fault processing since
after async page fault event is delivered to a guest it accesses page once
again and does its own GUP. Drop the FOLL_GET flag in GUP in async_pf
code, and do some simplifying in check/clear processing.
Suggested-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Gu zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
The UV3 hub revision ID is different than expected. The first
revision was supposed to start at 1 but instead will start at 0.
Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: <stable@kernel.org> # v3.9, v3.10, v3.11
Link: http://lkml.kernel.org/r/20131014161733.GA6274@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Correct common misspelling of "identify" as "indentify" throughout
the kernel
Signed-off-by: Maxime Jayat <maxime@artisandeveloppeur.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
I have a randconfig here which has enabled only
CONFIG_MICROCODE=y
CONFIG_MICROCODE_OLD_INTERFACE=y
with both
# CONFIG_MICROCODE_INTEL is not set
# CONFIG_MICROCODE_AMD is not set
off. Which makes building the microcode functionality a little
pointless. Don't do that in such cases then.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1381682189-14470-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The gfn_to_index function relies on huge page defines which either may
not make sense on systems that don't support huge pages or are defined
in an unconvenient way for other architectures. Since this is
x86-specific, move the function to arch/x86/include/asm/kvm_host.h.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Pull gcc "asm goto" miscompilation workaround from Ingo Molnar:
"This is the fix for the GCC miscompilation discussed in the following
lkml thread:
[x86] BUG: unable to handle kernel paging request at 00740060
The bug in GCC has been fixed by Jakub and the fix will be part of the
GCC 4.8.2 release expected to be released next week - so the quirk's
version test checks for <= 4.8.1.
The quirk is only added to compiler-gcc4.h and not to the higher level
compiler.h because all asm goto uses are behind a feature check"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
compiler/gcc4: Add quirk for 'asm goto' miscompilation bug
Pull x86 fixes from Ingo Molnar:
"A build fix and a reboot quirk"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/reboot: Add reboot quirk for Dell Latitude E5410
x86, build, pci: Fix PCI_MSI build on !SMP
Apply the asm_volatile_goto() compiler quirk to the new rmwcc.h
file as well, introduced in:
c2daa3bed5 sched, x86: Provide a per-cpu preempt_count implementation
Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down
a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto'
constructs, as outlined here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
Implement a workaround suggested by Jakub Jelinek.
Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>