Commit graph

108987 commits

Author SHA1 Message Date
Peter Zijlstra
b92b8b35a2 locking/arch: Rename set_mb() to smp_store_mb()
Since set_mb() is really about an smp_mb() -- not a IO/DMA barrier
like mb() rename it to match the recent smp_load_acquire() and
smp_store_release().

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19 08:32:00 +02:00
Peter Zijlstra
ab3f02fc23 locking/arch: Add WRITE_ONCE() to set_mb()
Since we assume set_mb() to result in a single store followed by a
full memory barrier, employ WRITE_ONCE().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19 08:31:59 +02:00
Thomas Gleixner
a22e5f579b arch: Remove __ARCH_HAVE_CMPXCHG
We removed the only user of this define in the rtmutex code. Get rid
of it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2015-05-13 10:55:42 +02:00
Waiman Long
c7114b4e6c locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS
To be consistent with the queued spinlocks which use
CONFIG_QUEUED_SPINLOCKS config parameter, the one for the queued
rwlocks is now renamed to CONFIG_QUEUED_RWLOCKS.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431367031-36697-1-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-12 09:46:00 +02:00
Ingo Molnar
62c7a1e9ae locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS
Valentin Rothberg reported that we use CONFIG_QUEUED_SPINLOCKS
in arch/x86/kernel/paravirt_patch_32.c, while the symbol is
called CONFIG_QUEUED_SPINLOCK. (Note the extra 'S')

But the typo was natural: the proper English term for such
a generic object would be 'queued spinlocks' - so rename
this and related symbols accordingly to the plural form.

Reported-by: Valentin Rothberg <valentinrothberg@gmail.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-11 09:52:09 +02:00
David Vrabel
e95e6f176c locking/pvqspinlock, x86: Enable PV qspinlock for Xen
This patch adds the necessary Xen specific code to allow Xen to
support the CPU halting and kicking operations needed by the queue
spinlock PV code.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel J Blueman <daniel@numascale.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-12-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08 12:37:18 +02:00
Waiman Long
bf0c7c34ad locking/pvqspinlock, x86: Enable PV qspinlock for KVM
This patch adds the necessary KVM specific code to allow KVM to
support the CPU halting and kicking operations needed by the queue
spinlock PV code.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel J Blueman <daniel@numascale.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-11-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08 12:37:17 +02:00
Peter Zijlstra (Intel)
f233f7f158 locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching
We use the regular paravirt call patching to switch between:

  native_queued_spin_lock_slowpath()	__pv_queued_spin_lock_slowpath()
  native_queued_spin_unlock()		__pv_queued_spin_unlock()

We use a callee saved call for the unlock function which reduces the
i-cache footprint and allows 'inlining' of SPIN_UNLOCK functions
again.

We further optimize the unlock path by patching the direct call with a
"movb $0,%arg1" if we are indeed using the native unlock code. This
makes the unlock code almost as fast as the !PARAVIRT case.

This significantly lowers the overhead of having
CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel J Blueman <daniel@numascale.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-10-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08 12:37:09 +02:00
Peter Zijlstra (Intel)
2aa79af642 locking/qspinlock: Revert to test-and-set on hypervisors
When we detect a hypervisor (!paravirt, see qspinlock paravirt support
patches), revert to a simple test-and-set lock to avoid the horrors
of queue preemption.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel J Blueman <daniel@numascale.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-8-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08 12:36:58 +02:00
Waiman Long
d73a33973f locking/qspinlock, x86: Enable x86-64 to use queued spinlocks
This patch makes the necessary changes at the x86 architecture
specific layer to enable the use of queued spinlocks for x86-64. As
x86-32 machines are typically not multi-socket. The benefit of queue
spinlock may not be apparent. So queued spinlocks are not enabled.

Currently, there is some incompatibilities between the para-virtualized
spinlock code (which hard-codes the use of ticket spinlock) and the
queued spinlocks. Therefore, the use of queued spinlocks is disabled
when the para-virtualized spinlock is enabled.

The arch/x86/include/asm/qspinlock.h header file includes some x86
specific optimization which will make the queueds spinlock code
perform better than the generic implementation.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel J Blueman <daniel@numascale.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-3-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08 12:36:26 +02:00
Linus Torvalds
3e0283a53f Power management and ACPI fixes for v4.1-rc3
- Fix for a PCI resources management regression introduced during
    the 4.0 cycle and related to the handling of ACPI resources'
    Producer/Consumer flags that turn out to be useless (Jiang Liu).
 
  - Fix for a MacBook regression related to the Smart Battery Subsystem
    (SBS) driver causing various problems (stalls on boot, failure to
    detect or report battery) to happen and introduced during the 3.18
    cycle (Chris Bainbridge).
 
  - Fix for an ACPI/PNP device enumeration regression introduced during
    the 3.16 cycle caused by failing to include two PNP device IDs into
    the list of IDs that PNP device objects need to be created for
    (Witold Szczeponik).
 
  - Fixes for two minor mistakes in the ACPI GPIO properties
    documentation (Antonio Ospite, Rafael J Wysocki).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJVS8ZQAAoJEILEb/54YlRx38QP/1dtmb1eVZZMP+InQ9COisHp
 qYuBRt8FCWCIyd1PZeRXmLVo8UM337mka2pyUpa+hFJoY+TpVku/8WW18ucjI1TU
 sbJSycYoCiATdHpk2qXV9h6SmCq0e0tEePJx90fA9xYXGTGYs1VEhoBLQ6ayuqnF
 gCiWy1wl3OmFEDDIpykW3pSXJf+8y3LcCqqDGdsL3yqNOXKszmzQJn98F5mxnpp6
 XDG9nBdoC17KxRcvO6vzVikFPNu0FAuPO1JO/vzmKyl4fRbxj7ZTGenzwwV+72aH
 i20SS4sIWTLtpxOj0vtDFaJzqdeAl1tZxNJ+06mJoEcJ67l9PpiE/mqWa9qMYLop
 eDz2dDZES7zLgpDT6sY3ofcAVq59SWrEyWvKiiRWtYe5+LAGZrw6RxgV0ZR1QJTy
 UFPvlYVLhIrPnbywl0pLReQU7DM+199v7RzHSABCa/X2ZOI1/W6streM4zRn4y9U
 JlRLnoW0GAFyO9muRGTesnR8J2+umQvdWbaiIkJ4i8H8KmkM4XTtPNruDfykScE5
 9b5NuxR8QyLrBVqz8dqVO9SkeXvbmnoxYKNCZeCWo6bq46Miyx2W+lAZhpnQTAFD
 jlwb+PpDjMu0R2ERmzhJD6lLNlxSjAUJCJP3nAfjPKoIRlSp9+5KrqLS5MeyQbZz
 b/zdMfGlP1CcfyU3Gn4M
 =JFUn
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "These include three regression fixes (PCI resources management,
  ACPI/PNP device enumeration, ACPI SBS on MacBook) and two ACPI
  documentation fixes related to GPIO.

  Specifics:

   - Fix for a PCI resources management regression introduced during the
     4.0 cycle and related to the handling of ACPI resources'
     Producer/Consumer flags that turn out to be useless (Jiang Liu)

   - Fix for a MacBook regression related to the Smart Battery Subsystem
     (SBS) driver causing various problems (stalls on boot, failure to
     detect or report battery) to happen and introduced during the 3.18
     cycle (Chris Bainbridge)

   - Fix for an ACPI/PNP device enumeration regression introduced during
     the 3.16 cycle caused by failing to include two PNP device IDs into
     the list of IDs that PNP device objects need to be created for
     (Witold Szczeponik)

   - Fixes for two minor mistakes in the ACPI GPIO properties
     documentation (Antonio Ospite, Rafael J Wysocki)"

* tag 'pm+acpi-4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / PNP: add two IDs to list for PNPACPI device enumeration
  ACPI / documentation: Fix ambiguity in the GPIO properties document
  ACPI / documentation: fix a sentence about GPIO resources
  ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook
  x86/PCI/ACPI: Make all resources except [io 0xcf8-0xcff] available on PCI bus
2015-05-07 15:58:00 -07:00
Rafael J. Wysocki
9a5d9315e4 Merge branches 'acpi-resources', 'acpi-battery', 'acpi-doc' and 'acpi-pnp'
* acpi-resources:
  x86/PCI/ACPI: Make all resources except [io 0xcf8-0xcff] available on PCI bus

* acpi-battery:
  ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook

* acpi-doc:
  ACPI / documentation: Fix ambiguity in the GPIO properties document
  ACPI / documentation: fix a sentence about GPIO resources

* acpi-pnp:
  ACPI / PNP: add two IDs to list for PNPACPI device enumeration
2015-05-07 21:24:34 +02:00
Linus Torvalds
0e1dc42748 xen: bug fixes for 4.1-rc2
- Fix blkback regression if using persistent grants.
 - Fix various event channel related suspend/resume bugs.
 - Fix AMD x86 regression with X86_BUG_SYSRET_SS_ATTRS.
 - SWIOTLB on ARM now uses frames <4 GiB (if available) so device only
   capable of 32-bit DMA work.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJVSiC1AAoJEFxbo/MsZsTRojgH/1zWPD0r5WMAEPb6DFdb7Ga1
 SqBbyHFu43axNwZ7EvUzSqI8BKDPbTnScQ3+zC6Zy1SIEfS+40+vn7kY/uASmWtK
 LYaYu8nd49OZP8ykH0HEvsJ2LXKnAwqAwvVbEigG7KJA7h8wXo7aDwdwxtZmHlFP
 18xRTfHcrnINtAJpjVRmIGZsCMXhXQz4bm0HwsXTTX0qUcRWtxydKDlMPTVFyWR8
 wQ2m5+76fQ8KlFsoJEB0M9ygFdheZBF4FxBGHRrWXBUOhHrQITnH+cf1aMVxTkvy
 NDwiEebwXUDHacv21onszoOkNjReLsx+DWp9eHknlT/fgPo6tweMM2yazFGm+JQ=
 =W683
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.1b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:

 - fix blkback regression if using persistent grants

 - fix various event channel related suspend/resume bugs

 - fix AMD x86 regression with X86_BUG_SYSRET_SS_ATTRS

 - SWIOTLB on ARM now uses frames <4 GiB (if available) so device only
   capable of 32-bit DMA work.

* tag 'for-linus-4.1b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM
  hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests
  xen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq()
  xen/console: Update console event channel on resume
  xen/xenbus: Update xenbus event channel on resume
  xen/events: Clear cpu_evtchn_mask before resuming
  xen-pciback: Add name prefix to global 'permissive' variable
  xen: Suspend ticks on all CPUs during suspend
  xen/grant: introduce func gnttab_unmap_refs_sync()
  xen/blkback: safely unmap purge persistent grants
2015-05-06 15:58:06 -07:00
Linus Torvalds
3d54ac9e35 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "EFI fixes, and FPU fix, a ticket spinlock boundary condition fix and
  two build fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Always restore_xinit_state() when use_eager_cpu()
  x86: Make cpu_tss available to external modules
  efi: Fix error handling in add_sysfs_runtime_map_entry()
  x86/spinlocks: Fix regression in spinlock contention detection
  x86/mm: Clean up types in xlate_dev_mem_ptr()
  x86/efi: Store upper bits of command line buffer address in ext_cmd_line_ptr
  efivarfs: Ensure VariableName is NUL-terminated
2015-05-06 10:57:37 -07:00
Linus Torvalds
d8fce2db72 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also an uncore PMU driver fix and an uncore
  PMU driver hardware-enablement addition"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf probe: Fix segfault if passed with ''.
  perf report: Fix -T/--threads option to work again
  perf bench numa: Fix immediate meeting of convergence condition
  perf bench numa: Fixes of --quiet argument
  perf bench futex: Fix hung wakeup tasks after requeueing
  perf probe: Fix bug with global variables handling
  perf top: Fix a segfault when kernel map is restricted.
  tools lib traceevent: Fix build failure on 32-bit arch
  perf kmem: Fix compiles on RHEL6/OL6
  tools lib api: Undefine _FORTIFY_SOURCE before setting it
  perf kmem: Consistently use PRIu64 for printing u64 values
  perf trace: Disable events and drain events when forked workload ends
  perf trace: Enable events when doing system wide tracing and starting a workload
  perf/x86/intel/uncore: Move PCI IDs for IMC to uncore driver
  perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile Processor) IMC uncore PMUs
  perf/x86/intel: Add cpu_(prepare|starting|dying) for core_pmu
2015-05-06 10:47:25 -07:00
Stefano Stabellini
8746515d7f xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM
Make sure that xen_swiotlb_init allocates buffers that are DMA capable
when at least one memblock is available below 4G. Otherwise we assume
that all devices on the SoC can cope with >4G addresses. We do this on
ARM and ARM64, where dom0 is mapped 1:1, so pfn == mfn in this case.

No functional changes on x86.

From: Chen Baozi <baozich@gmail.com>

Signed-off-by: Chen Baozi <baozich@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Chen Baozi <baozich@gmail.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-05-06 15:02:58 +01:00
Bobby Powers
c88d47480d x86/fpu: Always restore_xinit_state() when use_eager_cpu()
The following commit:

  f893959b08 ("x86/fpu: Don't abuse drop_init_fpu() in flush_thread()")

removed drop_init_fpu() usage from flush_thread(). This seems to break
things for me - the Go 1.4 test suite fails all over the place with
floating point comparision errors (offending commit found through
bisection).

The functional change was that flush_thread() after this commit
only calls restore_init_xstate() when both use_eager_fpu() and
!used_math() are true. drop_init_fpu() (now fpu_reset_state()) calls
restore_init_xstate() regardless of whether current used_math() - apply
the same logic here.

Switch used_math() -> tsk_used_math(tsk) to consistently use the grabbed
tsk instead of current, like in the rest of flush_thread().

Tested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Bobby Powers <bobbypowers@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: f893959b ("x86/fpu: Don't abuse drop_init_fpu() in flush_thread()")
Link: http://lkml.kernel.org/r/1430147441-9820-1-git-send-email-bobbypowers@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-06 11:22:03 +02:00
Ingo Molnar
c102cb097d * Avoid garbage names in efivarfs due to buggy firmware by zero'ing
EFI variable name - Ross Lagerwall
 
  * Stop erroneously dropping upper 32-bits of boot command line pointer
    in EFI boot stub and stash them in ext_cmd_line_ptr - Roy Franz
 
  * Fix double-free bug in error handling code path of EFI runtime map
    code - Dan Carpenter
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVSOSjAAoJEC84WcCNIz1VXk4P/R4GwmmzZBdYAseiwv6u/NRm
 bTXnK7SN1ZyY8WibEm8ptXJuTIyXZxmQYr4lY97canJy8P7umtoCP7P3tS0Ier8U
 N1AMFGes7xlwBhjIRz2Cr9e5plr5H3qk65JNMuUDp0/MVuPEiNEzi6efbL82dh9S
 RCLxQ94paX+wV6ltQMKWGD3v0WnHkzouuCdETCGaozqQmJx6PGzDmJ51kXYRWDyP
 esTCZpRHlIzKN0u3XEFgswlIev2wab0BtjXYOzUqb0AH1Q13OgQfiswX3WIG6k+c
 3xuMH4JByBIDwOLudgu0D6Sst2QwVJZnw6JavoEgGCFao0n6IPzUGolAWLFMdDhL
 Kparzc6ObHpiqYtqBjJXW+awOENVS4qIrn9MHc9wwsJxXOy++0YnyYCgge0iia47
 F2/pOHvkd52QiQ0gC442W0EdX1VlPCUR04G0s4d3UX3O875yl80QTyLQ4n7ZK074
 3wfi/9+Fuv8wWMJ4HI8FJgaTl57KzAP4ZPh2cy8oPs6bkiiwlnMWH24bEhlxKBK4
 mEIze045kyswz3rV7j1WX3MSXrPA2cM95L5WlvVTxckMn40QwLPBWSDCOJIj3K5K
 yhXNHHfHzG/GRm3SfD2i1EcK4gUW82awl72jJn0F69YMI5a+T1BIppEMP2pzsWE4
 FcwvWDxzWwKxYKJosfkk
 =f7a2
 -----END PGP SIGNATURE-----

Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent

Pull EFI fixes from Matt Fleming:

 * Avoid garbage names in efivarfs due to buggy firmware by zeroing
   EFI variable name. (Ross Lagerwall)

 * Stop erroneously dropping upper 32 bits of boot command line pointer
   in EFI boot stub and stash them in ext_cmd_line_ptr. (Roy Franz)

 * Fix double-free bug in error handling code path of EFI runtime map
   code. (Dan Carpenter)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-06 08:30:24 +02:00
Marc Dionne
de71ad2c97 x86: Make cpu_tss available to external modules
Commit 75182b1632 ("x86/asm/entry: Switch all C consumers of
kernel_stack to this_cpu_sp0()") changed current_thread_info
to use this_cpu_sp0, and indirectly made it rely on init_tss
which was exported with EXPORT_PER_CPU_SYMBOL_GPL.
As a result some macros and inline functions such as set/get_fs,
test_thread_flag and variants have been made unusable for
external modules.

Make cpu_tss exported with EXPORT_PER_CPU_SYMBOL so that these
functions are accessible again, as they were previously.

Signed-off-by: Marc Dionne <marc.dionne@your-file-system.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/1430763404-21221-1-git-send-email-marc.dionne@your-file-system.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-05-05 20:40:31 +02:00
Boris Ostrovsky
a71dbdaa8c hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests
Commit 61f01dd941 ("x86_64, asm: Work around AMD SYSRET SS descriptor
attribute issue") makes AMD processors set SS to __KERNEL_DS in
__switch_to() to deal with cases when SS is NULL.

This breaks Xen PV guests who do not want to load SS with__KERNEL_DS.

Since the problem that the commit is trying to address would have to be
fixed in the hypervisor (if it in fact exists under Xen) there is no
reason to set X86_BUG_SYSRET_SS_ATTRS flag for PV VPCUs here.

This can be easily achieved by adding x86_hyper_xen_hvm.set_cpu_features
op which will clear this flag. (And since this structure is no longer
HVM-specific we should do some renaming).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-05-05 18:27:43 +01:00
Tahsin Erdogan
e8a4a2696f x86/spinlocks: Fix regression in spinlock contention detection
A spinlock is regarded as contended when there is at least one waiter.
Currently, the code that checks whether there are any waiters rely on
tail value being greater than head. However, this is not true if tail
reaches the max value and wraps back to zero, so arch_spin_is_contended()
incorrectly returns 0 (not contended) when tail is smaller than head.

The original code (before regression) handled this case by casting the
(tail - head) to an unsigned value. This change simply restores that
behavior.

Fixes: d6abfdb202 ("x86/spinlocks/paravirt: Fix memory corruption on unlock")
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Cc: peterz@infradead.org
Cc: Waiman.Long@hp.com
Cc: borntraeger@de.ibm.com
Cc: oleg@redhat.com
Cc: raghavendra.kt@linux.vnet.ibm.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1430799331-20445-1-git-send-email-tahsin@google.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-05-05 11:01:38 +02:00
Linus Torvalds
180d89f6ef powerpc fixes for 4.1 # 2
- Build fix for SMP=n in book3s_xics.c
 - Fix for Daniel's pci_controller_ops on powernv.
 - Revert the TM syscall abort patch for now.
 - CPU affinity fix from Nathan.
 - Two EEH fixes from Gavin.
 - Fix for CR corruption from Sam.
 - Selftest build fix.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVRKvXAAoJEFHr6jzI4aWAQPwP/jctjzdpbt+/Ra+/f48E4TuP
 cLDqbVJcOV+aC0lflXDBwnORn7qff2zzN6yTUcj9lAkq/ILBY7lY8m/bNvj/C0g1
 yH1Bh6EIjKLyqLKyfFnu+H1DU2s+ROhaAFh9JXhW28j7gU0iSwb7kyBlQ3MP7py4
 8OTbVs1vBKg42SND5FX8JsJG7Vk5v/sNz7WXc2HdtIIWQip4tp95vKvftuCABZgj
 2bMfHF5OXCYd3yalVZGeuiIX3ZAezN9F2GpfFoetCn0118Fkp97pfEVkQ0p64tI7
 xomtzgNXZh9jKFvqqhqlcUDFWEpqr27UjB5/ToWa2YKL4ACrYrgvvo+ifL4qLLtb
 M9itrZVfHElHjA0JSn/hDMdaRKBALcyX1+71rvTpGOMvrdtUY7NaD/h+2jQJ6Cz8
 V8o7uI7SGOdGjWtzNV+bHN+bmhF1MKA1WJXk9a1Pexi+T0vtyZNcTQXr00RVoZJp
 zsrE5cZGwgXkz0tlkNK4Zf5U8xURqZKGZWoCxG4kCkwWPPyZZCWH0HDQtNzxMJXJ
 xrxDTuuF9B/B72xZ6UpVHYlIwYGLEzPz5jtL7r9muxjVEuaewT3NmX+3ZAQZKk/f
 hKMiwHpDSKs36K1Afn8g4ycjfzAy2HyL6TVMvHjO8XG14HVyI+tJ49oeqBTRQLO1
 2ZGZCkjGNJd/R1Ii1qeW
 =S0WU
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux

Pull powerpc fixes from Michael Ellerman:
 - build fix for SMP=n in book3s_xics.c
 - fix for Daniel's pci_controller_ops on powernv.
 - revert the TM syscall abort patch for now.
 - CPU affinity fix from Nathan.
 - two EEH fixes from Gavin.
 - fix for CR corruption from Sam.
 - selftest build fix.

* tag 'powerpc-4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  powerpc/powernv: Restore non-volatile CRs after nap
  powerpc/eeh: Delay probing EEH device during hotplug
  powerpc/eeh: Fix race condition in pcibios_set_pcie_reset_state()
  powerpc/pseries: Correct cpu affinity for dlpar added cpus
  selftests/powerpc: Fix the pmu install rule
  Revert "powerpc/tm: Abort syscalls in active transactions"
  powerpc/powernv: Fix early pci_controller_ops loading.
  powerpc/kvm: Fix SMP=n build error in book3s_xics.c
2015-05-03 10:28:36 -07:00
Linus Torvalds
036f351e25 arm64 fixes:
- Fix perf devicetree warnings at probe time
 - Fix memory leak in __dma_free()
 - Ensure DMA buffers are always zeroed
 - Show IRQ trigger in /proc/interrupts (for parity with ARM)
 - Implement byte and halfword access for smp_{load_acquire,store_release}
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJVQlyfAAoJELescNyEwWM0ZTIIAKnK5bIr1jS9wP9wLJ8eZ++y
 8p5RYluqtMBkYzbOpKr8HiJdg6GupNiSq6as+cf5T+Gc2D01HhFocqJbLrOCqJRA
 RG5gFqkx/7FKTCwIt/DSygiwA+szPKTArf3YOi+EuDmCkcgRv38vac8yqlM1rI+M
 UL2vdAEqvjPxPquzeAqrSIcwQ9jAYzEouh8VrMWigLFN5BvCUrO64r1y4zN7hbO1
 RWEKPkIFMLDGCFwCn17mzLafVF2f0JH514sMZEQi2AQA/MlaIS3hM+SFJgxbRzib
 ZRqXujX1Osi9W20VPwMtuUgQymEBQDAbXT35BxehL7MVMZhv66q+LBEabIpDMqM=
 =4S7b
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "Not too much here, but we've addressed a couple of nasty issues in the
  dma-mapping code as well as adding the halfword and byte variants of
  load_acquire/store_release following on from the CSD locking bug that
  you fixed in the core.

   - fix perf devicetree warnings at probe time

   - fix memory leak in __dma_free()

   - ensure DMA buffers are always zeroed

   - show IRQ trigger in /proc/interrupts (for parity with ARM)

   - implement byte and halfword access for smp_{load_acquire,store_release}"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: perf: Fix the pmu node name in warning message
  arm64: perf: don't warn about missing interrupt-affinity property for PPIs
  arm64: add missing PAGE_ALIGN() to __dma_free()
  arm64: dma-mapping: always clear allocated buffers
  ARM64: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL
  arm64: add missing data types in smp_load_acquire/smp_store_release
2015-05-01 07:44:32 -07:00
Sam Bobroff
0aab374709 powerpc/powernv: Restore non-volatile CRs after nap
Patches 7cba160ad "powernv/cpuidle: Redesign idle states management"
and 77b54e9f2 "powernv/powerpc: Add winkle support for offline cpus"
use non-volatile condition registers (cr2, cr3 and cr4) early in the system
reset interrupt handler (system_reset_pSeries()) before it has been determined
if state loss has occurred. If state loss has not occurred, control returns via
the power7_wakeup_noloss() path which does not restore those condition
registers, leaving them corrupted.

Fix this by restoring the condition registers in the power7_wakeup_noloss()
case.

This is apparent when running a KVM guest on hardware that does not
support winkle or sleep and the guest makes use of secondary threads. In
practice this means Power7 machines, though some early unreleased Power8
machines may also be susceptible.

The secondary CPUs are taken off line before the guest is started and
they call pnv_smp_cpu_kill_self(). This checks support for sleep
states (in this case there is no support) and power7_nap() is called.

When the CPU is woken, power7_nap() returns and because the CPU is
still off line, the main while loop executes again. The sleep states
support test is executed again, but because the tested values cannot
have changed, the compiler has optimized the test away and instead we
rely on the result of the first test, which has been left in cr3
and/or cr4. With the result overwritten, the wrong branch is taken and
power7_winkle() is called on a CPU that does not support it, leading
to it stalling.

Fixes: 7cba160ad7 ("powernv/cpuidle: Redesign idle states management")
Fixes: 77b54e9f21 ("powernv/powerpc: Add winkle support for offline cpus")
[mpe: Massage change log a bit more]
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-01 16:55:11 +10:00
Gavin Shan
d91dafc02f powerpc/eeh: Delay probing EEH device during hotplug
Commit 1c509148b ("powerpc/eeh: Do probe on pci_dn") probes EEH
devices in early stage, which is reasonable to pSeries platform.
However, it's wrong for PowerNV platform because the PE# isn't
determined until the resources (IO and MMIO) are assigned to
PE in hotplug case. So we have to delay probing EEH devices
for PowerNV platform until the PE# is assigned.

Fixes: ff57b454dd ("powerpc/eeh: Do probe on pci_dn")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-01 13:52:32 +10:00
Gavin Shan
1ae79b78bc powerpc/eeh: Fix race condition in pcibios_set_pcie_reset_state()
When asserting reset in pcibios_set_pcie_reset_state(), the PE
is enforced to (hardware) frozen state in order to drop unexpected
PCI transactions (except PCI config read/write) automatically by
hardware during reset, which would cause recursive EEH error.
However, the (software) frozen state EEH_PE_ISOLATED is missed.
When users get 0xFF from PCI config or MMIO read, EEH_PE_ISOLATED
is set in PE state retrival backend. Unfortunately, nobody (the
reset handler or the EEH recovery functinality in host) will clear
EEH_PE_ISOLATED when the PE has been passed through to guest.

The patch sets and clears EEH_PE_ISOLATED properly during reset
in function pcibios_set_pcie_reset_state() to fix the issue.

Fixes: 28158cd ("Enhance pcibios_set_pcie_reset_state()")
Reported-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-01 13:52:09 +10:00
Nathan Fontenot
f32393c943 powerpc/pseries: Correct cpu affinity for dlpar added cpus
The incorrect ordering of operations during cpu dlpar add results in invalid
affinity for the cpu being added. The ibm,associativity property in the
device tree is populated with all zeroes for the added cpu which results in
invalid affinity mappings and all cpus appear to belong to node 0.

This occurs because rtas configure-connector is called prior to making the
rtas set-indicator calls. Phyp does not assign affinity information
for a cpu until the rtas set-indicator calls are made to set the isolation
and allocation state.

Correct the order of operations to make the rtas set-indicator
calls (done in dlpar_acquire_drc) before calling rtas configure-connector.

Fixes: 1a8061c46c ("powerpc/pseries: Add kernel based CPU DLPAR handling")

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-01 13:47:24 +10:00
Jiang Liu
2c62e8492e x86/PCI/ACPI: Make all resources except [io 0xcf8-0xcff] available on PCI bus
An IO port or MMIO resource assigned to a PCI host bridge may be
consumed by the host bridge itself or available to its child
bus/devices. The ACPI specification defines a bit (Producer/Consumer)
to tell whether the resource is consumed by the host bridge itself,
but firmware hasn't used that bit consistently, so we can't rely on it.

Before commit 593669c2ac ("x86/PCI/ACPI: Use common ACPI resource
interfaces to simplify implementation"), arch/x86/pci/acpi.c ignored
all IO port resources defined by acpi_resource_io and
acpi_resource_fixed_io to filter out IO ports consumed by the host
bridge itself.

Commit 593669c2ac ("x86/PCI/ACPI: Use common ACPI resource interfaces
to simplify implementation") started accepting all IO port and MMIO
resources, which caused a regression that IO port resources consumed
by the host bridge itself became available to its child devices.

Then commit 63f1789ec7 ("x86/PCI/ACPI: Ignore resources consumed by
host bridge itself") ignored resources consumed by the host bridge
itself by checking the IORESOURCE_WINDOW flag, which accidently removed
MMIO resources defined by acpi_resource_memory24, acpi_resource_memory32
and acpi_resource_fixed_memory32.

On x86 and IA64 platforms, all IO port and MMIO resources are assumed
to be available to child bus/devices except one special case:
    IO port [0xCF8-0xCFF] is consumed by the host bridge itself
    to access PCI configuration space.

So explicitly filter out PCI CFG IO ports[0xCF8-0xCFF]. This solution
will also ease the way to consolidate ACPI PCI host bridge common code
from x86, ia64 and ARM64.

Related ACPI table are archived at:
https://bugzilla.kernel.org/show_bug.cgi?id=94221

Related discussions at:
http://patchwork.ozlabs.org/patch/461633/
https://lkml.org/lkml/2015/3/29/304

Fixes: 63f1789ec7 (Ignore resources consumed by host bridge itself)
Reported-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: 4.0+ <stable@vger.kernel.org> # 4.0+
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-30 22:17:34 +02:00
Linus Torvalds
9dbbe3cfc3 Remove from guest code the handling of task migration during a
pvclock read; instead use the correct protocol in KVM.
 
 This removes the need for task migration notifiers in core
 scheduler code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJVQkUWAAoJEL/70l94x66DhfcH/A8RTHUOELtoy+v2weahn21m
 FFWEnEUlCWzYgmiddgFdlr6+ub386W3ryFsXKPqjrn/8LVv3yS7tK1NJF8d03LQw
 n7HtIsrF01E9UI8CIWO4S/mUxWQev6vEJ9NXtNrsJcRmhSeLaIZkPjTH8Zqyx4i9
 ZvG4731WHXmxvbJ03bfJU9Y8OwHXe55GMi614aTxPndVBGdvIRu2Oj6aTfQTeab/
 7tEujub0MKWp74a7eyNU4GItcvIAXZCQt2wMc5dN1VK3ma5FTOnHIOuhAb8mACFF
 qEeGhtxAnOf7W+s9J8i7zVBdA5MOS0vUKng361ZOVGDb0OLqcVADW7GpuTZfRAM=
 =2A7v
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm changes from Paolo Bonzini:
 "Remove from guest code the handling of task migration during a pvclock
  read; instead use the correct protocol in KVM.

  This removes the need for task migration notifiers in core scheduler
  code"

[ The scheduler people really hated the migration notifiers, so this was
  kind of required  - Linus ]

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  x86: pvclock: Really remove the sched notifier for cross-cpu migrations
  kvm: x86: fix kvmclock update protocol
2015-04-30 09:44:04 -07:00
Suzuki K. Poulose
8291fd04d8 arm64: perf: Fix the pmu node name in warning message
With commit d5efd9cc9c ("arm64: pmu: add support for interrupt-affinity
property"), we print a warning when we find a PMU SPI with a missing
missing interrupt-affinity property in a pmu node. Unfortunately, we
pass the wrong (NULL) device node to of_node_full_name, resulting in
unhelpful messages such as:

 hw perfevents: Failed to parse <no-node>/interrupt-affinity[0]

This patch fixes the name to that of the pmu node.

Fixes: d5efd9cc9c (arm64: pmu: add support for interrupt-affinity property)
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-04-30 12:11:30 +01:00
Will Deacon
d795ef9aa8 arm64: perf: don't warn about missing interrupt-affinity property for PPIs
PPIs are affine by nature, so the interrupt-affinity property is not
used and therefore we shouldn't print a warning in its absence.

Reported-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Reviewed-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-04-30 12:11:23 +01:00
Michael Ellerman
68fc378ce3 Revert "powerpc/tm: Abort syscalls in active transactions"
This reverts commit feba40362b.

Although the principle of this change is good, the implementation has a
few issues.

Firstly we can sometimes fail to abort a syscall because r12 may have
been clobbered by C code if we went down the virtual CPU accounting
path, or if syscall tracing was enabled.

Secondly we have decided that it is safer to abort the syscall even
earlier in the syscall entry path, so that we avoid the syscall tracing
path when we are transactional.

So that we have time to thoroughly test those changes we have decided to
revert this for this merge window and will merge the fixed version in
the next window.

NB. Rather than reverting the selftest we just drop tm-syscall from
TEST_PROGS so that it's not run by default.

Fixes: feba40362b ("powerpc/tm: Abort syscalls in active transactions")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-30 15:24:58 +10:00
Dean Nelson
2cff98b99c arm64: add missing PAGE_ALIGN() to __dma_free()
__dma_alloc() does a PAGE_ALIGN() on the passed in size argument before
doing anything else. __dma_free() does not. And because it doesn't, it is
possible to leak memory should size not be an integer multiple of PAGE_SIZE.

The solution is to add a PAGE_ALIGN() to __dma_free() like is done in
__dma_alloc().

Additionally, this patch removes a redundant PAGE_ALIGN() from
__dma_alloc_coherent(), since __dma_alloc_coherent() can only be called
from __dma_alloc(), which already does a PAGE_ALIGN() before the call.

Cc: stable@vger.kernel.org
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-04-29 17:39:39 +01:00
Boris Ostrovsky
2b953a5e99 xen: Suspend ticks on all CPUs during suspend
Commit 77e32c89a7 ("clockevents: Manage device's state separately for
the core") decouples clockevent device's modes from states. With this
change when a Xen guest tries to resume, it won't be calling its
set_mode op which needs to be done on each VCPU in order to make the
hypervisor aware that we are in oneshot mode.

This happens because clockevents_tick_resume() (which is an intermediate
step of resuming ticks on a processor) doesn't call clockevents_set_state()
anymore and because during suspend clockevent devices on all VCPUs (except
for the one doing the suspend) are left in ONESHOT state. As result, during
resume the clockevents state machine will assume that device is already
where it should be and doesn't need to be updated.

To avoid this problem we should suspend ticks on all VCPUs during
suspend.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-04-29 17:10:05 +01:00
Daniel Axtens
d33047fd7e powerpc/powernv: Fix early pci_controller_ops loading.
Load the PowerNV platform pci controller ops into pci controllers
after all the operations are loaded into the platform ops struct, not
before.

Otherwise we aren't actually setting the ops properly which can break
IO for some devices.

Fixes: 65ebf4b63 ("powerpc/powernv: Move controller ops from ppc_md to controller_ops")
Reported-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-29 19:43:58 +10:00
Michael Ellerman
433c5c20c5 powerpc/kvm: Fix SMP=n build error in book3s_xics.c
Commit 34cb7954c0 "Convert ICS mutex lock to spin lock" added an
include of asm/spinlock.h, which does not work in the SMP=n case.

It should instead include linux/spinlock.h

Fixes: 34cb7954c0 ("KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock")
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-04-29 08:06:32 +10:00
Linus Torvalds
3d99e3fe13 Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull arch/tile bugfix from Chris Metcalf:
 "This just fixes a compiler warning from an old bug that only recently
  started generating a warning"

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: properly use node_isset() on a nodemask_t
2015-04-28 14:22:35 -07:00
Linus Torvalds
14bc84ce0b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "One additional new feature for 4.1, a new PRNG based on SHA-512 for
  the zcrypt driver.

  Two memory management related changes, the page table reallocation for
  KVM is removed, and with file ptes gone the encoding of page table
  entries is improved.

  And three bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator.
  s390/mm: change swap pte encoding and pgtable cleanup
  s390/mm: correct transfer of dirty & young bits in __pmd_to_pte
  s390/bpf: add dependency to z196 features
  s390/3215: free memory in error path
  s390/kvm: remove delayed reallocation of page tables for KVM
  kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP
2015-04-28 09:58:46 -07:00
Chris Metcalf
9b0f5d63e7 tile: properly use node_isset() on a nodemask_t
The code accidentally used cpu_isset() previously in one place
(though properly node_isset() elsewhere).

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2015-04-28 10:36:45 -04:00
Paolo Bonzini
73459e2a1a x86: pvclock: Really remove the sched notifier for cross-cpu migrations
This reverts commits 0a4e6be9ca
and 80f7fdb1c7.

The task migration notifier was originally introduced in order to support
the pvclock vsyscall with non-synchronized TSC, but KVM only supports it
with synchronized TSC.  Hence, on KVM the race condition is only needed
due to a bad implementation on the host side, and even then it's so rare
that it's mostly theoretical.

As far as KVM is concerned it's possible to fix the host, avoiding the
additional complexity in the vDSO and the (re)introduction of the task
migration notifier.

Xen, on the other hand, hasn't yet implemented vsyscall support at
all, so we do not care about its plans for non-synchronized TSC.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-27 15:49:30 +02:00
Radim Krčmář
5dca0d9147 kvm: x86: fix kvmclock update protocol
The kvmclock spec says that the host will increment a version field to
an odd number, then update stuff, then increment it to an even number.
The host is buggy and doesn't do this, and the result is observable
when one vcpu reads another vcpu's kvmclock data.

There's no good way for a guest kernel to keep its vdso from reading
a different vcpu's kvmclock data, but we don't need to care about
changing VCPUs as long as we read a consistent data from kvmclock.
(VCPU can change outside of this loop too, so it doesn't matter if we
return a value not fit for this VCPU.)

Based on a patch by Radim Krčmář.

Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-27 15:48:59 +02:00
Marek Szyprowski
6829e274a6 arm64: dma-mapping: always clear allocated buffers
Buffers allocated by dma_alloc_coherent() are always zeroed on Alpha,
ARM (32bit), MIPS, PowerPC, x86/x86_64 and probably other architectures.
It turned out that some drivers rely on this 'feature'. Allocated buffer
might be also exposed to userspace with dma_mmap() call, so clearing it
is desired from security point of view to avoid exposing random memory
to userspace. This patch unifies dma_alloc_coherent() behavior on ARM64
architecture with other implementations by unconditionally zeroing
allocated buffer.

Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-04-27 11:39:50 +01:00
Sudeep Holla
6544e67bfb ARM64: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL
Since several interrupt controllers including GIC support both edge and
level triggered interrupts, it's useful to provide that information in
/proc/interrupts even on ARM64 similar to ARM and PPC.

This is based on Geert Uytterhoeven's commit 7c07005eea ("ARM: 8339/1:
Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL")

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-04-27 11:39:05 +01:00
Andre Przywara
878a84d5a8 arm64: add missing data types in smp_load_acquire/smp_store_release
Commit 8053871d0f ("smp: Fix smp_call_function_single_async()
locking") introduced a call to smp_load_acquire() with a u16 argument,
but we only cared about u32 and u64 types in that function so far.
This resulted in a compiler warning fortunately, pointing at an
uninitialized use. Due to the implementation structure the compiler
misses that bug in the smp_store_release(), though.
Add the u16 and u8 variants using ldarh/stlrh and ldarb/stlrb,
respectively. Together with the compiletime_assert_atomic_type() check
this should cover all cases now.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-04-27 11:39:04 +01:00
Andy Lutomirski
61f01dd941 x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue
AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET with
SS == 0 results in an invalid usermode state in which SS is apparently
equal to __USER_DS but causes #SS if used.

Work around the issue by setting SS to __KERNEL_DS __switch_to, thus
ensuring that SYSRET never happens with SS set to NULL.

This was exposed by a recent vDSO cleanup.

Fixes: e7d6eefaaa x86/vdso32/syscall.S: Do not load __USER32_DS to %ss
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-26 17:57:38 -07:00
Linus Torvalds
9ec3a646fe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fourth vfs update from Al Viro:
 "d_inode() annotations from David Howells (sat in for-next since before
  the beginning of merge window) + four assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RCU pathwalk breakage when running into a symlink overmounting something
  fix I_DIO_WAKEUP definition
  direct-io: only inc/dec inode->i_dio_count for file systems
  fs/9p: fix readdir()
  VFS: assorted d_backing_inode() annotations
  VFS: fs/inode.c helpers: d_inode() annotations
  VFS: fs/cachefiles: d_backing_inode() annotations
  VFS: fs library helpers: d_inode() annotations
  VFS: assorted weird filesystems: d_inode() annotations
  VFS: normal filesystems (and lustre): d_inode() annotations
  VFS: security/: d_inode() annotations
  VFS: security/: d_backing_inode() annotations
  VFS: net/: d_inode() annotations
  VFS: net/unix: d_backing_inode() annotations
  VFS: kernel/: d_inode() annotations
  VFS: audit: d_backing_inode() annotations
  VFS: Fix up some ->d_inode accesses in the chelsio driver
  VFS: Cachefiles should perform fs modifications on the top layer only
  VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-26 17:22:07 -07:00
Linus Torvalds
d89b3e19ef Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This push fixes a build problem with img-hash under non-standard
  configurations and a serious regression with sha512_ssse3 which can
  lead to boot failures"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: img-hash - CRYPTO_DEV_IMGTEC_HASH should depend on HAS_DMA
  crypto: x86/sha512_ssse3 - fixup for asm function prototype change
2015-04-26 13:51:05 -07:00
Linus Torvalds
7f9f44308c CRIS changes for 4.1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iEYEABECAAYFAlU70nMACgkQ31LbvUHyf1cYgwCfSmPhyLFmr0pGM/BxsVY7K1v6
 PaEAn2+7xfZV38E6hwrGMrT42ZvKyL6r
 =LHQU
 -----END PGP SIGNATURE-----

Merge tag 'cris-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris

Pull arch/cris updates from Jesper Nilsson:
 "Some much needed love for the CRIS-port.

  There's a bunch of changes this time, giving the CRISv32 port a bit of
  modern makeover with device-tree, irq domain and gpiolib support, and
  more switchover to generic frameworks.

  Some small fixes and removal of the theoretical SMP support brings up
  the rear"

* tag 'cris-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris:
  cris: fix integer overflow in ELF_ET_DYN_BASE
  CRISv32: use GENERIC_SCHED_CLOCK
  CRISv32: use MMIO clocksource
  CRISv32: use generic clockevents
  CRIS: use generic headers via Kbuild
  CRIS: use generic cmpxchg.h
  CRIS: use generic atomic.h
  CRIS: use generic atomic bitops
  CRISv10: remove redundant macros from system.h
  CRIS: remove SMP code
  CRISv32: don't enable irqs in INIT_THREAD
  CRISv32: handle multiple signals
  CRISv32: prevent bogus restarts on sigreturn
  CRISv32: don't attempt syscall restart on irq exit
  Add binding documentation for CRIS
  CRIS: add Axis 88 board device tree
  CRISv32: add device tree support
  CRISv32: add irq domains support
  CRIS: enable GPIOLIB
2015-04-26 13:31:05 -07:00
Linus Torvalds
63905bba5b powerpc fixes for 4.1
- Fix for mm_dec_nr_pmds() from Scott.
 - Fixes for oopses seen with KVM + THP from Aneesh.
 - Build fixes from Aneesh & Shreyas.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVOsk5AAoJEFHr6jzI4aWACWMP/3EaNoeA1g8VbZWZEdRaoLvX
 W7D08DI3Dt8HLxyn2JR08jYZF0gr68XrF6OiscVVki7wVXT8fbH4jSNmBkbzNH95
 d9taScJyR1CUavkhsXivnR1qEE1Fi2KA2OW9RaNfoSt1MVtdsvOK6xXklUGksuJQ
 XygzyrRr4Dj82kuMUAMO0YDMvknMlzi3a8dzyrWZBXBZOOTWavGB6bQKtCTaOQ99
 3OFGLQ10uY7lmdHDi0t0tQ99FuYfLiJpg5fTLoUni4J5tFp8JlZ+x0Gwc0apN0cy
 Ym8EO6++qWDv8FXvYEPfVUEjbF1fyPiawUgpkMnyvXgd8K5G85SIrtkGW0Ml+6sX
 GfJH8w9hpDbF5EnWlC9bn/jT7sHBHFdrxZuQUc0L4M2OtM73R2a0Xr3b7ZxFCD1q
 7RpYu8MKKcyvaIXNg7VBJjj8zL+WmUJKF6J5uX5bGU2xH0khmp0vTknyyjbwrlcF
 uHidv5ZhMt3aAI70v14jA5BTEmLyOYRu58Ei6cT/VT/DjdbpEApdK8BMAvKSEeib
 +hzh6oDFT92AM0tbg15bNmqGbGfgqtVKe4GDS2QyGaHGAFOGs1nPuSa9se1xYDcM
 CCtRyABwpzJsrCfwra2fsTU6FxlatK4ONViyWFBXa6mEjBNSZ4XmyZvdWUqlwpSC
 F5jNGppm5Ama6xxcLphA
 =6yQx
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux

Pull powerpc fixes from Michael Ellerman:

 - fix for mm_dec_nr_pmds() from Scott.

 - fixes for oopses seen with KVM + THP from Aneesh.

 - build fixes from Aneesh & Shreyas.

* tag 'powerpc-4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  powerpc/mm: Fix build error with CONFIG_PPC_TRANSACTIONAL_MEM disabled
  powerpc/kvm: Fix ppc64_defconfig + PPC_POWERNV=n build error
  powerpc/mm/thp: Return pte address if we find trans_splitting.
  powerpc/mm/thp: Make page table walk safe against thp split/collapse
  KVM: PPC: Remove page table walk helpers
  KVM: PPC: Use READ_ONCE when dereferencing pte_t pointer
  powerpc/hugetlb: Call mm_dec_nr_pmds() in hugetlb_free_pmd_range()
2015-04-26 13:23:15 -07:00
Linus Torvalds
eadf16a912 This mostly includes the PPC changes for 4.1, which this time cover
Book3S HV only (debugging aids, minor performance improvements and some
 cleanups).  But there are also bug fixes and small cleanups for ARM,
 x86 and s390.
 
 The task_migration_notifier revert and real fix is still pending review,
 but I'll send it as soon as possible after -rc1.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJVOONLAAoJEL/70l94x66DbsMIAIpZPsaqgXOC1sDEiZuYay+6
 rD4n4id7j8hIAzcf3AlZdyf5XgLlr6I1Zyt62s1WcoRq/CCnL7k9EljzSmw31WFX
 P2y7/J0iBdkn0et+PpoNThfL2GsgTqNRCLOOQlKgEQwMP9Dlw5fnUbtC1UchOzTg
 eAMeBIpYwufkWkXhdMw4PAD4lJ9WxUZ1eXHEBRzJb0o0ZxAATJ1tPZGrFJzoUOSM
 WsVNTuBsNd7upT02kQdvA1TUo/OPjseTOEoksHHwfcORt6bc5qvpctL3jYfcr7sk
 /L6sIhYGVNkjkuredjlKGLfT2DDJjSEdJb1k2pWrDRsY76dmottQubAE9J9cDTk=
 =OAi2
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull second batch of KVM changes from Paolo Bonzini:
 "This mostly includes the PPC changes for 4.1, which this time cover
  Book3S HV only (debugging aids, minor performance improvements and
  some cleanups).  But there are also bug fixes and small cleanups for
  ARM, x86 and s390.

  The task_migration_notifier revert and real fix is still pending
  review, but I'll send it as soon as possible after -rc1"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (29 commits)
  KVM: arm/arm64: check IRQ number on userland injection
  KVM: arm: irqfd: fix value returned by kvm_irq_map_gsi
  KVM: VMX: Preserve host CR4.MCE value while in guest mode.
  KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8
  KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C
  KVM: PPC: Book3S HV: Streamline guest entry and exit
  KVM: PPC: Book3S HV: Use bitmap of active threads rather than count
  KVM: PPC: Book3S HV: Use decrementer to wake napping threads
  KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI
  KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken
  KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu
  KVM: PPC: Book3S HV: Minor cleanups
  KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update
  KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
  KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT
  KVM: PPC: Book3S HV: Add ICP real mode counters
  KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode
  KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock
  KVM: PPC: Book3S HV: Add guest->host real mode completion counters
  KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte
  ...
2015-04-26 13:06:22 -07:00