Commit graph

405311 commits

Author SHA1 Message Date
Catalin Marinas
61c77e0802 arm64: locks: Remove CONFIG_GENERIC_LOCKBREAK
Commit 52ea2a560a (arm64: locks: introduce ticket-based spinlock
implementation) introduces the arch_spin_is_contended() function making
CONFIG_GENERIC_LOCKBREAK unnecessary.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
2013-11-06 11:42:41 +00:00
Preeti U Murthy
37dc6b50ce sched: Remove unnecessary iteration over sched domains to update nr_busy_cpus
nr_busy_cpus parameter is used by nohz_kick_needed() to find out the
number of busy cpus in a sched domain which has SD_SHARE_PKG_RESOURCES
flag set.  Therefore instead of updating nr_busy_cpus at every level
of sched domain, since it is irrelevant, we can update this parameter
only at the parent domain of the sd which has this flag set. Introduce
a per-cpu parameter sd_busy which represents this parent domain.

In nohz_kick_needed() we directly query the nr_busy_cpus parameter
associated with the groups of sd_busy.

By associating sd_busy with the highest domain which has
SD_SHARE_PKG_RESOURCES flag set, we cover all lower level domains
which could have this flag set and trigger nohz_idle_balancing if any
of the levels have more than one busy cpu.

sd_busy is irrelevant for asymmetric load balancing. However sd_asym
has been introduced to represent the highest sched domain which has
SD_ASYM_PACKING flag set so that it can be queried directly when
required.

While we are at it, we might as well change the nohz_idle parameter to
be updated at the sd_busy domain level alone and not the base domain
level of a CPU.  This will unify the concept of busy cpus at just one
level of sched domain where it is currently used.

Signed-off-by: Preeti U Murthy<preeti@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: svaidy@linux.vnet.ibm.com
Cc: vincent.guittot@linaro.org
Cc: bitbucket@online.de
Cc: benh@kernel.crashing.org
Cc: anton@samba.org
Cc: Morten.Rasmussen@arm.com
Cc: pjt@google.com
Cc: peterz@infradead.org
Cc: mikey@neuling.org
Link: http://lkml.kernel.org/r/20131030031252.23426.4417.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:37:55 +01:00
Vaidyanathan Srinivasan
2042abe797 sched: Fix asymmetric scheduling for POWER7
Asymmetric scheduling within a core is a scheduler loadbalancing
feature that is triggered when SD_ASYM_PACKING flag is set.  The goal
for the load balancer is to move tasks to lower order idle SMT threads
within a core on a POWER7 system.

In nohz_kick_needed(), we intend to check if our sched domain (core)
is completely busy or we have idle cpu.

The following check for SD_ASYM_PACKING:

    (cpumask_first_and(nohz.idle_cpus_mask, sched_domain_span(sd)) < cpu)

already covers the case of checking if the domain has an idle cpu,
because cpumask_first_and() will not yield any set bits if this domain
has no idle cpu.

Hence, nr_busy check against group weight can be removed.

Reported-by: Michael Neuling <michael.neuling@au1.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: vincent.guittot@linaro.org
Cc: bitbucket@online.de
Cc: benh@kernel.crashing.org
Cc: anton@samba.org
Cc: Morten.Rasmussen@arm.com
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131030031242.23426.13019.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:37:54 +01:00
Yan, Zheng
f891d8cfb8 perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support
Unlike other uncore boxes, IRP boxes live in PCI buses with no UBOX
device. For PCI bus without UBOX device, we find the next bus that
has UBOX device and use its 'bus to socket' mapping.

Besides the counter/control registers in IRP boxes are not properly
aligned.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-2-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:31 +01:00
Yan, Zheng
d1e8f4a836 perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes
The encoding for filter registers of IvyBridge-EP uncore QPI boxes is
completely the same as SandyBridge-EP.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:29 +01:00
Oleg Nesterov
c7e548b45c perf: Factor out strncpy() in perf_event_mmap_event()
While this is really minor, but strncpy() does the unnecessary
zero-padding till the end of tmp[16] and it is called every time
we are going to use the string literal.

Turn these strncpy()'s into the single strlcpy() under the new
label, saves 72 bytes.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131017182417.GA17753@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:28 +01:00
Peter Zijlstra
a94d342b9c tools/perf: Add required memory barriers
To match patch bf378d341e ("perf: Fix perf ring buffer memory
ordering") change userspace to also adhere to the ordering outlined.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Link: http://lkml.kernel.org/r/20131030104246.GH16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:26 +01:00
Peter Zijlstra
0a196848ca perf: Fix arch_perf_out_copy_user default
The arch_perf_output_copy_user() default of
__copy_from_user_inatomic() returns bytes not copied, while all other
argument functions given DEFINE_OUTPUT_COPY() return bytes copied.

Since copy_from_user_nmi() is the odd duck out by returning bytes
copied where all other *copy_{to,from}* functions return bytes not
copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes
not copied.

Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied
while expecting its worker functions to return bytes copied.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: will.deacon@arm.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:25 +01:00
Peter Zijlstra
394570b793 perf: Update a stale comment
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-9s5mze78gmlz19agt39i8rii@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:23 +01:00
Peter Zijlstra
524feca5e9 perf: Optimize perf_output_begin() -- address calculation
Rewrite the handle address calculation code to be clearer.

Saves 8 bytes on x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-3trb2n2henb9m27tncef3ag7@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:22 +01:00
Peter Zijlstra
d20a973f46 perf: Optimize perf_output_begin() -- lost_event case
Avoid touching the lost_event and sample_data cachelines twince. Its
not like we end up doing less work, but it might help to keep all
accesses to these cachelines in one place.

Due to code shuffle, this looses 4 bytes on x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-zfxnc58qxj0eawdoj31hhupv@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:21 +01:00
Peter Zijlstra
85f59edf96 perf: Optimize perf_output_begin()
There's no point in re-doing the memory-barrier when we fail the
cmpxchg(). Also placing it after the space reservation loop makes it
clearer it only separates the userpage->tail read from the data
stores.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-c19u6egfldyx86tpyc3zgkw9@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:20 +01:00
Peter Zijlstra
c72b42a3dd perf: Add unlikely() to the ring-buffer code
Add unlikely() annotations to 'slow' paths:

When having a sampling event but no output buffer; you have bigger
issues -- also the bail is still faster than actually doing the work.

When having a sampling event but a control page only buffer, you have
bigger issues -- again the bail is still faster than actually doing
work.

Optimize for the case where you're not loosing events -- again, not
doing the work is still faster but make sure that when you have to
actually do work its as fast as possible.

The typical watermark is 1/2 the buffer size, so most events will not
take this path.

Shrinks perf_output_begin() by 16 bytes on x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-wlg3jew3qnutm8opd0hyeuwn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:19 +01:00
Peter Zijlstra
26c86da882 perf: Simplify the ring-buffer code
By using CIRC_SPACE() we can obviate the need for perf_output_space().

Shrinks the size of perf_output_begin() by 17 bytes on
x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-vtb0xb0llebmsdlfn1v5vtfj@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:18 +01:00
James Hogan
95281171a7 metag: handle low level kicks directly
Kick interrupts trigger the LWK (low level kick) signal, usually handled
by the __TBIDoStdLWK() function which is the only handler inherited from
the bootloader. The LWK signal is converted either to a SWK (plain
software kick) or a SWS (software kick with an attached message).

Linux has kick_handler() to handle SWK and call registered kick handlers
(IPIs and inter-thread comms), but SWS is as far as I'm aware unused
with Linux.

Therefore remove that abstraction and have Linux handle LWK directly.
This will reduce kick latency slightly, and reduce our dependence on the
bootloader, which makes it easier to directly boot a kernel in QEMU
(particularly for SMP).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
2013-11-06 10:40:02 +00:00
Marc Zyngier
c5b2c0f520 arm64: KVM: vgic: byteswap GICv2 access on world switch if BE
Ensure that accesses to the GICH_* registers are byteswapped
when the kernel is compiled as big-endian.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-11-06 10:10:12 +00:00
Marc Zyngier
18ea3dbc9e arm64: KVM: initialize HYP mode following the kernel endianness
Force SCTLR_EL2.EE to 1 if the kernel is compiled as BE.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-11-06 10:10:01 +00:00
Michael Opdenacker
edf6844ebf microblaze: Remove unused NO_MMU Kconfig parameter
This removes the NO_MMU Kconfig parameter,
which was no longer used anywhere in the source code
and Makefiles.

This also updates a comment refering to this parameter.

Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2013-11-06 08:48:58 +01:00
Josh Boyer
b53b5eda81 x86/cpu: Increase max CPU count to 8192
The MAXSMP option is intended to enable silly large numbers of
CPUs for testing purposes.  The current value of 4096 isn't very
silly any longer as there are actual SGI machines that approach
6096 CPUs when taking HT into account.

Increase the value to a nice round 8192 to account for this and
allow for short term future increases.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: prarit@redhat.com
Cc: Russ Anderson <rja@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131105143816.GK9944@hansolo.jdub.homelinux.org
[ Tweaked it so that MAXSMP simply sets the maximum of the normal range. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 08:16:34 +01:00
Josh Boyer
bb61ccc7db x86/cpu: Allow higher NR_CPUS values
The current range for SMP configs is 2 - 512 CPUs, or a full
4096 in the case of MAXSMP.  There are machines that have 1024
CPUs in them today and configuring a kernel for that means you
are forced to set MAXSMP.  This adds additional unnecessary
overhead.  While that overhead might be considered tiny for
large machines, it isn't necessarily so if you are building a
kernel that runs across a wide variety of machines.

To cover the range of more common machines today, we allow
NR_CPUS to be up to 4096 when CPUMASK_OFFSTACK is enabled.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: prarit@redhat.com
Cc: Russ Anderson <rja@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131105143728.GJ9944@hansolo.jdub.homelinux.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 08:16:28 +01:00
HATAYAMA Daisuke
a477c8594b x86/cpu: Always print SMP information in /proc/cpuinfo
Currently show_cpuinfo_core() displays cpu core information only if
the number of threads per a whole cores is 2 or larger.

However, this condition doesn't care about the number of
sockets. For example, this condition doesn't hold on systems
with two logical cpus consisting of two sockets and a single
core on each socket - yet the topology information would be
interesting to see in that case as well.

I don't know whether or not there are processors in real world
by which such configurations are possible, but at least on
vitual machine environments, such configuration can occur,
typically when no explicit SMP information is provided in
advance.

For example, on qemu/KVM, SMP information is specified via -smp
command-line option, more specifically, its syntax is:

  -smp n[,cores=cores][,threads=threads][,sockets=sockets][,maxcpus=maxcpus]

If this is not specified, qemu tells configuration with
n-sockets, 1-core and 1-thread to the guest machine, on which
guest, MP information is not displayed in /proc/cpuinfo.

I saw this situation on VMWare guest environment, too.

To fix this issue, this patch simply removes the condition
because this information is useful even if there's only 1
thread.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/5277D644.4090707@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 08:13:56 +01:00
Peter Zijlstra
b8a216269e sched: Move completion code from core.c to completion.c
Completions already have their own header file: linux/completion.h
Move the implementation out of kernel/sched/core.c and into its own
file: kernel/sched/completion.c.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-x2y49rmxu5dljt66ai2lcfuw@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 07:49:19 +01:00
Peter Zijlstra
b4145872f7 sched: Move wait code from core.c to wait.c
For some reason only the wait part of the wait api lives in
kernel/sched/wait.c and the wake part still lives in kernel/sched/core.c;
ammend this.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-ftycee88naznulqk7ei5mbci@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 07:49:18 +01:00
Peter Zijlstra
7a6354e241 sched: Move wait.c into kernel/sched/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-5q5yqvdaen0rmapwloeaotx3@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 07:49:16 +01:00
Ingo Molnar
164777530b Linux 3.12
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJSdt9HAAoJEHm+PkMAQRiGnzEH/345Keg5dp+oKACnokBfzOtp
 V0p3g5EBsGtzEVnV+1B96trczDUtWdDFFr5GfGSj565NBQpFyc+iZC1mC99RDJCs
 WUquGFqlLMK2aV0SbKwCO4K1rJ5A0TRVj0ZRJOUJUY7jwNf5Qahny0WBVjO/8qAY
 UvJK1rktBClhKdH53YtpDHHgXBeZ2LOrzt1fQ/AMpujGbZauGvnLdNOli5r2kCFK
 jzoOgFLvX+PHU/5/d4/QyJPeQNPva5hjk5Ho9UuSJYhnFtPO3EkD4XZLcpcbNEJb
 LqBvbnZWm6CS435lfU1l93RqQa5xMO9ITk0oe4h69syTSHwWk9aJ+ZTc/4Up+t8=
 =57MC
 -----END PGP SIGNATURE-----

Merge tag 'v3.12' into x86/cpu, to refresh the branch before queueing up more changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 06:50:21 +01:00
Ingo Molnar
83bf9702c8 perf/core improvements and fixes:
. Check maximum frequency rate for record/top, emitting better error
   messages, from Jiri Olsa.
 
 . Disable live kvm command if timerfd is not supported, from David Ahern.
 
 . Add usage to 'perf list', from David Ahern.
 
 . Fix detection of non-core features, from David Ahern.
 
 . Consolidate __hists__add_*entry(), cleanup from Namhyung Kim.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSeVA1AAoJENZQFvNTUqpAsz4P/0x7yyXBlJHgf5qyLNvmIKoS
 V5vVrEfXTpfiDrGSJmPJ+iasUi1caiEV3wKI4tmpZZ5GllNvIHPYQMIbnQgipzQ+
 JuVXoLgMe1jnzHuVflVoXhAHrexSpXIBJ27XJaZb9AIy7A1pwqnKDomUP9zn2Mth
 jzS835nCp/tsA+jcxJ4l8Fwr8XMz7F4aV5/Kp2I6+GDxBXbuAW29eTfBrEmAn9eH
 jocQq6m0m2f+upSwRWvLr7rlFBVgs1+yQszzq7QydZfb0dyczEPwgKIpJto7JnmR
 KyuSNFZZ9NArjQP7i1L84i/zA6WFT/shVv3T+8+EEoRMINTyZTZH4y6GSKNev30p
 hC3nSdGJ8tKVaLOjO7kv/OjQpOR/WfH6CUNKbVXPi1DruZTWQlgGsNczjO0OLpfx
 NAiy0qo2M8GSgl7uu7FjnpUgiSZgNnBztTUi9me/d5a0ES6ZAhId1EZfVYZ0h+A4
 sUsXkrSVst62OiaolSc6GNeosgkvUPl/l6K+Mbbqt3gS2uJlx9jUQ18dwdTLU7NO
 aBPTujmQdTNgLOEVd7YixpN+3xg/ZSSRNuJbqsJcL8EyXi0bY+3oBvcG87/txI/r
 6pnhCV0wFB2KoiSDaHkRtCi8ODvjCB6lwZBIILlybE9U4/qgagwj68IVSaW6o2gF
 4cG6ZQHCKK/uA3y2ljsF
 =CXFq
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

  * Check maximum frequency rate for record/top, emitting better error
    messages, from Jiri Olsa.

  * Disable live kvm command if timerfd is not supported, from David Ahern.

  * Add usage to 'perf list', from David Ahern.

  * Fix detection of non-core features, from David Ahern.

  * Consolidate __hists__add_*entry(), cleanup from Namhyung Kim.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 06:28:23 +01:00
Vineet Gupta
57e26e5745 ARC: [SMP] Fix build failures for large NR_CPUS
ST.as only takes S9 (255) for offset. This was going out of range when
accessing a task_struct field with 4k NR_CPUS (due to 128b of coumaks
itself in there).

Workaround by using an intermediate register to do the address scaling.

There is some duplication of fix for ctx_sw.c and ctx_sw_asm.S however
given that C version will go away soon I'm not bothering to factor out
the common code.

Reported-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:46 +05:30
Noam Camus
3aa4f80e41 ARC: [SMP] enlarge possible NR_CPUS
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:46 +05:30
Vineet Gupta
5ea72a9026 ARC: [SMP] TLB flush
- Add mm_cpumask setting (aggregating only, unlike some other arches)
  used to restrict the TLB flush cross-calling

- cross-calling versions of TLB flush routines (thanks to Noam)

Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:45 +05:30
Vineet Gupta
63eca94ca2 ARC: [SMP] ASID allocation
-Track a Per CPU ASID counter
-mm-per-cpu ASID (multiple threads, or mm migrated around)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:45 +05:30
Chen Gang
b6fe8e7c01 arc: export symbol for pm_power_off in reset.c
Need export symbol for it, or can not pass compiling, the related error
with allmodconfig:

    MODPOST 2994 modules
  ERROR: "pm_power_off" [drivers/mfd/retu-mfd.ko] undefined!
  ERROR: "pm_power_off" [drivers/char/ipmi/ipmi_poweroff.ko] undefined!

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:44 +05:30
Chen Gang
8f146d0204 arc: export symbol for save_stack_trace() in stacktrace.c
Need export its symbol just like other architectures done, or can not
pass compiling with allmodconfig, the related error:

    MODPOST 2994 modules
  ERROR: "save_stack_trace" [kernel/backtracetest.ko] undefined!
  ERROR: "save_stack_trace" [drivers/md/persistent-data/dm-persistent-data.ko] undefined!

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:44 +05:30
Chen Gang
4782f7f9ae arc: remove '__init' for get_hw_config_num_irq()
get_hw_config_num_irq() may be called by normal iss_model_init_smp()
which is a function pointer for 'init_smp' which may be called by
first_lines_of_secondary() which also need be normal too.

The related warning (with allmodconfig):

    MODPOST vmlinux.o
  WARNING: vmlinux.o(.text+0x5814): Section mismatch in reference from the function iss_model_init_smp() to the function .init.text:get_hw_config_num_irq()
  The function iss_model_init_smp() references
  the function __init get_hw_config_num_irq().
  This is often because iss_model_init_smp lacks a __init
  annotation or the annotation of get_hw_config_num_irq is wrong.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
2013-11-06 10:41:43 +05:30
Chen Gang
8f5d221b06 arc: remove '__init' for first_lines_of_secondary()
first_lines_of_secondary() is a '__init' function, but it may be called
by __cpu_up() by _cpu_up() by cpu_up() which is a normal export symbol
function. So recommend to remove '__init'.

The related warning (with allmodconfig):

    MODPOST vmlinux.o
  WARNING: vmlinux.o(.text+0x315c): Section mismatch in reference from the function __cpu_up() to the function .init.text:first_lines_of_secondary()
  The function __cpu_up() references
  the function __init first_lines_of_secondary().
  This is often because __cpu_up lacks a __init
  annotation or the annotation of first_lines_of_secondary is wrong.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
2013-11-06 10:41:43 +05:30
Chen Gang
ef3a661af6 arc: remove '__init' for setup_processor() and arc_init_IRQ()
They haven't '__init' in definition, but has '__init' in declaration.
And normal function start_kernel_secondary() may call setup_processor()
which will call arc_init_IRQ().

So need remove '__init' for both of them. The related warning (with
allmodconfig):

    MODPOST vmlinux.o
  WARNING: vmlinux.o(.text+0x3084): Section mismatch in reference from the function start_kernel_secondary() to the function .init.text:setup_processor()
  The function start_kernel_secondary() references
  the function __init setup_processor().
  This is often because start_kernel_secondary lacks a __init
  annotation or the annotation of setup_processor is wrong.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
2013-11-06 10:41:42 +05:30
Chen Gang
3d01c1ce41 arc: kgdb: add default implementation for kgdb_roundup_cpus()
arc supports kgdb, but need update -- add function kgdb_roundup_cpus(),
or can not pass compiling. At present, add the simple generic one just
like other architectures(e.g. tile, mips ...).

The related error (with allmodconfig):

  kernel/built-in.o: In function `kgdb_cpu_enter':
  kernel/debug/debug_core.c:580: undefined reference to `kgdb_roundup_cpus'

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:41 +05:30
Vineet Gupta
0a4c40a3b7 ARC: Fix bogus gcc warning and micro-optimise TLB iteration loop
------------------>8----------------------
arch/arc/mm/tlb.c: In function ‘do_tlb_overlap_fault’:
arch/arc/mm/tlb.c:688:13: warning: array subscript is above array bounds
[-Warray-bounds]
         (pd0[n] & PAGE_MASK)) {
             ^
------------------>8----------------------

While at it, remove the usless last iteration of outer loop when reading
a TLB SET for duplicate entries.

Suggested-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:41 +05:30
Vineet Gupta
0dafafc3ef ARC: Add support for irqflags tracing and lockdep
Lockdep required a small fix to stacktrace API which was incorrectly
unwindign out of __switch_to for the current call frame.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:40 +05:30
Vineet Gupta
54c8bff14d ARC: Reset the value of Interrupt Priority Register
In case bootloader has changed the priority of one/more IRQ lines

Reported-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:40 +05:30
Vineet Gupta
07ba69a46c ARC: Reduce #ifdef'ery for unaligned access emulation
Emulation not enabled is treated as if the fixup failed, so no need for
special #ifdef checks.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:39 +05:30
Vineet Gupta
21a63b5604 ARC: Change calling convention of do_page_fault()
switch the args (address, pt_regs) to match with all the other "C"
exception handlers.

This removes the awkwardness in EV_ProtV for page fault vs. unaligned
access.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:39 +05:30
Vineet Gupta
d4599baf5c ARC: cacheflush optim - PTAG can be loop invariant if V-P is const
Line op needs vaddr (indexing) and paddr (tag match). For page sized
flushes (V-P const), each line op will need a different index, but the
tag bits wil remain constant, hence paddr can be setup once outside the
loop.

This improves select LMBench numbers for Aliasing dcache where we have
more "preventive" cache flushing.

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.11-rc7- Linux 3.11.0-   80 4.66 8.88 69.7 112. 268. 8.60 28.0 3489 13.K 27.K	# Non alias ARC700
3.11-rc7- Linux 3.11.0-   80 4.64 8.51 68.6 98.5 271. 8.58 28.1 4160 15.K 32.K	# Aliasing
3.11-rc7- Linux 3.11.0-   80 4.64 8.51 69.8 99.4 270. 8.73 27.5 3880 15.K 31.K	# PTAG loop Inv

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:38 +05:30
Vineet Gupta
bd12976c36 ARC: cacheflush refactor #3: Unify the {d,i}cache flush leaf helpers
With Line length being constant now, we can fold the 2 helpers into 1.
This allows applying any optimizations (forthcoming) to single place.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:38 +05:30
Vineet Gupta
63d2dfdbf4 ARC: cacheflush refactor #2: I and D caches lines to have same size
Having them be different seems an obscure configuration.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:37 +05:30
Vineet Gupta
f3e4de3274 ARC: cacheflush refactor #1: push aux reg ascertaining into leaf routine
ARC dcache supports 3 ops - Inv, Flush, Flush-n-Inv.
The programming model however provides 2 commands FLUSH, INV.
INV will either discard or flush-n-discard (based on DT_CTRL bit)

The leaf helper __dc_line_loop() used to take the AUX register
(corresponding to the 2 commands). Now we push that to within the
helper, paving way for code consolidations to follow.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:41:29 +05:30
Vineet Gupta
064a626924 ARC: use __weak instead of __attribute__((weak))
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:40:37 +05:30
Vineet Gupta
8e457d6a75 ARC: Annotate some functions as static
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-06 10:40:37 +05:30
Christoph Lameter
6855e95ce3 arc: Replace __get_cpu_var uses
__get_cpu_var() is used for multiple purposes in the kernel source. One of them is
address calculation via the form &__get_cpu_var(x). This calculates the address for
the instance of the percpu variable of the current processor based on an offset.

Other use cases are for storing and retrieving data from the current processors percpu area.
__get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))

__get_cpu_var() always only does an address determination. However, store and retrieve operations
could use a segment prefix (or global register on other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use
optimized assembly code to read and write per cpu variables.

This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr()
or into a use of this_cpu operations that use the offset. Thereby address calcualtions are avoided
and less registers are used when code is generated.

At the end of the patchset all uses of __get_cpu_var have been removed so the macro is removed too.

The patchset includes passes over all arches as well. Once these operations are used throughout then
specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by
f.e. using a global register that may be set to the per cpu base.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu variable.

	DEFINE_PER_CPU(int, u);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(this_cpu_ptr(&x), y, sizeof(x));

5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	this_cpu_inc(y)

Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
2013-11-06 10:40:37 +05:30
Benjamin Herrenschmidt
0c4888ef1d powerpc: Fix fatal SLB miss when restoring PPR
When restoring the PPR value, we incorrectly access the thread structure
at a time where MSR:RI is clear, which means we cannot recover from nested
faults. However the thread structure isn't covered by the "bolted" SLB
entries and thus accessing can fault.

This fixes it by splitting the code so that the PPR value is loaded into
a GPR before MSR:RI is cleared.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-11-06 14:13:53 +11:00
Gavin Shan
36954dc78d powerpc/powernv: Reserve the correct PE number
We're assigning PE numbers after the completion of PCI probe. During
the PCI probe, we had PE#0 as the super container to encompass all
PCI devices. However, that's inappropriate since PELTM has ascending
order of priority on search on P7IOC. So we need PE#127 takes the
role that PE#0 has previously. For PHB3, we still have PE#0 as the
reserved PE.

The patch supposes that the underly firmware has built the RID to
PE# mapping after resetting IODA tables: all PELTM entries except
last one has invalid mapping on P7IOC, but all RTEs have binding
to PE#0. The reserved PE# is being exported by firmware by device
tree.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-11-06 14:13:52 +11:00