Commit graph

6118 commits

Author SHA1 Message Date
Ingo Molnar
bb3c3e8071 Merge commit 'v2.6.32-rc5' into perf/probes
Conflicts:
	kernel/trace/trace_event_profile.c

Merge reason: update to -rc5 and resolve conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-17 09:58:25 +02:00
Robin Holt
1d21e6e3ff x86, UV: Fix and clean up bau code to use uv_gpa_to_pnode()
Create an inline function to extract the pnode from a global
physical address and then convert the broadcast assist unit to
use the newly created uv_gpa_to_pnode function.

The open-coded code was wrong as well - it might explain a
few of our unexplained bau hangs.

Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Cliff Whickman <cpw@sgi.com>
Cc: linux-mm@kvack.org
Cc: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20091016112920.GZ8903@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 14:51:53 +02:00
Borislav Petkov
b33a636364 x86, mce: Add a global MCE init helper
Add an early initcall (pre SMP) which sets up global MCE
functionality.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1255689093-26921-2-git-send-email-borislav.petkov@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 14:46:50 +02:00
Borislav Petkov
5e09954a9a x86, mce: Fix up MCE naming nomenclature
Prefix global/setup routines with "mcheck_" thus differentiating
from the internal facilities prefixed with "mce_". Also, prefix
the per cpu calls with mcheck_cpu and rename them to reflect the
MCE setup hierarchy of calls better.

There should be no functionality change resulting from this
patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1255689093-26921-1-git-send-email-borislav.petkov@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 14:46:49 +02:00
Ingo Molnar
6b50f5c7c7 Merge branches 'x86/mce' and 'x86/urgent' into perf/mce
Merge reason: Put all MCE changes into this branch, we are
              queueing up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 14:42:25 +02:00
Roland Dreier
93ae5012a7 x86: Don't print number of MCE banks for every CPU
The MCE initialization code explicitly says it doesn't handle
asymmetric configurations where different CPUs support different
numbers of MCE banks, and it prints a big warning in that case.

Therefore, printing the "mce: CPU supports <x> MCE banks"
message into the kernel log for every CPU is pure redundancy
that clutters the log significantly for systems with lots of
CPUs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
LKML-Reference: <adaeip473qt.fsf@cisco.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 09:20:03 +02:00
Robin Holt
036ed8ba61 x86, UV: Fix information in __uv_hub_info structure
A few parts of the uv_hub_info structure are initialized
incorrectly.

 - n_val is being loaded with m_val.
 - gpa_mask is initialized with a bytes instead of an unsigned long.
 - Handle the case where none of the alias registers are used.

Lastly I converted the bau over to using the uv_hub_info->m_val
which is the correct value.

Without this patch, booting a large configuration hits a
problem where the upper bits of the gnode affect the pnode
and the bau will not operate.

Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Cc: Cliff Whickman <cpw@sgi.com>
Cc: stable@kernel.org
LKML-Reference: <20091015224946.396355000@alcatraz.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 08:18:34 +02:00
Ingo Molnar
a5912f6b3e x86: Document linker script ASSERT() quirk
Older binutils breaks if ASSERT() is used without a sink
for the output.

For example 2.14.90.0.6 is known to be broken, the link
fails with:

  LD      .tmp_vmlinux1
  ld:arch/x86/kernel/vmlinux.lds:678: parse error

Document this quirk in all three files that use it.

  See:    http://marc.info/?l=linux-kbuild&m=124930110427870&w=2
  See[2]: d2ba8b2 ("x86: Fix assert syntax in vmlinux.lds.S")

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <4AD6523D.5030909@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 07:18:46 +02:00
Cyrill Gorcunov
f88f2b4fdb x86: apic: Allow noop operations to be called almost at any time
As only apic noop is used we allow to use almost any operation
caller wants (and which of them noop driver supports of
course).

Initially it was reported by Ingo Molnar that apic noop
issue a warning for pkg id (which is actually false positive
and should be eliminated).

So we save checking (and warning issue) for read/write
operations while allow any other ops to be freely used.

Also:
 - fix noop_cpu_to_logical_apicid, it should be 0.
 - rename noop_default_phys_pkg_id to noop_phys_pkg_id
   (we use default_ prefix for more general routines
    in apic subsystem).

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
LKML-Reference: <20091015150416.GC5331@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 17:26:53 +02:00
Ingo Molnar
713490e02e Merge branch 'tracing/core' into perf/core
Merge reason: to add event filter support we need the following
commits from the tracing tree:

 3f6fe06: tracing/filters: Unify the regex parsing helpers
 1889d20: tracing/filters: Provide basic regex support
 737f453: tracing/filters: Cleanup useless headers

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 11:34:00 +02:00
Ingo Molnar
b226f744d4 Merge branch 'linus' into perf/core
Merge reason: pick up tools/perf/ changes from upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 08:44:44 +02:00
Ingo Molnar
db8590f504 Revert "x86: linker script syntax nits"
This reverts commit e9a63a4e55.

This breaks older binutils, where sink-less asserts are broken.

See this commit for further details:

  d2ba8b2: x86: Fix assert syntax in vmlinux.lds.S

Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4AD6523D.5030909@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 08:09:55 +02:00
Ingo Molnar
a0738a688d Merge branch 'linus' into x86/urgent
Merge reason: pull in latest, to be able to revert a patch there.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 08:07:30 +02:00
Linus Torvalds
601adfedba Merge branch 'topic/x86-lds-nits' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland
* 'topic/x86-lds-nits' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland:
  x86: linker script syntax nits
2009-10-14 15:33:05 -07:00
Linus Torvalds
f061d83a2b Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix missing kernel-doc notation
  Revert "x86, timers: Check for pending timers after (device) interrupts"
  sched: Update the clock of runqueue select_task_rq() selected
2009-10-14 15:25:04 -07:00
Linus Torvalds
ea87644105 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/paravirt: Use normal calling sequences for irq enable/disable
  x86: fix kernel panic on 32 bits when profiling
  x86: Fix Suspend to RAM freeze on Acer Aspire 1511Lmi laptop
  x86, vmi: Mark VMI deprecated and schedule it for removal
2009-10-14 15:24:32 -07:00
Roland McGrath
e9a63a4e55 x86: linker script syntax nits
The linker scripts grew some use of weirdly wrong linker script syntax.
It happens to work, but it's not what the syntax is documented to be.
Clean it up to use the official syntax.

Signed-off-by: Roland McGrath <roland@redhat.com>
CC: Ian Lance Taylor <iant@google.com>
2009-10-14 14:16:38 -07:00
Dimitri Sivanich
4a4de9c7d7 x86: UV RTC: Rename generic_interrupt to x86_platform_ipi
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20091014142257.GE11048@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 18:27:11 +02:00
Dimitri Sivanich
d5991ff297 x86: UV RTC: Clean up error handling
Cleanup error handling in uv_rtc_setup_clock.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20091014142103.GD11048@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 18:27:10 +02:00
Dimitri Sivanich
8c28de4d01 x86: UV RTC: Add clocksource only boot option
Add clocksource only boot option for UV RTC.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20091014141848.GC11048@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 18:27:10 +02:00
Dimitri Sivanich
e47938b1fa x86: UV RTC: Fix early expiry handling
Tune/fix early timer expiry handling and return correct early timeout value
for set_next_event.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20091014141630.GB11048@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 18:27:09 +02:00
Thomas Gleixner
05d86412ea x86: Remove BKL from apm_32
The lock/unlock kernel pair in do_open() got there with the BKL push
down and protects nothing. Remove it.

Replace the lock/unlock kernel in the ioctl code with a mutex to
protect standbys_pending and suspends_pending.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <20091010153349.365236337@linutronix.de>
2009-10-14 17:04:48 +02:00
Thomas Gleixner
ac06ea2cd0 x86: Remove BKL from microcode
cycle_lock_kernel() in microcode_open() is a worthless exercise as
there is nothing to wait for. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <20091010153349.196074920@linutronix.de>
2009-10-14 17:04:48 +02:00
Li Hong
89ccf465ab x86, perf_event: Rename 'performance counter interrupt'
In 'cdd6c482c9ff9c55475ee7392ec8f672eddb7be6', we renamed
Performance Counters -> Performance Events.

The name showed up in /proc/interrupts also needs a change. I use
PMI (Performance monitoring interrupt) here, since it is the
official name used in Intel's documents.

Signed-off-by: Li Hong <lihong.hi@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091014105039.GA22670@uhli>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 14:37:24 +02:00
Frederic Weisbecker
c44fc77084 tracing: Move syscalls metadata handling from arch to core
Most of the syscalls metadata processing is done from arch.
But these operations are mostly generic accross archs. Especially now
that we have a common variable name that expresses the number of
syscalls supported by an arch: NR_syscalls, the only remaining bits
that need to reside in arch is the syscall nr to addr translation.

v2: Compare syscalls symbols only after the "sys" prefix so that we
    avoid spurious mismatches with archs that have syscalls wrappers,
    in which case syscalls symbols have "SyS" prefixed aliases.
    (Reported by: Heiko Carstens)

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
2009-10-14 09:53:56 +02:00
Dimitri Sivanich
9338ad6ffb x86, apic: Move SGI UV functionality out of generic IO-APIC code
Move UV specific functionality out of the generic IO-APIC code.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20091013203236.GD20543@sgi.com>
[ Cleaned up the code some more in their new places. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:09 +02:00
Dimitri Sivanich
6c2c502910 x86: SGI UV: Fix irq affinity for hub based interrupts
This patch fixes handling of uv hub irq affinity.  IRQs with ALL or
NODE affinity can be routed to cpus other than their originally
assigned cpu.  Those with CPU affinity cannot be rerouted.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20090930160259.GA7822@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:01 +02:00
Cyrill Gorcunov
2626eb2b2f x86, apic: Limit apic dumping, introduce new show_lapic= setup option
In case if a system has a large number of cpus printing apics
contents may consume a long time period.

We limit such an output by 1 apic by default. But to have an
ability to see all apics or some part of them we introduce
"show_lapic" setup option which allow us to limit/unlimit the
number of APICs being dumped.

Example: apic=debug show_lapic=5, or apic=debug show_lapic=all

Also move apic_verbosity checking upper that way so helper routines
do not need to inspect it at all.

Suggested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
Cc: macro@linux-mips.org
LKML-Reference: <20091013201022.926793122@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:01 +02:00
Cyrill Gorcunov
a933c61829 x86, apic: Use apic noop driver
In case if apic were disabled we may use the whole apic NOOP driver
instead of sparse poking the some functions in apic driver.

Also NOOP would catch any inappropriate apic operation calls (not
just read/write).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
Cc: macro@linux-mips.org
LKML-Reference: <20091013201022.747817361@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:00 +02:00
Cyrill Gorcunov
9844ab11c7 x86, apic: Introduce the NOOP apic driver
Introduce NOOP APIC driver. We should use it in case if apic was
disabled due to hardware of software/firmware problems (including
user requested to disable it case).

The driver is attempting to catch any inappropriate apic operation
call with warning issue.

Also it is possible to use some apic operation like IPI calls,
read/write without checking for apic presence which should make
callers code easier.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
Cc: macro@linux-mips.org
LKML-Reference: <20091013201022.534682104@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:00 +02:00
Steven Rostedt
194ec34184 function-graph/x86: Replace unbalanced ret with jmp
The function graph tracer replaces the return address with a hook
to trace the exit of the function call. This hook will finish by
returning to the real location the function should return to.

But the current implementation uses a ret to jump to the real
return location. This causes a imbalance between calls and ret.
That is the original function does a call, the ret goes to the
handler and then the handler does a ret without a matching call.

Although the function graph tracer itself still breaks the branch
predictor by replacing the original ret, by using a second ret and
causing an imbalance, it breaks the predictor even more.

This patch replaces the ret with a jmp to keep the calls and ret
balanced. I tested this on one box and it showed a 1.7% increase in
performance. Another box only showed a small 0.3% increase. But no
box that I tested this on showed a decrease in performance by
making this change.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091013203425.042034383@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 08:13:53 +02:00
Linus Torvalds
80fa680d22 Merge git://git.infradead.org/~dwmw2/iommu-2.6.32
* git://git.infradead.org/~dwmw2/iommu-2.6.32:
  x86: Move pci_iommu_init to rootfs_initcall()
  Run pci_apply_final_quirks() sooner.
  Mark pci_apply_final_quirks() __init rather than __devinit
  Rename pci_init() to pci_apply_final_quirks(), move it to quirks.c
  intel-iommu: Yet another BIOS workaround: Isoch DMAR unit with no TLB space
  intel-iommu: Decode (and ignore) RHSA entries
  intel-iommu: Make "Unknown DMAR structure" message more informative
2009-10-13 10:04:40 -07:00
Hidetoshi Seto
8968f9d3dc perf_event, x86, mce: Use TRACE_EVENT() for MCE logging
This approach is the first baby step towards solving many of the
structural problems the x86 MCE logging code is having today:

 - It has a private ring-buffer implementation that has a number
   of limitations and has been historically fragile and buggy.

 - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE
   specific. /dev/mcelog is not part of any larger logging
   framework and hence has remained on the fringes for many years.

 - The MCE logging code is still very unclean partly due to its ABI
   limitations. Fields are being reused for multiple purposes, and
   the whole message structure is limited and x86 specific to begin
   with.

All in one, the x86 tree would like to move away from this private
implementation of an event logging facility to a broader framework.

By using perf events we gain the following advantages:

 - Multiple user-space agents can access MCE events. We can have an
   mcelog daemon running but also a system-wide tracer capturing
   important events in flight-recorder mode.

 - Sampling support: the kernel and the user-space call-chain of MCE
   events can be stored and analyzed as well. This way actual patterns
   of bad behavior can be matched to precisely what kind of activity
   happened in the kernel (and/or in the app) around that moment in
   time.

 - Coupling with other hardware and software events: the PMU can track a
   number of other anomalies - monitoring software might chose to
   monitor those plus the MCE events as well - in one coherent stream of
   events.

 - Discovery of MCE sources - tracepoints are enumerated and tools can
   act upon the existence (or non-existence) of various channels of MCE
   information.

 - Filtering support: we just subscribe to and act upon the events we
   are interested in. Then even on a per event source basis there's
   in-kernel filter expressions available that can restrict the amount
   of data that hits the event channel.

 - Arbitrary deep per cpu buffering of events - we can buffer 32
   entries or we can buffer as much as we want, as long as we have
   the RAM.

 - An NMI-safe ring-buffer implementation - mappable to user-space.

 - Built-in support for timestamping of events, PID markers, CPU
   markers, etc.

 - A rich ABI accessible over system call interface. Per cpu, per task
   and per workload monitoring of MCE events can be done this way. The
   ABI itself has a nice, meaningful structure.

 - Extensible ABI: new fields can be added without breaking tooling.
   New tracepoints can be added as the hardware side evolves. There's
   various parsers that can be used.

 - Lots of scheduling/buffering/batching modes of operandi for MCE
   events. poll() support. mmap() support. read() support. You name it.

 - Rich tooling support: even without any MCE specific extensions added
   the 'perf' tool today offers various views of MCE data: perf report,
   perf stat, perf trace can all be used to view logged MCE events and
   perhaps correlate them to certain user-space usage patterns. But it
   can be used directly as well, for user-space agents and policy action
   in mcelog, etc.

With this we hope to achieve significant code cleanup and feature
improvements in the MCE code, and we hope to be able to drop the
/dev/mcelog facility in the end.

This patch is just a plain dumb dump of mce_log() records to
the tracepoints / perf events framework - a first proof of
concept step.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 09:43:38 +02:00
Ingo Molnar
9dbdd6c41c Merge commit 'v2.6.32-rc4' into perf/core
Merge reason: we were on an -rc1 base, merge up to -rc4.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 09:31:34 +02:00
Arnaldo Carvalho de Melo
a2e2725541 net: Introduce recvmmsg socket syscall
Meaning receive multiple messages, reducing the number of syscalls and
net stack entry/exit operations.

Next patches will introduce mechanisms where protocols that want to
optimize this operation will provide an unlocked_recvmsg operation.

This takes into account comments made by:

. Paul Moore: sock_recvmsg is called only for the first datagram,
  sock_recvmsg_nosec is used for the rest.

. Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
  works in the same fashion as the ppoll one.

  If the underlying protocol returns a datagram with MSG_OOB set, this
  will make recvmmsg return right away with as many datagrams (+ the OOB
  one) it has received so far.

. Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
  datagrams and then recvmsg returns an error, recvmmsg will return
  the successfully received datagrams, store the error and return it
  in the next call.

This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
where we will be able to acquire the lock only at batch start and end, not at
every underlying recvmsg call.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-12 23:40:10 -07:00
Ingo Molnar
7a693d3f0d perf_events, x86: Fix event constraints code
There was namespace overlap due to a rename i did - this caused
the following build warning, reported by Stephen Rothwell against
linux-next x86_64 allmodconfig:

  arch/x86/kernel/cpu/perf_event.c: In function 'intel_get_event_idx':
  arch/x86/kernel/cpu/perf_event.c:1445: warning: 'event_constraint' is used uninitialized in this function

This is a real bug not just a warning: fix it by renaming the
global event-constraints table pointer to 'event_constraints'.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091013144223.369d616d.sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 08:19:53 +02:00
H. Peter Anvin
98272ed0d2 x86: use kernel_stack_pointer() in kprobes.c
The way to obtain a kernel-mode stack pointer from a struct pt_regs in
32-bit mode is "subtle": the stack doesn't actually contain the stack
pointer, but rather the location where it would have been marks the
actual previous stack frame.  For clarity, use kernel_stack_pointer()
instead of coding this weirdness explicitly.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
2009-10-12 14:19:35 -07:00
H. Peter Anvin
5ca6c0ca5d x86: use kernel_stack_pointer() in kgdb.c
The way to obtain a kernel-mode stack pointer from a struct
pt_regs in 32-bit mode is "subtle": the stack doesn't actually
contain the stack pointer, but rather the location where it would
have been marks the actual previous stack frame.  For clarity, use
kernel_stack_pointer() instead of coding this weirdness
explicitly.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
2009-10-12 14:19:35 -07:00
H. Peter Anvin
a343c75d33 x86: use kernel_stack_pointer() in dumpstack.c
The way to obtain a kernel-mode stack pointer from a struct pt_regs in
32-bit mode is "subtle": the stack doesn't actually contain the stack
pointer, but rather the location where it would have been marks the
actual previous stack frame.  For clarity, use kernel_stack_pointer()
instead of coding this weirdness explicitly.

Furthermore, user_mode() is only valid when the process is known to
not run in V86 mode.  Use the safer user_mode_vm() instead.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-12 14:19:34 -07:00
H. Peter Anvin
def3c5d0a3 x86: use kernel_stack_pointer() in process_32.c
The way to obtain a kernel-mode stack pointer from a struct pt_regs in
32-bit mode is "subtle": the stack doesn't actually contain the stack
pointer, but rather the location where it would have been marks the
actual previous stack frame.  For clarity, use kernel_stack_pointer()
instead of coding this weirdness explicitly.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-12 14:19:34 -07:00
David Rientjes
8716273cae x86: Export srat physical topology
This is the counterpart to "x86: export k8 physical topology" for
SRAT. It is not as invasive because the acpi code already seperates
node setup into detection and registration steps, with the
exception of registering e820 active regions in
acpi_numa_memory_affinity_init().  This is now moved to
acpi_scan_nodes() if NUMA emulation is disabled or deferred.

acpi_numa_init() now returns a value which specifies whether an
underlying SRAT was located.  If so, that topology can be used by
the emulation code to interleave emulated nodes over physical nodes
or to register the nodes for ACPI.

acpi_get_nodes() may now be used to export the srat physical
topology of the machine for NUMA emulation.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Ankita Garg <ankita@in.ibm.com>
Cc: Len Brown <len.brown@intel.com>
LKML-Reference: <alpine.DEB.1.00.0909251518580.14754@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 22:56:46 +02:00
David Rientjes
8ee2debce3 x86: Export k8 physical topology
To eventually interleave emulated nodes over physical nodes, we
need to know the physical topology of the machine without actually
registering it.  This does the k8 node setup in two parts:
detection and registration.  NUMA emulation can then used the
physical topology detected to setup the address ranges of emulated
nodes accordingly.  If emulation isn't used, the k8 nodes are
registered as normal.

Two formals are added to the x86 NUMA setup functions: `acpi' and
`k8'. These represent whether ACPI or K8 NUMA has been detected;
both cannot be true at the same time.  This specifies to the NUMA
emulation code whether an underlying physical NUMA topology exists
and which interface to use.

This patch deals solely with separating the k8 setup path into
Northbridge detection and registration steps and leaves the ACPI
changes for a subsequent patch.  The `acpi' formal is added here,
however, to avoid touching all the header files again in the next
patch.

This approach also ensures emulated nodes will not span physical
nodes so the true memory latency is not misrepresented.

k8_get_nodes() may now be used to export the k8 physical topology
of the machine for NUMA emulation.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Ankita Garg <ankita@in.ibm.com>
Cc: Len Brown <len.brown@intel.com>
LKML-Reference: <alpine.DEB.1.00.0909251518400.14754@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 22:56:45 +02:00
H. Peter Anvin
d1705c558c x86: fix kernel panic on 32 bits when profiling
Latest kernel has a kernel panic in booting on i386 machine when
profile=2 setting in cmdline.  It is due to 'sp' being incorrect in
profile_pc().

BUG: unable to handle kernel NULL pointer dereference at 00000246
IP: [<c01288b6>] profile_pc+0x2a/0x48
*pde = 00000000
Oops: 0000 [#1] SMP

This differs from the original version by Alex Shi in that we use the
kernel_stack_pointer() inline already defined in <asm/ptrace.h> for
this purpose, instead of #ifdef.

Originally-by: Alex Shi <alex.shi@intel.com>
Cc: "Chen, Tim C" <tim.c.chen@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-12 11:53:51 -07:00
Brian Gerst
ae24ffe5ec x86, 64-bit: Move K8 B step iret fixup to fault entry asm
Move the handling of truncated %rip from an iret fault to the fault
entry path.

This allows x86-64 to use the standard search_extable() function.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <1255357103-5418-1-git-send-email-brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 18:29:46 +02:00
Jan Beulich
7a4b7e5e74 x86: Fix Suspend to RAM freeze on Acer Aspire 1511Lmi laptop
Move the trampoline and accessors back out of .cpuinit.* for the
case of 64-bits+ACPI_SLEEP.

This solves s2ram hangs reported in:

  http://bugzilla.kernel.org/show_bug.cgi?id=14279

Reported-and-bisected-by: Christian Casteyde <casteyde.christian@free.fr>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: <bugzilla-daemon@bugzilla.kernel.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 18:06:48 +02:00
David Woodhouse
9a821b2316 x86: Move pci_iommu_init to rootfs_initcall()
We want this to happen after the PCI quirks, which are now running at
the very end of the fs_initcalls.

This works around the BIOS problems which were originally addressed by
commit db8be50c43 ('USB: Work around BIOS
bugs by quiescing USB controllers earlier'), which was reverted in
commit d93a8f829f.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-10-12 14:42:11 +01:00
Christoph Lameter
494f6a9e12 this_cpu: Use this_cpu_xx in nmi handling
this_cpu_inc/dec reduces the number of instructions needed.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-10-12 19:51:49 +09:00
Borislav Petkov
fb2531953f mce, edac: Use an atomic notifier for MCEs decoding
Add an atomic notifier which ensures proper locking when conveying
MCE info to EDAC for decoding. The actual notifier call overrides a
default, negative priority notifier.

Note: make sure we register the default decoder only once since
mcheck_init() runs on each CPU.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20091003065752.GA8935@liondog.tnic>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 12:24:45 +02:00
Joe Perches
3bb258bf43 ftrace.c: Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
- Remove prefixes from pr_<level>, use pr_fmt(fmt).

No change in output.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <9b377eefae9e28c599dd4a17bdc81172965e9931.1254701151.git.joe@perches.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 08:05:40 +02:00
Yinghai Lu
15b812f1d0 pci: increase alignment to make more space for hidden code
As reported in

	http://bugzilla.kernel.org/show_bug.cgi?id=13940

on some system when acpi are enabled, acpi clears some BAR for some
devices without reason, and kernel will need to allocate devices for
them.  It then apparently hits some undocumented resource conflict,
resulting in non-working devices.

Try to increase alignment to get more safe range for unassigned devices.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-11 14:43:36 -07:00