Commit graph

2908 commits

Author SHA1 Message Date
Richard Weinberger
449a66fd1f x86: Remove warning and warning_symbol from struct stacktrace_ops
Both warning and warning_symbol are nowhere used.
Let's get rid of them.

Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Soeren Sandmann Pedersen <ssp@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: x86 <x86@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Link: http://lkml.kernel.org/r/1305205872-10321-2-git-send-email-richard@nod.at
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2011-05-12 15:31:28 +02:00
Avi Kivity
ca1d4a9e77 KVM: x86 emulator: drop vcpu argument from pio callbacks
Making the emulator caller agnostic.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:11 -04:00
Avi Kivity
0f65dd70a4 KVM: x86 emulator: drop vcpu argument from memory read/write callbacks
Making the emulator caller agnostic.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:10 -04:00
Nelson Elhage
3d9b938eef KVM: emulator: Use linearize() when fetching instructions
Since segments need to be handled slightly differently when fetching
instructions, we add a __linearize helper that accepts a new 'fetch' boolean.

[avi: fix oops caused by wrong segmented_address initialization order]

Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:10 -04:00
Joerg Roedel
7c4c0f4fd5 KVM: X86: Update last_guest_tsc in vcpu_put
The last_guest_tsc is used in vcpu_load to adjust the
tsc_offset since tsc-scaling is merged. So the
last_guest_tsc needs to be updated in vcpu_put instead of
the the last_host_tsc. This is fixed with this patch.

Reported-by: Jan Kiszka <jan.kiszka@web.de>
Tested-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:10 -04:00
Gleb Natapov
7ae441eac5 KVM: emulator: do not needlesly sync registers from emulator ctxt to vcpu
Currently we sync registers back and forth before/after exiting
to userspace for IO, but during IO device model shouldn't need to
read/write the registers, so we can as well skip those sync points. The
only exaception is broken vmware backdor interface. The new code sync
registers content during IO only if registers are read from/written to
by userspace in the middle of the IO operation and this almost never
happens in practise.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-05-11 07:57:08 -04:00
Joerg Roedel
92a1f12d25 KVM: X86: Implement userspace interface to set virtual_tsc_khz
This patch implements two new vm-ioctls to get and set the
virtual_tsc_khz if the machine supports tsc-scaling. Setting
the tsc-frequency is only possible before userspace creates
any vcpu.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:06 -04:00
Joerg Roedel
857e40999e KVM: X86: Delegate tsc-offset calculation to architecture code
With TSC scaling in SVM the tsc-offset needs to be
calculated differently. This patch propagates this
calculation into the architecture specific modules so that
this complexity can be handled there.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:05 -04:00
Joerg Roedel
4051b18801 KVM: X86: Implement call-back to propagate virtual_tsc_khz
This patch implements a call-back into the architecture code
to allow the propagation of changes to the virtual tsc_khz
of the vcpu.
On SVM it updates the tsc_ratio variable, on VMX it does
nothing.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:05 -04:00
Joerg Roedel
1e993611d0 KVM: X86: Let kvm-clock report the right tsc frequency
This patch changes the kvm_guest_time_update function to use
TSC frequency the guest actually has for updating its clock.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:04 -04:00
Joerg Roedel
fbc0db76b7 KVM: SVM: Implement infrastructure for TSC_RATE_MSR
This patch enhances the kvm_amd module with functions to
support the TSC_RATE_MSR which can be used to set a given
tsc frequency for the guest vcpu.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:04 -04:00
Xiao Guangrong
7c5625227f KVM: MMU: remove mmu_seq verification on pte update path
The mmu_seq verification can be removed since we get the pfn in the
protection of mmu_lock.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:03 -04:00
Joerg Roedel
f6511935f4 KVM: SVM: Add checks for IO instructions
This patch adds code to check for IOIO intercepts on
instructions decoded by the KVM instruction emulator.

[avi: fix build error due to missing #define D2bvIP]

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:03 -04:00
Joerg Roedel
8061252ee0 KVM: SVM: Add intercept checks for remaining twobyte instructions
This patch adds intercepts checks for the remaining twobyte
instructions to the KVM instruction emulator.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:02 -04:00
Joerg Roedel
3b88e41a41 KVM: SVM: Add intercept check for accessing dr registers
This patch adds the intercept checks for instruction
accessing the debug registers.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:01 -04:00
Joerg Roedel
cfec82cb7d KVM: SVM: Add intercept check for emulated cr accesses
This patch adds all necessary intercept checks for
instructions that access the crX registers.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:01 -04:00
Joerg Roedel
8a76d7f25f KVM: x86: Add x86 callback for intercept check
This patch adds a callback into kvm_x86_ops so that svm and
vmx code can do intercept checks on emulated instructions.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:01 -04:00
Joerg Roedel
8ea7d6aef8 KVM: x86 emulator: Add flag to check for protected mode instructions
This patch adds a flag for the opcoded to tag instruction
which are only recognized in protected mode. The necessary
check is added too.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:01 -04:00
Joerg Roedel
d09beabd7c KVM: x86 emulator: Add check_perm callback
This patch adds a check_perm callback for each opcode into
the instruction emulator. This will be used to do all
necessary permission checks on instructions before checking
whether they are intercepted or not.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:01 -04:00
Joerg Roedel
775fde8648 KVM: x86 emulator: Don't write-back cpu-state on X86EMUL_INTERCEPTED
This patch prevents the changed CPU state to be written back
when the emulator detected that the instruction was
intercepted by the guest.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:00 -04:00
Avi Kivity
3c6e276f22 KVM: x86 emulator: add SVM intercepts
Add intercept codes for instructions defined by SVM as
interceptable.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:00 -04:00
Avi Kivity
c4f035c60d KVM: x86 emulator: add framework for instruction intercepts
When running in guest mode, certain instructions can be intercepted by
hardware.  This also holds for nested guests running on emulated
virtualization hardware, in particular instructions emulated by kvm
itself.

This patch adds a framework for intercepting instructions.  If an
instruction is marked for interception, and if we're running in guest
mode, a callback is called to check whether an intercept is needed or
not.  The callback is called at three points in time: immediately after
beginning execution, after checking privilge exceptions, and after
checking memory exception.  This suits the different interception points
defined for different instructions and for the various virtualization
instruction sets.

In addition, a new X86EMUL_INTERCEPT is defined, which any callback or
memory access may define, allowing the more complicated intercepts to be
implemented in existing callbacks.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:00 -04:00
Avi Kivity
1253791df9 KVM: x86 emulator: SSE support
Add support for marking an instruction as SSE, switching registers used
to the SSE register file.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:56:59 -04:00
Avi Kivity
5037f6f324 KVM: x86 emulator: define callbacks for using the guest fpu within the emulator
Needed for emulating fpu instructions.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:56:58 -04:00
Avi Kivity
1d6b114f20 KVM: x86 emulator: do not munge rep prefix
Currently we store a rep prefix as 1 or 2 depending on whether it is a REPE or
REPNE.  Since sse instructions depend on the prefix value, store it as the
original opcode to simplify things further on.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:56:58 -04:00
Avi Kivity
cef4dea07f KVM: 16-byte mmio support
Since sse instructions can issue 16-byte mmios, we need to support them.  We
can't increase the kvm_run mmio buffer size to 16 bytes without breaking
compatibility, so instead we break the large mmios into two smaller 8-byte
ones.  Since the bus is 64-bit we aren't breaking any atomicity guarantees.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:56:58 -04:00
Avi Kivity
69c7302890 KVM: VMX: Cache cpl
We may read the cpl quite often in the same vmexit (instruction privilege
check, memory access checks for instruction and operands), so we gain
a bit if we cache the value.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:56:54 -04:00
Avi Kivity
6de12732c4 KVM: VMX: Optimize vmx_get_rflags()
If called several times within the same exit, return cached results.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:56:54 -04:00
Yinghai Lu
5491ff511d x86/PCI: Remove dma32_reserve_bootmem
This workaround holds a dma32 buffer at early boot to prevent later
bootmem allocations from stealing it in the case of large RAM configs.

Now that x86 is using memblock, and the nobootmem wrapper does top-down
allocation, it's no longer necessary, so remove it.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-10 15:43:32 -07:00
Ingo Molnar
932fed4e2e Merge commit 'v2.6.39-rc7' into perf/core
Merge reason: pull in the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-10 17:05:45 +02:00
Joerg Roedel
72fe00f01f x86/amd-iommu: Use threaded interupt handler
Move the interupt handling for the iommu into the interupt
thread to reduce latencies and prepare interupt handling for
pri handling.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-05-10 11:07:58 +02:00
Joerg Roedel
604c307bf4 Merge branches 'dma-debug/next', 'amd-iommu/command-cleanups', 'amd-iommu/ats' and 'amd-iommu/extended-features' into iommu/2.6.40
Conflicts:
	arch/x86/include/asm/amd_iommu_types.h
	arch/x86/kernel/amd_iommu.c
	arch/x86/kernel/amd_iommu_init.c
2011-05-10 10:25:23 +02:00
Jack Steiner
1d44e8288a x86, UV: Fix NMI handler for UV platforms
This fixes problems seen on UV systems handling NMIs from the
node controller.

I isolated the "dazed..." messages that I saw earlier to a bug in
the BMC on our platform. It was sending NMIs w/o properly setting
a register that indicated the source of NMI.

So rather than _assuming_ any unhandled NMI came from the UV system
maintenance console (SMC), add a check to verify that the SMC actually
sent the NMI.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: gorcunov@gmail.com
Cc: dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-10 09:26:55 +02:00
Matthew Garrett
9cd2b07c19 x86, efi: Consolidate EFI nx control
The core EFI code and 64-bit EFI code currently have independent
implementations of code for setting memory regions as executable or not.
Let's consolidate them.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Link: http://lkml.kernel.org/r/1304623186-18261-2-git-send-email-mjg@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-05-09 12:14:29 -07:00
Anton Blanchard
228e548e60 net: Add sendmmsg socket system call
This patch adds a multiple message send syscall and is the send
version of the existing recvmmsg syscall. This is heavily
based on the patch by Arnaldo that added recvmmsg.

I wrote a microbenchmark to test the performance gains of using
this new syscall:

http://ozlabs.org/~anton/junkcode/sendmmsg_test.c

The test was run on a ppc64 box with a 10 Gbit network card. The
benchmark can send both UDP and RAW ethernet packets.

64B UDP

batch   pkts/sec
1       804570
2       872800 (+ 8 %)
4       916556 (+14 %)
8       939712 (+17 %)
16      952688 (+18 %)
32      956448 (+19 %)
64      964800 (+20 %)

64B raw socket

batch   pkts/sec
1       1201449
2       1350028 (+12 %)
4       1461416 (+22 %)
8       1513080 (+26 %)
16      1541216 (+28 %)
32      1553440 (+29 %)
64      1557888 (+30 %)

We see a 20% improvement in throughput on UDP send and 30%
on raw socket send.

[ Add sparc syscall entries. -DaveM ]

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-05 11:10:14 -07:00
Tejun Heo
1b7e03ef75 x86, NUMA: Enable emulation on 32bit too
Now that NUMA init path is unified, NUMA emulation can be enabled on
32bit.  Make numa_emluation.c safe on 32bit by doing the followings.

* Define MAX_DMA32_PFN on 32bit too.

* Include bootmem.h for max_pfn declaration.

* Use u64 explicitly and always use PFN_PHYS() when converting page
  number to address.

* Avoid __udivdi3() generation on 32bit by doing number of pages
  calculation instead in split_nodes_interleave().

And drop X86_64 dependency from Kconfig.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 17:24:48 +02:00
Tejun Heo
752d4f372f x86, NUMA: Make numa_init_array() static
numa_init_array() no longer has users outside of numa.c.  Make it
static.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 17:24:48 +02:00
Tejun Heo
bd6709a91a x86, NUMA: Make 32bit use common NUMA init path
With both _numa_init() methods converted and the rest of init code
adjusted, numa_32.c now can switch from the 32bit only init code to
the common one in numa.c.

* Shim get_memcfg_*()'s are dropped and initmem_init() calls
  x86_numa_init(), which is updated to handle NUMAQ.

* All boilerplate operations including node range limiting, pgdat
  alloc/init are handled by numa_init().  32bit only implementation is
  removed.

* 32bit numa_add_memblk(), numa_set_distance() and
  memory_add_physaddr_to_nid() removed and common versions in
  numa_32.c enabled for 32bit.

This change causes the following behavior changes.

* NODE_DATA()->node_start_pfn/node_spanned_pages properly initialized
  for 32bit too.

* Much more sanity checks and configuration cleanups.

* Proper handling of node distances.

* The same NUMA init messages as 64bit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 17:24:48 +02:00
Tejun Heo
744baba0c4 x86, NUMA: Enable build of generic NUMA init code on 32bit
Generic NUMA init code was moved to numa.c from numa_64.c but is still
guaraded by CONFIG_X86_64.  This patch removes the compile guard and
enables compiling on 32bit.

* numa_add_memblk() and numa_set_distance() clash with the shim
  implementation in numa_32.c and are left out.

* memory_add_physaddr_to_nid() clashes with 32bit implementation and
  is left out.

* MAX_DMA_PFN definition in dma.h moved out of !CONFIG_X86_32.

* node_data definition in numa_32.c removed in favor of the one in
  numa.c.

There are places where ulong is assumed to be 64bit.  The next patch
will fix them up.  Note that although the code is compiled it isn't
used yet and this patch doesn't cause any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 14:18:53 +02:00
Tejun Heo
a4106eae65 x86, NUMA: Move NUMA init logic from numa_64.c to numa.c
Move the generic 64bit NUMA init machinery from numa_64.c to numa.c.

* node_data[], numa_mem_info and numa_distance
* numa_add_memblk[_to](), numa_remove_memblk[_from]()
* numa_set_distance() and friends
* numa_init() and all the numa_meminfo handling helpers called from it
* dummy_numa_init()
* memory_add_physaddr_to_nid()

A new function x86_numa_init() is added and the content of
numa_64.c::initmem_init() is moved into it.  initmem_init() now simply
calls x86_numa_init().

Constants and numa_off declaration are moved from numa_{32|64}.h to
numa.h.

This is code reorganization and doesn't involve any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 14:18:53 +02:00
Tejun Heo
299a180aec x86-32, NUMA: Update numaq to use new NUMA init protocol
Update numaq such that it calls numa_add_memblk() and sets
numa_nodes_parsed instead of directly diddling with NUMA states.  The
original get_memcfg_numaq() is renamed to numaq_numa_init() and new
get_memcfg_numaq() is created in numa_32.c.

The shim numa_add_memblk() implementation handles node_start/end_pfn[]
and node_set_online() for nodes with memory.  The new
get_memcfg_numaq() exactly the same with get_memcfg_from_srat() other
than calling the numaq init function.  Things get_memcfgs_numaq() do
are not strictly necessary for numaq but added for consistency and to
help unifying NUMA init handling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 14:18:53 +02:00
Tejun Heo
5acd91ab83 x86-32, NUMA: Replace srat_32.c with srat.c
SRAT support implementation in srat_32.c and srat.c are generally
similar; however, there are some differences.

First of all, 64bit implementation supports more types of SRAT
entries.  64bit supports x2apic, affinity, memory and SLIT.  32bit
only supports processor and memory.

Most other differences stem from different initialization protocols
employed by 64bit and 32bit NUMA init paths.

On 64bit,

* Mappings among PXM, node and apicid are directly done in each SRAT
  entry callback.

* Memory affinity information is passed to numa_add_memblk() which
  takes care of all interfacing with NUMA init.

* Doesn't directly initialize NUMA configurations.  All the
  information is recorded in numa_nodes_parsed and memblks.

On 32bit,

* Checks numa_off.

* Things go through one more level of indirection via private tables
  but eventually end up initializing the same mappings.

* node_start/end_pfn[] are initialized and
  memblock_x86_register_active_regions() is called for each memory
  chunk.

* node_set_online() is called for each online node.

* sort_node_map() is called.

There are also other minor differences in sanity checking and messages
but taking 64bit version should be good enough.

This patch drops the 32bit specific implementation and makes the 64bit
implementation common for both 32 and 64bit.

The init protocol differences are dealt with in two places - the
numa_add_memblk() shim added in the previous patch and new temporary
numa_32.c:get_memcfg_from_srat() which wraps invocation of
x86_acpi_numa_init().

The shim numa_add_memblk() handles the folowings.

* node_start/end_pfn[] initialization.

* node_set_online() for memory nodes.

* Invocation of memblock_x86_register_active_regions().

The shim get_memcfg_from_srat() handles the followings.

* numa_off check.

* node_set_online() for CPU nodes.

* sort_node_map() invocation.

* Clearing of numa_nodes_parsed and active_ranges on failure.

The shims are temporary and will be removed as the generic NUMA init
path in 32bit is replaced with 64bit one.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 14:18:53 +02:00
Tejun Heo
b0d310801a x86-32, NUMA: implement temporary NUMA init shims
To help transition to common NUMA init, implement temporary 32bit
shims for numa_add_memblk() and numa_set_distance().
numa_add_memblk() registers the memblk and adjusts
node_start/end_pfn[].  numa_set_distance() is noop.

These shims will allow using 64bit NUMA init functions on 32bit and
gradual transition to common NUMA init path.

For detailed description, please read description of commits which
make use of the shim functions.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 14:18:53 +02:00
Tejun Heo
e6df595b37 x86, NUMA: Move numa_nodes_parsed to numa.[hc]
Move numa_nodes_parsed from numa_64.[hc] to numa.[hc] to prepare for
NUMA init path unification.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 14:18:53 +02:00
Tejun Heo
daf4f480ae x86-32, NUMA: Move get_memcfg_numa() into numa_32.c
There's no reason get_memcfg_numa() to be implemented inline in
mmzone_32.h.  Move it to numa_32.c and also make
get_memcfg_numa_flag() static.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 14:18:53 +02:00
Tejun Heo
1201e10a09 x86, NUMA: trivial cleanups
* Kill no longer used struct bootnode.

* Kill dangling declaration of pxm_to_nid() in numa_32.h.

* Make setup_node_bootmem() static.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 14:18:52 +02:00
Tejun Heo
84914ed0ec x86-32, NUMA: Make apic->x86_32_numa_cpu_node() optional
NUMAQ is the only meaningful user of this callback and
setup_local_APIC() the only callsite.  Stop torturing everyone else by
making the callback optional and removing all the boilerplate
implementations and assignments.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 14:18:52 +02:00
Tejun Heo
6bd262731b x86, NUMA: Unify 32/64bit numa_cpu_node() implementation
Currently, the only meaningful user of apic->x86_32_numa_cpu_node() is
NUMAQ which returns valid mapping only after CPU is initialized during
SMP bringup; thus, the previous patch to set apicid -> node in
setup_local_APIC() makes __apicid_to_node[] always contain the correct
mapping whether custom apic->x86_32_numa_cpu_node() is used or not.

So, there is no reason to keep separate 32bit implementation.  We can
always consult __apicid_to_node[].  Move 64bit implementation from
numa_64.c to numa.c and remove 32bit implementation from numa_32.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-05-02 14:18:52 +02:00
Tejun Heo
ba67cf5cf2 Merge branch 'x86/urgent' into x86-mm
Merge reason: Pick up the following two fix commits.

  2be19102b7: x86, NUMA: Fix empty memblk detection in numa_cleanup_meminfo()
  765af22da8: x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change

Scheduled NUMA init 32/64bit unification changes depend on these.

Signed-off-by: Tejun Heo <tj@kernel.org>
2011-05-02 14:16:47 +02:00
Tejun Heo
aff364860a Merge branch 'x86/numa' into x86-mm
Merge reason: Pick up x86-32 remap allocator cleanup changes - 14
commits, 3fe14ab541^..993ba1585c.

  3fe14ab541: x86-32, numa: Fix failure condition check in alloc_remap()
  993ba1585c: x86-32, numa: Update remap allocator comments

Scheduled NUMA init 32/64bit unification changes depend on them.

Signed-off-by: Tejun Heo <tj@kernel.org>
2011-05-02 14:08:47 +02:00