Split xen-asm into 32- and 64-bit files, and implement the 64-bit
variants.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add the Xen entrypoint and ELF notes to head_64.S. Adapts xen-head.S
to compile either 32-bit or 64-bit.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A number of random changes to make xen/smp.c compile in 64-bit mode.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>a
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As a stopgap until Mike Travis's x86-64 gs-based percpu patches are
ready, provide workaround functions for x86_read/write_percpu for
Xen's use.
Specifically, this means that we can't really make use of vcpu
placement, because we can't use a single gs-based memory access to get
to vcpu fields. So disable all that for now.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move all the smp_ops setup into smp.c, allowing a lot of things to
become static.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86_64 stores the active_mm in the pda, so fetch it from there.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We need extra pv_mmu_ops for 64-bit, to deal with the extra level of
pagetable.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Copy 64-bit definitions of various interface structures into place.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We need set_pte to work from a relatively early point, so enable it
from the start.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
add xen_timer_resume() hook.
Timer resume should be done after event channel is resumed.
add xen_arch_resume() hook when ipi becomes usable after resume.
After resume, some cpu specific resource must be reinitialized
on ia64 that can't be set by another cpu.
However available hooks is run once on only one cpu so that ipi has
to be used.
During stop_machine_run() ipi can't be used because interrupt is masked.
So add another hook after stop_machine_run().
Another approach might be use resume hook which is run by
device_resume(). However device_resume() may be executed on
suspend error recovery path.
So it is necessary to determine whether it is executed on real resume path
or error recovery path.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Print a backtrace if a multicall fails, to help with debugging.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This allows Xen's xen_cpu_up() to allocate a pda for the new CPU.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Update arch/x86's use of page-aligned variables. The change to
arch/x86/xen/mmu.c fixes an actual bug, but the rest are cleanups
and to set a precedent.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
process_64.c:__switch_to has some very old strange formatting, some of
it dating back to pre-git. Fix it up.
No functional changes.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Early fixmap will allocate its own L1 pagetable page for fixmap
mappings, so there's no need to preallocate one.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Call paravirt_pagetable_setup_{start,done}
These paravirt_ops functions were not being called on x86_64.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Using ACPI to find free address space allows us to find a gap for the
unallocated PCI resources or MMIO resources for hotplug devices within
the BIOS allowed PCI regions.
It works by evaluating the _CRS object under PCI0 looking for producer
resources. Then searches the e820 memory space for a gap within these
producer resources.
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Dave Hansen reported a build error on 32bit which went unnoticed
as newer gcc versions seem to optimize unused static functions
away before compiling them.
Make vread_tsc() depend on CONFIG_X86_64
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
fix merge fallout:
arch/x86/pci/amd_bus.c: In function ‘enable_pci_io_ecs':
arch/x86/pci/amd_bus.c:581: error: too many arguments to function ‘on_each_cpu'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'timers/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: add PCI ID for 6300ESB force hpet
x86: add another PCI ID for ICH6 force-hpet
kernel-paramaters: document pmtmr= command line option
acpi_pm clccksource: fix printk format warning
nohz: don't stop idle tick if softirqs are pending.
pmtmr: allow command line override of ioport
nohz: reduce jiffies polling overhead
hrtimer: Remove unused variables in ktime_divns()
hrtimer: remove warning in hres_timers_resume
posix-timers: print RT watchdog message
On Mon, 2008-04-21 at 18:54 -0400, Masami Hiramatsu wrote:
> Thank you for reporting.
>
> Actually, kprobes tries to fixup thread's flags in post_kprobe_handler
> (which is called from kprobe_exceptions_notify) by
> trace_hardirqs_fixup_flags(pt_regs->flags). However, even the irq flag
> is set in pt_regs->flags, true hardirq is still off until returning
> from do_debug. Thus, lockdep assumes that hardirq is off without annotation.
>
> IMHO, one possible solution is that fixing hardirq flags right after
> notify_die in do_debug instead of in post_kprobe_handler.
My reply to BZ 10489:
> [ 2.707509] Kprobe smoke test started
> [ 2.709300] ------------[ cut here ]------------
> [ 2.709420] WARNING: at kernel/lockdep.c:2658 check_flags+0x4d/0x12c()
> [ 2.709541] Modules linked in:
> [ 2.709588] Pid: 1, comm: swapper Not tainted 2.6.25.jml.057 #1
> [ 2.709588] [<c0126acc>] warn_on_slowpath+0x41/0x51
> [ 2.709588] [<c010bafc>] ? save_stack_trace+0x1d/0x3b
> [ 2.709588] [<c0140a83>] ? save_trace+0x37/0x89
> [ 2.709588] [<c011987d>] ? kernel_map_pages+0x103/0x11c
> [ 2.709588] [<c0109803>] ? native_sched_clock+0xca/0xea
> [ 2.709588] [<c0142958>] ? mark_held_locks+0x41/0x5c
> [ 2.709588] [<c0382580>] ? kprobe_exceptions_notify+0x322/0x3af
> [ 2.709588] [<c0142aff>] ? trace_hardirqs_on+0xf1/0x119
> [ 2.709588] [<c03825b3>] ? kprobe_exceptions_notify+0x355/0x3af
> [ 2.709588] [<c0140823>] check_flags+0x4d/0x12c
> [ 2.709588] [<c0143c9d>] lock_release+0x58/0x195
> [ 2.709588] [<c038347c>] ? __atomic_notifier_call_chain+0x0/0x80
> [ 2.709588] [<c03834d6>] __atomic_notifier_call_chain+0x5a/0x80
> [ 2.709588] [<c0383508>] atomic_notifier_call_chain+0xc/0xe
> [ 2.709588] [<c013b6d4>] notify_die+0x2d/0x2f
> [ 2.709588] [<c038168a>] do_debug+0x67/0xfe
> [ 2.709588] [<c0381287>] debug_stack_correct+0x27/0x30
> [ 2.709588] [<c01564c0>] ? kprobe_target+0x1/0x34
> [ 2.709588] [<c0156572>] ? init_test_probes+0x50/0x186
> [ 2.709588] [<c04fae48>] init_kprobes+0x85/0x8c
> [ 2.709588] [<c04e947b>] kernel_init+0x13d/0x298
> [ 2.709588] [<c04e933e>] ? kernel_init+0x0/0x298
> [ 2.709588] [<c04e933e>] ? kernel_init+0x0/0x298
> [ 2.709588] [<c0105ef7>] kernel_thread_helper+0x7/0x10
> [ 2.709588] =======================
> [ 2.709588] ---[ end trace 778e504de7e3b1e3 ]---
> [ 2.709588] possible reason: unannotated irqs-off.
> [ 2.709588] irq event stamp: 370065
> [ 2.709588] hardirqs last enabled at (370065): [<c0382580>] kprobe_exceptions_notify+0x322/0x3af
> [ 2.709588] hardirqs last disabled at (370064): [<c0381bb7>] do_int3+0x1d/0x7d
> [ 2.709588] softirqs last enabled at (370050): [<c012b464>] __do_softirq+0xfa/0x100
> [ 2.709588] softirqs last disabled at (370045): [<c0107438>] do_softirq+0x74/0xd9
> [ 2.714751] Kprobe smoke test passed successfully
how I love this stuff...
Ok, do_debug() is a trap, this can happen at any time regardless of the
machine's IRQ state. So the first thing we do is fix up the IRQ state.
Then we call this die notifier stuff; and return with messed up IRQ
state... YAY.
So, kprobes fudges it..
notify_die(DIE_DEBUG)
kprobe_exceptions_notify()
post_kprobe_handler()
modify regs->flags
trace_hardirqs_fixup_flags(regs->flags); <--- must be it
So what's the use of modifying flags if they're not meant to take effect
at some point.
/me tries to reproduce issue; enable kprobes test thingy && boot
OK, that reproduces..
So the below makes it work - but I'm not getting this code; at the time
I wrote that stuff I CC'ed each and every kprobe maintainer listed in
the usual places but got no reposonse - can some please explain this
stuff to me?
Are the saved flags only for the TF bit or are they made in full effect
later (and if so, where) ?
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-2.6.27' of git://git.infradead.org/users/dwmw2/firmware-2.6: (64 commits)
firmware: convert sb16_csp driver to use firmware loader exclusively
dsp56k: use request_firmware
edgeport-ti: use request_firmware()
edgeport: use request_firmware()
vicam: use request_firmware()
dabusb: use request_firmware()
cpia2: use request_firmware()
ip2: use request_firmware()
firmware: convert Ambassador ATM driver to request_firmware()
whiteheat: use request_firmware()
ti_usb_3410_5052: use request_firmware()
emi62: use request_firmware()
emi26: use request_firmware()
keyspan_pda: use request_firmware()
keyspan: use request_firmware()
ttusb-budget: use request_firmware()
kaweth: use request_firmware()
smctr: use request_firmware()
firmware: convert ymfpci driver to use firmware loader exclusively
firmware: convert maestro3 driver to use firmware loader exclusively
...
Fix up trivial conflicts with BKL removal in drivers/char/dsp56k.c and
drivers/char/ip2/ip2main.c manually.
* 'core/printk' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, generic: mark early_printk as asmlinkage
printk: export console_drivers
printk: remember the message level for multi-line output
printk: refactor processing of line severity tokens
printk: don't prefer unsuited consoles on registration
printk: clean up recursion check related static variables
namespacecheck: more kernel/printk.c fixes
namespacecheck: fix kernel printk.c
* 'tracing/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (228 commits)
ftrace: build fix for ftraced_suspend
ftrace: separate out the function enabled variable
ftrace: add ftrace_kill_atomic
ftrace: use current CPU for function startup
ftrace: start wakeup tracing after setting function tracer
ftrace: check proper config for preempt type
ftrace: trace schedule
ftrace: define function trace nop
ftrace: move sched_switch enable after markers
ftrace: prevent ftrace modifications while being kprobe'd, v2
fix "ftrace: store mcount address in rec->ip"
mmiotrace broken in linux-next (8-bit writes only)
ftrace: avoid modifying kprobe'd records
ftrace: freeze kprobe'd records
kprobes: enable clean usage of get_kprobe
ftrace: store mcount address in rec->ip
ftrace: build fix with gcc 4.3
namespacecheck: fixes
ftrace: fix "notrace" filtering priority
ftrace: fix printout
...
John Keller reports that PCI config space access is broken on machines
with more than one domain. conf1 accesses only work for domain 0, so make sure
we check the domain number in the raw routines before trying conf1.
Reported-by: John Keller <jpk@sgi.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Explain that we set up the descriptors for Big Real Mode, and why we
do so. In particular, one system that is known to fail without it is
the Lenovo X61.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The explanation for recent video BIOS suspend quirk failures is that
the VESA BIOS expects to be entered in Big Real Mode (*.limit = 0xffffffff)
instead of ordinary Real Mode (*.limit = 0xffff).
This patch changes the segment descriptors to Big Real Mode instead.
The segment descriptor registers (what Intel calls "segment cache") is
always active. The only thing that changes based on CR0.PE is how it is
*loaded* and the interpretation of the CS flags.
The segment descriptor registers contain of the following sub-registers:
selector (the "visible" part), base, limit and flags. In protected mode
or long mode, they are loaded from descriptors (or fs.base or gs.base can
be manipulated directly in long mode.) In real mode, the only thing
changed by a segment register load is the selector and the base, where the
base <- selector << 4. In particular, *the limit and the flags are not
changed*.
As far as the handling of the CS flags: a code segment cannot be writable
in protected mode, whereas it is "just another segment" in real mode, so
there is some kind of quirk that kicks in for this when CR0.PE <- 0. I'm
not sure if this is accomplished by actually changing the cs.flags register
or just changing the interpretation; it might be something that is
CPU-specific. In particular, the Transmeta CPUs had an explicit "CS is
writable if you're in real mode" override, so even if you had loaded CS
with an execute-only segment it'd be writable (but not readable!) on return
to real mode. I'm not at all sure if that is how other CPUs behave.
Signed-off-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>