Commit graph

13365 commits

Author SHA1 Message Date
Patrick McHardy
4e9cac2ba4 [NET]: Add __dev_getfirstbyhwtype
Add __dev_getfirstbyhwtype for callers that don't want a reference but
some data from the device and thus need to take the rtnl anyway.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 03:28:13 -07:00
Randy Dunlap
be52178b9f [NET] skbuff: fix kernel-doc
Fix skbuff.h kernel-doc:
linux-2.6.21-git4//include/linux/skbuff.h:316): No description found for parameter 'transport_header'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 03:16:20 -07:00
David Howells
ef4533f8af [AFS]: Make the match_*() functions take const options.
Make the match_*() functions take a const pointer to the options table
and make strings pointers in the options table const too.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 03:10:39 -07:00
Eric Dumazet
709525fad8 [IPV6]: Get rid of __HAVE_ARCH_ADDR_SET.
__HAVE_ARCH_ADDR_SET seems unused these days, just get rid of it.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 03:08:43 -07:00
Avi Kivity
2ff81f70b5 KVM: Remove unused 'instruction_length'
As we no longer emulate in userspace, this is meaningless.  We don't
compute it on SVM anyway.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:32 +03:00
Avi Kivity
02c8320972 KVM: Don't require explicit indication of completion of mmio or pio
It is illegal not to return from a pio or mmio request without completing
it, as mmio or pio is an atomic operation.  Therefore, we can simplify
the userspace interface by avoiding the completion indication.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:32 +03:00
Avi Kivity
b8836737d9 KVM: Add fpu get/set operations
These are really helpful when migrating an floating point app to another
machine.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:28 +03:00
Avi Kivity
e8207547d2 KVM: Add physical memory aliasing feature
With this, we can specify that accesses to one physical memory range will
be remapped to another.  This is useful for the vga window at 0xa0000 which
is used as a movable window into the (much larger) framebuffer.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:28 +03:00
Avi Kivity
039576c03c KVM: Avoid guest virtual addresses in string pio userspace interface
The current string pio interface communicates using guest virtual addresses,
relying on userspace to translate addresses and to check permissions.  This
interface cannot fully support guest smp, as the check needs to take into
account two pages at one in case an unaligned string transfer straddles a
page boundary.

Change the interface not to communicate guest addresses at all; instead use
a buffer page (mmaped by userspace) and do transfers there.  The kernel
manages the virtual to physical translation and can perform the checks
atomically by taking the appropriate locks.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:25 +03:00
Avi Kivity
07c45a366d KVM: Allow kernel to select size of mmap() buffer
This allows us to store offsets in the kernel/user kvm_run area, and be
sure that userspace has them mapped.  As offsets can be outside the
kvm_run struct, userspace has no way of knowing how much to mmap.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity
1961d276c8 KVM: Add guest mode signal mask
Allow a special signal mask to be used while executing in guest mode.  This
allows signals to be used to interrupt a vcpu without requiring signal
delivery to a userspace handler, which is quite expensive.  Userspace still
receives -EINTR and can get the signal via sigwait().

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity
1b19f3e61d KVM: Add a special exit reason when exiting due to an interrupt
This is redundant, as we also return -EINTR from the ioctl, but it
allows us to examine the exit_reason field on resume without seeing
old data.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity
8eb7d334bd KVM: Fold kvm_run::exit_type into kvm_run::exit_reason
Currently, userspace is told about the nature of the last exit from the
guest using two fields, exit_type and exit_reason, where exit_type has
just two enumerations (and no need for more).  So fold exit_type into
exit_reason, reducing the complexity of determining what really happened.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity
b4e63f560b KVM: Allow userspace to process hypercalls which have no kernel handler
This is useful for paravirtualized graphics devices, for example.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity
5d308f4550 KVM: Add method to check for backwards-compatible API extensions
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity
739872c56f KVM: Renumber ioctls
The recent changes have left the ioctl numbers in complete disarray.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:23 +03:00
Avi Kivity
2a4dac3952 KVM: Remove minor wart from KVM_CREATE_VCPU ioctl
That ioctl does not transfer any data, so it should be an _IO rather than an
_IOW.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:23 +03:00
Avi Kivity
106b552b43 KVM: Remove the 'emulated' field from the userspace interface
We no longer emulate single instructions in userspace.  Instead, we service
mmio or pio requests.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:23 +03:00
Avi Kivity
06465c5a3a KVM: Handle cpuid in the kernel instead of punting to userspace
KVM used to handle cpuid by letting userspace decide what values to
return to the guest.  We now handle cpuid completely in the kernel.  We
still let userspace decide which values the guest will see by having
userspace set up the value table beforehand (this is necessary to allow
management software to set the cpu features to the least common denominator,
so that live migration can work).

The motivation for the change is that kvm kernel code can be impacted by
cpuid features, for example the x86 emulator.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:23 +03:00
Avi Kivity
46fc147788 KVM: Do not communicate to userspace through cpu registers during PIO
Currently when passing the a PIO emulation request to userspace, we
rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx
(on string instructions).  This (a) requires two extra ioctls for getting
and setting the registers and (b) is unfriendly to non-x86 archs, when
they get kvm ports.

So fix by doing the register fixups in the kernel and passing to userspace
only an abstract description of the PIO to be done.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:23 +03:00
Avi Kivity
9a2bb7f486 KVM: Use a shared page for kernel/user communication when runing a vcpu
Instead of passing a 'struct kvm_run' back and forth between the kernel and
userspace, allocate a page and allow the user to mmap() it.  This reduces
needless copying and makes the interface expandable by providing lots of
free space.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:23 +03:00
Avi Kivity
ff42697436 KVM: Export <linux/kvm.h>
This allows users to actually build prgrams that use kvm without
the entire source tree.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:22 +03:00
Avi Kivity
bbe4432e66 KVM: Use own minor number
Use the minor number (232) allocated to kvm by lanana.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:22 +03:00
Mike Frysinger
a830df367c Input: pull input.h into uinpit.h
uinput.h relies on structures found in input.h, so pull in the header

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2007-05-03 00:55:34 -04:00
Adrian Bunk
ecf36501bc PCI: the overdue removal of pci_module_init()
Unless we finally completely remove it, people will always add new users.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Adrian Bunk
5adc55da4a PCI: remove the broken PCI_MULTITHREAD_PROBE option
This patch removes the PCI_MULTITHREAD_PROBE option that had already 
been marked as broken.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Michael Ellerman
032de8e2fe MSI: Give archs the option to free all MSI/Xs at once.
This patch introduces an optional function, arch_teardown_msi_irqs(),
which gives an arch the opportunity to do per-device teardown for
MSI/X. If that's not required, the default version simply calls
arch_teardown_msi_irq() for each msi irq required.

arch_teardown_msi_irqs() is simply passed a pdev, attached to the pdev
is a list of msi_descs, it is up to the arch to free the irq associated
with each of these as appropriate.

For archs that _don't_ implement arch_teardown_msi_irqs(), all msi_descs
with irq == 0 are considered unallocated, and the arch teardown routine
is not called on them.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Michael Ellerman
9c8313343c MSI: Give archs the option to allocate all MSI/Xs at once.
This patch introduces an optional function, arch_setup_msi_irqs(),
(note the plural) which gives an arch the opportunity to do per-device
setup for MSI/X and then allocate all the requested MSI/Xs at once.

If that's not required by the arch, the default version simply calls
arch_setup_msi_irq() for each MSI irq required.

arch_setup_msi_irqs() is passed a pdev, attached to the pdev is a list
of msi_descs with irq == 0, it is up to the arch to connect these up to
an irq (via set_irq_msi()) or return an error. For convenience the number
of vectors and the type are passed also.

All msi_descs with irq != 0 are considered allocated, and the arch
teardown routine will be called on them when necessary.

The existing semantics of pci_enable_msix() are that if the requested
number of irqs can not be allocated, the maximum number that _could_ be
allocated is returned. To support that, we define that in case of an
error from arch_setup_msi_irqs(), the number of msi_descs with irq != 0
are considered allocated, and are counted toward the "max that could be
allocated".


Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Michael Ellerman
314e77b3ee MSI: Remove dev->first_msi_irq
Now that we keep a list of msi descriptors, we don't need first_msi_irq
in the pci dev.

If we somehow have zero MSIs configured list_entry() will give us weird
oopes or nice memory corruption bugs. So be paranoid. Add BUG_ONs and also
a check in pci_msi_check_device() to make sure nvec > 0.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:37 -07:00
Michael Ellerman
4aa9bc955d MSI: Use a list instead of the custom link structure
The msi descriptors are linked together with what looks a lot like
a linked list, but isn't a struct list_head list. Make it one.

The only complication is that previously we walked a list of irqs, and
got the descriptor for each with get_irq_msi(). Now we have a list of
descriptors and need to get the irq out of it, so it needs to be in the
actual struct msi_desc. We use 0 to indicate no irq is setup.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:37 -07:00
Michael Ellerman
65891215e6 PCI: Create alloc_pci_dev(), the one true way to create a struct pci_dev
There are currently several places in the kernel where we kmalloc()
a struct pci_dev and start initialising it. It'd be preferable to
have an allocator so we can ensure the pci_dev is correctly initialised
in one place.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:37 -07:00
Michael Ellerman
c9953a73e9 MSI: Add an arch_msi_check_device()
Add an arch_check_device(), which gives archs a chance to check the input
to pci_enable_msi/x. The arch might be interested in the value of nvec so
pass it in. Propagate the error value returned from the arch routine out
to the caller.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:37 -07:00
Sergei Shtylyov
0da0ead901 PCI: define pci_request/release_regions() for CONFIG_PCI=n
Balance declarations of pci_request_regions() and pci_release_regions() with
empty inline definitions for the CONFIG_PCI=n case -- otherwise my patch to
drivers/net/3c59x.c in the -mm tree doesn't compile. :-)

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:35 -07:00
Jean Delvare
6473d160b4 PCI: Cleanup the includes of <linux/pci.h>
I noticed that many source files include <linux/pci.h> while they do
not appear to need it. Here is an attempt to clean it all up.

In order to find all possibly affected files, I searched for all
files including <linux/pci.h> but without any other occurence of "pci"
or "PCI". I removed the include statement from all of these, then I
compiled an allmodconfig kernel on both i386 and x86_64 and fixed the
false positives manually.

My tests covered 66% of the affected files, so there could be false
positives remaining. Untested files are:

arch/alpha/kernel/err_common.c
arch/alpha/kernel/err_ev6.c
arch/alpha/kernel/err_ev7.c
arch/ia64/sn/kernel/huberror.c
arch/ia64/sn/kernel/xpnet.c
arch/m68knommu/kernel/dma.c
arch/mips/lib/iomap.c
arch/powerpc/platforms/pseries/ras.c
arch/ppc/8260_io/enet.c
arch/ppc/8260_io/fcc_enet.c
arch/ppc/8xx_io/enet.c
arch/ppc/syslib/ppc4xx_sgdma.c
arch/sh64/mach-cayman/iomap.c
arch/xtensa/kernel/xtensa_ksyms.c
arch/xtensa/platform-iss/setup.c
drivers/i2c/busses/i2c-at91.c
drivers/i2c/busses/i2c-mpc.c
drivers/media/video/saa711x.c
drivers/misc/hdpuftrs/hdpu_cpustate.c
drivers/misc/hdpuftrs/hdpu_nexus.c
drivers/net/au1000_eth.c
drivers/net/fec_8xx/fec_main.c
drivers/net/fec_8xx/fec_mii.c
drivers/net/fs_enet/fs_enet-main.c
drivers/net/fs_enet/mac-fcc.c
drivers/net/fs_enet/mac-fec.c
drivers/net/fs_enet/mac-scc.c
drivers/net/fs_enet/mii-bitbang.c
drivers/net/fs_enet/mii-fec.c
drivers/net/ibm_emac/ibm_emac_core.c
drivers/net/lasi_82596.c
drivers/parisc/hppb.c
drivers/sbus/sbus.c
drivers/video/g364fb.c
drivers/video/platinumfb.c
drivers/video/stifb.c
drivers/video/valkyriefb.c
include/asm-arm/arch-ixp4xx/dma.h
sound/oss/au1550_ac97.c

I would welcome test reports for these files. I am fine with removing
the untested files from the patch if the general opinion is that these
changes aren't safe. The tested part would still be nice to have.

Note that this patch depends on another header fixup patch I submitted
to LKML yesterday:
  [PATCH] scatterlist.h needs types.h
  http://lkml.org/lkml/2007/3/01/141

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:35 -07:00
Jean Delvare
a9dfd281a7 PCI: scatterlist.h needs types.h
Most architectures' scatterlist.h use the type dma_addr_t, but omit to
include <asm/types.h> which defines it.  This could lead to build failures,
so let's add the missing includes.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:34 -07:00
Brian King
f7bdd12d23 pci: New PCI-E reset API
Adds a new API which can be used to issue various types
of PCI-E reset, including PCI-E warm reset and PCI-E hot reset.
This is needed for an ipr PCI-E adapter which does not properly
implement BIST. Running BIST on this adapter results in PCI-E
errors. The only reliable reset mechanism that exists on this
hardware is PCI Fundamental reset (warm reset). Since driving
this type of reset is architecture unique, this provides the
necessary hooks for architectures to add this support.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:34 -07:00
Greg Kroah-Hartman
823bccfc40 remove "struct subsystem" as it is no longer needed
We need to work on cleaning up the relationship between kobjects, ksets and
ktypes.  The removal of 'struct subsystem' is the first step of this,
especially as it is not really needed at all.

Thanks to Kay for fixing the bugs in this patch.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 18:57:59 -07:00
Sam Ravnborg
dc24f0e708 kbuild: remove dependency on input.h from file2alias
Almost all definitions used by file2alias was already
present in mod_devicetable.h.
Added the last definition and killed the input.h usage.

The errornous include was pointed out
by: Jan Engelhardt <jengelh@linux01.gwdg.de>

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Deepak Saxena <dsaxena@plexity.net>
2007-05-02 20:58:08 +02:00
Jan Kiszka
c41bf8fa5e [PATCH] i386: avoid redundant preempt_disable in __unlazy_fpu
There are two callers of __unlazy_fpu, unlazy_fpu and __switch_to, and
none of them appear to require additional preempt_disable/enable here.
Let's open-code save_init_fpu in __unlazy_fpu to save a few ops.

Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:21 +02:00
Jan Kiszka
02b64dab56 [PATCH] i386: white space fixes in i387.h
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:21 +02:00
Andi Kleen
c5bcb5635a [PATCH] x86: Use RDTSCP for synchronous get_cycles if possible
RDTSCP is already synchronous and doesn't need an explicit CPUID.
This is a little faster and more importantly avoids VMEXITs on Hypervisors.

Original patch from Joerg Roedel, but reworked by AK
Also includes miscompilation fix by Eric Biederman

Cc: "Joerg Roedel" <joerg.roedel@amd.com>

Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:21 +02:00
Andi Kleen
9bccb23dc5 [PATCH] i386: Add X86_FEATURE_RDTSCP
Following x86-64
Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:20 +02:00
Andi Kleen
3aefbe0746 [PATCH] i386: Implement X86_FEATURE_SYNC_RDTSC on i386
Syncs up with x86-64.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:20 +02:00
Andi Kleen
e859dc553c [PATCH] i386: Implement alternative_io for i386
Ported from x86-64.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:20 +02:00
Andi Kleen
3671df8572 [PATCH] i386: Evaluate constant cpu features at runtime
Redefine cpu_has() to evaluate cpu features already checked in early
boot at compile time.  This way the compiler might eliminate some dead code.
Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:20 +02:00
Andi Kleen
c7f81c9453 [PATCH] i386: Verify important CPUID bits in real mode
Check some CPUID bits that are needed for compiler generated early in boot.
When the system is still in real mode before changing the VESA BIOS mode
it is possible to still display an visible error message on the screen.

Similar to x86-64.

Includes cleanups from Eric Biederman

Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:20 +02:00
Andi Kleen
05cb007dac [PATCH] x86-64: Use the 32bit wd_ops for 64bit too.
This mainly removes a lot of code, replacing it with calls into the new 32bit
perfctr-watchdog.c

Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:20 +02:00
Andi Kleen
09198e6850 [PATCH] i386: Clean up NMI watchdog code
- Introduce a wd_ops structure
- Convert the various nmi watchdogs over to it
- This allows to split the perfctr reservation from the watchdog
setup cleanly.
- Do perfctr reservation globally as it should have always been
- Remove dead code referenced only by unused EXPORT_SYMBOLs

Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:20 +02:00
Zachary Amsden
9e5e3162b2 [PATCH] i386: pte simplify ops
Add comment and condense code to make use of native_local_ptep_get_and_clear
function.  Also, it turns out the 2-level and 3-level paging definitions were
identical, so move the common definition into pgtable.h

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:19 +02:00
Zachary Amsden
142dd97591 [PATCH] i386: pte xchg optimization
In situations where page table updates need only be made locally, and there is
no cross-processor A/D bit races involved, we need not use the heavyweight
xchg instruction to atomically fetch and clear page table entries.  Instead,
we can just read and clear them directly.

This introduces a neat optimization for non-SMP kernels; drop the atomic xchg
operations from page table updates.

Thanks to Michel Lespinasse for noting this potential optimization.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:19 +02:00