Commit graph

4571 commits

Author SHA1 Message Date
Benjamin Herrenschmidt
9d1e24928e memblock: Separate memblock_alloc_nid() and memblock_alloc_try_nid()
The former is now strict, it will fail if it cannot honor the allocation
within the node, while the later implements the previous semantic which
falls back to allocating anywhere.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:24 +10:00
Benjamin Herrenschmidt
c196f76fd5 memblock: NUMA allocate can now use early_pfn_map
We now provide a default (weak) implementation of memblock_nid_range()
which uses the early_pfn_map[] if CONFIG_ARCH_POPULATES_NODE_MAP
is set. Sparc still needs to use its own method due to the way
the pages can be scattered between nodes.

This implementation is inefficient due to our main algorithm and
callback construct wanting to work on an ascending addresses bases
while early_pfn_map[] would rather work with nid's (it's unsorted
at that stage). But it should work and we can look into improving
it subsequently, possibly using arch compile options to chose a
different algorithm alltogether.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:23 +10:00
Benjamin Herrenschmidt
fef501d49d memblock: Add "start" argument to memblock_find_base()
To constraint the search of a region between two boundaries,
which will be used by the new NUMA aware allocator among others.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:22 +10:00
Benjamin Herrenschmidt
d2cd563ba8 memblock: Add arch function to control coalescing of memblock memory regions
Some archs such as ARM want to avoid coalescing accross things such
as the lowmem/highmem boundary or similar. This provides the option
to control it via an arch callback for which a weak default is provided
which always allows coalescing.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:21 +10:00
Benjamin Herrenschmidt
142b45a72e memblock: Add array resizing support
When one of the array gets full, we resize it. After much thinking and
a few iterations of that code, I went back to on-demand resizing using
the (new) internal memblock_find_base() function, which is pretty much what
Yinghai initially proposed, though there some differences in the details.

To work this relies on the default alloc limit being set sensibly by
the architecture.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:20 +10:00
Benjamin Herrenschmidt
6ed311b282 memblock: Move functions around into a more sensible order
Some shuffling is needed for doing array resize so we may as well
put some sense into the ordering of the functions in the whole memblock.c
file. No code change. Added some comments.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:19 +10:00
Benjamin Herrenschmidt
7f219c736f memblock: split memblock_find_base() out of __memblock_alloc_base()
This will be used by the array resize code and might prove useful
to some arch code as well at which point it can be made non-static.

Also add comment as to why aligning size is important

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

v2. Fix loss of size alignment
v3. Fix result code
2010-08-05 12:56:18 +10:00
Benjamin Herrenschmidt
7590abe891 memblock: Move memblock_init() to the bottom of the file
It's a real PITA to have to search for it in the middle

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:17 +10:00
Benjamin Herrenschmidt
4d629f9a02 memblock: Define MEMBLOCK_ERROR internally instead of using ~(phys_addr_t)0
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:16 +10:00
Benjamin Herrenschmidt
3a9c2c81eb memblock: Make memblock_find_region() out of memblock_alloc_region()
This function will be used to locate a free area to put the new memblock
arrays when attempting to resize them. memblock_alloc_region() is gone,
the two callsites now call memblock_add_region().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
v2. Fix membase_alloc_nid_region() conversion
2010-08-05 12:56:14 +10:00
Benjamin Herrenschmidt
449e8df39d memblock: Add debug markers at the end of the array
Since we allocate one more than needed, why not do a bit of sanity checking
here to ensure we don't walk past the end of the array ?

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:13 +10:00
Benjamin Herrenschmidt
bf23c51f1f memblock: Move memblock arrays to static storage in memblock.c and make their size a variable
This is in preparation for having resizable arrays.

Note that we still allocate one more than needed, this is unchanged from
the previous implementation.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:12 +10:00
Benjamin Herrenschmidt
4734b594c6 memblock: Remove memblock_type.size and add memblock.memory_size instead
Right now, both the "memory" and "reserved" memblock_type structures have
a "size" member. It represents the calculated memory size in the former
case and is unused in the latter.

This moves it out to the main memblock structure instead

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:11 +10:00
Benjamin Herrenschmidt
2898cc4cdf memblock: Change u64 to phys_addr_t
Let's not waste space and cycles on archs that don't support >32-bit
physical address space.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:09 +10:00
Benjamin Herrenschmidt
cd3db0c4ca memblock: Remove rmo_size, burry it in arch/powerpc where it belongs
The RMA (RMO is a misnomer) is a concept specific to ppc64 (in fact
server ppc64 though I hijack it on embedded ppc64 for similar purposes)
and represents the area of memory that can be accessed in real mode
(aka with MMU off), or on embedded, from the exception vectors (which
is bolted in the TLB) which pretty much boils down to the same thing.

We take that out of the generic MEMBLOCK data structure and move it into
arch/powerpc where it belongs, renaming it to "RMA" while at it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:08 +10:00
Benjamin Herrenschmidt
e63075a3c9 memblock: Introduce default allocation limit and use it to replace explicit ones
This introduce memblock.current_limit which is used to limit allocations
from memblock_alloc() or memblock_alloc_base(..., MEMBLOCK_ALLOC_ACCESSIBLE).

The old MEMBLOCK_ALLOC_ANYWHERE changes value from 0 to ~(u64)0 and can still
be used with memblock_alloc_base() to allocate really anywhere.

It is -no-longer- cropped to MEMBLOCK_REAL_LIMIT which disappears.

Note to archs: I'm leaving the default limit to MEMBLOCK_ALLOC_ANYWHERE. I
strongly recommend that you ensure that you set an appropriate limit
during boot in order to guarantee that an memblock_alloc() at any time
results in something that is accessible with a simple __va().

The reason is that a subsequent patch will introduce the ability for
the array to resize itself by reallocating itself. The MEMBLOCK core will
honor the current limit when performing those allocations.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:07 +10:00
Benjamin Herrenschmidt
27f574c223 memblock: Expose MEMBLOCK_ALLOC_ANYWHERE
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:06 +10:00
Benjamin Herrenschmidt
c3f72b5706 memblock: Factor the lowest level alloc function
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:05 +10:00
Benjamin Herrenschmidt
35a1f0bd07 memblock: Remove nid_range argument, arch provides memblock_nid_range() instead
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:04 +10:00
Benjamin Herrenschmidt
b693fffb18 memblock: Remove memblock_find()
Nobody uses it anymore. It's semantics were ... weird

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:03 +10:00
Linus Torvalds
ffd386a9a8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: allow limited allocation before slab is online
  percpu: make @dyn_size always mean min dyn_size in first chunk init functions
2010-08-04 15:17:52 -07:00
Pekka Enberg
415cb47998 Merge branches 'slab/fixes', 'slob/fixes', 'slub/cleanups' and 'slub/fixes' into for-linus 2010-08-04 22:04:43 +03:00
Linus Torvalds
5e83f6fbdb Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (198 commits)
  KVM: VMX: Fix host GDT.LIMIT corruption
  KVM: MMU: using __xchg_spte more smarter
  KVM: MMU: cleanup spte set and accssed/dirty tracking
  KVM: MMU: don't atomicly set spte if it's not present
  KVM: MMU: fix page dirty tracking lost while sync page
  KVM: MMU: fix broken page accessed tracking with ept enabled
  KVM: MMU: add missing reserved bits check in speculative path
  KVM: MMU: fix mmu notifier invalidate handler for huge spte
  KVM: x86 emulator: fix xchg instruction emulation
  KVM: x86: Call mask notifiers from pic
  KVM: x86: never re-execute instruction with enabled tdp
  KVM: Document KVM_GET_SUPPORTED_CPUID2 ioctl
  KVM: x86: emulator: inc/dec can have lock prefix
  KVM: MMU: Eliminate redundant temporaries in FNAME(fetch)
  KVM: MMU: Validate all gptes during fetch, not just those used for new pages
  KVM: MMU: Simplify spte fetch() function
  KVM: MMU: Add gpte_valid() helper
  KVM: MMU: Add validate_direct_spte() helper
  KVM: MMU: Add drop_large_spte() helper
  KVM: MMU: Use __set_spte to link shadow pages
  ...
2010-08-04 10:43:01 -07:00
Benjamin Herrenschmidt
72d4b0b4e0 memblock: Implement memblock_is_memory and memblock_is_region_memory
To make it fast, we steal ARM's binary search for memblock_is_memory()
and we use that to also the replace existing implementation of
memblock_is_reserved().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-04 14:38:47 +10:00
Benjamin Herrenschmidt
e3239ff92a memblock: Rename memblock_region to memblock_type and memblock_property to memblock_region
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-04 14:21:49 +10:00
Benjamin Herrenschmidt
f1c2c19c49 memblock: Fix memblock_is_region_reserved() to return a boolean
All callers expect a boolean result which is true if the region
overlaps a reserved region. However, the implementation actually
returns -1 if there is no overlap, and a region index (0 based)
if there is.

Make it behave as callers (and common sense) expect.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-04 14:17:17 +10:00
Christoph Lameter
2bce648584 slub: Allow removal of slab caches during boot
Serialize kmem_cache_create and kmem_cache_destroy using the slub_lock. Only
possible after the use of the slub_lock during dynamic dma creation has been
removed.

Then make sure that the setup of the slab sysfs entries does not race
with kmem_cache_create and kmem_cache destroy.

If a slab cache is removed before we have setup sysfs then simply skip over
the sysfs handling.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-08-03 07:28:32 +03:00
Pekka Enberg
e438444de8 Revert "slub: Allow removal of slab caches during boot"
This reverts commit f5b801ac38.
2010-08-03 07:28:21 +03:00
Ingo Molnar
3772b73472 Merge commit 'v2.6.35' into perf/core
Conflicts:
	tools/perf/Makefile
	tools/perf/util/hist.c

Merge reason: Resolve the conflicts and update to latest upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-02 08:31:54 +02:00
Huang Ying
bbeb34062f KVM: Fix a race condition for usage of is_hwpoison_address()
is_hwpoison_address accesses the page table, so the caller must hold
current->mm->mmap_sem in read mode. So fix its usage in hva_to_pfn of
kvm accordingly.

Comment is_hwpoison_address to remind other users.

Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:47:11 +03:00
Huang Ying
bf998156d2 KVM: Avoid killing userspace through guest SRAO MCE on unmapped pages
In common cases, guest SRAO MCE will cause corresponding poisoned page
be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay
the MCE to guest OS.

But it is reported that if the poisoned page is accessed in guest
after unmapping and before MCE is relayed to guest OS, userspace will
be killed.

The reason is as follows. Because poisoned page has been un-mapped,
guest access will cause guest exit and kvm_mmu_page_fault will be
called. kvm_mmu_page_fault can not get the poisoned page for fault
address, so kernel and user space MMIO processing is tried in turn. In
user MMIO processing, poisoned page is accessed again, then userspace
is killed by force_sig_info.

To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM
and do not try kernel and user space MMIO processing for poisoned
page.

[xiao: fix warning introduced by avi]

Reported-by: Max Asbock <masbock@linux.vnet.ibm.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01 10:35:26 +03:00
Hugh Dickins
de51257aa3 mm: fix ia64 crash when gcore reads gate area
Debian's ia64 autobuilders have been seeing kernel freeze or reboot
when running the gdb testsuite (Debian bug 588574): dannf bisected to
2.6.32 62eede62da "mm: ZERO_PAGE without
PTE_SPECIAL"; and reproduced it with gdb's gcore on a simple target.

I'd missed updating the gate_vma handling in __get_user_pages(): that
happens to use vm_normal_page() (nowadays failing on the zero page),
yet reported success even when it failed to get a page - boom when
access_process_vm() tried to copy that to its intermediate buffer.

Fix this, resisting cleanups: in particular, leave it for now reporting
success when not asked to get any pages - very probably safe to change,
but let's not risk it without testing exposure.

Why did ia64 crash with 16kB pages, but succeed with 64kB pages?
Because setup_gate() pads each 64kB of its gate area with zero pages.

Reported-by: Andreas Barth <aba@not.so.argh.org>
Bisected-by: dann frazier <dannf@debian.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: dann frazier <dannf@dannf.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-30 18:56:09 -07:00
Christoph Lameter
bc6488e910 slub numa: Fix rare allocation from unexpected node
The network developers have seen sporadic allocations resulting in objects
coming from unexpected NUMA nodes despite asking for objects from a
specific node.

This is due to get_partial() calling get_any_partial() if partial
slabs are exhausted for a node even if a node was specified and therefore
one would expect allocations only from the specified node.

get_any_partial() sporadically may return a slab from a foreign
node to gradually reduce the size of partial lists on remote nodes
and thereby reduce total memory use for a slab cache.

The behavior is controlled by the remote_defrag_ratio of each cache.

Strictly speaking this is permitted behavior since __GFP_THISNODE was
not specified for the allocation but it is certain surprising.

This patch makes sure that the remote defrag behavior only occurs
if no node was specified.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-29 12:59:00 +03:00
Jeremy Fitzhardinge
a0d40c8025 vmap: add flag to allow lazy unmap to be disabled at runtime
Add a flag to force lazy_max_pages() to zero to prevent any outstanding
mapped pages.  We'll need this for Xen.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Nick Piggin <npiggin@suse.de>
2010-07-27 11:49:09 -04:00
Ingo Molnar
9dcdbf7a33 Merge branch 'linus' into perf/core
Merge reason: Pick up the latest perf fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-21 21:43:06 +02:00
Yinghai Lu
b8ab9f8202 x86,nobootmem: make alloc_bootmem_node fall back to other node when 32bit numa is used
Borislav Petkov reported his 32bit numa system has problem:

[    0.000000] Reserving total of 4c00 pages for numa KVA remap
[    0.000000] kva_start_pfn ~ 32800 max_low_pfn ~ 375fe
[    0.000000] max_pfn = 238000
[    0.000000] 8202MB HIGHMEM available.
[    0.000000] 885MB LOWMEM available.
[    0.000000]   mapped low ram: 0 - 375fe000
[    0.000000]   low ram: 0 - 375fe000
[    0.000000] alloc (nid=8 100000 - 7ee00000) (1000000 - ffffffff) 1000 1000 => 34e7000
[    0.000000] alloc (nid=8 100000 - 7ee00000) (1000000 - ffffffff) 200 40 => 34c9d80
[    0.000000] alloc (nid=0 100000 - 7ee00000) (1000000 - ffffffffffffffff) 180 40 => 34e6140
[    0.000000] alloc (nid=1 80000000 - c7e60000) (1000000 - ffffffffffffffff) 240 40 => 80000000
[    0.000000] BUG: unable to handle kernel paging request at 40000000
[    0.000000] IP: [<c2c8cff1>] __alloc_memory_core_early+0x147/0x1d6
[    0.000000] *pdpt = 0000000000000000 *pde = f000ff53f000ff00
...
[    0.000000] Call Trace:
[    0.000000]  [<c2c8b4f8>] ? __alloc_bootmem_node+0x216/0x22f
[    0.000000]  [<c2c90c9b>] ? sparse_early_usemaps_alloc_node+0x5a/0x10b
[    0.000000]  [<c2c9149e>] ? sparse_init+0x1dc/0x499
[    0.000000]  [<c2c79118>] ? paging_init+0x168/0x1df
[    0.000000]  [<c2c780ff>] ? native_pagetable_setup_start+0xef/0x1bb

looks like it allocates too much high address for bootmem.

Try to cut limit with get_max_mapped()

Reported-by: Borislav Petkov <borislav.petkov@amd.com>
Tested-by: Conny Seidel <conny.seidel@amd.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@kernel.org>		[2.6.34.x]
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-20 16:25:40 -07:00
Nick Piggin
a6aa62a090 mm/vmscan.c: fix mapping use after free
We need lock_page_nosync() here because we have no reference to the
mapping when taking the page lock.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-20 16:25:40 -07:00
Arjan van de Ven
78b435368f slab: use deferable timers for its periodic housekeeping
slab has a "once every 2 second" timer for its housekeeping.
As the number of logical processors is growing, its more and more
common that this 2 second timer becomes the primary wakeup source.

This patch turns this housekeeping timer into a deferable timer,
which means that the timer does not interrupt idle, but just runs
at the next event that wakes the cpu up.

The impact is that the timer likely runs a bit later, but during the
delay no code is running so there's not all that much reason for
a difference in housekeeping to occur because of this delay.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-20 10:03:23 +03:00
Catalin Marinas
a2b6bf63cb kmemleak: Add DocBook style comments to kmemleak.c
The description and parameters of the kmemleak API weren't obvious. This
patch adds comments clarifying the API usage.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-19 11:54:17 +01:00
Jason Baron
ab0155a22a kmemleak: Introduce a default off mode for kmemleak
Introduce a new DEBUG_KMEMLEAK_DEFAULT_OFF config parameter that allows
kmemleak to be disabled by default, but enabled on the command line
via: kmemleak=on. Although a reboot is required to turn it on, its still
useful to not require a re-compile.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-19 11:54:17 +01:00
Catalin Marinas
a7686a45c0 kmemleak: Show more information for objects found by alias
There may be situations when an object is freed using a pointer inside
the memory block. Kmemleak should show more information to help with
debugging.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-19 11:54:16 +01:00
Catalin Marinas
9078370c0d kmemleak: Add support for NO_BOOTMEM configurations
With commits 08677214 and 59be5a8e, alloc_bootmem()/free_bootmem() and
friends use the early_res functions for memory management when
NO_BOOTMEM is enabled. This patch adds the kmemleak calls in the
corresponding code paths for bootmem allocations.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: stable@kernel.org
2010-07-19 11:54:15 +01:00
Catalin Marinas
7952f98818 kmemleak: Annotate false positive in init_section_page_cgroup()
The pointer to the page_cgroup table allocated in
init_section_page_cgroup() is stored in section->page_cgroup as (base -
pfn). Since this value does not point to the beginning or inside the
allocated memory block, kmemleak reports a false positive.

This was reported in bugzilla.kernel.org as #16297.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Adrien Dessemond <adrien.dessemond@gmail.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
2010-07-19 11:54:14 +01:00
Dave Chinner
7f8275d0d6 mm: add context argument to shrinker callback
The current shrinker implementation requires the registered callback
to have global state to work from. This makes it difficult to shrink
caches that are not global (e.g. per-filesystem caches). Pass the shrinker
structure to the callback so that users can embed the shrinker structure
in the context the shrinker needs to operate on and get back to it in the
callback via container_of().

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-07-19 14:56:17 +10:00
Linus Torvalds
46ac0cc92e Merge branch 'kmemleak' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm
* 'kmemleak' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm:
  kmemleak: Add support for NO_BOOTMEM configurations
  kmemleak: Annotate false positive in init_section_page_cgroup()
2010-07-19 13:18:34 -07:00
Christoph Lameter
af537b0a6c slub: Use kmem_cache flags to detect if slab is in debugging mode.
The cacheline with the flags is reachable from the hot paths after the
percpu allocator changes went in. So there is no need anymore to put a
flag into each slab page. Get rid of the SlubDebug flag and use
the flags in kmem_cache instead.

Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-16 11:13:08 +03:00
Christoph Lameter
f5b801ac38 slub: Allow removal of slab caches during boot
If a slab cache is removed before we have setup sysfs then simply skip over
the sysfs handling.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Roland Dreier <rdreier@cisco.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-16 11:13:07 +03:00
Christoph Lameter
d7278bd7d1 slub: Check kasprintf results in kmem_cache_init()
Small allocations may fail during slab bringup which is fatal. Add a BUG_ON()
so that we fail immediately rather than failing later during sysfs
processing.

Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-16 11:13:07 +03:00
Christoph Lameter
f90ec39014 SLUB: Constants need UL
UL suffix is missing in some constants. Conform to how slab.h uses constants.

Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-16 11:13:07 +03:00
Christoph Lameter
2154a33638 slub: Use a constant for a unspecified node.
kmalloc_node() and friends can be passed a constant -1 to indicate
that no choice was made for the node from which the object needs to
come.

Use NUMA_NO_NODE instead of -1.

CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-16 11:13:06 +03:00