evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

647 lines

19 KiB

C

Raw Normal View History

zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`/*`
			`* zbud.c`
			`*`
			`* Copyright (C) 2013, Seth Jennings, IBM`
			`*`
			`* Concepts based on zcache internal zbud allocator by Dan Magenheimer.`
			`*`
			`* zbud is an special purpose allocator for storing compressed pages. Contrary`
			`* to what its name may suggest, zbud is not a buddy allocator, but rather an`
			`* allocator that "buddies" two compressed pages together in a single memory`
			`* page.`
			`*`
			`* While this design limits storage density, it has simple and deterministic`
			`* reclaim properties that make it preferable to a higher density approach when`
			`* reclaim will be used.`
			`*`
			`* zbud works by storing compressed pages, or "zpages", together in pairs in a`
			`* single memory page called a "zbud page". The first buddy is "left`
mm/zbud: fix some trivial typos in comments Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-09-11 14:21:42 -07:00			`* justified" at the beginning of the zbud page, and the last buddy is "right`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`* justified" at the end of the zbud page. The benefit is that if either`
			`* buddy is freed, the freed buddy space, coalesced with whatever slack space`
			`* that existed between the buddies, results in the largest possible free region`
			`* within the zbud page.`
			`*`
			`* zbud also provides an attractive lower bound on density. The ratio of zpages`
			`* to zbud pages can not be less than 1. This ensures that zbud can never "do`
			`* harm" by using more pages to store zpages than the uncompressed zpages would`
			`* have used on their own.`
			`*`
			`* zbud pages are divided into "chunks". The size of the chunks is fixed at`
			`* compile time and determined by NCHUNKS_ORDER below. Dividing zbud pages`
			`* into chunks allows organizing unbuddied zbud pages into a manageable number`
			`* of unbuddied lists according to the number of free chunks available in the`
			`* zbud page.`
			`*`
			`* The zbud API differs from that of conventional allocators in that the`
			`* allocation function, zbud_alloc(), returns an opaque handle to the user,`
			`* not a dereferenceable pointer. The user must map the handle using`
			`* zbud_map() in order to get a usable pointer by which to access the`
			`* allocation data and unmap the handle with zbud_unmap() when operations`
			`* on the allocation data are complete.`
			`*/`

			`#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt`

			`#include <linux/atomic.h>`
			`#include <linux/list.h>`
			`#include <linux/mm.h>`
			`#include <linux/module.h>`
			`#include <linux/preempt.h>`
			`#include <linux/slab.h>`
			`#include <linux/spinlock.h>`
			`#include <linux/zbud.h>`
mm/zpool: zbud/zsmalloc implement zpool Update zbud and zsmalloc to implement the zpool api. [fengguang.wu@intel.com: make functions static] Signed-off-by: Dan Streetman <ddstreet@ieee.org> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:38 -07:00			`#include <linux/zpool.h>`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00
			`/*****************`
			`* Structures`
			`*****************/`
			`/*`
			`* NCHUNKS_ORDER determines the internal allocation granularity, effectively`
			`* adjusting internal fragmentation. It also determines the number of`
			`* freelists maintained in each pool. NCHUNKS_ORDER of 6 means that the`
zbud: avoid accessing last unused freelist For now, there are NCHUNKS of 64 freelists in zbud_pool, the last unbuddied[63] freelist linked with all zbud pages which have free chunks of 63. Calculating according to context of num_free_chunks(), our max chunk number of unbuddied zbud page is 62, so none of zbud pages will be added/removed in last freelist, but still we will try to find an unbuddied zbud page in the last unused freelist, it is unneeded. This patch redefines NCHUNKS to 63 as free chunk number in one zbud page, hence we can decrease size of zpool and avoid accessing the last unused freelist whenever failing to allocate zbud from freelist in zbud_alloc. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Cc: Seth Jennings <sjennings@variantweb.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-10-09 15:30:04 -07:00			`* allocation granularity will be in chunks of size PAGE_SIZE/64. As one chunk`
			`* in allocated page is occupied by zbud header, NCHUNKS will be calculated to`
			`* 63 which shows the max number of free chunks in zbud page, also there will be`
			`* 63 freelists per pool.`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`*/`
			`#define NCHUNKS_ORDER 6`

			`#define CHUNK_SHIFT (PAGE_SHIFT - NCHUNKS_ORDER)`
			`#define CHUNK_SIZE (1 << CHUNK_SHIFT)`
			`#define ZHDR_SIZE_ALIGNED CHUNK_SIZE`
zbud: avoid accessing last unused freelist For now, there are NCHUNKS of 64 freelists in zbud_pool, the last unbuddied[63] freelist linked with all zbud pages which have free chunks of 63. Calculating according to context of num_free_chunks(), our max chunk number of unbuddied zbud page is 62, so none of zbud pages will be added/removed in last freelist, but still we will try to find an unbuddied zbud page in the last unused freelist, it is unneeded. This patch redefines NCHUNKS to 63 as free chunk number in one zbud page, hence we can decrease size of zpool and avoid accessing the last unused freelist whenever failing to allocate zbud from freelist in zbud_alloc. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Cc: Seth Jennings <sjennings@variantweb.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-10-09 15:30:04 -07:00			`#define NCHUNKS ((PAGE_SIZE - ZHDR_SIZE_ALIGNED) >> CHUNK_SHIFT)`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00
			`/**`
			`* struct zbud_pool - stores metadata for each zbud pool`
			`* @lock: protects all pool fields and first\|last_chunk fields of any`
			`* zbud page in the pool`
			`* @unbuddied: array of lists tracking zbud pages that only contain one buddy;`
			`* the lists each zbud page is added to depends on the size of`
			`* its free region.`
			`* @buddied: list tracking the zbud pages that contain two buddies;`
			`* these zbud pages are full`
			`* @lru: list tracking the zbud pages in LRU order by most recently`
			`* added buddy.`
			`* @pages_nr: number of zbud pages in the pool.`
			`* @ops: pointer to a structure of user defined operations specified at`
			`* pool creation time.`
			`*`
			`* This structure is allocated at pool creation time and maintains metadata`
			`* pertaining to a particular zbud pool.`
			`*/`
			`struct zbud_pool {`
			`spinlock_t lock;`
			`struct list_head unbuddied[NCHUNKS];`
			`struct list_head buddied;`
			`struct list_head lru;`
			`u64 pages_nr;`
mm: zbud: constify the zbud_ops The structure zbud_ops is not modified so make the pointer to it a pointer to const. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-09-08 15:05:06 -07:00			`const struct zbud_ops *ops;`
zpool: remove zpool_evict() Remove zpool_evict() helper function. As zbud is currently the only zpool implementation that supports eviction, add zpool and zpool_ops references to struct zbud_pool and directly call zpool_ops->evict(zpool, handle) on eviction. Currently zpool provides the zpool_evict helper which locks the zpool list lock and searches through all pools to find the specific one matching the caller, and call the corresponding zpool_ops->evict function. However, this is unnecessary, as the zbud pool can simply keep a reference to the zpool that created it, as well as the zpool_ops, and directly call the zpool_ops->evict function, when it needs to evict a page. This avoids a spinlock and list search in zpool for each eviction. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-06-25 15:00:40 -07:00			`#ifdef CONFIG_ZPOOL`
			`struct zpool *zpool;`
mm: zpool: constify the zpool_ops The structure zpool_ops is not modified so make the pointer to it a pointer to const. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-09-08 15:05:03 -07:00			`const struct zpool_ops *zpool_ops;`
zpool: remove zpool_evict() Remove zpool_evict() helper function. As zbud is currently the only zpool implementation that supports eviction, add zpool and zpool_ops references to struct zbud_pool and directly call zpool_ops->evict(zpool, handle) on eviction. Currently zpool provides the zpool_evict helper which locks the zpool list lock and searches through all pools to find the specific one matching the caller, and call the corresponding zpool_ops->evict function. However, this is unnecessary, as the zbud pool can simply keep a reference to the zpool that created it, as well as the zpool_ops, and directly call the zpool_ops->evict function, when it needs to evict a page. This avoids a spinlock and list search in zpool for each eviction. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-06-25 15:00:40 -07:00			`#endif`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`};`

			`/*`
			`* struct zbud_header - zbud page metadata occupying the first chunk of each`
			`* zbud page.`
			`* @buddy: links the zbud page into the unbuddied/buddied lists in the pool`
			`* @lru: links the zbud page into the lru list in the pool`
			`* @first_chunks: the size of the first buddy in chunks, 0 if free`
			`* @last_chunks: the size of the last buddy in chunks, 0 if free`
			`*/`
			`struct zbud_header {`
			`struct list_head buddy;`
			`struct list_head lru;`
			`unsigned int first_chunks;`
			`unsigned int last_chunks;`
			`bool under_reclaim;`
			`};`

mm/zpool: zbud/zsmalloc implement zpool Update zbud and zsmalloc to implement the zpool api. [fengguang.wu@intel.com: make functions static] Signed-off-by: Dan Streetman <ddstreet@ieee.org> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:38 -07:00			`/*****************`
			`* zpool`
			`****************/`

			`#ifdef CONFIG_ZPOOL`

			`static int zbud_zpool_evict(struct zbud_pool *pool, unsigned long handle)`
			`{`
zpool: remove zpool_evict() Remove zpool_evict() helper function. As zbud is currently the only zpool implementation that supports eviction, add zpool and zpool_ops references to struct zbud_pool and directly call zpool_ops->evict(zpool, handle) on eviction. Currently zpool provides the zpool_evict helper which locks the zpool list lock and searches through all pools to find the specific one matching the caller, and call the corresponding zpool_ops->evict function. However, this is unnecessary, as the zbud pool can simply keep a reference to the zpool that created it, as well as the zpool_ops, and directly call the zpool_ops->evict function, when it needs to evict a page. This avoids a spinlock and list search in zpool for each eviction. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-06-25 15:00:40 -07:00			`if (pool->zpool && pool->zpool_ops && pool->zpool_ops->evict)`
			`return pool->zpool_ops->evict(pool->zpool, handle);`
			`else`
			`return -ENOENT;`
mm/zpool: zbud/zsmalloc implement zpool Update zbud and zsmalloc to implement the zpool api. [fengguang.wu@intel.com: make functions static] Signed-off-by: Dan Streetman <ddstreet@ieee.org> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:38 -07:00			`}`

mm: zbud: constify the zbud_ops The structure zbud_ops is not modified so make the pointer to it a pointer to const. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-09-08 15:05:06 -07:00			`static const struct zbud_ops zbud_zpool_ops = {`
mm/zpool: zbud/zsmalloc implement zpool Update zbud and zsmalloc to implement the zpool api. [fengguang.wu@intel.com: make functions static] Signed-off-by: Dan Streetman <ddstreet@ieee.org> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:38 -07:00			`.evict = zbud_zpool_evict`
			`};`

mm: zsmalloc: constify struct zs_pool name Constify `struct zs_pool' ->name. [akpm@inux-foundation.org: constify zpool_create_pool()'s `type' arg also] Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Dan Streetman <ddstreet@ieee.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-11-06 16:29:21 -08:00			`static void zbud_zpool_create(const char name, gfp_t gfp,`
mm: zpool: constify the zpool_ops The structure zpool_ops is not modified so make the pointer to it a pointer to const. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-09-08 15:05:03 -07:00			`const struct zpool_ops *zpool_ops,`
zpool: remove zpool_evict() Remove zpool_evict() helper function. As zbud is currently the only zpool implementation that supports eviction, add zpool and zpool_ops references to struct zbud_pool and directly call zpool_ops->evict(zpool, handle) on eviction. Currently zpool provides the zpool_evict helper which locks the zpool list lock and searches through all pools to find the specific one matching the caller, and call the corresponding zpool_ops->evict function. However, this is unnecessary, as the zbud pool can simply keep a reference to the zpool that created it, as well as the zpool_ops, and directly call the zpool_ops->evict function, when it needs to evict a page. This avoids a spinlock and list search in zpool for each eviction. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-06-25 15:00:40 -07:00			`struct zpool *zpool)`
mm/zpool: zbud/zsmalloc implement zpool Update zbud and zsmalloc to implement the zpool api. [fengguang.wu@intel.com: make functions static] Signed-off-by: Dan Streetman <ddstreet@ieee.org> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:38 -07:00			`{`
zpool: remove zpool_evict() Remove zpool_evict() helper function. As zbud is currently the only zpool implementation that supports eviction, add zpool and zpool_ops references to struct zbud_pool and directly call zpool_ops->evict(zpool, handle) on eviction. Currently zpool provides the zpool_evict helper which locks the zpool list lock and searches through all pools to find the specific one matching the caller, and call the corresponding zpool_ops->evict function. However, this is unnecessary, as the zbud pool can simply keep a reference to the zpool that created it, as well as the zpool_ops, and directly call the zpool_ops->evict function, when it needs to evict a page. This avoids a spinlock and list search in zpool for each eviction. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-06-25 15:00:40 -07:00			`struct zbud_pool *pool;`

			`pool = zbud_create_pool(gfp, zpool_ops ? &zbud_zpool_ops : NULL);`
			`if (pool) {`
			`pool->zpool = zpool;`
			`pool->zpool_ops = zpool_ops;`
			`}`
			`return pool;`
mm/zpool: zbud/zsmalloc implement zpool Update zbud and zsmalloc to implement the zpool api. [fengguang.wu@intel.com: make functions static] Signed-off-by: Dan Streetman <ddstreet@ieee.org> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:38 -07:00			`}`

			`static void zbud_zpool_destroy(void *pool)`
			`{`
			`zbud_destroy_pool(pool);`
			`}`

			`static int zbud_zpool_malloc(void *pool, size_t size, gfp_t gfp,`
			`unsigned long *handle)`
			`{`
			`return zbud_alloc(pool, size, gfp, handle);`
			`}`
			`static void zbud_zpool_free(void *pool, unsigned long handle)`
			`{`
			`zbud_free(pool, handle);`
			`}`

			`static int zbud_zpool_shrink(void *pool, unsigned int pages,`
			`unsigned int *reclaimed)`
			`{`
			`unsigned int total = 0;`
			`int ret = -EINVAL;`

			`while (total < pages) {`
			`ret = zbud_reclaim_page(pool, 8);`
			`if (ret < 0)`
			`break;`
			`total++;`
			`}`

			`if (reclaimed)`
			`*reclaimed = total;`

			`return ret;`
			`}`

			`static void zbud_zpool_map(void pool, unsigned long handle,`
			`enum zpool_mapmode mm)`
			`{`
			`return zbud_map(pool, handle);`
			`}`
			`static void zbud_zpool_unmap(void *pool, unsigned long handle)`
			`{`
			`zbud_unmap(pool, handle);`
			`}`

			`static u64 zbud_zpool_total_size(void *pool)`
			`{`
			`return zbud_get_pool_size(pool) * PAGE_SIZE;`
			`}`

			`static struct zpool_driver zbud_zpool_driver = {`
			`.type = "zbud",`
			`.owner = THIS_MODULE,`
			`.create = zbud_zpool_create,`
			`.destroy = zbud_zpool_destroy,`
			`.malloc = zbud_zpool_malloc,`
			`.free = zbud_zpool_free,`
			`.shrink = zbud_zpool_shrink,`
			`.map = zbud_zpool_map,`
			`.unmap = zbud_zpool_unmap,`
			`.total_size = zbud_zpool_total_size,`
			`};`

mm/zpool: use prefixed module loading To avoid potential format string expansion via module parameters, do not use the zpool type directly in request_module() without a format string. Additionally, to avoid arbitrary modules being loaded via zpool API (e.g. via the zswap_zpool_type module parameter) add a "zpool-" prefix to the requested module, as well as module aliases for the existing zpool types (zbud and zsmalloc). Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Acked-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-29 15:18:40 -07:00			`MODULE_ALIAS("zpool-zbud");`
mm/zpool: zbud/zsmalloc implement zpool Update zbud and zsmalloc to implement the zpool api. [fengguang.wu@intel.com: make functions static] Signed-off-by: Dan Streetman <ddstreet@ieee.org> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:38 -07:00			`#endif /* CONFIG_ZPOOL */`

zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`/*****************`
			`* Helpers`
			`*****************/`
			`/* Just to make the code easier to read */`
			`enum buddy {`
			`FIRST,`
			`LAST`
			`};`

			`/* Converts an allocation size in bytes to size in zbud chunks */`
mm/zbud: change zbud_alloc size type to size_t Change the type of the zbud_alloc() size param from unsigned int to size_t. Technically, this should not make any difference, as the zbud implementation already restricts the size to well within either type's limits; but as zsmalloc (and kmalloc) use size_t, and zpool will use size_t, this brings the size parameter type in line with zsmalloc/zpool. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Acked-by: Seth Jennings <sjennings@variantweb.net> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Weijie Yang <weijie.yang@samsung.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:33 -07:00			`static int size_to_chunks(size_t size)`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`{`
			`return (size + CHUNK_SIZE - 1) >> CHUNK_SHIFT;`
			`}`

			`#define for_each_unbuddied_list(_iter, _begin) \`
			`for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)`

			`/* Initializes the zbud header of a newly allocated zbud page */`
			`static struct zbud_header init_zbud_page(struct page page)`
			`{`
			`struct zbud_header *zhdr = page_address(page);`
			`zhdr->first_chunks = 0;`
			`zhdr->last_chunks = 0;`
			`INIT_LIST_HEAD(&zhdr->buddy);`
			`INIT_LIST_HEAD(&zhdr->lru);`
			`zhdr->under_reclaim = 0;`
			`return zhdr;`
			`}`

			`/* Resets the struct page fields and frees the page */`
			`static void free_zbud_page(struct zbud_header *zhdr)`
			`{`
			`__free_page(virt_to_page(zhdr));`
			`}`

			`/*`
			`* Encodes the handle of a particular buddy within a zbud page`
			`* Pool lock should be held as this function accesses first\|last_chunks`
			`*/`
			`static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud)`
			`{`
			`unsigned long handle;`

			`/*`
			`* For now, the encoded handle is actually just the pointer to the data`
			`* but this might not always be the case. A little information hiding.`
			`* Add CHUNK_SIZE to the handle if it is the first allocation to jump`
			`* over the zbud header in the first chunk.`
			`*/`
			`handle = (unsigned long)zhdr;`
			`if (bud == FIRST)`
			`/* skip over zbud header */`
			`handle += ZHDR_SIZE_ALIGNED;`
			`else /* bud == LAST */`
			`handle += PAGE_SIZE - (zhdr->last_chunks << CHUNK_SHIFT);`
			`return handle;`
			`}`

			`/* Returns the zbud page where a given handle is stored */`
			`static struct zbud_header *handle_to_zbud_header(unsigned long handle)`
			`{`
			`return (struct zbud_header *)(handle & PAGE_MASK);`
			`}`

			`/* Returns the number of free chunks in a zbud page */`
			`static int num_free_chunks(struct zbud_header *zhdr)`
			`{`
			`/*`
			`* Rather than branch for different situations, just use the fact that`
zbud: avoid accessing last unused freelist For now, there are NCHUNKS of 64 freelists in zbud_pool, the last unbuddied[63] freelist linked with all zbud pages which have free chunks of 63. Calculating according to context of num_free_chunks(), our max chunk number of unbuddied zbud page is 62, so none of zbud pages will be added/removed in last freelist, but still we will try to find an unbuddied zbud page in the last unused freelist, it is unneeded. This patch redefines NCHUNKS to 63 as free chunk number in one zbud page, hence we can decrease size of zpool and avoid accessing the last unused freelist whenever failing to allocate zbud from freelist in zbud_alloc. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Cc: Seth Jennings <sjennings@variantweb.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-10-09 15:30:04 -07:00			`* free buddies have a length of zero to simplify everything.`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`*/`
zbud: avoid accessing last unused freelist For now, there are NCHUNKS of 64 freelists in zbud_pool, the last unbuddied[63] freelist linked with all zbud pages which have free chunks of 63. Calculating according to context of num_free_chunks(), our max chunk number of unbuddied zbud page is 62, so none of zbud pages will be added/removed in last freelist, but still we will try to find an unbuddied zbud page in the last unused freelist, it is unneeded. This patch redefines NCHUNKS to 63 as free chunk number in one zbud page, hence we can decrease size of zpool and avoid accessing the last unused freelist whenever failing to allocate zbud from freelist in zbud_alloc. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Cc: Seth Jennings <sjennings@variantweb.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-10-09 15:30:04 -07:00			`return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`}`

			`/*****************`
			`* API Functions`
			`*****************/`
			`/**`
			`* zbud_create_pool() - create a new zbud pool`
			`* @gfp: gfp flags when allocating the zbud pool structure`
			`* @ops: user-defined operations for the zbud pool`
			`*`
			`* Return: pointer to the new zbud pool or NULL if the metadata allocation`
			`* failed.`
			`*/`
mm: zbud: constify the zbud_ops The structure zbud_ops is not modified so make the pointer to it a pointer to const. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-09-08 15:05:06 -07:00			`struct zbud_pool zbud_create_pool(gfp_t gfp, const struct zbud_ops ops)`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`{`
			`struct zbud_pool *pool;`
			`int i;`

zpool: remove zpool_evict() Remove zpool_evict() helper function. As zbud is currently the only zpool implementation that supports eviction, add zpool and zpool_ops references to struct zbud_pool and directly call zpool_ops->evict(zpool, handle) on eviction. Currently zpool provides the zpool_evict helper which locks the zpool list lock and searches through all pools to find the specific one matching the caller, and call the corresponding zpool_ops->evict function. However, this is unnecessary, as the zbud pool can simply keep a reference to the zpool that created it, as well as the zpool_ops, and directly call the zpool_ops->evict function, when it needs to evict a page. This avoids a spinlock and list search in zpool for each eviction. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2015-06-25 15:00:40 -07:00			`pool = kzalloc(sizeof(struct zbud_pool), gfp);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`if (!pool)`
			`return NULL;`
			`spin_lock_init(&pool->lock);`
			`for_each_unbuddied_list(i, 0)`
			`INIT_LIST_HEAD(&pool->unbuddied[i]);`
			`INIT_LIST_HEAD(&pool->buddied);`
			`INIT_LIST_HEAD(&pool->lru);`
			`pool->pages_nr = 0;`
			`pool->ops = ops;`
			`return pool;`
			`}`

			`/**`
			`* zbud_destroy_pool() - destroys an existing zbud pool`
			`* @pool: the zbud pool to be destroyed`
			`*`
			`* The pool should be emptied before this function is called.`
			`*/`
			`void zbud_destroy_pool(struct zbud_pool *pool)`
			`{`
			`kfree(pool);`
			`}`

			`/**`
			`* zbud_alloc() - allocates a region of a given size`
			`* @pool: zbud pool from which to allocate`
			`* @size: size in bytes of the desired allocation`
			`* @gfp: gfp flags used if the pool needs to grow`
			`* @handle: handle of the new allocation`
			`*`
			`* This function will attempt to find a free region in the pool large enough to`
			`* satisfy the allocation request. A search of the unbuddied lists is`
			`* performed first. If no suitable free region is found, then a new page is`
			`* allocated and added to the pool to satisfy the request.`
			`*`
			`* gfp should not set __GFP_HIGHMEM as highmem pages cannot be used`
			`* as zbud pool pages.`
			`*`
mm/zbud: fix some trivial typos in comments Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-09-11 14:21:42 -07:00			`* Return: 0 if success and handle is set, otherwise -EINVAL if the size or`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`* gfp arguments are invalid or -ENOMEM if the pool was unable to allocate`
			`* a new page.`
			`*/`
mm/zbud: change zbud_alloc size type to size_t Change the type of the zbud_alloc() size param from unsigned int to size_t. Technically, this should not make any difference, as the zbud implementation already restricts the size to well within either type's limits; but as zsmalloc (and kmalloc) use size_t, and zpool will use size_t, this brings the size parameter type in line with zsmalloc/zpool. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Acked-by: Seth Jennings <sjennings@variantweb.net> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Weijie Yang <weijie.yang@samsung.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:33 -07:00			`int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp,`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`unsigned long *handle)`
			`{`
			`int chunks, i, freechunks;`
			`struct zbud_header *zhdr = NULL;`
			`enum buddy bud;`
			`struct page *page;`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`unsigned long flags;`
mm: zbud: initialize object to 0 on GFP_ZERO zbud_alloc if returns free object from pool must also initialize it to 0 when asked to do so. The same is already taken care if a fresh object is allocated. CRs-fixed: 979234 Change-Id: Id171edf131df321385fcdcd7660d06da97689e3e Signed-off-by: Shiraz Hashim <shashim@codeaurora.org> 2016-03-03 19:49:41 +05:30			`int found = 0;`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00
mm/zbud.c: make size unsigned like unique callsite zbud_alloc is only called by zswap_frontswap_store with unsigned int len. Change function parameter + update >= 0 check. Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Seth Jennings <sjennings@variantweb.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-06-04 16:11:07 -07:00			`if (!size \|\| (gfp & __GFP_HIGHMEM))`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`return -EINVAL;`
mm: zbud: fix condition check on allocation size zbud_alloc() incorrectly verifies the size of allocation limit. It should deny the allocation request greater than (PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE), not (PAGE_SIZE - ZHDR_SIZE_ALIGNED) which has no remaining spaces for its buddy. There is no point in spending the entire zbud page storing only a single page, since we don't have any benefits. Signed-off-by: Heesub Shin <heesub.shin@samsung.com> Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Dongjun Shin <d.j.shin@samsung.com> Cc: Sunae Seo <sunae.seo@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-31 13:53:40 -07:00			`if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`return -ENOSPC;`
			`chunks = size_to_chunks(size);`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_lock_irqsave(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00
			`/* First, try to find an unbuddied zbud page. */`
			`zhdr = NULL;`
			`for_each_unbuddied_list(i, chunks) {`
			`if (!list_empty(&pool->unbuddied[i])) {`
			`zhdr = list_first_entry(&pool->unbuddied[i],`
			`struct zbud_header, buddy);`
			`list_del(&zhdr->buddy);`
			`if (zhdr->first_chunks == 0)`
			`bud = FIRST;`
			`else`
			`bud = LAST;`
mm: zbud: initialize object to 0 on GFP_ZERO zbud_alloc if returns free object from pool must also initialize it to 0 when asked to do so. The same is already taken care if a fresh object is allocated. CRs-fixed: 979234 Change-Id: Id171edf131df321385fcdcd7660d06da97689e3e Signed-off-by: Shiraz Hashim <shashim@codeaurora.org> 2016-03-03 19:49:41 +05:30			`found = 1;`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`goto found;`
			`}`
			`}`

			`/* Couldn't find unbuddied zbud page, create new one */`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_unlock_irqrestore(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`page = alloc_page(gfp);`
			`if (!page)`
			`return -ENOMEM;`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_lock_irqsave(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`pool->pages_nr++;`
			`zhdr = init_zbud_page(page);`
			`bud = FIRST;`

			`found:`
			`if (bud == FIRST)`
			`zhdr->first_chunks = chunks;`
			`else`
			`zhdr->last_chunks = chunks;`

			`if (zhdr->first_chunks == 0 \|\| zhdr->last_chunks == 0) {`
			`/* Add to unbuddied list */`
			`freechunks = num_free_chunks(zhdr);`
			`list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);`
			`} else {`
			`/* Add to buddied list */`
			`list_add(&zhdr->buddy, &pool->buddied);`
			`}`

			`/* Add/move zbud page to beginning of LRU */`
			`if (!list_empty(&zhdr->lru))`
			`list_del(&zhdr->lru);`
			`list_add(&zhdr->lru, &pool->lru);`

			`*handle = encode_handle(zhdr, bud);`
mm: zbud: initialize object to 0 on GFP_ZERO zbud_alloc if returns free object from pool must also initialize it to 0 when asked to do so. The same is already taken care if a fresh object is allocated. CRs-fixed: 979234 Change-Id: Id171edf131df321385fcdcd7660d06da97689e3e Signed-off-by: Shiraz Hashim <shashim@codeaurora.org> 2016-03-03 19:49:41 +05:30			`if ((gfp & __GFP_ZERO) && found)`
			`memset((void )handle, 0, size);`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_unlock_irqrestore(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00
			`return 0;`
			`}`

			`/**`
			`* zbud_free() - frees the allocation associated with the given handle`
			`* @pool: pool in which the allocation resided`
			`* @handle: handle associated with the allocation returned by zbud_alloc()`
			`*`
			`* In the case that the zbud page in which the allocation resides is under`
			`* reclaim, as indicated by the PG_reclaim flag being set, this function`
			`* only sets the first\|last_chunks to 0. The page is actually freed`
			`* once both buddies are evicted (see zbud_reclaim_page() below).`
			`*/`
			`void zbud_free(struct zbud_pool *pool, unsigned long handle)`
			`{`
			`struct zbud_header *zhdr;`
			`int freechunks;`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`unsigned long flags;`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_lock_irqsave(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`zhdr = handle_to_zbud_header(handle);`

			`/* If first buddy, handle will be page aligned */`
			`if ((handle - ZHDR_SIZE_ALIGNED) & ~PAGE_MASK)`
			`zhdr->last_chunks = 0;`
			`else`
			`zhdr->first_chunks = 0;`

			`if (zhdr->under_reclaim) {`
			`/* zbud page is under reclaim, reclaim will free */`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_unlock_irqrestore(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`return;`
			`}`

			`/* Remove from existing buddy list */`
			`list_del(&zhdr->buddy);`

			`if (zhdr->first_chunks == 0 && zhdr->last_chunks == 0) {`
			`/* zbud page is empty, free */`
			`list_del(&zhdr->lru);`
			`free_zbud_page(zhdr);`
			`pool->pages_nr--;`
			`} else {`
			`/* Add to unbuddied list */`
			`freechunks = num_free_chunks(zhdr);`
			`list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);`
			`}`

mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_unlock_irqrestore(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`}`

			`#define list_tail_entry(ptr, type, member) \`
			`list_entry((ptr)->prev, type, member)`

			`/**`
			`* zbud_reclaim_page() - evicts allocations from a pool page and frees it`
			`* @pool: pool from which a page will attempt to be evicted`
			`* @retires: number of pages on the LRU list for which eviction will`
			`* be attempted before failing`
			`*`
			`* zbud reclaim is different from normal system reclaim in that the reclaim is`
			`* done from the bottom, up. This is because only the bottom layer, zbud, has`
			`* information on how the allocations are organized within each zbud page. This`
			`* has the potential to create interesting locking situations between zbud and`
			`* the user, however.`
			`*`
			`* To avoid these, this is how zbud_reclaim_page() should be called:`

			`* The user detects a page should be reclaimed and calls zbud_reclaim_page().`
			`* zbud_reclaim_page() will remove a zbud page from the pool LRU list and call`
			`* the user-defined eviction handler with the pool and handle as arguments.`
			`*`
			`* If the handle can not be evicted, the eviction handler should return`
			`* non-zero. zbud_reclaim_page() will add the zbud page back to the`
			`* appropriate list and try the next zbud page on the LRU up to`
			`* a user defined number of retries.`
			`*`
			`* If the handle is successfully evicted, the eviction handler should`
			`* return 0 _and_ should have called zbud_free() on the handle. zbud_free()`
			`* contains logic to delay freeing the page if the page is under reclaim,`
			`* as indicated by the setting of the PG_reclaim flag on the underlying page.`
			`*`
			`* If all buddies in the zbud page are successfully evicted, then the`
			`* zbud page can be freed.`
			`*`
			`* Returns: 0 if page is successfully freed, otherwise -EINVAL if there are`
			`* no pages to evict or an eviction handler is not registered, -EAGAIN if`
			`* the retry limit was hit.`
			`*/`
			`int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)`
			`{`
			`int i, ret, freechunks;`
			`struct zbud_header *zhdr;`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`unsigned long flags;`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`unsigned long first_handle = 0, last_handle = 0;`

mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_lock_irqsave(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`if (!pool->ops \|\| !pool->ops->evict \|\| list_empty(&pool->lru) \|\|`
			`retries == 0) {`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_unlock_irqrestore(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`return -EINVAL;`
			`}`
			`for (i = 0; i < retries; i++) {`
			`zhdr = list_tail_entry(&pool->lru, struct zbud_header, lru);`
			`list_del(&zhdr->lru);`
			`list_del(&zhdr->buddy);`
			`/* Protect zbud page against free */`
			`zhdr->under_reclaim = true;`
			`/*`
			`* We need encode the handles before unlocking, since we can`
			`* race with free that will set (first\|last)_chunks to 0`
			`*/`
			`first_handle = 0;`
			`last_handle = 0;`
			`if (zhdr->first_chunks)`
			`first_handle = encode_handle(zhdr, FIRST);`
			`if (zhdr->last_chunks)`
			`last_handle = encode_handle(zhdr, LAST);`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_unlock_irqrestore(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00
			`/* Issue the eviction callback(s) */`
			`if (first_handle) {`
			`ret = pool->ops->evict(pool, first_handle);`
			`if (ret)`
			`goto next;`
			`}`
			`if (last_handle) {`
			`ret = pool->ops->evict(pool, last_handle);`
			`if (ret)`
			`goto next;`
			`}`
			`next:`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_lock_irqsave(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`zhdr->under_reclaim = false;`
			`if (zhdr->first_chunks == 0 && zhdr->last_chunks == 0) {`
			`/*`
			`* Both buddies are now free, free the zbud page and`
			`* return success.`
			`*/`
			`free_zbud_page(zhdr);`
			`pool->pages_nr--;`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_unlock_irqrestore(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`return 0;`
			`} else if (zhdr->first_chunks == 0 \|\|`
			`zhdr->last_chunks == 0) {`
			`/* add to unbuddied list */`
			`freechunks = num_free_chunks(zhdr);`
			`list_add(&zhdr->buddy, &pool->unbuddied[freechunks]);`
			`} else {`
			`/* add to buddied list */`
			`list_add(&zhdr->buddy, &pool->buddied);`
			`}`

			`/* add to beginning of LRU */`
			`list_add(&zhdr->lru, &pool->lru);`
			`}`
mm: zbud: fix the locking scenarios with zcache With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> 2016-06-28 09:48:42 +05:30			`spin_unlock_irqrestore(&pool->lock, flags);`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`return -EAGAIN;`
			`}`

			`/**`
			`* zbud_map() - maps the allocation associated with the given handle`
			`* @pool: pool in which the allocation resides`
			`* @handle: handle associated with the allocation to be mapped`
			`*`
			`* While trivial for zbud, the mapping functions for others allocators`
			`* implementing this allocation API could have more complex information encoded`
			`* in the handle and could create temporary mappings to make the data`
			`* accessible to the user.`
			`*`
			`* Returns: a pointer to the mapped allocation`
			`*/`
			`void zbud_map(struct zbud_pool pool, unsigned long handle)`
			`{`
			`return (void *)(handle);`
			`}`

			`/**`
			`* zbud_unmap() - maps the allocation associated with the given handle`
			`* @pool: pool in which the allocation resides`
			`* @handle: handle associated with the allocation to be unmapped`
			`*/`
			`void zbud_unmap(struct zbud_pool *pool, unsigned long handle)`
			`{`
			`}`

			`/**`
			`* zbud_get_pool_size() - gets the zbud pool size in pages`
			`* @pool: pool whose size is being queried`
			`*`
			`* Returns: size in pages of the given pool. The pool lock need not be`
			`* taken to access pages_nr.`
			`*/`
			`u64 zbud_get_pool_size(struct zbud_pool *pool)`
			`{`
			`return pool->pages_nr;`
			`}`

			`static int __init init_zbud(void)`
			`{`
			`/* Make sure the zbud header will fit in one chunk */`
			`BUILD_BUG_ON(sizeof(struct zbud_header) > ZHDR_SIZE_ALIGNED);`
			`pr_info("loaded\n");`
mm/zpool: zbud/zsmalloc implement zpool Update zbud and zsmalloc to implement the zpool api. [fengguang.wu@intel.com: make functions static] Signed-off-by: Dan Streetman <ddstreet@ieee.org> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:38 -07:00
			`#ifdef CONFIG_ZPOOL`
			`zpool_register_driver(&zbud_zpool_driver);`
			`#endif`

zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`return 0;`
			`}`

			`static void __exit exit_zbud(void)`
			`{`
mm/zpool: zbud/zsmalloc implement zpool Update zbud and zsmalloc to implement the zpool api. [fengguang.wu@intel.com: make functions static] Signed-off-by: Dan Streetman <ddstreet@ieee.org> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2014-08-06 16:08:38 -07:00			`#ifdef CONFIG_ZPOOL`
			`zpool_unregister_driver(&zbud_zpool_driver);`
			`#endif`

zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`pr_info("unloaded\n");`
			`}`

			`module_init(init_zbud);`
			`module_exit(exit_zbud);`

			`MODULE_LICENSE("GPL");`
zbud, zswap: change module author email Old email no longer viable. Signed-off-by: Seth Jennings <sjennings@variantweb.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz> 2014-11-12 21:08:46 -06:00			`MODULE_AUTHOR("Seth Jennings <sjennings@variantweb.net>");`
zbud: add to mm/ zbud is an special purpose allocator for storing compressed pages. It is designed to store up to two compressed pages per physical page. While this design limits storage density, it has simple and deterministic reclaim properties that make it preferable to a higher density approach when reclaim will be used. zbud works by storing compressed pages, or "zpages", together in pairs in a single memory page called a "zbud page". The first buddy is "left justifed" at the beginning of the zbud page, and the last buddy is "right justified" at the end of the zbud page. The benefit is that if either buddy is freed, the freed buddy space, coalesced with whatever slack space that existed between the buddies, results in the largest possible free region within the zbud page. zbud also provides an attractive lower bound on density. The ratio of zpages to zbud pages can not be less than 1. This ensures that zbud can never "do harm" by using more pages to store zpages than the uncompressed zpages would have used on their own. This implementation is a rewrite of the zbud allocator internally used by zcache in the driver/staging tree. The rewrite was necessary to remove some of the zcache specific elements that were ingrained throughout and provide a generic allocation interface that can later be used by zsmalloc and others. This patch adds zbud to mm/ for later use by zswap. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Jenifer Hopper <jhopper@us.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Hansen <dave@sr71.net> Cc: Joe Perches <joe@perches.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Hugh Dickens <hughd@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2013-07-10 16:04:55 -07:00			`MODULE_DESCRIPTION("Buddy Allocator for Compressed Pages");`