evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
NeilBrown	0b59bb6422	md/raid10: avoid fullsync when not necessary. This is the raid10 equivalent of commit `4f0a5e012c` MD RAID1: Further conditionalize 'fullsync' If a device in a newly assembled array is not fully recovered we currently do a fully resync by don't need to. Signed-off-by: NeilBrown <neilb@suse.de>	2014-01-14 16:44:21 +11:00
NeilBrown	7eb418851f	md: allow a partially recovered device to be hot-added to an array. When adding a new device into an array it is normally important to clear any stale data from ->recovery_offset else the new device may not be recovered properly. However when re-adding a device which is known to be nearly in-sync, this is not needed and can be detrimental. The (bitmap-based) resync will still happen, and further recovery is only needed from where-ever it was already up to. So if save_raid_disk is set, signifying a re-add, don't clear ->recovery_offset. Signed-off-by: NeilBrown <neilb@suse.de>	2014-01-14 16:44:21 +11:00
NeilBrown	f466722ca6	md: Change handling of save_raid_disk and metadata update during recovery. Since commit `d70ed2e4fa` MD: Allow restarting an interrupted incremental recovery. we don't write out the metadata to devices while they are recovering. This had a good reason, but has unfortunate consequences. This patch changes things to make them work better. At issue is what happens if the array is shut down while a recovery is happening, particularly a bitmap-guided recovery. Ideally the recovery should pick up where it left off. However the metadata cannot represent the state "A recovery is in process which is guided by the bitmap". Before the above mentioned commit, we wrote metadata to the device which said "this is being recovered and it is up to <here>". So after a restart, a full recovery (not bitmap-guided) would happen from where-ever it was up to. After the commit the metadata wasn't updated so it still said "This device is fully in sync with <this> event count". That leads to a bitmap-based recovery following the whole bitmap, which should be a lot less work than a full recovery from some starting point. So this was an improvement. However updates some metadata but not all leads to other problems. In particular, the metadata written to the fully-up-to-date device record that the array has all devices present (even though some are recovering). So on restart, mdadm wants to find all devices and expects them to have current event counts. Obviously it doesn't (some have old event counts) so (when assembling with --incremental) it waits indefinitely for the rest of the expected devices. It really is wrong to not update all the metadata together. Do that is bound to cause confusion. Instead, we should make it possible to record the truth in the metadata. i.e. we need to be able to record that a device is being recovered based on the bitmap. We already have a Feature flag to say that recovery is happening. We now add another one to say that it is a bitmap-based recovery. With this we can remove the code that disables the write-out of metadata on some devices. So this patch: - moves the setting of 'saved_raid_disk' from add_new_disk to the validate_super methods. This makes sure it is always set properly, both when adding a new device to an array, and when assembling an array from a collection of devices. - Adds a metadata flag MD_FEATURE_RECOVERY_BITMAP which is only used if MD_FEATURE_RECOVERY_OFFSET is set, and record that a bitmap-based recovery is allowed. This is only present in v1.x metadata. v0.90 doesn't support devices which are in the middle of recovery at all. - Only skips writing metadata to Faulty devices. - Also allows rdev state to be set to "-insync" via sysfs. This can be used for external-metadata arrays. When the 'role' is set the device is assumed to be in-sync. If, after setting the role, we set the state to "-insync", the role is moved to saved_raid_disk which effectively says the device is partly in-sync with that slot and needs a bitmap recovery. Cc: Andrei Warkentin <andreiw@vmware.com> Signed-off-by: NeilBrown <neilb@suse.de>	2014-01-14 16:44:21 +11:00
NeilBrown	8313b8e57f	md: fix problem when adding device to read-only array with bitmap. If an array is started degraded, and then the missing device is found it can be re-added and a minimal bitmap-based recovery will bring it fully up-to-date. If the array is read-only a recovery would not be allowed. But also if the array is read-only and the missing device was present very recently, then there could be no need for any recovery at all, so we simply include the device in the read-only array without any recovery. However... if the missing device was removed a little longer ago it could be missing some updates, but if a bitmap is present it will be conditionally accepted pending a bitmap-based update. We don't currently detect this case properly and will include that old device into the read-only array with no recovery even though it really needs a recovery. This patch keeps track of whether a bitmap-based-recovery is really needed or not in the new Bitmap_sync rdev flag. If that is set, then the device will not be added to a read-only array. Cc: Andrei Warkentin <andreiw@vmware.com> Fixes: `d70ed2e4fa` Cc: stable@vger.kernel.org (3.2+) Signed-off-by: NeilBrown <neilb@suse.de>	2014-01-14 16:44:08 +11:00
NeilBrown	e8b8491585	md/raid10: fix bug when raid10 recovery fails to recover a block. commit `e875ecea26` md/raid10 record bad blocks as needed during recovery. added code to the "cannot recover this block" path to record a bad block rather than fail the whole recovery. Unfortunately this new case was placed after r10bio was freed rather than before, yet it still uses r10bio. This is will crash with a null dereference. So move the freeing of r10bio down where it is safe. Cc: stable@vger.kernel.org (v3.1+) Fixes: `e875ecea26` Reported-by: Damian Nowak <spam@nowaker.net> URL: https://bugzilla.kernel.org/show_bug.cgi?id=68181 Signed-off-by: NeilBrown <neilb@suse.de>	2014-01-14 16:44:08 +11:00
NeilBrown	5af9bef72c	md/raid5: fix a recently broken BUG_ON(). commit `6d183de407` md/raid5: fix newly-broken locking in get_active_stripe. simplified a BUG_ON, but removed too much so now it sometimes fires when it shouldn't. When the STRIPE_EXPANDING flag is set, the stripe_head might be on a special list while multiple stripe_heads are collected, or it might not be on any list, even a 'free' list when the refcount is zero. As long as STRIPE_EXPANDING is set, it will be found and added back to a list eventually. So both of the BUG_ONs which test for the ->lru being empty or not need to avoid the case where STRIPE_EXPANDING is set. The patch which broke this was marked for -stable, so this patch needs to be applied to any branch that received `6d183de4` Fixes: `6d183de407` Cc: stable@vger.kernel.org (any release to which above was applied) Signed-off-by: NeilBrown <neilb@suse.de>	2014-01-14 16:44:07 +11:00
NeilBrown	41a336e011	md/raid1: fix request counting bug in new 'barrier' code. The new iobarrier implementation in raid1 (which keeps normal writes and resync activity separate) counts every request what is not before the current resync point in either next_window_requests or current_window_requests. It flags that the request is counted by setting ->start_next_window. allow_barrier follows this model exactly and decrements one of the _window_requests if and only if ->start_next_window is set. However wait_barrier(), which increments _window_requests uses a slightly different test for setting -.start_next_window (which is set from the return value of this function). So there is a possibility of the counts getting out of sync, and this leads to the resync hanging. So change wait_barrier() to return a non-zero value in exactly the same cases that it increments *_window_requests. But was introduced in 3.13-rc1. Reported-by: Bruno Wolff III <bruno@wolff.to> URL: https://bugzilla.kernel.org/show_bug.cgi?id=68061 Fixes: `79ef3a8aa1` Cc: majianpeng <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2014-01-14 16:44:07 +11:00
NeilBrown	b50c259e25	md/raid10: fix two bugs in handling of known-bad-blocks. If we discover a bad block when reading we split the request and potentially read some of it from a different device. The code path of this has two bugs in RAID10. 1/ we get a spin_lock with _irq, but unlock without _irq!! 2/ The calculation of 'sectors_handled' is wrong, as can be clearly seen by comparison with raid1.c This leads to at least 2 warnings and a probable crash is a RAID10 ever had known bad blocks. Cc: stable@vger.kernel.org (v3.1+) Fixes: `856e08e237` Reported-by: Damian Nowak <spam@nowaker.net> URL: https://bugzilla.kernel.org/show_bug.cgi?id=68181 Signed-off-by: NeilBrown <neilb@suse.de>	2014-01-14 16:44:07 +11:00
NeilBrown	1cc03eb932	md/raid5: Fix possible confusion when multiple write errors occur. commit `5d8c71f9e5` md: raid5 crash during degradation Fixed a crash in an overly simplistic way which could leave R5_WriteError or R5_MadeGood set in the stripe cache for devices for which it is no longer relevant. When those devices are removed and spares added the flags are still set and can cause incorrect behaviour. commit `14a75d3e07` md/raid5: preferentially read from replacement device if possible. Fixed the same bug if a more effective way, so we can now revert the original commit. Reported-and-tested-by: Alexander Lyakas <alex.bolshoy@gmail.com> Cc: stable@vger.kernel.org (3.2+ - 3.2 will need a different fix though) Fixes: `5d8c71f9e5` Signed-off-by: NeilBrown <neilb@suse.de>	2014-01-14 16:44:07 +11:00
Hugh Dickins	b3ff8a2f95	cgroup: remove stray references to css_id Trivial: remove the few stray references to css_id, which itself was removed in v3.13's `2ff2a7d03b` "cgroup: kill css_id". Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-01-13 10:48:18 -05:00
Mike Snitzer	6a388618f1	dm cache: add block sizes and total cache blocks to status output Improve cache_status to emit: <metadata block size> <#used metadata blocks>/<#total metadata blocks> <cache block size> <#used cache blocks>/<#total cache blocks> ... Adding the block sizes allows for easier calculation of the overall size of both the metadata and cache devices. Adding <#total cache blocks> provides useful context for how much of the cache is used. Unfortunately these additions to the status will require updates to users' scripts that monitor the cache status. But these changes help provide more comprehensive information about the cache device and will simplify tools that are being developed to manage dm-cache devices -- because they won't need to issue 3 operations to cobble together the information that we can easily provide via a single status ioctl. While updating the status documentation in cache.txt spaces were tabify'd. Requested-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com>	2014-01-10 10:24:33 -05:00
Joe Thornber	f164e6900f	dm btree: add dm_btree_find_lowest_key dm_btree_find_lowest_key is the reciprocal of dm_btree_find_highest_key. Factor out common code for dm_btree_find_{highest,lowest}_key. dm_btree_find_lowest_key is needed for an upcoming DM target, as such it is best to get this interface in place. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2014-01-09 16:29:17 -05:00
Kent Overstreet	9dd6358a21	bcache: Fix auxiliary search trees for key size > cacheline size Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:15 -08:00
Kent Overstreet	3b3e9e50dd	bcache: Don't return -EINTR when insert finished We need to return -EINTR after a split because we invalidated iterators (and freed the btree node) - but if we were finished inserting, we don't want to redo the traversal. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:14 -08:00
Kent Overstreet	e0a985a4b1	bcache: Improve bucket_prio() calculation When deciding what order to reuse buckets we take into account both the bucket's priority (which indicates lru order) and also the amount of live data in that bucket. The way they were scaled together wasn't as correct as it could be... this patch improves and documents it. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:14 -08:00
Nicholas Swenson	3bdad1e40d	bcache: Add bch_bkey_equal_header() Checks if two keys have equivalent header fields. (good enough for replacement or merging) Used in bch_bkey_try_merge, and replacing a key in the btree. Signed-off-by: Nicholas Swenson <nks@daterainc.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:14 -08:00
Nicholas Swenson	0f49cf3d83	bcache: update bch_bkey_try_merge Added generic header checks to bch_bkey_try_merge, which then calls the bkey specific function Removed extraneous checks from bch_extent_merge Signed-off-by: Nicholas Swenson <nks@daterainc.com>	2014-01-08 13:05:14 -08:00
Kent Overstreet	829a60b905	bcache: Move insert_fixup() to btree_keys_ops Now handling overlapping extents/keys is a method that's specific to what the btree node contains. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:14 -08:00
Kent Overstreet	89ebb4a28b	bcache: Convert sorting to btree_keys More work to disentangle various code from struct btree Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:13 -08:00
Kent Overstreet	dc9d98d621	bcache: Convert debug code to btree_keys More work to disentangle various code from struct btree Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:13 -08:00
Kent Overstreet	c052dd9a26	bcache: Convert btree_iter to struct btree_keys More work to disentangle bset.c from struct btree Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:13 -08:00
Kent Overstreet	f67342dd34	bcache: Refactor bset_tree sysfs stats We're in the process of turning bset.c into library code, so none of the code in that file should know about struct cache_set or struct btree - so, move the btree traversal part of the stats code to sysfs.c. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:13 -08:00
Kent Overstreet	59158fde42	bcache: Add bch_btree_keys_u64s_remaining() Helper function to explicitly check how much space is free in a btree node Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:13 -08:00
Kent Overstreet	a85e968e66	bcache: Add struct btree_keys Soon, bset.c won't need to depend on struct btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:13 -08:00
Kent Overstreet	65d45231b5	bcache: Abstract out stuff needed for sorting Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:12 -08:00
Kent Overstreet	ee811287c9	bcache: Rename/shuffle various code around More work to disentangle bset.c from the rest of the code: Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:12 -08:00
Kent Overstreet	67539e8528	bcache: Add struct bset_sort_state More disentangling bset.c from the rest of the bcache code - soon, the sorting routines won't have any dependencies on any outside structs. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:12 -08:00
Kent Overstreet	911c961009	bcache: Split out sort_extent_cmp() Only use extent comparison for comparing extents, so we're not using START_KEY() on other key types (i.e. btree pointers) Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:12 -08:00
Kent Overstreet	fafff81cea	bcache: Bkey indexing renaming More refactoring: node() -> bset_bkey_idx() end() -> bset_bkey_last() Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:12 -08:00
Kent Overstreet	085d2a3dd4	bcache: Make bch_keylist_realloc() take u64s, not nptrs Getting away from KEY_PTRS and moving toward KEY_U64s - and getting rid of magic 2s Also - split out the part that checks against journal entry size so as to avoid a dependancy on struct cache_set in bset.c Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:11 -08:00
Kent Overstreet	9a02b7eeeb	bcache: Remove/fix some header dependencies In the process of disentagling/libraryizing bset.c from the rest of the bcache code. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:11 -08:00
Kent Overstreet	0a45114534	bcache: Use a mempool for mergesort temporary space It was a single element mempool before, it's slightly cleaner to just use a real mempool. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:11 -08:00
Kent Overstreet	78b77bf8b2	bcache: Btree verify code improvements Used this fixed code to find and fix the bug fixed by a4d885097b0ac0cd1337f171f2d4b83e946094d4. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:10 -08:00
Kent Overstreet	88b9f8c426	bcache: kill index() That was a terrible name for a macro, add some better helpers to replace it. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:10 -08:00
Kent Overstreet	5c41c8a713	bcache: Trivial error handling fix Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:10 -08:00
Kent Overstreet	c78afc6261	bcache/md: Use raid stripe size Now that we've got code for raid5/6 stripe awareness, bcache just needs to know about the stripes and when writing partial stripes is expensive - we probably don't want to enable this optimization for raid1 or 10, even though they have stripes. So add a flag to queue_limits. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:09 -08:00
Kent Overstreet	5f5837d2d6	bcache: Do bkey_put() in btree_split() error path This error path shouldn't have been hit in practice.. and we've got reworked reserve code coming soon so that it shouldn't _ever_ be bit... but if we've got code for this error path it should be correct. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:09 -08:00
Kent Overstreet	78365411b3	bcache: Rework allocator reserves We need a reserve for allocating buckets for new btree nodes - and now that we've got multiple btrees, it really needs to be per btree. This reworks the reserves so we've got separate freelists for each reserve instead of watermarks, which seems to make things a bit cleaner, and it adds some code so that btree_split() can make sure the reserve is available before it starts. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:09 -08:00
Kent Overstreet	1dd13c8d3c	bcache: kill closure locking code Also flesh out the documentation a bit Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:08 -08:00
Kent Overstreet	cb7a583e6a	bcache: kill closure locking usage Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:08 -08:00
Kent Overstreet	a5ae4300c1	bcache: Zero less memory Another minor performance optimization Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:08 -08:00
Kent Overstreet	d56d000a1f	bcache: Don't touch bucket gen for dirty ptrs Unnecessary since a bucket that has dirty pointers pointing to it can never be invalidated - and skipping it is a measurable performance boost, since the bucket gen will usually be a cache miss. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:07 -08:00
Kent Overstreet	b0f32a56f2	bcache: Minor btree cache fix Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:07 -08:00
Kent Overstreet	5775e2133d	bcache: Performance fix for when journal entry is full We were unnecessarily waiting on a journal write to complete when we just needed to start a journal write and start setting up the next one. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:07 -08:00
Kent Overstreet	b3fa7e77e6	bcache: Minor journal fix The real fix is where we check the bytes we need against how much is remaining - we also need to check for a journal entry bigger than our buffer, we'll never write those and it would be bad if we tried to read one. Also improve the diagnostic messages. Signed-off-by: Kent Overstreet <kmo@daterainc.com>	2014-01-08 13:05:06 -08:00
Kent Overstreet	ef71ec0000	bcache: Data corruption fix The code that handles overlapping extents that we've just read back in from disk was depending on the behaviour of the code that handles overlapping extents as we're inserting into a btree node in the case of an insert that forced an existing extent to be split: on insert, if we had to split we'd also insert a new extent to represent the top part of the old extent - and then that new extent would get written out. The code that read the extents back in thus not bother with splitting extents - if it saw an extent that ovelapped in the middle of an older extent, it would trim the old extent to only represent the bottom part, assuming that the original insert would've inserted a new extent to represent the top part. I still haven't figured out _how_ it can happen, but I'm now pretty convinced (and testing has confirmed) that there's some kind of an obscure corner case (probably involving extent merging, and multiple overwrites in different sets) that breaks this. The fix is to change the mergesort fixup code to split extents itself when required. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10	2014-01-08 13:05:06 -08:00
Joe Thornber	7e664b3dec	dm space map metadata: fix extending the space map When extending a metadata space map we should do the first commit whilst still in bootstrap mode -- a mode where all blocks get allocated in the new area. That way the commit overhead is allocated from the newly added space. Otherwise we risk running out of space. With this fix, and the previous commit "dm space map common: make sure new space is used during extend", the following device mapper testsuite test passes: dmtest run --suite thin-provisioning -n /resize_metadata_no_io/ Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org	2014-01-07 21:05:18 -05:00
Joe Thornber	12c91a5c2d	dm space map common: make sure new space is used during extend When extending a low level space map we should update nr_blocks at the start so the new space is used for the index entries. Otherwise extend can fail, e.g.: sm_metadata_extend call sequence that fails: -> sm_ll_extend -> dm_tm_new_block -> dm_sm_new_block -> sm_bootstrap_new_block => returns -ENOSPC because smm->begin == smm->ll.nr_blocks Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org	2014-01-07 21:05:17 -05:00
Mikulas Patocka	be35f48610	dm: wait until embedded kobject is released before destroying a device There may be other parts of the kernel holding a reference on the dm kobject. We must wait until all references are dropped before deallocating the mapped_device structure. The dm_kobject_release method signals that all references are dropped via completion. But dm_kobject_release doesn't free the kobject (which is embedded in the mapped_device structure). This is the sequence of operations: * when destroying a DM device, call kobject_put from dm_sysfs_exit * wait until all users stop using the kobject, when it happens the release method is called * the release method signals the completion and should return without delay * the dm device removal code that waits on the completion continues * the dm device removal code drops the dm_mod reference the device had * the dm device removal code frees the mapped_device structure that contains the kobject Using kobject this way should avoid the module unload race that was mentioned at the beginning of this thread: https://lkml.org/lkml/2014/1/4/83 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org	2014-01-07 21:01:43 -05:00
Mikulas Patocka	1ddd641ddc	dm: remove pointless kobject comparison in dm_get_from_kobject The comparison is always true and the compiler optimizes it out anyway. Milan offered additional context relative to the original commit `784aae735d` ("dm: add name and uuid to sysfs") which introduced the code: "I think it is just relict of some experiments before I committed this simple embedded sysfs kobj handling". Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2014-01-07 13:22:32 -05:00

1 2 3 4 5 ...