android_kernel_oneplus_msm8998/drivers/md
Mikulas Patocka 40dbcbebf0 dm snapshot: rework COW throttling to fix deadlock
[ Upstream commit b21555786f18cd77f2311ad89074533109ae3ffa ]

Commit 721b1d98fb517a ("dm snapshot: Fix excessive memory usage and
workqueue stalls") introduced a semaphore to limit the maximum number of
in-flight kcopyd (COW) jobs.

The implementation of this throttling mechanism is prone to a deadlock:

1. One or more threads write to the origin device causing COW, which is
   performed by kcopyd.

2. At some point some of these threads might reach the s->cow_count
   semaphore limit and block in down(&s->cow_count), holding a read lock
   on _origins_lock.

3. Someone tries to acquire a write lock on _origins_lock, e.g.,
   snapshot_ctr(), which blocks because the threads at step (2) already
   hold a read lock on it.

4. A COW operation completes and kcopyd runs dm-snapshot's completion
   callback, which ends up calling pending_complete().
   pending_complete() tries to resubmit any deferred origin bios. This
   requires acquiring a read lock on _origins_lock, which blocks.

   This happens because the read-write semaphore implementation gives
   priority to writers, meaning that as soon as a writer tries to enter
   the critical section, no readers will be allowed in, until all
   writers have completed their work.

   So, pending_complete() waits for the writer at step (3) to acquire
   and release the lock. This writer waits for the readers at step (2)
   to release the read lock and those readers wait for
   pending_complete() (the kcopyd thread) to signal the s->cow_count
   semaphore: DEADLOCK.

The above was thoroughly analyzed and documented by Nikos Tsironis as
part of his initial proposal for fixing this deadlock, see:
https://www.redhat.com/archives/dm-devel/2019-October/msg00001.html

Fix this deadlock by reworking COW throttling so that it waits without
holding any locks. Add a variable 'in_progress' that counts how many
kcopyd jobs are running. A function wait_for_in_progress() will sleep if
'in_progress' is over the limit. It drops _origins_lock in order to
avoid the deadlock.

Reported-by: Guruswamy Basavaiah <guru2018@gmail.com>
Reported-by: Nikos Tsironis <ntsironis@arrikto.com>
Reviewed-by: Nikos Tsironis <ntsironis@arrikto.com>
Tested-by: Nikos Tsironis <ntsironis@arrikto.com>
Fixes: 721b1d98fb51 ("dm snapshot: Fix excessive memory usage and workqueue stalls")
Cc: stable@vger.kernel.org # v5.0+
Depends-on: 4a3f111a73a8c ("dm snapshot: introduce account_start_copy() and account_end_copy()")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06 12:09:10 +01:00
..
bcache bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush() 2019-08-04 09:34:48 +02:00
persistent-data dm space map metadata: fix missing store of apply_bops() return value 2019-09-06 10:18:10 +02:00
bitmap.c md/bitmap: disable bitmap_resize for file-backed bitmaps. 2017-09-27 11:00:14 +02:00
bitmap.h md-cluster: Use a small window for resync 2015-10-12 01:32:05 -05:00
dm-bio-prison.c block: add a bi_error field to struct bio 2015-07-29 08:55:15 -06:00
dm-bio-prison.h dm bio prison: add dm_cell_promote_or_release() 2015-05-29 14:19:06 -04:00
dm-bio-record.h
dm-bufio.c Revert "dm bufio: fix deadlock with loop device" 2019-09-06 10:18:09 +02:00
dm-bufio.h
dm-builtin.c
dm-cache-block-types.h
dm-cache-metadata.c dm cache metadata: save in-core policy_hint_size to on-disk superblock 2018-09-09 20:04:33 +02:00
dm-cache-metadata.h dm cache: make sure every metadata function checks fail_io 2016-04-12 09:08:40 -07:00
dm-cache-policy-cleaner.c - Revert a dm-multipath change that caused a regression for unprivledged 2015-11-04 21:19:53 -08:00
dm-cache-policy-internal.h dm cache: age and write back cache entries even without active IO 2015-06-11 17:13:01 -04:00
dm-cache-policy-mq.c dm: convert ffs to __ffs 2015-10-31 19:06:01 -04:00
dm-cache-policy-smq.c dm: convert ffs to __ffs 2015-10-31 19:06:01 -04:00
dm-cache-policy.c
dm-cache-policy.h dm cache: age and write back cache entries even without active IO 2015-06-11 17:13:01 -04:00
dm-cache-target.c dm cache: fix resize crash if user doesn't reload cache table 2018-10-13 09:11:32 +02:00
dm-crypt.c dm crypt: mark key as invalid until properly loaded 2017-01-06 11:16:15 +01:00
dm-delay.c dm delay: fix a crash when invalid device is specified 2019-06-11 12:23:48 +02:00
dm-era-target.c dm era: save spacemap metadata root after the pre-commit 2017-05-20 14:27:00 +02:00
dm-exception-store.c - Revert a dm-multipath change that caused a regression for unprivledged 2015-11-04 21:19:53 -08:00
dm-exception-store.h dm snapshot: fix hung bios when copy error occurs 2016-03-03 15:07:14 -08:00
dm-flakey.c dm flakey: return -EINVAL on interval bounds error in flakey_ctr() 2017-01-06 11:16:15 +01:00
dm-io.c dm io: fix duplicate bio completion due to missing ref count 2018-03-11 16:19:47 +01:00
dm-ioctl.c dm ioctl: harden copy_params()'s copy_from_user() from malicious users 2018-11-21 09:27:36 +01:00
dm-kcopyd.c dm kcopyd: Fix bug causing workqueue stalls 2019-01-26 09:42:54 +01:00
dm-linear.c dm linear: remove redundant target name from error messages 2015-10-31 19:06:03 -04:00
dm-log-userspace-base.c dm: drop NULL test before kmem_cache_destroy() and mempool_destroy() 2015-10-31 19:06:00 -04:00
dm-log-userspace-transfer.c dm log userspace transfer: match wait_for_completion_timeout return type 2015-04-15 12:10:20 -04:00
dm-log-userspace-transfer.h
dm-log-writes.c dm log writes: fix bug with too large bios 2016-10-07 15:23:47 +02:00
dm-log.c
dm-mpath.c dm mpath: check if path's request_queue is dying in activate_path() 2016-10-28 03:01:28 -04:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-queue-length.c
dm-raid.c dm raid: fix round up of default region size 2015-10-02 12:02:31 -04:00
dm-raid1.c dm mirror: fix read error on recovery after default leg failure 2016-11-10 16:36:35 +01:00
dm-region-hash.c dm: convert ffs to __ffs 2015-10-31 19:06:01 -04:00
dm-round-robin.c
dm-service-time.c
dm-snap-persistent.c dm snapshot: fix hung bios when copy error occurs 2016-03-03 15:07:14 -08:00
dm-snap-transient.c dm snapshot: fix hung bios when copy error occurs 2016-03-03 15:07:14 -08:00
dm-snap.c dm snapshot: rework COW throttling to fix deadlock 2019-11-06 12:09:10 +01:00
dm-stats.c dm stats: fix a leaked s->histogram_boundaries array 2017-03-12 06:37:26 +01:00
dm-stats.h dm stats: support precise timestamps 2015-06-17 12:40:40 -04:00
dm-stripe.c Merge tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm 2015-09-02 16:35:26 -07:00
dm-switch.c dm switch: simplify conditional in alloc_region_table() 2015-10-31 19:06:06 -04:00
dm-sysfs.c
dm-table.c dm table: fix invalid memory accesses with too high sector number 2019-09-06 10:18:11 +02:00
dm-target.c
dm-thin-metadata.c dm thin metadata: fix __udivdi3 undefined on 32-bit 2018-10-10 08:52:13 +02:00
dm-thin-metadata.h dm thin metadata: add dm_thin_remove_range() 2015-06-11 17:13:04 -04:00
dm-thin.c dm thin: add sanity checks to thin-pool and external snapshot creation 2019-04-27 09:33:49 +02:00
dm-uevent.c
dm-uevent.h
dm-verity.c dm verity: use message limit for data block corruption message 2019-07-21 09:07:14 +02:00
dm-zero.c block: add a bi_error field to struct bio 2015-07-29 08:55:15 -06:00
dm.c dm: correctly handle chained bios in dec_pending() 2018-02-22 15:45:01 +01:00
dm.h block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
faulty.c block: add a bi_error field to struct bio 2015-07-29 08:55:15 -06:00
Kconfig dm raid: select the Kconfig option CONFIG_MD_RAID0 2017-05-25 14:30:07 +02:00
linear.c md/linear: shutup lockdep warnning 2017-10-21 17:09:05 +02:00
linear.h md linear: fix a race between linear_add() and linear_congested() 2017-03-12 06:37:30 +01:00
Makefile raid5: add basic stripe log 2015-10-24 17:16:19 +11:00
md-cluster.c md-cluster: clear another node's suspend_area after the copy is finished 2018-10-10 08:52:04 +02:00
md-cluster.h md-cluster: Fix adding of new disk with new reload code 2015-10-12 03:35:30 -05:00
md.c md: don't set In_sync if array is frozen 2019-10-05 12:27:47 +02:00
md.h md/raid: only permit hot-add of compatible integrity profiles 2016-02-17 12:30:57 -08:00
multipath.c md: multipath: don't hardcopy bio in .make_request path 2016-04-12 09:08:57 -07:00
multipath.h
raid0.c md/raid0: apply base queue limits *before* disk_stack_limits 2015-10-02 17:23:44 +10:00
raid0.h block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
raid1.c md/raid1: fail run raid1 array when active disk less than one 2019-10-05 12:27:49 +02:00
raid1.h md-cluster: Use a small window for resync 2015-10-12 01:32:05 -05:00
raid5-cache.c raid5-cache: start raid5 readonly if journal is missing 2015-11-01 13:48:29 +11:00
raid5.c md/raid6: Set R5_ReadError when there is read failure on parity disk 2019-10-05 12:27:54 +02:00
raid5.h RAID5: revert e9e4c377e2 to fix a livelock 2016-04-12 09:08:57 -07:00
raid10.c md: Fix failed allocation of md_register_thread 2019-03-23 08:44:39 +01:00
raid10.h md/raid10: ensure device failure recorded before write request returns. 2015-08-31 19:43:45 +02:00