android_kernel_oneplus_msm8998/fs
Qu Wenruo 02e48c4d57 btrfs: Don't remove block group that still has pinned down bytes
[ Upstream commit 43794446548730ac8461be30bbe47d5d027d1d16 ]

[BUG]
Under certain KVM load and LTP tests, it is possible to hit the
following calltrace if quota is enabled:

BTRFS critical (device vda2): unable to find logical 8820195328 length 4096
BTRFS critical (device vda2): unable to find logical 8820195328 length 4096

WARNING: CPU: 0 PID: 49 at ../block/blk-core.c:172 blk_status_to_errno+0x1a/0x30
CPU: 0 PID: 49 Comm: kworker/u2:1 Not tainted 4.12.14-15-default #1 SLE15 (unreleased)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
task: ffff9f827b340bc0 task.stack: ffffb4f8c0304000
RIP: 0010:blk_status_to_errno+0x1a/0x30
Call Trace:
 submit_extent_page+0x191/0x270 [btrfs]
 ? btrfs_create_repair_bio+0x130/0x130 [btrfs]
 __do_readpage+0x2d2/0x810 [btrfs]
 ? btrfs_create_repair_bio+0x130/0x130 [btrfs]
 ? run_one_async_done+0xc0/0xc0 [btrfs]
 __extent_read_full_page+0xe7/0x100 [btrfs]
 ? run_one_async_done+0xc0/0xc0 [btrfs]
 read_extent_buffer_pages+0x1ab/0x2d0 [btrfs]
 ? run_one_async_done+0xc0/0xc0 [btrfs]
 btree_read_extent_buffer_pages+0x94/0xf0 [btrfs]
 read_tree_block+0x31/0x60 [btrfs]
 read_block_for_search.isra.35+0xf0/0x2e0 [btrfs]
 btrfs_search_slot+0x46b/0xa00 [btrfs]
 ? kmem_cache_alloc+0x1a8/0x510
 ? btrfs_get_token_32+0x5b/0x120 [btrfs]
 find_parent_nodes+0x11d/0xeb0 [btrfs]
 ? leaf_space_used+0xb8/0xd0 [btrfs]
 ? btrfs_leaf_free_space+0x49/0x90 [btrfs]
 ? btrfs_find_all_roots_safe+0x93/0x100 [btrfs]
 btrfs_find_all_roots_safe+0x93/0x100 [btrfs]
 btrfs_find_all_roots+0x45/0x60 [btrfs]
 btrfs_qgroup_trace_extent_post+0x20/0x40 [btrfs]
 btrfs_add_delayed_data_ref+0x1a3/0x1d0 [btrfs]
 btrfs_alloc_reserved_file_extent+0x38/0x40 [btrfs]
 insert_reserved_file_extent.constprop.71+0x289/0x2e0 [btrfs]
 btrfs_finish_ordered_io+0x2f4/0x7f0 [btrfs]
 ? pick_next_task_fair+0x2cd/0x530
 ? __switch_to+0x92/0x4b0
 btrfs_worker_helper+0x81/0x300 [btrfs]
 process_one_work+0x1da/0x3f0
 worker_thread+0x2b/0x3f0
 ? process_one_work+0x3f0/0x3f0
 kthread+0x11a/0x130
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x35/0x40

BTRFS critical (device vda2): unable to find logical 8820195328 length 16384
BTRFS: error (device vda2) in btrfs_finish_ordered_io:3023: errno=-5 IO failure
BTRFS info (device vda2): forced readonly
BTRFS error (device vda2): pending csums is 2887680

[CAUSE]
It's caused by race with block group auto removal:

- There is a meta block group X, which has only one tree block
  The tree block belongs to fs tree 257.
- In current transaction, some operation modified fs tree 257
  The tree block gets COWed, so the block group X is empty, and marked
  as unused, queued to be deleted.
- Some workload (like fsync) wakes up cleaner_kthread()
  Which will call btrfs_delete_unused_bgs() to remove unused block
  groups.
  So block group X along its chunk map get removed.
- Some delalloc work finished for fs tree 257
  Quota needs to get the original reference of the extent, which will
  read tree blocks of commit root of 257.
  Then since the chunk map gets removed, the above warning gets
  triggered.

[FIX]
Just let btrfs_delete_unused_bgs() skip block group which still has
pinned bytes.

However there is a minor side effect: currently we only queue empty
blocks at update_block_group(), and such empty block group with pinned
bytes won't go through update_block_group() again, such block group
won't be removed, until it gets new extent allocated and removed.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15 09:40:40 +02:00
..
9p fs/9p/xattr.c: catch the error of p9_client_clunk when setting xattr failed 2018-09-09 20:04:33 +02:00
adfs
affs affs_lookup(): close a race with affs_remove_link() 2018-05-30 07:48:51 +02:00
afs afs: Fix afs_kill_pages() 2017-12-20 10:04:56 +01:00
autofs4 autofs: mount point create should honour passed in mode 2018-04-24 09:32:11 +02:00
befs
bfs
btrfs btrfs: Don't remove block group that still has pinned down bytes 2018-09-15 09:40:40 +02:00
cachefiles cachefiles: Wait rather than BUG'ing on "Unexpected object collision" 2018-09-05 09:18:35 +02:00
ceph ceph: drop negative child dentries before try pruning inode's alias 2017-12-20 10:04:52 +01:00
cifs SMB3: Number of requests sent should be displayed for SMB3 not just CIFS 2018-09-15 09:40:39 +02:00
coda coda: fix 'kernel memory exposure attempt' in fsync 2017-11-24 08:32:25 +01:00
configfs configfs: Fix race between create_link and configfs_rmdir 2017-06-26 07:13:08 +02:00
cramfs
debugfs dentry name snapshots 2017-08-06 19:19:42 -07:00
devpts devpts: clean up interface to pty drivers 2016-08-16 09:30:49 +02:00
dlm dlm: avoid double-free on error path in dlm_device_{register,unregister} 2017-09-13 14:09:45 -07:00
ecryptfs do d_instantiate/unlock_new_inode combinations safely 2018-05-30 07:48:52 +02:00
efivarfs efi: Make efivarfs entries immutable by default 2016-03-03 15:07:09 -08:00
efs
exofs osd fs: __r4w_get_page rely on PageUptodate for uptodate 2015-12-12 10:15:34 -08:00
exportfs
ext2 do d_instantiate/unlock_new_inode combinations safely 2018-05-30 07:48:52 +02:00
ext4 ext4: reset error code in ext4_find_entry in fallback 2018-09-05 09:18:38 +02:00
f2fs f2fs: fix to don't trigger writeback during recovery 2018-08-06 16:24:31 +02:00
fat fat: validate ->i_start before using 2018-09-15 09:40:38 +02:00
freevxfs
fscache fscache: Allow cancelled operations to be enqueued 2018-09-05 09:18:35 +02:00
fuse fuse: Add missed unlock_page() to fuse_readpages_fill() 2018-09-05 09:18:39 +02:00
gfs2 gfs2: Fix fallocate chunk size 2018-05-30 07:49:13 +02:00
hfs hfs: prevent crash on exit from failed search 2018-09-15 09:40:37 +02:00
hfsplus hfsplus: fix NULL dereference in hfsplus_lookup() 2018-09-15 09:40:38 +02:00
hostfs hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common() 2016-09-30 10:18:39 +02:00
hpfs hpfs: implement the show_options method 2016-06-01 12:15:54 -07:00
hugetlbfs mm: larger stack guard gap, between vmas 2017-06-26 07:13:11 +02:00
isofs isofs: fix timestamps beyond 2027 2017-11-30 08:37:20 +00:00
jbd2 jbd2: don't mark block as modified if the handle is out of credits 2018-07-11 16:03:48 +02:00
jffs2 jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path 2018-05-30 07:48:54 +02:00
jfs jfs: Fix inconsistency between memory allocation and ea_buf->max_size 2018-08-09 12:19:28 +02:00
kernfs kernfs: fix regression in kernfs_fop_write caused by wrong type 2018-02-16 20:09:42 +01:00
lockd lockd: lost rollback of set_grace_period() in lockd_down_net() 2018-05-26 08:48:50 +02:00
logfs
minix
ncpfs staging: ncpfs: memory corruption in ncp_read_kernel() 2018-03-28 18:40:15 +02:00
nfs pnfs/blocklayout: off by one in bl_map_stripe() 2018-09-09 20:04:34 +02:00
nfs_common lockd: fix "list_add double add" caused by legacy signal interface 2018-02-03 17:04:28 +01:00
nfsd nfsd: fix potential use-after-free in nfsd4_decode_getdeviceinfo 2018-08-06 16:24:30 +02:00
nilfs2 do d_instantiate/unlock_new_inode combinations safely 2018-05-30 07:48:52 +02:00
nls
notify fanotify: fix logic of events on child 2018-04-24 09:32:11 +02:00
ntfs
ocfs2 ocfs2: subsystem.su_mutex is required while accessing the item->ci_parent 2018-07-22 14:25:52 +02:00
omfs
openpromfs
overlayfs ovl: warn instead of error if d_type is not supported 2018-08-28 07:23:43 +02:00
proc proc: Use underscores for SSBD in 'status' 2018-07-25 10:18:28 +02:00
pstore pstore: Use dynamic spinlock initializer 2017-08-06 19:19:43 -07:00
qnx4
qnx6
quota fs/quota: Fix spectre gadget in do_quotactl 2018-09-09 20:04:36 +02:00
ramfs
reiserfs reiserfs: change j_timestamp type to time64_t 2018-09-15 09:40:38 +02:00
romfs romfs: use different way to generate fsid for BLOCK or MTD 2017-06-17 06:39:38 +02:00
squashfs squashfs: more metadata hardenings 2018-08-06 16:24:42 +02:00
sysfs scsi: sysfs: Introduce sysfs_{un,}break_active_protection() 2018-09-05 09:18:40 +02:00
sysv
tracefs
ubifs ubifs: Fix synced_i_size calculation for xattr inodes 2018-09-09 20:04:35 +02:00
udf udf: Detect incorrect directory size 2018-07-03 11:21:34 +02:00
ufs do d_instantiate/unlock_new_inode combinations safely 2018-05-30 07:48:52 +02:00
xfs xfs: fix incorrect log_flushed on fsync 2018-06-13 16:15:27 +02:00
aio.c fix io_destroy()/aio_complete() race 2018-06-06 16:46:23 +02:00
anon_inodes.c
attr.c vfs: move permission checking into notify_change() for utimes(NULL) 2016-10-22 12:26:56 +02:00
bad_inode.c
binfmt_aout.c
binfmt_elf.c binfmt_elf: use ELF_ET_DYN_BASE only for PIE 2017-07-21 07:44:57 +02:00
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c fs/binfmt_misc.c: do not allow offset overflow 2018-07-03 11:21:26 +02:00
binfmt_script.c
block_dev.c fs/block_dev: always invalidate cleancache in invalidate_bdev() 2017-05-20 14:27:01 +02:00
buffer.c fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
char_dev.c
compat.c
compat_binfmt_elf.c binfmt_elf: compat: avoid unused function warning 2018-02-25 11:03:51 +01:00
compat_ioctl.c fs: compat: Remove warning from COMPATIBLE_IOCTL 2018-04-08 11:51:57 +02:00
coredump.c coredump: Ensure proper size of sparse core files 2017-07-05 14:37:20 +02:00
dax.c
dcache.c fs/dcache.c: fix kmemcheck splat at take_dentry_name_snapshot() 2018-09-15 09:40:38 +02:00
dcookies.c
direct-io.c direct-io: Prevent NULL pointer access in submit_page_section 2017-10-18 09:20:42 +02:00
drop_caches.c
eventfd.c
eventpoll.c epoll: fix race between ep_poll_callback(POLLFREE) and ep_free()/ep_remove() 2017-09-07 08:34:10 +02:00
exec.c exec: Limit arg stack to at most 75% of _STK_LIM 2017-07-21 07:44:57 +02:00
fcntl.c fs/fcntl: f_setown, avoid undefined behaviour 2018-01-31 12:06:11 +01:00
fhandle.c fs/coredump: prevent fsuid=0 dumps into user-controlled directories 2016-04-12 09:08:58 -07:00
file.c
file_table.c
filesystems.c
fs-writeback.c bdi: Fix oops in wb_workfn() 2018-05-16 10:06:51 +02:00
fs_pin.c
fs_struct.c
inode.c Fix up non-directory creation in SGID directories 2018-07-17 11:31:43 +02:00
internal.h
ioctl.c
Kconfig
Kconfig.binfmt
libfs.c
locks.c locks: don't check for race with close when setting OFD lock 2018-01-17 09:35:27 +01:00
Makefile
mbcache.c
mount.h mnt: In propgate_umount handle visiting mounts in any order 2017-07-21 07:44:57 +02:00
mpage.c fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
namei.c getname_kernel() needs to make sure that ->name != ->iname in long case 2018-04-24 09:32:04 +02:00
namespace.c fix __legitimize_mnt()/mntput() race 2018-08-15 17:42:05 +02:00
no-block.c
nsfs.c nsfs: mark dentry with DCACHE_RCUACCESS 2018-02-16 20:09:43 +01:00
open.c fs: completely ignore unknown open flags 2017-07-15 11:57:44 +02:00
pipe.c pipe: cap initial pipe capacity according to pipe-max-size limit 2018-05-26 08:48:51 +02:00
pnode.c mnt: Make propagate_umount less slow for overlapping mount propagation trees 2017-07-21 07:44:58 +02:00
pnode.h mnt: Add a per mount namespace limit on the number of mounts 2017-04-30 05:49:28 +02:00
posix_acl.c tmpfs: clear S_ISGID when setting posix ACLs 2017-01-26 08:23:47 +01:00
proc_namespace.c vfs: show_vfsstat: do not ignore errors from show_devname method 2016-04-12 09:08:55 -07:00
read_write.c vfs: Return -ENXIO for negative SEEK_HOLE / SEEK_DATA offsets 2017-10-05 09:41:45 +02:00
readdir.c
select.c fs/select: add vmalloc fallback for select(2) 2018-01-31 12:06:09 +01:00
seq_file.c Make file credentials available to the seqfile interfaces 2017-08-06 19:19:42 -07:00
signalfd.c
splice.c vfs: fix uninitialized flags in splice_to_pipe() 2017-02-23 17:43:09 +01:00
stack.c
stat.c ufs: restore maintaining ->i_blocks 2017-06-14 13:16:24 +02:00
statfs.c
super.c sget(): handle failures of register_shrinker() 2018-03-03 10:19:41 +01:00
sync.c
timerfd.c timerfd: Protect the might cancel mechanism proper 2017-05-08 07:46:01 +02:00
userfaultfd.c userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE 2017-12-20 10:04:53 +01:00
utimes.c vfs: move permission checking into notify_change() for utimes(NULL) 2016-10-22 12:26:56 +02:00
xattr.c getxattr: use correct xattr length 2018-09-09 20:04:36 +02:00