android_kernel_oneplus_msm8998/fs/btrfs
Robbie Ko 44b0243956 Btrfs: send, fix infinite loop due to directory rename dependencies
[ Upstream commit a4390aee72713d9e73f1132bcdeb17d72fbbf974 ]

When doing an incremental send, due to the need of delaying directory move
(rename) operations we can end up in infinite loop at
apply_children_dir_moves().

An example scenario that triggers this problem is described below, where
directory names correspond to the numbers of their respective inodes.

Parent snapshot:

 .
 |--- 261/
       |--- 271/
             |--- 266/
                   |--- 259/
                   |--- 260/
                   |     |--- 267
                   |
                   |--- 264/
                   |     |--- 258/
                   |           |--- 257/
                   |
                   |--- 265/
                   |--- 268/
                   |--- 269/
                   |     |--- 262/
                   |
                   |--- 270/
                   |--- 272/
                   |     |--- 263/
                   |     |--- 275/
                   |
                   |--- 274/
                         |--- 273/

Send snapshot:

 .
 |-- 275/
      |-- 274/
           |-- 273/
                |-- 262/
                     |-- 269/
                          |-- 258/
                               |-- 271/
                                    |-- 268/
                                         |-- 267/
                                              |-- 270/
                                                   |-- 259/
                                                   |    |-- 265/
                                                   |
                                                   |-- 272/
                                                        |-- 257/
                                                             |-- 260/
                                                             |-- 264/
                                                                  |-- 263/
                                                                       |-- 261/
                                                                            |-- 266/

When processing inode 257 we delay its move (rename) operation because its
new parent in the send snapshot, inode 272, was not yet processed. Then
when processing inode 272, we delay the move operation for that inode
because inode 274 is its ancestor in the send snapshot. Finally we delay
the move operation for inode 274 when processing it because inode 275 is
its new parent in the send snapshot and was not yet moved.

When finishing processing inode 275, we start to do the move operations
that were previously delayed (at apply_children_dir_moves()), resulting in
the following iterations:

1) We issue the move operation for inode 274;

2) Because inode 262 depended on the move operation of inode 274 (it was
   delayed because 274 is its ancestor in the send snapshot), we issue the
   move operation for inode 262;

3) We issue the move operation for inode 272, because it was delayed by
   inode 274 too (ancestor of 272 in the send snapshot);

4) We issue the move operation for inode 269 (it was delayed by 262);

5) We issue the move operation for inode 257 (it was delayed by 272);

6) We issue the move operation for inode 260 (it was delayed by 272);

7) We issue the move operation for inode 258 (it was delayed by 269);

8) We issue the move operation for inode 264 (it was delayed by 257);

9) We issue the move operation for inode 271 (it was delayed by 258);

10) We issue the move operation for inode 263 (it was delayed by 264);

11) We issue the move operation for inode 268 (it was delayed by 271);

12) We verify if we can issue the move operation for inode 270 (it was
    delayed by 271). We detect a path loop in the current state, because
    inode 267 needs to be moved first before we can issue the move
    operation for inode 270. So we delay again the move operation for
    inode 270, this time we will attempt to do it after inode 267 is
    moved;

13) We issue the move operation for inode 261 (it was delayed by 263);

14) We verify if we can issue the move operation for inode 266 (it was
    delayed by 263). We detect a path loop in the current state, because
    inode 270 needs to be moved first before we can issue the move
    operation for inode 266. So we delay again the move operation for
    inode 266, this time we will attempt to do it after inode 270 is
    moved (its move operation was delayed in step 12);

15) We issue the move operation for inode 267 (it was delayed by 268);

16) We verify if we can issue the move operation for inode 266 (it was
    delayed by 270). We detect a path loop in the current state, because
    inode 270 needs to be moved first before we can issue the move
    operation for inode 266. So we delay again the move operation for
    inode 266, this time we will attempt to do it after inode 270 is
    moved (its move operation was delayed in step 12). So here we added
    again the same delayed move operation that we added in step 14;

17) We attempt again to see if we can issue the move operation for inode
    266, and as in step 16, we realize we can not due to a path loop in
    the current state due to a dependency on inode 270. Again we delay
    inode's 266 rename to happen after inode's 270 move operation, adding
    the same dependency to the empty stack that we did in steps 14 and 16.
    The next iteration will pick the same move dependency on the stack
    (the only entry) and realize again there is still a path loop and then
    again the same dependency to the stack, over and over, resulting in
    an infinite loop.

So fix this by preventing adding the same move dependency entries to the
stack by removing each pending move record from the red black tree of
pending moves. This way the next call to get_pending_dir_moves() will
not return anything for the current parent inode.

A test case for fstests, with this reproducer, follows soon.

Signed-off-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
[Wrote changelog with example and more clear explanation]
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-17 21:55:10 +01:00
..
tests btrfs: tests/qgroup: Fix wrong tree backref level 2018-05-30 07:49:09 +02:00
acl.c btrfs: preserve i_mode if __btrfs_set_acl() fails 2018-03-11 16:19:47 +01:00
async-thread.c btrfs: limit async_work allocation and worker func duration 2017-01-06 11:16:06 +01:00
async-thread.h btrfs: limit async_work allocation and worker func duration 2017-01-06 11:16:06 +01:00
backref.c Btrfs: fix hang on extent buffer lock caused by the inode_paths ioctl 2016-02-25 12:01:15 -08:00
backref.h btrfs: cleanup, remove inode_item_info helper 2015-01-14 19:23:47 +01:00
btrfs_inode.h Btrfs: Direct I/O: Fix space accounting 2015-09-21 13:47:55 -07:00
check-integrity.c Merge branch 'cleanups/for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4 2015-10-21 18:21:40 -07:00
check-integrity.h
compression.c btrfs: assign error values to the correct bio structs 2016-10-22 12:26:54 +02:00
compression.h btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00
ctree.c btrfs: Fix out of bounds access in btrfs_search_slot 2018-05-30 07:48:54 +02:00
ctree.h btrfs: store and load values of stripes_min/stripes_max in balance status item 2017-01-06 11:16:06 +01:00
delayed-inode.c btrfs: limit async_work allocation and worker func duration 2017-01-06 11:16:06 +01:00
delayed-inode.h btrfs: properly set the termination value of ctx->pos in readdir 2016-02-25 12:01:15 -08:00
delayed-ref.c btrfs: qgroup: Fix a race in delayed_ref which leads to abort trans 2015-10-26 19:44:39 -07:00
delayed-ref.h btrfs: qgroup: Fix a race in delayed_ref which leads to abort trans 2015-10-26 19:44:39 -07:00
dev-replace.c btrfs: replace: Reset on-disk dev stats value after replace 2018-09-15 09:40:40 +02:00
dev-replace.h
dir-item.c Btrfs: make xattr replace operations atomic 2014-11-20 17:20:07 -08:00
disk-io.c btrfs: Always try all copies when reading extent buffers 2018-12-13 09:21:32 +01:00
disk-io.h btrfs: don't create or leak aliased root while cleaning up orphans 2018-11-10 07:41:36 -08:00
export.c BTRFS: support NFSv2 export 2015-10-06 06:55:23 -07:00
export.h
extent-tree.c btrfs: Ensure btrfs_trim_fs can trim the whole filesystem 2018-12-01 09:46:41 +01:00
extent-tree.h btrfs: qgroup: Add new qgroup calculation function 2015-06-10 09:25:49 -07:00
extent_io.c btrfs: fix incorrect error return ret being passed to mapping_set_error 2018-04-13 19:50:06 +02:00
extent_io.h btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function 2015-10-21 18:37:45 -07:00
extent_map.c Btrfs: do not move em to modified list when unpinning 2014-11-21 11:59:54 -08:00
extent_map.h
file-item.c Merge branch 'cleanups-post-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.1 2015-03-25 10:52:48 -07:00
file.c Btrfs: set plug for fsync 2018-05-30 07:48:54 +02:00
free-space-cache.c Btrfs: fix use-after-free when dumping free space 2018-12-13 09:21:32 +01:00
free-space-cache.h Btrfs: keep track of largest extent in bitmaps 2015-10-21 18:55:40 -07:00
hash.c btrfs: LLVMLinux: Remove VLAIS 2014-10-14 10:51:22 +02:00
hash.h
inode-item.c Btrfs: consolidate btrfs_error() to btrfs_std_error() 2015-09-29 16:30:00 +02:00
inode-map.c Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots 2016-03-03 15:07:12 -08:00
inode-map.h Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots 2016-03-03 15:07:12 -08:00
inode.c Btrfs: fix null pointer dereference on compressed write path error 2018-11-21 09:27:38 +01:00
ioctl.c btrfs: Ensure btrfs_trim_fs can trim the whole filesystem 2018-12-01 09:46:41 +01:00
Kconfig rcu: Make SRCU optional by using CONFIG_SRCU 2015-01-06 11:04:29 -08:00
locking.c btrfs: comment the rest of implicit barriers before waitqueue_active 2015-10-10 18:42:00 +02:00
locking.h btrfs: fix lockups from btrfs_clear_path_blocking 2014-11-19 10:34:35 -08:00
lzo.c btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00
Makefile
math.h btrfs: cleanup 64bit/32bit divs, compile time constants 2015-03-03 17:23:57 +01:00
ordered-data.c Btrfs: change how we wait for pending ordered extents 2015-10-21 18:51:40 -07:00
ordered-data.h Btrfs: change how we wait for pending ordered extents 2015-10-21 18:51:40 -07:00
orphan.c
print-tree.c btrfs: remove parameter blocksize from read_tree_block 2014-10-02 17:14:50 +02:00
print-tree.h
props.c btrfs: cleanup iterating over prop_handlers array 2015-10-21 18:28:48 +02:00
props.h
qgroup.c btrfs: qgroup: Dirty all qgroups before rescan 2018-11-21 09:27:38 +01:00
qgroup.h btrfs: waiting on qgroup rescan should not always be interruptible 2016-09-07 08:32:43 +02:00
raid56.c Btrfs: make raid6 rebuild retry more 2018-07-03 11:21:24 +02:00
raid56.h Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation 2015-08-09 07:34:26 -07:00
rcu-string.h
reada.c btrfs: reada: Fix returned errno code 2015-10-21 18:29:50 +02:00
relocation.c btrfs: Handle owner mismatch gracefully when walking up tree 2018-11-21 09:27:37 +01:00
root-tree.c btrfs: don't create or leak aliased root while cleaning up orphans 2018-11-10 07:41:36 -08:00
scrub.c btrfs: scrub: Don't use inode pages for device replace 2018-07-03 11:21:25 +02:00
send.c Btrfs: send, fix infinite loop due to directory rename dependencies 2018-12-17 21:55:10 +01:00
send.h
struct-funcs.c
super.c Btrfs: ensure path name is null terminated at btrfs_control_ioctl 2018-12-13 09:21:26 +01:00
sysfs.c Btrfs: rename super_kobj to fsid_kobj 2015-09-29 16:29:59 +02:00
sysfs.h Btrfs: rename btrfs_kobj_rm_device to btrfs_sysfs_rm_device_link 2015-09-29 16:29:59 +02:00
transaction.c btrfs: release metadata before running delayed refs 2018-12-13 09:21:27 +01:00
transaction.h btrfs: account for non-CoW'd blocks in btrfs_abort_transaction 2016-07-27 09:47:33 -07:00
tree-defrag.c Btrfs: cleanup: remove unnecessary check before btrfs_free_path is called 2015-08-31 11:46:41 -07:00
tree-log.c Btrfs: fix wrong dentries after fsync of file that got its parent replaced 2018-11-21 09:27:38 +01:00
tree-log.h Btrfs: fix metadata inconsistencies after directory fsync 2015-03-26 17:56:23 -07:00
ulist.c btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
ulist.h btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
uuid-tree.c btrfs: return the actual error value from from btrfs_uuid_tree_iterate 2017-11-30 08:37:28 +00:00
volumes.c Btrfs: make raid6 rebuild retry more 2018-07-03 11:21:24 +02:00
volumes.h btrfs: fix clashing number of the enhanced balance usage filter 2015-11-25 05:19:50 -08:00
xattr.c Btrfs: fix race when listing an inode's xattrs 2015-11-09 18:34:40 +00:00
xattr.h
zlib.c btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00