Commit graph

188 commits

Author SHA1 Message Date
Chao Yu
9e6ece8a41 f2fs: support journalled quota
This patch supports to enable f2fs to accept quota information through
mount option:
- {usr,grp,prj}jquota=<quota file path>
- jqfmt=<quota type>

Then, in ->mount flow, we can recover quota file during log replaying,
by this, journelled quota can be supported.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: Fix wrong return values.]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-21 14:12:24 -07:00
Chao Yu
d39f75a593 f2fs: add app/fs io stat
This patch enables inner app/fs io stats and introduces below virtual fs
nodes for exposing stats info:
/sys/fs/f2fs/<dev>/iostat_enable
/proc/fs/f2fs/<dev>/iostat_info

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix wrong stat assignment]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-21 14:11:49 -07:00
Yunlong Song
48ac27052b f2fs: provide f2fs_balance_fs to __write_node_page
Let node writeback also do f2fs_balance_fs to ensure there are always enough free
segments.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-21 14:11:38 -07:00
Jaegeuk Kim
745ad3de2f f2fs: avoid cpu lockup
Before retrying to flush data or dentry pages, we need to release cpu in order
to prevent watchdog.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-20 16:55:24 -07:00
Chao Yu
a2bd44aca5 f2fs: use spin_{,un}lock_irq{save,restore}
generic/361 reports below warning, this is because: once, there is
someone entering into critical region of sbi.cp_lock, if write_end_io.
f2fs_stop_checkpoint is invoked from an triggered IRQ, we will encounter
deadlock.

So this patch changes to use spin_{,un}lock_irq{save,restore} to create
critical region without IRQ enabled to avoid potential deadlock.

 irq event stamp: 83391573
 loop: Write error at byte offset 438729728, length 1024.
 hardirqs last  enabled at (83391573): [<c1809752>] restore_all+0xf/0x65
 hardirqs last disabled at (83391572): [<c1809eac>] reschedule_interrupt+0x30/0x3c
 loop: Write error at byte offset 438860288, length 1536.
 softirqs last  enabled at (83389244): [<c180cc4e>] __do_softirq+0x1ae/0x476
 softirqs last disabled at (83389237): [<c101ca7c>] do_softirq_own_stack+0x2c/0x40
 loop: Write error at byte offset 438990848, length 2048.
 ================================
 WARNING: inconsistent lock state
 4.12.0-rc2+ #30 Tainted: G           O
 --------------------------------
 inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
 xfs_io/7959 [HC1[1]:SC0[0]:HE0:SE1] takes:
  (&(&sbi->cp_lock)->rlock){?.+...}, at: [<f96f96cc>] f2fs_stop_checkpoint+0x1c/0x50 [f2fs]
 {HARDIRQ-ON-W} state was registered at:
   __lock_acquire+0x527/0x7b0
   lock_acquire+0xae/0x220
   _raw_spin_lock+0x42/0x50
   do_checkpoint+0x165/0x9e0 [f2fs]
   write_checkpoint+0x33f/0x740 [f2fs]
   __f2fs_sync_fs+0x92/0x1f0 [f2fs]
   f2fs_sync_fs+0x12/0x20 [f2fs]
   sync_filesystem+0x67/0x80
   generic_shutdown_super+0x27/0x100
   kill_block_super+0x22/0x50
   kill_f2fs_super+0x3a/0x40 [f2fs]
   deactivate_locked_super+0x3d/0x70
   deactivate_super+0x40/0x60
   cleanup_mnt+0x39/0x70
   __cleanup_mnt+0x10/0x20
   task_work_run+0x69/0x80
   exit_to_usermode_loop+0x57/0x85
   do_fast_syscall_32+0x18c/0x1b0
   entry_SYSENTER_32+0x4c/0x7b
 irq event stamp: 1957420
 hardirqs last  enabled at (1957419): [<c1808f37>] _raw_spin_unlock_irq+0x27/0x50
 hardirqs last disabled at (1957420): [<c1809f9c>] call_function_single_interrupt+0x30/0x3c
 softirqs last  enabled at (1953784): [<c180cc4e>] __do_softirq+0x1ae/0x476
 softirqs last disabled at (1953773): [<c101ca7c>] do_softirq_own_stack+0x2c/0x40

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&(&sbi->cp_lock)->rlock);
   <Interrupt>
     lock(&(&sbi->cp_lock)->rlock);

  *** DEADLOCK ***

 2 locks held by xfs_io/7959:
  #0:  (sb_writers#13){.+.+.+}, at: [<c11fd7ca>] vfs_write+0x16a/0x190
  #1:  (&sb->s_type->i_mutex_key#16){+.+.+.}, at: [<f96e33f5>] f2fs_file_write_iter+0x25/0x140 [f2fs]

 stack backtrace:
 CPU: 2 PID: 7959 Comm: xfs_io Tainted: G           O    4.12.0-rc2+ #30
 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
 Call Trace:
  dump_stack+0x5f/0x92
  print_usage_bug+0x1d3/0x1dd
  ? check_usage_backwards+0xe0/0xe0
  mark_lock+0x23d/0x280
  __lock_acquire+0x699/0x7b0
  ? __this_cpu_preempt_check+0xf/0x20
  ? trace_hardirqs_off_caller+0x91/0xe0
  lock_acquire+0xae/0x220
  ? f2fs_stop_checkpoint+0x1c/0x50 [f2fs]
  _raw_spin_lock+0x42/0x50
  ? f2fs_stop_checkpoint+0x1c/0x50 [f2fs]
  f2fs_stop_checkpoint+0x1c/0x50 [f2fs]
  f2fs_write_end_io+0x147/0x150 [f2fs]
  bio_endio+0x7a/0x1e0
  blk_update_request+0xad/0x410
  blk_mq_end_request+0x16/0x60
  lo_complete_rq+0x3c/0x70
  __blk_mq_complete_request_remote+0x11/0x20
  flush_smp_call_function_queue+0x6d/0x120
  ? debug_smp_processor_id+0x12/0x20
  generic_smp_call_function_single_interrupt+0x12/0x30
  smp_call_function_single_interrupt+0x25/0x40
  call_function_single_interrupt+0x37/0x3c
 EIP: _raw_spin_unlock_irq+0x2d/0x50
 EFLAGS: 00000296 CPU: 2
 EAX: 00000001 EBX: d2ccc51c ECX: 00000001 EDX: c1aacebd
 ESI: 00000000 EDI: 00000000 EBP: c96c9d1c ESP: c96c9d18
  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
  ? inherit_task_group.isra.98.part.99+0x6b/0xb0
  __add_to_page_cache_locked+0x1d4/0x290
  add_to_page_cache_lru+0x38/0xb0
  pagecache_get_page+0x8e/0x200
  f2fs_write_begin+0x96/0xf00 [f2fs]
  ? trace_hardirqs_on_caller+0xdd/0x1c0
  ? current_time+0x17/0x50
  ? trace_hardirqs_on+0xb/0x10
  generic_perform_write+0xa9/0x170
  __generic_file_write_iter+0x1a2/0x1f0
  ? f2fs_preallocate_blocks+0x137/0x160 [f2fs]
  f2fs_file_write_iter+0x6e/0x140 [f2fs]
  ? __lock_acquire+0x429/0x7b0
  __vfs_write+0xc1/0x140
  vfs_write+0x9b/0x190
  SyS_pwrite64+0x63/0xa0
  do_fast_syscall_32+0xa1/0x1b0
  entry_SYSENTER_32+0x4c/0x7b
 EIP: 0xb7786c61
 EFLAGS: 00000293 CPU: 2
 EAX: ffffffda EBX: 00000003 ECX: 08416000 EDX: 00001000
 ESI: 18b24000 EDI: 00000000 EBP: 00000003 ESP: bf9b36b0
  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b

Fixes: aaec2b1d1879 ("f2fs: introduce cp_lock to protect updating of ckpt_flags")
Cc: stable@vger.kernel.org
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-13 16:58:18 -07:00
Chao Yu
0b1e7ba9ee f2fs: skip ->writepages for {mete,node}_inode during recovery
Skip ->writepages in prior to ->writepage for {meta,node}_inode during
recovery, hence unneeded loop in ->writepages can be avoided.

Moreover, check SBI_POR_DOING earlier while writebacking pages.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-13 16:58:08 -07:00
Chao Yu
843d3364d7 f2fs: introduce io_list for serialize data/node IOs
Serialize data/node IOs by using fifo list instead of mutex lock,
it will help to enhance concurrency of f2fs, meanwhile keeping LFS
IO semantics.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-13 16:56:13 -07:00
Jaegeuk Kim
4a6ac1475b f2fs: remove unnecessary read cases in merged IO flow
Merged IO flow doesn't need to care about read IOs.

f2fs_submit_merged_bio -> f2fs_submit_merged_write
f2fs_submit_merged_bios -> f2fs_submit_merged_writes
f2fs_submit_merged_bio_cond -> f2fs_submit_merged_write_cond

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-13 16:55:56 -07:00
Jaegeuk Kim
74683b0ea0 f2fs: use f2fs_submit_page_bio for ra_meta_pages
This patch avoids to use f2fs_submit_merged_bio for read, which was the only
read case.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-13 16:55:54 -07:00
Chao Yu
a725708ca4 f2fs: introduce CP_TRIMMED_FLAG to avoid unneeded discard
Introduce CP_TRIMMED_FLAG to indicate all invalid block were trimmed
before umount, so once we do mount with image which contain the flag,
we don't record invalid blocks as undiscard one, when fstrim is being
triggered, we can avoid issuing redundant discard commands.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08 16:31:15 -07:00
Chao Yu
30d60edd7b f2fs: allow cpc->reason to indicate more than one reason
Change to use different bits of cpc->reason to indicate different status,
so cpc->reason can indicate more than one reason.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08 16:31:13 -07:00
Jaegeuk Kim
24f3c7e195 f2fs: give time to flush dirty pages for checkpoint
If all the threads are waiting for checkpoint, we have no chance to flush
required dirty pages.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08 16:26:52 -07:00
Kaixu Xia
bdc8c12ddd f2fs: remove the redundant variable definition
The variable 'i' has been defined before, so here we can
use it directly.

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08 16:24:45 -07:00
Yunlei He
aa9de43b3b f2fs: allow write page cache when writting cp
This patch allow write data to normal file when writting
new checkpoint.

We relax three limitations for write_begin path:
1. data allocation
2. node allocation
3. variables in checkpoint

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08 16:24:26 -07:00
Chao Yu
afc8c720de f2fs: don't track volatile file in dirty inode list
Don't track volatile file in dirty inode list, otherwise with data_flush
option, background thread will entry into endless loop for flushing
journal file's pages.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08 16:24:20 -07:00
Kinglong Mee
96d73c33c4 f2fs: sanity check of crc_offset from raw checkpoint
The crc_offset towards or beyond the end of block is wrong,
sanity check it.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08 16:24:06 -07:00
Yunlei He
633f62b7dc f2fs: skip writeback meta pages if cp_mutex acquire failed
Skip writeback meta pages if cp_mutex lock acquire failed, cp will
flush dirty pages instead.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08 16:16:23 -07:00
Jaegeuk Kim
9f5bdf3b0a f2fs: show more precise message on orphan recovery failure
This case is not caused by fsck.f2fs. User needs to retry mount.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08 16:16:21 -07:00
Kinglong Mee
bf5320b6fb f2fs: le32_to_cpu for ckpt->cp_pack_total_block_count
Fixes: 22ad0b6ab4 ("f2fs: add bitmaps for empty or full NAT blocks")
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08 16:15:58 -07:00
Kinglong Mee
4db9ebac24 f2fs: new helper cur_cp_crc() getting crc in f2fs_checkpoint
There are four places that getting the crc value in f2fs_checkpoint,
just add a new helper cur_cp_crc for them.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-08 22:01:23 -08:00
Chao Yu
3a40c74cce f2fs: show simple call stack in fault injection message
Previously kernel message can show that in which function we do the
injection, but unfortunately, most of the caller are the same, for
tracking more information of injection path, it needs to show upper
caller's name. This patch supports that ability.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/data.c
2017-03-08 22:00:43 -08:00
Jaegeuk Kim
273924c377 f2fs: add bitmaps for empty or full NAT blocks
This patches adds bitmaps to represent empty or full NAT blocks containing
free nid entries.

If we can find valid crc|cp_ver in the last block of checkpoint pack, we'll
use these bitmaps when building free nids. In order to avoid checkpointing
burden, up-to-date bitmaps will be flushed only during umount time. So,
normally we can get this gain, but when power-cut happens, we rely on fsck.f2fs
which recovers this bitmap again.

After this patch, we build free nids from nid #0 at mount time to make more
full NAT blocks, but in runtime, we check empty NAT blocks to load free nids
without loading any NAT pages from disk.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-08 21:57:15 -08:00
Jaegeuk Kim
69a0a6912f f2fs: remove build_free_nids() during checkpoint
Let's avoid build_free_nids() in checkpoint path.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-08 21:56:33 -08:00
Jaegeuk Kim
5b7c840833 f2fs: check last page index in cached bio to decide submission
If the cached bio has the last page's index, then we need to submit it.
Otherwise, we don't need to submit it and can wait for further IO merges.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-08 21:56:25 -08:00
Chao Yu
f948bcc51e f2fs: clean up with list_{first, last}_entry
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-06 13:53:44 -08:00
Jaegeuk Kim
99a5dca4d9 f2fs: return fs_trim if there is no candidate
If there is no candidate to submit discard command during f2sf_trim_fs, let's
return without checkpoint.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-06 13:53:44 -08:00
Jaegeuk Kim
711f0385dc f2fs: avoid needless checkpoint in f2fs_trim_fs
The f2fs_trim_fs() doesn't need to do checkpoint if there are newly allocated
data blocks only which didn't change the critical checkpoint data such as nat
and sit entries.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-06 13:53:43 -08:00
Jaegeuk Kim
5521ead704 f2fs: relax async discard commands more
This patch relaxes async discard commands to avoid waiting its end_io during
checkpoint.
Instead of waiting them during checkpoint, it will be done when actually reusing
them.

Test on initial partition of nvme drive.

 # time fstrim /mnt/test

Before : 6.158s
After : 4.822s

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-06 13:53:42 -08:00
Jaegeuk Kim
373bb0247a f2fs: support async discard based on v4.9
This patch is based on commit 275b66b09e85 (f2fs: support async discard).

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-06 13:53:30 -08:00
Jaegeuk Kim
7146292938 f2fs: resolve op and op_flags confilcts
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-06 13:53:26 -08:00
Jaegeuk Kim
640bdae24f f2fs: free meta pages if sanity check for ckpt is failed
This fixes missing freeing meta pages in the error case.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-05 17:38:12 -08:00
Chao Yu
2ea2e28982 f2fs: don't wait writeback for datas during checkpoint
Normally, while committing checkpoint, we will wait on all pages to be
writebacked no matter the page is data or metadata, so in scenario where
there are lots of data IO being submitted with metadata, we may suffer
long latency for waiting writeback during checkpoint.

Indeed, we only care about persistence for pages with metadata, but not
pages with data, as file system consistent are only related to metadate,
so in order to avoid encountering long latency in above scenario, let's
recognize and reference metadata in submitted IOs, wait writeback only
for metadatas.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/data.c
2016-12-01 11:01:23 -08:00
Jaegeuk Kim
86f4d9f42e f2fs: make clean inodes when flushing inode page
This patch tries to make more clean inodes when flushing dirty inodes in
checkpoint.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-01 10:39:33 -08:00
Jaegeuk Kim
1d486e74cf f2fs: use BIO_MAX_PAGES for bio allocation
We don't need to allocate bio partially in order to maximize sequential writes.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-01 10:38:31 -08:00
Chao Yu
b1b14da24a f2fs: don't interrupt free nids building during nid allocation
Let build_free_nids support sync/async methods, in allocation flow of nids,
we use synchronuous method, so that we can avoid looping in alloc_nid when
free memory is low; in unblock_operations and f2fs_balance_fs_bg we use
asynchronuous method in where low memory condition can interrupt us.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-01 10:38:22 -08:00
Chao Yu
7a2d5d5f81 f2fs: fix to release discard entries during checkpoint
In f2fs_fill_super, if there is any IO error occurs during recovery,
cached discard entries will be leaked, in order to avoid this, make
write_checkpoint() handle memory release by itself, besides, move
clear_prefree_segments to write_checkpoint for readability.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/checkpoint.c
2016-12-01 10:36:45 -08:00
Jaegeuk Kim
34a546cb04 f2fs: fix to determine start_cp_addr by sbi->cur_cp_pack
We don't guarantee cp_addr is fixed by cp_version.
This is to sync with f2fs-tools.

Cc: stable@vger.kernel.org
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-28 13:43:12 -08:00
Jaegeuk Kim
8da9e3f747 f2fs: backport from (4c1fad64 - Merge tag 'for-f2fs-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs)
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-10-17 16:38:44 -07:00
Chao Yu
26879fb101 f2fs: support lower priority asynchronous readahead in ra_meta_pages
Now, we use ra_meta_pages to reads continuous physical blocks as much as
possible to improve performance of following reads. However, ra_meta_pages
uses a synchronous readahead approach by submitting bio with READ, as READ
is with high priority, it can not be used in the case of preloading blocks,
and it's not sure when these RAed pages will be used.

This patch supports asynchronous readahead in ra_meta_pages by tagging bio
with READA flag in order to allow preloading.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-12 14:03:15 -07:00
Chao Yu
2b947003fa f2fs: don't tag REQ_META for temporary non-meta pages
In recovery or checkpoint flow, we grab pages temperarily in meta inode's
mapping for caching temperary data, actually, datas in these pages were
not meta data of f2fs, but still we tag them with REQ_META flag. However,
lower device like eMMC may do some optimization for data of such type.
So in order to avoid wrong optimization, we'd better remove such flag
for temperary non-meta pages.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-12 14:01:46 -07:00
Jaegeuk Kim
6066d8cdb6 f2fs: merge meta writes as many possible
This patch tries to merge IOs as many as possible when background flusher
conducts flushing the dirty meta pages.

[Before]

...
2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124320, size = 4096
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124560, size = 32768
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 95720, size = 987136
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123928, size = 4096
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123944, size = 8192
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123968, size = 45056
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124064, size = 4096
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 97648, size = 1007616
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123776, size = 8192
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123800, size = 32768
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124624, size = 4096
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 99616, size = 921600
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123608, size = 4096
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123624, size = 77824
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123792, size = 4096
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123864, size = 32768
...

[After]

...
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 92168, size = 892928
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 93912, size = 753664
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 95384, size = 716800
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 96784, size = 712704
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 104160, size = 364544
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 104872, size = 356352
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 105568, size = 278528
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 106112, size = 319488
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 106736, size = 258048
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 107240, size = 270336
f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 107768, size = 180224
...

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-09 16:20:57 -07:00
Jaegeuk Kim
60b99b486b f2fs: introduce a periodic checkpoint flow
This patch introduces a periodic checkpoint feature.
Note that, this is not enforcing to conduct checkpoints very strictly in terms
of trigger timing, instead just hope to help user experiences.
The default value is 60 seconds.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-09 16:20:57 -07:00
Jaegeuk Kim
a7230d16d5 f2fs: check end_io for metapages before making next checkpoint blocks
This patch avoids to produce new checkpoint blocks before the previous meta
pages were written completely.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-09 16:20:51 -07:00
Jaegeuk Kim
80c545055d f2fs: use __GFP_NOFAIL to avoid infinite loop
__GFP_NOFAIL can avoid retrying the whole path of kmem_cache_alloc and
bio_alloc.
And, it also fixes the use cases of GFP_ATOMIC correctly.

Suggested-by: Chao Yu <chao2.yu@samsung.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-08-24 09:37:21 -07:00
Jaegeuk Kim
315df8398e f2fs: do not write any node pages related to orphan inodes
We should not write node pages when deleting orphan inodes.
In order to do that, we can eaisly set POR_DOING flag earlier before entering
orphan inode routine.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-08-20 08:59:42 -07:00
Chao Yu
8c14bfadea f2fs: handle error of f2fs_iget correctly
In recover_orphan_inode, whenever f2fs_iget fail, we will make kernel panic,
but it's not reasonable, because f2fs_iget can fail due to a lot of reasons
including out of memory.

So we change error handling method as below:
a) when finding no entry for the orphan inode, bug_on for catching bugs;
b) for other reasons, report it to caller.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-08-14 16:02:14 -07:00
Chao Yu
e90c2d2850 f2fs: invalidate temporary meta page
To avoid meeting garbage data in next free node block at the end of warm
node chain when doing recovery, we will try to zero out that invalid block.

If the device is not support discard, our way for zeroing out block is:
grabbing a temporary zeroed page in meta inode, then, issue write request
with this page.

But, we forget to release that temporary page, so our memory usage will
increase without gaining any hit ratio benefit, so it's better to free it
for saving memory.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-08-05 08:19:21 -07:00
Chao Yu
f3f338caad f2fs: freeze filesystem when fail to update meta page due to IO error
In get_meta_page, we guarantee no failure for the returned page,
but sometimes, IO error from device will incur returning an
non-updated page.

Then, we still use this page as updated one, exception could happen
when using this kind of page.

So in this condition, we'd better freeze fs by making fs readonly and
and stop doing checkpoint.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-08-05 08:08:17 -07:00
Jaegeuk Kim
86531d6b84 f2fs: callers take care of the page from bio error
This patch changes for a caller to handle the page after its bio gets an error.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-08-05 08:08:07 -07:00
Chao Yu
bd936f8407 f2fs: cleanup write_orphan_inodes
Previously, since 'commit 4531929e39 ("f2fs: move grabing orphan
pages out of protection region")' was committed, in write_orphan_inodes(),
we will grab all meta page in a batch before we use them under spinlock,
so that we can avoid large time delay of grabbing meta pages under
spinlock.

Now, 'commit d6c67a4fee ("f2fs: revmove spin_lock for
write_orphan_inodes")' remove the spinlock in write_orphan_inodes,
so there is no issue we describe above, we'd better recover to move
the grab operation to original place for readability.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-08-04 14:09:59 -07:00