In ext4_es_can_be_merged() when checking whether we can merge two
extents we should use EXT_MAX_BLOCKS instead of defining it manually.
Also if it is really the case we should notify userspace because clearly
there is a bug in extent status tree implementation since this should
never happen.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Pull cifs fix from Steve French:
"Small cifs fix for metadata caching"
* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix actimeo=0 corner case when cifs_i->time == jiffies
Caches from 'ext4_groupinfo_caches' may be in use by other mounts,
which have already existed. So, it is incorrect to destroy them when
newly requested mount fails.
Found by Linux File System Verification project (linuxtesting.org).
Signed-off-by: Andrey Tsyvarev <tsyvarev@ispras.ru>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
I have been running make namespacecheck to look for unneeded globals, and
found these in ext4.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
On a bigalloc,^flex_bg filesystem, the ext4_valid_block_bitmap
function fails to convert from blocks to clusters when spot-checking
the validity of the bitmap block that we've just read from disk. This
causes ext4 to think that the bitmap is garbage, which results in the
block group being taken offline when it's not necessary. Add in the
necessary EXT4_B2C() calls to perform the conversions.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext4_bg_has_super() function doesn't know about the new rules for
where backup superblocks go on a sparse_super2 filesystem. Therefore,
block bitmap initialization doesn't know that it shouldn't reserve
space for backups in groups that are never going to contain backups.
The result of this is e2fsck complaining about the block bitmap being
incorrect (fortunately not in a way that results in cross-linked
files), so fix the whole thing.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
On a filesystem with a 1k block size, the group descriptors live in
block 2, not block 1. If the filesystem has bigalloc,meta_bg set,
however, the calculation of the group descriptor table location does
not take this into account and returns the wrong block number. Fix
the calculation to return the correct value for this case.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In ext4_xattr_set_handle() we have checked the xattr name's length. So
we should also check it in ext4_xattr_get() to avoid unneeded lookup
caused by invalid name.
Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When we perform a data integrity sync we tag all the dirty pages with
PAGECACHE_TAG_TOWRITE at start of ext4_da_writepages. Later we check
for this tag in write_cache_pages_da and creates a struct
mpage_da_data containing contiguously indexed pages tagged with this
tag and sync these pages with a call to mpage_da_map_and_submit. This
process is done in while loop until all the PAGECACHE_TAG_TOWRITE
pages are synced. We also do journal start and stop in each iteration.
journal_stop could initiate journal commit which would call
ext4_writepage which in turn will call ext4_bio_write_page even for
delayed OR unwritten buffers. When ext4_bio_write_page is called for
such buffers, even though it does not sync them but it clears the
PAGECACHE_TAG_TOWRITE of the corresponding page and hence these pages
are also not synced by the currently running data integrity sync. We
will end up with dirty pages although sync is completed.
This could cause a potential data loss when the sync call is followed
by a truncate_pagecache call, which is exactly the case in
collapse_range. (It will cause generic/127 failure in xfstests)
To avoid this issue, we can use set_page_writeback_keepwrite instead of
set_page_writeback, which doesn't clear TOWRITE tag.
Cc: stable@vger.kernel.org
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Pull nfsd fixes from Bruce Fields.
* 'for-3.15' of git://linux-nfs.org/~bfields/linux:
NFSD: Call ->set_acl with a NULL ACL structure if no entries
NFSd: call rpc_destroy_wait_queue() from free_client()
NFSd: Move default initialisers from create_client() to alloc_client()
v2: replace missing break in switch statement (as pointed out by Dave
Jones)
commit bce7560d49 (locks: consolidate checks for compatible
filp->f_mode values in setlk handlers) introduced a regression in the
F_GETLK handler.
flock64_to_posix_lock is a shared codepath between F_GETLK and F_SETLK,
but the f_mode checks should only be applicable to the F_SETLK codepaths
according to POSIX.
Instead of just reverting the patch, add a new function to do this
checking and have the F_SETLK handlers call it.
Cc: Dave Jones <davej@redhat.com>
Reported-and-Tested-by: Reuben Farrelly <reuben@reub.net>
Signed-off-by: Jeff Layton <jlayton@poochiereds.net>
- fix a remote attribute size calculation bug that leads to a
transaction overrun
- add default ACLs to O_TMPFILE files
- Remove the EXPERIMENTAL tag from filesystems with metadata CRC
support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJTa+0vAAoJEK3oKUf0dfodQGkQAI4IzVYr4K02Rj7UcbZaGlUM
R9OAATojaQ8/hPDshzrhyLK7/KvF2tJvH9PhP6zj3bko3fmhofJ5/mB6LYIsQtt+
3rNd/jij9Icmq4+ouZPRDl00nJdnCZjJcfYys6N/tXwLNIvwKP04vjB4QoC1rxVv
j6L85yUkMpPohA0Wbf+PKrTVJDrtTOe+YpczciYgGKHr0YF27Bdy6iYSU3KvTvd+
wuqXvGAc9ARZDsrVHt8t6eh9OKRRk1RAV5vdwGwucBrVlnxGaspvia/85JyU3Kv0
F2EQ3fWcGQs5ydQjpvSZlEIttDqBDn/LiuncNctXIUvHpr+MQ73XMVrNLoNY1m6d
wQqXFQXT4e/vzJTXyQz/jYgzGl5t9Lvf/1Z5lFHliqhaBm1aNMhdjfCZhEpehoaQ
09JSVj8ZKLHZt3yRgwkZdOmM0bl4thJmY1Wf5O2EPMrk3NE3nZKiNG+W2U/sSFti
i12M4uVgInmeHoDIWFNL9kXp3fs+gr6HF5BNQOulm0ywzG3U1ozWGyKsnRmpPFQr
995voVKZKDP410wzp98UKpjXalmonYuTFLNUDEEjr2UKUWq6fRpvDdSeBSRirGxP
kdwfpgCZHDJlZEY7d4lv4Pv6L84KgYYHQpmbaFcPEAmJmlMZ4web1KqHl8TDy1hT
Z+STYvTImpXV9sP5TZYT
=79c6
-----END PGP SIGNATURE-----
Merge tag 'xfs-for-linus-3.15-rc5' of git://oss.sgi.com/xfs/xfs
Pull xfs fixes from Dave Chinner:
"The main fix is adding support for default ACLs on O_TMPFILE opened
inodes to bring XFS into line with other filesystems. Metadata CRCs
are now also considered well enough tested to be fully supported, so
we're removing the shouty warnings issued at mount time for
filesystems with that format. And there's transaction block
reservation overrun fix.
Summary:
- fix a remote attribute size calculation bug that leads to a
transaction overrun
- add default ACLs to O_TMPFILE files
- Remove the EXPERIMENTAL tag from filesystems with metadata CRC
support"
* tag 'xfs-for-linus-3.15-rc5' of git://oss.sgi.com/xfs/xfs:
xfs: remote attribute overwrite causes transaction overrun
xfs: initialize default acls for ->tmpfile()
xfs: fully support v5 format filesystems
Use fh_fsid when reffering to the fsid part of the filehandle. The
variable length auth field envisioned in nfsfh wasn't ever implemented.
Also clean up some lose ends around this and document the file handle
format better.
Btw, why do we even export nfsfh.h to userspace? The file handle very
much is kernel private, and nothing in nfs-utils include the header
either.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
commit 4ac7249ea5 have remove all EXPORT_SYMBOL,
linux/export.h is not needed, just clean it.
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch uses exported inode_init_owner() to simplify codes in
f2fs_new_inode().
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
If we use slab memory in f2fs_issue_flush(), we will face memory pressure and
latency time caused by racing of kmem_cache_{alloc,free}.
Let's alloc memory in stack instead of slab.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Break up balance_leaf_paste_right into:
balance_leaf_paste_right_shift
balance_leaf_paste_right_shift_dirent
balance_leaf_paste_right_whole
and keep balance_leaf_paste_right as a handler to select which is appropriate.
Also reformat to adhere to CodingStyle.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Break up balance_leaf_paste_left into:
balance_leaf_paste_left_shift
balance_leaf_paste_left_shift_dirent
balance_leaf_paste_left_whole
and keep balance_leaf_paste_left as a handler to select which is appropriate.
Also reformat to adhere to CodingStyle.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
This patch factors out a new balance_leaf_finish_node_paste from the code
in balance_leaf responsible for pasting new content into existing items
held in S[0].
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
This patch factors out a new balance_leaf_finish_node_insert from the code
in balance_leaf responsible for inserting new items into S[0]
It has not been reformatted yet.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
This patch factors out a new balance_leaf_new_nodes_insert from the code
in balance_leaf responsible for pasting new content into existing items
that may have been shifted into new nodes in the tree.
It has not been reformatted yet.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
This patch factors out a new balance_leaf_new_nodes_insert from the code
in balance_leaf responsible for inserting new items into new nodes in
the tree.
It has not been reformatted yet.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
This patch factors out a new balance_leaf_paste_right from the code in
balance_leaf responsible for pasting new contents into an existing item
located in the node to the right of S[0] in the tree.
It has not been reformatted yet.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
This patch factors out a new balance_leaf_insert_right from the code in
balance_leaf responsible for inserting new items into the node to
the right of S[0] in the tree.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
This patch factors out a new balance_leaf_paste_left from the code in
balance_leaf responsible for pasting new content into an existing item
located in the node to the left of S[0] in the tree.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
This patch factors out a new balance_leaf_insert_left from the code in
balance_leaf responsible for inserting new items into the node to
the left of S[0] in the tree.
It is not yet formatted correctly.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
This patch pushes the rest of the state variables in balance_leaf into
the tree_balance structure so we can use them when we split balance_leaf
into separate functions.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
The comments in balance_leaf are as bad as the code. This patch shifts
them around to fit in 80 columns and be easier to read.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
The hash detection code uses long ugly macros multiple times to get the same
value. This patch cleans it up to be easier to read.
[JK: Fixed up path leak in find_hash_out()]
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Move the state locking and file descriptor reference out from the
callers and into nfs4_preprocess_stateid_op() itself.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch adds a tracepoint for f2fs_read_data_page to trace when page is
readed by user.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
This patch adds a tracepoint for f2fs_write_{meta,node,data}_pages to trace when
pages are fsyncing/flushing.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
This patch adds a tracepoint for f2fs_write_{meta,node,data}_page to trace when
page is writting out.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
This patch adds a tracepoint for f2fs_write_end to trace write op of user.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
This patch adds a tracepoint for f2fs_write_begin to trace write op of user.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
fix the following checkpatch warning:
WARNING: do {} while (0) macros should not be semicolon terminated
Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
If the inode page is clean during its inode eviction, it'd better drop the page
to reduce further memory pressure.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
This patch reduces the lock granularity during write_begin.
When the system is under memory pressure, it would be better to reduce
the locking time for the data pages.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
We don't need to wait on page writeback for these cases.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>