We want to assign an offset when the dentry goes from null to linked, which
is always done by splice_dentry(). Notably, we should NOT assign an
offset when a dentry is first created and is still null.
BUG if we try to splice a non-null dentry (we shouldn't).
Signed-off-by: Sage Weil <sage@newdream.net>
If the version hasn't changed, don't rebuild the index.
Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
If we use the xattr_blob, clear the pointer so we don't release the memory
at the bottom of the fuction.
Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Simplify messenger locking, and close race between ceph_con_close() setting
the CLOSED bit and con_work() checking the bit, then taking the mutex.
Signed-off-by: Sage Weil <sage@newdream.net>
d_obtain_alias() doesn't return NULL, it returns an ERR_PTR().
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
We only need to pass in front_len. Callers can attach any other payload
pieces (middle, data) as they see fit.
Signed-off-by: Sage Weil <sage@newdream.net>
Returning ERR_PTR(-ENOMEM) is useless extra work. Return NULL on failure
instead, and fix up the callers (about half of which were wrong anyway).
Signed-off-by: Sage Weil <sage@newdream.net>
Since we don't need to maintain large pools of messages, we can just
use the standard mempool_t. We maintain a msgpool 'wrapper' because we
need the mempool_t* in the alloc function, and mempool gives us only
pool_data.
Signed-off-by: Sage Weil <sage@newdream.net>
ceph_sb_to_client and ceph_client are really identical, we need to dump
one; while function ceph_client is confusing with "struct ceph_client",
ceph_sb_to_client's definition is more clear; so we'd better switch all
call to ceph_sb_to_client.
-static inline struct ceph_client *ceph_client(struct super_block *sb)
-{
- return sb->s_fs_info;
-}
Signed-off-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
This would only trigger if we bailed out before resetting r_con_filling_msg
because the server reply was corrupt (oversized).
Signed-off-by: Sage Weil <sage@newdream.net>
"xattr" is never NULL here. We took care of that in the previous
if statement block.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Following Nick Piggin patches in btrfs, pagecache pages should be
allocated with __page_cache_alloc, so they obey pagecache memory
policies.
Also, using add_to_page_cache_lru instead of using a private
pagevec where applicable.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
The uniqueid field sent by the server when unix extensions are enabled
is currently used sometimes when it shouldn't be. The readdir codepath
is correct, but most others are not. Fix it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
We use this value to find an inode within the hash bucket, so we can't
change this without re-hashing the inode. For now, treat this value
as immutable.
Eventually, we should probably use an inode number change on a path
based operation to indicate that the lookup cache is invalid, but that's
a bit more code to deal with.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The old cifs_revalidate logic always revalidated hardlinked inodes.
This hack allowed CIFS to pass some connectathon tests when server inode
numbers aren't used (basic test7, in particular).
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Sparse does not like inline function declared without body,
because it is not part of the standard kernel practice.
The xattr_handler tables can be declared static.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Sparse detected that unsigned pointer was being passed as int pointer.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
[fixed up to deal with code refactoring]
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Add new extended inode types that store the xattr_id field.
Also add the necessary code changes to make xattrs visibile.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
This patch adds support for mapping xattr ids (stored in inodes)
into the on-disk location of the xattrs themselves.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
If we abort a request, we return to caller, but the request may still
complete. And if we hold the dir FILE_EXCL bit, we may not release a
lease when sending a request. A simple un-tar, control-c, un-tar again
will reproduce the bug (manifested as a 'Cannot open: File exists').
Ensure we invalidate affected dentry leases (as well dir I_COMPLETE) so
we don't have valid (but incorrect) leases. Do the same, consistently, at
other sites where I_COMPLETE is similarly cleared.
Signed-off-by: Sage Weil <sage@newdream.net>
When we abort requests we need to prevent fill_trace et al from doing
anything that relies on locks held by the VFS caller. This fixes a race
between the reply handler and the abort code, ensuring that continue
holding the dir mutex until the reply handler completes.
Signed-off-by: Sage Weil <sage@newdream.net>
We would occasionally BUG out in the reply handler because r_reply was
nonzero, due to a race with ceph_mdsc_do_request temporarily setting
r_reply to an ERR_PTR value. This is unnecessary, messy, and also wrong
in the EIO case.
Clean up by consistently using r_err for errors and r_reply for messages.
Also fix the abort logic to trigger consistently for all errors that return
to the caller early (e.g., EIO from timeout case). If an abort races with
a reply, use the result from the reply.
Also fix locking for r_err, r_reply update in the reply handler.
Signed-off-by: Sage Weil <sage@newdream.net>
Filesystems with delalloc support may dirty inode during writepages.
As result inode will have dirty metadata flags even after write_inode.
In fact we have two dedicated functions for proper data and metadata
writeback. It is reasonable to separate flags updates in two stages.
https://bugzilla.kernel.org/show_bug.cgi?id=15906
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
When umount calls sync_filesystem(), we first do a WB_SYNC_NONE
writeback to kick off writeback of pending dirty inodes, then follow
that up with a WB_SYNC_ALL to wait for it. Since umount already holds
the sb s_umount mutex, WB_SYNC_NONE ends up doing nothing and all
writeback happens as WB_SYNC_ALL. This can greatly slow down umount,
since WB_SYNC_ALL writeback is a data integrity operation and thus
a bigger hammer than simple WB_SYNC_NONE. For barrier aware file systems
it's a lot slower.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Prior to 2.6.32, setting /proc/sys/vm/dirty_writeback_centisecs disabled
periodic dirty writeback from kupdate. This got broken and now causes
excessive sys CPU usage if set to zero, as we'll keep beating on
schedule().
Cc: stable@kernel.org
Reported-by: Justin Maggard <jmaggard10@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
For kmap_atomic() we call kunmap_atomic() on the returned pointer.
That's different from kmap() and kunmap() and so it's easy to get them
backwards.
Cc: Stable <stable@kernel.org>
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
All vectors of address_space_operations should be initialized
by the filesystem. Add the missing parts.
This is actually an optimization, by using
__set_page_dirty_nobuffers. The default, in case of NULL,
would be __set_page_dirty_buffers which has these extar if(s).
.releasepage && .invalidatepage should both not be called
because page_private() is NULL in exofs. Put a WARN_ON if
they are called, to indicate the Kernel has changed in this
regard, if when it does.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>