...for clarity and so we can reuse the name for the real name_len.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Add a label that we can goto on error, and reduce some of the
if/then/else indentation in this function.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
...to remove ambiguity about how these values are interpreted when
passing in more complex values as arguments.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
803bf5ec25 ("fs/exec.c: restrict initial
stack space expansion to rlimit") attempts to limit the initial stack to
20*PAGE_SIZE. Unfortunately, in attempting ensure the stack is not
reduced in size, we ended up not changing the stack at all.
This size reduction check is not necessary as the expand_stack call does
this already.
This caused a regression in UML resulting in most guest processes being
killed.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jouni Malinen <j@w1.fi>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Many usages of seq_file use RCU protected lists, so non RCU
iterators will not work safely.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the commit_metadata export operation for XFS.
- Takes one inode to be committed.
- Forces the log up to the lsn of the inode.
- Doesn't force the log if the inode doesn't have a pincount.
Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
[bfields@citi.umich.edu: trivial whitespace fix]
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
- Add commit_metadata export_operation to allow the underlying filesystem to
decide how to commit an inode most efficiently.
- Usage of nfsd_sync_dir and write_inode_now has been replaced with the
commit_metadata function that takes a svc_fh.
- The commit_metadata function calls the commit_metadata export_op if it's
there, or else falls back to sync_inode instead of fsync and write_inode_now
because only metadata need be synced here.
- nfsd4_sync_rec_dir now uses vfs_fsync so that commit_metadata can be static
Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
cachefiles_delete_object() can race with rename. It gets the parent directory
of the object it's asked to delete, then locks it - but rename may have changed
the object's parent between the get and the completion of the lock.
However, if such a circumstance is detected, we abandon our attempt to delete
the object - since it's no longer in the index key path, it won't be seen
again by lookups of that key. The assumption is that cachefilesd may have
culled it by renaming it to the graveyard for later destruction.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This adds reader's lock for the_nilfs->cno in nilfs_ioctl_sync,
for the_nilfs->cno should be proctected by segctor_sem when reading.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
commit 1e41568d73 ("Take ima_path_check()
in nfsd past dentry_open() in nfsd_open()") moved this code back to its
original location but missed the "else".
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Make sure that automount "symlinks" are followed regardless of LOOKUP_FOLLOW;
it should have no effect on them.
Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This is a trivial patch to remove unnecessary condition.
load_segment_summary() checks crc of segment_summary OR crc of whole
log data blocks based on boolean argument full_check. However,
callers of the function pass only 1 as full_check, which means only
whole log data blocks checking code is running all the time.
This patch deletes the condition and full_check argument and also
deletes enum 'NILFS_SEG_FAIL_CHECKSUM_SEGSUM' and corresponding case
clause, for it is nolonger used anymore.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Add __percpu sparse annotations to fs.
These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors. This patch doesn't affect normal builds.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
There is currently a bug in sysfs_sd_setattr inherited from
sysfs_setattr in 2.6.32 where the first time we set the attributes
on a sysfs file we allocate backing store but do not set the
backing store attributes. Resulting in overly restrictive
permissions on sysfs files.
The fix is to simply modify the code so that it always executes
when we update the sysfs attributes, as we did in 2.6.31 and earlier.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Tested-by: Jean Delvare <khali@linux-fr.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Calls to ext4_handle_dirty_metadata should only pass in an inode
pointer for inode-specific metadata, and not for shared metadata
blocks such as inode table blocks, block group descriptors, the
superblock, etc.
The BUG_ON can get tripped when updating a special device (such as a
block device) that is opened (so that i_mapping is set in
fs/block_dev.c) and the file system is mounted in no journal mode.
Addresses-Google-Bug: #2404870
Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The cached read and write paths initialize fattr->time_start in their
setup procedures. The value of fattr->time_start is propagated to
read_cache_jiffies by nfs_update_inode(). Subsequent calls to
nfs_attribute_timeout() will then use a good time stamp when
computing the attribute cache timeout, and squelch unneeded GETATTR
calls.
Since the direct I/O paths erroneously leave the inode's
fattr->time_start field set to zero, read_cache_jiffies for that inode
is set to zero after any direct read or write operation. This
triggers an otw GETATTR or ACCESS call to update the file's attribute
and access caches properly, even when the NFS READ or WRITE replies
have usable post-op attributes.
Make sure the direct read and write setup code performs the same fattr
initialization as the cached I/O paths to prevent unnecessary GETATTR
calls.
This was likely introduced by commit 0e574af1 in 2.6.15, which appears
to add new nfs_fattr_init() call sites in the cached read and write
paths, but not in the equivalent places in fs/nfs/direct.c. A
subsequent commit in the same series, 33801147, introduces the
fattr->time_start field.
Interestingly, the direct write reschedule path already has a call to
nfs_fattr_init() in the right place.
Reported-by: Quentin Barnes <qbarnes@yahoo-inc.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Delay discarding buffers in journal_unmap_buffer until
we know that "add to orphan" operation has definitely been
committed, otherwise the log space of committing transation
may be freed and reused before truncate get committed, updates
may get lost if crash happens.
Signed-off-by: dingdinghua <dingdinghua@nrchpc.ac.cn>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext4_fiemap() rounds the length of the requested range down to
blocksize, which is is not the true number of blocks that cover the
requested region. This problem is especially impressive if the user
requests only the first byte of a file: not a single extent will be
reported.
We fix this by calculating the last block of the region and then
subtract to find the number of blocks in the extents.
Signed-off-by: Leonard Michlmayr <leonard.michlmayr@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Just a pet peeve of mine; we had a mishash of calls with either __func__
or "function_name" and the latter tends to get out of sync.
I think it's easier to just hide the __func__ in a macro, and it'll
be consistent from then on.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When we wait for an inode through reiserfs_iget(), we hold
the reiserfs lock. And waiting for an inode may imply waiting
for its writeback. But the inode writeback path may also require
the reiserfs lock, which leads to a deadlock.
We just need to release the reiserfs lock from reiserfs_iget()
to fix this.
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Chris Mason <chris.mason@oracle.com>
Commit e22f628395 introduced a build
breakage for ARM devtree work: the THIS_MODULE macro was added, but we
don't have module.h
This change adds the necessary #include to get THIS_MODULE defined.
While we could just replace it with NULL (PROC_FS is a bool, not a
tristate), using THIS_MODULE will prevent unexpected breakage if we
ever do compile this as a module.
Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Test the value that was just allocated rather than the previously tested one.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@r@
expression *x;
expression e;
identifier l;
@@
if (x == NULL || ...) {
... when forall
return ...; }
... when != goto l;
when != x = e
when != &x
*x == NULL
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
This moves iterator to submit write requests for a series of logs into
segbuf.c, and hides nilfs_segbuf_write() and nilfs_segbuf_wait() in
the file.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This replaces s_dirt flag use in nilfs with a new flag added on the
nilfs object. The s_dirt flag was used to indicate if
sop->write_super() should be called, however the current version of
nilfs does not use the callback. Thus, it can be replaced with the
own flag.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Jiro SEKIBA <jir@unicus.jp>
This will clean up nilfs_segctor_req struct and the obscure request
argument passed among private methods of segment constructor.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This is a trivial patch to delete unnecessary condition in nilfs_dat_translate.
nilfs_dat_translate() will asign translated address to *blocknrp if blocknrp
is not NULL. However the condition is unneeded, because all callers of
nilfs_dat_translate() pass blocknrp properly.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
nilfs_error() calls nilfs_detach_segment_constructor() if
errors=remount-ro option is specified, and this may lead to a hang due
to recursive locking of, for instance, nilfs->ns_segctor_sem and
others.
In this case, detaching segment constructor is not necessary because
read-only flag is set to the filesystem and further writes are
blocked.
This fixes the potential hang issue by removing the
nilfs_detach_segment_constructor() call from nilfs_error.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
A few nilfs2 ioctls need to ask for and then later release write
access to the mount in order to avoid potential write to read-only
mounts.
This adds the missing mnt_want_write and mnt_drop_write in
nilfs_ioctl_change_cpmode, nilfs_ioctl_delete_checkpoint, and
nilfs_ioctl_clean_segments.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This adds a function to send discard requests for given array of
segment numbers, and calls the function when garbage collection
succeeded.
Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
My test do: fallocate a big file and do write. The file is 512M, but
after file write is done btrfs-debug-tree shows:
item 6 key (257 EXTENT_DATA 0) itemoff 3516 itemsize 53
extent data disk byte 1103101952 nr 536870912
extent data offset 0 nr 399634432 ram 536870912
extent compression 0
Looks like a regression introducted by
6c7d54ac87, where we set wrong slot.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
If we have a pinned inode it must have a log item attached to it.
Usually that log item will have ili_last_lsn already set, in which
case we only need to flush the log up to that LSN instead of doing a
full log force. This gives speedups of about 5% in some fsync heavy
workloads.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
file_remove_suid already calls into ->setattr to clear the suid and
sgid bits if needed, no need to start a second transaction to do it
ourselves.
Note that xfs_write_clear_setuid issues a sync transaction while the
path through ->setattr doesn't, but that is consistant with the
other filesystems.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
This patch solves a corner case during allocation which occurs if both
metadata (indirect) and data blocks are required but there is an
obstacle in the filesystem (e.g. a resource group header or another
allocated block) such that when the allocation is requested only
enough blocks for the metadata are returned.
By changing the exit condition of this loop, we ensure that a
minimum of one data block will always be returned.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
We need this one-liner to signal the mount helper of the 'insufficient journals' condition.
Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Fix the mapping of the NFSERR_SERVERFAULT error
NFS: Remove a redundant check for PageFsCache in nfs_migrate_page()
NFS: Fix a bug in nfs_fscache_release_page()
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] qla2xxx: Obtain proper host structure during response-queue processing.
[SCSI] compat_ioct: fix bsg SG_IO
[SCSI] qla2xxx: make msix interrupt handler safe for irq
[SCSI] zfcp: Report FC BSG errors in correct field
[SCSI] mptfusion : mptscsih_abort return value should be SUCCESS instead of value 0.
When reserving stack space for a new process, make sure we're not
attempting to expand the stack by more than rlimit allows.
This fixes a bug caused by b6a2fea393 ("mm:
variable length argument support") and unmasked by
fc63cf2370 ("exec: setup_arg_pages() fails
to return errors").
This bug means that when limiting the stack to less the 20*PAGE_SIZE (eg.
80K on 4K pages or 'ulimit -s 79') all processes will be killed before
they start. This is particularly bad with 64K pages, where a ulimit below
1280K will kill every process.
To test, do:
'ulimit -s 15; ls'
before and after the patch is applied. Before it's applied, 'ls' should
be killed. After the patch is applied, 'ls' should no longer be killed.
A stack limit of 15KB since it's small enough to trigger 20*PAGE_SIZE.
Also 15KB not a multiple of PAGE_SIZE, which is a trickier case to handle
correctly with this code.
4K pages should be fine to test with.
[kosaki.motohiro@jp.fujitsu.com: cleanup]
[akpm@linux-foundation.org: cleanup cleanup]
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Americo Wang <xiyou.wangcong@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is used by tcgetsid(3).
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some places in kernel need to iterate over a hlist in seq_file,
so provide some common helpers.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
md ioctls are now handled by the md driver itself, but mdadm
may call RAID_VERSION on other devices as well. Mark the command
as IGNORE_IOCTL so this fails silently rather than printing
an annoying message.
Reported-by: "Michael S. Tsirkin" <m.s.tsirkin@gmail.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix dentry hash calculation for case-insensitive mounts
[CIFS] Don't cache timestamps on utimes due to coarse granularity
[CIFS] Maximum username length check in session setup does not match
cifs: fix length calculation for converted unicode readdir names
[CIFS] Add support for TCP_NODELAY
For NFSv2 and v3:
O_DIRECT writes are always synchronous, and aren't cached, so nothing
should be flushed when closing an NFS O_DIRECT file descriptor. Thus
there are no write errors to report on close(2).
In addition, there's no cached data to verify on the next open(2),
so we don't need clean GETATTR results at close time to compare with.
Thus, there's no need for the nfs_revalidate_inode() call when closing
an NFS O_DIRECT file. This reduces the number of synchronous
on-the-wire requests for a simple open-write-close of an NFS O_DIRECT
file by roughly 20%.
For NFSv4:
Call nfs4_do_close() with wait set to zero when closing an NFS
O_DIRECT file. The CLOSE will go on the wire, but the application
won't wait for it to complete.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The bytes counted by the performance counters for NFS writes should
reflect write and sync errors. If the write(2) system call reports
an error, the bytes should not be counted. And, if the write is
short, the actual number of bytes that was written should be counted,
not the number of bytes that was requested.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>