Pull CIFS fixes from Steve French:
"A set of small cifs fixes fixing a memory leak, kernel oops, and
infinite loop (and some spotted by Coverity)"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
Fix warning
Fix another dereference before null check warning
CIFS: session servername can't be null
Fix warning on impossible comparison
Fix coverity warning
Fix dereference before null check warning
Don't ignore errors on encrypting password in SMBTcon
Fix warning on uninitialized buftype
cifs: potential memory leaks when parsing mnt opts
cifs: fix use-after-free bug in find_writable_file
cifs: smb2_clone_range() - exit on unhandled error
Previously commit 14ece1028b added a
support for for syncing parent directory of newly created inodes to
make sure that the inode is not lost after a power failure in
no-journal mode.
However this does not work in majority of cases, namely:
- if the directory has inline data
- if the directory is already indexed
- if the directory already has at least one block and:
- the new entry fits into it
- or we've successfully converted it to indexed
So in those cases we might lose the inode entirely even after fsync in
the no-journal mode. This also includes ext2 default mode obviously.
I've noticed this while running xfstest generic/321 and even though the
test should fail (we need to run fsck after a crash in no-journal mode)
I could not find a newly created entries even when if it was fsynced
before.
Fix this by adjusting the ext4_add_entry() successful exit paths to set
the inode EXT4_STATE_NEWENTRY so that fsync has the chance to fsync the
parent directory as well.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Frank Mayhar <fmayhar@google.com>
Cc: stable@vger.kernel.org
If something went wrong with creating a debugfs file/symlink/directory,
that value could be passed down into debugfs again as a parent dentry.
To make caller code simpler, just error out if this happens, and don't
crash the kernel.
Reported-by: Alex Elder <elder@linaro.org>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Alex Elder <elder@linaro.org>
During the v3.20/v4.0 cycle, I had originally had the code manage the
inode->i_flctx pointer using a compare-and-swap operation instead of the
i_lock.
Sasha Levin though hit a problem while testing with trinity that made me
believe that that wasn't safe. At the time, changing the code to protect
the i_flctx pointer seemed to fix the issue, but I now think that was
just coincidence.
The issue was likely the same race that Kirill Shutemov hit while
testing the pre-rc1 v4.0 kernel and that Linus spotted. Due to the way
that the spinlock was dropped in the middle of flock_lock_file, you
could end up with multiple flock locks for the same struct file on the
inode.
Reinstate the use of a CAS operation to assign this pointer since it's
likely to be more efficient and gets the i_lock completely out of the
file locking business.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
As Bruce points out, there's no compelling reason to change /proc/locks
output at this point. If we did want to do this, then we'd almost
certainly want to introduce a new file to display this info (maybe via
debugfs?).
Let's remove the dead WE_CAN_BREAK_LSLK_NOW ifdef here and just plan to
stay with the legacy format.
Reported-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
The current prototypes for these operations are somewhat awkward as they
deal with fl_owners but take struct file_lock arguments. In the future,
we'll want to be able to take references without necessarily dealing
with a struct file_lock.
Change them to take fl_owner_t arguments instead and have the callers
deal with assigning the values to the file_lock structs.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
In the event that we get an F_UNLCK request on an inode that has no lock
context, there is no reason to allocate one. Change
locks_get_lock_context to take a "type" pointer and avoid allocating a
new context if it's F_UNLCK.
Then, fix the callers to return appropriately if that function returns
NULL.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Annonate insert, remove and iterate function that we need
blocked_lock_lock held.
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
We know that the locks being passed into this function are of the
correct type, now that they live on their own lists.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Since following change
commit bd61e0a9c8
Author: Jeff Layton <jlayton@primarydata.com>
Date: Fri Jan 16 15:05:55 2015 -0500
locks: convert posix locks to file_lock_context
all Posix locks are kept on their a separate list, so the test is
redudant.
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Jeff Layton <jlayton@primarydata.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
When xfstests' auto group is run on a bigalloc filesystem with a
4.0-rc3 kernel, e2fsck failures and kernel warnings occur for some
tests. e2fsck reports incorrect iblocks values, and the warnings
indicate that the space reserved for delayed allocation is being
overdrawn at allocation time.
Some of these errors occur because the reserved space is incorrectly
decreased by one cluster when ext4_ext_map_blocks satisfies an
allocation request by mapping an unused portion of a previously
allocated cluster. Because a cluster's worth of reserved space was
already released when it was first allocated, it should not be released
again.
This patch appears to correct the e2fsck failure reported for
generic/232 and the kernel warnings produced by ext4/001, generic/009,
and generic/033. Failures and warnings for some other tests remain to
be addressed.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In ext4_zero_range(), removing a file's entire block range from the
extent status tree removes all records of that file's delalloc extents.
The delalloc accounting code uses this information, and its loss can
then lead to accounting errors and kernel warnings at writeback time and
subsequent file system damage. This is most noticeable on bigalloc
file systems where code in ext4_ext_map_blocks() handles cases where
delalloc extents share clusters with a newly allocated extent.
Because we're not deleting a block range and are correctly updating the
status of its associated extent, there is no need to remove anything
from the extent status tree.
When this patch is combined with an unrelated bug fix for
ext4_zero_range(), kernel warnings and e2fsck errors reported during
xfstests runs on bigalloc filesystems are greatly reduced without
introducing regressions on other xfstests-bld test scenarios.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently there is a bug in zero range code which causes zero range
calls to only allocate block aligned portion of the range, while
ignoring the rest in some cases.
In some cases, namely if the end of the range is past i_size, we do
attempt to preallocate the last nonaligned block. However this might
cause kernel to BUG() in some carefully designed zero range requests
on setups where page size > block size.
Fix this problem by first preallocating the entire range, including
the nonaligned edges and converting the written extents to unwritten
in the next step. This approach will also give us the advantage of
having the range to be as linearly contiguous as possible.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This is a leftover of commit 71d4f7d032
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
bdi->dev now never goes away, so this function became useless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In this if statement, the previous condition is useless, the later one
has covered it.
Signed-off-by: Weiyuan <weiyuan.wei@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Remove unused header files and header files which are included in
ext4.h.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If the first mount in shared subtree is locked don't unmount the
shared subtree.
This is ensured by walking through the mounts parents before children
and marking a mount as unmountable if it is not locked or it is locked
but it's parent is marked.
This allows recursive mount detach to propagate through a set of
mounts when unmounting them would not reveal what is under any locked
mount.
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
A prerequisite of calling umount_tree is that the point where the tree
is mounted at is valid to unmount.
If we are propagating the effect of the unmount clear MNT_LOCKED in
every instance where the same filesystem is mounted on the same
mountpoint in the mount tree, as we know (by virtue of the fact
that umount_tree was called) that it is safe to reveal what
is at that mountpoint.
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
- Modify __lookup_mnt_hash_last to ignore mounts that have MNT_UMOUNTED set.
- Don't remove mounts from the mount hash table in propogate_umount
- Don't remove mounts from the mount hash table in umount_tree before
the entire list of mounts to be umounted is selected.
- Remove mounts from the mount hash table as the last thing that
happens in the case where a mount has a parent in umount_tree.
Mounts without parents are not hashed (by definition).
This paves the way for delaying removal from the mount hash table even
farther and fixing the MNT_LOCKED vs MNT_DETACH issue.
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
In some instances it is necessary to know if the the unmounting
process has begun on a mount. Add MNT_UMOUNT to make that reliably
testable.
This fix gets used in fixing locked mounts in MNT_DETACH
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
umount_tree builds a list of mounts that need to be unmounted.
Utilize mnt_list for this purpose instead of mnt_hash. This begins to
allow keeping a mount on the mnt_hash after it is unmounted, which is
necessary for a properly functioning MNT_LOCKED implementation.
The fact that mnt_list is an ordinary list makding available list_move
is nice bonus.
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Invoking mount propagation from __detach_mounts is inefficient and
wrong.
It is inefficient because __detach_mounts already walks the list of
mounts that where something needs to be done, and mount propagation
walks some subset of those mounts again.
It is actively wrong because if the dentry that is passed to
__detach_mounts is not part of the path to a mount that mount should
not be affected.
change_mnt_propagation(p,MS_PRIVATE) modifies the mount propagation
tree of a master mount so it's slaves are connected to another master
if possible. Which means even removing a mount from the middle of a
mount tree with __detach_mounts will not deprive any mount propagated
mount events.
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
- Remove the unneeded declaration from pnode.h
- Mark umount_tree static as it has no callers outside of namespace.c
- Define an enumeration of umount_tree's flags.
- Pass umount_tree's flags in by name
This removes the magic numbers 0, 1 and 2 making the code a little
clearer and makes it possible for there to be lazy unmounts that don't
propagate. Which is what __detach_mounts actually wants for example.
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Since commit a9b8241594, we are allowed to merge unwritten extents,
so here these comments are wrong, remove it.
Signed-off-by: Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
According to C99, %*.s means the same as %*.0s, in other words, print as
many spaces as the field width argument says and effectively ignore the
string argument. That is certainly not what was meant here. The kernel's
printf implementation, however, treats it as if the . was not there,
i.e. as %*s. I don't know if de->name is nul-terminated or not, but in
any case I'm guessing the intention was to use de->name_len as precision
instead of field width.
[ Note: this is debugging code which is commented out, so this is not
security issue; a developer would have to explicitly enable
INLINE_DIR_DEBUG before this would be an issue. ]
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Release references to buffer-heads if ext4_journal_start() fails.
Fixes: 5b61de7575 ("ext4: start handle at least possible moment when renaming files")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
This should cover the set emitted by viced and the volume server.
Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
Building alpha:allmodconfig fails with
fs/btrfs/inode.c: In function 'check_direct_IO':
fs/btrfs/inode.c:8050:2: error: implicit declaration of function 'iov_iter_alignment'
due to a missing include file.
Fixes: 3737c63e1fb0 ("fs: move struct kiocb to fs.h")
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
frequently updated inods to never have their timestamps updated.
These changes guarantee that no timestamp on disk will be stale by
more than 24 hours.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJVGx6pAAoJEPL5WVaVDYGjh5cIAKQAyGST92IbTkxRZsxMgqnH
7LQI+fbNn6oHGEjSSnsWLxl6CpwT4WrCmj8WhVmpAoTLU958nBbF7iZAaaeQCGeS
3EqaNOlKvuOK9M5PKK7a5AWO04uJuj+t6s536OqHyB1zRb1yYMsywllPzu63eigA
jxu2yZxkFIKjo2ohSaTDRONVCsQGlqgZ2Aq/Ho5vy5QffVJKTN1G/3Kf33xukUyr
SAnndaax23jMqcFJE3gePYXc3W8EuGoloehKyo04qFeNNVMmSoytXAwMzcTmHn+H
biOTN5ezSKbYzv1aevRg7UuSPv17/yIo3aEberfLBgsn5O4wJGDdS+LajaI5/x8=
=0k0d
-----END PGP SIGNATURE-----
Merge tag 'lazytime_fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull lazytime fixes from Ted Ts'o:
"This fixes a problem in the lazy time patches, which can cause
frequently updated inods to never have their timestamps updated.
These changes guarantee that no timestamp on disk will be stale by
more than 24 hours"
* tag 'lazytime_fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
fs: add dirtytime_expire_seconds sysctl
fs: make sure the timestamps for lazytime inodes eventually get written
Pull nfsd fixes from Bruce Fields:
"Two main issues:
- We found that turning on pNFS by default (when it's configured at
build time) was too aggressive, so we want to switch the default
before the 4.0 release.
- Recent client changes to increase open parallelism uncovered a
serious bug lurking in the server's open code.
Also fix a krb5/selinux regression.
The rest is mainly smaller pNFS fixes"
* 'for-4.0' of git://linux-nfs.org/~bfields/linux:
sunrpc: make debugfs file creation failure non-fatal
nfsd: require an explicit option to enable pNFS
NFSD: Fix bad update of layout in nfsd4_return_file_layout
NFSD: Take care the return value from nfsd4_encode_stateid
NFSD: Printk blocklayout length and offset as format 0x%llx
nfsd: return correct lockowner when there is a race on hash insert
nfsd: return correct openowner when there is a race to put one in the hash
NFSD: Put exports after nfsd4_layout_verify fail
NFSD: Error out when register_shrinker() fail
NFSD: Take care the return value from nfsd4_decode_stateid
NFSD: Check layout type when returning client layouts
NFSD: restore trace event lost in mismerge
afs_send_empty_reply() doesn't require an iovec array with which to initialise
the msghdr, but can pass NULL instead.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
We failed to update ctime & mtime of a directory when new entry was
created in it during rename, link, create, etc. Fix that.
Reported-by: Taesoo Kim <tsgatesv@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Instead of -ENOMEM, properly return -EIO udf_update_inode()
error, similar/consistent to the rest of filesystems.
Signed-off-by: Changwoo Min <changwoo.m@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
In this if statement, the previous condition is useless, the later one has covered it.
Signed-off-by: Weiyuan <weiyuan.wei@huawei.com>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Coverity reports a warning due to unitialized attr structure in one
code path.
Reported by Coverity (CID 728535)
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Jeff Layton <jlayton@samba.org>
null tcon is not possible in these paths so
remove confusing null check
Reported by Coverity (CID 728519)
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Jeff Layton <jlayton@samba.org>
remove impossible check
Pointed out by Coverity (CID 115422)
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Jeff Layton <jlayton@samba.org>
workstation_RFC1001_name is part of the struct and can't be null,
remove impossible comparison (array vs. null)
Pointed out by Coverity (CID 140095)
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Jeff Layton <jlayton@samba.org>
Coverity reports a warning for referencing the beginning of the
SMB2/SMB3 frame using the ProtocolId field as an array. Although
it works the same either way, this patch should quiet the warning
and might be a little clearer.
Reported by Coverity (CID 741269)
Signed-off-by: Steve French <smfrench@gmail.com>
Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Jeff Layton <jlayton@poochiereds.net>
null tcon is not likely in these paths in current
code, but obviously it does clarify the code to
check for null (if at all) before derefrencing
rather than after.
Reported by Coverity (CID 1042666)
Signed-off-by: Steve French <smfrench@gmail.com>
Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Although unlikely to fail (and tree connect does not commonly send
a password since SECMODE_USER is the default for most servers)
do not ignore errors on SMBNTEncrypt in SMB Tree Connect.
Reported by Coverity (CID 1226853)
Signed-off-by: Steve French <smfrench@gmail.com>
Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Jeff Layton <jlayton@poochiereds.net>
Pointed out by coverity analyzer. resp_buftype is
not initialized in one path which can rarely log
a spurious warning (buf is null so there will
not be a problem with freeing data, but if buf_type
were randomly set to wrong value could log a warning)
Reported by Coverity (CID 1269144)
Signed-off-by: Steve French <smfrench@gmail.com>
Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Jeff Layton <jlayton@poochiereds.net>
NFS4_MAXLABELLEN has defined for sec label max length, use it directly.
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We've been refusing ACLs that DENY permissions that we can't effectively
deny. (For example, we can't deny permission to read attributes.)
Andreas points out that any DENY of Window's "read", "write", or
"modify" permissions would trigger this. That would be annoying.
So maybe we should be a little less paranoid, and ignore entirely the
permissions that are meaningless to us.
Reported-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
NFSD_FAULT_INJECTION depends on DEBUG_FS, otherwise the debugfs_create_*
interface may return unexpected error -ENODEV, and cause system crash.
Signed-off-by: Chengyu Song <csong84@gatech.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>