Commit graph

29074 commits

Author SHA1 Message Date
Trond Myklebust
579342785f NFSv4.1: Remove unused 'default allocation' for pnfs_alloc_layout_hdr()
...and ditto for pnfs_free_layout_hdr()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:16 -04:00
Trond Myklebust
a9136d4914 NFSv4.1: Get rid of pNFS spin lock debugging asserts...
These are all in static declared functions that are called only once.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:16 -04:00
Trond Myklebust
8f0d27dc5d NFSv4.1: Balance pnfs_layout_hdr refcount in pnfs_layout_(insert|remove)_lseg
Ensure that the reference count for pnfs_layout_hdr reverts to the
original value after a call to pnfs_layout_remove_lseg().

Note that the caller is expected to hold a reference to the struct
pnfs_layout_hdr.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:15 -04:00
Trond Myklebust
905ca191cf NFSv4.1: Clean up pnfs_put_lseg()
There is no longer a need to use pnfs_free_lseg_list(). Just call
pnfs_free_lseg() directly.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:15 -04:00
Trond Myklebust
9c6263819f NFSv4.1: Clean up the removal of pnfs_layout_hdr from the server list
Move the code into pnfs_free_layout_hdr(), and add checks to
get_layout_by_fh_locked to ensure that they don't reference a layout
that is being freed.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:14 -04:00
Trond Myklebust
6622c3ea05 NFSv4.1: Free the pnfs_layout_hdr outside the inode->i_lock
None of the existing pNFS layout drivers seem to require the inode
to be locked while they free the layout header.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:14 -04:00
Trond Myklebust
01d39ce82b NFSv4.1: Remove redundant reference to the pnfs_layout_hdr
Each layout segment already holds a reference to the pnfs_layout_hdr,
so there is no need to hold an extra reference that is released once
the last layout segment is freed.

Ensure that pnfs_find_alloc_layout() always returns a reference
to the pnfs_layout_hdr, which will be matched by the final call to
pnfs_put_layout_hdr() in pnfs_update_layout().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:13 -04:00
Trond Myklebust
57036a3776 NFSv4.1: Rename the pnfs_put_lseg_common to pnfs_layout_remove_lseg
The latter name is more descriptive of the actual function.
Also rename pnfs_insert_layout to pnfs_layout_insert_lseg.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:13 -04:00
Trond Myklebust
bb346f6397 NFSv4.1: reset the inode MDS threshold counters on layout destruction
Instead of resetting the inode MDS threshold counters when we mark
the layout for destruction, do it as part of freeing the layout.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:12 -04:00
Trond Myklebust
965938b83b NFSv4.1: Get rid of pNFS layout state "NFS_LAYOUT_INVALID"
In all cases where we set NFS_LAYOUT_INVALID, we also set NFS_LAYOUT_DESTROYED.
Furthermore, in all cases where we test for NFS_LAYOUT_INVALID, we should
also be testing for NFS_LAYOUT_DESTROYED, since the latter means that
we hold no valid layout segments.
Ergo the two are redundant.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:12 -04:00
Trond Myklebust
1f7977c136 NFSv4.1: Simplify the pNFS return-on-close code
Confine it to the nfs4_do_close() code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:12 -04:00
Trond Myklebust
7fdab069b7 NFSv4.1: Fix a race in the pNFS return-on-close code
If we sleep after dropping the inode->i_lock, then we are no longer
atomic with respect to the rpc_wake_up() call in pnfs_layout_remove_lseg().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:11 -04:00
Trond Myklebust
115ce575cb NFSv4.1: pnfs_layout_io_set_failed must clear invalid lsegs
If pnfs_layout_io_test_failed() authorises a retry of the failed layoutgets,
we should clear the existing layout segments so that we start afresh. Do
this in pnfs_layout_io_set_failed().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:11 -04:00
Trond Myklebust
3e62121493 NFSv4.1: Don't drop the pnfs_layout_hdr after a layoutget failure
We want to cache the pnfs_layout_hdr after a layoutget or i/o
failure so that pnfs_update_layout() can find it and know when
it is time to retry.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:10 -04:00
Trond Myklebust
830ffb5657 NFSv4.1: Fix a reference leak in pnfs_update_layout
If we exit after the call to pnfs_find_alloc_layout(), we have to ensure
that we put the struct pnfs_layout_hdr.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:10 -04:00
Trond Myklebust
1dfed2737d NFSv4.1: pNFS data servers may be temporarily offline
In cases where the pNFS data server is just temporarily out of service,
we want to mark it as such, and then try again later. Typically that will
be in cases of network connection errors etc.
This patch allows us to mark the devices as being "unavailable" for such
transient errors, and will make them available for retries after a
2 minute timeout period.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:09 -04:00
Trond Myklebust
25c7533357 NFSv4.1: Retry pNFS after a 2 minute timeout
If we had to fall back to read/write through MDS, then assume that we should
retry pNFS after a suitable timeout period.
The following patch sets a timeout of 2 minutes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:09 -04:00
Trond Myklebust
b9e028fd89 NFSv4.1: Add helpers for setting/reading the I/O fail bit
...and make them local to the pnfs.c file.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:09 -04:00
Trond Myklebust
f86bbcf85d NFSv4.1: Replace dprintk() in pnfs_update_layout with something less buggy
Dereferencing nfsi->layout in order to read plh_flags without holding
a spin lock is bug prone. Furthermore, the dprintk() tells you nothing
about whether or not the call succeeded.
Replace it with something that tells you about whether or not a valid
layout segment was returned for the inode in question.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:08 -04:00
Trond Myklebust
78e4e05c64 NFSv4.1: Replace get_device_info() with filelayout_get_device_info()
Fix the namespace pollution issue.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:08 -04:00
Trond Myklebust
9369a431bc NFSv4.1: Cleanup; add "pnfs_" prefix to put_lseg() and get_lseg()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:07 -04:00
Trond Myklebust
70c3bd2bdf NFSv4.1: Cleanup; add "pnfs_" prefix to get_layout_hdr() and put_layout_hdr()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:07 -04:00
Trond Myklebust
49a85061b0 NFSv4.1: Cleanup add a "pnfs_" prefix to mark_matching_lsegs_invalid
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:06 -04:00
Trond Myklebust
a0b0a6e39b NFS: Clean up the pNFS layoutget interface
Ensure that we do return errors from nfs4_proc_layoutget() and that we
don't mark the layout as having failed if the error was due to a
signal or resource problem on the client side.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:06 -04:00
Trond Myklebust
dcfc4f2546 NFS: Write the entire file if a server reboot occurs during fsync()
This is to ensure that we don't clear the NFS_CONTEXT_RESEND_WRITES
flag while there are still writes that haven't been resent.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:05 -04:00
Trond Myklebust
05990d1bf2 NFS: Fix fdatasync/fsync() when confronted with a server reboot
If the server reboots before it can commit the unstable writes to disk,
then nfs_commit_release_pages() will detect this when it compares the
verifier returned by COMMIT to the one returned by WRITE. When this
happens, the client needs to resend those writes in order to guarantee
that they make it to stable storage.

This patch adds a signalling mechanism to notify fsync() that it
needs to retry all writes before it can exit.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:05 -04:00
Trond Myklebust
795a88c968 NFSv4: Convert the nfs4_lock_state->ls_flags to a bit field
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:04 -04:00
Trond Myklebust
2a369153c8 NFS: Clean up helper function nfs4_select_rw_stateid()
We want to be able to pass on the information that the page was not
dirtied under a lock. Instead of adding a flag parameter, do this
by passing a pointer to a 'struct nfs_lock_owner' that may be NULL.

Also reuse this structure in struct nfs_lock_context to carry the
fl_owner_t and pid_t.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:04 -04:00
Trond Myklebust
b3c54de6f8 NFS: Convert nfs_get_lock_context to return an ERR_PTR on failure
We want to be able to distinguish between allocation failures, and
the case where the lock context is not needed (because there are no
locks).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:03 -04:00
David S. Miller
6a06e5e1bb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/team/team.c
	drivers/net/usb/qmi_wwan.c
	net/batman-adv/bat_iv_ogm.c
	net/ipv4/fib_frontend.c
	net/ipv4/route.c
	net/l2tp/l2tp_netlink.c

The team, fib_frontend, route, and l2tp_netlink conflicts were simply
overlapping changes.

qmi_wwan and bat_iv_ogm were of the "use HEAD" variety.

With help from Antonio Quartulli.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-28 14:40:49 -04:00
Trond Myklebust
0cac120233 NFSv4: Ensure that idmap_pipe_downcall sanity-checks the downcall data
Use the idmapper upcall data to verify that the legacy idmapper daemon
is indeed responding to an upcall that we sent.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
2012-09-28 13:43:34 -04:00
Trond Myklebust
e9ab41b620 NFSv4: Clean up the legacy idmapper upcall
Replace the BUG_ON(idmap->idmap_key_cons != NULL) with a
WARN_ON_ONCE(). Then get rid of the ACCESS_ONCE(idmap->idmap_key_cons).

Then add helper functions for starting, finishing and aborting the
legacy upcall.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
2012-09-28 13:42:41 -04:00
Linus Torvalds
7596824e66 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "A couple of fixes; one for automount/lazy umount race, another a
  classic "we don't protect the refcount transition to zero with the
  lock that protects looking for object in hash" kind of crap in lockd."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  close the race in nlmsvc_free_block()
  do_add_mount()/umount -l races
2012-09-28 10:02:53 -07:00
Trond Myklebust
0e24d849c4 NFSv4: Remove BUG_ON() and ACCESS_ONCE() calls in the idmapper
The use of ACCESS_ONCE() is wrong, since the various routines that set/clear
idmap->idmap_key_cons should be strictly ordered w.r.t. each other, and
the idmap->idmap_mutex ensures that only one thread at a time may be in
an upcall situation.

Also replace the BUG_ON()s with WARN_ON_ONCE() where appropriate.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 12:27:11 -04:00
Shaohua Li
02f3939e1a block: makes bio_split support bio without data
discard bio hasn't data attached. We hit a BUG_ON with such bio. This makes
bio_split works for such bio.

Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-09-28 10:38:48 +02:00
James Morris
bf53083445 Linux 3.6-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJQX7MuAAoJEHm+PkMAQRiG0h0IAJURkrMCAQUxA+Ik66ReH89s
 LQcVd0U9uL4UUOi7f5WR64Vf9Cfu6VVGX9ZKSvjpNskvlQaUQPMIt4pMe6g4X4dI
 u0bApEy4XZz3nGabUAghIU8jJ8cDmhCG6kPpSiS7pi7KHc0yIa4WFtJRrIpGaIWT
 xuK38YOiOHcSDRlLyWZzainMncQp/ixJdxnqVMTonkVLk0q0b84XzOr4/qlLE5lU
 i+TsK3PRKdQXgvZ4CebL+srPBwWX1dmgP3VkeBloQbSSenSeELICbFWavn2ml+sF
 GXi4dO93oNquL/Oy5SwI666T4uNcrRPaS+5X+xSZgBW/y2aQVJVJuNZg6ZP/uWk=
 =0v2l
 -----END PGP SIGNATURE-----

Merge tag 'v3.6-rc7' into next

Linux 3.6-rc7

Requested by David Howells so he can merge his key susbsystem work into
my tree with requisite -linus changesets.
2012-09-28 13:37:32 +10:00
J. Bruce Fields
fd51790949 trivial select_parent documentation fix
"Search list for X" sounds like you're trying to find X on a list.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-27 15:43:08 -07:00
Wei Yongjun
ba39ebb614 ext4: convert to use leXX_add_cpu()
Convert cpu_to_leXX(leXX_to_cpu(E1) + E2) to use leXX_add_cpu().

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-27 09:37:53 -04:00
Carlos Maiolino
6d1ab10e69 ext4: ext4_bread usage audit
When ext4_bread() returns NULL and err is set to zero, this means
there is no phyical block mapped to the specified logical block
number.  (Previous to commit 90b0a97323, err was uninitialized in this
case, which caused other problems.)

The directory handling routines use ext4_bread() in many places, the
fact that ext4_bread() now returns NULL with err set to zero could
cause problems since a number of these functions will simply return
the value of err if the result of ext4_bread() was the NULL pointer,
causing the caller of the function to think that the function was
successful.

Since directories should never contain holes, this case can only
happen if the file system is corrupted.  This commit audits all of the
callers of ext4_bread(), and makes sure they do the right thing if a
hole in a directory is found by ext4_bread().

Some ext4_bread() callers did not need any changes either because they
already had its own hole detector paths.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-27 09:31:33 -04:00
Wang Sheng-Hui
cbb4ee830e ext4: remove redundant offset check in mext_check_arguments()
In the check code above, if orig_start != donor_start, we would
return -EINVAL. So here, orig_start should be equal with donor_start.
Remove the redundant check here.

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-27 08:00:01 -04:00
Eric Sandeen
c25f9bc614 ext4: don't clear orphan list on ro mount with errors
If the file system contains errors and it is being mounted read-only,
don't clear the orphan list.  We should minimize changes to the file
system if it is mounted read-only.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-26 23:30:12 -04:00
Pavel Shilovsky
f065fd099f CIFS: Fix possible freed pointer dereference in CIFS_SessSetup
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-09-26 22:15:24 -05:00
Pavel Shilovsky
4ca3a99ca4 CIFS: Fix possible freed pointer dereference in SMB2_sess_setup
and remove redundant (rsp == NULL) checks after SendReceive2.

Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-09-26 22:15:18 -05:00
Jan Kara
b794e7a6eb jbd2: fix assertion failure in commit code due to lacking transaction credits
ext4 users of data=journal mode with blocksize < pagesize were
occasionally hitting assertion failure in
jbd2_journal_commit_transaction() checking whether the transaction has
at least as many credits reserved as buffers attached.  The core of the
problem is that when a file gets truncated, buffers that still need
checkpointing or that are attached to the committing transaction are
left with buffer_mapped set. When this happens to buffers beyond i_size
attached to a page stradding i_size, subsequent write extending the file
will see these buffers and as they are mapped (but underlying blocks
were freed) things go awry from here.

The assertion failure just coincidentally (and in this case luckily as
we would start corrupting filesystem) triggers due to journal_head not
being properly cleaned up as well.

We fix the problem by unmapping buffers if possible (in lots of cases we
just need a buffer attached to a transaction as a place holder but it
must not be written out anyway).  And in one case, we just have to bite
the bullet and wait for transaction commit to finish.

CC: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2012-09-26 23:11:13 -04:00
Pavel Shilovsky
760ad0cac1 CIFS: Make ops->close return void
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-09-26 22:05:10 -05:00
Djalal Harouni
9b68733273 ext4: release donor reference when EXT4_IOC_MOVE_EXT ioctl fails
When the EXT4_IOC_MOVE_EXT ioctl() fails on bigalloc file systems, we
should jump to the 'mext_out' label to release the donor file reference.

Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-26 22:58:50 -04:00
Lukas Czerner
aaf7d73e54 ext4: enable FITRIM ioctl on bigalloc file system
With a minor tweaks regarding minimum extent size to discard and
discarded bytes reporting the FITRIM can be enabled on bigalloc file
system and it works without any problem.

This patch fixes minlen handling and discarded bytes reporting to
take into consideration bigalloc enabled file systems and finally
removes the restriction and allow FITRIM to be used on file system with
bigalloc feature enabled.

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-26 22:21:21 -04:00
Denys Vlasenko
f34f9d186d coredump: prevent double-free on an error path in core dumper
In !CORE_DUMP_USE_REGSET case, if elf_note_info_init fails to allocate
memory for info->fields, it frees already allocated stuff and returns
error to its caller, fill_note_info.  Which in turn returns error to its
caller, elf_core_dump.  Which jumps to cleanup label and calls
free_note_info, which will happily try to free all info->fields again.
BOOM.

This is the fix.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Venu Byravarasu <vbyravarasu@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2012-09-26 22:20:21 -04:00
Al Viro
63784dd02b fcntl: fix misannotations
__user * != * __user...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26 22:20:20 -04:00
Al Viro
2744c171db ceph: don't abuse d_delete() on failure exits
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26 22:20:20 -04:00