Commit graph

31744 commits

Author SHA1 Message Date
Dave Chinner
6b2647a12a xfs: shortform directory offsets change for dir3 format
Because the header size for the CRC enabled directory blocks is
larger, the offset of the first entry into a directory block is
different to the dir2 format. The shortform directory stores the
dirent's offset so that it doesn't change when moving from shortform
to block form and back again, and hence it needs to take into
account the different header sizes to maintain the correct offsets.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-27 12:24:32 -05:00
Dave Chinner
24df33b45e xfs: add CRC checking to dir2 leaf blocks
This addition follows the same pattern as the dir2 block CRCs.
Seeing as both LEAF1 and LEAFN types need to changed at the same
time, this is a pretty large amount of change. leaf block headers
need to be abstracted away from the on-disk structures (struct
xfs_dir3_icleaf_hdr), as do the base leaf entry locations.

This header abstract allows the in-core header and leaf entry
location to be passed around instead of the leaf block itself. This
saves a lot of converting individual variables from on-disk format
to host format where they are used, so there's a good chance that
the compiler will be able to produce much more optimal code as it's
not having to byteswap variables all over the place.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-27 12:19:53 -05:00
Dave Chinner
33363feed1 xfs: add CRC checking to dir2 data blocks
This addition follows the same pattern as the dir2 block CRCs.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-27 12:00:00 -05:00
Dave Chinner
cbc8adf897 xfs: add CRC checking to dir2 free blocks
This addition follows the same pattern as the dir2 block CRCs, but
with a few differences. The main difference is that the free block
header is different between the v2 and v3 formats, so an "in-core"
free block header has been added and _todisk/_from_disk functions
used to abstract the differences in structure format from the code.
This is similar to the on-disk superblock versus the in-core
superblock setup. The in-core strucutre is populated when the buffer
is read from disk, all the in memory checks and modifications are
done on the in-core version of the structure which is written back
to the buffer before the buffer is logged.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-27 11:58:16 -05:00
Dave Chinner
f5f3d9b016 xfs: add CRC checks to block format directory blocks
Now that directory buffers are made from a single struct xfs_buf, we
can add CRC calculation and checking callbacks. While there, add all
the fields to the on disk structures for future functionality such
as d_type support, uuids, block numbers, owner inode, etc.

To distinguish between the different on disk formats, change the
magic numbers for the new format directory blocks.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-27 11:51:56 -05:00
Dave Chinner
f948dd76dd xfs: add CRC checks to remote symlinks
Add a header to the remote symlink block, containing location and
owner information, as well as CRCs and LSN fields. This requires
verifiers to be added to the remote symlink buffers for CRC enabled
filesystems.

This also fixes a bug reading multiple block symlinks, where the second
block overwrites the first block when copying out the link name.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-27 11:49:28 -05:00
J. Bruce Fields
dd30333cf5 nfsd4: better error return to indicate SSV non-support
As 4.1 becomes less experimental and SSV still isn't implemented, we
have to admit it's not going to be, and return some sensible error
rather than just saying "our server's broken".  Discussion in the ietf
group hasn't turned up any objections to using NFS4ERR_ENC_ALG_UNSUPP
for that purpose.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 16:18:15 -04:00
J. Bruce Fields
aa387d6ce1 nfsd: fix EXDEV checking in rename
We again check for the EXDEV a little later on, so the first check is
redundant.  This check is also slightly racier, since a badly timed
eviction from the export cache could leave us with the two fh_export
pointers pointing to two different cache entries which each refer to the
same underlying export.

It's better to compare vfsmounts as the later check does, but that
leaves a minor security hole in the case where the two exports refer to
two different directories especially if (for example) they have
different root-squashing options.

So, compare ex_path.dentry too.

Reported-by: Joe Habermann <joe.habermann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 16:18:15 -04:00
J. Bruce Fields
c85b03ab20 Merge Trond's nfs-for-next
Merging Trond's nfs-for-next branch, mainly to get
b7993cebb8 "SUNRPC: Allow rpc_create() to
request that TCP slots be unlimited", which a small piece of the
gss-proxy work depends on.
2013-04-26 11:37:43 -04:00
Zhao Hongjiang
91d80a84bb aio: fix possible invalid memory access when DEBUG is enabled
dprintk() shouldn't access @ring after it's unmapped.

Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-26 07:56:18 -07:00
Bob Peterson
222cb538f5 GFS2: Flush work queue before clearing glock hash tables
There was a timing window when a GFS2 file system was unmounted
that caused GFS2 to call BUG() and panic the kernel. The call
to BUG() is meant to ensure that the glock reference count,
gl_ref, never gets down to zero and bounce back up again. What was
happening during umount is that function gfs2_put_super was dequeing
its glocks for well-known files. In particular, we saw it on the
journal glock, sd_jinode_gh. The dequeue caused delayed work to be
queued for the glock state machine, to transition the lock to an
"unlocked" state. While the work was still queued, gfs2_put_super
called gfs2_gl_hash_clear to clear out the glock hash tables.
If the timing was just so, the glock work function would drop the
reference count at the time when it was being checked for zero,
and that caused BUG() to be called. This patch calls
flush_workqueue before clearing the glock hash tables, thereby
ensuring that the delayed work is executed before the hash tables
are cleared, and therefore the reference count never goes to zero
until the glock is cleared.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-04-26 10:09:04 +01:00
Michael Neuling
2171364d1a powerpc: Add HWCAP2 aux entry
We are currently out of free bits in AT_HWCAP. With POWER8, we have
several hardware features that we need to advertise.

Tested on POWER and x86.

Signed-off-by: Michael Neuling <michael@neuling.org>
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26 16:08:16 +10:00
Jaegeuk Kim
9198aceb53 f2fs: check nid == 0 in add_free_nid
It is more obvious that add_free_nid checks whether the free nid is zero or not.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-26 10:35:13 +09:00
Namjae Jeon
8680441caa f2fs: add REQ_META about metadata requests for submit
Adding REQ_META for all the metadata requests can help in improving the
FS performance, if the underlying device supports TAGGING.
So, when considering the submit_bio path for all the f2fs requests. We can
add REQ_META for all the META requests.
As a precursor to this change we considered the commit
4265900e0b 'mmc: MMC-4.5 Data Tag Support'

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-26 10:35:11 +09:00
Jaegeuk Kim
c718379b6b f2fs: give a chance to merge IOs by IO scheduler
Previously, background GC submits many 4KB read requests to load victim blocks
and/or its (i)node blocks.

...
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed
f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0]
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee
f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0]
f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef
f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0]
...

However, by the fact that many IOs are sequential, we can give a chance to merge
the IOs by IO scheduler.
In order to do that, let's use blk_plug.

...
f2fs_gc : f2fs_iget: ino = 143
f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee
f2fs_gc : f2fs_iget: ino = 143
f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef
<idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0]
<idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0]
<idle> : block_rq_complete: 8,16 R () 1520432 + 96 [0]
<idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0]
<idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0]
<idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0]
<idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0]
<idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0]
<idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0]
...

Note that this issue should be addressed in checkpoint, and some readahead
flows too.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-26 10:35:10 +09:00
Jaegeuk Kim
6cb968d9b0 f2fs: avoid frequent background GC
If there is no victim segments selected by background GC, let's wait
a little bit longer time to collect dirty segments.
By default, let's give 5 minutes.

Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-26 10:35:03 +09:00
Zheng Liu
e162b2f835 jbd: use kmem_cache_zalloc instead of kmem_cache_alloc/memset
Now jbd_alloc_handle is only called by new_handle.  So this commit
uses kmem_cache_zalloc instead of kmem_cache_alloc/memset.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2013-04-25 15:25:23 +02:00
Thomas Gleixner
6402c7dc2a Merge branch 'linus' into timers/core
Reason: Get upstream fixes before adding conflicting code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-24 20:33:54 +02:00
Trond Myklebust
b0212b84fb Merge branch 'bugfixes' into linux-next
Fix up a conflict between the linux-next branch and mainline.
Conflicts:
	fs/nfs/nfs4proc.c
2013-04-23 15:52:14 -04:00
Trond Myklebust
bd1d421abc Merge branch 'rpcsec_gss-from_cel' into linux-next
* rpcsec_gss-from_cel: (21 commits)
  NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
  NFSv4: Don't clear the machine cred when client establish returns EACCES
  NFSv4: Fix issues in nfs4_discover_server_trunking
  NFSv4: Fix the fallback to AUTH_NULL if krb5i is not available
  NFS: Use server-recommended security flavor by default (NFSv3)
  SUNRPC: Don't recognize RPC_AUTH_MAXFLAVOR
  NFS: Use "krb5i" to establish NFSv4 state whenever possible
  NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC
  NFS: Use static list of security flavors during root FH lookup recovery
  NFS: Avoid PUTROOTFH when managing leases
  NFS: Clean up nfs4_proc_get_rootfh
  NFS: Handle missing rpc.gssd when looking up root FH
  SUNRPC: Remove EXPORT_SYMBOL_GPL() from GSS mech switch
  SUNRPC: Make gss_mech_get() static
  SUNRPC: Refactor nfsd4_do_encode_secinfo()
  SUNRPC: Consider qop when looking up pseudoflavors
  SUNRPC: Load GSS kernel module by OID
  SUNRPC: Introduce rpcauth_get_pseudoflavor()
  SUNRPC: Define rpcsec_gss_info structure
  NFS: Remove unneeded forward declaration
  ...
2013-04-23 15:40:40 -04:00
Trond Myklebust
bdeca1b76c NFSv4: Don't recheck permissions on open in case of recovery cached open
If we already checked the user access permissions on the original open,
then don't bother checking again on recovery. Doing so can cause a
deadlock with NFSv4.1, since the may_open() operation is not privileged.
Furthermore, we can't report an access permission failure here anyway.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-23 14:52:44 -04:00
Bryan Schumaker
bf8d909705 nfsd: Decode and send 64bit time values
The seconds field of an nfstime4 structure is 64bit, but we are assuming
that the first 32bits are zero-filled.  So if the client tries to set
atime to a value before the epoch (touch -t 196001010101), then the
server will save the wrong value on disk.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-23 14:49:37 -04:00
Trond Myklebust
cd4c9be2c6 NFSv4.1: Don't do a delegated open for NFS4_OPEN_CLAIM_DELEG_CUR_FH modes
If we're in a delegation recall situation, we can't do a delegated open.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-23 14:46:25 -04:00
Trond Myklebust
8188df1733 NFSv4.1: Use the more efficient open_noattr call for open-by-filehandle
When we're doing open-by-filehandle in NFSv4.1, we shouldn't need to
do the cache consistency revalidation on the directory. It is
therefore more efficient to just use open_noattr, which returns the
file attributes, but not the directory attributes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-23 14:31:19 -04:00
Theodore Ts'o
0d606e2c9f ext4: fix type-widening bug in inode table readahead code
Due to a missing cast, the high 32-bits of a 64-bit block number used
when calculating the readahead block for inode tables can get lost.
This means we can end up fetching the wrong blocks for readahead for
file systems > 16TB.

Linus found this when experimenting with an enhacement to the sparse
static code checker which checks for missing widening casts before
binary "not" operators.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-23 08:59:35 -04:00
Namjae Jeon
2af4bd6ca5 f2fs: add tracepoints to debug checkpoint request
Add tracepoints to debug checkpoint request.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
[Jaegeuk: change expressions]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23 19:16:37 +09:00
Namjae Jeon
6ec178dac6 f2fs: add tracepoints for write page operations
Add tracepoints to debug the various page write operation
like data pages, meta pages.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
[Jaegeuk: remove unnecessary tracepoints]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23 18:15:17 +09:00
Namjae Jeon
c01e285324 f2fs: add tracepoints to debug the block allocation
Add tracepoints to debug the block allocation & fallocate.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
[Jaegeuk: enhance information]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23 18:15:16 +09:00
Namjae Jeon
8e46b3ed11 f2fs: add tracepoints for GC threads
Add tracepoints for tracing the garbage collector
threads in f2fs with status of collection & type.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
[Jaegeuk: modify slightly to show information]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23 18:15:10 +09:00
Namjae Jeon
848753aa3b f2fs: add tracepoint for tracing the page i/o
Add tracepoints for page i/o operations and block allocation
tracing during page read operation.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
[Jaegeuk: combine and modify the tracepoint structures]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23 16:40:43 +09:00
Namjae Jeon
51dd624934 f2fs: add tracepoints for truncate operation
add tracepoints for tracing the truncate operations
like truncate node/data blocks, f2fs_truncate etc.

Tracepoints are added at entry and exit of operation
to trace the success & failure of operation.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
[Jaegeuk: combine and modify the tracepoint structures]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23 16:40:38 +09:00
Namjae Jeon
a2a4a7e4ab f2fs: add tracepoints for sync & inode operations
Add tracepoints in f2fs for tracing the syncing
operations like filesystem sync, file sync enter/exit.
It will helf to trace the code under debugging scenarios.

Also add tracepoints for tracing the various inode operations
like building inode, eviction of inode, link/unlike of
inodes.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
[Jaegeuk: combine and modify the tracepoint structures]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23 15:30:27 +09:00
David S. Miller
6e0895c2ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/emulex/benet/be_main.c
	drivers/net/ethernet/intel/igb/igb_main.c
	drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
	include/net/scm.h
	net/batman-adv/routing.c
	net/ipv4/tcp_input.c

The e{uid,gid} --> {uid,gid} credentials fix conflicted with the
cleanup in net-next to now pass cred structs around.

The be2net driver had a bug fix in 'net' that overlapped with the VLAN
interface changes by Patrick McHardy in net-next.

An IGB conflict existed because in 'net' the build_skb() support was
reverted, and in 'net-next' there was a comment style fix within that
code.

Several batman-adv conflicts were resolved by making sure that all
calls to batadv_is_my_mac() are changed to have a new bat_priv first
argument.

Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO
rewrite in 'net-next', mostly overlapping changes.

Thanks to Stephen Rothwell and Antonio Quartulli for help with several
of these merge resolutions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-22 20:32:51 -04:00
Namjae Jeon
e66509f03e f2fs: make is_multimedia_file code align with its name
The code conditions put inside the function is_multimedia_file are
reverse to the name i.e, we need to negate the return to actually
check if the file is a multimedia file. So, change the code and usage
path to align both the name and comparision conditions.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23 08:56:21 +09:00
Chuck Lever
79d852bf5e NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
Recently I changed the SETCLIENTID code to use AUTH_GSS(krb5i), and
then retry with AUTH_NONE if that didn't work.  This was to enable
Kerberos NFS mounts to work without forcing Linux NFS clients to
have a keytab on hand.

Rick Macklem reports that the FreeBSD server accepts AUTH_NONE only
for NULL operations (thus certainly not for SETCLIENTID).  Falling
back to AUTH_NONE means our proposed 3.10 NFS client will not
interoperate with FreeBSD servers over NFSv4 unless Kerberos is
fully configured on both ends.

If the Linux client falls back to using AUTH_SYS instead for
SETCLIENTID, all should work fine as long as the NFS server is
configured to allow AUTH_SYS for SETCLIENTID.

This may still prevent access to Kerberos-only FreeBSD servers by
Linux clients with no keytab.  Rick is of the opinion that the
security settings the server applies to its pseudo-fs should also
apply to the SETCLIENTID operation.

Linux and Solaris NFS servers do not place that limitation on
SETCLIENTID.  The security settings for the server's pseudo-fs are
determined automatically as the union of security flavors allowed on
real exports, as recommended by RFC 3530bis; and the flavors allowed
for SETCLIENTID are all flavors supported by the respective server
implementation.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-22 16:09:53 -04:00
Trond Myklebust
fd068b200f NFSv4: Ensure that we clear the NFS_OPEN_STATE flag when appropriate
We should always clear it before initiating file recovery.
Also ensure that we clear it after a CLOSE and/or after TEST_STATEID fails.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-22 11:29:51 -04:00
Theodore Ts'o
3f8a6411fb ext4: add check for inodes_count overflow in new resize ioctl
Addresses-Red-Hat-Bugzilla: #913245

Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Cc: stable@vger.kernel.org
2013-04-21 22:56:32 -04:00
Theodore Ts'o
7f3e3c7cfc ext4: fix Kconfig documentation for CONFIG_EXT4_DEBUG
Fox the Kconfig documentation for CONFIG_EXT4_DEBUG to match the
change made by commit a0b30c1229: ext4: use module parameters instead
of debugfs for mballoc_debug

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-04-21 20:32:03 -04:00
Theodore Ts'o
c5c72d814c ext4: fix online resizing for ext3-compat file systems
Commit fb0a387dcd restricts block allocations for indirect-mapped
files to block groups less than s_blockfile_groups.  However, the
online resizing code wasn't setting s_blockfile_groups, so the newly
added block groups were not available for non-extent mapped files.

Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-04-21 20:19:43 -04:00
Wei Yongjun
66348b723b f2fs: fix error return code in f2fs_fill_super()
Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.
Introduce by commit c0d39e(f2fs: fix return values from validate superblock)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-22 08:56:03 +09:00
Trond Myklebust
1dfd89af86 LOCKD: Ensure that nlmclnt_block resets block->b_status after a server reboot
After a server reboot, the reclaimer thread will recover all the existing
locks. For locks that are blocked, however, it will change the value
of block->b_status to nlm_lck_denied_grace_period in order to signal that
they need to wake up and resend the original blocking lock request.

Due to a bug, however, the block->b_status never gets reset after the
blocked locks have been woken up, and so the process goes into an
infinite loop of resends until the blocked lock is satisfied.

Reported-by: Marc Eshel <eshel@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-04-21 18:08:42 -04:00
Theodore Ts'o
f783f091e4 jbd2: trace when lock_buffer in do_get_write_access takes a long time
While investigating interactivity problems it was clear that processes
sometimes stall for long periods of times if an attempt is made to
lock a buffer which is undergoing writeback.  It would stall in
a trace looking something like

[<ffffffff811a39de>] __lock_buffer+0x2e/0x30
[<ffffffff8123a60f>] do_get_write_access+0x43f/0x4b0
[<ffffffff8123a7cb>] jbd2_journal_get_write_access+0x2b/0x50
[<ffffffff81220f79>] __ext4_journal_get_write_access+0x39/0x80
[<ffffffff811f3198>] ext4_reserve_inode_write+0x78/0xa0
[<ffffffff811f3209>] ext4_mark_inode_dirty+0x49/0x220
[<ffffffff811f57d1>] ext4_dirty_inode+0x41/0x60
[<ffffffff8119ac3e>] __mark_inode_dirty+0x4e/0x2d0
[<ffffffff8118b9b9>] update_time+0x79/0xc0
[<ffffffff8118ba98>] file_update_time+0x98/0x100
[<ffffffff81110ffc>] __generic_file_aio_write+0x17c/0x3b0
[<ffffffff811112aa>] generic_file_aio_write+0x7a/0xf0
[<ffffffff811ea853>] ext4_file_write+0x83/0xd0
[<ffffffff81172b23>] do_sync_write+0xa3/0xe0
[<ffffffff811731ae>] vfs_write+0xae/0x180
[<ffffffff8117361d>] sys_write+0x4d/0x90
[<ffffffff8159d62d>] system_call_fastpath+0x1a/0x1f
[<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-21 16:47:54 -04:00
Theodore Ts'o
13fca323e9 ext4: mark metadata blocks using bh flags
This allows metadata writebacks which are issued via block device
writeback to be sent with the current write request flags.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-21 16:45:54 -04:00
Dave Chinner
19de7351a8 xfs: split out symlink code into it's own file.
The symlink code is about to get more complicated when CRCs are
added for remote symlink blocks. The symlink management code is
mostly self contained, so move it to it's own files so that all the
new code and the existing symlink code will not be intermingled
with other unrelated code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-21 15:38:04 -05:00
Christoph Hellwig
93848a999c xfs: add version 3 inode format with CRCs
Add a new inode version with a larger core.  The primary objective is
to allow for a crc of the inode, and location information (uuid and ino)
to verify it was written in the right place.  We also extend it by:

	a creation time (for Samba);
	a changecount (for NFSv4);
	a flush sequence (in LSN format for recovery);
	an additional inode flags field; and
	some additional padding.

These additional fields are not implemented yet, but already laid
out in the structure.

[dchinner@redhat.com] Added LSN and flags field, some factoring and rework to
capture all the necessary information in the crc calculation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-21 15:03:33 -05:00
Christoph Hellwig
3fe58f30b4 xfs: add CRC checks for quota blocks
Use the reserved space in struct xfs_dqblk to store a UUID and a crc
for the quota blocks.

[dchinner@redhat.com] Add a LSN field and update for current verifier
infrastructure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-21 14:58:22 -05:00
Dave Chinner
983d09ffe3 xfs: add CRC checks to the AGI
Same set of changes made to the AGF need to be made to the AGI.
This patch has a similar history to the AGF, hence a similar
sign-off chain.

Signed-off-by: Dave Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dgc@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-21 14:57:43 -05:00
Christoph Hellwig
77c95bba01 xfs: add CRC checks to the AGFL
Add CRC checks, location information and a magic number to the AGFL.
Previously the AGFL was just a block containing nothing but the
free block pointers.  The new AGFL has a real header with the usual
boilerplate instead, so that we can verify it's not corrupted and
written into the right place.

[dchinner@redhat.com] Added LSN field, reworked significantly to fit
into new verifier structure and growfs structure, enabled full
verifier functionality now there is a header to verify and we can
guarantee an initialised AGFL.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-21 14:55:34 -05:00
Dave Chinner
4e0e6040c4 xfs: add CRC checks to the AGF
The AGF already has some self identifying fields (e.g. the sequence
number) so we only need to add the uuid to it to identify the
filesystem it belongs to. The location is fixed based on the
sequence number, so there's no need to add a block number, either.

Hence the only additional fields are the CRC and LSN fields. These
are unlogged, so place some space between the end of the logged
fields and them so that future expansion of the AGF for logged
fields can be placed adjacent to the existing logged fields and
hence not complicate the field-derived range based logging we
currently have.

Based originally on a patch from myself, modified further by
Christoph Hellwig and then modified again to fit into the
verifier structure with additional fields by myself. The multiple
signed-off-by tags indicate the age and history of this patch.

Signed-off-by: Dave Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-21 14:54:46 -05:00
Christoph Hellwig
ee1a47ab0e xfs: add support for large btree blocks
Add support for larger btree blocks that contains a CRC32C checksum,
a filesystem uuid and block number for detecting filesystem
consistency and out of place writes.

[dchinner@redhat.com] Also include an owner field to allow reverse
mappings to be implemented for improved repairability and a LSN
field to so that log recovery can easily determine the last
modification that made it to disk for each buffer.

[dchinner@redhat.com] Add buffer log format flags to indicate the
type of buffer to recovery so that we don't have to do blind magic
number tests to determine what the buffer is.

[dchinner@redhat.com] Modified to fit into the verifier structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-04-21 14:53:46 -05:00