Commit graph

35016 commits

Author SHA1 Message Date
J. Bruce Fields
208d0acc49 nfsd4: break only delegations when appropriate
As a temporary fix, nfsd was breaking all leases on unlink, link,
rename, and setattr.

Now that we can distinguish between leases and delegations, we can be
nicer and break only the delegations, and not bother lease-holders with
operations they don't care about.

And we get to delete some code while we're at it.

Note that in the presence of delegations the vfs calls here all return
-EWOULDBLOCK instead of blocking, so nfsd threads will not get stuck
waiting for delegation returns.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-07 16:01:15 -05:00
Theodore Ts'o
8c9367fd9b ext4: don't pass freed handle to ext4_walk_page_buffers
This is harmless, since ext4_walk_page_buffers only passes the handle
onto the callback function, and in this call site the function in
question, bput_one(), doesn't actually use the handle.  But there's no
point passing in an invalid handle, and it creates a Coverity warning,
so let's just clean it up.

Addresses-Coverity-Id: #1091168

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-01-07 13:08:03 -05:00
Theodore Ts'o
09c455aaa8 ext4: avoid clearing beyond i_blocks when truncating an inline data file
A missing cast means that when we are truncating a file which is less
than 60 bytes, we don't clear the correct area of memory, and in fact
we can end up truncating the next inode in the inode table, or worse
yet, some other kernel data structure.

Addresses-Coverity-Id: #751987

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2014-01-07 12:58:19 -05:00
Masanari Iida
8faaaead62 treewide: fix comments and printk msgs
This patch fixed several typo in printk from various
part of kernel source.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-01-07 15:06:07 +01:00
Bob Peterson
62e96cf819 GFS2: Increase i_writecount during gfs2_setattr_chown
This patch calls get_write_access in function gfs2_setattr_chown,
which merely increases inode->i_writecount for the duration of the
function. That will ensure that any file closes won't delete the
inode's multi-block reservation while the function is running.
It also ensures that a multi-block reservation exists when needed
for quota change operations during the chown.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-07 09:43:51 +00:00
Linus Torvalds
ef350bb7c5 Fix a regression introduced in v3.13-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQIcBAABCAAGBQJSyv4GAAoJENNvdpvBGATwwG4QANhHQupchnt4vnyetcvTZvs3
 x0BlnZnGDwzBqRhiZa8tORABn8z/8JuzsepqZOjKfmuULDHO4hjv42DSYhiBf7l7
 rIjrPhDuSoP6aVIiaLxllaVe+d18/fLUeoJ4bw2/Np9lLTjALA7j7zpVfy9RsIrr
 mreIh5Nu7ay8R/5Mts7ApJwQTtHHEOWm+NcsisZIFoCyuJKGDyeWOutlpGcgSn2T
 W3pUTF/iuN3trXAr+VYWfn/yqewWYlQ9hEifFYiqef7dEo9ITgO7Gn0Ig12PhUcX
 KuaRcAXsx+ynB3gLjsocCPHfHRouqeEN1jzfbVLn9GIHlgU9JEYAhyZB7eKvjoAH
 kf7IKWEOOVVQcRpJLmr3cXiY3ut5gqyytwftc4lntJG5nLbDAw3MihScSRdm1DBR
 ELHD60IDHvFtGGeCqgu11MCXoXBM2HG7iQWnCQfsktlAWPSmnAtqZO6vYzPtJHvD
 iMv2WuIBlEB0Qx/JJwi27vCb8PXmfj3mT/mvr8UhlMSl1W5cBG3nXOmbiZC33tmE
 nlB0j1UC21VFOUoA9BxYP5imfpUvD5fvGCmr6QE/mw9Y675q4QIWGo8P9EJiaCBL
 zx1VbgofNLs/DoCPzMlqR9qrO4w2SwWOuwY1dJq7EkP3cfQd2mv/cq3UrqDEtxju
 otnvt2Z2dadntRq3vGv3
 =6Ghs
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bugfix from Ted Ts'o:
 "Fix a regression introduced in v3.13-rc6"

* tag 'ext4_for_linus_stable' of http://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix bigalloc regression
2014-01-07 08:22:42 +08:00
Kinglong Mee
60810e5489 NFSD: Fix a memory leak in nfsd4_create_session
If failed after calling alloc_session but before init_session, nfsd will call __free_session to
free se_slots in session. But, session->se_fchannel.maxreqs is not initialized (value is zero).
So that, the memory malloced for slots will be lost in free_session_slots for maxreqs is zero.

This path sets the information for channel in alloc_session after mallocing slots succeed,
instead in init_session.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-06 15:33:54 -05:00
Jie Liu
85dd0707f0 xfs: fix off-by-one error in xfs_attr3_rmt_verify
With CRC check is enabled, if trying to set an attributes value just
equal to the maximum size of XATTR_SIZE_MAX would cause the v3 remote
attr write verification procedure failure, which would yield the back
trace like below:

<snip>
XFS (sda7): Internal error xfs_attr3_rmt_write_verify at line 191 of file fs/xfs/xfs_attr_remote.c
<snip>
Call Trace:
[<ffffffff816f0042>] dump_stack+0x45/0x56
[<ffffffffa0d99c8b>] xfs_error_report+0x3b/0x40 [xfs]
[<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs]
[<ffffffffa0d99ce5>] xfs_corruption_error+0x55/0x80 [xfs]
[<ffffffffa0dbef6b>] xfs_attr3_rmt_write_verify+0x14b/0x1a0 [xfs]
[<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs]
[<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs]
[<ffffffffa0d96edd>] _xfs_buf_ioapply+0x6d/0x390 [xfs]
[<ffffffff81184cda>] ? vm_map_ram+0x31a/0x460
[<ffffffff81097230>] ? wake_up_state+0x20/0x20
[<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs]
[<ffffffffa0d9726b>] xfs_buf_iorequest+0x6b/0xc0 [xfs]
[<ffffffffa0d97315>] xfs_bdstrat_cb+0x55/0xb0 [xfs]
[<ffffffffa0d97906>] xfs_bwrite+0x46/0x80 [xfs]
[<ffffffffa0dbfa94>] xfs_attr_rmtval_set+0x334/0x490 [xfs]
[<ffffffffa0db84aa>] xfs_attr_leaf_addname+0x24a/0x410 [xfs]
[<ffffffffa0db8893>] xfs_attr_set_int+0x223/0x470 [xfs]
[<ffffffffa0db8b76>] xfs_attr_set+0x96/0xb0 [xfs]
[<ffffffffa0db13b2>] xfs_xattr_set+0x42/0x70 [xfs]
[<ffffffff811df9b2>] generic_setxattr+0x62/0x80
[<ffffffff811e0213>] __vfs_setxattr_noperm+0x63/0x1b0
[<ffffffff81307afe>] ? evm_inode_setxattr+0xe/0x10
[<ffffffff811e0415>] vfs_setxattr+0xb5/0xc0
[<ffffffff811e054e>] setxattr+0x12e/0x1c0
[<ffffffff811c6e82>] ? final_putname+0x22/0x50
[<ffffffff811c708b>] ? putname+0x2b/0x40
[<ffffffff811cc4bf>] ? user_path_at_empty+0x5f/0x90
[<ffffffff811bdfd9>] ? __sb_start_write+0x49/0xe0
[<ffffffff81168589>] ? vm_mmap_pgoff+0x99/0xc0
[<ffffffff811e07df>] SyS_setxattr+0x8f/0xe0
[<ffffffff81700c2d>] system_call_fastpath+0x1a/0x1f

Tests:
    setfattr -n user.longxattr -v `perl -e 'print "A"x65536'` testfile

This patch fix it to check the remote EA size is greater than the
XATTR_SIZE_MAX rather than more than or equal to it, because it's
valid if the specified EA value size is equal to the limitation as
per VFS setxattr interface.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2014-01-06 13:37:35 -06:00
Yongqiang Yang
65eddb56f4 ext4: ext4_inode_is_fast_symlink should use EXT4_CLUSTER_SIZE
Can be reproduced by xfstests 62 with bigalloc and 128bit size inode.

Signed-off-by: Yongqiang Yang <yangyongqiang01@baidu.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2014-01-06 14:06:18 -05:00
Yongqiang Yang
9e740568bc ext4: fix a typo in extents.c
Signed-off-by: Yongqiang Yang <yangyongqiang01@baidu.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2014-01-06 14:05:23 -05:00
David Howells
a34e15cc35 ext4: use %pd printk specificer
Use the new %pd printk() specifier in Ext4 to replace passing of
dentry name or dentry name and name length * 2 with just passing the
dentry.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
cc: Andreas Dilger <adilger.kernel@dilger.ca>
cc: linux-ext4@vger.kernel.org
2014-01-06 14:04:23 -05:00
Jan Kara
52e4477758 ext4: standardize error handling in ext4_da_write_inline_data_begin()
The function has a bit non-standard (for ext4) error recovery in that it
used a mix of 'out' labels and testing for 'handle' being NULL. There
isn't a good reason for that in the function so clean it up a bit.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-01-06 14:03:23 -05:00
Jan Kara
bc0ca9df3b ext4: retry allocation when inline->extent conversion failed
Similarly as other ->write_begin functions in ext4, also
ext4_da_write_inline_data_begin() should retry allocation if the
conversion failed because of ENOSPC. This avoids returning ENOSPC
prematurely because of uncommitted block deletions.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-01-06 14:02:23 -05:00
Zheng Liu
9cb00419fa ext4: enable punch hole for bigalloc
After applied this commit (d23142c6), ext4 has supported punch hole for
a file system with bigalloc feature.  But we forgot to enable it.  This
commit fixes it.

Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-01-06 14:01:23 -05:00
Eric Whitney
d0abafac8c ext4: fix bigalloc regression
Commit f5a44db5d2 introduced a regression on filesystems created with
the bigalloc feature (cluster size > blocksize).  It causes xfstests
generic/006 and /013 to fail with an unexpected JBD2 failure and
transaction abort that leaves the test file system in a read only state.
Other xfstests run on bigalloc file systems are likely to fail as well.

The cause is the accidental use of a cluster mask where a cluster
offset was needed in ext4_ext_map_blocks().

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
2014-01-06 14:00:23 -05:00
Kinglong Mee
73ca65904c nfsd: get rid of unused function definition
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-06 12:23:33 -05:00
Kinglong Mee
3ff69309fe Define op_iattr for nfsd4_open instead using macro
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-06 12:23:32 -05:00
Kinglong Mee
ff88825fbb NFSD: fix compile warning without CONFIG_NFSD_V3
Without CONFIG_NFSD_V3, compile will get warning as,

fs/nfsd/nfssvc.c: In function 'nfsd_svc':
>> fs/nfsd/nfssvc.c:246:60: warning: array subscript is above array bounds [-Warray-bounds]
        return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
                                                               ^

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-06 12:23:31 -05:00
Steven Whitehouse
2b47dad866 GFS2: Remember directory insert point
When we look to see if there is enough space to add a dir
entry without allocation, we have then been repeating the
same search later when we do the actual insertion. This
patch caches the details of the location in the gfs2_diradd
structure, so that we do not have to repeat the search.

This will provide a performance improvement which will be
greater as the size of the directory increases.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-06 12:49:43 +00:00
Steven Whitehouse
534cf9ca55 GFS2: Consolidate transaction blocks calculation for dir add
There are three cases where we need to calculate the number of
blocks to reserve in a transaction involving linking an inode
into a directory. The one in rename is a bit more complicated,
but the basis of it is the same as for link and create. So it
makes sense to move this calculation into a single function
rather than repeating it three times.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-06 12:03:05 +00:00
Steven Whitehouse
3c1c0ae1db GFS2: Add directory addition info structure
The intent is that this structure will hold the information
required when adding entries to a directory (linking). To
start with, it will contain only the number of blocks which
are required to link the new entry into the directory. The
current calculation returns either 0 or the maximim number of
blocks that can ever be requested by such a transaction.

The intent is that in a later patch, we can update the dir
code to calculate this value more accurately. In addition
further patches will also add further fields to the new
structure to increase its utility.

In addition this patch fixes a bug where the link used during
inode creation was adding requesting too many blocks in
some cases. This is harmless unless the fs is close to being
full.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-06 11:28:41 +00:00
Chao Yu
04a17fb17f f2fs: avoid to read inline data except first page
Here is a case which could read inline page data not from first page.

1. write inline data
2. lseek to offset 4096
3. read 4096 bytes from offset 4096
	(read_inline_data read inline data page to non-first page,
	And previously VFS has add this page to page cache)
4. ftruncate offset 8192
5. read 4096 bytes from offset 4096
	(we meet this updated page with inline data in cache)

So we should leave this page with inited data and uptodate flag
for this case.

Change log from v1:
 o fix a deadlock bug

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-06 16:42:22 +09:00
Chao Yu
18309aaa41 f2fs: avoid to left uninitialized data in page when read inline data
Change log from v1:
 o reduce unneeded memset in __f2fs_convert_inline_data

>From 58796be2bd2becbe8d52305210fb2a64e7dd80b6 Mon Sep 17 00:00:00 2001
From: Chao Yu <chao2.yu@samsung.com>
Date: Mon, 30 Dec 2013 09:21:33 +0800
Subject: [PATCH] f2fs: avoid to left uninitialized data in page when read
 inline data

We left uninitialized data in the tail of page when we read an inline data
page. So let's initialize left part of the page excluding inline data region.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-06 16:42:22 +09:00
shifei10.ge
a225dca394 f2fs: fix truncate_partial_nodes bug
The truncate_partial_nodes puts pages incorrectly in the following two cases.
Note that the value for argc 'depth' can only be 2 or 3.
Please see truncate_inode_blocks() and truncate_partial_nodes().

1) An err is occurred in the first 'for' loop
  When err is occurred with depth = 2, pages[0] is invalid, so this page doesn't
  need to be put. There is no problem, however, when depth is 3, it doesn't put
  the pages correctly where pages[0] is valid and pages[1] is invalid.
  In this case, depth is set to 2 (ref to statemnt depth = i + 1), and then
  'goto fail'.
  In label 'fail', for (i = depth - 3; i >= 0; i--) cannot meet the condition
  because i = -1, so pages[0] cann't be put.

2) An err happened in the second 'for' loop
  Now we've got pages[0] with depth = 2, or we've got pages[0] and pages[1]
  with depth = 3. When an err is detected, we need 'goto fail' to put such
  the pages.
  When depth is 2, in label 'fail', for (i = depth - 3; i >= 0; i--) cann't
  meet the condition because i = -1, so pages[0] cann't be put.
  When depth is 3, in label 'fail', for (i = depth - 3; i >= 0; i--) can
  only put pages[0], pages[1] also cann't be put.

Note that 'depth' has been changed before first 'goto fail' (ref to statemnt
depth = i + 1), so passing this modified 'depth' to the tracepoint,
trace_f2fs_truncate_partial_nodes, is also incorrect.

Signed-off-by: Shifei Ge <shifei10.ge@samsung.com>
[Jaegeuk Kim: modify the description and fix one bug]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-06 16:42:21 +09:00
Jaegeuk Kim
a8865372a8 f2fs: handle errors correctly during f2fs_reserve_block
The get_dnode_of_data nullifies inode and node page when error is occurred.

There are two cases that passes inode page into get_dnode_of_data().

1. make_empty_dir()
    -> get_new_data_page()
      -> f2fs_reserve_block(ipage)
	-> get_dnode_of_data()

2. f2fs_convert_inline_data()
    -> __f2fs_convert_inline_data()
      -> f2fs_reserve_block(ipage)
	-> get_dnode_of_data()

This patch adds correct error handling codes when get_dnode_of_data() returns
an error.

At first, f2fs_reserve_block() calls f2fs_put_dnode() whenever reserve_new_block
returns an error.
So, the rule of f2fs_reserve_block() is to nullify inode page when there is any
error internally.

Finally, two callers of f2fs_reserve_block() should call f2fs_put_dnode()
appropriately if they got an error since successful f2fs_reserve_block().

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-06 16:42:21 +09:00
Jaegeuk Kim
1e1bb4baf1 f2fs: add inline_data recovery routine
This patch adds a inline_data recovery routine with the following policy.

[prev.] [next] of inline_data flag
   o       o  -> recover inline_data
   o       x  -> remove inline_data, and then recover data blocks
   x       o  -> remove inline_data, and then recover inline_data
   x       x  -> recover data blocks

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-06 16:42:20 +09:00
Jaegeuk Kim
0dbdc2ae9b f2fs: add the number of inline_data files to status info
This patch adds the number of inline_data files into the status information.
Note that the number is reset whenever the filesystem is newly mounted.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-06 16:42:20 +09:00
Jaegeuk Kim
9e09fc855d f2fs: refactor f2fs_convert_inline_data
Change log from v1:
 o handle NULL pointer of grab_cache_page_write_begin() pointed by Chao Yu.

This patch refactors f2fs_convert_inline_data to check a couple of conditions
internally for deciding whether it needs to convert inline_data or not.

So, the new f2fs_convert_inline_data initially checks:
1) f2fs_has_inline_data(), and
2) the data size to be changed.

If the inode has inline_data but the size to fill is less than MAX_INLINE_DATA,
then we don't need to convert the inline_data with data allocation.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-06 16:42:19 +09:00
Jaegeuk Kim
26f466f4a9 f2fs: call f2fs_put_page at the error case
In f2fs_write_begin(), if f2fs_conver_inline_data() returns an error like
-ENOSPC, f2fs should call f2fs_put_page().
Otherwise, it is remained as a locked page, resulting in the following bug.

[<ffffffff8114657e>] sleep_on_page+0xe/0x20
[<ffffffff81146567>] __lock_page+0x67/0x70
[<ffffffff81157d08>] truncate_inode_pages_range+0x368/0x5d0
[<ffffffff81157ff5>] truncate_inode_pages+0x15/0x20
[<ffffffff8115804b>] truncate_pagecache+0x4b/0x70
[<ffffffff81158082>] truncate_setsize+0x12/0x20
[<ffffffffa02a1842>] f2fs_setattr+0x72/0x270 [f2fs]
[<ffffffff811cdae3>] notify_change+0x213/0x400
[<ffffffff811ab376>] do_truncate+0x66/0xa0
[<ffffffff811ab541>] vfs_truncate+0x191/0x1b0
[<ffffffff811ab5bc>] do_sys_truncate+0x5c/0xa0
[<ffffffff811ab78e>] SyS_truncate+0xe/0x10
[<ffffffff81756052>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-06 16:42:19 +09:00
Jaegeuk Kim
8230a0a49f f2fs: convert inline_data for punch_hole
In the punch_hole(), let's convert inline_data all the time for simplicity and
to avoid potential deadlock conditions.
It is pretty much not a big deal to do this.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-06 16:42:12 +09:00
Toralf Förster
16a6ddc709 point to the right include file in a comment (left over from a9004abc3)
Signed-off-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-05 15:51:35 -05:00
Niels de Vos
1e8968c5b0 NFS: dprintk() should not print negative fileids and inode numbers
A fileid in NFS is a uint64. There are some occurrences where dprintk()
outputs a signed fileid. This leads to confusion and more difficult to
read debugging (negative fileids matching positive inode numbers).

Signed-off-by: Niels de Vos <ndevos@redhat.com>
CC: Santosh Pradhan <spradhan@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-05 15:51:23 -05:00
Alexander Aring
a8c2275493 nfs: fix dead code of ipv6_addr_scope
The correct way to check on IPV6_ADDR_SCOPE_LINKLOCAL is to check with
the ipv6_addr_src_scope function.

Currently this can't be work, because ipv6_addr_scope returns a int with
a mask of IPV6_ADDR_SCOPE_MASK (0x00f0U) and IPV6_ADDR_SCOPE_LINKLOCAL
is 0x02. So the condition is always false.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-05 15:38:21 -05:00
Kinglong Mee
8ef667140c NFSD: Don't start lockd when only NFSv4 is running
When starting without nfsv2 and nfsv3, nfsd does not need to start
lockd (and certainly doesn't need to fail because lockd failed to
register with the portmapper).

Reported-by: Gareth Williams <gareth@garethwilliams.me.uk>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 18:18:50 -05:00
Kinglong Mee
7e55b59b2f SUNRPC/NFSD: Support a new option for ignoring the result of svc_register
NFSv4 clients can contact port 2049 directly instead of needing the
portmapper.

Therefore a failure to register to the portmapper when starting an
NFSv4-only server isn't really a problem.

But Gareth Williams reports that an attempt to start an NFSv4-only
server without starting portmap fails:

  #rpc.nfsd -N 2 -N 3
  rpc.nfsd: writing fd to kernel failed: errno 111 (Connection refused)
  rpc.nfsd: unable to set any sockets for nfsd

Add a flag to svc_version to tell the rpc layer it can safely ignore an
rpcbind failure in the NFSv4-only case.

Reported-by: Gareth Williams <gareth@garethwilliams.me.uk>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 18:18:49 -05:00
Kinglong Mee
8a891633b8 NFSD: fix bad length checking for backchannel
the length for backchannel checking should be multiplied by sizeof(__be32).

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 18:18:48 -05:00
Kinglong Mee
f403e450e8 NFSD: fix a leak which can cause CREATE_SESSION failures
check_forechannel_attrs gets drc memory, so nfsd must put it when
check_backchannel_attrs fails.

After many requests with bad back channel attrs, nfsd will deny any
client's CREATE_SESSION forever.

A new test case named CSESS29 for pynfs will send in another mail.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 18:18:47 -05:00
Kinglong Mee
2ce02b6b6c Add missing recording of back channel attrs in nfsd4_session
commit 5b6feee960 forgot
recording the back channel attrs in nfsd4_session.

nfsd just check the back channel attars by check_backchannel_attrs,
but do not  record it in nfsd4_session in the latest kernel.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 18:18:46 -05:00
Zhouyi Zhou
2072552c33 jffs2: NULL return of kmem_cache_zalloc should be handled
NULL return of kmem_cache_zalloc should be handled in jffs2_alloc_xattr_datum
and jff2_alloc_xattr_ref.

Signed-off-by: Zhouyi Zhou <yizhouzhou@ict.ac.cn>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2014-01-03 11:22:22 -08:00
Kinglong Mee
dfeecc829e nfsd: get rid of unused macro definition
Since defined in Linux-2.6.12-rc2, READTIME has not been used.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 13:44:24 -05:00
Kinglong Mee
eba1c99ce4 nfsd: clean up unnecessary temporary variable in nfsd4_decode_fattr
host_err was only used for nfs4_acl_new.
This patch delete it, and return nfserr_jukebox directly.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 13:44:23 -05:00
Kinglong Mee
43212cc7df nfsd: using nfsd4_encode_noop for encoding destroy_session/free_stateid
Get rid of the extra code, using nfsd4_encode_noop for encoding destroy_session and free_stateid.
And, delete unused argument (fr_status) int nfsd4_free_stateid.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 13:44:22 -05:00
Kinglong Mee
a9f7b4a06c nfsd: clean up an xdr reserved space calculation
We should use XDR_LEN to calculate reserved space in case the oid is not
a multiple of 4.

RESERVE_SPACE actually rounds up for us, but it's probably better to be
careful here.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 13:44:12 -05:00
Steven Whitehouse
70d4ee94b3 GFS2: Use only a single address space for rgrps
Prior to this patch, GFS2 had one address space for each rgrp,
stored in the glock. This patch changes them to use a single
address space in the super block. This therefore saves
(sizeof(struct address_space) * nr_of_rgrps) bytes of memory
and for large filesystems, that can be significant.

It would be nice to be able to do something similar and merge
the inode metadata address space into the same global
address space. However, that is rather more complicated as the
on-disk location doesn't have a 1:1 mapping with the inodes in
general. So while it could be done, it will be a more complicated
operation as it requires changing a lot more code paths.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-03 10:01:50 +00:00
Steven Whitehouse
7005c3e4ae GFS2: Use range based functions for rgrp sync/invalidation
Each rgrp header is represented as a single extent on disk, so we
can calculate the position within the address space, since we are
using address spaces mapped 1:1 to the disk. This means that it
is possible to use the range based versions of filemap_fdatawrite/wait
and for invalidating the page cache.

Our eventual intent is to then be able to merge the address spaces
used for rgrps into a single address space, rather than to have
one for each glock, saving memory and reducing complexity.

Since during umount, the rgrp structures are disposed of before
the glocks, we need to store the extent information in the glock
so that is is available for a final invalidation. This patch uses
a field which is otherwise unused in rgrp glocks to do that, so
that we do not have to expand the size of a glock.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-03 10:00:31 +00:00
Steven Whitehouse
7de41d36ff GFS2: Remove test which is always true
Since gfs2_inplace_reserve() is always called with a valid
alloc parms structure, there is no need to test for this
within the function itself - and in any case, after we've
all ready dereferenced it anyway.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-03 09:59:30 +00:00
Steven Whitehouse
7aed98fb1d GFS2: Remove gfs2_quota_change_host structure
There is only one place this is used, when reading in the quota
changes at mount time. It is not really required and much
simpler to just convert the fields from the on-disk structure
as required.

There should be no functional change as a result of this patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-03 09:59:05 +00:00
Steven Whitehouse
e4f2920625 GFS2: Clean up releasepage
For historical reasons, we drop and retake the log lock in ->releasepage()
however, since there is no reason why we cannot hold the log lock over
the whole function, this allows some simplification. In particular,
pinning a buffer is only ever done under the log lock, so it is possible
here to remove the test for pinned buffers in the second loop, since it
is impossible for that to happen (it is also tested in the first loop).

As a result, two tests made later in the second loop become constants
and can also be reduced to the only possible branch. So the net result
is to remove various bits of unreachable code and make this more
readable.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-03 09:58:41 +00:00
Bob Peterson
5ea5050cec GFS2: Implement a "rgrp has no extents longer than X" scheme
With the preceding patch, we started accepting block reservations
smaller than the ideal size, which requires a lot more parsing of the
bitmaps. To reduce the amount of bitmap searching, this patch
implements a scheme whereby each rgrp keeps track of the point
at this multi-block reservations will fail.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-03 09:58:08 +00:00
Bob Peterson
1330edbeaf GFS2: Drop inadequate rgrps from the reservation tree
This is just basically a resend of a patch I posted earlier.
It didn't change from its original, except in diff offsets, etc:

This patch fixes a bug in the GFS2 block allocation code. The problem
starts if a process already has a multi-block reservation, but for
some reason, another process disqualifies it from further allocations.
For example, the other process might set on the GFS2_RDF_ERROR bit.
The process holding the reservation jumps to label skip_rgrp, but
that label comes after the code that removes the reservation from the
tree. Therefore, the no longer usable reservation is not removed from
the rgrp's reservations tree; it's lost. Eventually, the lost reservation
causes the count of reserved blocks to get off, and eventually that
causes a BUG_ON(rs->rs_rbm.rgd->rd_reserved < rs->rs_free) to trigger.
This patch moves the call to after label skip_rgrp so that the
disqualified reservation is properly removed from the tree, thus keeping
the rgrp rd_reserved count sane.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-01-03 09:57:31 +00:00