This patch is in response of the following post:
http://lwn.net/Articles/556136/
"ext4: introduce two new ioctls"
Dave chinner suggested that truncate_block_range
(which was one of the ioctls name) should be a fallocate operation
and not any fs specific ioctl, hence we add this functionality to new flags of fallocate.
This new functionality of collapsing range could be used by media editing tools
which does non linear editing to quickly purge and edit parts of a media file.
This will immensely improve the performance of these operations.
The limitation of fs block size aligned offsets can be easily handled
by media codecs which are encapsulated in a conatiner as they have to
just change the offset to next keyframe value to match the proper alignment.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
This patch implements fallocate's FALLOC_FL_COLLAPSE_RANGE for Ext4.
The semantics of this flag are following:
1) It collapses the range lying between offset and length by removing any data
blocks which are present in this range and than updates all the logical
offsets of extents beyond "offset + len" to nullify the hole created by
removing blocks. In short, it does not leave a hole.
2) It should be used exclusively. No other fallocate flag in combination.
3) Offset and length supplied to fallocate should be fs block size aligned
in case of xfs and ext4.
4) Collaspe range does not work beyond i_size.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Tested-by: Dongsu Park <dongsu.park@profitbricks.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Pull xfs fixes from Dave Chinner:
"This is the first pull request I've had to do for you, so I'm still
sorting things out. The reason I'm sending this and not Ben should be
obvious from the first commit below - SGI has stepped down from the
XFS maintainership role. As such, I'd like to take another
opportunity to thank them for their many years of effort maintaining
XFS and supporting the XFS community that they developed from the
ground up.
So I haven't had time to work things like signed tags into my
workflows yet, so this is just a repo branch I'm asking you to pull
from. And yes, I named the branch -rc4 because I wanted the fixes in
rc4, not because the branch was for merging into -rc3. Probably not
right, either.
Anyway, I should have everything sorted out by the time the next merge
window comes around. If there's anything that you don't like in the
pull req, feel free to flame me unmercifully.
The changes are fixes for recent regressions and important thinkos in
verification code:
- a log vector buffer alignment issue on ia32
- timestamps on truncate got mangled
- primary superblock CRC validation fixes and error message
sanitisation"
* 'xfs-fixes-for-3.14-rc4' of git://oss.sgi.com/xfs/xfs:
xfs: limit superblock corruption errors to actual corruption
xfs: skip verification on initial "guess" superblock read
MAINTAINERS: SGI no longer maintaining XFS
xfs: xfs_sb_read_verify() doesn't flag bad crcs on primary sb
xfs: ensure correct log item buffer alignment
xfs: ensure correct timestamp updates from truncate
This reverts commit c4a391b53a. Dave
Chinner <david@fromorbit.com> has reported the commit may cause some
inodes to be left out from sync(2). This is because we can call
redirty_tail() for some inode (which sets i_dirtied_when to current time)
after sync(2) has started or similarly requeue_inode() can set
i_dirtied_when to current time if writeback had to skip some pages. The
real problem is in the functions clobbering i_dirtied_when but fixing
that isn't trivial so revert is a safer choice for now.
CC: stable@vger.kernel.org # >= 3.13
Signed-off-by: Jan Kara <jack@suse.cz>
Given that bip->bip_iter.bi_size is decremented after bio_advance() ->
bio_integrity_advance() is called, the BUG_ON() in bio_integrity_verify()
ends up tripping in v3.14-rc1 code with the advent of immutable biovecs
in:
commit d57a5f7c66
Author: Kent Overstreet <kmo@daterainc.com>
Date: Sat Nov 23 17:20:16 2013 -0800
bio-integrity: Convert to bvec_iter
Given that there is no easy way to ascertain the original bi_size
value, go ahead and drop this BUG_ON().
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Reported-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Most code of function bio_integrity_verify and bio_integrity_generate
is the same, so introduce a help function bio_integrity_generate_verify()
to remove the duplicate code.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
A couple of "int" fields were being used as boolean values
so we can make them bitfields of one bit, and put them in
what might otherwise be a hole in the structure with 64
bit alignment.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
While handling punch-hole fallocate, it's useless to truncate page cache
before removing the range from extent tree (or block map in indirect case)
because page cache can be re-populated (by read-ahead or read(2) or mmap-ed
read) immediately after truncating page cache, but before updating extent
tree (or block map). In that case the user will see stale data even after
fallocate is completed.
Until the problem of data corruption resulting from pages backed by
already freed blocks is fully resolved, the simple thing we can do now
is to add another truncation of pagecache after punch hole is done.
Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Adjust the conversion specifications in a few optionally compiled debug
messages to match the return type of ext4_es_status(). Also, make a
couple of minor grammatical message edits while we're at it.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently last dqput() can race with dquot_scan_active() causing it to
call callback for an already deactivated dquot. The race is as follows:
CPU1 CPU2
dqput()
spin_lock(&dq_list_lock);
if (atomic_read(&dquot->dq_count) > 1) {
- not taken
if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
spin_unlock(&dq_list_lock);
->release_dquot(dquot);
if (atomic_read(&dquot->dq_count) > 1)
- not taken
dquot_scan_active()
spin_lock(&dq_list_lock);
if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
- not taken
atomic_inc(&dquot->dq_count);
spin_unlock(&dq_list_lock);
- proceeds to release dquot
ret = fn(dquot, priv);
- called for inactive dquot
Fix the problem by making sure possible ->release_dquot() is finished by
the time we call the callback and new calls to it will notice reference
dquot_scan_active() has taken and bail out.
CC: stable@vger.kernel.org # >= 2.6.29
Signed-off-by: Jan Kara <jack@suse.cz>
UDF has two types of files - files with data stored in inode (ICB in
UDF terminology) and files with data stored in external data blocks. We
convert file from in-inode format to external format in
udf_file_aio_write() when we find out data won't fit into inode any
longer. However the following race between two O_APPEND writes can happen:
CPU1 CPU2
udf_file_aio_write() udf_file_aio_write()
down_write(&iinfo->i_data_sem);
checks that i_size + count1 fits within inode
=> no need to convert
up_write(&iinfo->i_data_sem);
down_write(&iinfo->i_data_sem);
checks that i_size + count2 fits
within inode => no need to convert
up_write(&iinfo->i_data_sem);
generic_file_aio_write()
- extends file by count1 bytes
generic_file_aio_write()
- extends file by count2 bytes
Clearly if count1 + count2 doesn't fit into the inode, we overwrite
kernel buffers beyond inode, possibly corrupting the filesystem as well.
Fix the problem by acquiring i_mutex before checking whether write fits
into the inode and using __generic_file_aio_write() afterwards which
puts check and write into one critical section.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
Pull cgroup fixes from Tejun Heo:
"Quite a few fixes this time.
Three locking fixes, all marked for -stable. A couple error path
fixes and some misc fixes. Hugh found a bug in memcg offlining
sequence and we thought we could fix that from cgroup core side but
that turned out to be insufficient and got reverted. A different fix
has been applied to -mm"
* 'for-3.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: update cgroup_enable_task_cg_lists() to grab siglock
Revert "cgroup: use an ordered workqueue for cgroup destruction"
cgroup: protect modifications to cgroup_idr with cgroup_mutex
cgroup: fix locking in cgroup_cfts_commit()
cgroup: fix error return from cgroup_create()
cgroup: fix error return value in cgroup_mount()
cgroup: use an ordered workqueue for cgroup destruction
nfs: include xattr.h from fs/nfs/nfs3proc.c
cpuset: update MAINTAINERS entry
arm, pm, vmpressure: add missing slab.h includes
When looking at a bug report with:
> kernel: EXT4-fs: 0 scanned, 0 found
I thought wow, 0 scanned, that's odd? But it's not odd; it's printing
a variable that is initialized to 0 and never touched again.
It's never been used since the original merge, so I don't really even
know what the original intent was, either.
If anyone knows how to hook it up, speak now via patch, otherwise just
yank it so it's not making a confusing situation more confusing in
kernel logs.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext4_map_blocks() function returns the number of blocks which
satisfying the caller's request. This number of blocks requested by
the caller is specified by an unsigned integer, but the return value
of ext4_map_blocks() is a signed integer (to accomodate error codes
per the kernel's standard error signalling convention).
Historically, overflows could never happen since mballoc() will refuse
to allocate more than 2048 blocks at a time (which is something we
should fix), and if the blocks were already allocated, the fact that
there would be some number of intervening metadata blocks pretty much
guaranteed that there could never be a contiguous region of data
blocks that was greater than 2**31 blocks.
However, this is now possible if there is a file system which is a bit
bigger than 8TB, and is created using the new mke2fs hugeblock
feature, which can create a perfectly contiguous file. In that case,
if a userspace program attempted to call fallocate() on this already
fully allocated file, it's possible that ext4_map_blocks() could
return a number large enough that it would overflow a signed integer,
resulting in a ext4 thinking that the ext4_map_blocks() call had
failed with some strange error code.
Since ext4_map_blocks() is always free to return a smaller number of
blocks than what was requested by the caller, fix this by capping the
number of blocks that ext4_map_blocks() will ever try to map to 2**31
- 1. In practice this should never get hit, except by someone
deliberately trying to provke the above-described bug.
Thanks to the PaX team for asking whethre this could possibly happen
in some off-line discussions about using some static code checking
technology they are developing to find bugs in kernel code.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The lowest levels of mballoc set all of the fields of struct
ext4_free_extent except for fe_logical, since they are just trying to
find the requested free set of blocks, and the logical block hasn't
been set yet. This makes some static code checkers sad. Set it to
various different debug values, which would be useful when
debugging mballoc if these values were to ever show up due to the
parts of mballoc triyng to use ac->ac_b_ex.fe_logical before it is
properly upper layers of mballoc failing to properly set, usually by
ext4_mb_use_best_found().
Addresses-Coverity-Id: #139697
Addresses-Coverity-Id: #139698
Addresses-Coverity-Id: #139699
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The function ext4_expand_extra_isize_ea() doesn't need the size of all
of the extended attribute headers. So if we don't calculate it when
it is unneeded, it we can skip some undeeded memory references, and as
a bonus, we eliminate some kvetching by static code analysis tools.
Addresses-Coverity-Id: #741291
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Avoid false positives by static code analysis tools such as sparse and
coverity caused by the fact that we set the physical block, and then
the status in the extent_status structure. It is also more efficient
to set both of these values at once.
Addresses-Coverity-Id: #989077
Addresses-Coverity-Id: #989078
Addresses-Coverity-Id: #1080722
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Commit 3779473246 breaks the return of error codes from
ext4_ext_handle_uninitialized_extents() in ext4_ext_map_blocks(). A
portion of the patch assigns that function's signed integer return
value to an unsigned int. Consequently, negatively valued error codes
are lost and can be treated as a bogus allocated block count.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
Highlights include stable fixes for the following bugs:
- General performance regression due to NFS_INO_INVALID_LABEL being set
when the server doesn't support labeled NFS
- Hang in the RPC code due to a socket out-of-buffer race
- Infinite loop when trying to establish the NFSv4 lease
- Use-after-free bug in the RPCSEC gss code.
- nfs4_select_rw_stateid is returning with a non-zero error value on success
Other bug fixes:
- Potential memory scribble in the RPC bi-directional RPC code
- Pipe version reference leak
- Use the correct net namespace in the new NFSv4 migration code
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTBMBlAAoJEGcL54qWCgDyOakP+gKDh0VhKw8GziJFfY+6kHHI
wej86M/coNnRPUv8n7s3N5TXMoV36qisYpxbIG/WQbOn6MfR3qjto6WoP7+vsrEq
iNomtLivgEsWJePydTuIyAR/TK0du/zqP4zoPEDgdLDenucEVvkCzGIkqzg8Mddc
duknEhIq918BkXIe3hRBWuxl+pRjwZur+TY0h/OR11oodqTYHxrE37f5PvREWOmB
08hhjOFBFYlyEnCjD3I1SFmcXQxkKzvACavvbhTyF6u/37oL/QC1/DZKL5mSdOJ6
novO8sv9gIpn/RhsEMOdaeYMYM5QTvkYIJQyLpKAYyaLZ42EMbRkczwNE7C0ZWyi
F9MizDMNTie+DvsSHZPYwABTDOOQOuWPa9PO3Lyo8UtWxfmTOjYr2Rre1wyG0/+0
ywb3JiKQCtVDPnmSHhqxFfVp9XvS7D2/vz0udgKjrCLKyDC2OMMGLVa/acnrtZmz
s94QpPiqhjnRqIuKo251HtbK3AVaLBNQxBCPszieYwPm9aZ04P7mjsGg1WuhhB2v
+eSa9UkicGJwKWJWtIBr54qIAOlEXu2bXY+vio5UfbDb+5qfBHe0TmNrz5QNJ53A
x0eUBocth9VpW1cv/Rf+o30tJyZGy6Jtv8kn2hfXkJXNL3N1Gn25rD3tpb+5TtdY
b7gtHy13/oe8yi0tK91f
=tll2
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.14-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include stable fixes for the following bugs:
- General performance regression due to NFS_INO_INVALID_LABEL being
set when the server doesn't support labeled NFS
- Hang in the RPC code due to a socket out-of-buffer race
- Infinite loop when trying to establish the NFSv4 lease
- Use-after-free bug in the RPCSEC gss code.
- nfs4_select_rw_stateid is returning with a non-zero error value on
success
Other bug fixes:
- Potential memory scribble in the RPC bi-directional RPC code
- Pipe version reference leak
- Use the correct net namespace in the new NFSv4 migration code"
* tag 'nfs-for-3.14-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS fix error return in nfs4_select_rw_stateid
NFSv4: Use the correct net namespace in nfs4_update_server
SUNRPC: Fix a pipe_version reference leak
SUNRPC: Ensure that gss_auth isn't freed before its upcall messages
SUNRPC: Fix potential memory scribble in xprt_free_bc_request()
SUNRPC: Fix races in xs_nospace()
SUNRPC: Don't create a gss auth cache unless rpc.gssd is running
NFS: Do not set NFS_INO_INVALID_LABEL unless server supports labeled NFS
This patch fix spelling typo in Documentation/DocBook.
It is because .html and .xml files are generated by make htmldocs,
I have to fix a typo within the source files.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Today, if
xfs_sb_read_verify
xfs_sb_verify
xfs_mount_validate_sb
detects superblock corruption, it'll be extremely noisy, dumping
2 stacks, 2 hexdumps, etc.
This is because we call XFS_CORRUPTION_ERROR in xfs_mount_validate_sb
as well as in xfs_sb_read_verify.
Also, *any* errors in xfs_mount_validate_sb which are not corruption
per se; things like too-big-blocksize, bad version, bad magic, v1 dirs,
rw-incompat etc - things which do not return EFSCORRUPTED - will
still do the whole XFS_CORRUPTION_ERROR spew when xfs_sb_read_verify
sees any error at all. And it suggests to the user that they
should run xfs_repair, even if the root cause of the mount failure
is a simple incompatibility.
I'll submit that the probably-not-corrupted errors don't warrant
this much noise, so this patch removes the warning for anything
other than EFSCORRUPTED returns, and replaces the lower-level
XFS_CORRUPTION_ERROR with an xfs_notice().
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
When xfs_readsb() does the very first read of the superblock,
it makes a guess at the length of the buffer, based on the
sector size of the underlying storage. This may or may
not match the filesystem sector size in sb_sectsize, so
we can't i.e. do a CRC check on it; it might be too short.
In fact, mounting a filesystem with sb_sectsize larger
than the device sector size will cause a mount failure
if CRCs are enabled, because we are checksumming a length
which exceeds the buffer passed to it.
So always read twice; the first time we read with NULL
buffer ops to skip verification; then set the proper
read length, hook up the proper verifier, and give it
another go.
Once we are sure that we've got the right buffer length,
we can also use bp->b_length in the xfs_sb_read_verify,
rather than the less-trusted on-disk sectorsize for
secondary superblocks. Before this we ran the risk of
passing junk to the crc32c routines, which didn't always
handle extreme values.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
My earlier commit 10e6e65 deserves a layer or two of brown paper
bags. The logic in that commit means that a CRC failure on the
primary superblock will *never* result in an error return.
Hopefully this fixes it, so that we always return the error
if it's a primary superblock, otherwise only if the filesystem
has CRCs enabled.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJTA+AkAAoJEDaohF61QIxkIQwP/jkvIFJFuLCAn62ogIW/lnFe
nyfjqoz5UpWHEIofXzMalt0ugYby5VLHWI17FPFmAmrpTrpyHFjVt1qt8GhJ8yM3
mj3inorur9+/COozpEfqacS9bqiuH5DB35ufgA4EQTT9uwnI4AKypwQ3PogrBAxw
9TE4AedPbqAbYPAyNEZhuNnLCCf6kRIlb0lK6HWPQ7769YsSokmoHxa+Rke1NDyx
b2oABa4PHQTx0H53ppZKQok77Rg1dALeOfak6AawOeHijzRz05IEdV5ZH8MEMPTD
Yb9R6cDBMxGg6YKUYgQrE1BYQ9azqsotFFmqE0gYB376ag/R6M3NmM/Jx6bD2OkW
jmS+pI18EdJ97cRnylmasGYxI1G/3N9RhoTK7g4H5Cvmzs84Khw3cp7cN4LqUMzA
7+3rh+Gd49gvR0YY3/gjlyTVZihvS7JDiYsAJBCIiTW2UtsLPdNaT/X8K18hmZ5/
z7awKk/GPoNxUDke4NRFv+zoI+7GjorLG9DZZ/vKeIwR0DN1DQZpNGu/YGN+nHG7
YfIwAFNjBnyFsR1ev18dR0wSuSm0fGuvPx5CKWQaLdZit/2WxZNVc6oslZ08vUNn
VqE+MEkd5zKlQ5a7IXo2GUOUkuSsdW9aYXlNbbG4I/CBE2Nanu296lvbRH85bYnf
hokisNr50zX/7a41v9FD
=iBAK
-----END PGP SIGNATURE-----
Merge tag 'jfs-3.14-rc4' of git://github.com/kleikamp/linux-shaggy
Pull jfs fix from David Kleikamp:
"Another ACL regression. This one more subtle"
* tag 'jfs-3.14-rc4' of git://github.com/kleikamp/linux-shaggy:
jfs: set i_ctime when setting ACL
When using device mapper, there are many "bio: create slab" messages in
the log. Device mapper targets have different front_pad, so each time when
we load a target that wasn't loaded before, we allocate a slab with the
appropriate front_pad and there is associated "bio: create slab" message.
This patch removes these messages, there is no need for them.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJTA56tAAoJENNvdpvBGATwgrYQAKKWtbMT3xUHJE5Et4//0Pep
Wn1c8icQDqz+Y1C0L6ELZbY8rQaYb2cQxuf6kJAa7KJHF4lVCa8LzW+VG6S+z+Fy
+HSoi3eaS/IWzP+eZSE71u+MPN+tdURg5n305/YP40r7OH5lzsQT7LRKosifJwFc
t+B7lz6XwpgPQ4ooDEMNAHwTaxcocSYlmQqRUgiMsEwqOvskm+CWswB+piPrOzaH
kyNNIKwds6UH4hVhWEwPk7IrOS6VXe0IlgpkNZM1AOW6DWJ3SNF0qGeH5/e4pG+O
xpv8UvVej7QJWMyLGPE8z1NBCZ7P57nFOfeiag3Xuvgl/q3WB+5Ye/UZk6DNUaAF
h2SqS8D4rHuYB9n5izfIDtqJQZ69jbt2dDzTW9a88OVvN1UScaqh++2GeX2OJ/V8
W6XkOE/g1atO7SyAOaH9RD3D/8+Dp3DZZxocZ4G+/SoKETF8/RboLxyeozoRima2
FFPaqS105W7b9+QdG1avOlod+0E4wU8gw5kKKuFdAoAjFXaQ+u3R+87w5r8G5alU
BO+Trf4CTWMLZQv40HEcgDEeF1Nggd48+3GG3oiUmch9qRAZjwyk6gxLvDo1673e
CcAXpITLn3qPDVzWu/rba8uxRr95PFpXLQ9V8it8b9rWPogodoKg+A7v0Ro/n7bR
MszulL/GyP1isllyO+Ku
=XNON
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Miscellaneous ext4 bug fixes for v3.14"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
jbd2: fix use after free in jbd2_journal_start_reserved()
ext4: don't leave i_crtime.tv_sec uninitialized
ext4: fix online resize with a non-standard blocks per group setting
ext4: fix online resize with very large inode tables
ext4: don't try to modify s_flags if the the file system is read-only
ext4: fix error paths in swap_inode_boot_loader()
ext4: fix xfstest generic/299 block validity failures
There is a regression in
208d0ac 2014-01-07 nfsd4: break only delegations when appropriate
which deletes an nfserrno() call in nfsd_setattr() (by accident,
probably), and NFSD becomes ignoring an error from VFS.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
My rework of handling of notification events (namely commit 7053aee26a
"fsnotify: do not share events between notification groups") broke
sending of cookies with inotify events. We didn't propagate the value
passed to fsnotify() properly and passed 4 uninitialized bytes to
userspace instead (so it is also an information leak). Sadly I didn't
notice this during my testing because inotify cookies aren't used very
much and LTP inotify tests ignore them.
Fix the problem by passing the cookie value properly.
Fixes: 7053aee26a
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
When !defined(CONFIG_EXT4_DEBUG), mb_debug() should be defined as a
no_printk() statement instead of an empty statement in order to suppress
the following compiler warning:
fs/ext4/mballoc.c: In function ‘ext4_mb_cleanup_pa’:
fs/ext4/mballoc.c:2659:47: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
mb_debug(1, "mballoc: %u PAs left\n", count);
Signed-off-by: Patrick Palka <patrick@parcs.ath.cx>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Mark functions as static in jbd2/journal.c because they are not used
outside this file.
This eliminates the following warning in jbd2/journal.c:
fs/jbd2/journal.c:125:5: warning: no previous prototype for ‘jbd2_verify_csum_type’ [-Wmissing-prototypes]
fs/jbd2/journal.c:146:5: warning: no previous prototype for ‘jbd2_superblock_csum_verify’ [-Wmissing-prototypes]
fs/jbd2/journal.c:154:6: warning: no previous prototype for ‘jbd2_superblock_csum_set’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
"err" is zero here, there is no need to check again.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If start_this_handle() fails then it leads to a use after free of
"handle".
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
(Trivial patch.)
If the code is looking at the RCU-protected pointer itself, but not
dereferencing it, the rcu_dereference() functions can be downgraded to
rcu_access_pointer(). This commit makes this downgrade in __alloc_fd(),
which simply compares the RCU-protected pointer against NULL with no
dereferencing.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Pull Ceph fixes from Sage Weil:
"We have some patches fixing up ACL support issues from Zheng and
Guangliang and a mount option to enable/disable this support. (These
fixes were somewhat delayed by the Chinese holiday.)
There is also a small fix for cached readdir handling when directories
are fragmented"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: fix __dcache_readdir()
ceph: add acl, noacl options for cephfs mount
ceph: make ceph_forget_all_cached_acls() static inline
ceph: add missing init_acl() for mkdir() and atomic_open()
ceph: fix ceph_set_acl()
ceph: fix ceph_removexattr()
ceph: remove xattr when null value is given to setxattr()
ceph: properly handle XATTR_CREATE and XATTR_REPLACE
Pull CIFS fixes from Steve French:
"Three cifs fixes, the most important fixing the problem with passing
bogus pointers with writev (CVE-2014-0069).
Two additional cifs fixes are still in review (including the fix for
an append problem which Al also discovered)"
* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Fix too big maxBuf size for SMB3 mounts
cifs: ensure that uncached writes handle unmapped areas correctly
[CIFS] Fix cifsacl mounts over smb2 to not call cifs
When FS-Cache allocates an object, the following sequence of events can
occur:
-->fscache_alloc_object()
-->cachefiles_alloc_object() [via cache->ops->alloc_object]
<--[returns new object]
-->fscache_attach_object()
<--[failed]
-->cachefiles_put_object() [via cache->ops->put_object]
-->fscache_object_destroy()
-->fscache_objlist_remove()
-->rb_erase() to remove the object from fscache_object_list.
resulting in a crash in the rbtree code.
The problem is that the object is only added to fscache_object_list on
the success path of fscache_attach_object() where it calls
fscache_objlist_add().
So if fscache_attach_object() fails, the object won't have been added to
the objlist rbtree. We do, however, unconditionally try to remove the
object from the tree.
Thanks to NeilBrown for finding this and suggesting this solution.
Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: (a customer of) NeilBrown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This has been this way for years, and every time I stumble across it I
lose my lunch. After coming across it for the nth time in the Coverity
results, I had to overcome the bystander effect and do something about
it.
This ignores the 79 column limit in favor of making it look like C
instead of gibberish.
The correct thing to do here would be to lose some of the indentation by
breaking this function up into several smaller ones. I might do that at
some point if I have the stomach to look at this again.
(Also some of those overlong ternary operations would likely be more
readable as regular if's)
Signed-off-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If directory is fragmented, readdir() read its dirfrags one by one.
After reading all dirfrags, the corresponding dentries are sorted in
(frag_t, off) order in the dcache. If dentries of a directory are all
cached, __dcache_readdir() can use the cached dentries to satisfy
readdir syscall. But when checking if a given dentry is after the
position of readdir, __dcache_readdir() compares numerical value of
frag_t directly. This is wrong, it should use ceph_frag_compare().
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Make the 'acl' option dependent on having ACL support compiled in. Make
the 'noacl' option work even without it so that one can always ask it to
be off and not error out on mount when it is not supported.
Signed-off-by: Guangliang Zhao <lucienchao@gmail.com>
Signed-off-by: Sage Weil <sage@inktank.com>