Commit graph

39459 commits

Author SHA1 Message Date
Jaegeuk Kim
f7ef9b83b5 f2fs: introduce macros to convert bytes and blocks in f2fs
This patch adds two macros for transition between byte and block offsets.
Currently, f2fs only supports 4KB blocks, so use the default size for now.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:48 -08:00
Jaegeuk Kim
da17eece03 f2fs: call set_buffer_new for get_block
This patch fixes wrong handling of buffer_new flag in get_block.
If f2fs allocates new blocks and mapped buffer_head, it needs to set buffer_new
for the bh_result.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:47 -08:00
Jaegeuk Kim
aaf9607516 f2fs: check node page contents all the time
In get_node_page, if the page is up-to-date, we assumed that the page was not
reclaimed at all.
But, sometimes it was reported that its contents was missing.
So, just for sure, let's check its mapping and contents.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:46 -08:00
Chao Yu
2e023174a8 f2fs: avoid data offset overflow when lseeking huge file
xfstest generic/285 complains our issue in lseeking huge file.

Here is the detail output of generic/285:
"./check -f2fs tests/generic/285
Ran: generic/285
Failures: generic/285
Failed 1 of 1 tests

10. Test a huge file for offset overflow
10.01 SEEK_HOLE expected 65536 or 8589934592, got 65536.          succ
10.02 SEEK_HOLE expected 65536 or 8589934592, got 65536.          succ
10.03 SEEK_DATA expected 0 or 0, got 0.                           succ
10.04 SEEK_DATA expected 1 or 1, got 1.                           succ
10.05 SEEK_HOLE expected 8589934592 or 8589934592, got 0.         FAIL
10.06 SEEK_DATA expected 8589869056 or 8589869056, got 8589869056. succ
10.07 SEEK_DATA expected 8589869057 or 8589869057, got 8589869057. succ
10.08 SEEK_DATA expected 8589869056 or 8589869056, got 4294901760. FAIL"

The reason of this issue is:
We will calculate current offset through left shifting page-offset with
PAGE_CACHE_SHIFT bits, but our page-offset is a type of unsigned long, its size
is 4 bytes in 32-bits machine.

So if our page-offset is bigger than (1 << 32 / pagesize - 1), result of left
shifting will overflow.

Let's fix this issue by casting type of page-offset to type of current offset:
loff_t.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:46 -08:00
Chao Yu
560d4672e2 f2fs: fix to use highmem for pages of newly created directory
In commit a78186ebe5 ("f2fs: use highmem for directory pages"), we have set
__GFP_HIGHMEM into dir mapping's gfp flag in f2fs_iget, so high address memory
could be used for these existing dir's page.

But we forgot to set flag for newly created dir, due to this reason, our newly
created dir pages could not be allocated from high address memory. Fix it.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:45 -08:00
Jaegeuk Kim
bba681cbb2 f2fs: introduce a batched trim
This patch introduces a batched trimming feature, which submits split discard
commands.

This is to avoid long latency due to huge trim commands.
If fstrim was triggered ranging from 0 to the end of device, we should lock
all the checkpoint-related mutexes, resulting in very long latency.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:44 -08:00
Chao Yu
487261f39b f2fs: merge {invalidate,release}page for meta/node/data pages
This patch merges ->{invalidate,release}page function for meta/node/data pages.

After this, duplication of codes could be removed.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:44 -08:00
Jaegeuk Kim
d24bdcbfc6 f2fs: show the number of writeback pages in stat
This patch adds the # of writeback pages in stat info.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:43 -08:00
Jaegeuk Kim
f68daeebba f2fs: keep PagePrivate during releasepage
If PagePrivate is removed by releasepage, f2fs loses counting dirty pages.

e.g., try_to_release_page will not release page when the page is dirty,
but our releasepage removes PagePrivate.

    [<ffffffff81188d75>] try_to_release_page+0x35/0x50
    [<ffffffff811996f9>] invalidate_inode_pages2_range+0x2f9/0x3b0
    [<ffffffffa02a7f54>] ? truncate_blocks+0x384/0x4d0 [f2fs]
    [<ffffffffa02b7583>] ? f2fs_direct_IO+0x283/0x290 [f2fs]
    [<ffffffffa02b7fb0>] ? get_data_block_fiemap+0x20/0x20 [f2fs]
    [<ffffffff8118aa53>] generic_file_direct_write+0x163/0x170
    [<ffffffff8118ad06>] __generic_file_write_iter+0x2a6/0x350
    [<ffffffff8118adef>] generic_file_write_iter+0x3f/0xb0
    [<ffffffff81203081>] new_sync_write+0x81/0xb0
    [<ffffffff81203837>] vfs_write+0xb7/0x1f0
    [<ffffffff81204459>] SyS_write+0x49/0xb0
    [<ffffffff817c286d>] system_call_fastpath+0x16/0x1b

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:42 -08:00
Jaegeuk Kim
081d78c2fc f2fs: should fail mount when trying to recover data on read-only dev
If device is read-only, we should not proceed data recovery.
But, if the previous checkpoint was done by normal clean shutdown, it's safe to
proceed the recovery, since there will be no data to be recovered.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:42 -08:00
Jaegeuk Kim
119ee91445 f2fs: split UMOUNT and FASTBOOT flags
This patch adds FASTBOOT flag into checkpoint as follows.

 - CP_UMOUNT_FLAG is set when system is umounted.
 - CP_FASTBOOT_FLAG is set when intermediate checkpoint having node summaries
   was done.

So, if you get CP_UMOUNT_FLAG from checkpoint, the system was umounted cleanly.
Instead, if there was sudden-power-off, you can get CP_FASTBOOT_FLAG or nothing.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:41 -08:00
Jaegeuk Kim
11504a8e7e f2fs: avoid write_checkpoint if f2fs is mounted readonly
Do not change any partition when f2fs is changed to readonly mode.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:40 -08:00
Jaegeuk Kim
2d834bf9ac f2fs: support norecovery mount option
This patch adds a mount option, norecovery, which is mostly same as
disable_roll_forward. The only difference is that norecovery should be activated
with read-only mount option.

This can be used when user wants to check whether f2fs is mountable or not
without any recovery process. (e.g., xfstests/200)

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:40 -08:00
Jaegeuk Kim
dabc4a5c60 f2fs: fix not to drop mount options when retrying fill_super
If wrong mount option was requested, f2fs tries to fill_super again.
But, during the next trial, f2fs has no valid mount options, since
parse_options deleted all the separators in the original string.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:39 -08:00
Chao Yu
caf0047e7e f2fs: merge flags in struct f2fs_sb_info
Currently, there are several variables with Boolean type as below:

struct f2fs_sb_info {
...
	int s_dirty;
	bool need_fsck;
	bool s_closing;
...
	bool por_doing;
...
}

For this there are some issues:
1. there are some space of f2fs_sb_info is wasted due to aligning after Boolean
   type variables by compiler.
2. if we continuously add new flag into f2fs_sb_info, structure will be messed
   up.

So in this patch, we try to:
1. switch s_dirty to Boolean type variable since it has two status 0/1.
2. merge s_dirty/need_fsck/s_closing/por_doing variables into s_flag.
3. introduce an enum type which can indicate different states of sbi.
4. use new introduced universal interfaces is_sbi_flag_set/{set,clear}_sbi_flag
   to operate flags for sbi.

After that, above issues will be fixed.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:38 -08:00
Chao Yu
88dd893419 f2fs: clean up {in,de}create_sleep_time
Use pointer parameter @wait to pass result in {in,de}create_sleep_time for
cleanup.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:37 -08:00
Chao Yu
feeb0debfb f2fs: make truncate_inline_date static
1. make truncate_inline_date static;
2. remove parameter @from of truncate_inline_date as callers only pass zero.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:37 -08:00
Kinglong Mee
3b6709b771 f2fs: fix a bug of inheriting default ACL from parent
Introduced by a6dda0e63e
"f2fs: use generic posix ACL infrastructure".

When testing default acl, gets in recent kernel (3.19.0-rc5),
user::rwx
group::r-x
other::r-x
default:user::rwx
default:group::r-x
default:group:root:rwx
default😷:rwx
default:other::r-x

]# getfacl testdir/
user::rwx
group::rwx
                // missing an acl "group:root:rwx" inherited from parent
other::r-x
default:user::rwx
default:group::r-x
default:group:root:rwx
default😷:rwx
default:other::r-x

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:36 -08:00
Chao Yu
f28e503429 f2fs: use f2fs_radix_tree_insert to clean codes
No modification in functionality, just clean codes with f2fs_radix_tree_insert.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:35 -08:00
Chao Yu
d49f3e8902 f2fs: add F2FS_IOC_GETVERSION support
In this patch we add the FS_IOC_GETVERSION ioctl for getting i_generation from
inode, after that, users can list file's generation number by using "lsattr -v".

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:35 -08:00
Jaegeuk Kim
bc4a1f873b f2fs: leave comment for code readability
During the recovery, any xattr blocks should not be found, since they are
written into cold log, not the warm node chain.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:34 -08:00
Chao Yu
1601839e9e f2fs: fix to release count of meta page in ->invalidatepage
We will encounter deadloop in below scenario:

1. increase page count for F2FS_DIRTY_META type in following path:
->recover_fsync_data
  ->recover_data
    ->do_recover_data
      ->recover_data_page
        ->change_curseg
          ->write_sum_page
            ->set_page_dirty
2. fail in recover_data()
3. invalidate meta pages in truncate_inode_pages_final without decreasing page
   count.
4. deadloop when sync_meta_pages as page count will always be non-zero.

message:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!

 [<c1129a37>] pagevec_lookup_tag+0x27/0x30
 [<f0e774c7>] sync_meta_pages+0x87/0x160 [f2fs]
 [<f0e86dd9>] recover_fsync_data+0xeb9/0xf10 [f2fs]
 [<f0e75398>] f2fs_fill_super+0x888/0x980 [f2fs]
 [<c11733ca>] mount_bdev+0x16a/0x1a0
 [<f0e7180f>] f2fs_mount+0x1f/0x30 [f2fs]
 [<c1173da6>] mount_fs+0x36/0x170
 [<c118b6f5>] vfs_kern_mount+0x55/0xe0
 [<c118d63f>] do_mount+0x1df/0x9f0
 [<c118e110>] SyS_mount+0x70/0xb0
 [<c15a0c48>] sysenter_do_call+0x12/0x12

To avoid page count leak, let's add ->invalidatepage and ->releasepage in
f2fs_meta_aops as f2fs_node_aops to release meta page count correctly.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:33 -08:00
Jaegeuk Kim
85dc2f2c6c f2fs: do checkpoint when umount flag is not set
If the previous checkpoint was done without CP_UMOUNT flag, it needs to do
checkpoint with CP_UMOUNT for the next fast boot.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:33 -08:00
Jaegeuk Kim
30a5537f9a f2fs: trigger correct checkpoint during umount
This patch fixes to trigger checkpoint with umount flag when kill_sb was called.
In kill_sb, f2fs_sync_fs was finally called, but at this time, f2fs can't do
checkpoint with CP_UMOUNT.
After then, f2fs_put_super is not doing checkpoint, since it is not dirty.

So, this patch adds a flag to indicate f2fs_sync_fs is called during umount.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:32 -08:00
Jaegeuk Kim
6f0aacbc3c f2fs: update memory footprint information
This patch adds missing memory usages, and splits them in detail.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:31 -08:00
Chao Yu
9066c6a7eb f2fs: fix wrong memory footprint statistics in debugfs
Our value of memory footprint statistics showed in debugfs is not calculated
correctly. Fix it in this patch.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:31 -08:00
Jaegeuk Kim
871f599f4a f2fs: avoid infinite loop on cp_error
If cp_error is set, we should avoid all the infinite loop.
In f2fs_sync_file, there is a hole, and this patch fixes that.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11 17:04:30 -08:00
Linus Torvalds
c5ce28df0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) More iov_iter conversion work from Al Viro.

    [ The "crypto: switch af_alg_make_sg() to iov_iter" commit was
      wrong, and this pull actually adds an extra commit on top of the
      branch I'm pulling to fix that up, so that the pre-merge state is
      ok.   - Linus ]

 2) Various optimizations to the ipv4 forwarding information base trie
    lookup implementation.  From Alexander Duyck.

 3) Remove sock_iocb altogether, from CHristoph Hellwig.

 4) Allow congestion control algorithm selection via routing metrics.
    From Daniel Borkmann.

 5) Make ipv4 uncached route list per-cpu, from Eric Dumazet.

 6) Handle rfs hash collisions more gracefully, also from Eric Dumazet.

 7) Add xmit_more support to r8169, e1000, and e1000e drivers.  From
    Florian Westphal.

 8) Transparent Ethernet Bridging support for GRO, from Jesse Gross.

 9) Add BPF packet actions to packet scheduler, from Jiri Pirko.

10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer.

11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman
    Kwok.

12) More sanely handle out-of-window dupacks, which can result in
    serious ACK storms.  From Neal Cardwell.

13) Various rhashtable bug fixes and enhancements, from Herbert Xu,
    Patrick McHardy, and Thomas Graf.

14) Support xmit_more in be2net, from Sathya Perla.

15) Group Policy extensions for vxlan, from Thomas Graf.

16) Remove Checksum Offload support for vxlan, from Tom Herbert.

17) Like ipv4, support lockless transmit over ipv6 UDP sockets.  From
    Vlad Yasevich.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits)
  crypto: fix af_alg_make_sg() conversion to iov_iter
  ipv4: Namespecify TCP PMTU mechanism
  i40e: Fix for stats init function call in Rx setup
  tcp: don't include Fast Open option in SYN-ACK on pure SYN-data
  openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set
  ipv6: Make __ipv6_select_ident static
  ipv6: Fix fragment id assignment on LE arches.
  bridge: Fix inability to add non-vlan fdb entry
  net: Mellanox: Delete unnecessary checks before the function call "vunmap"
  cxgb4: Add support in cxgb4 to get expansion rom version via ethtool
  ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version
  net: dsa: Remove redundant phy_attach()
  IB/mlx4: Reset flow support for IB kernel ULPs
  IB/mlx4: Always use the correct port for mirrored multicast attachments
  net/bonding: Fix potential bad memory access during bonding events
  tipc: remove tipc_snprintf
  tipc: nl compat add noop and remove legacy nl framework
  tipc: convert legacy nl stats show to nl compat
  tipc: convert legacy nl net id get to nl compat
  tipc: convert legacy nl net id set to nl compat
  ...
2015-02-10 20:01:30 -08:00
Linus Torvalds
992de5a8ec Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "Bite-sized chunks this time, to avoid the MTA ratelimiting woes.

   - fs/notify updates

   - ocfs2

   - some of MM"

That laconic "some MM" is mainly the removal of remap_file_pages(),
which is a big simplification of the VM, and which gets rid of a *lot*
of random cruft and special cases because we no longer support the
non-linear mappings that it used.

From a user interface perspective, nothing has changed, because the
remap_file_pages() syscall still exists, it's just done by emulating the
old behavior by creating a lot of individual small mappings instead of
one non-linear one.

The emulation is slower than the old "native" non-linear mappings, but
nobody really uses or cares about remap_file_pages(), and simplifying
the VM is a big advantage.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (78 commits)
  memcg: zap memcg_slab_caches and memcg_slab_mutex
  memcg: zap memcg_name argument of memcg_create_kmem_cache
  memcg: zap __memcg_{charge,uncharge}_slab
  mm/page_alloc.c: place zone_id check before VM_BUG_ON_PAGE check
  mm: hugetlb: fix type of hugetlb_treat_as_movable variable
  mm, hugetlb: remove unnecessary lower bound on sysctl handlers"?
  mm: memory: merge shared-writable dirtying branches in do_wp_page()
  mm: memory: remove ->vm_file check on shared writable vmas
  xtensa: drop _PAGE_FILE and pte_file()-related helpers
  x86: drop _PAGE_FILE and pte_file()-related helpers
  unicore32: drop pte_file()-related helpers
  um: drop _PAGE_FILE and pte_file()-related helpers
  tile: drop pte_file()-related helpers
  sparc: drop pte_file()-related helpers
  sh: drop _PAGE_FILE and pte_file()-related helpers
  score: drop _PAGE_FILE and pte_file()-related helpers
  s390: drop pte_file()-related helpers
  parisc: drop _PAGE_FILE and pte_file()-related helpers
  openrisc: drop _PAGE_FILE and pte_file()-related helpers
  nios2: drop _PAGE_FILE and pte_file()-related helpers
  ...
2015-02-10 16:45:56 -08:00
Linus Torvalds
b2718bffb4 This time we have mostly clean ups. There is a bug fix for a NULL dereference
relating to ACLs, and another which improves (but does not fix entirely) an
 allocation fall-back code path. The other three patches are small clean
 ups.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQIcBAABAgAGBQJU2dvJAAoJEMrg3m4a/8jSQJYQAKJXz9l0MDHPpm4qcjEO1p8T
 JLSEa8X24taugcbSavC15g0v3HOCveleMymNo2wrfTFqkcpy/YTlqj0LjZywXOZc
 KJMZWL4ZYZXmdf2DDhZ6WFb6Ouw+0l2MbtIWb/L2NQiv1K6nSdOVsgXiRvyehvxD
 YEQBbdQxsa8uFr1QbBVNvaM6X6iHOLLvkLHHpw84N1dMcijt3BjQqqXQphkD6aM8
 uSEYSY4aMl2nZv/fEItjqLqLZvkhQX/dbN5aUnixHd65bSYkdsWMjn4DInJujbNd
 aT/iLNFhCKN3FqAcl+FAfnOhanJsj0jDSS4puyufuLOv2vgK9AmF8TiAW4TEZ0ny
 IAqdurn7+1dAINxKTuo/ktBdkdwWI35cwP3PUrvKJifgKxX/EvmGDC3U2EoeItP1
 /TbwE2PVdQspCiXUGIAx40BhmEEszrQHIdM+p+cUxxmmt74Hm9582pby1aMOPnHQ
 FZoPhtFRKD8JttxRu+MyEwC4K0/JRK7R496fFPoY225+DSqeNibXrhU3USznZXYc
 0h2kRoXTMO74ptdZr+I91RKUWZo17Ujm96mRZgnCSGgEX3IwD/6dxgd0cjU7075r
 mSVEy6fngwM6cGF0qQsnqhf2bktCPg+8zhSilbxwuFfWIa66NUjoupqB3wzpdlyC
 sd+3tpZijhStqjPYTKmM
 =CiRw
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw

Pull gfs2 updates from Steven Whitehouse:
 "This time we have mostly clean ups.  There is a bug fix for a NULL
  dereference relating to ACLs, and another which improves (but does not
  fix entirely) an allocation fall-back code path.  The other three
  patches are small clean ups"

* tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
  GFS2: Fix crash during ACL deletion in acl max entry check in gfs2_set_acl()
  GFS2: use __vmalloc GFP_NOFS for fs-related allocations.
  GFS2: Eliminate a nonsense goto
  GFS2: fix sprintf format specifier
  GFS2: Eliminate __gfs2_glock_remove_from_lru
2015-02-10 16:20:49 -08:00
Linus Torvalds
ae90fb1420 xfs: update for 3.20-rc1
This update contains:
 o RENAME_EXCHANGE support
 o Rework of the superblock logging infrastructure
 o Rework of the XFS_IOCTL_SETXATTR implementation
 	- enables use inside user namespaces
 	- fixes inconsistencies setting extent size hints
 o fixes for missing buffer type annotations used in log recovery
 o more consolidation of libxfs headers
 o preparation patches for block based PNFS support
 o miscellaneous bug fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJU2UQ6AAoJEK3oKUf0dfodNAwQAKD2T154VZCXZtjC6LobdEL4
 tuk72eDEyduNQZ1yL8MPNf090lsFizoLmXocawQEKPHZnSnZu1E6VQLughI5wSge
 N658/5mNvuRQkvUTw13onPhNdOzSoqigiy+6u0qiZbhab72ZExfuloqAAgZ908Ak
 DWZRQNXDspG8olGD1EKyEOKAoWDgRljiBEUw7+wVAN2thdQRXs6pyyTcaIXst3j6
 bk9EAeWZOK78OJUjtWlvIFv7RqjjUQL/8Z4Arr3lMuv8Y1fxS3oL8V9GjKae0MUn
 ZFPJzRHE9kfcO4AYAwYDUDW0Ia4ccq3sYu8FDzX6FTf3AqziQXKW/ovw83So5oEE
 vpT/ltsugWfu2feShvfa9FTYxqmecXUqoITZ4Sczm/8F9tDz0Q4HZ/GxleAmJyH9
 89IbTlSgm6qy1nmSbg21x0CVuOqJqHFAG83nnMKv8X4gmaMg9SzdNlU1iL2ugBQ7
 l2XKS2DKDbjgBnSdW+bOdBoFiMyCUjrwMAp23vYELuTIFSdHkTFsBQBvqtpjkJ0X
 KEutMaZdy8uo5DdPRN9k78OANbzOc4qYKJHTjiO5sQsMybsPNkciLBX49CgYFmw4
 xoVQ+EYrebpBKi4WHyLKWFOWb5CP/AUtLgxsX0clQVBuuaEBhA306qic1ZZixtIm
 ea2Q8ODEgpzs8QWZ7j6+
 =eZEU
 -----END PGP SIGNATURE-----

Merge tag 'xfs-for-linus-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs

Pull xfs update from Dave Chinner:
 "This update contains:

   - RENAME_EXCHANGE support

   - Rework of the superblock logging infrastructure

   - Rework of the XFS_IOCTL_SETXATTR implementation
       * enables use inside user namespaces
       * fixes inconsistencies setting extent size hints

   - fixes for missing buffer type annotations used in log recovery

   - more consolidation of libxfs headers

   - preparation patches for block based PNFS support

   - miscellaneous bug fixes and cleanups"

* tag 'xfs-for-linus-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (37 commits)
  xfs: only trace buffer items if they exist
  xfs: report proper f_files in statfs if we overshoot imaxpct
  xfs: fix panic_mask documentation
  xfs: xfs_ioctl_setattr_check_projid can be static
  xfs: growfs should use synchronous transactions
  xfs: fix behaviour of XFS_IOC_FSSETXATTR on directories
  xfs: factor projid hint checking out of xfs_ioctl_setattr
  xfs: factor extsize hint checking out of xfs_ioctl_setattr
  xfs: XFS_IOCTL_SETXATTR can run in user namespaces
  xfs: kill xfs_ioctl_setattr behaviour mask
  xfs: disaggregate xfs_ioctl_setattr
  xfs: factor out xfs_ioctl_setattr transaciton preamble
  xfs: separate xflags from xfs_ioctl_setattr
  xfs: FSX_NONBLOCK is not used
  xfs: don't allocate an ioend for direct I/O completions
  xfs: change kmem_free to use generic kvfree()
  xfs: factor out a xfs_update_prealloc_flags() helper
  xfs: remove incorrect error negation in attr_multi ioctl
  xfs: set superblock buffer type correctly
  xfs: set buf types when converting extent formats
  ...
2015-02-10 16:15:17 -08:00
Linus Torvalds
c5452a58db Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota interface unification and misc cleanups from Jan Kara:
 "The first part of the series unifying XFS and VFS quota interfaces.

  This part unifies turning quotas on and off so quota-tools and
  xfs_quota can be used to manage any filesystem.  This is useful so
  that userspace doesn't have to distinguish which filesystem it is
  working with.  As a result we can then easily reuse tests for project
  quotas in XFS for ext4.

  This also contains minor cleanups and fixes for udf, isofs, and ext3"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (23 commits)
  udf: remove bool assignment to 0/1
  udf: use bool for done
  quota: Store maximum space limit in bytes
  quota: Remove quota_on_meta callback
  ocfs2: Use generic helpers for quotaon and quotaoff
  ext4: Use generic helpers for quotaon and quotaoff
  quota: Add ->quota_{enable,disable} callbacks for VFS quotas
  quota: Wire up ->quota_{enable,disable} callbacks into Q_QUOTA{ON,OFF}
  quota: Split ->set_xstate callback into two
  xfs: Remove some pointless quota checks
  xfs: Remove some useless flags tests
  xfs: Remove useless test
  quota: Verify flags passed to Q_SETINFO
  quota: Cleanup flags definitions
  ocfs2: Move OLQF_CLEAN flag out of generic quota flags
  quota: Don't store flags for v2 quota format
  jbd: drop jbd_ENOSYS debug
  udf: destroy sbi mutex in put_super
  udf: Check length of extended attributes and allocation descriptors
  udf: Remove repeated loads blocksize
  ...
2015-02-10 15:52:38 -08:00
Linus Torvalds
4b4f8580a4 File locking related changes for v3.20 (pile #1)
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU1MYmAAoJEAAOaEEZVoIV/rAQAKoHj/PCOATTy05lF/NDhJlS
 6NbNjupnC8HrbNPv6Z/cQ902eC1YRVH96gf6we4FeAm9Tjctpje6uEqvPQCUxpot
 2jWgCG+g95OeEaQEjXQvR3x5ZfXvPUtwKVOnMF423L1p5Xfbj3kJfGi+dv2k8XOi
 GArsUB7uCwqLyyz+L47RJ2Cz7s47M9O25HkVRfWlgYOv+4afq5OpADGKQAhMLL/s
 CPhYgqw/7r1p+pLkjUE/x+5BAliDzUinFtDatgD4CeHOdq0RKlxzQ1rFg6uJVg/k
 3ZttGOxWUtGIeGM4v5cosDFReLPCESax/TUzn58jxxFR702MjHAA+lHRgjZoWvW/
 9EnShl0XlznQX1ns6f0rI1seWe4M5R3CWus8AcG0kDmdbTp8nARo+pBLFhCME/kZ
 15GHLz4tDSRt5SNow6aqJdlYJR7p3WrsceKyM5aH9M7odM3eaB5vJxIJ0fljsZbS
 Qtz4t+Ua1oVSYD7TX3y7EUiQVPVo8VKS3o6Ua73wCHIXNbSH7hZLOvPLFs6V1Psi
 RKqRiad5iO3+iavVGuDDcs12zXZ5hmksE8oMh0NkjFZ6wJlO4Hf5iOt5thABNDmT
 Km+40IBq1DYwclPTofaRpB+ytDOnWedMxdWfWdEWQ710zuuNY3cfi/XMXEX34kBY
 fLhUMabqcyfUegpA6S0R
 =6+UV
 -----END PGP SIGNATURE-----

Merge tag 'locks-v3.20-1' of git://git.samba.org/jlayton/linux

Pull file locking related changes #1 from Jeff Layton:
 "This patchset contains a fairly major overhaul of how file locks are
  tracked within the inode.  Rather than a single list, we now create a
  per-inode "lock context" that contains individual lists for the file
  locks, and a new dedicated spinlock for them.

  There are changes in other trees that are based on top of this set so
  it may be easiest to pull this in early"

* tag 'locks-v3.20-1' of git://git.samba.org/jlayton/linux:
  locks: update comments that refer to inode->i_flock
  locks: consolidate NULL i_flctx checks in locks_remove_file
  locks: keep a count of locks on the flctx lists
  locks: clean up the lm_change prototype
  locks: add a dedicated spinlock to protect i_flctx lists
  locks: remove i_flock field from struct inode
  locks: convert lease handling to file_lock_context
  locks: convert posix locks to file_lock_context
  locks: move flock locks to file_lock_context
  ceph: move spinlocking into ceph_encode_locks_to_buffer and ceph_count_locks
  locks: add a new struct file_locking_context pointer to struct inode
  locks: have locks_release_file use flock_lock_file to release generic flock locks
  locks: add new struct list_head to struct file_lock
2015-02-10 15:34:42 -08:00
Kirill A. Shutemov
27ba0644ea rmap: drop support of non-linear mappings
We don't create non-linear mappings anymore.  Let's drop code which
handles them in rmap.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:31 -08:00
Kirill A. Shutemov
1da4b35b00 proc: drop handling non-linear mappings
We have to handle non-linear mappings for /proc/PID/{smaps,clear_refs}
which is unused now.  Let's drop it.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:31 -08:00
Kirill A. Shutemov
d83a08db5b mm: drop vm_ops->remap_pages and generic_file_remap_pages() stub
Nobody uses it anymore.

[akpm@linux-foundation.org: fix filemap_xip.c]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:30 -08:00
Dmitry Monakhov
913e027ca1 fsioctl.c: make generic_block_fiemap() signal-tolerant
__generic_block_fiemap may spin very long time for large sparse files.

Without this patch an unprivileged user may abuse system resources simply
by spawning a vast number of unkilable busyloops (works on ext2/ext3):

  truncate --size 1T test
  for ((i=0;i<1024;i++))
  do
         filefrag test > /dev/null &
  done

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:30 -08:00
Srinivas Eeda
99b8874e79 o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper
A tiny race between BAST and unlock message causes the NULL dereference.

A node sends an unlock request to master and receives a response.  Before
processing the response it receives a BAST from the master.  Since both
requests are processed by different threads it creates a race.  While the
BAST is being processed, lock can get freed by unlock code.

This patch makes bast to return immediately if lock is found but unlock is
pending.  The code should handle this race.  We also have to fix master
node to skip sending BAST after receiving unlock message.

Below is the crash stack

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
    IP: o2dlm_blocking_ast_wrapper+0xd/0x16
    dlm_do_local_bast+0x8e/0x97 [ocfs2_dlm]
    dlm_proxy_ast_handler+0x838/0x87e [ocfs2_dlm]
    o2net_process_message+0x395/0x5b8 [ocfs2_nodemanager]
    o2net_rx_until_empty+0x762/0x90d [ocfs2_nodemanager]
    worker_thread+0x14d/0x1ed

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:30 -08:00
alex chen
10ab88117d ocfs2: prune the dcache before deleting the dentry of directory
In ocfs2_dentry_convert_worker, we should prune the dcache before deleting
the dentry of directory, otherwise, in the following cases the inode of
directory will still remain in orphan directory until the device being
umounted.

Mount point: /mnt/ocfs2
Node A                              Node B
mkdir /mnt/ocfs2/testdir
  ocfs2_mkdir
  ->ocfs2_mknod
  ->ocfs2_dentry_attach_lock
  ->ocfs2_dentry_lock(dentry, 0)
  ... ...
touch /mnt/ocfs2/testdir/testfile
                                    unlink /mnt/test/testdir/testfile
                                    rmdir /mnt/ocfs2/testdir
                                      ocfs2_unlink
                                      ->ocfs2_remote_dentry_delete
                                      ->ocfs2_dentry_lock(dentry, 1)
                                      ... ...
... ...
ocfs2_downconvert_thread
->ocfs2_unblock_lock
->ocfs2_dentry_convert_worker
->ocfs2_find_local_alias
  ->dget_dlock
->d_delete
Here the dentry can not be
released because the children's
dentry is negative but still exist.
Finally, this inode will still remain
in orphan directory until its children
are destroyed.

So before deleting dentry of directory, we should prune the dcache to
remove unused children of the parent dentry by shrink_dcache_parent().

Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: joyce.xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:30 -08:00
Fabian Frederick
8989b67330 ocfs2: make resv_lock spinlock static
resv_lock is only used in reservations.c

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00
Daeseok Youn
9d6008c759 ocfs2: remove unreachable code in __ocfs2_recovery_thread()
Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00
Daeseok Youn
9b57269170 ocfs2: removes mlog_errno() call twice in ocfs2_find_dir_space_el()
mlog_errno() is called twice when some functions are failed.

Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00
Daeseok Youn
592a202a3d ocfs2: remove unreachable code
Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00
Dan Carpenter
e6d9f86d6b ocfs2: o2net: silence uninitialized variable warning
Smatch complains that, if o2net_tx_can_proceed() returns false, then "sc"
and "ret" are uninialized or maybe we are re-using the data from previous
iteration.  I do not know if we can hit this bug in real life but checking
the return value is harmless and we may as well silence the static checker
warning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00
Jan Kara
d9510a20f8 ocfs2: remove pointless assignment from ocfs2_calc_refcount_meta_credits()
The assigned value is never used.

Coverity-id 1226847.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00
alex chen
1dfeb76847 ocfs2: add a mount option journal_async_commit on ocfs2 filesystem
Add a mount option to support JBD2 feature:

JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT.  When this feature is opened, journal
commit block can be written to disk without waiting for descriptor blocks,
which can improve journal commit performance.  This option will enable
'journal_checksum' internally.

Using the fs_mark benchmark, using journal_async_commit shows a 50%
improvement, the files per second go up from 215.2 to 317.5.

test script:
fs_mark  -d  /mnt/ocfs2/  -s  10240  -n  1000

default:
FSUse%        Count         Size    Files/sec     App Overhead
     0         1000        10240        215.2            17878

with journal_async_commit option:
FSUse%        Count         Size    Files/sec     App Overhead
     0         1000        10240        317.5            17881

Signed-off-by: Alex Chen <alex.chen@huawei.com>
Signed-off-by: Weiwei Wang <wangww631@huawei.comm>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00
alex chen
15eba0fe3e ocfs2: fix journal commit deadlock in ocfs2_convert_inline_data_to_extents
Similar to ocfs2_write_end_nolock() which is metioned at commit
136f49b917 ("ocfs2: fix journal commit deadlock"), we should unlock
pages before ocfs2_commit_trans() in ocfs2_convert_inline_data_to_extents.

Otherwise, it will cause a deadlock with journal commit threads.

Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00
Rickard Strandqvist
95671c63d5 ocfs2: dlm: dlmdomain: remove unused function
Remove dlm_joined() that is not used anywhere.

This was partially found by using a static code analysis program called
cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00
Rickard Strandqvist
7ea62d7031 ocfs2: quota_local: remove unused function
Remove ol_dqblk_file_block() that is not used anywhere.

This was partially found by using a static code analysis program called
cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00
Rickard Strandqvist
d6edc87af8 ocfs2: xattr: remove unused function
Remove ocfs2_xattr_bucket_get_val() that is not used anywhere.

This was partially found by using a static code analysis program called
cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:29 -08:00