Commit graph

42598 commits

Author SHA1 Message Date
Jaegeuk Kim
f96999c35f f2fs: refactor __find_rev_next_{zero}_bit
This patch refactors __find_rev_next_{zero}_bit which was disabled previously
due to bugs.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-21 15:26:00 -07:00
Trond Myklebust
a85240d254 Merge branch 'bugfixes'
* bugfixes:
  NFSv4.1/pnfs: Retry through MDS when getting bad length of data
  nfs/blocklayout: Fix bad using of page offset in bl_read_pagelist
  NFS: Return directly if encode_sessionid fail
  NFS: Fix bad checking of max taglen in callback request
  NFS: Fix bad defines of callback response maxsize
  NFS: Use NFS4_MAX_SESSIONID_LEN directly for decode/encode sessionid
  NFS: Remove unneeded NFS_DEBUG checking before define NFSDBG_FACILITY
  NFS: Remove the left function defines in callback.h
  NFS: Remove the left global variable nfs_callback_tcpport
  NFS: Get rid of the unneeded addr stored in callback arguments
  nfsroot: make nfsroot to accept the 1024 bytes long directory name
2015-10-21 16:07:21 -05:00
Kinglong Mee
f8417b481c NFSv4.1/pnfs: Retry through MDS when getting bad length of data
If non rpc-based layout driver return bad length of data, nfs retries
by calling rpc_restart_call_prepare() that cause an NULL reference panic.

This patch lets nfs retry through MDS for non rpc-based layout driver
return bad length of data.

[13034.883329] BUG: unable to handle kernel NULL pointer dereference at           (null)
[13034.884902] IP: [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc]
[13034.886558] PGD 0
[13034.888126] Oops: 0000 [#1] KASAN
[13034.889710] Modules linked in: blocklayoutdriver(OE) nfsv4(OE) nfs(OE) fscache(E) nfsd(OE) xfs libcrc32c coretemp btrfs crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev vmw_balloon auth_rpcgss shpchp nfs_acl lockd vmw_vmci parport_pc xor raid6_pq grace parport sunrpc i2c_piix4 vmwgfx drm_kms_helper ttm drm mptspi e1000 serio_raw scsi_transport_spi mptscsih mptbase ata_generic pata_acpi [last unloaded: fscache]
[13034.898260] CPU: 0 PID: 10112 Comm: kworker/0:1 Tainted: G           OE   4.3.0-rc5+ #279
[13034.899932] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[13034.903342] Workqueue: events bl_read_cleanup [blocklayoutdriver]
[13034.905059] task: ffff88006a9148c0 ti: ffff880035e90000 task.ti: ffff880035e90000
[13034.906827] RIP: 0010:[<ffffffffa00db372>]  [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc]
[13034.910522] RSP: 0018:ffff880035e97b58  EFLAGS: 00010282
[13034.912378] RAX: fffffbfff04a5a94 RBX: ffff880068fe4858 RCX: 0000000000000003
[13034.914339] RDX: dffffc0000000000 RSI: 0000000000000003 RDI: 0000000000000282
[13034.916236] RBP: ffff880035e97b68 R08: 0000000000000001 R09: 0000000000000001
[13034.918229] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[13034.920007] R13: ffff880068fe4858 R14: ffff880068fe4a60 R15: 0000000000001000
[13034.921845] FS:  0000000000000000(0000) GS:ffffffff82247000(0000) knlGS:0000000000000000
[13034.923645] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13034.925525] CR2: 0000000000000000 CR3: 00000000063dd000 CR4: 00000000001406f0
[13034.932808] Stack:
[13034.934813]  ffff880068fe4780 0000000000001000 ffff880035e97ba8 ffffffffa08800d2
[13034.936675]  ffffffffa088029d ffff880068fe4780 ffff880068fe4858 ffffffffa089c0a0
[13034.938593]  ffff880068fe47e0 ffff88005d59faf0 ffff880035e97be0 ffffffffa087e08f
[13034.940454] Call Trace:
[13034.942388]  [<ffffffffa08800d2>] nfs_readpage_result+0x112/0x200 [nfs]
[13034.944317]  [<ffffffffa088029d>] ? nfs_readpage_done+0xdd/0x160 [nfs]
[13034.946267]  [<ffffffffa087e08f>] nfs_pgio_result+0x9f/0x120 [nfs]
[13034.948166]  [<ffffffffa09266cc>] pnfs_ld_read_done+0x7c/0x1e0 [nfsv4]
[13034.950247]  [<ffffffffa03b07ee>] bl_read_cleanup+0x2e/0x60 [blocklayoutdriver]
[13034.952156]  [<ffffffff810ebf62>] process_one_work+0x412/0x870
[13034.954102]  [<ffffffff810ebe84>] ? process_one_work+0x334/0x870
[13034.955949]  [<ffffffff810ebb50>] ? queue_delayed_work_on+0x40/0x40
[13034.957985]  [<ffffffff810ec441>] worker_thread+0x81/0x6a0
[13034.959817]  [<ffffffff810ec3c0>] ? process_one_work+0x870/0x870
[13034.961785]  [<ffffffff810f43bd>] kthread+0x17d/0x1a0
[13034.963544]  [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330
[13034.965479]  [<ffffffff81100428>] ? finish_task_switch+0x88/0x220
[13034.967223]  [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330
[13034.968929]  [<ffffffff81b6ae5f>] ret_from_fork+0x3f/0x70
[13034.970534]  [<ffffffff810f4240>] ? kthread_create_on_node+0x330/0x330
[13034.972176] Code: c7 43 50 40 84 0d a0 e8 3d fe 1c e1 48 8d 7b 58 c7 83 e4 00 00 00 00 00 00 00 e8 ca fe 1c e1 4c 8b 63 58 4c 89 e7 e8 be fe 1c e1 <49> 83 3c 24 00 74 12 48 c7 43 50 f0 a2 0e a0 b8 01 00 00 00 5b
[13034.977148] RIP  [<ffffffffa00db372>] rpc_restart_call_prepare+0x62/0x90 [sunrpc]
[13034.978780]  RSP <ffff880035e97b58>
[13034.980399] CR2: 0000000000000000

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:55:47 -05:00
Kinglong Mee
15ae2c7bdc nfs/blocklayout: Fix bad using of page offset in bl_read_pagelist
Blocklayout uses file offset for the read-back page's offset of first writing,
it's definitely wrong, it writes data to bad address of page that cause userspace
application segment fault. It must be the page base stored in header->args.pgbase.

Also, the pg_offset has no influence with isect and extent length.

Note: The offset of the non-first page is always zero.

Ps: A test program will segment fault at read() as,
#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>

int main(int argc, char **argv)
{
        char buf[2049];
        char *filename = NULL;
        int fd = -1;

        if (argc < 2) {
                printf("Usage: %s filename\n", argv[0]);
                return 0;
        }

        filename = argv[1];
        fd = open(filename, O_RDONLY | O_DIRECT);
        if (fd < 0) {
                printf("Open %s fail: %m\n", filename);
                return 1;
        }

        lseek(fd, 2048, SEEK_SET);
        if (read(fd, buf, sizeof(buf) - 1) != (sizeof(buf) - 1))
                printf("Read 4096 bityes data from %s fail: %m\n", filename);
out:
        close(fd);
        return 0;
}

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:55:47 -05:00
Kinglong Mee
e0a63c0bfc NFS: Return directly if encode_sessionid fail
encode_sessionid() may return error, nfs needs process the return value.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:49:23 -05:00
Kinglong Mee
403889c039 NFS: Fix bad checking of max taglen in callback request
The taglen should be checked with CB_OP_TAGLEN_MAXSZ directly.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:49:23 -05:00
Kinglong Mee
45724e8a5b NFS: Fix bad defines of callback response maxsize
As CB_OP_TAGLEN_MAXSZ, all XXX_MAXSZ should be defined as bit.
Each operation should not cantains CB_OP_TAGLEN_MAXSZ.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:49:23 -05:00
Kinglong Mee
590184a6ce NFS: Use NFS4_MAX_SESSIONID_LEN directly for decode/encode sessionid
It's no need to define a temporary variables for NFS4_MAX_SESSIONID_LEN.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:49:23 -05:00
Kinglong Mee
39de493e88 NFS: Remove unneeded NFS_DEBUG checking before define NFSDBG_FACILITY
It's not needed to checking NFS_DEBUG before define NFSDBG_FACILITY, remove it.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:49:23 -05:00
Kinglong Mee
f765bf762b NFS: Remove the left function defines in callback.h
Commit 778be232a2 "NFS do not find client in NFSv4 pg_authenticate" has remove
the define and using of nfs4_set_callback_sessionid(), and
commit 36281caa83 "NFSv4: Further clean-ups of delegation stateid validation"
has update the checking of stateid, and move the code to nfs4proc.c.

This patch remove those function defines left in callback.h

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:49:22 -05:00
Kinglong Mee
8c163d8e5a NFS: Remove the left global variable nfs_callback_tcpport
Commit bbe0a3aa4e "NFS: make nfs_callback_tcpport per network context" has
make nfs_callback_tcpport per network, but left the global nfs_callback_tcpport,
remove it.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:49:22 -05:00
Kinglong Mee
d4e2ce0961 NFS: Get rid of the unneeded addr stored in callback arguments
Commit c36fca52f5 "NFS refactor nfs_find_client and reference client
across callback processing" has store clp in cb_process_state
which is set in cb_sequence.

So that, it's unneeded to store address pointer in any callback arguments.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:49:22 -05:00
Li RongQing
c646619355 nfsroot: make nfsroot to accept the 1024 bytes long directory name
although NFS_MAXPATHLEN is defined to 1024, nfs client hopes to accept
a 1024 byte path, but nfs_root_parms is limited to 256, and the nfs path
will truncated when a user inputs nfs path from kernel cmdline

enlarge nfs_root_parms to 1024, to make it accept the 1024 bytes long
directory name, since nfs_root_parms is defined as _initdata, it will
be released after system bootup

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-21 15:49:19 -05:00
Martin K. Petersen
25520d55cd block: Inline blk_integrity in struct gendisk
Up until now the_integrity profile has been dynamically allocated and
attached to struct gendisk after the disk has been made active.

This causes problems because NVMe devices need to register the profile
prior to the partition table being read due to a mandatory metadata
buffer requirement. In addition, DM goes through hoops to deal with
preallocating, but not initializing integrity profiles.

Since the integrity profile is small (4 bytes + a pointer), Christoph
suggested moving it to struct gendisk proper. This requires several
changes:

 - Moving the blk_integrity definition to genhd.h.

 - Inlining blk_integrity in struct gendisk.

 - Removing the dynamic allocation code.

 - Adding helper functions which allow gendisk to set up and tear down
   the integrity sysfs dir when a disk is added/deleted.

 - Adding a blk_integrity_revalidate() callback for updating the stable
   pages bdi setting.

 - The calls that depend on whether a device has an integrity profile or
   not now key off of the bi->profile pointer.

 - Simplifying the integrity support routines in DM (Mike Snitzer).

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-21 14:42:42 -06:00
Trond Myklebust
51e0164ebe Merge branch 'nfsclone'
* nfsclone:
  nfs: add missing linux/types.h
  NFS: Fix an 'unused variable' complaint when #ifndef CONFIG_NFS_V4_2
  nfs42: add NFS_IOC_CLONE_RANGE ioctl
  nfs42: respect clone_blksize
  nfs: get clone_blksize when probing fsinfo
  nfs42: add NFS_IOC_CLONE ioctl
  nfs42: add CLONE proc functions
  nfs42: add CLONE xdr functions
2015-10-21 15:42:20 -05:00
Luis de Bethencourt
ddd664f447 btrfs: reada: Fix returned errno code
reada is using -1 instead of the -ENOMEM defined macro to specify that
a buffer allocation failed. Since the error number is propagated, the
caller will get a -EPERM which is the wrong error condition.

Also, updating the caller to return the exact value from
reada_add_block.

Smatch tool warning:
reada_add_block() warn: returning -1 instead of -ENOMEM is sloppy

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:29:50 +02:00
Luis de Bethencourt
0b8d8ce029 btrfs: check-integrity: Fix returned errno codes
check-integrity is using -1 instead of the -ENOMEM defined macro to
specify that a buffer allocation failed. Since the error number is
propagated, the caller will get a -EPERM which is the wrong error
condition.

Also, the smatch tool complains with the following warnings:
btrfsic_process_superblock() warn: returning -1 instead of -ENOMEM is sloppy
btrfsic_read_block() warn: returning -1 instead of -ENOMEM is sloppy

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:29:44 +02:00
Byongho Lee
d91876496b btrfs: compress: put variables defined per compress type in struct to make cache friendly
Below variables are defined per compress type.
 - struct list_head comp_idle_workspace[BTRFS_COMPRESS_TYPES]
 - spinlock_t comp_workspace_lock[BTRFS_COMPRESS_TYPES]
 - int comp_num_workspace[BTRFS_COMPRESS_TYPES]
 - atomic_t comp_alloc_workspace[BTRFS_COMPRESS_TYPES]
 - wait_queue_head_t comp_workspace_wait[BTRFS_COMPRESS_TYPES]

BTW, while accessing one compress type of these variables, the next or
before address is other compress types of it.
So this patch puts these variables in a struct to make cache friendly.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Byongho Lee <bhlee.kernel@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Byongho Lee
619ed39242 btrfs: cleanup iterating over prop_handlers array
This patch eliminates the last item of prop_handlers array which is used
to check end of array and instead uses ARRAY_SIZE macro.
Though this is a very tiny optimization, using ARRAY_SIZE macro is a
good practice to iterate array.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Byongho Lee <bhlee.kernel@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Geliang Tang
8cd1e73111 btrfs: fix a comment typo
Just fix a typo in the code comment.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Alexandru Moise
6e4d6fa12c btrfs: declare rsv_count as unsigned int instead of int
rsv_count ultimately gets passed to start_transaction() which
now takes an unsigned int as its num_items parameter.
The value of rsv_count should always be positive so declare it
as being unsigned.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Alexandru Moise
5aed1dd8b4 btrfs: change num_items type from u64 to unsigned int
The value of num_items that start_transaction() ultimately
always takes is a small one, so a 64 bit integer is overkill.

Also change num_items for btrfs_start_transaction() and
btrfs_start_transaction_lflush() as well.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Alexandru Moise
bdcd3c97d1 btrfs: cleanup btrfs_balance profile validity checks
Improve readability by generalizing the profile validity checks.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Shan Hai
bb78915203 btrfs/file.c: remove an unsed varialbe first_index
The commit b37392ea86 ("Btrfs: cleanup unnecessary parameter
and variant of prepare_pages()") makes it redundant.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Shan Hai <haishan.bai@hotmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Zhao Lei
9c170b2644 btrfs: use btrfs_raid_array in btrfs_reduce_alloc_profile
btrfs_raid_array[] holds attributes of all raid types.

Use btrfs_raid_array[].devs_min is best way for request
in btrfs_reduce_alloc_profile(), instead of use complex
condition of each raid types.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Zhao Lei
8789f4fe60 btrfs: use btrfs_raid_array for btrfs_get_num_tolerated_disk_barrier_failures()
btrfs_raid_array[] is used to define all raid attributes, use it
to get tolerated_failures in btrfs_get_num_tolerated_disk_barrier_failures(),
instead of complex condition in function.

It can make code simple and auto-support other possible raid-type in
future.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Zhao Lei
af90204750 btrfs: Move btrfs_raid_array to public
This array is used to record attributes of each raid type,
make it public, and many functions will benifit with this array.

For example, num_tolerated_disk_barrier_failures(), we can
avoid complex conditions in this function, and get raid attribute
simply by accessing above array.

It can also make code logic simple, reduce duplication code, and
increase maintainability.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Alexandru Moise
e9cf439f0d btrfs: use a single if() statement for one outcome in get_block_rsv()
Rather than have three separate if() statements for the same outcome
we should just OR them together in the same if() statement.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Alexandru Moise
a099d0fdb3 btrfs: memset cur_trans->delayed_refs to zero
Use memset() to null out the btrfs_delayed_ref_root of
btrfs_transaction instead of setting all the members to 0 by hand.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Byongho Lee
568b1c9cca btrfs: remove unnecessary list_del
We can safely iterate whole list items, without using list_del macro.
So remove the list_del call.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Byongho Lee <bhlee.kernel@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Byongho Lee
d7641a49a5 btrfs: replace unnecessary list_for_each_entry_safe to list_for_each_entry
There is no removing list element while iterating over list.
So, replace list_for_each_entry_safe to list_for_each_entry.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Byongho Lee <bhlee.kernel@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Alexandru Moise
f2f767e734 btrfs: trimming some start_transaction() code away
Just call kmem_cache_zalloc() instead of calling kmem_cache_alloc().
We're just initializing most fields to 0, false and NULL later on
_anyway_, so to make the code mode readable and potentially gain
a bit of performance (completely untested claim), we should fill our
btrfs_trans_handle with zeros on allocation then just initialize
those five remaining fields (not counting the list_heads) as normal.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Alexandru Moise
0412e58c6d btrfs: Fixed declaration of old_len
old_len is used to store the return value of btrfs_item_size_nr().
The return value of btrfs_item_size_nr() is of type u32.
To improve code correctness and avoid mixing signed and unsigned
integers I've changed old_len to be of type u32 as well.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Alexandru Moise
ce0eac2a1d btrfs: Fixed dsize and last_off declarations
The return values of btrfs_item_offset_nr and btrfs_item_size_nr are of
type u32. To avoid mixing signed and unsigned integers we should also
declare dsize and last_off to be of type u32.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:48 +02:00
Chandan Rajendra
0d51e28a11 Btrfs: btrfs_submit_bio_hook: Use btrfs_wq_endio_type values instead of integer constants
btrfs_submit_bio_hook() uses integer constants instead of values from "enum
btrfs_wq_endio_type". Fix this.

Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2015-10-21 18:28:47 +02:00
Geliang Tang
1873041152 pstore: add a helper function pstore_register_kmsg
Add a new wrapper function pstore_register_kmsg to keep the
consistency with other similar pstore_register_* functions.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2015-10-21 09:27:10 -07:00
Geliang Tang
549b39a9e7 pstore: add vmalloc error check
If vmalloc fails, make write_pmsg return -ENOMEM.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2015-10-21 09:27:10 -07:00
David Howells
146aa8b145 KEYS: Merge the type-specific data with the payload data
Merge the type-specific data with the payload data into one four-word chunk
as it seems pointless to keep them separate.

Use user_key_payload() for accessing the payloads of overloaded
user-defined keys.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cifs@vger.kernel.org
cc: ecryptfs@vger.kernel.org
cc: linux-ext4@vger.kernel.org
cc: linux-f2fs-devel@lists.sourceforge.net
cc: linux-nfs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: linux-ima-devel@lists.sourceforge.net
2015-10-21 15:18:36 +01:00
Qu Wenruo
0f6925fa29 btrfs: Avoid truncate tailing page if fallocate range doesn't exceed inode size
Current code will always truncate tailing page if its alloc_start is
smaller than inode size.

For example, the file extent layout is like:
0	4K	8K	16K	32K
|<-----Extent A---------------->|
|<--Inode size: 18K---------->|

But if calling fallocate even for range [0,4K), it will cause btrfs to
re-truncate the range [16,32K), causing COW and a new extent.

0	4K	8K	16K	32K
|///////|	<- Fallocate call range
|<-----Extent A-------->|<--B-->|

The cause is quite easy, just a careless btrfs_truncate_inode() in a
else branch without extra judgment.
Fix it by add judgment on whether the fallocate range is beyond isize.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-10-20 19:07:29 -07:00
Jaegeuk Kim
67f8cf3cee f2fs: support fiemap for inline_data
There is a FIEMAP_EXTENT_INLINE_DATA, pointed out by Marc.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-20 11:33:21 -07:00
Jaegeuk Kim
1d373a0ef7 f2fs: flush dirty data for bmap
Users expect bmap will give allocated block addresses.
Let's play likewise ext4.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-10-20 11:33:11 -07:00
Ross Zwisler
5726b27b09 ext2: Add locking for DAX faults
Add locking to ensure that DAX faults are isolated from ext2 operations
that modify the data blocks allocation for an inode.  This is intended to
be analogous to the work being done in XFS by Dave Chinner:

http://www.spinics.net/lists/linux-fsdevel/msg90260.html

Compared with XFS the ext2 case is greatly simplified by the fact that ext2
already allocates and zeros new blocks before they are returned as part of
ext2_get_block(), so DAX doesn't need to worry about getting unmapped or
unwritten buffer heads.

This means that the only work we need to do in ext2 is to isolate the DAX
faults from inode block allocation changes.  I believe this just means that
we need to isolate the DAX faults from truncate operations.

The newly introduced dax_sem is intended to replicate the protection
offered by i_mmaplock in XFS.  In addition to truncate the i_mmaplock also
protects XFS operations like hole punching, fallocate down, extent
manipulation IOCTLS like xfs_ioc_space() and extent swapping.  Truncate is
the only one of these operations supported by ext2.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.com>
2015-10-19 14:40:54 +02:00
John Stultz
16175039e6 ext4: fix abs() usage in ext4_mb_check_group_pa
The ext4_fsblk_t type is a long long, which should not be used
with abs(), as is done in ext4_mb_check_group_pa().

This patch modifies ext4_mb_check_group_pa() to use abs64()
instead.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-10-19 00:01:05 -04:00
Dmitry Monakhov
1e381f60da ext4: do not allow journal_opts for fs w/o journal
It is appeared that we can pass journal related mount options and such options
be shown in /proc/mounts

Example:
#mkfs.ext4 -F /dev/vdb
#tune2fs -O ^has_journal /dev/vdb
#mount /dev/vdb /mnt/  -ocommit=20,journal_async_commit
#cat /proc/mounts  | grep /mnt
 /dev/vdb /mnt ext4 rw,relatime,journal_checksum,journal_async_commit,commit=20,data=ordered 0 0

But options:"journal_checksum,journal_async_commit,commit=20,data=ordered" has
nothing with reality because there is no journal at all.

This patch disallow following options for journalless configurations:
 - journal_checksum
 - journal_async_commit
 - commit=%ld
 - data={writeback,ordered,journal}

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
2015-10-18 23:50:26 -04:00
Dmitry Monakhov
c93cf2d757 ext4: explicit mount options parsing cleanup
Currently MOPT_EXPLICIT treated as EXPLICIT_DELALLOC which may be changed
in future. Let's fix it now.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-10-18 23:35:32 -04:00
Jarkko Sakkinen
37c1c04cca sysfs: added __compat_only_sysfs_link_entry_to_kobj()
Added a new function __compat_only_sysfs_link_group_to_kobj() that adds
a symlink from attribute or group to a kobject. This needed for
maintaining backwards compatibility with PPI attributes in the TPM
driver.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
2015-10-19 01:01:19 +02:00
Dave Chinner
fcd8a399a9 Merge branch 'xfs-stats-fixes' into for-next 2015-10-19 09:03:30 +11:00
Dan Carpenter
f9d460b341 xfs: fix an error code in xfs_fs_fill_super()
If alloc_percpu() fails, we accidentally return PTR_ERR(NULL), which
means success, but we intended to return -ENOMEM.

Fixes: 225e463558 ('xfs: per-filesystem stats in sysfs')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-10-19 08:42:47 +11:00
Dave Chinner
985ef4dcf9 xfs: stats are no longer dependent on CONFIG_PROC_FS
So we need to fix the makefile to understand this, otherwise build
errors with CONFIG_PROC_FS=n occur.

Reported-and-tested-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-10-19 08:42:46 +11:00
Daeho Jeong
4327ba52af ext4, jbd2: ensure entering into panic after recording an error in superblock
If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
journaling will be aborted first and the error number will be recorded
into JBD2 superblock and, finally, the system will enter into the
panic state in "errors=panic" option.  But, in the rare case, this
sequence is little twisted like the below figure and it will happen
that the system enters into panic state, which means the system reset
in mobile environment, before completion of recording an error in the
journal superblock. In this case, e2fsck cannot recognize that the
filesystem failure occurred in the previous run and the corruption
wouldn't be fixed.

Task A                        Task B
ext4_handle_error()
-> jbd2_journal_abort()
  -> __journal_abort_soft()
    -> __jbd2_journal_abort_hard()
    | -> journal->j_flags |= JBD2_ABORT;
    |
    |                         __ext4_abort()
    |                         -> jbd2_journal_abort()
    |                         | -> __journal_abort_soft()
    |                         |   -> if (journal->j_flags & JBD2_ABORT)
    |                         |           return;
    |                         -> panic()
    |
    -> jbd2_journal_update_sb_errno()

Tested-by: Hobin Woo <hobin.woo@samsung.com>
Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2015-10-18 17:02:56 -04:00