Commit graph

569807 commits

Author SHA1 Message Date
Steve Muckle
69fbcb521a ANDROID: add script to fetch android kernel config fragments
The Android kernel config fragments now live in a separate repository.
To prevent others from having to search for this location, add a script
to fetch and unpack the fragments.

Update .gitignore to include these fragments.

Change-Id: If2d4a59b86e4573b0a9b3190025dfe4191870b46
Signed-off-by: Steve Muckle <smuckle@google.com>
2017-10-03 10:59:04 -07:00
Jaegeuk Kim
d78a12988c f2fs: reorganize stat information
commit d4adb30f25f5f2aa9b205891e395251d2a9098be upstream.

This patch modifies stat information more clearly.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 15:39:56 +00:00
Jaegeuk Kim
eee3f1f510 f2fs: clean up flush/discard command namings
commit b01a92019cac30398ef75b560d2668b399f4e393 upstream.

This patch simply cleans up the names for flush/discard commands.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 15:27:53 +00:00
Chao Yu
fb2e2f44af f2fs: check in-memory sit version bitmap
commit ae27d62e6befd3cac4ffa702e644cc52019642e8 upstream.

This patch adds a mirror for sit version bitmap, and use it to detect
in-memory bitmap corruption which may be caused by bit-transition of
cache or memory overflow.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 15:27:19 +00:00
Chao Yu
5c53448ff2 f2fs: check in-memory nat version bitmap
commit 599a09b2c1ac222e6aad0c22515d1ccde7c3b702 upstream.

This patch adds a mirror for nat version bitmap, and use it to detect
in-memory bitmap corruption which may be caused by bit-transition of
cache or memory overflow.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 15:26:55 +00:00
Chao Yu
dd5804b214 f2fs: check in-memory block bitmap
commit 355e78913c0d57492076d545b6f44b94fec2bf6b upstream.

This patch adds a mirror for valid block bitmap, and use it to detect
in-memory bitmap corruption which may be caused by bit-transition of
cache or memory overflow.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 15:22:12 +00:00
Chao Yu
dc8b8cea1e f2fs: introduce FI_ATOMIC_COMMIT
commit 5fe457430e554a2f5188f13c1a2e36ad845640c5 upstream.

This patch introduces a new flag to indicate inode status of doing atomic
write committing, so that, we can keep atomic write status for inode
during atomic committing, then we can skip GCing pages of atomic write inode,
that avoids random GCed datas being mixed with current transaction, so
isolation of transaction can be kept.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 15:20:50 +00:00
Chao Yu
7129702a48 f2fs: clean up with list_{first, last}_entry
commit 939afa943c5290a3b92f01612a792af17bc98115 upstream.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:49:36 +00:00
Jaegeuk Kim
556f5ba349 f2fs: return fs_trim if there is no candidate
commit 25290fa5591d81767713db304e0d567bf991786f upstream.

If there is no candidate to submit discard command during f2sf_trim_fs, let's
return without checkpoint.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:49:15 +00:00
Jaegeuk Kim
d051ccbd1b f2fs: avoid needless checkpoint in f2fs_trim_fs
commit 0333ad4e4f49e14217256e1db1134a70cf60b2de upstream.

The f2fs_trim_fs() doesn't need to do checkpoint if there are newly allocated
data blocks only which didn't change the critical checkpoint data such as nat
and sit entries.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:48:56 +00:00
Jaegeuk Kim
132263ddad f2fs: relax async discard commands more
commit 4e6a8d9b224f886362ea6e8f6046b541437c944f upstream.

This patch relaxes async discard commands to avoid waiting its end_io during
checkpoint.
Instead of waiting them during checkpoint, it will be done when actually reusing
them.

Test on initial partition of nvme drive.

 # time fstrim /mnt/test

Before : 6.158s
After : 4.822s

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:48:43 +00:00
Jaegeuk Kim
66e2310bf9 f2fs: drop exist_data for inline_data when truncated to 0
commit bb95d9ab2a9d4afd03b59a603cccb2c601f68b78 upstream.

A test program gets the SEEK_DATA with two values between
a new created file and the exist file on f2fs filesystem.

F2FS filesystem,  (the first "test1" is a new file)
SEEK_DATA size != 0 (offset = 8192)
SEEK_DATA size != 0 (offset = 4096)

PNFS filesystem, (the first "test1" is a new file)
SEEK_DATA size != 0 (offset = 4096)
SEEK_DATA size != 0 (offset = 4096)

int main(int argc, char **argv)
{
        char *filename = argv[1];
        int offset = 1, i = 0, fd = -1;

        if (argc < 2) {
                printf("Usage: %s f2fsfilename\n", argv[0]);
                return -1;
        }

        /*
        if (!access(filename, F_OK) || errno != ENOENT) {
                printf("Needs a new file for test, %m\n");
                return -1;
        }*/

        fd = open(filename, O_RDWR | O_CREAT, 0777);
        if (fd < 0) {
                printf("Create test file %s failed, %m\n", filename);
                return -1;
        }

        for (i = 0; i < 20; i++) {
                offset = 1 << i;
                ftruncate(fd, 0);
                lseek(fd, offset, SEEK_SET);
                write(fd, "test", 5);
                /* Get the alloc size by seek data equal zero*/
                if (lseek(fd, 0, SEEK_DATA)) {
                        printf("SEEK_DATA size != 0 (offset = %d)\n", offset);
                        break;
                }
        }

        close(fd);
        return 0;
}

Reported-and-Tested-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:47:48 +00:00
Jaegeuk Kim
dde5a6f8fd f2fs: don't allow encrypted operations without keys
commit 363fa4e078cbdc97a172c19d19dc04b41b52ebc8 upstream.

This patch fixes the renaming bug on encrypted filenames, which was pointed by

 (ext4: don't allow encrypted operations without keys)

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:46:57 +00:00
Jaegeuk Kim
97a43c7059 f2fs: show the max number of atomic operations
commit 26a28a0c1eb756ba18bfb1f93309c4b4406b9cd9 upstream.

This patch adds to show the max number of atomic operations which are
conducting concurrently.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:46:38 +00:00
Jaegeuk Kim
7b214391b2 f2fs: get io size bit from mount option
commit ec91538dccd44329ad83d3aae1aa6a8389b5c75f upstream.

This patch adds to set io_size_bits from mount option.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:46:12 +00:00
Jaegeuk Kim
b3fcb70064 f2fs: support IO alignment for DATA and NODE writes
commit 0a595ebaaa6b53a2226d3fee2a2fd616ea5ba378 upstream.

This patch implements IO alignment by filling dummy blocks in DATA and NODE
write bios. If we can guarantee, for example, 32KB or 64KB for such the IOs,
we can eliminate underlying dummy page problem which FTL conducts in order to
close MLC or TLC partial written pages.

Note that,
 - it requires "-o mode=lfs".
 - IO size should be power of 2, not exceed BIO_MAX_PAGES, 256.
 - read IO is still 4KB.
 - do checkpoint at fsync, if dummy NODE page was written.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:45:50 +00:00
Jaegeuk Kim
8ef4f0ca7b f2fs: add submit_bio tracepoint
commit 554b5125f5cfca6653461fd52bad24d4ef35ec29 upstream.

This patch adds final submit_bio() tracepoint.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:45:13 +00:00
Jaegeuk Kim
d4e5223d81 f2fs: reassign new segment for mode=lfs
commit 9d52a504db6db9e4e254576130aa867838daff55 upstream.

Otherwise we can remain wrong curseg->next_blkoff, resulting in fsck failure.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:40:17 +00:00
Yunlei He
c70e14cdaf f2fs: fix a missing discard prefree segments
commit 650d3c4e56e1e92ee6e004648c9deb243e5963e0 upstream.

If userspace issue a fstrim with a range not involve prefree segments,
it will reuse these segments without discard. This patch fix it.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:39:58 +00:00
Geliang Tang
574da11960 f2fs: use rb_entry_safe
commit ed0b56209fe79a1309653d4b03f5c3147f580f6b upstream.

Use rb_entry_safe() instead of open-coding it.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:39:34 +00:00
Yunlei He
ff9199293b f2fs: add a case of no need to read a page in write begin
commit 746e2403927efbd7c7f2e796314e3cfb3cfabaa4 upstream.

If the range we write cover the whole valid data in the last page,
we do not need to read it.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
[Jaegeuk Kim: nullify the remaining area (fix: xfstests/f2fs/001)]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:39:12 +00:00
Yunlei He
75487d02a7 f2fs: fix a problem of using memory after free
commit 7855eba4d6102f811b6dd142d6c749f53b591fa3 upstream.

This patch fix a problem of using memory after free
in function __try_merge_extent_node.

Fixes: 0f825ee6e8 ("f2fs: add new interfaces for extent tree")
Cc: <stable@vger.kernel.org>
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:38:30 +00:00
Dan Carpenter
a1c90b43fc f2fs: remove unneeded condition
commit 07fe8d44409f88be8f4a4e8f22b47ee709a22657 upstream.

We checked that "inode" is not an error pointer earlier so there is
no need to check again here.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:38:07 +00:00
Chao Yu
1b05b5e173 f2fs: don't cache nat entry if out of memory
commit 5c9e418436f3445d7cc4f3ba2964f231a4b33f17 upstream.

If we run out of memory, in cache_nat_entry, it's better to avoid loop
for allocating memory to cache nat entry, so in low memory scenario, for
read path of node block, I expect this can avoid unneeded latency.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:37:14 +00:00
Yunlei He
2ed473dc91 f2fs: remove unused values in recover_fsync_data
commit fed24668482e07421b8e746a4886e7725434050a upstream.

This patch remove unused values in function recover_fsync_data

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:36:20 +00:00
Jaegeuk Kim
401c465b81 f2fs: support async discard based on v4.9
commit 275b66b09e85cf0520dc610dd89706952751a473 upstream.

This patch is based on commit 275b66b09e85 (f2fs: support async discard).

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:36:05 +00:00
Jaegeuk Kim
dc45fd9e28 f2fs: resolve op and op_flags confilcts
commit 70fd76140a6cb63262bd47b68d57b42e889c10ee upstream.

This patch backported ("block,fs: use REQ_* flags directly")

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:35:45 +00:00
Jaegeuk Kim
6b1f845ef3 f2fs: remove wrong backported codes
Kconfig and dentry RCU mode fixes.

Fixes: c1286ff41c ("f2fs: backport from 'for-f2fs-4.9'")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-03 13:35:15 +00:00
Todd Kjos
642da1dade FROMLIST: binder: fix use-after-free in binder_transaction()
(from https://patchwork.kernel.org/patch/9978801/)

User-space normally keeps the node alive when creating a transaction
since it has a reference to the target. The local strong ref keeps it
alive if the sending process dies before the target process processes
the transaction. If the source process is malicious or has a reference
counting bug, this can fail.

In this case, when we attempt to decrement the node in the failure
path, the node has already been freed.

This is fixed by taking a tmpref on the node while constructing
the transaction. To avoid re-acquiring the node lock and inner
proc lock to increment the proc's tmpref, a helper is used that
does the ref increments on both the node and proc.

Bug: 66899329
Change-Id: Iad40e1e0bccee88234900494fb52a510a37fe8d7
Signed-off-by: Todd Kjos <tkjos@google.com>
2017-10-02 18:08:29 +00:00
Ido Schimmel
a886cc1d3a UPSTREAM: ipv6: fib: Unlink replaced routes from their nodes
When a route is deleted its node pointer is set to NULL to indicate it's
no longer linked to its node. Do the same for routes that are replaced.

This will later allow us to test if a route is still in the FIB by
checking its node pointer instead of its reference count.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Cherry-pick from: 7483cea79957312e9f8e9cf760a1bc5d6c507113
Bug: 64978549

Change-Id: Ibfa54cf918084138b6b19437e9ef86bfaea5deae
2017-09-29 17:30:40 -07:00
Greg Kroah-Hartman
d68ba9f116 This is the 4.4.89 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlnLaLoACgkQONu9yGCS
 aT7hDw/+Ipx/xnjIUJFV/aqo8lTh3XqP/TjD5whoi+yYC8axLEZBLiOSLZceVjsG
 hi2mP22gKn1i7GLXNeWIZ+rMtVzAN+qNg7i8cjWNfFp1fA7cCfFaYvlV0LVrO2tK
 WnvvE8r5kQAKyQG8498ebEjianxwxHVERnNiE5/SDpCNj14DnwCJBTEYM0tEnuXZ
 /jBIIs4xvndVa0fFfUjuAzh65AefAT1BmgsPll4GnFMUFHh30smYdFla5LL0GNIq
 FQGFvIi8Q02disSMg9lFJVOlazc/HUREiFB1qy1DRtGMnS6/Q0HW0kCxeRi/7QEi
 +HN2rLxtbpnuD5P7W4lDJ5/cyCHMIv8SJ8OqUd8uxbTWz31P/QxbM7d35d+w3rq8
 dv3sQ6CMRnuIXGL5dFHh7zYqlzNS9PKjLmxzAw9grDf+nVsDxE4KUfJy00DSN1I1
 Bopi1kCD2nUMOiBrmxkIczN6OOvcGBHh6/TTB2WEKVHn42D0fjLnO66kJVJLMsBm
 vDdKJDDSGM/0HiUa5ydr6R0Ae7My3h5AJZRa5gn0kL/myatX/vsa0B2ZLpHlVipM
 GhODBsDFkI4k4yceONDZPJmhhVab1lewTMuIW5D2KRMsgpQqLmlOyL5gykfH0rTx
 FVnLSoMAHsgm6qVPwRS5BqK/UnXogfqjiB0iXzNNZnkiABWWoUQ=
 =Skkr
 -----END PGP SIGNATURE-----

Merge 4.4.89 into android-4.4

Changes in 4.4.89
	ipv6: accept 64k - 1 packet length in ip6_find_1stfragopt()
	ipv6: add rcu grace period before freeing fib6_node
	ipv6: fix sparse warning on rt6i_node
	qlge: avoid memcpy buffer overflow
	Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()"
	tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0
	Revert "net: use lib/percpu_counter API for fragmentation mem accounting"
	Revert "net: fix percpu memory leaks"
	gianfar: Fix Tx flow control deactivation
	ipv6: fix memory leak with multiple tables during netns destruction
	ipv6: fix typo in fib6_net_exit()
	f2fs: check hot_data for roll-forward recovery
	x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps
	md/raid5: release/flush io in raid5_do_work()
	nfsd: Fix general protection fault in release_lock_stateid()
	mm: prevent double decrease of nr_reserved_highatomic
	tty: improve tty_insert_flip_char() fast path
	tty: improve tty_insert_flip_char() slow path
	tty: fix __tty_insert_flip_char regression
	Input: i8042 - add Gigabyte P57 to the keyboard reset table
	MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix quiet NaN propagation
	MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix cases of both inputs zero
	MIPS: math-emu: <MAX|MIN>.<D|S>: Fix cases of both inputs negative
	MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of input values with opposite signs
	MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of both infinite inputs
	MIPS: math-emu: MINA.<D|S>: Fix some cases of infinity and zero inputs
	crypto: AF_ALG - remove SGL terminator indicator when chaining
	ext4: fix incorrect quotaoff if the quota feature is enabled
	ext4: fix quota inconsistency during orphan cleanup for read-only mounts
	powerpc: Fix DAR reporting when alignment handler faults
	block: Relax a check in blk_start_queue()
	md/bitmap: disable bitmap_resize for file-backed bitmaps.
	skd: Avoid that module unloading triggers a use-after-free
	skd: Submit requests to firmware before triggering the doorbell
	scsi: zfcp: fix queuecommand for scsi_eh commands when DIX enabled
	scsi: zfcp: add handling for FCP_RESID_OVER to the fcp ingress path
	scsi: zfcp: fix capping of unsuccessful GPN_FT SAN response trace records
	scsi: zfcp: fix passing fsf_req to SCSI trace on TMF to correlate with HBA
	scsi: zfcp: fix missing trace records for early returns in TMF eh handlers
	scsi: zfcp: fix payload with full FCP_RSP IU in SCSI trace records
	scsi: zfcp: trace HBA FSF response by default on dismiss or timedout late response
	scsi: zfcp: trace high part of "new" 64 bit SCSI LUN
	scsi: megaraid_sas: Check valid aen class range to avoid kernel panic
	scsi: megaraid_sas: Return pended IOCTLs with cmd_status MFI_STAT_WRONG_STATE in case adapter is dead
	scsi: storvsc: fix memory leak on ring buffer busy
	scsi: sg: remove 'save_scat_len'
	scsi: sg: use standard lists for sg_requests
	scsi: sg: off by one in sg_ioctl()
	scsi: sg: factor out sg_fill_request_table()
	scsi: sg: fixup infoleak when using SG_GET_REQUEST_TABLE
	scsi: qla2xxx: Fix an integer overflow in sysfs code
	ftrace: Fix selftest goto location on error
	tracing: Apply trace_clock changes to instance max buffer
	ARC: Re-enable MMU upon Machine Check exception
	PCI: shpchp: Enable bridge bus mastering if MSI is enabled
	media: v4l2-compat-ioctl32: Fix timespec conversion
	media: uvcvideo: Prevent heap overflow when accessing mapped controls
	bcache: initialize dirty stripes in flash_dev_run()
	bcache: Fix leak of bdev reference
	bcache: do not subtract sectors_to_gc for bypassed IO
	bcache: correct cache_dirty_target in __update_writeback_rate()
	bcache: Correct return value for sysfs attach errors
	bcache: fix for gc and write-back race
	bcache: fix bch_hprint crash and improve output
	ftrace: Fix memleak when unregistering dynamic ops when tracing disabled
	Linux 4.4.89

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-09-27 11:52:16 +02:00
Greg Kroah-Hartman
10def3a677 Linux 4.4.89 2017-09-27 11:00:37 +02:00
Steven Rostedt (VMware)
ed1bf4397d ftrace: Fix memleak when unregistering dynamic ops when tracing disabled
commit edb096e00724f02db5f6ec7900f3bbd465c6c76f upstream.

If function tracing is disabled by the user via the function-trace option or
the proc sysctl file, and a ftrace_ops that was allocated on the heap is
unregistered, then the shutdown code exits out without doing the proper
clean up. This was found via kmemleak and running the ftrace selftests, as
one of the tests unregisters with function tracing disabled.

 # cat kmemleak
unreferenced object 0xffffffffa0020000 (size 4096):
  comm "swapper/0", pid 1, jiffies 4294668889 (age 569.209s)
  hex dump (first 32 bytes):
    55 ff 74 24 10 55 48 89 e5 ff 74 24 18 55 48 89  U.t$.UH...t$.UH.
    e5 48 81 ec a8 00 00 00 48 89 44 24 50 48 89 4c  .H......H.D$PH.L
  backtrace:
    [<ffffffff81d64665>] kmemleak_vmalloc+0x85/0xf0
    [<ffffffff81355631>] __vmalloc_node_range+0x281/0x3e0
    [<ffffffff8109697f>] module_alloc+0x4f/0x90
    [<ffffffff81091170>] arch_ftrace_update_trampoline+0x160/0x420
    [<ffffffff81249947>] ftrace_startup+0xe7/0x300
    [<ffffffff81249bd2>] register_ftrace_function+0x72/0x90
    [<ffffffff81263786>] trace_selftest_ops+0x204/0x397
    [<ffffffff82bb8971>] trace_selftest_startup_function+0x394/0x624
    [<ffffffff81263a75>] run_tracer_selftest+0x15c/0x1d7
    [<ffffffff82bb83f1>] init_trace_selftests+0x75/0x192
    [<ffffffff81002230>] do_one_initcall+0x90/0x1e2
    [<ffffffff82b7d620>] kernel_init_freeable+0x350/0x3fe
    [<ffffffff81d61ec3>] kernel_init+0x13/0x122
    [<ffffffff81d72c6a>] ret_from_fork+0x2a/0x40
    [<ffffffffffffffff>] 0xffffffffffffffff

Fixes: 12cce594fa ("ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:17 +02:00
Michael Lyle
a069d0a43d bcache: fix bch_hprint crash and improve output
commit 9276717b9e297a62d1151a43d1cd286213f68eb7 upstream.

Most importantly, solve a crash where %llu was used to format signed
numbers.  This would cause a buffer overflow when reading sysfs
writeback_rate_debug, as only 20 bytes were allocated for this and
%llu writes 20 characters plus a null.

Always use the units mechanism rather than having different output
paths for simplicity.

Also, correct problems with display output where 1.10 was a larger
number than 1.09, by multiplying by 10 and then dividing by 1024 instead
of dividing by 100.  (Remainders of >= 1000 would print as .10).

Minor changes: Always display the decimal point instead of trying to
omit it based on number of digits shown.  Decide what units to use
based on 1000 as a threshold, not 1024 (in other words, always print
at most 3 digits before the decimal point).

Signed-off-by: Michael Lyle <mlyle@lyle.org>
Reported-by: Dmitry Yu Okunev <dyokunev@ut.mephi.ru>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:17 +02:00
Tang Junhui
f522051a84 bcache: fix for gc and write-back race
commit 9baf30972b5568d8b5bc8b3c46a6ec5b58100463 upstream.

gc and write-back get raced (see the email "bcache get stucked" I sended
before):
gc thread                               write-back thread
|                                       |bch_writeback_thread()
|bch_gc_thread()                        |
|                                       |==>read_dirty()
|==>bch_btree_gc()                      |
|==>btree_root() //get btree root       |
|                //node write locker    |
|==>bch_btree_gc_root()                 |
|                                       |==>read_dirty_submit()
|                                       |==>write_dirty()
|                                       |==>continue_at(cl,
|                                       |               write_dirty_finish,
|                                       |               system_wq);
|                                       |==>write_dirty_finish()//excute
|                                       |               //in system_wq
|                                       |==>bch_btree_insert()
|                                       |==>bch_btree_map_leaf_nodes()
|                                       |==>__bch_btree_map_nodes()
|                                       |==>btree_root //try to get btree
|                                       |              //root node read
|                                       |              //lock
|                                       |-----stuck here
|==>bch_btree_set_root()
|==>bch_journal_meta()
|==>bch_journal()
|==>journal_try_write()
|==>journal_write_unlocked() //journal_full(&c->journal)
|                            //condition satisfied
|==>continue_at(cl, journal_write, system_wq); //try to excute
|                               //journal_write in system_wq
|                               //but work queue is excuting
|                               //write_dirty_finish()
|==>closure_sync(); //wait journal_write execute
|                   //over and wake up gc,
|-------------stuck here
|==>release root node write locker

This patch alloc a separate work-queue for write-back thread to avoid such
race.

(Commit log re-organized by Coly Li to pass checkpatch.pl checking)

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:17 +02:00
Tony Asleson
a6c5e7a0cd bcache: Correct return value for sysfs attach errors
commit 77fa100f27475d08a569b9d51c17722130f089e7 upstream.

If you encounter any errors in bch_cached_dev_attach it will return
a negative error code.  The variable 'v' which stores the result is
unsigned, thus user space sees a very large value returned for bytes
written which can cause incorrect user space behavior.  Utilize 1
signed variable to use throughout the function to preserve error return
capability.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:17 +02:00
Tang Junhui
d9c6a28a6a bcache: correct cache_dirty_target in __update_writeback_rate()
commit a8394090a9129b40f9d90dcb7f4a49d60c727ca6 upstream.

__update_write_rate() uses a Proportion-Differentiation Controller
algorithm to control writeback rate. A dirty target number is used in
this PD controller to control writeback rate. A larger target number
will make the writeback rate smaller, on the versus, a smaller target
number will make the writeback rate larger.

bcache uses the following steps to calculate the target number,
1) cache_sectors = all-buckets-of-cache-set * buckets-size
2) cache_dirty_target = cache_sectors * cached-device-writeback_percent
3) target = cache_dirty_target *
(sectors-of-cached-device/sectors-of-all-cached-devices-of-this-cache-set)

The calculation at step 1) for cache_sectors is incorrect, which does
not consider dirty blocks occupied by flash only volume.

A flash only volume can be took as a bcache device without cached
device. All data sectors allocated for it are persistent on cache device
and marked dirty, they are not touched by bcache writeback and garbage
collection code. So data blocks of flash only volume should be ignore
when calculating cache_sectors of cache set.

Current code does not subtract dirty sectors of flash only volume, which
results a larger target number from the above 3 steps. And in sequence
the cache device's writeback rate is smaller then a correct value,
writeback speed is slower on all cached devices.

This patch fixes the incorrect slower writeback rate by subtracting
dirty sectors of flash only volumes in __update_writeback_rate().

(Commit log composed by Coly Li to pass checkpatch.pl checking)

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:17 +02:00
Tang Junhui
0471f58e18 bcache: do not subtract sectors_to_gc for bypassed IO
commit 69daf03adef5f7bc13e0ac86b4b8007df1767aab upstream.

Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to
trigger gc thread.

Signed-off-by: tang.junhui <tang.junhui@zte.com.cn>
Acked-by: Coly Li <colyli@suse.de>
Reviewed-by: Eric Wheeler <bcache@linux.ewheeler.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Jan Kara
093457f2bd bcache: Fix leak of bdev reference
commit 4b758df21ee7081ab41448d21d60367efaa625b3 upstream.

If blkdev_get_by_path() in register_bcache() fails, we try to lookup the
block device using lookup_bdev() to detect which situation we are in to
properly report error. However we never drop the reference returned to
us from lookup_bdev(). Fix that.

Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Tang Junhui
5025da3b53 bcache: initialize dirty stripes in flash_dev_run()
commit 175206cf9ab63161dec74d9cd7f9992e062491f5 upstream.

bcache uses a Proportion-Differentiation Controller algorithm to control
writeback rate to cached devices. In the PD controller algorithm, dirty
stripes of thin flash device should not be counted in, because flash only
volumes never write back dirty data.

Currently dirty stripe counter for thin flash device is not initialized
when the thin flash device starts. Which means the following calculation
in PD controller will reference an undefined dirty stripes number, and
all cached devices attached to the same cache set where the thin flash
device lies on may have an inaccurate writeback rate.

This patch calles bch_sectors_dirty_init() in flash_dev_run(), to
correctly initialize dirty stripe counter when the thin flash device
starts to run. This patch also does following parameter data type change,
 -void bch_sectors_dirty_init(struct cached_dev *dc);
 +void bch_sectors_dirty_init(struct bcache_device *);
to call this function conveniently in flash_dev_run().

(Commit log is composed by Coly Li)

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Guenter Roeck
4931578fbe media: uvcvideo: Prevent heap overflow when accessing mapped controls
commit 7e09f7d5c790278ab98e5f2c22307ebe8ad6e8ba upstream.

The size of uvc_control_mapping is user controlled leading to a
potential heap overflow in the uvc driver. This adds a check to verify
the user provided size fits within the bounds of the defined buffer
size.

Originally-from: Richard Simmons <rssimmo@amazon.com>

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Daniel Mentz
04affe4e11 media: v4l2-compat-ioctl32: Fix timespec conversion
commit 9c7ba1d7634cef490b85bc64c4091ff004821bfd upstream.

Certain syscalls like recvmmsg support 64 bit timespec values for the
X32 ABI. The helper function compat_put_timespec converts a timespec
value to a 32 bit or 64 bit value depending on what ABI is used. The
v4l2 compat layer, however, is not designed to support 64 bit timespec
values and always uses 32 bit values. Hence, compat_put_timespec must
not be used.

Without this patch, user space will be provided with bad timestamp
values from the VIDIOC_DQEVENT ioctl. Also, fields of the struct
v4l2_event32 that come immediately after timestamp get overwritten,
namely the field named id.

Fixes: 81993e81a9 ("compat: Get rid of (get|put)_compat_time(val|spec)")
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Cc: Tiffany Lin <tiffany.lin@mediatek.com>
Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Aleksandr Bezzubikov
7498bd6058 PCI: shpchp: Enable bridge bus mastering if MSI is enabled
commit 48b79a14505349a29b3e20f03619ada9b33c4b17 upstream.

An SHPC may generate MSIs to notify software about slot or controller
events (SHPC spec r1.0, sec 4.7).  A PCI device can only generate an MSI if
it has bus mastering enabled.

Enable bus mastering if the bridge contains an SHPC that uses MSI for event
notifications.

Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Jose Abreu
81306fc3db ARC: Re-enable MMU upon Machine Check exception
commit 1ee55a8f7f6b7ca4c0c59e0b4b4e3584a085c2d3 upstream.

I recently came upon a scenario where I would get a double fault
machine check exception tiriggered by a kernel module.
However the ensuing crash stacktrace (ksym lookup) was not working
correctly.

Turns out that machine check auto-disables MMU while modules are allocated
in kernel vaddr spapce.

This patch re-enables the MMU before start printing the stacktrace
making stacktracing of modules work upon a fatal exception.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: moved code into low level handler to avoid in 2 places]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Baohong Liu
d28e96be7c tracing: Apply trace_clock changes to instance max buffer
commit 170b3b1050e28d1ba0700e262f0899ffa4fccc52 upstream.

Currently trace_clock timestamps are applied to both regular and max
buffers only for global trace. For instance trace, trace_clock
timestamps are applied only to regular buffer. But, regular and max
buffers can be swapped, for example, following a snapshot. So, for
instance trace, bad timestamps can be seen following a snapshot.
Let's apply trace_clock timestamps to instance max buffer as well.

Link: http://lkml.kernel.org/r/ebdb168d0be042dcdf51f81e696b17fabe3609c1.1504642143.git.tom.zanussi@linux.intel.com

Fixes: 277ba0446 ("tracing: Add interface to allow multiple trace buffers")
Signed-off-by: Baohong Liu <baohong.liu@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Steven Rostedt (VMware)
753154fcfe ftrace: Fix selftest goto location on error
commit 46320a6acc4fb58f04bcf78c4c942cc43b20f986 upstream.

In the second iteration of trace_selftest_ops(), the error goto label is
wrong in the case where trace_selftest_test_global_cnt is off. In the
case of error, it leaks the dynamic ops that was allocated.

Fixes: 95950c2e ("ftrace: Add self-tests for multiple function trace users")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Dan Carpenter
d8663aa277 scsi: qla2xxx: Fix an integer overflow in sysfs code
commit e6f77540c067b48dee10f1e33678415bfcc89017 upstream.

The value of "size" comes from the user.  When we add "start + size" it
could lead to an integer overflow bug.

It means we vmalloc() a lot more memory than we had intended.  I believe
that on 64 bit systems vmalloc() can succeed even if we ask it to
allocate huge 4GB buffers.  So we would get memory corruption and likely
a crash when we call ha->isp_ops->write_optrom() and ->read_optrom().

Only root can trigger this bug.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=194061

Fixes: b7cc176c9e ("[SCSI] qla2xxx: Allow region-based flash-part accesses.")
Reported-by: shqking <shqking@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Hannes Reinecke
72896ca30a scsi: sg: fixup infoleak when using SG_GET_REQUEST_TABLE
commit 3e0097499839e0fe3af380410eababe5a47c4cf9 upstream.

When calling SG_GET_REQUEST_TABLE ioctl only a half-filled table is
returned; the remaining part will then contain stale kernel memory
information.  This patch zeroes out the entire table to avoid this
issue.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Hannes Reinecke
c04996ad58 scsi: sg: factor out sg_fill_request_table()
commit 4759df905a474d245752c9dc94288e779b8734dd upstream.

Factor out sg_fill_request_table() for better readability.

[mkp: typos, applied by hand]

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:15 +02:00
Dan Carpenter
f0cd701d47 scsi: sg: off by one in sg_ioctl()
commit bd46fc406b30d1db1aff8dabaff8d18bb423fdcf upstream.

If "val" is SG_MAX_QUEUE then we are one element beyond the end of the
"rinfo" array so the > should be >=.

Fixes: 109bade9c625 ("scsi: sg: use standard lists for sg_requests")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:15 +02:00