Commit graph

481374 commits

Author SHA1 Message Date
Theodore Ts'o
ad7fefb109 Revert "ext4: fix suboptimal seek_{data,hole} extents traversial"
This reverts commit 14516bb7bb.

This was causing regression test failures with generic/285 with an ext3
filesystem using CONFIG_EXT4_USE_FOR_EXT23.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-01-02 15:16:00 -05:00
Theodore Ts'o
011fa99404 ext4: prevent online resize with backup superblock
Prevent BUG or corrupted file systems after the following:

mkfs.ext4 /dev/vdc 100M
mount -t ext4 -o sb=40961 /dev/vdc /vdc
resize2fs /dev/vdc

We previously prevented online resizing using the old resize ioctl.
Move the code to ext4_resize_begin(), so the check applies for all of
the resize ioctl's.

Reported-by: Maxim Malkov <malkov@ispras.ru>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-26 23:58:21 -05:00
Dmitry Monakhov
50db71abc5 ext4: ext4_da_convert_inline_data_to_extent drop locked page after error
Testcase:
xfstests generic/270
MKFS_OPTIONS="-q -I 256 -O inline_data,64bit"

Call Trace:
 [<ffffffff81144c76>] lock_page+0x35/0x39 -------> DEADLOCK
 [<ffffffff81145260>] pagecache_get_page+0x65/0x15a
 [<ffffffff811507fc>] truncate_inode_pages_range+0x1db/0x45c
 [<ffffffff8120ea63>] ? ext4_da_get_block_prep+0x439/0x4b6
 [<ffffffff811b29b7>] ? __block_write_begin+0x284/0x29c
 [<ffffffff8120e62a>] ? ext4_change_inode_journal_flag+0x16b/0x16b
 [<ffffffff81150af0>] truncate_inode_pages+0x12/0x14
 [<ffffffff81247cb4>] ext4_truncate_failed_write+0x19/0x25
 [<ffffffff812488cf>] ext4_da_write_inline_data_begin+0x196/0x31c
 [<ffffffff81210dad>] ext4_da_write_begin+0x189/0x302
 [<ffffffff810c07ac>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff810ddd13>] ? read_seqcount_begin.clone.1+0x9f/0xcc
 [<ffffffff8114309d>] generic_perform_write+0xc7/0x1c6
 [<ffffffff810c040e>] ? mark_held_locks+0x59/0x77
 [<ffffffff811445d1>] __generic_file_write_iter+0x17f/0x1c5
 [<ffffffff8120726b>] ext4_file_write_iter+0x2a5/0x354
 [<ffffffff81185656>] ? file_start_write+0x2a/0x2c
 [<ffffffff8107bcdb>] ? bad_area_nosemaphore+0x13/0x15
 [<ffffffff811858ce>] new_sync_write+0x8a/0xb2
 [<ffffffff81186e7b>] vfs_write+0xb5/0x14d
 [<ffffffff81186ffb>] SyS_write+0x5c/0x8c
 [<ffffffff816f2529>] system_call_fastpath+0x12/0x17

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-05 21:37:15 -05:00
Dmitry Monakhov
14516bb7bb ext4: fix suboptimal seek_{data,hole} extents traversial
It is ridiculous practice to scan inode block by block, this technique
applicable only for old indirect files. This takes significant amount
of time for really large files. Let's reuse ext4_fiemap which already
traverse inode-tree in most optimal meaner.

TESTCASE:
ftruncate64(fd, 0);
ftruncate64(fd, 1ULL << 40);
/* lseek will spin very long time */
lseek64(fd, 0, SEEK_DATA);
lseek64(fd, 0, SEEK_HOLE);

Original report: https://lkml.org/lkml/2014/10/16/620

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-02 18:08:53 -05:00
Dmitry Monakhov
d952d69e26 ext4: ext4_inline_data_fiemap should respect callers argument
Currently ext4_inline_data_fiemap ignores requested arguments (start
and len) which may lead endless loop if start != 0.  Also fix incorrect
extent length determination.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-02 16:11:20 -05:00
Dmitry Monakhov
5cc28a9eaa ext4: prevent fsreentrance deadlock for inline_data
ext4_da_convert_inline_data_to_extent() invokes
grab_cache_page_write_begin().  grab_cache_page_write_begin performs
memory allocation, so fs-reentrance should be prohibited because we
are inside journal transaction.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-12-02 16:09:50 -05:00
Jan Kara
d4f7610743 ext4: forbid journal_async_commit in data=ordered mode
Option journal_async_commit breaks gurantees of data=ordered mode as it
sends only a single cache flush after writing a transaction commit
block. Thus even though the transaction including the commit block is
fully stored on persistent storage, file data may still linger in drives
caches and will be lost on power failure. Since all checksums match on
journal recovery, we replay the transaction thus possibly exposing stale
user data.

To fix this data exposure issue, remove the possibility to use
journal_async_commit in data=ordered mode.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 20:19:17 -05:00
Theodore Ts'o
d9f39d1e44 jbd2: remove unnecessary NULL check before iput()
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 20:02:37 -05:00
Markus Elfring
bfcba2d035 ext4: Remove an unnecessary check for NULL before iput()
The iput() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 20:01:37 -05:00
Namjae Jeon
31fc006b12 ext4: remove unneeded code in ext4_unlink
Setting retval to zero is not needed in ext4_unlink.
Remove unneeded code.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 16:34:38 -05:00
Eric Sandeen
b003b52496 ext4: don't count external journal blocks as overhead
This was fixed for ext3 with:

e6d8fb3 ext3: Count internal journal as bsddf overhead in ext3_statfs

but was never fixed for ext4.

With a large external journal and no used disk blocks, df comes
out negative without this, as journal blocks are added to the
overhead & subtracted from used blocks unconditionally.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 16:27:44 -05:00
Jan Kara
733ded2a80 ext4: remove never taken branch from ext4_ext_shift_path_extents()
path[depth].p_hdr can never be NULL for a path passed to us (and even if
it could, EXT_LAST_EXTENT() would make something != NULL from it). So
just remove the branch.

Coverity-id: 1196498
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 16:23:48 -05:00
Darrick J. Wong
c6d3d56dd0 ext4: create nojournal_checksum mount option
Create a mount option to disable journal checksumming (because the
metadata_csum feature turns it on by default now), and fix remount not
to allow changing the journal checksumming option, since changing the
mount options has no effect on the journal.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 16:20:50 -05:00
Wang Shilong
58d86a50ee ext4: update comments regarding ext4_delete_inode()
ext4_delete_inode() has been renamed for a long time, update
comments for this.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 16:17:29 -05:00
Dmitry Monakhov
4fdb554318 ext4: cleanup GFP flags inside resize path
We must use GFP_NOFS instead GFP_KERNEL inside ext4_mb_add_groupinfo
and ext4_calculate_overhead() because they are called from inside a
journal transaction. Call trace:

ioctl
 ->ext4_group_add
   ->journal_start
   ->ext4_setup_new_descs
     ->ext4_mb_add_groupinfo -> GFP_KERNEL
   ->ext4_flex_group_add
     ->ext4_update_super
       ->ext4_calculate_overhead  -> GFP_KERNEL
   ->journal_stop

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 13:08:04 -05:00
Jan Kara
2be12de98a ext4: introduce aging to extent status tree
Introduce a simple aging to extent status tree. Each extent has a
REFERENCED bit which gets set when the extent is used. Shrinker then
skips entries with referenced bit set and clears the bit. Thus
frequently used extents have higher chances of staying in memory.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 11:55:24 -05:00
Jan Kara
624d0f1dd7 ext4: cleanup flag definitions for extent status tree
Currently flags for extent status tree are defined twice, once shifted
and once without a being shifted. Consolidate these definitions into one
place and make some computations automatic to make adding flags less
error prone. Compiler should be clever enough to figure out these are
constants and generate the same code.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 11:53:47 -05:00
Jan Kara
dd47592551 ext4: limit number of scanned extents in status tree shrinker
Currently we scan extent status trees of inodes until we reclaim nr_to_scan
extents. This can however require a lot of scanning when there are lots
of delayed extents (as those cannot be reclaimed).

Change shrinker to work as shrinkers are supposed to and *scan* only
nr_to_scan extents regardless of how many extents did we actually
reclaim. We however need to be careful and avoid scanning each status
tree from the beginning - that could lead to a situation where we would
not be able to reclaim anything at all when first nr_to_scan extents in
the tree are always unreclaimable. We remember with each inode offset
where we stopped scanning and continue from there when we next come
across the inode.

Note that we also need to update places calling __es_shrink() manually
to pass reasonable nr_to_scan to have a chance of reclaiming anything and
not just 1.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 11:51:23 -05:00
Jan Kara
b0dea4c165 ext4: move handling of list of shrinkable inodes into extent status code
Currently callers adding extents to extent status tree were responsible
for adding the inode to the list of inodes with freeable extents. This
is error prone and puts list handling in unnecessarily many places.

Just add inode to the list automatically when the first non-delay extent
is added to the tree and remove inode from the list when the last
non-delay extent is removed.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 11:49:25 -05:00
Zheng Liu
edaa53cac8 ext4: change LRU to round-robin in extent status tree shrinker
In this commit we discard the lru algorithm for inodes with extent
status tree because it takes significant effort to maintain a lru list
in extent status tree shrinker and the shrinker can take a long time to
scan this lru list in order to reclaim some objects.

We replace the lru ordering with a simple round-robin.  After that we
never need to keep a lru list.  That means that the list needn't be
sorted if the shrinker can not reclaim any objects in the first round.

Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 11:45:37 -05:00
Zheng Liu
2f8e0a7c6c ext4: cache extent hole in extent status tree for ext4_da_map_blocks()
Currently extent status tree doesn't cache extent hole when a write
looks up in extent tree to make sure whether a block has been allocated
or not.  In this case, we don't put extent hole in extent cache because
later this extent might be removed and a new delayed extent might be
added back.  But it will cause a defect when we do a lot of writes.  If
we don't put extent hole in extent cache, the following writes also need
to access extent tree to look at whether or not a block has been
allocated.  It brings a cache miss.  This commit fixes this defect.
Also if the inode doesn't have any extent, this extent hole will be
cached as well.

Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 11:44:37 -05:00
Jan Kara
cbd7584e6e ext4: fix block reservation for bigalloc filesystems
For bigalloc filesystems we have to check whether newly requested inode
block isn't already part of a cluster for which we already have delayed
allocation reservation. This check happens in ext4_ext_map_blocks() and
that function sets EXT4_MAP_FROM_CLUSTER if that's the case. However if
ext4_da_map_blocks() finds in extent cache information about the block,
we don't call into ext4_ext_map_blocks() and thus we always end up
getting new reservation even if the space for cluster is already
reserved. This results in overreservation and premature ENOSPC reports.

Fix the problem by checking for existing cluster reservation already in
ext4_da_map_blocks(). That simplifies the logic and actually allows us
to get rid of the EXT4_MAP_FROM_CLUSTER flag completely.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-25 11:41:49 -05:00
Eric Whitney
0756b908a3 ext4: fix end of region partial cluster handling
ext4_ext_remove_space() can incorrectly free a partial_cluster if
EAGAIN is encountered while truncating or punching.  Extent removal
should be retried in this case.

It also fails to free a partial cluster when the punched region begins
at the start of a file on that unaligned cluster and where the entire
file has not been punched.  Remove the requirement that all blocks in
the file must have been freed in order to free the partial cluster.

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-23 00:59:39 -05:00
Eric Whitney
345ee94748 ext4: miscellaneous partial cluster cleanups
Add some casts and rearrange a few statements for improved readability.
Some code can also be simplified and made more readable if we set
partial_cluster to 0 rather than to a negative value when we can tell
we've hit the left edge of the punched region.

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-23 00:59:39 -05:00
Eric Whitney
5bf4376065 ext4: fix end of leaf partial cluster handling
The fix in commit ad6599ab3a ("ext4: fix premature freeing of
partial clusters split across leaf blocks"), intended to avoid
dereferencing an invalid extent pointer when determining whether a
partial cluster should be freed, wasn't quite good enough.  Assure that
at least one extent remains at the start of the leaf once the hole has
been punched.  Otherwise, the pointer to the extent to the right of the
hole will be invalid and a partial cluster will be incorrectly freed.

Set partial_cluster to 0 when we can tell we've hit the left edge of
the punched region within the leaf.  This prevents incorrect freeing
of a partial cluster when ext4_ext_rm_leaf is called one last time
during extent tree traversal after the punched region has been removed.

Adjust comments to reflect code changes and a correction.  Remove a bit
of dead code.

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-23 00:58:11 -05:00
Eric Whitney
f4226d9ea4 ext4: fix partial cluster initialization
The partial_cluster variable is not always initialized correctly when
hole punching on bigalloc file systems.  Although commit c063449394
("ext4: fix partial cluster handling for bigalloc file systems")
addressed the case where the right edge of the punched region and the
next extent to its right were within the same leaf, it didn't handle
the case where the next extent to its right is in the next leaf.  This
causes xfstest generic/300 to fail.

Fix this by replacing the code in c0634493922 with a more general
solution that can continue the search for the first cluster to the
right of the punched region into the next leaf if present.  If found,
partial_cluster is initialized to this cluster's negative value.
There's no need to determine if that cluster is actually shared;  we
simply record it so its blocks won't be freed in the event it does
happen to be shared.

Also, minimize the burden on non-bigalloc file systems with some minor
code simplification.

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-23 00:55:42 -05:00
Al Viro
b93b41d4c7 ext4: kill ext4_kvfree()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-20 12:19:11 -05:00
Dmitry Monakhov
88c6b61ff1 ext4: move_extent improve bh vanishing success factor
Xiaoguang Wang has reported sporadic EBUSY failures of ext4/302
Unfortunetly there is nothing we can do if some other task holds BH's
refenrence.  So we must return EBUSY in this case.  But we can try
kicking the journal to see if the other task releases the bh reference
after the commit is complete.  Also decrease false positives by
properly checking for ENOSPC and retrying the allocation after kicking
the journal --- which is done by ext4_should_retry_alloc().

[ Modified by tytso to properly check for ENOSPC. ]

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-11-05 11:52:38 -05:00
Linus Torvalds
0df1f2487d Linux 3.18-rc3 2014-11-02 15:01:51 -08:00
Linus Torvalds
81d92dc117 Three main MTD fixes for 3.18:
* A regression from 3.16 which was noticed in 3.17. With the restructuring of
    the m25p80.c driver and the SPI NOR library framework, we omitted proper
    listing of the SPI device IDs. This means m25p80.c wouldn't auto-load
    (modprobe) properly when built as a module. For now, we duplicate the device
    IDs into both modules.
 
 * The OMAP / ELM modules were depending on an implicit link ordering. Use
   deferred probing so that the new link order (in 3.18-rc) can still allow for
   successful probing.
 
 * Fix suspend/resume support for LH28F640BF NOR flash
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUVq20AAoJEFySrpd9RFgt8XQQAI4oIygz4zGQ6n0y4HOqwOBy
 F4ZPtOuzuCYA86x2zORFgj4A9JGVjDQwTfnMQnn1NG+XEEmZMJfG2IwqlUxsZd5A
 KkAS5XUoi/Fvq95Qi95KQYXqm1dniXEGKsRFsHKXIsnnDmbqRK5fBn6Ve5PAwcau
 uru5FwrZ2Ve0EwF9/Z/bxAatRirdAhwgMGlaXdXLmL7S13NQGmXP9QI7CbxSZ38R
 GJ+A6PhiYs6Sml6Ou5bovNXyFGfx4J35pk6nTWoWe5MfZHRQk447OQwBPbsrM119
 Boq8F/6diXyJfuXdxvF6JiDmDzaw/fBY+Xuq1O6p+JzLONN16x93KlpAPhzy4a15
 PwFHCBzg5khY49if/dmrPJ+kLkU+9wIHUib8m6HSImCKBT5Bv/VJXoQ1g4s1IJ8/
 Di3mz/8pWm/cscABIkuEqb9TwwUrSHzVXgGH/p4CY0eUo8DbQQA1zDsig8aRIX36
 FlAReaHH8QivdnghkMX9Px7SIo7XoMZZEi+55k8FrIVqjqEHNGx+w+BKhxgtFggN
 nAg0l7NrLdQHpigK1SZjLFGIYi7MmarvbatUjVPGagiRqoQ0mCSS7eKX1DEs4EAo
 P2g64BSJickGAhUiAV9ZO1EBoaOU6olIPpc33J+uG/8qBU1cNClx3FJ1UPWX27JQ
 +FBsD1mec4FuoZ2SoE7r
 =IxjM
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20141102' of git://git.infradead.org/linux-mtd

Pull MTD fixes from Brian Norris:
 "Three main MTD fixes for 3.18:

   - A regression from 3.16 which was noticed in 3.17.  With the
     restructuring of the m25p80.c driver and the SPI NOR library
     framework, we omitted proper listing of the SPI device IDs.  This
     means m25p80.c wouldn't auto-load (modprobe) properly when built as
     a module.  For now, we duplicate the device IDs into both modules.

   - The OMAP / ELM modules were depending on an implicit link ordering.
     Use deferred probing so that the new link order (in 3.18-rc) can
     still allow for successful probing.

   - Fix suspend/resume support for LH28F640BF NOR flash"

* tag 'for-linus-20141102' of git://git.infradead.org/linux-mtd:
  mtd: cfi_cmdset_0001.c: fix resume for LH28F640BF chips
  mtd: omap: fix mtd devices not showing up
  mtd: m25p80,spi-nor: Fix module aliases for m25p80
  mtd: spi-nor: make spi_nor_scan() take a chip type name, not spi_device_id
  mtd: m25p80: get rid of spi_get_device_id
2014-11-02 14:45:52 -08:00
Linus Torvalds
ad2be3796f SCSI for-linus on 20141102
This is a set of six patches consisting of two MAINTAINER updates, two scsi-mq
 fixs for the old parallel interface (not every request is tagged and we need
 to set the right flags to populate the SPI tag message) and a fix for a memory
 leak in scatterlist traversal caused by a preallocation update in 3.17) and an
 ipv6 fix for cxgbi.
 
 Signed-off-by: James Bottomley <JBottomley@Parallels.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJUVqYyAAoJEDeqqVYsXL0MDwQIAI4aCdC3LUelt6IzywToiJ9u
 GTUZU7eULzV1KuaExcBM8g9DyE3wtnhfS0WA5TqvfKf8ImMKntan0loK0juIjj8K
 UBtbSDapwkKRRAIoRHvBzcHU+Jzc//naO95ojsFt2ELHB91jta368+DiPb8m+wQx
 IKeYO6vFaSwZZCKVNaWwyCpxEKmMO+ib80zfuANBzWjau3IFxMiUofSvtPGIWb+E
 Zxi9qvOyqjfFE8OLfL9ckH5NPiIYcwmarAXyuYcJstiw6VX2deGBaJalueSSC0JV
 TAqEtT5k1AraFS2HM0TRxar+HK5iBj3s8f0pFBfgD2+oagUln6L/486nMh2Ktac=
 =l6wS
 -----END PGP SIGNATURE-----

Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of six patches consisting of:
   - two MAINTAINER updates
   - two scsi-mq fixs for the old parallel interface (not every request
     is tagged and we need to set the right flags to populate the SPI
     tag message)
   - a fix for a memory leak in scatterlist traversal caused by a
     preallocation update in 3.17
   - an ipv6 fix for cxgbi"

[ The scatterlist fix also came in separately through the block layer tree ]

* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  MAINTAINERS: ufs - remove self
  MAINTAINERS: change hpsa and cciss maintainer
  libcxgbi : support ipv6 address host_param
  scsi: set REQ_QUEUE for the blk-mq case
  Revert "block: all blk-mq requests are tagged"
  lib/scatterlist: fix memory leak with scsi-mq
2014-11-02 14:39:35 -08:00
Linus Torvalds
12267166c5 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Nothing too astounding or major: radeon, i915, vmwgfx, armada and
  exynos.

  Biggest ones:
   - vmwgfx has one big locking regression fix
   - i915 has come displayport fixes
   - radeon has some stability and a memory alloc failure
   - armada and exynos have some vblank fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (24 commits)
  drm/exynos: correct connector->dpms field before resuming
  drm/exynos: enable vblank after DPMS on
  drm/exynos: init kms poll at the end of initialization
  drm/exynos: propagate plane initialization errors
  drm/exynos: vidi: fix build warning
  drm/exynos: remove explicit encoder/connector de-initialization
  drm/exynos: init vblank with real number of crtcs
  drm/vmwgfx: Filter out modes those cannot be supported by the current VRAM size.
  drm/vmwgfx: Fix hash key computation
  drm/vmwgfx: fix lock breakage
  drm/i915/dp: only use training pattern 3 on platforms that support it
  drm/radeon: remove some buggy dead code
  drm/i915: Ignore VBT backlight check on Macbook 2, 1
  drm/radeon: remove invalid pci id
  drm/radeon: dpm fixes for asrock systems
  radeon: clean up coding style differences in radeon_get_bios()
  drm/radeon: Use drm_malloc_ab instead of kmalloc_array
  drm/radeon/dpm: disable ulv support on SI
  drm/i915: Fix GMBUSFREQ on vlv/chv
  drm/i915: Ignore long hpds on eDP ports
  ...
2014-11-02 14:27:30 -08:00
Linus Torvalds
3c43de0ffd Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 - add the new bpf syscall to ARM.
 - drop a redundant return statement in __iommu_alloc_remap()
 - fix a performance issue noticed by Thomas Petazzoni with
   kmap_atomic().
 - fix an issue with the L2 cache OF parsing code which caused it to
   incorrectly print warnings on each boot, and make the warning text
   more consistent with the rest of the code

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8180/1: mm: implement no-highmem fast path in kmap_atomic_pfn()
  ARM: 8183/1: l2c: Improve l2c310_of_parse() error message
  ARM: 8181/1: Drop extra return statement
  ARM: 8182/1: l2c: Make l2x0_cache_size_of_parse() return 'int'
  ARM: enable bpf syscall
2014-11-02 12:56:20 -08:00
Linus Torvalds
7501a53329 A small set of x86 fixes. The most serious is an SRCU lockdep fix.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJUVd9KAAoJEL/70l94x66Dc1AH/0jdb8DsewyAuJzLKaJ/qJwK
 9JMqglpDQ+Sm0f2puPyJkR8NQd2AMPK7J5aJjWAl/XxJjsDcn+TQur20okzUDXLJ
 21sIbqo92hCgpSNs+RHLHlj7/iMQVYnMFh7bp6JcvzmhpN8F/D793BT+oOxdjMRg
 PLCQ794ugGhFboesDkV822VWgtQ26yG2aQDWbYgL9r5xPp5OpbzSiq85KopSEfS0
 K+PPntI8yNI+EvOC9ta0FfEOMMfQoLDds+V0FXiEIRx43MV8bwAXpWzsB8ibd1F6
 eY+cVvSPzWgDSCVLn3gfYkrRl3sWGdvyfxTe/cz507ZfXcuT2uHJhtbpH2KCGto=
 =FJ6/
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "A small set of x86 fixes.  The most serious is an SRCU lockdep fix.

  A bit late - needed some time to test the SRCU fix, which only came in
  on Friday"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: vmx: defer load of APIC access page address during reset
  KVM: nVMX: Disable preemption while reading from shadow VMCS
  KVM: x86: Fix far-jump to non-canonical check
  KVM: emulator: fix execution close to the segment limit
  KVM: emulator: fix error code for __linearize
2014-11-02 12:31:02 -08:00
Dave Airlie
66338feee4 Merge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes
This pull-request includes some bug fixes and code cleanups.
Especially, this fixes the bind failure issue occurred when it tries
to re-bind Exynos drm driver after unbound, and the modetest failure
issue incurred by not having a pair to vblank on and off requests.

* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
  drm/exynos: correct connector->dpms field before resuming
  drm/exynos: enable vblank after DPMS on
  drm/exynos: init kms poll at the end of initialization
  drm/exynos: propagate plane initialization errors
  drm/exynos: vidi: fix build warning
  drm/exynos: remove explicit encoder/connector de-initialization
  drm/exynos: init vblank with real number of crtcs
2014-11-03 05:23:17 +10:00
Linus Torvalds
7e05b807b9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS fixes from Al Viro:
 "A bunch of assorted fixes, most of them followups to overlayfs merge"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ovl: initialize ->is_cursor
  Return short read or 0 at end of a raw device, not EIO
  isofs: don't bother with ->d_op for normal case
  isofs_cmp(): we'll never see a dentry for . or ..
  overlayfs: fix lockdep misannotation
  ovl: fix check for cursor
  overlayfs: barriers for opening upper-layer directory
  rcu: Provide counterpart to rcu_dereference() for non-RCU situations
  staging: android: logger: Fix log corruption regression
2014-11-02 10:28:43 -08:00
Linus Torvalds
4cb8c3593b irda: stop calling sk_prot->disconnect() on connection failure
The sk_prot is irda's own set of protocol handlers, so irda should
statically know what that function is anyway, without using an indirect
pointer.  And as it happens, we know *exactly* what that pointer is
statically: it's NULL, because irda doesn't define a disconnect
operation.

So calling that function is doubly wrong, and will just cause an oops.

Reported-by: Martin Lang <mlg.hessigheim@gmail.com>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-02 10:20:26 -08:00
Andrzej Hajda
74cfe07a83 drm/exynos: correct connector->dpms field before resuming
During system suspend after connector switch off its dpms field
is set to connector previous dpms state. To properly resume dpms field
should be set to its actual state (off) before resuming to previous dpms state.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-11-03 01:51:28 +09:00
Andrzej Hajda
d6948b2fd8 drm/exynos: enable vblank after DPMS on
Before DPMS off driver disables vblank.
It should be balanced by vblank enable after DPMS on.
The patch fixes issue with page_flip ioctl not being able
to acquire vblank counter introduced by patch:
drm: Always reject drm_vblank_get() after drm_vblank_off()

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-11-03 01:51:28 +09:00
Andrzej Hajda
3cb6830a75 drm/exynos: init kms poll at the end of initialization
HPD events can be generated by components even if drm_dev is not fully
initialized, to skip such events kms poll initialization should
be performed at the end of load callback followed directly by forced
connection detection.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-11-03 01:51:28 +09:00
Andrzej Hajda
64f7aed83d drm/exynos: propagate plane initialization errors
In case of error during plane initialization load callback
incorrectly return success, this patch fixes it.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-11-03 01:51:28 +09:00
Inki Dae
9887e2d9da drm/exynos: vidi: fix build warning
encoder object isn't used anymore so remove it.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-11-03 01:51:27 +09:00
Andrzej Hajda
d9aaf75762 drm/exynos: remove explicit encoder/connector de-initialization
All KMS objects are destroyed by drm_mode_config_cleanup in proper order
so component drivers should not care about it.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-11-03 01:51:27 +09:00
Andrzej Hajda
c52142e6a8 drm/exynos: init vblank with real number of crtcs
Initialization of vblank with MAX_CRTC caused attempts
to disabling vblanks for non-existing crtcs in case
drm used fewer crtcs. The patch fixes it.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-11-03 01:51:27 +09:00
Paolo Bonzini
a73896cb5b KVM: vmx: defer load of APIC access page address during reset
Most call paths to vmx_vcpu_reset do not hold the SRCU lock.  Defer loading
the APIC access page to the next vmentry.

This avoids the following lockdep splat:

[ INFO: suspicious RCU usage. ]
3.18.0-rc2-test2+ #70 Not tainted
-------------------------------
include/linux/kvm_host.h:474 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by qemu-system-x86/2371:
 #0:  (&vcpu->mutex){+.+...}, at: [<ffffffffa037d800>] vcpu_load+0x20/0xd0 [kvm]

stack backtrace:
CPU: 4 PID: 2371 Comm: qemu-system-x86 Not tainted 3.18.0-rc2-test2+ #70
Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
 0000000000000001 ffff880209983ca8 ffffffff816f514f 0000000000000000
 ffff8802099b8990 ffff880209983cd8 ffffffff810bd687 00000000000fee00
 ffff880208a2c000 ffff880208a10000 ffff88020ef50040 ffff880209983d08
Call Trace:
 [<ffffffff816f514f>] dump_stack+0x4e/0x71
 [<ffffffff810bd687>] lockdep_rcu_suspicious+0xe7/0x120
 [<ffffffffa037d055>] gfn_to_memslot+0xd5/0xe0 [kvm]
 [<ffffffffa03807d3>] __gfn_to_pfn+0x33/0x60 [kvm]
 [<ffffffffa0380885>] gfn_to_page+0x25/0x90 [kvm]
 [<ffffffffa038aeec>] kvm_vcpu_reload_apic_access_page+0x3c/0x80 [kvm]
 [<ffffffffa08f0a9c>] vmx_vcpu_reset+0x20c/0x460 [kvm_intel]
 [<ffffffffa039ab8e>] kvm_vcpu_reset+0x15e/0x1b0 [kvm]
 [<ffffffffa039ac0c>] kvm_arch_vcpu_setup+0x2c/0x50 [kvm]
 [<ffffffffa037f7e0>] kvm_vm_ioctl+0x1d0/0x780 [kvm]
 [<ffffffff810bc664>] ? __lock_is_held+0x54/0x80
 [<ffffffff812231f0>] do_vfs_ioctl+0x300/0x520
 [<ffffffff8122ee45>] ? __fget+0x5/0x250
 [<ffffffff8122f0fa>] ? __fget_light+0x2a/0xe0
 [<ffffffff81223491>] SyS_ioctl+0x81/0xa0
 [<ffffffff816fed6d>] system_call_fastpath+0x16/0x1b

Reported-by: Takashi Iwai <tiwai@suse.de>
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reviewed-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Tested-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Fixes: 38b9917350
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-02 08:37:18 +01:00
Jan Kiszka
282da870f4 KVM: nVMX: Disable preemption while reading from shadow VMCS
In order to access the shadow VMCS, we need to load it. At this point,
vmx->loaded_vmcs->vmcs and the actually loaded one start to differ. If
we now get preempted by Linux, vmx_vcpu_put and, on return, the
vmx_vcpu_load will work against the wrong vmcs. That can cause
copy_shadow_to_vmcs12 to corrupt the vmcs12 state.

Fix the issue by disabling preemption during the copy operation.
copy_vmcs12_to_shadow is safe from this issue as it is executed by
vmx_vcpu_run when preemption is already disabled before vmentry.

This bug is exposed by running Jailhouse within KVM on CPUs with
shadow VMCS support.  Jailhouse never expects an interrupt pending
vmexit, but the bug can cause it if, after copy_shadow_to_vmcs12
is preempted, the active VMCS happens to have the virtual interrupt
pending flag set in the CPU-based execution controls.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-02 07:55:46 +01:00
Nadav Amit
7e46dddd6f KVM: x86: Fix far-jump to non-canonical check
Commit d1442d85cc ("KVM: x86: Handle errors when RIP is set during far
jumps") introduced a bug that caused the fix to be incomplete.  Due to
incorrect evaluation, far jump to segment with L bit cleared (i.e., 32-bit
segment) and RIP with any of the high bits set (i.e, RIP[63:32] != 0) set may
not trigger #GP.  As we know, this imposes a security problem.

In addition, the condition for two warnings was incorrect.

Fixes: d1442d85cc
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
[Add #ifdef CONFIG_X86_64 to avoid complaints of undefined behavior. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-02 07:54:55 +01:00
Dave Airlie
10a8fce846 Merge branch 'vmwgfx-fixes-3.18' of git://people.freedesktop.org/~thomash/linux
A critical 3.18 regression fix from Rob, (thanks!)
A fix to avoid advertizing modes we can't support from Sinclair
  (welcome Sinclair!)
and a fix for an incorrect  hash key computation from me that is
  completely harmless, but can wait 'til the next merge window if necessary.
  (I can't really bother stable with this one).

* 'vmwgfx-fixes-3.18' of git://people.freedesktop.org/~thomash/linux:
  drm/vmwgfx: Filter out modes those cannot be supported by the current VRAM size.
  drm/vmwgfx: Fix hash key computation
  drm/vmwgfx: fix lock breakage
2014-11-02 09:23:31 +10:00
Linus Torvalds
12d7aacab5 Staging fixes for 3.18-rc3
Here are some staging driver fixes for 3.18-rc3.  Mostly iio and comedi
 driver fixes for issues reported by people.
 
 All of these have been in linux-next for a while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlRVPt0ACgkQMUfUDdst+yndhACgyuy8zg68a3VBujGyJN1iVigF
 wmEAoIqv0NfNrJ6tOKGGQlB40ZEEyjF3
 =h/D5
 -----END PGP SIGNATURE-----

Merge tag 'staging-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging fixes from Greg KH:
 "Here are some staging driver fixes for 3.18-rc3.  Mostly iio and
  comedi driver fixes for issues reported by people.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'staging-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: comedi: fix memory leak / bad pointer freeing for chanlist
  staging: comedi: Kconfig: fix config COMEDI_ADDI_APCI_3120 dependants
  staging: comedi: widen subdevice number argument in ioctl handlers
  staging: rtl8723au: Fix alignment of mac_addr for ether_addr_copy() usage
  drivers/staging/comedi/Kconfig: Let COMEDI_II_PCI20KC depend on HAS_IOMEM
  staging: comedi: (regression) channel list must be set for COMEDI_CMD ioctl
  iio: adc: mxs-lradc: Disable the clock on probe failure
  iio: st_sensors: Fix buffer copy
  staging:iio:ad5933: Drop "raw" from channel names
  staging:iio:ad5933: Fix NULL pointer deref when enabling buffer
2014-11-01 15:11:27 -07:00
Linus Torvalds
528a506e4b USB fixes for 3.18-rc3
Here are a bunch of USB fixes for 3.18-rc3.
 
 Mostly usb-serial device ids and gadget fixes for issues that have been
 reported.  Full details are in the shortlog.
 
 All of these have been in linux-next for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlRVP34ACgkQMUfUDdst+ykxrwCbB2HvQU+YKThwHs2g6Gf+MElI
 Rk8An2ftMJlvvB4rcqwUYqbMZGV02zxV
 =0SHY
 -----END PGP SIGNATURE-----

Merge tag 'usb-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are a bunch of USB fixes for 3.18-rc3.

  Mostly usb-serial device ids and gadget fixes for issues that have
  been reported.  Full details are in the shortlog.

  All of these have been in linux-next for a while"

* tag 'usb-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (42 commits)
  usb: chipidea: Fix oops when removing the ci_hdrc module
  usb: gadget: function: Fixed the return value on error path
  usb: dwc2: gadget: disable phy before turning off power regulators
  usb: gadget: function: Remove redundant usb_free_all_descriptors
  usb: dwc3: gadget: Properly initialize LINK TRB
  usb: dwc2: gadget: fix gadget unregistration in udc_stop() function
  usb: dwc2: Bits in bitfield should add up to 32
  usb: dwc2: gadget: sparse warning of context imbalance
  usb: gadget: udc: core: fix kernel oops with soft-connect
  usb: musb: musb_dsps: fix NULL pointer in suspend
  usb: musb: dsps: start OTG timer on resume again
  usb: gadget: loopback: don't queue requests to bogus endpoints
  usb: ffs: fix regression when quirk_ep_out_aligned_size flag is set
  usb: gadget: f_fs: remove redundant ffs_data_get()
  usb: gadget: udc: USB_GADGET_XILINX should depend on HAS_DMA
  Revert "usb: dwc3: dwc3-omap: Disable/Enable only wrapper interrupts in prepare/complete"
  usb: gadget: composite: enable BESL support
  usb: musb: cppi41: restart hrtimer only if not yet done
  usb: dwc3: ep0: fix Data Phase for transfer sizes aligned to wMaxPacketSize
  usb: serial: ftdi_sio: add "bricked" FTDI device PID
  ...
2014-11-01 15:08:04 -07:00