For buffered IO, we don't need to use block plug to cache bio,
for direct IO, generic f2fs_direct_IO has already added block
plug, so let's remove redundant one in .write_iter.
As Yunlei described in his patch:
-f2fs_file_write_iter
-blk_start_plug
-__generic_file_write_iter
...
-do_blockdev_direct_IO
-blk_start_plug
...
-blk_finish_plug
...
-blk_finish_plug
which may conduct performance decrease in our platform
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Since the layout of regular dentry block is different from inline dentry
block, zero_user_segment starting from MAX_INLINE_DATA(dir) is not
correct for regular dentry block, besides, bitmap is already copied and
used, so there is no necessary to zero page at all, so just remove the
zero_user_segment is OK.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously, we use generic FS_*_FL defined by vfs to indicate inode status
for each bit of i_flags, so f2fs's flag status definition is tied to vfs'
one, it will be hard for f2fs to reuse bits f2fs never used to indicate
new status..
In order to solve this issue, we introduce private inode status mapping,
Note, for these bits have already been persisted into disk, we should
never change their definition, for other ones, we can remap them for
later new coming status.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* refs/heads/tmp-7ba5557
Linux 4.4.139
Bluetooth: Fix connection if directed advertising and privacy is used
cdc_ncm: avoid padding beyond end of skb
dm thin: handle running out of data space vs concurrent discard
block: Fix transfer when chunk sectors exceeds max
spi: Fix scatterlist elements size in spi_map_buf
Btrfs: fix unexpected cow in run_delalloc_nocow
ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210
Input: elantech - fix V4 report decoding for module with middle key
Input: elantech - enable middle button of touchpads on ThinkPad P52
Input: elan_i2c_smbus - fix more potential stack buffer overflows
udf: Detect incorrect directory size
xen: Remove unnecessary BUG_ON from __unbind_from_irq()
Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID
video: uvesafb: Fix integer overflow in allocation
NFSv4: Fix possible 1-byte stack overflow in nfs_idmap_read_and_verify_message
nfsd: restrict rd_maxcount to svc_max_payload in nfsd_encode_readdir
media: dvb_frontend: fix locking issues at dvb_frontend_get_event()
media: cx231xx: Add support for AverMedia DVD EZMaker 7
media: v4l2-compat-ioctl32: prevent go past max size
perf intel-pt: Fix packet decoding of CYC packets
perf intel-pt: Fix "Unexpected indirect branch" error
perf intel-pt: Fix MTC timing after overflow
perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
mfd: intel-lpss: Program REMAP register in PIO mode
backlight: tps65217_bl: Fix Device Tree node lookup
backlight: max8925_bl: Fix Device Tree node lookup
backlight: as3711_bl: Fix Device Tree node lookup
xfrm: skip policies marked as dead while rehashing
xfrm: Ignore socket policies when rebuilding hash tables
UBIFS: Fix potential integer overflow in allocation
ubi: fastmap: Cancel work upon detach
md: fix two problems with setting the "re-add" device state.
linvdimm, pmem: Preserve read-only setting for pmem devices
scsi: zfcp: fix missing REC trigger trace on enqueue without ERP thread
scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED
scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED
scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return
scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed
scsi: zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF
scsi: zfcp: fix missing SCSI trace for result of eh_host_reset_handler
scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails
iio:buffer: make length types match kfifo types
Btrfs: fix clone vs chattr NODATASUM race
time: Make sure jiffies_to_msecs() preserves non-zero time periods
MIPS: io: Add barrier after register read in inX()
PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume
MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum
mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking.
mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary
mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips
mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock()
mtd: cfi_cmdset_0002: Change write buffer to check correct value
RDMA/mlx4: Discard unknown SQP work requests
IB/qib: Fix DMA api warning with debug kernel
of: unittest: for strings, account for trailing \0 in property length field
ARM: 8764/1: kgdb: fix NUMREGBYTES so that gdb_regs[] is the correct size
powerpc/fadump: Unregister fadump on kexec down path.
cpuidle: powernv: Fix promotion from snooze if next state disabled
powerpc/ptrace: Fix enforcement of DAWR constraints
powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG
powerpc/mm/hash: Add missing isync prior to kernel stack SLB switch
fuse: fix control dir setup and teardown
fuse: don't keep dead fuse_conn at fuse_fill_super().
fuse: atomic_o_trunc should truncate pagecache
Bluetooth: hci_qca: Avoid missing rampatch failure with userspace fw loader
ipmi:bt: Set the timeout before doing a capabilities check
branch-check: fix long->int truncation when profiling branches
mips: ftrace: fix static function graph tracing
lib/vsprintf: Remove atomic-unsafe support for %pCr
ASoC: cirrus: i2s: Fix {TX|RX}LinCtrlData setup
ASoC: cirrus: i2s: Fix LRCLK configuration
ASoC: dapm: delete dapm_kcontrol_data paths list before freeing it
1wire: family module autoload fails because of upper/lower case mismatch.
usb: do not reset if a low-speed or full-speed device timed out
signal/xtensa: Consistenly use SIGBUS in do_unaligned_user
serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version
m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap()
x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()
fs/binfmt_misc.c: do not allow offset overflow
w1: mxc_w1: Enable clock before calling clk_get_rate() on it
libata: Drop SanDisk SD7UB3Q*G1001 NOLPM quirk
libata: zpodd: small read overflow in eject_tray()
libata: zpodd: make arrays cdb static, reduces object code size
cpufreq: Fix new policy initialization during limits updates via sysfs
ALSA: hda: add dock and led support for HP ProBook 640 G4
ALSA: hda: add dock and led support for HP EliteBook 830 G5
ALSA: hda - Handle kzalloc() failure in snd_hda_attach_pcm_stream()
btrfs: scrub: Don't use inode pages for device replace
driver core: Don't ignore class_dir_create_and_add() failure.
ext4: fix fencepost error in check for inode count overflow during resize
ext4: update mtime in ext4_punch_hole even if no blocks are released
tcp: verify the checksum of the first data segment in a new connection
bonding: re-evaluate force_primary when the primary slave name changes
usb: musb: fix remote wakeup racing with suspend
Btrfs: make raid6 rebuild retry more
tcp: do not overshoot window_clamp in tcp_rcv_space_adjust()
Revert "Btrfs: fix scrub to repair raid6 corruption"
net/sonic: Use dma_mapping_error()
net: qmi_wwan: Add Netgear Aircard 779S
atm: zatm: fix memcmp casting
ipvs: fix buffer overflow with sync daemon and service
netfilter: ebtables: handle string from userspace with care
xfrm6: avoid potential infinite loop in _decode_session6()
ANDROID: Add kconfig to make dm-verity check_at_most_once default enabled
ANDROID: sdcardfs: fix potential crash when reserved_mb is not zero
Change-Id: Ibcd2b6614843e4e8fd5a57acf350a9e83e1c0dbc
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAls7QB8ACgkQONu9yGCS
aT6trQ/9EO1dgc0lZO0zGCxFFiikPzzMp1auSKd99FhSaqlrCPutT5K0gBVc1rug
EvggbqWj2MBX2HZvxQR8LbGNvp7+kkM3apIdYOqyTQPvs7x03YNeuvXZUF3EFyPO
eDZ71nLuwgnEeySceJ+Z9HcVBcWR/0dEkwjhjpIJ2IO25tcecWzbqOOdzNypBIKK
EG4dGhO5JY6jLqxbEFZ9d302bGZQozOQHiDfEZz6NueI0yYVJIjQQvuLp/V0ChDg
TN+PgTOdzxIPCpZw9y4XzN4nhdsOial1xeX7agzAkZDjdbprNpbZrxjfY0NLdpQ0
4ZV3vLqIZ5rs8xuCRgNJ7yTVt6X7miw/h7TQp30qpeDuRf1SHZa4ITqMzdXJUahW
BT+XkjrrCjKxXkCH+rWy0txtouUaVwM+sKHIW0bvrOJwHM0UJXNAUppt4NrBtgtD
7Zt/FDKAHCJk1GuW3U5zXOHmgn+QkRNEndpwbUjwRowvHcE5jVSLLkH4XZkA0+SL
ucQCxOqGKrbHjhyXT+e2Kpx4Z5sqJIUHhc4iw6gi7xyaoJ55kHZ2S+sCwo3cjreq
B43SrwkQ0EJXwHzcrmvDfnvEFf7ylDVWH597lQsIQMNI7Gg04fXixYpvr6DYOBSN
AKHvoqd7VztHnX/ZogyLXp4jWiU5dU6qYXdj/zEs+tB8DYPZ4+c=
=Mli0
-----END PGP SIGNATURE-----
Merge 4.4.139 into android-4.4
Changes in 4.4.139
xfrm6: avoid potential infinite loop in _decode_session6()
netfilter: ebtables: handle string from userspace with care
ipvs: fix buffer overflow with sync daemon and service
atm: zatm: fix memcmp casting
net: qmi_wwan: Add Netgear Aircard 779S
net/sonic: Use dma_mapping_error()
Revert "Btrfs: fix scrub to repair raid6 corruption"
tcp: do not overshoot window_clamp in tcp_rcv_space_adjust()
Btrfs: make raid6 rebuild retry more
usb: musb: fix remote wakeup racing with suspend
bonding: re-evaluate force_primary when the primary slave name changes
tcp: verify the checksum of the first data segment in a new connection
ext4: update mtime in ext4_punch_hole even if no blocks are released
ext4: fix fencepost error in check for inode count overflow during resize
driver core: Don't ignore class_dir_create_and_add() failure.
btrfs: scrub: Don't use inode pages for device replace
ALSA: hda - Handle kzalloc() failure in snd_hda_attach_pcm_stream()
ALSA: hda: add dock and led support for HP EliteBook 830 G5
ALSA: hda: add dock and led support for HP ProBook 640 G4
cpufreq: Fix new policy initialization during limits updates via sysfs
libata: zpodd: make arrays cdb static, reduces object code size
libata: zpodd: small read overflow in eject_tray()
libata: Drop SanDisk SD7UB3Q*G1001 NOLPM quirk
w1: mxc_w1: Enable clock before calling clk_get_rate() on it
fs/binfmt_misc.c: do not allow offset overflow
x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()
m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap()
serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version
signal/xtensa: Consistenly use SIGBUS in do_unaligned_user
usb: do not reset if a low-speed or full-speed device timed out
1wire: family module autoload fails because of upper/lower case mismatch.
ASoC: dapm: delete dapm_kcontrol_data paths list before freeing it
ASoC: cirrus: i2s: Fix LRCLK configuration
ASoC: cirrus: i2s: Fix {TX|RX}LinCtrlData setup
lib/vsprintf: Remove atomic-unsafe support for %pCr
mips: ftrace: fix static function graph tracing
branch-check: fix long->int truncation when profiling branches
ipmi:bt: Set the timeout before doing a capabilities check
Bluetooth: hci_qca: Avoid missing rampatch failure with userspace fw loader
fuse: atomic_o_trunc should truncate pagecache
fuse: don't keep dead fuse_conn at fuse_fill_super().
fuse: fix control dir setup and teardown
powerpc/mm/hash: Add missing isync prior to kernel stack SLB switch
powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG
powerpc/ptrace: Fix enforcement of DAWR constraints
cpuidle: powernv: Fix promotion from snooze if next state disabled
powerpc/fadump: Unregister fadump on kexec down path.
ARM: 8764/1: kgdb: fix NUMREGBYTES so that gdb_regs[] is the correct size
of: unittest: for strings, account for trailing \0 in property length field
IB/qib: Fix DMA api warning with debug kernel
RDMA/mlx4: Discard unknown SQP work requests
mtd: cfi_cmdset_0002: Change write buffer to check correct value
mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock()
mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips
mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary
mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking.
MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum
PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume
MIPS: io: Add barrier after register read in inX()
time: Make sure jiffies_to_msecs() preserves non-zero time periods
Btrfs: fix clone vs chattr NODATASUM race
iio:buffer: make length types match kfifo types
scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails
scsi: zfcp: fix missing SCSI trace for result of eh_host_reset_handler
scsi: zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF
scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed
scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return
scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED
scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED
scsi: zfcp: fix missing REC trigger trace on enqueue without ERP thread
linvdimm, pmem: Preserve read-only setting for pmem devices
md: fix two problems with setting the "re-add" device state.
ubi: fastmap: Cancel work upon detach
UBIFS: Fix potential integer overflow in allocation
xfrm: Ignore socket policies when rebuilding hash tables
xfrm: skip policies marked as dead while rehashing
backlight: as3711_bl: Fix Device Tree node lookup
backlight: max8925_bl: Fix Device Tree node lookup
backlight: tps65217_bl: Fix Device Tree node lookup
mfd: intel-lpss: Program REMAP register in PIO mode
perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
perf intel-pt: Fix MTC timing after overflow
perf intel-pt: Fix "Unexpected indirect branch" error
perf intel-pt: Fix packet decoding of CYC packets
media: v4l2-compat-ioctl32: prevent go past max size
media: cx231xx: Add support for AverMedia DVD EZMaker 7
media: dvb_frontend: fix locking issues at dvb_frontend_get_event()
nfsd: restrict rd_maxcount to svc_max_payload in nfsd_encode_readdir
NFSv4: Fix possible 1-byte stack overflow in nfs_idmap_read_and_verify_message
video: uvesafb: Fix integer overflow in allocation
Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID
xen: Remove unnecessary BUG_ON from __unbind_from_irq()
udf: Detect incorrect directory size
Input: elan_i2c_smbus - fix more potential stack buffer overflows
Input: elantech - enable middle button of touchpads on ThinkPad P52
Input: elantech - fix V4 report decoding for module with middle key
ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210
Btrfs: fix unexpected cow in run_delalloc_nocow
spi: Fix scatterlist elements size in spi_map_buf
block: Fix transfer when chunk sectors exceeds max
dm thin: handle running out of data space vs concurrent discard
cdc_ncm: avoid padding beyond end of skb
Bluetooth: Fix connection if directed advertising and privacy is used
Linux 4.4.139
Change-Id: I93013bedf2ebe3e6a8718972d8854723609963cc
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 5811375325420052fcadd944792a416a43072b7f upstream.
Fstests generic/475 provides a way to fail metadata reads while
checking if checksum exists for the inode inside run_delalloc_nocow(),
and csum_exist_in_range() interprets error (-EIO) as inode having
checksum and makes its caller enter the cow path.
In case of free space inode, this ends up with a warning in
cow_file_range().
The same problem applies to btrfs_cross_ref_exist() since it may also
read metadata in between.
With this, run_delalloc_nocow() bails out when errors occur at the two
places.
cc: <stable@vger.kernel.org> v2.6.28+
Fixes: 17d217fe97 ("Btrfs: fix nodatasum handling in balancing code")
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fa65653e575fbd958bdf5fb9c4a71a324e39510d upstream.
Detect when a directory entry is (possibly partially) beyond directory
size and return EIO in that case since it means the filesystem is
corrupted. Otherwise directory operations can further corrupt the
directory and possibly also oops the kernel.
CC: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
CC: stable@vger.kernel.org
Reported-and-tested-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d68894800ec5712d7ddf042356f11e36f87d7f78 upstream.
In nfs_idmap_read_and_verify_message there is an incorrect sprintf '%d'
that converts the __u32 'im_id' from struct idmap_msg to 'id_str', which
is a stack char array variable of length NFS_UINT_MAXLEN == 11.
If a uid or gid value is > 2147483647 = 0x7fffffff, the conversion
overflows into a negative value, for example:
crash> p (unsigned) (0x80000000)
$1 = 2147483648
crash> p (signed) (0x80000000)
$2 = -2147483648
The '-' sign is written to the buffer and this causes a 1 byte overflow
when the NULL byte is written, which corrupts kernel stack memory. If
CONFIG_CC_STACKPROTECTOR_STRONG is set we see a stack-protector panic:
[11558053.616565] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffa05b8a8c
[11558053.639063] CPU: 6 PID: 9423 Comm: rpc.idmapd Tainted: G W ------------ T 3.10.0-514.el7.x86_64 #1
[11558053.641990] Hardware name: Red Hat OpenStack Compute, BIOS 1.10.2-3.el7_4.1 04/01/2014
[11558053.644462] ffffffff818c7bc0 00000000b1f3aec1 ffff880de0f9bd48 ffffffff81685eac
[11558053.646430] ffff880de0f9bdc8 ffffffff8167f2b3 ffffffff00000010 ffff880de0f9bdd8
[11558053.648313] ffff880de0f9bd78 00000000b1f3aec1 ffffffff811dcb03 ffffffffa05b8a8c
[11558053.650107] Call Trace:
[11558053.651347] [<ffffffff81685eac>] dump_stack+0x19/0x1b
[11558053.653013] [<ffffffff8167f2b3>] panic+0xe3/0x1f2
[11558053.666240] [<ffffffff811dcb03>] ? kfree+0x103/0x140
[11558053.682589] [<ffffffffa05b8a8c>] ? idmap_pipe_downcall+0x1cc/0x1e0 [nfsv4]
[11558053.689710] [<ffffffff810855db>] __stack_chk_fail+0x1b/0x30
[11558053.691619] [<ffffffffa05b8a8c>] idmap_pipe_downcall+0x1cc/0x1e0 [nfsv4]
[11558053.693867] [<ffffffffa00209d6>] rpc_pipe_write+0x56/0x70 [sunrpc]
[11558053.695763] [<ffffffff811fe12d>] vfs_write+0xbd/0x1e0
[11558053.702236] [<ffffffff810acccc>] ? task_work_run+0xac/0xe0
[11558053.704215] [<ffffffff811fec4f>] SyS_write+0x7f/0xe0
[11558053.709674] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b
Fix this by calling the internally defined nfs_map_numeric_to_string()
function which properly uses '%u' to convert this __u32. For consistency,
also replace the one other place where snprintf is called.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reported-by: Stephen Johnston <sjohnsto@redhat.com>
Fixes: cf4ab538f1 ("NFSv4: Fix the string length returned by the idmapper")
Cc: stable@vger.kernel.org # v3.4+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c2ece6ef67e9d376f32823086169b489c422ed0 upstream.
nfsd4_readdir_rsize restricts rd_maxcount to svc_max_payload when
estimating the size of the readdir reply, but nfsd_encode_readdir
restricts it to INT_MAX when encoding the reply. This can result in log
messages like "kernel: RPC request reserved 32896 but used 1049444".
Restrict rd_dircount similarly (no reason it should be larger than
svc_max_payload).
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 353748a359f1821ee934afc579cf04572406b420 upstream.
There is potential for the size and len fields in ubifs_data_node to be
too large causing either a negative value for the length fields or an
integer overflow leading to an incorrect memory allocation. Likewise,
when the len field is small, an integer underflow may occur.
Signed-off-by: Silvio Cesare <silvio.cesare@gmail.com>
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5c40d598f5408bd0ca22dfffa82f03cd9433f23 upstream.
In btrfs_clone_files(), we must check the NODATASUM flag while the
inodes are locked. Otherwise, it's possible that btrfs_ioctl_setflags()
will change the flags after we check and we can end up with a party
checksummed file.
The race window is only a few instructions in size, between the if and
the locks which is:
3834 if (S_ISDIR(src->i_mode) || S_ISDIR(inode->i_mode))
3835 return -EISDIR;
where the setflags must be run and toggle the NODATASUM flag (provided
the file size is 0). The clone will block on the inode lock, segflags
takes the inode lock, changes flags, releases log and clone continues.
Not impossible but still needs a lot of bad luck to hit unintentionally.
Fixes: 0e7b824c4e ("Btrfs: don't make a file partly checksummed through file clone")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ adjusted for 4.4 ]
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
commit 6becdb601bae2a043d7fb9762c4d48699528ea6e upstream.
syzbot is reporting NULL pointer dereference at fuse_ctl_remove_conn() [1].
Since fc->ctl_ndents is incremented by fuse_ctl_add_conn() when new_inode()
failed, fuse_ctl_remove_conn() reaches an inode-less dentry and tries to
clear d_inode(dentry)->i_private field.
Fix by only adding the dentry to the array after being fully set up.
When tearing down the control directory, do d_invalidate() on it to get rid
of any mounts that might have been added.
[1] https://syzkaller.appspot.com/bug?id=f396d863067238959c91c0b7cfc10b163638cac6
Reported-by: syzbot <syzbot+32c236387d66c4516827@syzkaller.appspotmail.com>
Fixes: bafa96541b ("[PATCH] fuse: add control filesystem")
Cc: <stable@vger.kernel.org> # v2.6.18
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 543b8f8662fe6d21f19958b666ab0051af9db21a upstream.
syzbot is reporting use-after-free at fuse_kill_sb_blk() [1].
Since sb->s_fs_info field is not cleared after fc was released by
fuse_conn_put() when initialization failed, fuse_kill_sb_blk() finds
already released fc and tries to hold the lock. Fix this by clearing
sb->s_fs_info field after calling fuse_conn_put().
[1] https://syzkaller.appspot.com/bug?id=a07a680ed0a9290585ca424546860464dd9658db
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+ec3986119086fe4eec97@syzkaller.appspotmail.com>
Fixes: 3b463ae0c6 ("fuse: invalidation reverse calls")
Cc: John Muir <john@jmuir.com>
Cc: Csaba Henk <csaba@gluster.com>
Cc: Anand Avati <avati@redhat.com>
Cc: <stable@vger.kernel.org> # v2.6.31
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df0e91d488276086bc07da2e389986cae0048c37 upstream.
Fuse has an "atomic_o_trunc" mode, where userspace filesystem uses the
O_TRUNC flag in the OPEN request to truncate the file atomically with the
open.
In this mode there's no need to send a SETATTR request to userspace after
the open, so fuse_do_setattr() checks this mode and returns. But this
misses the important step of truncating the pagecache.
Add the missing parts of truncation to the ATTR_OPEN branch.
Reported-by: Chad Austin <chadaustin@fb.com>
Fixes: 6ff958edbf ("fuse: add atomic open+truncate support")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5cc41e099504b77014358b58567c5ea6293dd220 upstream.
WHen registering a new binfmt_misc handler, it is possible to overflow
the offset to get a negative value, which might crash the system, or
possibly leak kernel data.
Here is a crash log when 2500000000 was used as an offset:
BUG: unable to handle kernel paging request at ffff989cfd6edca0
IP: load_misc_binary+0x22b/0x470 [binfmt_misc]
PGD 1ef3e067 P4D 1ef3e067 PUD 0
Oops: 0000 [#1] SMP NOPTI
Modules linked in: binfmt_misc kvm_intel ppdev kvm irqbypass joydev input_leds serio_raw mac_hid parport_pc qemu_fw_cfg parpy
CPU: 0 PID: 2499 Comm: bash Not tainted 4.15.0-22-generic #24-Ubuntu
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
RIP: 0010:load_misc_binary+0x22b/0x470 [binfmt_misc]
Call Trace:
search_binary_handler+0x97/0x1d0
do_execveat_common.isra.34+0x667/0x810
SyS_execve+0x31/0x40
do_syscall_64+0x73/0x130
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Use kstrtoint instead of simple_strtoul. It will work as the code
already set the delimiter byte to '\0' and we only do it when the field
is not empty.
Tested with offsets -1, 2500000000, UINT_MAX and INT_MAX. Also tested
with examples documented at Documentation/admin-guide/binfmt-misc.rst
and other registrations from packages on Ubuntu.
Link: http://lkml.kernel.org/r/20180529135648.14254-1-cascardo@canonical.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac0b4145d662a3b9e34085dea460fb06ede9b69b upstream.
[BUG]
Btrfs can create compressed extent without checksum (even though it
shouldn't), and if we then try to replace device containing such extent,
the result device will contain all the uncompressed data instead of the
compressed one.
Test case already submitted to fstests:
https://patchwork.kernel.org/patch/10442353/
[CAUSE]
When handling compressed extent without checksum, device replace will
goe into copy_nocow_pages() function.
In that function, btrfs will get all inodes referring to this data
extents and then use find_or_create_page() to get pages direct from that
inode.
The problem here is, pages directly from inode are always uncompressed.
And for compressed data extent, they mismatch with on-disk data.
Thus this leads to corrupted compressed data extent written to replace
device.
[FIX]
In this attempt, we could just remove the "optimization" branch, and let
unified scrub_pages() to handle it.
Although scrub_pages() won't bother reusing page cache, it will be a
little slower, but it does the correct csum checking and won't cause
such data corruption caused by "optimization".
Note about the fix: this is the minimal fix that can be backported to
older stable trees without conflicts. The whole callchain from
copy_nocow_pages() can be deleted, and will be in followup patches.
Fixes: ff023aac31 ("Btrfs: add code to scrub to copy read data to another disk")
CC: stable@vger.kernel.org # 4.4+
Reported-by: James Harvey <jamespharvey20@gmail.com>
Reviewed-by: James Harvey <jamespharvey20@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ remove code removal, add note why ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f2f76f751433908364ccff82f437a57d0e6e9b7 upstream.
ext4_resize_fs() has an off-by-one bug when checking whether growing of
a filesystem will not overflow inode count. As a result it allows a
filesystem with 8192 inodes per group to grow to 64TB which overflows
inode count to 0 and makes filesystem unusable. Fix it.
Cc: stable@vger.kernel.org
Fixes: 3f8a6411fb
Reported-by: Jaco Kroon <jaco@uls.co.za>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eee597ac931305eff3d3fd1d61d6aae553bc0984 upstream.
Currently in ext4_punch_hole we're going to skip the mtime update if
there are no actual blocks to release. However we've actually modified
the file by zeroing the partial block so the mtime should be updated.
Moreover the sync and datasync handling is skipped as well, which is
also wrong. Fix it.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Joe Habermann <joe.habermann@quantum.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8810f7517a3bc4ca2d41d022446d3f5fd6b77c09 ]
There is a scenario that can end up with rebuild process failing to
return good content, i.e.
suppose that all disks can be read without problems and if the content
that was read out doesn't match its checksum, currently for raid6
btrfs at most retries twice,
- the 1st retry is to rebuild with all other stripes, it'll eventually
be a raid5 xor rebuild,
- if the 1st fails, the 2nd retry will deliberately fail parity p so
that it will do raid6 style rebuild,
however, the chances are that another non-parity stripe content also
has something corrupted, so that the above retries are not able to
return correct content, and users will think of this as data loss.
More seriouly, if the loss happens on some important internal btree
roots, it could refuse to mount.
This extends btrfs to do more retries and each retry fails only one
stripe. Since raid6 can tolerate 2 disk failures, if there is one
more failure besides the failure on which we're recovering, this can
always work.
The worst case is to retry as many times as the number of raid6 disks,
but given the fact that such a scenario is really rare in practice,
it's still acceptable.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 95b286daf7.
This commit used an incorrect log message.
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Log the crypto algorithm driver name for each fscrypt encryption mode on
its first use, also showing a friendly name for the mode.
This will help people determine whether the expected implementations are
being used. In some cases we've seen people do benchmarks and reject
using encryption for performance reasons, when in fact they used a much
slower implementation of AES-XTS than was possible on the hardware. It
can make an enormous difference; e.g., AES-XTS on ARM is about 10x
faster with the crypto extensions (AES instructions) than without.
This also makes it more obvious which modes are being used, now that
fscrypt supports multiple combinations of modes.
Example messages (with default modes, on x86_64):
[ 35.492057] fscrypt: AES-256-CTS-CBC using implementation "cts(cbc-aes-aesni)"
[ 35.492171] fscrypt: AES-256-XTS using implementation "xts-aes-aesni"
Note: algorithms can be dynamically added to the crypto API, which can
result in different implementations being used at different times. But
this is rare; for most users, showing the first will be good enough.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
fscrypt currently only supports AES encryption. However, many low-end
mobile devices have older CPUs that don't have AES instructions, e.g.
the ARMv8 Cryptography Extensions. Currently, user data on such devices
is not encrypted at rest because AES is too slow, even when the NEON
bit-sliced implementation of AES is used. Unfortunately, it is
infeasible to encrypt these devices at all when AES is the only option.
Therefore, this patch updates fscrypt to support the Speck block cipher,
which was recently added to the crypto API. The C implementation of
Speck is not especially fast, but Speck can be implemented very
efficiently with general-purpose vector instructions, e.g. ARM NEON.
For example, on an ARMv7 processor, we measured the NEON-accelerated
Speck128/256-XTS at 69 MB/s for both encryption and decryption, while
AES-256-XTS with the NEON bit-sliced implementation was only 22 MB/s
encryption and 19 MB/s decryption.
There are multiple variants of Speck. This patch only adds support for
Speck128/256, which is the variant with a 128-bit block size and 256-bit
key size -- the same as AES-256. This is believed to be the most secure
variant of Speck, and it's only about 6% slower than Speck128/128.
Speck64/128 would be at least 20% faster because it has 20% rounds, and
it can be even faster on CPUs that can't efficiently do the 64-bit
operations needed for Speck128. However, Speck64's 64-bit block size is
not preferred security-wise. ARM NEON also supports the needed 64-bit
operations even on 32-bit CPUs, resulting in Speck128 being fast enough
for our targeted use cases so far.
The chosen modes of operation are XTS for contents and CTS-CBC for
filenames. These are the same modes of operation that fscrypt defaults
to for AES. Note that as with the other fscrypt modes, Speck will not
be used unless userspace chooses to use it. Nor are any of the existing
modes (which are all AES-based) being removed, of course.
We intentionally don't make CONFIG_FS_ENCRYPTION select
CONFIG_CRYPTO_SPECK, so people will have to enable Speck support
themselves if they need it. This is because we shouldn't bloat the
FS_ENCRYPTION dependencies with every new cipher, especially ones that
aren't recommended for most users. Moreover, CRYPTO_SPECK is just the
generic implementation, which won't be fast enough for many users; in
practice, they'll need to enable CRYPTO_SPECK_NEON to get acceptable
performance.
More details about our choice of Speck can be found in our patches that
added Speck to the crypto API, and the follow-on discussion threads.
We're planning a publication that explains the choice in more detail.
But briefly, we can't use ChaCha20 as we previously proposed, since it
would be insecure to use a stream cipher in this context, with potential
IV reuse during writes on f2fs and/or on wear-leveling flash storage.
We also evaluated many other lightweight and/or ARX-based block ciphers
such as Chaskey-LTS, RC5, LEA, CHAM, Threefish, RC6, NOEKEON, SPARX, and
XTEA. However, all had disadvantages vs. Speck, such as insufficient
performance with NEON, much less published cryptanalysis, or an
insufficient security level. Various design choices in Speck make it
perform better with NEON than competing ciphers while still having a
security margin similar to AES, and in the case of Speck128 also the
same available security levels. Unfortunately, Speck does have some
political baggage attached -- it's an NSA designed cipher, and was
rejected from an ISO standard (though for context, as far as I know none
of the above-mentioned alternatives are ISO standards either).
Nevertheless, we believe it is a good solution to the problem from a
technical perspective.
Certain algorithms constructed from ChaCha or the ChaCha permutation,
such as MEM (Masked Even-Mansour) or HPolyC, may also meet our
performance requirements. However, these are new constructions that
need more time to receive the cryptographic review and acceptance needed
to be confident in their security. HPolyC hasn't been published yet,
and we are concerned that MEM makes stronger assumptions about the
underlying permutation than the ChaCha stream cipher does. In contrast,
the XTS mode of operation is relatively well accepted, and Speck has
over 70 cryptanalysis papers. Of course, these ChaCha-based algorithms
can still be added later if they become ready.
The best known attack on Speck128/256 is a differential cryptanalysis
attack on 25 of 34 rounds with 2^253 time complexity and 2^125 chosen
plaintexts, i.e. only marginally faster than brute force. There is no
known attack on the full 34 rounds.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently the key derivation function in fscrypt uses the master key
length as the amount of output key material to derive. This works, but
it means we can waste time deriving more key material than is actually
used, e.g. most commonly, deriving 64 bytes for directories which only
take a 32-byte AES-256-CTS-CBC key. It also forces us to validate that
the master key length is a multiple of AES_BLOCK_SIZE, which wouldn't
otherwise be necessary.
Fix it to only derive the needed length key.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Refactor the confusingly-named function 'validate_user_key()' into a new
function 'find_and_derive_key()' which first finds the keyring key, then
does the key derivation. Among other benefits this avoids the strange
behavior we had previously where if key derivation failed for some
reason, then we would fall back to the alternate key prefix. Now, we'll
only fall back to the alternate key prefix if a valid key isn't found.
This patch also improves the warning messages that are logged when the
keyring key's payload is invalid.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Use a common function for fscrypt warning and error messages so that all
the messages are consistently ratelimited, include the "fscrypt:"
prefix, and include the filesystem name if applicable.
Also fix up a few of the log messages to be more descriptive.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
With one exception, the internal key size constants such as
FS_AES_256_XTS_KEY_SIZE are only used for the 'available_modes' array,
where they really only serve to obfuscate what the values are. Also
some of the constants are unused, and the key sizes tend to be in the
names of the algorithms anyway. In the past these values were also
misused, e.g. we used to have FS_AES_256_XTS_KEY_SIZE in places that
technically should have been FS_MAX_KEY_SIZE.
The exception is that FS_AES_128_ECB_KEY_SIZE is used for key
derivation. But it's more appropriate to use
FS_KEY_DERIVATION_NONCE_SIZE for that instead.
Thus, just put the sizes directly in the 'available_modes' array.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We're passing 'key_type_logon' to request_key(), so the found key is
guaranteed to be of type "logon". Thus, there is no reason to check
later that the key is really a "logon" key.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Now ->max_namelen() is only called to limit the filename length when
adding NUL padding, and only for real filenames -- not symlink targets.
It also didn't give the correct length for symlink targets anyway since
it forgot to subtract 'sizeof(struct fscrypt_symlink_data)'.
Thus, change ->max_namelen from a function to a simple 'unsigned int'
that gives the filesystem's maximum filename length.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
fname_decrypt() is validating that the encrypted filename is nonempty.
However, earlier a stronger precondition was already enforced: the
encrypted filename must be at least 16 (FS_CRYPTO_BLOCK_SIZE) bytes.
Drop the redundant check for an empty filename.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
fname_decrypt() returns an error if the input filename is longer than
the inode's ->max_namelen() as given by the filesystem. But, this
doesn't actually make sense because the filesystem provided the input
filename in the first place, where it was subject to the filesystem's
limits. And fname_decrypt() has no internal limit itself.
Thus, remove this unnecessary check.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In fscrypt_setup_filename(), remove the unnecessary check for
fscrypt_get_encryption_info() returning EOPNOTSUPP. There's no reason
to handle this error differently from any other. I think there may have
been some confusion because the "notsupp" version of
fscrypt_get_encryption_info() returns EOPNOTSUPP -- but that's not
applicable from inside fs/crypto/.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
fscrypt is clearing the flags on the crypto_skcipher it allocates for
each inode. But, this is unnecessary and may cause problems in the
future because it will even clear flags that are meant to be internal to
the crypto API, e.g. CRYPTO_TFM_NEED_KEY.
Remove the unnecessary flag clearing.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
skcipher_request_alloc() can only fail due to lack of memory, and in
that case the memory allocator will have already printed a detailed
error message. Thus, remove the redundant error messages from fscrypt.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
crypto_alloc_skcipher() returns an ERR_PTR() on failure, not NULL.
Remove the unnecessary check for NULL.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Now that all filesystems have been converted to use
fscrypt_prepare_lookup(), we can remove the fscrypt_set_d_op() and
fscrypt_set_encrypted_dentry() functions as well as un-export
fscrypt_d_ops.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Improve fscrypt read performance by switching the decryption workqueue
from bound to unbound. With the bound workqueue, when multiple bios
completed on the same CPU, they were decrypted on that same CPU. But
with the unbound queue, they are now decrypted in parallel on any CPU.
Although fscrypt read performance can be tough to measure due to the
many sources of variation, this change is most beneficial when
decryption is slow, e.g. on CPUs without AES instructions. For example,
I timed tarring up encrypted directories on f2fs. On x86 with AES-NI
instructions disabled, the unbound workqueue improved performance by
about 25-35%, using 1 to NUM_CPUs jobs with 4 or 8 CPUs available. But
with AES-NI enabled, performance was unchanged to within ~2%.
I also did the same test on a quad-core ARM CPU using xts-speck128-neon
encryption. There performance was usually about 10% better with the
unbound workqueue, bringing it closer to the unencrypted speed.
The unbound workqueue may be worse in some cases due to worse locality,
but I think it's still the better default. dm-crypt uses an unbound
workqueue by default too, so this change makes fscrypt match.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* refs/heads/tmp-a2e2217
Linux 4.4.137
net: metrics: add proper netlink validation
net: phy: broadcom: Fix bcm_write_exp()
rtnetlink: validate attributes in do_setlink()
team: use netdev_features_t instead of u32
net/mlx4: Fix irq-unsafe spinlock usage
qed: Fix mask for physical address in ILT entry
packet: fix reserve calculation
net: usb: cdc_mbim: add flag FLAG_SEND_ZLP
net/packet: refine check for priv area size
netdev-FAQ: clarify DaveM's position for stable backports
isdn: eicon: fix a missing-check bug
ipv4: remove warning in ip_recv_error
ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
enic: set DMA mask to 47 bit
dccp: don't free ccid2_hc_tx_sock struct in dccp_disconnect()
bnx2x: use the right constant
brcmfmac: Fix check for ISO3166 code
drm: set FMODE_UNSIGNED_OFFSET for drm files
xfs: fix incorrect log_flushed on fsync
kconfig: Avoid format overflow warning from GCC 8.1
mmap: relax file size limit for regular files
mmap: introduce sane default mmap limits
tpm: self test failure should not cause suspend to fail
tpm: do not suspend/resume if power stays on
ANDROID: Update arm64 ranchu64_defconfig
Linux 4.4.136
sparc64: Fix build warnings with gcc 7.
mm: fix the NULL mapping case in __isolate_lru_page()
fix io_destroy()/aio_complete() race
Kbuild: change CC_OPTIMIZE_FOR_SIZE definition
drm/i915: Disable LVDS on Radiant P845
hwtracing: stm: fix build error on some arches
stm class: Use vmalloc for the master map
scsi: scsi_transport_srp: Fix shost to rport translation
MIPS: prctl: Disallow FRE without FR with PR_SET_FP_MODE requests
MIPS: ptrace: Fix PTRACE_PEEKUSR requests for 64-bit FGRs
iio:kfifo_buf: check for uint overflow
dmaengine: usb-dmac: fix endless loop in usb_dmac_chan_terminate_all()
i2c: rcar: revoke START request early
i2c: rcar: check master irqs before slave irqs
i2c: rcar: don't issue stop when HW does it automatically
i2c: rcar: init new messages in irq
i2c: rcar: refactor setup of a msg
i2c: rcar: remove spinlock
i2c: rcar: remove unused IOERROR state
i2c: rcar: rework hw init
i2c: rcar: make sure clocks are on when doing clock calculation
tcp: avoid integer overflows in tcp_rcv_space_adjust()
irda: fix overly long udelay()
ASoC: Intel: sst: remove redundant variable dma_dev_name
rtlwifi: rtl8192cu: Remove variable self-assignment in rf.c
cfg80211: further limit wiphy names to 64 bytes
selinux: KASAN: slab-out-of-bounds in xattr_getsecurity
tracing: Fix crash when freeing instances with event triggers
Input: elan_i2c_smbus - fix corrupted stack
Revert "ima: limit file hash setting by user to fix and log modes"
xfs: detect agfl count corruption and reset agfl
sh: New gcc support
USB: serial: cp210x: use tcflag_t to fix incompatible pointer type
powerpc/64s: Clear PCR on boot
arm64: lse: Add early clobbers to some input/output asm operands
FROMLIST: f2fs: run fstrim asynchronously if runtime discard is on
goldfish: pipe: ANDROID: address must be written as __pa(x), not x
goldfish: pipe: ANDROID: add missing check for memory allocated
goldfish: pipe: ANDROID: remove redundant blank lines
Update arch/x86/configs/x86_64_ranchu_defconfig
ANDROID: x86_64_cuttlefish_defconfig: Enable F2FS
ANDROID: Update x86_64_cuttlefish_defconfig
FROMLIST: f2fs: early updates queued for v4.18-rc1
Change-Id: I314254168cd5ad06a7c6bca2fa68c8a6ae6c257d
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
* refs/heads/tmp-c9d74f2
Linux 4.4.135
Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU"
Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU"
Linux 4.4.134
s390/ftrace: use expoline for indirect branches
kdb: make "mdr" command repeat
Bluetooth: btusb: Add device ID for RTL8822BE
ASoC: samsung: i2s: Ensure the RCLK rate is properly determined
regulator: of: Add a missing 'of_node_put()' in an error handling path of 'of_regulator_match()'
scsi: lpfc: Fix frequency of Release WQE CQEs
scsi: lpfc: Fix soft lockup in lpfc worker thread during LIP testing
scsi: lpfc: Fix issue_lip if link is disabled
netlabel: If PF_INET6, check sk_buff ip header version
selftests/net: fixes psock_fanout eBPF test case
perf report: Fix memory corruption in --branch-history mode --branch-history
perf tests: Use arch__compare_symbol_names to compare symbols
x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified
drm/rockchip: Respect page offset for PRIME mmap calls
MIPS: Octeon: Fix logging messages with spurious periods after newlines
audit: return on memory error to avoid null pointer dereference
crypto: sunxi-ss - Add MODULE_ALIAS to sun4i-ss
clk: samsung: exynos3250: Fix PLL rates
clk: samsung: exynos5250: Fix PLL rates
clk: samsung: exynos5433: Fix PLL rates
clk: samsung: exynos5260: Fix PLL rates
clk: samsung: s3c2410: Fix PLL rates
media: cx25821: prevent out-of-bounds read on array card
udf: Provide saner default for invalid uid / gid
PCI: Add function 1 DMA alias quirk for Marvell 88SE9220
serial: arc_uart: Fix out-of-bounds access through DT alias
serial: fsl_lpuart: Fix out-of-bounds access through DT alias
serial: imx: Fix out-of-bounds access through serial port index
serial: mxs-auart: Fix out-of-bounds access through serial port index
serial: samsung: Fix out-of-bounds access through serial port index
serial: xuartps: Fix out-of-bounds access through DT alias
rtc: tx4939: avoid unintended sign extension on a 24 bit shift
staging: rtl8192u: return -ENOMEM on failed allocation of priv->oldaddr
hwrng: stm32 - add reset during probe
enic: enable rq before updating rq descriptors
clk: rockchip: Prevent calculating mmc phase if clock rate is zero
media: em28xx: USB bulk packet size fix
dmaengine: pl330: fix a race condition in case of threaded irqs
media: s3c-camif: fix out-of-bounds array access
media: cx23885: Set subdev host data to clk_freq pointer
media: cx23885: Override 888 ImpactVCBe crystal frequency
ALSA: vmaster: Propagate slave error
x86/devicetree: Fix device IRQ settings in DT
x86/devicetree: Initialize device tree before using it
usb: gadget: composite: fix incorrect handling of OS desc requests
usb: gadget: udc: change comparison to bitshift when dealing with a mask
gfs2: Fix fallocate chunk size
cdrom: do not call check_disk_change() inside cdrom_open()
hwmon: (pmbus/adm1275) Accept negative page register values
hwmon: (pmbus/max8688) Accept negative page register values
perf/core: Fix perf_output_read_group()
ASoC: topology: create TLV data for dapm widgets
powerpc: Add missing prototype for arch_irq_work_raise()
usb: gadget: ffs: Execute copy_to_user() with USER_DS set
usb: gadget: ffs: Let setup() return USB_GADGET_DELAYED_STATUS
usb: dwc2: Fix interval type issue
ipmi_ssif: Fix kernel panic at msg_done_handler
PCI: Restore config space on runtime resume despite being unbound
MIPS: ath79: Fix AR724X_PLL_REG_PCIE_CONFIG offset
xhci: zero usb device slot_id member when disabling and freeing a xhci slot
KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use
i2c: mv64xxx: Apply errata delay only in standard mode
ACPICA: acpi: acpica: fix acpi operand cache leak in nseval.c
ACPICA: Events: add a return on failure from acpi_hw_register_read
bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set
zorro: Set up z->dev.dma_mask for the DMA API
clk: Don't show the incorrect clock phase
cpufreq: cppc_cpufreq: Fix cppc_cpufreq_init() failure path
usb: dwc3: Update DWC_usb31 GTXFIFOSIZ reg fields
arm: dts: socfpga: fix GIC PPI warning
virtio-net: Fix operstate for virtio when no VIRTIO_NET_F_STATUS
ima: Fallback to the builtin hash algorithm
ima: Fix Kconfig to select TPM 2.0 CRB interface
ath10k: Fix kernel panic while using worker (ath10k_sta_rc_update_wk)
net/mlx5: Protect from command bit overflow
selftests: Print the test we're running to /dev/kmsg
tools/thermal: tmon: fix for segfault
powerpc/perf: Fix kernel address leak via sampling registers
powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer
rtc: hctosys: Ensure system time doesn't overflow time_t
hwmon: (nct6775) Fix writing pwmX_mode
parisc/pci: Switch LBA PCI bus from Hard Fail to Soft Fail mode
m68k: set dma and coherent masks for platform FEC ethernets
powerpc/mpic: Check if cpu_possible() in mpic_physmask()
ACPI: acpi_pad: Fix memory leak in power saving threads
xen/acpi: off by one in read_acpi_id()
btrfs: fix lockdep splat in btrfs_alloc_subvolume_writers
Btrfs: fix copy_items() return value when logging an inode
btrfs: tests/qgroup: Fix wrong tree backref level
Bluetooth: btusb: Add USB ID 7392:a611 for Edimax EW-7611ULB
net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()
rtc: snvs: Fix usage of snvs_rtc_enable
sparc64: Make atomic_xchg() an inline function rather than a macro.
fscache: Fix hanging wait on page discarded by writeback
KVM: VMX: raise internal error for exception during invalid protected mode state
sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning
ocfs2/dlm: don't handle migrate lockres if already in shutdown
btrfs: Fix possible softlock on single core machines
Btrfs: fix NULL pointer dereference in log_dir_items
Btrfs: bail out on error during replay_dir_deletes
mm: fix races between address_space dereference and free in page_evicatable
mm/ksm: fix interaction with THP
dp83640: Ensure against premature access to PHY registers after reset
scsi: aacraid: Insure command thread is not recursively stopped
cpufreq: CPPC: Initialize shared perf capabilities of CPUs
Force log to disk before reading the AGF during a fstrim
sr: get/drop reference to device in revalidate and check_events
swap: divide-by-zero when zero length swap file on ssd
fs/proc/proc_sysctl.c: fix potential page fault while unregistering sysctl table
x86/pgtable: Don't set huge PUD/PMD on non-leaf entries
sh: fix debug trap failure to process signals before return to user
net: mvneta: fix enable of all initialized RXQs
net: Fix untag for vlan packets without ethernet header
mm/kmemleak.c: wait for scan completion before disabling free
llc: properly handle dev_queue_xmit() return value
net-usb: add qmi_wwan if on lte modem wistron neweb d18q1
net/usb/qmi_wwan.c: Add USB id for lt4120 modem
net: qmi_wwan: add BroadMobi BM806U 2020:2033
ARM: 8748/1: mm: Define vdso_start, vdso_end as array
batman-adv: fix packet loss for broadcasted DHCP packets to a server
batman-adv: fix multicast-via-unicast transmission with AP isolation
selftests: ftrace: Add a testcase for probepoint
selftests: ftrace: Add a testcase for string type with kprobe_event
selftests: ftrace: Add probe event argument syntax testcase
mm/mempolicy.c: avoid use uninitialized preferred_node
RDMA/ucma: Correct option size check using optlen
perf/cgroup: Fix child event counting bug
vti4: Don't override MTU passed on link creation via IFLA_MTU
vti4: Don't count header length twice on tunnel setup
batman-adv: fix header size check in batadv_dbg_arp()
net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off
sunvnet: does not support GSO for sctp
ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu
workqueue: use put_device() instead of kfree()
bnxt_en: Check valid VNIC ID in bnxt_hwrm_vnic_set_tpa().
netfilter: ebtables: fix erroneous reject of last rule
USB: OHCI: Fix NULL dereference in HCDs using HCD_LOCAL_MEM
xen: xenbus: use put_device() instead of kfree()
fbdev: Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in sbusfb_ioctl_helper().
scsi: sd: Keep disk read-only when re-reading partition
scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM
usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers
e1000e: allocate ring descriptors with dma_zalloc_coherent
e1000e: Fix check_for_link return value with autoneg off
watchdog: f71808e_wdt: Fix magic close handling
KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing
selftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable
Btrfs: send, fix issuing write op when processing hole in no data mode
xen/pirq: fix error path cleanup when binding MSIs
net/tcp/illinois: replace broken algorithm reference link
gianfar: Fix Rx byte accounting for ndev stats
sit: fix IFLA_MTU ignored on NEWLINK
bcache: fix kcrashes with fio in RAID5 backend dev
dmaengine: rcar-dmac: fix max_chunk_size for R-Car Gen3
virtio-gpu: fix ioctl and expose the fixed status to userspace.
r8152: fix tx packets accounting
clocksource/drivers/fsl_ftm_timer: Fix error return checking
nvme-pci: Fix nvme queue cleanup if IRQ setup fails
netfilter: ebtables: convert BUG_ONs to WARN_ONs
batman-adv: invalidate checksum on fragment reassembly
batman-adv: fix packet checksum in receive path
md/raid1: fix NULL pointer dereference
media: dmxdev: fix error code for invalid ioctls
x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs
regulatory: add NUL to request alpha2
smsc75xx: fix smsc75xx_set_features()
ARM: OMAP: Fix dmtimer init for omap1
s390/cio: clear timer when terminating driver I/O
s390/cio: fix return code after missing interrupt
powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access
kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
md: raid5: avoid string overflow warning
locking/xchg/alpha: Add unconditional memory barrier to cmpxchg()
usb: musb: fix enumeration after resume
drm/exynos: fix comparison to bitshift when dealing with a mask
md raid10: fix NULL deference in handle_write_completed()
mac80211: round IEEE80211_TX_STATUS_HEADROOM up to multiple of 4
NFC: llcp: Limit size of SDP URI
ARM: OMAP1: clock: Fix debugfs_create_*() usage
ARM: OMAP3: Fix prm wake interrupt for resume
ARM: OMAP2+: timer: fix a kmemleak caused in omap_get_timer_dt
scsi: qla4xxx: skip error recovery in case of register disconnect.
scsi: aacraid: fix shutdown crash when init fails
scsi: storvsc: Increase cmd_per_lun for higher speed devices
selftests: memfd: add config fragment for fuse
usb: dwc2: Fix dwc2_hsotg_core_init_disconnected()
usb: gadget: fsl_udc_core: fix ep valid checks
usb: gadget: f_uac2: fix bFirstInterface in composite gadget
ARC: Fix malformed ARC_EMUL_UNALIGNED default
scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion()
scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo()
scsi: sym53c8xx_2: iterator underflow in sym_getsync()
scsi: bnx2fc: Fix check in SCSI completion handler for timed out request
scsi: ufs: Enable quirk to ignore sending WRITE_SAME command
irqchip/gic-v3: Change pr_debug message to pr_devel
locking/qspinlock: Ensure node->count is updated before initialising node
tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
bcache: return attach error when no cache set exist
bcache: fix for data collapse after re-attaching an attached device
bcache: fix for allocator and register thread race
bcache: properly set task state in bch_writeback_thread()
cifs: silence compiler warnings showing up with gcc-8.0.0
proc: fix /proc/*/map_files lookup
arm64: spinlock: Fix theoretical trylock() A-B-A with LSE atomics
RDS: IB: Fix null pointer issue
xen/grant-table: Use put_page instead of free_page
xen-netfront: Fix race between device setup and open
MIPS: TXx9: use IS_BUILTIN() for CONFIG_LEDS_CLASS
bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y
ACPI: processor_perflib: Do not send _PPC change notification if not ready
firmware: dmi_scan: Fix handling of empty DMI strings
x86/power: Fix swsusp_arch_resume prototype
IB/ipoib: Fix for potential no-carrier state
mm: pin address_space before dereferencing it while isolating an LRU page
asm-generic: provide generic_pmdp_establish()
mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
mm/mempolicy: fix the check of nodemask from user
ocfs2: return error when we attempt to access a dirty bh in jbd2
ocfs2/acl: use 'ip_xattr_sem' to protect getting extended attribute
ocfs2: return -EROFS to mount.ocfs2 if inode block is invalid
ntb_transport: Fix bug with max_mw_size parameter
RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure
powerpc/numa: Ensure nodes initialized for hotplug
powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes
jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path
HID: roccat: prevent an out of bounds read in kovaplus_profile_activated()
scsi: fas216: fix sense buffer initialization
Btrfs: fix scrub to repair raid6 corruption
btrfs: Fix out of bounds access in btrfs_search_slot
Btrfs: set plug for fsync
ipmi/powernv: Fix error return code in ipmi_powernv_probe()
mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl()
kconfig: Fix expr_free() E_NOT leak
kconfig: Fix automatic menu creation mem leak
kconfig: Don't leak main menus during parsing
watchdog: sp5100_tco: Fix watchdog disable bit
nfs: Do not convert nfs_idmap_cache_timeout to jiffies
dm thin: fix documentation relative to low water mark threshold
tools lib traceevent: Fix get_field_str() for dynamic strings
perf callchain: Fix attr.sample_max_stack setting
tools lib traceevent: Simplify pointer print logic and fix %pF
PCI: Add function 1 DMA alias quirk for Marvell 9128
tracing/hrtimer: Fix tracing bugs by taking all clock bases and modes into account
kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
ASoC: au1x: Fix timeout tests in au1xac97c_ac97_read()
ALSA: hda - Use IS_REACHABLE() for dependency on input
NFSv4: always set NFS_LOCK_LOST when a lock is lost.
firewire-ohci: work around oversized DMA reads on JMicron controllers
do d_instantiate/unlock_new_inode combinations safely
xfs: remove racy hasattr check from attr ops
kernel/signal.c: avoid undefined behaviour in kill_something_info
kernel/sys.c: fix potential Spectre v1 issue
kasan: fix memory hotplug during boot
ipc/shm: fix shmat() nil address after round-down when remapping
Revert "ipc/shm: Fix shmat mmap nil-page protection"
xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent
libata: blacklist Micron 500IT SSD with MU01 firmware
libata: Blacklist some Sandisk SSDs for NCQ
mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register
ALSA: timer: Fix pause event notification
aio: fix io_destroy(2) vs. lookup_ioctx() race
affs_lookup(): close a race with affs_remove_link()
KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable"
MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs
MIPS: ptrace: Expose FIR register through FP regset
UPSTREAM: sched/fair: Consider RT/IRQ pressure in capacity_spare_wake
Conflicts:
drivers/media/dvb-core/dmxdev.c
drivers/scsi/sd.c
drivers/scsi/ufs/ufshcd.c
drivers/usb/gadget/function/f_fs.c
fs/ecryptfs/inode.c
Change-Id: I15751ed8c82ec65ba7eedcb0d385b9f803c333f7
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
sdcardfs_mkdir() calls check_min_free_space(). When reserved_mb is not zero, a negative dentry will be passed to
ext4_statfs() at last and ext4_statfs() will crash. The parent dentry is positive. So we use the parent dentry to
check free space.
Change-Id: I80ab9623fe59ba911f4cc9f0e029a1c6f7ee421b
Signed-off-by: Lianjun Huang <huanglianjun@vivo.com>
commit 362f924b64ba0f4be2ee0cb697690c33d40be721 upstream.
Those are stupid and code should use static_cpu_has_safe() or
boot_cpu_has() instead. Kill the least used and unused ones.
The remaining ones need more careful inspection before a conversion can
happen. On the TODO.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1449481182-27541-4-git-send-email-bp@alien8.de
Cc: David Sterba <dsterba@suse.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlshJwIACgkQONu9yGCS
aT79XQ//S3GUXgBKG87+4HlPW0bebORgAbMTw47VFAm09jKRlaI65TXpT+EaXy+R
iuHeyrie9SmN6eG2P5o25txl0NRL/WoZcKHBRdT3P8of19iG64g5zPd2meWgO8vO
TtGy2fTNnmYoUEcCtDIQayRTPkeWbLUNE56grA1/LtxEeCmLTsm7tzhxXy+qzQSg
YdlpKeTMTy9yUWc8Dt7Mt7Njmq+hj7HUVs42fvfddcW2y4jZl2rKnZ6HN2uADIJx
sKyzeN1i8LMBdP216kg785jeBaaMnk01hhd/F+D+S+qTBrHbacO+reO9CkEEhovM
UaneMG2j3t3THdaPs+amx/39IX3t+duiyyz+zI6kKlYJ9WrhbBuOJhG97B6jbnN2
2QNs7Ll6cucRwcOY6pMbIzh46bIsUtUODB1/gLn4ALiB8OBWh78qH5CISBwpzGVY
UNtBvV+nk9/aoo1BwdD2IZ6QQtafVwK3pzsb3bdGVJWZPkeGKbsQ3Y8ffGIdAbQg
6MsBOCyF9t+dSk9ShGwdNr5vOp7VfmbOoUtHBp+czYFObWJsMzN5bG2dfvqQU0Sd
LH0AkA4j4DWJQpl9OVhBCuUJ/SMk693b6wlO2dMnLsQg/I8j2eihtc5CTcJ6pH1V
BjO6Uc3MwqmyR4/vF8acYKhL1zY6wtZtVkTTiKDoEuyDtZTTaqs=
=Gmn/
-----END PGP SIGNATURE-----
Merge 4.4.137 into android-4.4
Changes in 4.4.137
tpm: do not suspend/resume if power stays on
tpm: self test failure should not cause suspend to fail
mmap: introduce sane default mmap limits
mmap: relax file size limit for regular files
kconfig: Avoid format overflow warning from GCC 8.1
xfs: fix incorrect log_flushed on fsync
drm: set FMODE_UNSIGNED_OFFSET for drm files
brcmfmac: Fix check for ISO3166 code
bnx2x: use the right constant
dccp: don't free ccid2_hc_tx_sock struct in dccp_disconnect()
enic: set DMA mask to 47 bit
ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
ipv4: remove warning in ip_recv_error
isdn: eicon: fix a missing-check bug
netdev-FAQ: clarify DaveM's position for stable backports
net/packet: refine check for priv area size
net: usb: cdc_mbim: add flag FLAG_SEND_ZLP
packet: fix reserve calculation
qed: Fix mask for physical address in ILT entry
net/mlx4: Fix irq-unsafe spinlock usage
team: use netdev_features_t instead of u32
rtnetlink: validate attributes in do_setlink()
net: phy: broadcom: Fix bcm_write_exp()
net: metrics: add proper netlink validation
Linux 4.4.137
Change-Id: I247cc9905e330810546f7105bdf723bf84c3308f
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 47c7d0b19502583120c3f396c7559e7a77288a68 upstream.
When calling into _xfs_log_force{,_lsn}() with a pointer
to log_flushed variable, log_flushed will be set to 1 if:
1. xlog_sync() is called to flush the active log buffer
AND/OR
2. xlog_wait() is called to wait on a syncing log buffers
xfs_file_fsync() checks the value of log_flushed after
_xfs_log_force_lsn() call to optimize away an explicit
PREFLUSH request to the data block device after writing
out all the file's pages to disk.
This optimization is incorrect in the following sequence of events:
Task A Task B
-------------------------------------------------------
xfs_file_fsync()
_xfs_log_force_lsn()
xlog_sync()
[submit PREFLUSH]
xfs_file_fsync()
file_write_and_wait_range()
[submit WRITE X]
[endio WRITE X]
_xfs_log_force_lsn()
xlog_wait()
[endio PREFLUSH]
The write X is not guarantied to be on persistent storage
when PREFLUSH request in completed, because write A was submitted
after the PREFLUSH request, but xfs_file_fsync() of task A will
be notified of log_flushed=1 and will skip explicit flush.
If the system crashes after fsync of task A, write X may not be
present on disk after reboot.
This bug was discovered and demonstrated using Josef Bacik's
dm-log-writes target, which can be used to record block io operations
and then replay a subset of these operations onto the target device.
The test goes something like this:
- Use fsx to execute ops of a file and record ops on log device
- Every now and then fsync the file, store md5 of file and mark
the location in the log
- Then replay log onto device for each mark, mount fs and compare
md5 of file to stored value
Cc: Christoph Hellwig <hch@lst.de>
Cc: Josef Bacik <jbacik@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4faa99965e027cc057c5145ce45fa772caa04e8d upstream.
If io_destroy() gets to cancelling everything that can be cancelled and
gets to kiocb_cancel() calling the function driver has left in ->ki_cancel,
it becomes vulnerable to a race with IO completion. At that point req
is already taken off the list and aio_complete() does *NOT* spin until
we (in free_ioctx_users()) releases ->ctx_lock. As the result, it proceeds
to kiocb_free(), freing req just it gets passed to ->ki_cancel().
Fix is simple - remove from the list after the call of kiocb_cancel(). All
instances of ->ki_cancel() already have to cope with the being called with
iocb still on list - that's what happens in io_cancel(2).
Cc: stable@kernel.org
Fixes: 0460fef2a9 "aio: use cancellation list lazily"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a27ba2607e60312554cbcd43fc660b2c7f29dc9c upstream.
The struct xfs_agfl v5 header was originally introduced with
unexpected padding that caused the AGFL to operate with one less
slot than intended. The header has since been packed, but the fix
left an incompatibility for users who upgrade from an old kernel
with the unpacked header to a newer kernel with the packed header
while the AGFL happens to wrap around the end. The newer kernel
recognizes one extra slot at the physical end of the AGFL that the
previous kernel did not. The new kernel will eventually attempt to
allocate a block from that slot, which contains invalid data, and
cause a crash.
This condition can be detected by comparing the active range of the
AGFL to the count. While this detects a padding mismatch, it can
also trigger false positives for unrelated flcount corruption. Since
we cannot distinguish a size mismatch due to padding from unrelated
corruption, we can't trust the AGFL enough to simply repopulate the
empty slot.
Instead, avoid unnecessarily complex detection logic and and use a
solution that can handle any form of flcount corruption that slips
through read verifiers: distrust the entire AGFL and reset it to an
empty state. Any valid blocks within the AGFL are intentionally
leaked. This requires xfs_repair to rectify (which was already
necessary based on the state the AGFL was found in). The reset
mitigates the side effect of the padding mismatch problem from a
filesystem crash to a free space accounting inconsistency. The
generic approach also means that this patch can be safely backported
to kernels with or without a packed struct xfs_agfl.
Check the AGF for an invalid freelist count on initial read from
disk. If detected, set a flag on the xfs_perag to indicate that a
reset is required before the AGFL can be used. In the first
transaction that attempts to use a flagged AGFL, reset it to empty,
warn the user about the inconsistency and allow the freelist fixup
code to repopulate the AGFL with new blocks. The xfs_perag flag is
cleared to eliminate the need for repeated checks on each block
allocation operation.
This allows kernels that include the packing fix commit 96f859d52bcb
("libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct")
to handle older unpacked AGFL formats without a filesystem crash.
Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by Dave Chiluk <chiluk+linuxxfs@indeed.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chiluk <chiluk+linuxxfs@indeed.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cherry-picked from:
origin/upstream-f2fs-stable-linux-4.4.y
We don't need to wait for whole bunch of discard candidates in fstrim, since
runtime discard will issue them in idle time.
Change-Id: I32602711842d603cca030765eab49b337789e8ad
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>