Commit graph

606876 commits

Author SHA1 Message Date
Kirill Smelkov
e6779b264d fuse: retrieve: cap requested size to negotiated max_write
[ Upstream commit 7640682e67b33cab8628729afec8ca92b851394f ]

FUSE filesystem server and kernel client negotiate during initialization
phase, what should be the maximum write size the client will ever issue.
Correspondingly the filesystem server then queues sys_read calls to read
requests with buffer capacity large enough to carry request header + that
max_write bytes. A filesystem server is free to set its max_write in
anywhere in the range between [1*page, fc->max_pages*page]. In particular
go-fuse[2] sets max_write by default as 64K, wheres default fc->max_pages
corresponds to 128K. Libfuse also allows users to configure max_write, but
by default presets it to possible maximum.

If max_write is < fc->max_pages*page, and in NOTIFY_RETRIEVE handler we
allow to retrieve more than max_write bytes, corresponding prepared
NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the
filesystem server, in full correspondence with server/client contract, will
be only queuing sys_read with ~max_write buffer capacity, and
fuse_dev_do_read throws away requests that cannot fit into server request
buffer. In turn the filesystem server could get stuck waiting indefinitely
for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK which is
understood by clients as that NOTIFY_REPLY was queued and will be sent
back.

Cap requested size to negotiate max_write to avoid the problem.  This
aligns with the way NOTIFY_RETRIEVE handler works, which already
unconditionally caps requested retrieve size to fuse_conn->max_pages.  This
way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than
was originally requested.

Please see [1] for context where the problem of stuck filesystem was hit
for real, how the situation was traced and for more involving patch that
did not make it into the tree.

[1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2
[2] https://github.com/hanwen/go-fuse

Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Cc: Han-Wen Nienhuys <hanwen@google.com>
Cc: Jakob Unterwurzacher <jakobunt@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:20 +02:00
Jorge Ramirez-Ortiz
742cb74bf1 nvmem: core: fix read buffer in place
[ Upstream commit 2fe518fecb3a4727393be286db9804cd82ee2d91 ]

When the bit_offset in the cell is zero, the pointer to the msb will
not be properly initialized (ie, will still be pointing to the first
byte in the buffer).

This being the case, if there are bits to clear in the msb, those will
be left untouched while the mask will incorrectly clear bit positions
on the first byte.

This commit also makes sure that any byte unused in the cell is
cleared.

Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:20 +02:00
Takashi Iwai
750c55e69c ALSA: hda - Register irq handler after the chip initialization
[ Upstream commit f495222e28275222ab6fd93813bd3d462e16d340 ]

Currently the IRQ handler in HD-audio controller driver is registered
before the chip initialization.  That is, we have some window opened
between the azx_acquire_irq() call and the CORB/RIRB setup.  If an
interrupt is triggered in this small window, the IRQ handler may
access to the uninitialized RIRB buffer, which leads to a NULL
dereference Oops.

This is usually no big problem since most of Intel chips do register
the IRQ via MSI, and we've already fixed the order of the IRQ
enablement and the CORB/RIRB setup in the former commit b61749a89f82
("sound: enable interrupt after dma buffer initialization"), hence the
IRQ won't be triggered in that room.  However, some platforms use a
shared IRQ, and this may allow the IRQ trigger by another source.

Another possibility is the kdump environment: a stale interrupt might
be present in there, the IRQ handler can be falsely triggered as well.

For covering this small race, let's move the azx_acquire_irq() call
after hda_intel_init_chip() call.  Although this is a bit radical
change, it can cover more widely than checking the CORB/RIRB setup
locally in the callee side.

Reported-by: Liwei Song <liwei.song@windriver.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:20 +02:00
Lu Baolu
09ad374f23 iommu/vt-d: Set intel_iommu_gfx_mapped correctly
[ Upstream commit cf1ec4539a50bdfe688caad4615ca47646884316 ]

The intel_iommu_gfx_mapped flag is exported by the Intel
IOMMU driver to indicate whether an IOMMU is used for the
graphic device. In a virtualized IOMMU environment (e.g.
QEMU), an include-all IOMMU is used for graphic device.
This flag is found to be clear even the IOMMU is used.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Reported-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Fixes: c0771df8d5 ("intel-iommu: Export a flag indicating that the IOMMU is used for iGFX.")
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:19 +02:00
Chao Yu
9e4ed17b94 f2fs: fix to do sanity check on valid block count of segment
[ Upstream commit e95bcdb2fefa129f37bd9035af1d234ca92ee4ef ]

As Jungyeon reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=203233

- Overview
When mounting the attached crafted image and running program, following errors are reported.
Additionally, it hangs on sync after running program.

The image is intentionally fuzzed from a normal f2fs image for testing.
Compile options for F2FS are as follows.
CONFIG_F2FS_FS=y
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
CONFIG_F2FS_CHECK_FS=y

- Reproduces
cc poc_13.c
mkdir test
mount -t f2fs tmp.img test
cp a.out test
cd test
sudo ./a.out
sync

- Kernel messages
 F2FS-fs (sdb): Bitmap was wrongly set, blk:4608
 kernel BUG at fs/f2fs/segment.c:2102!
 RIP: 0010:update_sit_entry+0x394/0x410
 Call Trace:
  f2fs_allocate_data_block+0x16f/0x660
  do_write_page+0x62/0x170
  f2fs_do_write_node_page+0x33/0xa0
  __write_node_page+0x270/0x4e0
  f2fs_sync_node_pages+0x5df/0x670
  f2fs_write_checkpoint+0x372/0x1400
  f2fs_sync_fs+0xa3/0x130
  f2fs_do_sync_file+0x1a6/0x810
  do_fsync+0x33/0x60
  __x64_sys_fsync+0xb/0x10
  do_syscall_64+0x43/0xf0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

sit.vblocks and sum valid block count in sit.valid_map may be
inconsistent, segment w/ zero vblocks will be treated as free
segment, while allocating in free segment, we may allocate a
free block, if its bitmap is valid previously, it can cause
kernel crash due to bitmap verification failure.

Anyway, to avoid further serious metadata inconsistence and
corruption, it is necessary and worth to detect SIT
inconsistence. So let's enable check_block_count() to verify
vblocks and valid_map all the time rather than do it only
CONFIG_F2FS_CHECK_FS is enabled.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:19 +02:00
Chao Yu
534ef92237 f2fs: fix to avoid panic in do_recover_data()
[ Upstream commit 22d61e286e2d9097dae36f75ed48801056b77cac ]

As Jungyeon reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=203227

- Overview
When mounting the attached crafted image, following errors are reported.
Additionally, it hangs on sync after trying to mount it.

The image is intentionally fuzzed from a normal f2fs image for testing.
Compile options for F2FS are as follows.
CONFIG_F2FS_FS=y
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
CONFIG_F2FS_CHECK_FS=y

- Reproduces
mkdir test
mount -t f2fs tmp.img test
sync

- Messages
 kernel BUG at fs/f2fs/recovery.c:549!
 RIP: 0010:recover_data+0x167a/0x1780
 Call Trace:
  f2fs_recover_fsync_data+0x613/0x710
  f2fs_fill_super+0x1043/0x1aa0
  mount_bdev+0x16d/0x1a0
  mount_fs+0x4a/0x170
  vfs_kern_mount+0x5d/0x100
  do_mount+0x200/0xcf0
  ksys_mount+0x79/0xc0
  __x64_sys_mount+0x1c/0x20
  do_syscall_64+0x43/0xf0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

During recovery, if ofs_of_node is inconsistent in between recovered
node page and original checkpointed node page, let's just fail recovery
instead of making kernel panic.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:19 +02:00
Miroslav Lichvar
90a238a8a2 ntp: Allow TAI-UTC offset to be set to zero
[ Upstream commit fdc6bae940ee9eb869e493990540098b8c0fd6ab ]

The ADJ_TAI adjtimex mode sets the TAI-UTC offset of the system clock.
It is typically set by NTP/PTP implementations and it is automatically
updated by the kernel on leap seconds. The initial value is zero (which
applications may interpret as unknown), but this value cannot be set by
adjtimex. This limitation seems to go back to the original "nanokernel"
implementation by David Mills.

Change the ADJ_TAI check to accept zero as a valid TAI-UTC offset in
order to allow setting it back to the initial value.

Fixes: 153b5d054a ("ntp: support for TAI")
Suggested-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: https://lkml.kernel.org/r/20190417084833.7401-1-mlichvar@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:19 +02:00
Matt Redfearn
25be7d51a0 drm/bridge: adv7511: Fix low refresh rate selection
[ Upstream commit 67793bd3b3948dc8c8384b6430e036a30a0ecb43 ]

The driver currently sets register 0xfb (Low Refresh Rate) based on the
value of mode->vrefresh. Firstly, this field is specified to be in Hz,
but the magic numbers used by the code are Hz * 1000. This essentially
leads to the low refresh rate always being set to 0x01, since the
vrefresh value will always be less than 24000. Fix the magic numbers to
be in Hz.
Secondly, according to the comment in drm_modes.h, the field is not
supposed to be used in a functional way anyway. Instead, use the helper
function drm_mode_vrefresh().

Fixes: 9c8af882bf ("drm: Add adv7511 encoder driver")
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Matt Redfearn <matt.redfearn@thinci.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424132210.26338-1-matt.redfearn@thinci.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:19 +02:00
Stephane Eranian
8fdebdd06c perf/x86/intel: Allow PEBS multi-entry in watermark mode
[ Upstream commit c7a286577d7592720c2f179aadfb325a1ff48c95 ]

This patch fixes a restriction/bug introduced by:

   583feb08e7f7 ("perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS")

The original patch prevented using multi-entry PEBS when wakeup_events != 0.
However given that wakeup_events is part of a union with wakeup_watermark, it
means that in watermark mode, PEBS multi-entry is also disabled which is not the
intent. This patch fixes this by checking is watermark mode is enabled.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Cc: vincent.weaver@maine.edu
Fixes: 583feb08e7f7 ("perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS")
Link: http://lkml.kernel.org/r/20190514003400.224340-1-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:19 +02:00
Tony Lindgren
dee1ba919c mfd: twl6040: Fix device init errors for ACCCTL register
[ Upstream commit 48171d0ea7caccf21c9ee3ae75eb370f2a756062 ]

I noticed that we can get a -EREMOTEIO errors on at least omap4 duovero:

twl6040 0-004b: Failed to write 2d = 19: -121

And then any following register access will produce errors.

There 2d offset above is register ACCCTL that gets written on twl6040
powerup. With error checking added to the related regcache_sync() call,
the -EREMOTEIO error is reproducable on twl6040 powerup at least
duovero.

To fix the error, we need to wait until twl6040 is accessible after the
powerup. Based on tests on omap4 duovero, we need to wait over 8ms after
powerup before register write will complete without failures. Let's also
make sure we warn about possible errors too.

Note that we have twl6040_patch[] reg_sequence with the ACCCTL register
configuration and regcache_sync() will write the new value to ACCCTL.

Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:19 +02:00
Binbin Wu
4110c41888 mfd: intel-lpss: Set the device in reset state when init
[ Upstream commit dad06532292d77f37fbe831a02948a593500f682 ]

In virtualized setup, when system reboots due to warm
reset interrupt storm is seen.

Call Trace:
<IRQ>
dump_stack+0x70/0xa5
__report_bad_irq+0x2e/0xc0
note_interrupt+0x248/0x290
? add_interrupt_randomness+0x30/0x220
handle_irq_event_percpu+0x54/0x80
handle_irq_event+0x39/0x60
handle_fasteoi_irq+0x91/0x150
handle_irq+0x108/0x180
do_IRQ+0x52/0xf0
common_interrupt+0xf/0xf
</IRQ>
RIP: 0033:0x76fc2cfabc1d
Code: 24 28 bf 03 00 00 00 31 c0 48 8d 35 63 77 0e 00 48 8d 15 2e
94 0e 00 4c 89 f9 49 89 d9 4c 89 d3 e8 b8 e2 01 00 48 8b 54 24 18
<48> 89 ef 48 89 de 4c 89 e1 e8 d5 97 01 00 84 c0 74 2d 48 8b 04
24
RSP: 002b:00007ffd247c1fc0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffffda
RAX: 0000000000000000 RBX: 00007ffd247c1ff0 RCX: 000000000003d3ce
RDX: 0000000000000000 RSI: 00007ffd247c1ff0 RDI: 000076fc2cbb6010
RBP: 000076fc2cded010 R08: 00007ffd247c2210 R09: 00007ffd247c22a0
R10: 000076fc29465470 R11: 0000000000000000 R12: 00007ffd247c1fc0
R13: 000076fc2ce8e470 R14: 000076fc27ec9960 R15: 0000000000000414
handlers:
[<000000000d3fa913>] idma64_irq
Disabling IRQ #27

To avoid interrupt storm, set the device in reset state
before bringing out the device from reset state.

Changelog v2:
- correct the subject line by adding "mfd: "

Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:19 +02:00
Cyrill Gorcunov
1bef19130b kernel/sys.c: prctl: fix false positive in validate_prctl_map()
[ Upstream commit a9e73998f9d705c94a8dca9687633adc0f24a19a ]

While validating new map we require the @start_data to be strictly less
than @end_data, which is fine for regular applications (this is why this
nit didn't trigger for that long).  These members are set from executable
loaders such as elf handers, still it is pretty valid to have a loadable
data section with zero size in file, in such case the start_data is equal
to end_data once kernel loader finishes.

As a result when we're trying to restore such programs the procedure fails
and the kernel returns -EINVAL.  From the image dump of a program:

 | "mm_start_code": "0x400000",
 | "mm_end_code": "0x8f5fb4",
 | "mm_start_data": "0xf1bfb0",
 | "mm_end_data": "0xf1bfb0",

Thus we need to change validate_prctl_map from strictly less to less or
equal operator use.

Link: http://lkml.kernel.org/r/20190408143554.GY1421@uranus.lan
Fixes: f606b77f1a ("prctl: PR_SET_MM -- introduce PR_SET_MM_MAP operation")
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Andrey Vagin <avagin@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:18 +02:00
Yue Hu
937fa1624a mm/cma_debug.c: fix the break condition in cma_maxchunk_get()
[ Upstream commit f0fd50504a54f5548eb666dc16ddf8394e44e4b7 ]

If not find zero bit in find_next_zero_bit(), it will return the size
parameter passed in, so the start bit should be compared with bitmap_maxno
rather than cma->count.  Although getting maxchunk is working fine due to
zero value of order_per_bit currently, the operation will be stuck if
order_per_bit is set as non-zero.

Link: http://lkml.kernel.org/r/20190319092734.276-1-zbestahu@gmail.com
Signed-off-by: Yue Hu <huyue2@yulong.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Joe Perches <joe@perches.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Safonov <d.safonov@partner.samsung.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:18 +02:00
Yue Hu
fceb0be418 mm/cma.c: fix crash on CMA allocation if bitmap allocation fails
[ Upstream commit 1df3a339074e31db95c4790ea9236874b13ccd87 ]

f022d8cb7e ("mm: cma: Don't crash on allocation if CMA area can't be
activated") fixes the crash issue when activation fails via setting
cma->count as 0, same logic exists if bitmap allocation fails.

Link: http://lkml.kernel.org/r/20190325081309.6004-1-zbestahu@gmail.com
Signed-off-by: Yue Hu <huyue2@yulong.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:17 +02:00
Mike Kravetz
9c8d4d77e3 hugetlbfs: on restore reserve error path retain subpool reservation
[ Upstream commit 0919e1b69ab459e06df45d3ba6658d281962db80 ]

When a huge page is allocated, PagePrivate() is set if the allocation
consumed a reservation.  When freeing a huge page, PagePrivate is checked.
If set, it indicates the reservation should be restored.  PagePrivate
being set at free huge page time mostly happens on error paths.

When huge page reservations are created, a check is made to determine if
the mapping is associated with an explicitly mounted filesystem.  If so,
pages are also reserved within the filesystem.  The default action when
freeing a huge page is to decrement the usage count in any associated
explicitly mounted filesystem.  However, if the reservation is to be
restored the reservation/use count within the filesystem should not be
decrementd.  Otherwise, a subsequent page allocation and free for the same
mapping location will cause the file filesystem usage to go 'negative'.

Filesystem                         Size  Used Avail Use% Mounted on
nodev                              4.0G -4.0M  4.1G    - /opt/hugepool

To fix, when freeing a huge page do not adjust filesystem usage if
PagePrivate() is set to indicate the reservation should be restored.

I did not cc stable as the problem has been around since reserves were
added to hugetlbfs and nobody has noticed.

Link: http://lkml.kernel.org/r/20190328234704.27083-2-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:17 +02:00
Li Rongqing
d8129a5d7a ipc: prevent lockup on alloc_msg and free_msg
[ Upstream commit d6a2946a88f524a47cc9b79279667137899db807 ]

msgctl10 of ltp triggers the following lockup When CONFIG_KASAN is
enabled on large memory SMP systems, the pages initialization can take a
long time, if msgctl10 requests a huge block memory, and it will block
rcu scheduler, so release cpu actively.

After adding schedule() in free_msg, free_msg can not be called when
holding spinlock, so adding msg to a tmp list, and free it out of
spinlock

  rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  rcu:     Tasks blocked on level-1 rcu_node (CPUs 16-31): P32505
  rcu:     Tasks blocked on level-1 rcu_node (CPUs 48-63): P34978
  rcu:     (detected by 11, t=35024 jiffies, g=44237529, q=16542267)
  msgctl10        R  running task    21608 32505   2794 0x00000082
  Call Trace:
   preempt_schedule_irq+0x4c/0xb0
   retint_kernel+0x1b/0x2d
  RIP: 0010:__is_insn_slot_addr+0xfb/0x250
  Code: 82 1d 00 48 8b 9b 90 00 00 00 4c 89 f7 49 c1 ee 03 e8 59 83 1d 00 48 b8 00 00 00 00 00 fc ff df 4c 39 eb 48 89 9d 58 ff ff ff <41> c6 04 06 f8 74 66 4c 8d 75 98 4c 89 f1 48 c1 e9 03 48 01 c8 48
  RSP: 0018:ffff88bce041f758 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
  RAX: dffffc0000000000 RBX: ffffffff8471bc50 RCX: ffffffff828a2a57
  RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff88bce041f780
  RBP: ffff88bce041f828 R08: ffffed15f3f4c5b3 R09: ffffed15f3f4c5b3
  R10: 0000000000000001 R11: ffffed15f3f4c5b2 R12: 000000318aee9b73
  R13: ffffffff8471bc50 R14: 1ffff1179c083ef0 R15: 1ffff1179c083eec
   kernel_text_address+0xc1/0x100
   __kernel_text_address+0xe/0x30
   unwind_get_return_address+0x2f/0x50
   __save_stack_trace+0x92/0x100
   create_object+0x380/0x650
   __kmalloc+0x14c/0x2b0
   load_msg+0x38/0x1a0
   do_msgsnd+0x19e/0xcf0
   do_syscall_64+0x117/0x400
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

  rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  rcu:     Tasks blocked on level-1 rcu_node (CPUs 0-15): P32170
  rcu:     (detected by 14, t=35016 jiffies, g=44237525, q=12423063)
  msgctl10        R  running task    21608 32170  32155 0x00000082
  Call Trace:
   preempt_schedule_irq+0x4c/0xb0
   retint_kernel+0x1b/0x2d
  RIP: 0010:lock_acquire+0x4d/0x340
  Code: 48 81 ec c0 00 00 00 45 89 c6 4d 89 cf 48 8d 6c 24 20 48 89 3c 24 48 8d bb e4 0c 00 00 89 74 24 0c 48 c7 44 24 20 b3 8a b5 41 <48> c1 ed 03 48 c7 44 24 28 b4 25 18 84 48 c7 44 24 30 d0 54 7a 82
  RSP: 0018:ffff88af83417738 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
  RAX: dffffc0000000000 RBX: ffff88bd335f3080 RCX: 0000000000000002
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88bd335f3d64
  RBP: ffff88af83417758 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000001 R11: ffffed13f3f745b2 R12: 0000000000000000
  R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
   is_bpf_text_address+0x32/0xe0
   kernel_text_address+0xec/0x100
   __kernel_text_address+0xe/0x30
   unwind_get_return_address+0x2f/0x50
   __save_stack_trace+0x92/0x100
   save_stack+0x32/0xb0
   __kasan_slab_free+0x130/0x180
   kfree+0xfa/0x2d0
   free_msg+0x24/0x50
   do_msgrcv+0x508/0xe60
   do_syscall_64+0x117/0x400
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

Davidlohr said:
 "So after releasing the lock, the msg rbtree/list is empty and new
  calls will not see those in the newly populated tmp_msg list, and
  therefore they cannot access the delayed msg freeing pointers, which
  is good. Also the fact that the node_cache is now freed before the
  actual messages seems to be harmless as this is wanted for
  msg_insert() avoiding GFP_ATOMIC allocations, and after releasing the
  info->lock the thing is freed anyway so it should not change things"

Link: http://lkml.kernel.org/r/1552029161-4957-1-git-send-email-lirongqing@baidu.com
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:17 +02:00
Christian Brauner
50c0db5399 sysctl: return -EINVAL if val violates minmax
[ Upstream commit e260ad01f0aa9e96b5386d5cd7184afd949dc457 ]

Currently when userspace gives us a values that overflow e.g.  file-max
and other callers of __do_proc_doulongvec_minmax() we simply ignore the
new value and leave the current value untouched.

This can be problematic as it gives the illusion that the limit has
indeed be bumped when in fact it failed.  This commit makes sure to
return EINVAL when an overflow is detected.  Please note that this is a
userspace facing change.

Link: http://lkml.kernel.org/r/20190210203943.8227-4-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:17 +02:00
Hou Tao
8b9241b052 fs/fat/file.c: issue flush after the writeback of FAT
[ Upstream commit bd8309de0d60838eef6fb575b0c4c7e95841cf73 ]

fsync() needs to make sure the data & meta-data of file are persistent
after the return of fsync(), even when a power-failure occurs later.  In
the case of fat-fs, the FAT belongs to the meta-data of file, so we need
to issue a flush after the writeback of FAT instead before.

Also bail out early when any stage of fsync fails.

Link: http://lkml.kernel.org/r/20190409030158.136316-1-houtao1@huawei.com
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:17 +02:00
Linux Build Service Account
335e20ca95 Merge "soc: qcom: glink_spi_xprt: Sanitize input for short cmd" 2019-06-21 16:30:48 -07:00
Linux Build Service Account
abc2a59bfd Merge "soc: qcom: hab: add error handling when dt item is missing" 2019-06-21 08:40:23 -07:00
Daniel Colascione
9998d2e52d mm: add /proc/pid/smaps_rollup
/proc/pid/smaps_rollup is a new proc file that improves the performance
of user programs that determine aggregate memory statistics (e.g., total
PSS) of a process.

Android regularly "samples" the memory usage of various processes in
order to balance its memory pool sizes.  This sampling process involves
opening /proc/pid/smaps and summing certain fields.  For very large
processes, sampling memory use this way can take several hundred
milliseconds, due mostly to the overhead of the seq_printf calls in
task_mmu.c.

smaps_rollup improves the situation.  It contains most of the fields of
/proc/pid/smaps, but instead of a set of fields for each VMA,
smaps_rollup instead contains one synthetic smaps-format entry
representing the whole process.  In the single smaps_rollup synthetic
entry, each field is the summation of the corresponding field in all of
the real-smaps VMAs.  Using a common format for smaps_rollup and smaps
allows userspace parsers to repurpose parsers meant for use with
non-rollup smaps for smaps_rollup, and it allows userspace to switch
between smaps_rollup and smaps at runtime (say, based on the
availability of smaps_rollup in a given kernel) with minimal fuss.

By using smaps_rollup instead of smaps, a caller can avoid the
significant overhead of formatting, reading, and parsing each of a large
process's potentially very numerous memory mappings.  For sampling
system_server's PSS in Android, we measured a 12x speedup, representing
a savings of several hundred milliseconds.

One alternative to a new per-process proc file would have been including
PSS information in /proc/pid/status.  We considered this option but
thought that PSS would be too expensive (by a few orders of magnitude)
to collect relative to what's already emitted as part of
/proc/pid/status, and slowing every user of /proc/pid/status for the
sake of readers that happen to want PSS feels wrong.

The code itself works by reusing the existing VMA-walking framework we
use for regular smaps generation and keeping the mem_size_stats
structure around between VMA walks instead of using a fresh one for each
VMA.  In this way, summation happens automatically.  We let seq_file
walk over the VMAs just as it does for regular smaps and just emit
nothing to the seq_file until we hit the last VMA.

Benchmarks:

    using smaps:
    iterations:1000 pid:1163 pss:220023808
    0m29.46s real 0m08.28s user 0m20.98s system

    using smaps_rollup:
    iterations:1000 pid:1163 pss:220702720
    0m04.39s real 0m00.03s user 0m04.31s system

We're using the PSS samples we collect asynchronously for
system-management tasks like fine-tuning oom_adj_score, memory use
tracking for debugging, application-level memory-use attribution, and
deciding whether we want to kill large processes during system idle
maintenance windows.  Android has been using PSS for these purposes for
a long time; as the average process VMA count has increased and and
devices become more efficiency-conscious, PSS-collection inefficiency
has started to matter more.  IMHO, it'd be a lot safer to optimize the
existing PSS-collection model, which has been fine-tuned over the years,
instead of changing the memory tracking approach entirely to work around
smaps-generation inefficiency.

Tim said:

: There are two main reasons why Android gathers PSS information:
:
: 1. Android devices can show the user the amount of memory used per
:    application via the settings app.  This is a less important use case.
:
: 2. We log PSS to help identify leaks in applications.  We have found
:    an enormous number of bugs (in the Android platform, in Google's own
:    apps, and in third-party applications) using this data.
:
: To do this, system_server (the main process in Android userspace) will
: sample the PSS of a process three seconds after it changes state (for
: example, app is launched and becomes the foreground application) and about
: every ten minutes after that.  The net result is that PSS collection is
: regularly running on at least one process in the system (usually a few
: times a minute while the screen is on, less when screen is off due to
: suspend).  PSS of a process is an incredibly useful stat to track, and we
: aren't going to get rid of it.  We've looked at some very hacky approaches
: using RSS ("take the RSS of the target process, subtract the RSS of the
: zygote process that is the parent of all Android apps") to reduce the
: accounting time, but it regularly overestimated the memory used by 20+
: percent.  Accordingly, I don't think that there's a good alternative to
: using PSS.
:
: We started looking into PSS collection performance after we noticed random
: frequency spikes while a phone's screen was off; occasionally, one of the
: CPU clusters would ramp to a high frequency because there was 200-300ms of
: constant CPU work from a single thread in the main Android userspace
: process.  The work causing the spike (which is reasonable governor
: behavior given the amount of CPU time needed) was always PSS collection.
: As a result, Android is burning more power than we should be on PSS
: collection.
:
: The other issue (and why I'm less sure about improving smaps as a
: long-term solution) is that the number of VMAs per process has increased
: significantly from release to release.  After trying to figure out why we
: were seeing these 200-300ms PSS collection times on Android O but had not
: noticed it in previous versions, we found that the number of VMAs in the
: main system process increased by 50% from Android N to Android O (from
: ~1800 to ~2700) and varying increases in every userspace process.  Android
: M to N also had an increase in the number of VMAs, although not as much.
: I'm not sure why this is increasing so much over time, but thinking about
: ASLR and ways to make ASLR better, I expect that this will continue to
: increase going forward.  I would not be surprised if we hit 5000 VMAs on
: the main Android process (system_server) by 2020.
:
: If we assume that the number of VMAs is going to increase over time, then
: doing anything we can do to reduce the overhead of each VMA during PSS
: collection seems like the right way to go, and that means outputting an
: aggregate statistic (to avoid whatever overhead there is per line in
: writing smaps and in reading each line from userspace).

Link: http://lkml.kernel.org/r/20170812022148.178293-1-dancol@google.com
Signed-off-by: Daniel Colascione <dancol@google.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Change-Id: Id86ab841e6ad1a50b372e8d6e8fa4ecc148ffd44
Git-commit: 493b0e9d945fa9dfe96be93ae41b4ca4b6fdb317
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[spathi@codeaurora.org: resolved conflicts in task_mmu.c file]
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2019-06-20 21:56:35 -07:00
Linux Build Service Account
a216516368 Merge "msm: ais: sensor: actuator: fix out of bound read for bivcm region params" 2019-06-20 05:00:41 -07:00
Jiacheng Zheng
7ec9476039 soc: qcom: hab: add error handling when dt item is missing
If certain testgipc node under aliases in device tree is
missing, ep_path remains NULL and it will cause NULL
dereference problem later in the code. We need to go to
error handler and deinitialize everything under this
condition.

Change-Id: I98a27acfb2e8af9687114d610304a31a1ac9c3ca
Signed-off-by: Jiacheng Zheng <jiaczhen@codeaurora.org>
2019-06-20 18:15:25 +08:00
E V Ravi
7967865bb4 msm: ais: sensor: actuator: fix out of bound read for bivcm region params
Issue:
The region index for bivcm is not validated against the region size.
This cause out-of-bound read on the KASAN kernel.
Fix:
Add restriction that region index smaller than region size.
CRs-Fixed: 2379514

Change-Id: Iff77993000fc89513be90d9de174728fed3b50a7
Signed-off-by: E V Ravi <evenka@codeaurora.org>
2019-06-19 20:59:02 -07:00
Linux Build Service Account
14f9c77766 Merge "Merge android-4.4.182 (9c4ab57) into msm-4.4" 2019-06-19 17:24:59 -07:00
Roman Kiryanov
eee21eb4a6 ANDROID: kernel: cgroup: cpuset: Clear cpus_requested for empty buf
update_cpumask had a special case for empty buf which
did not update cpus_requested. This change reduces
differences (only to parsing) in empty/non-empty codepaths
to make them consistent.

Bug: 120444281
Fixes: 4803def4e0b2 ("ANDROID: cpuset: Make cpusets restore on hotplug")
Test: check that writes to /dev/cpuset/background/tasks
Test: work as expected, e.g.:
Test: echo $$ > /dev/cpuset/background/tasks
Test: echo > /dev/cpuset/background/tasks
Signed-off-by: Roman Kiryanov <rkir@google.com>
Change-Id: I49d320ea046636ec38bd23f053317abc59f64f8e
2019-06-19 11:13:11 -07:00
Roman Kiryanov
f1adac4c22 ANDROID: kernel: cgroup: cpuset: Add missing allocation of cpus_requested in alloc_trial_cpuset
alloc_trial_cpuset missed allocation of the alloc_trial_cpuset
field which caused it to be shared from the base cs provided.

Once update_cpumask parsed buf into cpus_requested and updated
cpus_allowed, the result were never written to cs because
cs and trialcs shared the same pointer to cpus_requested and
cpus_requested always matched to itself and no updates were
written. This caused cpus_requested to be non-empty and
cpus_allowed empty.

This issue occurs only with CONFIG_CPUMASK_OFFSTACK enabled
(e.g. via CONFIG_MAXSMP).

Bug: 134051784
Bug: 120444281
Fixes: 4803def4e0b2 ("ANDROID: cpuset: Make cpusets restore on hotplug")
Test: enable CONFIG_CPUSETS, boot and check logcat that
Test: libprocessgroup does not fail with something similar to
Test: AddTidToCgroup failed to write '2354'; fd=93: No space left on device
Signed-off-by: Roman Kiryanov <rkir@google.com>
Change-Id: I866836b5c0acfde8349c250a510ee89d8d37cb8e
2019-06-19 11:13:03 -07:00
Xianbin Zhu
291c54b284 i2c: virtio: reallocate memory for each msg buffer
We'd better reallocate memory for each msg buffer to avoid
the memory in stack space be transferred to the backend for
the driver of i2c virtualization.

Change-Id: I6daa58cf819d4f612ae246b5a685898e28fbeb54
Signed-off-by: Xianbin Zhu <xianzhu@codeaurora.org>
2019-06-19 14:22:56 +08:00
Gerrit - the friendly Code Review server
ca64572d54 Merge changes into msm-4.4 2019-06-18 08:47:08 -07:00
Srinivasarao P
a32f2cd759 Merge android-4.4.182 (9c4ab57) into msm-4.4
* refs/heads/tmp-9c4ab57
  Linux 4.4.182
  tcp: enforce tcp_min_snd_mss in tcp_mtu_probing()
  tcp: add tcp_min_snd_mss sysctl
  tcp: tcp_fragment() should apply sane memory limits
  tcp: limit payload size of sacked skbs
  UPSTREAM: binder: check for overflow when alloc for security context
  BACKPORT: binder: fix race between munmap() and direct reclaim

Change-Id: I4cfb9eb282f54f083631fec4c8161eed42e8ab54
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2019-06-18 12:42:28 +05:30
Linux Build Service Account
2ae4cde579 Merge "i2c: refine the driver of i2c virtualization" 2019-06-17 23:51:36 -07:00
Greg Kroah-Hartman
9c4ab57299 This is the 4.4.182 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0H088ACgkQONu9yGCS
 aT76DQ/+P8y0xk1PUWdaIWj8sjlZJXFc/5lBdxYiJ7pVi8pfXS2aFq1EG9ubTecR
 JWSJ7aMSJHuyLx4Lc2LIeW9dIz1bhp7h61Jqrxvfeo1EoS3diiiu+J1jp3rJyG3A
 4QrqzSsJTswMldmaEVM4vbC0NClwLuHh/c0715f7qi8kuZiTrSuh2x2FVAsaswp0
 Pt8XnKFQ9oE9TvsivfcB24ZiCVQlFSqgk8YwGL111Tfr51mTGddExHyTawXjnH23
 5zgPzVQBMpONLy9r+aW8NJ+CFjk0xTJ2kaWejN9+909J2Nic3h8CZmbb+VMbOnP2
 +ELcy/vUYik/YTaGKmutAWEDJ8Pwlzw0c2EXdXZ8srv8nwz4i/GXwutBsCKsDc0D
 XS9ftELJJDGe4cOACLKwabL5i5vIMqgy78XcNsCXmMEJP+tKF20U3G3Amze9vR/1
 FXtSHRfunzMdTIPqRUuVPzuTze6dP4pICcPkuxZePnbVeOTjjz5GZs0fTGH0xkr9
 p94mSFC90nIbkGpzCqO33uTfqmmWsadbq39DgVZotKR7u9hlfEHst6djEBt7gFej
 CQV28C0Gv1I7rNdAjy+RV9siyhOjtaIsqjhhOAiNf01HlrcgBXCVUJ/Jzpisfs8P
 B23cnWgEjjzsCCaq9nH61Zcd4iyU2Yu0azOWRKigbh1Nfe3X3pA=
 =R6Xb
 -----END PGP SIGNATURE-----

Merge 4.4.182 into android-4.4

Changes in 4.4.182
	tcp: limit payload size of sacked skbs
	tcp: tcp_fragment() should apply sane memory limits
	tcp: add tcp_min_snd_mss sysctl
	tcp: enforce tcp_min_snd_mss in tcp_mtu_probing()
	Linux 4.4.182

Change-Id: I2698fa2ad3884a20d936e6bee638304f52df19bd
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-06-17 20:28:36 +02:00
Greg Kroah-Hartman
33790f2eda Linux 4.4.182 2019-06-17 19:54:23 +02:00
Eric Dumazet
f938ae0ce5 tcp: enforce tcp_min_snd_mss in tcp_mtu_probing()
commit 967c05aee439e6e5d7d805e195b3a20ef5c433d6 upstream.

If mtu probing is enabled tcp_mtu_probing() could very well end up
with a too small MSS.

Use the new sysctl tcp_min_snd_mss to make sure MSS search
is performed in an acceptable range.

CVE-2019-11479 -- tcp mss hardcoded to 48

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Cc: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Bruce Curtis <brucec@netflix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-17 19:54:22 +02:00
Eric Dumazet
e757d052f3 tcp: add tcp_min_snd_mss sysctl
commit 5f3e2bf008c2221478101ee72f5cb4654b9fc363 upstream.

Some TCP peers announce a very small MSS option in their SYN and/or
SYN/ACK messages.

This forces the stack to send packets with a very high network/cpu
overhead.

Linux has enforced a minimal value of 48. Since this value includes
the size of TCP options, and that the options can consume up to 40
bytes, this means that each segment can include only 8 bytes of payload.

In some cases, it can be useful to increase the minimal value
to a saner value.

We still let the default to 48 (TCP_MIN_SND_MSS), for compatibility
reasons.

Note that TCP_MAXSEG socket option enforces a minimal value
of (TCP_MIN_MSS). David Miller increased this minimal value
in commit c39508d6f1 ("tcp: Make TCP_MAXSEG minimum more correct.")
from 64 to 88.

We might in the future merge TCP_MIN_SND_MSS and TCP_MIN_MSS.

CVE-2019-11479 -- tcp mss hardcoded to 48

Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Bruce Curtis <brucec@netflix.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-17 19:54:22 +02:00
Eric Dumazet
ad472d3a94 tcp: tcp_fragment() should apply sane memory limits
commit f070ef2ac66716357066b683fb0baf55f8191a2e upstream.

Jonathan Looney reported that a malicious peer can force a sender
to fragment its retransmit queue into tiny skbs, inflating memory
usage and/or overflow 32bit counters.

TCP allows an application to queue up to sk_sndbuf bytes,
so we need to give some allowance for non malicious splitting
of retransmit queue.

A new SNMP counter is added to monitor how many times TCP
did not allow to split an skb if the allowance was exceeded.

Note that this counter might increase in the case applications
use SO_SNDBUF socket option to lower sk_sndbuf.

CVE-2019-11478 : tcp_fragment, prevent fragmenting a packet when the
	socket is already using more than half the allowed space

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Cc: Bruce Curtis <brucec@netflix.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-17 19:54:22 +02:00
Eric Dumazet
4657ee0fe0 tcp: limit payload size of sacked skbs
commit 3b4929f65b0d8249f19a50245cd88ed1a2f78cff upstream.

Jonathan Looney reported that TCP can trigger the following crash
in tcp_shifted_skb() :

	BUG_ON(tcp_skb_pcount(skb) < pcount);

This can happen if the remote peer has advertized the smallest
MSS that linux TCP accepts : 48

An skb can hold 17 fragments, and each fragment can hold 32KB
on x86, or 64KB on PowerPC.

This means that the 16bit witdh of TCP_SKB_CB(skb)->tcp_gso_segs
can overflow.

Note that tcp_sendmsg() builds skbs with less than 64KB
of payload, so this problem needs SACK to be enabled.
SACK blocks allow TCP to coalesce multiple skbs in the retransmit
queue, thus filling the 17 fragments to maximal capacity.

CVE-2019-11477 -- u16 overflow of TCP_SKB_CB(skb)->tcp_gso_segs

Backport notes, provided by Joao Martins <joao.m.martins@oracle.com>

v4.15 or since commit 737ff314563 ("tcp: use sequence distance to
detect reordering") had switched from the packet-based FACK tracking and
switched to sequence-based.

v4.14 and older still have the old logic and hence on
tcp_skb_shift_data() needs to retain its original logic and have
@fack_count in sync. In other words, we keep the increment of pcount with
tcp_skb_pcount(skb) to later used that to update fack_count. To make it
more explicit we track the new skb that gets incremented to pcount in
@next_pcount, and we get to avoid the constant invocation of
tcp_skb_pcount(skb) all together.

Fixes: 832d11c5cd ("tcp: Try to restore large SKBs while SACK processing")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jonathan Looney <jtl@netflix.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Bruce Curtis <brucec@netflix.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-17 19:54:22 +02:00
Linux Build Service Account
17c66e9e49 Merge "msm: vidc: add additional check to avoid out of bound access" 2019-06-17 10:19:40 -07:00
Linux Build Service Account
4e933027e6 Merge "Merge android-4.4.181 (bd858d7) into msm-4.4" 2019-06-17 10:19:38 -07:00
Xianbin Zhu
be5afd9343 i2c: refine the driver of i2c virtualization
1. there's no need to allocate buffer for virtio_i2c_req every
   time when it comes to data transmission, we can prepare this
   buffer at probe stage.

2. we need to add some input parameter checking for avoiding
   security risks.

3. refine the code to reduce the number of judgement statements.

Change-Id: Idb6ae8bb632c1a3032c6ba04de6ebb460f1e1a5c
Signed-off-by: Xianbin Zhu <xianzhu@codeaurora.org>
2019-06-17 07:23:00 -07:00
Govindaraj Rajagopal
f1057f82fa msm: vidc: add additional check to avoid out of bound access
pkt->msg_size can be corrupted and that leads to OOB access. So added
additional conditional check to avoid OOB access in debug queue
packet handling.

Change-Id: I360812c40369ecef2dd99464d400661bc785074b
Signed-off-by: Govindaraj Rajagopal <grajagop@codeaurora.org>
2019-06-17 06:03:29 -07:00
Manoj Prabhu B
c176a066df diag: dci: Correct out of bounds check in processing dci pkt rsp
Correct the out of bounds check and prevent moving the temp pointer
further than out of bounds check which is not necessary while
processing dci pkt rsp.

Change-Id: I01f8cd7454aff81b24c986eade35c79724976151
Signed-off-by: Manoj Prabhu B <bmanoj@codeaurora.org>
2019-06-17 01:30:06 -07:00
Linux Build Service Account
cc47c2c325 Merge "msm: adsprpc: maintain local copy of rpra offloaded to DSP" 2019-06-14 08:06:12 -07:00
Tharun Kumar Merugu
1f3eb68e71 msm: adsprpc: maintain local copy of rpra offloaded to DSP
Since DSP is not supposed to modify the base pointer rpra of the
input/output arguments offloaded to DSP, maintain a local copy of
the pointer and use it after receiving interrupt from DSP.

Change-Id: I4afade7184cb2aca148060fb0cda06c6174f3b55
Acked-by: Maitreyi Gupta <maitreyi@qti.qualcomm.com>
Signed-off-by: Tharun Kumar Merugu <mtharu@codeaurora.org>
Signed-off-by: Mohammed Nayeem Ur Rahman <mohara@codeaurora.org>
2019-06-14 15:03:35 +05:30
Linux Build Service Account
a50f44266d Merge "diag: Prevent out-of-bound access while processing userspace data" 2019-06-13 16:34:55 -07:00
Linux Build Service Account
c658076e45 Merge "ASoC: msm: Add support for AVS version check" 2019-06-13 16:34:53 -07:00
Linux Build Service Account
f328b4d28d Merge "arm: dts: msm: Add avs-version dt property for 8996" 2019-06-12 18:51:25 -07:00
Hardik Arya
5dbfa1533a diag: Prevent out-of-bound access while processing userspace data
Proper buffer length checks are missing in diagchar_write
handlers for userspace data while processing the same buffer.

Change-Id: I5b8095766e09c22f164398089505fe827fee8b54
Signed-off-by: Hardik Arya <harya@codeaurora.org>
2019-06-12 17:39:26 +05:30
Soumya Managoli
eb0322ea9b arm: dts: msm: Add avs-version dt property for 8996
Add avs-version dt property for reading the ADSP AVS version

Change-Id: I5da40c9530ca4c6ff61f0f1ea5e6b39f1e47e2ad
Signed-off-by: Soumya Managoli <smanag@codeaurora.org>
2019-06-12 14:24:22 +05:30
Srinivasarao P
5ef154a266 Merge android-4.4.181 (bd858d7) into msm-4.4
* refs/heads/tmp-bd858d7
  Linux 4.4.181
  ethtool: check the return value of get_regs_len
  ipv4: Define __ipv4_neigh_lookup_noref when CONFIG_INET is disabled
  fuse: Add FOPEN_STREAM to use stream_open()
  fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock
  drm/gma500/cdv: Check vbt config bits when detecting lvds panels
  genwqe: Prevent an integer overflow in the ioctl
  MIPS: pistachio: Build uImage.gz by default
  fuse: fallocate: fix return with locked inode
  parisc: Use implicit space register selection for loading the coherence index of I/O pdirs
  rcu: locking and unlocking need to always be at least barriers
  pktgen: do not sleep with the thread lock held.
  net: rds: fix memory leak in rds_ib_flush_mr_pool
  net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query
  neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit
  ethtool: fix potential userspace buffer overflow
  media: uvcvideo: Fix uvc_alloc_entity() allocation alignment
  usb: gadget: fix request length error for isoc transfer
  net: cdc_ncm: GetNtbFormat endian fix
  Revert "x86/build: Move _etext to actual end of .text"
  userfaultfd: don't pin the user memory in userfaultfd_file_create()
  brcmfmac: add subtype check for event handling in data path
  brcmfmac: add length checks in scheduled scan result handler
  brcmfmac: fix incorrect event channel deduction
  brcmfmac: revise handling events in receive path
  brcmfmac: screening firmware event packet
  brcmfmac: Add length checks on firmware events
  bnx2x: disable GSO where gso_size is too big for hardware
  net: create skb_gso_validate_mac_len()
  binder: replace "%p" with "%pK"
  binder: Replace "%p" with "%pK" for stable
  CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM
  kernel/signal.c: trace_signal_deliver when signal_group_exit
  memcg: make it work on sparse non-0-node systems
  tty: max310x: Fix external crystal register setup
  tty: serial: msm_serial: Fix XON/XOFF
  drm/nouveau/i2c: Disable i2c bus access after ->fini()
  ALSA: hda/realtek - Set default power save node to 0
  Btrfs: fix race updating log root item during fsync
  scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs)
  scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove
  media: smsusb: better handle optional alignment
  media: usb: siano: Fix false-positive "uninitialized variable" warning
  media: usb: siano: Fix general protection fault in smsusb
  USB: rio500: fix memory leak in close after disconnect
  USB: rio500: refuse more than one device at a time
  USB: Add LPM quirk for Surface Dock GigE adapter
  USB: sisusbvga: fix oops in error path of sisusb_probe
  USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor
  usb: xhci: avoid null pointer deref when bos field is NULL
  xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()
  include/linux/bitops.h: sanitize rotate primitives
  sparc64: Fix regression in non-hypervisor TLB flush xcall
  tipc: fix modprobe tipc failed after switch order of device registration -v2
  Revert "tipc: fix modprobe tipc failed after switch order of device registration"
  xen/pciback: Don't disable PCI_COMMAND on PCI device reset.
  crypto: vmx - ghash: do nosimd fallback manually
  net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value
  bnxt_en: Fix aggregation buffer leak under OOM condition.
  tipc: Avoid copying bytes beyond the supplied data
  usbnet: fix kernel crash after disconnect
  net: stmmac: fix reset gpio free missing
  net-gro: fix use-after-free read in napi_gro_frags()
  llc: fix skb leak in llc_build_and_send_ui_pkt()
  ipv6: Consider sk_bound_dev_if when binding a raw socket to an address
  ASoC: davinci-mcasp: Fix clang warning without CONFIG_PM
  spi: Fix zero length xfer bug
  spi: rspi: Fix sequencer reset during initialization
  spi : spi-topcliff-pch: Fix to handle empty DMA buffers
  scsi: lpfc: Fix SLI3 commands being issued on SLI4 devices
  media: saa7146: avoid high stack usage with clang
  media: go7007: avoid clang frame overflow warning with KASAN
  media: m88ds3103: serialize reset messages in m88ds3103_set_frontend
  scsi: qla4xxx: avoid freeing unallocated dma memory
  usb: core: Add PM runtime calls to usb_hcd_platform_shutdown
  rcutorture: Fix cleanup path for invalid torture_type strings
  tty: ipwireless: fix missing checks for ioremap
  virtio_console: initialize vtermno value for ports
  media: wl128x: prevent two potential buffer overflows
  spi: tegra114: reset controller on probe
  cxgb3/l2t: Fix undefined behaviour
  ASoC: fsl_utils: fix a leaked reference by adding missing of_node_put
  ASoC: eukrea-tlv320: fix a leaked reference by adding missing of_node_put
  HID: core: move Usage Page concatenation to Main item
  chardev: add additional check for minor range overlap
  x86/ia32: Fix ia32_restore_sigcontext() AC leak
  arm64: cpu_ops: fix a leaked reference by adding missing of_node_put
  scsi: ufs: Avoid configuring regulator with undefined voltage range
  scsi: ufs: Fix regulator load and icc-level configuration
  brcmfmac: fix race during disconnect when USB completion is in progress
  brcmfmac: convert dev_init_lock mutex to completion
  b43: shut up clang -Wuninitialized variable warning
  brcmfmac: fix missing checks for kmemdup
  rtlwifi: fix a potential NULL pointer dereference
  iio: common: ssp_sensors: Initialize calculated_time in ssp_common_process_data
  iio: hmc5843: fix potential NULL pointer dereferences
  iio: ad_sigma_delta: Properly handle SPI bus locking vs CS assertion
  x86/build: Keep local relocations with ld.lld
  cpufreq: pmac32: fix possible object reference leak
  cpufreq/pasemi: fix possible object reference leak
  cpufreq: ppc_cbe: fix possible object reference leak
  s390: cio: fix cio_irb declaration
  extcon: arizona: Disable mic detect if running when driver is removed
  PM / core: Propagate dev->power.wakeup_path when no callbacks
  mmc: sdhci-of-esdhc: add erratum eSDHC-A001 and A-008358 support
  mmc: sdhci-of-esdhc: add erratum eSDHC5 support
  mmc_spi: add a status check for spi_sync_locked
  scsi: libsas: Do discovery on empty PHY to update PHY info
  hwmon: (f71805f) Use request_muxed_region for Super-IO accesses
  hwmon: (pc87427) Use request_muxed_region for Super-IO accesses
  hwmon: (smsc47b397) Use request_muxed_region for Super-IO accesses
  hwmon: (smsc47m1) Use request_muxed_region for Super-IO accesses
  hwmon: (vt1211) Use request_muxed_region for Super-IO accesses
  RDMA/cxgb4: Fix null pointer dereference on alloc_skb failure
  i40e: don't allow changes to HW VLAN stripping on active port VLANs
  x86/irq/64: Limit IST stack overflow check to #DB stack
  USB: core: Don't unbind interfaces following device reset failure
  sched/core: Handle overflow in cpu_shares_write_u64
  sched/core: Check quota and period overflow at usec to nsec conversion
  powerpc/numa: improve control of topology updates
  media: pvrusb2: Prevent a buffer overflow
  media: au0828: Fix NULL pointer dereference in au0828_analog_stream_enable()
  audit: fix a memory leak bug
  media: ov2659: make S_FMT succeed even if requested format doesn't match
  media: au0828: stop video streaming only when last user stops
  media: ov6650: Move v4l2_clk_get() to ov6650_video_probe() helper
  media: coda: clear error return value before picture run
  dmaengine: at_xdmac: remove BUG_ON macro in tasklet
  pinctrl: pistachio: fix leaked of_node references
  HID: logitech-hidpp: use RAP instead of FAP to get the protocol version
  mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions
  x86/mm: Remove in_nmi() warning from 64-bit implementation of vmalloc_fault()
  smpboot: Place the __percpu annotation correctly
  x86/build: Move _etext to actual end of .text
  bcache: avoid clang -Wunintialized warning
  bcache: add failure check to run_cache_set() for journal replay
  bcache: fix failure in journal relplay
  bcache: return error immediately in bch_journal_replay()
  net: cw1200: fix a NULL pointer dereference
  mwifiex: prevent an array overflow
  ASoC: fsl_sai: Update is_slave_mode with correct value
  mac80211/cfg80211: update bss channel on channel switch
  dmaengine: pl330: _stop: clear interrupt status
  w1: fix the resume command API
  rtc: 88pm860x: prevent use-after-free on device remove
  brcm80211: potential NULL dereference in brcmf_cfg80211_vndr_cmds_dcmd_handler()
  spi: pxa2xx: fix SCR (divisor) calculation
  ASoC: imx: fix fiq dependencies
  powerpc/boot: Fix missing check of lseek() return value
  mmc: core: Verify SD bus width
  cxgb4: Fix error path in cxgb4_init_module
  gfs2: Fix lru_count going negative
  tools include: Adopt linux/bits.h
  perf tools: No need to include bitops.h in util.h
  at76c50x-usb: Don't register led_trigger if usb_register_driver failed
  ssb: Fix possible NULL pointer dereference in ssb_host_pcmcia_exit
  media: vivid: use vfree() instead of kfree() for dev->bitmap_cap
  media: cpia2: Fix use-after-free in cpia2_exit
  fbdev: fix WARNING in __alloc_pages_nodemask bug
  hugetlb: use same fault hash key for shared and private mappings
  fbdev: fix divide error in fb_var_to_videomode
  btrfs: sysfs: don't leak memory when failing add fsid
  Btrfs: fix race between ranged fsync and writeback of adjacent ranges
  gfs2: Fix sign extension bug in gfs2_update_stats
  crypto: vmx - CTR: always increment IV as quadword
  Revert "scsi: sd: Keep disk read-only when re-reading partition"
  bio: fix improper use of smp_mb__before_atomic()
  KVM: x86: fix return value for reserved EFER
  ext4: do not delete unlinked inode from orphan list on failed truncate
  fbdev: sm712fb: fix memory frequency by avoiding a switch/case fallthrough
  btrfs: Honour FITRIM range constraints during free space trim
  md/raid: raid5 preserve the writeback action after the parity check
  Revert "Don't jump to compute_result state from check_result state"
  perf bench numa: Add define for RUSAGE_THREAD if not present
  ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
  power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
  KVM: arm/arm64: Ensure vcpu target is unset on reset failure
  xfrm4: Fix uninitialized memory read in _decode_session4
  vti4: ipip tunnel deregistration fixes.
  xfrm6_tunnel: Fix potential panic when unloading xfrm6_tunnel module
  xfrm: policy: Fix out-of-bound array accesses in __xfrm_policy_unlink
  dm delay: fix a crash when invalid device is specified
  PCI: Mark Atheros AR9462 to avoid bus reset
  fbdev: sm712fb: fix crashes and garbled display during DPMS modesetting
  fbdev: sm712fb: use 1024x768 by default on non-MIPS, fix garbled display
  fbdev: sm712fb: fix support for 1024x768-16 mode
  fbdev: sm712fb: fix crashes during framebuffer writes by correctly mapping VRAM
  fbdev: sm712fb: fix boot screen glitch when sm712fb replaces VGA
  fbdev: sm712fb: fix white screen of death on reboot, don't set CR3B-CR3F
  fbdev: sm712fb: fix VRAM detection, don't set SR70/71/74/75
  fbdev: sm712fb: fix brightness control on reboot, don't set SR30
  perf intel-pt: Fix sample timestamp wrt non-taken branches
  perf intel-pt: Fix improved sample timestamp
  perf intel-pt: Fix instructions sampling rate
  memory: tegra: Fix integer overflow on tick value calculation
  tracing: Fix partial reading of trace event's id file
  ceph: flush dirty inodes before proceeding with remount
  iommu/tegra-smmu: Fix invalid ASID bits on Tegra30/114
  fuse: honor RLIMIT_FSIZE in fuse_file_fallocate
  fuse: fix writepages on 32bit
  clk: tegra: Fix PLLM programming on Tegra124+ when PMC overrides divider
  NFS4: Fix v4.0 client state corruption when mount
  media: ov6650: Fix sensor possibly not detected on probe
  cifs: fix strcat buffer overflow and reduce raciness in smb21_set_oplock_level()
  of: fix clang -Wunsequenced for be32_to_cpu()
  intel_th: msu: Fix single mode with IOMMU
  md: add mddev->pers to avoid potential NULL pointer dereference
  stm class: Fix channel free in stm output free path
  tipc: fix modprobe tipc failed after switch order of device registration
  tipc: switch order of device registration to fix a crash
  ppp: deflate: Fix possible crash in deflate_init
  net/mlx4_core: Change the error print to info print
  net: avoid weird emergency message
  KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes
  ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug
  ext4: zero out the unused memory region in the extent tree block
  fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
  writeback: synchronize sync(2) against cgroup writeback membership switches
  crypto: arm/aes-neonbs - don't access already-freed walk.iv
  crypto: salsa20 - don't access already-freed walk.iv
  crypto: chacha20poly1305 - set cra_name correctly
  crypto: gcm - fix incompatibility between "gcm" and "gcm_base"
  crypto: gcm - Fix error return code in crypto_gcm_create_common()
  ipmi:ssif: compare block number correctly for multi-part return messages
  bcache: never set KEY_PTRS of journal key to 0 in journal_reclaim()
  bcache: fix a race between cache register and cacheset unregister
  Btrfs: do not start a transaction at iterate_extent_inodes()
  ext4: fix ext4_show_options for file systems w/o journal
  ext4: actually request zeroing of inode table after grow
  tty/vt: fix write/write race in ioctl(KDSKBSENT) handler
  mfd: da9063: Fix OTP control register names to match datasheets for DA9063/63L
  ocfs2: fix ocfs2 read inode data panic in ocfs2_iget
  mm/mincore.c: make mincore() more conservative
  ASoC: RT5677-SPI: Disable 16Bit SPI Transfers
  ASoC: max98090: Fix restore of DAPM Muxes
  ALSA: hda/realtek - EAPD turn on later
  ALSA: hda/hdmi - Consider eld_valid when reporting jack event
  ALSA: usb-audio: Fix a memory leak bug
  crypto: x86/crct10dif-pcl - fix use via crypto_shash_digest()
  crypto: crct10dif-generic - fix use via crypto_shash_digest()
  crypto: vmx - fix copy-paste error in CTR mode
  ARM: exynos: Fix a leaked reference by adding missing of_node_put
  x86/speculation/mds: Improve CPU buffer clear documentation
  x86/speculation/mds: Revert CPU buffer clear on double fault exit
  f2fs: link f2fs quota ops for sysfile
  fs: sdcardfs: Add missing option to show_options

Conflicts:
	drivers/scsi/sd.c
	drivers/scsi/ufs/ufshcd.c

Change-Id: If6679c7cc8c3fee323c749ac359353fbebfd12d9
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2019-06-12 13:53:42 +05:30