android_kernel_oneplus_msm8998/drivers
Douglas Anderson 1a2b3e7807 UPSTREAM: dm bufio: avoid sleeping while holding the dm_bufio lock
We've seen in-field reports showing _lots_ (18 in one case, 41 in
another) of tasks all sitting there blocked on:

  mutex_lock+0x4c/0x68
  dm_bufio_shrink_count+0x38/0x78
  shrink_slab.part.54.constprop.65+0x100/0x464
  shrink_zone+0xa8/0x198

In the two cases analyzed, we see one task that looks like this:

  Workqueue: kverityd verity_prefetch_io

  __switch_to+0x9c/0xa8
  __schedule+0x440/0x6d8
  schedule+0x94/0xb4
  schedule_timeout+0x204/0x27c
  schedule_timeout_uninterruptible+0x44/0x50
  wait_iff_congested+0x9c/0x1f0
  shrink_inactive_list+0x3a0/0x4cc
  shrink_lruvec+0x418/0x5cc
  shrink_zone+0x88/0x198
  try_to_free_pages+0x51c/0x588
  __alloc_pages_nodemask+0x648/0xa88
  __get_free_pages+0x34/0x7c
  alloc_buffer+0xa4/0x144
  __bufio_new+0x84/0x278
  dm_bufio_prefetch+0x9c/0x154
  verity_prefetch_io+0xe8/0x10c
  process_one_work+0x240/0x424
  worker_thread+0x2fc/0x424
  kthread+0x10c/0x114

...and that looks to be the one holding the mutex.

The problem has been reproduced on fairly easily:
0. Be running Chrome OS w/ verity enabled on the root filesystem
1. Pick test patch: http://crosreview.com/412360
2. Install launchBalloons.sh and balloon.arm from
     http://crbug.com/468342
   ...that's just a memory stress test app.
3. On a 4GB rk3399 machine, run
     nice ./launchBalloons.sh 4 900 100000
   ...that tries to eat 4 * 900 MB of memory and keep accessing.
4. Login to the Chrome web browser and restore many tabs

With that, I've seen printouts like:
  DOUG: long bufio 90758 ms
...and stack trace always show's we're in dm_bufio_prefetch().

The problem is that we try to allocate memory with GFP_NOIO while
we're holding the dm_bufio lock.  Instead we should be using
GFP_NOWAIT.  Using GFP_NOIO can cause us to sleep while holding the
lock and that causes the above problems.

The current behavior explained by David Rientjes:

  It will still try reclaim initially because __GFP_WAIT (or
  __GFP_KSWAPD_RECLAIM) is set by GFP_NOIO.  This is the cause of
  contention on dm_bufio_lock() that the thread holds.  You want to
  pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a
  mutex that can be contended by a concurrent slab shrinker (if
  count_objects didn't use a trylock, this pattern would trivially
  deadlock).

This change significantly increases responsiveness of the system while
in this state.  It makes a real difference because it unblocks kswapd.
In the bug report analyzed, kswapd was hung:

   kswapd0         D ffffffc000204fd8     0    72      2 0x00000000
   Call trace:
   [<ffffffc000204fd8>] __switch_to+0x9c/0xa8
   [<ffffffc00090b794>] __schedule+0x440/0x6d8
   [<ffffffc00090bac0>] schedule+0x94/0xb4
   [<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44
   [<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac
   [<ffffffc00090d9d8>] mutex_lock+0x4c/0x68
   [<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78
   [<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464
   [<ffffffc00030dbd8>] shrink_zone+0xa8/0x198
   [<ffffffc00030e578>] balance_pgdat+0x328/0x508
   [<ffffffc00030eb7c>] kswapd+0x424/0x51c
   [<ffffffc00023f06c>] kthread+0x10c/0x114
   [<ffffffc000203dd0>] ret_from_fork+0x10/0x40

By unblocking kswapd memory pressure should be reduced.

Change-Id: I10da1bcb02160d75320c16259a54b5de4aafede1
Suggested-by: David Rientjes <rientjes@google.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
(cherry picked from commit 9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc)
Signed-off-by: Minchan Kim <minchan@google.com>
2018-05-22 00:27:36 +00:00
..
accessibility
acpi ACPI / video: Add quirk to force acpi-video backlight on Samsung 670Z5E 2018-04-24 09:32:06 +02:00
amba ARM: amba: Don't read past the end of sysfs "driver_override" buffer 2018-05-02 07:53:42 -07:00
android UPSTREAM: ANDROID: binder: prevent transactions into own process. 2018-05-11 07:29:13 +00:00
ata libata: Apply NOLPM quirk for SanDisk SD7UB3Q*G1001 SSDs 2018-05-16 10:06:51 +02:00
atm atm: zatm: Fix potential Spectre v1 2018-05-16 10:06:52 +02:00
auxdisplay
base This is the 4.4.129 stable release 2018-04-24 10:42:34 +02:00
bcma
block block/loop: fix deadlock after loop_set_status 2018-04-24 09:32:03 +02:00
bluetooth Revert "Bluetooth: btusb: Fix quirk for Atheros 1525/QCA6174" 2018-05-16 10:06:52 +02:00
bus bus: brcmstb_gisb: correct support for 64-bit address output 2018-04-13 19:50:05 +02:00
cdrom cdrom: information leak in cdrom_ioctl_media_changed() 2018-04-29 07:50:07 +02:00
char virtio_console: free buffers after reset 2018-05-02 07:53:40 -07:00
clk clk: bcm2835: De-assert/assert PLL reset signal when appropriate 2018-04-24 09:32:08 +02:00
clocksource clocksource: arch_timer: make virtual counter access configurable 2018-01-09 13:35:07 +01:00
connector
cpufreq This is the 4.4.127 stable release 2018-04-08 16:07:37 +02:00
cpuidle This is the 4.4.128 stable release 2018-04-14 15:35:32 +02:00
crypto crypto: s5p-sss - Fix kernel Oops in AES-ECB mode 2018-02-25 11:03:55 +01:00
dca
devfreq PM / devfreq: Propagate error from devfreq_add_device() 2018-02-22 15:44:58 +01:00
dio
dma dmaengine: at_xdmac: fix rare residue corruption 2018-04-24 09:32:08 +02:00
dma-buf
edac EDAC, mv64x60: Fix an error handling path 2018-04-13 19:50:23 +02:00
eisa
extcon extcon: palmas: Check the parent instance to prevent the NULL 2017-11-21 09:21:18 +01:00
firewire
firmware This is the 4.4.107 stable release 2017-12-20 10:49:07 +01:00
fmc
fpga
gpio gpio: label descriptors using the device name 2018-04-13 19:50:14 +02:00
gpu This is the 4.4.132 stable release 2018-05-16 11:32:47 +02:00
hid This is the 4.4.129 stable release 2018-04-24 10:42:34 +02:00
hsi HSI: ssi_protocol: double free in ssip_pn_xmit() 2018-03-24 10:58:42 +01:00
hv Drivers: hv: vmbus: fix build warning 2018-02-25 11:03:46 +01:00
hwmon hwmon: (ina2xx) Fix access to uninitialized mutex 2018-04-24 09:32:04 +02:00
hwspinlock
hwtracing coresight: Fix disabling of CoreSight TPIU 2018-03-24 10:58:48 +01:00
i2c i2c: i2c-scmi: add a MS HID 2018-03-24 10:58:41 +01:00
ide
idle idle: i7300: add PCI dependency 2018-02-25 11:03:51 +01:00
iio iio: magnetometer: st_magn_spi: fix spi_device_id table 2018-04-13 19:50:21 +02:00
infiniband IB/mlx5: Use unlimited rate when static rate is not supported 2018-05-16 10:06:48 +02:00
input This is the 4.4.132 stable release 2018-05-16 11:32:47 +02:00
iommu iommu/vt-d: Fix a potential memory leak 2018-04-24 09:32:08 +02:00
ipack
irqchip This is the 4.4.123 stable release 2018-03-22 09:57:28 +01:00
isdn mISDN: Fix a sleep-in-atomic bug 2018-04-13 19:50:16 +02:00
leds leds: pca955x: Correct I2C Functionality 2018-04-13 19:50:09 +02:00
lguest
lightnvm
macintosh
mailbox
mcb
md UPSTREAM: dm bufio: avoid sleeping while holding the dm_bufio lock 2018-05-22 00:27:36 +00:00
media media: v4l2-compat-ioctl32: don't oops on overlay 2018-04-24 09:32:03 +02:00
memory ARM: OMAP2+: gpmc-onenand: propagate error on initialization failure 2017-12-16 10:33:51 +01:00
memstick
message scsi: mptsas: Disable WRITE SAME 2018-04-29 07:50:06 +02:00
mfd mfd: palmas: Reset the POWERHOLD mux during power off 2018-03-24 10:58:44 +01:00
misc This is the 4.4.128 stable release 2018-04-14 15:35:32 +02:00
mmc This is the 4.4.129 stable release 2018-04-24 10:42:34 +02:00
mtd This is the 4.4.132 stable release 2018-05-16 11:32:47 +02:00
net This is the 4.4.132 stable release 2018-05-16 11:32:47 +02:00
nfc This is the 4.4.123 stable release 2018-03-22 09:57:28 +01:00
ntb ntb_transport: fix bug calculating num_qps_mw 2017-08-30 10:19:29 +02:00
nubus
nvdimm libnvdimm, namespace: make 'resource' attribute only readable by root 2017-11-30 08:37:23 +00:00
nvme nvme: Fix managing degraded controllers 2018-02-16 20:09:47 +01:00
nvmem
of This is the 4.4.123 stable release 2018-03-22 09:57:28 +01:00
oprofile
parisc parisc: Hide Diva-built-in serial aux and graphics card 2018-01-02 20:33:20 +01:00
parport parport_pc: Add support for WCH CH382L PCI-E single parallel port card. 2018-04-08 11:52:00 +02:00
pci ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status() 2018-04-24 09:32:06 +02:00
pcmcia
perf This is the 4.4.123 stable release 2018-03-22 09:57:28 +01:00
phy phy: work around 'phys' references to usb-nop-xceiv devices 2018-01-23 19:50:16 +01:00
pinctrl pinctrl: Really force states during suspend/resume 2018-03-24 10:58:48 +01:00
platform goldfish: pipe: ANDROID: mark local functions static 2018-05-11 11:21:59 -07:00
pnp
power This is the 4.4.124 stable release 2018-03-25 10:51:55 +02:00
powercap PowerCap: Fix an error code in powercap_register_zone() 2018-04-13 19:50:05 +02:00
pps
ps3
ptp time: Change posix clocks ops interfaces to use timespec64 2018-03-24 10:58:40 +01:00
pwm pwm: tegra: Increase precision in PWM rate calculation 2018-03-22 09:23:27 +01:00
rapidio
ras
regulator regulator: anatop: set default voltage selector for pcie 2018-03-24 10:58:40 +01:00
remoteproc
reset
rpmsg
rtc This is the 4.4.128 stable release 2018-04-14 15:35:32 +02:00
s390 s390/cio: update chpid descriptor after resource accessibility event 2018-04-29 07:50:07 +02:00
sbus
scsi This is the 4.4.131 stable release 2018-05-02 11:10:46 -07:00
sfi
sh
sn
soc
spi spi: davinci: fix up dma_mapping_error() incorrect patch 2018-04-08 11:52:02 +02:00
spmi
ssb ssb: mark ssb_bus_register as __maybe_unused 2018-02-25 11:03:44 +01:00
staging FROMLIST: staging: Fix sparse warnings in vsoc driver. 2018-05-03 12:34:17 -07:00
target tcm_fileio: Prevent information leak for short reads 2018-03-24 10:58:45 +01:00
tc
tee BACKPORT: tee: shm: Potential NULL dereference calling tee_shm_register() 2018-02-21 15:40:49 +00:00
thermal thermal: imx: Fix race condition in imx_thermal_probe() 2018-04-24 09:32:08 +02:00
thunderbolt thunderbolt: Resume control channel after hibernation image is created 2018-04-24 09:32:07 +02:00
tty This is the 4.4.131 stable release 2018-05-02 11:10:46 -07:00
uio
usb This is the 4.4.132 stable release 2018-05-16 11:32:47 +02:00
uwb uwb: ensure that endpoint is interrupt 2017-10-12 11:27:35 +02:00
vfio vfio/pci: Virtualize Maximum Read Request Size 2018-04-24 09:32:09 +02:00
vhost vhost: correctly remove wait queue during poll failure 2018-04-13 19:50:25 +02:00
video This is the 4.4.128 stable release 2018-04-14 15:35:32 +02:00
virt
virtio virtio_balloon: prevent uninitialized variable use 2018-02-25 11:03:42 +01:00
vlynq
vme
w1
watchdog watchdog: f71808e_wdt: Fix WD_EN register read 2018-04-24 09:32:08 +02:00
xen xen/gntdev: Fix partial gntdev_mmap() cleanup 2018-03-03 10:19:45 +01:00
zorro
Kconfig tee: generic TEE subsystem 2017-12-02 06:53:27 +00:00
Makefile This is the 4.4.118 stable release 2018-02-26 09:24:57 +01:00