Commit graph

10164 commits

Author SHA1 Message Date
Greg Kroah-Hartman
583bdda5ea This is the 4.4.204 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl3gA2oACgkQONu9yGCS
 aT7rfQ//c4X05XMCcC7uHpMX43BvzLYIRLMt81PrLuOIJWloyzKZQ6/24smCVqHS
 AER8+DVvVORLKMyXV5fEwPubXfeckAEqjTFUyI3vvAyxtQA4MYMW+a6b/GIyoZG0
 jjGBKYUGwSYsSD1nTmfiGkX8tbCQqYcQzRMk0N6drefluVo18Dxn59J+2Q4hBaRi
 /PQ2XKb9upW7Lq63rfnfgoBHgllI+Jkfl9MW8xuMnTZFda1a9xKqpNpxycQMLT5b
 wtSa8S30Tt10boQcJsj/yeG9vxiCHMNjpju3Z9DBSbAKdcZQI/DvKxh0cFk39pSi
 mvH3rW/CBEjR0+7hX46gu51mVIcIObiqz45BO5ln6KN0yC1s1EuDHYRxnyyoaC+i
 +kmxrAuO2i+S9aYtbODnHclUB7n6LxUrCmHwYtBLEwez1Cha6kH2hC2+SB08H7a8
 2PwTPbgvuwfuHloUNNC0svfCBwy/RQJRPf5NQacuZqHriAJOVuUwRFoBweWqozsS
 BVbrA1KQtR43/xjcKfQJVvnOQr923MkZ1r8qx1USXOhoZLhFXUe1yJW5gO88i3IT
 qOTRR/zisINt7Cw0KBzLiTD1sxxffkLjjg7+Mzoci6C6KHpLVXkf6BbFbD5u6XzG
 CvxzznMtyPqVyIepFi0+PT8q5+XGALSLzLo8gt3x5q+WP7h7JCY=
 =+alR
 -----END PGP SIGNATURE-----

Merge 4.4.204 into android-4.4-p

Changes in 4.4.204
	net/mlx4_en: fix mlx4 ethtool -N insertion
	sfc: Only cancel the PPS workqueue if it exists
	net/sched: act_pedit: fix WARN() in the traffic path
	net: rtnetlink: prevent underflows in do_setvfinfo()
	Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"
	mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()
	asus-wmi: Create quirk for airplane_mode LED
	asus-wmi: Add quirk_no_rfkill_wapf4 for the Asus X456UF
	asus-wmi: Add quirk_no_rfkill for the Asus N552VW
	asus-wmi: Add quirk_no_rfkill for the Asus U303LB
	asus-wmi: Add quirk_no_rfkill for the Asus Z550MA
	platform/x86: asus-wmi: Filter buggy scan codes on ASUS Q500A
	platform/x86: asus-wmi: fix asus ux303ub brightness issue
	platform/x86: asus-wmi: Set specified XUSB2PR value for X550LB
	asus-wmi: provide access to ALS control
	platform/x86: asus-wmi: try to set als by default
	platform/x86: asus-nb-wmi: Support ALS on the Zenbook UX430UQ
	platform/x86: asus-wmi: Only Tell EC the OS will handle display hotkeys from asus_nb_wmi
	platform/x86: asus-wmi: add SERIO_I8042 dependency
	mwifiex: Fix NL80211_TX_POWER_LIMITED
	ALSA: isight: fix leak of reference to firewire unit in error path of .probe callback
	printk: fix integer overflow in setup_log_buf()
	gfs2: Fix marking bitmaps non-full
	synclink_gt(): fix compat_ioctl()
	powerpc: Fix signedness bug in update_flash_db()
	powerpc/eeh: Fix use of EEH_PE_KEEP on wrong field
	brcmsmac: AP mode: update beacon when TIM changes
	spi: sh-msiof: fix deferred probing
	mmc: mediatek: fix cannot receive new request when msdc_cmd_is_ready fail
	btrfs: handle error of get_old_root
	gsmi: Fix bug in append_to_eventlog sysfs handler
	misc: mic: fix a DMA pool free failure
	amiflop: clean up on errors during setup
	scsi: ips: fix missing break in switch
	KVM/x86: Fix invvpid and invept register operand size in 64-bit mode
	scsi: isci: Use proper enumerated type in atapi_d2h_reg_frame_handler
	scsi: isci: Change sci_controller_start_task's return type to sci_status
	scsi: iscsi_tcp: Explicitly cast param in iscsi_sw_tcp_host_get_param
	clk: mmp2: fix the clock id for sdh2_clk and sdh3_clk
	scsi: dc395x: fix dma API usage in srb_done
	scsi: dc395x: fix DMA API usage in sg_update_list
	net: fix warning in af_unix
	kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack
	ALSA: i2c/cs8427: Fix int to char conversion
	macintosh/windfarm_smu_sat: Fix debug output
	USB: misc: appledisplay: fix backlight update_status return code
	SUNRPC: Fix a compile warning for cmpxchg64()
	atm: zatm: Fix empty body Clang warnings
	s390/perf: Return error when debug_register fails
	spi: omap2-mcspi: Set FIFO DMA trigger level to word length
	sparc: Fix parport build warnings.
	ceph: fix dentry leak in ceph_readdir_prepopulate
	rtc: s35390a: Change buf's type to u8 in s35390a_init
	mISDN: Fix type of switch control variable in ctrl_teimanager
	qlcnic: fix a return in qlcnic_dcb_get_capability()
	mfd: mc13xxx-core: Fix PMIC shutdown when reading ADC values
	mfd: max8997: Enale irq-wakeup unconditionally
	selftests/ftrace: Fix to test kprobe $comm arg only if available
	thermal: rcar_thermal: Prevent hardware access during system suspend
	sparc64: Rework xchg() definition to avoid warnings.
	fs/ocfs2/dlm/dlmdebug.c: fix a sleep-in-atomic-context bug in dlm_print_one_mle()
	mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock
	um: Make line/tty semantics use true write IRQ
	linux/bitmap.h: handle constant zero-size bitmaps correctly
	linux/bitmap.h: fix type of nbits in bitmap_shift_right()
	hfsplus: fix BUG on bnode parent update
	hfs: fix BUG on bnode parent update
	hfsplus: prevent btree data loss on ENOSPC
	hfs: prevent btree data loss on ENOSPC
	hfsplus: fix return value of hfsplus_get_block()
	hfs: fix return value of hfs_get_block()
	fs/hfs/extent.c: fix array out of bounds read of array extent
	igb: shorten maximum PHC timecounter update interval
	ntb_netdev: fix sleep time mismatch
	ntb: intel: fix return value for ndev_vec_mask()
	ocfs2: don't put and assigning null to bh allocated outside
	ocfs2: fix clusters leak in ocfs2_defrag_extent()
	net: do not abort bulk send on BQL status
	sched/fair: Don't increase sd->balance_interval on newidle balance
	audit: print empty EXECVE args
	wlcore: Fix the return value in case of error in 'wlcore_vendor_cmd_smart_config_start()'
	rtl8xxxu: Fix missing break in switch
	brcmsmac: never log "tid x is not agg'able" by default
	wireless: airo: potential buffer overflow in sprintf()
	rtlwifi: rtl8192de: Fix misleading REG_MCUFWDL information
	scsi: mpt3sas: Fix Sync cache command failure during driver unload
	scsi: mpt3sas: Fix driver modifying persistent data in Manufacturing page11
	scsi: megaraid_sas: Fix msleep granularity
	scsi: lpfc: fcoe: Fix link down issue after 1000+ link bounces
	dlm: fix invalid free
	dlm: don't leak kernel pointer to userspace
	net: bcmgenet: return correct value 'ret' from bcmgenet_power_down
	sock: Reset dst when changing sk_mark via setsockopt
	pinctrl: qcom: spmi-gpio: fix gpio-hog related boot issues
	pinctrl: zynq: Use define directive for PIN_CONFIG_IO_STANDARD
	PCI: keystone: Use quirk to limit MRRS for K2G
	spi: omap2-mcspi: Fix DMA and FIFO event trigger size mismatch
	IB/hfi1: Ensure full Gen3 speed in a Gen4 system
	Bluetooth: Fix invalid-free in bcsp_close()
	ath9k_hw: fix uninitialized variable data
	dm: use blk_set_queue_dying() in __dm_destroy()
	arm64: fix for bad_mode() handler to always result in panic
	cpufreq: Skip cpufreq resume if it's not suspended
	ocfs2: remove ocfs2_is_o2cb_active()
	mmc: block: Fix tag condition with packed writes
	ARC: perf: Accommodate big-endian CPU
	x86/insn: Fix awk regexp warnings
	x86/speculation: Fix incorrect MDS/TAA mitigation status
	x86/speculation: Fix redundant MDS mitigation message
	media: vivid: Set vid_cap_streaming and vid_out_streaming to true
	media: vivid: Fix wrong locking that causes race conditions on streaming stop
	cpufreq: Add NULL checks to show() and store() methods of cpufreq
	media: b2c2-flexcop-usb: add sanity checking
	media: cxusb: detect cxusb_ctrl_msg error in query
	media: imon: invalid dereference in imon_touch_event
	virtio_console: reset on out of memory
	virtio_console: don't tie bufs to a vq
	virtio_console: allocate inbufs in add_port() only if it is needed
	virtio_console: fix uninitialized variable use
	virtio_console: drop custom control queue cleanup
	virtio_console: move removal code
	usb-serial: cp201x: support Mark-10 digital force gauge
	appledisplay: fix error handling in the scheduled work
	USB: serial: mos7840: add USB ID to support Moxa UPort 2210
	USB: serial: mos7720: fix remote wakeup
	USB: serial: mos7840: fix remote wakeup
	USB: serial: option: add support for DW5821e with eSIM support
	USB: serial: option: add support for Foxconn T77W968 LTE modules
	staging: comedi: usbduxfast: usbduxfast_ai_cmdtest rounding error
	powerpc/64s: support nospectre_v2 cmdline option
	powerpc/book3s64: Fix link stack flush on context switch
	KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel
	Linux 4.4.204

Change-Id: I63f64a109a8797f479bc7226be23ca591fa01b1c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-11-28 18:42:19 +01:00
Dave Chinner
29f9d8d958 mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock
[ Upstream commit 64081362e8ff4587b4554087f3cfc73d3e0a4cd7 ]

We've recently seen a workload on XFS filesystems with a repeatable
deadlock between background writeback and a multi-process application
doing concurrent writes and fsyncs to a small range of a file.

range_cyclic
writeback		Process 1		Process 2

xfs_vm_writepages
  write_cache_pages
    writeback_index = 2
    cycled = 0
    ....
    find page 2 dirty
    lock Page 2
    ->writepage
      page 2 writeback
      page 2 clean
      page 2 added to bio
    no more pages
			write()
			locks page 1
			dirties page 1
			locks page 2
			dirties page 1
			fsync()
			....
			xfs_vm_writepages
			write_cache_pages
			  start index 0
			  find page 1 towrite
			  lock Page 1
			  ->writepage
			    page 1 writeback
			    page 1 clean
			    page 1 added to bio
			  find page 2 towrite
			  lock Page 2
			  page 2 is writeback
			  <blocks>
						write()
						locks page 1
						dirties page 1
						fsync()
						....
						xfs_vm_writepages
						write_cache_pages
						  start index 0

    !done && !cycled
      sets index to 0, restarts lookup
    find page 1 dirty
						  find page 1 towrite
						  lock Page 1
						  page 1 is writeback
						  <blocks>

    lock Page 1
    <blocks>

DEADLOCK because:

	- process 1 needs page 2 writeback to complete to make
	  enough progress to issue IO pending for page 1
	- writeback needs page 1 writeback to complete so process 2
	  can progress and unlock the page it is blocked on, then it
	  can issue the IO pending for page 2
	- process 2 can't make progress until process 1 issues IO
	  for page 1

The underlying cause of the problem here is that range_cyclic writeback is
processing pages in descending index order as we hold higher index pages
in a structure controlled from above write_cache_pages().  The
write_cache_pages() caller needs to be able to submit these pages for IO
before write_cache_pages restarts writeback at mapping index 0 to avoid
wcp inverting the page lock/writeback wait order.

generic_writepages() is not susceptible to this bug as it has no private
context held across write_cache_pages() - filesystems using this
infrastructure always submit pages in ->writepage immediately and so there
is no problem with range_cyclic going back to mapping index 0.

However:
	mpage_writepages() has a private bio context,
	exofs_writepages() has page_collect
	fuse_writepages() has fuse_fill_wb_data
	nfs_writepages() has nfs_pageio_descriptor
	xfs_vm_writepages() has xfs_writepage_ctx

All of these ->writepages implementations can hold pages under writeback
in their private structures until write_cache_pages() returns, and hence
they are all susceptible to this deadlock.

Also worth noting is that ext4 has it's own bastardised version of
write_cache_pages() and so it /may/ have an equivalent deadlock.  I looked
at the code long enough to understand that it has a similar retry loop for
range_cyclic writeback reaching the end of the file and then promptly ran
away before my eyes bled too much.  I'll leave it for the ext4 developers
to determine if their code is actually has this deadlock and how to fix it
if it has.

There's a few ways I can see avoid this deadlock.  There's probably more,
but these are the first I've though of:

1. get rid of range_cyclic altogether

2. range_cyclic always stops at EOF, and we start again from
writeback index 0 on the next call into write_cache_pages()

2a. wcp also returns EAGAIN to ->writepages implementations to
indicate range cyclic has hit EOF. writepages implementations can
then flush the current context and call wpc again to continue. i.e.
lift the retry into the ->writepages implementation

3. range_cyclic uses trylock_page() rather than lock_page(), and it
skips pages it can't lock without blocking. It will already do this
for pages under writeback, so this seems like a no-brainer

3a. all non-WB_SYNC_ALL writeback uses trylock_page() to avoid
blocking as per pages under writeback.

I don't think #1 is an option - range_cyclic prevents frequently
dirtied lower file offset from starving background writeback of
rarely touched higher file offsets.

#2 is simple, and I don't think it will have any impact on
performance as going back to the start of the file implies an
immediate seek. We'll have exactly the same number of seeks if we
switch writeback to another inode, and then come back to this one
later and restart from index 0.

#2a is pretty much "status quo without the deadlock". Moving the
retry loop up into the wcp caller means we can issue IO on the
pending pages before calling wcp again, and so avoid locking or
waiting on pages in the wrong order. I'm not convinced we need to do
this given that we get the same thing from #2 on the next writeback
call from the writeback infrastructure.

#3 is really just a band-aid - it doesn't fix the access/wait
inversion problem, just prevents it from becoming a deadlock
situation. I'd prefer we fix the inversion, not sweep it under the
carpet like this.

#3a is really an optimisation that just so happens to include the
band-aid fix of #3.

So it seems that the simplest way to fix this issue is to implement
solution #2

Link: http://lkml.kernel.org/r/20181005054526.21507-1-david@fromorbit.com
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.de>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28 18:25:52 +01:00
Andrey Ryabinin
4fed90d99b mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()
commit 9a63236f1ad82d71a98aa80320b6cb618fb32f44 upstream.

It's possible to hit the WARN_ON_ONCE(page_mapped(page)) in
remove_stable_node() when it races with __mmput() and squeezes in
between ksm_exit() and exit_mmap().

  WARNING: CPU: 0 PID: 3295 at mm/ksm.c:888 remove_stable_node+0x10c/0x150

  Call Trace:
   remove_all_stable_nodes+0x12b/0x330
   run_store+0x4ef/0x7b0
   kernfs_fop_write+0x200/0x420
   vfs_write+0x154/0x450
   ksys_write+0xf9/0x1d0
   do_syscall_64+0x99/0x510
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

Remove the warning as there is nothing scary going on.

Link: http://lkml.kernel.org/r/20191119131850.5675-1-aryabinin@virtuozzo.com
Fixes: cbf86cfe04 ("ksm: remove old stable nodes more thoroughly")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-28 18:25:27 +01:00
Greg Kroah-Hartman
40ef73d67a This is the 4.4.203 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl3b60UACgkQONu9yGCS
 aT7SHA//a75vH8zxZnVvNaDBbpw6GdvWAXiDuwFiaikG/UHOLFjv08aE/+QiuJz/
 AX94klb25jHsXVvMEk79lyDanQYGrrbfuXR6XxY+Q4N8dEdVmp+fBmM+Q/sktdOA
 M6BsAYuim0Ttz/Rv1Vb+dm8U5KlSpqBmqGs/aBSvpVMGCb9AKGbUNF3k4jB42xOU
 zHhyfG2u3K2YU7MbH9b6bktV7Q7ZpqQYD0qDT9aa9Mx1A1z9/mB4CVWjpCvhKPD7
 Dsjuz+/1+lBfvElLKxV1J9Xg+RI4kaqkv42gBydWP/PpsNKvZorZ5X1oFy/a5JSB
 qj4C6FkpTJmvJ0QLISS6s+vC6bEn2G+ojUT4UkgUKlsORyjQBV4twJTVUnX71vNC
 BVOgd/KNBUtu919JRL8Jr39ZTEUkpkhF6XbMjuCiKtoyDN46z13gi9ul54T+Go6S
 npyOBxK2QRbOfo+5b1XSqswfcbOOSTEk4WkSXtYO6XLojl7XRFsCYnxVm50Rc201
 U8nA/Mkk3FunSS21lGbm4e2SCPsVjiyewtolqc5J/4BY/l2y6vkYCEqVMJNelIP+
 cwN81i0Ugwp3v1Zj05dTlxFB8RduZoIIJmJdtrFczdg6gT44qtZR2GsIBMlBaxR/
 PaIYg2MSkWv8ednnPS05d1shgZXczr4aVI3pkj0e5mESu7Q8cRA=
 =NAKA
 -----END PGP SIGNATURE-----

Merge 4.4.203 into android-4.4-p

Changes in 4.4.203
	slip: Fix memory leak in slip_open error path
	ax88172a: fix information leak on short answers
	ALSA: usb-audio: Fix missing error check at mixer resolution test
	ALSA: usb-audio: not submit urb for stopped endpoint
	Input: ff-memless - kill timer in destroy()
	ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stable
	ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either
	iommu/vt-d: Fix QI_DEV_IOTLB_PFSID and QI_DEV_EIOTLB_PFSID macros
	mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()
	mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()
	mmc: sdhci-of-at91: fix quirk2 overwrite
	iio: dac: mcp4922: fix error handling in mcp4922_write_raw
	ALSA: pcm: signedness bug in snd_pcm_plug_alloc()
	ARM: dts: at91/trivial: Fix USART1 definition for at91sam9g45
	ALSA: seq: Do error checks at creating system ports
	gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated
	ASoC: dpcm: Properly initialise hw->rate_max
	MIPS: BCM47XX: Enable USB power on Netgear WNDR3400v3
	ARM: dts: exynos: Fix sound in Snow-rev5 Chromebook
	i40e: use correct length for strncpy
	i40e: hold the rtnl lock on clearing interrupt scheme
	i40e: Prevent deleting MAC address from VF when set by PF
	ARM: dts: pxa: fix power i2c base address
	rtl8187: Fix warning generated when strncpy() destination length matches the sixe argument
	net: lan78xx: Bail out if lan78xx_get_endpoints fails
	ASoC: sgtl5000: avoid division by zero if lo_vag is zero
	ath10k: wmi: disable softirq's while calling ieee80211_rx
	mips: txx9: fix iounmap related issue
	of: make PowerMac cache node search conditional on CONFIG_PPC_PMAC
	ARM: dts: omap3-gta04: give spi_lcd node a label so that we can overwrite in other DTS files
	ARM: dts: omap3-gta04: tvout: enable as display1 alias
	ARM: dts: omap3-gta04: make NAND partitions compatible with recent U-Boot
	ARM: dts: omap3-gta04: keep vpll2 always on
	dmaengine: dma-jz4780: Further residue status fix
	signal: Always ignore SIGKILL and SIGSTOP sent to the global init
	signal: Properly deliver SIGILL from uprobes
	signal: Properly deliver SIGSEGV from x86 uprobes
	scsi: sym53c8xx: fix NULL pointer dereference panic in sym_int_sir()
	ARM: imx6: register pm_power_off handler if "fsl,pmic-stby-poweroff" is set
	scsi: pm80xx: Corrected dma_unmap_sg() parameter
	scsi: pm80xx: Fixed system hang issue during kexec boot
	kprobes: Don't call BUG_ON() if there is a kprobe in use on free list
	nvmem: core: return error code instead of NULL from nvmem_device_get
	media: fix: media: pci: meye: validate offset to avoid arbitrary access
	ALSA: intel8x0m: Register irq handler after register initializations
	pinctrl: at91-pio4: fix has_config check in atmel_pctl_dt_subnode_to_map()
	llc: avoid blocking in llc_sap_close()
	powerpc/vdso: Correct call frame information
	ARM: dts: socfpga: Fix I2C bus unit-address error
	pinctrl: at91: don't use the same irqchip with multiple gpiochips
	cxgb4: Fix endianness issue in t4_fwcache()
	power: supply: ab8500_fg: silence uninitialized variable warnings
	power: supply: max8998-charger: Fix platform data retrieval
	kernfs: Fix range checks in kernfs_get_target_path
	s390/qeth: invoke softirqs after napi_schedule()
	PCI/ACPI: Correct error message for ASPM disabling
	serial: mxs-auart: Fix potential infinite loop
	powerpc/iommu: Avoid derefence before pointer check
	powerpc/64s/hash: Fix stab_rr off by one initialization
	powerpc/pseries: Disable CPU hotplug across migrations
	libfdt: Ensure INT_MAX is defined in libfdt_env.h
	power: supply: twl4030_charger: fix charging current out-of-bounds
	power: supply: twl4030_charger: disable eoc interrupt on linear charge
	net: toshiba: fix return type of ndo_start_xmit function
	net: xilinx: fix return type of ndo_start_xmit function
	net: broadcom: fix return type of ndo_start_xmit function
	net: amd: fix return type of ndo_start_xmit function
	usb: chipidea: Fix otg event handler
	ARM: dts: am335x-evm: fix number of cpsw
	ARM: dts: ux500: Correct SCU unit address
	ARM: dts: ux500: Fix LCDA clock line muxing
	ARM: dts: ste: Fix SPI controller node names
	cpufeature: avoid warning when compiling with clang
	bnx2x: Ignore bandwidth attention in single function mode
	net: micrel: fix return type of ndo_start_xmit function
	x86/CPU: Use correct macros for Cyrix calls
	MIPS: kexec: Relax memory restriction
	media: pci: ivtv: Fix a sleep-in-atomic-context bug in ivtv_yuv_init()
	media: davinci: Fix implicit enum conversion warning
	usb: gadget: uvc: configfs: Drop leaked references to config items
	usb: gadget: uvc: configfs: Prevent format changes after linking header
	usb: gadget: uvc: Factor out video USB request queueing
	usb: gadget: uvc: Only halt video streaming endpoint in bulk mode
	misc: kgdbts: Fix restrict error
	misc: genwqe: should return proper error value.
	vfio/pci: Fix potential memory leak in vfio_msi_cap_len
	scsi: libsas: always unregister the old device if going to discover new
	ARM: dts: tegra30: fix xcvr-setup-use-fuses
	ARM: tegra: apalis_t30: fix mmc1 cmd pull-up
	net: smsc: fix return type of ndo_start_xmit function
	EDAC: Raise the maximum number of memory controllers
	Bluetooth: L2CAP: Detect if remote is not able to use the whole MPS
	arm64: dts: amd: Fix SPI bus warnings
	fuse: use READ_ONCE on congestion_threshold and max_background
	Bluetooth: hci_ldisc: Fix null pointer derefence in case of early data
	Bluetooth: hci_ldisc: Postpone HCI_UART_PROTO_READY bit set in hci_uart_set_proto()
	memfd: Use radix_tree_deref_slot_protected to avoid the warning.
	slcan: Fix memory leak in error path
	net: cdc_ncm: Signedness bug in cdc_ncm_set_dgram_size()
	x86/atomic: Fix smp_mb__{before,after}_atomic()
	apparmor: fix uninitialized lsm_audit member
	apparmor: fix update the mtime of the profile file on replacement
	apparmor: fix module parameters can be changed after policy is locked
	kprobes/x86: Prohibit probing on exception masking instructions
	uprobes/x86: Prohibit probing on MOV SS instruction
	fbdev: Remove unused SH-Mobile HDMI driver
	fbdev: Ditch fb_edid_add_monspecs
	block: introduce blk_rq_is_passthrough
	libata: have ata_scsi_rw_xlat() fail invalid passthrough requests
	net: ovs: fix return type of ndo_start_xmit function
	f2fs: return correct errno in f2fs_gc
	SUNRPC: Fix priority queue fairness
	ath10k: fix vdev-start timeout on error
	ath9k: fix reporting calculated new FFT upper max
	usb: gadget: udc: fotg210-udc: Fix a sleep-in-atomic-context bug in fotg210_get_status()
	nl80211: Fix a GET_KEY reply attribute
	dmaengine: ep93xx: Return proper enum in ep93xx_dma_chan_direction
	dmaengine: timb_dma: Use proper enum in td_prep_slave_sg
	mei: samples: fix a signedness bug in amt_host_if_call()
	cxgb4: Use proper enum in cxgb4_dcb_handle_fw_update
	cxgb4: Use proper enum in IEEE_FAUX_SYNC
	powerpc/pseries: Fix DTL buffer registration
	powerpc/pseries: Fix how we iterate over the DTL entries
	mtd: rawnand: sh_flctl: Use proper enum for flctl_dma_fifo0_transfer
	ixgbe: Fix crash with VFs and flow director on interface flap
	IB/mthca: Fix error return code in __mthca_init_one()
	ata: ep93xx: Use proper enums for directions
	ALSA: hda/sigmatel - Disable automute for Elo VuPoint
	KVM: PPC: Book3S PR: Exiting split hack mode needs to fixup both PC and LR
	USB: serial: cypress_m8: fix interrupt-out transfer length
	mtd: physmap_of: Release resources on error
	brcmfmac: fix full timeout waiting for action frame on-channel tx
	NFSv4.x: fix lock recovery during delegation recall
	dmaengine: ioat: fix prototype of ioat_enumerate_channels
	Input: st1232 - set INPUT_PROP_DIRECT property
	x86/olpc: Fix build error with CONFIG_MFD_CS5535=m
	crypto: mxs-dcp - Fix SHA null hashes and output length
	crypto: mxs-dcp - Fix AES issues
	ACPI / SBS: Fix rare oops when removing modules
	fbdev: sbuslib: use checked version of put_user()
	fbdev: sbuslib: integer overflow in sbusfb_ioctl_helper()
	bcache: recal cached_dev_sectors on detach
	proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()
	backlight: lm3639: Unconditionally call led_classdev_unregister
	printk: Give error on attempt to set log buffer length to over 2G
	media: isif: fix a NULL pointer dereference bug
	GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads
	media: cx231xx: fix potential sign-extension overflow on large shift
	x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
	gpio: syscon: Fix possible NULL ptr usage
	spi: spidev: Fix OF tree warning logic
	ARM: 8802/1: Call syscall_trace_exit even when system call skipped
	hwmon: (pwm-fan) Silence error on probe deferral
	mac80211: minstrel: fix CCK rate group streams value
	spi: rockchip: initialize dma_slave_config properly
	arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault
	Linux 4.4.203

Change-Id: Icba08e9fbb6f47274ee6fcf1023a1469cd8550d3
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-11-25 17:44:35 +01:00
zhong jiang
905bf98eb3 memfd: Use radix_tree_deref_slot_protected to avoid the warning.
The commit eb4058d8da ("memfd: Fix locking when tagging pins")
introduces the following warning messages.

*WARNING: suspicious RCU usage in memfd_wait_for_pins*

It is because we still use radix_tree_deref_slot without read_rcu_lock.
We should use radix_tree_deref_slot_protected instead in the case.

Cc: stable@vger.kernel.org
Fixes: eb4058d8da ("memfd: Fix locking when tagging pins")
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-25 15:54:19 +01:00
Roman Gushchin
69ab55f2bb mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()
commit 0362f326d86c645b5e96b7dbc3ee515986ed019d upstream.

An exiting task might belong to an offline cgroup.  In this case an
attempt to grab a cgroup reference from the task can end up with an
infinite loop in hugetlb_cgroup_charge_cgroup(), because neither the
cgroup will become online, neither the task will be migrated to a live
cgroup.

Fix this by switching over to css_tryget().  As css_tryget_online()
can't guarantee that the cgroup won't go offline, in most cases the
check doesn't make sense.  In this particular case users of
hugetlb_cgroup_charge_cgroup() are not affected by this change.

A similar problem is described by commit 18fa84a2db0e ("cgroup: Use
css_tryget() instead of css_tryget_online() in task_get_css()").

Link: http://lkml.kernel.org/r/20191106225131.3543616-2-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-25 15:53:43 +01:00
Roman Gushchin
f023333e92 mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()
commit 00d484f354d85845991b40141d40ba9e5eb60faf upstream.

We've encountered a rcu stall in get_mem_cgroup_from_mm():

  rcu: INFO: rcu_sched self-detected stall on CPU
  rcu: 33-....: (21000 ticks this GP) idle=6c6/1/0x4000000000000002 softirq=35441/35441 fqs=5017
  (t=21031 jiffies g=324821 q=95837) NMI backtrace for cpu 33
  <...>
  RIP: 0010:get_mem_cgroup_from_mm+0x2f/0x90
  <...>
   __memcg_kmem_charge+0x55/0x140
   __alloc_pages_nodemask+0x267/0x320
   pipe_write+0x1ad/0x400
   new_sync_write+0x127/0x1c0
   __kernel_write+0x4f/0xf0
   dump_emit+0x91/0xc0
   writenote+0xa0/0xc0
   elf_core_dump+0x11af/0x1430
   do_coredump+0xc65/0xee0
   get_signal+0x132/0x7c0
   do_signal+0x36/0x640
   exit_to_usermode_loop+0x61/0xd0
   do_syscall_64+0xd4/0x100
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

The problem is caused by an exiting task which is associated with an
offline memcg.  We're iterating over and over in the do {} while
(!css_tryget_online()) loop, but obviously the memcg won't become online
and the exiting task won't be migrated to a live memcg.

Let's fix it by switching from css_tryget_online() to css_tryget().

As css_tryget_online() cannot guarantee that the memcg won't go offline,
the check is usually useless, except some rare cases when for example it
determines if something should be presented to a user.

A similar problem is described by commit 18fa84a2db0e ("cgroup: Use
css_tryget() instead of css_tryget_online() in task_get_css()").

Johannes:

: The bug aside, it doesn't matter whether the cgroup is online for the
: callers.  It used to matter when offlining needed to evacuate all charges
: from the memcg, and so needed to prevent new ones from showing up, but we
: don't care now.

Link: http://lkml.kernel.org/r/20191106225131.3543616-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Shakeel Butt <shakeeb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutn <mkoutny@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-25 15:53:43 +01:00
Greg Kroah-Hartman
ef0b39d33a This is the 4.4.201 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl3K9lIACgkQONu9yGCS
 aT4PXw/9ExBjUrU6NzDZXAr8h+mR3D9lDzZb3KC3Jn/8bzARG7OHx2i3wsYqQB3p
 a+A5HViZlBJCRl70CkfPmUAgQ0OpLQDnrkW3XPRzpa/x+xE1IBKM1nkmvxofD3Jh
 HRW8eD8YcIR2vqzYhYiSpqKYMfMYcKfSl7XYs6QBGMsRcbDs2O8+KP+S4Z5wm3ZO
 aCJ6v3GVWhOosE4oDklXm4OxhIQ12IQMtP66j4RskF31wd3iXoUzTQkxJxTLWHpK
 D8e+7oFUCVDRB8kdfmsNOL/HCkazqvZ9ZsuU91P6/f91S9vimzaR7xOzk7XZRxSF
 FRDbe3uwWWvscs4E4MU3cqHQXO1PePdGalty2pzMKQxQzLyh4VOF13d2GmlOjac6
 BV7Yim8En5SSsGh3V1VhRbLBodboFp8paLVBQoXBDJ0ErpTCwxxCzfKfK/+QJ0RD
 esdrcl+iAuz4CFJQLBwfB4iFJDG31lD3sc8IWQ9bx4FDQzZxtPf2UPJJCGF6JvCS
 eiGqO5blbhasuvsGxgBVdAdlpXDssGI6LDDZPy5nxGkMtFs/3Ic6OtjS0V3NQEHt
 2zdeYmGkiZ0OSTYUnlXUfhm1NAp8m3HMGvTD4VU8UDx+cnI2p9FF103/6X4m0dui
 0+7cGeWnAlKxORmOV8C49Pc0OXQ8SJzxoiTF4rF7KU+n1loypgY=
 =IvZj
 -----END PGP SIGNATURE-----

Merge 4.4.201 into android-4.4-p

Changes in 4.4.201
	CDC-NCM: handle incomplete transfer of MTU
	net: fix data-race in neigh_event_send()
	NFC: fdp: fix incorrect free object
	NFC: st21nfca: fix double free
	qede: fix NULL pointer deref in __qede_remove()
	nfc: netlink: fix double device reference drop
	ALSA: bebob: fix to detect configured source of sampling clock for Focusrite Saffire Pro i/o series
	ALSA: hda/ca0132 - Fix possible workqueue stall
	mm, vmstat: hide /proc/pagetypeinfo from normal users
	dump_stack: avoid the livelock of the dump_lock
	perf tools: Fix time sorting
	drm/radeon: fix si_enable_smc_cac() failed issue
	ceph: fix use-after-free in __ceph_remove_cap()
	iio: imu: adis16480: make sure provided frequency is positive
	netfilter: nf_tables: Align nft_expr private data to 64-bit
	netfilter: ipset: Fix an error code in ip_set_sockfn_get()
	can: usb_8dev: fix use-after-free on disconnect
	can: c_can: c_can_poll(): only read status register after status IRQ
	can: peak_usb: fix a potential out-of-sync while decoding packets
	can: gs_usb: gs_can_open(): prevent memory leak
	can: peak_usb: fix slab info leak
	drivers: usb: usbip: Add missing break statement to switch
	configfs: fix a deadlock in configfs_symlink()
	PCI: tegra: Enable Relaxed Ordering only for Tegra20 & Tegra30
	scsi: qla2xxx: fixup incorrect usage of host_byte
	scsi: lpfc: Honor module parameter lpfc_use_adisc
	ipvs: move old_secure_tcp into struct netns_ipvs
	bonding: fix unexpected IFF_BONDING bit unset
	usb: fsl: Check memory resource before releasing it
	usb: gadget: udc: atmel: Fix interrupt storm in FIFO mode.
	usb: gadget: composite: Fix possible double free memory bug
	usb: gadget: configfs: fix concurrent issue between composite APIs
	perf/x86/amd/ibs: Fix reading of the IBS OpData register and thus precise RIP validity
	USB: Skip endpoints with 0 maxpacket length
	scsi: qla2xxx: stop timer in shutdown path
	net: hisilicon: Fix "Trying to free already-free IRQ"
	NFSv4: Don't allow a cached open with a revoked delegation
	igb: Fix constant media auto sense switching when no cable is connected
	e1000: fix memory leaks
	can: flexcan: disable completely the ECC mechanism
	mm/filemap.c: don't initiate writeback if mapping has no dirty pages
	cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead
	net: prevent load/store tearing on sk->sk_stamp
	drm/i915/gtt: Add read only pages to gen8_pte_encode
	drm/i915/gtt: Read-only pages for insert_entries on bdw+
	drm/i915/gtt: Disable read-only support under GVT
	drm/i915: Rename gen7 cmdparser tables
	drm/i915: Disable Secure Batches for gen6+
	drm/i915: Remove Master tables from cmdparser
	drm/i915: Add support for mandatory cmdparsing
	drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
	drm/i915: Allow parsing of unsized batches
	drm/i915: Add gen9 BCS cmdparsing
	drm/i915/cmdparser: Add support for backward jumps
	drm/i915/cmdparser: Ignore Length operands during command matching
	drm/i915: Lower RM timeout to avoid DSI hard hangs
	drm/i915/gen8+: Add RC6 CTX corruption WA
	drm/i915/cmdparser: Fix jump whitelist clearing
	Linux 4.4.201

Change-Id: Ifc1fa5b9734f244745b862c6dbf7e34b73245806
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-11-14 14:39:48 +08:00
Konstantin Khlebnikov
bfb01e6a4f mm/filemap.c: don't initiate writeback if mapping has no dirty pages
commit c3aab9a0bd91b696a852169479b7db1ece6cbf8c upstream.

Functions like filemap_write_and_wait_range() should do nothing if inode
has no dirty pages or pages currently under writeback.  But they anyway
construct struct writeback_control and this does some atomic operations if
CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
updates state of writeback ownership, on slow path might be more work.
Current this path is safely avoided only when inode mapping has no pages.

For example generic_file_read_iter() calls filemap_write_and_wait_range()
at each O_DIRECT read - pretty hot path.

This patch skips starting new writeback if mapping has no dirty tags set.
If writeback is already in progress filemap_write_and_wait_range() will
wait for it.

Link: http://lkml.kernel.org/r/156378816804.1087.8607636317907921438.stgit@buzz
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12 19:13:30 +01:00
Michal Hocko
a6d17aa291 mm, vmstat: hide /proc/pagetypeinfo from normal users
commit abaed0112c1db08be15a784a2c5c8a8b3063cdd3 upstream.

/proc/pagetypeinfo is a debugging tool to examine internal page
allocator state wrt to fragmentation.  It is not very useful for any
other use so normal users really do not need to read this file.

Waiman Long has noticed that reading this file can have negative side
effects because zone->lock is necessary for gathering data and that a)
interferes with the page allocator and its users and b) can lead to hard
lockups on large machines which have very long free_list.

Reduce both issues by simply not exporting the file to regular users.

Link: http://lkml.kernel.org/r/20191025072610.18526-2-mhocko@kernel.org
Fixes: 467c996c1e ("Print out statistics in relation to fragmentation avoidance to /proc/pagetypeinfo")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Waiman Long <longman@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Jann Horn <jannh@google.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12 19:13:18 +01:00
Greg Kroah-Hartman
dbd016261f This is the 4.4.198 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl239LAACgkQONu9yGCS
 aT6L9A//XPoRZILliafvNuuuA7wsJ0P1lLyMdVom4TxoyJOWo7e+toU3SOWXsvO2
 5oxlBJ26e1lwZSbne77EPS17N2Ym77q546fqCm/XzifdEyyxkydKaO2JAsYSg5E0
 /9uv45HOxYbd++tKNMZewnztSUVlOrxr4JrF18R/QuVtgffxJrbM4flI8boyiCIV
 +HyWbYOZmV2hGy3s4Y2bC3xohfA9Dd1C8MvDarlPj55AfQtC4klkRA3szuOQa2p3
 Kp8kgddcRCE7CBU0o4P3S5LGuM34Yu8uU7/KUy5zmuBVtg9wbH35Y8vjXucke9L6
 DlsAOOaCbTn0o9eHC2d/DStRJyGuI9JEaJVlBOy5064OSddyGeuknkPIgPexkNts
 2K8Y1CxaddwjuhKq1ugJ/tFY0wxShm6RCgBsz07iZgRAPysxJwnkocCItm0W82W0
 I1RBNKxkulLWM6V2XD5bE1C6Nt64fb5sL24VELqKuBwC97mREm5dxviOB8TSwh+e
 bRH6sVm8X/vEfnmjqx8TBW9JSgAR3gVRJXHA1i//wT508dPBA6Lk7+YH6mGbauZp
 eK64tGtWbBaShLBhyMMsLDPcO5TFGdmM0VWi39ST9Z36YtnlB+1EpM712X+2Wp3i
 P+LS82ym8xVA2/T+RBJUfcLu9hYwVaa7ypTh00Q1Whb/EOVQ7uo=
 =fGQC
 -----END PGP SIGNATURE-----

Merge 4.4.198 into android-4.4-p

Changes in 4.4.198
	scsi: ufs: skip shutdown if hba is not powered
	scsi: megaraid: disable device when probe failed after enabled device
	scsi: qla2xxx: Fix unbound sleep in fcport delete path.
	ARM: OMAP2+: Fix missing reset done flag for am3 and am43
	ARM: dts: am4372: Set memory bandwidth limit for DISPC
	nl80211: fix null pointer dereference
	mips: Loongson: Fix the link time qualifier of 'serial_exit()'
	net: hisilicon: Fix usage of uninitialized variable in function mdio_sc_cfg_reg_write()
	namespace: fix namespace.pl script to support relative paths
	loop: Add LOOP_SET_DIRECT_IO to compat ioctl
	net: bcmgenet: Fix RGMII_MODE_EN value for GENET v1/2/3
	net: bcmgenet: Set phydev->dev_flags only for internal PHYs
	sctp: change sctp_prot .no_autobind with true
	net: avoid potential infinite loop in tc_ctl_action()
	ipv4: Return -ENETUNREACH if we can't create route but saddr is valid
	memfd: Fix locking when tagging pins
	USB: legousbtower: fix memleak on disconnect
	usb: udc: lpc32xx: fix bad bit shift operation
	USB: serial: ti_usb_3410_5052: fix port-close races
	USB: ldusb: fix memleak on disconnect
	USB: usblp: fix use-after-free on disconnect
	USB: ldusb: fix read info leaks
	scsi: core: try to get module before removing device
	ASoC: rsnd: Reinitialize bit clock inversion flag for every format setting
	cfg80211: wext: avoid copying malformed SSIDs
	mac80211: Reject malformed SSID elements
	drm/edid: Add 6 bpc quirk for SDC panel in Lenovo G50
	scsi: zfcp: fix reaction on bit error threshold notification
	mm/slub: fix a deadlock in show_slab_objects()
	xtensa: drop EXPORT_SYMBOL for outs*/ins*
	parisc: Fix vmap memory leak in ioremap()/iounmap()
	CIFS: avoid using MID 0xFFFF
	btrfs: block-group: Fix a memory leak due to missing btrfs_put_block_group()
	memstick: jmb38x_ms: Fix an error handling path in 'jmb38x_ms_probe()'
	cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown
	xen/netback: fix error path of xenvif_connect_data()
	PCI: PM: Fix pci_power_up()
	net: sched: Fix memory exposure from short TCA_U32_SEL
	RDMA/cxgb4: Do not dma memory off of the stack
	Linux 4.4.198

Change-Id: Ibaaa507ab0873375f5ad9ef2d53982aa8d346599
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-10-29 09:47:13 +01:00
Qian Cai
cff989b79e mm/slub: fix a deadlock in show_slab_objects()
commit e4f8e513c3d353c134ad4eef9fd0bba12406c7c8 upstream.

A long time ago we fixed a similar deadlock in show_slab_objects() [1].
However, it is apparently due to the commits like 01fb58bcba63 ("slab:
remove synchronous synchronize_sched() from memcg cache deactivation
path") and 03afc0e25f ("slab: get_online_mems for
kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by
just reading files in /sys/kernel/slab which will generate a lockdep
splat below.

Since the "mem_hotplug_lock" here is only to obtain a stable online node
mask while racing with NUMA node hotplug, in the worst case, the results
may me miscalculated while doing NUMA node hotplug, but they shall be
corrected by later reads of the same files.

  WARNING: possible circular locking dependency detected
  ------------------------------------------------------
  cat/5224 is trying to acquire lock:
  ffff900012ac3120 (mem_hotplug_lock.rw_sem){++++}, at:
  show_slab_objects+0x94/0x3a8

  but task is already holding lock:
  b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (kn->count#45){++++}:
         lock_acquire+0x31c/0x360
         __kernfs_remove+0x290/0x490
         kernfs_remove+0x30/0x44
         sysfs_remove_dir+0x70/0x88
         kobject_del+0x50/0xb0
         sysfs_slab_unlink+0x2c/0x38
         shutdown_cache+0xa0/0xf0
         kmemcg_cache_shutdown_fn+0x1c/0x34
         kmemcg_workfn+0x44/0x64
         process_one_work+0x4f4/0x950
         worker_thread+0x390/0x4bc
         kthread+0x1cc/0x1e8
         ret_from_fork+0x10/0x18

  -> #1 (slab_mutex){+.+.}:
         lock_acquire+0x31c/0x360
         __mutex_lock_common+0x16c/0xf78
         mutex_lock_nested+0x40/0x50
         memcg_create_kmem_cache+0x38/0x16c
         memcg_kmem_cache_create_func+0x3c/0x70
         process_one_work+0x4f4/0x950
         worker_thread+0x390/0x4bc
         kthread+0x1cc/0x1e8
         ret_from_fork+0x10/0x18

  -> #0 (mem_hotplug_lock.rw_sem){++++}:
         validate_chain+0xd10/0x2bcc
         __lock_acquire+0x7f4/0xb8c
         lock_acquire+0x31c/0x360
         get_online_mems+0x54/0x150
         show_slab_objects+0x94/0x3a8
         total_objects_show+0x28/0x34
         slab_attr_show+0x38/0x54
         sysfs_kf_seq_show+0x198/0x2d4
         kernfs_seq_show+0xa4/0xcc
         seq_read+0x30c/0x8a8
         kernfs_fop_read+0xa8/0x314
         __vfs_read+0x88/0x20c
         vfs_read+0xd8/0x10c
         ksys_read+0xb0/0x120
         __arm64_sys_read+0x54/0x88
         el0_svc_handler+0x170/0x240
         el0_svc+0x8/0xc

  other info that might help us debug this:

  Chain exists of:
    mem_hotplug_lock.rw_sem --> slab_mutex --> kn->count#45

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(kn->count#45);
                                 lock(slab_mutex);
                                 lock(kn->count#45);
    lock(mem_hotplug_lock.rw_sem);

   *** DEADLOCK ***

  3 locks held by cat/5224:
   #0: 9eff00095b14b2a0 (&p->lock){+.+.}, at: seq_read+0x4c/0x8a8
   #1: 0eff008997041480 (&of->mutex){+.+.}, at: kernfs_seq_start+0x34/0xf0
   #2: b8ff009693eee398 (kn->count#45){++++}, at:
  kernfs_seq_start+0x44/0xf0

  stack backtrace:
  Call trace:
   dump_backtrace+0x0/0x248
   show_stack+0x20/0x2c
   dump_stack+0xd0/0x140
   print_circular_bug+0x368/0x380
   check_noncircular+0x248/0x250
   validate_chain+0xd10/0x2bcc
   __lock_acquire+0x7f4/0xb8c
   lock_acquire+0x31c/0x360
   get_online_mems+0x54/0x150
   show_slab_objects+0x94/0x3a8
   total_objects_show+0x28/0x34
   slab_attr_show+0x38/0x54
   sysfs_kf_seq_show+0x198/0x2d4
   kernfs_seq_show+0xa4/0xcc
   seq_read+0x30c/0x8a8
   kernfs_fop_read+0xa8/0x314
   __vfs_read+0x88/0x20c
   vfs_read+0xd8/0x10c
   ksys_read+0xb0/0x120
   __arm64_sys_read+0x54/0x88
   el0_svc_handler+0x170/0x240
   el0_svc+0x8/0xc

I think it is important to mention that this doesn't expose the
show_slab_objects to use-after-free.  There is only a single path that
might really race here and that is the slab hotplug notifier callback
__kmem_cache_shrink (via slab_mem_going_offline_callback) but that path
doesn't really destroy kmem_cache_node data structures.

[1] http://lkml.iu.edu/hypermail/linux/kernel/1101.0/02850.html

[akpm@linux-foundation.org: add comment explaining why we don't need mem_hotplug_lock]
Link: http://lkml.kernel.org/r/1570192309-10132-1-git-send-email-cai@lca.pw
Fixes: 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path")
Fixes: 03afc0e25f ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29 09:13:28 +01:00
Matthew Wilcox (Oracle)
eb4058d8da memfd: Fix locking when tagging pins
The RCU lock is insufficient to protect the radix tree iteration as
a deletion from the tree can occur before we take the spinlock to
tag the entry.  In 4.19, this has manifested as a bug with the following
trace:

kernel BUG at lib/radix-tree.c:1429!
invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 7 PID: 6935 Comm: syz-executor.2 Not tainted 4.19.36 #25
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:radix_tree_tag_set+0x200/0x2f0 lib/radix-tree.c:1429
Code: 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 89 44 24 10 e8 a3 29 7e fe 48 8b 44 24 10 48 0f ab 03 e9 d2 fe ff ff e8 90 29 7e fe <0f> 0b 48 c7 c7 e0 5a 87 84 e8 f0 e7 08 ff 4c 89 ef e8 4a ff ac fe
RSP: 0018:ffff88837b13fb60 EFLAGS: 00010016
RAX: 0000000000040000 RBX: ffff8883c5515d58 RCX: ffffffff82cb2ef0
RDX: 0000000000000b72 RSI: ffffc90004cf2000 RDI: ffff8883c5515d98
RBP: ffff88837b13fb98 R08: ffffed106f627f7e R09: ffffed106f627f7e
R10: 0000000000000001 R11: ffffed106f627f7d R12: 0000000000000004
R13: ffffea000d7fea80 R14: 1ffff1106f627f6f R15: 0000000000000002
FS:  00007fa1b8df2700(0000) GS:ffff8883e2fc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa1b8df1db8 CR3: 000000037d4d2001 CR4: 0000000000160ee0
Call Trace:
 memfd_tag_pins mm/memfd.c:51 [inline]
 memfd_wait_for_pins+0x2c5/0x12d0 mm/memfd.c:81
 memfd_add_seals mm/memfd.c:215 [inline]
 memfd_fcntl+0x33d/0x4a0 mm/memfd.c:247
 do_fcntl+0x589/0xeb0 fs/fcntl.c:421
 __do_sys_fcntl fs/fcntl.c:463 [inline]
 __se_sys_fcntl fs/fcntl.c:448 [inline]
 __x64_sys_fcntl+0x12d/0x180 fs/fcntl.c:448
 do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:293

The problem does not occur in mainline due to the XArray rewrite which
changed the locking to exclude modification of the tree during iteration.
At the time, nobody realised this was a bugfix.  Backport the locking
changes to stable.

Cc: stable@vger.kernel.org
Reported-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-29 09:13:23 +01:00
Greg Kroah-Hartman
dfd52e7009 This is the 4.4.190 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1iTHIACgkQONu9yGCS
 aT6r1RAAizAiIopvXWZ6Z7BAWj0MyalChPY+DhkGgS9egBs9TIqRJ4QCffTeIbHC
 2PYXf//eoWRP1fT7AesMwD1s4lSaT0BuEoFwZn3bwFhR4Xf9HzVEl8bCXIgj4bq2
 WCjE8u/W8ALFZOZ5yJJjRtVOEpTt512u5OzkaF3h+iXh4/g3UYHh1QPNSHA9l7fx
 UbX+PsI9jYl3Ge4zqMIcuzPkgCIuF+g+EirzNRKijfYxOfoVgad83UXPFAMI2isF
 +ftDAbIR+Tc7sBug/30ATdhQjFWDfM9Gzz0rBl9Pw1SpCH+h33e2cEMzLJC42DI2
 mLpTABI7TMt+tygNAxceHCunPma80z22oobxgkGoZJRKH7MfQg/FD9N05bR+8C11
 AlOcix4p1oaWRJssv4myrLjJq4Yt5Ura+/MvWSp1FLhodUbA+F7lfB1L1nW2LtXu
 /edinaBKNMYUVxAmkJOWm3HT79OsonzbC6KqPDLsTTEfYISS6S5i99WnrLdjNoim
 ozXkPUG3ymT9oRcgndRwEDGRGe0lAhI5SwdXujvxf1J26f90r8e8sHnAiDeO63ne
 H+Uxd76b6BxAVpOnluKMTouHI+nxhRyvnp0rUYC523rOtFo9mYluWzte5BOCQ/Ss
 hFgUKSe1F9+Vzm1bIdt1LicrooKGIeoYOqhk8A3jTum67jnUaQA=
 =Kv4T
 -----END PGP SIGNATURE-----

Merge 4.4.190 into android-4.4-p

Changes in 4.4.190
	usb: iowarrior: fix deadlock on disconnect
	sound: fix a memory leak bug
	x86/mm: Check for pfn instead of page in vmalloc_sync_one()
	x86/mm: Sync also unmappings in vmalloc_sync_all()
	mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
	perf db-export: Fix thread__exec_comm()
	usb: yurex: Fix use-after-free in yurex_delete
	can: peak_usb: fix potential double kfree_skb()
	netfilter: nfnetlink: avoid deadlock due to synchronous request_module
	iscsi_ibft: make ISCSI_IBFT dependson ACPI instead of ISCSI_IBFT_FIND
	mac80211: don't warn about CW params when not using them
	hwmon: (nct6775) Fix register address and added missed tolerance for nct6106
	cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init()
	s390/qdio: add sanity checks to the fast-requeue path
	ALSA: compress: Fix regression on compressed capture streams
	ALSA: compress: Prevent bypasses of set_params
	ALSA: compress: Be more restrictive about when a drain is allowed
	perf probe: Avoid calling freeing routine multiple times for same pointer
	ARM: davinci: fix sleep.S build error on ARMv4
	scsi: megaraid_sas: fix panic on loading firmware crashdump
	scsi: ibmvfc: fix WARN_ON during event pool release
	tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop
	perf/core: Fix creating kernel counters for PMUs that override event->cpu
	can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices
	can: peak_usb: pcan_usb_fd: Fix info-leaks to USB devices
	hwmon: (nct7802) Fix wrong detection of in4 presence
	ALSA: firewire: fix a memory leak bug
	mac80211: don't WARN on short WMM parameters from AP
	SMB3: Fix deadlock in validate negotiate hits reconnect
	smb3: send CAP_DFS capability during session setup
	mwifiex: fix 802.11n/WPA detection
	scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
	sh: kernel: hw_breakpoint: Fix missing break in switch statement
	usb: gadget: f_midi: fail if set_alt fails to allocate requests
	USB: gadget: f_midi: fixing a possible double-free in f_midi
	mm/memcontrol.c: fix use after free in mem_cgroup_iter()
	ALSA: hda - Fix a memory leak bug
	HID: holtek: test for sanity of intfdata
	HID: hiddev: avoid opening a disconnected device
	HID: hiddev: do cleanup in failure of opening a device
	Input: kbtab - sanity check for endpoint type
	Input: iforce - add sanity checks
	net: usb: pegasus: fix improper read if get_registers() fail
	xen/pciback: remove set but not used variable 'old_state'
	irqchip/irq-imx-gpcv2: Forward irq type to parent
	perf header: Fix divide by zero error if f_header.attr_size==0
	perf header: Fix use of unitialized value warning
	libata: zpodd: Fix small read overflow in zpodd_get_mech_type()
	scsi: hpsa: correct scsi command status issue after reset
	ata: libahci: do not complain in case of deferred probe
	kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules
	IB/core: Add mitigation for Spectre V1
	ocfs2: remove set but not used variable 'last_hash'
	asm-generic: fix -Wtype-limits compiler warnings
	staging: comedi: dt3000: Fix signed integer overflow 'divider * base'
	staging: comedi: dt3000: Fix rounding up of timer divisor
	USB: core: Fix races in character device registration and deregistraion
	usb: cdc-acm: make sure a refcount is taken early enough
	USB: serial: option: add D-Link DWM-222 device ID
	USB: serial: option: Add support for ZTE MF871A
	USB: serial: option: add the BroadMobi BM818 card
	USB: serial: option: Add Motorola modem UARTs
	Backport minimal compiler_attributes.h to support GCC 9
	include/linux/module.h: copy __init/__exit attrs to init/cleanup_module
	arm64: compat: Allow single-byte watchpoints on all addresses
	Input: psmouse - fix build error of multiple definition
	asm-generic: default BUG_ON(x) to if(x)BUG()
	scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure
	RDMA: Directly cast the sockaddr union to sockaddr
	IB/mlx5: Make coding style more consistent
	x86/vdso: Remove direct HPET access through the vDSO
	iommu/amd: Move iommu_init_pci() to .init section
	x86/boot: Disable the address-of-packed-member compiler warning
	net/packet: fix race in tpacket_snd()
	xen/netback: Reset nr_frags before freeing skb
	net/mlx5e: Only support tx/rx pause setting for port owner
	sctp: fix the transport error_count check
	bonding: Add vlan tx offload to hw_enc_features
	Linux 4.4.190

Change-Id: Ic4094fbac2f9b8f6d4a9b4397e82471f40424332
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-08-25 14:35:01 +02:00
Miles Chen
3d29e6420b mm/memcontrol.c: fix use after free in mem_cgroup_iter()
commit 54a83d6bcbf8f4700013766b974bf9190d40b689 upstream.

This patch is sent to report an use after free in mem_cgroup_iter()
after merging commit be2657752e9e ("mm: memcg: fix use after free in
mem_cgroup_iter()").

I work with android kernel tree (4.9 & 4.14), and commit be2657752e9e
("mm: memcg: fix use after free in mem_cgroup_iter()") has been merged
to the trees.  However, I can still observe use after free issues
addressed in the commit be2657752e9e.  (on low-end devices, a few times
this month)

backtrace:
        css_tryget <- crash here
        mem_cgroup_iter
        shrink_node
        shrink_zones
        do_try_to_free_pages
        try_to_free_pages
        __perform_reclaim
        __alloc_pages_direct_reclaim
        __alloc_pages_slowpath
        __alloc_pages_nodemask

To debug, I poisoned mem_cgroup before freeing it:

  static void __mem_cgroup_free(struct mem_cgroup *memcg)
        for_each_node(node)
        free_mem_cgroup_per_node_info(memcg, node);
        free_percpu(memcg->stat);
  +     /* poison memcg before freeing it */
  +     memset(memcg, 0x78, sizeof(struct mem_cgroup));
        kfree(memcg);
  }

The coredump shows the position=0xdbbc2a00 is freed.

  (gdb) p/x ((struct mem_cgroup_per_node *)0xe5009e00)->iter[8]
  $13 = {position = 0xdbbc2a00, generation = 0x2efd}

  0xdbbc2a00:     0xdbbc2e00      0x00000000      0xdbbc2800      0x00000100
  0xdbbc2a10:     0x00000200      0x78787878      0x00026218      0x00000000
  0xdbbc2a20:     0xdcad6000      0x00000001      0x78787800      0x00000000
  0xdbbc2a30:     0x78780000      0x00000000      0x0068fb84      0x78787878
  0xdbbc2a40:     0x78787878      0x78787878      0x78787878      0xe3fa5cc0
  0xdbbc2a50:     0x78787878      0x78787878      0x00000000      0x00000000
  0xdbbc2a60:     0x00000000      0x00000000      0x00000000      0x00000000
  0xdbbc2a70:     0x00000000      0x00000000      0x00000000      0x00000000
  0xdbbc2a80:     0x00000000      0x00000000      0x00000000      0x00000000
  0xdbbc2a90:     0x00000001      0x00000000      0x00000000      0x00100000
  0xdbbc2aa0:     0x00000001      0xdbbc2ac8      0x00000000      0x00000000
  0xdbbc2ab0:     0x00000000      0x00000000      0x00000000      0x00000000
  0xdbbc2ac0:     0x00000000      0x00000000      0xe5b02618      0x00001000
  0xdbbc2ad0:     0x00000000      0x78787878      0x78787878      0x78787878
  0xdbbc2ae0:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2af0:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2b00:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2b10:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2b20:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2b30:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2b40:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2b50:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2b60:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2b70:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2b80:     0x78787878      0x78787878      0x00000000      0x78787878
  0xdbbc2b90:     0x78787878      0x78787878      0x78787878      0x78787878
  0xdbbc2ba0:     0x78787878      0x78787878      0x78787878      0x78787878

In the reclaim path, try_to_free_pages() does not setup
sc.target_mem_cgroup and sc is passed to do_try_to_free_pages(), ...,
shrink_node().

In mem_cgroup_iter(), root is set to root_mem_cgroup because
sc->target_mem_cgroup is NULL.  It is possible to assign a memcg to
root_mem_cgroup.nodeinfo.iter in mem_cgroup_iter().

        try_to_free_pages
        	struct scan_control sc = {...}, target_mem_cgroup is 0x0;
        do_try_to_free_pages
        shrink_zones
        shrink_node
        	 mem_cgroup *root = sc->target_mem_cgroup;
        	 memcg = mem_cgroup_iter(root, NULL, &reclaim);
        mem_cgroup_iter()
        	if (!root)
        		root = root_mem_cgroup;
        	...

        	css = css_next_descendant_pre(css, &root->css);
        	memcg = mem_cgroup_from_css(css);
        	cmpxchg(&iter->position, pos, memcg);

My device uses memcg non-hierarchical mode.  When we release a memcg:
invalidate_reclaim_iterators() reaches only dead_memcg and its parents.
If non-hierarchical mode is used, invalidate_reclaim_iterators() never
reaches root_mem_cgroup.

  static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
  {
        struct mem_cgroup *memcg = dead_memcg;

        for (; memcg; memcg = parent_mem_cgroup(memcg)
        ...
  }

So the use after free scenario looks like:

  CPU1						CPU2

  try_to_free_pages
  do_try_to_free_pages
  shrink_zones
  shrink_node
  mem_cgroup_iter()
      if (!root)
      	root = root_mem_cgroup;
      ...
      css = css_next_descendant_pre(css, &root->css);
      memcg = mem_cgroup_from_css(css);
      cmpxchg(&iter->position, pos, memcg);

        				invalidate_reclaim_iterators(memcg);
        				...
        				__mem_cgroup_free()
        					kfree(memcg);

  try_to_free_pages
  do_try_to_free_pages
  shrink_zones
  shrink_node
  mem_cgroup_iter()
      if (!root)
      	root = root_mem_cgroup;
      ...
      mz = mem_cgroup_nodeinfo(root, reclaim->pgdat->node_id);
      iter = &mz->iter[reclaim->priority];
      pos = READ_ONCE(iter->position);
      css_tryget(&pos->css) <- use after free

To avoid this, we should also invalidate root_mem_cgroup.nodeinfo.iter
in invalidate_reclaim_iterators().

[cai@lca.pw: fix -Wparentheses compilation warning]
  Link: http://lkml.kernel.org/r/1564580753-17531-1-git-send-email-cai@lca.pw
Link: http://lkml.kernel.org/r/20190730015729.4406-1-miles.chen@mediatek.com
Fixes: 5ac8fb31ad ("mm: memcontrol: convert reclaim iterator to simple css refcounting")
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-25 10:52:56 +02:00
Joerg Roedel
a89f96d5d9 mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
commit 3f8fd02b1bf1d7ba964485a56f2f4b53ae88c167 upstream.

On x86-32 with PTI enabled, parts of the kernel page-tables are not shared
between processes. This can cause mappings in the vmalloc/ioremap area to
persist in some page-tables after the region is unmapped and released.

When the region is re-used the processes with the old mappings do not fault
in the new mappings but still access the old ones.

This causes undefined behavior, in reality often data corruption, kernel
oopses and panics and even spontaneous reboots.

Fix this problem by activly syncing unmaps in the vmalloc/ioremap area to
all page-tables in the system before the regions can be re-used.

References: https://bugzilla.suse.com/show_bug.cgi?id=1118689
Fixes: 5d72b4fba4 ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20190719184652.11391-4-joro@8bytes.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-25 10:52:44 +02:00
Greg Kroah-Hartman
7838c48a55 This is the 4.4.188 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1Jqq8ACgkQONu9yGCS
 aT4l7Q//TuNNZ1dlnc+f4CiH+RL8bpiNL0McILfP1p8/0iBHEg4e/eYn1LhVRzyD
 7luYa5zVJCWuZleeTcHpTedXuUbSY7KEpmX8Hd3A2JOlPLtaPn9V1yZ50CEj6Nmq
 QEU5QhuROaWh7mF6qcIZBcWSqQjrTplqC38/j3GVqESdcPiKL3pm8E9ueNNrJeuG
 twGcVnhZd3il+Pm1d0js+qupuqk9Ah/MLaVnMnmVgkc+Foq7K2eVOvQXcUn6EoAm
 6bEHst1GFYnsUGmud54hSRcdbh1TzxgOjTl8UJ3rxm9ktxY/6M2PsvuZkMzf2xGK
 QKXCgMdA56BfMpPM0VMhjugKZ8vxxc7wfANYLw+bFqhF3TWrn8MB80AZ+2Y/8Oso
 TSib3D5fxv5Q0cnznVo9IxS08sx5Gz0+/IOLwwpnePp2CUR3OaQP2vgO1M1/oQjB
 h8k8tVEknSNpD02WfDYm1ahrJyQcbqhjt2pmtrHVYdO1+Szh4iuyZm1EJXMOTQoz
 Tpn0kdg3lMkBnSW+AkTLNlXqoKWUbMdzvLK1oqI+jFLxMEuHxHqkO3KXck5gZmJH
 wX1we1OOzr0nSDHbux4QfxsuniHYl7YhDyc70208psUWvw59FxIG76V8Du1ecqcZ
 O41nro2lV0ymv3QMLILtjk/JF62+fs4aa5ykM9bCvSIaPhothUM=
 =TYvp
 -----END PGP SIGNATURE-----

Merge 4.4.188 into android-4.4-p

Changes in 4.4.188
	ARM: riscpc: fix DMA
	ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend
	kernel/module.c: Only return -EEXIST for modules that have finished loading
	MIPS: lantiq: Fix bitfield masking
	dmaengine: rcar-dmac: Reject zero-length slave DMA requests
	fs/adfs: super: fix use-after-free bug
	btrfs: fix minimum number of chunk errors for DUP
	ceph: fix improper use of smp_mb__before_atomic()
	scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized
	ACPI: fix false-positive -Wuninitialized warning
	be2net: Signal that the device cannot transmit during reconfiguration
	x86/apic: Silence -Wtype-limits compiler warnings
	x86: math-emu: Hide clang warnings for 16-bit overflow
	mm/cma.c: fail if fixed declaration can't be honored
	coda: add error handling for fget
	coda: fix build using bare-metal toolchain
	uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers
	ipc/mqueue.c: only perform resource calculation if user valid
	x86/kvm: Don't call kvm_spurious_fault() from .fixup
	selinux: fix memory leak in policydb_init()
	s390/dasd: fix endless loop after read unit address configuration
	xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
	Linux 4.4.188

Change-Id: I6ed0db8e205744849b0242a9fd12b38f728077e0
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-08-06 18:36:03 +02:00
Doug Berger
6e48d75966 mm/cma.c: fail if fixed declaration can't be honored
[ Upstream commit c633324e311243586675e732249339685e5d6faa ]

The description of cma_declare_contiguous() indicates that if the
'fixed' argument is true the reserved contiguous area must be exactly at
the address of the 'base' argument.

However, the function currently allows the 'base', 'size', and 'limit'
arguments to be silently adjusted to meet alignment constraints.  This
commit enforces the documented behavior through explicit checks that
return an error if the region does not fit within a specified region.

Link: http://lkml.kernel.org/r/1561422051-16142-1-git-send-email-opendmb@gmail.com
Fixes: 5ea3b1b2f8 ("cma: add placement specifier for "cma=" kernel parameter")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Yue Hu <huyue2@yulong.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Peng Fan <peng.fan@nxp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06 18:28:27 +02:00
Greg Kroah-Hartman
ebf4d7ea8d This is the 4.4.187 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1GiqYACgkQONu9yGCS
 aT5poQ//XNZuSNH5NeE8y37z/7EC5cnx5QOdgpVEz/RZF6Al7DzM0SK/oWiMJR9O
 +gJOoHEwlW/GmVw5O/yOll6ChnAlXfbGnZy9TlXkVUVIa9qU3xVrSFnh4lM1xiZy
 crEaIQ9ow6tfQHnq/DcODvfyEdZgaiW0xTBTB/ZBEKmN9//rBphTuZlFvAKX7bv5
 JBflHDCGl/1zO09xqR9jgWcrCW//a2Ip/O2D61IW1l3oqp7eVGDZMBHMbac45zQ0
 4tpD/ppzv8ak3+HTknIujuZSMlMkCJ6FYBlTqpp44e/qQ8ZvQ2s0OdP3iHwlC5HA
 E60F2ynewg1JJ6RnhmnTn2g4C1MEvL7QMroo3fo1TujpHYLJBpLiQpggXnweTfYN
 eR+Ux1i38SyyqhYSMncp42vttsIXnYTpAGzZi0gLOenVj9MnrNjQueBI4o5PmJwF
 CcYP8SIaadSZhBPv/FDo0mKFdepb10g1PBi/0Dk+tqJuxSDbqc+cD5BywkJh67T5
 y+3LBVOIZCYA6WY8v7J65x9gNZI50RGKcoX0YWsbEKhBjnfCmW0B0qB17HwWpPWz
 UvSIGY7Vj7ufhCMSgzuqOPSVKQ5gL36BsJOZPyrnqz2GdMebSpKRMPEGsNSdPvnl
 8M8GuZFotgKmW7m2aU5nr8+Mwh82zXir9He1aShxd172caefGIk=
 =ml+6
 -----END PGP SIGNATURE-----

Merge 4.4.187 into android-4.4-p

Changes in 4.4.187
	MIPS: ath79: fix ar933x uart parity mode
	MIPS: fix build on non-linux hosts
	dmaengine: imx-sdma: fix use-after-free on probe error path
	ath10k: Do not send probe response template for mesh
	ath9k: Check for errors when reading SREV register
	ath6kl: add some bounds checking
	ath: DFS JP domain W56 fixed pulse type 3 RADAR detection
	batman-adv: fix for leaked TVLV handler.
	media: dvb: usb: fix use after free in dvb_usb_device_exit
	crypto: talitos - fix skcipher failure due to wrong output IV
	media: marvell-ccic: fix DMA s/g desc number calculation
	media: vpss: fix a potential NULL pointer dereference
	net: stmmac: dwmac1000: Clear unused address entries
	signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig
	af_key: fix leaks in key_pol_get_resp and dump_sp.
	xfrm: Fix xfrm sel prefix length validation
	media: staging: media: davinci_vpfe: - Fix for memory leak if decoder initialization fails.
	net: phy: Check against net_device being NULL
	tua6100: Avoid build warnings.
	locking/lockdep: Fix merging of hlocks with non-zero references
	media: wl128x: Fix some error handling in fm_v4l2_init_video_device()
	cpupower : frequency-set -r option misses the last cpu in related cpu list
	net: fec: Do not use netdev messages too early
	net: axienet: Fix race condition causing TX hang
	s390/qdio: handle PENDING state for QEBSM devices
	perf test 6: Fix missing kvm module load for s390
	gpio: omap: fix lack of irqstatus_raw0 for OMAP4
	gpio: omap: ensure irq is enabled before wakeup
	regmap: fix bulk writes on paged registers
	bpf: silence warning messages in core
	rcu: Force inlining of rcu_read_lock()
	xfrm: fix sa selector validation
	perf evsel: Make perf_evsel__name() accept a NULL argument
	vhost_net: disable zerocopy by default
	EDAC/sysfs: Fix memory leak when creating a csrow object
	media: i2c: fix warning same module names
	ntp: Limit TAI-UTC offset
	timer_list: Guard procfs specific code
	acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
	media: coda: fix mpeg2 sequence number handling
	media: coda: increment sequence offset for the last returned frame
	mt7601u: do not schedule rx_tasklet when the device has been disconnected
	x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c
	mt7601u: fix possible memory leak when the device is disconnected
	ath10k: fix PCIE device wake up failed
	rslib: Fix decoding of shortened codes
	rslib: Fix handling of of caller provided syndrome
	ixgbe: Check DDM existence in transceiver before access
	EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec
	bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush()
	Bluetooth: hci_bcsp: Fix memory leak in rx_skb
	Bluetooth: 6lowpan: search for destination address in all peers
	Bluetooth: Check state in l2cap_disconnect_rsp
	Bluetooth: validate BLE connection interval updates
	crypto: ghash - fix unaligned memory access in ghash_setkey()
	crypto: arm64/sha1-ce - correct digest for empty data in finup
	crypto: arm64/sha2-ce - correct digest for empty data in finup
	Input: gtco - bounds check collection indent level
	regulator: s2mps11: Fix buck7 and buck8 wrong voltages
	tracing/snapshot: Resize spare buffer if size changed
	NFSv4: Handle the special Linux file open access mode
	lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE
	ALSA: seq: Break too long mutex context in the write loop
	media: v4l2: Test type instead of cfg->type in v4l2_ctrl_new_custom()
	media: coda: Remove unbalanced and unneeded mutex unlock
	KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed
	drm/nouveau/i2c: Enable i2c pads & busses during preinit
	padata: use smp_mb in padata_reorder to avoid orphaned padata jobs
	9p/virtio: Add cleanup path in p9_virtio_init
	PCI: Do not poll for PME if the device is in D3cold
	take floppy compat ioctls to sodding floppy.c
	floppy: fix div-by-zero in setup_format_params
	floppy: fix out-of-bounds read in next_valid_format
	floppy: fix invalid pointer dereference in drive_name
	floppy: fix out-of-bounds read in copy_buffer
	coda: pass the host file in vma->vm_file on mmap
	gpu: ipu-v3: ipu-ic: Fix saturation bit offset in TPMEM
	parisc: Fix kernel panic due invalid values in IAOQ0 or IAOQ1
	powerpc/32s: fix suspend/resume when IBATs 4-7 are used
	powerpc/watchpoint: Restore NV GPRs while returning from exception
	eCryptfs: fix a couple type promotion bugs
	intel_th: msu: Fix single mode with disabled IOMMU
	Bluetooth: Add SMP workaround Microsoft Surface Precision Mouse bug
	usb: Handle USB3 remote wakeup for LPM enabled devices correctly
	dm bufio: fix deadlock with loop device
	bnx2x: Prevent load reordering in tx completion processing
	caif-hsi: fix possible deadlock in cfhsi_exit_module()
	ipv4: don't set IPv6 only flags to IPv4 addresses
	net: bcmgenet: use promisc for unsupported filters
	net: neigh: fix multiple neigh timer scheduling
	nfc: fix potential illegal memory access
	sky2: Disable MSI on ASUS P6T
	netrom: fix a memory leak in nr_rx_frame()
	netrom: hold sock when setting skb->destructor
	tcp: Reset bytes_acked and bytes_received when disconnecting
	bonding: validate ip header before check IPPROTO_IGMP
	net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling
	net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query
	net: bridge: stp: don't cache eth dest pointer before skb pull
	elevator: fix truncation of icq_cache_name
	NFSv4: Fix open create exclusive when the server reboots
	nfsd: increase DRC cache limit
	nfsd: give out fewer session slots as limit approaches
	nfsd: fix performance-limiting session calculation
	nfsd: Fix overflow causing non-working mounts on 1 TB machines
	drm/panel: simple: Fix panel_simple_dsi_probe
	usb: core: hub: Disable hub-initiated U1/U2
	tty: max310x: Fix invalid baudrate divisors calculator
	pinctrl: rockchip: fix leaked of_node references
	tty: serial: cpm_uart - fix init when SMC is relocated
	memstick: Fix error cleanup path of memstick_init
	tty/serial: digicolor: Fix digicolor-usart already registered warning
	tty: serial: msm_serial: avoid system lockup condition
	drm/virtio: Add memory barriers for capset cache.
	phy: renesas: rcar-gen2: Fix memory leak at error paths
	usb: gadget: Zero ffs_io_data
	powerpc/pci/of: Fix OF flags parsing for 64bit BARs
	PCI: sysfs: Ignore lockdep for remove attribute
	iio: iio-utils: Fix possible incorrect mask calculation
	recordmcount: Fix spurious mcount entries on powerpc
	mfd: core: Set fwnode for created devices
	mfd: arizona: Fix undefined behavior
	um: Silence lockdep complaint about mmap_sem
	powerpc/4xx/uic: clear pending interrupt after irq type/pol change
	serial: sh-sci: Fix TX DMA buffer flushing and workqueue races
	kallsyms: exclude kasan local symbols on s390
	perf test mmap-thread-lookup: Initialize variable to suppress memory sanitizer warning
	f2fs: avoid out-of-range memory access
	mailbox: handle failed named mailbox channel request
	powerpc/eeh: Handle hugepages in ioremap space
	sh: prevent warnings when using iounmap
	mm/kmemleak.c: fix check for softirq context
	9p: pass the correct prototype to read_cache_page
	mm/mmu_notifier: use hlist_add_head_rcu()
	locking/lockdep: Fix lock used or unused stats error
	locking/lockdep: Hide unused 'class' variable
	usb: wusbcore: fix unbalanced get/put cluster_id
	usb: pci-quirks: Correct AMD PLL quirk detection
	x86/sysfb_efi: Add quirks for some devices with swapped width and height
	x86/speculation/mds: Apply more accurate check on hypervisor platform
	hpet: Fix division by zero in hpet_time_div()
	ALSA: line6: Fix wrong altsetting for LINE6_PODHD500_1
	ALSA: hda - Add a conexant codec entry to let mute led work
	powerpc/tm: Fix oops on sigreturn on systems without TM
	access: avoid the RCU grace period for the temporary subjective credentials
	vmstat: Remove BUG_ON from vmstat_update
	mm, vmstat: make quiet_vmstat lighter
	ipv6: check sk sk_type and protocol early in ip_mroute_set/getsockopt
	tcp: reset sk_send_head in tcp_write_queue_purge
	ISDN: hfcsusb: checking idx of ep configuration
	media: cpia2_usb: first wake up, then free in disconnect
	media: radio-raremono: change devm_k*alloc to k*alloc
	Bluetooth: hci_uart: check for missing tty operations
	sched/fair: Don't free p->numa_faults with concurrent readers
	drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
	ceph: hold i_ceph_lock when removing caps for freeing inode
	Linux 4.4.187

Change-Id: I6086b23376cdf9f6a905f727fb07175a7ebdd356
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-08-04 09:53:45 +02:00
Michal Hocko
1ab1512366 mm, vmstat: make quiet_vmstat lighter
commit f01f17d3705bb6081c9e5728078f64067982be36 upstream.

Mike has reported a considerable overhead of refresh_cpu_vm_stats from
the idle entry during pipe test:

    12.89%  [kernel]       [k] refresh_cpu_vm_stats.isra.12
     4.75%  [kernel]       [k] __schedule
     4.70%  [kernel]       [k] mutex_unlock
     3.14%  [kernel]       [k] __switch_to

This is caused by commit 0eb77e988032 ("vmstat: make vmstat_updater
deferrable again and shut down on idle") which has placed quiet_vmstat
into cpu_idle_loop.  The main reason here seems to be that the idle
entry has to get over all zones and perform atomic operations for each
vmstat entry even though there might be no per cpu diffs.  This is a
pointless overhead for _each_ idle entry.

Make sure that quiet_vmstat is as light as possible.

First of all it doesn't make any sense to do any local sync if the
current cpu is already set in oncpu_stat_off because vmstat_update puts
itself there only if there is nothing to do.

Then we can check need_update which should be a cheap way to check for
potential per-cpu diffs and only then do refresh_cpu_vm_stats.

The original patch also did cancel_delayed_work which we are not doing
here.  There are two reasons for that.  Firstly cancel_delayed_work from
idle context will blow up on RT kernels (reported by Mike):

  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rt3 #7
  Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
  Call Trace:
    dump_stack+0x49/0x67
    ___might_sleep+0xf5/0x180
    rt_spin_lock+0x20/0x50
    try_to_grab_pending+0x69/0x240
    cancel_delayed_work+0x26/0xe0
    quiet_vmstat+0x75/0xa0
    cpu_idle_loop+0x38/0x3e0
    cpu_startup_entry+0x13/0x20
    start_secondary+0x114/0x140

And secondly, even on !RT kernels it might add some non trivial overhead
which is not necessary.  Even if the vmstat worker wakes up and preempts
idle then it will be most likely a single shot noop because the stats
were already synced and so it would end up on the oncpu_stat_off anyway.
We just need to teach both vmstat_shepherd and vmstat_update to stop
scheduling the worker if there is nothing to do.

[mgalbraith@suse.de: cancel pending work of the cpu_stat_off CPU]
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04 09:35:01 +02:00
Christoph Lameter
fece2f828f vmstat: Remove BUG_ON from vmstat_update
commit 587198ba5206cdf0d30855f7361af950a4172cd6 upstream.

If we detect that there is nothing to do just set the flag and do not
check if it was already set before.  Races really do not matter.  If the
flag is set by any code then the shepherd will start dealing with the
situation and reenable the vmstat workers when necessary again.

Since commit 0eb77e988032 ("vmstat: make vmstat_updater deferrable again
and shut down on idle") quiet_vmstat might update cpu_stat_off and mark
a particular cpu to be handled by vmstat_shepherd.  This might trigger a
VM_BUG_ON in vmstat_update because the work item might have been
sleeping during the idle period and see the cpu_stat_off updated after
the wake up.  The VM_BUG_ON is therefore misleading and no more
appropriate.  Moreover it doesn't really suite any protection from real
bugs because vmstat_shepherd will simply reschedule the vmstat_work
anytime it sees a particular cpu set or vmstat_update would do the same
from the worker context directly.  Even when the two would race the
result wouldn't be incorrect as the counters update is fully idempotent.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04 09:35:01 +02:00
Jean-Philippe Brucker
eb2f57fd9f mm/mmu_notifier: use hlist_add_head_rcu()
[ Upstream commit 543bdb2d825fe2400d6e951f1786d92139a16931 ]

Make mmu_notifier_register() safer by issuing a memory barrier before
registering a new notifier.  This fixes a theoretical bug on weakly
ordered CPUs.  For example, take this simplified use of notifiers by a
driver:

	my_struct->mn.ops = &my_ops; /* (1) */
	mmu_notifier_register(&my_struct->mn, mm)
		...
		hlist_add_head(&mn->hlist, &mm->mmu_notifiers); /* (2) */
		...

Once mmu_notifier_register() releases the mm locks, another thread can
invalidate a range:

	mmu_notifier_invalidate_range()
		...
		hlist_for_each_entry_rcu(mn, &mm->mmu_notifiers, hlist) {
			if (mn->ops->invalidate_range)

The read side relies on the data dependency between mn and ops to ensure
that the pointer is properly initialized.  But the write side doesn't have
any dependency between (1) and (2), so they could be reordered and the
readers could dereference an invalid mn->ops.  mmu_notifier_register()
does take all the mm locks before adding to the hlist, but those have
acquire semantics which isn't sufficient.

By calling hlist_add_head_rcu() instead of hlist_add_head() we update the
hlist using a store-release, ensuring that readers see prior
initialization of my_struct.  This situation is better illustated by
litmus test MP+onceassign+derefonce.

Link: http://lkml.kernel.org/r/20190502133532.24981-1-jean-philippe.brucker@arm.com
Fixes: cddb8a5c14 ("mmu-notifiers: core")
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-04 09:34:59 +02:00
Dmitry Vyukov
723bcdcfdb mm/kmemleak.c: fix check for softirq context
[ Upstream commit 6ef9056952532c3b746de46aa10d45b4d7797bd8 ]

in_softirq() is a wrong predicate to check if we are in a softirq
context.  It also returns true if we have BH disabled, so objects are
falsely stamped with "softirq" comm.  The correct predicate is
in_serving_softirq().

If user does cat from /sys/kernel/debug/kmemleak previously they would
see this, which is clearly wrong, this is system call context (see the
comm):

unreferenced object 0xffff88805bd661c0 (size 64):
  comm "softirq", pid 0, jiffies 4294942959 (age 12.400s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00  ................
    00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
  backtrace:
    [<0000000007dcb30c>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [<0000000007dcb30c>] slab_post_alloc_hook mm/slab.h:439 [inline]
    [<0000000007dcb30c>] slab_alloc mm/slab.c:3326 [inline]
    [<0000000007dcb30c>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
    [<00000000969722b7>] kmalloc include/linux/slab.h:547 [inline]
    [<00000000969722b7>] kzalloc include/linux/slab.h:742 [inline]
    [<00000000969722b7>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
    [<00000000969722b7>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
    [<00000000a4134b5f>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
    [<00000000d20248ad>] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
    [<000000003d367be7>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
    [<000000003c7c76af>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
    [<000000000c1aeb23>] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
    [<000000000157b92b>] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
    [<00000000a9f3d058>] __do_sys_setsockopt net/socket.c:2089 [inline]
    [<00000000a9f3d058>] __se_sys_setsockopt net/socket.c:2086 [inline]
    [<00000000a9f3d058>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
    [<000000001b8da885>] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
    [<00000000ba770c62>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

now they will see this:

unreferenced object 0xffff88805413c800 (size 64):
  comm "syz-executor.4", pid 8960, jiffies 4294994003 (age 14.350s)
  hex dump (first 32 bytes):
    00 7a 8a 57 80 88 ff ff e0 00 00 01 00 00 00 00  .z.W............
    00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000c5d3be64>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [<00000000c5d3be64>] slab_post_alloc_hook mm/slab.h:439 [inline]
    [<00000000c5d3be64>] slab_alloc mm/slab.c:3326 [inline]
    [<00000000c5d3be64>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
    [<0000000023865be2>] kmalloc include/linux/slab.h:547 [inline]
    [<0000000023865be2>] kzalloc include/linux/slab.h:742 [inline]
    [<0000000023865be2>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
    [<0000000023865be2>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
    [<000000003029a9d4>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
    [<00000000ccd0a87c>] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
    [<00000000a85a3785>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
    [<00000000ec13c18d>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
    [<0000000052d748e3>] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
    [<00000000512f1014>] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
    [<00000000181758bc>] __do_sys_setsockopt net/socket.c:2089 [inline]
    [<00000000181758bc>] __se_sys_setsockopt net/socket.c:2086 [inline]
    [<00000000181758bc>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
    [<00000000d4b73623>] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
    [<00000000c1098bec>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Link: http://lkml.kernel.org/r/20190517171507.96046-1-dvyukov@gmail.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-04 09:34:59 +02:00
Greg Kroah-Hartman
60a02c0d81 This is the 4.4.185 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0lmj0ACgkQONu9yGCS
 aT67kRAAgoQ2/imz/GJ7bBKhtql6XneFD7fPDnggxHNRGA4VvkV+KcpHIinWZtXp
 QV5sq8u092SlvQHYolLcVyK3OsMzfzjsK0HMXJRxEa4UsI38hEDBqpgDaUIYiKrb
 l1NUsQLy8foFobbKsTkdiN3+RBdjkTFWZ/SxDXg1T6EicbntvpP/E0QXQ4J+H9TQ
 Wdvx0wMS1m8RkBQzoUxyqz10vCDP67e2NPIBotaKOQS3iYa1he6nU5Y7mLpJOw7s
 mEPkwnT9+DnDENhL7YHo37JPIgSKz+kLdnw6xLzKSIEOl7MUylu7ocbxcJlAHyxC
 LkLRA6E3vCBu540ajHhyjIfN4IPln78qnV1ciGmyTE+YNWtvyPqZVERDXqh9thM9
 4lPUm20HAyhopmxdYfoAq933Ki8IH/mTc3vXpcXbVnAOp2uZfJ0nhVOyhqov9B+t
 p6ct9t9/1ARPITmFTGNWFvTTRT+OoPLDi6ND1o+8ukXRmn9+sl0/VpWsJNpdIAiM
 ss93JfdVTFkR84Oc0zL4Cg55q01BvJYiGtj2oeU5cBECXvXAEvR3ro17b17fmCFh
 jk5xcckickpSK1g8iXNsC8EAAL0tIE+vQemxmJDPCcV3rM83addo9Lwqyme6Wa/3
 F702sfcOSvRvoEfd+9WZcKV01GfTEFu4jK7onrvL16Xn60Wq1iM=
 =Qr7D
 -----END PGP SIGNATURE-----

Merge 4.4.185 into android-4.4-p

Changes in 4.4.185
	fs/binfmt_flat.c: make load_flat_shared_library() work
	mm/page_idle.c: fix oops because end_pfn is larger than max_pfn
	scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck()
	tracing: Silence GCC 9 array bounds warning
	gcc-9: silence 'address-of-packed-member' warning
	usb: chipidea: udc: workaround for endpoint conflict issue
	Input: uinput - add compat ioctl number translation for UI_*_FF_UPLOAD
	apparmor: enforce nullbyte at end of tag string
	parport: Fix mem leak in parport_register_dev_model
	parisc: Fix compiler warnings in float emulation code
	IB/hfi1: Insure freeze_work work_struct is canceled on shutdown
	MIPS: uprobes: remove set but not used variable 'epc'
	net: hns: Fix loopback test failed at copper ports
	sparc: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD
	scripts/checkstack.pl: Fix arm64 wrong or unknown architecture
	scsi: ufs: Check that space was properly alloced in copy_query_response
	s390/qeth: fix VLAN attribute in bridge_hostnotify udev event
	hwmon: (pmbus/core) Treat parameters as paged if on multiple pages
	Btrfs: fix race between readahead and device replace/removal
	btrfs: start readahead also in seed devices
	can: flexcan: fix timeout when set small bitrate
	can: purge socket error queue on sock destruct
	ARM: imx: cpuidle-imx6sx: Restrict the SW2ISO increase to i.MX6SX
	Bluetooth: Align minimum encryption key size for LE and BR/EDR connections
	Bluetooth: Fix regression with minimum encryption key size alignment
	SMB3: retry on STATUS_INSUFFICIENT_RESOURCES instead of failing write
	cfg80211: fix memory leak of wiphy device name
	mac80211: drop robust management frames from unknown TA
	perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul
	perf help: Remove needless use of strncpy()
	9p/rdma: do not disconnect on down_interruptible EAGAIN
	9p: acl: fix uninitialized iattr access
	9p/rdma: remove useless check in cm_event_handler
	9p: p9dirent_read: check network-provided name length
	net/9p: include trans_common.h to fix missing prototype warning.
	KVM: X86: Fix scan ioapic use-before-initialization
	ovl: modify ovl_permission() to do checks on two inodes
	x86/speculation: Allow guests to use SSBD even if host does not
	cpu/speculation: Warn on unsupported mitigations= parameter
	sctp: change to hold sk after auth shkey is created successfully
	tipc: change to use register_pernet_device
	tipc: check msg->req data len in tipc_nl_compat_bearer_disable
	team: Always enable vlan tx offload
	ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop
	bonding: Always enable vlan tx offload
	net: check before dereferencing netdev_ops during busy poll
	Bluetooth: Fix faulty expression for minimum encryption key size check
	um: Compile with modern headers
	ASoC : cs4265 : readable register too low
	spi: bitbang: Fix NULL pointer dereference in spi_unregister_master
	ASoC: max98090: remove 24-bit format support if RJ is 0
	usb: gadget: fusb300_udc: Fix memory leak of fusb300->ep[i]
	usb: gadget: udc: lpc32xx: allocate descriptor with GFP_ATOMIC
	scsi: hpsa: correct ioaccel2 chaining
	ARC: Assume multiplier is always present
	ARC: fix build warning in elf.h
	MIPS: math-emu: do not use bools for arithmetic
	mfd: omap-usb-tll: Fix register offsets
	swiotlb: Make linux/swiotlb.h standalone includible
	bug.h: work around GCC PR82365 in BUG()
	MIPS: Workaround GCC __builtin_unreachable reordering bug
	ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME
	crypto: user - prevent operating on larval algorithms
	ALSA: seq: fix incorrect order of dest_client/dest_ports arguments
	ALSA: firewire-lib/fireworks: fix miss detection of received MIDI messages
	ALSA: usb-audio: fix sign unintended sign extension on left shifts
	lib/mpi: Fix karactx leak in mpi_powm
	btrfs: Ensure replaced device doesn't have pending chunk allocation
	tty: rocket: fix incorrect forward declaration of 'rp_init()'
	ARC: handle gcc generated __builtin_trap for older compiler
	arm64, vdso: Define vdso_{start,end} as array
	KVM: x86: degrade WARN to pr_warn_ratelimited
	dmaengine: imx-sdma: remove BD_INTR for channel0
	Linux 4.4.185

Change-Id: If1b1ee0b61d5f6d6fb162dc446c621d6baebfab9
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-07-10 12:57:28 +02:00
Colin Ian King
bd6042e9c3 mm/page_idle.c: fix oops because end_pfn is larger than max_pfn
commit 7298e3b0a149c91323b3205d325e942c3b3b9ef6 upstream.

Currently the calcuation of end_pfn can round up the pfn number to more
than the actual maximum number of pfns, causing an Oops.  Fix this by
ensuring end_pfn is never more than max_pfn.

This can be easily triggered when on systems where the end_pfn gets
rounded up to more than max_pfn using the idle-page stress-ng stress test:

sudo stress-ng --idle-page 0

  BUG: unable to handle kernel paging request at 00000000000020d8
  #PF error: [normal kernel read fault]
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 1 PID: 11039 Comm: stress-ng-idle- Not tainted 5.0.0-5-generic #6-Ubuntu
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
  RIP: 0010:page_idle_get_page+0xc8/0x1a0
  Code: 0f b1 0a 75 7d 48 8b 03 48 89 c2 48 c1 e8 33 83 e0 07 48 c1 ea 36 48 8d 0c 40 4c 8d 24 88 49 c1 e4 07 4c 03 24 d5 00 89 c3 be <49> 8b 44 24 58 48 8d b8 80 a1 02 00 e8 07 d5 77 00 48 8b 53 08 48
  RSP: 0018:ffffafd7c672fde8 EFLAGS: 00010202
  RAX: 0000000000000005 RBX: ffffe36341fff700 RCX: 000000000000000f
  RDX: 0000000000000284 RSI: 0000000000000275 RDI: 0000000001fff700
  RBP: ffffafd7c672fe00 R08: ffffa0bc34056410 R09: 0000000000000276
  R10: ffffa0bc754e9b40 R11: ffffa0bc330f6400 R12: 0000000000002080
  R13: ffffe36341fff700 R14: 0000000000080000 R15: ffffa0bc330f6400
  FS: 00007f0ec1ea5740(0000) GS:ffffa0bc7db00000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000020d8 CR3: 0000000077d68000 CR4: 00000000000006e0
  Call Trace:
    page_idle_bitmap_write+0x8c/0x140
    sysfs_kf_bin_write+0x5c/0x70
    kernfs_fop_write+0x12e/0x1b0
    __vfs_write+0x1b/0x40
    vfs_write+0xab/0x1b0
    ksys_write+0x55/0xc0
    __x64_sys_write+0x1a/0x20
    do_syscall_64+0x5a/0x110
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Link: http://lkml.kernel.org/r/20190618124352.28307-1-colin.king@canonical.com
Fixes: 33c3fc71c8 ("mm: introduce idle page tracking")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-10 09:56:30 +02:00
Greg Kroah-Hartman
032bab8306 This is the 4.4.183 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl0NyDMACgkQONu9yGCS
 aT5MEA//XPA94FjtAzSiaMZlf79PAusz1FJIazheoOuhgn+dkLyrmZtO4BoADglK
 3G8M7ArAYYGg4grZgIkUdqjeTAlWu5XQm+pS8N7b2fJFiKM/p/1t4s1RitQwk2xm
 giCcqvCA+5D9FIwPsktAlRjhtLK7chmQAvxHDv8CYI60GPkMxSTLrx+7rMwj31wP
 4MR/Ll8MJAxXWXF7PPJnfDgMCIp5T8FEV+Uu3UUQc5kB0YNQofi9sFyhbCdiHSgN
 g9HIGuUKt5wV+bwzGcshRR86sumfsh0ayuEToZQHTUSWBgPvmh4Q9hyn/cFkOmAq
 tSFB9wC+DIjgGwvrT1573i1pwZBhetz8L8sgo4QGilqQl4QTbGv3PR5dpdxm4VFK
 BbzhWGmZGGd/J8KSIjRpwEEE2FoyUi/kalcYxqcIu8N+FuQVX3HvUdDkFSmwyoLB
 7/xMZb+xNkkmsuw+4JTGQ2CMQKcY4XbLIpa1cVbvDY9z923DGMJCqmMfb2FU4Rn1
 KRUXSV2Dt231IHVbXjqt7jLnMdTIq9Z8gMjH/yoX0Z4AFphvq2pk/xB3W7AkWQE3
 Q4iqSzPzT8Hij/3OJBVQTcwUajdmwjmMf+6ZvKjnECUcYv2nOuyuU6hXMDQuhH/v
 ZEB9WyH/kjUaRwjNensR+NhEmsKRKRW6eWaZ3CSJvzlUZKWMBsU=
 =8C1+
 -----END PGP SIGNATURE-----

Merge 4.4.183 into android-4.4-p

Changes in 4.4.183
	fs/fat/file.c: issue flush after the writeback of FAT
	sysctl: return -EINVAL if val violates minmax
	ipc: prevent lockup on alloc_msg and free_msg
	hugetlbfs: on restore reserve error path retain subpool reservation
	mm/cma.c: fix crash on CMA allocation if bitmap allocation fails
	mm/cma_debug.c: fix the break condition in cma_maxchunk_get()
	kernel/sys.c: prctl: fix false positive in validate_prctl_map()
	mfd: intel-lpss: Set the device in reset state when init
	mfd: twl6040: Fix device init errors for ACCCTL register
	perf/x86/intel: Allow PEBS multi-entry in watermark mode
	drm/bridge: adv7511: Fix low refresh rate selection
	ntp: Allow TAI-UTC offset to be set to zero
	f2fs: fix to avoid panic in do_recover_data()
	f2fs: fix to do sanity check on valid block count of segment
	iommu/vt-d: Set intel_iommu_gfx_mapped correctly
	ALSA: hda - Register irq handler after the chip initialization
	nvmem: core: fix read buffer in place
	fuse: retrieve: cap requested size to negotiated max_write
	nfsd: allow fh_want_write to be called twice
	x86/PCI: Fix PCI IRQ routing table memory leak
	platform/chrome: cros_ec_proto: check for NULL transfer function
	soc: mediatek: pwrap: Zero initialize rdata in pwrap_init_cipher
	clk: rockchip: Turn on "aclk_dmac1" for suspend on rk3288
	ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ahb" clock to SDMA
	ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ipg" clock to SDMA
	ARM: dts: imx6qdl: Specify IMX6QDL_CLK_IPG as "ipg" clock to SDMA
	PCI: rpadlpar: Fix leaked device_node references in add/remove paths
	PCI: rcar: Fix a potential NULL pointer dereference
	video: hgafb: fix potential NULL pointer dereference
	video: imsttfb: fix potential NULL pointer dereferences
	PCI: xilinx: Check for __get_free_pages() failure
	gpio: gpio-omap: add check for off wake capable gpios
	dmaengine: idma64: Use actual device for DMA transfers
	pwm: tiehrpwm: Update shadow register for disabling PWMs
	ARM: dts: exynos: Always enable necessary APIO_1V8 and ABB_1V8 regulators on Arndale Octa
	pwm: Fix deadlock warning when removing PWM device
	ARM: exynos: Fix undefined instruction during Exynos5422 resume
	futex: Fix futex lock the wrong page
	Revert "Bluetooth: Align minimum encryption key size for LE and BR/EDR connections"
	ALSA: seq: Cover unsubscribe_port() in list_mutex
	libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk
	mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node
	fs/ocfs2: fix race in ocfs2_dentry_attach_lock()
	signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO
	ptrace: restore smp_rmb() in __ptrace_may_access()
	i2c: acorn: fix i2c warning
	bcache: fix stack corruption by PRECEDING_KEY()
	cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()
	ASoC: cs42xx8: Add regcache mask dirty
	Drivers: misc: fix out-of-bounds access in function param_set_kgdbts_var
	scsi: lpfc: add check for loss of ndlp when sending RRQ
	scsi: bnx2fc: fix incorrect cast to u64 on shift operation
	usbnet: ipheth: fix racing condition
	KVM: x86/pmu: do not mask the value that is written to fixed PMUs
	KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION
	drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an invalid read
	drm/vmwgfx: NULL pointer dereference from vmw_cmd_dx_view_define()
	USB: Fix chipmunk-like voice when using Logitech C270 for recording audio.
	USB: usb-storage: Add new ID to ums-realtek
	USB: serial: pl2303: add Allied Telesis VT-Kit3
	USB: serial: option: add support for Simcom SIM7500/SIM7600 RNDIS mode
	USB: serial: option: add Telit 0x1260 and 0x1261 compositions
	ax25: fix inconsistent lock state in ax25_destroy_timer
	be2net: Fix number of Rx queues used for flow hashing
	ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero
	lapb: fixed leak of control-blocks.
	neigh: fix use-after-free read in pneigh_get_next
	sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg
	mISDN: make sure device name is NUL terminated
	x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor
	perf/ring_buffer: Fix exposing a temporarily decreased data_head
	perf/ring_buffer: Add ordering to rb->nest increment
	gpio: fix gpio-adp5588 build errors
	net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE()
	i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr
	configfs: Fix use-after-free when accessing sd->s_dentry
	ia64: fix build errors by exporting paddr_to_nid()
	KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list
	net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs
	scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route()
	scsi: libsas: delete sas port if expander discover failed
	Revert "crypto: crypto4xx - properly set IV after de- and encrypt"
	coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
	Abort file_remove_privs() for non-reg. files
	Linux 4.4.183

Change-Id: I26e2772a587b1dcf557adede5bcff66962f72432
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-06-22 09:45:38 +02:00
Andrea Arcangeli
8f6345a11c coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
commit 04f5866e41fb70690e28397487d8bd8eea7d712a upstream.

The core dumping code has always run without holding the mmap_sem for
writing, despite that is the only way to ensure that the entire vma
layout will not change from under it.  Only using some signal
serialization on the processes belonging to the mm is not nearly enough.
This was pointed out earlier.  For example in Hugh's post from Jul 2017:

  https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils

  "Not strictly relevant here, but a related note: I was very surprised
   to discover, only quite recently, how handle_mm_fault() may be called
   without down_read(mmap_sem) - when core dumping. That seems a
   misguided optimization to me, which would also be nice to correct"

In particular because the growsdown and growsup can move the
vm_start/vm_end the various loops the core dump does around the vma will
not be consistent if page faults can happen concurrently.

Pretty much all users calling mmget_not_zero()/get_task_mm() and then
taking the mmap_sem had the potential to introduce unexpected side
effects in the core dumping code.

Adding mmap_sem for writing around the ->core_dump invocation is a
viable long term fix, but it requires removing all copy user and page
faults and to replace them with get_dump_page() for all binary formats
which is not suitable as a short term fix.

For the time being this solution manually covers the places that can
confuse the core dump either by altering the vma layout or the vma flags
while it runs.  Once ->core_dump runs under mmap_sem for writing the
function mmget_still_valid() can be dropped.

Allowing mmap_sem protected sections to run in parallel with the
coredump provides some minor parallelism advantage to the swapoff code
(which seems to be safe enough by never mangling any vma field and can
keep doing swapins in parallel to the core dumping) and to some other
corner case.

In order to facilitate the backporting I added "Fixes: 86039bd3b4e6"
however the side effect of this same race condition in /proc/pid/mem
should be reproducible since before 2.6.12-rc2 so I couldn't add any
other "Fixes:" because there's no hash beyond the git genesis commit.

Because find_extend_vma() is the only location outside of the process
context that could modify the "mm" structures under mmap_sem for
reading, by adding the mmget_still_valid() check to it, all other cases
that take the mmap_sem for reading don't need the new check after
mmget_not_zero()/get_task_mm().  The expand_stack() in page fault
context also doesn't need the new check, because all tasks under core
dumping are frozen.

Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com
Fixes: 86039bd3b4 ("userfaultfd: add new syscall to provide memory externalization")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
[mhocko@suse.com: stable 4.4 backport
 - drop infiniband part because of missing 5f9794dc94f59
 - drop userfaultfd_event_wait_completion hunk because of
   missing 9cd75c3cd4c3d]
 - handle binder_update_page_range because of missing 720c241924046
 - handle mlx5_ib_disassociate_ucontext - akaher@vmware.com
]
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-22 08:18:27 +02:00
Shakeel Butt
c05fed5075 mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node
commit 3510955b327176fd4cbab5baa75b449f077722a2 upstream.

Syzbot reported following memory leak:

ffffffffda RBX: 0000000000000003 RCX: 0000000000441f79
BUG: memory leak
unreferenced object 0xffff888114f26040 (size 32):
  comm "syz-executor626", pid 7056, jiffies 4294948701 (age 39.410s)
  hex dump (first 32 bytes):
    40 60 f2 14 81 88 ff ff 40 60 f2 14 81 88 ff ff  @`......@`......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
     slab_post_alloc_hook mm/slab.h:439 [inline]
     slab_alloc mm/slab.c:3326 [inline]
     kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
     kmalloc include/linux/slab.h:547 [inline]
     __memcg_init_list_lru_node+0x58/0xf0 mm/list_lru.c:352
     memcg_init_list_lru_node mm/list_lru.c:375 [inline]
     memcg_init_list_lru mm/list_lru.c:459 [inline]
     __list_lru_init+0x193/0x2a0 mm/list_lru.c:626
     alloc_super+0x2e0/0x310 fs/super.c:269
     sget_userns+0x94/0x2a0 fs/super.c:609
     sget+0x8d/0xb0 fs/super.c:660
     mount_nodev+0x31/0xb0 fs/super.c:1387
     fuse_mount+0x2d/0x40 fs/fuse/inode.c:1236
     legacy_get_tree+0x27/0x80 fs/fs_context.c:661
     vfs_get_tree+0x2e/0x120 fs/super.c:1476
     do_new_mount fs/namespace.c:2790 [inline]
     do_mount+0x932/0xc50 fs/namespace.c:3110
     ksys_mount+0xab/0x120 fs/namespace.c:3319
     __do_sys_mount fs/namespace.c:3333 [inline]
     __se_sys_mount fs/namespace.c:3330 [inline]
     __x64_sys_mount+0x26/0x30 fs/namespace.c:3330
     do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
     entry_SYSCALL_64_after_hwframe+0x44/0xa9

This is a simple off by one bug on the error path.

Link: http://lkml.kernel.org/r/20190528043202.99980-1-shakeelb@google.com
Fixes: 60d3fd32a7 ("list_lru: introduce per-memcg lists")
Reported-by: syzbot+f90a420dfe2b1b03cb2c@syzkaller.appspotmail.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-22 08:18:22 +02:00
Yue Hu
937fa1624a mm/cma_debug.c: fix the break condition in cma_maxchunk_get()
[ Upstream commit f0fd50504a54f5548eb666dc16ddf8394e44e4b7 ]

If not find zero bit in find_next_zero_bit(), it will return the size
parameter passed in, so the start bit should be compared with bitmap_maxno
rather than cma->count.  Although getting maxchunk is working fine due to
zero value of order_per_bit currently, the operation will be stuck if
order_per_bit is set as non-zero.

Link: http://lkml.kernel.org/r/20190319092734.276-1-zbestahu@gmail.com
Signed-off-by: Yue Hu <huyue2@yulong.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Joe Perches <joe@perches.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Safonov <d.safonov@partner.samsung.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:18 +02:00
Yue Hu
fceb0be418 mm/cma.c: fix crash on CMA allocation if bitmap allocation fails
[ Upstream commit 1df3a339074e31db95c4790ea9236874b13ccd87 ]

f022d8cb7e ("mm: cma: Don't crash on allocation if CMA area can't be
activated") fixes the crash issue when activation fails via setting
cma->count as 0, same logic exists if bitmap allocation fails.

Link: http://lkml.kernel.org/r/20190325081309.6004-1-zbestahu@gmail.com
Signed-off-by: Yue Hu <huyue2@yulong.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:17 +02:00
Mike Kravetz
9c8d4d77e3 hugetlbfs: on restore reserve error path retain subpool reservation
[ Upstream commit 0919e1b69ab459e06df45d3ba6658d281962db80 ]

When a huge page is allocated, PagePrivate() is set if the allocation
consumed a reservation.  When freeing a huge page, PagePrivate is checked.
If set, it indicates the reservation should be restored.  PagePrivate
being set at free huge page time mostly happens on error paths.

When huge page reservations are created, a check is made to determine if
the mapping is associated with an explicitly mounted filesystem.  If so,
pages are also reserved within the filesystem.  The default action when
freeing a huge page is to decrement the usage count in any associated
explicitly mounted filesystem.  However, if the reservation is to be
restored the reservation/use count within the filesystem should not be
decrementd.  Otherwise, a subsequent page allocation and free for the same
mapping location will cause the file filesystem usage to go 'negative'.

Filesystem                         Size  Used Avail Use% Mounted on
nodev                              4.0G -4.0M  4.1G    - /opt/hugepool

To fix, when freeing a huge page do not adjust filesystem usage if
PagePrivate() is set to indicate the reservation should be restored.

I did not cc stable as the problem has been around since reserves were
added to hugetlbfs and nobody has noticed.

Link: http://lkml.kernel.org/r/20190328234704.27083-2-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:18:17 +02:00
Greg Kroah-Hartman
f992814dac This is the 4.4.181 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlz/gVcACgkQONu9yGCS
 aT4kBQ//adwq+iNdyEF550hc8tWZny0dSLPRKflzTb4hPXnzGdImCSY6pO1KdXzK
 IhjtgLb8aeFpDSZyyAw+sqFxY/2Nd9GZ5pgetWedm218uX/Hr9ETRUe+QqfmXKfx
 sIeBfhSSCm2T8HV23SOL+MWqLaHLQFEXWjSDxJAPxB7ptzGiYJ4jmje0MBrN1xV8
 22H5ijDR9SweZoR83AFtDAr9hKnpXz2ciQtJ/0xOjnVPGDQgD2uK3mpaO+F2r1hR
 kbLA2Hst3m4C3mtQZnns/SZWCKURkPk1hFYhKZyD0k757sRcSR4iHnqKdBBk29kR
 lFNfjVsAARCIj1ucYwwBbkiRJfBaCpT6TMphdtgT0f91zVMOCDTuVTN2couGSsJl
 6wWmboyM20SKkHJ3VawvtZ4YcTUjut2B1mZC/iFBSQJsMyVPQkhFzSdAXUKO6VZ9
 ZLrMTXNpPwlkYLL7VluIzdUr5crRmYj9sYIH1A/+pyzfM8WZO779jev/i2E4Eipt
 lU7ak2UMgSEZhv3GWmqPkFnJIpZwHyIsl5bGUWJ2b3wd69VasUjroVxRu1CyynXN
 CeDnqmJGLSoOlFD6/SF3MCqgvuavt3hgF+eKT2gbVti9zwLnxCxkQ7pgWMQpiMZs
 uIECSg9f1Zox/E+RpsyWc6Jx7r5yIkYHTlAyIpMuwgT+zwhWXaY=
 =sf4M
 -----END PGP SIGNATURE-----

Merge 4.4.181 into android-4.4-p

Changes in 4.4.181
	x86/speculation/mds: Revert CPU buffer clear on double fault exit
	x86/speculation/mds: Improve CPU buffer clear documentation
	ARM: exynos: Fix a leaked reference by adding missing of_node_put
	crypto: vmx - fix copy-paste error in CTR mode
	crypto: crct10dif-generic - fix use via crypto_shash_digest()
	crypto: x86/crct10dif-pcl - fix use via crypto_shash_digest()
	ALSA: usb-audio: Fix a memory leak bug
	ALSA: hda/hdmi - Consider eld_valid when reporting jack event
	ALSA: hda/realtek - EAPD turn on later
	ASoC: max98090: Fix restore of DAPM Muxes
	ASoC: RT5677-SPI: Disable 16Bit SPI Transfers
	mm/mincore.c: make mincore() more conservative
	ocfs2: fix ocfs2 read inode data panic in ocfs2_iget
	mfd: da9063: Fix OTP control register names to match datasheets for DA9063/63L
	tty/vt: fix write/write race in ioctl(KDSKBSENT) handler
	ext4: actually request zeroing of inode table after grow
	ext4: fix ext4_show_options for file systems w/o journal
	Btrfs: do not start a transaction at iterate_extent_inodes()
	bcache: fix a race between cache register and cacheset unregister
	bcache: never set KEY_PTRS of journal key to 0 in journal_reclaim()
	ipmi:ssif: compare block number correctly for multi-part return messages
	crypto: gcm - Fix error return code in crypto_gcm_create_common()
	crypto: gcm - fix incompatibility between "gcm" and "gcm_base"
	crypto: chacha20poly1305 - set cra_name correctly
	crypto: salsa20 - don't access already-freed walk.iv
	crypto: arm/aes-neonbs - don't access already-freed walk.iv
	writeback: synchronize sync(2) against cgroup writeback membership switches
	fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
	ext4: zero out the unused memory region in the extent tree block
	ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug
	KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes
	net: avoid weird emergency message
	net/mlx4_core: Change the error print to info print
	ppp: deflate: Fix possible crash in deflate_init
	tipc: switch order of device registration to fix a crash
	tipc: fix modprobe tipc failed after switch order of device registration
	stm class: Fix channel free in stm output free path
	md: add mddev->pers to avoid potential NULL pointer dereference
	intel_th: msu: Fix single mode with IOMMU
	of: fix clang -Wunsequenced for be32_to_cpu()
	cifs: fix strcat buffer overflow and reduce raciness in smb21_set_oplock_level()
	media: ov6650: Fix sensor possibly not detected on probe
	NFS4: Fix v4.0 client state corruption when mount
	clk: tegra: Fix PLLM programming on Tegra124+ when PMC overrides divider
	fuse: fix writepages on 32bit
	fuse: honor RLIMIT_FSIZE in fuse_file_fallocate
	iommu/tegra-smmu: Fix invalid ASID bits on Tegra30/114
	ceph: flush dirty inodes before proceeding with remount
	tracing: Fix partial reading of trace event's id file
	memory: tegra: Fix integer overflow on tick value calculation
	perf intel-pt: Fix instructions sampling rate
	perf intel-pt: Fix improved sample timestamp
	perf intel-pt: Fix sample timestamp wrt non-taken branches
	fbdev: sm712fb: fix brightness control on reboot, don't set SR30
	fbdev: sm712fb: fix VRAM detection, don't set SR70/71/74/75
	fbdev: sm712fb: fix white screen of death on reboot, don't set CR3B-CR3F
	fbdev: sm712fb: fix boot screen glitch when sm712fb replaces VGA
	fbdev: sm712fb: fix crashes during framebuffer writes by correctly mapping VRAM
	fbdev: sm712fb: fix support for 1024x768-16 mode
	fbdev: sm712fb: use 1024x768 by default on non-MIPS, fix garbled display
	fbdev: sm712fb: fix crashes and garbled display during DPMS modesetting
	PCI: Mark Atheros AR9462 to avoid bus reset
	dm delay: fix a crash when invalid device is specified
	xfrm: policy: Fix out-of-bound array accesses in __xfrm_policy_unlink
	xfrm6_tunnel: Fix potential panic when unloading xfrm6_tunnel module
	vti4: ipip tunnel deregistration fixes.
	xfrm4: Fix uninitialized memory read in _decode_session4
	KVM: arm/arm64: Ensure vcpu target is unset on reset failure
	power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
	ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
	perf bench numa: Add define for RUSAGE_THREAD if not present
	Revert "Don't jump to compute_result state from check_result state"
	md/raid: raid5 preserve the writeback action after the parity check
	btrfs: Honour FITRIM range constraints during free space trim
	fbdev: sm712fb: fix memory frequency by avoiding a switch/case fallthrough
	ext4: do not delete unlinked inode from orphan list on failed truncate
	KVM: x86: fix return value for reserved EFER
	bio: fix improper use of smp_mb__before_atomic()
	Revert "scsi: sd: Keep disk read-only when re-reading partition"
	crypto: vmx - CTR: always increment IV as quadword
	gfs2: Fix sign extension bug in gfs2_update_stats
	Btrfs: fix race between ranged fsync and writeback of adjacent ranges
	btrfs: sysfs: don't leak memory when failing add fsid
	fbdev: fix divide error in fb_var_to_videomode
	hugetlb: use same fault hash key for shared and private mappings
	fbdev: fix WARNING in __alloc_pages_nodemask bug
	media: cpia2: Fix use-after-free in cpia2_exit
	media: vivid: use vfree() instead of kfree() for dev->bitmap_cap
	ssb: Fix possible NULL pointer dereference in ssb_host_pcmcia_exit
	at76c50x-usb: Don't register led_trigger if usb_register_driver failed
	perf tools: No need to include bitops.h in util.h
	tools include: Adopt linux/bits.h
	gfs2: Fix lru_count going negative
	cxgb4: Fix error path in cxgb4_init_module
	mmc: core: Verify SD bus width
	powerpc/boot: Fix missing check of lseek() return value
	ASoC: imx: fix fiq dependencies
	spi: pxa2xx: fix SCR (divisor) calculation
	brcm80211: potential NULL dereference in brcmf_cfg80211_vndr_cmds_dcmd_handler()
	rtc: 88pm860x: prevent use-after-free on device remove
	w1: fix the resume command API
	dmaengine: pl330: _stop: clear interrupt status
	mac80211/cfg80211: update bss channel on channel switch
	ASoC: fsl_sai: Update is_slave_mode with correct value
	mwifiex: prevent an array overflow
	net: cw1200: fix a NULL pointer dereference
	bcache: return error immediately in bch_journal_replay()
	bcache: fix failure in journal relplay
	bcache: add failure check to run_cache_set() for journal replay
	bcache: avoid clang -Wunintialized warning
	x86/build: Move _etext to actual end of .text
	smpboot: Place the __percpu annotation correctly
	x86/mm: Remove in_nmi() warning from 64-bit implementation of vmalloc_fault()
	mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions
	HID: logitech-hidpp: use RAP instead of FAP to get the protocol version
	pinctrl: pistachio: fix leaked of_node references
	dmaengine: at_xdmac: remove BUG_ON macro in tasklet
	media: coda: clear error return value before picture run
	media: ov6650: Move v4l2_clk_get() to ov6650_video_probe() helper
	media: au0828: stop video streaming only when last user stops
	media: ov2659: make S_FMT succeed even if requested format doesn't match
	audit: fix a memory leak bug
	media: au0828: Fix NULL pointer dereference in au0828_analog_stream_enable()
	media: pvrusb2: Prevent a buffer overflow
	powerpc/numa: improve control of topology updates
	sched/core: Check quota and period overflow at usec to nsec conversion
	sched/core: Handle overflow in cpu_shares_write_u64
	USB: core: Don't unbind interfaces following device reset failure
	x86/irq/64: Limit IST stack overflow check to #DB stack
	i40e: don't allow changes to HW VLAN stripping on active port VLANs
	RDMA/cxgb4: Fix null pointer dereference on alloc_skb failure
	hwmon: (vt1211) Use request_muxed_region for Super-IO accesses
	hwmon: (smsc47m1) Use request_muxed_region for Super-IO accesses
	hwmon: (smsc47b397) Use request_muxed_region for Super-IO accesses
	hwmon: (pc87427) Use request_muxed_region for Super-IO accesses
	hwmon: (f71805f) Use request_muxed_region for Super-IO accesses
	scsi: libsas: Do discovery on empty PHY to update PHY info
	mmc_spi: add a status check for spi_sync_locked
	mmc: sdhci-of-esdhc: add erratum eSDHC5 support
	mmc: sdhci-of-esdhc: add erratum eSDHC-A001 and A-008358 support
	PM / core: Propagate dev->power.wakeup_path when no callbacks
	extcon: arizona: Disable mic detect if running when driver is removed
	s390: cio: fix cio_irb declaration
	cpufreq: ppc_cbe: fix possible object reference leak
	cpufreq/pasemi: fix possible object reference leak
	cpufreq: pmac32: fix possible object reference leak
	x86/build: Keep local relocations with ld.lld
	iio: ad_sigma_delta: Properly handle SPI bus locking vs CS assertion
	iio: hmc5843: fix potential NULL pointer dereferences
	iio: common: ssp_sensors: Initialize calculated_time in ssp_common_process_data
	rtlwifi: fix a potential NULL pointer dereference
	brcmfmac: fix missing checks for kmemdup
	b43: shut up clang -Wuninitialized variable warning
	brcmfmac: convert dev_init_lock mutex to completion
	brcmfmac: fix race during disconnect when USB completion is in progress
	scsi: ufs: Fix regulator load and icc-level configuration
	scsi: ufs: Avoid configuring regulator with undefined voltage range
	arm64: cpu_ops: fix a leaked reference by adding missing of_node_put
	x86/ia32: Fix ia32_restore_sigcontext() AC leak
	chardev: add additional check for minor range overlap
	HID: core: move Usage Page concatenation to Main item
	ASoC: eukrea-tlv320: fix a leaked reference by adding missing of_node_put
	ASoC: fsl_utils: fix a leaked reference by adding missing of_node_put
	cxgb3/l2t: Fix undefined behaviour
	spi: tegra114: reset controller on probe
	media: wl128x: prevent two potential buffer overflows
	virtio_console: initialize vtermno value for ports
	tty: ipwireless: fix missing checks for ioremap
	rcutorture: Fix cleanup path for invalid torture_type strings
	usb: core: Add PM runtime calls to usb_hcd_platform_shutdown
	scsi: qla4xxx: avoid freeing unallocated dma memory
	media: m88ds3103: serialize reset messages in m88ds3103_set_frontend
	media: go7007: avoid clang frame overflow warning with KASAN
	media: saa7146: avoid high stack usage with clang
	scsi: lpfc: Fix SLI3 commands being issued on SLI4 devices
	spi : spi-topcliff-pch: Fix to handle empty DMA buffers
	spi: rspi: Fix sequencer reset during initialization
	spi: Fix zero length xfer bug
	ASoC: davinci-mcasp: Fix clang warning without CONFIG_PM
	ipv6: Consider sk_bound_dev_if when binding a raw socket to an address
	llc: fix skb leak in llc_build_and_send_ui_pkt()
	net-gro: fix use-after-free read in napi_gro_frags()
	net: stmmac: fix reset gpio free missing
	usbnet: fix kernel crash after disconnect
	tipc: Avoid copying bytes beyond the supplied data
	bnxt_en: Fix aggregation buffer leak under OOM condition.
	net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value
	crypto: vmx - ghash: do nosimd fallback manually
	xen/pciback: Don't disable PCI_COMMAND on PCI device reset.
	Revert "tipc: fix modprobe tipc failed after switch order of device registration"
	tipc: fix modprobe tipc failed after switch order of device registration -v2
	sparc64: Fix regression in non-hypervisor TLB flush xcall
	include/linux/bitops.h: sanitize rotate primitives
	xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()
	usb: xhci: avoid null pointer deref when bos field is NULL
	USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor
	USB: sisusbvga: fix oops in error path of sisusb_probe
	USB: Add LPM quirk for Surface Dock GigE adapter
	USB: rio500: refuse more than one device at a time
	USB: rio500: fix memory leak in close after disconnect
	media: usb: siano: Fix general protection fault in smsusb
	media: usb: siano: Fix false-positive "uninitialized variable" warning
	media: smsusb: better handle optional alignment
	scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove
	scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs)
	Btrfs: fix race updating log root item during fsync
	ALSA: hda/realtek - Set default power save node to 0
	drm/nouveau/i2c: Disable i2c bus access after ->fini()
	tty: serial: msm_serial: Fix XON/XOFF
	tty: max310x: Fix external crystal register setup
	memcg: make it work on sparse non-0-node systems
	kernel/signal.c: trace_signal_deliver when signal_group_exit
	CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM
	binder: Replace "%p" with "%pK" for stable
	binder: replace "%p" with "%pK"
	net: create skb_gso_validate_mac_len()
	bnx2x: disable GSO where gso_size is too big for hardware
	brcmfmac: Add length checks on firmware events
	brcmfmac: screening firmware event packet
	brcmfmac: revise handling events in receive path
	brcmfmac: fix incorrect event channel deduction
	brcmfmac: add length checks in scheduled scan result handler
	brcmfmac: add subtype check for event handling in data path
	userfaultfd: don't pin the user memory in userfaultfd_file_create()
	Revert "x86/build: Move _etext to actual end of .text"
	net: cdc_ncm: GetNtbFormat endian fix
	usb: gadget: fix request length error for isoc transfer
	media: uvcvideo: Fix uvc_alloc_entity() allocation alignment
	ethtool: fix potential userspace buffer overflow
	neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit
	net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query
	net: rds: fix memory leak in rds_ib_flush_mr_pool
	pktgen: do not sleep with the thread lock held.
	rcu: locking and unlocking need to always be at least barriers
	parisc: Use implicit space register selection for loading the coherence index of I/O pdirs
	fuse: fallocate: fix return with locked inode
	MIPS: pistachio: Build uImage.gz by default
	genwqe: Prevent an integer overflow in the ioctl
	drm/gma500/cdv: Check vbt config bits when detecting lvds panels
	fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock
	fuse: Add FOPEN_STREAM to use stream_open()
	ipv4: Define __ipv4_neigh_lookup_noref when CONFIG_INET is disabled
	ethtool: check the return value of get_regs_len
	Linux 4.4.181

Change-Id: I0c9e7effbb6bd5d1978b4ffad3db3b76af6692bc
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-06-11 14:23:58 +02:00
Jiri Slaby
7a47d18731 memcg: make it work on sparse non-0-node systems
commit 3e8589963773a5c23e2f1fe4bcad0e9a90b7f471 upstream.

We have a single node system with node 0 disabled:
  Scanning NUMA topology in Northbridge 24
  Number of physical nodes 2
  Skipping disabled node 0
  Node 1 MemBase 0000000000000000 Limit 00000000fbff0000
  NODE_DATA(1) allocated [mem 0xfbfda000-0xfbfeffff]

This causes crashes in memcg when system boots:
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  #PF error: [normal kernel read fault]
...
  RIP: 0010:list_lru_add+0x94/0x170
...
  Call Trace:
   d_lru_add+0x44/0x50
   dput.part.34+0xfc/0x110
   __fput+0x108/0x230
   task_work_run+0x9f/0xc0
   exit_to_usermode_loop+0xf5/0x100

It is reproducible as far as 4.12.  I did not try older kernels.  You have
to have a new enough systemd, e.g.  241 (the reason is unknown -- was not
investigated).  Cannot be reproduced with systemd 234.

The system crashes because the size of lru array is never updated in
memcg_update_all_list_lrus and the reads are past the zero-sized array,
causing dereferences of random memory.

The root cause are list_lru_memcg_aware checks in the list_lru code.  The
test in list_lru_memcg_aware is broken: it assumes node 0 is always
present, but it is not true on some systems as can be seen above.

So fix this by avoiding checks on node 0.  Remember the memcg-awareness by
a bool flag in struct list_lru.

Link: http://lkml.kernel.org/r/20190522091940.3615-1-jslaby@suse.cz
Fixes: 60d3fd32a7 ("list_lru: introduce per-memcg lists")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11 12:24:10 +02:00
Mike Kravetz
bf8474c648 hugetlb: use same fault hash key for shared and private mappings
commit 1b426bac66e6cc83c9f2d92b96e4e72acf43419a upstream.

hugetlb uses a fault mutex hash table to prevent page faults of the
same pages concurrently.  The key for shared and private mappings is
different.  Shared keys off address_space and file index.  Private keys
off mm and virtual address.  Consider a private mappings of a populated
hugetlbfs file.  A fault will map the page from the file and if needed
do a COW to map a writable page.

Hugetlbfs hole punch uses the fault mutex to prevent mappings of file
pages.  It uses the address_space file index key.  However, private
mappings will use a different key and could race with this code to map
the file page.  This causes problems (BUG) for the page cache remove
code as it expects the page to be unmapped.  A sample stack is:

page dumped because: VM_BUG_ON_PAGE(page_mapped(page))
kernel BUG at mm/filemap.c:169!
...
RIP: 0010:unaccount_page_cache_page+0x1b8/0x200
...
Call Trace:
__delete_from_page_cache+0x39/0x220
delete_from_page_cache+0x45/0x70
remove_inode_hugepages+0x13c/0x380
? __add_to_page_cache_locked+0x162/0x380
hugetlbfs_fallocate+0x403/0x540
? _cond_resched+0x15/0x30
? __inode_security_revalidate+0x5d/0x70
? selinux_file_permission+0x100/0x130
vfs_fallocate+0x13f/0x270
ksys_fallocate+0x3c/0x80
__x64_sys_fallocate+0x1a/0x20
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9

There seems to be another potential COW issue/race with this approach
of different private and shared keys as noted in commit 8382d914eb
("mm, hugetlb: improve page-fault scalability").

Since every hugetlb mapping (even anon and private) is actually a file
mapping, just use the address_space index key for all mappings.  This
results in potentially more hash collisions.  However, this should not
be the common case.

Link: http://lkml.kernel.org/r/20190328234704.27083-3-mike.kravetz@oracle.com
Link: http://lkml.kernel.org/r/20190412165235.t4sscoujczfhuiyt@linux-r8p5
Fixes: b5cec28d36 ("hugetlbfs: truncate_hugepages() takes a range of pages")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11 12:23:52 +02:00
Tejun Heo
bfce20eaf1 writeback: synchronize sync(2) against cgroup writeback membership switches
commit 7fc5854f8c6efae9e7624970ab49a1eac2faefb1 upstream.

sync_inodes_sb() can race against cgwb (cgroup writeback) membership
switches and fail to writeback some inodes.  For example, if an inode
switches to another wb while sync_inodes_sb() is in progress, the new
wb might not be visible to bdi_split_work_to_wbs() at all or the inode
might jump from a wb which hasn't issued writebacks yet to one which
already has.

This patch adds backing_dev_info->wb_switch_rwsem to synchronize cgwb
switch path against sync_inodes_sb() so that sync_inodes_sb() is
guaranteed to see all the target wbs and inodes can't jump wbs to
escape syncing.

v2: Fixed misplaced rwsem init.  Spotted by Jiufei.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jiufei Xue <xuejiufei@gmail.com>
Link: http://lkml.kernel.org/r/dc694ae2-f07f-61e1-7097-7c8411cee12d@gmail.com
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11 12:23:41 +02:00
Jiri Kosina
b614485b6b mm/mincore.c: make mincore() more conservative
commit 134fca9063ad4851de767d1768180e5dede9a881 upstream.

The semantics of what mincore() considers to be resident is not
completely clear, but Linux has always (since 2.3.52, which is when
mincore() was initially done) treated it as "page is available in page
cache".

That's potentially a problem, as that [in]directly exposes
meta-information about pagecache / memory mapping state even about
memory not strictly belonging to the process executing the syscall,
opening possibilities for sidechannel attacks.

Change the semantics of mincore() so that it only reveals pagecache
information for non-anonymous mappings that belog to files that the
calling process could (if it tried to) successfully open for writing;
otherwise we'd be including shared non-exclusive mappings, which

 - is the sidechannel

 - is not the usecase for mincore(), as that's primarily used for data,
   not (shared) text

[jkosina@suse.cz: v2]
  Link: http://lkml.kernel.org/r/20190312141708.6652-2-vbabka@suse.cz
[mhocko@suse.com: restructure can_do_mincore() conditions]
Link: http://lkml.kernel.org/r/nycvar.YFH.7.76.1903062342020.19912@cbobk.fhfr.pm
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Josh Snyder <joshs@netflix.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Originally-by: Linus Torvalds <torvalds@linux-foundation.org>
Originally-by: Dominique Martinet <asmadeus@codewreck.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Kevin Easton <kevin@guarana.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Cyril Hrubis <chrubis@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Daniel Gruss <daniel@gruss.cc>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11 12:23:36 +02:00
Greg Kroah-Hartman
505ad68286 This is the 4.4.179 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlzEBesACgkQONu9yGCS
 aT4KhA//fTPDn1cwSHYX7vCK1gWVfAC+d7SnWcdlyjiCzuUNOHeGtwYl2xIi2pmH
 7cnxCpzKRbxgPCuawj/T4MrZoKtc2fH8sduQGQtiIgtz4rew9vqYcQTIxO0AF8cI
 KcqZefj7L6M04RxSq+F0O2MpXtPCIfDnoYej44Espa5tCt1tWJLxQvaEV4waHmA1
 +6/Sh9ZlWzMF/vcyTJS0RpCSNrHDVjaoQjgQuaDNGaGhPhaWqvPavd4eSzLrOQ4U
 r54X+T3vqQuF3gjSVywGgSvRkpFX+ZTsgPtg9mkm3oJTNoLUa9nnCRl1p6ig+moI
 +7ZPFbk0Duhi+N34GpSL2MXbFkMz49Cvon+kDlQrO0LvETWha39VyEG/GcHSCm+o
 ISNWwEY//rsYl8JF3rZIzfbw973/x0QKZeNQySs184ls4pwnIo3To3fL2Wv9ArA1
 jCIHT3lFwBkNZUMfK9hz8e7fC93QZPIgDLGx/HWBDPRE05D5O29kRSuj6cQ10GFk
 PChQiV3WLiwgwEhq3/tIso3052MheAVaNsz76wYbpakTrupjkapNeN64bOOZO2FD
 BzdLkpSjOIe5kaSMzzVCTt9A25M0t8iz/rj7/6OEPYtaT8T49t1ZtTxjrAzqL2nc
 oyB0U20t67uMoaloZUQF6kYmMirvnYYVwDWSHXQ568bU5wXhI0U=
 =pJ1x
 -----END PGP SIGNATURE-----

Merge 4.4.179 into android-4.4-p

Changes in 4.4.179
	arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
	arm64: debug: Ensure debug handlers check triggering exception level
	ext4: cleanup bh release code in ext4_ind_remove_space()
	lib/int_sqrt: optimize initial value compute
	tty/serial: atmel: Add is_half_duplex helper
	mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
	i2c: core-smbus: prevent stack corruption on read I2C_BLOCK_DATA
	Bluetooth: Fix decrementing reference count twice in releasing socket
	tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped
	CIFS: fix POSIX lock leak and invalid ptr deref
	h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux-
	tracing: kdb: Fix ftdump to not sleep
	gpio: gpio-omap: fix level interrupt idling
	sysctl: handle overflow for file-max
	enic: fix build warning without CONFIG_CPUMASK_OFFSTACK
	mm/cma.c: cma_declare_contiguous: correct err handling
	mm/page_ext.c: fix an imbalance with kmemleak
	mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512!
	mm/slab.c: kmemleak no scan alien caches
	ocfs2: fix a panic problem caused by o2cb_ctl
	f2fs: do not use mutex lock in atomic context
	fs/file.c: initialize init_files.resize_wait
	cifs: use correct format characters
	dm thin: add sanity checks to thin-pool and external snapshot creation
	cifs: Fix NULL pointer dereference of devname
	fs: fix guard_bio_eod to check for real EOD errors
	tools lib traceevent: Fix buffer overflow in arg_eval
	usb: chipidea: Grab the (legacy) USB PHY by phandle first
	scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
	coresight: etm4x: Add support to enable ETMv4.2
	ARM: 8840/1: use a raw_spinlock_t in unwind
	mmc: omap: fix the maximum timeout setting
	e1000e: Fix -Wformat-truncation warnings
	IB/mlx4: Increase the timeout for CM cache
	scsi: megaraid_sas: return error when create DMA pool failed
	perf test: Fix failure of 'evsel-tp-sched' test on s390
	SoC: imx-sgtl5000: add missing put_device()
	media: sh_veu: Correct return type for mem2mem buffer helpers
	media: s5p-jpeg: Correct return type for mem2mem buffer helpers
	media: s5p-g2d: Correct return type for mem2mem buffer helpers
	media: mx2_emmaprp: Correct return type for mem2mem buffer helpers
	leds: lp55xx: fix null deref on firmware load failure
	kprobes: Prohibit probing on bsearch()
	ARM: 8833/1: Ensure that NEON code always compiles with Clang
	ALSA: PCM: check if ops are defined before suspending PCM
	bcache: fix input overflow to cache set sysfs file io_error_halflife
	bcache: fix input overflow to sequential_cutoff
	bcache: improve sysfs_strtoul_clamp()
	fbdev: fbmem: fix memory access if logo is bigger than the screen
	cdrom: Fix race condition in cdrom_sysctl_register
	ASoC: fsl-asoc-card: fix object reference leaks in fsl_asoc_card_probe
	soc: qcom: gsbi: Fix error handling in gsbi_probe()
	mt7601u: bump supported EEPROM version
	ARM: avoid Cortex-A9 livelock on tight dmb loops
	tty: increase the default flip buffer limit to 2*640K
	media: mt9m111: set initial frame size other than 0x0
	hwrng: virtio - Avoid repeated init of completion
	soc/tegra: fuse: Fix illegal free of IO base address
	hpet: Fix missing '=' character in the __setup() code of hpet_mmap_enable
	dmaengine: imx-dma: fix warning comparison of distinct pointer types
	netfilter: physdev: relax br_netfilter dependency
	media: s5p-jpeg: Check for fmt_ver_flag when doing fmt enumeration
	regulator: act8865: Fix act8600_sudcdc_voltage_ranges setting
	wlcore: Fix memory leak in case wl12xx_fetch_firmware failure
	x86/build: Mark per-CPU symbols as absolute explicitly for LLD
	dmaengine: tegra: avoid overflow of byte tracking
	drm/dp/mst: Configure no_stop_bit correctly for remote i2c xfers
	binfmt_elf: switch to new creds when switching to new mm
	kbuild: clang: choose GCC_TOOLCHAIN_DIR not on LD
	x86/build: Specify elf_i386 linker emulation explicitly for i386 objects
	x86: vdso: Use $LD instead of $CC to link
	x86/vdso: Drop implicit common-page-size linker flag
	lib/string.c: implement a basic bcmp
	tty: mark Siemens R3964 line discipline as BROKEN
	tty: ldisc: add sysctl to prevent autoloading of ldiscs
	ipv6: Fix dangling pointer when ipv6 fragment
	ipv6: sit: reset ip header pointer in ipip6_rcv
	net: rds: force to destroy connection if t_sock is NULL in rds_tcp_kill_sock().
	openvswitch: fix flow actions reallocation
	qmi_wwan: add Olicard 600
	sctp: initialize _pad of sockaddr_in before copying to user memory
	tcp: Ensure DCTCP reacts to losses
	netns: provide pure entropy for net_hash_mix()
	net: ethtool: not call vzalloc for zero sized memory request
	ip6_tunnel: Match to ARPHRD_TUNNEL6 for dev type
	ALSA: seq: Fix OOB-reads from strlcpy
	include/linux/bitrev.h: fix constant bitrev
	ASoC: fsl_esai: fix channel swap issue when stream starts
	block: do not leak memory in bio_copy_user_iov()
	genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
	ARM: dts: at91: Fix typo in ISC_D0 on PC9
	arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value
	xen: Prevent buffer overflow in privcmd ioctl
	sched/fair: Do not re-read ->h_load_next during hierarchical load calculation
	xtensa: fix return_address
	PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller
	perf/core: Restore mmap record type correctly
	ext4: add missing brelse() in add_new_gdb_meta_bg()
	ext4: report real fs size after failed resize
	ALSA: echoaudio: add a check for ioremap_nocache
	ALSA: sb8: add a check for request_region
	IB/mlx4: Fix race condition between catas error reset and aliasguid flows
	mmc: davinci: remove extraneous __init annotation
	ALSA: opl3: fix mismatch between snd_opl3_drum_switch definition and declaration
	thermal/int340x_thermal: Add additional UUIDs
	thermal/int340x_thermal: fix mode setting
	tools/power turbostat: return the exit status of a command
	perf top: Fix error handling in cmd_top()
	perf evsel: Free evsel->counts in perf_evsel__exit()
	perf tests: Fix a memory leak of cpu_map object in the openat_syscall_event_on_all_cpus test
	perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test()
	x86/hpet: Prevent potential NULL pointer dereference
	x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors
	iommu/vt-d: Check capability before disabling protected memory
	x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error
	fix incorrect error code mapping for OBJECTID_NOT_FOUND
	ext4: prohibit fstrim in norecovery mode
	rsi: improve kernel thread handling to fix kernel panic
	9p: do not trust pdu content for stat item size
	9p locks: add mount option for lock retry interval
	f2fs: fix to do sanity check with current segment number
	serial: uartps: console_setup() can't be placed to init section
	ARM: samsung: Limit SAMSUNG_PM_CHECK config option to non-Exynos platforms
	ACPI / SBS: Fix GPE storm on recent MacBookPro's
	cifs: fallback to older infolevels on findfirst queryinfo retry
	crypto: sha256/arm - fix crash bug in Thumb2 build
	crypto: sha512/arm - fix crash bug in Thumb2 build
	iommu/dmar: Fix buffer overflow during PCI bus notification
	ARM: 8839/1: kprobe: make patch_lock a raw_spinlock_t
	appletalk: Fix use-after-free in atalk_proc_exit
	lib/div64.c: off by one in shift
	include/linux/swap.h: use offsetof() instead of custom __swapoffset macro
	tpm/tpm_crb: Avoid unaligned reads in crb_recv()
	ovl: fix uid/gid when creating over whiteout
	appletalk: Fix compile regression
	bonding: fix event handling for stacked bonds
	net: atm: Fix potential Spectre v1 vulnerabilities
	net: bridge: multicast: use rcu to access port list from br_multicast_start_querier
	net: fou: do not use guehdr after iptunnel_pull_offloads in gue_udp_recv
	tcp: tcp_grow_window() needs to respect tcp_space()
	ipv4: recompile ip options in ipv4_link_failure
	ipv4: ensure rcu_read_lock() in ipv4_link_failure()
	crypto: crypto4xx - properly set IV after de- and encrypt
	modpost: file2alias: go back to simple devtable lookup
	modpost: file2alias: check prototype of handler
	tpm/tpm_i2c_atmel: Return -E2BIG when the transfer is incomplete
	KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU
	iio/gyro/bmg160: Use millidegrees for temperature scale
	iio: ad_sigma_delta: select channel when reading register
	iio: adc: at91: disable adc channel interrupt in timeout case
	io: accel: kxcjk1013: restore the range after resume.
	staging: comedi: vmk80xx: Fix use of uninitialized semaphore
	staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf
	staging: comedi: ni_usb6501: Fix use of uninitialized mutex
	staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf
	ALSA: core: Fix card races between register and disconnect
	crypto: x86/poly1305 - fix overflow during partial reduction
	arm64: futex: Restore oldval initialization to work around buggy compilers
	x86/kprobes: Verify stack frame on kretprobe
	kprobes: Mark ftrace mcount handler functions nokprobe
	kprobes: Fix error check when reusing optimized probes
	mac80211: do not call driver wake_tx_queue op during reconfig
	Revert "kbuild: use -Oz instead of -Os when using clang"
	sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup
	device_cgroup: fix RCU imbalance in error case
	mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
	ALSA: info: Fix racy addition/deletion of nodes
	Revert "locking/lockdep: Add debug_locks check in __lock_downgrade()"
	kernel/sysctl.c: fix out-of-bounds access when setting file-max
	Linux 4.4.179

Change-Id: Ia88dbd6c37250a682098a4a8540672869c6adf42
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-04-30 13:25:38 +02:00
Konstantin Khlebnikov
0e4d4e0d6b mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
commit e8277b3b52240ec1caad8e6df278863e4bf42eac upstream.

Commit 58bc4c34d249 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly")
depends on skipping vmstat entries with empty name introduced in
7aaf77272358 ("mm: don't show nr_indirectly_reclaimable in
/proc/vmstat") but reverted in b29940c1abd7 ("mm: rename and change
semantics of nr_indirectly_reclaimable_bytes").

So skipping no longer works and /proc/vmstat has misformatted lines " 0".

This patch simply shows debug counters "nr_tlb_remote_*" for UP.

Link: http://lkml.kernel.org/r/155481488468.467.4295519102880913454.stgit@buzz
Fixes: 58bc4c34d249 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Roman Gushchin <guro@fb.com>
Cc: Jann Horn <jannh@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-27 09:34:02 +02:00
Qian Cai
b1399497b7 mm/slab.c: kmemleak no scan alien caches
[ Upstream commit 92d1d07daad65c300c7d0b68bbef8867e9895d54 ]

Kmemleak throws endless warnings during boot due to in
__alloc_alien_cache(),

    alc = kmalloc_node(memsize, gfp, node);
    init_arraycache(&alc->ac, entries, batch);
    kmemleak_no_scan(ac);

Kmemleak does not track the array cache (alc->ac) but the alien cache
(alc) instead, so let it track the latter by lifting kmemleak_no_scan()
out of init_arraycache().

There is another place that calls init_arraycache(), but
alloc_kmem_cache_cpus() uses the percpu allocation where will never be
considered as a leak.

  kmemleak: Found object by alias at 0xffff8007b9aa7e38
  CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #2
  Call trace:
   dump_backtrace+0x0/0x168
   show_stack+0x24/0x30
   dump_stack+0x88/0xb0
   lookup_object+0x84/0xac
   find_and_get_object+0x84/0xe4
   kmemleak_no_scan+0x74/0xf4
   setup_kmem_cache_node+0x2b4/0x35c
   __do_tune_cpucache+0x250/0x2d4
   do_tune_cpucache+0x4c/0xe4
   enable_cpucache+0xc8/0x110
   setup_cpu_cache+0x40/0x1b8
   __kmem_cache_create+0x240/0x358
   create_cache+0xc0/0x198
   kmem_cache_create_usercopy+0x158/0x20c
   kmem_cache_create+0x50/0x64
   fsnotify_init+0x58/0x6c
   do_one_initcall+0x194/0x388
   kernel_init_freeable+0x668/0x688
   kernel_init+0x18/0x124
   ret_from_fork+0x10/0x18
  kmemleak: Object 0xffff8007b9aa7e00 (size 256):
  kmemleak:   comm "swapper/0", pid 1, jiffies 4294697137
  kmemleak:   min_count = 1
  kmemleak:   count = 0
  kmemleak:   flags = 0x1
  kmemleak:   checksum = 0
  kmemleak:   backtrace:
       kmemleak_alloc+0x84/0xb8
       kmem_cache_alloc_node_trace+0x31c/0x3a0
       __kmalloc_node+0x58/0x78
       setup_kmem_cache_node+0x26c/0x35c
       __do_tune_cpucache+0x250/0x2d4
       do_tune_cpucache+0x4c/0xe4
       enable_cpucache+0xc8/0x110
       setup_cpu_cache+0x40/0x1b8
       __kmem_cache_create+0x240/0x358
       create_cache+0xc0/0x198
       kmem_cache_create_usercopy+0x158/0x20c
       kmem_cache_create+0x50/0x64
       fsnotify_init+0x58/0x6c
       do_one_initcall+0x194/0x388
       kernel_init_freeable+0x668/0x688
       kernel_init+0x18/0x124
  kmemleak: Not scanning unknown object at 0xffff8007b9aa7e38
  CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #2
  Call trace:
   dump_backtrace+0x0/0x168
   show_stack+0x24/0x30
   dump_stack+0x88/0xb0
   kmemleak_no_scan+0x90/0xf4
   setup_kmem_cache_node+0x2b4/0x35c
   __do_tune_cpucache+0x250/0x2d4
   do_tune_cpucache+0x4c/0xe4
   enable_cpucache+0xc8/0x110
   setup_cpu_cache+0x40/0x1b8
   __kmem_cache_create+0x240/0x358
   create_cache+0xc0/0x198
   kmem_cache_create_usercopy+0x158/0x20c
   kmem_cache_create+0x50/0x64
   fsnotify_init+0x58/0x6c
   do_one_initcall+0x194/0x388
   kernel_init_freeable+0x668/0x688
   kernel_init+0x18/0x124
   ret_from_fork+0x10/0x18

Link: http://lkml.kernel.org/r/20190129184518.39808-1-cai@lca.pw
Fixes: 1fe00d50a9 ("slab: factor out initialization of array cache")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-27 09:33:48 +02:00
Uladzislau Rezki (Sony)
cb4d6cd276 mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512!
[ Upstream commit afd07389d3f4933c7f7817a92fb5e053d59a3182 ]

One of the vmalloc stress test case triggers the kernel BUG():

  <snip>
  [60.562151] ------------[ cut here ]------------
  [60.562154] kernel BUG at mm/vmalloc.c:512!
  [60.562206] invalid opcode: 0000 [#1] PREEMPT SMP PTI
  [60.562247] CPU: 0 PID: 430 Comm: vmalloc_test/0 Not tainted 4.20.0+ #161
  [60.562293] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
  [60.562351] RIP: 0010:alloc_vmap_area+0x36f/0x390
  <snip>

it can happen due to big align request resulting in overflowing of
calculated address, i.e.  it becomes 0 after ALIGN()'s fixup.

Fix it by checking if calculated address is within vstart/vend range.

Link: http://lkml.kernel.org/r/20190124115648.9433-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-27 09:33:48 +02:00
Qian Cai
2ea83494ce mm/page_ext.c: fix an imbalance with kmemleak
[ Upstream commit 0c81585499601acd1d0e1cbf424cabfaee60628c ]

After offlining a memory block, kmemleak scan will trigger a crash, as
it encounters a page ext address that has already been freed during
memory offlining.  At the beginning in alloc_page_ext(), it calls
kmemleak_alloc(), but it does not call kmemleak_free() in
free_page_ext().

    BUG: unable to handle kernel paging request at ffff888453d00000
    PGD 128a01067 P4D 128a01067 PUD 128a04067 PMD 47e09e067 PTE 800ffffbac2ff060
    Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
    CPU: 1 PID: 1594 Comm: bash Not tainted 5.0.0-rc8+ #15
    Hardware name: HP ProLiant DL180 Gen9/ProLiant DL180 Gen9, BIOS U20 10/25/2017
    RIP: 0010:scan_block+0xb5/0x290
    Code: 85 6e 01 00 00 48 b8 00 00 30 f5 81 88 ff ff 48 39 c3 0f 84 5b 01 00 00 48 89 d8 48 c1 e8 03 42 80 3c 20 00 0f 85 87 01 00 00 <4c> 8b 3b e8 f3 0c fa ff 4c 39 3d 0c 6b 4c 01 0f 87 08 01 00 00 4c
    RSP: 0018:ffff8881ec57f8e0 EFLAGS: 00010082
    RAX: 0000000000000000 RBX: ffff888453d00000 RCX: ffffffffa61e5a54
    RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff888453d00000
    RBP: ffff8881ec57f920 R08: fffffbfff4ed588d R09: fffffbfff4ed588c
    R10: fffffbfff4ed588c R11: ffffffffa76ac463 R12: dffffc0000000000
    R13: ffff888453d00ff9 R14: ffff8881f80cef48 R15: ffff8881f80cef48
    FS:  00007f6c0e3f8740(0000) GS:ffff8881f7680000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffff888453d00000 CR3: 00000001c4244003 CR4: 00000000001606a0
    Call Trace:
     scan_gray_list+0x269/0x430
     kmemleak_scan+0x5a8/0x10f0
     kmemleak_write+0x541/0x6ca
     full_proxy_write+0xf8/0x190
     __vfs_write+0xeb/0x980
     vfs_write+0x15a/0x4f0
     ksys_write+0xd2/0x1b0
     __x64_sys_write+0x73/0xb0
     do_syscall_64+0xeb/0xaaa
     entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x7f6c0dad73b8
    Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 63 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
    RSP: 002b:00007ffd5b863cb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
    RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f6c0dad73b8
    RDX: 0000000000000005 RSI: 000055a9216e1710 RDI: 0000000000000001
    RBP: 000055a9216e1710 R08: 000000000000000a R09: 00007ffd5b863840
    R10: 000000000000000a R11: 0000000000000246 R12: 00007f6c0dda9780
    R13: 0000000000000005 R14: 00007f6c0dda4740 R15: 0000000000000005
    Modules linked in: nls_iso8859_1 nls_cp437 vfat fat kvm_intel kvm irqbypass efivars ip_tables x_tables xfs sd_mod ahci libahci igb i2c_algo_bit libata i2c_core dm_mirror dm_region_hash dm_log dm_mod efivarfs
    CR2: ffff888453d00000
    ---[ end trace ccf646c7456717c5 ]---
    Kernel panic - not syncing: Fatal exception
    Shutting down cpus with NMI
    Kernel Offset: 0x24c00000 from 0xffffffff81000000 (relocation range:
    0xffffffff80000000-0xffffffffbfffffff)
    ---[ end Kernel panic - not syncing: Fatal exception ]---

Link: http://lkml.kernel.org/r/20190227173147.75650-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-27 09:33:48 +02:00
Peng Fan
4970a8ba94 mm/cma.c: cma_declare_contiguous: correct err handling
[ Upstream commit 0d3bd18a5efd66097ef58622b898d3139790aa9d ]

In case cma_init_reserved_mem failed, need to free the memblock
allocated by memblock_reserve or memblock_alloc_range.

Quote Catalin's comments:
  https://lkml.org/lkml/2019/2/26/482

Kmemleak is supposed to work with the memblock_{alloc,free} pair and it
ignores the memblock_reserve() as a memblock_alloc() implementation
detail. It is, however, tolerant to memblock_free() being called on
a sub-range or just a different range from a previous memblock_alloc().
So the original patch looks fine to me. FWIW:

Link: http://lkml.kernel.org/r/20190227144631.16708-1-peng.fan@nxp.com
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-27 09:33:48 +02:00
Yang Shi
b3b489eea2 mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
commit a7f40cfe3b7ada57af9b62fd28430eeb4a7cfcb7 upstream.

When MPOL_MF_STRICT was specified and an existing page was already on a
node that does not follow the policy, mbind() should return -EIO.  But
commit 6f4576e368 ("mempolicy: apply page table walker on
queue_pages_range()") broke the rule.

And commit c8633798497c ("mm: mempolicy: mbind and migrate_pages support
thp migration") didn't return the correct value for THP mbind() too.

If MPOL_MF_STRICT is set, ignore vma_migratable() to make sure it
reaches queue_pages_to_pte_range() or queue_pages_pmd() to check if an
existing page was already on a node that does not follow the policy.
And, non-migratable vma may be used, return -EIO too if MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL was specified.

Tested with https://github.com/metan-ucw/ltp/blob/master/testcases/kernel/syscalls/mbind/mbind02.c

[akpm@linux-foundation.org: tweak code comment]
Link: http://lkml.kernel.org/r/1553020556-38583-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: 6f4576e368 ("mempolicy: apply page table walker on queue_pages_range()")
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reported-by: Cyril Hrubis <chrubis@suse.cz>
Suggested-by: Kirill A. Shutemov <kirill@shutemov.name>
Acked-by: Rafael Aquini <aquini@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-27 09:33:47 +02:00
Greg Kroah-Hartman
254f2a04a8 This is the 4.4.178 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlykNUEACgkQONu9yGCS
 aT6n6A//QT/8UQ8IWU2J1iTtlxX95RWxfgbsip0bBh8PdVOhRAalR6+fa6F/Fh9D
 kM82QHro5R9ZO48mkQ1yF4ooJmVapabS4bvlgLil+/La9gDsF/Z2T/wxsUht2nCm
 aic3ZjLX2mtte75zQAL+lvEjPR6q92PibNOgBvt51ueLK7Hcxga4uiAzpdlZausp
 YKAtqwhaj7AD2xUqPuyB9xHw5tvFbGqiN6rMmxIbQSOUhgtiUxiiLHRM8ppanoHv
 D2fMKKj8Pz5FGgzd7c0b9fZUERFNqHeKSTPgxNENzLS0TCRexP94Ihp5FoWN4tY+
 HPQT291DrWyquSl0c7FrI1BuF41fmKJ+CZHbvXBwT429bJQQ2dehgIUfdGYgrSBt
 J/zbh0OO2fkLCxNDVpA0cNm+tlYUGbc+TCG4R2I3V2dn5yTxru/w+TdG/GyM8h75
 jUAGS3hFKBCFQSLC8M+nRcOsLsV1H4H9/MnQ84+wpXXC/Z5MseYHo07E1xWNViUW
 UHuM6PlGRUPJ0JrC6J6wLkkvHDyjXbaSitligH8K2aW9PtCU814T7+4rwgyaHCVr
 OMizAmy65Y2lutJ4mtMNc05mKlQRlGfWu/EOBgTRzB+V4hadp2NRZ1b9rk3MFRgk
 ckxiYM91MjtuvHV/SrLd3e2PlnvouGw30jaWhScy2Sl5D4g76ok=
 =zBsi
 -----END PGP SIGNATURE-----

Merge 4.4.178 into android-4.4-p

Changes in 4.4.178
	mmc: pxamci: fix enum type confusion
	drm/vmwgfx: Don't double-free the mode stored in par->set_mode
	udf: Fix crash on IO error during truncate
	mips: loongson64: lemote-2f: Add IRQF_NO_SUSPEND to "cascade" irqaction.
	MIPS: Fix kernel crash for R6 in jump label branch function
	futex: Ensure that futex address is aligned in handle_futex_death()
	ext4: fix NULL pointer dereference while journal is aborted
	ext4: fix data corruption caused by unaligned direct AIO
	ext4: brelse all indirect buffer in ext4_ind_remove_space()
	mmc: tmio_mmc_core: don't claim spurious interrupts
	media: v4l2-ctrls.c/uvc: zero v4l2_event
	locking/lockdep: Add debug_locks check in __lock_downgrade()
	ALSA: hda - Record the current power state before suspend/resume calls
	ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec
	mmc: pwrseq_simple: Make reset-gpios optional to match doc
	mmc: debugfs: Add a restriction to mmc debugfs clock setting
	mmc: make MAN_BKOPS_EN message a debug
	mmc: sanitize 'bus width' in debug output
	mmc: core: shut up "voltage-ranges unspecified" pr_info()
	usb: dwc3: gadget: Fix suspend/resume during device mode
	arm64: mm: Add trace_irqflags annotations to do_debug_exception()
	mmc: core: fix using wrong io voltage if mmc_select_hs200 fails
	mm/rmap: replace BUG_ON(anon_vma->degree) with VM_WARN_ON
	extcon: usb-gpio: Don't miss event during suspend/resume
	kbuild: setlocalversion: print error to STDERR
	usb: gadget: composite: fix dereference after null check coverify warning
	usb: gadget: Add the gserial port checking in gs_start_tx()
	tcp/dccp: drop SYN packets if accept queue is full
	serial: sprd: adjust TIMEOUT to a big value
	Hang/soft lockup in d_invalidate with simultaneous calls
	arm64: traps: disable irq in die()
	usb: renesas_usbhs: gadget: fix unused-but-set-variable warning
	serial: sprd: clear timeout interrupt only rather than all interrupts
	lib/int_sqrt: optimize small argument
	USB: core: only clean up what we allocated
	rtc: Fix overflow when converting time64_t to rtc_time
	ath10k: avoid possible string overflow
	Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt
	Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer
	sched/fair: Fix new task's load avg removed from source CPU in wake_up_new_task()
	mmc: block: Allow more than 8 partitions per card
	arm64: fix COMPAT_SHMLBA definition for large pages
	efi: stub: define DISABLE_BRANCH_PROFILING for all architectures
	ARM: 8458/1: bL_switcher: add GIC dependency
	ARM: 8494/1: mm: Enable PXN when running non-LPAE kernel on LPAE processor
	android: unconditionally remove callbacks in sync_fence_free()
	vmstat: make vmstat_updater deferrable again and shut down on idle
	hid-sensor-hub.c: fix wrong do_div() usage
	arm64: hide __efistub_ aliases from kallsyms
	perf: Synchronously free aux pages in case of allocation failure
	net: diag: support v4mapped sockets in inet_diag_find_one_icsk()
	Revert "mmc: block: don't use parameter prefix if built as module"
	writeback: initialize inode members that track writeback history
	coresight: fixing lockdep error
	coresight: coresight_unregister() function cleanup
	coresight: release reference taken by 'bus_find_device()'
	coresight: remove csdev's link from topology
	stm class: Fix locking in unbinding policy path
	stm class: Fix link list locking
	stm class: Prevent user-controllable allocations
	stm class: Support devices with multiple instances
	stm class: Fix unlocking braino in the error path
	stm class: Guard output assignment against concurrency
	stm class: Fix unbalanced module/device refcounting
	stm class: Fix a race in unlinking
	coresight: "DEVICE_ATTR_RO" should defined as static.
	coresight: etm4x: Check every parameter used by dma_xx_coherent.
	asm-generic: Fix local variable shadow in __set_fixmap_offset
	staging: ashmem: Avoid deadlock with mmap/shrink
	staging: ashmem: Add missing include
	staging: ion: Set minimum carveout heap allocation order to PAGE_SHIFT
	staging: goldfish: audio: fix compiliation on arm
	ARM: 8510/1: rework ARM_CPU_SUSPEND dependencies
	arm64/kernel: fix incorrect EL0 check in inv_entry macro
	mac80211: fix "warning: ‘target_metric’ may be used uninitialized"
	perf/ring_buffer: Refuse to begin AUX transaction after rb->aux_mmap_count drops
	arm64: kernel: Include _AC definition in page.h
	PM / Hibernate: Call flush_icache_range() on pages restored in-place
	stm class: Do not leak the chrdev in error path
	stm class: Fix stm device initialization order
	ipv6: fix endianness error in icmpv6_err
	usb: gadget: configfs: add mutex lock before unregister gadget
	usb: gadget: rndis: free response queue during REMOTE_NDIS_RESET_MSG
	cpu/hotplug: Handle unbalanced hotplug enable/disable
	video: fbdev: Set pixclock = 0 in goldfishfb
	arm64: kconfig: drop CONFIG_RTC_LIB dependency
	mmc: mmc: fix switch timeout issue caused by jiffies precision
	cfg80211: size various nl80211 messages correctly
	stmmac: copy unicast mac address to MAC registers
	dccp: do not use ipv6 header for ipv4 flow
	mISDN: hfcpci: Test both vendor & device ID for Digium HFC4S
	net/packet: Set __GFP_NOWARN upon allocation in alloc_pg_vec
	net: rose: fix a possible stack overflow
	Add hlist_add_tail_rcu() (Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
	packets: Always register packet sk in the same order
	tcp: do not use ipv6 header for ipv4 flow
	vxlan: Don't call gro_cells_destroy() before device is unregistered
	sctp: get sctphdr by offset in sctp_compute_cksum
	mac8390: Fix mmio access size probe
	btrfs: remove WARN_ON in log_dir_items
	btrfs: raid56: properly unmap parity page in finish_parity_scrub()
	ARM: imx6q: cpuidle: fix bug that CPU might not wake up at expected time
	ALSA: compress: add support for 32bit calls in a 64bit kernel
	ALSA: rawmidi: Fix potential Spectre v1 vulnerability
	ALSA: seq: oss: Fix Spectre v1 vulnerability
	ALSA: pcm: Fix possible OOB access in PCM oss plugins
	ALSA: pcm: Don't suspend stream in unrecoverable PCM state
	scsi: sd: Fix a race between closing an sd device and sd I/O
	scsi: zfcp: fix rport unblock if deleted SCSI devices on Scsi_Host
	scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices
	tty: atmel_serial: fix a potential NULL pointer dereference
	staging: vt6655: Remove vif check from vnt_interrupt
	staging: vt6655: Fix interrupt race condition on device start up.
	serial: max310x: Fix to avoid potential NULL pointer dereference
	serial: sh-sci: Fix setting SCSCR_TIE while transferring data
	USB: serial: cp210x: add new device id
	USB: serial: ftdi_sio: add additional NovaTech products
	USB: serial: mos7720: fix mos_parport refcount imbalance on error path
	USB: serial: option: set driver_info for SIM5218 and compatibles
	USB: serial: option: add Olicard 600
	Disable kgdboc failed by echo space to /sys/module/kgdboc/parameters/kgdboc
	fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_links
	gpio: adnp: Fix testing wrong value in adnp_gpio_direction_input
	perf intel-pt: Fix TSC slip
	x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y
	KVM: Reject device ioctls from processes other than the VM's creator
	xhci: Fix port resume done detection for SS ports with LPM enabled
	Revert "USB: core: only clean up what we allocated"
	arm64: support keyctl() system call in 32-bit mode
	coresight: removing bind/unbind options from sysfs
	stm class: Hide STM-specific options if STM is disabled
	Linux 4.4.178

Change-Id: Iac01be124213731798a36b20d80ea3a8e911d025
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-04-03 10:21:44 +02:00
Christoph Lameter
bdf3c006b9 vmstat: make vmstat_updater deferrable again and shut down on idle
[ Upstream commit 0eb77e9880321915322d42913c3b53241739c8aa ]

Currently the vmstat updater is not deferrable as a result of commit
ba4877b9ca ("vmstat: do not use deferrable delayed work for
vmstat_update").  This in turn can cause multiple interruptions of the
applications because the vmstat updater may run at

Make vmstate_update deferrable again and provide a function that folds
the differentials when the processor is going to idle mode thus
addressing the issue of the above commit in a clean way.

Note that the shepherd thread will continue scanning the differentials
from another processor and will reenable the vmstat workers if it
detects any changes.

Fixes: ba4877b9ca ("vmstat: do not use deferrable delayed work for vmstat_update")
Signed-off-by: Christoph Lameter <cl@linux.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-03 06:23:21 +02:00
Konstantin Khlebnikov
7f69a980f6 mm/rmap: replace BUG_ON(anon_vma->degree) with VM_WARN_ON
commit e4c5800a3991f0c6a766983535dfc10d51802cf6 upstream.

This check effectively catches anon vma hierarchy inconsistence and some
vma corruptions.  It was effective for catching corner cases in anon vma
reusing logic.  For now this code seems stable so check could be hidden
under CONFIG_DEBUG_VM and replaced with WARN because it's not so fatal.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Suggested-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03 06:23:18 +02:00
Greg Kroah-Hartman
349ac1a59c This is the 4.4.177 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlyV4+kACgkQONu9yGCS
 aT5T2RAAn9hyo4LmxMvxab61d+PSEfn9TKhNjEtF8vFKNiYb+W+vI0ALHYSWcT1Z
 O5T4d1TeSeMrs9G1McL/D80vMJFIzcg0a9QIYuFObFAB21VpDiiGcVc74d+6fHtH
 m6loPE1d2GCpzwJ7VOCvdC9DR8C9SK0IVANyMJApXUL8mkNRo2H6vY/NGt65+5zb
 vioEbGbXZQJl1GvvwquM6cX9ABH4nyAU1yTX9r2CHMFCBQ0JDkpY4yxClY1NBZ02
 1Rc1NpJCR6OJUPvQUpyHuY5rkkPfM12Iz9dxFHARXvtTsmzm3AFdkev5GEMlR5e1
 hNXs6ZPyTADJL/fKO8nmeKwKf30xTaWObgMw9A3d8FOFSmDXAW6FLKAmIz+yZBGc
 27Tta1pGkZscC1iajEX2dcp5Zjkwr4y/HA5EJJ3jCCwrfTPDL5u8N900GbKMx4Lk
 EgPB3byZUAn/9k1m5HEA8RS08LqsNTAEA2Q6nZZhuhmqGJQPRtbBPG7tib9bvhUy
 KBLQdqJ8ubi9T1EopHu8xZdpZbbB/uCS+FB6NIkXuWR1IHkAGdEPheHrv3tuR5rf
 8/2OU970h63ztE5qHFsBci2uC4htiZFY62NULiPbI7HjeEUdym0AGK4JzGnn0lnX
 8McOBeOKwQwR5XuHZcMKWrsstt4mv9zo5QOdCJ1XDxFv628G2dQ=
 =eGAC
 -----END PGP SIGNATURE-----

Merge 4.4.177 into android-4.4-p

Changes in 4.4.177
	ceph: avoid repeatedly adding inode to mdsc->snap_flush_list
	numa: change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES
	KEYS: allow reaching the keys quotas exactly
	mfd: ti_am335x_tscadc: Use PLATFORM_DEVID_AUTO while registering mfd cells
	mfd: twl-core: Fix section annotations on {,un}protect_pm_master
	mfd: db8500-prcmu: Fix some section annotations
	mfd: ab8500-core: Return zero in get_register_interruptible()
	mfd: qcom_rpm: write fw_version to CTRL_REG
	mfd: wm5110: Add missing ASRC rate register
	mfd: mc13xxx: Fix a missing check of a register-read failure
	net: hns: Fix use after free identified by SLUB debug
	MIPS: ath79: Enable OF serial ports in the default config
	scsi: qla4xxx: check return code of qla4xxx_copy_from_fwddb_param
	scsi: isci: initialize shost fully before calling scsi_add_host()
	MIPS: jazz: fix 64bit build
	isdn: i4l: isdn_tty: Fix some concurrency double-free bugs
	atm: he: fix sign-extension overflow on large shift
	leds: lp5523: fix a missing check of return value of lp55xx_read
	isdn: avm: Fix string plus integer warning from Clang
	RDMA/srp: Rework SCSI device reset handling
	KEYS: user: Align the payload buffer
	KEYS: always initialize keyring_index_key::desc_len
	batman-adv: fix uninit-value in batadv_interface_tx()
	net/packet: fix 4gb buffer limit due to overflow check
	team: avoid complex list operations in team_nl_cmd_options_set()
	sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach()
	net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
	ARCv2: Enable unaligned access in early ASM code
	Revert "bridge: do not add port to router list when receives query with source 0.0.0.0"
	libceph: handle an empty authorize reply
	scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached
	drm/msm: Unblock writer if reader closes file
	ASoC: Intel: Haswell/Broadwell: fix setting for .dynamic field
	ALSA: compress: prevent potential divide by zero bugs
	thermal: int340x_thermal: Fix a NULL vs IS_ERR() check
	usb: dwc3: gadget: Fix the uninitialized link_state when udc starts
	usb: gadget: Potential NULL dereference on allocation error
	ASoC: dapm: change snprintf to scnprintf for possible overflow
	ASoC: imx-audmux: change snprintf to scnprintf for possible overflow
	ARC: fix __ffs return value to avoid build warnings
	mac80211: fix miscounting of ttl-dropped frames
	serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling
	scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state()
	net: altera_tse: fix connect_local_phy error path
	ibmveth: Do not process frames after calling napi_reschedule
	mac80211: don't initiate TDLS connection if station is not associated to AP
	cfg80211: extend range deviation for DMG
	KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1
	arm/arm64: KVM: Feed initialized memory to MMIO accesses
	KVM: arm/arm64: Fix MMIO emulation data handling
	powerpc: Always initialize input array when calling epapr_hypercall()
	mmc: spi: Fix card detection during probe
	mm: enforce min addr even if capable() in expand_downwards()
	x86/uaccess: Don't leak the AC flag into __put_user() value evaluation
	USB: serial: option: add Telit ME910 ECM composition
	USB: serial: cp210x: add ID for Ingenico 3070
	USB: serial: ftdi_sio: add ID for Hjelmslund Electronics USB485
	cpufreq: Use struct kobj_attribute instead of struct global_attr
	sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
	ncpfs: fix build warning of strncpy
	isdn: isdn_tty: fix build warning of strncpy
	staging: lustre: fix buffer overflow of string buffer
	net-sysfs: Fix mem leak in netdev_register_kobject
	sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
	team: Free BPF filter when unregistering netdev
	bnxt_en: Drop oversize TX packets to prevent errors.
	net: nfc: Fix NULL dereference on nfc_llcp_build_tlv fails
	xen-netback: fix occasional leak of grant ref mappings under memory pressure
	net: Add __icmp_send helper.
	net: avoid use IPCB in cipso_v4_error
	net: phy: Micrel KSZ8061: link failure after cable connect
	x86/CPU/AMD: Set the CPB bit unconditionally on F17h
	applicom: Fix potential Spectre v1 vulnerabilities
	MIPS: irq: Allocate accurate order pages for irq stack
	hugetlbfs: fix races and page leaks during migration
	netlabel: fix out-of-bounds memory accesses
	net: dsa: mv88e6xxx: Fix u64 statistics
	ip6mr: Do not call __IP6_INC_STATS() from preemptible context
	media: uvcvideo: Fix 'type' check leading to overflow
	vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel
	perf tools: Handle TOPOLOGY headers with no CPU
	IB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM
	ipvs: Fix signed integer overflow when setsockopt timeout
	iommu/amd: Fix IOMMU page flush when detach device from a domain
	xtensa: SMP: fix ccount_timer_shutdown
	xtensa: SMP: fix secondary CPU initialization
	xtensa: smp_lx200_defconfig: fix vectors clash
	xtensa: SMP: mark each possible CPU as present
	xtensa: SMP: limit number of possible CPUs by NR_CPUS
	net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case
	net: hns: Fix wrong read accesses via Clause 45 MDIO protocol
	net: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup()
	gpio: vf610: Mask all GPIO interrupts
	nfs: Fix NULL pointer dereference of dev_name
	scsi: libfc: free skb when receiving invalid flogi resp
	platform/x86: Fix unmet dependency warning for SAMSUNG_Q10
	cifs: fix computation for MAX_SMB2_HDR_SIZE
	x86/kexec: Don't setup EFI info if EFI runtime is not enabled
	x86_64: increase stack size for KASAN_EXTRA
	mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone
	mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone
	fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
	autofs: drop dentry reference only when it is never used
	autofs: fix error return in autofs_fill_super()
	ARM: pxa: ssp: unneeded to free devm_ allocated data
	irqchip/mmp: Only touch the PJ4 IRQ & FIQ bits on enable/disable
	dmaengine: at_xdmac: Fix wrongfull report of a channel as in use
	dmaengine: dmatest: Abort test in case of mapping error
	s390/qeth: fix use-after-free in error path
	perf symbols: Filter out hidden symbols from labels
	MIPS: Remove function size check in get_frame_info()
	Input: wacom_serial4 - add support for Wacom ArtPad II tablet
	Input: elan_i2c - add id for touchpad found in Lenovo s21e-20
	iscsi_ibft: Fix missing break in switch statement
	futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()
	ARM: dts: exynos: Add minimal clkout parameters to Exynos3250 PMU
	Revert "x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls"
	ARM: dts: exynos: Do not ignore real-world fuse values for thermal zone 0 on Exynos5420
	udplite: call proper backlog handlers
	netfilter: x_tables: enforce nul-terminated table name from getsockopt GET_ENTRIES
	netfilter: nfnetlink_log: just returns error for unknown command
	netfilter: nfnetlink_acct: validate NFACCT_FILTER parameters
	netfilter: nf_conntrack_tcp: Fix stack out of bounds when parsing TCP options
	KEYS: restrict /proc/keys by credentials at open time
	l2tp: fix infoleak in l2tp_ip6_recvmsg()
	net: hsr: fix memory leak in hsr_dev_finalize()
	net: sit: fix UBSAN Undefined behaviour in check_6rd
	net/x25: fix use-after-free in x25_device_event()
	net/x25: reset state in x25_connect()
	pptp: dst_release sk_dst_cache in pptp_sock_destruct
	ravb: Decrease TxFIFO depth of Q3 and Q2 to one
	route: set the deleted fnhe fnhe_daddr to 0 in ip_del_fnhe to fix a race
	tcp: handle inet_csk_reqsk_queue_add() failures
	net/mlx4_core: Fix reset flow when in command polling mode
	net/mlx4_core: Fix qp mtt size calculation
	net/x25: fix a race in x25_bind()
	mdio_bus: Fix use-after-free on device_register fails
	net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255
	missing barriers in some of unix_sock ->addr and ->path accesses
	ipvlan: disallow userns cap_net_admin to change global mode/flags
	vxlan: test dev->flags & IFF_UP before calling gro_cells_receive()
	vxlan: Fix GRO cells race condition between receive and link delete
	net/hsr: fix possible crash in add_timer()
	gro_cells: make sure device is up in gro_cells_receive()
	tcp/dccp: remove reqsk_put() from inet_child_forget()
	ALSA: bebob: use more identical mod_alias for Saffire Pro 10 I/O against Liquid Saffire 56
	fs/9p: use fscache mutex rather than spinlock
	It's wrong to add len to sector_nr in raid10 reshape twice
	media: videobuf2-v4l2: drop WARN_ON in vb2_warn_zero_bytesused()
	9p: use inode->i_lock to protect i_size_write() under 32-bit
	9p/net: fix memory leak in p9_client_create
	ASoC: fsl_esai: fix register setting issue in RIGHT_J mode
	stm class: Fix an endless loop in channel allocation
	crypto: caam - fixed handling of sg list
	crypto: ahash - fix another early termination in hash walk
	gpu: ipu-v3: Fix i.MX51 CSI control registers offset
	gpu: ipu-v3: Fix CSI offsets for imx53
	s390/dasd: fix using offset into zero size array error
	ARM: OMAP2+: Variable "reg" in function omap4_dsi_mux_pads() could be uninitialized
	Input: matrix_keypad - use flush_delayed_work()
	i2c: cadence: Fix the hold bit setting
	Input: st-keyscan - fix potential zalloc NULL dereference
	ARM: 8824/1: fix a migrating irq bug when hotplug cpu
	assoc_array: Fix shortcut creation
	scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task
	net: systemport: Fix reception of BPDUs
	pinctrl: meson: meson8b: fix the sdxc_a data 1..3 pins
	net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe()
	ASoC: topology: free created components in tplg load error
	arm64: Relax GIC version check during early boot
	tmpfs: fix link accounting when a tmpfile is linked in
	ARC: uacces: remove lp_start, lp_end from clobber list
	phonet: fix building with clang
	mac80211_hwsim: propagate genlmsg_reply return code
	net: set static variable an initial value in atl2_probe()
	tmpfs: fix uninitialized return value in shmem_link
	stm class: Prevent division by zero
	crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling
	CIFS: Fix read after write for files with read caching
	tracing: Do not free iter->trace in fail path of tracing_open_pipe()
	ACPI / device_sysfs: Avoid OF modalias creation for removed device
	regulator: s2mps11: Fix steps for buck7, buck8 and LDO35
	regulator: s2mpa01: Fix step values for some LDOs
	clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR
	clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown
	s390/virtio: handle find on invalid queue gracefully
	scsi: virtio_scsi: don't send sc payload with tmfs
	scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock
	m68k: Add -ffreestanding to CFLAGS
	btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
	Btrfs: fix corruption reading shared and compressed extents after hole punching
	crypto: pcbc - remove bogus memcpy()s with src == dest
	cpufreq: tegra124: add missing of_node_put()
	cpufreq: pxa2xx: remove incorrect __init annotation
	ext4: fix crash during online resizing
	ext2: Fix underflow in ext2_max_size()
	clk: ingenic: Fix round_rate misbehaving with non-integer dividers
	dmaengine: usb-dmac: Make DMAC system sleep callbacks explicit
	mm/vmalloc: fix size check for remap_vmalloc_range_partial()
	kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv
	intel_th: Don't reference unassigned outputs
	parport_pc: fix find_superio io compare code, should use equal test.
	i2c: tegra: fix maximum transfer size
	perf bench: Copy kernel files needed to build mem{cpy,set} x86_64 benchmarks
	serial: 8250_pci: Fix number of ports for ACCES serial cards
	serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup()
	jbd2: clear dirty flag when revoking a buffer from an older transaction
	jbd2: fix compile warning when using JBUFFER_TRACE
	powerpc/32: Clear on-stack exception marker upon exception return
	powerpc/wii: properly disable use of BATs when requested.
	powerpc/powernv: Make opal log only readable by root
	powerpc/83xx: Also save/restore SPRG4-7 during suspend
	ARM: s3c24xx: Fix boolean expressions in osiris_dvs_notify
	dm: fix to_sector() for 32bit
	NFS41: pop some layoutget errors to application
	perf intel-pt: Fix CYC timestamp calculation after OVF
	perf auxtrace: Define auxtrace record alignment
	perf intel-pt: Fix overlap calculation for padding
	md: Fix failed allocation of md_register_thread
	NFS: Fix an I/O request leakage in nfs_do_recoalesce
	NFS: Don't recoalesce on error in nfs_pageio_complete_mirror()
	nfsd: fix memory corruption caused by readdir
	nfsd: fix wrong check in write_v4_end_grace()
	PM / wakeup: Rework wakeup source timer cancellation
	rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
	media: uvcvideo: Avoid NULL pointer dereference at the end of streaming
	drm/radeon/evergreen_cs: fix missing break in switch statement
	KVM: nVMX: Sign extend displacements of VMX instr's mem operands
	KVM: nVMX: Ignore limit checks on VMX instructions using flat segments
	KVM: X86: Fix residual mmio emulation request to userspace
	Linux 4.4.177

Change-Id: Ia33b88c9634e04612874d79ce4cc166e8aa8096a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-03-23 09:28:32 +01:00
Roman Penyaev
49b3c4a292 mm/vmalloc: fix size check for remap_vmalloc_range_partial()
commit 401592d2e095947344e10ec0623adbcd58934dd4 upstream.

When VM_NO_GUARD is not set area->size includes adjacent guard page,
thus for correct size checking get_vm_area_size() should be used, but
not area->size.

This fixes possible kernel oops when userspace tries to mmap an area on
1 page bigger than was allocated by vmalloc_user() call: the size check
inside remap_vmalloc_range_partial() accounts non-existing guard page
also, so check successfully passes but vmalloc_to_page() returns NULL
(guard page does not physically exist).

The following code pattern example should trigger an oops:

  static int oops_mmap(struct file *file, struct vm_area_struct *vma)
  {
        void *mem;

        mem = vmalloc_user(4096);
        BUG_ON(!mem);
        /* Do not care about mem leak */

        return remap_vmalloc_range(vma, mem, 0);
  }

And userspace simply mmaps size + PAGE_SIZE:

  mmap(NULL, 8192, PROT_WRITE|PROT_READ, MAP_PRIVATE, fd, 0);

Possible candidates for oops which do not have any explicit size
checks:

   *** drivers/media/usb/stkwebcam/stk-webcam.c:
   v4l_stk_mmap[789]   ret = remap_vmalloc_range(vma, sbuf->buffer, 0);

Or the following one:

   *** drivers/video/fbdev/core/fbmem.c
   static int
   fb_mmap(struct file *file, struct vm_area_struct * vma)
        ...
        res = fb->fb_mmap(info, vma);

Where fb_mmap callback calls remap_vmalloc_range() directly without any
explicit checks:

   *** drivers/video/fbdev/vfb.c
   static int vfb_mmap(struct fb_info *info,
             struct vm_area_struct *vma)
   {
       return remap_vmalloc_range(vma, (void *)info->fix.smem_start, vma->vm_pgoff);
   }

Link: http://lkml.kernel.org/r/20190103145954.16942-2-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joe Perches <joe@perches.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23 08:44:37 +01:00
Darrick J. Wong
5f4c9964d1 tmpfs: fix uninitialized return value in shmem_link
[ Upstream commit 29b00e609960ae0fcff382f4c7079dd0874a5311 ]

When we made the shmem_reserve_inode call in shmem_link conditional, we
forgot to update the declaration for ret so that it always has a known
value.  Dan Carpenter pointed out this deficiency in the original patch.

Fixes: 1062af920c07 ("tmpfs: fix link accounting when a tmpfile is linked in")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Matej Kupljen <matej.kupljen@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-03-23 08:44:34 +01:00
Darrick J. Wong
f8f413336b tmpfs: fix link accounting when a tmpfile is linked in
[ Upstream commit 1062af920c07f5b54cf5060fde3339da6df0cf6b ]

tmpfs has a peculiarity of accounting hard links as if they were
separate inodes: so that when the number of inodes is limited, as it is
by default, a user cannot soak up an unlimited amount of unreclaimable
dcache memory just by repeatedly linking a file.

But when v3.11 added O_TMPFILE, and the ability to use linkat() on the
fd, we missed accommodating this new case in tmpfs: "df -i" shows that
an extra "inode" remains accounted after the file is unlinked and the fd
closed and the actual inode evicted.  If a user repeatedly links
tmpfiles into a tmpfs, the limit will be hit (ENOSPC) even after they
are deleted.

Just skip the extra reservation from shmem_link() in this case: there's
a sense in which this first link of a tmpfile is then cheaper than a
hard link of another file, but the accounting works out, and there's
still good limiting, so no need to do anything more complicated.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1902182134370.7035@eggly.anvils
Fixes: f4e0c30c19 ("allow the temp files created by open() to be linked to")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Matej Kupljen <matej.kupljen@gmail.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-03-23 08:44:34 +01:00