Commit graph

253 commits

Author SHA1 Message Date
Srinivasarao P
b87d31674a Merge android-4.4.153 (5e24b4e) into msm-4.4
* refs/heads/tmp-5e24b4e
  Linux 4.4.153
  ovl: warn instead of error if d_type is not supported
  ovl: Do d_type check only if work dir creation was successful
  ovl: Ensure upper filesystem supports d_type
  x86/mm: Fix use-after-free of ldt_struct
  x86/mm/pat: Fix L1TF stable backport for CPA
  ANDROID: x86_64_cuttlefish_defconfig: Enable lz4 compression for zram
  UPSTREAM: drivers/block/zram/zram_drv.c: fix bug storing backing_dev
  BACKPORT: zram: introduce zram memory tracking
  BACKPORT: zram: record accessed second
  BACKPORT: zram: mark incompressible page as ZRAM_HUGE
  UPSTREAM: zram: correct flag name of ZRAM_ACCESS
  UPSTREAM: zram: Delete gendisk before cleaning up the request queue
  UPSTREAM: drivers/block/zram/zram_drv.c: make zram_page_end_io() static
  BACKPORT: zram: set BDI_CAP_STABLE_WRITES once
  UPSTREAM: zram: fix null dereference of handle
  UPSTREAM: zram: add config and doc file for writeback feature
  BACKPORT: zram: read page from backing device
  BACKPORT: zram: write incompressible pages to backing device
  BACKPORT: zram: identify asynchronous IO's return value
  BACKPORT: zram: add free space management in backing device
  UPSTREAM: zram: add interface to specif backing device
  UPSTREAM: zram: rename zram_decompress_page to __zram_bvec_read
  UPSTREAM: zram: inline zram_compress
  UPSTREAM: zram: clean up duplicated codes in __zram_bvec_write
  Linux 4.4.152
  reiserfs: fix broken xattr handling (heap corruption, bad retval)
  i2c: imx: Fix race condition in dma read
  PCI: pciehp: Fix use-after-free on unplug
  PCI: Skip MPS logic for Virtual Functions (VFs)
  PCI: hotplug: Don't leak pci_slot on registration failure
  parisc: Remove unnecessary barriers from spinlock.h
  bridge: Propagate vlan add failure to user
  packet: refine ring v3 block size test to hold one frame
  netfilter: conntrack: dccp: treat SYNC/SYNCACK as invalid if no prior state
  xfrm_user: prevent leaking 2 bytes of kernel memory
  parisc: Remove ordered stores from syscall.S
  ext4: fix spectre gadget in ext4_mb_regular_allocator()
  KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumer
  staging: android: ion: check for kref overflow
  tcp: identify cryptic messages as TCP seq # bugs
  net: qca_spi: Fix log level if probe fails
  net: qca_spi: Make sure the QCA7000 reset is triggered
  net: qca_spi: Avoid packet drop during initial sync
  net: usb: rtl8150: demote allmulti message to dev_dbg()
  net/ethernet/freescale/fman: fix cross-build error
  drm/nouveau/gem: off by one bugs in nouveau_gem_pushbuf_reloc_apply()
  tcp: remove DELAYED ACK events in DCTCP
  qlogic: check kstrtoul() for errors
  packet: reset network header if packet shorter than ll reserved space
  ixgbe: Be more careful when modifying MAC filters
  ARM: dts: am3517.dtsi: Disable reference to OMAP3 OTG controller
  ARM: 8780/1: ftrace: Only set kernel memory back to read-only after boot
  perf llvm-utils: Remove bashism from kernel include fetch script
  bnxt_en: Fix for system hang if request_irq fails
  drm/armada: fix colorkey mode property
  ieee802154: fakelb: switch from BUG_ON() to WARN_ON() on problem
  ieee802154: at86rf230: use __func__ macro for debug messages
  ieee802154: at86rf230: switch from BUG_ON() to WARN_ON() on problem
  ARM: pxa: irq: fix handling of ICMR registers in suspend/resume
  netfilter: x_tables: set module owner for icmp(6) matches
  smsc75xx: Add workaround for gigabit link up hardware errata.
  kasan: fix shadow_size calculation error in kasan_module_alloc
  tracing: Use __printf markup to silence compiler
  ARM: imx_v4_v5_defconfig: Select ULPI support
  ARM: imx_v6_v7_defconfig: Select ULPI support
  HID: wacom: Correct touch maximum XY of 2nd-gen Intuos
  m68k: fix "bad page state" oops on ColdFire boot
  bnx2x: Fix receiving tx-timeout in error or recovery state.
  drm/exynos: decon5433: Fix WINCONx reset value
  drm/exynos: decon5433: Fix per-plane global alpha for XRGB modes
  drm/exynos: gsc: Fix support for NV16/61, YUV420/YVU420 and YUV422 modes
  md/raid10: fix that replacement cannot complete recovery after reassemble
  dmaengine: k3dma: Off by one in k3_of_dma_simple_xlate()
  ARM: dts: da850: Fix interrups property for gpio
  selftests/x86/sigreturn/64: Fix spurious failures on AMD CPUs
  perf report powerpc: Fix crash if callchain is empty
  perf test session topology: Fix test on s390
  usb: xhci: increase CRS timeout value
  ARM: dts: am437x: make edt-ft5x06 a wakeup source
  brcmfmac: stop watchdog before detach and free everything
  cxgb4: when disabling dcb set txq dcb priority to 0
  Smack: Mark inode instant in smack_task_to_inode
  ipv6: mcast: fix unsolicited report interval after receiving querys
  locking/lockdep: Do not record IRQ state within lockdep code
  net: davinci_emac: match the mdio device against its compatible if possible
  ARC: Enable machine_desc->init_per_cpu for !CONFIG_SMP
  net: propagate dev_get_valid_name return code
  net: hamradio: use eth_broadcast_addr
  enic: initialize enic->rfs_h.lock in enic_probe
  qed: Add sanity check for SIMD fastpath handler.
  arm64: make secondary_start_kernel() notrace
  scsi: xen-scsifront: add error handling for xenbus_printf
  usb: gadget: dwc2: fix memory leak in gadget_init()
  usb: gadget: composite: fix delayed_status race condition when set_interface
  usb: dwc2: fix isoc split in transfer with no data
  ARM: dts: Cygnus: Fix I2C controller interrupt type
  selftests: sync: add config fragment for testing sync framework
  selftests: zram: return Kselftest Skip code for skipped tests
  selftests: user: return Kselftest Skip code for skipped tests
  selftests: static_keys: return Kselftest Skip code for skipped tests
  selftests: pstore: return Kselftest Skip code for skipped tests
  netfilter: ipv6: nf_defrag: reduce struct net memory waste
  ARC: Explicitly add -mmedium-calls to CFLAGS
  ANDROID: x86_64_cuttlefish_defconfig: Enable zram and zstd
  BACKPORT: crypto: zstd - Add zstd support
  UPSTREAM: zram: add zstd to the supported algorithms list
  UPSTREAM: lib: Add zstd modules
  UPSTREAM: lib: Add xxhash module
  UPSTREAM: zram: rework copy of compressor name in comp_algorithm_store()
  UPSTREAM: zram: constify attribute_group structures.
  UPSTREAM: zram: count same page write as page_stored
  UPSTREAM: zram: reduce load operation in page_same_filled
  UPSTREAM: zram: use zram_free_page instead of open-coded
  UPSTREAM: zram: introduce zram data accessor
  UPSTREAM: zram: remove zram_meta structure
  UPSTREAM: zram: use zram_slot_lock instead of raw bit_spin_lock op
  BACKPORT: zram: partial IO refactoring
  BACKPORT: zram: handle multiple pages attached bio's bvec
  UPSTREAM: zram: fix operator precedence to get offset
  BACKPORT: zram: extend zero pages to same element pages
  BACKPORT: zram: remove waitqueue for IO done
  UPSTREAM: zram: remove obsolete sysfs attrs
  UPSTREAM: zram: support BDI_CAP_STABLE_WRITES
  UPSTREAM: zram: revalidate disk under init_lock
  BACKPORT: mm: support anonymous stable page
  UPSTREAM: zram: use __GFP_MOVABLE for memory allocation
  UPSTREAM: zram: drop gfp_t from zcomp_strm_alloc()
  UPSTREAM: zram: add more compression algorithms
  UPSTREAM: zram: delete custom lzo/lz4
  UPSTREAM: zram: cosmetic: cleanup documentation
  UPSTREAM: zram: use crypto api to check alg availability
  BACKPORT: zram: switch to crypto compress API
  UPSTREAM: zram: rename zstrm find-release functions
  UPSTREAM: zram: introduce per-device debug_stat sysfs node
  UPSTREAM: zram: remove max_comp_streams internals
  UPSTREAM: zram: user per-cpu compression streams
  BACKPORT: zsmalloc: require GFP in zs_malloc()
  UPSTREAM: zram/zcomp: do not zero out zcomp private pages
  UPSTREAM: zram: pass gfp from zcomp frontend to backend
  UPSTREAM: socket: close race condition between sock_close() and sockfs_setattr()
  ANDROID: Refresh x86_64_cuttlefish_defconfig
  Linux 4.4.151
  isdn: Disable IIOCDBGVAR
  Bluetooth: avoid killing an already killed socket
  x86/mm: Simplify p[g4um]d_page() macros
  serial: 8250_dw: always set baud rate in dw8250_set_termios
  ACPI / PM: save NVS memory for ASUS 1025C laptop
  ACPI: save NVS memory for Lenovo G50-45
  USB: option: add support for DW5821e
  USB: serial: sierra: fix potential deadlock at close
  ALSA: vxpocket: Fix invalid endian conversions
  ALSA: memalloc: Don't exceed over the requested size
  ALSA: hda: Correct Asrock B85M-ITX power_save blacklist entry
  ALSA: cs5535audio: Fix invalid endian conversion
  ALSA: virmidi: Fix too long output trigger loop
  ALSA: vx222: Fix invalid endian conversions
  ALSA: hda - Turn CX8200 into D3 as well upon reboot
  ALSA: hda - Sleep for 10ms after entering D3 on Conexant codecs
  net_sched: fix NULL pointer dereference when delete tcindex filter
  vsock: split dwork to avoid reinitializations
  net_sched: Fix missing res info when create new tc_index filter
  llc: use refcount_inc_not_zero() for llc_sap_find()
  l2tp: use sk_dst_check() to avoid race on sk->sk_dst_cache
  dccp: fix undefined behavior with 'cwnd' shift in ccid2_cwnd_restart()

Conflicts:
	drivers/block/zram/zram_drv.c
	drivers/staging/android/ion/ion.c
	include/linux/swap.h
	mm/zsmalloc.c

Change-Id: I1c437ac5133503a939d06d51ec778b65371df6d1
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2018-08-28 17:28:39 +05:30
Minchan Kim
354502bc5e BACKPORT: mm: support anonymous stable page
During developemnt for zram-swap asynchronous writeback, I found strange
corruption of compressed page, resulting in:

  Modules linked in: zram(E)
  CPU: 3 PID: 1520 Comm: zramd-1 Tainted: G            E   4.8.0-mm1-00320-ge0d4894c9c38-dirty #3274
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
  task: ffff88007620b840 task.stack: ffff880078090000
  RIP: set_freeobj.part.43+0x1c/0x1f
  RSP: 0018:ffff880078093ca8  EFLAGS: 00010246
  RAX: 0000000000000018 RBX: ffff880076798d88 RCX: ffffffff81c408c8
  RDX: 0000000000000018 RSI: 0000000000000000 RDI: 0000000000000246
  RBP: ffff880078093cb0 R08: 0000000000000000 R09: 0000000000000000
  R10: ffff88005bc43030 R11: 0000000000001df3 R12: ffff880076798d88
  R13: 000000000005bc43 R14: ffff88007819d1b8 R15: 0000000000000001
  FS:  0000000000000000(0000) GS:ffff88007e380000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fc934048f20 CR3: 0000000077b01000 CR4: 00000000000406e0
  Call Trace:
    obj_malloc+0x22b/0x260
    zs_malloc+0x1e4/0x580
    zram_bvec_rw+0x4cd/0x830 [zram]
    page_requests_rw+0x9c/0x130 [zram]
    zram_thread+0xe6/0x173 [zram]
    kthread+0xca/0xe0
    ret_from_fork+0x25/0x30

With investigation, it reveals currently stable page doesn't support
anonymous page.  IOW, reuse_swap_page can reuse the page without waiting
writeback completion so it can overwrite page zram is compressing.

Unfortunately, zram has used per-cpu stream feature from v4.7.
It aims for increasing cache hit ratio of scratch buffer for
compressing. Downside of that approach is that zram should ask
memory space for compressed page in per-cpu context which requires
stricted gfp flag which could be failed. If so, it retries to
allocate memory space out of per-cpu context so it could get memory
this time and compress the data again, copies it to the memory space.

In this scenario, zram assumes the data should never be changed
but it is not true unless stable page supports. So, If the data is
changed under us, zram can make buffer overrun because second
compression size could be bigger than one we got in previous trial
and blindly, copy bigger size object to smaller buffer which is
buffer overrun. The overrun breaks zsmalloc free object chaining
so system goes crash like above.

I think below is same problem.
https://bugzilla.suse.com/show_bug.cgi?id=997574

Unfortunately, reuse_swap_page should be atomic so that we cannot wait on
writeback in there so the approach in this patch is simply return false if
we found it needs stable page.  Although it increases memory footprint
temporarily, it happens rarely and it should be reclaimed easily althoug
it happened.  Also, It would be better than waiting of IO completion,
which is critial path for application latency.

Fixes: da9556a2367c ("zram: user per-cpu compression streams")
Link: http://lkml.kernel.org/r/20161120233015.GA14113@bbox
Link: http://lkml.kernel.org/r/1482366980-3782-2-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Hyeoncheol Lee <cheol.lee@lge.com>
Cc: <yjay.kim@lge.com>
Cc: Sangseok Lee <sangseok.lee@lge.com>
Cc: <stable@vger.kernel.org> [4.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit f05714293a591038304ddae7cb0dd747bb3786cc)
Signed-off-by: Peter Kalauskas <peskal@google.com>
Bug: 112488418
Change-Id: I0fa5012aff9daf614b2d1d04f35b86ff7043ff21
2018-08-23 12:00:17 -07:00
Srinivasarao P
79de04d806 Merge android-4.4.148 (f057ff9) into msm-4.4
* refs/heads/tmp-f057ff9
  Linux 4.4.148
  x86/speculation/l1tf: Unbreak !__HAVE_ARCH_PFN_MODIFY_ALLOWED architectures
  x86/init: fix build with CONFIG_SWAP=n
  x86/speculation/l1tf: Fix up CPU feature flags
  x86/mm/kmmio: Make the tracer robust against L1TF
  x86/mm/pat: Make set_memory_np() L1TF safe
  x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert
  x86/speculation/l1tf: Invert all not present mappings
  x86/speculation/l1tf: Fix up pte->pfn conversion for PAE
  x86/speculation/l1tf: Protect PAE swap entries against L1TF
  x86/cpufeatures: Add detection of L1D cache flush support.
  x86/speculation/l1tf: Extend 64bit swap file size limit
  x86/bugs: Move the l1tf function and define pr_fmt properly
  x86/speculation/l1tf: Limit swap file size to MAX_PA/2
  x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings
  mm: fix cache mode tracking in vm_insert_mixed()
  mm: Add vm_insert_pfn_prot()
  x86/speculation/l1tf: Add sysfs reporting for l1tf
  x86/speculation/l1tf: Make sure the first page is always reserved
  x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation
  x86/speculation/l1tf: Protect swap entries against L1TF
  x86/speculation/l1tf: Change order of offset/type in swap entry
  mm: x86: move _PAGE_SWP_SOFT_DIRTY from bit 7 to bit 1
  x86/mm: Fix swap entry comment and macro
  x86/mm: Move swap offset/type up in PTE to work around erratum
  x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_SHIFT
  x86/irqflags: Provide a declaration for native_save_fl
  kprobes/x86: Fix %p uses in error messages
  x86/speculation: Protect against userspace-userspace spectreRSB
  x86/paravirt: Fix spectre-v2 mitigations for paravirt guests
  ARM: dts: imx6sx: fix irq for pcie bridge
  IB/ocrdma: fix out of bounds access to local buffer
  IB/mlx4: Mark user MR as writable if actual virtual memory is writable
  IB/core: Make testing MR flags for writability a static inline function
  fix __legitimize_mnt()/mntput() race
  fix mntput/mntput race
  root dentries need RCU-delayed freeing
  scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management enabled
  ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices
  xen/netfront: don't cache skb_shinfo()
  parisc: Define mb() and add memory barriers to assembler unlock sequences
  parisc: Enable CONFIG_MLONGCALLS by default
  fork: unconditionally clear stack on fork
  ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV
  tpm: fix race condition in tpm_common_write()
  ext4: fix check to prevent initializing reserved inodes
  Linux 4.4.147
  jfs: Fix inconsistency between memory allocation and ea_buf->max_size
  i2c: imx: Fix reinit_completion() use
  ring_buffer: tracing: Inherit the tracing setting to next ring buffer
  ACPI / PCI: Bail early in acpi_pci_add_bus() if there is no ACPI handle
  ext4: fix false negatives *and* false positives in ext4_check_descriptors()
  netlink: Don't shift on 64 for ngroups
  netlink: Don't shift with UB on nlk->ngroups
  netlink: Do not subscribe to non-existent groups
  nohz: Fix local_timer_softirq_pending()
  genirq: Make force irq threading setup more robust
  scsi: qla2xxx: Return error when TMF returns
  scsi: qla2xxx: Fix ISP recovery on unload

Conflicts:
	include/linux/swapfile.h

Removed CONFIG_CRYPTO_ECHAINIV from defconfig files since this upmerge is
adding this config to Kconfig file.

Change-Id: Ide96c29f919d76590c2bdccf356d1d464a892fd7
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2018-08-24 00:07:01 +05:30
Greg Kroah-Hartman
f057ff9377 This is the 4.4.148 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlt0SdMACgkQONu9yGCS
 aT4NLxAAovDVqFFejBk8M1nxAtQSqRzB2PMboc+l62clKa6BAJtWAsgPFjECgzEp
 edlDeUttliQoTB6S3GYYM82oj50myUKlGvlJRptRE3Gr1iYubdB/U2RDmwEzCxbC
 AEzu4tEv+Z23jaLGsuAIOg66faBTqqgVoKtp/TlKwl+Y/b6WzkI0gRzxWTBFnAlj
 AKuhmoc1JoS9JF/MQ4q02gYSQ0g1eTpr1gIU2GMow9pK9Rahk4Jdl4yRjNLUFDxd
 ojrBYCoElf90R3q+NvmZBbzxwanm2OgzeEBffhh647aB5kHEUd5h4z9w+sIoXmSq
 50uD59q62Umdpp2O125HH5KHeHbcTUCXXp3g1VY6A/+d9dTs9GZqo//vf6aJsxEb
 gixoYyNbIcqw1k0jhEEW2ah3F3j+ZHvNmLKPyV4U8h2Tw2K5QKzFu/fVnQw7Xfv6
 Gv0z1TQ4Y+w2bqpzDiDBO4sRgKOXVr3hzWa0jggW5AoKWTco/oIVkE+Rqmj65AfK
 DROqCMQq75K+pymrM8I3wTXRSxtSH9bO/iqCu2LiiaG+JAkvr0OIHEHgizxLtAFO
 ivpREPDsWhVAYUmnoCgJa8Za1GdJk1I9uvxoJY1TBL8gbcYc61yjjeJDYqLghuNT
 EhrvFvJ4r/fQ6BJ76+rO7FSJIl+Kov2Uf7CWql3Lzxps6/u5GNQ=
 =73dO
 -----END PGP SIGNATURE-----

Merge 4.4.148 into android-4.4

Changes in 4.4.148
	ext4: fix check to prevent initializing reserved inodes
	tpm: fix race condition in tpm_common_write()
	ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV
	fork: unconditionally clear stack on fork
	parisc: Enable CONFIG_MLONGCALLS by default
	parisc: Define mb() and add memory barriers to assembler unlock sequences
	xen/netfront: don't cache skb_shinfo()
	ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices
	scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management enabled
	root dentries need RCU-delayed freeing
	fix mntput/mntput race
	fix __legitimize_mnt()/mntput() race
	IB/core: Make testing MR flags for writability a static inline function
	IB/mlx4: Mark user MR as writable if actual virtual memory is writable
	IB/ocrdma: fix out of bounds access to local buffer
	ARM: dts: imx6sx: fix irq for pcie bridge
	x86/paravirt: Fix spectre-v2 mitigations for paravirt guests
	x86/speculation: Protect against userspace-userspace spectreRSB
	kprobes/x86: Fix %p uses in error messages
	x86/irqflags: Provide a declaration for native_save_fl
	x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_SHIFT
	x86/mm: Move swap offset/type up in PTE to work around erratum
	x86/mm: Fix swap entry comment and macro
	mm: x86: move _PAGE_SWP_SOFT_DIRTY from bit 7 to bit 1
	x86/speculation/l1tf: Change order of offset/type in swap entry
	x86/speculation/l1tf: Protect swap entries against L1TF
	x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation
	x86/speculation/l1tf: Make sure the first page is always reserved
	x86/speculation/l1tf: Add sysfs reporting for l1tf
	mm: Add vm_insert_pfn_prot()
	mm: fix cache mode tracking in vm_insert_mixed()
	x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings
	x86/speculation/l1tf: Limit swap file size to MAX_PA/2
	x86/bugs: Move the l1tf function and define pr_fmt properly
	x86/speculation/l1tf: Extend 64bit swap file size limit
	x86/cpufeatures: Add detection of L1D cache flush support.
	x86/speculation/l1tf: Protect PAE swap entries against L1TF
	x86/speculation/l1tf: Fix up pte->pfn conversion for PAE
	x86/speculation/l1tf: Invert all not present mappings
	x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert
	x86/mm/pat: Make set_memory_np() L1TF safe
	x86/mm/kmmio: Make the tracer robust against L1TF
	x86/speculation/l1tf: Fix up CPU feature flags
	x86/init: fix build with CONFIG_SWAP=n
	x86/speculation/l1tf: Unbreak !__HAVE_ARCH_PFN_MODIFY_ALLOWED architectures
	Linux 4.4.148

Change-Id: I83c857d9d9d74ee47e61d15eb411f276f057ba3d
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-08-15 18:20:41 +02:00
Andi Kleen
685b44483f x86/speculation/l1tf: Limit swap file size to MAX_PA/2
commit 377eeaa8e11fe815b1d07c81c4a0e2843a8c15eb upstream

For the L1TF workaround its necessary to limit the swap file size to below
MAX_PA/2, so that the higher bits of the swap offset inverted never point
to valid memory.

Add a mechanism for the architecture to override the swap file size check
in swapfile.c and add a x86 specific max swapfile check function that
enforces that limit.

The check is only enabled if the CPU is vulnerable to L1TF.

In VMs with 42bit MAX_PA the typical limit is 2TB now, on a native system
with 46bit PA it is 32TB. The limit is only per individual swap file, so
it's always possible to exceed these limits with multiple swap files or
partitions.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15 17:42:10 +02:00
Srinivasarao P
f9cff13b5d Merge android-4.4.135 (c9d74f2) into msm-4.4
* refs/heads/tmp-c9d74f2
  Linux 4.4.135
  Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU"
  Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU"
  Linux 4.4.134
  s390/ftrace: use expoline for indirect branches
  kdb: make "mdr" command repeat
  Bluetooth: btusb: Add device ID for RTL8822BE
  ASoC: samsung: i2s: Ensure the RCLK rate is properly determined
  regulator: of: Add a missing 'of_node_put()' in an error handling path of 'of_regulator_match()'
  scsi: lpfc: Fix frequency of Release WQE CQEs
  scsi: lpfc: Fix soft lockup in lpfc worker thread during LIP testing
  scsi: lpfc: Fix issue_lip if link is disabled
  netlabel: If PF_INET6, check sk_buff ip header version
  selftests/net: fixes psock_fanout eBPF test case
  perf report: Fix memory corruption in --branch-history mode --branch-history
  perf tests: Use arch__compare_symbol_names to compare symbols
  x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified
  drm/rockchip: Respect page offset for PRIME mmap calls
  MIPS: Octeon: Fix logging messages with spurious periods after newlines
  audit: return on memory error to avoid null pointer dereference
  crypto: sunxi-ss - Add MODULE_ALIAS to sun4i-ss
  clk: samsung: exynos3250: Fix PLL rates
  clk: samsung: exynos5250: Fix PLL rates
  clk: samsung: exynos5433: Fix PLL rates
  clk: samsung: exynos5260: Fix PLL rates
  clk: samsung: s3c2410: Fix PLL rates
  media: cx25821: prevent out-of-bounds read on array card
  udf: Provide saner default for invalid uid / gid
  PCI: Add function 1 DMA alias quirk for Marvell 88SE9220
  serial: arc_uart: Fix out-of-bounds access through DT alias
  serial: fsl_lpuart: Fix out-of-bounds access through DT alias
  serial: imx: Fix out-of-bounds access through serial port index
  serial: mxs-auart: Fix out-of-bounds access through serial port index
  serial: samsung: Fix out-of-bounds access through serial port index
  serial: xuartps: Fix out-of-bounds access through DT alias
  rtc: tx4939: avoid unintended sign extension on a 24 bit shift
  staging: rtl8192u: return -ENOMEM on failed allocation of priv->oldaddr
  hwrng: stm32 - add reset during probe
  enic: enable rq before updating rq descriptors
  clk: rockchip: Prevent calculating mmc phase if clock rate is zero
  media: em28xx: USB bulk packet size fix
  dmaengine: pl330: fix a race condition in case of threaded irqs
  media: s3c-camif: fix out-of-bounds array access
  media: cx23885: Set subdev host data to clk_freq pointer
  media: cx23885: Override 888 ImpactVCBe crystal frequency
  ALSA: vmaster: Propagate slave error
  x86/devicetree: Fix device IRQ settings in DT
  x86/devicetree: Initialize device tree before using it
  usb: gadget: composite: fix incorrect handling of OS desc requests
  usb: gadget: udc: change comparison to bitshift when dealing with a mask
  gfs2: Fix fallocate chunk size
  cdrom: do not call check_disk_change() inside cdrom_open()
  hwmon: (pmbus/adm1275) Accept negative page register values
  hwmon: (pmbus/max8688) Accept negative page register values
  perf/core: Fix perf_output_read_group()
  ASoC: topology: create TLV data for dapm widgets
  powerpc: Add missing prototype for arch_irq_work_raise()
  usb: gadget: ffs: Execute copy_to_user() with USER_DS set
  usb: gadget: ffs: Let setup() return USB_GADGET_DELAYED_STATUS
  usb: dwc2: Fix interval type issue
  ipmi_ssif: Fix kernel panic at msg_done_handler
  PCI: Restore config space on runtime resume despite being unbound
  MIPS: ath79: Fix AR724X_PLL_REG_PCIE_CONFIG offset
  xhci: zero usb device slot_id member when disabling and freeing a xhci slot
  KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use
  i2c: mv64xxx: Apply errata delay only in standard mode
  ACPICA: acpi: acpica: fix acpi operand cache leak in nseval.c
  ACPICA: Events: add a return on failure from acpi_hw_register_read
  bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set
  zorro: Set up z->dev.dma_mask for the DMA API
  clk: Don't show the incorrect clock phase
  cpufreq: cppc_cpufreq: Fix cppc_cpufreq_init() failure path
  usb: dwc3: Update DWC_usb31 GTXFIFOSIZ reg fields
  arm: dts: socfpga: fix GIC PPI warning
  virtio-net: Fix operstate for virtio when no VIRTIO_NET_F_STATUS
  ima: Fallback to the builtin hash algorithm
  ima: Fix Kconfig to select TPM 2.0 CRB interface
  ath10k: Fix kernel panic while using worker (ath10k_sta_rc_update_wk)
  net/mlx5: Protect from command bit overflow
  selftests: Print the test we're running to /dev/kmsg
  tools/thermal: tmon: fix for segfault
  powerpc/perf: Fix kernel address leak via sampling registers
  powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer
  rtc: hctosys: Ensure system time doesn't overflow time_t
  hwmon: (nct6775) Fix writing pwmX_mode
  parisc/pci: Switch LBA PCI bus from Hard Fail to Soft Fail mode
  m68k: set dma and coherent masks for platform FEC ethernets
  powerpc/mpic: Check if cpu_possible() in mpic_physmask()
  ACPI: acpi_pad: Fix memory leak in power saving threads
  xen/acpi: off by one in read_acpi_id()
  btrfs: fix lockdep splat in btrfs_alloc_subvolume_writers
  Btrfs: fix copy_items() return value when logging an inode
  btrfs: tests/qgroup: Fix wrong tree backref level
  Bluetooth: btusb: Add USB ID 7392:a611 for Edimax EW-7611ULB
  net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()
  rtc: snvs: Fix usage of snvs_rtc_enable
  sparc64: Make atomic_xchg() an inline function rather than a macro.
  fscache: Fix hanging wait on page discarded by writeback
  KVM: VMX: raise internal error for exception during invalid protected mode state
  sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning
  ocfs2/dlm: don't handle migrate lockres if already in shutdown
  btrfs: Fix possible softlock on single core machines
  Btrfs: fix NULL pointer dereference in log_dir_items
  Btrfs: bail out on error during replay_dir_deletes
  mm: fix races between address_space dereference and free in page_evicatable
  mm/ksm: fix interaction with THP
  dp83640: Ensure against premature access to PHY registers after reset
  scsi: aacraid: Insure command thread is not recursively stopped
  cpufreq: CPPC: Initialize shared perf capabilities of CPUs
  Force log to disk before reading the AGF during a fstrim
  sr: get/drop reference to device in revalidate and check_events
  swap: divide-by-zero when zero length swap file on ssd
  fs/proc/proc_sysctl.c: fix potential page fault while unregistering sysctl table
  x86/pgtable: Don't set huge PUD/PMD on non-leaf entries
  sh: fix debug trap failure to process signals before return to user
  net: mvneta: fix enable of all initialized RXQs
  net: Fix untag for vlan packets without ethernet header
  mm/kmemleak.c: wait for scan completion before disabling free
  llc: properly handle dev_queue_xmit() return value
  net-usb: add qmi_wwan if on lte modem wistron neweb d18q1
  net/usb/qmi_wwan.c: Add USB id for lt4120 modem
  net: qmi_wwan: add BroadMobi BM806U 2020:2033
  ARM: 8748/1: mm: Define vdso_start, vdso_end as array
  batman-adv: fix packet loss for broadcasted DHCP packets to a server
  batman-adv: fix multicast-via-unicast transmission with AP isolation
  selftests: ftrace: Add a testcase for probepoint
  selftests: ftrace: Add a testcase for string type with kprobe_event
  selftests: ftrace: Add probe event argument syntax testcase
  mm/mempolicy.c: avoid use uninitialized preferred_node
  RDMA/ucma: Correct option size check using optlen
  perf/cgroup: Fix child event counting bug
  vti4: Don't override MTU passed on link creation via IFLA_MTU
  vti4: Don't count header length twice on tunnel setup
  batman-adv: fix header size check in batadv_dbg_arp()
  net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off
  sunvnet: does not support GSO for sctp
  ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu
  workqueue: use put_device() instead of kfree()
  bnxt_en: Check valid VNIC ID in bnxt_hwrm_vnic_set_tpa().
  netfilter: ebtables: fix erroneous reject of last rule
  USB: OHCI: Fix NULL dereference in HCDs using HCD_LOCAL_MEM
  xen: xenbus: use put_device() instead of kfree()
  fbdev: Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in sbusfb_ioctl_helper().
  scsi: sd: Keep disk read-only when re-reading partition
  scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM
  usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers
  e1000e: allocate ring descriptors with dma_zalloc_coherent
  e1000e: Fix check_for_link return value with autoneg off
  watchdog: f71808e_wdt: Fix magic close handling
  KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing
  selftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable
  Btrfs: send, fix issuing write op when processing hole in no data mode
  xen/pirq: fix error path cleanup when binding MSIs
  net/tcp/illinois: replace broken algorithm reference link
  gianfar: Fix Rx byte accounting for ndev stats
  sit: fix IFLA_MTU ignored on NEWLINK
  bcache: fix kcrashes with fio in RAID5 backend dev
  dmaengine: rcar-dmac: fix max_chunk_size for R-Car Gen3
  virtio-gpu: fix ioctl and expose the fixed status to userspace.
  r8152: fix tx packets accounting
  clocksource/drivers/fsl_ftm_timer: Fix error return checking
  nvme-pci: Fix nvme queue cleanup if IRQ setup fails
  netfilter: ebtables: convert BUG_ONs to WARN_ONs
  batman-adv: invalidate checksum on fragment reassembly
  batman-adv: fix packet checksum in receive path
  md/raid1: fix NULL pointer dereference
  media: dmxdev: fix error code for invalid ioctls
  x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
  locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs
  regulatory: add NUL to request alpha2
  smsc75xx: fix smsc75xx_set_features()
  ARM: OMAP: Fix dmtimer init for omap1
  s390/cio: clear timer when terminating driver I/O
  s390/cio: fix return code after missing interrupt
  powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access
  kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
  md: raid5: avoid string overflow warning
  locking/xchg/alpha: Add unconditional memory barrier to cmpxchg()
  usb: musb: fix enumeration after resume
  drm/exynos: fix comparison to bitshift when dealing with a mask
  md raid10: fix NULL deference in handle_write_completed()
  mac80211: round IEEE80211_TX_STATUS_HEADROOM up to multiple of 4
  NFC: llcp: Limit size of SDP URI
  ARM: OMAP1: clock: Fix debugfs_create_*() usage
  ARM: OMAP3: Fix prm wake interrupt for resume
  ARM: OMAP2+: timer: fix a kmemleak caused in omap_get_timer_dt
  scsi: qla4xxx: skip error recovery in case of register disconnect.
  scsi: aacraid: fix shutdown crash when init fails
  scsi: storvsc: Increase cmd_per_lun for higher speed devices
  selftests: memfd: add config fragment for fuse
  usb: dwc2: Fix dwc2_hsotg_core_init_disconnected()
  usb: gadget: fsl_udc_core: fix ep valid checks
  usb: gadget: f_uac2: fix bFirstInterface in composite gadget
  ARC: Fix malformed ARC_EMUL_UNALIGNED default
  scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion()
  scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo()
  scsi: sym53c8xx_2: iterator underflow in sym_getsync()
  scsi: bnx2fc: Fix check in SCSI completion handler for timed out request
  scsi: ufs: Enable quirk to ignore sending WRITE_SAME command
  irqchip/gic-v3: Change pr_debug message to pr_devel
  locking/qspinlock: Ensure node->count is updated before initialising node
  tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
  bcache: return attach error when no cache set exist
  bcache: fix for data collapse after re-attaching an attached device
  bcache: fix for allocator and register thread race
  bcache: properly set task state in bch_writeback_thread()
  cifs: silence compiler warnings showing up with gcc-8.0.0
  proc: fix /proc/*/map_files lookup
  arm64: spinlock: Fix theoretical trylock() A-B-A with LSE atomics
  RDS: IB: Fix null pointer issue
  xen/grant-table: Use put_page instead of free_page
  xen-netfront: Fix race between device setup and open
  MIPS: TXx9: use IS_BUILTIN() for CONFIG_LEDS_CLASS
  bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y
  ACPI: processor_perflib: Do not send _PPC change notification if not ready
  firmware: dmi_scan: Fix handling of empty DMI strings
  x86/power: Fix swsusp_arch_resume prototype
  IB/ipoib: Fix for potential no-carrier state
  mm: pin address_space before dereferencing it while isolating an LRU page
  asm-generic: provide generic_pmdp_establish()
  mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
  mm/mempolicy: fix the check of nodemask from user
  ocfs2: return error when we attempt to access a dirty bh in jbd2
  ocfs2/acl: use 'ip_xattr_sem' to protect getting extended attribute
  ocfs2: return -EROFS to mount.ocfs2 if inode block is invalid
  ntb_transport: Fix bug with max_mw_size parameter
  RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure
  powerpc/numa: Ensure nodes initialized for hotplug
  powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes
  jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path
  HID: roccat: prevent an out of bounds read in kovaplus_profile_activated()
  scsi: fas216: fix sense buffer initialization
  Btrfs: fix scrub to repair raid6 corruption
  btrfs: Fix out of bounds access in btrfs_search_slot
  Btrfs: set plug for fsync
  ipmi/powernv: Fix error return code in ipmi_powernv_probe()
  mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl()
  kconfig: Fix expr_free() E_NOT leak
  kconfig: Fix automatic menu creation mem leak
  kconfig: Don't leak main menus during parsing
  watchdog: sp5100_tco: Fix watchdog disable bit
  nfs: Do not convert nfs_idmap_cache_timeout to jiffies
  dm thin: fix documentation relative to low water mark threshold
  tools lib traceevent: Fix get_field_str() for dynamic strings
  perf callchain: Fix attr.sample_max_stack setting
  tools lib traceevent: Simplify pointer print logic and fix %pF
  PCI: Add function 1 DMA alias quirk for Marvell 9128
  tracing/hrtimer: Fix tracing bugs by taking all clock bases and modes into account
  kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
  ASoC: au1x: Fix timeout tests in au1xac97c_ac97_read()
  ALSA: hda - Use IS_REACHABLE() for dependency on input
  NFSv4: always set NFS_LOCK_LOST when a lock is lost.
  firewire-ohci: work around oversized DMA reads on JMicron controllers
  do d_instantiate/unlock_new_inode combinations safely
  xfs: remove racy hasattr check from attr ops
  kernel/signal.c: avoid undefined behaviour in kill_something_info
  kernel/sys.c: fix potential Spectre v1 issue
  kasan: fix memory hotplug during boot
  ipc/shm: fix shmat() nil address after round-down when remapping
  Revert "ipc/shm: Fix shmat mmap nil-page protection"
  xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent
  libata: blacklist Micron 500IT SSD with MU01 firmware
  libata: Blacklist some Sandisk SSDs for NCQ
  mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register
  ALSA: timer: Fix pause event notification
  aio: fix io_destroy(2) vs. lookup_ioctx() race
  affs_lookup(): close a race with affs_remove_link()
  KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable"
  MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs
  MIPS: ptrace: Expose FIR register through FP regset
  UPSTREAM: sched/fair: Consider RT/IRQ pressure in capacity_spare_wake

Conflicts:
	drivers/media/dvb-core/dmxdev.c
	drivers/scsi/sd.c
	drivers/scsi/ufs/ufshcd.c
	drivers/usb/gadget/function/f_fs.c
	fs/ecryptfs/inode.c

Change-Id: I15751ed8c82ec65ba7eedcb0d385b9f803c333f7
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2018-06-27 14:42:55 +05:30
Greg Kroah-Hartman
6e37ae0e7a This is the 4.4.134 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlsOO14ACgkQONu9yGCS
 aT4ulhAAhMVYSRa/cOFm0BHxSL/59WmJTa3Na8TJqkTrJy+LRluBiKCywyiMZknp
 4rIffv4jcxcFNCpqYTjNTSStGLWCCkBLNSzxuzFv5M89Jdx4Gz1Ww1hzMESP3gxK
 puHUewSJQm7qtVOiC2l4YcW3Q6nFK0kqbCWpSkHoGVfZoX9JS2P1V8n+KFZpUH1a
 UyhVW48ainUpXfhSKJZ5xABiWYM2hcSq52RW1edNZvwuKwulZ+2EME26HgGCK7ff
 WHzGHECE6Lem+iunR26J/QtbTo8LKEyU0F039X21E7FIxf33S0xyPx+MGjJfWBOo
 Q6A23mAEWwEhlMomNKzdd/iUzSVlWSzKe8LJa7GI5G6BxftN8Z0TGTnKzIDkw++M
 T6RfK03CP6c9rQ756d0fTPxdZh6ae9EN8WSot/Sbbc9SvGSfy6o4I8Y/uJygShmF
 j13JfMweC+t7/6fyUqc5dcgY0Xy7LUFiWqfPxQj6axDiT82Mx2AvQaczrPUAKr1K
 KQsetmyhHC+Cpy7ILrhUGYjEWlvQm11ZiFoX8BkocFLFWk736QA63iB7mOUpCOQR
 SKLK00dF163GJdQC6nb4wCtyBxnCg4pSoP/72Z1foPtaSd3ccJ4CLsIE6GY5sP/I
 sDlPnIlnzEDfDPIxtVfKC8e1JINP6awXwtoJJo6MnuCuP3LDb58=
 =ogZQ
 -----END PGP SIGNATURE-----

Merge 4.4.134 into android-4.4

Changes in 4.4.134
	MIPS: ptrace: Expose FIR register through FP regset
	MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs
	KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable"
	affs_lookup(): close a race with affs_remove_link()
	aio: fix io_destroy(2) vs. lookup_ioctx() race
	ALSA: timer: Fix pause event notification
	mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register
	libata: Blacklist some Sandisk SSDs for NCQ
	libata: blacklist Micron 500IT SSD with MU01 firmware
	xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent
	Revert "ipc/shm: Fix shmat mmap nil-page protection"
	ipc/shm: fix shmat() nil address after round-down when remapping
	kasan: fix memory hotplug during boot
	kernel/sys.c: fix potential Spectre v1 issue
	kernel/signal.c: avoid undefined behaviour in kill_something_info
	xfs: remove racy hasattr check from attr ops
	do d_instantiate/unlock_new_inode combinations safely
	firewire-ohci: work around oversized DMA reads on JMicron controllers
	NFSv4: always set NFS_LOCK_LOST when a lock is lost.
	ALSA: hda - Use IS_REACHABLE() for dependency on input
	ASoC: au1x: Fix timeout tests in au1xac97c_ac97_read()
	kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
	tracing/hrtimer: Fix tracing bugs by taking all clock bases and modes into account
	PCI: Add function 1 DMA alias quirk for Marvell 9128
	tools lib traceevent: Simplify pointer print logic and fix %pF
	perf callchain: Fix attr.sample_max_stack setting
	tools lib traceevent: Fix get_field_str() for dynamic strings
	dm thin: fix documentation relative to low water mark threshold
	nfs: Do not convert nfs_idmap_cache_timeout to jiffies
	watchdog: sp5100_tco: Fix watchdog disable bit
	kconfig: Don't leak main menus during parsing
	kconfig: Fix automatic menu creation mem leak
	kconfig: Fix expr_free() E_NOT leak
	mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl()
	ipmi/powernv: Fix error return code in ipmi_powernv_probe()
	Btrfs: set plug for fsync
	btrfs: Fix out of bounds access in btrfs_search_slot
	Btrfs: fix scrub to repair raid6 corruption
	scsi: fas216: fix sense buffer initialization
	HID: roccat: prevent an out of bounds read in kovaplus_profile_activated()
	jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path
	powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes
	powerpc/numa: Ensure nodes initialized for hotplug
	RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure
	ntb_transport: Fix bug with max_mw_size parameter
	ocfs2: return -EROFS to mount.ocfs2 if inode block is invalid
	ocfs2/acl: use 'ip_xattr_sem' to protect getting extended attribute
	ocfs2: return error when we attempt to access a dirty bh in jbd2
	mm/mempolicy: fix the check of nodemask from user
	mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
	asm-generic: provide generic_pmdp_establish()
	mm: pin address_space before dereferencing it while isolating an LRU page
	IB/ipoib: Fix for potential no-carrier state
	x86/power: Fix swsusp_arch_resume prototype
	firmware: dmi_scan: Fix handling of empty DMI strings
	ACPI: processor_perflib: Do not send _PPC change notification if not ready
	bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y
	MIPS: TXx9: use IS_BUILTIN() for CONFIG_LEDS_CLASS
	xen-netfront: Fix race between device setup and open
	xen/grant-table: Use put_page instead of free_page
	RDS: IB: Fix null pointer issue
	arm64: spinlock: Fix theoretical trylock() A-B-A with LSE atomics
	proc: fix /proc/*/map_files lookup
	cifs: silence compiler warnings showing up with gcc-8.0.0
	bcache: properly set task state in bch_writeback_thread()
	bcache: fix for allocator and register thread race
	bcache: fix for data collapse after re-attaching an attached device
	bcache: return attach error when no cache set exist
	tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
	locking/qspinlock: Ensure node->count is updated before initialising node
	irqchip/gic-v3: Change pr_debug message to pr_devel
	scsi: ufs: Enable quirk to ignore sending WRITE_SAME command
	scsi: bnx2fc: Fix check in SCSI completion handler for timed out request
	scsi: sym53c8xx_2: iterator underflow in sym_getsync()
	scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo()
	scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion()
	ARC: Fix malformed ARC_EMUL_UNALIGNED default
	usb: gadget: f_uac2: fix bFirstInterface in composite gadget
	usb: gadget: fsl_udc_core: fix ep valid checks
	usb: dwc2: Fix dwc2_hsotg_core_init_disconnected()
	selftests: memfd: add config fragment for fuse
	scsi: storvsc: Increase cmd_per_lun for higher speed devices
	scsi: aacraid: fix shutdown crash when init fails
	scsi: qla4xxx: skip error recovery in case of register disconnect.
	ARM: OMAP2+: timer: fix a kmemleak caused in omap_get_timer_dt
	ARM: OMAP3: Fix prm wake interrupt for resume
	ARM: OMAP1: clock: Fix debugfs_create_*() usage
	NFC: llcp: Limit size of SDP URI
	mac80211: round IEEE80211_TX_STATUS_HEADROOM up to multiple of 4
	md raid10: fix NULL deference in handle_write_completed()
	drm/exynos: fix comparison to bitshift when dealing with a mask
	usb: musb: fix enumeration after resume
	locking/xchg/alpha: Add unconditional memory barrier to cmpxchg()
	md: raid5: avoid string overflow warning
	kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
	powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access
	s390/cio: fix return code after missing interrupt
	s390/cio: clear timer when terminating driver I/O
	ARM: OMAP: Fix dmtimer init for omap1
	smsc75xx: fix smsc75xx_set_features()
	regulatory: add NUL to request alpha2
	locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs
	x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
	media: dmxdev: fix error code for invalid ioctls
	md/raid1: fix NULL pointer dereference
	batman-adv: fix packet checksum in receive path
	batman-adv: invalidate checksum on fragment reassembly
	netfilter: ebtables: convert BUG_ONs to WARN_ONs
	nvme-pci: Fix nvme queue cleanup if IRQ setup fails
	clocksource/drivers/fsl_ftm_timer: Fix error return checking
	r8152: fix tx packets accounting
	virtio-gpu: fix ioctl and expose the fixed status to userspace.
	dmaengine: rcar-dmac: fix max_chunk_size for R-Car Gen3
	bcache: fix kcrashes with fio in RAID5 backend dev
	sit: fix IFLA_MTU ignored on NEWLINK
	gianfar: Fix Rx byte accounting for ndev stats
	net/tcp/illinois: replace broken algorithm reference link
	xen/pirq: fix error path cleanup when binding MSIs
	Btrfs: send, fix issuing write op when processing hole in no data mode
	selftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable
	KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing
	watchdog: f71808e_wdt: Fix magic close handling
	e1000e: Fix check_for_link return value with autoneg off
	e1000e: allocate ring descriptors with dma_zalloc_coherent
	usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers
	scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM
	scsi: sd: Keep disk read-only when re-reading partition
	fbdev: Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in sbusfb_ioctl_helper().
	xen: xenbus: use put_device() instead of kfree()
	USB: OHCI: Fix NULL dereference in HCDs using HCD_LOCAL_MEM
	netfilter: ebtables: fix erroneous reject of last rule
	bnxt_en: Check valid VNIC ID in bnxt_hwrm_vnic_set_tpa().
	workqueue: use put_device() instead of kfree()
	ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu
	sunvnet: does not support GSO for sctp
	net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off
	batman-adv: fix header size check in batadv_dbg_arp()
	vti4: Don't count header length twice on tunnel setup
	vti4: Don't override MTU passed on link creation via IFLA_MTU
	perf/cgroup: Fix child event counting bug
	RDMA/ucma: Correct option size check using optlen
	mm/mempolicy.c: avoid use uninitialized preferred_node
	selftests: ftrace: Add probe event argument syntax testcase
	selftests: ftrace: Add a testcase for string type with kprobe_event
	selftests: ftrace: Add a testcase for probepoint
	batman-adv: fix multicast-via-unicast transmission with AP isolation
	batman-adv: fix packet loss for broadcasted DHCP packets to a server
	ARM: 8748/1: mm: Define vdso_start, vdso_end as array
	net: qmi_wwan: add BroadMobi BM806U 2020:2033
	net/usb/qmi_wwan.c: Add USB id for lt4120 modem
	net-usb: add qmi_wwan if on lte modem wistron neweb d18q1
	llc: properly handle dev_queue_xmit() return value
	mm/kmemleak.c: wait for scan completion before disabling free
	net: Fix untag for vlan packets without ethernet header
	net: mvneta: fix enable of all initialized RXQs
	sh: fix debug trap failure to process signals before return to user
	x86/pgtable: Don't set huge PUD/PMD on non-leaf entries
	fs/proc/proc_sysctl.c: fix potential page fault while unregistering sysctl table
	swap: divide-by-zero when zero length swap file on ssd
	sr: get/drop reference to device in revalidate and check_events
	Force log to disk before reading the AGF during a fstrim
	cpufreq: CPPC: Initialize shared perf capabilities of CPUs
	scsi: aacraid: Insure command thread is not recursively stopped
	dp83640: Ensure against premature access to PHY registers after reset
	mm/ksm: fix interaction with THP
	mm: fix races between address_space dereference and free in page_evicatable
	Btrfs: bail out on error during replay_dir_deletes
	Btrfs: fix NULL pointer dereference in log_dir_items
	btrfs: Fix possible softlock on single core machines
	ocfs2/dlm: don't handle migrate lockres if already in shutdown
	sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning
	KVM: VMX: raise internal error for exception during invalid protected mode state
	fscache: Fix hanging wait on page discarded by writeback
	sparc64: Make atomic_xchg() an inline function rather than a macro.
	rtc: snvs: Fix usage of snvs_rtc_enable
	net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()
	Bluetooth: btusb: Add USB ID 7392:a611 for Edimax EW-7611ULB
	btrfs: tests/qgroup: Fix wrong tree backref level
	Btrfs: fix copy_items() return value when logging an inode
	btrfs: fix lockdep splat in btrfs_alloc_subvolume_writers
	xen/acpi: off by one in read_acpi_id()
	ACPI: acpi_pad: Fix memory leak in power saving threads
	powerpc/mpic: Check if cpu_possible() in mpic_physmask()
	m68k: set dma and coherent masks for platform FEC ethernets
	parisc/pci: Switch LBA PCI bus from Hard Fail to Soft Fail mode
	hwmon: (nct6775) Fix writing pwmX_mode
	rtc: hctosys: Ensure system time doesn't overflow time_t
	powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer
	powerpc/perf: Fix kernel address leak via sampling registers
	tools/thermal: tmon: fix for segfault
	selftests: Print the test we're running to /dev/kmsg
	net/mlx5: Protect from command bit overflow
	ath10k: Fix kernel panic while using worker (ath10k_sta_rc_update_wk)
	ima: Fix Kconfig to select TPM 2.0 CRB interface
	ima: Fallback to the builtin hash algorithm
	virtio-net: Fix operstate for virtio when no VIRTIO_NET_F_STATUS
	arm: dts: socfpga: fix GIC PPI warning
	usb: dwc3: Update DWC_usb31 GTXFIFOSIZ reg fields
	cpufreq: cppc_cpufreq: Fix cppc_cpufreq_init() failure path
	clk: Don't show the incorrect clock phase
	zorro: Set up z->dev.dma_mask for the DMA API
	bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set
	ACPICA: Events: add a return on failure from acpi_hw_register_read
	ACPICA: acpi: acpica: fix acpi operand cache leak in nseval.c
	i2c: mv64xxx: Apply errata delay only in standard mode
	KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use
	xhci: zero usb device slot_id member when disabling and freeing a xhci slot
	MIPS: ath79: Fix AR724X_PLL_REG_PCIE_CONFIG offset
	PCI: Restore config space on runtime resume despite being unbound
	ipmi_ssif: Fix kernel panic at msg_done_handler
	usb: dwc2: Fix interval type issue
	usb: gadget: ffs: Let setup() return USB_GADGET_DELAYED_STATUS
	usb: gadget: ffs: Execute copy_to_user() with USER_DS set
	powerpc: Add missing prototype for arch_irq_work_raise()
	ASoC: topology: create TLV data for dapm widgets
	perf/core: Fix perf_output_read_group()
	hwmon: (pmbus/max8688) Accept negative page register values
	hwmon: (pmbus/adm1275) Accept negative page register values
	cdrom: do not call check_disk_change() inside cdrom_open()
	gfs2: Fix fallocate chunk size
	usb: gadget: udc: change comparison to bitshift when dealing with a mask
	usb: gadget: composite: fix incorrect handling of OS desc requests
	x86/devicetree: Initialize device tree before using it
	x86/devicetree: Fix device IRQ settings in DT
	ALSA: vmaster: Propagate slave error
	media: cx23885: Override 888 ImpactVCBe crystal frequency
	media: cx23885: Set subdev host data to clk_freq pointer
	media: s3c-camif: fix out-of-bounds array access
	dmaengine: pl330: fix a race condition in case of threaded irqs
	media: em28xx: USB bulk packet size fix
	clk: rockchip: Prevent calculating mmc phase if clock rate is zero
	enic: enable rq before updating rq descriptors
	hwrng: stm32 - add reset during probe
	staging: rtl8192u: return -ENOMEM on failed allocation of priv->oldaddr
	rtc: tx4939: avoid unintended sign extension on a 24 bit shift
	serial: xuartps: Fix out-of-bounds access through DT alias
	serial: samsung: Fix out-of-bounds access through serial port index
	serial: mxs-auart: Fix out-of-bounds access through serial port index
	serial: imx: Fix out-of-bounds access through serial port index
	serial: fsl_lpuart: Fix out-of-bounds access through DT alias
	serial: arc_uart: Fix out-of-bounds access through DT alias
	PCI: Add function 1 DMA alias quirk for Marvell 88SE9220
	udf: Provide saner default for invalid uid / gid
	media: cx25821: prevent out-of-bounds read on array card
	clk: samsung: s3c2410: Fix PLL rates
	clk: samsung: exynos5260: Fix PLL rates
	clk: samsung: exynos5433: Fix PLL rates
	clk: samsung: exynos5250: Fix PLL rates
	clk: samsung: exynos3250: Fix PLL rates
	crypto: sunxi-ss - Add MODULE_ALIAS to sun4i-ss
	audit: return on memory error to avoid null pointer dereference
	MIPS: Octeon: Fix logging messages with spurious periods after newlines
	drm/rockchip: Respect page offset for PRIME mmap calls
	x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified
	perf tests: Use arch__compare_symbol_names to compare symbols
	perf report: Fix memory corruption in --branch-history mode --branch-history
	selftests/net: fixes psock_fanout eBPF test case
	netlabel: If PF_INET6, check sk_buff ip header version
	scsi: lpfc: Fix issue_lip if link is disabled
	scsi: lpfc: Fix soft lockup in lpfc worker thread during LIP testing
	scsi: lpfc: Fix frequency of Release WQE CQEs
	regulator: of: Add a missing 'of_node_put()' in an error handling path of 'of_regulator_match()'
	ASoC: samsung: i2s: Ensure the RCLK rate is properly determined
	Bluetooth: btusb: Add device ID for RTL8822BE
	kdb: make "mdr" command repeat
	s390/ftrace: use expoline for indirect branches
	Linux 4.4.134

Change-Id: Iababaf9b89bc8d0437b95e1368d8b0a9126a178c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-05-30 13:25:24 +02:00
Tom Abraham
69ccb50c17 swap: divide-by-zero when zero length swap file on ssd
[ Upstream commit a06ad633a37c64a0cd4c229fc605cee8725d376e ]

Calling swapon() on a zero length swap file on SSD can lead to a
divide-by-zero.

Although creating such files isn't possible with mkswap and they woud be
considered invalid, it would be better for the swapon code to be more
robust and handle this condition gracefully (return -EINVAL).
Especially since the fix is small and straightforward.

To help with wear leveling on SSD, the swapon syscall calculates a
random position in the swap file using modulo p->highest_bit, which is
set to maxpages - 1 in read_swap_header.

If the swap file is zero length, read_swap_header sets maxpages=1 and
last_page=0, resulting in p->highest_bit=0 and we divide-by-zero when we
modulo p->highest_bit in swapon syscall.

This can be prevented by having read_swap_header return zero if
last_page is zero.

Link: http://lkml.kernel.org/r/5AC747C1020000A7001FA82C@prv-mh.provo.novell.com
Signed-off-by: Thomas Abraham <tabraham@suse.com>
Reported-by: <Mark.Landis@Teradata.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:07 +02:00
Srinivasarao P
dd4f1e35fa Merge android-4.4.106 (2fea039) into msm-4.4
* refs/heads/tmp-2fea039
  Linux 4.4.106
  usb: gadget: ffs: Forbid usb_ep_alloc_request from sleeping
  arm: KVM: Fix VTTBR_BADDR_MASK BUG_ON off-by-one
  Revert "x86/mm/pat: Ensure cpa->pfn only contains page frame numbers"
  Revert "x86/efi: Hoist page table switching code into efi_call_virt()"
  Revert "x86/efi: Build our own page table structures"
  net/packet: fix a race in packet_bind() and packet_notifier()
  packet: fix crash in fanout_demux_rollover()
  sit: update frag_off info
  rds: Fix NULL pointer dereference in __rds_rdma_map
  tipc: fix memory leak in tipc_accept_from_sock()
  more bio_map_user_iov() leak fixes
  s390: always save and restore all registers on context switch
  ipmi: Stop timers before cleaning up the module
  audit: ensure that 'audit=1' actually enables audit for PID 1
  ipvlan: fix ipv6 outbound device
  afs: Connect up the CB.ProbeUuid
  IB/mlx5: Assign send CQ and recv CQ of UMR QP
  IB/mlx4: Increase maximal message size under UD QP
  xfrm: Copy policy family in clone_policy
  jump_label: Invoke jump_label_test() via early_initcall()
  atm: horizon: Fix irq release error
  sctp: use the right sk after waking up from wait_buf sleep
  sctp: do not free asoc when it is already dead in sctp_sendmsg
  sparc64/mm: set fields in deferred pages
  block: wake up all tasks blocked in get_request()
  sunrpc: Fix rpc_task_begin trace point
  NFS: Fix a typo in nfs_rename()
  dynamic-debug-howto: fix optional/omitted ending line number to be LARGE instead of 0
  lib/genalloc.c: make the avail variable an atomic_long_t
  route: update fnhe_expires for redirect when the fnhe exists
  route: also update fnhe_genid when updating a route cache
  mac80211_hwsim: Fix memory leak in hwsim_new_radio_nl()
  kbuild: pkg: use --transform option to prefix paths in tar
  EDAC, i5000, i5400: Fix definition of NRECMEMB register
  EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macro
  powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested
  drm/amd/amdgpu: fix console deadlock if late init failed
  axonram: Fix gendisk handling
  netfilter: don't track fragmented packets
  zram: set physical queue limits to avoid array out of bounds accesses
  i2c: riic: fix restart condition
  crypto: s5p-sss - Fix completing crypto request in IRQ handler
  ipv6: reorder icmpv6_init() and ip6_mr_init()
  bnx2x: do not rollback VF MAC/VLAN filters we did not configure
  bnx2x: fix possible overrun of VFPF multicast addresses array
  bnx2x: prevent crash when accessing PTP with interface down
  spi_ks8995: fix "BUG: key accdaa28 not in .data!"
  arm64: KVM: Survive unknown traps from guests
  arm: KVM: Survive unknown traps from guests
  KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset
  irqchip/crossbar: Fix incorrect type of register size
  scsi: lpfc: Fix crash during Hardware error recovery on SLI3 adapters
  workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq
  libata: drop WARN from protocol error in ata_sff_qc_issue()
  kvm: nVMX: VMCLEAR should not cause the vCPU to shut down
  USB: gadgetfs: Fix a potential memory leak in 'dev_config()'
  usb: gadget: configs: plug memory leak
  HID: chicony: Add support for another ASUS Zen AiO keyboard
  gpio: altera: Use handle_level_irq when configured as a level_high
  ARM: OMAP2+: Release device node after it is no longer needed.
  ARM: OMAP2+: Fix device node reference counts
  module: set __jump_table alignment to 8
  selftest/powerpc: Fix false failures for skipped tests
  x86/hpet: Prevent might sleep splat on resume
  ARM: OMAP2+: gpmc-onenand: propagate error on initialization failure
  vti6: Don't report path MTU below IPV6_MIN_MTU.
  Revert "s390/kbuild: enable modversions for symbols exported from asm"
  Revert "spi: SPI_FSL_DSPI should depend on HAS_DMA"
  Revert "drm/armada: Fix compile fail"
  mm: drop unused pmdp_huge_get_and_clear_notify()
  thp: fix MADV_DONTNEED vs. numa balancing race
  thp: reduce indentation level in change_huge_pmd()
  scsi: storvsc: Workaround for virtual DVD SCSI version
  ARM: avoid faulting on qemu
  ARM: BUG if jumping to usermode address in kernel mode
  arm64: fpsimd: Prevent registers leaking from dead tasks
  KVM: VMX: remove I/O port 0x80 bypass on Intel hosts
  arm64: KVM: fix VTTBR_BADDR_MASK BUG_ON off-by-one
  media: dvb: i2c transfers over usb cannot be done from stack
  drm/exynos: gem: Drop NONCONTIG flag for buffers allocated without IOMMU
  drm: extra printk() wrapper macros
  kdb: Fix handling of kallsyms_symbol_next() return value
  s390: fix compat system call table
  iommu/vt-d: Fix scatterlist offset handling
  ALSA: usb-audio: Add check return value for usb_string()
  ALSA: usb-audio: Fix out-of-bound error
  ALSA: seq: Remove spurious WARN_ON() at timer check
  ALSA: pcm: prevent UAF in snd_pcm_info
  x86/PCI: Make broadcom_postcore_init() check acpi_disabled
  X.509: reject invalid BIT STRING for subjectPublicKey
  ASN.1: check for error from ASN1_OP_END__ACT actions
  ASN.1: fix out-of-bounds read when parsing indefinite length item
  efi: Move some sysfs files to be read-only by root
  scsi: libsas: align sata_device's rps_resp on a cacheline
  isa: Prevent NULL dereference in isa_bus driver callbacks
  hv: kvp: Avoid reading past allocated blocks from KVP file
  virtio: release virtio index when fail to device_register
  can: usb_8dev: cancel urb on -EPIPE and -EPROTO
  can: esd_usb2: cancel urb on -EPIPE and -EPROTO
  can: ems_usb: cancel urb on -EPIPE and -EPROTO
  can: kvaser_usb: cancel urb on -EPIPE and -EPROTO
  can: kvaser_usb: ratelimit errors if incomplete messages are received
  can: kvaser_usb: Fix comparison bug in kvaser_usb_read_bulk_callback()
  can: kvaser_usb: free buf in error paths
  can: ti_hecc: Fix napi poll return value for repoll
  BACKPORT: irq: Make the irqentry text section unconditional
  UPSTREAM: arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections
  UPSTREAM: x86, kasan, ftrace: Put APIC interrupt handlers into .irqentry.text
  UPSTREAM: kasan: make get_wild_bug_type() static
  UPSTREAM: kasan: separate report parts by empty lines
  UPSTREAM: kasan: improve double-free report format
  UPSTREAM: kasan: print page description after stacks
  UPSTREAM: kasan: improve slab object description
  UPSTREAM: kasan: change report header
  UPSTREAM: kasan: simplify address description logic
  UPSTREAM: kasan: change allocation and freeing stack traces headers
  UPSTREAM: kasan: unify report headers
  UPSTREAM: kasan: introduce helper functions for determining bug type
  BACKPORT: kasan: report only the first error by default
  UPSTREAM: kasan: fix races in quarantine_remove_cache()
  UPSTREAM: kasan: resched in quarantine_remove_cache()
  BACKPORT: kasan, sched/headers: Uninline kasan_enable/disable_current()
  BACKPORT: kasan: drain quarantine of memcg slab objects
  UPSTREAM: kasan: eliminate long stalls during quarantine reduction
  UPSTREAM: kasan: support panic_on_warn
  UPSTREAM: x86/suspend: fix false positive KASAN warning on suspend/resume
  UPSTREAM: kasan: support use-after-scope detection
  UPSTREAM: kasan/tests: add tests for user memory access functions
  UPSTREAM: mm, kasan: add a ksize() test
  UPSTREAM: kasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2
  UPSTREAM: kasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right()
  UPSTREAM: lib/stackdepot: export save/fetch stack for drivers
  UPSTREAM: lib/stackdepot.c: bump stackdepot capacity from 16MB to 128MB
  BACKPORT: kprobes: Unpoison stack in jprobe_return() for KASAN
  UPSTREAM: kasan: remove the unnecessary WARN_ONCE from quarantine.c
  UPSTREAM: kasan: avoid overflowing quarantine size on low memory systems
  UPSTREAM: kasan: improve double-free reports
  BACKPORT: mm: coalesce split strings
  BACKPORT: mm/kasan: get rid of ->state in struct kasan_alloc_meta
  UPSTREAM: mm/kasan: get rid of ->alloc_size in struct kasan_alloc_meta
  UPSTREAM: mm: kasan: remove unused 'reserved' field from struct kasan_alloc_meta
  UPSTREAM: mm/kasan, slub: don't disable interrupts when object leaves quarantine
  UPSTREAM: mm/kasan: don't reduce quarantine in atomic contexts
  UPSTREAM: mm/kasan: fix corruptions and false positive reports
  UPSTREAM: lib/stackdepot.c: use __GFP_NOWARN for stack allocations
  BACKPORT: mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB
  UPSTREAM: kasan/quarantine: fix bugs on qlist_move_cache()
  UPSTREAM: mm: mempool: kasan: don't poot mempool objects in quarantine
  UPSTREAM: kasan: change memory hot-add error messages to info messages
  BACKPORT: mm/kasan: add API to check memory regions
  UPSTREAM: mm/kasan: print name of mem[set,cpy,move]() caller in report
  UPSTREAM: mm: kasan: initial memory quarantine implementation
  UPSTREAM: lib/stackdepot: avoid to return 0 handle
  UPSTREAM: lib/stackdepot.c: allow the stack trace hash to be zero
  UPSTREAM: mm, kasan: fix compilation for CONFIG_SLAB
  BACKPORT: mm, kasan: stackdepot implementation. Enable stackdepot for SLAB
  BACKPORT: mm, kasan: add GFP flags to KASAN API
  UPSTREAM: mm, kasan: SLAB support
  UPSTREAM: mm/slab: align cache size first before determination of OFF_SLAB candidate
  UPSTREAM: mm/slab: use more appropriate condition check for debug_pagealloc
  UPSTREAM: mm/slab: factor out debugging initialization in cache_init_objs()
  UPSTREAM: mm/slab: remove object status buffer for DEBUG_SLAB_LEAK
  UPSTREAM: mm/slab: alternative implementation for DEBUG_SLAB_LEAK
  UPSTREAM: mm/slab: clean up DEBUG_PAGEALLOC processing code
  UPSTREAM: mm/slab: activate debug_pagealloc in SLAB when it is actually enabled
  sched: EAS/WALT: Don't take into account of running task's util
  BACKPORT: schedutil: Reset cached freq if it is not in sync with next_freq
  UPSTREAM: kasan: add functions to clear stack poison

Conflicts:
	arch/arm/include/asm/kvm_arm.h
	arch/arm64/kernel/vmlinux.lds.S
	include/linux/kasan.h
	kernel/softirq.c
	lib/Kconfig
	lib/Kconfig.kasan
	lib/Makefile
	lib/stackdepot.c
	mm/kasan/kasan.c
	sound/usb/mixer.c

Change-Id: If70ced6da5f19be3dd92d10a8d8cd4d5841e5870
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2018-01-18 12:45:07 +05:30
Joe Perches
1a44264ae8 BACKPORT: mm: coalesce split strings
Kernel style prefers a single string over split strings when the string is
'user-visible'.

Miscellanea:

 - Add a missing newline
 - Realign arguments

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Tejun Heo <tj@kernel.org>	[percpu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Bug: 64145065
(cherry-picked from 756a025f00091918d9d09ca3229defb160b409c0)
Change-Id: I377fb1542980c15d2f306924656227ad17b02b5e
Signed-off-by: Paul Lawrence <paullawrence@google.com>
2017-12-14 08:21:34 -08:00
Runmin Wang
4b7c952db6 Merge tag 'lsk-v4.4-16.12-android' into branch 'msm-4.4'
* remotes/origin/tmp-2f0de51:
  Linux 4.4.38
  esp6: Fix integrity verification when ESN are used
  esp4: Fix integrity verification when ESN are used
  ipv4: Set skb->protocol properly for local output
  ipv6: Set skb->protocol properly for local output
  Don't feed anything but regular iovec's to blk_rq_map_user_iov
  constify iov_iter_count() and iter_is_iovec()
  sparc64: fix compile warning section mismatch in find_node()
  sparc64: Fix find_node warning if numa node cannot be found
  sparc32: Fix inverted invalid_frame_pointer checks on sigreturns
  net: ping: check minimum size on ICMP header length
  net: avoid signed overflows for SO_{SND|RCV}BUFFORCE
  geneve: avoid use-after-free of skb->data
  sh_eth: remove unchecked interrupts for RZ/A1
  net: bcmgenet: Utilize correct struct device for all DMA operations
  packet: fix race condition in packet_set_ring
  net/dccp: fix use-after-free in dccp_invalid_packet
  netlink: Do not schedule work from sk_destruct
  netlink: Call cb->done from a worker thread
  net/sched: pedit: make sure that offset is valid
  net, sched: respect rcu grace period on cls destruction
  net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change
  l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()
  rtnetlink: fix FDB size computation
  af_unix: conditionally use freezable blocking calls in read
  net: sky2: Fix shutdown crash
  ip6_tunnel: disable caching when the traffic class is inherited
  net: check dead netns for peernet2id_alloc()
  virtio-net: add a missing synchronize_net()
  Linux 4.4.37
  arm64: suspend: Reconfigure PSTATE after resume from idle
  arm64: mm: Set PSTATE.PAN from the cpu_enable_pan() call
  arm64: cpufeature: Schedule enable() calls instead of calling them via IPI
  pwm: Fix device reference leak
  mwifiex: printk() overflow with 32-byte SSIDs
  PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)
  PCI: Export pcie_find_root_port
  rcu: Fix soft lockup for rcu_nocb_kthread
  ALSA: pcm : Call kill_fasync() in stream lock
  x86/traps: Ignore high word of regs->cs in early_fixup_exception()
  kasan: update kasan_global for gcc 7
  zram: fix unbalanced idr management at hot removal
  ARC: Don't use "+l" inline asm constraint
  Linux 4.4.36
  scsi: mpt3sas: Unblock device after controller reset
  flow_dissect: call init_default_flow_dissectors() earlier
  mei: fix return value on disconnection
  mei: me: fix place for kaby point device ids.
  mei: me: disable driver on SPT SPS firmware
  drm/radeon: Ensure vblank interrupt is enabled on DPMS transition to on
  mpi: Fix NULL ptr dereference in mpi_powm() [ver #3]
  parisc: Also flush data TLB in flush_icache_page_asm
  parisc: Fix race in pci-dma.c
  parisc: Fix races in parisc_setup_cache_timing()
  NFSv4.x: hide array-bounds warning
  apparmor: fix change_hat not finding hat after policy replacement
  cfg80211: limit scan results cache size
  tile: avoid using clocksource_cyc2ns with absolute cycle count
  scsi: mpt3sas: Fix secure erase premature termination
  Fix USB CB/CBI storage devices with CONFIG_VMAP_STACK=y
  USB: serial: ftdi_sio: add support for TI CC3200 LaunchPad
  USB: serial: cp210x: add ID for the Zone DPMX
  usb: chipidea: move the lock initialization to core file
  KVM: x86: check for pic and ioapic presence before use
  KVM: x86: drop error recovery in em_jmp_far and em_ret_far
  iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions
  iommu/vt-d: Fix PASID table allocation
  sched: tune: Fix lacking spinlock initialization
  UPSTREAM: trace: Update documentation for mono, mono_raw and boot clock
  UPSTREAM: trace: Add an option for boot clock as trace clock
  UPSTREAM: timekeeping: Add a fast and NMI safe boot clock
  ANDROID: goldfish_pipe: fix allmodconfig build
  ANDROID: goldfish: goldfish_pipe: fix locking errors
  ANDROID: video: goldfishfb: fix platform_no_drv_owner.cocci warnings
  ANDROID: goldfish_pipe: fix call_kern.cocci warnings
  arm64: rename ranchu defconfig to ranchu64
  ANDROID: arch: x86: disable pic for Android toolchain
  ANDROID: goldfish_pipe: An implementation of more parallel pipe
  ANDROID: goldfish_pipe: bugfixes and performance improvements.
  ANDROID: goldfish: Add goldfish sync driver
  ANDROID: goldfish: add ranchu defconfigs
  ANDROID: goldfish_audio: Clear audio read buffer status after each read
  ANDROID: goldfish_events: no extra EV_SYN; register goldfish
  ANDROID: goldfish_fb: Set pixclock = 0
  ANDROID: goldfish: Enable ACPI-based enumeration for goldfish audio
  ANDROID: goldfish: Enable ACPI-based enumeration for goldfish framebuffer
  ANDROID: video: goldfishfb: add devicetree bindings
  BACKPORT: staging: goldfish: audio: fix compiliation on arm
  BACKPORT: Input: goldfish_events - enable ACPI-based enumeration for goldfish events
  BACKPORT: goldfish: Enable ACPI-based enumeration for goldfish battery
  BACKPORT: drivers: tty: goldfish: Add device tree bindings
  BACKPORT: tty: goldfish: support platform_device with id -1
  BACKPORT: Input: goldfish_events - add devicetree bindings
  BACKPORT: power: goldfish_battery: add devicetree bindings
  BACKPORT: staging: goldfish: audio: add devicetree bindings
  ANDROID: usb: gadget: function: cleanup: Add blank line after declaration
  cpufreq: sched: Fix kernel crash on accessing sysfs file
  usb: gadget: f_mtp: simplify ptp NULL pointer check
  cgroup: replace unified-hierarchy.txt with a proper cgroup v2 documentation
  cgroup: rename Documentation/cgroups/ to Documentation/cgroup-legacy/
  cgroup: replace __DEVEL__sane_behavior with cgroup2 fs type
  writeback: initialize inode members that track writeback history
  mm: page_alloc: generalize the dirty balance reserve
  block: fix module reference leak on put_disk() call for cgroups throttle
  Linux 4.4.35
  netfilter: nft_dynset: fix element timeout for HZ != 1000
  IB/cm: Mark stale CM id's whenever the mad agent was unregistered
  IB/uverbs: Fix leak of XRC target QPs
  IB/core: Avoid unsigned int overflow in sg_alloc_table
  IB/mlx5: Fix fatal error dispatching
  IB/mlx5: Use cache line size to select CQE stride
  IB/mlx4: Fix create CQ error flow
  IB/mlx4: Check gid_index return value
  PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails
  PM / sleep: fix device reference leak in test_suspend
  uwb: fix device reference leaks
  mfd: core: Fix device reference leak in mfd_clone_cell
  iwlwifi: pcie: fix SPLC structure parsing
  rtc: omap: Fix selecting external osc
  clk: mmp: mmp2: fix return value check in mmp2_clk_init()
  clk: mmp: pxa168: fix return value check in pxa168_clk_init()
  clk: mmp: pxa910: fix return value check in pxa910_clk_init()
  drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)
  crypto: caam - do not register AES-XTS mode on LP units
  ext4: sanity check the block and cluster size at mount time
  kbuild: Steal gcc's pie from the very beginning
  x86/kexec: add -fno-PIE
  scripts/has-stack-protector: add -fno-PIE
  kbuild: add -fno-PIE
  i2c: mux: fix up dependencies
  can: bcm: fix warning in bcm_connect/proc_register
  mfd: intel-lpss: Do not put device in reset state on suspend
  fuse: fix fuse_write_end() if zero bytes were copied
  KVM: Disable irq while unregistering user notifier
  KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr
  x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems
  Linux 4.4.34
  sparc64: Delete now unused user copy fixup functions.
  sparc64: Delete now unused user copy assembler helpers.
  sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert NG2copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert NGcopy_{from,to}_user to accurate exception reporting.
  sparc64: Convert NG4copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert U1copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert GENcopy_{from,to}_user to accurate exception reporting.
  sparc64: Convert copy_in_user to accurate exception reporting.
  sparc64: Prepare to move to more saner user copy exception handling.
  sparc64: Delete __ret_efault.
  sparc64: Handle extremely large kernel TLB range flushes more gracefully.
  sparc64: Fix illegal relative branches in hypervisor patched TLB cross-call code.
  sparc64: Fix instruction count in comment for __hypervisor_flush_tlb_pending.
  sparc64: Fix illegal relative branches in hypervisor patched TLB code.
  sparc64: Handle extremely large kernel TSB range flushes sanely.
  sparc: Handle negative offsets in arch_jump_label_transform
  sparc64 mm: Fix base TSB sizing when hugetlb pages are used
  sparc: serial: sunhv: fix a double lock bug
  sparc: Don't leak context bits into thread->fault_address
  tty: Prevent ldisc drivers from re-using stale tty fields
  tcp: take care of truncations done by sk_filter()
  ipv4: use new_gw for redirect neigh lookup
  net: __skb_flow_dissect() must cap its return value
  sock: fix sendmmsg for partial sendmsg
  fib_trie: Correct /proc/net/route off by one error
  sctp: assign assoc_id earlier in __sctp_connect
  ipv6: dccp: add missing bind_conflict to dccp_ipv6_mapped
  ipv6: dccp: fix out of bound access in dccp_v6_err()
  dccp: fix out of bound access in dccp_v4_err()
  dccp: do not send reset to already closed sockets
  tcp: fix potential memory corruption
  ip6_tunnel: Clear IP6CB in ip6tunnel_xmit()
  bgmac: stop clearing DMA receive control register right after it is set
  net: mangle zero checksum in skb_checksum_help()
  net: clear sk_err_soft in sk_clone_lock()
  dctcp: avoid bogus doubling of cwnd after loss
  ARM: 8485/1: cpuidle: remove cpu parameter from the cpuidle_ops suspend hook
  Linux 4.4.33
  netfilter: fix namespace handling in nf_log_proc_dostring
  btrfs: qgroup: Prevent qgroup->reserved from going subzero
  mmc: mxs: Initialize the spinlock prior to using it
  ASoC: sun4i-codec: return error code instead of NULL when create_card fails
  ACPI / APEI: Fix incorrect return value of ghes_proc()
  i40e: fix call of ndo_dflt_bridge_getlink()
  hwrng: core - Don't use a stack buffer in add_early_randomness()
  lib/genalloc.c: start search from start of chunk
  mei: bus: fix received data size check in NFC fixup
  iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path
  iommu/amd: Free domain id when free a domain of struct dma_ops_domain
  tty/serial: at91: fix hardware handshake on Atmel platforms
  dmaengine: at_xdmac: fix spurious flag status for mem2mem transfers
  drm/i915: Respect alternate_ddc_pin for all DDI ports
  KVM: MIPS: Precalculate MMIO load resume PC
  scsi: mpt3sas: Fix for block device of raid exists even after deleting raid disk
  scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init
  iio: orientation: hid-sensor-rotation: Add PM function (fix non working driver)
  iio: hid-sensors: Increase the precision of scale to fix wrong reading interpretation.
  clk: qoriq: Don't allow CPU clocks higher than starting value
  toshiba-wmi: Fix loading the driver on non Toshiba laptops
  drbd: Fix kernel_sendmsg() usage - potential NULL deref
  usb: gadget: u_ether: remove interrupt throttling
  USB: cdc-acm: fix TIOCMIWAIT
  staging: nvec: remove managed resource from PS2 driver
  Revert "staging: nvec: ps2: change serio type to passthrough"
  drivers: staging: nvec: remove bogus reset command for PS/2 interface
  staging: iio: ad5933: avoid uninitialized variable in error case
  pinctrl: cherryview: Prevent possible interrupt storm on resume
  pinctrl: cherryview: Serialize register access in suspend/resume
  ARC: timer: rtc: implement read loop in "C" vs. inline asm
  s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignment
  coredump: fix unfreezable coredumping task
  swapfile: fix memory corruption via malformed swapfile
  dib0700: fix nec repeat handling
  ASoC: cs4270: fix DAPM stream name mismatch
  ALSA: info: Limit the proc text input size
  ALSA: info: Return error for invalid read/write
  arm64: Enable KPROBES/HIBERNATION/CORESIGHT in defconfig
  arm64: kvm: allows kvm cpu hotplug
  arm64: KVM: Register CPU notifiers when the kernel runs at HYP
  arm64: KVM: Skip HYP setup when already running in HYP
  arm64: hyp/kvm: Make hyp-stub reject kvm_call_hyp()
  arm64: hyp/kvm: Make hyp-stub extensible
  arm64: kvm: Move lr save/restore from do_el2_call into EL1
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  ANDROID: video: adf: Avoid directly referencing user pointers
  ANDROID: usb: gadget: audio_source: fix comparison of distinct pointer types
  android: binder: support for file-descriptor arrays.
  android: binder: support for scatter-gather.
  android: binder: add extra size to allocator.
  android: binder: refactor binder_transact()
  android: binder: support multiple /dev instances.
  android: binder: deal with contexts in debugfs.
  android: binder: support multiple context managers.
  android: binder: split flat_binder_object.
  disable aio support in recommended configuration
  Linux 4.4.32
  scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
  drm/radeon: fix DP mode validation
  drm/radeon/dp: add back special handling for NUTMEG
  drm/amdgpu: fix DP mode validation
  drm/amdgpu/dp: add back special handling for NUTMEG
  KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  Revert KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  of: silence warnings due to max() usage
  packet: on direct_xmit, limit tso and csum to supported devices
  sctp: validate chunk len before actually using it
  net sched filters: fix notification of filter delete with proper handle
  udp: fix IP_CHECKSUM handling
  net: sctp, forbid negative length
  ipv4: use the right lock for ping_group_range
  ipv4: disable BH in set_ping_group_range()
  net: add recursion limit to GRO
  rtnetlink: Add rtnexthop offload flag to compare mask
  bridge: multicast: restore perm router ports on multicast enable
  net: pktgen: remove rcu locking in pktgen_change_name()
  ipv6: correctly add local routes when lo goes up
  ip6_tunnel: fix ip6_tnl_lookup
  ipv6: tcp: restore IP6CB for pktoptions skbs
  netlink: do not enter direct reclaim from netlink_dump()
  packet: call fanout_release, while UNREGISTERING a netdev
  net: Add netdev all_adj_list refcnt propagation to fix panic
  net/sched: act_vlan: Push skb->data to mac_header prior calling skb_vlan_*() functions
  net: pktgen: fix pkt_size
  net: fec: set mac address unconditionally
  tg3: Avoid NULL pointer dereference in tg3_io_error_detected()
  ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route
  ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()
  tcp: fix a compile error in DBGUNDO()
  tcp: fix wrong checksum calculation on MTU probing
  net: avoid sk_forward_alloc overflows
  tcp: fix overflow in __tcp_retransmit_skb()
  arm64/kvm: fix build issue on kvm debug
  arm64: ptdump: Indicate whether memory should be faulting
  arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
  arm64: Drop alloc function from create_mapping
  arm64: allow vmalloc regions to be set with set_memory_*
  arm64: kernel: implement ACPI parking protocol
  arm64: mm: create new fine-grained mappings at boot
  arm64: ensure _stext and _etext are page-aligned
  arm64: mm: allow passing a pgdir to alloc_init_*
  arm64: mm: allocate pagetables anywhere
  arm64: mm: use fixmap when creating page tables
  arm64: mm: add functions to walk tables in fixmap
  arm64: mm: add __{pud,pgd}_populate
  arm64: mm: avoid redundant __pa(__va(x))
  Linux 4.4.31
  HID: usbhid: add ATEN CS962 to list of quirky devices
  ubi: fastmap: Fix add_vol() return value test in ubi_attach_fastmap()
  kvm: x86: Check memopp before dereference (CVE-2016-8630)
  tty: vt, fix bogus division in csi_J
  usb: dwc3: Fix size used in dma_free_coherent()
  pwm: Unexport children before chip removal
  UBI: fastmap: scrub PEB when bitflips are detected in a free PEB EC header
  Disable "frame-address" warning
  smc91x: avoid self-comparison warning
  cgroup: avoid false positive gcc-6 warning
  drm/exynos: fix error handling in exynos_drm_subdrv_open
  mm/cma: silence warnings due to max() usage
  ARM: 8584/1: floppy: avoid gcc-6 warning
  powerpc/ptrace: Fix out of bounds array access warning
  x86/xen: fix upper bound of pmd loop in xen_cleanhighmap()
  perf build: Fix traceevent plugins build race
  drm/dp/mst: Check peer device type before attempting EDID read
  drm/radeon: drop register readback in cayman_cp_int_cntl_setup
  drm/radeon/si_dpm: workaround for SI kickers
  drm/radeon/si_dpm: Limit clocks on HD86xx part
  Revert "drm/radeon: fix DP link training issue with second 4K monitor"
  mmc: dw_mmc-pltfm: fix the potential NULL pointer dereference
  scsi: arcmsr: Send SYNCHRONIZE_CACHE command to firmware
  scsi: scsi_debug: Fix memory leak if LBP enabled and module is unloaded
  scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices
  mac80211: discard multicast and 4-addr A-MSDUs
  firewire: net: fix fragmented datagram_size off-by-one
  firewire: net: guard against rx buffer overflows
  Input: i8042 - add XMG C504 to keyboard reset table
  dm mirror: fix read error on recovery after default leg failure
  virtio: console: Unlock vqs while freeing buffers
  virtio_ring: Make interrupt suppression spec compliant
  parisc: Ensure consistent state when switching to kernel stack at syscall entry
  ovl: fsync after copy-up
  KVM: MIPS: Make ERET handle ERL before EXL
  KVM: x86: fix wbinvd_dirty_mask use-after-free
  dm: free io_barrier after blk_cleanup_queue call
  USB: serial: cp210x: fix tiocmget error handling
  tty: limit terminal size to 4M chars
  xhci: add restart quirk for Intel Wildcatpoint PCH
  hv: do not lose pending heartbeat vmbus packets
  vt: clear selection before resizing
  Fix potential infoleak in older kernels
  GenWQE: Fix bad page access during abort of resource allocation
  usb: increase ohci watchdog delay to 275 msec
  xhci: use default USB_RESUME_TIMEOUT when resuming ports.
  USB: serial: ftdi_sio: add support for Infineon TriBoard TC2X7
  USB: serial: fix potential NULL-dereference at probe
  usb: gadget: function: u_ether: don't starve tx request queue
  mei: txe: don't clean an unprocessed interrupt cause.
  ubifs: Fix regression in ubifs_readdir()
  ubifs: Abort readdir upon error
  btrfs: fix races on root_log_ctx lists
  ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct
  ANDROID: binder: Add strong ref checks
  ALSA: hda - Fix headset mic detection problem for two Dell laptops
  ALSA: hda - Adding a new group of pin cfg into ALC295 pin quirk table
  ALSA: hda - allow 40 bit DMA mask for NVidia devices
  ALSA: hda - Raise AZX_DCAPS_RIRB_DELAY handling into top drivers
  ALSA: hda - Merge RIRB_PRE_DELAY into CTX_WORKAROUND caps
  ALSA: usb-audio: Add quirk for Syntek STK1160
  KEYS: Fix short sprintf buffer in /proc/keys show function
  mm: memcontrol: do not recurse in direct reclaim
  mm/list_lru.c: avoid error-path NULL pointer deref
  libxfs: clean up _calc_dquots_per_chunk
  h8300: fix syscall restarting
  drm/dp/mst: Clear port->pdt when tearing down the i2c adapter
  i2c: core: fix NULL pointer dereference under race condition
  i2c: xgene: Avoid dma_buffer overrun
  arm64:cpufeature ARM64_NCAPS is the indicator of last feature
  arm64: hibernate: Refuse to hibernate if the boot cpu is offline
  PM / sleep: Add support for read-only sysfs attributes
  arm64: kernel: Add support for hibernate/suspend-to-disk
  arm64: mm: add functions to walk page tables by PA
  arm64: mm: move pte_* macros
  PM / Hibernate: Call flush_icache_range() on pages restored in-place
  arm64: Add new asm macro copy_page
  arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  arm64: kernel: Include _AC definition in page.h
  arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
  arm64: kernel: Rework finisher callback out of __cpu_suspend_enter()
  arm64: Cleanup SCTLR flags
  arm64: Fold proc-macros.S into assembler.h
  arm/arm64: KVM: Add hook for C-based stage2 init
  arm/arm64: KVM: Detect vGIC presence at runtime
  arm64: KVM: Add support for 16-bit VMID
  arm: KVM: Make kvm_arm.h friendly to assembly code
  arm/arm64: KVM: Remove unreferenced S2_PGD_ORDER
  arm64: KVM: debug: Remove spurious inline attributes
  ARM: KVM: Cleanup exception injection
  arm64: KVM: Remove weak attributes
  arm64: KVM: Cleanup asm-offset.c
  arm64: KVM: Turn system register numbers to an enum
  arm64: KVM: VHE: Patch out use of HVC
  arm64: Add ARM64_HAS_VIRT_HOST_EXTN feature
  arm/arm64: Add new is_kernel_in_hyp_mode predicate
  arm64: KVM: Move away from the assembly version of the world switch
  arm64: KVM: Map the kernel RO section into HYP
  arm64: KVM: Add compatibility aliases
  arm64: KVM: Implement vgic-v3 save/restore
  arm64: KVM: Add panic handling
  arm64: KVM: HYP mode entry points
  arm64: KVM: Implement TLB handling
  arm64: KVM: Implement fpsimd save/restore
  arm64: KVM: Implement the core world switch
  arm64: KVM: Add patchable function selector
  arm64: KVM: Implement guest entry
  arm64: KVM: Implement debug save/restore
  arm64: KVM: Implement 32bit system register save/restore
  arm64: KVM: Implement system register save/restore
  arm64: KVM: Implement timer save/restore
  arm64: KVM: Implement vgic-v2 save/restore
  arm64: KVM: Add a HYP-specific header file
  KVM: arm/arm64: vgic-v3: Make the LR indexing macro public
  arm64: Add macros to read/write system registers
  Linux 4.4.30
  Revert "fix minor infoleak in get_user_ex()"
  Revert "x86/mm: Expand the exception table logic to allow new handling options"
  Linux 4.4.29
  ARM: pxa: pxa_cplds: fix interrupt handling
  powerpc/nvram: Fix an incorrect partition merge
  mpt3sas: Don't spam logs if logging level is 0
  perf symbols: Fixup symbol sizes before picking best ones
  perf symbols: Check symbol_conf.allow_aliases for kallsyms loading too
  perf hists browser: Fix event group display
  clk: divider: Fix clk_divider_round_rate() to use clk_readl()
  clk: qoriq: fix a register offset error
  s390/con3270: fix insufficient space padding
  s390/con3270: fix use of uninitialised data
  s390/cio: fix accidental interrupt enabling during resume
  x86/mm: Expand the exception table logic to allow new handling options
  dmaengine: ipu: remove bogus NO_IRQ reference
  power: bq24257: Fix use of uninitialized pointer bq->charger
  staging: r8188eu: Fix scheduling while atomic splat
  ASoC: dapm: Fix kcontrol creation for output driver widget
  ASoC: dapm: Fix value setting for _ENUM_DOUBLE MUX's second channel
  ASoC: dapm: Fix possible uninitialized variable in snd_soc_dapm_get_volsw()
  ASoC: topology: Fix error return code in soc_tplg_dapm_widget_create()
  hwrng: omap - Only fail if pm_runtime_get_sync returns < 0
  crypto: arm/ghash-ce - add missing async import/export
  crypto: gcm - Fix IV buffer size in crypto_gcm_setkey
  mwifiex: correct aid value during tdls setup
  spi: spi-fsl-dspi: Drop extra spi_master_put in device remove function
  ARM: clk-imx35: fix name for ckil clk
  uio: fix dmem_region_start computation
  genirq/generic_chip: Add irq_unmap callback
  perf stat: Fix interval output values
  powerpc/eeh: Null check uses of eeh_pe_bus_get
  tunnels: Remove encapsulation offloads on decap.
  tunnels: Don't apply GRO to multiple layers of encapsulation.
  ipip: Properly mark ipip GRO packets as encapsulated.
  posix_acl: Clear SGID bit when setting file permissions
  brcmfmac: avoid potential stack overflow in brcmf_cfg80211_start_ap()
  mm/hugetlb: fix memory offline with hugepage size > memory block size
  drm/i915: Unalias obj->phys_handle and obj->userptr
  drm/i915: Account for TSEG size when determining 865G stolen base
  Revert "drm/i915: Check live status before reading edid"
  drm/i915/gen9: fix the WaWmMemoryReadLatency implementation
  xenbus: don't look up transaction IDs for ordinary writes
  drm/vmwgfx: Limit the user-space command buffer size
  drm/radeon: change vblank_time's calculation method to reduce computational error.
  drm/radeon/si/dpm: fix phase shedding setup
  drm/radeon: narrow asic_init for virtualization
  drm/amdgpu: change vblank_time's calculation method to reduce computational error.
  drm/amdgpu/dce11: add missing drm_mode_config_cleanup call
  drm/amdgpu/dce11: disable hpd on local panels
  drm/amdgpu/dce8: disable hpd on local panels
  drm/amdgpu/dce10: disable hpd on local panels
  drm/amdgpu: fix IB alignment for UVD
  drm/prime: Pass the right module owner through to dma_buf_export()
  Linux 4.4.28
  target: Don't override EXTENDED_COPY xcopy_pt_cmd SCSI status code
  target: Make EXTENDED_COPY 0xe4 failure return COPY TARGET DEVICE NOT REACHABLE
  target: Re-add missing SCF_ACK_KREF assignment in v4.1.y
  ubifs: Fix xattr_names length in exit paths
  jbd2: fix incorrect unlock on j_list_lock
  ext4: do not advertise encryption support when disabled
  mmc: rtsx_usb_sdmmc: Handle runtime PM while changing the led
  mmc: rtsx_usb_sdmmc: Avoid keeping the device runtime resumed when unused
  mmc: core: Annotate cmd_hdr as __le32
  powerpc/mm: Prevent unlikely crash in copro_calculate_slb()
  ceph: fix error handling in ceph_read_iter
  arm64: kernel: Init MDCR_EL2 even in the absence of a PMU
  arm64: percpu: rewrite ll/sc loops in assembly
  memstick: rtsx_usb_ms: Manage runtime PM when accessing the device
  memstick: rtsx_usb_ms: Runtime resume the device when polling for cards
  isofs: Do not return EACCES for unknown filesystems
  irqchip/gic-v3-its: Fix entry size mask for GITS_BASER
  s390/mm: fix gmap tlb flush issues
  Using BUG_ON() as an assert() is _never_ acceptable
  mm: filemap: fix mapping->nrpages double accounting in fuse
  mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
  acpi, nfit: check for the correct event code in notifications
  net/mlx4_core: Allow resetting VF admin mac to zero
  bnx2x: Prevent false warning for lack of FC NPIV
  PKCS#7: Don't require SpcSpOpusInfo in Authenticode pkcs7 signatures
  hpsa: correct skipping masked peripherals
  sd: Fix rw_max for devices that report an optimal xfer size
  irqchip/gicv3: Handle loop timeout proper
  kvm: x86: memset whole irq_eoi
  x86/e820: Don't merge consecutive E820_PRAM ranges
  blkcg: Unlock blkcg_pol_mutex only once when cpd == NULL
  Fix regression which breaks DFS mounting
  Cleanup missing frees on some ioctls
  Do not send SMB3 SET_INFO request if nothing is changing
  SMB3: GUIDs should be constructed as random but valid uuids
  Set previous session id correctly on SMB3 reconnect
  Display number of credits available
  Clarify locking of cifs file and tcon structures and make more granular
  fs/cifs: keep guid when assigning fid to fileinfo
  cifs: Limit the overall credit acquired
  fs/super.c: fix race between freeze_super() and thaw_super()
  arc: don't leak bits of kernel stack into coredump
  lightnvm: ensure that nvm_dev_ops can be used without CONFIG_NVM
  ipc/sem.c: fix complex_count vs. simple op race
  mm: filemap: don't plant shadow entries without radix tree node
  metag: Only define atomic_dec_if_positive conditionally
  scsi: Fix use-after-free
  NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic
  NFSv4: Open state recovery must account for file permission changes
  NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid
  NFSv4: Don't report revoked delegations as valid in nfs_have_delegation()
  sunrpc: fix write space race causing stalls
  Input: elantech - add Fujitsu Lifebook E556 to force crc_enabled
  Input: elantech - force needed quirks on Fujitsu H760
  Input: i8042 - skip selftest on ASUS laptops
  lib: add "on"/"off" support to kstrtobool
  lib: update single-char callers of strtobool()
  lib: move strtobool() to kstrtobool()
  MIPS: ptrace: Fix regs_return_value for kernel context
  MIPS: Fix -mabi=64 build of vdso.lds
  ALSA: hda - Fix a failure of micmute led when having multi adcs
  cx231xx: fix GPIOs for Pixelview SBTVD hybrid
  cx231xx: don't return error on success
  mb86a20s: fix demod settings
  mb86a20s: fix the locking logic
  ovl: copy_up_xattr(): use strnlen
  ovl: Fix info leak in ovl_lookup_temp()
  fbdev/efifb: Fix 16 color palette entry calculation
  scsi: zfcp: spin_lock_irqsave() is not nestable
  zfcp: trace full payload of all SAN records (req,resp,iels)
  zfcp: fix payload trace length for SAN request&response
  zfcp: fix D_ID field with actual value on tracing SAN responses
  zfcp: restore tracing of handle for port and LUN with HBA records
  zfcp: trace on request for open and close of WKA port
  zfcp: restore: Dont use 0 to indicate invalid LUN in rec trace
  zfcp: retain trace level for SCSI and HBA FSF response records
  zfcp: close window with unblocked rport during rport gone
  zfcp: fix ELS/GS request&response length for hardware data router
  zfcp: fix fc_host port_type with NPIV
  ubi: Deal with interrupted erasures in WL
  powerpc/pseries: Fix stack corruption in htpe code
  powerpc/64: Fix incorrect return value from __copy_tofrom_user
  powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data()
  powerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag()
  powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear()
  powerpc/vdso64: Use double word compare on pointers
  dm crypt: fix crash on exit
  dm mpath: check if path's request_queue is dying in activate_path()
  dm: return correct error code in dm_resume()'s retry loop
  dm: mark request_queue dead before destroying the DM device
  perf intel-pt: Fix MTC timestamp calculation for large MTC periods
  perf intel-pt: Fix estimated timestamps for cycle-accurate mode
  perf intel-pt: Fix snapshot overlap detection decoder errors
  pstore/ram: Use memcpy_fromio() to save old buffer
  pstore/ram: Use memcpy_toio instead of memcpy
  pstore/core: drop cmpxchg based updates
  pstore/ramoops: fixup driver removal
  parisc: Increase initial kernel mapping size
  parisc: Fix kernel memory layout regarding position of __gp
  parisc: Increase KERNEL_INITIAL_SIZE for 32-bit SMP kernels
  cpufreq: intel_pstate: Fix unsafe HWP MSR access
  platform: don't return 0 from platform_get_irq[_byname]() on error
  PCI: Mark Atheros AR9580 to avoid bus reset
  mmc: sdhci: cast unsigned int to unsigned long long to avoid unexpeted error
  mmc: block: don't use CMD23 with very old MMC cards
  rtlwifi: Fix missing country code for Great Britain
  PM / devfreq: event: remove duplicate devfreq_event_get_drvdata()
  clk: imx6: initialize GPU clocks
  regulator: tps65910: Work around silicon erratum SWCZ010
  mei: me: add kaby point device ids
  gpio: mpc8xxx: Correct irq handler function
  cgroup: Change from CAP_SYS_NICE to CAP_SYS_RESOURCE for cgroup migration permissions
  UPSTREAM: cpu/hotplug: Handle unbalanced hotplug enable/disable
  UPSTREAM: arm64: kaslr: fix breakage with CONFIG_MODVERSIONS=y
  UPSTREAM: arm64: kaslr: keep modules close to the kernel when DYNAMIC_FTRACE=y
  cgroup: Remove leftover instances of allow_attach
  BACKPORT: lib: harden strncpy_from_user
  CHROMIUM: cgroups: relax permissions on moving tasks between cgroups
  CHROMIUM: remove Android's cgroup generic permissions checks
  Linux 4.4.27
  cfq: fix starvation of asynchronous writes
  vfs: move permission checking into notify_change() for utimes(NULL)
  dlm: free workqueues after the connections
  crypto: vmx - Fix memory corruption caused by p8_ghash
  crypto: ghash-generic - move common definitions to a new header file
  ext4: release bh in make_indexed_dir
  ext4: allow DAX writeback for hole punch
  ext4: fix memory leak in ext4_insert_range()
  ext4: reinforce check of i_dtime when clearing high fields of uid and gid
  ext4: enforce online defrag restriction for encrypted files
  scsi: ibmvfc: Fix I/O hang when port is not mapped
  scsi: arcmsr: Simplify user_len checking
  scsi: arcmsr: Buffer overflow in arcmsr_iop_message_xfer()
  async_pq_val: fix DMA memory leak
  reiserfs: switch to generic_{get,set,remove}xattr()
  reiserfs: Unlock superblock before calling reiserfs_quota_on_mount()
  ASoC: Intel: Atom: add a missing star in a memcpy call
  brcmfmac: fix memory leak in brcmf_fill_bss_param
  i40e: avoid NULL pointer dereference and recursive errors on early PCI error
  fuse: fix killing s[ug]id in setattr
  fuse: invalidate dir dentry after chmod
  fuse: listxattr: verify xattr list
  drivers: base: dma-mapping: page align the size when unmap_kernel_range
  btrfs: assign error values to the correct bio structs
  serial: 8250_dw: Check the data->pclk when get apb_pclk
  arm64: Use PoU cache instr for I/D coherency
  arm64: mm: add code to safely replace TTBR1_EL1
  arm64: mm: place __cpu_setup in .text
  arm64: add function to install the idmap
  arm64: unmap idmap earlier
  arm64: unify idmap removal
  arm64: mm: place empty_zero_page in bss
  arm64: head.S: use memset to clear BSS
  arm64: mm: specialise pagetable allocators
  arm64: mm: remove pointless PAGE_MASKing
  asm-generic: Fix local variable shadow in __set_fixmap_offset
  arm64: mm: fold alternatives into .init
  ARM: 8511/1: ARM64: kernel: PSCI: move PSCI idle management code to drivers/firmware
  ARM: 8481/2: drivers: psci: replace psci firmware calls
  ARM: 8480/2: arm64: add implementation for arm-smccc
  ARM: 8479/2: add implementation for arm-smccc
  ARM: 8478/2: arm/arm64: add arm-smccc
  ARM: 8510/1: rework ARM_CPU_SUSPEND dependencies
  ARM: 8458/1: bL_switcher: add GIC dependency
  Linux 4.4.26
  mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
  x86/build: Build compressed x86 kernels as PIE
  arm64: Remove stack duplicating code from jprobes
  arm64: kprobes: Add KASAN instrumentation around stack accesses
  arm64: kprobes: Cleanup jprobe_return
  arm64: kprobes: Fix overflow when saving stack
  arm64: kprobes: WARN if attempting to step with PSTATE.D=1
  kprobes: Add arm64 case in kprobe example module
  arm64: Add kernel return probes support (kretprobes)
  arm64: Add trampoline code for kretprobes
  arm64: kprobes instruction simulation support
  arm64: Treat all entry code as non-kprobe-able
  arm64: Blacklist non-kprobe-able symbol
  arm64: Kprobes with single stepping support
  arm64: add conditional instruction simulation support
  arm64: Add more test functions to insn.c
  arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature
  Linux 4.4.25
  tpm_crb: fix crb_req_canceled behavior
  tpm: fix a race condition in tpm2_unseal_trusted()
  ima: use file_dentry()
  ARM: cpuidle: Fix error return code
  ARM: dts: MSM8064 remove flags from SPMI/MPP IRQs
  ARM: dts: mvebu: armada-390: add missing compatibility string and bracket
  x86/dumpstack: Fix x86_32 kernel_stack_pointer() previous stack access
  x86/irq: Prevent force migration of irqs which are not in the vector domain
  x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation
  KVM: PPC: BookE: Fix a sanity check
  KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register
  mfd: wm8350-i2c: Make sure the i2c regmap functions are compiled
  mfd: 88pm80x: Double shifting bug in suspend/resume
  mfd: atmel-hlcdc: Do not sleep in atomic context
  mfd: rtsx_usb: Avoid setting ucr->current_sg.status
  ALSA: usb-line6: use the same declaration as definition in header for MIDI manufacturer ID
  ALSA: usb-audio: Extend DragonFly dB scale quirk to cover other variants
  ALSA: ali5451: Fix out-of-bound position reporting
  timekeeping: Fix __ktime_get_fast_ns() regression
  time: Add cycles to nanoseconds translation
  mm: Fix build for hardened usercopy
  ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct
  ANDROID: binder: Add strong ref checks
  UPSTREAM: staging/android/ion : fix a race condition in the ion driver
  ANDROID: android-base: CONFIG_HARDENED_USERCOPY=y
  UPSTREAM: fs/proc/kcore.c: Add bounce buffer for ktext data
  UPSTREAM: fs/proc/kcore.c: Make bounce buffer global for read
  BACKPORT: arm64: Correctly bounds check virt_addr_valid
  Fix a build breakage in IO latency hist code.
  UPSTREAM: efi: include asm/early_ioremap.h not asm/efi.h to get early_memremap
  UPSTREAM: ia64: split off early_ioremap() declarations into asm/early_ioremap.h
  FROMLIST: arm64: Enable CONFIG_ARM64_SW_TTBR0_PAN
  FROMLIST: arm64: xen: Enable user access before a privcmd hvc call
  FROMLIST: arm64: Handle faults caused by inadvertent user access with PAN enabled
  FROMLIST: arm64: Disable TTBR0_EL1 during normal kernel execution
  FROMLIST: arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1
  FROMLIST: arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro
  FROMLIST: arm64: Factor out PAN enabling/disabling into separate uaccess_* macros
  UPSTREAM: arm64: Handle el1 synchronous instruction aborts cleanly
  UPSTREAM: arm64: include alternative handling in dcache_by_line_op
  UPSTREAM: arm64: fix "dc cvau" cache operation on errata-affected core
  UPSTREAM: Revert "arm64: alternatives: add enable parameter to conditional asm macros"
  UPSTREAM: arm64: Add new asm macro copy_page
  UPSTREAM: arm64: kill ESR_LNX_EXEC
  UPSTREAM: arm64: add macro to extract ESR_ELx.EC
  UPSTREAM: arm64: mm: mark fault_info table const
  UPSTREAM: arm64: fix dump_instr when PAN and UAO are in use
  BACKPORT: arm64: Fold proc-macros.S into assembler.h
  UPSTREAM: arm64: choose memstart_addr based on minimum sparsemem section alignment
  UPSTREAM: arm64/mm: ensure memstart_addr remains sufficiently aligned
  UPSTREAM: arm64/kernel: fix incorrect EL0 check in inv_entry macro
  UPSTREAM: arm64: Add macros to read/write system registers
  UPSTREAM: arm64/efi: refactor EFI init and runtime code for reuse by 32-bit ARM
  UPSTREAM: arm64/efi: split off EFI init and runtime code for reuse by 32-bit ARM
  UPSTREAM: arm64/efi: mark UEFI reserved regions as MEMBLOCK_NOMAP
  BACKPORT: arm64: only consider memblocks with NOMAP cleared for linear mapping
  UPSTREAM: mm/memblock: add MEMBLOCK_NOMAP attribute to memblock memory table
  ANDROID: dm: android-verity: Remove fec_header location constraint
  BACKPORT: audit: consistently record PIDs with task_tgid_nr()
  android-base.cfg: Enable kernel ASLR
  UPSTREAM: vmlinux.lds.h: allow arch specific handling of ro_after_init data section
  UPSTREAM: arm64: spinlock: fix spin_unlock_wait for LSE atomics
  UPSTREAM: arm64: avoid TLB conflict with CONFIG_RANDOMIZE_BASE
  UPSTREAM: arm64: Only select ARM64_MODULE_PLTS if MODULES=y
  sched: Add Kconfig option DEFAULT_USE_ENERGY_AWARE to set ENERGY_AWARE feature flag
  sched/fair: remove printk while schedule is in progress
  ANDROID: fs: FS tracepoints to track IO.
  sched/walt: Drop arch-specific timer access
  ANDROID: fiq_debugger: Pass task parameter to unwind_frame()
  eas/sched/fair: Fixing comments in find_best_target.
  input: keyreset: switch to orderly_reboot
  UPSTREAM: tun: fix transmit timestamp support
  UPSTREAM: arch/arm/include/asm/pgtable-3level.h: add pmd_mkclean for THP
  net: inet: diag: expose the socket mark to privileged processes.
  net: diag: make udp_diag_destroy work for mapped addresses.
  net: diag: support SOCK_DESTROY for UDP sockets
  net: diag: allow socket bytecode filters to match socket marks
  net: diag: slightly refactor the inet_diag_bc_audit error checks.
  net: diag: Add support to filter on device index
  UPSTREAM: brcmfmac: avoid potential stack overflow in brcmf_cfg80211_start_ap()
  Linux 4.4.24
  ALSA: hda - Add the top speaker pin config for HP Spectre x360
  ALSA: hda - Fix headset mic detection problem for several Dell laptops
  ACPICA: acpi_get_sleep_type_data: Reduce warnings
  ALSA: hda - Adding one more ALC255 pin definition for headset problem
  Revert "usbtmc: convert to devm_kzalloc"
  USB: serial: cp210x: Add ID for a Juniper console
  Staging: fbtft: Fix bug in fbtft-core
  usb: misc: legousbtower: Fix NULL pointer deference
  USB: serial: cp210x: fix hardware flow-control disable
  dm log writes: fix bug with too large bios
  clk: xgene: Add missing parenthesis when clearing divider value
  aio: mark AIO pseudo-fs noexec
  batman-adv: remove unused callback from batadv_algo_ops struct
  IB/mlx4: Use correct subnet-prefix in QP1 mads under SR-IOV
  IB/mlx4: Fix code indentation in QP1 MAD flow
  IB/mlx4: Fix incorrect MC join state bit-masking on SR-IOV
  IB/ipoib: Don't allow MC joins during light MC flush
  IB/core: Fix use after free in send_leave function
  IB/ipoib: Fix memory corruption in ipoib cm mode connect flow
  KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write
  dmaengine: at_xdmac: fix to pass correct device identity to free_irq()
  kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd
  ASoC: omap-mcpdm: Fix irq resource handling
  sysctl: handle error writing UINT_MAX to u32 fields
  powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support
  brcmsmac: Initialize power in brcms_c_stf_ss_algo_channel_get()
  brcmsmac: Free packet if dma_mapping_error() fails in dma_rxfill
  brcmfmac: Fix glob_skb leak in brcmf_sdiod_recv_chain
  ASoC: Intel: Skylake: Fix error return code in skl_probe()
  pNFS/flexfiles: Fix layoutcommit after a commit to DS
  pNFS/files: Fix layoutcommit after a commit to DS
  NFS: Don't drop CB requests with invalid principals
  svc: Avoid garbage replies when pc_func() returns rpc_drop_reply
  dmaengine: at_xdmac: fix debug string
  fnic: pci_dma_mapping_error() doesn't return an error code
  avr32: off by one in at32_init_pio()
  ath9k: Fix programming of minCCA power threshold
  gspca: avoid unused variable warnings
  em28xx-i2c: rt_mutex_trylock() returns zero on failure
  NFC: fdp: Detect errors from fdp_nci_create_conn()
  iwlmvm: mvm: set correct state in smart-fifo configuration
  tile: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
  pstore: drop file opened reference count
  blk-mq: actually hook up defer list when running requests
  hwrng: omap - Fix assumption that runtime_get_sync will always succeed
  ARM: sa1111: fix pcmcia suspend/resume
  ARM: shmobile: fix regulator quirk for Gen2
  ARM: sa1100: clear reset status prior to reboot
  ARM: sa1100: fix 3.6864MHz clock
  ARM: sa1100: register clocks early
  ARM: sun5i: Fix typo in trip point temperature
  regulator: qcom_smd: Fix voltage ranges for pm8x41
  regulator: qcom_spmi: Update mvs1/mvs2 switches on pm8941
  regulator: qcom_spmi: Add support for get_mode/set_mode on switches
  regulator: qcom_spmi: Add support for S4 supply on pm8941
  tpm: fix byte-order for the value read by tpm2_get_tpm_pt
  printk: fix parsing of "brl=" option
  MIPS: uprobes: fix use of uninitialised variable
  MIPS: Malta: Fix IOCU disable switch read for MIPS64
  MIPS: fix uretprobe implementation
  MIPS: uprobes: remove incorrect set_orig_insn
  arm64: debug: avoid resetting stepping state machine when TIF_SINGLESTEP
  ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7
  irqchip/gicv3: Silence noisy DEBUG_PER_CPU_MAPS warning
  gpio: sa1100: fix irq probing for ucb1x00
  usb: gadget: fsl_qe_udc: signedness bug in qe_get_frame()
  ceph: fix race during filling readdir cache
  iwlwifi: mvm: don't use ret when not initialised
  iwlwifi: pcie: fix access to scratch buffer
  spi: sh-msiof: Avoid invalid clock generator parameters
  hwmon: (adt7411) set bit 3 in CFG1 register
  nvmem: Declare nvmem_cell_read() consistently
  ipvs: fix bind to link-local mcast IPv6 address in backup
  tools/vm/slabinfo: fix an unintentional printf
  mmc: pxamci: fix potential oops
  drivers/perf: arm_pmu: Fix leak in error path
  pinctrl: Flag strict is a field in struct pinmux_ops
  pinctrl: uniphier: fix .pin_dbg_show() callback
  i40e: avoid null pointer dereference
  perf/core: Fix pmu::filter_match for SW-led groups
  iwlwifi: mvm: fix a few firmware capability checks
  usb: musb: fix DMA for host mode
  usb: musb: Fix DMA desired mode for Mentor DMA engine
  ARM: 8617/1: dma: fix dma_max_pfn()
  ARM: 8616/1: dt: Respect property size when parsing CPUs
  drm/radeon/si/dpm: add workaround for for Jet parts
  drm/nouveau/fifo/nv04: avoid ramht race against cookie insertion
  x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID
  x86/init: Fix cr4_init_shadow() on CR4-less machines
  can: dev: fix deadlock reported after bus-off
  mm,ksm: fix endless looping in allocating memory when ksm enable
  mtd: nand: davinci: Reinitialize the HW ECC engine in 4bit hwctl
  cpuset: handle race between CPU hotplug and cpuset_hotplug_work
  usercopy: fold builtin_const check into inline function
  Linux 4.4.23
  hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common()
  qxl: check for kmap failures
  power: supply: max17042_battery: fix model download bug.
  power_supply: tps65217-charger: fix missing platform_set_drvdata()
  PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
  PM / hibernate: Restore processor state before using per-CPU variables
  MIPS: paravirt: Fix undefined reference to smp_bootstrap
  MIPS: Add a missing ".set pop" in an early commit
  MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...)
  MIPS: Remove compact branch policy Kconfig entries
  MIPS: vDSO: Fix Malta EVA mapping to vDSO page structs
  MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
  MIPS: Fix pre-r6 emulation FPU initialisation
  i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
  i2c-eg20t: fix race between i2c init and interrupt enable
  btrfs: ensure that file descriptor used with subvol ioctls is a dir
  nl80211: validate number of probe response CSA counters
  can: flexcan: fix resume function
  mm: delete unnecessary and unsafe init_tlb_ubc()
  tracing: Move mutex to protect against resetting of seq data
  fix memory leaks in tracing_buffers_splice_read()
  power: reset: hisi-reboot: Unmap region obtained by of_iomap
  mtd: pmcmsp-flash: Allocating too much in init_msp_flash()
  mtd: maps: sa1100-flash: potential NULL dereference
  fix fault_in_multipages_...() on architectures with no-op access_ok()
  fanotify: fix list corruption in fanotify_get_response()
  fsnotify: add a way to stop queueing events on group shutdown
  xfs: prevent dropping ioend completions during buftarg wait
  autofs: use dentry flags to block walks during expire
  autofs races
  pwm: Mark all devices as "might sleep"
  bridge: re-introduce 'fix parsing of MLDv2 reports'
  net: smc91x: fix SMC accesses
  Revert "phy: IRQ cannot be shared"
  net: dsa: bcm_sf2: Fix race condition while unmasking interrupts
  net/mlx5: Added missing check of msg length in verifying its signature
  tipc: fix NULL pointer dereference in shutdown()
  net/irda: handle iriap_register_lsap() allocation failure
  vti: flush x-netns xfrm cache when vti interface is removed
  af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
  Revert "af_unix: Fix splice-bind deadlock"
  bonding: Fix bonding crash
  megaraid: fix null pointer check in megasas_detach_one().
  nouveau: fix nv40_perfctr_next() cleanup regression
  Staging: iio: adc: fix indent on break statement
  iwlegacy: avoid warning about missing braces
  ath9k: fix misleading indentation
  am437x-vfpe: fix typo in vpfe_get_app_input_index
  Add braces to avoid "ambiguous ‘else’" compiler warnings
  net: caif: fix misleading indentation
  Makefile: Mute warning for __builtin_return_address(>0) for tracing only
  Disable "frame-address" warning
  Disable "maybe-uninitialized" warning globally
  gcov: disable -Wmaybe-uninitialized warning
  Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES
  kbuild: forbid kernel directory to contain spaces and colons
  tools: Support relative directory path for 'O='
  Makefile: revert "Makefile: Document ability to make file.lst and file.S" partially
  kbuild: Do not run modules_install and install in paralel
  ocfs2: fix start offset to ocfs2_zero_range_for_truncate()
  ocfs2/dlm: fix race between convert and migration
  crypto: echainiv - Replace chaining with multiplication
  crypto: skcipher - Fix blkcipher walk OOM crash
  crypto: arm/aes-ctr - fix NULL dereference in tail processing
  crypto: arm64/aes-ctr - fix NULL dereference in tail processing
  tcp: properly scale window in tcp_v[46]_reqsk_send_ack()
  tcp: fix use after free in tcp_xmit_retransmit_queue()
  tcp: cwnd does not increase in TCP YeAH
  ipv6: release dst in ping_v6_sendmsg
  ipv4: panic in leaf_walk_rcu due to stale node pointer
  reiserfs: fix "new_insert_key may be used uninitialized ..."
  Fix build warning in kernel/cpuset.c
  include/linux/kernel.h: change abs() macro so it uses consistent return type
  Linux 4.4.22
  openrisc: fix the fix of copy_from_user()
  avr32: fix 'undefined reference to `___copy_from_user'
  ia64: copy_from_user() should zero the destination on access_ok() failure
  genirq/msi: Fix broken debug output
  ppc32: fix copy_from_user()
  sparc32: fix copy_from_user()
  mn10300: copy_from_user() should zero on access_ok() failure...
  nios2: copy_from_user() should zero the tail of destination
  openrisc: fix copy_from_user()
  parisc: fix copy_from_user()
  metag: copy_from_user() should zero the destination on access_ok() failure
  alpha: fix copy_from_user()
  asm-generic: make copy_from_user() zero the destination properly
  mips: copy_from_user() must zero the destination on access_ok() failure
  hexagon: fix strncpy_from_user() error return
  sh: fix copy_from_user()
  score: fix copy_from_user() and friends
  blackfin: fix copy_from_user()
  cris: buggered copy_from_user/copy_to_user/clear_user
  frv: fix clear_user()
  asm-generic: make get_user() clear the destination on errors
  ARC: uaccess: get_user to zero out dest in cause of fault
  s390: get_user() should zero on failure
  score: fix __get_user/get_user
  nios2: fix __get_user()
  sh64: failing __get_user() should zero
  m32r: fix __get_user()
  mn10300: failing __get_user() and get_user() should zero
  fix minor infoleak in get_user_ex()
  microblaze: fix copy_from_user()
  avr32: fix copy_from_user()
  microblaze: fix __get_user()
  fix iov_iter_fault_in_readable()
  irqchip/atmel-aic: Fix potential deadlock in ->xlate()
  genirq: Provide irq_gc_{lock_irqsave,unlock_irqrestore}() helpers
  drm: Only use compat ioctl for addfb2 on X86/IA64
  drm: atmel-hlcdc: Fix vertical scaling
  net: simplify napi_synchronize() to avoid warnings
  kconfig: tinyconfig: provide whole choice blocks to avoid warnings
  soc: qcom/spm: shut up uninitialized variable warning
  pinctrl: at91-pio4: use %pr format string for resource
  mmc: dw_mmc: use resource_size_t to store physical address
  drm/i915: Avoid pointer arithmetic in calculating plane surface offset
  mpssd: fix buffer overflow warning
  gma500: remove annoying deprecation warning
  ipv6: addrconf: fix dev refcont leak when DAD failed
  sched/core: Fix a race between try_to_wake_up() and a woken up task
  Revert "wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel"
  ath9k: fix using sta->drv_priv before initializing it
  md-cluster: make md-cluster also can work when compiled into kernel
  xhci: fix null pointer dereference in stop command timeout function
  fuse: direct-io: don't dirty ITER_BVEC pages
  Btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns
  crypto: cryptd - initialize child shash_desc on import
  arm64: spinlocks: implement smp_mb__before_spinlock() as smp_mb()
  pinctrl: sunxi: fix uart1 CTS/RTS pins at PG on A23/A33
  pinctrl: pistachio: fix mfio pll_lock pinmux
  dm crypt: fix error with too large bios
  dm log writes: move IO accounting earlier to fix error path
  dm log writes: fix check of kthread_run() return value
  bus: arm-ccn: Fix XP watchpoint settings bitmask
  bus: arm-ccn: Do not attempt to configure XPs for cycle counter
  bus: arm-ccn: Fix PMU handling of MN
  ARM: dts: STiH407-family: Provide interconnect clock for consumption in ST SDHCI
  ARM: dts: overo: fix gpmc nand on boards with ethernet
  ARM: dts: overo: fix gpmc nand cs0 range
  ARM: dts: imx6qdl: Fix SPDIF regression
  ARM: OMAP3: hwmod data: Add sysc information for DSI
  ARM: kirkwood: ib62x0: fix size of u-boot environment partition
  ARM: imx6: add missing BM_CLPCR_BYPASS_PMIC_READY setting for imx6sx
  ARM: imx6: add missing BM_CLPCR_BYP_MMDC_CH0_LPM_HS setting for imx6ul
  ARM: AM43XX: hwmod: Fix RSTST register offset for pruss
  cpuset: make sure new tasks conform to the current config of the cpuset
  net: thunderx: Fix OOPs with ethtool --register-dump
  USB: change bInterval default to 10 ms
  ARM: dts: STiH410: Handle interconnect clock required by EHCI/OHCI (USB)
  usb: chipidea: udc: fix NULL ptr dereference in isr_setup_status_phase
  usb: renesas_usbhs: fix clearing the {BRDY,BEMP}STS condition
  USB: serial: simple: add support for another Infineon flashloader
  serial: 8250: added acces i/o products quad and octal serial cards
  serial: 8250_mid: fix divide error bug if baud rate is 0
  iio: ensure ret is initialized to zero before entering do loop
  iio:core: fix IIO_VAL_FRACTIONAL sign handling
  iio: accel: kxsd9: Fix scaling bug
  iio: fix pressure data output unit in hid-sensor-attributes
  iio: accel: bmc150: reset chip at init time
  iio: adc: at91: unbreak channel adc channel 3
  iio: ad799x: Fix buffered capture for ad7991/ad7995/ad7999
  iio: adc: ti_am335x_adc: Increase timeout value waiting for ADC sample
  iio: adc: ti_am335x_adc: Protect FIFO1 from concurrent access
  iio: adc: rockchip_saradc: reset saradc controller before programming it
  iio: proximity: as3935: set up buffer timestamps for non-zero values
  iio: accel: kxsd9: Fix raw read return
  kvm-arm: Unmap shadow pagetables properly
  x86/AMD: Apply erratum 665 on machines without a BIOS fix
  x86/paravirt: Do not trace _paravirt_ident_*() functions
  ARC: mm: fix build breakage with STRICT_MM_TYPECHECKS
  IB/uverbs: Fix race between uverbs_close and remove_one
  dm flakey: fix reads to be issued if drop_writes configured
  audit: fix exe_file access in audit_exe_compare
  mm: introduce get_task_exe_file
  kexec: fix double-free when failing to relocate the purgatory
  NFSv4.1: Fix the CREATE_SESSION slot number accounting
  pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised
  nfsd: Close race between nfsd4_release_lockowner and nfsd4_lock
  NFSv4.x: Fix a refcount leak in nfs_callback_up_net
  pNFS: The client must not do I/O to the DS if it's lease has expired
  kernfs: don't depend on d_find_any_alias() when generating notifications
  powerpc/mm: Don't alias user region to other regions below PAGE_OFFSET
  powerpc/powernv : Drop reference added by kset_find_obj()
  powerpc/tm: do not use r13 for tabort_syscall
  tipc: move linearization of buffers to generic code
  lightnvm: put bio before return
  fscrypto: require write access to mount to set encryption policy
  Revert "KVM: x86: fix missed hardware breakpoints"
  MIPS: KVM: Check for pfn noslot case
  clocksource/drivers/sun4i: Clear interrupts after stopping timer in probe function
  fscrypto: add authorization check for setting encryption policy
  ext4: use __GFP_NOFAIL in ext4_free_blocks()

Conflicts:
	arch/arm/kernel/devtree.c
	arch/arm64/Kconfig
	arch/arm64/kernel/arm64ksyms.c
	arch/arm64/kernel/psci.c
	arch/arm64/mm/fault.c
	drivers/android/binder.c
	drivers/usb/host/xhci-hub.c
	fs/ext4/readpage.c
	include/linux/mmc/core.h
	include/linux/mmzone.h
	mm/memcontrol.c
	net/core/filter.c
	net/netlink/af_netlink.c
	net/netlink/af_netlink.h

Change-Id: I99fe7a0914e83e284b11b33185b71448a8999d1f
Signed-off-by: Runmin Wang <runminw@codeaurora.org>
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2017-02-28 17:10:49 -08:00
Jann Horn
5c54f79ad2 swapfile: fix memory corruption via malformed swapfile
commit dd111be69114cc867f8e826284559bfbc1c40e37 upstream.

When root activates a swap partition whose header has the wrong
endianness, nr_badpages elements of badpages are swabbed before
nr_badpages has been checked, leading to a buffer overrun of up to 8GB.

This normally is not a security issue because it can only be exploited
by root (more specifically, a process with CAP_SYS_ADMIN or the ability
to modify a swap file/partition), and such a process can already e.g.
modify swapped-out memory of any other userspace process on the system.

Link: http://lkml.kernel.org/r/1477949533-2509-1-git-send-email-jann@thejh.net
Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-18 10:48:34 +01:00
Vinayak Menon
ad6f486284 mm: swap: swap ratio support
Add support to receive a static ratio from userspace to
divide the swap pages between ZRAM and disk based swap
devices. The existing infrastructure allows to keep
same priority for multiple swap devices, which results
in round robin distribution of pages. With this patch,
the ratio can be defined.

CRs-fixed: 968416
Change-Id: I54f54489db84cabb206569dd62d61a8a7a898991
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-03-23 21:19:04 -07:00
Vinayak Menon
588411ea51 mm: swap: fix swapcache usage for fast swap devices
1) The swap readahead algorithm need not be applied for fast swap
devices like zram.
2) Code to set SWP_FAST is placed incorrectly, resulting in the flag
not being set.
Fix these to reduce the swapcache usage.

Change-Id: I23d9af5819f4b25f90f14a12657fa19ed401fb2a
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-03-23 21:13:01 -07:00
Vinayak Menon
e5ce54a9cb mm: swap: don't delay swap free for fast swap devices
There are couple of issues with swapcache usage when ZRAM is used
as swap device.
1) Kernel does a swap readahead which can be around 6 to 8 pages
depending on total ram, which is not required for zram since
accesses are fast.
2) Kernel delays the freeing up of swapcache expecting a later hit,
which again is useless in the case of zram.
3) This is not related to swapcache, but zram usage itself.
As mentioned in (2) kernel delays freeing of swapcache, but along with
that it delays zram compressed page free also. i.e. there can be 2 copies,
though one is compressed.

This patch addresses these issues using two new flags
QUEUE_FLAG_FAST and SWP_FAST, to indicate that accesses to the device
will be fast and cheap, and instructs the swap layer to free up
swap space agressively, and not to do read ahead.

Change-Id: I5d2d5176a5f9420300bb2f843f6ecbdb25ea80e4
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-03-22 11:03:52 -07:00
Minchan Kim
8334b96221 mm: /proc/pid/smaps:: show proportional swap share of the mapping
We want to know per-process workingset size for smart memory management
on userland and we use swap(ex, zram) heavily to maximize memory
efficiency so workingset includes swap as well as RSS.

On such system, if there are lots of shared anonymous pages, it's really
hard to figure out exactly how many each process consumes memory(ie, rss
+ wap) if the system has lots of shared anonymous memory(e.g, android).

This patch introduces SwapPss field on /proc/<pid>/smaps so we can get
more exact workingset size per process.

Bongkyu tested it. Result is below.

1. 50M used swap
SwapTotal: 461976 kB
SwapFree: 411192 kB

$ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print sum}';
48236
$ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print sum}';
141184

2. 240M used swap
SwapTotal: 461976 kB
SwapFree: 216808 kB

$ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print sum}';
230315
$ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print sum}';
1387744

[akpm@linux-foundation.org: simplify kunmap_atomic() call]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Bongkyu Kim <bongkyu.kim@lge.com>
Tested-by: Bongkyu Kim <bongkyu.kim@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Hugh Dickins
6f179af88f mm: fix potential data race in SyS_swapon
While running KernelThreadSanitizer (ktsan) on upstream kernel with
trinity, we got a few reports from SyS_swapon, here is one of them:

Read of size 8 by thread T307 (K7621):
 [<     inlined    >] SyS_swapon+0x3c0/0x1850 SYSC_swapon mm/swapfile.c:2395
 [<ffffffff812242c0>] SyS_swapon+0x3c0/0x1850 mm/swapfile.c:2345
 [<ffffffff81e97c8a>] ia32_do_call+0x1b/0x25

Looks like the swap_lock should be taken when iterating through the
swap_info array on lines 2392 - 2401: q->swap_file may be reset to
NULL by another thread before it is dereferenced for f_mapping.

But why is that iteration needed at all?  Doesn't the claim_swapfile()
which follows do all that is needed to check for a duplicate entry -
FMODE_EXCL on a bdev, testing IS_SWAPFILE under i_mutex on a regfile?

Well, not quite: bd_may_claim() allows the same "holder" to claim the
bdev again, so we do need to use a different holder than "sys_swapon";
and we should not replace appropriate -EBUSY by inappropriate -EINVAL.

Index i was reused in a cpu loop further down: renamed cpu there.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-08-21 02:33:07 -04:00
Miklos Szeredi
2726d56620 vfs: add seq_file_path() helper
Turn
	seq_path(..., &file->f_path, ...);
into
	seq_file_path(..., file, ...);

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-06-23 18:01:07 -04:00
Jason Low
4db0c3c298 mm: remove rest of ACCESS_ONCE() usages
We converted some of the usages of ACCESS_ONCE to READ_ONCE in the mm/
tree since it doesn't work reliably on non-scalar types.

This patch removes the rest of the usages of ACCESS_ONCE, and use the new
READ_ONCE API for the read accesses.  This makes things cleaner, instead
of using separate/multiple sets of APIs.

Signed-off-by: Jason Low <jason.low2@hp.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:18 -07:00
Johannes Weiner
5d1ea48bdd mm: page_cgroup: rename file to mm/swap_cgroup.c
Now that the external page_cgroup data structure and its lookup is gone,
the only code remaining in there is swap slot accounting.

Rename it and move the conditional compilation into mm/Makefile.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Tejun Heo <tj@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10 17:41:09 -08:00
Johannes Weiner
0a31bc97c8 mm: memcontrol: rewrite uncharge API
The memcg uncharging code that is involved towards the end of a page's
lifetime - truncation, reclaim, swapout, migration - is impressively
complicated and fragile.

Because anonymous and file pages were always charged before they had their
page->mapping established, uncharges had to happen when the page type
could still be known from the context; as in unmap for anonymous, page
cache removal for file and shmem pages, and swap cache truncation for swap
pages.  However, these operations happen well before the page is actually
freed, and so a lot of synchronization is necessary:

- Charging, uncharging, page migration, and charge migration all need
  to take a per-page bit spinlock as they could race with uncharging.

- Swap cache truncation happens during both swap-in and swap-out, and
  possibly repeatedly before the page is actually freed.  This means
  that the memcg swapout code is called from many contexts that make
  no sense and it has to figure out the direction from page state to
  make sure memory and memory+swap are always correctly charged.

- On page migration, the old page might be unmapped but then reused,
  so memcg code has to prevent untimely uncharging in that case.
  Because this code - which should be a simple charge transfer - is so
  special-cased, it is not reusable for replace_page_cache().

But now that charged pages always have a page->mapping, introduce
mem_cgroup_uncharge(), which is called after the final put_page(), when we
know for sure that nobody is looking at the page anymore.

For page migration, introduce mem_cgroup_migrate(), which is called after
the migration is successful and the new page is fully rmapped.  Because
the old page is no longer uncharged after migration, prevent double
charges by decoupling the page's memcg association (PCG_USED and
pc->mem_cgroup) from the page holding an actual charge.  The new bits
PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
to the new page during migration.

mem_cgroup_migrate() is suitable for replace_page_cache() as well,
which gets rid of mem_cgroup_replace_page_cache().  However, care
needs to be taken because both the source and the target page can
already be charged and on the LRU when fuse is splicing: grab the page
lock on the charge moving side to prevent changing pc->mem_cgroup of a
page under migration.  Also, the lruvecs of both pages change as we
uncharge the old and charge the new during migration, and putback may
race with us, so grab the lru lock and isolate the pages iff on LRU to
prevent races and ensure the pages are on the right lruvec afterward.

Swap accounting is massively simplified: because the page is no longer
uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
before the final put_page() in page reclaim.

Finally, page_cgroup changes are now protected by whatever protection the
page itself offers: anonymous pages are charged under the page table lock,
whereas page cache insertions, swapin, and migration hold the page lock.
Uncharging happens under full exclusion with no outstanding references.
Charging and uncharging also ensure that the page is off-LRU, which
serializes against charge migration.  Remove the very costly page_cgroup
lock and set pc->flags non-atomically.

[mhocko@suse.cz: mem_cgroup_charge_statistics needs preempt_disable]
[vdavydov@parallels.com: fix flags definition]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Tested-by: Jet Chen <jet.chen@intel.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 15:57:17 -07:00
Johannes Weiner
00501b531c mm: memcontrol: rewrite charge API
These patches rework memcg charge lifetime to integrate more naturally
with the lifetime of user pages.  This drastically simplifies the code and
reduces charging and uncharging overhead.  The most expensive part of
charging and uncharging is the page_cgroup bit spinlock, which is removed
entirely after this series.

Here are the top-10 profile entries of a stress test that reads a 128G
sparse file on a freshly booted box, without even a dedicated cgroup (i.e.
 executing in the root memcg).  Before:

    15.36%              cat  [kernel.kallsyms]   [k] copy_user_generic_string
    13.31%              cat  [kernel.kallsyms]   [k] memset
    11.48%              cat  [kernel.kallsyms]   [k] do_mpage_readpage
     4.23%              cat  [kernel.kallsyms]   [k] get_page_from_freelist
     2.38%              cat  [kernel.kallsyms]   [k] put_page
     2.32%              cat  [kernel.kallsyms]   [k] __mem_cgroup_commit_charge
     2.18%          kswapd0  [kernel.kallsyms]   [k] __mem_cgroup_uncharge_common
     1.92%          kswapd0  [kernel.kallsyms]   [k] shrink_page_list
     1.86%              cat  [kernel.kallsyms]   [k] __radix_tree_lookup
     1.62%              cat  [kernel.kallsyms]   [k] __pagevec_lru_add_fn

After:

    15.67%           cat  [kernel.kallsyms]   [k] copy_user_generic_string
    13.48%           cat  [kernel.kallsyms]   [k] memset
    11.42%           cat  [kernel.kallsyms]   [k] do_mpage_readpage
     3.98%           cat  [kernel.kallsyms]   [k] get_page_from_freelist
     2.46%           cat  [kernel.kallsyms]   [k] put_page
     2.13%       kswapd0  [kernel.kallsyms]   [k] shrink_page_list
     1.88%           cat  [kernel.kallsyms]   [k] __radix_tree_lookup
     1.67%           cat  [kernel.kallsyms]   [k] __pagevec_lru_add_fn
     1.39%       kswapd0  [kernel.kallsyms]   [k] free_pcppages_bulk
     1.30%           cat  [kernel.kallsyms]   [k] kfree

As you can see, the memcg footprint has shrunk quite a bit.

   text    data     bss     dec     hex filename
  37970    9892     400   48262    bc86 mm/memcontrol.o.old
  35239    9892     400   45531    b1db mm/memcontrol.o

This patch (of 4):

The memcg charge API charges pages before they are rmapped - i.e.  have an
actual "type" - and so every callsite needs its own set of charge and
uncharge functions to know what type is being operated on.  Worse,
uncharge has to happen from a context that is still type-specific, rather
than at the end of the page's lifetime with exclusive access, and so
requires a lot of synchronization.

Rewrite the charge API to provide a generic set of try_charge(),
commit_charge() and cancel_charge() transaction operations, much like
what's currently done for swap-in:

  mem_cgroup_try_charge() attempts to reserve a charge, reclaiming
  pages from the memcg if necessary.

  mem_cgroup_commit_charge() commits the page to the charge once it
  has a valid page->mapping and PageAnon() reliably tells the type.

  mem_cgroup_cancel_charge() aborts the transaction.

This reduces the charge API and enables subsequent patches to
drastically simplify uncharging.

As pages need to be committed after rmap is established but before they
are added to the LRU, page_add_new_anon_rmap() must stop doing LRU
additions again.  Revive lru_cache_add_active_or_unevictable().

[hughd@google.com: fix shmem_unuse]
[hughd@google.com: Add comments on the private use of -EAGAIN]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 15:57:17 -07:00
Chen Yucong
50088c4409 mm/swapfile.c: delete the "last_in_cluster < scan_base" loop in the body of scan_swap_map()
Via commit ebc2a1a691 ("swap: make cluster allocation per-cpu"), we
can find that all SWP_SOLIDSTATE "seek is cheap"(SSD case) has already
gone to si->cluster_info scan_swap_map_try_ssd_cluster() route.  So that
the "last_in_cluster < scan_base" loop in the body of scan_swap_map()
has already become a dead code snippet, and it should have been deleted.

This patch is to delete the redundant loop as Hugh and Shaohua
suggested.

[hughd@google.com: fix comment, simplify code]
Signed-off-by: Chen Yucong <slaoub@gmail.com>
Cc: Shaohua Li <shli@kernel.org>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:12 -07:00
Dan Streetman
18ab4d4ced swap: change swap_list_head to plist, add swap_avail_head
Originally get_swap_page() started iterating through the singly-linked
list of swap_info_structs using swap_list.next or highest_priority_index,
which both were intended to point to the highest priority active swap
target that was not full.  The first patch in this series changed the
singly-linked list to a doubly-linked list, and removed the logic to start
at the highest priority non-full entry; it starts scanning at the highest
priority entry each time, even if the entry is full.

Replace the manually ordered swap_list_head with a plist, swap_active_head.
Add a new plist, swap_avail_head.  The original swap_active_head plist
contains all active swap_info_structs, as before, while the new
swap_avail_head plist contains only swap_info_structs that are active and
available, i.e. not full.  Add a new spinlock, swap_avail_lock, to protect
the swap_avail_head list.

Mel Gorman suggested using plists since they internally handle ordering
the list entries based on priority, which is exactly what swap was doing
manually.  All the ordering code is now removed, and swap_info_struct
entries and simply added to their corresponding plist and automatically
ordered correctly.

Using a new plist for available swap_info_structs simplifies and
optimizes get_swap_page(), which no longer has to iterate over full
swap_info_structs.  Using a new spinlock for swap_avail_head plist
allows each swap_info_struct to add or remove themselves from the
plist when they become full or not-full; previously they could not
do so because the swap_info_struct->lock is held when they change
from full<->not-full, and the swap_lock protecting the main
swap_active_head must be ordered before any swap_info_struct->lock.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Weijie Yang <weijieut@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:07 -07:00
Dan Streetman
adfab836f4 swap: change swap_info singly-linked list to list_head
The logic controlling the singly-linked list of swap_info_struct entries
for all active, i.e.  swapon'ed, swap targets is rather complex, because:

 - it stores the entries in priority order
 - there is a pointer to the highest priority entry
 - there is a pointer to the highest priority not-full entry
 - there is a highest_priority_index variable set outside the swap_lock
 - swap entries of equal priority should be used equally

this complexity leads to bugs such as: https://lkml.org/lkml/2014/2/13/181
where different priority swap targets are incorrectly used equally.

That bug probably could be solved with the existing singly-linked lists,
but I think it would only add more complexity to the already difficult to
understand get_swap_page() swap_list iteration logic.

The first patch changes from a singly-linked list to a doubly-linked list
using list_heads; the highest_priority_index and related code are removed
and get_swap_page() starts each iteration at the highest priority
swap_info entry, even if it's full.  While this does introduce unnecessary
list iteration (i.e.  Schlemiel the painter's algorithm) in the case where
one or more of the highest priority entries are full, the iteration and
manipulation code is much simpler and behaves correctly re: the above bug;
and the fourth patch removes the unnecessary iteration.

The second patch adds some minor plist helper functions; nothing new
really, just functions to match existing regular list functions.  These
are used by the next two patches.

The third patch adds plist_requeue(), which is used by get_swap_page() in
the next patch - it performs the requeueing of same-priority entries
(which moves the entry to the end of its priority in the plist), so that
all equal-priority swap_info_structs get used equally.

The fourth patch converts the main list into a plist, and adds a new plist
that contains only swap_info entries that are both active and not full.
As Mel suggested using plists allows removing all the ordering code from
swap - plists handle ordering automatically.  The list naming is also
clarified now that there are two lists, with the original list changed
from swap_list_head to swap_active_head and the new list named
swap_avail_head.  A new spinlock is also added for the new list, so
swap_info entries can be added or removed from the new list immediately as
they become full or not full.

This patch (of 4):

Replace the singly-linked list tracking active, i.e.  swapon'ed,
swap_info_struct entries with a doubly-linked list using struct
list_heads.  Simplify the logic iterating and manipulating the list of
entries, especially get_swap_page(), by using standard list_head
functions, and removing the highest priority iteration logic.

The change fixes the bug:
https://lkml.org/lkml/2014/2/13/181
in which different priority swap entries after the highest priority entry
are incorrectly used equally in pairs.  The swap behavior is now as
advertised, i.e. different priority swap entries are used in order, and
equal priority swap targets are used concurrently.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Weijie Yang <weijieut@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:07 -07:00
Weijie Yang
f893ab41e4 mm/swap: fix race on swap_info reuse between swapoff and swapon
swapoff clear swap_info's SWP_USED flag prematurely and free its
resources after that.  A concurrent swapon will reuse this swap_info
while its previous resources are not cleared completely.

These late freed resources are:
 - p->percpu_cluster
 - swap_cgroup_ctrl[type]
 - block_device setting
 - inode->i_flags &= ~S_SWAPFILE

This patch clears the SWP_USED flag after all its resources are freed,
so that swapon can reuse this swap_info by alloc_swap_info() safely.

[akpm@linux-foundation.org: tidy up code comment]
Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06 13:48:51 -08:00
Jamie Liu
a5998061da mm/swapfile.c: do not skip lowest_bit in scan_swap_map() scan loop
In the second half of scan_swap_map()'s scan loop, offset is set to
si->lowest_bit and then incremented before entering the loop for the
first time, causing si->swap_map[si->lowest_bit] to be skipped.

Signed-off-by: Jamie Liu <jamieliu@google.com>
Cc: Shaohua Li <shli@fusionio.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23 16:36:53 -08:00
Sasha Levin
309381feae mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE
Most of the VM_BUG_ON assertions are performed on a page.  Usually, when
one of these assertions fails we'll get a BUG_ON with a call stack and
the registers.

I've recently noticed based on the requests to add a small piece of code
that dumps the page to various VM_BUG_ON sites that the page dump is
quite useful to people debugging issues in mm.

This patch adds a VM_BUG_ON_PAGE(cond, page) which beyond doing what
VM_BUG_ON() does, also dumps the page before executing the actual
BUG_ON.

[akpm@linux-foundation.org: fix up includes]
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23 16:36:50 -08:00
Krzysztof Kozlowski
58e97ba6b1 frontswap: enable call to invalidate area on swapoff
During swapoff the frontswap_map was NULL-ified before calling
frontswap_invalidate_area().  However the frontswap_invalidate_area()
exits early if frontswap_map is NULL.  Invalidate was never called
during swapoff.

This patch moves frontswap_map_set() in swapoff just after calling
frontswap_invalidate_area() so outside of locks (swap_lock and
swap_info_struct->lock).  This shouldn't be a problem as during swapon
the frontswap_map_set() is called also outside of any locks.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:07 +09:00
Seth Jennings
2de1a7e40a mm/swapfile.c: fix comment typos
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 12:09:07 +09:00
Krzysztof Kozlowski
5b808a2300 swap: fix set_blocksize race during swapon/swapoff
Fix race between swapoff and swapon.  Swapoff used old_block_size from
swap_info outside of swapon_mutex so it could be overwritten by
concurrent swapon.

The race has visible effect only if more than one swap block device
exists with different block sizes (e.g.  /dev/sda1 with block size 4096
and /dev/sdb1 with 512).  In such case it leads to setting the blocksize
of swapped off device with wrong blocksize.

The bug can be triggered with multiple concurrent swapoff and swapon:
0. Swap for some device is on.
1. swapoff:
First the swapoff is called on this device and "struct swap_info_struct
*p" is assigned. This is done under swap_lock however this lock is
released for the call try_to_unuse().

2. swapon:
After the assignment above (and before acquiring swapon_mutex &
swap_lock by swapoff) the swapon is called on the same device.
The p->old_block_size is assigned to the value of block_size the device.
This block size should be the same as previous but sometimes it is not.
The swapon ends successfully.

3. swapoff:
Swapoff resumes, grabs the locks and mutex and continues to disable this
swap device. Now it sets the block size to value taken from swap_info
which was overwritten by swapon in 2.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reported-by: Weijie Yang <weijie.yang.kh@gmail.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Minchan Kim <minchan@kernel.org>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-16 21:35:53 -07:00
Shaohua Li
ebc2a1a691 swap: make cluster allocation per-cpu
swap cluster allocation is to get better request merge to improve
performance.  But the cluster is shared globally, if multiple tasks are
doing swap, this will cause interleave disk access.  While multiple tasks
swap is quite common, for example, each numa node has a kswapd thread
doing swap and multiple threads/processes doing direct page reclaim.

ioscheduler can't help too much here, because tasks don't send swapout IO
down to block layer in the meantime.  Block layer does merge some IOs, but
a lot not, depending on how many tasks are doing swapout concurrently.  In
practice, I've seen a lot of small size IO in swapout workloads.

We makes the cluster allocation per-cpu here.  The interleave disk access
issue goes away.  All tasks swapout to their own cluster, so swapout will
become sequential, which can be easily merged to big size IO.  If one CPU
can't get its per-cpu cluster (for example, there is no free cluster
anymore in the swap), it will fallback to scan swap_map.  The CPU can
still continue swap.  We don't need recycle free swap entries of other
CPUs.

In my test (swap to a 2-disk raid0 partition), this improves around 10%
swapout throughput, and request size is increased significantly.

How does this impact swap readahead is uncertain though.  On one side,
page reclaim always isolates and swaps several adjancent pages, this will
make page reclaim write the pages sequentially and benefit readahead.  On
the other side, several CPU write pages interleave means the pages don't
live _sequentially_ but relatively _near_.  In the per-cpu allocation
case, if adjancent pages are written by different cpus, they will live
relatively _far_.  So how this impacts swap readahead depends on how many
pages page reclaim isolates and swaps one time.  If the number is big,
this patch will benefit swap readahead.  Of course, this is about
sequential access pattern.  The patch has no impact for random access
pattern, because the new cluster allocation algorithm is just for SSD.

Alternative solution is organizing swap layout to be per-mm instead of
this per-cpu approach.  In the per-mm layout, we allocate a disk range for
each mm, so pages of one mm live in swap disk adjacently.  per-mm layout
has potential issues of lock contention if multiple reclaimers are swap
pages from one mm.  For a sequential workload, per-mm layout is better to
implement swap readahead, because pages from the mm are adjacent in disk.
But per-cpu layout isn't very bad in this workload, as page reclaim always
isolates and swaps several pages one time, such pages will still live in
disk sequentially and readahead can utilize this.  For a random workload,
per-mm layout isn't beneficial of request merge, because it's quite
possible pages from different mm are swapout in the meantime and IO can't
be merged in per-mm layout.  while with per-cpu layout we can merge
requests from any mm.  Considering random workload is more popular in
workloads with swap (and per-cpu approach isn't too bad for sequential
workload too), I'm choosing per-cpu layout.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kyungmin Park <kmpark@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:57:17 -07:00
Shaohua Li
edfe23dac3 swap: fix races exposed by swap discard
The previous patch can expose races, according to Hugh:

swapoff was sometimes failing with "Cannot allocate memory", coming from
try_to_unuse()'s -ENOMEM: it needs to allow for swap_duplicate() failing
on a free entry temporarily SWAP_MAP_BAD while being discarded.

We should use ACCESS_ONCE() there, and whenever accessing swap_map
locklessly; but rather than peppering it throughout try_to_unuse(), just
declare *swap_map with volatile.

try_to_unuse() is accustomed to *swap_map going down racily, but not
necessarily to it jumping up from 0 to SWAP_MAP_BAD: we'll be safer to
prevent that transition once SWP_WRITEOK is switched off, when it's a
waste of time to issue discards anyway (swapon can do a whole discard).

Another issue is:

In swapin_readahead(), read_swap_cache_async() can read a bad swap entry,
because we don't check if readahead swap entry is bad.  This doesn't break
anything but such swapin page is wasteful and can only be freed at page
reclaim.  We should avoid read such swap entry.  And in discard, we mark
swap entry SWAP_MAP_BAD and then switch it to normal when discard is
finished.  If readahead reads such swap entry, we have the same issue, so
we much check if swap entry is bad too.

Thanks Hugh to inspire swapin_readahead could use bad swap entry.

[include Hugh's patch 'swap: fix swapoff ENOMEMs from discard']
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kyungmin Park <kmpark@infradead.org>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:57:16 -07:00
Shaohua Li
815c2c543d swap: make swap discard async
swap can do cluster discard for SSD, which is good, but there are some
problems here:

1. swap do the discard just before page reclaim gets a swap entry and
   writes the disk sectors.  This is useless for high end SSD, because an
   overwrite to a sector implies a discard to original sector too.  A
   discard + overwrite == overwrite.

2. the purpose of doing discard is to improve SSD firmware garbage
   collection.  Idealy we should send discard as early as possible, so
   firmware can do something smart.  Sending discard just after swap entry
   is freed is considered early compared to sending discard before write.
   Of course, if workload is already bound to gc speed, sending discard
   earlier or later doesn't make

3. block discard is a sync API, which will delay scan_swap_map()
   significantly.

4. Write and discard command can be executed parallel in PCIe SSD.
   Making swap discard async can make execution more efficiently.

This patch makes swap discard async and moves discard to where swap entry
is freed.  Discard and write have no dependence now, so above issues can
be avoided.  Idealy we should do discard for any freed sectors, but some
SSD discard is very slow.  This patch still does discard for a whole
cluster.

My test does a several round of 'mmap, write, unmap', which will trigger a
lot of swap discard.  In a fusionio card, with this patch, the test
runtime is reduced to 18% of the time without it, so around 5.5x faster.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kyungmin Park <kmpark@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:57:15 -07:00
Shaohua Li
2a8f944934 swap: change block allocation algorithm for SSD
I'm using a fast SSD to do swap.  scan_swap_map() sometimes uses up to
20~30% CPU time (when cluster is hard to find, the CPU time can be up to
80%), which becomes a bottleneck.  scan_swap_map() scans a byte array to
search a 256 page cluster, which is very slow.

Here I introduced a simple algorithm to search cluster.  Since we only
care about 256 pages cluster, we can just use a counter to track if a
cluster is free.  Every 256 pages use one int to store the counter.  If
the counter of a cluster is 0, the cluster is free.  All free clusters
will be added to a list, so searching cluster is very efficient.  With
this, scap_swap_map() overhead disappears.

This might help low end SD card swap too.  Because if the cluster is
aligned, SD firmware can do flash erase more efficiently.

We only enable the algorithm for SSD.  Hard disk swap isn't fast enough
and has downside with the algorithm which might introduce regression (see
below).

The patch slightly changes which cluster is choosen.  It always adds free
cluster to list tail.  This can help wear leveling for low end SSD too.
And if no cluster found, the scan_swap_map() will do search from the end
of last cluster.  So if no cluster found, the scan_swap_map() will do
search from the end of last free cluster, which is random.  For SSD, this
isn't a problem at all.

Another downside is the cluster must be aligned to 256 pages, which will
reduce the chance to find a cluster.  I would expect this isn't a big
problem for SSD because of the non-seek penality.  (And this is the reason
I only enable the algorithm for SSD).

Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kyungmin Park <kmpark@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:57:15 -07:00
Andrew Morton
465c47fd8d mm/swapfile.c: convert to pr_foo()
A few 80-col gymnastics were cleaned up as a result.

Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:57:03 -07:00
Raymond Jennings
d6bbbd29b1 swap: warn when a swap area overflows the maximum size
It is possible to swapon a swap area that is too big for the pte width
to handle.

Presently this failure happens silently.

Instead, emit a diagnostic to warn the user.

Testing results, root prompt commands and kernel log messages:

# lvresize /dev/system/swap --size 16G
# mkswap /dev/system/swap
# swapon /dev/system/swap

Jul  7 04:27:22 warfang kernel: Adding 16777212k swap
on /dev/mapper/system-swap.  Priority:-1 extents:1 across:16777212k

# lvresize /dev/system/swap --size 64G
# mkswap /dev/system/swap
# swapon /dev/system/swap

Jul  7 04:27:22 warfang kernel: Truncating oversized swap area, only
using 33554432k out of 67108860k
Jul  7 04:27:22 warfang kernel: Adding 33554428k swap
on /dev/mapper/system-swap.  Priority:-1 extents:1 across:33554428k

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Raymond Jennings <shentino@gmail.com>
Acked-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:57:00 -07:00
Cyrill Gorcunov
179ef71cbc mm: save soft-dirty bits on swapped pages
Andy Lutomirski reported that if a page with _PAGE_SOFT_DIRTY bit set
get swapped out, the bit is getting lost and no longer available when
pte read back.

To resolve this we introduce _PTE_SWP_SOFT_DIRTY bit which is saved in
pte entry for the page being swapped out.  When such page is to be read
back from a swap cache we check for bit presence and if it's there we
clear it and restore the former _PAGE_SOFT_DIRTY bit back.

One of the problem was to find a place in pte entry where we can save
the _PTE_SWP_SOFT_DIRTY bit while page is in swap.  The _PAGE_PSE was
chosen for that, it doesn't intersect with swap entry format stored in
pte.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 17:57:47 -07:00
Rafael Aquini
dcf6b7ddd7 swap: discard while swapping only if SWAP_FLAG_DISCARD_PAGES
Considering the use cases where the swap device supports discard:
a) and can do it quickly;
b) but it's slow to do in small granularities (or concurrent with other
   I/O);
c) but the implementation is so horrendous that you don't even want to
   send one down;

And assuming that the sysadmin considers it useful to send the discards down
at all, we would (probably) want the following solutions:

  i. do the fine-grained discards for freed swap pages, if device is
     capable of doing so optimally;
 ii. do single-time (batched) swap area discards, either at swapon
     or via something like fstrim (not implemented yet);
iii. allow doing both single-time and fine-grained discards; or
 iv. turn it off completely (default behavior)

As implemented today, one can only enable/disable discards for swap, but
one cannot select, for instance, solution (ii) on a swap device like (b)
even though the single-time discard is regarded to be interesting, or
necessary to the workload because it would imply (1), and the device is
not capable of performing it optimally.

This patch addresses the scenario depicted above by introducing a way to
ensure the (probably) wanted solutions (i, ii, iii and iv) can be flexibly
flagged through swapon(8) to allow a sysadmin to select the best suitable
swap discard policy accordingly to system constraints.

This patch introduces SWAP_FLAG_DISCARD_PAGES and SWAP_FLAG_DISCARD_ONCE
new flags to allow more flexibe swap discard policies being flagged
through swapon(8).  The default behavior is to keep both single-time, or
batched, area discards (SWAP_FLAG_DISCARD_ONCE) and fine-grained discards
for page-clusters (SWAP_FLAG_DISCARD_PAGES) enabled, in order to keep
consistentcy with older kernel behavior, as well as maintain compatibility
with older swapon(8).  However, through the new introduced flags the best
suitable discard policy can be selected accordingly to any given swap
device constraint.

[akpm@linux-foundation.org: tweak comments]
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Karel Zak <kzak@redhat.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:32 -07:00
Akinobu Mita
7b57976da4 frontswap: fix incorrect zeroing and allocation size for frontswap_map
The bitmap accessed by bitops must have enough size to hold the required
numbers of bits rounded up to a multiple of BITS_PER_LONG.  And the
bitmap must not be zeroed by memset() if the number of bits cleared is
not a multiple of BITS_PER_LONG.

This fixes incorrect zeroing and allocation size for frontswap_map.  The
incorrect zeroing part doesn't cause any problem because frontswap_map
is freed just after zeroing.  But the wrongly calculated allocation size
may cause the problem.

For 32bit systems, the allocation size of frontswap_map is about twice
as large as required size.  For 64bit systems, the allocation size is
smaller than requeired if the number of bits is not a multiple of
BITS_PER_LONG.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-12 16:29:46 -07:00
Minchan Kim
4f89849da2 frontswap: get rid of swap_lock dependency
Frontswap initialization routine depends on swap_lock, which want to be
atomic about frontswap's first appearance.  IOW, frontswap is not present
and will fail all calls OR frontswap is fully functional but if new
swap_info_struct isn't registered by enable_swap_info, swap subsystem
doesn't start I/O so there is no race between init procedure and page I/O
working on frontswap.

So let's remove unnecessary swap_lock dependency.

Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
[v1: Rebased on my branch, reworked to work with backends loading late]
[v2: Added a check for !map]
[v3: Made the invalidate path follow the init path]
[v4: Address comments by Wanpeng Li <liwanp@linux.vnet.ibm.com>]
Signed-off-by: Konrad Rzeszutek Wilk <konrad@darnok.org>
Signed-off-by: Bob Liu <lliubbo@gmail.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Andor Daam <andor.daam@googlemail.com>
Cc: Florian Schmaus <fschmaus@gmail.com>
Cc: Stefan Hengelein <ilendir@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30 17:04:00 -07:00
Akinobu Mita
d3d30417d3 mm/: rename random32() to prandom_u32()
Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 18:28:42 -07:00
Linus Torvalds
d895cb1af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
 "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
  locking violations, etc.

  The most visible changes here are death of FS_REVAL_DOT (replaced with
  "has ->d_weak_revalidate()") and a new helper getting from struct file
  to inode.  Some bits of preparation to xattr method interface changes.

  Misc patches by various people sent this cycle *and* ocfs2 fixes from
  several cycles ago that should've been upstream right then.

  PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  saner proc_get_inode() calling conventions
  proc: avoid extra pde_put() in proc_fill_super()
  fs: change return values from -EACCES to -EPERM
  fs/exec.c: make bprm_mm_init() static
  ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
  ocfs2: fix possible use-after-free with AIO
  ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
  target: writev() on single-element vector is pointless
  export kernel_write(), convert open-coded instances
  fs: encode_fh: return FILEID_INVALID if invalid fid_type
  kill f_vfsmnt
  vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
  nfsd: handle vfs_getattr errors in acl protocol
  switch vfs_getattr() to struct path
  default SET_PERSONALITY() in linux/elf.h
  ceph: prepopulate inodes only when request is aborted
  d_hash_and_lookup(): export, switch open-coded instances
  9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
  9p: split dropping the acls from v9fs_set_create_acl()
  ...
2013-02-26 20:16:07 -08:00
Hugh Dickins
9e16b7fb1d mm,ksm: swapoff might need to copy
Before establishing that KSM page migration was the cause of my
WARN_ON_ONCE(page_mapped(page))s, I suspected that they came from the
lack of a ksm_might_need_to_copy() in swapoff's unuse_pte() - which in
many respects is equivalent to faulting in a page.

In fact I've never caught that as the cause: but in theory it does at
least need the KSM_RUN_UNMERGE check in ksm_might_need_to_copy(), to
avoid bringing a KSM page back in when it's not supposed to be.

I intended to copy how it's done in do_swap_page(), but have a strong
aversion to how "swapcache" ends up being used there: rework it with
"page != swapcache".

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Petr Holasek <pholasek@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Izik Eidus <izik.eidus@ravellosystems.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:23 -08:00
Shaohua Li
ec8acf20af swap: add per-partition lock for swapfile
swap_lock is heavily contended when I test swap to 3 fast SSD (even
slightly slower than swap to 2 such SSD).  The main contention comes
from swap_info_get().  This patch tries to fix the gap with adding a new
per-partition lock.

Global data like nr_swapfiles, total_swap_pages, least_priority and
swap_list are still protected by swap_lock.

nr_swap_pages is an atomic now, it can be changed without swap_lock.  In
theory, it's possible get_swap_page() finds no swap pages but actually
there are free swap pages.  But sounds not a big problem.

Accessing partition specific data (like scan_swap_map and so on) is only
protected by swap_info_struct.lock.

Changing swap_info_struct.flags need hold swap_lock and
swap_info_struct.lock, because scan_scan_map() will check it.  read the
flags is ok with either the locks hold.

If both swap_lock and swap_info_struct.lock must be hold, we always hold
the former first to avoid deadlock.

swap_entry_free() can change swap_list.  To delete that code, we add a
new highest_priority_index.  Whenever get_swap_page() is called, we
check it.  If it's valid, we use it.

It's a pity get_swap_page() still holds swap_lock().  But in practice,
swap_lock() isn't heavily contended in my test with this patch (or I can
say there are other much more heavier bottlenecks like TLB flush).  And
BTW, looks get_swap_page() doesn't really need the lock.  We never free
swap_info[] and we check SWAP_WRITEOK flag.  The only risk without the
lock is we could swapout to some low priority swap, but we can quickly
recover after several rounds of swap, so sounds not a big deal to me.
But I'd prefer to fix this if it's a real problem.

"swap: make each swap partition have one address_space" improved the
swapout speed from 1.7G/s to 2G/s.  This patch further improves the
speed to 2.3G/s, so around 15% improvement.  It's a multi-process test,
so TLB flush isn't the biggest bottleneck before the patches.

[arnd@arndb.de: fix it for nommu]
[hughd@google.com: add missing unlock]
[minchan@kernel.org: get rid of lockdep whinge on sys_swapon]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:17 -08:00
Shaohua Li
33806f06da swap: make each swap partition have one address_space
When I use several fast SSD to do swap, swapper_space.tree_lock is
heavily contended.  This makes each swap partition have one
address_space to reduce the lock contention.  There is an array of
address_space for swap.  The swap entry type is the index to the array.

In my test with 3 SSD, this increases the swapout throughput 20%.

[akpm@linux-foundation.org: revert unneeded change to  __add_to_swap_cache]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:17 -08:00
Al Viro
496ad9aa8e new helper: file_inode(file)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-22 23:31:31 -05:00
David Rientjes
e1e12d2f31 mm, oom: fix race when specifying a thread as the oom origin
test_set_oom_score_adj() and compare_swap_oom_score_adj() are used to
specify that current should be killed first if an oom condition occurs in
between the two calls.

The usage is

	short oom_score_adj = test_set_oom_score_adj(OOM_SCORE_ADJ_MAX);
	...
	compare_swap_oom_score_adj(OOM_SCORE_ADJ_MAX, oom_score_adj);

to store the thread's oom_score_adj, temporarily change it to the maximum
score possible, and then restore the old value if it is still the same.

This happens to still be racy, however, if the user writes
OOM_SCORE_ADJ_MAX to /proc/pid/oom_score_adj in between the two calls.
The compare_swap_oom_score_adj() will then incorrectly reset the old value
prior to the write of OOM_SCORE_ADJ_MAX.

To fix this, introduce a new oom_flags_t member in struct signal_struct
that will be used for per-thread oom killer flags.  KSM and swapoff can
now use a bit in this member to specify that threads should be killed
first in oom conditions without playing around with oom_score_adj.

This also allows the correct oom_score_adj to always be shown when reading
/proc/pid/oom_score.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
David Rientjes
a9c58b907d mm, oom: change type of oom_score_adj to short
The maximum oom_score_adj is 1000 and the minimum oom_score_adj is -1000,
so this range can be represented by the signed short type with no
functional change.  The extra space this frees up in struct signal_struct
will be used for per-thread oom kill flags in the next patch.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
Cesar Eduardo Barros
6555bc0357 mm: do not call frontswap_init() during swapoff
The call to frontswap_init() was added within enable_swap_info(), which
was called not only during sys_swapon, but also to reinsert the swap_info
into the swap_list in case of failure of try_to_unuse() within
sys_swapoff.  This means that frontswap_init() might be called more than
once for the same swap area.

While as far as I could see no frontswap implementation has any problem
with it (and in fact, all the ones I found ignore the parameter passed to
frontswap_init), this could change in the future.

To prevent future problems, move the call to frontswap_init() to outside
the code shared between sys_swapon and sys_swapoff.

Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:24 -08:00