Commit graph

9884 commits

Author SHA1 Message Date
Linux Build Service Account
2b4e8cbd34 Merge "Revert "Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4"" 2016-08-29 00:49:25 -07:00
Trilok Soni
5ab1e18aa3 Revert "Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4"
This reverts commit 9d6fd2c3e9 ("Merge remote-tracking branch
'msm-4.4/tmp-510d0a3f' into msm-4.4"), because it breaks the
dump parsing tools due to kernel can be loaded anywhere in the memory
now and not fixed at linear mapping.

Change-Id: Id416f0a249d803442847d09ac47781147b0d0ee6
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
2016-08-26 14:34:05 -07:00
Vinayak Menon
47b41f43ee mm: zbud: fix the locking scenarios with zcache
With zcache using zbud, strange locking scenarios are
observed. The first problem seen is:

Core 2 waiting on mapping->tree_lock which is taken by core 6
do_raw_spin_lock

raw_spin_lock_irq
atomic_cmpxchg
page_freeze_refs
__remove_mapping
shrink_page_list

Core 6 after taking mapping->tree_lock is waiting on zbud pool lock
which is held by core 5
zbud_alloc
zcache_store_page
__cleancache_put_page
cleancache_put_page
__delete_from_page_cache
spin_unlock_irq
__remove_mapping
shrink_page_list
shrink_inactive_list

Core 5 after taking zbud pool lock from zbud_free received an IRQ, and
after IRQ exit, softirqs were scheduled and end_page_writeback tried to
lock on mapping->tree_lock which is already held by Core 6. Deadlock.
do_raw_spin_lock
raw_spin_lock_irqsave
test_clear_page_writeba
end_page_writeback
ext4_finish_bio
ext4_end_bio
bio_endio
blk_update_request
end_clone_bio
bio_endio
blk_update_request
blk_update_bidi_request
blk_end_bidi_request
blk_end_request
mmc_blk_cmdq_complete_r
mmc_cmdq_softirq_done
blk_done_softirq
static_key_count
static_key_false
trace_softirq_exit
__do_softirq()
tick_irq_exit
irq_exit()
set_irq_regs
__handle_domain_irq
gic_handle_irq
el1_irq
exception
__list_del_entry
list_del
zbud_free
zcache_load_page
__cleancache_get_page(?

This shows that allowing softirqs while holding zbud pool lock
can result in deadlocks. To fix this, 'commit 6a1fdaa36272
("mm: zbud: prevent softirq during zbud alloc, free and reclaim")'
decided to take spin_lock_bh during zbud_free, zbud_alloc and
zbud_reclaim. But this resulted in another deadlock.

spin_bug()
do_raw_spin_lock()
_raw_spin_lock_irqsave()
test_clear_page_writeback()
end_page_writeback()
ext4_finish_bio()
ext4_end_bio()
bio_endio()
blk_update_request()
end_clone_bio()
bio_endio()
blk_update_request()
blk_update_bidi_request()
blk_end_request()
mmc_blk_cmdq_complete_rq()
mmc_cmdq_softirq_done()
blk_done_softirq()
__do_softirq()
do_softirq()
__local_bh_enable_ip()
_raw_spin_unlock_bh()
zbud_alloc()
zcache_store_page()
__cleancache_put_page()
__delete_from_page_cache()
__remove_mapping()
shrink_page_list()

Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq,
which resulted in the acquisition of mapping->tree_lock which was already
taken by __remove_mapping.

The new fix considers the following facts.
1) zcache_store_page is always be called from __delete_from_page_cache
with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc
which is called only from zcache_store_page is always called with
interrupts disabled.
2) zbud_free and zbud_reclaim_page can be called from zcache with or
without interrupts disabled. So an interrupt while holding zbud pool
lock can result in do_softirq and acquisition of mapping->tree_lock.

(1) implies zbud_alloc need not explicitly disable bh. But disable
interrupts to make sure zbud_alloc is safe with zcache, irrespective
of future changes. This will fix the second scenario.
(2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave,
so that interrupts will not be triggered and inturn softirqs.
spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq
even in interrupt context. This will fix the first scenario.

Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-08-25 11:49:45 +05:30
Trilok Soni
9d6fd2c3e9 Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4
* msm-4.4/tmp-510d0a3f:
  Linux 4.4.11
  nf_conntrack: avoid kernel pointer value leak in slab name
  drm/radeon: fix DP link training issue with second 4K monitor
  drm/i915/bdw: Add missing delay during L3 SQC credit programming
  drm/i915: Bail out of pipe config compute loop on LPT
  drm/radeon: fix PLL sharing on DCE6.1 (v2)
  Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing"
  Input: max8997-haptic - fix NULL pointer dereference
  get_rock_ridge_filename(): handle malformed NM entries
  tools lib traceevent: Do not reassign parg after collapse_tree()
  qla1280: Don't allocate 512kb of host tags
  atomic_open(): fix the handling of create_error
  regulator: axp20x: Fix axp22x ldo_io voltage ranges
  regulator: s2mps11: Fix invalid selector mask and voltages for buck9
  workqueue: fix rebind bound workers warning
  ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC
  vfs: rename: check backing inode being equal
  vfs: add vfs_select_inode() helper
  perf/core: Disable the event on a truncated AUX record
  regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case
  pinctrl: at91-pio4: fix pull-up/down logic
  spi: spi-ti-qspi: Handle truncated frames properly
  spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden
  spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT
  ALSA: hda - Fix broken reconfig
  ALSA: hda - Fix white noise on Asus UX501VW headset
  ALSA: hda - Fix subwoofer pin on ASUS N751 and N551
  ALSA: usb-audio: Yet another Phoneix Audio device quirk
  ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2)
  crypto: testmgr - Use kmalloc memory for RSA input
  crypto: hash - Fix page length clamping in hash walk
  crypto: qat - fix invalid pf2vf_resp_wq logic
  s390/mm: fix asce_bits handling with dynamic pagetable levels
  zsmalloc: fix zs_can_compact() integer overflow
  ocfs2: fix posix_acl_create deadlock
  ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang
  net/route: enforce hoplimit max value
  tcp: refresh skb timestamp at retransmit time
  net: thunderx: avoid exposing kernel stack
  net: fix a kernel infoleak in x25 module
  uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h MIME-Version: 1.0
  bridge: fix igmp / mld query parsing
  net: bridge: fix old ioctl unlocked net device walk
  VSOCK: do not disconnect socket when peer has shutdown SEND only
  net/mlx4_en: Fix endianness bug in IPV6 csum calculation
  net: fix infoleak in rtnetlink
  net: fix infoleak in llc
  net: fec: only clear a queue's work bit if the queue was emptied
  netem: Segment GSO packets on enqueue
  sch_dsmark: update backlog as well
  sch_htb: update backlog as well
  net_sched: update hierarchical backlog too
  net_sched: introduce qdisc_replace() helper
  gre: do not pull header in ICMP error processing
  net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
  samples/bpf: fix trace_output example
  bpf: fix check_map_func_compatibility logic
  bpf: fix refcnt overflow
  bpf: fix double-fdput in replace_map_fd_with_map_ptr()
  net/mlx4_en: fix spurious timestamping callbacks
  ipv4/fib: don't warn when primary address is missing if in_dev is dead
  net/mlx5e: Fix minimum MTU
  net/mlx5e: Device's mtu field is u16 and not int
  openvswitch: use flow protocol when recalculating ipv6 checksums
  atl2: Disable unimplemented scatter/gather feature
  vlan: pull on __vlan_insert_tag error path and fix csum correction
  net: use skb_postpush_rcsum instead of own implementations
  cdc_mbim: apply "NDP to end" quirk to all Huawei devices
  bpf/verifier: reject invalid LD_ABS | BPF_DW instruction
  net: sched: do not requeue a NULL skb
  packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface
  route: do not cache fib route info on local routes with oif
  decnet: Do not build routes to devices without decnet private data.
  parisc: Use generic extable search and sort routines
  arm64: kasan: Use actual memory node when populating the kernel image shadow
  arm64: mm: treat memstart_addr as a signed quantity
  arm64: lse: deal with clobbered IP registers after branch via PLT
  arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly
  arm64: kasan: Fix zero shadow mapping overriding kernel image shadow
  arm64: consistently use p?d_set_huge
  arm64: fix KASLR boot-time I-cache maintenance
  arm64: hugetlb: partial revert of 66b3923a1a0f
  arm64: make irq_stack_ptr more robust
  arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness
  efi: stub: use high allocation for converted command line
  efi: stub: add implementation of efi_random_alloc()
  efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL
  arm64: kaslr: randomize the linear region
  arm64: add support for kernel ASLR
  arm64: add support for building vmlinux as a relocatable PIE binary
  arm64: switch to relative exception tables
  extable: add support for relative extables to search and sort routines
  scripts/sortextable: add support for ET_DYN binaries
  arm64: futex.h: Add missing PAN toggling
  arm64: make asm/elf.h available to asm files
  arm64: avoid dynamic relocations in early boot code
  arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
  arm64: add support for module PLTs
  arm64: move brk immediate argument definitions to separate header
  arm64: mm: use bit ops rather than arithmetic in pa/va translations
  arm64: mm: only perform memstart_addr sanity check if DEBUG_VM
  arm64: User die() instead of panic() in do_page_fault()
  arm64: allow kernel Image to be loaded anywhere in physical memory
  arm64: defer __va translation of initrd_start and initrd_end
  arm64: move kernel image to base of vmalloc area
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: decouple early fixmap init from linear mapping
  arm64: pgtable: implement static [pte|pmd|pud]_offset variants
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  arm64: add support for ioremap() block mappings
  arm64: prevent potential circular header dependencies in asm/bug.h
  of/fdt: factor out assignment of initrd_start/initrd_end
  of/fdt: make memblock minimum physical address arch configurable
  arm64: Remove the get_thread_info() function
  arm64: kernel: Don't toggle PAN on systems with UAO
  arm64: cpufeature: Test 'matches' pointer to find the end of the list
  arm64: kernel: Add support for User Access Override
  arm64: add ARMv8.2 id_aa64mmfr2 boiler plate
  arm64: cpufeature: Change read_cpuid() to use sysreg's mrs_s macro
  arm64: use local label prefixes for __reg_num symbols
  arm64: vdso: Mark vDSO code as read-only
  arm64: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL
  arm64: ptdump: Indicate whether memory should be faulting
  arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
  arm64: Drop alloc function from create_mapping
  arm64: prefetch: add missing #include for spin_lock_prefetch
  arm64: lib: patch in prfm for copy_page if requested
  arm64: lib: improve copy_page to deal with 128 bytes at a time
  arm64: prefetch: add alternative pattern for CPUs without a prefetcher
  arm64: prefetch: don't provide spin_lock_prefetch with LSE
  arm64: allow vmalloc regions to be set with set_memory_*
  arm64: kernel: implement ACPI parking protocol
  arm64: mm: create new fine-grained mappings at boot
  arm64: ensure _stext and _etext are page-aligned
  arm64: mm: allow passing a pgdir to alloc_init_*
  arm64: mm: allocate pagetables anywhere
  arm64: mm: use fixmap when creating page tables
  arm64: mm: add functions to walk tables in fixmap
  arm64: mm: add __{pud,pgd}_populate
  arm64: mm: avoid redundant __pa(__va(x))
  arm64: mm: add functions to walk page tables by PA
  arm64: mm: move pte_* macros
  arm64: kasan: avoid TLB conflicts
  arm64: mm: add code to safely replace TTBR1_EL1
  arm64: add function to install the idmap
  arm64: unmap idmap earlier
  arm64: unify idmap removal
  arm64: mm: place empty_zero_page in bss
  arm64: mm: specialise pagetable allocators
  asm-generic: Fix local variable shadow in __set_fixmap_offset
  Eliminate the .eh_frame sections from the aarch64 vmlinux and kernel modules
  arm64: Fix an enum typo in mm/dump.c
  arm64: kasan: ensure that the KASAN zero page is mapped read-only
  arch/arm64/include/asm/pgtable.h: add pmd_mkclean for THP
  arm64: hide __efistub_ aliases from kallsyms
  Linux 4.4.10
  drm/i915/skl: Fix DMC load on Skylake J0 and K0
  lib/test-string_helpers.c: fix and improve string_get_size() tests
  ACPI / processor: Request native thermal interrupt handling via _OSC
  drm/i915: Fake HDMI live status
  drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW
  drm/i915: Fix eDP low vswing for Broadwell
  drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
  drm/radeon: make sure vertical front porch is at least 1
  iio: ak8975: fix maybe-uninitialized warning
  iio: ak8975: Fix NULL pointer exception on early interrupt
  drm/amdgpu: set metadata pointer to NULL after freeing.
  drm/amdgpu: make sure vertical front porch is at least 1
  gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading
  nvmem: mxs-ocotp: fix buffer overflow in read
  USB: serial: cp210x: add Straizona Focusers device ids
  USB: serial: cp210x: add ID for Link ECU
  ata: ahci-platform: Add ports-implemented DT bindings.
  libahci: save port map for forced port map
  powerpc: Fix bad inline asm constraint in create_zero_mask()
  ACPICA: Dispatcher: Update thread ID for recursive method calls
  x86/sysfb_efi: Fix valid BAR address range check
  ARC: Add missing io barriers to io{read,write}{16,32}be()
  ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
  propogate_mnt: Handle the first propogated copy being a slave
  fs/pnode.c: treat zero mnt_group_id-s as unequal
  x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
  MAINTAINERS: Remove asterisk from EFI directory names
  writeback: Fix performance regression in wb_over_bg_thresh()
  batman-adv: Reduce refcnt of removed router when updating route
  batman-adv: Fix broadcast/ogm queue limit on a removed interface
  batman-adv: Check skb size before using encapsulated ETH+VLAN header
  batman-adv: fix DAT candidate selection (must use vid)
  mm: update min_free_kbytes from khugepaged after core initialization
  proc: prevent accessing /proc/<PID>/environ until it's ready
  Input: zforce_ts - fix dual touch recognition
  HID: Fix boot delay for Creative SB Omni Surround 5.1 with quirk
  HID: wacom: Add support for DTK-1651
  xen/evtchn: fix ring resize when binding new events
  xen/balloon: Fix crash when ballooning on x86 32 bit PAE
  xen: Fix page <-> pfn conversion on 32 bit systems
  ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernel
  ARM: EXYNOS: Properly skip unitialized parent clock in power domain on
  mm/zswap: provide unique zpool name
  mm, cma: prevent nr_isolated_* counters from going negative
  Minimal fix-up of bad hashing behavior of hash_64()
  MD: make bio mergeable
  tracing: Don't display trigger file for events that can't be enabled
  mac80211: fix statistics leak if dev_alloc_name() fails
  ath9k: ar5008_hw_cmn_spur_mitigate: add missing mask_m & mask_p initialisation
  lpfc: fix misleading indentation
  clk: qcom: msm8960: Fix ce3_src register offset
  clk: versatile: sp810: support reentrance
  clk: qcom: msm8960: fix ce3_core clk enable register
  clk: meson: Fix meson_clk_register_clks() signature type mismatch
  clk: rockchip: free memory in error cases when registering clock branches
  soc: rockchip: power-domain: fix err handle while probing
  clk-divider: make sure read-only dividers do not write to their register
  CNS3xxx: Fix PCI cns3xxx_write_config()
  mwifiex: fix corner case association failure
  ata: ahci_xgene: dereferencing uninitialized pointer in probe
  nbd: ratelimit error msgs after socket close
  mfd: intel-lpss: Remove clock tree on error path
  ipvs: drop first packet to redirect conntrack
  ipvs: correct initial offset of Call-ID header search in SIP persistence engine
  ipvs: handle ip_vs_fill_iph_skb_off failure
  RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
  Revert: "powerpc/tm: Check for already reclaimed tasks"
  arm64: head.S: use memset to clear BSS
  efi: stub: define DISABLE_BRANCH_PROFILING for all architectures
  arm64: entry: remove pointless SPSR mode check
  arm64: mm: move pgd_cache initialisation to pgtable_cache_init
  arm64: module: avoid undefined shift behavior in reloc_data()
  arm64: module: fix relocation of movz instruction with negative immediate
  arm64: traps: address fallout from printk -> pr_* conversion
  arm64: ftrace: fix a stack tracer's output under function graph tracer
  arm64: pass a task parameter to unwind_frame()
  arm64: ftrace: modify a stack frame in a safe way
  arm64: remove irq_count and do_softirq_own_stack()
  arm64: hugetlb: add support for PTE contiguous bit
  arm64: Use PoU cache instr for I/D coherency
  arm64: Defer dcache flush in __cpu_copy_user_page
  arm64: reduce stack use in irq_handler
  arm64: Documentation: add list of software workarounds for errata
  arm64: mm: place __cpu_setup in .text
  arm64: cmpxchg: Don't incldue linux/mmdebug.h
  arm64: mm: fold alternatives into .init
  arm64: Remove redundant padding from linker script
  arm64: mm: remove pointless PAGE_MASKing
  arm64: don't call C code with el0's fp register
  arm64: when walking onto the task stack, check sp & fp are in current->stack
  arm64: Add this_cpu_ptr() assembler macro for use in entry.S
  arm64: irq: fix walking from irq stack to task stack
  arm64: Add do_softirq_own_stack() and enable irq_stacks
  arm64: Modify stack trace and dump for use with irq_stack
  arm64: Store struct thread_info in sp_el0
  arm64: Add trace_hardirqs_off annotation in ret_to_user
  arm64: ftrace: fix the comments for ftrace_modify_code
  arm64: ftrace: stop using kstop_machine to enable/disable tracing
  arm64: spinlock: serialise spin_unlock_wait against concurrent lockers
  arm64: enable HAVE_IRQ_TIME_ACCOUNTING
  arm64: fix COMPAT_SHMLBA definition for large pages
  arm64: add __init/__initdata section marker to some functions/variables
  arm64: pgtable: implement pte_accessible()
  arm64: mm: allow sections for unaligned bases
  arm64: mm: detect bad __create_mapping uses
  Linux 4.4.9
  extcon: max77843: Use correct size for reading the interrupt register
  stm class: Select CONFIG_SRCU
  megaraid_sas: add missing curly braces in ioctl handler
  sunrpc/cache: drop reference when sunrpc_cache_pipe_upcall() detects a race
  thermal: rockchip: fix a impossible condition caused by the warning
  unbreak allmodconfig KCONFIG_ALLCONFIG=...
  jme: Fix device PM wakeup API usage
  jme: Do not enable NIC WoL functions on S0
  bus: imx-weim: Take the 'status' property value into account
  ARM: dts: pxa: fix dma engine node to pxa3xx-nand
  ARM: dts: armada-375: use armada-370-sata for SATA
  ARM: EXYNOS: select THERMAL_OF
  ARM: prima2: always enable reset controller
  ARM: OMAP3: Add cpuidle parameters table for omap3430
  ext4: fix races of writeback with punch hole and zero range
  ext4: fix races between buffered IO and collapse / insert range
  ext4: move unlocked dio protection from ext4_alloc_file_blocks()
  ext4: fix races between page faults and hole punching
  perf stat: Document --detailed option
  perf tools: handle spaces in file names obtained from /proc/pid/maps
  perf hists browser: Only offer symbol scripting when a symbol is under the cursor
  mtd: nand: Drop mtd.owner requirement in nand_scan
  mtd: brcmnand: Fix v7.1 register offsets
  mtd: spi-nor: remove micron_quad_enable()
  serial: sh-sci: Remove cpufreq notifier to fix crash/deadlock
  ext4: fix NULL pointer dereference in ext4_mark_inode_dirty()
  x86/mm/kmmio: Fix mmiotrace for hugepages
  perf evlist: Reference count the cpu and thread maps at set_maps()
  drivers/misc/ad525x_dpot: AD5274 fix RDAC read back errors
  rtc: max77686: Properly handle regmap_irq_get_virq() error code
  rtc: rx8025: remove rv8803 id
  rtc: ds1685: passing bogus values to irq_restore
  rtc: vr41xx: Wire up alarm_irq_enable
  rtc: hym8563: fix invalid year calculation
  PM / Domains: Fix removal of a subdomain
  PM / OPP: Initialize u_volt_min/max to a valid value
  misc: mic/scif: fix wrap around tests
  misc/bmp085: Enable building as a module
  lib/mpi: Endianness fix
  fbdev: da8xx-fb: fix videomodes of lcd panels
  scsi_dh: force modular build if SCSI is a module
  paride: make 'verbose' parameter an 'int' again
  regulator: s5m8767: fix get_register() error handling
  irqchip/mxs: Fix error check of of_io_request_and_map()
  irqchip/sunxi-nmi: Fix error check of of_io_request_and_map()
  spi/rockchip: Make sure spi clk is on in rockchip_spi_set_cs
  locking/mcs: Fix mcs_spin_lock() ordering
  regulator: core: Fix nested locking of supplies
  regulator: core: Ensure we lock all regulators
  regulator: core: fix regulator_lock_supply regression
  Revert "regulator: core: Fix nested locking of supplies"
  videobuf2-v4l2: Verify planes array in buffer dequeueing
  videobuf2-core: Check user space planes array in dqbuf
  USB: usbip: fix potential out-of-bounds write
  cgroup: make sure a parent css isn't freed before its children
  mm/hwpoison: fix wrong num_poisoned_pages accounting
  mm: vmscan: reclaim highmem zone if buffer_heads is over limit
  numa: fix /proc/<pid>/numa_maps for THP
  mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check
  memcg: relocate charge moving from ->attach to ->post_attach
  cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback
  slub: clean up code for kmem cgroup support to kmem_cache_free_bulk
  workqueue: fix ghost PENDING flag while doing MQ IO
  x86/apic: Handle zero vector gracefully in clear_vector_irq()
  efi: Expose non-blocking set_variable() wrapper to efivars
  efi: Fix out-of-bounds read in variable_matches()
  IB/security: Restrict use of the write() interface
  IB/mlx5: Expose correct max_sge_rd limit
  cxl: Keep IRQ mappings on context teardown
  v4l2-dv-timings.h: fix polarity for 4k formats
  vb2-memops: Fix over allocation of frame vectors
  ASoC: rt5640: Correct the digital interface data select
  ASoC: dapm: Make sure we have a card when displaying component widgets
  ASoC: ssm4567: Reset device before regcache_sync()
  ASoC: s3c24xx: use const snd_soc_component_driver pointer
  EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback
  toshiba_acpi: Fix regression caused by hotkey enabling value
  i2c: exynos5: Fix possible ABBA deadlock by keeping I2C clock prepared
  i2c: cpm: Fix build break due to incompatible pointer types
  perf intel-pt: Fix segfault tracing transactions
  drm/i915: Use fw_domains_put_with_fifo() on HSW
  drm/i915: Fixup the free space logic in ring_prepare
  drm/amdkfd: uninitialized variable in dbgdev_wave_control_set_registers()
  drm/i915: skl_update_scaler() wants a rotation bitmask instead of bit number
  drm/i915: Cleanup phys status page too
  pwm: brcmstb: Fix check of devm_ioremap_resource() return code
  drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()
  drm/dp/mst: Restore primary hub guid on resume
  drm/dp/mst: Validate port in drm_dp_payload_send_msg()
  drm/nouveau/gr/gf100: select a stream master to fixup tfb offset queries
  drm: Loongson-3 doesn't fully support wc memory
  drm/radeon: fix vertical bars appear on monitor (v2)
  drm/radeon: forbid mapping of userptr bo through radeon device file
  drm/radeon: fix initial connector audio value
  drm/radeon: add a quirk for a XFX R9 270X
  drm/amdgpu: fix regression on CIK (v2)
  amdgpu/uvd: add uvd fw version for amdgpu
  drm/amdgpu: bump the afmt limit for CZ, ST, Polaris
  drm/amdgpu: use defines for CRTCs and AMFT blocks
  drm/amdgpu: when suspending, if uvd/vce was running. need to cancel delay work.
  iommu/dma: Restore scatterlist offsets correctly
  iommu/amd: Fix checking of pci dma aliases
  pinctrl: single: Fix pcs_parse_bits_in_pinctrl_entry to use __ffs than ffs
  pinctrl: mediatek: correct debounce time unit in mtk_gpio_set_debounce
  xen kconfig: don't "select INPUT_XEN_KBDDEV_FRONTEND"
  Input: pmic8xxx-pwrkey - fix algorithm for converting trigger delay
  Input: gtco - fix crash on detecting device without endpoints
  netlink: don't send NETLINK_URELEASE for unbound sockets
  nl80211: check netlink protocol in socket release notification
  powerpc: Update TM user feature bits in scan_features()
  powerpc: Update cpu_user_features2 in scan_features()
  powerpc: scan_features() updates incorrect bits for REAL_LE
  crypto: talitos - fix AEAD tcrypt tests
  crypto: talitos - fix crash in talitos_cra_init()
  crypto: sha1-mb - use corrcet pointer while completing jobs
  crypto: ccp - Prevent information leakage on export
  iwlwifi: mvm: fix memory leak in paging
  iwlwifi: pcie: lower the debug level for RSA semaphore access
  s390/pci: add extra padding to function measurement block
  cpufreq: intel_pstate: Fix processing for turbo activation ratio
  Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control"
  Revert "drm/radeon: disable runtime pm on PX laptops without dGPU power control"
  drm/i915: Fix race condition in intel_dp_destroy_mst_connector()
  drm/qxl: fix cursor position with non-zero hotspot
  drm/nouveau/core: use vzalloc for allocating ramht
  futex: Acknowledge a new waiter in counter before plist
  futex: Handle unlock_pi race gracefully
  asm-generic/futex: Re-enable preemption in futex_atomic_cmpxchg_inatomic()
  ALSA: hda - Add dock support for ThinkPad X260
  ALSA: pcxhr: Fix missing mutex unlock
  ALSA: hda - add PCI ID for Intel Broxton-T
  ALSA: hda - Keep powering up ADCs on Cirrus codecs
  ALSA: hda/realtek - Add ALC3234 headset mode for Optiplex 9020m
  ALSA: hda - Don't trust the reported actual power state
  x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address
  x86/mm/xen: Suppress hugetlbfs in PV guests
  arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission
  arm64: Honour !PTE_WRITE in set_pte_at() for kernel mappings
  sched/cgroup: Fix/cleanup cgroup teardown/init
  dmaengine: pxa_dma: fix the maximum requestor line
  dmaengine: hsu: correct use of channel status register
  dmaengine: dw: fix master selection
  debugfs: Make automount point inodes permanently empty
  lib: lz4: fixed zram with lz4 on big endian machines
  dm cache metadata: fix cmd_read_lock() acquiring write lock
  dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros
  usb: gadget: f_fs: Fix use-after-free
  usb: hcd: out of bounds access in for_each_companion
  xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers
  usb: xhci: fix wild pointers in xhci_mem_cleanup
  xhci: resume USB 3 roothub first
  usb: xhci: applying XHCI_PME_STUCK_QUIRK to Intel BXT B0 host
  assoc_array: don't call compare_object() on a node
  ARM: OMAP2+: hwmod: Fix updating of sysconfig register
  ARM: OMAP2: Fix up interconnect barrier initialization for DRA7
  ARM: mvebu: Correct unit address for linksys
  ARM: dts: AM43x-epos: Fix clk parent for synctimer
  KVM: arm/arm64: Handle forward time correction gracefully
  kvm: x86: do not leak guest xcr0 into host interrupt handlers
  x86/mce: Avoid using object after free in genpool
  block: loop: fix filesystem corruption in case of aio/dio
  block: partition: initialize percpuref before sending out KOBJ_ADD

Conflicts:
	arch/arm64/Kconfig
	arch/arm64/include/asm/cputype.h
	arch/arm64/include/asm/hardirq.h
	arch/arm64/include/asm/irq.h
	arch/arm64/kernel/cpu_errata.c
	arch/arm64/kernel/cpuinfo.c
	arch/arm64/kernel/setup.c
	arch/arm64/kernel/smp.c
	arch/arm64/kernel/stacktrace.c
	arch/arm64/mm/init.c
	arch/arm64/mm/mmu.c
	arch/arm64/mm/pageattr.c
	mm/memcontrol.c

CRs-Fixed: 1054234
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
Change-Id: I2a7a34631ffee36ce18b9171f16d023be777392f
2016-08-18 14:50:45 -07:00
Satya Durga Srinivasu Prabhala
4e6dcec1bd Revert "arm64: Add support for app specific settings"
This reverts commit 7ab05c20ad ("arm64: Add support
for app specific settings").

Feature is not applicable to msmcobalt and only applicable
to MSM8996.

CRs-Fixed: 1054373
Change-Id: I12d3a22362b965c7d302976c83ab0e757c98d3c6
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2016-08-15 15:08:53 -07:00
Trilok Soni
f145f41478 Merge remote-tracking branch 'msm-4.4/tmp-2bf7955' into msm-4.4
* msm-4.4/tmp-2bf7955:
  Linux 4.4.8
  Revert "usb: hub: do not clear BOS field during reset device"
  usbvision: fix crash on detecting device with invalid configuration
  staging: android: ion: Set the length of the DMA sg entries in buffer
  Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()"
  Revert "PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed"
  Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled"
  HID: usbhid: fix inconsistent reset/resume/reset-resume behavior
  HID: wacom: fix Bamboo ONE oops
  ALSA: usb-audio: Skip volume controls triggers hangup on Dell USB Dock
  ALSA: usb-audio: Add a quirk for Plantronics BT300
  ALSA: usb-audio: Add a sample rate quirk for Phoenix Audio TMX320
  ALSA: hda/realtek - Enable the ALC292 dock fixup on the Thinkpad T460s
  ALSA: hda - fix front mic problem for a HP desktop
  ALSA: hda - Fix headset support and noise on HP EliteBook 755 G2
  ALSA: hda - Fixup speaker pass-through control for nid 0x14 on ALC225
  mmc: sdhci-pci: Add support and PCI IDs for more Broxton host controllers
  perf: Cure event->pending_disable race
  perf: Do not double free
  arm64: replace read_lock to rcu lock in call_step_hook
  Btrfs: fix file/data loss caused by fsync after rename and new inode
  iommu: Don't overwrite domain pointer when there is no default_domain
  ext4: ignore quota mount options if the quota feature is enabled
  ext4: add lockdep annotations for i_data_sem
  btrfs: fix crash/invalid memory access on fsync when using overlayfs
  nfs: use file_dentry()
  fs: add file_dentry()
  sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes
  iio: gyro: bmg160: fix endianness when reading axes
  iio: gyro: bmg160: fix buffer read values
  iio: accel: bmc150: fix endianness when reading axes
  iio: st_magn: always define ST_MAGN_TRIGGER_SET_STATE
  usb: renesas_usbhs: fix to avoid using a disabled ep in usbhsg_queue_done()
  usb: renesas_usbhs: disable TX IRQ before starting TX DMAC transfer
  usb: renesas_usbhs: avoid NULL pointer derefernce in usbhsf_pkt_handler()
  mac80211: fix txq queue related crashes
  mac80211: fix unnecessary frame drops in mesh fwding
  mac80211: fix ibss scan parameters
  mac80211: avoid excessive stack usage in sta_info
  mac80211: properly deal with station hashtable insert errors
  virtio: virtio 1.0 cs04 spec compliance for reset
  rbd: use GFP_NOIO consistently for request allocations
  pcmcia: db1xxx_ss: fix last irq_to_gpio user
  v4l: vsp1: Set the SRU CTRL0 register when starting the stream
  coda: fix error path in case of missing pdata on non-DT platform
  au0828: Fix dev_state handling
  au0828: fix au0828_v4l2_close() dev_state race condition
  pinctrl: freescale: imx: fix bogus check of of_iomap() return value
  pinctrl: nomadik: fix pull debug print inversion
  pinctrl: sunxi: Fix A33 external interrupts not working
  pinctrl: sh-pfc: only use dummy states for non-DT platforms
  pinctrl: pistachio: fix mfio84-89 function description and pinmux.
  MIPS: Fix MSA ld unaligned failure cases
  KVM: x86: reduce default value of halt_poll_ns parameter
  KVM: x86: Inject pending interrupt even if pending nmi exist
  cdc-acm: fix NULL pointer reference
  USB: uas: Add a new NO_REPORT_LUNS quirk
  USB: uas: Limit qdepth at the scsi-host level
  mpls: find_outdev: check for err ptr in addition to NULL check
  ipv6: Count in extension headers in skb->network_header
  ip6_tunnel: set rtnl_link_ops before calling register_netdevice
  ipv6: l2tp: fix a potential issue in l2tp_ip6_recv
  ipv4: l2tp: fix a potential issue in l2tp_ip_recv
  tuntap: restore default qdisc
  tun, bpf: fix suspicious RCU usage in tun_{attach, detach}_filter
  rtnl: fix msg size calculation in if_nlmsg_size()
  bridge: Allow set bridge ageing time when switchdev disabled
  ipv6: udp: fix UDP_MIB_IGNOREDMULTI updates
  qmi_wwan: add "D-Link DWM-221 B1" device id
  xfrm: Fix crash observed during device unregistration and decryption
  ppp: take reference on channels netns
  ipv4: initialize flowi4_flags before calling fib_lookup()
  ipv4: fix broadcast packets reception
  bonding: fix bond_get_stats()
  net: bcmgenet: fix dma api length mismatch
  qlge: Fix receive packets drop.
  tcp/dccp: remove obsolete WARN_ON() in icmp handlers
  ppp: ensure file->private_data can't be overridden
  ath9k: fix buffer overrun for ar9287
  farsync: fix off-by-one bug in fst_add_one
  mlx4: add missing braces in verify_qp_parameters
  net: Fix use after free in the recvmmsg exit path
  ipv4: Don't do expensive useless work during inetdev destroy.
  bridge: allow zero ageing time
  rocker: set FDB cleanup timer according to lowest ageing time
  mlxsw: spectrum: Check requested ageing time is valid
  macvtap: always pass ethernet header in linear
  qlcnic: Fix mailbox completion handling during spurious interrupt
  qlcnic: Remove unnecessary usage of atomic_t
  sh_eth: advance 'rxdesc' later in sh_eth_ring_format()
  sh_eth: fix NULL pointer dereference in sh_eth_ring_format()
  bpf: avoid copying junk bytes in bpf_get_current_comm()
  packet: validate variable length ll headers
  ax25: add link layer header validation function
  net: validate variable length ll headers
  ppp: release rtnl mutex when interface creation fails
  tcp: fix tcpi_segs_in after connection establishment
  udp6: fix UDP/IPv6 encap resubmit path
  usbnet: cleanup after bind() in probe()
  cdc_ncm: toggle altsetting to force reset before setup
  vxlan: fix missing options_len update on RX with collect metadata
  ipv6: re-enable fragment header matching in ipv6_find_hdr
  qmi_wwan: add Sierra Wireless EM74xx device ID
  tipc: Revert "tipc: use existing sk_write_queue for outgoing packet chain"
  mld, igmp: Fix reserved tailroom calculation
  sctp: lack the check for ports in sctp_v6_cmp_addr
  net: fix bridge multicast packet checksum validation
  net: qca_spi: clear IFF_TX_SKB_SHARING
  net: qca_spi: Don't clear IFF_BROADCAST
  net: vrf: Remove direct access to skb->data
  net: jme: fix suspend/resume on JMC260
  ipv4: only create late gso-skb if skb is already set up with CHECKSUM_PARTIAL
  tunnel: Clear IPCB(skb)->opt before dst_link_failure called
  tcp: convert cached rtt from usec to jiffies when feeding initial rto
  xen/events: Mask a moving irq
  drm/amdgpu/gmc: use proper register for vram type on Fiji
  drm/amdgpu/gmc: move vram type fetching into sw_init
  drm/radeon: add a dpm quirk for all R7 370 parts
  drm/radeon: add another R7 370 quirk
  drm/radeon: add a dpm quirk for sapphire Dual-X R7 370 2G D5
  drm/udl: Use unlocked gem unreferencing
  drm/dp: move hw_mutex up the call stack
  arm64: opcodes.h: Add arm big-endian config options before including arm header
  compiler-gcc: disable -ftracer for __noclone functions
  libnvdimm, pfn: fix uuid validation
  libnvdimm: fix smart data retrieval
  powerpc/mm: Fixup preempt underflow with huge pages
  mm: fix invalid node in alloc_migrate_target()
  ALSA: hda - Apply fix for white noise on Asus N550JV, too
  ALSA: hda - Fix white noise on Asus N750JV headphone
  ALSA: hda - Asus N750JV external subwoofer fixup
  ALSA: timer: Use mod_timer() for rearming the system timer
  parisc: Unbreak handling exceptions from kernel modules
  parisc: Fix kernel crash with reversed copy_from_user()
  parisc: Avoid function pointers for kernel exception routines
  PKCS#7: pkcs7_validate_trust(): initialize the _trusted output argument
  hwmon: (max1111) Return -ENODEV from max1111_read_channel if not instantiated
  Linux 4.4.7
  perf/x86/intel: Fix PEBS data source interpretation on Nehalem/Westmere
  perf/x86/intel: Use PAGE_SIZE for PEBS buffer size on Core2
  perf/x86/intel: Fix PEBS warning by only restoring active PMU in pmi
  perf/x86/pebs: Add workaround for broken OVFL status on HSW+
  sched/cputime: Fix steal time accounting vs. CPU hotplug
  scsi_common: do not clobber fixed sense information
  PM / sleep: Clear pm_suspend_global_flags upon hibernate
  intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled
  mtd: onenand: fix deadlock in onenand_block_markbad
  mm/page_alloc: prevent merging between isolated and other pageblocks
  ocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list
  ocfs2/dlm: fix race between convert and recovery
  Input: ati_remote2 - fix crashes on detecting device with invalid descriptor
  Input: ims-pcu - sanity check against missing interfaces
  Input: synaptics - handle spurious release of trackstick buttons, again
  writeback, cgroup: fix use of the wrong bdi_writeback which mismatches the inode
  writeback, cgroup: fix premature wb_put() in locked_inode_to_wb_and_lock_list()
  ACPI / PM: Runtime resume devices when waking from hibernate
  ARM: dts: at91: sama5d4 Xplained: don't disable hsmci regulator
  ARM: dts: at91: sama5d3 Xplained: don't disable hsmci regulator
  nfsd: fix deadlock secinfo+readdir compound
  nfsd4: fix bad bounds checking
  iser-target: Rework connection termination
  iser-target: Separate flows for np listeners and connections cma events
  iser-target: Add new state ISER_CONN_BOUND to isert_conn
  iser-target: Fix identification of login rx descriptor type
  target: Fix target_release_cmd_kref shutdown comp leak
  clk: bcm2835: Fix setting of PLL divider clock rates
  clk: rockchip: add hclk_cpubus to the list of rk3188 critical clocks
  clk: rockchip: rk3368: fix hdmi_cec gate-register
  clk: rockchip: rk3368: fix parents of video encoder/decoder
  clk: rockchip: rk3368: fix cpuclk core dividers
  clk: rockchip: rk3368: fix cpuclk mux bit of big cpu-cluster
  mmc: sdhci: Fix override of timeout clk wrt max_busy_timeout
  mmc: sdhci: fix data timeout (part 2)
  mmc: sdhci: fix data timeout (part 1)
  mmc: mmc_spi: Add Card Detect comments and fix CD GPIO case
  mmc: block: fix ABI regression of mmc_blk_ioctl
  ideapad-laptop: Add ideapad Y700 (15) to the no_hw_rfkill DMI list
  MAINTAINERS: Update mailing list and web page for hwmon subsystem
  kbuild/mkspec: fix grub2 installkernel issue
  scripts/kconfig: allow building with make 3.80 again
  scripts/coccinelle: modernize &
  bitops: Do not default to __clear_bit() for __clear_bit_unlock()
  tracing: Fix trace_printk() to print when not using bprintk()
  tracing: Fix crash from reading trace_pipe with sendfile
  tracing: Have preempt(irqs)off trace preempt disabled functions
  IB/ipoib: fix for rare multicast join race condition
  drm/amdgpu: include the right version of gmc header files for iceland
  drm/amdgpu: disable runtime pm on PX laptops without dGPU power control
  drm/radeon: Don't drop DP 2.7 Ghz link setup on some cards.
  drm/radeon: disable runtime pm on PX laptops without dGPU power control
  iwlwifi: mvm: Fix paging memory leak
  ipr: Fix regression when loading firmware
  ipr: Fix out-of-bounds null overwrite
  rapidio/rionet: fix deadlock on SMP
  fs/coredump: prevent fsuid=0 dumps into user-controlled directories
  fuse: Add reference counting for fuse_io_priv
  fuse: do not use iocb after it may have been freed
  md: multipath: don't hardcopy bio in .make_request path
  md/raid5: preserve STRIPE_PREREAD_ACTIVE in break_stripe_batch_list
  raid10: include bio_end_io_list in nr_queued to prevent freeze_array hang
  RAID5: revert e9e4c377e2 to fix a livelock
  RAID5: check_reshape() shouldn't call mddev_suspend
  md/raid5: Compare apples to apples (or sectors to sectors)
  raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang
  xfs: fix two memory leaks in xfs_attr_list.c error paths
  quota: Fix possible GPF due to uninitialised pointers
  ARC: bitops: Remove non relevant comments
  ARC: [BE] readl()/writel() to work in Big Endian CPU configuration
  xtensa: clear all DBREAKC registers on start
  xtensa: fix preemption in {clear,copy}_user_highpage
  xtensa: ISS: don't hang if stdin EOF is reached
  splice: handle zero nr_pages in splice_to_pipe()
  vfs: show_vfsstat: do not ignore errors from show_devname method
  of: alloc anywhere from memblock if range not specified
  net: mvneta: enable change MAC address when interface is up
  cgroup: ignore css_sets associated with dead cgroups during migration
  Bluetooth: Fix potential buffer overflow with Add Advertising
  Bluetooth: Add new AR3012 ID 0489:e095
  watchdog: rc32434_wdt: fix ioctl error handling
  watchdog: don't run proc_watchdog_update if new value is same as old
  ia64: define ioremap_uc()
  mm: memcontrol: reclaim and OOM kill when shrinking memory.max below usage
  mm: memcontrol: reclaim when shrinking memory.high below usage
  bcache: fix cache_set_flush() NULL pointer dereference on OOM
  bcache: fix race of writeback thread starting before complete initialization
  bcache: cleaned up error handling around register_cache()
  IB/srpt: Simplify srpt_handle_tsk_mgmt()
  brd: Fix discard request processing
  jbd2: fix FS corruption possibility in jbd2_journal_destroy() on umount path
  tools/hv: Use include/uapi with __EXPORTED_HEADERS__
  ALSA: hda - Fix unconditional GPIO toggle via automute
  ALSA: hda - fix the mic mute button and led problem for a Lenovo AIO
  ALSA: hda - Don't handle ELD notify from invalid port
  ALSA: intel8x0: Add clock quirk entry for AD1981B on IBM ThinkPad X41.
  ALSA: pcm: Avoid "BUG:" string for warnings again
  ALSA: hda - Apply reboot D3 fix for CX20724 codec, too
  mtip32xx: Cleanup queued requests after surprise removal
  mtip32xx: Implement timeout handler
  mtip32xx: Handle FTL rebuild failure state during device initialization
  mtip32xx: Handle safe removal during IO
  mtip32xx: Fix for rmmod crash when drive is in FTL rebuild
  mtip32xx: Print exact time when an internal command is interrupted
  mtip32xx: Remove unwanted code from taskfile error handler
  mtip32xx: Fix broken service thread handling
  mtip32xx: Avoid issuing standby immediate cmd during FTL rebuild
  media: v4l2-compat-ioctl32: fix missing length copy in put_v4l2_buffer32
  coda: fix first encoded frame payload
  bttv: Width must be a multiple of 16 when capturing planar formats
  adv7511: TX_EDID_PRESENT is still 1 after a disconnect
  saa7134: Fix bytesperline not being set correctly for planar formats
  8250: use callbacks to access UART_DLL/UART_DLM
  net: irda: Fix use-after-free in irtty_open()
  tty: Fix GPF in flush_to_ldisc(), part 2
  staging: comedi: ni_mio_common: fix the ni_write[blw]() functions
  staging: android: ion_test: fix check of platform_device_register_simple() error code
  staging: comedi: ni_tiocmd: change mistaken use of start_src for start_arg
  HID: fix hid_ignore_special_drivers module parameter
  HID: multitouch: force retrieving of Win8 signature blob
  HID: i2c-hid: fix OOB write in i2c_hid_set_or_send_report()
  HID: logitech: fix Dual Action gamepad support
  tpm: fix the cleanup of struct tpm_chip
  tpm_eventlog.c: fix binary_bios_measurements
  tpm_crb: tpm2_shutdown() must be called before tpm_chip_unregister()
  tpm: fix the rollback in tpm_chip_register()
  mei: bus: check if the device is enabled before data transfer
  X.509: Fix leap year handling again
  crypto: marvell/cesa - forward devm_ioremap_resource() error code
  crypto: ux500 - fix checks of error code returned by devm_ioremap_resource()
  crypto: atmel - fix checks of error code returned by devm_ioremap_resource()
  crypto: keywrap - memzero the correct memory
  crypto: ccp - memset request context to zero during import
  crypto: ccp - Don't assume export/import areas are aligned
  crypto: ccp - Limit the amount of information exported
  crypto: ccp - Add hash state import and export support
  Bluetooth: btusb: Add a new AR3012 ID 13d3:3472
  Bluetooth: btusb: Add a new AR3012 ID 04ca:3014
  Bluetooth: btusb: Add new AR3012 ID 13d3:3395
  ALSA: usb-audio: Fix double-free in error paths after snd_usb_add_audio_stream() call
  ALSA: usb-audio: Minor code cleanup in create_fixed_stream_quirk()
  ALSA: usb-audio: add Microsoft HD-5001 to quirks
  ALSA: usb-audio: Add sanity checks for endpoint accesses
  ALSA: usb-audio: Fix NULL dereference in create_fixed_stream_quirk()
  Input: powermate - fix oops with malicious USB descriptors
  pwc: Add USB id for Philips Spc880nc webcam
  USB: option: add "D-Link DWM-221 B1" device id
  USB: serial: ftdi_sio: Add support for ICP DAS I-756xU devices
  USB: serial: cp210x: Adding GE Healthcare Device ID
  USB: cypress_m8: add endpoint sanity check
  USB: digi_acceleport: do sanity checking for the number of ports
  USB: mct_u232: add sanity checking in probe
  USB: usb_driver_claim_interface: add sanity checking
  USB: iowarrior: fix oops with malicious USB descriptors
  USB: cdc-acm: more sanity checking
  USB: uas: Reduce can_queue to MAX_CMNDS
  usb: hub: fix a typo in hub_port_init() leading to wrong logic
  usb: retry reset if a device times out
  dm: fix rq_end_stats() NULL pointer in dm_requeue_original_request()
  dm cache: make sure every metadata function checks fail_io
  dm thin metadata: don't issue prefetches if a transaction abort has failed
  dm: fix excessive dm-mq context switching
  dm snapshot: disallow the COW and origin devices from being identical
  libnvdimm: Fix security issue with DSM IOCTL.
  aic7xxx: Fix queue depth handling
  be2iscsi: set the boot_kset pointer to NULL in case of failure
  scsi: storvsc: fix SRB_STATUS_ABORTED handling
  sd: Fix discard granularity when LBPRZ=1
  aacraid: Set correct msix count for EEH recovery
  aacraid: Fix memory leak in aac_fib_map_free
  aacraid: Fix RRQ overload
  sg: fix dxferp in from_to case
  x86/mm: TLB_REMOTE_SEND_IPI should count pages
  x86/iopl: Fix iopl capability check on Xen PV
  x86/iopl/64: Properly context-switch IOPL on Xen PV
  x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt()
  x86/irq: Cure live lock in fixup_irqs()
  PCI: ACPI: IA64: fix IO port generic range check
  PCI: Disable IO/MEM decoding for devices with non-compliant BARs
  pinctrl-bcm2835: Fix cut-and-paste error in "pull" parsing
  s390/pci: enforce fmb page boundary rule
  s390/cpumf: add missing lpp magic initialization
  s390: fix floating pointer register corruption (again)
  EDAC, amd64_edac: Shift wrapping issue in f1x_get_norm_dct_addr()
  EDAC/sb_edac: Fix computation of channel address
  sched/preempt, sh: kmap_coherent relies on disabled preemption
  sched/cputime: Fix steal_account_process_tick() to always return jiffies
  Thermal: Ignore invalid trip points
  perf tools: Fix python extension build
  perf tools: Fix checking asprintf return value
  perf tools: Dont stop PMU parsing on alias parse error
  perf/core: Fix perf_sched_count derailment
  KVM: VMX: fix nested vpid for old KVM guests
  KVM: VMX: avoid guest hang on invalid invvpid instruction
  KVM: VMX: avoid guest hang on invalid invept instruction
  KVM: fix spin_lock_init order on x86
  KVM: i8254: change PIT discard tick policy
  KVM: x86: fix missed hardware breakpoints
  x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs
  perf/x86/intel: Add definition for PT PMI bit
  x86/entry/compat: Keep TS_COMPAT set during signal delivery
  x86/microcode: Untangle from BLK_DEV_INITRD
  x86/microcode/intel: Make early loader look for builtin microcode too
  mmc: sh_mmcif: Correct TX DMA channel allocation
  mmc: sh_mmcif: rework dma channel handling
  ASoC: samsung: pass DMA channels as pointers
  regulator: core: Fix nested locking of supplies
  regulator: core: avoid unused variable warning
  s390/cpumf: Fix lpp detection
  cpufreq: dt: No need to allocate resources anymore
  cpufreq: dt: No need to fetch voltage-tolerance
  cpufreq: dt: Use dev_pm_opp_set_rate() to switch frequency
  cpufreq: dt: Reuse dev_pm_opp_get_max_transition_latency()
  cpufreq: dt: Unsupported OPPs are already disabled
  cpufreq: dt: Pass regulator name to the OPP core
  cpufreq: dt: OPP layers handles clock-latency for V1 bindings as well
  cpufreq: dt: Rename 'need_update' to 'opp_v1'
  cpufreq: dt: Convert few pr_debug/err() calls to dev_dbg/err()
  cpufreq-dt: fix handling regulator_get_voltage() result
  cpufreq-dt: Supply power coefficient when registering cooling devices
  PM / OPP: Rename structures for clarity
  PM / OPP: Fix incorrect comments
  PM / OPP: Initialize regulator pointer to an error value
  PM / OPP: Initialize u_volt_min/max to a valid value
  PM / OPP: Fix NULL pointer dereference crash when disabling OPPs
  PM / OPP: Add dev_pm_opp_set_rate()
  PM / OPP: Manage device clk
  PM / OPP: Parse clock-latency and voltage-tolerance for v1 bindings
  PM / OPP: Introduce dev_pm_opp_get_max_transition_latency()
  PM / OPP: Introduce dev_pm_opp_get_max_volt_latency()
  PM / OPP: Disable OPPs that aren't supported by the regulator
  PM / OPP: get/put regulators from OPP core
  cpufreq: cpufreq-dt: avoid uninitialized variable warnings:
  PM / OPP: Use snprintf() instead of sprintf()
  PM / OPP: Set cpu_dev->id in cpumask first
  PM / OPP: Fix parsing of opp-microvolt and opp-microamp properties
  PM / OPP: Parse 'opp-<prop>-<name>' bindings
  PM / OPP: Parse 'opp-supported-hw' binding
  PM / OPP: Add missing doc comments
  PM / OPP: Rename OPP nodes as opp@<opp-hz>
  PM / OPP: Remove 'operating-points-names' binding
  PM / OPP: Add {opp-microvolt|opp-microamp}-<name> binding
  PM / OPP: Add "opp-supported-hw" binding
  PM / OPP: Add debugfs support
  arm64: vdso: Mark vDSO code as read-only

Conflicts:
	drivers/staging/android/ion/ion.c
	mm/page_alloc.c

CRs-Fixed: 1010239
Change-Id: Id59539cad642885e1e41340cebae4159ba1f7eaf
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
2016-07-22 16:45:32 -07:00
Runmin Wang
750075feff Merge remote-tracking branch 'origin/tmp-917a9a9133a6' into lsk
* tmp-917a9:
  ARM/vdso: Mark the vDSO code read-only after init
  x86/vdso: Mark the vDSO code read-only after init
  lkdtm: Verify that '__ro_after_init' works correctly
  arch: Introduce post-init read-only memory
  x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option
  mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings
  asm-generic: Consolidate mark_rodata_ro()
  Linux 4.4.6
  ld-version: Fix awk regex compile failure
  target: Drop incorrect ABORT_TASK put for completed commands
  block: don't optimize for non-cloned bio in bio_get_last_bvec()
  MIPS: smp.c: Fix uninitialised temp_foreign_map
  MIPS: Fix build error when SMP is used without GIC
  ovl: fix getcwd() failure after unsuccessful rmdir
  ovl: copy new uid/gid into overlayfs runtime inode
  userfaultfd: don't block on the last VM updates at exit time
  powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
  powerpc/powernv: Add a kmsg_dumper that flushes console output on panic
  powerpc: Fix dedotify for binutils >= 2.26
  Revert "drm/radeon/pm: adjust display configuration after powerstate"
  drm/radeon: Fix error handling in radeon_flip_work_func.
  drm/amdgpu: Fix error handling in amdgpu_flip_work_func.
  Revert "drm/radeon: call hpd_irq_event on resume"
  x86/mm: Fix slow_virt_to_phys() for X86_PAE again
  gpu: ipu-v3: Do not bail out on missing optional port nodes
  mac80211: Fix Public Action frame RX in AP mode
  mac80211: check PN correctly for GCMP-encrypted fragmented MPDUs
  mac80211: minstrel_ht: fix a logic error in RTS/CTS handling
  mac80211: minstrel_ht: set default tx aggregation timeout to 0
  mac80211: fix use of uninitialised values in RX aggregation
  mac80211: minstrel: Change expected throughput unit back to Kbps
  iwlwifi: mvm: inc pending frames counter also when txing non-sta
  can: gs_usb: fixed disconnect bug by removing erroneous use of kfree()
  cfg80211/wext: fix message ordering
  wext: fix message delay/ordering
  ovl: fix working on distributed fs as lower layer
  ovl: ignore lower entries when checking purity of non-directory entries
  ASoC: wm8958: Fix enum ctl accesses in a wrong type
  ASoC: wm8994: Fix enum ctl accesses in a wrong type
  ASoC: samsung: Use IRQ safe spin lock calls
  ASoC: dapm: Fix ctl value accesses in a wrong type
  ncpfs: fix a braino in OOM handling in ncp_fill_cache()
  jffs2: reduce the breakage on recovery from halfway failed rename()
  dmaengine: at_xdmac: fix residue computation
  tracing: Fix check for cpu online when event is disabled
  s390/dasd: fix diag 0x250 inline assembly
  s390/mm: four page table levels vs. fork
  KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0
  KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
  KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit
  KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS
  KVM: VMX: disable PEBS before a guest entry
  kvm: cap halt polling at exactly halt_poll_ns
  PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr()
  ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property
  ARM: dts: dra7: do not gate cpsw clock due to errata i877
  ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
  arm64: account for sparsemem section alignment when choosing vmemmap offset
  Linux 4.4.5
  drm/amdgpu: fix topaz/tonga gmc assignment in 4.4 stable
  modules: fix longstanding /proc/kallsyms vs module insertion race.
  drm/i915: refine qemu south bridge detection
  drm/i915: more virtual south bridge detection
  block: get the 1st and last bvec via helpers
  block: check virt boundary in bio_will_gap()
  drm/amdgpu: Use drm_calloc_large for VM page_tables array
  thermal: cpu_cooling: fix out of bounds access in time_in_idle
  i2c: brcmstb: allocate correct amount of memory for regmap
  ubi: Fix out of bounds write in volume update code
  cxl: Fix PSL timebase synchronization detection
  MIPS: traps: Fix SIGFPE information leak from `do_ov' and `do_trap_or_bp'
  MIPS: scache: Fix scache init with invalid line size.
  USB: serial: option: add support for Quectel UC20
  USB: serial: option: add support for Telit LE922 PID 0x1045
  USB: qcserial: add Sierra Wireless EM74xx device ID
  USB: qcserial: add Dell Wireless 5809e Gobi 4G HSPA+ (rev3)
  USB: cp210x: Add ID for Parrot NMEA GPS Flight Recorder
  usb: chipidea: otg: change workqueue ci_otg as freezable
  ALSA: timer: Fix broken compat timer user status ioctl
  ALSA: hdspm: Fix zero-division
  ALSA: hdsp: Fix wrong boolean ctl value accesses
  ALSA: hdspm: Fix wrong boolean ctl value accesses
  ALSA: seq: oss: Don't drain at closing a client
  ALSA: pcm: Fix ioctls for X32 ABI
  ALSA: timer: Fix ioctls for X32 ABI
  ALSA: rawmidi: Fix ioctls X32 ABI
  ALSA: hda - Fix mic issues on Acer Aspire E1-472
  ALSA: ctl: Fix ioctls for X32 ABI
  ALSA: usb-audio: Add a quirk for Plantronics DA45
  adv7604: fix tx 5v detect regression
  dmaengine: pxa_dma: fix cyclic transfers
  Fix directory hardlinks from deleted directories
  jffs2: Fix page lock / f->sem deadlock
  Revert "jffs2: Fix lock acquisition order bug in jffs2_write_begin"
  Btrfs: fix loading of orphan roots leading to BUG_ON
  pata-rb532-cf: get rid of the irq_to_gpio() call
  tracing: Do not have 'comm' filter override event 'comm' field
  ata: ahci: don't mark HotPlugCapable Ports as external/removable
  PM / sleep / x86: Fix crash on graph trace through x86 suspend
  arm64: vmemmap: use virtual projection of linear region
  Adding Intel Lewisburg device IDs for SATA
  writeback: flush inode cgroup wb switches instead of pinning super_block
  block: bio: introduce helpers to get the 1st and last bvec
  libata: Align ata_device's id on a cacheline
  libata: fix HDIO_GET_32BIT ioctl
  drm/amdgpu: return from atombios_dp_get_dpcd only when error
  drm/amdgpu/gfx8: specify which engine to wait before vm flush
  drm/amdgpu: apply gfx_v8 fixes to gfx_v7 as well
  drm/amdgpu/pm: update current crtc info after setting the powerstate
  drm/radeon/pm: update current crtc info after setting the powerstate
  drm/ast: Fix incorrect register check for DRAM width
  target: Fix WRITE_SAME/DISCARD conversion to linux 512b sectors
  iommu/vt-d: Use BUS_NOTIFY_REMOVED_DEVICE in hotplug path
  iommu/amd: Fix boot warning when device 00:00.0 is not iommu covered
  iommu/amd: Apply workaround for ATS write permission check
  arm/arm64: KVM: Fix ioctl error handling
  KVM: x86: fix root cause for missed hardware breakpoints
  vfio: fix ioctl error handling
  Fix cifs_uniqueid_to_ino_t() function for s390x
  CIFS: Fix SMB2+ interim response processing for read requests
  cifs: fix out-of-bounds access in lease parsing
  fbcon: set a default value to blink interval
  kvm: x86: Update tsc multiplier on change.
  mips/kvm: fix ioctl error handling
  parisc: Fix ptrace syscall number and return value modification
  PCI: keystone: Fix MSI code that retrieves struct pcie_port pointer
  block: Initialize max_dev_sectors to 0
  drm/amdgpu: mask out WC from BO on unsupported arches
  btrfs: async-thread: Fix a use-after-free error for trace
  btrfs: Fix no_space in write and rm loop
  Btrfs: fix deadlock running delayed iputs at transaction commit time
  drivers: sh: Restore legacy clock domain on SuperH platforms
  use ->d_seq to get coherency between ->d_inode and ->d_flags
  Linux 4.4.4
  iwlwifi: mvm: don't allow sched scans without matches to be started
  iwlwifi: update and fix 7265 series PCI IDs
  iwlwifi: pcie: properly configure the debug buffer size for 8000
  iwlwifi: dvm: fix WoWLAN
  security: let security modules use PTRACE_MODE_* with bitmasks
  IB/cma: Fix RDMA port validation for iWarp
  x86/irq: Plug vector cleanup race
  x86/irq: Call irq_force_move_complete with irq descriptor
  x86/irq: Remove outgoing CPU from vector cleanup mask
  x86/irq: Remove the cpumask allocation from send_cleanup_vector()
  x86/irq: Clear move_in_progress before sending cleanup IPI
  x86/irq: Remove offline cpus from vector cleanup
  x86/irq: Get rid of code duplication
  x86/irq: Copy vectormask instead of an AND operation
  x86/irq: Check vector allocation early
  x86/irq: Reorganize the search in assign_irq_vector
  x86/irq: Reorganize the return path in assign_irq_vector
  x86/irq: Do not use apic_chip_data.old_domain as temporary buffer
  x86/irq: Validate that irq descriptor is still active
  x86/irq: Fix a race in x86_vector_free_irqs()
  x86/irq: Call chip->irq_set_affinity in proper context
  x86/entry/compat: Add missing CLAC to entry_INT80_32
  x86/mpx: Fix off-by-one comparison with nr_registers
  hpfs: don't truncate the file when delete fails
  do_last(): ELOOP failure exit should be done after leaving RCU mode
  should_follow_link(): validate ->d_seq after having decided to follow
  xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted.
  xen/pciback: Save the number of MSI-X entries to be copied later.
  xen/pciback: Check PF instead of VF for PCI_COMMAND_MEMORY
  xen/scsiback: correct frontend counting
  xen/arm: correctly handle DMA mapping of compound pages
  ARM: at91/dt: fix typo in sama5d2 pinmux descriptions
  ARM: OMAP2+: Fix onenand initialization to avoid filesystem corruption
  do_last(): don't let a bogus return value from ->open() et.al. to confuse us
  kernel/resource.c: fix muxed resource handling in __request_region()
  sunrpc/cache: fix off-by-one in qword_get()
  tracing: Fix showing function event in available_events
  powerpc/eeh: Fix partial hotplug criterion
  KVM: x86: MMU: fix ubsan index-out-of-range warning
  KVM: x86: fix conversion of addresses to linear in 32-bit protected mode
  KVM: x86: fix missed hardware breakpoints
  KVM: arm/arm64: vgic: Ensure bitmaps are long enough
  KVM: async_pf: do not warn on page allocation failures
  of/irq: Fix msi-map calculation for nonzero rid-base
  NFSv4: Fix a dentry leak on alias use
  nfs: fix nfs_size_to_loff_t
  block: fix use-after-free in dio_bio_complete
  bio: return EINTR if copying to user space got interrupted
  i2c: i801: Adding Intel Lewisburg support for iTCO
  phy: core: fix wrong err handle for phy_power_on
  writeback: keep superblock pinned during cgroup writeback association switches
  cgroup: make sure a parent css isn't offlined before its children
  cpuset: make mm migration asynchronous
  PCI/AER: Flush workqueue on device remove to avoid use-after-free
  ARCv2: SMP: Emulate IPI to self using software triggered interrupt
  ARCv2: STAR 9000950267: Handle return from intr to Delay Slot #2
  libata: fix sff host state machine locking while polling
  qla2xxx: Fix stale pointer access.
  spi: atmel: fix gpio chip-select in case of non-DT platform
  target: Fix race with SCF_SEND_DELAYED_TAS handling
  target: Fix remote-port TMR ABORT + se_cmd fabric stop
  target: Fix TAS handling for multi-session se_node_acls
  target: Fix LUN_RESET active TMR descriptor handling
  target: Fix LUN_RESET active I/O handling for ACK_KREF
  ALSA: hda - Fixing background noise on Dell Inspiron 3162
  ALSA: hda - Apply clock gate workaround to Skylake, too
  Revert "workqueue: make sure delayed work run in local cpu"
  workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup
  mac80211: Requeue work after scan complete for all VIF types.
  rfkill: fix rfkill_fop_read wait_event usage
  tick/nohz: Set the correct expiry when switching to nohz/lowres mode
  perf stat: Do not clean event's private stats
  cdc-acm:exclude Samsung phone 04e8:685d
  Revert "Staging: panel: usleep_range is preferred over udelay"
  Staging: speakup: Fix getting port information
  sd: Optimal I/O size is in bytes, not sectors
  libceph: don't spam dmesg with stray reply warnings
  libceph: use the right footer size when skipping a message
  libceph: don't bail early from try_read() when skipping a message
  libceph: fix ceph_msg_revoke()
  seccomp: always propagate NO_NEW_PRIVS on tsync
  cpufreq: Fix NULL reference crash while accessing policy->governor_data
  cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype
  hwmon: (ads1015) Handle negative conversion values correctly
  hwmon: (gpio-fan) Remove un-necessary speed_index lookup for thermal hook
  hwmon: (dell-smm) Blacklist Dell Studio XPS 8000
  Thermal: do thermal zone update after a cooling device registered
  Thermal: handle thermal zone device properly during system sleep
  Thermal: initialize thermal zone device correctly
  IB/mlx5: Expose correct maximum number of CQE capacity
  IB/qib: Support creating qps with GFP_NOIO flag
  IB/qib: fix mcast detach when qp not attached
  IB/cm: Fix a recently introduced deadlock
  dmaengine: dw: disable BLOCK IRQs for non-cyclic xfer
  dmaengine: at_xdmac: fix resume for cyclic transfers
  dmaengine: dw: fix cyclic transfer callbacks
  dmaengine: dw: fix cyclic transfer setup
  nfit: fix multi-interface dimm handling, acpi6.1 compatibility
  ACPI / PCI / hotplug: unlock in error path in acpiphp_enable_slot()
  ACPI: Revert "ACPI / video: Add Dell Inspiron 5737 to the blacklist"
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700
  lib: sw842: select crc32
  uapi: update install list after nvme.h rename
  ideapad-laptop: Add Lenovo Yoga 700 to no_hw_rfkill dmi list
  ideapad-laptop: Add Lenovo ideapad Y700-17ISK to no_hw_rfkill dmi list
  toshiba_acpi: Fix blank screen at boot if transflective backlight is supported
  make sure that freeing shmem fast symlinks is RCU-delayed
  drm/radeon/pm: adjust display configuration after powerstate
  drm/radeon: Don't hang in radeon_flip_work_func on disabled crtc. (v2)
  drm: Fix treatment of drm_vblank_offdelay in drm_vblank_on() (v2)
  drm: Fix drm_vblank_pre/post_modeset regression from Linux 4.4
  drm: Prevent vblank counter bumps > 1 with active vblank clients. (v2)
  drm: No-Op redundant calls to drm_vblank_off() (v2)
  drm/radeon: use post-decrement in error handling
  drm/qxl: use kmalloc_array to alloc reloc_info in qxl_process_single_command
  drm/i915: fix error path in intel_setup_gmbus()
  drm/i915/dsi: don't pass arbitrary data to sideband
  drm/i915/dsi: defend gpio table against out of bounds access
  drm/i915/skl: Don't skip mst encoders in skl_ddi_pll_select()
  drm/i915: Don't reject primary plane windowing with color keying enabled on SKL+
  drm/i915/dp: fall back to 18 bpp when sink capability is unknown
  drm/i915: Make sure DC writes are coherent on flush.
  drm/i915: Init power domains early in driver load
  drm/i915: intel_hpd_init(): Fix suspend/resume reprobing
  drm/i915: Restore inhibiting the load of the default context
  drm: fix missing reference counting decrease
  drm/radeon: hold reference to fences in radeon_sa_bo_new
  drm/radeon: mask out WC from BO on unsupported arches
  drm: add helper to check for wc memory support
  drm/radeon: fix DP audio support for APU with DCE4.1 display engine
  drm/radeon: Add a common function for DFS handling
  drm/radeon: cleaned up VCO output settings for DP audio
  drm/radeon: properly byte swap vce firmware setup
  drm/radeon: clean up fujitsu quirks
  drm/radeon: Fix "slow" audio over DP on DCE8+
  drm/radeon: call hpd_irq_event on resume
  drm/radeon: Fix off-by-one errors in radeon_vm_bo_set_addr
  drm/dp/mst: deallocate payload on port destruction
  drm/dp/mst: Reverse order of MST enable and clearing VC payload table.
  drm/dp/mst: move GUID storage from mgr, port to only mst branch
  drm/dp/mst: Calculate MST PBN with 31.32 fixed point
  drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil
  drm/dp/mst: fix in RAD element access
  drm/dp/mst: fix in MSTB RAD initialization
  drm/dp/mst: always send reply for UP request
  drm/dp/mst: process broadcast messages correctly
  drm/nouveau: platform: Fix deferred probe
  drm/nouveau/disp/dp: ensure sink is powered up before attempting link training
  drm/nouveau/display: Enable vblank irqs after display engine is on again.
  drm/nouveau/kms: take mode_config mutex in connector hotplug path
  drm/amdgpu/pm: adjust display configuration after powerstate
  drm/amdgpu: Don't hang in amdgpu_flip_work_func on disabled crtc.
  drm/amdgpu: use post-decrement in error handling
  drm/amdgpu: fix issue with overlapping userptrs
  drm/amdgpu: hold reference to fences in amdgpu_sa_bo_new (v2)
  drm/amdgpu: remove unnecessary forward declaration
  drm/amdgpu: fix s4 resume
  drm/amdgpu: remove exp hardware support from iceland
  drm/amdgpu: don't load MEC2 on topaz
  drm/amdgpu: drop topaz support from gmc8 module
  drm/amdgpu: pull topaz gmc bits into gmc_v7
  drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above
  drm/amdgpu: iceland use CI based MC IP
  drm/amdgpu: move gmc7 support out of CIK dependency
  drm/amdgpu: no need to load MC firmware on fiji
  drm/amdgpu: fix amdgpu_bo_pin_restricted VRAM placing v2
  drm/amdgpu: fix tonga smu resume
  drm/amdgpu: fix lost sync_to if scheduler is enabled.
  drm/amdgpu: call hpd_irq_event on resume
  drm/amdgpu: Fix off-by-one errors in amdgpu_vm_bo_map
  drm/vmwgfx: respect 'nomodeset'
  drm/vmwgfx: Fix a width / pitch mismatch on framebuffer updates
  drm/vmwgfx: Fix an incorrect lock check
  virtio_pci: fix use after free on release
  virtio_balloon: fix race between migration and ballooning
  virtio_balloon: fix race by fill and leak
  regulator: mt6311: MT6311_REGULATOR needs to select REGMAP_I2C
  regulator: axp20x: Fix GPIO LDO enable value for AXP22x
  clk: exynos: use irqsave version of spin_lock to avoid deadlock with irqs
  cxl: use correct operator when writing pcie config space values
  sparc64: fix incorrect sign extension in sys_sparc64_personality
  EDAC, mc_sysfs: Fix freeing bus' name
  EDAC: Robustify workqueues destruction
  MIPS: Fix buffer overflow in syscall_get_arguments()
  MIPS: Fix some missing CONFIG_CPU_MIPSR6 #ifdefs
  MIPS: hpet: Choose a safe value for the ETIME check
  MIPS: Loongson-3: Fix SMP_ASK_C0COUNT IPI handler
  Revert "MIPS: Fix PAGE_MASK definition"
  cputime: Prevent 32bit overflow in time[val|spec]_to_cputime()
  time: Avoid signed overflow in timekeeping_get_ns()
  Bluetooth: 6lowpan: Fix handling of uncompressed IPv6 packets
  Bluetooth: 6lowpan: Fix kernel NULL pointer dereferences
  Bluetooth: Fix incorrect removing of IRKs
  Bluetooth: Add support of Toshiba Broadcom based devices
  Bluetooth: Use continuous scanning when creating LE connections
  Drivers: hv: vmbus: Fix a Host signaling bug
  tools: hv: vss: fix the write()'s argument: error -> vss_msg
  mmc: sdhci: Allow override of get_cd() called from sdhci_request()
  mmc: sdhci: Allow override of mmc host operations
  mmc: sdhci-pci: Fix card detect race for Intel BXT/APL
  mmc: pxamci: fix again read-only gpio detection polarity
  mmc: sdhci-acpi: Fix card detect race for Intel BXT/APL
  mmc: mmci: fix an ages old detection error
  mmc: core: Enable tuning according to the actual timing
  mmc: sdhci: Fix sdhci_runtime_pm_bus_on/off()
  mmc: mmc: Fix incorrect use of driver strength switching HS200 and HS400
  mmc: sdio: Fix invalid vdd in voltage switch power cycle
  mmc: sdhci: Fix DMA descriptor with zero data length
  mmc: sdhci-pci: Do not default to 33 Ohm driver strength for Intel SPT
  mmc: usdhi6rol0: handle NULL data in timeout
  clockevents/tcb_clksrc: Prevent disabling an already disabled clock
  posix-clock: Fix return code on the poll method's error path
  irqchip/gic-v3-its: Fix double ICC_EOIR write for LPI in EOImode==1
  irqchip/atmel-aic: Fix wrong bit operation for IRQ priority
  irqchip/mxs: Add missing set_handle_irq()
  irqchip/omap-intc: Add support for spurious irq handling
  coresight: checking for NULL string in coresight_name_match()
  dm: fix dm_rq_target_io leak on faults with .request_fn DM w/ blk-mq paths
  dm snapshot: fix hung bios when copy error occurs
  dm space map metadata: remove unused variable in brb_pop()
  tda1004x: only update the frontend properties if locked
  vb2: fix a regression in poll() behavior for output,streams
  gspca: ov534/topro: prevent a division by 0
  si2157: return -EINVAL if firmware blob is too big
  media: dvb-core: Don't force CAN_INVERSION_AUTO in oneshot mode
  rc: sunxi-cir: Initialize the spinlock properly
  namei: ->d_inode of a pinned dentry is stable only for positives
  mei: validate request value in client notify request ioctl
  mei: fix fasync return value on error
  rtlwifi: rtl8723be: Fix module parameter initialization
  rtlwifi: rtl8188ee: Fix module parameter initialization
  rtlwifi: rtl8192se: Fix module parameter initialization
  rtlwifi: rtl8723ae: Fix initialization of module parameters
  rtlwifi: rtl8192de: Fix incorrect module parameter descriptions
  rtlwifi: rtl8192ce: Fix handling of module parameters
  rtlwifi: rtl8192cu: Add missing parameter setup
  rtlwifi: rtl_pci: Fix kernel panic
  locks: fix unlock when fcntl_setlk races with a close
  um: link with -lpthread
  uml: fix hostfs mknod()
  uml: flush stdout before forking
  s390/fpu: signals vs. floating point control register
  s390/compat: correct restore of high gprs on signal return
  s390/dasd: fix performance drop
  s390/dasd: fix refcount for PAV reassignment
  s390/dasd: prevent incorrect length error under z/VM after PAV changes
  s390: fix normalization bug in exception table sorting
  btrfs: initialize the seq counter in struct btrfs_device
  Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots
  Btrfs: fix transaction handle leak on failure to create hard link
  Btrfs: fix number of transaction units required to create symlink
  Btrfs: send, don't BUG_ON() when an empty symlink is found
  btrfs: statfs: report zero available if metadata are exhausted
  Btrfs: igrab inode in writepage
  Btrfs: add missing brelse when superblock checksum fails
  KVM: s390: fix memory overwrites when vx is disabled
  s390/kvm: remove dependency on struct save_area definition
  clocksource/drivers/vt8500: Increase the minimum delta
  genirq: Validate action before dereferencing it in handle_irq_event_percpu()
  mm: numa: quickly fail allocations for NUMA balancing on full nodes
  mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED
  ocfs2: unlock inode if deleting inode from orphan fails
  drm/i915: shut up gen8+ SDE irq dmesg noise
  iw_cxgb3: Fix incorrectly returning error on success
  spi: omap2-mcspi: Prevent duplicate gpio_request
  drivers: android: correct the size of struct binder_uintptr_t for BC_DEAD_BINDER_DONE
  USB: option: add "4G LTE usb-modem U901"
  USB: option: add support for SIM7100E
  USB: cp210x: add IDs for GE B650V3 and B850V3 boards
  usb: dwc3: Fix assignment of EP transfer resources
  can: ems_usb: Fix possible tx overflow
  dm thin: fix race condition when destroying thin pool workqueue
  bcache: Change refill_dirty() to always scan entire disk if necessary
  bcache: prevent crash on changing writeback_running
  bcache: allows use of register in udev to avoid "device_busy" error.
  bcache: unregister reboot notifier if bcache fails to unregister device
  bcache: fix a leak in bch_cached_dev_run()
  bcache: clear BCACHE_DEV_UNLINK_DONE flag when attaching a backing device
  bcache: Add a cond_resched() call to gc
  bcache: fix a livelock when we cause a huge number of cache misses
  lib/ucs2_string: Correct ucs2 -> utf8 conversion
  efi: Add pstore variables to the deletion whitelist
  efi: Make efivarfs entries immutable by default
  efi: Make our variable validation list include the guid
  efi: Do variable name validation tests in utf8
  efi: Use ucs2_as_utf8 in efivarfs instead of open coding a bad version
  lib/ucs2_string: Add ucs2 -> utf8 helper functions
  ARM: 8457/1: psci-smp is built only for SMP
  drm/gma500: Use correct unref in the gem bo create function
  devm_memremap: Fix error value when memremap failed
  KVM: s390: fix guest fprs memory leak
  arm64: errata: Add -mpc-relative-literal-loads to build flags
  ARM: debug-ll: fix BCM63xx entry for multiplatform
  ext4: fix bh->b_state corruption
  sctp: Fix port hash table size computation
  unix_diag: fix incorrect sign extension in unix_lookup_by_ino
  tipc: unlock in error path
  rtnl: RTM_GETNETCONF: fix wrong return value
  IFF_NO_QUEUE: Fix for drivers not calling ether_setup()
  tcp/dccp: fix another race at listener dismantle
  route: check and remove route cache when we get route
  net_sched fix: reclassification needs to consider ether protocol changes
  pppoe: fix reference counting in PPPoE proxy
  l2tp: Fix error creating L2TP tunnels
  net/mlx4_en: Avoid changing dev->features directly in run-time
  net/mlx4_en: Choose time-stamping shift value according to HW frequency
  net/mlx4_en: Count HW buffer overrun only once
  qmi_wwan: add "4G LTE usb-modem U901"
  tcp: md5: release request socket instead of listener
  tipc: fix premature addition of node to lookup table
  af_unix: Guard against other == sk in unix_dgram_sendmsg
  af_unix: Don't set err in unix_stream_read_generic unless there was an error
  ipv4: fix memory leaks in ip_cmsg_send() callers
  bonding: Fix ARP monitor validation
  bpf: fix branch offset adjustment on backjumps after patching ctx expansion
  flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen
  net: Copy inner L3 and L4 headers as unaligned on GRE TEB
  sctp: translate network order to host order when users get a hmacid
  enic: increment devcmd2 result ring in case of timeout
  tg3: Fix for tg3 transmit queue 0 timed out when too many gso_segs
  net:Add sysctl_max_skb_frags
  tcp: do not drop syn_recv on all icmp reports
  unix: correctly track in-flight fds in sending process user_struct
  ipv6: fix a lockdep splat
  ipv6: addrconf: Fix recursive spin lock call
  ipv6/udp: use sticky pktinfo egress ifindex on connect()
  ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail()
  tcp: beware of alignments in tcp_get_info()
  switchdev: Require RTNL mutex to be held when sending FDB notifications
  inet: frag: Always orphan skbs inside ip_defrag()
  tipc: fix connection abort during subscription cancel
  net: dsa: fix mv88e6xxx switches
  sctp: allow setting SCTP_SACK_IMMEDIATELY by the application
  pptp: fix illegal memory access caused by multiple bind()s
  af_unix: fix struct pid memory leak
  tcp: fix NULL deref in tcp_v4_send_ack()
  lwt: fix rx checksum setting for lwt devices tunneling over ipv6
  tunnels: Allow IPv6 UDP checksums to be correctly controlled.
  net: dp83640: Fix tx timestamp overflow handling.
  gro: Make GRO aware of lightweight tunnels.
  af_iucv: Validate socket address length in iucv_sock_bind()

Conflicts:
	arch/arm64/Makefile
	arch/arm64/include/asm/cacheflush.h
	drivers/mmc/host/sdhci.c
	drivers/usb/dwc3/ep0.c
	drivers/usb/dwc3/gadget.c
	kernel/module.c
	sound/core/pcm_compat.c

CRs-Fixed: 1010239
Signed-off-by: Runmin Wang <runminw@codeaurora.org>
Change-Id: I41a28636fc9ad91f9d979b191784609476294cdf
2016-07-12 11:40:49 -07:00
Vinayak Menon
7da300d6a3 mm: ksm: avoid trageted reclaim of ksm pages
Per Process Reclaim tries to avoid reclaiming of
shared pages by passing the target VMA. For anon and
file pages this works, but not for KSM. This is because
vma_address(page) will not return the intended value
resulting in crashes like this

[From 3.10 kernel]
kernel BUG at kernel/mm/rmap.c:534!
(vma_address+0x28/0x2c) from [<c01eff94>] (try_to_unmap_ksm+0x4c/0x170)
(try_to_unmap_ksm+0x4c/0x170) from [<c01e4834>] (try_to_unmap+0x34/0xa4)
(try_to_unmap+0x34/0xa4) from [<c01c8e64>] (shrink_page_list+0x3f0/0xa34)
(shrink_page_list+0x3f0/0xa34) from [<c01c9688>] (reclaim_pages_from_list)
(reclaim_pages_from_list+0xb0/0x100) from [<c0240ed0>] (reclaim_pte_range)
(reclaim_pte_range+0xf0/0x164) from [<c01e79a4>] (walk_page_range+0x1d0)
(walk_page_range+0x1d0/0x260) from [<c0241ccc>] (reclaim_task_anon+0xb0)
(reclaim_task_anon+0xb0/0x114) from [<c01f8d98>] (swap_fn+0x220/0x460)
(swap_fn+0x220/0x460) from [<c0133a38>] (process_one_work+0x294/0x430)
(process_one_work+0x294/0x430) from [<c0134720>] (worker_thread)
(worker_thread+0x224/0x358) from [<c01390d4>] (kthread+0xa0/0xac)
(kthread+0xa0/0xac) from [<c0105f38>] (ret_from_fork+0x14/0x3c)

CRs-Fixed: 984947
Change-Id: I5208fb68372f7af72868e39399bf545fb7b774f3
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-29 15:12:23 -07:00
Shiraz Hashim
1be1715f14 mm: process_reclaim: use unbounded cpu workqueue
It is observed that in some cases process reclaim work
doesn't get chance to run due to presence of RT scheduled
on the same CPU. This is leading to user space freeze and
a live-lock situation where RT itself is looping for a
page to be present in swap cache while process reclaim
work is unable to schedule and do the same.

Schedule process reclaim work on unbounded cpu workqueue
so that the work has opportunity to be scheduled on to
other cpu.

Change-Id: I6852f7e8d0a344ab5631b188627263f11414f27e
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-29 15:12:09 -07:00
Vinayak Menon
9ff0477006 mm: process_reclaim: do not iterate over stale task structs
swap_fn iterates through the threads of selected tasks after
a rcu_read_unlock which is wrong. But we can't extend the
rcu_read_lock since it will result in severe performance
issues. So better avoid iterating over the threads. Just
lock the group leader and use it further.

Change-Id: I36269b1b6619315f33f6f3b49ec73571a66796f2
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-29 15:11:56 -07:00
Vinayak Menon
1ad1a93af4 mm: process_reclaim: fix reclaim skip on low efficiency
The logic used to skip reclaim on low efficiency results
in process reclaim not triggering at all. Fix it by
properly handling the skip_reclaim atomic variable.

Change-Id: I119097bb9b1baf8f3e8d4afa0a6dc2c30c0de6e7
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-29 15:11:36 -07:00
Vinayak Menon
9caa3b38bb mm: process reclaim: vmpressure based process reclaim
With this patch, anon pages of inactive tasks can be reclaimed,
depending on memory pressure. Memory pressure is detected
using vmpressure events. 'N' best tasks in terms of anon
size is selected and pages proportional to their tasksize
is reclaimed. The total number of pages reclaimed at each
run of the swap work, can be tuned from userspace, the
default being SWAP_CLUSTER_MAX * 32.

The patch also adds tracepoints to debug and tune the
feature.

echo 1 > /sys/module/process_reclaim/parameters/enable_process_reclaim
to enable the feature.

echo <pages> > /sys/module/process_reclaim/parameters/per_swap_size,
to set the number of pages reclaimed in each scan.

/sys/module/process_reclaim/parameters/reclaim_avg_efficiency, provides
the average efficiency (scan to reclaim ratio) of the algorithm.

/sys/module/process_reclaim/parameters/swap_eff_win, to set the window
period (in unit of number of times reclaim is triggered) to detect
low efficiency runs.

/sys/module/process_reclaim/parameters/swap_opt_eff, to set the optimal
efficiency threshold for low efficiency detection.

Change-Id: I895986f10c997d1715761eaaadc4bbbee60db9d2
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-29 15:11:26 -07:00
Mel Gorman
5812a73a14 mm, sl[au]b: add __GFP_ATOMIC to the GFP reclaim mask
Commit d0164adc89 ("mm, page_alloc: distinguish between being unable to
sleep, unwilling to sleep and avoiding waking kswapd") modified __GFP_WAIT
to explicitly identify the difference between atomic callers and those
that were unwilling to sleep.  Later the definition was removed entirely.

The GFP_RECLAIM_MASK is the set of flags that affect watermark checking
and reclaim behaviour but __GFP_ATOMIC was never added. Without it, atomic
users of the slab allocator strip the __GFP_ATOMIC flag and cannot access
the page allocator atomic reserves.  This patch addresses the problem.

The user-visible impact depends on the workload but potentially atomic
allocations unnecessarily fail without this path.

Change-Id: Ieac0932d146f7fd992db9fd834b0e9aa3822f891
Link: http://lkml.kernel.org/r/20160610093832.GK2527@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Marcin Wojtas <mw@semihalf.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>	[4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Git-Commit: 843f65ccbb2273430b57ae135ccd26dddee05be7
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-24 15:05:16 -07:00
Minchan Kim
384da356db mm: Support address range reclaim
This patch adds address range reclaim of a process.
The requirement is following as,

Like webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so this patch supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.

	echo [addr] [size-byte] > /proc/pid/reclaim
The addr should be page-aligned.

So now reclaim konb's interface is following as.

echo file > /proc/pid/reclaim
	reclaim file-backed pages only
echo anon > /proc/pid/reclaim
	reclaim anonymous pages only
echo all > /proc/pid/reclaim
	reclaim all pages
echo 0x100000 8K > /proc/pid/reclaim
	reclaim pages in (0x100000 - 0x102000)

Change-Id: I111131d31be1cfcfa246617b634a9a8bc4078098
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 08:39:01
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-22 14:44:20 -07:00
Minchan Kim
06de050ac6 mm: Enhance per process reclaim to consider shared pages
Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.

This patch causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

This feature doesn't handle non-linear mapping on ramfs because
it's very time-consuming and doesn't make sure of reclaiming and
not common.

Change-Id: I7e5f34f2e947f5db6d405867fe2ad34863ca40f7
Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:27
[vinmenon@codeaurora.org: trivial merge conflict fixes + changes
to make the patch work with 3.18 kernel]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-22 14:43:57 -07:00
Minchan Kim
a4e92011d4 mm: Remove shrink_page
By previous patch, shrink_page_list can handle pages from
multiple zone so let's remove shrink_page.

Change-Id: I3526377aa6ee6142b8f3ec63396e7ada1e442505
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 22 Apr 2013 17:45:03
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-22 14:43:32 -07:00
Minchan Kim
7d65fdc5d7 mm: make shrink_page_list with pages work from multiple zones
Shrink_page_list expects all pages come from a same zone
but it's too limited to use.

This patch removes the dependency so next patch can use
shrink_page_list with pages from multiple zones.

Change-Id: I34469b7f0a79f2b79e30e40033ba8b3e1dd5f2d0
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:25
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-22 14:43:18 -07:00
Minchan Kim
ecc070a294 mm: Per process reclaim
These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier.

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

This patch adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Reclaim file-backed pages only.
	echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
	echo anon > /proc/PID/reclaim
Reclaim all pages
	echo all > /proc/PID/reclaim

Change-Id: Iabdb7bc2ef3dc4d94e3ea005fbe18f4cd06739ab
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:24
[vinmenon@codeaurora.org: trivial merge conflict fixes,
and minor tweak of the commit msg]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-22 14:43:06 -07:00
Minchan Kim
b3d3ad2c8f mm: prevent to write out dirty page in CMA by may_writepage
Now, local variable references in shrink_page_list is
PAGEREF_RECLAIM_CLEAN as default. It is for preventing to reclaim
dirty pages when CMA try to migrate pages.
Strictly speaking, we don't need it because CMA already didn't allow
to write out by .may_writepage = 0 in reclaim_clean_pages_from_list.

Morever, it has a problem to prevent anonymous pages's swap out when
we use force_reclaim = true in shrink_page_list(ex, per process reclaim
can do it)

So this patch makes references's default value to PAGEREF_RECLAIM
and declare .may_writepage = 0 of scan_control in CMA part to make
code more clear.

Change-Id: I5edc3c955d106ecebc4949ce27daf5b7b7a18089
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mgorman@suse.de>
Reported-by: Minkyung Kim <minkyung88@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:23
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-22 14:42:54 -07:00
Vinayak Menon
0b8c4c2cee mm: fix cma accounting in zone_watermark_ok
Some cases were reported on 3.18 where atomic unmovable allocations
of order 2 fails, but kswapd does not wakeup. And in such cases it
was seen that, when zone_watermark_ok check is  performed to decide
whether to wake up kswapd, there were lot of CMA pages of order 2 and
above. This makes the watermark check succeed resulting in kswapd not
being woken up. But since these atomic unmovable allocations can't come
from CMA region, further atomic allocations keeps failing, without
kswapd trying to reclaim. Usually concurrent movable allocations result
in reclaim and improves the situtation, but the case reported was from
a network test which was resulting in only atomic skb allocations being
attempted. On 3.18 this was fixed by adding a cma free page counter and
accouting the cma free pages properly in watermark calculations.

Later this issue was indirectly fixed by the commit "mm, page_alloc:
only enforce watermarks for order-0 allocations".

But the commit "mm: add cma pcp list" brought the problem back because
it includes MIGRATE_CMA within MIGRATE_PCPTYPES, and thus watermark
check erroneously returns success for !ALLOC_CMA by finding free pages
in cma free list.

Change-Id: Id0e48b5c2f9deea93c5875c10d5ec72bd360df5f
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-13 19:06:15 -07:00
Vlastimil Babka
9ea305710e mm/page_alloc: prevent merging between isolated and other pageblocks
Hanjun Guo has reported that a CMA stress test causes broken accounting of
CMA and free pages:

> Before the test, I got:
> -bash-4.3# cat /proc/meminfo | grep Cma
> CmaTotal:         204800 kB
> CmaFree:          195044 kB
>
>
> After running the test:
> -bash-4.3# cat /proc/meminfo | grep Cma
> CmaTotal:         204800 kB
> CmaFree:         6602584 kB
>
> So the freed CMA memory is more than total..
>
> Also the the MemFree is more than mem total:
>
> -bash-4.3# cat /proc/meminfo
> MemTotal:       16342016 kB
> MemFree:        22367268 kB
> MemAvailable:   22370528 kB

Laura Abbott has confirmed the issue and suspected the freepage accounting
rewrite around 3.18/4.0 by Joonsoo Kim. Joonsoo had a theory that this is
caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA
pageblocks:

> CMA isolates MAX_ORDER aligned blocks, but, during the process,
> partialy isolated block exists. If MAX_ORDER is 11 and
> pageblock_order is 9, two pageblocks make up MAX_ORDER
> aligned block and I can think following scenario because pageblock
> (un)isolation would be done one by one.
>
> (each character means one pageblock. 'C', 'I' means MIGRATE_CMA,
> MIGRATE_ISOLATE, respectively.
>
> CC -> IC -> II (Isolation)
> II -> CI -> CC (Un-isolation)
>
> If some pages are freed at this intermediate state such as IC or CI,
> that page could be merged to the other page that is resident on
> different type of pageblock and it will cause wrong freepage count.

This was supposed to be prevented by CMA operating on MAX_ORDER blocks, but
since it doesn't hold the zone->lock between pageblocks, a race window does
exist.

It's also likely that unexpected merging can occur between MIGRATE_ISOLATE
and non-CMA pageblocks. This should be prevented in __free_one_page() since
commit 3c605096d3 ("mm/page_alloc: restrict max order of merging on isolated
pageblock"). However, we only check the migratetype of the pageblock where
buddy merging has been initiated, not the migratetype of the buddy pageblock
(or group of pageblocks) which can be MIGRATE_ISOLATE.

Joonsoo has suggested checking for buddy migratetype as part of
page_is_buddy(), but that would add extra checks in allocator hotpath and
bloat-o-meter has shown significant code bloat (the function is inline).

This patch reduces the bloat at some expense of more complicated code. The
buddy-merging while-loop in __free_one_page() is initially bounded to
pageblock_border and without any migratetype checks. The checks are placed
outside, bumping the max_order if merging is allowed, and returning to the
while-loop with a statement which can't be possibly considered harmful.

This fixes the accounting bug and also removes the arguably weird state in the
original commit 3c605096d3 where buddies could be left unmerged.

Fixes: 3c605096d3 ("mm/page_alloc: restrict max order of merging on isolated pageblock")
Link: https://lkml.org/lkml/2016/3/2/280
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Debugged-by: Laura Abbott <labbott@redhat.com>
Debugged-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>	[3.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I768a9c4886aa3fe2e827aba682f67bac2dba6f71
Git-commit: d9dddbf556674bf125ecd925b24e43a5cf2a568a
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
[vinemenon@codeaurora.org: fix trivial merge conflicts]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
2016-06-07 16:05:12 -07:00
Vinayak Menon
6cc2fdb17c mm: zcache: fix merge issues
Fix 4.4 merge issues in zero page support, and add the
missing label.

Change-Id: I4bed7add011e0c9b0e148d1b44132ba1873cf607
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-06-07 11:57:44 -07:00
Sarangdhar Joshi
7ab05c20ad arm64: Add support for app specific settings
Add support to provide an interface that can be used from
userspace to decide whether app specific settings need to
be applied / cleared when particular processes are running.

CRs-Fixed: 981519 997757
Change-Id: Id81f8b70de64f291a8586150f4d2c7c8f8b4420f
Signed-off-by: Sarangdhar Joshi <spjoshi@codeaurora.org>
[satyap@codeaurora.org: trivial merge conflict resolution and pull
fixes for CR: 997757]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2016-06-07 11:53:27 -07:00
Vinayak Menon
9e6c849ebb mm: zcache: remove __GFP_NO_KSWAPD
Remove __GFP_NO_KSWAPD. It no longer exist.

Change-Id: I8b50a06bdae050b3a3c47b80e21d0d2edf18b7c5
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-05-31 15:27:23 -07:00
Bob Liu
5248c3b4e4 mm: add WasActive page flag
Zcache could be ineffective if the compressed memory pool is full with
compressed inactive file pages and most of them will be never used again.

So we pick up pages from active file list only, those pages would probably
be accessed again. Compress them in memory can reduce the latency
significantly compared with rereading from disk.

When a file page is shrunk from active file list to inactive file list,
PageActive flag is also cleared.
So adding an extra WasActive page flag for zcache to know whether the
file page was shrunk from the active list.

Change-Id: Ida1f4db17075d1f6f825ef7ce2b3bae4eb799e3f
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Patch-mainline: linux-mm @ 2013-08-06 11:36:17
[vinmenon@codeaurora.org: trivial merge conflict fixes, checkpatch fixes,
fix the definitions of was_active page flag so that it does not create
compile time errors with CONFIG_CLEANCACHE disabled. Also remove the
unnecessary use of PG_was_active in PAGE_FLAGS_CHECK_AT_PREP. Since
was_active is a requirement for zcache, make the definitions dependent on
CONFIG_ZCACHE rather than CONFIG_CLEANCACHE.]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-05-31 15:27:11 -07:00
Shiraz Hashim
9345b8f94b mm/memblock: disable local irqs while late memblock changes
There is a possibility of deadlock while doing late
memblock configuration as only preemption is disabled and
irq can be serviced while seqlock is held and in turn
memblock_is_memory can be called from irq context thus
trying to claim seqlock again. Following call stack was
observed,

[<c02136d4>] memblock_search+0x1c
[<c021487c>] memblock_is_memory+0x10
[<c01e4684>] free_kmem_pages+0x44
[<c0121c04>] free_task+0x28
[<c0178b30>] rcu_process_callbacks+0x488
[<c0127e30>] __do_softirq+0x150
[<c0128284>] irq_exit+0x84
[<c010c11c>] handle_IPI+0x12c
[<c0100588>] gic_handle_irq+0x70
[<c0e9efc0>] __irq_svc+0x40
[<c0214a8c>] memblock_region_resize_late_end+0xc
[<c057010c>] removed_alloc+0x110
[<c04ab2c4>] pil_boot+0x2b0
[<c04b7700>] __subsystem_get+0xe0
[<c04b79cc>] subsys_device_open+0x74
[<c0229f20>] chrdev_open+0x12c
[<c02246e4>] do_dentry_open+0x280
[<c0232698>] do_last+0x9a4
[<c0232b8c>] path_openat+0x23c
[<c0233bf0>] do_filp_open+0x2c

Fix it by disabling irqs during late memblock
configuration. It is a one time operation which changes
memblock related data structures and doesn't carry
performance impact.

CRs-Fixed: 1003890
Change-Id: I3ff1894f0c80580920b1971cda357915665b5054
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
2016-05-31 15:26:50 -07:00
Vinayak Menon
350c68c124 mm: swap_ratio: bail out if there aren't any other swap device
It is pointless to calculate the swap ratio when there is only
one swap device in the group. Moreover the existing code would
result in a spinlock recursion because of not taking this into
consideration. Interestingly, this check is already performed
in swap_ratio_slow by this piece of code

if (&(*si)->avail_list == plist_last(&swap_avail_head)) {
	/* just to make skip work */
	n = *si;
	ret = -ENODEV;
	goto skip;
}

But there is window where we drop the swap_avail_lock before
invoking swap_ratio() and take it back again in swap_ratio_slow.
In this period the si can get removed from swap_avail_head,
resulting in the failure of above logic. So recheck again.

Similarly, bail out from swap_ratio() if the sysctl is disabled,
and thus avoiding overhead of taking unnecessary locks.

Change-Id: I81a9dd61d24b7da55d5341c48a1f71d2b4b1978d
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-05-31 15:23:38 -07:00
Vinayak Menon
44bd107fc9 mm: vmscan: fix the page state calculation in too_many_isolated
It is observed that sometimes multiple tasks get blocked in
the congestion_wait loop below, in shrink_inactive_list.

(__schedule) from [<c0a03328>]
(schedule_timeout) from [<c0a04940>]
(io_schedule_timeout) from [<c01d585c>]
(congestion_wait) from [<c01cc9d8>]
(shrink_inactive_list) from [<c01cd034>]
(shrink_zone) from [<c01cdd08>]
(try_to_free_pages) from [<c01c442c>]
(__alloc_pages_nodemask) from [<c01f1884>]
(new_slab) from [<c09fcf60>]
(__slab_alloc) from [<c01f1a6c>]

In one such instance, zone_page_state(zone, NR_ISOLATED_FILE)
had returned 14, zone_page_state(zone, NR_INACTIVE_FILE)
returned 92, and the gfp_flag was GFP_KERNEL which resulted
in too_many_isolated to return true. But one of the CPU pageset
vmstat diff had NR_ISOLATED_FILE as -14. As there weren't any more
update to per cpu pageset, the threshold wasn't met, and the
tasks were blocked in the congestion wait.

This patch uses zone_page_state_snapshot instead, but restricts
its usage to avoid performance penalty.

Change-Id: Iec767a548e524729c7ed79a92fe4718cdd08ce69
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-05-31 15:23:28 -07:00
Sergey Senozhatsky
1d77f0a51c zsmalloc: fix zs_can_compact() integer overflow
commit 44f43e99fe70833058482d183e99fdfd11220996 upstream.

zs_can_compact() has two race conditions in its core calculation:

unsigned long obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
				zs_stat_get(class, OBJ_USED);

1) classes are not locked, so the numbers of allocated and used
   objects can change by the concurrent ops happening on other CPUs
2) shrinker invokes it from preemptible context

Depending on the circumstances, thus, OBJ_ALLOCATED can become
less than OBJ_USED, which can result in either very high or
negative `total_scan' value calculated later in do_shrink_slab().

do_shrink_slab() has some logic to prevent those cases:

 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62
 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62

However, due to the way `total_scan' is calculated, not every
shrinker->count_objects() overflow can be spotted and handled.
To demonstrate the latter, I added some debugging code to do_shrink_slab()
(x86_64) and the results were:

 vmscan: OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
 vmscan: but total_scan > 0: 92679974445502
 vmscan: resulting total_scan: 92679974445502
[..]
 vmscan: OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615]
 vmscan: but total_scan > 0: 22634041808232578
 vmscan: resulting total_scan: 22634041808232578

Even though shrinker->count_objects() has returned an overflowed value,
the resulting `total_scan' is positive, and, what is more worrisome, it
is insanely huge. This value is getting used later on in
shrinker->scan_objects() loop:

        while (total_scan >= batch_size ||
               total_scan >= freeable) {
                unsigned long ret;
                unsigned long nr_to_scan = min(batch_size, total_scan);

                shrinkctl->nr_to_scan = nr_to_scan;
                ret = shrinker->scan_objects(shrinker, shrinkctl);
                if (ret == SHRINK_STOP)
                        break;
                freed += ret;

                count_vm_events(SLABS_SCANNED, nr_to_scan);
                total_scan -= nr_to_scan;

                cond_resched();
        }

`total_scan >= batch_size' is true for a very-very long time and
'total_scan >= freeable' is also true for quite some time, because
`freeable < 0' and `total_scan' is large enough, for example,
22634041808232578. The only break condition, in the given scheme of
things, is shrinker->scan_objects() == SHRINK_STOP test, which is a
bit too weak to rely on, especially in heavy zsmalloc-usage scenarios.

To fix the issue, take a pool stat snapshot and use it instead of
racy zs_stat_get() calls.

Link: http://lkml.kernel.org/r/20160509140052.3389-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-18 17:06:44 -07:00
Liam Mark
2ef8479273 mm: cma: sleep between retries in cma_alloc
Port support from 3.10 for retrying cma allocations
to 3.18 to help resolve cma allocation failures.

It was observed that CMA pages are sometimes getting
pinned down by BG processes scheduled out in their exit
path. Since BG processes have lower priority they end up
getting less time slice by scheduler there by consuming
more time to free up CMA pages.

Also when a process is being forked copy_one_pte
may create copy-on-write mappings, when this is done
the page _count and page _mapcount are each
incremented sequentially. If the process is context
switched out after incrementing the _count but before
incrementing the _mapcount then the page will appear
temporarily pinned.

So instead of failing to allocate and directly
returning an error on the CMA allocation path we do 2
retries, with sleeps, to give the system an opportunity
to unpin any pinned pages.

Change-Id: I022a9341f8ee44f281c7cb34769695843e97d684
Signed-off-by: Susheel Khiani <skhiani@codeaurora.org>
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2016-05-18 13:38:15 -07:00
Howard Cochran
4bc9468f16 writeback: Fix performance regression in wb_over_bg_thresh()
commit 74d369443325063a5f0260e63971decb950fd8fa upstream.

Commit 947e9762a8 ("writeback: update wb_over_bg_thresh() to use
wb_domain aware operations") unintentionally changed this function's
meaning from "are there more dirty pages than the background writeback
threshold" to "are there more dirty pages than the writeback threshold".
The background writeback threshold is typically half of the writeback
threshold, so this had the effect of raising the number of dirty pages
required to cause a writeback worker to perform background writeout.

This can cause a very severe performance regression when a BDI uses
BDI_CAP_STRICTLIMIT because balance_dirty_pages() and the writeback worker
can now disagree on whether writeback should be initiated.

For example, in a system having 1GB of RAM, a single spinning disk, and a
"pass-through" FUSE filesystem mounted over the disk, application code
mmapped a 128MB file on the disk and was randomly dirtying pages in that
mapping.

Because FUSE uses strictlimit and has a default max_ratio of only 1%, in
balance_dirty_pages, thresh is ~200, bg_thresh is ~100, and the
dirty_freerun_ceiling is the average of those, ~150. So, it pauses the
dirtying processes when we have 151 dirty pages and wakes up a background
writeback worker. But the worker tests the wrong threshold (200 instead of
100), so it does not initiate writeback and just returns.

Thus, balance_dirty_pages keeps looping, sleeping and then waking up the
worker who will do nothing. It remains stuck in this state until the few
dirty pages that we have finally expire and we write them back for that
reason. Then the whole process repeats, resulting in near-zero throughput
through the FUSE BDI.

The fix is to call the parameterized variant of wb_calc_thresh, so that the
worker will do writeback if the bg_thresh is exceeded which was the
behavior before the referenced commit.

Fixes: 947e9762a8 ("writeback: update wb_over_bg_thresh() to use wb_domain aware operations")
Signed-off-by: Howard Cochran <hcochran@kernelspring.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Tested-by Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-11 11:21:18 +02:00
Jason Baron
24b8a175a6 mm: update min_free_kbytes from khugepaged after core initialization
commit bc22af74f271ef76b2e6f72f3941f91f0da3f5f8 upstream.

Khugepaged attempts to raise min_free_kbytes if its set too low.
However, on boot khugepaged sets min_free_kbytes first from
subsys_initcall(), and then the mm 'core' over-rides min_free_kbytes
after from init_per_zone_wmark_min(), via a module_init() call.

Khugepaged used to use a late_initcall() to set min_free_kbytes (such
that it occurred after the core initialization), however this was
removed when the initialization of min_free_kbytes was integrated into
the starting of the khugepaged thread.

The fix here is simply to invoke the core initialization using a
core_initcall() instead of module_init(), such that the previous
initialization ordering is restored.  I didn't restore the
late_initcall() since start_stop_khugepaged() already sets
min_free_kbytes via set_recommended_min_free_kbytes().

This was noticed when we had a number of page allocation failures when
moving a workload to a kernel with this new initialization ordering.  On
an 8GB system this restores min_free_kbytes back to 67584 from 11365
when CONFIG_TRANSPARENT_HUGEPAGE=y is set and either
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y or
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.

Fixes: 79553da293 ("thp: cleanup khugepaged startup")
Signed-off-by: Jason Baron <jbaron@akamai.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-11 11:21:17 +02:00
Dan Streetman
851375cc49 mm/zswap: provide unique zpool name
commit 32a4e169039927bfb6ee9f0ccbbe3a8aaf13a4bc upstream.

Instead of using "zswap" as the name for all zpools created, add an
atomic counter and use "zswap%x" with the counter number for each zpool
created, to provide a unique name for each new zpool.

As zsmalloc, one of the zpool implementations, requires/expects a unique
name for each pool created, zswap should provide a unique name.  The
zsmalloc pool creation does not fail if a new pool with a conflicting
name is created, unless CONFIG_ZSMALLOC_STAT is enabled; in that case,
zsmalloc pool creation fails with -ENOMEM.  Then zswap will be unable to
change its compressor parameter if its zpool is zsmalloc; it also will
be unable to change its zpool parameter back to zsmalloc, if it has any
existing old zpool using zsmalloc with page(s) in it.  Attempts to
change the parameters will result in failure to create the zpool.  This
changes zswap to provide a unique name for each zpool creation.

Fixes: f1c54846ee ("zswap: dynamic pool creation")
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Dan Streetman <dan.streetman@canonical.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-11 11:21:14 +02:00
Hugh Dickins
d27e2ddc40 mm, cma: prevent nr_isolated_* counters from going negative
commit 14af4a5e9b26ad251f81c174e8a43f3e179434a5 upstream.

/proc/sys/vm/stat_refresh warns nr_isolated_anon and nr_isolated_file go
increasingly negative under compaction: which would add delay when
should be none, or no delay when should delay.  The bug in compaction
was due to a recent mmotm patch, but much older instance of the bug was
also noticed in isolate_migratepages_range() which is used for CMA and
gigantic hugepage allocations.

The bug is caused by putback_movable_pages() in an error path
decrementing the isolated counters without them being previously
incremented by acct_isolated().  Fix isolate_migratepages_range() by
removing the error-path putback, thus reaching acct_isolated() with
migratepages still isolated, and leaving putback to caller like most
other places do.

Fixes: edc2ca6124 ("mm, compaction: move pageblock checks up from isolate_migratepages_range()")
[vbabka@suse.cz: expanded the changelog]
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-11 11:21:14 +02:00
Trilok Soni
1ca0992ef1 lib: memtest: Add MEMTEST_ENABLE_DEFAULT option
As of now memtest remains disabled until we specify the patterns
through the kernel command line. Some platforms have two
different configurations files (one for debug and another for
product) which can use the configuration option to enable the
memtest by default (in the debug configuration file).

CRs-Fixed: 1007344
Change-Id: I0bf7b33c3584f3d6cf5ef58dfe72be46212041da
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
2016-05-05 15:05:52 -07:00
Minchan Kim
36abe7272a mm/hwpoison: fix wrong num_poisoned_pages accounting
commit d7e69488bd04de165667f6bc741c1c0ec6042ab9 upstream.

Currently, migration code increses num_poisoned_pages on *failed*
migration page as well as successfully migrated one at the trial of
memory-failure.  It will make the stat wrong.  As well, it marks the
page as PG_HWPoison even if the migration trial failed.  It would mean
we cannot recover the corrupted page using memory-failure facility.

This patches fixes it.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04 14:48:49 -07:00
Minchan Kim
87c855f150 mm: vmscan: reclaim highmem zone if buffer_heads is over limit
commit 7bf52fb891b64b8d61caf0b82060adb9db761aec upstream.

We have been reclaimed highmem zone if buffer_heads is over limit but
commit 6b4f7799c6 ("mm: vmscan: invoke slab shrinkers from
shrink_zone()") changed the behavior so it doesn't reclaim highmem zone
although buffer_heads is over the limit.  This patch restores the logic.

Fixes: 6b4f7799c6 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04 14:48:49 -07:00
Gerald Schaefer
e513b90a9a numa: fix /proc/<pid>/numa_maps for THP
commit 28093f9f34cedeaea0f481c58446d9dac6dd620f upstream.

In gather_pte_stats() a THP pmd is cast into a pte, which is wrong
because the layouts may differ depending on the architecture.  On s390
this will lead to inaccurate numa_maps accounting in /proc because of
misguided pte_present() and pte_dirty() checks on the fake pte.

On other architectures pte_present() and pte_dirty() may work by chance,
but there may be an issue with direct-access (dax) mappings w/o
underlying struct pages when HAVE_PTE_SPECIAL is set and THP is
available.  In vm_normal_page() the fake pte will be checked with
pte_special() and because there is no "special" bit in a pmd, this will
always return false and the VM_PFNMAP | VM_MIXEDMAP checking will be
skipped.  On dax mappings w/o struct pages, an invalid struct page
pointer would then be returned that can crash the kernel.

This patch fixes the numa_maps THP handling by introducing new "_pmd"
variants of the can_gather_numa_stats() and vm_normal_page() functions.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04 14:48:49 -07:00
Konstantin Khlebnikov
be591a683e mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check
commit 3486b85a29c1741db99d0c522211c82d2b7a56d0 upstream.

Khugepaged detects own VMAs by checking vm_file and vm_ops but this way
it cannot distinguish private /dev/zero mappings from other special
mappings like /dev/hpet which has no vm_ops and popultes PTEs in mmap.

This fixes false-positive VM_BUG_ON and prevents installing THP where
they are not expected.

Link: http://lkml.kernel.org/r/CACT4Y+ZmuZMV5CjSFOeXviwQdABAgT7T+StKfTqan9YDtgEi5g@mail.gmail.com
Fixes: 78f11a2557 ("mm: thp: fix /dev/zero MAP_PRIVATE and vm_flags cleanups")
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04 14:48:49 -07:00
Tejun Heo
52526076a5 memcg: relocate charge moving from ->attach to ->post_attach
commit 264a0ae164bc0e9144bebcd25ff030d067b1a878 upstream.

Hello,

So, this ended up a lot simpler than I originally expected.  I tested
it lightly and it seems to work fine.  Petr, can you please test these
two patches w/o the lru drain drop patch and see whether the problem
is gone?

Thanks.
------ 8< ------
If charge moving is used, memcg performs relabeling of the affected
pages from its ->attach callback which is called under both
cgroup_threadgroup_rwsem and thus can't create new kthreads.  This is
fragile as various operations may depend on workqueues making forward
progress which relies on the ability to create new kthreads.

There's no reason to perform charge moving from ->attach which is deep
in the task migration path.  Move it to ->post_attach which is called
after the actual migration is finished and cgroup_threadgroup_rwsem is
dropped.

* move_charge_struct->mm is added and ->can_attach is now responsible
  for pinning and recording the target mm.  mem_cgroup_clear_mc() is
  updated accordingly.  This also simplifies mem_cgroup_move_task().

* mem_cgroup_move_task() is now called from ->post_attach instead of
  ->attach.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Debugged-and-tested-by: Petr Mladek <pmladek@suse.com>
Reported-by: Cyril Hrubis <chrubis@suse.cz>
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Fixes: 1ed1328792 ("sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04 14:48:49 -07:00
Jesper Dangaard Brouer
a4e25ff311 slub: clean up code for kmem cgroup support to kmem_cache_free_bulk
commit 376bf125ac781d32e202760ed7deb1ae4ed35d31 upstream.

This change is primarily an attempt to make it easier to realize the
optimizations the compiler performs in-case CONFIG_MEMCG_KMEM is not
enabled.

Performance wise, even when CONFIG_MEMCG_KMEM is compiled in, the
overhead is zero.  This is because, as long as no process have enabled
kmem cgroups accounting, the assignment is replaced by asm-NOP
operations.  This is possible because memcg_kmem_enabled() uses a
static_key_false() construct.

It also helps readability as it avoid accessing the p[] array like:
p[size - 1] which "expose" that the array is processed backwards inside
helper function build_detached_freelist().

Lastly this also makes the code more robust, in error case like passing
NULL pointers in the array.  Which were previously handled before commit
033745189b ("slub: add missing kmem cgroup support to
kmem_cache_free_bulk").

Fixes: 033745189b ("slub: add missing kmem cgroup support to kmem_cache_free_bulk")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04 14:48:49 -07:00
Trilok Soni
bf6ceddd56 mm/page_owner: ask users about default setting of PAGE_OWNER
Since this commit 48c96a3685 ("mm/page_owner: keep track
of page owners") doesn't enable the page_owner by default
even though CONFIG_PAGE_OWNER is enabled.

Add configuration option CONFIG_PAGE_OWNER_ENABLE_DEFAULT to
allow user to enable it by default through the defconfig file.

CRs-Fixed: 1006743
Change-Id: I9b565a34e2068bf575974eaf3dc9f7820bdd7a96
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
2016-05-03 15:53:55 -07:00
Christian Borntraeger
9194c460b8 mm/debug_pagealloc: ask users for default setting of debug_pagealloc
Since commit 031bc5743f ("mm/debug-pagealloc: make debug-pagealloc
boottime configurable") CONFIG_DEBUG_PAGEALLOC is by default not adding
any page debugging.

This resulted in several unnoticed bugs, e.g.

    https://lkml.kernel.org/g/<569F5E29.3090107@de.ibm.com>
or
    https://lkml.kernel.org/g/<56A20F30.4050705@de.ibm.com>

as this behaviour change was not even documented in Kconfig.

Let's provide a new Kconfig symbol that allows to change the default
back to enabled, e.g.  for debug kernels.  This also makes the change
obvious to kernel packagers.

Let's also change the Kconfig description for CONFIG_DEBUG_PAGEALLOC, to
indicate that there are two stages of overhead.

CRs-Fixed: 1006743
Change-Id: I52c36765837cc873877b9398371ffd840d485a81
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: ea6eabb05b26bd3d6f60b29b77a03bc61479fc0f
Git-repo: git://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
2016-04-29 14:40:10 -07:00
Vinayak Menon
b6fb81015e mm: fix compile time error with !CONFIG_CMA
Fixes compile time failures because of not protecting
CMA related elements with CONFIG_CMA.

Change-Id: I930b7c0ffdce0f1bfc4f8a582a698be16ed44d1f
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-04-26 14:38:03 -07:00
Xishi Qiu
fb4cfc6e0a mm: fix invalid node in alloc_migrate_target()
commit 6f25a14a7053b69917e2ebea0d31dd444cd31fd5 upstream.

It is incorrect to use next_node to find a target node, it will return
MAX_NUMNODES or invalid node.  This will lead to crash in buddy system
allocation.

Fixes: c8721bbbdd ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Laura Abbott" <lauraa@codeaurora.org>
Cc: Hui Zhu <zhuhui@xiaomi.com>
Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-20 15:41:53 +09:00
Liam Mark
50050f2a1d mm: add cma pcp list
Add a cma pcp list in order to increase cma memory utilization.

Increased cma memory utilization will improve overall memory
utilization because free cma pages are ignored when memory reclaim
is done with gfp mask GFP_KERNEL.

Since most memory reclaim is done by kswapd, which uses a gfp mask
of GFP_KERNEL, by increasing cma memory utilization we are therefore
ensuring that less aggressive memory reclaim takes place.

Increased cma memory utilization will improve performance,
for example it will increase app concurrency.

Change-Id: I809589a25c6abca51f1c963f118adfc78e955cf9
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2016-04-13 11:11:40 -07:00
Heesub Shin
d491cf59f0 cma: redirect page allocation to CMA
CMA pages are designed to be used as fallback for movable allocations
and cannot be used for non-movable allocations. If CMA pages are
utilized poorly, non-movable allocations may end up getting starved if
all regular movable pages are allocated and the only pages left are
CMA. Always using CMA pages first creates unacceptable performance
problems. As a midway alternative, use CMA pages for certain
userspace allocations. The userspace pages can be migrated or dropped
quickly which giving decent utilization.

Change-Id: I6165dda01b705309eebabc6dfa67146b7a95c174
CRs-Fixed: 452508
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Heesub Shin <heesub.shin@samsung.com
[lauraa@codeaurora.org: Missing CONFIG_CMA guards, add commit text]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
[lmark@codeaurora.org: resolve conflicts relating to
 MIGRATE_HIGHATOMIC and some other trivial merge conflicts]
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2016-04-13 11:11:30 -07:00
Liam Mark
1426d1f8d9 lowmemorykiller: Don't count swap cache pages twice
The lowmem_shrink function discounts all the swap cache pages from
the file cache count. The zone aware code also discounts all file
cache pages from a certain zone.  This results in some swap cache
pages being discounted twice, which can result in the low memory
killer being unnecessarily aggressive.

Fix the low memory killer to only discount the swap cache pages
once.

Change-Id: I650bbfbf0fbbabd01d82bdb3502b57ff59c3e14f
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-04-13 11:11:01 -07:00
Liam Mark
92c1fefed5 android/lowmemorykiller: Selectively count free CMA pages
In certain memory configurations there can be a large number of
CMA pages which are not suitable to satisfy certain memory
requests.

This large number of unsuitable pages can cause the
lowmemorykiller to not kill any tasks because the
lowmemorykiller counts all free pages.
In order to ensure the lowmemorykiller properly evaluates the
free memory only count the free CMA pages if they are suitable
for satisfying the memory request.

Change-Id: I7f06d53e2d8cfe7439e5561fe6e5209ce73b1c90
CRs-fixed: 437016
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2016-04-13 11:09:54 -07:00
Vlastimil Babka
5dc7e939b6 mm/page_alloc: prevent merging between isolated and other pageblocks
commit d9dddbf556674bf125ecd925b24e43a5cf2a568a upstream.

Hanjun Guo has reported that a CMA stress test causes broken accounting of
CMA and free pages:

> Before the test, I got:
> -bash-4.3# cat /proc/meminfo | grep Cma
> CmaTotal:         204800 kB
> CmaFree:          195044 kB
>
>
> After running the test:
> -bash-4.3# cat /proc/meminfo | grep Cma
> CmaTotal:         204800 kB
> CmaFree:         6602584 kB
>
> So the freed CMA memory is more than total..
>
> Also the the MemFree is more than mem total:
>
> -bash-4.3# cat /proc/meminfo
> MemTotal:       16342016 kB
> MemFree:        22367268 kB
> MemAvailable:   22370528 kB

Laura Abbott has confirmed the issue and suspected the freepage accounting
rewrite around 3.18/4.0 by Joonsoo Kim.  Joonsoo had a theory that this is
caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA
pageblocks:

> CMA isolates MAX_ORDER aligned blocks, but, during the process,
> partialy isolated block exists. If MAX_ORDER is 11 and
> pageblock_order is 9, two pageblocks make up MAX_ORDER
> aligned block and I can think following scenario because pageblock
> (un)isolation would be done one by one.
>
> (each character means one pageblock. 'C', 'I' means MIGRATE_CMA,
> MIGRATE_ISOLATE, respectively.
>
> CC -> IC -> II (Isolation)
> II -> CI -> CC (Un-isolation)
>
> If some pages are freed at this intermediate state such as IC or CI,
> that page could be merged to the other page that is resident on
> different type of pageblock and it will cause wrong freepage count.

This was supposed to be prevented by CMA operating on MAX_ORDER blocks,
but since it doesn't hold the zone->lock between pageblocks, a race
window does exist.

It's also likely that unexpected merging can occur between
MIGRATE_ISOLATE and non-CMA pageblocks.  This should be prevented in
__free_one_page() since commit 3c605096d3 ("mm/page_alloc: restrict
max order of merging on isolated pageblock").  However, we only check
the migratetype of the pageblock where buddy merging has been initiated,
not the migratetype of the buddy pageblock (or group of pageblocks)
which can be MIGRATE_ISOLATE.

Joonsoo has suggested checking for buddy migratetype as part of
page_is_buddy(), but that would add extra checks in allocator hotpath
and bloat-o-meter has shown significant code bloat (the function is
inline).

This patch reduces the bloat at some expense of more complicated code.
The buddy-merging while-loop in __free_one_page() is initially bounded
to pageblock_border and without any migratetype checks.  The checks are
placed outside, bumping the max_order if merging is allowed, and
returning to the while-loop with a statement which can't be possibly
considered harmful.

This fixes the accounting bug and also removes the arguably weird state
in the original commit 3c605096d3 where buddies could be left
unmerged.

Fixes: 3c605096d3 ("mm/page_alloc: restrict max order of merging on isolated pageblock")
Link: https://lkml.org/lkml/2016/3/2/280
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Debugged-by: Laura Abbott <labbott@redhat.com>
Debugged-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12 09:09:05 -07:00