Commit graph

285 commits

Author SHA1 Message Date
Runmin Wang
4b7c952db6 Merge tag 'lsk-v4.4-16.12-android' into branch 'msm-4.4'
* remotes/origin/tmp-2f0de51:
  Linux 4.4.38
  esp6: Fix integrity verification when ESN are used
  esp4: Fix integrity verification when ESN are used
  ipv4: Set skb->protocol properly for local output
  ipv6: Set skb->protocol properly for local output
  Don't feed anything but regular iovec's to blk_rq_map_user_iov
  constify iov_iter_count() and iter_is_iovec()
  sparc64: fix compile warning section mismatch in find_node()
  sparc64: Fix find_node warning if numa node cannot be found
  sparc32: Fix inverted invalid_frame_pointer checks on sigreturns
  net: ping: check minimum size on ICMP header length
  net: avoid signed overflows for SO_{SND|RCV}BUFFORCE
  geneve: avoid use-after-free of skb->data
  sh_eth: remove unchecked interrupts for RZ/A1
  net: bcmgenet: Utilize correct struct device for all DMA operations
  packet: fix race condition in packet_set_ring
  net/dccp: fix use-after-free in dccp_invalid_packet
  netlink: Do not schedule work from sk_destruct
  netlink: Call cb->done from a worker thread
  net/sched: pedit: make sure that offset is valid
  net, sched: respect rcu grace period on cls destruction
  net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change
  l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()
  rtnetlink: fix FDB size computation
  af_unix: conditionally use freezable blocking calls in read
  net: sky2: Fix shutdown crash
  ip6_tunnel: disable caching when the traffic class is inherited
  net: check dead netns for peernet2id_alloc()
  virtio-net: add a missing synchronize_net()
  Linux 4.4.37
  arm64: suspend: Reconfigure PSTATE after resume from idle
  arm64: mm: Set PSTATE.PAN from the cpu_enable_pan() call
  arm64: cpufeature: Schedule enable() calls instead of calling them via IPI
  pwm: Fix device reference leak
  mwifiex: printk() overflow with 32-byte SSIDs
  PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)
  PCI: Export pcie_find_root_port
  rcu: Fix soft lockup for rcu_nocb_kthread
  ALSA: pcm : Call kill_fasync() in stream lock
  x86/traps: Ignore high word of regs->cs in early_fixup_exception()
  kasan: update kasan_global for gcc 7
  zram: fix unbalanced idr management at hot removal
  ARC: Don't use "+l" inline asm constraint
  Linux 4.4.36
  scsi: mpt3sas: Unblock device after controller reset
  flow_dissect: call init_default_flow_dissectors() earlier
  mei: fix return value on disconnection
  mei: me: fix place for kaby point device ids.
  mei: me: disable driver on SPT SPS firmware
  drm/radeon: Ensure vblank interrupt is enabled on DPMS transition to on
  mpi: Fix NULL ptr dereference in mpi_powm() [ver #3]
  parisc: Also flush data TLB in flush_icache_page_asm
  parisc: Fix race in pci-dma.c
  parisc: Fix races in parisc_setup_cache_timing()
  NFSv4.x: hide array-bounds warning
  apparmor: fix change_hat not finding hat after policy replacement
  cfg80211: limit scan results cache size
  tile: avoid using clocksource_cyc2ns with absolute cycle count
  scsi: mpt3sas: Fix secure erase premature termination
  Fix USB CB/CBI storage devices with CONFIG_VMAP_STACK=y
  USB: serial: ftdi_sio: add support for TI CC3200 LaunchPad
  USB: serial: cp210x: add ID for the Zone DPMX
  usb: chipidea: move the lock initialization to core file
  KVM: x86: check for pic and ioapic presence before use
  KVM: x86: drop error recovery in em_jmp_far and em_ret_far
  iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions
  iommu/vt-d: Fix PASID table allocation
  sched: tune: Fix lacking spinlock initialization
  UPSTREAM: trace: Update documentation for mono, mono_raw and boot clock
  UPSTREAM: trace: Add an option for boot clock as trace clock
  UPSTREAM: timekeeping: Add a fast and NMI safe boot clock
  ANDROID: goldfish_pipe: fix allmodconfig build
  ANDROID: goldfish: goldfish_pipe: fix locking errors
  ANDROID: video: goldfishfb: fix platform_no_drv_owner.cocci warnings
  ANDROID: goldfish_pipe: fix call_kern.cocci warnings
  arm64: rename ranchu defconfig to ranchu64
  ANDROID: arch: x86: disable pic for Android toolchain
  ANDROID: goldfish_pipe: An implementation of more parallel pipe
  ANDROID: goldfish_pipe: bugfixes and performance improvements.
  ANDROID: goldfish: Add goldfish sync driver
  ANDROID: goldfish: add ranchu defconfigs
  ANDROID: goldfish_audio: Clear audio read buffer status after each read
  ANDROID: goldfish_events: no extra EV_SYN; register goldfish
  ANDROID: goldfish_fb: Set pixclock = 0
  ANDROID: goldfish: Enable ACPI-based enumeration for goldfish audio
  ANDROID: goldfish: Enable ACPI-based enumeration for goldfish framebuffer
  ANDROID: video: goldfishfb: add devicetree bindings
  BACKPORT: staging: goldfish: audio: fix compiliation on arm
  BACKPORT: Input: goldfish_events - enable ACPI-based enumeration for goldfish events
  BACKPORT: goldfish: Enable ACPI-based enumeration for goldfish battery
  BACKPORT: drivers: tty: goldfish: Add device tree bindings
  BACKPORT: tty: goldfish: support platform_device with id -1
  BACKPORT: Input: goldfish_events - add devicetree bindings
  BACKPORT: power: goldfish_battery: add devicetree bindings
  BACKPORT: staging: goldfish: audio: add devicetree bindings
  ANDROID: usb: gadget: function: cleanup: Add blank line after declaration
  cpufreq: sched: Fix kernel crash on accessing sysfs file
  usb: gadget: f_mtp: simplify ptp NULL pointer check
  cgroup: replace unified-hierarchy.txt with a proper cgroup v2 documentation
  cgroup: rename Documentation/cgroups/ to Documentation/cgroup-legacy/
  cgroup: replace __DEVEL__sane_behavior with cgroup2 fs type
  writeback: initialize inode members that track writeback history
  mm: page_alloc: generalize the dirty balance reserve
  block: fix module reference leak on put_disk() call for cgroups throttle
  Linux 4.4.35
  netfilter: nft_dynset: fix element timeout for HZ != 1000
  IB/cm: Mark stale CM id's whenever the mad agent was unregistered
  IB/uverbs: Fix leak of XRC target QPs
  IB/core: Avoid unsigned int overflow in sg_alloc_table
  IB/mlx5: Fix fatal error dispatching
  IB/mlx5: Use cache line size to select CQE stride
  IB/mlx4: Fix create CQ error flow
  IB/mlx4: Check gid_index return value
  PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails
  PM / sleep: fix device reference leak in test_suspend
  uwb: fix device reference leaks
  mfd: core: Fix device reference leak in mfd_clone_cell
  iwlwifi: pcie: fix SPLC structure parsing
  rtc: omap: Fix selecting external osc
  clk: mmp: mmp2: fix return value check in mmp2_clk_init()
  clk: mmp: pxa168: fix return value check in pxa168_clk_init()
  clk: mmp: pxa910: fix return value check in pxa910_clk_init()
  drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)
  crypto: caam - do not register AES-XTS mode on LP units
  ext4: sanity check the block and cluster size at mount time
  kbuild: Steal gcc's pie from the very beginning
  x86/kexec: add -fno-PIE
  scripts/has-stack-protector: add -fno-PIE
  kbuild: add -fno-PIE
  i2c: mux: fix up dependencies
  can: bcm: fix warning in bcm_connect/proc_register
  mfd: intel-lpss: Do not put device in reset state on suspend
  fuse: fix fuse_write_end() if zero bytes were copied
  KVM: Disable irq while unregistering user notifier
  KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr
  x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems
  Linux 4.4.34
  sparc64: Delete now unused user copy fixup functions.
  sparc64: Delete now unused user copy assembler helpers.
  sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert NG2copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert NGcopy_{from,to}_user to accurate exception reporting.
  sparc64: Convert NG4copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert U1copy_{from,to}_user to accurate exception reporting.
  sparc64: Convert GENcopy_{from,to}_user to accurate exception reporting.
  sparc64: Convert copy_in_user to accurate exception reporting.
  sparc64: Prepare to move to more saner user copy exception handling.
  sparc64: Delete __ret_efault.
  sparc64: Handle extremely large kernel TLB range flushes more gracefully.
  sparc64: Fix illegal relative branches in hypervisor patched TLB cross-call code.
  sparc64: Fix instruction count in comment for __hypervisor_flush_tlb_pending.
  sparc64: Fix illegal relative branches in hypervisor patched TLB code.
  sparc64: Handle extremely large kernel TSB range flushes sanely.
  sparc: Handle negative offsets in arch_jump_label_transform
  sparc64 mm: Fix base TSB sizing when hugetlb pages are used
  sparc: serial: sunhv: fix a double lock bug
  sparc: Don't leak context bits into thread->fault_address
  tty: Prevent ldisc drivers from re-using stale tty fields
  tcp: take care of truncations done by sk_filter()
  ipv4: use new_gw for redirect neigh lookup
  net: __skb_flow_dissect() must cap its return value
  sock: fix sendmmsg for partial sendmsg
  fib_trie: Correct /proc/net/route off by one error
  sctp: assign assoc_id earlier in __sctp_connect
  ipv6: dccp: add missing bind_conflict to dccp_ipv6_mapped
  ipv6: dccp: fix out of bound access in dccp_v6_err()
  dccp: fix out of bound access in dccp_v4_err()
  dccp: do not send reset to already closed sockets
  tcp: fix potential memory corruption
  ip6_tunnel: Clear IP6CB in ip6tunnel_xmit()
  bgmac: stop clearing DMA receive control register right after it is set
  net: mangle zero checksum in skb_checksum_help()
  net: clear sk_err_soft in sk_clone_lock()
  dctcp: avoid bogus doubling of cwnd after loss
  ARM: 8485/1: cpuidle: remove cpu parameter from the cpuidle_ops suspend hook
  Linux 4.4.33
  netfilter: fix namespace handling in nf_log_proc_dostring
  btrfs: qgroup: Prevent qgroup->reserved from going subzero
  mmc: mxs: Initialize the spinlock prior to using it
  ASoC: sun4i-codec: return error code instead of NULL when create_card fails
  ACPI / APEI: Fix incorrect return value of ghes_proc()
  i40e: fix call of ndo_dflt_bridge_getlink()
  hwrng: core - Don't use a stack buffer in add_early_randomness()
  lib/genalloc.c: start search from start of chunk
  mei: bus: fix received data size check in NFC fixup
  iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path
  iommu/amd: Free domain id when free a domain of struct dma_ops_domain
  tty/serial: at91: fix hardware handshake on Atmel platforms
  dmaengine: at_xdmac: fix spurious flag status for mem2mem transfers
  drm/i915: Respect alternate_ddc_pin for all DDI ports
  KVM: MIPS: Precalculate MMIO load resume PC
  scsi: mpt3sas: Fix for block device of raid exists even after deleting raid disk
  scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init
  iio: orientation: hid-sensor-rotation: Add PM function (fix non working driver)
  iio: hid-sensors: Increase the precision of scale to fix wrong reading interpretation.
  clk: qoriq: Don't allow CPU clocks higher than starting value
  toshiba-wmi: Fix loading the driver on non Toshiba laptops
  drbd: Fix kernel_sendmsg() usage - potential NULL deref
  usb: gadget: u_ether: remove interrupt throttling
  USB: cdc-acm: fix TIOCMIWAIT
  staging: nvec: remove managed resource from PS2 driver
  Revert "staging: nvec: ps2: change serio type to passthrough"
  drivers: staging: nvec: remove bogus reset command for PS/2 interface
  staging: iio: ad5933: avoid uninitialized variable in error case
  pinctrl: cherryview: Prevent possible interrupt storm on resume
  pinctrl: cherryview: Serialize register access in suspend/resume
  ARC: timer: rtc: implement read loop in "C" vs. inline asm
  s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignment
  coredump: fix unfreezable coredumping task
  swapfile: fix memory corruption via malformed swapfile
  dib0700: fix nec repeat handling
  ASoC: cs4270: fix DAPM stream name mismatch
  ALSA: info: Limit the proc text input size
  ALSA: info: Return error for invalid read/write
  arm64: Enable KPROBES/HIBERNATION/CORESIGHT in defconfig
  arm64: kvm: allows kvm cpu hotplug
  arm64: KVM: Register CPU notifiers when the kernel runs at HYP
  arm64: KVM: Skip HYP setup when already running in HYP
  arm64: hyp/kvm: Make hyp-stub reject kvm_call_hyp()
  arm64: hyp/kvm: Make hyp-stub extensible
  arm64: kvm: Move lr save/restore from do_el2_call into EL1
  arm64: kvm: deal with kernel symbols outside of linear mapping
  arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
  ANDROID: video: adf: Avoid directly referencing user pointers
  ANDROID: usb: gadget: audio_source: fix comparison of distinct pointer types
  android: binder: support for file-descriptor arrays.
  android: binder: support for scatter-gather.
  android: binder: add extra size to allocator.
  android: binder: refactor binder_transact()
  android: binder: support multiple /dev instances.
  android: binder: deal with contexts in debugfs.
  android: binder: support multiple context managers.
  android: binder: split flat_binder_object.
  disable aio support in recommended configuration
  Linux 4.4.32
  scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
  drm/radeon: fix DP mode validation
  drm/radeon/dp: add back special handling for NUTMEG
  drm/amdgpu: fix DP mode validation
  drm/amdgpu/dp: add back special handling for NUTMEG
  KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  Revert KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  of: silence warnings due to max() usage
  packet: on direct_xmit, limit tso and csum to supported devices
  sctp: validate chunk len before actually using it
  net sched filters: fix notification of filter delete with proper handle
  udp: fix IP_CHECKSUM handling
  net: sctp, forbid negative length
  ipv4: use the right lock for ping_group_range
  ipv4: disable BH in set_ping_group_range()
  net: add recursion limit to GRO
  rtnetlink: Add rtnexthop offload flag to compare mask
  bridge: multicast: restore perm router ports on multicast enable
  net: pktgen: remove rcu locking in pktgen_change_name()
  ipv6: correctly add local routes when lo goes up
  ip6_tunnel: fix ip6_tnl_lookup
  ipv6: tcp: restore IP6CB for pktoptions skbs
  netlink: do not enter direct reclaim from netlink_dump()
  packet: call fanout_release, while UNREGISTERING a netdev
  net: Add netdev all_adj_list refcnt propagation to fix panic
  net/sched: act_vlan: Push skb->data to mac_header prior calling skb_vlan_*() functions
  net: pktgen: fix pkt_size
  net: fec: set mac address unconditionally
  tg3: Avoid NULL pointer dereference in tg3_io_error_detected()
  ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route
  ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()
  tcp: fix a compile error in DBGUNDO()
  tcp: fix wrong checksum calculation on MTU probing
  net: avoid sk_forward_alloc overflows
  tcp: fix overflow in __tcp_retransmit_skb()
  arm64/kvm: fix build issue on kvm debug
  arm64: ptdump: Indicate whether memory should be faulting
  arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
  arm64: Drop alloc function from create_mapping
  arm64: allow vmalloc regions to be set with set_memory_*
  arm64: kernel: implement ACPI parking protocol
  arm64: mm: create new fine-grained mappings at boot
  arm64: ensure _stext and _etext are page-aligned
  arm64: mm: allow passing a pgdir to alloc_init_*
  arm64: mm: allocate pagetables anywhere
  arm64: mm: use fixmap when creating page tables
  arm64: mm: add functions to walk tables in fixmap
  arm64: mm: add __{pud,pgd}_populate
  arm64: mm: avoid redundant __pa(__va(x))
  Linux 4.4.31
  HID: usbhid: add ATEN CS962 to list of quirky devices
  ubi: fastmap: Fix add_vol() return value test in ubi_attach_fastmap()
  kvm: x86: Check memopp before dereference (CVE-2016-8630)
  tty: vt, fix bogus division in csi_J
  usb: dwc3: Fix size used in dma_free_coherent()
  pwm: Unexport children before chip removal
  UBI: fastmap: scrub PEB when bitflips are detected in a free PEB EC header
  Disable "frame-address" warning
  smc91x: avoid self-comparison warning
  cgroup: avoid false positive gcc-6 warning
  drm/exynos: fix error handling in exynos_drm_subdrv_open
  mm/cma: silence warnings due to max() usage
  ARM: 8584/1: floppy: avoid gcc-6 warning
  powerpc/ptrace: Fix out of bounds array access warning
  x86/xen: fix upper bound of pmd loop in xen_cleanhighmap()
  perf build: Fix traceevent plugins build race
  drm/dp/mst: Check peer device type before attempting EDID read
  drm/radeon: drop register readback in cayman_cp_int_cntl_setup
  drm/radeon/si_dpm: workaround for SI kickers
  drm/radeon/si_dpm: Limit clocks on HD86xx part
  Revert "drm/radeon: fix DP link training issue with second 4K monitor"
  mmc: dw_mmc-pltfm: fix the potential NULL pointer dereference
  scsi: arcmsr: Send SYNCHRONIZE_CACHE command to firmware
  scsi: scsi_debug: Fix memory leak if LBP enabled and module is unloaded
  scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices
  mac80211: discard multicast and 4-addr A-MSDUs
  firewire: net: fix fragmented datagram_size off-by-one
  firewire: net: guard against rx buffer overflows
  Input: i8042 - add XMG C504 to keyboard reset table
  dm mirror: fix read error on recovery after default leg failure
  virtio: console: Unlock vqs while freeing buffers
  virtio_ring: Make interrupt suppression spec compliant
  parisc: Ensure consistent state when switching to kernel stack at syscall entry
  ovl: fsync after copy-up
  KVM: MIPS: Make ERET handle ERL before EXL
  KVM: x86: fix wbinvd_dirty_mask use-after-free
  dm: free io_barrier after blk_cleanup_queue call
  USB: serial: cp210x: fix tiocmget error handling
  tty: limit terminal size to 4M chars
  xhci: add restart quirk for Intel Wildcatpoint PCH
  hv: do not lose pending heartbeat vmbus packets
  vt: clear selection before resizing
  Fix potential infoleak in older kernels
  GenWQE: Fix bad page access during abort of resource allocation
  usb: increase ohci watchdog delay to 275 msec
  xhci: use default USB_RESUME_TIMEOUT when resuming ports.
  USB: serial: ftdi_sio: add support for Infineon TriBoard TC2X7
  USB: serial: fix potential NULL-dereference at probe
  usb: gadget: function: u_ether: don't starve tx request queue
  mei: txe: don't clean an unprocessed interrupt cause.
  ubifs: Fix regression in ubifs_readdir()
  ubifs: Abort readdir upon error
  btrfs: fix races on root_log_ctx lists
  ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct
  ANDROID: binder: Add strong ref checks
  ALSA: hda - Fix headset mic detection problem for two Dell laptops
  ALSA: hda - Adding a new group of pin cfg into ALC295 pin quirk table
  ALSA: hda - allow 40 bit DMA mask for NVidia devices
  ALSA: hda - Raise AZX_DCAPS_RIRB_DELAY handling into top drivers
  ALSA: hda - Merge RIRB_PRE_DELAY into CTX_WORKAROUND caps
  ALSA: usb-audio: Add quirk for Syntek STK1160
  KEYS: Fix short sprintf buffer in /proc/keys show function
  mm: memcontrol: do not recurse in direct reclaim
  mm/list_lru.c: avoid error-path NULL pointer deref
  libxfs: clean up _calc_dquots_per_chunk
  h8300: fix syscall restarting
  drm/dp/mst: Clear port->pdt when tearing down the i2c adapter
  i2c: core: fix NULL pointer dereference under race condition
  i2c: xgene: Avoid dma_buffer overrun
  arm64:cpufeature ARM64_NCAPS is the indicator of last feature
  arm64: hibernate: Refuse to hibernate if the boot cpu is offline
  PM / sleep: Add support for read-only sysfs attributes
  arm64: kernel: Add support for hibernate/suspend-to-disk
  arm64: mm: add functions to walk page tables by PA
  arm64: mm: move pte_* macros
  PM / Hibernate: Call flush_icache_range() on pages restored in-place
  arm64: Add new asm macro copy_page
  arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  arm64: kernel: Include _AC definition in page.h
  arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
  arm64: kernel: Rework finisher callback out of __cpu_suspend_enter()
  arm64: Cleanup SCTLR flags
  arm64: Fold proc-macros.S into assembler.h
  arm/arm64: KVM: Add hook for C-based stage2 init
  arm/arm64: KVM: Detect vGIC presence at runtime
  arm64: KVM: Add support for 16-bit VMID
  arm: KVM: Make kvm_arm.h friendly to assembly code
  arm/arm64: KVM: Remove unreferenced S2_PGD_ORDER
  arm64: KVM: debug: Remove spurious inline attributes
  ARM: KVM: Cleanup exception injection
  arm64: KVM: Remove weak attributes
  arm64: KVM: Cleanup asm-offset.c
  arm64: KVM: Turn system register numbers to an enum
  arm64: KVM: VHE: Patch out use of HVC
  arm64: Add ARM64_HAS_VIRT_HOST_EXTN feature
  arm/arm64: Add new is_kernel_in_hyp_mode predicate
  arm64: KVM: Move away from the assembly version of the world switch
  arm64: KVM: Map the kernel RO section into HYP
  arm64: KVM: Add compatibility aliases
  arm64: KVM: Implement vgic-v3 save/restore
  arm64: KVM: Add panic handling
  arm64: KVM: HYP mode entry points
  arm64: KVM: Implement TLB handling
  arm64: KVM: Implement fpsimd save/restore
  arm64: KVM: Implement the core world switch
  arm64: KVM: Add patchable function selector
  arm64: KVM: Implement guest entry
  arm64: KVM: Implement debug save/restore
  arm64: KVM: Implement 32bit system register save/restore
  arm64: KVM: Implement system register save/restore
  arm64: KVM: Implement timer save/restore
  arm64: KVM: Implement vgic-v2 save/restore
  arm64: KVM: Add a HYP-specific header file
  KVM: arm/arm64: vgic-v3: Make the LR indexing macro public
  arm64: Add macros to read/write system registers
  Linux 4.4.30
  Revert "fix minor infoleak in get_user_ex()"
  Revert "x86/mm: Expand the exception table logic to allow new handling options"
  Linux 4.4.29
  ARM: pxa: pxa_cplds: fix interrupt handling
  powerpc/nvram: Fix an incorrect partition merge
  mpt3sas: Don't spam logs if logging level is 0
  perf symbols: Fixup symbol sizes before picking best ones
  perf symbols: Check symbol_conf.allow_aliases for kallsyms loading too
  perf hists browser: Fix event group display
  clk: divider: Fix clk_divider_round_rate() to use clk_readl()
  clk: qoriq: fix a register offset error
  s390/con3270: fix insufficient space padding
  s390/con3270: fix use of uninitialised data
  s390/cio: fix accidental interrupt enabling during resume
  x86/mm: Expand the exception table logic to allow new handling options
  dmaengine: ipu: remove bogus NO_IRQ reference
  power: bq24257: Fix use of uninitialized pointer bq->charger
  staging: r8188eu: Fix scheduling while atomic splat
  ASoC: dapm: Fix kcontrol creation for output driver widget
  ASoC: dapm: Fix value setting for _ENUM_DOUBLE MUX's second channel
  ASoC: dapm: Fix possible uninitialized variable in snd_soc_dapm_get_volsw()
  ASoC: topology: Fix error return code in soc_tplg_dapm_widget_create()
  hwrng: omap - Only fail if pm_runtime_get_sync returns < 0
  crypto: arm/ghash-ce - add missing async import/export
  crypto: gcm - Fix IV buffer size in crypto_gcm_setkey
  mwifiex: correct aid value during tdls setup
  spi: spi-fsl-dspi: Drop extra spi_master_put in device remove function
  ARM: clk-imx35: fix name for ckil clk
  uio: fix dmem_region_start computation
  genirq/generic_chip: Add irq_unmap callback
  perf stat: Fix interval output values
  powerpc/eeh: Null check uses of eeh_pe_bus_get
  tunnels: Remove encapsulation offloads on decap.
  tunnels: Don't apply GRO to multiple layers of encapsulation.
  ipip: Properly mark ipip GRO packets as encapsulated.
  posix_acl: Clear SGID bit when setting file permissions
  brcmfmac: avoid potential stack overflow in brcmf_cfg80211_start_ap()
  mm/hugetlb: fix memory offline with hugepage size > memory block size
  drm/i915: Unalias obj->phys_handle and obj->userptr
  drm/i915: Account for TSEG size when determining 865G stolen base
  Revert "drm/i915: Check live status before reading edid"
  drm/i915/gen9: fix the WaWmMemoryReadLatency implementation
  xenbus: don't look up transaction IDs for ordinary writes
  drm/vmwgfx: Limit the user-space command buffer size
  drm/radeon: change vblank_time's calculation method to reduce computational error.
  drm/radeon/si/dpm: fix phase shedding setup
  drm/radeon: narrow asic_init for virtualization
  drm/amdgpu: change vblank_time's calculation method to reduce computational error.
  drm/amdgpu/dce11: add missing drm_mode_config_cleanup call
  drm/amdgpu/dce11: disable hpd on local panels
  drm/amdgpu/dce8: disable hpd on local panels
  drm/amdgpu/dce10: disable hpd on local panels
  drm/amdgpu: fix IB alignment for UVD
  drm/prime: Pass the right module owner through to dma_buf_export()
  Linux 4.4.28
  target: Don't override EXTENDED_COPY xcopy_pt_cmd SCSI status code
  target: Make EXTENDED_COPY 0xe4 failure return COPY TARGET DEVICE NOT REACHABLE
  target: Re-add missing SCF_ACK_KREF assignment in v4.1.y
  ubifs: Fix xattr_names length in exit paths
  jbd2: fix incorrect unlock on j_list_lock
  ext4: do not advertise encryption support when disabled
  mmc: rtsx_usb_sdmmc: Handle runtime PM while changing the led
  mmc: rtsx_usb_sdmmc: Avoid keeping the device runtime resumed when unused
  mmc: core: Annotate cmd_hdr as __le32
  powerpc/mm: Prevent unlikely crash in copro_calculate_slb()
  ceph: fix error handling in ceph_read_iter
  arm64: kernel: Init MDCR_EL2 even in the absence of a PMU
  arm64: percpu: rewrite ll/sc loops in assembly
  memstick: rtsx_usb_ms: Manage runtime PM when accessing the device
  memstick: rtsx_usb_ms: Runtime resume the device when polling for cards
  isofs: Do not return EACCES for unknown filesystems
  irqchip/gic-v3-its: Fix entry size mask for GITS_BASER
  s390/mm: fix gmap tlb flush issues
  Using BUG_ON() as an assert() is _never_ acceptable
  mm: filemap: fix mapping->nrpages double accounting in fuse
  mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
  acpi, nfit: check for the correct event code in notifications
  net/mlx4_core: Allow resetting VF admin mac to zero
  bnx2x: Prevent false warning for lack of FC NPIV
  PKCS#7: Don't require SpcSpOpusInfo in Authenticode pkcs7 signatures
  hpsa: correct skipping masked peripherals
  sd: Fix rw_max for devices that report an optimal xfer size
  irqchip/gicv3: Handle loop timeout proper
  kvm: x86: memset whole irq_eoi
  x86/e820: Don't merge consecutive E820_PRAM ranges
  blkcg: Unlock blkcg_pol_mutex only once when cpd == NULL
  Fix regression which breaks DFS mounting
  Cleanup missing frees on some ioctls
  Do not send SMB3 SET_INFO request if nothing is changing
  SMB3: GUIDs should be constructed as random but valid uuids
  Set previous session id correctly on SMB3 reconnect
  Display number of credits available
  Clarify locking of cifs file and tcon structures and make more granular
  fs/cifs: keep guid when assigning fid to fileinfo
  cifs: Limit the overall credit acquired
  fs/super.c: fix race between freeze_super() and thaw_super()
  arc: don't leak bits of kernel stack into coredump
  lightnvm: ensure that nvm_dev_ops can be used without CONFIG_NVM
  ipc/sem.c: fix complex_count vs. simple op race
  mm: filemap: don't plant shadow entries without radix tree node
  metag: Only define atomic_dec_if_positive conditionally
  scsi: Fix use-after-free
  NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic
  NFSv4: Open state recovery must account for file permission changes
  NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid
  NFSv4: Don't report revoked delegations as valid in nfs_have_delegation()
  sunrpc: fix write space race causing stalls
  Input: elantech - add Fujitsu Lifebook E556 to force crc_enabled
  Input: elantech - force needed quirks on Fujitsu H760
  Input: i8042 - skip selftest on ASUS laptops
  lib: add "on"/"off" support to kstrtobool
  lib: update single-char callers of strtobool()
  lib: move strtobool() to kstrtobool()
  MIPS: ptrace: Fix regs_return_value for kernel context
  MIPS: Fix -mabi=64 build of vdso.lds
  ALSA: hda - Fix a failure of micmute led when having multi adcs
  cx231xx: fix GPIOs for Pixelview SBTVD hybrid
  cx231xx: don't return error on success
  mb86a20s: fix demod settings
  mb86a20s: fix the locking logic
  ovl: copy_up_xattr(): use strnlen
  ovl: Fix info leak in ovl_lookup_temp()
  fbdev/efifb: Fix 16 color palette entry calculation
  scsi: zfcp: spin_lock_irqsave() is not nestable
  zfcp: trace full payload of all SAN records (req,resp,iels)
  zfcp: fix payload trace length for SAN request&response
  zfcp: fix D_ID field with actual value on tracing SAN responses
  zfcp: restore tracing of handle for port and LUN with HBA records
  zfcp: trace on request for open and close of WKA port
  zfcp: restore: Dont use 0 to indicate invalid LUN in rec trace
  zfcp: retain trace level for SCSI and HBA FSF response records
  zfcp: close window with unblocked rport during rport gone
  zfcp: fix ELS/GS request&response length for hardware data router
  zfcp: fix fc_host port_type with NPIV
  ubi: Deal with interrupted erasures in WL
  powerpc/pseries: Fix stack corruption in htpe code
  powerpc/64: Fix incorrect return value from __copy_tofrom_user
  powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data()
  powerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag()
  powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear()
  powerpc/vdso64: Use double word compare on pointers
  dm crypt: fix crash on exit
  dm mpath: check if path's request_queue is dying in activate_path()
  dm: return correct error code in dm_resume()'s retry loop
  dm: mark request_queue dead before destroying the DM device
  perf intel-pt: Fix MTC timestamp calculation for large MTC periods
  perf intel-pt: Fix estimated timestamps for cycle-accurate mode
  perf intel-pt: Fix snapshot overlap detection decoder errors
  pstore/ram: Use memcpy_fromio() to save old buffer
  pstore/ram: Use memcpy_toio instead of memcpy
  pstore/core: drop cmpxchg based updates
  pstore/ramoops: fixup driver removal
  parisc: Increase initial kernel mapping size
  parisc: Fix kernel memory layout regarding position of __gp
  parisc: Increase KERNEL_INITIAL_SIZE for 32-bit SMP kernels
  cpufreq: intel_pstate: Fix unsafe HWP MSR access
  platform: don't return 0 from platform_get_irq[_byname]() on error
  PCI: Mark Atheros AR9580 to avoid bus reset
  mmc: sdhci: cast unsigned int to unsigned long long to avoid unexpeted error
  mmc: block: don't use CMD23 with very old MMC cards
  rtlwifi: Fix missing country code for Great Britain
  PM / devfreq: event: remove duplicate devfreq_event_get_drvdata()
  clk: imx6: initialize GPU clocks
  regulator: tps65910: Work around silicon erratum SWCZ010
  mei: me: add kaby point device ids
  gpio: mpc8xxx: Correct irq handler function
  cgroup: Change from CAP_SYS_NICE to CAP_SYS_RESOURCE for cgroup migration permissions
  UPSTREAM: cpu/hotplug: Handle unbalanced hotplug enable/disable
  UPSTREAM: arm64: kaslr: fix breakage with CONFIG_MODVERSIONS=y
  UPSTREAM: arm64: kaslr: keep modules close to the kernel when DYNAMIC_FTRACE=y
  cgroup: Remove leftover instances of allow_attach
  BACKPORT: lib: harden strncpy_from_user
  CHROMIUM: cgroups: relax permissions on moving tasks between cgroups
  CHROMIUM: remove Android's cgroup generic permissions checks
  Linux 4.4.27
  cfq: fix starvation of asynchronous writes
  vfs: move permission checking into notify_change() for utimes(NULL)
  dlm: free workqueues after the connections
  crypto: vmx - Fix memory corruption caused by p8_ghash
  crypto: ghash-generic - move common definitions to a new header file
  ext4: release bh in make_indexed_dir
  ext4: allow DAX writeback for hole punch
  ext4: fix memory leak in ext4_insert_range()
  ext4: reinforce check of i_dtime when clearing high fields of uid and gid
  ext4: enforce online defrag restriction for encrypted files
  scsi: ibmvfc: Fix I/O hang when port is not mapped
  scsi: arcmsr: Simplify user_len checking
  scsi: arcmsr: Buffer overflow in arcmsr_iop_message_xfer()
  async_pq_val: fix DMA memory leak
  reiserfs: switch to generic_{get,set,remove}xattr()
  reiserfs: Unlock superblock before calling reiserfs_quota_on_mount()
  ASoC: Intel: Atom: add a missing star in a memcpy call
  brcmfmac: fix memory leak in brcmf_fill_bss_param
  i40e: avoid NULL pointer dereference and recursive errors on early PCI error
  fuse: fix killing s[ug]id in setattr
  fuse: invalidate dir dentry after chmod
  fuse: listxattr: verify xattr list
  drivers: base: dma-mapping: page align the size when unmap_kernel_range
  btrfs: assign error values to the correct bio structs
  serial: 8250_dw: Check the data->pclk when get apb_pclk
  arm64: Use PoU cache instr for I/D coherency
  arm64: mm: add code to safely replace TTBR1_EL1
  arm64: mm: place __cpu_setup in .text
  arm64: add function to install the idmap
  arm64: unmap idmap earlier
  arm64: unify idmap removal
  arm64: mm: place empty_zero_page in bss
  arm64: head.S: use memset to clear BSS
  arm64: mm: specialise pagetable allocators
  arm64: mm: remove pointless PAGE_MASKing
  asm-generic: Fix local variable shadow in __set_fixmap_offset
  arm64: mm: fold alternatives into .init
  ARM: 8511/1: ARM64: kernel: PSCI: move PSCI idle management code to drivers/firmware
  ARM: 8481/2: drivers: psci: replace psci firmware calls
  ARM: 8480/2: arm64: add implementation for arm-smccc
  ARM: 8479/2: add implementation for arm-smccc
  ARM: 8478/2: arm/arm64: add arm-smccc
  ARM: 8510/1: rework ARM_CPU_SUSPEND dependencies
  ARM: 8458/1: bL_switcher: add GIC dependency
  Linux 4.4.26
  mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
  x86/build: Build compressed x86 kernels as PIE
  arm64: Remove stack duplicating code from jprobes
  arm64: kprobes: Add KASAN instrumentation around stack accesses
  arm64: kprobes: Cleanup jprobe_return
  arm64: kprobes: Fix overflow when saving stack
  arm64: kprobes: WARN if attempting to step with PSTATE.D=1
  kprobes: Add arm64 case in kprobe example module
  arm64: Add kernel return probes support (kretprobes)
  arm64: Add trampoline code for kretprobes
  arm64: kprobes instruction simulation support
  arm64: Treat all entry code as non-kprobe-able
  arm64: Blacklist non-kprobe-able symbol
  arm64: Kprobes with single stepping support
  arm64: add conditional instruction simulation support
  arm64: Add more test functions to insn.c
  arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature
  Linux 4.4.25
  tpm_crb: fix crb_req_canceled behavior
  tpm: fix a race condition in tpm2_unseal_trusted()
  ima: use file_dentry()
  ARM: cpuidle: Fix error return code
  ARM: dts: MSM8064 remove flags from SPMI/MPP IRQs
  ARM: dts: mvebu: armada-390: add missing compatibility string and bracket
  x86/dumpstack: Fix x86_32 kernel_stack_pointer() previous stack access
  x86/irq: Prevent force migration of irqs which are not in the vector domain
  x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation
  KVM: PPC: BookE: Fix a sanity check
  KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register
  mfd: wm8350-i2c: Make sure the i2c regmap functions are compiled
  mfd: 88pm80x: Double shifting bug in suspend/resume
  mfd: atmel-hlcdc: Do not sleep in atomic context
  mfd: rtsx_usb: Avoid setting ucr->current_sg.status
  ALSA: usb-line6: use the same declaration as definition in header for MIDI manufacturer ID
  ALSA: usb-audio: Extend DragonFly dB scale quirk to cover other variants
  ALSA: ali5451: Fix out-of-bound position reporting
  timekeeping: Fix __ktime_get_fast_ns() regression
  time: Add cycles to nanoseconds translation
  mm: Fix build for hardened usercopy
  ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct
  ANDROID: binder: Add strong ref checks
  UPSTREAM: staging/android/ion : fix a race condition in the ion driver
  ANDROID: android-base: CONFIG_HARDENED_USERCOPY=y
  UPSTREAM: fs/proc/kcore.c: Add bounce buffer for ktext data
  UPSTREAM: fs/proc/kcore.c: Make bounce buffer global for read
  BACKPORT: arm64: Correctly bounds check virt_addr_valid
  Fix a build breakage in IO latency hist code.
  UPSTREAM: efi: include asm/early_ioremap.h not asm/efi.h to get early_memremap
  UPSTREAM: ia64: split off early_ioremap() declarations into asm/early_ioremap.h
  FROMLIST: arm64: Enable CONFIG_ARM64_SW_TTBR0_PAN
  FROMLIST: arm64: xen: Enable user access before a privcmd hvc call
  FROMLIST: arm64: Handle faults caused by inadvertent user access with PAN enabled
  FROMLIST: arm64: Disable TTBR0_EL1 during normal kernel execution
  FROMLIST: arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1
  FROMLIST: arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro
  FROMLIST: arm64: Factor out PAN enabling/disabling into separate uaccess_* macros
  UPSTREAM: arm64: Handle el1 synchronous instruction aborts cleanly
  UPSTREAM: arm64: include alternative handling in dcache_by_line_op
  UPSTREAM: arm64: fix "dc cvau" cache operation on errata-affected core
  UPSTREAM: Revert "arm64: alternatives: add enable parameter to conditional asm macros"
  UPSTREAM: arm64: Add new asm macro copy_page
  UPSTREAM: arm64: kill ESR_LNX_EXEC
  UPSTREAM: arm64: add macro to extract ESR_ELx.EC
  UPSTREAM: arm64: mm: mark fault_info table const
  UPSTREAM: arm64: fix dump_instr when PAN and UAO are in use
  BACKPORT: arm64: Fold proc-macros.S into assembler.h
  UPSTREAM: arm64: choose memstart_addr based on minimum sparsemem section alignment
  UPSTREAM: arm64/mm: ensure memstart_addr remains sufficiently aligned
  UPSTREAM: arm64/kernel: fix incorrect EL0 check in inv_entry macro
  UPSTREAM: arm64: Add macros to read/write system registers
  UPSTREAM: arm64/efi: refactor EFI init and runtime code for reuse by 32-bit ARM
  UPSTREAM: arm64/efi: split off EFI init and runtime code for reuse by 32-bit ARM
  UPSTREAM: arm64/efi: mark UEFI reserved regions as MEMBLOCK_NOMAP
  BACKPORT: arm64: only consider memblocks with NOMAP cleared for linear mapping
  UPSTREAM: mm/memblock: add MEMBLOCK_NOMAP attribute to memblock memory table
  ANDROID: dm: android-verity: Remove fec_header location constraint
  BACKPORT: audit: consistently record PIDs with task_tgid_nr()
  android-base.cfg: Enable kernel ASLR
  UPSTREAM: vmlinux.lds.h: allow arch specific handling of ro_after_init data section
  UPSTREAM: arm64: spinlock: fix spin_unlock_wait for LSE atomics
  UPSTREAM: arm64: avoid TLB conflict with CONFIG_RANDOMIZE_BASE
  UPSTREAM: arm64: Only select ARM64_MODULE_PLTS if MODULES=y
  sched: Add Kconfig option DEFAULT_USE_ENERGY_AWARE to set ENERGY_AWARE feature flag
  sched/fair: remove printk while schedule is in progress
  ANDROID: fs: FS tracepoints to track IO.
  sched/walt: Drop arch-specific timer access
  ANDROID: fiq_debugger: Pass task parameter to unwind_frame()
  eas/sched/fair: Fixing comments in find_best_target.
  input: keyreset: switch to orderly_reboot
  UPSTREAM: tun: fix transmit timestamp support
  UPSTREAM: arch/arm/include/asm/pgtable-3level.h: add pmd_mkclean for THP
  net: inet: diag: expose the socket mark to privileged processes.
  net: diag: make udp_diag_destroy work for mapped addresses.
  net: diag: support SOCK_DESTROY for UDP sockets
  net: diag: allow socket bytecode filters to match socket marks
  net: diag: slightly refactor the inet_diag_bc_audit error checks.
  net: diag: Add support to filter on device index
  UPSTREAM: brcmfmac: avoid potential stack overflow in brcmf_cfg80211_start_ap()
  Linux 4.4.24
  ALSA: hda - Add the top speaker pin config for HP Spectre x360
  ALSA: hda - Fix headset mic detection problem for several Dell laptops
  ACPICA: acpi_get_sleep_type_data: Reduce warnings
  ALSA: hda - Adding one more ALC255 pin definition for headset problem
  Revert "usbtmc: convert to devm_kzalloc"
  USB: serial: cp210x: Add ID for a Juniper console
  Staging: fbtft: Fix bug in fbtft-core
  usb: misc: legousbtower: Fix NULL pointer deference
  USB: serial: cp210x: fix hardware flow-control disable
  dm log writes: fix bug with too large bios
  clk: xgene: Add missing parenthesis when clearing divider value
  aio: mark AIO pseudo-fs noexec
  batman-adv: remove unused callback from batadv_algo_ops struct
  IB/mlx4: Use correct subnet-prefix in QP1 mads under SR-IOV
  IB/mlx4: Fix code indentation in QP1 MAD flow
  IB/mlx4: Fix incorrect MC join state bit-masking on SR-IOV
  IB/ipoib: Don't allow MC joins during light MC flush
  IB/core: Fix use after free in send_leave function
  IB/ipoib: Fix memory corruption in ipoib cm mode connect flow
  KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write
  dmaengine: at_xdmac: fix to pass correct device identity to free_irq()
  kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd
  ASoC: omap-mcpdm: Fix irq resource handling
  sysctl: handle error writing UINT_MAX to u32 fields
  powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support
  brcmsmac: Initialize power in brcms_c_stf_ss_algo_channel_get()
  brcmsmac: Free packet if dma_mapping_error() fails in dma_rxfill
  brcmfmac: Fix glob_skb leak in brcmf_sdiod_recv_chain
  ASoC: Intel: Skylake: Fix error return code in skl_probe()
  pNFS/flexfiles: Fix layoutcommit after a commit to DS
  pNFS/files: Fix layoutcommit after a commit to DS
  NFS: Don't drop CB requests with invalid principals
  svc: Avoid garbage replies when pc_func() returns rpc_drop_reply
  dmaengine: at_xdmac: fix debug string
  fnic: pci_dma_mapping_error() doesn't return an error code
  avr32: off by one in at32_init_pio()
  ath9k: Fix programming of minCCA power threshold
  gspca: avoid unused variable warnings
  em28xx-i2c: rt_mutex_trylock() returns zero on failure
  NFC: fdp: Detect errors from fdp_nci_create_conn()
  iwlmvm: mvm: set correct state in smart-fifo configuration
  tile: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
  pstore: drop file opened reference count
  blk-mq: actually hook up defer list when running requests
  hwrng: omap - Fix assumption that runtime_get_sync will always succeed
  ARM: sa1111: fix pcmcia suspend/resume
  ARM: shmobile: fix regulator quirk for Gen2
  ARM: sa1100: clear reset status prior to reboot
  ARM: sa1100: fix 3.6864MHz clock
  ARM: sa1100: register clocks early
  ARM: sun5i: Fix typo in trip point temperature
  regulator: qcom_smd: Fix voltage ranges for pm8x41
  regulator: qcom_spmi: Update mvs1/mvs2 switches on pm8941
  regulator: qcom_spmi: Add support for get_mode/set_mode on switches
  regulator: qcom_spmi: Add support for S4 supply on pm8941
  tpm: fix byte-order for the value read by tpm2_get_tpm_pt
  printk: fix parsing of "brl=" option
  MIPS: uprobes: fix use of uninitialised variable
  MIPS: Malta: Fix IOCU disable switch read for MIPS64
  MIPS: fix uretprobe implementation
  MIPS: uprobes: remove incorrect set_orig_insn
  arm64: debug: avoid resetting stepping state machine when TIF_SINGLESTEP
  ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7
  irqchip/gicv3: Silence noisy DEBUG_PER_CPU_MAPS warning
  gpio: sa1100: fix irq probing for ucb1x00
  usb: gadget: fsl_qe_udc: signedness bug in qe_get_frame()
  ceph: fix race during filling readdir cache
  iwlwifi: mvm: don't use ret when not initialised
  iwlwifi: pcie: fix access to scratch buffer
  spi: sh-msiof: Avoid invalid clock generator parameters
  hwmon: (adt7411) set bit 3 in CFG1 register
  nvmem: Declare nvmem_cell_read() consistently
  ipvs: fix bind to link-local mcast IPv6 address in backup
  tools/vm/slabinfo: fix an unintentional printf
  mmc: pxamci: fix potential oops
  drivers/perf: arm_pmu: Fix leak in error path
  pinctrl: Flag strict is a field in struct pinmux_ops
  pinctrl: uniphier: fix .pin_dbg_show() callback
  i40e: avoid null pointer dereference
  perf/core: Fix pmu::filter_match for SW-led groups
  iwlwifi: mvm: fix a few firmware capability checks
  usb: musb: fix DMA for host mode
  usb: musb: Fix DMA desired mode for Mentor DMA engine
  ARM: 8617/1: dma: fix dma_max_pfn()
  ARM: 8616/1: dt: Respect property size when parsing CPUs
  drm/radeon/si/dpm: add workaround for for Jet parts
  drm/nouveau/fifo/nv04: avoid ramht race against cookie insertion
  x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID
  x86/init: Fix cr4_init_shadow() on CR4-less machines
  can: dev: fix deadlock reported after bus-off
  mm,ksm: fix endless looping in allocating memory when ksm enable
  mtd: nand: davinci: Reinitialize the HW ECC engine in 4bit hwctl
  cpuset: handle race between CPU hotplug and cpuset_hotplug_work
  usercopy: fold builtin_const check into inline function
  Linux 4.4.23
  hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common()
  qxl: check for kmap failures
  power: supply: max17042_battery: fix model download bug.
  power_supply: tps65217-charger: fix missing platform_set_drvdata()
  PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
  PM / hibernate: Restore processor state before using per-CPU variables
  MIPS: paravirt: Fix undefined reference to smp_bootstrap
  MIPS: Add a missing ".set pop" in an early commit
  MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...)
  MIPS: Remove compact branch policy Kconfig entries
  MIPS: vDSO: Fix Malta EVA mapping to vDSO page structs
  MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
  MIPS: Fix pre-r6 emulation FPU initialisation
  i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
  i2c-eg20t: fix race between i2c init and interrupt enable
  btrfs: ensure that file descriptor used with subvol ioctls is a dir
  nl80211: validate number of probe response CSA counters
  can: flexcan: fix resume function
  mm: delete unnecessary and unsafe init_tlb_ubc()
  tracing: Move mutex to protect against resetting of seq data
  fix memory leaks in tracing_buffers_splice_read()
  power: reset: hisi-reboot: Unmap region obtained by of_iomap
  mtd: pmcmsp-flash: Allocating too much in init_msp_flash()
  mtd: maps: sa1100-flash: potential NULL dereference
  fix fault_in_multipages_...() on architectures with no-op access_ok()
  fanotify: fix list corruption in fanotify_get_response()
  fsnotify: add a way to stop queueing events on group shutdown
  xfs: prevent dropping ioend completions during buftarg wait
  autofs: use dentry flags to block walks during expire
  autofs races
  pwm: Mark all devices as "might sleep"
  bridge: re-introduce 'fix parsing of MLDv2 reports'
  net: smc91x: fix SMC accesses
  Revert "phy: IRQ cannot be shared"
  net: dsa: bcm_sf2: Fix race condition while unmasking interrupts
  net/mlx5: Added missing check of msg length in verifying its signature
  tipc: fix NULL pointer dereference in shutdown()
  net/irda: handle iriap_register_lsap() allocation failure
  vti: flush x-netns xfrm cache when vti interface is removed
  af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
  Revert "af_unix: Fix splice-bind deadlock"
  bonding: Fix bonding crash
  megaraid: fix null pointer check in megasas_detach_one().
  nouveau: fix nv40_perfctr_next() cleanup regression
  Staging: iio: adc: fix indent on break statement
  iwlegacy: avoid warning about missing braces
  ath9k: fix misleading indentation
  am437x-vfpe: fix typo in vpfe_get_app_input_index
  Add braces to avoid "ambiguous ‘else’" compiler warnings
  net: caif: fix misleading indentation
  Makefile: Mute warning for __builtin_return_address(>0) for tracing only
  Disable "frame-address" warning
  Disable "maybe-uninitialized" warning globally
  gcov: disable -Wmaybe-uninitialized warning
  Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES
  kbuild: forbid kernel directory to contain spaces and colons
  tools: Support relative directory path for 'O='
  Makefile: revert "Makefile: Document ability to make file.lst and file.S" partially
  kbuild: Do not run modules_install and install in paralel
  ocfs2: fix start offset to ocfs2_zero_range_for_truncate()
  ocfs2/dlm: fix race between convert and migration
  crypto: echainiv - Replace chaining with multiplication
  crypto: skcipher - Fix blkcipher walk OOM crash
  crypto: arm/aes-ctr - fix NULL dereference in tail processing
  crypto: arm64/aes-ctr - fix NULL dereference in tail processing
  tcp: properly scale window in tcp_v[46]_reqsk_send_ack()
  tcp: fix use after free in tcp_xmit_retransmit_queue()
  tcp: cwnd does not increase in TCP YeAH
  ipv6: release dst in ping_v6_sendmsg
  ipv4: panic in leaf_walk_rcu due to stale node pointer
  reiserfs: fix "new_insert_key may be used uninitialized ..."
  Fix build warning in kernel/cpuset.c
  include/linux/kernel.h: change abs() macro so it uses consistent return type
  Linux 4.4.22
  openrisc: fix the fix of copy_from_user()
  avr32: fix 'undefined reference to `___copy_from_user'
  ia64: copy_from_user() should zero the destination on access_ok() failure
  genirq/msi: Fix broken debug output
  ppc32: fix copy_from_user()
  sparc32: fix copy_from_user()
  mn10300: copy_from_user() should zero on access_ok() failure...
  nios2: copy_from_user() should zero the tail of destination
  openrisc: fix copy_from_user()
  parisc: fix copy_from_user()
  metag: copy_from_user() should zero the destination on access_ok() failure
  alpha: fix copy_from_user()
  asm-generic: make copy_from_user() zero the destination properly
  mips: copy_from_user() must zero the destination on access_ok() failure
  hexagon: fix strncpy_from_user() error return
  sh: fix copy_from_user()
  score: fix copy_from_user() and friends
  blackfin: fix copy_from_user()
  cris: buggered copy_from_user/copy_to_user/clear_user
  frv: fix clear_user()
  asm-generic: make get_user() clear the destination on errors
  ARC: uaccess: get_user to zero out dest in cause of fault
  s390: get_user() should zero on failure
  score: fix __get_user/get_user
  nios2: fix __get_user()
  sh64: failing __get_user() should zero
  m32r: fix __get_user()
  mn10300: failing __get_user() and get_user() should zero
  fix minor infoleak in get_user_ex()
  microblaze: fix copy_from_user()
  avr32: fix copy_from_user()
  microblaze: fix __get_user()
  fix iov_iter_fault_in_readable()
  irqchip/atmel-aic: Fix potential deadlock in ->xlate()
  genirq: Provide irq_gc_{lock_irqsave,unlock_irqrestore}() helpers
  drm: Only use compat ioctl for addfb2 on X86/IA64
  drm: atmel-hlcdc: Fix vertical scaling
  net: simplify napi_synchronize() to avoid warnings
  kconfig: tinyconfig: provide whole choice blocks to avoid warnings
  soc: qcom/spm: shut up uninitialized variable warning
  pinctrl: at91-pio4: use %pr format string for resource
  mmc: dw_mmc: use resource_size_t to store physical address
  drm/i915: Avoid pointer arithmetic in calculating plane surface offset
  mpssd: fix buffer overflow warning
  gma500: remove annoying deprecation warning
  ipv6: addrconf: fix dev refcont leak when DAD failed
  sched/core: Fix a race between try_to_wake_up() and a woken up task
  Revert "wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel"
  ath9k: fix using sta->drv_priv before initializing it
  md-cluster: make md-cluster also can work when compiled into kernel
  xhci: fix null pointer dereference in stop command timeout function
  fuse: direct-io: don't dirty ITER_BVEC pages
  Btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns
  crypto: cryptd - initialize child shash_desc on import
  arm64: spinlocks: implement smp_mb__before_spinlock() as smp_mb()
  pinctrl: sunxi: fix uart1 CTS/RTS pins at PG on A23/A33
  pinctrl: pistachio: fix mfio pll_lock pinmux
  dm crypt: fix error with too large bios
  dm log writes: move IO accounting earlier to fix error path
  dm log writes: fix check of kthread_run() return value
  bus: arm-ccn: Fix XP watchpoint settings bitmask
  bus: arm-ccn: Do not attempt to configure XPs for cycle counter
  bus: arm-ccn: Fix PMU handling of MN
  ARM: dts: STiH407-family: Provide interconnect clock for consumption in ST SDHCI
  ARM: dts: overo: fix gpmc nand on boards with ethernet
  ARM: dts: overo: fix gpmc nand cs0 range
  ARM: dts: imx6qdl: Fix SPDIF regression
  ARM: OMAP3: hwmod data: Add sysc information for DSI
  ARM: kirkwood: ib62x0: fix size of u-boot environment partition
  ARM: imx6: add missing BM_CLPCR_BYPASS_PMIC_READY setting for imx6sx
  ARM: imx6: add missing BM_CLPCR_BYP_MMDC_CH0_LPM_HS setting for imx6ul
  ARM: AM43XX: hwmod: Fix RSTST register offset for pruss
  cpuset: make sure new tasks conform to the current config of the cpuset
  net: thunderx: Fix OOPs with ethtool --register-dump
  USB: change bInterval default to 10 ms
  ARM: dts: STiH410: Handle interconnect clock required by EHCI/OHCI (USB)
  usb: chipidea: udc: fix NULL ptr dereference in isr_setup_status_phase
  usb: renesas_usbhs: fix clearing the {BRDY,BEMP}STS condition
  USB: serial: simple: add support for another Infineon flashloader
  serial: 8250: added acces i/o products quad and octal serial cards
  serial: 8250_mid: fix divide error bug if baud rate is 0
  iio: ensure ret is initialized to zero before entering do loop
  iio:core: fix IIO_VAL_FRACTIONAL sign handling
  iio: accel: kxsd9: Fix scaling bug
  iio: fix pressure data output unit in hid-sensor-attributes
  iio: accel: bmc150: reset chip at init time
  iio: adc: at91: unbreak channel adc channel 3
  iio: ad799x: Fix buffered capture for ad7991/ad7995/ad7999
  iio: adc: ti_am335x_adc: Increase timeout value waiting for ADC sample
  iio: adc: ti_am335x_adc: Protect FIFO1 from concurrent access
  iio: adc: rockchip_saradc: reset saradc controller before programming it
  iio: proximity: as3935: set up buffer timestamps for non-zero values
  iio: accel: kxsd9: Fix raw read return
  kvm-arm: Unmap shadow pagetables properly
  x86/AMD: Apply erratum 665 on machines without a BIOS fix
  x86/paravirt: Do not trace _paravirt_ident_*() functions
  ARC: mm: fix build breakage with STRICT_MM_TYPECHECKS
  IB/uverbs: Fix race between uverbs_close and remove_one
  dm flakey: fix reads to be issued if drop_writes configured
  audit: fix exe_file access in audit_exe_compare
  mm: introduce get_task_exe_file
  kexec: fix double-free when failing to relocate the purgatory
  NFSv4.1: Fix the CREATE_SESSION slot number accounting
  pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised
  nfsd: Close race between nfsd4_release_lockowner and nfsd4_lock
  NFSv4.x: Fix a refcount leak in nfs_callback_up_net
  pNFS: The client must not do I/O to the DS if it's lease has expired
  kernfs: don't depend on d_find_any_alias() when generating notifications
  powerpc/mm: Don't alias user region to other regions below PAGE_OFFSET
  powerpc/powernv : Drop reference added by kset_find_obj()
  powerpc/tm: do not use r13 for tabort_syscall
  tipc: move linearization of buffers to generic code
  lightnvm: put bio before return
  fscrypto: require write access to mount to set encryption policy
  Revert "KVM: x86: fix missed hardware breakpoints"
  MIPS: KVM: Check for pfn noslot case
  clocksource/drivers/sun4i: Clear interrupts after stopping timer in probe function
  fscrypto: add authorization check for setting encryption policy
  ext4: use __GFP_NOFAIL in ext4_free_blocks()

Conflicts:
	arch/arm/kernel/devtree.c
	arch/arm64/Kconfig
	arch/arm64/kernel/arm64ksyms.c
	arch/arm64/kernel/psci.c
	arch/arm64/mm/fault.c
	drivers/android/binder.c
	drivers/usb/host/xhci-hub.c
	fs/ext4/readpage.c
	include/linux/mmc/core.h
	include/linux/mmzone.h
	mm/memcontrol.c
	net/core/filter.c
	net/netlink/af_netlink.c
	net/netlink/af_netlink.h

Change-Id: I99fe7a0914e83e284b11b33185b71448a8999d1f
Signed-off-by: Runmin Wang <runminw@codeaurora.org>
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2017-02-28 17:10:49 -08:00
Vlastimil Babka
f9d29d58eb mm, compaction: introduce kcompactd
Memory compaction can be currently performed in several contexts:

 - kswapd balancing a zone after a high-order allocation failure
 - direct compaction to satisfy a high-order allocation, including THP
   page fault attemps
 - khugepaged trying to collapse a hugepage
 - manually from /proc

The purpose of compaction is two-fold.  The obvious purpose is to
satisfy a (pending or future) high-order allocation, and is easy to
evaluate.  The other purpose is to keep overal memory fragmentation low
and help the anti-fragmentation mechanism.  The success wrt the latter
purpose is more

The current situation wrt the purposes has a few drawbacks:

 - compaction is invoked only when a high-order page or hugepage is not
   available (or manually).  This might be too late for the purposes of
   keeping memory fragmentation low.
 - direct compaction increases latency of allocations.  Again, it would
   be better if compaction was performed asynchronously to keep
   fragmentation low, before the allocation itself comes.
 - (a special case of the previous) the cost of compaction during THP
   page faults can easily offset the benefits of THP.
 - kswapd compaction appears to be complex, fragile and not working in
   some scenarios.  It could also end up compacting for a high-order
   allocation request when it should be reclaiming memory for a later
   order-0 request.

To improve the situation, we should be able to benefit from an
equivalent of kswapd, but for compaction - i.e. a background thread
which responds to fragmentation and the need for high-order allocations
(including hugepages) somewhat proactively.

One possibility is to extend the responsibilities of kswapd, which could
however complicate its design too much.  It should be better to let
kswapd handle reclaim, as order-0 allocations are often more critical
than high-order ones.

Another possibility is to extend khugepaged, but this kthread is a
single instance and tied to THP configs.

This patch goes with the option of a new set of per-node kthreads called
kcompactd, and lays the foundations, without introducing any new
tunables.  The lifecycle mimics kswapd kthreads, including the memory
hotplug hooks.

For compaction, kcompactd uses the standard compaction_suitable() and
ompact_finished() criteria and the deferred compaction functionality.
Unlike direct compaction, it uses only sync compaction, as there's no
allocation latency to minimize.

This patch doesn't yet add a call to wakeup_kcompactd.  The kswapd
compact/reclaim loop for high-order pages will be replaced by waking up
kcompactd in the next patch with the description of what's wrong with
the old approach.

Waking up of the kcompactd threads is also tied to kswapd activity and
follows these rules:
 - we don't want to affect any fastpaths, so wake up kcompactd only from
   the slowpath, as it's done for kswapd
 - if kswapd is doing reclaim, it's more important than compaction, so
   don't invoke kcompactd until kswapd goes to sleep
 - the target order used for kswapd is passed to kcompactd

Future possible future uses for kcompactd include the ability to wake up
kcompactd on demand in special situations, such as when hugepages are
not available (currently not done due to __GFP_NO_KSWAPD) or when a
fragmentation event (i.e.  __rmqueue_fallback()) occurs.  It's also
possible to perform periodic compaction with kcompactd.

[arnd@arndb.de: fix build errors with kcompactd]
[paul.gortmaker@windriver.com: don't use modular references for non modular code]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 698b1b30642f1ff0ea10ef1de9745ab633031377
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Change-Id: I987ae548cba936987b8479dc02de67d0f88b9cb6
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2017-02-21 12:36:57 +05:30
Runmin Wang
efbe378b81 Merge branch 'v4.4-16.09-android-tmp' into lsk-v4.4-16.09-android
* v4.4-16.09-android-tmp:
  unsafe_[get|put]_user: change interface to use a error target label
  usercopy: remove page-spanning test for now
  usercopy: fix overlap check for kernel text
  mm/slub: support left redzone
  Linux 4.4.21
  lib/mpi: mpi_write_sgl(): fix skipping of leading zero limbs
  regulator: anatop: allow regulator to be in bypass mode
  hwrng: exynos - Disable runtime PM on probe failure
  cpufreq: Fix GOV_LIMITS handling for the userspace governor
  metag: Fix atomic_*_return inline asm constraints
  scsi: fix upper bounds check of sense key in scsi_sense_key_string()
  ALSA: timer: fix NULL pointer dereference on memory allocation failure
  ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE
  ALSA: timer: fix NULL pointer dereference in read()/ioctl() race
  ALSA: hda - Enable subwoofer on Dell Inspiron 7559
  ALSA: hda - Add headset mic quirk for Dell Inspiron 5468
  ALSA: rawmidi: Fix possible deadlock with virmidi registration
  ALSA: fireworks: accessing to user space outside spinlock
  ALSA: firewire-tascam: accessing to user space outside spinlock
  ALSA: usb-audio: Add sample rate inquiry quirk for B850V3 CP2114
  crypto: caam - fix IV loading for authenc (giv)decryption
  uprobes: Fix the memcg accounting
  x86/apic: Do not init irq remapping if ioapic is disabled
  vhost/scsi: fix reuse of &vq->iov[out] in response
  bcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of two.
  ubifs: Fix assertion in layout_in_gaps()
  ovl: fix workdir creation
  ovl: listxattr: use strnlen()
  ovl: remove posix_acl_default from workdir
  ovl: don't copy up opaqueness
  wrappers for ->i_mutex access
  lustre: remove unused declaration
  timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING
  timekeeping: Cap array access in timekeeping_debug
  xfs: fix superblock inprogress check
  ASoC: atmel_ssc_dai: Don't unconditionally reset SSC on stream startup
  drm/msm: fix use of copy_from_user() while holding spinlock
  drm: Reject page_flip for !DRIVER_MODESET
  drm/radeon: fix radeon_move_blit on 32bit systems
  s390/sclp_ctl: fix potential information leak with /dev/sclp
  rds: fix an infoleak in rds_inc_info_copy
  powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
  nvme: Call pci_disable_device on the error path.
  cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork
  block: make sure a big bio is split into at most 256 bvecs
  block: Fix race triggered by blk_set_queue_dying()
  ext4: avoid modifying checksum fields directly during checksum verification
  ext4: avoid deadlock when expanding inode size
  ext4: properly align shifted xattrs when expanding inodes
  ext4: fix xattr shifting when expanding inodes part 2
  ext4: fix xattr shifting when expanding inodes
  ext4: validate that metadata blocks do not overlap superblock
  net: Use ns_capable_noaudit() when determining net sysctl permissions
  kernel: Add noaudit variant of ns_capable()
  KEYS: Fix ASN.1 indefinite length object parsing
  drivers:hv: Lock access to hyperv_mmio resource tree
  cxlflash: Move to exponential back-off when cmd_room is not available
  netfilter: x_tables: check for size overflow
  drm/amdgpu/cz: enable/disable vce dpm even if vce pg is disabled
  cred: Reject inodes with invalid ids in set_create_file_as()
  fs: Check for invalid i_uid in may_follow_link()
  IB/IPoIB: Do not set skb truesize since using one linearskb
  udp: properly support MSG_PEEK with truncated buffers
  crypto: nx-842 - Mask XERS0 bit in return value
  cxlflash: Fix to avoid virtual LUN failover failure
  cxlflash: Fix to escalate LINK_RESET also on port 1
  tipc: fix nl compat regression for link statistics
  tipc: fix an infoleak in tipc_nl_compat_link_dump
  netfilter: x_tables: check for size overflow
  Bluetooth: Add support for Intel Bluetooth device 8265 [8087:0a2b]
  drm/i915: Check VBT for port presence in addition to the strap on VLV/CHV
  drm/i915: Only ignore eDP ports that are connected
  Input: xpad - move pending clear to the correct location
  net: thunderx: Fix link status reporting
  x86/hyperv: Avoid reporting bogus NMI status for Gen2 instances
  crypto: vmx - IV size failing on skcipher API
  tda10071: Fix dependency to REGMAP_I2C
  crypto: vmx - Fix ABI detection
  crypto: vmx - comply with ABIs that specify vrsave as reserved.
  HID: core: prevent out-of-bound readings
  lpfc: Fix DMA faults observed upon plugging loopback connector
  block: fix blk_rq_get_max_sectors for driver private requests
  irqchip/gicv3-its: numa: Enable workaround for Cavium thunderx erratum 23144
  clocksource: Allow unregistering the watchdog
  btrfs: Continue write in case of can_not_nocow
  blk-mq: End unstarted requests on dying queue
  cxlflash: Fix to resolve dead-lock during EEH recovery
  drm/radeon/mst: fix regression in lane/link handling.
  ecryptfs: fix handling of directory opening
  ALSA: hda: add AMD Polaris-10/11 AZ PCI IDs with proper driver caps
  drm: Balance error path for GEM handle allocation
  ntp: Fix ADJ_SETOFFSET being used w/ ADJ_NANO
  time: Verify time values in adjtimex ADJ_SETOFFSET to avoid overflow
  Input: xpad - correctly handle concurrent LED and FF requests
  net: thunderx: Fix receive packet stats
  net: thunderx: Fix for multiqset not configured upon interface toggle
  perf/x86/cqm: Fix CQM memory leak and notifier leak
  perf/x86/cqm: Fix CQM handling of grouping events into a cache_group
  s390/crypto: provide correct file mode at device register.
  proc: revert /proc/<pid>/maps [stack:TID] annotation
  intel_idle: Support for Intel Xeon Phi Processor x200 Product Family
  cxlflash: Fix to avoid unnecessary scan with internal LUNs
  Drivers: hv: vmbus: don't manipulate with clocksources on crash
  Drivers: hv: vmbus: avoid scheduling in interrupt context in vmbus_initiate_unload()
  Drivers: hv: vmbus: avoid infinite loop in init_vp_index()
  arcmsr: fixes not release allocated resource
  arcmsr: fixed getting wrong configuration data
  s390/pci_dma: fix DMA table corruption with > 4 TB main memory
  net/mlx5e: Don't modify CQ before it was created
  net/mlx5e: Don't try to modify CQ moderation if it is not supported
  mmc: sdhci: Do not BUG on invalid vdd
  UVC: Add support for R200 depth camera
  sched/numa: Fix use-after-free bug in the task_numa_compare
  ALSA: hda - add codec support for Kabylake display audio codec
  drm/i915: Fix hpd live status bits for g4x
  tipc: fix nullptr crash during subscription cancel
  arm64: Add workaround for Cavium erratum 27456
  net: thunderx: Fix for Qset error due to CQ full
  drm/radeon: fix dp link rate selection (v2)
  drm/amdgpu: fix dp link rate selection (v2)
  qla2xxx: Use ATIO type to send correct tmr response
  mmc: sdhci: 64-bit DMA actually has 4-byte alignment
  drm/atomic: Do not unset crtc when an encoder is stolen
  drm/i915/skl: Add missing SKL ids
  drm/i915/bxt: update list of PCIIDs
  hrtimer: Catch illegal clockids
  i40e/i40evf: Fix RSS rx-flow-hash configuration through ethtool
  mpt3sas: Fix for Asynchronous completion of timedout IO and task abort of timedout IO.
  mpt3sas: A correction in unmap_resources
  net: cavium: liquidio: fix check for in progress flag
  arm64: KVM: Configure TCR_EL2.PS at runtime
  irqchip/gic-v3: Make sure read from ICC_IAR1_EL1 is visible on redestributor
  pwm: lpc32xx: fix and simplify duty cycle and period calculations
  pwm: lpc32xx: correct number of PWM channels from 2 to 1
  pwm: fsl-ftm: Fix clock enable/disable when using PM
  megaraid_sas: Add an i/o barrier
  megaraid_sas: Fix SMAP issue
  megaraid_sas: Do not allow PCI access during OCR
  s390/cio: update measurement characteristics
  s390/cio: ensure consistent measurement state
  s390/cio: fix measurement characteristics memleak
  qeth: initialize net_device with carrier off
  lpfc: Fix external loopback failure.
  lpfc: Fix mbox reuse in PLOGI completion
  lpfc: Fix RDP Speed reporting.
  lpfc: Fix crash in fcp command completion path.
  lpfc: Fix driver crash when module parameter lpfc_fcp_io_channel set to 16
  lpfc: Fix RegLogin failed error seen on Lancer FC during port bounce
  lpfc: Fix the FLOGI discovery logic to comply with T11 standards
  lpfc: Fix FCF Infinite loop in lpfc_sli4_fcf_rr_next_index_get.
  cxl: Enable PCI device ID for future IBM CXL adapter
  cxl: fix build for GCC 4.6.x
  cxlflash: Enable device id for future IBM CXL adapter
  cxlflash: Resolve oops in wait_port_offline
  cxlflash: Fix to resolve cmd leak after host reset
  cxl: Fix DSI misses when the context owning task exits
  cxl: Fix possible idr warning when contexts are released
  Drivers: hv: vmbus: fix rescind-offer handling for device without a driver
  Drivers: hv: vmbus: serialize process_chn_event() and vmbus_close_internal()
  Drivers: hv: vss: run only on supported host versions
  drivers/hv: cleanup synic msrs if vmbus connect failed
  Drivers: hv: util: catch allocation errors
  tools: hv: report ENOSPC errors in hv_fcopy_daemon
  Drivers: hv: utils: run polling callback always in interrupt context
  Drivers: hv: util: Increase the timeout for util services
  lightnvm: fix missing grown bad block type
  lightnvm: fix locking and mempool in rrpc_lun_gc
  lightnvm: unlock rq and free ppa_list on submission fail
  lightnvm: add check after mempool allocation
  lightnvm: fix incorrect nr_free_blocks stat
  lightnvm: fix bio submission issue
  cxlflash: a couple off by one bugs
  fm10k: Cleanup exception handling for mailbox interrupt
  fm10k: Cleanup MSI-X interrupts in case of failure
  fm10k: reinitialize queuing scheme after calling init_hw
  fm10k: always check init_hw for errors
  fm10k: reset max_queues on init_hw_vf failure
  fm10k: Fix handling of NAPI budget when multiple queues are enabled per vector
  fm10k: Correct MTU for jumbo frames
  fm10k: do not assume VF always has 1 queue
  clk: xgene: Fix divider with non-zero shift value
  e1000e: fix division by zero on jumbo MTUs
  e1000: fix data race between tx_ring->next_to_clean
  ixgbe: Fix handling of NAPI budget when multiple queues are enabled per vector
  igb: fix NULL derefs due to skipped SR-IOV enabling
  igb: use the correct i210 register for EEMNGCTL
  igb: don't unmap NULL hw_addr
  i40e: Fix Rx hash reported to the stack by our driver
  i40e: clean whole mac filter list
  i40evf: check rings before freeing resources
  i40e: don't add zero MAC filter
  i40e: properly delete VF MAC filters
  i40e: Fix memory leaks, sideband filter programming
  i40e: fix: do not sleep in netdev_ops
  i40e/i40evf: Fix RS bit update in Tx path and disable force WB workaround
  i40evf: handle many MAC filters correctly
  i40e: Workaround fix for mss < 256 issue
  UPSTREAM: audit: fix a double fetch in audit_log_single_execve_arg()
  UPSTREAM: ARM: 8494/1: mm: Enable PXN when running non-LPAE kernel on LPAE processor
  FIXUP: sched/tune: update accouting before CPU capacity
  FIXUP: sched/tune: add fixes missing from a previous patch
  arm: Fix #if/#ifdef typo in topology.c
  arm: Fix build error "conflicting types for 'scale_cpu_capacity'"
  sched/walt: use do_div instead of division operator
  DEBUG: cpufreq: fix cpu_capacity tracing build for non-smp systems
  sched/walt: include missing header for arm_timer_read_counter()
  cpufreq: Kconfig: Fixup incorrect selection by CPU_FREQ_DEFAULT_GOV_SCHED
  sched/fair: Avoid redundant idle_cpu() call in update_sg_lb_stats()
  FIXUP: sched: scheduler-driven cpu frequency selection
  sched/rt: Add Kconfig option to enable panicking for RT throttling
  sched/rt: print RT tasks when RT throttling is activated
  UPSTREAM: sched: Fix a race between __kthread_bind() and sched_setaffinity()
  sched/fair: Favor higher cpus only for boosted tasks
  vmstat: make vmstat_updater deferrable again and shut down on idle
  sched/fair: call OPP update when going idle after migration
  sched/cpufreq_sched: fix thermal capping events
  sched/fair: Picking cpus with low OPPs for tasks that prefer idle CPUs
  FIXUP: sched/tune: do initialization as a postcore_initicall
  DEBUG: sched: add tracepoint for RD overutilized
  sched/tune: Introducing a new schedtune attribute prefer_idle
  sched: use util instead of capacity to select busy cpu
  arch_timer: add error handling when the MPM global timer is cleared
  FIXUP: sched: Fix double-release of spinlock in move_queued_task
  FIXUP: sched/fair: Fix hang during suspend in sched_group_energy
  FIXUP: sched: fix SchedFreq integration for both PELT and WALT
  sched: EAS: Avoid causing spikes to max-freq unnecessarily
  FIXUP: sched: fix set_cfs_cpu_capacity when WALT is in use
  sched/walt: Accounting for number of irqs pending on each core
  sched: Introduce Window Assisted Load Tracking (WALT)
  sched/tune: fix PB and PC cuts indexes definition
  sched/fair: optimize idle cpu selection for boosted tasks
  FIXUP: sched/tune: fix accounting for runnable tasks
  sched/tune: use a single initialisation function
  sched/{fair,tune}: simplify fair.c code
  FIXUP: sched/tune: fix payoff calculation for boost region
  sched/tune: Add support for negative boost values
  FIX: sched/tune: move schedtune_nornalize_energy into fair.c
  FIX: sched/tune: update usage of boosted task utilisation on CPU selection
  sched/fair: add tunable to set initial task load
  sched/fair: add tunable to force selection at cpu granularity
  sched: EAS: take cstate into account when selecting idle core
  sched/cpufreq_sched: Consolidated update
  FIXUP: sched: fix build for non-SMP target
  DEBUG: sched/tune: add tracepoint on P-E space filtering
  DEBUG: sched/tune: add tracepoint for energy_diff() values
  DEBUG: sched/tune: add tracepoint for task boost signal
  arm: topology: Define TC2 energy and provide it to the scheduler
  CHROMIUM: sched: update the average of nr_running
  DEBUG: schedtune: add tracepoint for schedtune_tasks_update() values
  DEBUG: schedtune: add tracepoint for CPU boost signal
  DEBUG: schedtune: add tracepoint for SchedTune configuration update
  DEBUG: sched: add energy procfs interface
  DEBUG: sched,cpufreq: add cpu_capacity change tracepoint
  DEBUG: sched: add tracepoint for CPU load/util signals
  DEBUG: sched: add tracepoint for task load/util signals
  DEBUG: sched: add tracepoint for cpu/freq scale invariance
  sched/fair: filter energy_diff() based on energy_payoff value
  sched/tune: add support to compute normalized energy
  sched/fair: keep track of energy/capacity variations
  sched/fair: add boosted task utilization
  sched/{fair,tune}: track RUNNABLE tasks impact on per CPU boost value
  sched/tune: compute and keep track of per CPU boost value
  sched/tune: add initial support for CGroups based boosting
  sched/fair: add boosted CPU usage
  sched/fair: add function to convert boost value into "margin"
  sched/tune: add sysctl interface to define a boost value
  sched/tune: add detailed documentation
  fixup! sched/fair: jump to max OPP when crossing UP threshold
  fixup! sched: scheduler-driven cpu frequency selection
  sched: rt scheduler sets capacity requirement
  sched: deadline: use deadline bandwidth in scale_rt_capacity
  sched: remove call of sched_avg_update from sched_rt_avg_update
  sched/cpufreq_sched: add trace events
  sched/fair: jump to max OPP when crossing UP threshold
  sched/fair: cpufreq_sched triggers for load balancing
  sched/{core,fair}: trigger OPP change request on fork()
  sched/fair: add triggers for OPP change requests
  sched: scheduler-driven cpu frequency selection
  cpufreq: introduce cpufreq_driver_is_slow
  sched: Consider misfit tasks when load-balancing
  sched: Add group_misfit_task load-balance type
  sched: Add per-cpu max capacity to sched_group_capacity
  sched: Do eas idle balance regardless of the rq avg idle value
  arm64: Enable max freq invariant scheduler load-tracking and capacity support
  arm: Enable max freq invariant scheduler load-tracking and capacity support
  sched: Update max cpu capacity in case of max frequency constraints
  cpufreq: Max freq invariant scheduler load-tracking and cpu capacity support
  arm64, topology: Updates to use DT bindings for EAS costing data
  sched: Support for extracting EAS energy costs from DT
  Documentation: DT bindings for energy model cost data required by EAS
  sched: Disable energy-unfriendly nohz kicks
  sched: Consider a not over-utilized energy-aware system as balanced
  sched: Energy-aware wake-up task placement
  sched: Determine the current sched_group idle-state
  sched, cpuidle: Track cpuidle state index in the scheduler
  sched: Add over-utilization/tipping point indicator
  sched: Estimate energy impact of scheduling decisions
  sched: Extend sched_group_energy to test load-balancing decisions
  sched: Calculate energy consumption of sched_group
  sched: Highest energy aware balancing sched_domain level pointer
  sched: Relocated cpu_util() and change return type
  sched: Compute cpu capacity available at current frequency
  arm64: Cpu invariant scheduler load-tracking and capacity support
  arm: Cpu invariant scheduler load-tracking and capacity support
  sched: Introduce SD_SHARE_CAP_STATES sched_domain flag
  sched: Initialize energy data structures
  sched: Introduce energy data structures
  sched: Make energy awareness a sched feature
  sched: Documentation for scheduler energy cost model
  sched: Prevent unnecessary active balance of single task in sched group
  sched: Enable idle balance to pull single task towards cpu with higher capacity
  sched: Consider spare cpu capacity at task wake-up
  sched: Add cpu capacity awareness to wakeup balancing
  sched: Store system-wide maximum cpu capacity in root domain
  arm: Update arch_scale_cpu_capacity() to reflect change to define
  arm64: Enable frequency invariant scheduler load-tracking support
  arm: Enable frequency invariant scheduler load-tracking support
  cpufreq: Frequency invariant scheduler load-tracking support
  sched/fair: Fix new task's load avg removed from source CPU in wake_up_new_task()
  FROMLIST: pstore: drop pmsg bounce buffer
  UPSTREAM: usercopy: remove page-spanning test for now
  UPSTREAM: usercopy: force check_object_size() inline
  BACKPORT: usercopy: fold builtin_const check into inline function
  UPSTREAM: x86/uaccess: force copy_*_user() to be inlined
  UPSTREAM: HID: core: prevent out-of-bound readings
  Android: Fix build breakages.
  UPSTREAM: tty: Prevent ldisc drivers from re-using stale tty fields
  UPSTREAM: netfilter: nfnetlink: correctly validate length of batch messages
  cpuset: Make cpusets restore on hotplug
  UPSTREAM: mm/slub: support left redzone
  UPSTREAM: Make the hardened user-copy code depend on having a hardened allocator
  Android: MMC/UFS IO Latency Histograms.
  UPSTREAM: usercopy: fix overlap check for kernel text
  UPSTREAM: usercopy: avoid potentially undefined behavior in pointer math
  UPSTREAM: unsafe_[get|put]_user: change interface to use a error target label
  BACKPORT: arm64: mm: fix location of _etext
  BACKPORT: ARM: 8583/1: mm: fix location of _etext
  BACKPORT: Don't show empty tag stats for unprivileged uids
  UPSTREAM: tcp: fix use after free in tcp_xmit_retransmit_queue()
  ANDROID: base-cfg: drop SECCOMP_FILTER config
  UPSTREAM: [media] xc2028: unlock on error in xc2028_set_config()
  UPSTREAM: [media] xc2028: avoid use after free
  ANDROID: base-cfg: enable SECCOMP config
  ANDROID: rcu_sync: Export rcu_sync_lockdep_assert
  RFC: FROMLIST: cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork
  RFC: FROMLIST: cgroup: avoid synchronize_sched() in __cgroup_procs_write()
  RFC: FROMLIST: locking/percpu-rwsem: Optimize readers and reduce global impact
  net: ipv6: Fix ping to link-local addresses.
  ipv6: fix endianness error in icmpv6_err
  ANDROID: dm: android-verity: Allow android-verity to be compiled as an independent module
  backporting: a brief introduce of backported feautures on 4.4
  Linux 4.4.20
  sysfs: correctly handle read offset on PREALLOC attrs
  hwmon: (iio_hwmon) fix memory leak in name attribute
  ALSA: line6: Fix POD sysfs attributes segfault
  ALSA: line6: Give up on the lock while URBs are released.
  ALSA: line6: Remove double line6_pcm_release() after failed acquire.
  ACPI / SRAT: fix SRAT parsing order with both LAPIC and X2APIC present
  ACPI / sysfs: fix error code in get_status()
  ACPI / drivers: replace acpi_probe_lock spinlock with mutex
  ACPI / drivers: fix typo in ACPI_DECLARE_PROBE_ENTRY macro
  staging: comedi: ni_mio_common: fix wrong insn_write handler
  staging: comedi: ni_mio_common: fix AO inttrig backwards compatibility
  staging: comedi: comedi_test: fix timer race conditions
  staging: comedi: daqboard2000: bug fix board type matching code
  USB: serial: option: add WeTelecom 0x6802 and 0x6803 products
  USB: serial: option: add WeTelecom WM-D200
  USB: serial: mos7840: fix non-atomic allocation in write path
  USB: serial: mos7720: fix non-atomic allocation in write path
  USB: fix typo in wMaxPacketSize validation
  usb: chipidea: udc: don't touch DP when controller is in host mode
  USB: avoid left shift by -1
  dmaengine: usb-dmac: check CHCR.DE bit in usb_dmac_isr_channel()
  crypto: qat - fix aes-xts key sizes
  crypto: nx - off by one bug in nx_of_update_msc()
  Input: i8042 - set up shared ps2_cmd_mutex for AUX ports
  Input: i8042 - break load dependency between atkbd/psmouse and i8042
  Input: tegra-kbc - fix inverted reset logic
  btrfs: properly track when rescan worker is running
  btrfs: waiting on qgroup rescan should not always be interruptible
  fs/seq_file: fix out-of-bounds read
  gpio: Fix OF build problem on UM
  usb: renesas_usbhs: gadget: fix return value check in usbhs_mod_gadget_probe()
  megaraid_sas: Fix probing cards without io port
  mpt3sas: Fix resume on WarpDrive flash cards
  cdc-acm: fix wrong pipe type on rx interrupt xfers
  i2c: cros-ec-tunnel: Fix usage of cros_ec_cmd_xfer()
  mfd: cros_ec: Add cros_ec_cmd_xfer_status() helper
  aacraid: Check size values after double-fetch from user
  ARC: Elide redundant setup of DMA callbacks
  ARC: Call trace_hardirqs_on() before enabling irqs
  ARC: use correct offset in pt_regs for saving/restoring user mode r25
  ARC: build: Better way to detect ISA compatible toolchain
  drm/i915: fix aliasing_ppgtt leak
  drm/amdgpu: record error code when ring test failed
  drm/amd/amdgpu: sdma resume fail during S4 on CI
  drm/amdgpu: skip TV/CV in display parsing
  drm/amdgpu: avoid a possible array overflow
  drm/amdgpu: fix amdgpu_move_blit on 32bit systems
  drm/amdgpu: Change GART offset to 64-bit
  iio: fix sched WARNING "do not call blocking ops when !TASK_RUNNING"
  sched/nohz: Fix affine unpinned timers mess
  sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
  of: fix reference counting in of_graph_get_endpoint_by_regs
  arm64: dts: rockchip: add reset saradc node for rk3368 SoCs
  mac80211: fix purging multicast PS buffer queue
  s390/dasd: fix hanging device after clear subchannel
  EDAC: Increment correct counter in edac_inc_ue_error()
  pinctrl/amd: Remove the default de-bounce time
  iommu/arm-smmu: Don't BUG() if we find aborting STEs with disable_bypass
  iommu/arm-smmu: Fix CMDQ error handling
  iommu/dma: Don't put uninitialised IOVA domains
  xhci: Make sure xhci handles USB_SPEED_SUPER_PLUS devices.
  USB: serial: ftdi_sio: add PIDs for Ivium Technologies devices
  USB: serial: ftdi_sio: add device ID for WICED USB UART dev board
  USB: serial: option: add support for Telit LE920A4
  USB: serial: option: add D-Link DWM-156/A3
  USB: serial: fix memleak in driver-registration error path
  xhci: don't dereference a xhci member after removing xhci
  usb: xhci: Fix panic if disconnect
  xhci: always handle "Command Ring Stopped" events
  usb/gadget: fix gadgetfs aio support.
  usb: gadget: fsl_qe_udc: off by one in setup_received_handle()
  USB: validate wMaxPacketValue entries in endpoint descriptors
  usb: renesas_usbhs: Use dmac only if the pipe type is bulk
  usb: renesas_usbhs: clear the BRDYSTS in usbhsg_ep_enable()
  USB: hub: change the locking in hub_activate
  USB: hub: fix up early-exit pathway in hub_activate
  usb: hub: Fix unbalanced reference count/memory leak/deadlocks
  usb: define USB_SPEED_SUPER_PLUS speed for SuperSpeedPlus USB3.1 devices
  usb: dwc3: gadget: increment request->actual once
  usb: dwc3: pci: add Intel Kabylake PCI ID
  usb: misc: usbtest: add fix for driver hang
  usb: ehci: change order of register cleanup during shutdown
  crypto: caam - defer aead_set_sh_desc in case of zero authsize
  crypto: caam - fix echainiv(authenc) encrypt shared descriptor
  crypto: caam - fix non-hmac hashes
  genirq/msi: Make sure PCI MSIs are activated early
  genirq/msi: Remove unused MSI_FLAG_IDENTITY_MAP
  um: Don't discard .text.exit section
  ACPI / CPPC: Prevent cpc_desc_ptr points to the invalid data
  ACPI: CPPC: Return error if _CPC is invalid on a CPU
  mmc: sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs
  PCI: Limit config space size for Netronome NFP4000
  PCI: Add Netronome NFP4000 PF device ID
  PCI: Limit config space size for Netronome NFP6000 family
  PCI: Add Netronome vendor and device IDs
  PCI: Support PCIe devices with short cfg_size
  NVMe: Don't unmap controller registers on reset
  ALSA: hda - Manage power well properly for resume
  libnvdimm, nd_blk: mask off reserved status bits
  perf intel-pt: Fix occasional decoding errors when tracing system-wide
  vfio/pci: Fix NULL pointer oops in error interrupt setup handling
  virtio: fix memory leak in virtqueue_add()
  parisc: Fix order of EREFUSED define in errno.h
  arm64: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
  ALSA: usb-audio: Add quirk for ELP HD USB Camera
  ALSA: usb-audio: Add a sample rate quirk for Creative Live! Cam Socialize HD (VF0610)
  powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
  SUNRPC: allow for upcalls for same uid but different gss service
  SUNRPC: Handle EADDRNOTAVAIL on connection failures
  tools/testing/nvdimm: fix SIGTERM vs hotplug crash
  uprobes/x86: Fix RIP-relative handling of EVEX-encoded instructions
  x86/mm: Disable preemption during CR3 read+write
  hugetlb: fix nr_pmds accounting with shared page tables
  mm: SLUB hardened usercopy support
  mm: SLAB hardened usercopy support
  s390/uaccess: Enable hardened usercopy
  sparc/uaccess: Enable hardened usercopy
  powerpc/uaccess: Enable hardened usercopy
  ia64/uaccess: Enable hardened usercopy
  arm64/uaccess: Enable hardened usercopy
  ARM: uaccess: Enable hardened usercopy
  x86/uaccess: Enable hardened usercopy
  x86: remove more uaccess_32.h complexity
  x86: remove pointless uaccess_32.h complexity
  x86: fix SMAP in 32-bit environments
  Use the new batched user accesses in generic user string handling
  Add 'unsafe' user access functions for batched accesses
  x86: reorganize SMAP handling in user space accesses
  mm: Hardened usercopy
  mm: Implement stack frame object validation
  mm: Add is_migrate_cma_page
  Linux 4.4.19
  Documentation/module-signing.txt: Note need for version info if reusing a key
  module: Invalidate signatures on force-loaded modules
  dm flakey: error READ bios during the down_interval
  rtc: s3c: Add s3c_rtc_{enable/disable}_clk in s3c_rtc_setfreq()
  lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()
  ACPI / EC: Work around method reentrancy limit in ACPICA for _Qxx
  x86/platform/intel_mid_pci: Rework IRQ0 workaround
  PCI: Mark Atheros AR9485 and QCA9882 to avoid bus reset
  MIPS: hpet: Increase HPET_MIN_PROG_DELTA and decrease HPET_MIN_CYCLES
  MIPS: Don't register r4k sched clock when CPUFREQ enabled
  MIPS: mm: Fix definition of R6 cache instruction
  SUNRPC: Don't allocate a full sockaddr_storage for tracing
  Input: elan_i2c - properly wake up touchpad on ASUS laptops
  target: Fix ordered task CHECK_CONDITION early exception handling
  target: Fix max_unmap_lba_count calc overflow
  target: Fix race between iscsi-target connection shutdown + ABORT_TASK
  target: Fix missing complete during ABORT_TASK + CMD_T_FABRIC_STOP
  target: Fix ordered task target_setup_cmd_from_cdb exception hang
  iscsi-target: Fix panic when adding second TCP connection to iSCSI session
  ubi: Fix race condition between ubi device creation and udev
  ubi: Fix early logging
  ubi: Make volume resize power cut aware
  of: fix memory leak related to safe_name()
  IB/mlx4: Fix memory leak if QP creation failed
  IB/mlx4: Fix error flow when sending mads under SRIOV
  IB/mlx4: Fix the SQ size of an RC QP
  IB/IWPM: Fix a potential skb leak
  IB/IPoIB: Don't update neigh validity for unresolved entries
  IB/SA: Use correct free function
  IB/mlx5: Return PORT_ERR in Active to Initializing tranisition
  IB/mlx5: Fix post send fence logic
  IB/mlx5: Fix entries check in mlx5_ib_resize_cq
  IB/mlx5: Fix returned values of query QP
  IB/mlx5: Fix entries checks in mlx5_ib_create_cq
  IB/mlx5: Fix MODIFY_QP command input structure
  ALSA: hda - Fix headset mic detection problem for two dell machines
  ALSA: hda: add AMD Bonaire AZ PCI ID with proper driver caps
  ALSA: hda/realtek - Can't adjust speaker's volume on a Dell AIO
  ALSA: hda: Fix krealloc() with __GFP_ZERO usage
  mm/hugetlb: avoid soft lockup in set_max_huge_pages()
  mtd: nand: fix bug writing 1 byte less than page size
  block: fix bdi vs gendisk lifetime mismatch
  block: add missing group association in bio-cloning functions
  metag: Fix __cmpxchg_u32 asm constraint for CMP
  ftrace/recordmcount: Work around for addition of metag magic but not relocations
  balloon: check the number of available pages in leak balloon
  drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown"
  drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
  drm/edid: Add 6 bpc quirk for display AEO model 0.
  drm: Restore double clflush on the last partial cacheline
  drm/nouveau/fbcon: fix font width not divisible by 8
  drm/nouveau/gr/nv3x: fix instobj write offsets in gr setup
  drm/nouveau: check for supported chipset before booting fbdev off the hw
  drm/radeon: support backlight control for UNIPHY3
  drm/radeon: fix firmware info version checks
  drm/radeon: Poll for both connect/disconnect on analog connectors
  drm/radeon: add a delay after ATPX dGPU power off
  drm/amdgpu/gmc7: add missing mullins case
  drm/amdgpu: fix firmware info version checks
  drm/amdgpu: Disable RPM helpers while reprobing connectors on resume
  drm/amdgpu: support backlight control for UNIPHY3
  drm/amdgpu: Poll for both connect/disconnect on analog connectors
  drm/amdgpu: add a delay after ATPX dGPU power off
  w1:omap_hdq: fix regression
  netlabel: add address family checks to netlbl_{sock,req}_delattr()
  ARM: dts: sunxi: Add a startup delay for fixed regulator enabled phys
  audit: fix a double fetch in audit_log_single_execve_arg()
  iommu/amd: Update Alias-DTE in update_device_table()
  iommu/amd: Init unity mappings only for dma_ops domains
  iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back
  iommu/vt-d: Return error code in domain_context_mapping_one()
  iommu/exynos: Suppress unbinding to prevent system failure
  drm/i915: Don't complain about lack of ACPI video bios
  nfsd: don't return an unhashed lock stateid after taking mutex
  nfsd: Fix race between FREE_STATEID and LOCK
  nfs: don't create zero-length requests
  MIPS: KVM: Propagate kseg0/mapped tlb fault errors
  MIPS: KVM: Fix gfn range check in kseg0 tlb faults
  MIPS: KVM: Add missing gfn range check
  MIPS: KVM: Fix mapped fault broken commpage handling
  random: add interrupt callback to VMBus IRQ handler
  random: print a warning for the first ten uninitialized random users
  random: initialize the non-blocking pool via add_hwgenerator_randomness()
  CIFS: Fix a possible invalid memory access in smb2_query_symlink()
  cifs: fix crash due to race in hmac(md5) handling
  cifs: Check for existing directory when opening file with O_CREAT
  fs/cifs: make share unaccessible at root level mountable
  jbd2: make journal y2038 safe
  ARC: mm: don't loose PTE_SPECIAL in pte_modify()
  remoteproc: Fix potential race condition in rproc_add
  ovl: disallow overlayfs as upperdir
  HID: uhid: fix timeout when probe races with IO
  EDAC: Correct channel count limit
  Bluetooth: Fix l2cap_sock_setsockopt() with optname BT_RCVMTU
  spi: pxa2xx: Clear all RFT bits in reset_sccr1() on Intel Quark
  i2c: efm32: fix a failure path in efm32_i2c_probe()
  s5p-mfc: Add release callback for memory region devs
  s5p-mfc: Set device name for reserved memory region devs
  hp-wmi: Fix wifi cannot be hard-unblocked
  dm: set DMF_SUSPENDED* _before_ clearing DMF_NOFLUSH_SUSPENDING
  sur40: fix occasional oopses on device close
  sur40: lower poll interval to fix occasional FPS drops to ~56 FPS
  Fix RC5 decoding with Fintek CIR chipset
  vb2: core: Skip planes array verification if pb is NULL
  videobuf2-v4l2: Verify planes array in buffer dequeueing
  media: dvb_ringbuffer: Add memory barriers
  media: usbtv: prevent access to free'd resources
  mfd: qcom_rpm: Parametrize also ack selector size
  mfd: qcom_rpm: Fix offset error for msm8660
  intel_pstate: Fix MSR_CONFIG_TDP_x addressing in core_get_max_pstate()
  s390/cio: allow to reset channel measurement block
  KVM: nVMX: Fix memory corruption when using VMCS shadowing
  KVM: VMX: handle PML full VMEXIT that occurs during event delivery
  KVM: MTRR: fix kvm_mtrr_check_gfn_range_consistency page fault
  KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
  KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
  arm64: mm: avoid fdt_check_header() before the FDT is fully mapped
  arm64: dts: rockchip: fixes the gic400 2nd region size for rk3368
  pinctrl: cherryview: prevent concurrent access to GPIO controllers
  Bluetooth: hci_intel: Fix null gpio desc pointer dereference
  gpio: intel-mid: Remove potentially harmful code
  gpio: pca953x: Fix NBANK calculation for PCA9536
  tty/serial: atmel: fix RS485 half duplex with DMA
  serial: samsung: Fix ERR pointer dereference on deferred probe
  tty: serial: msm: Don't read off end of tx fifo
  arm64: Fix incorrect per-cpu usage for boot CPU
  arm64: debug: unmask PSTATE.D earlier
  arm64: kernel: Save and restore UAO and addr_limit on exception entry
  USB: usbfs: fix potential infoleak in devio
  usb: renesas_usbhs: fix NULL pointer dereference in xfer_work()
  USB: serial: option: add support for Telit LE910 PID 0x1206
  usb: dwc3: fix for the isoc transfer EP_BUSY flag
  usb: quirks: Add no-lpm quirk for Elan
  usb: renesas_usbhs: protect the CFIFOSEL setting in usbhsg_ep_enable()
  usb: f_fs: off by one bug in _ffs_func_bind()
  usb: gadget: avoid exposing kernel stack
  UPSTREAM: usb: gadget: configfs: add mutex lock before unregister gadget
  ANDROID: dm-verity: adopt changes made to dm callbacks
  UPSTREAM: ecryptfs: fix handling of directory opening
  ANDROID: net: core: fix UID-based routing
  ANDROID: net: fib: remove duplicate assignment
  FROMLIST: proc: Fix timerslack_ns CAP_SYS_NICE check when adjusting self
  ANDROID: dm verity fec: pack the fec_header structure
  ANDROID: dm: android-verity: Verify header before fetching table
  ANDROID: dm: allow adb disable-verity only in userdebug
  ANDROID: dm: mount as linear target if eng build
  ANDROID: dm: use default verity public key
  ANDROID: dm: fix signature verification flag
  ANDROID: dm: use name_to_dev_t
  ANDROID: dm: rename dm-linear methods for dm-android-verity
  ANDROID: dm: Minor cleanup
  ANDROID: dm: Mounting root as linear device when verity disabled
  ANDROID: dm-android-verity: Rebase on top of 4.1
  ANDROID: dm: Add android verity target
  ANDROID: dm: fix dm_substitute_devices()
  ANDROID: dm: Rebase on top of 4.1
  CHROMIUM: dm: boot time specification of dm=
  Implement memory_state_time, used by qcom,cpubw
  Revert "panic: Add board ID to panic output"
  usb: gadget: f_accessory: remove duplicate endpoint alloc
  BACKPORT: brcmfmac: defer DPC processing during probe
  FROMLIST: proc: Add LSM hook checks to /proc/<tid>/timerslack_ns
  FROMLIST: proc: Relax /proc/<tid>/timerslack_ns capability requirements
  UPSTREAM: ppp: defer netns reference release for ppp channel
  cpuset: Add allow_attach hook for cpusets on android.
  UPSTREAM: KEYS: Fix ASN.1 indefinite length object parsing
  ANDROID: sdcardfs: fix itnull.cocci warnings
  android-recommended.cfg: enable fstack-protector-strong
  Linux 4.4.18
  mm: memcontrol: fix memcg id ref counter on swap charge move
  mm: memcontrol: fix swap counter leak on swapout from offline cgroup
  mm: memcontrol: fix cgroup creation failure after many small jobs
  ext4: fix reference counting bug on block allocation error
  ext4: short-cut orphan cleanup on error
  ext4: validate s_reserved_gdt_blocks on mount
  ext4: don't call ext4_should_journal_data() on the journal inode
  ext4: fix deadlock during page writeback
  ext4: check for extents that wrap around
  crypto: scatterwalk - Fix test in scatterwalk_done
  crypto: gcm - Filter out async ghash if necessary
  fs/dcache.c: avoid soft-lockup in dput()
  fuse: fix wrong assignment of ->flags in fuse_send_init()
  fuse: fuse_flush must check mapping->flags for errors
  fuse: fsync() did not return IO errors
  sysv, ipc: fix security-layer leaking
  block: fix use-after-free in seq file
  x86/syscalls/64: Add compat_sys_keyctl for 32-bit userspace
  drm/i915: Pretend cursor is always on for ILK-style WM calculations (v2)
  x86/mm/pat: Fix BUG_ON() in mmap_mem() on QEMU/i386
  x86/pat: Document the PAT initialization sequence
  x86/xen, pat: Remove PAT table init code from Xen
  x86/mtrr: Fix PAT init handling when MTRR is disabled
  x86/mtrr: Fix Xorg crashes in Qemu sessions
  x86/mm/pat: Replace cpu_has_pat with boot_cpu_has()
  x86/mm/pat: Add pat_disable() interface
  x86/mm/pat: Add support of non-default PAT MSR setting
  devpts: clean up interface to pty drivers
  random: strengthen input validation for RNDADDTOENTCNT
  apparmor: fix ref count leak when profile sha1 hash is read
  Revert "s390/kdump: Clear subchannel ID to signal non-CCW/SCSI IPL"
  KEYS: 64-bit MIPS needs to use compat_sys_keyctl for 32-bit userspace
  arm: oabi compat: add missing access checks
  cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
  i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR
  x86/mm/32: Enable full randomization on i386 and X86_32
  HID: sony: do not bail out when the sixaxis refuses the output report
  PNP: Add Broadwell to Intel MCH size workaround
  PNP: Add Haswell-ULT to Intel MCH size workaround
  scsi: ignore errors from scsi_dh_add_device()
  ipath: Restrict use of the write() interface
  tcp: consider recv buf for the initial window scale
  qed: Fix setting/clearing bit in completion bitmap
  net/irda: fix NULL pointer dereference on memory allocation failure
  net: bgmac: Fix infinite loop in bgmac_dma_tx_add()
  bonding: set carrier off for devices created through netlink
  ipv4: reject RTNH_F_DEAD and RTNH_F_LINKDOWN from user space
  tcp: enable per-socket rate limiting of all 'challenge acks'
  tcp: make challenge acks less predictable
  arm64: relocatable: suppress R_AARCH64_ABS64 relocations in vmlinux
  arm64: vmlinux.lds: make __rela_offset and __dynsym_offset ABSOLUTE
  Linux 4.4.17
  vfs: fix deadlock in file_remove_privs() on overlayfs
  intel_th: Fix a deadlock in modprobing
  intel_th: pci: Add Kaby Lake PCH-H support
  net: mvneta: set real interrupt per packet for tx_done
  libceph: apply new_state before new_up_client on incrementals
  libata: LITE-ON CX1-JB256-HP needs lower max_sectors
  i2c: mux: reg: wrong condition checked for of_address_to_resource return value
  posix_cpu_timer: Exit early when process has been reaped
  media: fix airspy usb probe error path
  ipr: Clear interrupt on croc/crocodile when running with LSI
  SCSI: fix new bug in scsi_dev_info_list string matching
  RDS: fix rds_tcp_init() error path
  can: fix oops caused by wrong rtnl dellink usage
  can: fix handling of unmodifiable configuration options fix
  can: c_can: Update D_CAN TX and RX functions to 32 bit - fix Altera Cyclone access
  can: at91_can: RX queue could get stuck at high bus load
  perf/x86: fix PEBS issues on Intel Atom/Core2
  ovl: handle ATTR_KILL*
  sched/fair: Fix effective_load() to consistently use smoothed load
  mmc: block: fix packed command header endianness
  block: fix use-after-free in sys_ioprio_get()
  qeth: delete napi struct when removing a qeth device
  platform/chrome: cros_ec_dev - double fetch bug in ioctl
  clk: rockchip: initialize flags of clk_init_data in mmc-phase clock
  spi: sun4i: fix FIFO limit
  spi: sunxi: fix transfer timeout
  namespace: update event counter when umounting a deleted dentry
  9p: use file_dentry()
  ext4: verify extent header depth
  ecryptfs: don't allow mmap when the lower fs doesn't support it
  Revert "ecryptfs: forbid opening files without mmap handler"
  locks: use file_inode()
  power_supply: power_supply_read_temp only if use_cnt > 0
  cgroup: set css->id to -1 during init
  pinctrl: imx: Do not treat a PIN without MUX register as an error
  pinctrl: single: Fix missing flush of posted write for a wakeirq
  pvclock: Add CPU barriers to get correct version value
  Input: tsc200x - report proper input_dev name
  Input: xpad - validate USB endpoint count during probe
  Input: wacom_w8001 - w8001_MAX_LENGTH should be 13
  Input: xpad - fix oops when attaching an unknown Xbox One gamepad
  Input: elantech - add more IC body types to the list
  Input: vmmouse - remove port reservation
  ALSA: timer: Fix leak in events via snd_timer_user_tinterrupt
  ALSA: timer: Fix leak in events via snd_timer_user_ccallback
  ALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS
  xenbus: don't bail early from xenbus_dev_request_and_reply()
  xenbus: don't BUG() on user mode induced condition
  xen/pciback: Fix conf_space read/write overlap check.
  ARC: unwind: ensure that .debug_frame is generated (vs. .eh_frame)
  arc: unwind: warn only once if DW2_UNWIND is disabled
  kernel/sysrq, watchdog, sched/core: Reset watchdog on all CPUs while processing sysrq-w
  pps: do not crash when failed to register
  vmlinux.lds: account for destructor sections
  mm, meminit: ensure node is online before checking whether pages are uninitialised
  mm, meminit: always return a valid node from early_pfn_to_nid
  mm, compaction: prevent VM_BUG_ON when terminating freeing scanner
  fs/nilfs2: fix potential underflow in call to crc32_le
  mm, compaction: abort free scanner if split fails
  mm, sl[au]b: add __GFP_ATOMIC to the GFP reclaim mask
  dmaengine: at_xdmac: double FIFO flush needed to compute residue
  dmaengine: at_xdmac: fix residue corruption
  dmaengine: at_xdmac: align descriptors on 64 bits
  x86/quirks: Add early quirk to reset Apple AirPort card
  x86/quirks: Reintroduce scanning of secondary buses
  x86/quirks: Apply nvidia_bugs quirk only on root bus
  USB: OHCI: Don't mark EDs as ED_OPER if scheduling fails

Conflicts:
	arch/arm/kernel/topology.c
	arch/arm64/include/asm/arch_gicv3.h
	arch/arm64/kernel/topology.c
	block/bio.c
	drivers/cpufreq/Kconfig
	drivers/md/Makefile
	drivers/media/dvb-core/dvb_ringbuffer.c
	drivers/media/tuners/tuner-xc2028.c
	drivers/misc/Kconfig
	drivers/misc/Makefile
	drivers/mmc/core/host.c
	drivers/scsi/ufs/ufshcd.c
	drivers/scsi/ufs/ufshcd.h
	drivers/usb/dwc3/gadget.c
	drivers/usb/gadget/configfs.c
	fs/ecryptfs/file.c
	include/linux/mmc/core.h
	include/linux/mmc/host.h
	include/linux/mmzone.h
	include/linux/sched.h
	include/linux/sched/sysctl.h
	include/trace/events/power.h
	include/trace/events/sched.h
	init/Kconfig
	kernel/cpuset.c
	kernel/exit.c
	kernel/sched/Makefile
	kernel/sched/core.c
	kernel/sched/cputime.c
	kernel/sched/fair.c
	kernel/sched/features.h
	kernel/sched/rt.c
	kernel/sched/sched.h
	kernel/sched/stop_task.c
	kernel/sched/tune.c
	lib/Kconfig.debug
	mm/Makefile
	mm/vmstat.c

Change-Id: I243a43231ca56a6362076fa6301827e1b0493be5
Signed-off-by: Runmin Wang <runminw@codeaurora.org>
2016-12-16 13:52:17 -08:00
Alex Shi
072c0f9ee4 Merge remote-tracking branch 'origin/v4.4/topic/wb-cg2' into linux-linaro-lsk-v4.4 2016-11-29 15:58:42 +08:00
Johannes Weiner
aaaf9e59c0 mm: page_alloc: generalize the dirty balance reserve
The dirty balance reserve that dirty throttling has to consider is
merely memory not available to userspace allocations.  There is nothing
writeback-specific about it.  Generalize the name so that it's reusable
outside of that context.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit a8d0143730d7b42c9fe6d1435d92ecce6863a62a)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
2016-11-29 15:25:03 +08:00
Laura Abbott
472dd6904d mm: Add is_migrate_cma_page
Code such as hardened user copy[1] needs a way to tell if a
page is CMA or not. Add is_migrate_cma_page in a similar way
to is_migrate_isolate_page.

[1]http://article.gmane.org/gmane.linux.kernel.mm/155238

Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
(cherry picked from commit 7c15d9bb8231f998ae7dc0b72415f5215459f7fb)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
2016-08-27 11:23:38 +08:00
Liam Mark
50050f2a1d mm: add cma pcp list
Add a cma pcp list in order to increase cma memory utilization.

Increased cma memory utilization will improve overall memory
utilization because free cma pages are ignored when memory reclaim
is done with gfp mask GFP_KERNEL.

Since most memory reclaim is done by kswapd, which uses a gfp mask
of GFP_KERNEL, by increasing cma memory utilization we are therefore
ensuring that less aggressive memory reclaim takes place.

Increased cma memory utilization will improve performance,
for example it will increase app concurrency.

Change-Id: I809589a25c6abca51f1c963f118adfc78e955cf9
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2016-04-13 11:11:40 -07:00
Heesub Shin
d491cf59f0 cma: redirect page allocation to CMA
CMA pages are designed to be used as fallback for movable allocations
and cannot be used for non-movable allocations. If CMA pages are
utilized poorly, non-movable allocations may end up getting starved if
all regular movable pages are allocated and the only pages left are
CMA. Always using CMA pages first creates unacceptable performance
problems. As a midway alternative, use CMA pages for certain
userspace allocations. The userspace pages can be migrated or dropped
quickly which giving decent utilization.

Change-Id: I6165dda01b705309eebabc6dfa67146b7a95c174
CRs-Fixed: 452508
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Heesub Shin <heesub.shin@samsung.com
[lauraa@codeaurora.org: Missing CONFIG_CMA guards, add commit text]
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
[lmark@codeaurora.org: resolve conflicts relating to
 MIGRATE_HIGHATOMIC and some other trivial merge conflicts]
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2016-04-13 11:11:30 -07:00
Liam Mark
1426d1f8d9 lowmemorykiller: Don't count swap cache pages twice
The lowmem_shrink function discounts all the swap cache pages from
the file cache count. The zone aware code also discounts all file
cache pages from a certain zone.  This results in some swap cache
pages being discounted twice, which can result in the low memory
killer being unnecessarily aggressive.

Fix the low memory killer to only discount the swap cache pages
once.

Change-Id: I650bbfbf0fbbabd01d82bdb3502b57ff59c3e14f
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2016-04-13 11:11:01 -07:00
Liam Mark
92c1fefed5 android/lowmemorykiller: Selectively count free CMA pages
In certain memory configurations there can be a large number of
CMA pages which are not suitable to satisfy certain memory
requests.

This large number of unsuitable pages can cause the
lowmemorykiller to not kill any tasks because the
lowmemorykiller counts all free pages.
In order to ensure the lowmemorykiller properly evaluates the
free memory only count the free CMA pages if they are suitable
for satisfying the memory request.

Change-Id: I7f06d53e2d8cfe7439e5561fe6e5209ce73b1c90
CRs-fixed: 437016
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2016-04-13 11:09:54 -07:00
Laura Abbott
e48a20a27c mm: Add is_cma_pageblock definition
Bring back the is_cma_pageblock definition for determining if a
page is CMA or not.

Change-Id: I39fd546e22e240b752244832c79514f109c8e84b
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
2016-03-22 11:03:37 -07:00
Andrew Morton
8990332760 include/linux/mmzone.h: reflow comment
Someone has an 86 column display.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Mel Gorman
0aaa29a56e mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand
High-order watermark checking exists for two reasons -- kswapd high-order
awareness and protection for high-order atomic requests.  Historically the
kernel depended on MIGRATE_RESERVE to preserve min_free_kbytes as
high-order free pages for as long as possible.  This patch introduces
MIGRATE_HIGHATOMIC that reserves pageblocks for high-order atomic
allocations on demand and avoids using those blocks for order-0
allocations.  This is more flexible and reliable than MIGRATE_RESERVE was.

A MIGRATE_HIGHORDER pageblock is created when an atomic high-order
allocation request steals a pageblock but limits the total number to 1% of
the zone.  Callers that speculatively abuse atomic allocations for
long-lived high-order allocations to access the reserve will quickly fail.
 Note that SLUB is currently not such an abuser as it reclaims at least
once.  It is possible that the pageblock stolen has few suitable
high-order pages and will need to steal again in the near future but there
would need to be strong justification to search all pageblocks for an
ideal candidate.

The pageblocks are unreserved if an allocation fails after a direct
reclaim attempt.

The watermark checks account for the reserved pageblocks when the
allocation request is not a high-order atomic allocation.

The reserved pageblocks can not be used for order-0 allocations.  This may
allow temporary wastage until a failed reclaim reassigns the pageblock.
This is deliberate as the intent of the reservation is to satisfy a
limited number of atomic high-order short-lived requests if the system
requires them.

The stutter benchmark was used to evaluate this but while it was running
there was a systemtap script that randomly allocated between 1 high-order
page and 12.5% of memory's worth of order-3 pages using GFP_ATOMIC.  This
is much larger than the potential reserve and it does not attempt to be
realistic.  It is intended to stress random high-order allocations from an
unknown source, show that there is a reduction in failures without
introducing an anomaly where atomic allocations are more reliable than
regular allocations.  The amount of memory reserved varied throughout the
workload as reserves were created and reclaimed under memory pressure.
The allocation failures once the workload warmed up were as follows;

4.2-rc5-vanilla		70%
4.2-rc5-atomic-reserve	56%

The failure rate was also measured while building multiple kernels.  The
failure rate was 14% but is 6% with this patch applied.

Overall, this is a small reduction but the reserves are small relative to
the number of allocation requests.  In early versions of the patch, the
failure rate reduced by a much larger amount but that required much larger
reserves and perversely made atomic allocations seem more reliable than
regular allocations.

[yalin.wang2010@gmail.com: fix redundant check and a memory leak]
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: yalin wang <yalin.wang2010@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Mel Gorman
974a786e63 mm, page_alloc: remove MIGRATE_RESERVE
MIGRATE_RESERVE preserves an old property of the buddy allocator that
existed prior to fragmentation avoidance -- min_free_kbytes worth of pages
tended to remain contiguous until the only alternative was to fail the
allocation.  At the time it was discovered that high-order atomic
allocations relied on this property so MIGRATE_RESERVE was introduced.  A
later patch will introduce an alternative MIGRATE_HIGHATOMIC so this patch
deletes MIGRATE_RESERVE and supporting code so it'll be easier to review.
Note that this patch in isolation may look like a false regression if
someone was bisecting high-order atomic allocation failures.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Mel Gorman
f77cf4e4cc mm, page_alloc: delete the zonelist_cache
The zonelist cache (zlc) was introduced to skip over zones that were
recently known to be full.  This avoided expensive operations such as the
cpuset checks, watermark calculations and zone_reclaim.  The situation
today is different and the complexity of zlc is harder to justify.

1) The cpuset checks are no-ops unless a cpuset is active and in general
   are a lot cheaper.

2) zone_reclaim is now disabled by default and I suspect that was a large
   source of the cost that zlc wanted to avoid. When it is enabled, it's
   known to be a major source of stalling when nodes fill up and it's
   unwise to hit every other user with the overhead.

3) Watermark checks are expensive to calculate for high-order
   allocation requests. Later patches in this series will reduce the cost
   of the watermark checking.

4) The most important issue is that in the current implementation it
   is possible for a failed THP allocation to mark a zone full for order-0
   allocations and cause a fallback to remote nodes.

The last issue could be addressed with additional complexity but as the
benefit of zlc is questionable, it is better to remove it.  If stalls due
to zone_reclaim are ever reported then an alternative would be to
introduce deferring logic based on a timeout inside zone_reclaim itself
and leave the page allocator fast paths alone.

The impact on page-allocator microbenchmarks is negligible as they don't
hit the paths where the zlc comes into play.  Most page-reclaim related
workloads showed no noticeable difference as a result of the removal.

The impact was noticeable in a workload called "stutter".  One part uses a
lot of anonymous memory, a second measures mmap latency and a third copies
a large file.  In an ideal world the latency application would not notice
the mmap latency.  On a 2-node machine the results of this patch are

stutter
                             4.3.0-rc1             4.3.0-rc1
                              baseline              nozlc-v4
Min         mmap     20.9243 (  0.00%)     20.7716 (  0.73%)
1st-qrtle   mmap     22.0612 (  0.00%)     22.0680 ( -0.03%)
2nd-qrtle   mmap     22.3291 (  0.00%)     22.3809 ( -0.23%)
3rd-qrtle   mmap     25.2244 (  0.00%)     25.2396 ( -0.06%)
Max-90%     mmap     48.0995 (  0.00%)     28.3713 ( 41.02%)
Max-93%     mmap     52.5557 (  0.00%)     36.0170 ( 31.47%)
Max-95%     mmap     55.8173 (  0.00%)     47.3163 ( 15.23%)
Max-99%     mmap     67.3781 (  0.00%)     70.1140 ( -4.06%)
Max         mmap  24447.6375 (  0.00%)  12915.1356 ( 47.17%)
Mean        mmap     33.7883 (  0.00%)     27.7944 ( 17.74%)
Best99%Mean mmap     27.7825 (  0.00%)     25.2767 (  9.02%)
Best95%Mean mmap     26.3912 (  0.00%)     23.7994 (  9.82%)
Best90%Mean mmap     24.9886 (  0.00%)     23.2251 (  7.06%)
Best50%Mean mmap     22.0157 (  0.00%)     22.0261 ( -0.05%)
Best10%Mean mmap     21.6705 (  0.00%)     21.6083 (  0.29%)
Best5%Mean  mmap     21.5581 (  0.00%)     21.4611 (  0.45%)
Best1%Mean  mmap     21.3079 (  0.00%)     21.1631 (  0.68%)

Note that the maximum stall latency went from 24 seconds to 12 which is
still bad but an improvement.  The milage varies considerably 2-node
machine on an earlier test went from 494 seconds to 47 seconds and a
4-node machine that tested an earlier version of this patch went from a
worst case stall time of 6 seconds to 67ms.  The nature of the benchmark
is inherently unpredictable as it is hammering the system and the milage
will vary between machines.

There is a secondary impact with potentially more direct reclaim because
zones are now being considered instead of being skipped by zlc.  In this
particular test run it did not occur so will not be described.  However,
in at least one test the following was observed

1. Direct reclaim rates were higher. This was likely due to direct reclaim
  being entered instead of the zlc disabling a zone and busy looping.
  Busy looping may have the effect of allowing kswapd to make more
  progress and in some cases may be better overall. If this is found then
  the correct action is to put direct reclaimers to sleep on a waitqueue
  and allow kswapd make forward progress. Busy looping on the zlc is even
  worse than when the allocator used to blindly call congestion_wait().

2. There was higher swap activity as direct reclaim was active.

3. Direct reclaim efficiency was lower. This is related to 1 as more
  scanning activity also encountered more pages that could not be
  immediately reclaimed

In that case, the direct page scan and reclaim rates are noticeable but
it is not considered a problem for a few reasons

1. The test is primarily concerned with latency. The mmap attempts are also
   faulted which means there are THP allocation requests. The ZLC could
   cause zones to be disabled causing the process to busy loop instead
   of reclaiming.  This looks like elevated direct reclaim activity but
   it's the correct action to take based on what processes requested.

2. The test hammers reclaim and compaction heavily. The number of successful
   THP faults is highly variable but affects the reclaim stats. It's not a
   realistic or reasonable measure of page reclaim activity.

3. No other page-reclaim intensive workload that was tested showed a problem.

4. If a workload is identified that benefitted from the busy looping then it
   should be fixed by having direct reclaimers sleep on a wait queue until
   woken by kswapd instead of busy looping. We had this class of problem before
   when congestion_waits() with a fixed timeout was a brain damaged decision
   but happened to benefit some workloads.

If a workload is identified that relied on the zlc to busy loop then it
should be fixed correctly and have a direct reclaimer sleep on a waitqueue
until woken by kswapd.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Mel Gorman
016c13daa5 mm, page_alloc: use masks and shifts when converting GFP flags to migrate types
This patch redefines which GFP bits are used for specifying mobility and
the order of the migrate types.  Once redefined it's possible to convert
GFP flags to a migrate type with a simple mask and shift.  The only
downside is that readers of OOM kill messages and allocation failures may
have been used to the existing values but scripts/gfp-translate will help.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Mel Gorman
e2b19197ff mm, page_alloc: remove unnecessary parameter from zone_watermark_ok_safe
Overall, the intent of this series is to remove the zonelist cache which
was introduced to avoid high overhead in the page allocator.  Once this is
done, it is necessary to reduce the cost of watermark checks.

The series starts with minor micro-optimisations.

Next it notes that GFP flags that affect watermark checks are abused.
__GFP_WAIT historically identified callers that could not sleep and could
access reserves.  This was later abused to identify callers that simply
prefer to avoid sleeping and have other options.  A patch distinguishes
between atomic callers, high-priority callers and those that simply wish
to avoid sleep.

The zonelist cache has been around for a long time but it is of dubious
merit with a lot of complexity and some issues that are explained.  The
most important issue is that a failed THP allocation can cause a zone to
be treated as "full".  This potentially causes unnecessary stalls, reclaim
activity or remote fallbacks.  The issues could be fixed but it's not
worth it.  The series places a small number of other micro-optimisations
on top before examining GFP flags watermarks.

High-order watermarks enforcement can cause high-order allocations to fail
even though pages are free.  The watermark checks both protect high-order
atomic allocations and make kswapd aware of high-order pages but there is
a much better way that can be handled using migrate types.  This series
uses page grouping by mobility to reserve pageblocks for high-order
allocations with the size of the reservation depending on demand.  kswapd
awareness is maintained by examining the free lists.  By patch 12 in this
series, there are no high-order watermark checks while preserving the
properties that motivated the introduction of the watermark checks.

This patch (of 10):

No user of zone_watermark_ok_safe() specifies alloc_flags.  This patch
removes the unnecessary parameter.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Yaowei Bai
b171e40930 mm/page_alloc: remove unused parameter in init_currently_empty_zone()
Commit a2f3aa0257 ("[PATCH] Fix sparsemem on Cell") fixed an oops
experienced on the Cell architecture when init-time functions,
early_*(), are called at runtime by introducing an 'enum memmap_context'
parameter to memmap_init_zone() and init_currently_empty_zone().  This
parameter is intended to be used to tell whether the call of these two
functions is being made on behalf of a hotplug event, or happening at
boot-time.  However, init_currently_empty_zone() does not use this
parameter at all, so remove it.

Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Linus Torvalds
12f03ee606 libnvdimm for 4.3:
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
    mechanism for adding device-driver-discovered memory regions to the
    kernel's direct map.  This facility is used by the pmem driver to
    enable pfn_to_page() operations on the page frames returned by DAX
    ('direct_access' in 'struct block_device_operations'). For now, the
    'memmap' allocation for these "device" pages comes from "System
    RAM".  Support for allocating the memmap from device memory will
    arrive in a later kernel.
 
 2/ Introduce memremap() to replace usages of ioremap_cache() and
    ioremap_wt().  memremap() drops the __iomem annotation for these
    mappings to memory that do not have i/o side effects.  The
    replacement of ioremap_cache() with memremap() is limited to the
    pmem driver to ease merging the api change in v4.3.  Completion of
    the conversion is targeted for v4.4.
 
 3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
    driver, update the VFS DAX implementation and PMEM api to provide
    persistence guarantees for kernel operations on a DAX mapping.
 
 4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
    cacheable to improve performance.
 
 5/ Miscellaneous updates and fixes to libnvdimm including support
    for issuing "address range scrub" commands, clarifying the optimal
    'sector size' of pmem devices, a clarification of the usage of the
    ACPI '_STA' (status) property for DIMM devices, and other minor
    fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
 JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
 OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
 nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
 NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
 zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
 1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
 sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
 bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
 o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
 dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
 slsw6DkrWT60CRE42nbK
 =o57/
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "This update has successfully completed a 0day-kbuild run and has
  appeared in a linux-next release.  The changes outside of the typical
  drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
  removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
  the introduction of ZONE_DEVICE + devm_memremap_pages().

  Summary:

   - Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
     mechanism for adding device-driver-discovered memory regions to the
     kernel's direct map.

     This facility is used by the pmem driver to enable pfn_to_page()
     operations on the page frames returned by DAX ('direct_access' in
     'struct block_device_operations').

     For now, the 'memmap' allocation for these "device" pages comes
     from "System RAM".  Support for allocating the memmap from device
     memory will arrive in a later kernel.

   - Introduce memremap() to replace usages of ioremap_cache() and
     ioremap_wt().  memremap() drops the __iomem annotation for these
     mappings to memory that do not have i/o side effects.  The
     replacement of ioremap_cache() with memremap() is limited to the
     pmem driver to ease merging the api change in v4.3.

     Completion of the conversion is targeted for v4.4.

   - Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
     driver, update the VFS DAX implementation and PMEM api to provide
     persistence guarantees for kernel operations on a DAX mapping.

   - Convert the ACPI NFIT 'BLK' driver to map the block apertures as
     cacheable to improve performance.

   - Miscellaneous updates and fixes to libnvdimm including support for
     issuing "address range scrub" commands, clarifying the optimal
     'sector size' of pmem devices, a clarification of the usage of the
     ACPI '_STA' (status) property for DIMM devices, and other minor
     fixes"

* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
  libnvdimm, pmem: direct map legacy pmem by default
  libnvdimm, pmem: 'struct page' for pmem
  libnvdimm, pfn: 'struct page' provider infrastructure
  x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
  add devm_memremap_pages
  mm: ZONE_DEVICE for "device memory"
  mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
  dax: drop size parameter to ->direct_access()
  nd_blk: change aperture mapping from WC to WB
  nvdimm: change to use generic kvfree()
  pmem, dax: have direct_access use __pmem annotation
  dax: update I/O path to do proper PMEM flushing
  pmem: add copy_from_iter_pmem() and clear_pmem()
  pmem, x86: clean up conditional pmem includes
  pmem: remove layer when calling arch_has_wmb_pmem()
  pmem, x86: move x86 PMEM API to new pmem.h header
  libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
  pmem: switch to devm_ allocations
  devres: add devm_memremap
  libnvdimm, btt: write and validate parent_uuid
  ...
2015-09-08 14:35:59 -07:00
minkyung88.kim
4e6dab4233 mm: remove struct node_active_region
struct node_active_region is not used anymore.  Remove it.

Signed-off-by: minkyung88.kim <minkyung88.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Dan Williams
033fbae988 mm: ZONE_DEVICE for "device memory"
While pmem is usable as a block device or via DAX mappings to userspace
there are several usage scenarios that can not target pmem due to its
lack of struct page coverage. In preparation for "hot plugging" pmem
into the vmemmap add ZONE_DEVICE as a new zone to tag these pages
separately from the ones that are subject to standard page allocations.
Importantly "device memory" can be removed at will by userspace
unbinding the driver of the device.

Having a separate zone prevents allocation and otherwise marks these
pages that are distinct from typical uniform memory.  Device memory has
different lifetime and performance characteristics than RAM.  However,
since we have run out of ZONES_SHIFT bits this functionality currently
depends on sacrificing ZONE_DMA.

Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Jerome Glisse <j.glisse@gmail.com>
[hch: various simplifications in the arch interface]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27 19:40:58 -04:00
Mel Gorman
3a80a7fa79 mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set
This patch initalises all low memory struct pages and 2G of the highest
zone on each node during memory initialisation if
CONFIG_DEFERRED_STRUCT_PAGE_INIT is set.  That config option cannot be set
but will be available in a later patch.  Parallel initialisation of struct
page depends on some features from memory hotplug and it is necessary to
alter alter section annotations.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Nate Zimmer <nzimmer@sgi.com>
Tested-by: Waiman Long <waiman.long@hp.com>
Tested-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Nate Zimmer <nzimmer@sgi.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-30 19:44:56 -07:00
Mel Gorman
75a592a471 mm: meminit: inline some helper functions
early_pfn_in_nid() and meminit_pfn_in_nid() are small functions that are
unnecessarily visible outside memory initialisation.  As well as
unnecessary visibility, it's unnecessary function call overhead when
initialising pages.  This patch moves the helpers inline.

[akpm@linux-foundation.org: fix build]
[mhocko@suse.cz: fix build]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Nate Zimmer <nzimmer@sgi.com>
Tested-by: Waiman Long <waiman.long@hp.com>
Tested-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Nate Zimmer <nzimmer@sgi.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-30 19:44:56 -07:00
Mel Gorman
8a942fdea5 mm: meminit: make __early_pfn_to_nid SMP-safe and introduce meminit_pfn_in_nid
__early_pfn_to_nid() use static variables to cache recent lookups as
memblock lookups are very expensive but it assumes that memory
initialisation is single-threaded.  Parallel initialisation of struct
pages will break that assumption so this patch makes __early_pfn_to_nid()
SMP-safe by requiring the caller to cache recent search information.
early_pfn_to_nid() keeps the same interface but is only safe to use early
in boot due to the use of a global static variable.  meminit_pfn_in_nid()
is an SMP-safe version that callers must maintain their own state for.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Nate Zimmer <nzimmer@sgi.com>
Tested-by: Waiman Long <waiman.long@hp.com>
Tested-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Nate Zimmer <nzimmer@sgi.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-30 19:44:56 -07:00
Zhang Zhen
d7e4a2ea51 mm: refactor zone_movable_is_highmem()
All callers of zone_movable_is_highmem are under #ifdef CONFIG_HIGHMEM,
so the else branch return 0 is not needed.

Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:16 -07:00
Mel Gorman
a368ab67aa mm: move zone lock to a different cache line than order-0 free page lists
Huang Ying reported the following problem due to commit 3484b2de94 ("mm:
rearrange zone fields into read-only, page alloc, statistics and page
reclaim lines") from the Intel performance tests

    24b7e5819a  3484b2de94
    ----------------  --------------------------
             %stddev     %change         %stddev
                 \          |                \
        152288 \261  0%     -46.2%      81911 \261  0%  aim7.jobs-per-min
           237 \261  0%     +85.6%        440 \261  0%  aim7.time.elapsed_time
           237 \261  0%     +85.6%        440 \261  0%  aim7.time.elapsed_time.max
         25026 \261  0%     +70.7%      42712 \261  0%  aim7.time.system_time
       2186645 \261  5%     +32.0%    2885949 \261  4%  aim7.time.voluntary_context_switches
       4576561 \261  1%     +24.9%    5715773 \261  0%  aim7.time.involuntary_context_switches

The problem is specific to very large machines under stress.  It was not
reproducible with the machines I had used to justify the original patch
because large numbers of CPUs are required.  When pressure is high enough,
the cache line is bouncing between CPUs trying to acquire the lock and the
holder of the lock adjusting free lists.  The intention was that the
acquirer of the lock would automatically have the cache line holding the
free lists but according to Huang, this is not a universal win.

One possibility is to move the zone lock to its own cache line but it
increases the size of the zone.  This patch moves the lock to the other
end of the free lists where they do not contend under high pressure.  It
does mean the page allocator paths now require more cache lines but Huang
reports that it restores performance to previous levels on large machines

             %stddev     %change         %stddev
                 \          |                \
         84568 \261  1%     +94.3%     164280 \261  1%  aim7.jobs-per-min
       2881944 \261  2%     -35.1%    1870386 \261  8%  aim7.time.voluntary_context_switches
           681 \261  1%      -3.4%        658 \261  0%  aim7.time.user_time
       5538139 \261  0%     -12.1%    4867884 \261  0%  aim7.time.involuntary_context_switches
         44174 \261  1%     -46.0%      23848 \261  1%  aim7.time.system_time
           426 \261  1%     -48.4%        219 \261  1%  aim7.time.elapsed_time
           426 \261  1%     -48.4%        219 \261  1%  aim7.time.elapsed_time.max
           468 \261  1%     -43.1%        266 \261  2%  uptime.boot

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Huang Ying <ying.huang@intel.com>
Tested-by: Huang Ying <ying.huang@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-07 16:45:33 -07:00
Vlastimil Babka
05891fb065 mm: microoptimize zonelist operations
next_zones_zonelist() returns a zoneref pointer, as well as a zone pointer
via extra parameter.  Since the latter can be trivially obtained by
dereferencing the former, the overhead of the extra parameter is
unjustified.

This patch thus removes the zone parameter from next_zones_zonelist().
Both callers happen to be in the same header file, so it's simple to add
the zoneref dereference inline.  We save some bytes of code size.

add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-105 (-105)
function                                     old     new   delta
nr_free_zone_pages                           129     115     -14
__alloc_pages_nodemask                      2300    2285     -15
get_page_from_freelist                      2652    2576     -76

add/remove: 0/0 grow/shrink: 1/0 up/down: 10/0 (10)
function                                     old     new   delta
try_to_compact_pages                         569     579     +10

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11 17:06:02 -08:00
Baoquan He
44628d9755 mm: fix typo of MIGRATE_RESERVE in comment
Found it when I want to jump to the definition of MIGRATE_RESERVE ctags.

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11 17:06:01 -08:00
Joonsoo Kim
eefa864b70 mm/page_ext: resurrect struct page extending code for debugging
When we debug something, we'd like to insert some information to every
page.  For this purpose, we sometimes modify struct page itself.  But,
this has drawbacks.  First, it requires re-compile.  This makes us
hesitate to use the powerful debug feature so development process is
slowed down.  And, second, sometimes it is impossible to rebuild the
kernel due to third party module dependency.  At third, system behaviour
would be largely different after re-compile, because it changes size of
struct page greatly and this structure is accessed by every part of
kernel.  Keeping this as it is would be better to reproduce errornous
situation.

This feature is intended to overcome above mentioned problems.  This
feature allocates memory for extended data per page in certain place
rather than the struct page itself.  This memory can be accessed by the
accessor functions provided by this code.  During the boot process, it
checks whether allocation of huge chunk of memory is needed or not.  If
not, it avoids allocating memory at all.  With this advantage, we can
include this feature into the kernel in default and can avoid rebuild and
solve related problems.

Until now, memcg uses this technique.  But, now, memcg decides to embed
their variable to struct page itself and it's code to extend struct page
has been removed.  I'd like to use this code to develop debug feature, so
this patch resurrect it.

To help these things to work well, this patch introduces two callbacks for
clients.  One is the need callback which is mandatory if user wants to
avoid useless memory allocation at boot-time.  The other is optional, init
callback, which is used to do proper initialization after memory is
allocated.  Detailed explanation about purpose of these functions is in
code comment.  Please refer it.

Others are completely same with previous extension code in memcg.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Jungsoo Son <jungsoo.son@lge.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13 12:42:48 -08:00
Johannes Weiner
1306a85aed mm: embed the memcg pointer directly into struct page
Memory cgroups used to have 5 per-page pointers.  To allow users to
disable that amount of overhead during runtime, those pointers were
allocated in a separate array, with a translation layer between them and
struct page.

There is now only one page pointer remaining: the memcg pointer, that
indicates which cgroup the page is associated with when charged.  The
complexity of runtime allocation and the runtime translation overhead is
no longer justified to save that *potential* 0.19% of memory.  With
CONFIG_SLUB, page->mem_cgroup actually sits in the doubleword padding
after the page->private member and doesn't even increase struct page,
and then this patch actually saves space.  Remaining users that care can
still compile their kernels without CONFIG_MEMCG.

     text    data     bss     dec     hex     filename
  8828345 1725264  983040 11536649 b00909  vmlinux.old
  8827425 1725264  966656 11519345 afc571  vmlinux.new

[mhocko@suse.cz: update Documentation/cgroups/memory.txt]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10 17:41:09 -08:00
Joonsoo Kim
ad53f92eb4 mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype
Before describing bugs itself, I first explain definition of freepage.

 1. pages on buddy list are counted as freepage.
 2. pages on isolate migratetype buddy list are *not* counted as freepage.
 3. pages on cma buddy list are counted as CMA freepage, too.

Now, I describe problems and related patch.

Patch 1: There is race conditions on getting pageblock migratetype that
it results in misplacement of freepages on buddy list, incorrect
freepage count and un-availability of freepage.

Patch 2: Freepages on pcp list could have stale cached information to
determine migratetype of buddy list to go.  This causes misplacement of
freepages on buddy list and incorrect freepage count.

Patch 4: Merging between freepages on different migratetype of
pageblocks will cause freepages accouting problem.  This patch fixes it.

Without patchset [3], above problem doesn't happens on my CMA allocation
test, because CMA reserved pages aren't used at all.  So there is no
chance for above race.

With patchset [3], I did simple CMA allocation test and get below
result:

 - Virtual machine, 4 cpus, 1024 MB memory, 256 MB CMA reservation
 - run kernel build (make -j16) on background
 - 30 times CMA allocation(8MB * 30 = 240MB) attempts in 5 sec interval
 - Result: more than 5000 freepage count are missed

With patchset [3] and this patchset, I found that no freepage count are
missed so that I conclude that problems are solved.

On my simple memory offlining test, these problems also occur on that
environment, too.

This patch (of 4):

There are two paths to reach core free function of buddy allocator,
__free_one_page(), one is free_one_page()->__free_one_page() and the
other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
Each paths has race condition causing serious problems.  At first, this
patch is focused on first type of freepath.  And then, following patch
will solve the problem in second type of freepath.

In the first type of freepath, we got migratetype of freeing page
without holding the zone lock, so it could be racy.  There are two cases
of this race.

 1. pages are added to isolate buddy list after restoring orignal
    migratetype

    CPU1                                   CPU2

    get migratetype => return MIGRATE_ISOLATE
    call free_one_page() with MIGRATE_ISOLATE

                                grab the zone lock
                                unisolate pageblock
                                release the zone lock

    grab the zone lock
    call __free_one_page() with MIGRATE_ISOLATE
    freepage go into isolate buddy list,
    although pageblock is already unisolated

This may cause two problems.  One is that we can't use this page anymore
until next isolation attempt of this pageblock, because freepage is on
isolate buddy list.  The other is that freepage accouting could be wrong
due to merging between different buddy list.  Freepages on isolate buddy
list aren't counted as freepage, but ones on normal buddy list are
counted as freepage.  If merge happens, buddy freepage on normal buddy
list is inevitably moved to isolate buddy list without any consideration
of freepage accouting so it could be incorrect.

 2. pages are added to normal buddy list while pageblock is isolated.
    It is similar with above case.

This also may cause two problems.  One is that we can't keep these
freepages from being allocated.  Although this pageblock is isolated,
freepage would be added to normal buddy list so that it could be
allocated without any restriction.  And the other problem is same as
case 1, that it, incorrect freepage accouting.

This race condition would be prevented by checking migratetype again
with holding the zone lock.  Because it is somewhat heavy operation and
it isn't needed in common case, we want to avoid rechecking as much as
possible.  So this patch introduce new variable, nr_isolate_pageblock in
struct zone to check if there is isolated pageblock.  With this, we can
avoid to re-check migratetype in common case and do it only if there is
isolated pageblock or migratetype is MIGRATE_ISOLATE.  This solve above
mentioned problems.

Changes from v3:
Add one more check in free_one_page() that checks whether migratetype is
MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Heesub Shin <heesub.shin@samsung.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Ritesh Harjani <ritesh.list@gmail.com>
Cc: Gioh Kim <gioh.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-13 16:17:05 -08:00
Johannes Weiner
5705465174 mm: clean up zone flags
Page reclaim tests zone_is_reclaim_dirty(), but the site that actually
sets this state does zone_set_flag(zone, ZONE_TAIL_LRU_DIRTY), sending the
reader through layers indirection just to track down a simple bit.

Remove all zone flag wrappers and just use bitops against zone->flags
directly.  It's just as readable and the lines are barely any longer.

Also rename ZONE_TAIL_LRU_DIRTY to ZONE_DIRTY to match ZONE_WRITEBACK, and
remove the zone_flags_t typedef.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:57 -04:00
Mel Gorman
4ffeaf3560 mm: page_alloc: reduce cost of the fair zone allocation policy
The fair zone allocation policy round-robins allocations between zones
within a node to avoid age inversion problems during reclaim.  If the
first allocation fails, the batch counts are reset and a second attempt
made before entering the slow path.

One assumption made with this scheme is that batches expire at roughly
the same time and the resets each time are justified.  This assumption
does not hold when zones reach their low watermark as the batches will
be consumed at uneven rates.  Allocation failure due to watermark
depletion result in additional zonelist scans for the reset and another
watermark check before hitting the slowpath.

On UMA, the benefit is negligible -- around 0.25%.  On 4-socket NUMA
machine it's variable due to the variability of measuring overhead with
the vmstat changes.  The system CPU overhead comparison looks like

          3.16.0-rc3  3.16.0-rc3  3.16.0-rc3
             vanilla   vmstat-v5 lowercost-v5
User          746.94      774.56      802.00
System      65336.22    32847.27    40852.33
Elapsed     27553.52    27415.04    27368.46

However it is worth noting that the overall benchmark still completed
faster and intuitively it makes sense to take as few passes as possible
through the zonelists.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:20 -07:00
Mel Gorman
0d5d823ab4 mm: move zone->pages_scanned into a vmstat counter
zone->pages_scanned is a write-intensive cache line during page reclaim
and it's also updated during page free.  Move the counter into vmstat to
take advantage of the per-cpu updates and do not update it in the free
paths unless necessary.

On a small UMA machine running tiobench the difference is marginal.  On
a 4-node machine the overhead is more noticable.  Note that automatic
NUMA balancing was disabled for this test as otherwise the system CPU
overhead is unpredictable.

          3.16.0-rc3  3.16.0-rc3  3.16.0-rc3
             vanillarearrange-v5   vmstat-v5
User          746.94      759.78      774.56
System      65336.22    58350.98    32847.27
Elapsed     27553.52    27282.02    27415.04

Note that the overhead reduction will vary depending on where exactly
pages are allocated and freed.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:20 -07:00
Mel Gorman
3484b2de94 mm: rearrange zone fields into read-only, page alloc, statistics and page reclaim lines
The arrangement of struct zone has changed over time and now it has
reached the point where there is some inappropriate sharing going on.
On x86-64 for example

o The zone->node field is shared with the zone lock and zone->node is
  accessed frequently from the page allocator due to the fair zone
  allocation policy.

o span_seqlock is almost never used by shares a line with free_area

o Some zone statistics share a cache line with the LRU lock so
  reclaim-intensive and allocator-intensive workloads can bounce the cache
  line on a stat update

This patch rearranges struct zone to put read-only and read-mostly
fields together and then splits the page allocator intensive fields, the
zone statistics and the page reclaim intensive fields into their own
cache lines.  Note that the type of lowmem_reserve changes due to the
watermark calculations being signed and avoiding a signed/unsigned
conversion there.

On the test configuration I used the overall size of struct zone shrunk
by one cache line.  On smaller machines, this is not likely to be
noticable.  However, on a 4-node NUMA machine running tiobench the
system CPU overhead is reduced by this patch.

          3.16.0-rc3  3.16.0-rc3
             vanillarearrange-v5r9
User          746.94      759.78
System      65336.22    58350.98
Elapsed     27553.52    27282.02

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:20 -07:00
Wang Nan
1a4dc5bc7c mem-hotplug: improve zone_movable_is_highmem logic
In original code, zone_movable_is_highmem() assumes ZONE_MOVABLE not
highmem if CONFIG_HAVE_MEMBLOCK_NODE_MAP is not set.  In online_pages,
it extracts pages from the previous zone before ZONE_MOVABLE.  Which is
logically inconsistent:

If HAVE_MEMBLOCK_NODE_MAP is turned off but HIGHMEM is on,
zone_movable_is_highmem() makes movable zone not highmem, but
online_pages() extracts pages from ZONE_HIGHMEM.

This inconsistency doesn't cause real problem currently, because all
architectures support online_pages also have HAVE_MEMBLOCK_NODE_MAP.
However, fixing it makes code clear, and also helps futher coding.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Zhang Zhen <zhangzhen@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:18 -07:00
Mel Gorman
7aeb09f910 mm: page_alloc: use unsigned int for order in more places
X86 prefers the use of unsigned types for iterators and there is a
tendency to mix whether a signed or unsigned type if used for page order.
This converts a number of sites in mm/page_alloc.c to use unsigned int for
order where possible.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:09 -07:00
Mel Gorman
dc4b0caff2 mm: page_alloc: reduce number of times page_to_pfn is called
In the free path we calculate page_to_pfn multiple times. Reduce that.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:09 -07:00
Mel Gorman
e58469bafd mm: page_alloc: use word-based accesses for get/set pageblock bitmaps
The test_bit operations in get/set pageblock flags are expensive.  This
patch reads the bitmap on a word basis and use shifts and masks to isolate
the bits of interest.  Similarly masks are used to set a local copy of the
bitmap and then use cmpxchg to update the bitmap if there have been no
other changes made in parallel.

In a test running dd onto tmpfs the overhead of the pageblock-related
functions went from 1.27% in profiles to 0.5%.

In addition to the performance benefits, this patch closes races that are
possible between:

a) get_ and set_pageblock_migratetype(), where get_pageblock_migratetype()
   reads part of the bits before and other part of the bits after
   set_pageblock_migratetype() has updated them.

b) set_pageblock_migratetype() and set_pageblock_skip(), where the non-atomic
   read-modify-update set bit operation in set_pageblock_skip() will cause
   lost updates to some bits changed in the set_pageblock_migratetype().

Joonsoo Kim first reported the case a) via code inspection.  Vlastimil
Babka's testing with a debug patch showed that either a) or b) occurs
roughly once per mmtests' stress-highalloc benchmark (although not
necessarily in the same pageblock).  Furthermore during development of
unrelated compaction patches, it was observed that frequent calls to
{start,undo}_isolate_page_range() the race occurs several thousands of
times and has resulted in NULL pointer dereferences in move_freepages()
and free_one_page() in places where free_list[migratetype] is
manipulated by e.g.  list_move().  Further debugging confirmed that
migratetype had invalid value of 6, causing out of bounds access to the
free_list array.

That confirmed that the race exist, although it may be extremely rare,
and currently only fatal where page isolation is performed due to
memory hot remove.  Races on pageblocks being updated by
set_pageblock_migratetype(), where both old and new migratetype are
lower MIGRATE_RESERVE, currently cannot result in an invalid value
being observed, although theoretically they may still lead to
unexpected creation or destruction of MIGRATE_RESERVE pageblocks.
Furthermore, things could get suddenly worse when memory isolation is
used more, or when new migratetypes are added.

After this patch, the race has no longer been observed in testing.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:09 -07:00
David Rientjes
35979ef339 mm, compaction: add per-zone migration pfn cache for async compaction
Each zone has a cached migration scanner pfn for memory compaction so that
subsequent calls to memory compaction can start where the previous call
left off.

Currently, the compaction migration scanner only updates the per-zone
cached pfn when pageblocks were not skipped for async compaction.  This
creates a dependency on calling sync compaction to avoid having subsequent
calls to async compaction from scanning an enormous amount of non-MOVABLE
pageblocks each time it is called.  On large machines, this could be
potentially very expensive.

This patch adds a per-zone cached migration scanner pfn only for async
compaction.  It is updated everytime a pageblock has been scanned in its
entirety and when no pages from it were successfully isolated.  The cached
migration scanner pfn for sync compaction is updated only when called for
sync compaction.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:06 -07:00
Vladimir Davydov
bfc8c90139 mem-hotplug: implement get/put_online_mems
kmem_cache_{create,destroy,shrink} need to get a stable value of
cpu/node online mask, because they init/destroy/access per-cpu/node
kmem_cache parts, which can be allocated or destroyed on cpu/mem
hotplug.  To protect against cpu hotplug, these functions use
{get,put}_online_cpus.  However, they do nothing to synchronize with
memory hotplug - taking the slab_mutex does not eliminate the
possibility of race as described in patch 2.

What we need there is something like get_online_cpus, but for memory.
We already have lock_memory_hotplug, which serves for the purpose, but
it's a bit of a hammer right now, because it's backed by a mutex.  As a
result, it imposes some limitations to locking order, which are not
desirable, and can't be used just like get_online_cpus.  That's why in
patch 1 I substitute it with get/put_online_mems, which work exactly
like get/put_online_cpus except they block not cpu, but memory hotplug.

[ v1 can be found at https://lkml.org/lkml/2014/4/6/68.  I NAK'ed it by
  myself, because it used an rw semaphore for get/put_online_mems,
  making them dead lock prune.  ]

This patch (of 2):

{un}lock_memory_hotplug, which is used to synchronize against memory
hotplug, is currently backed by a mutex, which makes it a bit of a
hammer - threads that only want to get a stable value of online nodes
mask won't be able to proceed concurrently.  Also, it imposes some
strong locking ordering rules on it, which narrows down the set of its
usage scenarios.

This patch introduces get/put_online_mems, which are the same as
get/put_online_cpus, but for memory hotplug, i.e.  executing a code
inside a get/put_online_mems section will guarantee a stable value of
online nodes, present pages, etc.

lock_memory_hotplug()/unlock_memory_hotplug() are removed altogether.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:59 -07:00
Mel Gorman
5f7a75acdb mm: page_alloc: do not cache reclaim distances
pgdat->reclaim_nodes tracks if a remote node is allowed to be reclaimed
by zone_reclaim due to its distance.  As it is expected that
zone_reclaim_mode will be rarely enabled it is unreasonable for all
machines to take a penalty.  Fortunately, the zone_reclaim_mode() path
is already slow and it is the path that takes the hit.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:53:59 -07:00
Johannes Weiner
449dd6984d mm: keep page cache radix tree nodes in check
Previously, page cache radix tree nodes were freed after reclaim emptied
out their page pointers.  But now reclaim stores shadow entries in their
place, which are only reclaimed when the inodes themselves are
reclaimed.  This is problematic for bigger files that are still in use
after they have a significant amount of their cache reclaimed, without
any of those pages actually refaulting.  The shadow entries will just
sit there and waste memory.  In the worst case, the shadow entries will
accumulate until the machine runs out of memory.

To get this under control, the VM will track radix tree nodes
exclusively containing shadow entries on a per-NUMA node list.  Per-NUMA
rather than global because we expect the radix tree nodes themselves to
be allocated node-locally and we want to reduce cross-node references of
otherwise independent cache workloads.  A simple shrinker will then
reclaim these nodes on memory pressure.

A few things need to be stored in the radix tree node to implement the
shadow node LRU and allow tree deletions coming from the list:

1. There is no index available that would describe the reverse path
   from the node up to the tree root, which is needed to perform a
   deletion.  To solve this, encode in each node its offset inside the
   parent.  This can be stored in the unused upper bits of the same
   member that stores the node's height at no extra space cost.

2. The number of shadow entries needs to be counted in addition to the
   regular entries, to quickly detect when the node is ready to go to
   the shadow node LRU list.  The current entry count is an unsigned
   int but the maximum number of entries is 64, so a shadow counter
   can easily be stored in the unused upper bits.

3. Tree modification needs tree lock and tree root, which are located
   in the address space, so store an address_space backpointer in the
   node.  The parent pointer of the node is in a union with the 2-word
   rcu_head, so the backpointer comes at no extra cost as well.

4. The node needs to be linked to an LRU list, which requires a list
   head inside the node.  This does increase the size of the node, but
   it does not change the number of objects that fit into a slab page.

[akpm@linux-foundation.org: export the right function]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:21:01 -07:00
Johannes Weiner
a528910e12 mm: thrash detection-based file cache sizing
The VM maintains cached filesystem pages on two types of lists.  One
list holds the pages recently faulted into the cache, the other list
holds pages that have been referenced repeatedly on that first list.
The idea is to prefer reclaiming young pages over those that have shown
to benefit from caching in the past.  We call the recently usedbut
ultimately was not significantly better than a FIFO policy and still
thrashed cache based on eviction speed, rather than actual demand for
cache.

This patch solves one half of the problem by decoupling the ability to
detect working set changes from the inactive list size.  By maintaining
a history of recently evicted file pages it can detect frequently used
pages with an arbitrarily small inactive list size, and subsequently
apply pressure on the active list based on actual demand for cache, not
just overall eviction speed.

Every zone maintains a counter that tracks inactive list aging speed.
When a page is evicted, a snapshot of this counter is stored in the
now-empty page cache radix tree slot.  On refault, the minimum access
distance of the page can be assessed, to evaluate whether the page
should be part of the active list or not.

This fixes the VM's blindness towards working set changes in excess of
the inactive list.  And it's the foundation to further improve the
protection ability and reduce the minimum inactive list size of 50%.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:21:01 -07:00
Johannes Weiner
e97ca8e5b8 mm: fix GFP_THISNODE callers and clarify
GFP_THISNODE is for callers that implement their own clever fallback to
remote nodes.  It restricts the allocation to the specified node and
does not invoke reclaim, assuming that the caller will take care of it
when the fallback fails, e.g.  through a subsequent allocation request
without GFP_THISNODE set.

However, many current GFP_THISNODE users only want the node exclusive
aspect of the flag, without actually implementing their own fallback or
triggering reclaim if necessary.  This results in things like page
migration failing prematurely even when there is easily reclaimable
memory available, unless kswapd happens to be running already or a
concurrent allocation attempt triggers the necessary reclaim.

Convert all callsites that don't implement their own fallback strategy
to __GFP_THISNODE.  This restricts the allocation a single node too, but
at the same time allows the allocator to enter the slowpath, wake
kswapd, and invoke direct reclaim if necessary, to make the allocation
happen when memory is full.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-10 17:26:19 -07:00
Mel Gorman
1c5e9c27cb mm: numa: limit scope of lock for NUMA migrate rate limiting
NUMA migrate rate limiting protects a migration counter and window using
a lock but in some cases this can be a contended lock.  It is not
critical that the number of pages be perfect, lost updates are
acceptable.  Reduce the importance of this lock.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21 16:19:48 -08:00
Yasuaki Ishimatsu
943dca1a1f mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve
Yasuaki Ishimatsu reported memory hot-add spent more than 5 _hours_ on
9TB memory machine since onlining memory sections is too slow.  And we
found out setup_zone_migrate_reserve spent >90% of the time.

The problem is, setup_zone_migrate_reserve scans all pageblocks
unconditionally, but it is only necessary if the number of reserved
block was reduced (i.e.  memory hot remove).

Moreover, maximum MIGRATE_RESERVE per zone is currently 2.  It means
that the number of reserved pageblocks is almost always unchanged.

This patch adds zone->nr_migrate_reserve_block to maintain the number of
MIGRATE_RESERVE pageblocks and it reduces the overhead of
setup_zone_migrate_reserve dramatically.  The following table shows time
of onlining a memory section.

  Amount of memory     | 128GB | 192GB | 256GB|
  ---------------------------------------------
  linux-3.12           |  23.9 |  31.4 | 44.5 |
  This patch           |   8.3 |   8.3 |  8.6 |
  Mel's proposal patch |  10.9 |  19.2 | 31.3 |
  ---------------------------------------------
                                   (millisecond)

  128GB : 4 nodes and each node has 32GB of memory
  192GB : 6 nodes and each node has 32GB of memory
  256GB : 8 nodes and each node has 32GB of memory

  (*1) Mel proposed his idea by the following threads.
       https://lkml.org/lkml/2013/10/30/272

[akpm@linux-foundation.org: tweak comment]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reported-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21 16:19:43 -08:00
Lisa Du
6e543d5780 mm: vmscan: fix do_try_to_free_pages() livelock
This patch is based on KOSAKI's work and I add a little more description,
please refer https://lkml.org/lkml/2012/6/14/74.

Currently, I found system can enter a state that there are lots of free
pages in a zone but only order-0 and order-1 pages which means the zone is
heavily fragmented, then high order allocation could make direct reclaim
path's long stall(ex, 60 seconds) especially in no swap and no compaciton
enviroment.  This problem happened on v3.4, but it seems issue still lives
in current tree, the reason is do_try_to_free_pages enter live lock:

kswapd will go to sleep if the zones have been fully scanned and are still
not balanced.  As kswapd thinks there's little point trying all over again
to avoid infinite loop.  Instead it changes order from high-order to
0-order because kswapd think order-0 is the most important.  Look at
73ce02e9 in detail.  If watermarks are ok, kswapd will go back to sleep
and may leave zone->all_unreclaimable =3D 0.  It assume high-order users
can still perform direct reclaim if they wish.

Direct reclaim continue to reclaim for a high order which is not a
COSTLY_ORDER without oom-killer until kswapd turn on
zone->all_unreclaimble= .  This is because to avoid too early oom-kill.
So it means direct_reclaim depends on kswapd to break this loop.

In worst case, direct-reclaim may continue to page reclaim forever when
kswapd sleeps forever until someone like watchdog detect and finally kill
the process.  As described in:
http://thread.gmane.org/gmane.linux.kernel.mm/103737

We can't turn on zone->all_unreclaimable from direct reclaim path because
direct reclaim path don't take any lock and this way is racy.  Thus this
patch removes zone->all_unreclaimable field completely and recalculates
zone reclaimable state every time.

Note: we can't take the idea that direct-reclaim see zone->pages_scanned
directly and kswapd continue to use zone->all_unreclaimable.  Because, it
is racy.  commit 929bea7c71 (vmscan: all_unreclaimable() use
zone->all_unreclaimable as a name) describes the detail.

[akpm@linux-foundation.org: uninline zone_reclaimable_pages() and zone_reclaimable()]
Cc: Aaditya Kumar <aaditya.kumar.30@gmail.com>
Cc: Ying Han <yinghan@google.com>
Cc: Nick Piggin <npiggin@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Bob Liu <lliubbo@gmail.com>
Cc: Neil Zhang <zhangwm@marvell.com>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Lisa Du <cldu@marvell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:58:01 -07:00
Johannes Weiner
81c0a2bb51 mm: page_alloc: fair zone allocator policy
Each zone that holds userspace pages of one workload must be aged at a
speed proportional to the zone size.  Otherwise, the time an individual
page gets to stay in memory depends on the zone it happened to be
allocated in.  Asymmetry in the zone aging creates rather unpredictable
aging behavior and results in the wrong pages being reclaimed, activated
etc.

But exactly this happens right now because of the way the page allocator
and kswapd interact.  The page allocator uses per-node lists of all zones
in the system, ordered by preference, when allocating a new page.  When
the first iteration does not yield any results, kswapd is woken up and the
allocator retries.  Due to the way kswapd reclaims zones below the high
watermark while a zone can be allocated from when it is above the low
watermark, the allocator may keep kswapd running while kswapd reclaim
ensures that the page allocator can keep allocating from the first zone in
the zonelist for extended periods of time.  Meanwhile the other zones
rarely see new allocations and thus get aged much slower in comparison.

The result is that the occasional page placed in lower zones gets
relatively more time in memory, even gets promoted to the active list
after its peers have long been evicted.  Meanwhile, the bulk of the
working set may be thrashing on the preferred zone even though there may
be significant amounts of memory available in the lower zones.

Even the most basic test -- repeatedly reading a file slightly bigger than
memory -- shows how broken the zone aging is.  In this scenario, no single
page should be able stay in memory long enough to get referenced twice and
activated, but activation happens in spades:

  $ grep active_file /proc/zoneinfo
      nr_inactive_file 0
      nr_active_file 0
      nr_inactive_file 0
      nr_active_file 8
      nr_inactive_file 1582
      nr_active_file 11994
  $ cat data data data data >/dev/null
  $ grep active_file /proc/zoneinfo
      nr_inactive_file 0
      nr_active_file 70
      nr_inactive_file 258753
      nr_active_file 443214
      nr_inactive_file 149793
      nr_active_file 12021

Fix this with a very simple round robin allocator.  Each zone is allowed a
batch of allocations that is proportional to the zone's size, after which
it is treated as full.  The batch counters are reset when all zones have
been tried and the allocator enters the slowpath and kicks off kswapd
reclaim.  Allocation and reclaim is now fairly spread out to all
available/allowable zones:

  $ grep active_file /proc/zoneinfo
      nr_inactive_file 0
      nr_active_file 0
      nr_inactive_file 174
      nr_active_file 4865
      nr_inactive_file 53
      nr_active_file 860
  $ cat data data data data >/dev/null
  $ grep active_file /proc/zoneinfo
      nr_inactive_file 0
      nr_active_file 0
      nr_inactive_file 666622
      nr_active_file 4988
      nr_inactive_file 190969
      nr_active_file 937

When zone_reclaim_mode is enabled, allocations will now spread out to all
zones on the local node, not just the first preferred zone (which on a 4G
node might be a tiny Normal zone).

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Paul Bolle <paul.bollee@gmail.com>
Cc: Zlatko Calusic <zcalusic@bitsync.net>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:57:23 -07:00
Zhang Yanfei
b21fbccd4b mm: remove unused functions is_{normal_idx, normal, dma32, dma}
These functions are nowhere used, so remove them.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09 10:33:22 -07:00