evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
Trilok Soni	5ab1e18aa3	Revert "Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4" This reverts commit `9d6fd2c3e9` ("Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4"), because it breaks the dump parsing tools due to kernel can be loaded anywhere in the memory now and not fixed at linear mapping. Change-Id: Id416f0a249d803442847d09ac47781147b0d0ee6 Signed-off-by: Trilok Soni <tsoni@codeaurora.org>	2016-08-26 14:34:05 -07:00
Trilok Soni	9d6fd2c3e9	Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4 * msm-4.4/tmp-510d0a3f: Linux 4.4.11 nf_conntrack: avoid kernel pointer value leak in slab name drm/radeon: fix DP link training issue with second 4K monitor drm/i915/bdw: Add missing delay during L3 SQC credit programming drm/i915: Bail out of pipe config compute loop on LPT drm/radeon: fix PLL sharing on DCE6.1 (v2) Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing" Input: max8997-haptic - fix NULL pointer dereference get_rock_ridge_filename(): handle malformed NM entries tools lib traceevent: Do not reassign parg after collapse_tree() qla1280: Don't allocate 512kb of host tags atomic_open(): fix the handling of create_error regulator: axp20x: Fix axp22x ldo_io voltage ranges regulator: s2mps11: Fix invalid selector mask and voltages for buck9 workqueue: fix rebind bound workers warning ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC vfs: rename: check backing inode being equal vfs: add vfs_select_inode() helper perf/core: Disable the event on a truncated AUX record regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case pinctrl: at91-pio4: fix pull-up/down logic spi: spi-ti-qspi: Handle truncated frames properly spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT ALSA: hda - Fix broken reconfig ALSA: hda - Fix white noise on Asus UX501VW headset ALSA: hda - Fix subwoofer pin on ASUS N751 and N551 ALSA: usb-audio: Yet another Phoneix Audio device quirk ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2) crypto: testmgr - Use kmalloc memory for RSA input crypto: hash - Fix page length clamping in hash walk crypto: qat - fix invalid pf2vf_resp_wq logic s390/mm: fix asce_bits handling with dynamic pagetable levels zsmalloc: fix zs_can_compact() integer overflow ocfs2: fix posix_acl_create deadlock ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang net/route: enforce hoplimit max value tcp: refresh skb timestamp at retransmit time net: thunderx: avoid exposing kernel stack net: fix a kernel infoleak in x25 module uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h MIME-Version: 1.0 bridge: fix igmp / mld query parsing net: bridge: fix old ioctl unlocked net device walk VSOCK: do not disconnect socket when peer has shutdown SEND only net/mlx4_en: Fix endianness bug in IPV6 csum calculation net: fix infoleak in rtnetlink net: fix infoleak in llc net: fec: only clear a queue's work bit if the queue was emptied netem: Segment GSO packets on enqueue sch_dsmark: update backlog as well sch_htb: update backlog as well net_sched: update hierarchical backlog too net_sched: introduce qdisc_replace() helper gre: do not pull header in ICMP error processing net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case samples/bpf: fix trace_output example bpf: fix check_map_func_compatibility logic bpf: fix refcnt overflow bpf: fix double-fdput in replace_map_fd_with_map_ptr() net/mlx4_en: fix spurious timestamping callbacks ipv4/fib: don't warn when primary address is missing if in_dev is dead net/mlx5e: Fix minimum MTU net/mlx5e: Device's mtu field is u16 and not int openvswitch: use flow protocol when recalculating ipv6 checksums atl2: Disable unimplemented scatter/gather feature vlan: pull on __vlan_insert_tag error path and fix csum correction net: use skb_postpush_rcsum instead of own implementations cdc_mbim: apply "NDP to end" quirk to all Huawei devices bpf/verifier: reject invalid LD_ABS \| BPF_DW instruction net: sched: do not requeue a NULL skb packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface route: do not cache fib route info on local routes with oif decnet: Do not build routes to devices without decnet private data. parisc: Use generic extable search and sort routines arm64: kasan: Use actual memory node when populating the kernel image shadow arm64: mm: treat memstart_addr as a signed quantity arm64: lse: deal with clobbered IP registers after branch via PLT arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly arm64: kasan: Fix zero shadow mapping overriding kernel image shadow arm64: consistently use p?d_set_huge arm64: fix KASLR boot-time I-cache maintenance arm64: hugetlb: partial revert of 66b3923a1a0f arm64: make irq_stack_ptr more robust arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness efi: stub: use high allocation for converted command line efi: stub: add implementation of efi_random_alloc() efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL arm64: kaslr: randomize the linear region arm64: add support for kernel ASLR arm64: add support for building vmlinux as a relocatable PIE binary arm64: switch to relative exception tables extable: add support for relative extables to search and sort routines scripts/sortextable: add support for ET_DYN binaries arm64: futex.h: Add missing PAN toggling arm64: make asm/elf.h available to asm files arm64: avoid dynamic relocations in early boot code arm64: avoid R_AARCH64_ABS64 relocations for Image header fields arm64: add support for module PLTs arm64: move brk immediate argument definitions to separate header arm64: mm: use bit ops rather than arithmetic in pa/va translations arm64: mm: only perform memstart_addr sanity check if DEBUG_VM arm64: User die() instead of panic() in do_page_fault() arm64: allow kernel Image to be loaded anywhere in physical memory arm64: defer __va translation of initrd_start and initrd_end arm64: move kernel image to base of vmalloc area arm64: kvm: deal with kernel symbols outside of linear mapping arm64: decouple early fixmap init from linear mapping arm64: pgtable: implement static [pte\|pmd\|pud]_offset variants arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region arm64: add support for ioremap() block mappings arm64: prevent potential circular header dependencies in asm/bug.h of/fdt: factor out assignment of initrd_start/initrd_end of/fdt: make memblock minimum physical address arch configurable arm64: Remove the get_thread_info() function arm64: kernel: Don't toggle PAN on systems with UAO arm64: cpufeature: Test 'matches' pointer to find the end of the list arm64: kernel: Add support for User Access Override arm64: add ARMv8.2 id_aa64mmfr2 boiler plate arm64: cpufeature: Change read_cpuid() to use sysreg's mrs_s macro arm64: use local label prefixes for __reg_num symbols arm64: vdso: Mark vDSO code as read-only arm64: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL arm64: ptdump: Indicate whether memory should be faulting arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC arm64: Drop alloc function from create_mapping arm64: prefetch: add missing #include for spin_lock_prefetch arm64: lib: patch in prfm for copy_page if requested arm64: lib: improve copy_page to deal with 128 bytes at a time arm64: prefetch: add alternative pattern for CPUs without a prefetcher arm64: prefetch: don't provide spin_lock_prefetch with LSE arm64: allow vmalloc regions to be set with set_memory_* arm64: kernel: implement ACPI parking protocol arm64: mm: create new fine-grained mappings at boot arm64: ensure _stext and _etext are page-aligned arm64: mm: allow passing a pgdir to alloc_init_* arm64: mm: allocate pagetables anywhere arm64: mm: use fixmap when creating page tables arm64: mm: add functions to walk tables in fixmap arm64: mm: add __{pud,pgd}_populate arm64: mm: avoid redundant __pa(__va(x)) arm64: mm: add functions to walk page tables by PA arm64: mm: move pte_* macros arm64: kasan: avoid TLB conflicts arm64: mm: add code to safely replace TTBR1_EL1 arm64: add function to install the idmap arm64: unmap idmap earlier arm64: unify idmap removal arm64: mm: place empty_zero_page in bss arm64: mm: specialise pagetable allocators asm-generic: Fix local variable shadow in __set_fixmap_offset Eliminate the .eh_frame sections from the aarch64 vmlinux and kernel modules arm64: Fix an enum typo in mm/dump.c arm64: kasan: ensure that the KASAN zero page is mapped read-only arch/arm64/include/asm/pgtable.h: add pmd_mkclean for THP arm64: hide __efistub_ aliases from kallsyms Linux 4.4.10 drm/i915/skl: Fix DMC load on Skylake J0 and K0 lib/test-string_helpers.c: fix and improve string_get_size() tests ACPI / processor: Request native thermal interrupt handling via _OSC drm/i915: Fake HDMI live status drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW drm/i915: Fix eDP low vswing for Broadwell drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume drm/radeon: make sure vertical front porch is at least 1 iio: ak8975: fix maybe-uninitialized warning iio: ak8975: Fix NULL pointer exception on early interrupt drm/amdgpu: set metadata pointer to NULL after freeing. drm/amdgpu: make sure vertical front porch is at least 1 gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading nvmem: mxs-ocotp: fix buffer overflow in read USB: serial: cp210x: add Straizona Focusers device ids USB: serial: cp210x: add ID for Link ECU ata: ahci-platform: Add ports-implemented DT bindings. libahci: save port map for forced port map powerpc: Fix bad inline asm constraint in create_zero_mask() ACPICA: Dispatcher: Update thread ID for recursive method calls x86/sysfb_efi: Fix valid BAR address range check ARC: Add missing io barriers to io{read,write}{16,32}be() ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value propogate_mnt: Handle the first propogated copy being a slave fs/pnode.c: treat zero mnt_group_id-s as unequal x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO MAINTAINERS: Remove asterisk from EFI directory names writeback: Fix performance regression in wb_over_bg_thresh() batman-adv: Reduce refcnt of removed router when updating route batman-adv: Fix broadcast/ogm queue limit on a removed interface batman-adv: Check skb size before using encapsulated ETH+VLAN header batman-adv: fix DAT candidate selection (must use vid) mm: update min_free_kbytes from khugepaged after core initialization proc: prevent accessing /proc/<PID>/environ until it's ready Input: zforce_ts - fix dual touch recognition HID: Fix boot delay for Creative SB Omni Surround 5.1 with quirk HID: wacom: Add support for DTK-1651 xen/evtchn: fix ring resize when binding new events xen/balloon: Fix crash when ballooning on x86 32 bit PAE xen: Fix page <-> pfn conversion on 32 bit systems ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernel ARM: EXYNOS: Properly skip unitialized parent clock in power domain on mm/zswap: provide unique zpool name mm, cma: prevent nr_isolated_* counters from going negative Minimal fix-up of bad hashing behavior of hash_64() MD: make bio mergeable tracing: Don't display trigger file for events that can't be enabled mac80211: fix statistics leak if dev_alloc_name() fails ath9k: ar5008_hw_cmn_spur_mitigate: add missing mask_m & mask_p initialisation lpfc: fix misleading indentation clk: qcom: msm8960: Fix ce3_src register offset clk: versatile: sp810: support reentrance clk: qcom: msm8960: fix ce3_core clk enable register clk: meson: Fix meson_clk_register_clks() signature type mismatch clk: rockchip: free memory in error cases when registering clock branches soc: rockchip: power-domain: fix err handle while probing clk-divider: make sure read-only dividers do not write to their register CNS3xxx: Fix PCI cns3xxx_write_config() mwifiex: fix corner case association failure ata: ahci_xgene: dereferencing uninitialized pointer in probe nbd: ratelimit error msgs after socket close mfd: intel-lpss: Remove clock tree on error path ipvs: drop first packet to redirect conntrack ipvs: correct initial offset of Call-ID header search in SIP persistence engine ipvs: handle ip_vs_fill_iph_skb_off failure RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips Revert: "powerpc/tm: Check for already reclaimed tasks" arm64: head.S: use memset to clear BSS efi: stub: define DISABLE_BRANCH_PROFILING for all architectures arm64: entry: remove pointless SPSR mode check arm64: mm: move pgd_cache initialisation to pgtable_cache_init arm64: module: avoid undefined shift behavior in reloc_data() arm64: module: fix relocation of movz instruction with negative immediate arm64: traps: address fallout from printk -> pr_* conversion arm64: ftrace: fix a stack tracer's output under function graph tracer arm64: pass a task parameter to unwind_frame() arm64: ftrace: modify a stack frame in a safe way arm64: remove irq_count and do_softirq_own_stack() arm64: hugetlb: add support for PTE contiguous bit arm64: Use PoU cache instr for I/D coherency arm64: Defer dcache flush in __cpu_copy_user_page arm64: reduce stack use in irq_handler arm64: Documentation: add list of software workarounds for errata arm64: mm: place __cpu_setup in .text arm64: cmpxchg: Don't incldue linux/mmdebug.h arm64: mm: fold alternatives into .init arm64: Remove redundant padding from linker script arm64: mm: remove pointless PAGE_MASKing arm64: don't call C code with el0's fp register arm64: when walking onto the task stack, check sp & fp are in current->stack arm64: Add this_cpu_ptr() assembler macro for use in entry.S arm64: irq: fix walking from irq stack to task stack arm64: Add do_softirq_own_stack() and enable irq_stacks arm64: Modify stack trace and dump for use with irq_stack arm64: Store struct thread_info in sp_el0 arm64: Add trace_hardirqs_off annotation in ret_to_user arm64: ftrace: fix the comments for ftrace_modify_code arm64: ftrace: stop using kstop_machine to enable/disable tracing arm64: spinlock: serialise spin_unlock_wait against concurrent lockers arm64: enable HAVE_IRQ_TIME_ACCOUNTING arm64: fix COMPAT_SHMLBA definition for large pages arm64: add __init/__initdata section marker to some functions/variables arm64: pgtable: implement pte_accessible() arm64: mm: allow sections for unaligned bases arm64: mm: detect bad __create_mapping uses Linux 4.4.9 extcon: max77843: Use correct size for reading the interrupt register stm class: Select CONFIG_SRCU megaraid_sas: add missing curly braces in ioctl handler sunrpc/cache: drop reference when sunrpc_cache_pipe_upcall() detects a race thermal: rockchip: fix a impossible condition caused by the warning unbreak allmodconfig KCONFIG_ALLCONFIG=... jme: Fix device PM wakeup API usage jme: Do not enable NIC WoL functions on S0 bus: imx-weim: Take the 'status' property value into account ARM: dts: pxa: fix dma engine node to pxa3xx-nand ARM: dts: armada-375: use armada-370-sata for SATA ARM: EXYNOS: select THERMAL_OF ARM: prima2: always enable reset controller ARM: OMAP3: Add cpuidle parameters table for omap3430 ext4: fix races of writeback with punch hole and zero range ext4: fix races between buffered IO and collapse / insert range ext4: move unlocked dio protection from ext4_alloc_file_blocks() ext4: fix races between page faults and hole punching perf stat: Document --detailed option perf tools: handle spaces in file names obtained from /proc/pid/maps perf hists browser: Only offer symbol scripting when a symbol is under the cursor mtd: nand: Drop mtd.owner requirement in nand_scan mtd: brcmnand: Fix v7.1 register offsets mtd: spi-nor: remove micron_quad_enable() serial: sh-sci: Remove cpufreq notifier to fix crash/deadlock ext4: fix NULL pointer dereference in ext4_mark_inode_dirty() x86/mm/kmmio: Fix mmiotrace for hugepages perf evlist: Reference count the cpu and thread maps at set_maps() drivers/misc/ad525x_dpot: AD5274 fix RDAC read back errors rtc: max77686: Properly handle regmap_irq_get_virq() error code rtc: rx8025: remove rv8803 id rtc: ds1685: passing bogus values to irq_restore rtc: vr41xx: Wire up alarm_irq_enable rtc: hym8563: fix invalid year calculation PM / Domains: Fix removal of a subdomain PM / OPP: Initialize u_volt_min/max to a valid value misc: mic/scif: fix wrap around tests misc/bmp085: Enable building as a module lib/mpi: Endianness fix fbdev: da8xx-fb: fix videomodes of lcd panels scsi_dh: force modular build if SCSI is a module paride: make 'verbose' parameter an 'int' again regulator: s5m8767: fix get_register() error handling irqchip/mxs: Fix error check of of_io_request_and_map() irqchip/sunxi-nmi: Fix error check of of_io_request_and_map() spi/rockchip: Make sure spi clk is on in rockchip_spi_set_cs locking/mcs: Fix mcs_spin_lock() ordering regulator: core: Fix nested locking of supplies regulator: core: Ensure we lock all regulators regulator: core: fix regulator_lock_supply regression Revert "regulator: core: Fix nested locking of supplies" videobuf2-v4l2: Verify planes array in buffer dequeueing videobuf2-core: Check user space planes array in dqbuf USB: usbip: fix potential out-of-bounds write cgroup: make sure a parent css isn't freed before its children mm/hwpoison: fix wrong num_poisoned_pages accounting mm: vmscan: reclaim highmem zone if buffer_heads is over limit numa: fix /proc/<pid>/numa_maps for THP mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check memcg: relocate charge moving from ->attach to ->post_attach cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback slub: clean up code for kmem cgroup support to kmem_cache_free_bulk workqueue: fix ghost PENDING flag while doing MQ IO x86/apic: Handle zero vector gracefully in clear_vector_irq() efi: Expose non-blocking set_variable() wrapper to efivars efi: Fix out-of-bounds read in variable_matches() IB/security: Restrict use of the write() interface IB/mlx5: Expose correct max_sge_rd limit cxl: Keep IRQ mappings on context teardown v4l2-dv-timings.h: fix polarity for 4k formats vb2-memops: Fix over allocation of frame vectors ASoC: rt5640: Correct the digital interface data select ASoC: dapm: Make sure we have a card when displaying component widgets ASoC: ssm4567: Reset device before regcache_sync() ASoC: s3c24xx: use const snd_soc_component_driver pointer EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback toshiba_acpi: Fix regression caused by hotkey enabling value i2c: exynos5: Fix possible ABBA deadlock by keeping I2C clock prepared i2c: cpm: Fix build break due to incompatible pointer types perf intel-pt: Fix segfault tracing transactions drm/i915: Use fw_domains_put_with_fifo() on HSW drm/i915: Fixup the free space logic in ring_prepare drm/amdkfd: uninitialized variable in dbgdev_wave_control_set_registers() drm/i915: skl_update_scaler() wants a rotation bitmask instead of bit number drm/i915: Cleanup phys status page too pwm: brcmstb: Fix check of devm_ioremap_resource() return code drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1() drm/dp/mst: Restore primary hub guid on resume drm/dp/mst: Validate port in drm_dp_payload_send_msg() drm/nouveau/gr/gf100: select a stream master to fixup tfb offset queries drm: Loongson-3 doesn't fully support wc memory drm/radeon: fix vertical bars appear on monitor (v2) drm/radeon: forbid mapping of userptr bo through radeon device file drm/radeon: fix initial connector audio value drm/radeon: add a quirk for a XFX R9 270X drm/amdgpu: fix regression on CIK (v2) amdgpu/uvd: add uvd fw version for amdgpu drm/amdgpu: bump the afmt limit for CZ, ST, Polaris drm/amdgpu: use defines for CRTCs and AMFT blocks drm/amdgpu: when suspending, if uvd/vce was running. need to cancel delay work. iommu/dma: Restore scatterlist offsets correctly iommu/amd: Fix checking of pci dma aliases pinctrl: single: Fix pcs_parse_bits_in_pinctrl_entry to use __ffs than ffs pinctrl: mediatek: correct debounce time unit in mtk_gpio_set_debounce xen kconfig: don't "select INPUT_XEN_KBDDEV_FRONTEND" Input: pmic8xxx-pwrkey - fix algorithm for converting trigger delay Input: gtco - fix crash on detecting device without endpoints netlink: don't send NETLINK_URELEASE for unbound sockets nl80211: check netlink protocol in socket release notification powerpc: Update TM user feature bits in scan_features() powerpc: Update cpu_user_features2 in scan_features() powerpc: scan_features() updates incorrect bits for REAL_LE crypto: talitos - fix AEAD tcrypt tests crypto: talitos - fix crash in talitos_cra_init() crypto: sha1-mb - use corrcet pointer while completing jobs crypto: ccp - Prevent information leakage on export iwlwifi: mvm: fix memory leak in paging iwlwifi: pcie: lower the debug level for RSA semaphore access s390/pci: add extra padding to function measurement block cpufreq: intel_pstate: Fix processing for turbo activation ratio Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control" Revert "drm/radeon: disable runtime pm on PX laptops without dGPU power control" drm/i915: Fix race condition in intel_dp_destroy_mst_connector() drm/qxl: fix cursor position with non-zero hotspot drm/nouveau/core: use vzalloc for allocating ramht futex: Acknowledge a new waiter in counter before plist futex: Handle unlock_pi race gracefully asm-generic/futex: Re-enable preemption in futex_atomic_cmpxchg_inatomic() ALSA: hda - Add dock support for ThinkPad X260 ALSA: pcxhr: Fix missing mutex unlock ALSA: hda - add PCI ID for Intel Broxton-T ALSA: hda - Keep powering up ADCs on Cirrus codecs ALSA: hda/realtek - Add ALC3234 headset mode for Optiplex 9020m ALSA: hda - Don't trust the reported actual power state x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address x86/mm/xen: Suppress hugetlbfs in PV guests arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission arm64: Honour !PTE_WRITE in set_pte_at() for kernel mappings sched/cgroup: Fix/cleanup cgroup teardown/init dmaengine: pxa_dma: fix the maximum requestor line dmaengine: hsu: correct use of channel status register dmaengine: dw: fix master selection debugfs: Make automount point inodes permanently empty lib: lz4: fixed zram with lz4 on big endian machines dm cache metadata: fix cmd_read_lock() acquiring write lock dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros usb: gadget: f_fs: Fix use-after-free usb: hcd: out of bounds access in for_each_companion xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers usb: xhci: fix wild pointers in xhci_mem_cleanup xhci: resume USB 3 roothub first usb: xhci: applying XHCI_PME_STUCK_QUIRK to Intel BXT B0 host assoc_array: don't call compare_object() on a node ARM: OMAP2+: hwmod: Fix updating of sysconfig register ARM: OMAP2: Fix up interconnect barrier initialization for DRA7 ARM: mvebu: Correct unit address for linksys ARM: dts: AM43x-epos: Fix clk parent for synctimer KVM: arm/arm64: Handle forward time correction gracefully kvm: x86: do not leak guest xcr0 into host interrupt handlers x86/mce: Avoid using object after free in genpool block: loop: fix filesystem corruption in case of aio/dio block: partition: initialize percpuref before sending out KOBJ_ADD Conflicts: arch/arm64/Kconfig arch/arm64/include/asm/cputype.h arch/arm64/include/asm/hardirq.h arch/arm64/include/asm/irq.h arch/arm64/kernel/cpu_errata.c arch/arm64/kernel/cpuinfo.c arch/arm64/kernel/setup.c arch/arm64/kernel/smp.c arch/arm64/kernel/stacktrace.c arch/arm64/mm/init.c arch/arm64/mm/mmu.c arch/arm64/mm/pageattr.c mm/memcontrol.c CRs-Fixed: 1054234 Signed-off-by: Trilok Soni <tsoni@codeaurora.org> Change-Id: I2a7a34631ffee36ce18b9171f16d023be777392f	2016-08-18 14:50:45 -07:00
Runmin Wang	750075feff	Merge remote-tracking branch 'origin/tmp-917a9a9133a6' into lsk * tmp-917a9: ARM/vdso: Mark the vDSO code read-only after init x86/vdso: Mark the vDSO code read-only after init lkdtm: Verify that '__ro_after_init' works correctly arch: Introduce post-init read-only memory x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings asm-generic: Consolidate mark_rodata_ro() Linux 4.4.6 ld-version: Fix awk regex compile failure target: Drop incorrect ABORT_TASK put for completed commands block: don't optimize for non-cloned bio in bio_get_last_bvec() MIPS: smp.c: Fix uninitialised temp_foreign_map MIPS: Fix build error when SMP is used without GIC ovl: fix getcwd() failure after unsuccessful rmdir ovl: copy new uid/gid into overlayfs runtime inode userfaultfd: don't block on the last VM updates at exit time powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages powerpc/powernv: Add a kmsg_dumper that flushes console output on panic powerpc: Fix dedotify for binutils >= 2.26 Revert "drm/radeon/pm: adjust display configuration after powerstate" drm/radeon: Fix error handling in radeon_flip_work_func. drm/amdgpu: Fix error handling in amdgpu_flip_work_func. Revert "drm/radeon: call hpd_irq_event on resume" x86/mm: Fix slow_virt_to_phys() for X86_PAE again gpu: ipu-v3: Do not bail out on missing optional port nodes mac80211: Fix Public Action frame RX in AP mode mac80211: check PN correctly for GCMP-encrypted fragmented MPDUs mac80211: minstrel_ht: fix a logic error in RTS/CTS handling mac80211: minstrel_ht: set default tx aggregation timeout to 0 mac80211: fix use of uninitialised values in RX aggregation mac80211: minstrel: Change expected throughput unit back to Kbps iwlwifi: mvm: inc pending frames counter also when txing non-sta can: gs_usb: fixed disconnect bug by removing erroneous use of kfree() cfg80211/wext: fix message ordering wext: fix message delay/ordering ovl: fix working on distributed fs as lower layer ovl: ignore lower entries when checking purity of non-directory entries ASoC: wm8958: Fix enum ctl accesses in a wrong type ASoC: wm8994: Fix enum ctl accesses in a wrong type ASoC: samsung: Use IRQ safe spin lock calls ASoC: dapm: Fix ctl value accesses in a wrong type ncpfs: fix a braino in OOM handling in ncp_fill_cache() jffs2: reduce the breakage on recovery from halfway failed rename() dmaengine: at_xdmac: fix residue computation tracing: Fix check for cpu online when event is disabled s390/dasd: fix diag 0x250 inline assembly s390/mm: four page table levels vs. fork KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS KVM: VMX: disable PEBS before a guest entry kvm: cap halt polling at exactly halt_poll_ns PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr() ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property ARM: dts: dra7: do not gate cpsw clock due to errata i877 ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window arm64: account for sparsemem section alignment when choosing vmemmap offset Linux 4.4.5 drm/amdgpu: fix topaz/tonga gmc assignment in 4.4 stable modules: fix longstanding /proc/kallsyms vs module insertion race. drm/i915: refine qemu south bridge detection drm/i915: more virtual south bridge detection block: get the 1st and last bvec via helpers block: check virt boundary in bio_will_gap() drm/amdgpu: Use drm_calloc_large for VM page_tables array thermal: cpu_cooling: fix out of bounds access in time_in_idle i2c: brcmstb: allocate correct amount of memory for regmap ubi: Fix out of bounds write in volume update code cxl: Fix PSL timebase synchronization detection MIPS: traps: Fix SIGFPE information leak from `do_ov' and `do_trap_or_bp' MIPS: scache: Fix scache init with invalid line size. USB: serial: option: add support for Quectel UC20 USB: serial: option: add support for Telit LE922 PID 0x1045 USB: qcserial: add Sierra Wireless EM74xx device ID USB: qcserial: add Dell Wireless 5809e Gobi 4G HSPA+ (rev3) USB: cp210x: Add ID for Parrot NMEA GPS Flight Recorder usb: chipidea: otg: change workqueue ci_otg as freezable ALSA: timer: Fix broken compat timer user status ioctl ALSA: hdspm: Fix zero-division ALSA: hdsp: Fix wrong boolean ctl value accesses ALSA: hdspm: Fix wrong boolean ctl value accesses ALSA: seq: oss: Don't drain at closing a client ALSA: pcm: Fix ioctls for X32 ABI ALSA: timer: Fix ioctls for X32 ABI ALSA: rawmidi: Fix ioctls X32 ABI ALSA: hda - Fix mic issues on Acer Aspire E1-472 ALSA: ctl: Fix ioctls for X32 ABI ALSA: usb-audio: Add a quirk for Plantronics DA45 adv7604: fix tx 5v detect regression dmaengine: pxa_dma: fix cyclic transfers Fix directory hardlinks from deleted directories jffs2: Fix page lock / f->sem deadlock Revert "jffs2: Fix lock acquisition order bug in jffs2_write_begin" Btrfs: fix loading of orphan roots leading to BUG_ON pata-rb532-cf: get rid of the irq_to_gpio() call tracing: Do not have 'comm' filter override event 'comm' field ata: ahci: don't mark HotPlugCapable Ports as external/removable PM / sleep / x86: Fix crash on graph trace through x86 suspend arm64: vmemmap: use virtual projection of linear region Adding Intel Lewisburg device IDs for SATA writeback: flush inode cgroup wb switches instead of pinning super_block block: bio: introduce helpers to get the 1st and last bvec libata: Align ata_device's id on a cacheline libata: fix HDIO_GET_32BIT ioctl drm/amdgpu: return from atombios_dp_get_dpcd only when error drm/amdgpu/gfx8: specify which engine to wait before vm flush drm/amdgpu: apply gfx_v8 fixes to gfx_v7 as well drm/amdgpu/pm: update current crtc info after setting the powerstate drm/radeon/pm: update current crtc info after setting the powerstate drm/ast: Fix incorrect register check for DRAM width target: Fix WRITE_SAME/DISCARD conversion to linux 512b sectors iommu/vt-d: Use BUS_NOTIFY_REMOVED_DEVICE in hotplug path iommu/amd: Fix boot warning when device 00:00.0 is not iommu covered iommu/amd: Apply workaround for ATS write permission check arm/arm64: KVM: Fix ioctl error handling KVM: x86: fix root cause for missed hardware breakpoints vfio: fix ioctl error handling Fix cifs_uniqueid_to_ino_t() function for s390x CIFS: Fix SMB2+ interim response processing for read requests cifs: fix out-of-bounds access in lease parsing fbcon: set a default value to blink interval kvm: x86: Update tsc multiplier on change. mips/kvm: fix ioctl error handling parisc: Fix ptrace syscall number and return value modification PCI: keystone: Fix MSI code that retrieves struct pcie_port pointer block: Initialize max_dev_sectors to 0 drm/amdgpu: mask out WC from BO on unsupported arches btrfs: async-thread: Fix a use-after-free error for trace btrfs: Fix no_space in write and rm loop Btrfs: fix deadlock running delayed iputs at transaction commit time drivers: sh: Restore legacy clock domain on SuperH platforms use ->d_seq to get coherency between ->d_inode and ->d_flags Linux 4.4.4 iwlwifi: mvm: don't allow sched scans without matches to be started iwlwifi: update and fix 7265 series PCI IDs iwlwifi: pcie: properly configure the debug buffer size for 8000 iwlwifi: dvm: fix WoWLAN security: let security modules use PTRACE_MODE_* with bitmasks IB/cma: Fix RDMA port validation for iWarp x86/irq: Plug vector cleanup race x86/irq: Call irq_force_move_complete with irq descriptor x86/irq: Remove outgoing CPU from vector cleanup mask x86/irq: Remove the cpumask allocation from send_cleanup_vector() x86/irq: Clear move_in_progress before sending cleanup IPI x86/irq: Remove offline cpus from vector cleanup x86/irq: Get rid of code duplication x86/irq: Copy vectormask instead of an AND operation x86/irq: Check vector allocation early x86/irq: Reorganize the search in assign_irq_vector x86/irq: Reorganize the return path in assign_irq_vector x86/irq: Do not use apic_chip_data.old_domain as temporary buffer x86/irq: Validate that irq descriptor is still active x86/irq: Fix a race in x86_vector_free_irqs() x86/irq: Call chip->irq_set_affinity in proper context x86/entry/compat: Add missing CLAC to entry_INT80_32 x86/mpx: Fix off-by-one comparison with nr_registers hpfs: don't truncate the file when delete fails do_last(): ELOOP failure exit should be done after leaving RCU mode should_follow_link(): validate ->d_seq after having decided to follow xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted. xen/pciback: Save the number of MSI-X entries to be copied later. xen/pciback: Check PF instead of VF for PCI_COMMAND_MEMORY xen/scsiback: correct frontend counting xen/arm: correctly handle DMA mapping of compound pages ARM: at91/dt: fix typo in sama5d2 pinmux descriptions ARM: OMAP2+: Fix onenand initialization to avoid filesystem corruption do_last(): don't let a bogus return value from ->open() et.al. to confuse us kernel/resource.c: fix muxed resource handling in __request_region() sunrpc/cache: fix off-by-one in qword_get() tracing: Fix showing function event in available_events powerpc/eeh: Fix partial hotplug criterion KVM: x86: MMU: fix ubsan index-out-of-range warning KVM: x86: fix conversion of addresses to linear in 32-bit protected mode KVM: x86: fix missed hardware breakpoints KVM: arm/arm64: vgic: Ensure bitmaps are long enough KVM: async_pf: do not warn on page allocation failures of/irq: Fix msi-map calculation for nonzero rid-base NFSv4: Fix a dentry leak on alias use nfs: fix nfs_size_to_loff_t block: fix use-after-free in dio_bio_complete bio: return EINTR if copying to user space got interrupted i2c: i801: Adding Intel Lewisburg support for iTCO phy: core: fix wrong err handle for phy_power_on writeback: keep superblock pinned during cgroup writeback association switches cgroup: make sure a parent css isn't offlined before its children cpuset: make mm migration asynchronous PCI/AER: Flush workqueue on device remove to avoid use-after-free ARCv2: SMP: Emulate IPI to self using software triggered interrupt ARCv2: STAR 9000950267: Handle return from intr to Delay Slot #2 libata: fix sff host state machine locking while polling qla2xxx: Fix stale pointer access. spi: atmel: fix gpio chip-select in case of non-DT platform target: Fix race with SCF_SEND_DELAYED_TAS handling target: Fix remote-port TMR ABORT + se_cmd fabric stop target: Fix TAS handling for multi-session se_node_acls target: Fix LUN_RESET active TMR descriptor handling target: Fix LUN_RESET active I/O handling for ACK_KREF ALSA: hda - Fixing background noise on Dell Inspiron 3162 ALSA: hda - Apply clock gate workaround to Skylake, too Revert "workqueue: make sure delayed work run in local cpu" workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup mac80211: Requeue work after scan complete for all VIF types. rfkill: fix rfkill_fop_read wait_event usage tick/nohz: Set the correct expiry when switching to nohz/lowres mode perf stat: Do not clean event's private stats cdc-acm:exclude Samsung phone 04e8:685d Revert "Staging: panel: usleep_range is preferred over udelay" Staging: speakup: Fix getting port information sd: Optimal I/O size is in bytes, not sectors libceph: don't spam dmesg with stray reply warnings libceph: use the right footer size when skipping a message libceph: don't bail early from try_read() when skipping a message libceph: fix ceph_msg_revoke() seccomp: always propagate NO_NEW_PRIVS on tsync cpufreq: Fix NULL reference crash while accessing policy->governor_data cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype hwmon: (ads1015) Handle negative conversion values correctly hwmon: (gpio-fan) Remove un-necessary speed_index lookup for thermal hook hwmon: (dell-smm) Blacklist Dell Studio XPS 8000 Thermal: do thermal zone update after a cooling device registered Thermal: handle thermal zone device properly during system sleep Thermal: initialize thermal zone device correctly IB/mlx5: Expose correct maximum number of CQE capacity IB/qib: Support creating qps with GFP_NOIO flag IB/qib: fix mcast detach when qp not attached IB/cm: Fix a recently introduced deadlock dmaengine: dw: disable BLOCK IRQs for non-cyclic xfer dmaengine: at_xdmac: fix resume for cyclic transfers dmaengine: dw: fix cyclic transfer callbacks dmaengine: dw: fix cyclic transfer setup nfit: fix multi-interface dimm handling, acpi6.1 compatibility ACPI / PCI / hotplug: unlock in error path in acpiphp_enable_slot() ACPI: Revert "ACPI / video: Add Dell Inspiron 5737 to the blacklist" ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830 ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700 lib: sw842: select crc32 uapi: update install list after nvme.h rename ideapad-laptop: Add Lenovo Yoga 700 to no_hw_rfkill dmi list ideapad-laptop: Add Lenovo ideapad Y700-17ISK to no_hw_rfkill dmi list toshiba_acpi: Fix blank screen at boot if transflective backlight is supported make sure that freeing shmem fast symlinks is RCU-delayed drm/radeon/pm: adjust display configuration after powerstate drm/radeon: Don't hang in radeon_flip_work_func on disabled crtc. (v2) drm: Fix treatment of drm_vblank_offdelay in drm_vblank_on() (v2) drm: Fix drm_vblank_pre/post_modeset regression from Linux 4.4 drm: Prevent vblank counter bumps > 1 with active vblank clients. (v2) drm: No-Op redundant calls to drm_vblank_off() (v2) drm/radeon: use post-decrement in error handling drm/qxl: use kmalloc_array to alloc reloc_info in qxl_process_single_command drm/i915: fix error path in intel_setup_gmbus() drm/i915/dsi: don't pass arbitrary data to sideband drm/i915/dsi: defend gpio table against out of bounds access drm/i915/skl: Don't skip mst encoders in skl_ddi_pll_select() drm/i915: Don't reject primary plane windowing with color keying enabled on SKL+ drm/i915/dp: fall back to 18 bpp when sink capability is unknown drm/i915: Make sure DC writes are coherent on flush. drm/i915: Init power domains early in driver load drm/i915: intel_hpd_init(): Fix suspend/resume reprobing drm/i915: Restore inhibiting the load of the default context drm: fix missing reference counting decrease drm/radeon: hold reference to fences in radeon_sa_bo_new drm/radeon: mask out WC from BO on unsupported arches drm: add helper to check for wc memory support drm/radeon: fix DP audio support for APU with DCE4.1 display engine drm/radeon: Add a common function for DFS handling drm/radeon: cleaned up VCO output settings for DP audio drm/radeon: properly byte swap vce firmware setup drm/radeon: clean up fujitsu quirks drm/radeon: Fix "slow" audio over DP on DCE8+ drm/radeon: call hpd_irq_event on resume drm/radeon: Fix off-by-one errors in radeon_vm_bo_set_addr drm/dp/mst: deallocate payload on port destruction drm/dp/mst: Reverse order of MST enable and clearing VC payload table. drm/dp/mst: move GUID storage from mgr, port to only mst branch drm/dp/mst: Calculate MST PBN with 31.32 fixed point drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil drm/dp/mst: fix in RAD element access drm/dp/mst: fix in MSTB RAD initialization drm/dp/mst: always send reply for UP request drm/dp/mst: process broadcast messages correctly drm/nouveau: platform: Fix deferred probe drm/nouveau/disp/dp: ensure sink is powered up before attempting link training drm/nouveau/display: Enable vblank irqs after display engine is on again. drm/nouveau/kms: take mode_config mutex in connector hotplug path drm/amdgpu/pm: adjust display configuration after powerstate drm/amdgpu: Don't hang in amdgpu_flip_work_func on disabled crtc. drm/amdgpu: use post-decrement in error handling drm/amdgpu: fix issue with overlapping userptrs drm/amdgpu: hold reference to fences in amdgpu_sa_bo_new (v2) drm/amdgpu: remove unnecessary forward declaration drm/amdgpu: fix s4 resume drm/amdgpu: remove exp hardware support from iceland drm/amdgpu: don't load MEC2 on topaz drm/amdgpu: drop topaz support from gmc8 module drm/amdgpu: pull topaz gmc bits into gmc_v7 drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above drm/amdgpu: iceland use CI based MC IP drm/amdgpu: move gmc7 support out of CIK dependency drm/amdgpu: no need to load MC firmware on fiji drm/amdgpu: fix amdgpu_bo_pin_restricted VRAM placing v2 drm/amdgpu: fix tonga smu resume drm/amdgpu: fix lost sync_to if scheduler is enabled. drm/amdgpu: call hpd_irq_event on resume drm/amdgpu: Fix off-by-one errors in amdgpu_vm_bo_map drm/vmwgfx: respect 'nomodeset' drm/vmwgfx: Fix a width / pitch mismatch on framebuffer updates drm/vmwgfx: Fix an incorrect lock check virtio_pci: fix use after free on release virtio_balloon: fix race between migration and ballooning virtio_balloon: fix race by fill and leak regulator: mt6311: MT6311_REGULATOR needs to select REGMAP_I2C regulator: axp20x: Fix GPIO LDO enable value for AXP22x clk: exynos: use irqsave version of spin_lock to avoid deadlock with irqs cxl: use correct operator when writing pcie config space values sparc64: fix incorrect sign extension in sys_sparc64_personality EDAC, mc_sysfs: Fix freeing bus' name EDAC: Robustify workqueues destruction MIPS: Fix buffer overflow in syscall_get_arguments() MIPS: Fix some missing CONFIG_CPU_MIPSR6 #ifdefs MIPS: hpet: Choose a safe value for the ETIME check MIPS: Loongson-3: Fix SMP_ASK_C0COUNT IPI handler Revert "MIPS: Fix PAGE_MASK definition" cputime: Prevent 32bit overflow in time[val\|spec]_to_cputime() time: Avoid signed overflow in timekeeping_get_ns() Bluetooth: 6lowpan: Fix handling of uncompressed IPv6 packets Bluetooth: 6lowpan: Fix kernel NULL pointer dereferences Bluetooth: Fix incorrect removing of IRKs Bluetooth: Add support of Toshiba Broadcom based devices Bluetooth: Use continuous scanning when creating LE connections Drivers: hv: vmbus: Fix a Host signaling bug tools: hv: vss: fix the write()'s argument: error -> vss_msg mmc: sdhci: Allow override of get_cd() called from sdhci_request() mmc: sdhci: Allow override of mmc host operations mmc: sdhci-pci: Fix card detect race for Intel BXT/APL mmc: pxamci: fix again read-only gpio detection polarity mmc: sdhci-acpi: Fix card detect race for Intel BXT/APL mmc: mmci: fix an ages old detection error mmc: core: Enable tuning according to the actual timing mmc: sdhci: Fix sdhci_runtime_pm_bus_on/off() mmc: mmc: Fix incorrect use of driver strength switching HS200 and HS400 mmc: sdio: Fix invalid vdd in voltage switch power cycle mmc: sdhci: Fix DMA descriptor with zero data length mmc: sdhci-pci: Do not default to 33 Ohm driver strength for Intel SPT mmc: usdhi6rol0: handle NULL data in timeout clockevents/tcb_clksrc: Prevent disabling an already disabled clock posix-clock: Fix return code on the poll method's error path irqchip/gic-v3-its: Fix double ICC_EOIR write for LPI in EOImode==1 irqchip/atmel-aic: Fix wrong bit operation for IRQ priority irqchip/mxs: Add missing set_handle_irq() irqchip/omap-intc: Add support for spurious irq handling coresight: checking for NULL string in coresight_name_match() dm: fix dm_rq_target_io leak on faults with .request_fn DM w/ blk-mq paths dm snapshot: fix hung bios when copy error occurs dm space map metadata: remove unused variable in brb_pop() tda1004x: only update the frontend properties if locked vb2: fix a regression in poll() behavior for output,streams gspca: ov534/topro: prevent a division by 0 si2157: return -EINVAL if firmware blob is too big media: dvb-core: Don't force CAN_INVERSION_AUTO in oneshot mode rc: sunxi-cir: Initialize the spinlock properly namei: ->d_inode of a pinned dentry is stable only for positives mei: validate request value in client notify request ioctl mei: fix fasync return value on error rtlwifi: rtl8723be: Fix module parameter initialization rtlwifi: rtl8188ee: Fix module parameter initialization rtlwifi: rtl8192se: Fix module parameter initialization rtlwifi: rtl8723ae: Fix initialization of module parameters rtlwifi: rtl8192de: Fix incorrect module parameter descriptions rtlwifi: rtl8192ce: Fix handling of module parameters rtlwifi: rtl8192cu: Add missing parameter setup rtlwifi: rtl_pci: Fix kernel panic locks: fix unlock when fcntl_setlk races with a close um: link with -lpthread uml: fix hostfs mknod() uml: flush stdout before forking s390/fpu: signals vs. floating point control register s390/compat: correct restore of high gprs on signal return s390/dasd: fix performance drop s390/dasd: fix refcount for PAV reassignment s390/dasd: prevent incorrect length error under z/VM after PAV changes s390: fix normalization bug in exception table sorting btrfs: initialize the seq counter in struct btrfs_device Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots Btrfs: fix transaction handle leak on failure to create hard link Btrfs: fix number of transaction units required to create symlink Btrfs: send, don't BUG_ON() when an empty symlink is found btrfs: statfs: report zero available if metadata are exhausted Btrfs: igrab inode in writepage Btrfs: add missing brelse when superblock checksum fails KVM: s390: fix memory overwrites when vx is disabled s390/kvm: remove dependency on struct save_area definition clocksource/drivers/vt8500: Increase the minimum delta genirq: Validate action before dereferencing it in handle_irq_event_percpu() mm: numa: quickly fail allocations for NUMA balancing on full nodes mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED ocfs2: unlock inode if deleting inode from orphan fails drm/i915: shut up gen8+ SDE irq dmesg noise iw_cxgb3: Fix incorrectly returning error on success spi: omap2-mcspi: Prevent duplicate gpio_request drivers: android: correct the size of struct binder_uintptr_t for BC_DEAD_BINDER_DONE USB: option: add "4G LTE usb-modem U901" USB: option: add support for SIM7100E USB: cp210x: add IDs for GE B650V3 and B850V3 boards usb: dwc3: Fix assignment of EP transfer resources can: ems_usb: Fix possible tx overflow dm thin: fix race condition when destroying thin pool workqueue bcache: Change refill_dirty() to always scan entire disk if necessary bcache: prevent crash on changing writeback_running bcache: allows use of register in udev to avoid "device_busy" error. bcache: unregister reboot notifier if bcache fails to unregister device bcache: fix a leak in bch_cached_dev_run() bcache: clear BCACHE_DEV_UNLINK_DONE flag when attaching a backing device bcache: Add a cond_resched() call to gc bcache: fix a livelock when we cause a huge number of cache misses lib/ucs2_string: Correct ucs2 -> utf8 conversion efi: Add pstore variables to the deletion whitelist efi: Make efivarfs entries immutable by default efi: Make our variable validation list include the guid efi: Do variable name validation tests in utf8 efi: Use ucs2_as_utf8 in efivarfs instead of open coding a bad version lib/ucs2_string: Add ucs2 -> utf8 helper functions ARM: 8457/1: psci-smp is built only for SMP drm/gma500: Use correct unref in the gem bo create function devm_memremap: Fix error value when memremap failed KVM: s390: fix guest fprs memory leak arm64: errata: Add -mpc-relative-literal-loads to build flags ARM: debug-ll: fix BCM63xx entry for multiplatform ext4: fix bh->b_state corruption sctp: Fix port hash table size computation unix_diag: fix incorrect sign extension in unix_lookup_by_ino tipc: unlock in error path rtnl: RTM_GETNETCONF: fix wrong return value IFF_NO_QUEUE: Fix for drivers not calling ether_setup() tcp/dccp: fix another race at listener dismantle route: check and remove route cache when we get route net_sched fix: reclassification needs to consider ether protocol changes pppoe: fix reference counting in PPPoE proxy l2tp: Fix error creating L2TP tunnels net/mlx4_en: Avoid changing dev->features directly in run-time net/mlx4_en: Choose time-stamping shift value according to HW frequency net/mlx4_en: Count HW buffer overrun only once qmi_wwan: add "4G LTE usb-modem U901" tcp: md5: release request socket instead of listener tipc: fix premature addition of node to lookup table af_unix: Guard against other == sk in unix_dgram_sendmsg af_unix: Don't set err in unix_stream_read_generic unless there was an error ipv4: fix memory leaks in ip_cmsg_send() callers bonding: Fix ARP monitor validation bpf: fix branch offset adjustment on backjumps after patching ctx expansion flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen net: Copy inner L3 and L4 headers as unaligned on GRE TEB sctp: translate network order to host order when users get a hmacid enic: increment devcmd2 result ring in case of timeout tg3: Fix for tg3 transmit queue 0 timed out when too many gso_segs net:Add sysctl_max_skb_frags tcp: do not drop syn_recv on all icmp reports unix: correctly track in-flight fds in sending process user_struct ipv6: fix a lockdep splat ipv6: addrconf: Fix recursive spin lock call ipv6/udp: use sticky pktinfo egress ifindex on connect() ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail() tcp: beware of alignments in tcp_get_info() switchdev: Require RTNL mutex to be held when sending FDB notifications inet: frag: Always orphan skbs inside ip_defrag() tipc: fix connection abort during subscription cancel net: dsa: fix mv88e6xxx switches sctp: allow setting SCTP_SACK_IMMEDIATELY by the application pptp: fix illegal memory access caused by multiple bind()s af_unix: fix struct pid memory leak tcp: fix NULL deref in tcp_v4_send_ack() lwt: fix rx checksum setting for lwt devices tunneling over ipv6 tunnels: Allow IPv6 UDP checksums to be correctly controlled. net: dp83640: Fix tx timestamp overflow handling. gro: Make GRO aware of lightweight tunnels. af_iucv: Validate socket address length in iucv_sock_bind() Conflicts: arch/arm64/Makefile arch/arm64/include/asm/cacheflush.h drivers/mmc/host/sdhci.c drivers/usb/dwc3/ep0.c drivers/usb/dwc3/gadget.c kernel/module.c sound/core/pcm_compat.c CRs-Fixed: 1010239 Signed-off-by: Runmin Wang <runminw@codeaurora.org> Change-Id: I41a28636fc9ad91f9d979b191784609476294cdf	2016-07-12 11:40:49 -07:00
Wanpeng Li	cf73d8ad76	workqueue: fix rebind bound workers warning commit f7c17d26f43d5cc1b7a6b896cd2fa24a079739b9 upstream. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 16 at kernel/workqueue.c:4559 rebind_workers+0x1c0/0x1d0 Modules linked in: CPU: 0 PID: 16 Comm: cpuhp/0 Not tainted 4.6.0-rc4+ #31 Hardware name: IBM IBM System x3550 M4 Server -[7914IUW]-/00Y8603, BIOS -[D7E128FUS-1.40]- 07/23/2013 0000000000000000 ffff881037babb58 ffffffff8139d885 0000000000000010 0000000000000000 0000000000000000 0000000000000000 ffff881037babba8 ffffffff8108505d ffff881037ba0000 000011cf3e7d6e60 0000000000000046 Call Trace: dump_stack+0x89/0xd4 __warn+0xfd/0x120 warn_slowpath_null+0x1d/0x20 rebind_workers+0x1c0/0x1d0 workqueue_cpu_up_callback+0xf5/0x1d0 notifier_call_chain+0x64/0x90 ? trace_hardirqs_on_caller+0xf2/0x220 ? notify_prepare+0x80/0x80 __raw_notifier_call_chain+0xe/0x10 __cpu_notify+0x35/0x50 notify_down_prepare+0x5e/0x80 ? notify_prepare+0x80/0x80 cpuhp_invoke_callback+0x73/0x330 ? __schedule+0x33e/0x8a0 cpuhp_down_callbacks+0x51/0xc0 cpuhp_thread_fun+0xc1/0xf0 smpboot_thread_fn+0x159/0x2a0 ? smpboot_create_threads+0x80/0x80 kthread+0xef/0x110 ? wait_for_completion+0xf0/0x120 ? schedule_tail+0x35/0xf0 ret_from_fork+0x22/0x50 ? __init_kthread_worker+0x70/0x70 ---[ end trace eb12ae47d2382d8f ]--- notify_down_prepare: attempt to take down CPU 0 failed This bug can be reproduced by below config w/ nohz_full= all cpus: CONFIG_BOOTPARAM_HOTPLUG_CPU0=y CONFIG_DEBUG_HOTPLUG_CPU0=y CONFIG_NO_HZ_FULL=y As Thomas pointed out: \| If a down prepare callback fails, then DOWN_FAILED is invoked for all \| callbacks which have successfully executed DOWN_PREPARE. \| \| But, workqueue has actually two notifiers. One which handles \| UP/DOWN_FAILED/ONLINE and one which handles DOWN_PREPARE. \| \| Now look at the priorities of those callbacks: \| \| CPU_PRI_WORKQUEUE_UP = 5 \| CPU_PRI_WORKQUEUE_DOWN = -5 \| \| So the call order on DOWN_PREPARE is: \| \| CB 1 \| CB ... \| CB workqueue_up() -> Ignores DOWN_PREPARE \| CB ... \| CB X ---> Fails \| \| So we call up to CB X with DOWN_FAILED \| \| CB 1 \| CB ... \| CB workqueue_up() -> Handles DOWN_FAILED \| CB ... \| CB X-1 \| \| So the problem is that the workqueue stuff handles DOWN_FAILED in the up \| callback, while it should do it in the down callback. Which is not a good idea \| either because it wants to be called early on rollback... \| \| Brilliant stuff, isn't it? The hotplug rework will solve this problem because \| the callbacks become symetric, but for the existing mess, we need some \| workaround in the workqueue code. The boot CPU handles housekeeping duty(unbound timers, workqueues, timekeeping, ...) on behalf of full dynticks CPUs. It must remain online when nohz full is enabled. There is a priority set to every notifier_blocks: workqueue_cpu_up > tick_nohz_cpu_down > workqueue_cpu_down So tick_nohz_cpu_down callback failed when down prepare cpu 0, and notifier_blocks behind tick_nohz_cpu_down will not be called any more, which leads to workers are actually not unbound. Then hotplug state machine will fallback to undo and online cpu 0 again. Workers will be rebound unconditionally even if they are not unbound and trigger the warning in this progress. This patch fix it by catching !DISASSOCIATED to avoid rebind bound workers. Cc: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Suggested-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-05-18 17:06:50 -07:00
Tejun Heo	e2b6ea208b	workqueue: implement lockup detector Workqueue stalls can happen from a variety of usage bugs such as missing WQ_MEM_RECLAIM flag or concurrency managed work item indefinitely staying RUNNING. These stalls can be extremely difficult to hunt down because the usual warning mechanisms can't detect workqueue stalls and the internal state is pretty opaque. To alleviate the situation, this patch implements workqueue lockup detector. It periodically monitors all worker_pools periodically and, if any pool failed to make forward progress longer than the threshold duration, triggers warning and dumps workqueue state as follows. BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s! Showing busy workqueues and worker pools: workqueue events: flags=0x0 pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256 pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent workqueue events_power_efficient: flags=0x80 pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 pending: check_lifetime, neigh_periodic_work workqueue cgroup_pidlist_destroy: flags=0x0 pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1 pending: cgroup_pidlist_destroy_work_fn ... The detection mechanism is controller through kernel parameter workqueue.watchdog_thresh and can be updated at runtime through the sysfs module parameter file. v2: Decoupled from softlockup control knobs. CRs-Fixed: 1007459 Change-Id: Id7dfbbd2701128a942b1bcac2299e07a66db8657 Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Andrew Morton <akpm@linux-foundation.org> Git-commit: 82607adcf9cdf40fb7b5331269780c8f70ec6e35 Git-repo: git://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git Signed-off-by: Trilok Soni <tsoni@codeaurora.org>	2016-05-05 15:05:53 -07:00
Roman Pen	2da9606aea	workqueue: fix ghost PENDING flag while doing MQ IO commit 346c09f80459a3ad97df1816d6d606169a51001a upstream. The bug in a workqueue leads to a stalled IO request in MQ ctx->rq_list with the following backtrace: [ 601.347452] INFO: task kworker/u129:5:1636 blocked for more than 120 seconds. [ 601.347574] Tainted: G O 4.4.5-1-storage+ #6 [ 601.347651] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 601.348142] kworker/u129:5 D ffff880803077988 0 1636 2 0x00000000 [ 601.348519] Workqueue: ibnbd_server_fileio_wq ibnbd_dev_file_submit_io_worker [ibnbd_server] [ 601.348999] ffff880803077988 ffff88080466b900 ffff8808033f9c80 ffff880803078000 [ 601.349662] ffff880807c95000 7fffffffffffffff ffffffff815b0920 ffff880803077ad0 [ 601.350333] ffff8808030779a0 ffffffff815b01d5 0000000000000000 ffff880803077a38 [ 601.350965] Call Trace: [ 601.351203] [<ffffffff815b0920>] ? bit_wait+0x60/0x60 [ 601.351444] [<ffffffff815b01d5>] schedule+0x35/0x80 [ 601.351709] [<ffffffff815b2dd2>] schedule_timeout+0x192/0x230 [ 601.351958] [<ffffffff812d43f7>] ? blk_flush_plug_list+0xc7/0x220 [ 601.352208] [<ffffffff810bd737>] ? ktime_get+0x37/0xa0 [ 601.352446] [<ffffffff815b0920>] ? bit_wait+0x60/0x60 [ 601.352688] [<ffffffff815af784>] io_schedule_timeout+0xa4/0x110 [ 601.352951] [<ffffffff815b3a4e>] ? _raw_spin_unlock_irqrestore+0xe/0x10 [ 601.353196] [<ffffffff815b093b>] bit_wait_io+0x1b/0x70 [ 601.353440] [<ffffffff815b056d>] __wait_on_bit+0x5d/0x90 [ 601.353689] [<ffffffff81127bd0>] wait_on_page_bit+0xc0/0xd0 [ 601.353958] [<ffffffff81096db0>] ? autoremove_wake_function+0x40/0x40 [ 601.354200] [<ffffffff81127cc4>] __filemap_fdatawait_range+0xe4/0x140 [ 601.354441] [<ffffffff81127d34>] filemap_fdatawait_range+0x14/0x30 [ 601.354688] [<ffffffff81129a9f>] filemap_write_and_wait_range+0x3f/0x70 [ 601.354932] [<ffffffff811ced3b>] blkdev_fsync+0x1b/0x50 [ 601.355193] [<ffffffff811c82d9>] vfs_fsync_range+0x49/0xa0 [ 601.355432] [<ffffffff811cf45a>] blkdev_write_iter+0xca/0x100 [ 601.355679] [<ffffffff81197b1a>] __vfs_write+0xaa/0xe0 [ 601.355925] [<ffffffff81198379>] vfs_write+0xa9/0x1a0 [ 601.356164] [<ffffffff811c59d8>] kernel_write+0x38/0x50 The underlying device is a null_blk, with default parameters: queue_mode = MQ submit_queues = 1 Verification that nullb0 has something inflight: root@pserver8:~# cat /sys/block/nullb0/inflight 0 1 root@pserver8:~# find /sys/block/nullb0/mq/0/cpu* -name rq_list -print -exec cat {} \; ... /sys/block/nullb0/mq/0/cpu2/rq_list CTX pending: ffff8838038e2400 ... During debug it became clear that stalled request is always inserted in the rq_list from the following path: save_stack_trace_tsk + 34 blk_mq_insert_requests + 231 blk_mq_flush_plug_list + 281 blk_flush_plug_list + 199 wait_on_page_bit + 192 __filemap_fdatawait_range + 228 filemap_fdatawait_range + 20 filemap_write_and_wait_range + 63 blkdev_fsync + 27 vfs_fsync_range + 73 blkdev_write_iter + 202 __vfs_write + 170 vfs_write + 169 kernel_write + 56 So blk_flush_plug_list() was called with from_schedule == true. If from_schedule is true, that means that finally blk_mq_insert_requests() offloads execution of __blk_mq_run_hw_queue() and uses kblockd workqueue, i.e. it calls kblockd_schedule_delayed_work_on(). That means, that we race with another CPU, which is about to execute __blk_mq_run_hw_queue() work. Further debugging shows the following traces from different CPUs: CPU#0 CPU#1 ---------------------------------- ------------------------------- reqeust A inserted STORE hctx->ctx_map[0] bit marked kblockd_schedule...() returns 1 <schedule to kblockd workqueue> request B inserted STORE hctx->ctx_map[1] bit marked kblockd_schedule...() returns 0 * WORK PENDING bit is cleared * flush_busy_ctxs() is executed, but bit 1, set by CPU#1, is not observed As a result request B pended forever. This behaviour can be explained by speculative LOAD of hctx->ctx_map on CPU#0, which is reordered with clear of PENDING bit and executed _before_ actual STORE of bit 1 on CPU#1. The proper fix is an explicit full barrier <mfence>, which guarantees that clear of PENDING bit is to be executed before all possible speculative LOADS or STORES inside actual work function. Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Cc: Gioh Kim <gi-oh.kim@profitbricks.com> Cc: Michael Wang <yun.wang@profitbricks.com> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-05-04 14:48:49 -07:00
Syed Rameez Mustafa	e2cddd1040	kernel/lib: add additional debug capabilites for data corruption Data corruptions in the kernel often end up in system crashes that are easier to debug closer to the time of detection. Specifically, if we do not panic immediately after lock or list corruptions have been detected, the problem context is lost in the ensuing system mayhem. Add support for allowing system crash immediately after such corruptions are detected. The CONFIG option controls the enabling/disabling of the feature. Change-Id: I9b2eb62da506a13007acff63e85e9515145909ff Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [abhimany: minor merge conflict resolution] Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>	2016-03-22 11:16:29 -07:00
Tejun Heo	6684710434	Revert "workqueue: make sure delayed work run in local cpu" commit 041bd12e272c53a35c54c13875839bcb98c999ce upstream. This reverts commit `874bbfe600`. Workqueue used to implicity guarantee that work items queued without explicit CPU specified are put on the local CPU. Recent changes in timer broke the guarantee and led to vmstat breakage which was fixed by `176bed1de5` ("vmstat: explicitly schedule per-cpu work on the CPU we need it to run on"). vmstat is the most likely to expose the issue and it's quite possible that there are other similar problems which are a lot more difficult to trigger. As a preventive measure, `874bbfe600` ("workqueue: make sure delayed work run in local cpu") was applied to restore the local CPU guarnatee. Unfortunately, the change exposed a bug in timer code which got fixed by `22b886dd10` ("timers: Use proper base migration in add_timer_on()"). Due to code restructuring, the commit couldn't be backported beyond certain point and stable kernels which only had `874bbfe600` started crashing. The local CPU guarantee was accidental more than anything else and we want to get rid of it anyway. As, with the vmstat case fixed, `874bbfe600` is causing more problems than it's fixing, it has been decided to take the chance and officially break the guarantee by reverting the commit. A debug feature will be added to force foreign CPU assignment to expose cases relying on the guarantee and fixes for the individual cases will be backported to stable as necessary. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: `874bbfe600` ("workqueue: make sure delayed work run in local cpu") Link: http://lkml.kernel.org/g/20160120211926.GJ10810@quack.suse.cz Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Daniel Bilik <daniel.bilik@neosystem.cz> Cc: Jan Kara <jack@suse.cz> Cc: Shaohua Li <shli@fb.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Bilik <daniel.bilik@neosystem.cz> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-03 15:07:27 -08:00
Tejun Heo	21b34b4574	workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup commit d6e022f1d207a161cd88e08ef0371554680ffc46 upstream. When looking up the pool_workqueue to use for an unbound workqueue, workqueue assumes that the target CPU is always bound to a valid NUMA node. However, currently, when a CPU goes offline, the mapping is destroyed and cpu_to_node() returns NUMA_NO_NODE. This has always been broken but hasn't triggered often enough before `874bbfe600` ("workqueue: make sure delayed work run in local cpu"). After the commit, workqueue forcifully assigns the local CPU for delayed work items without explicit target CPU to fix a different issue. This widens the window where CPU can go offline while a delayed work item is pending causing delayed work items dispatched with target CPU set to an already offlined CPU. The resulting NUMA_NO_NODE mapping makes workqueue try to queue the work item on a NULL pool_workqueue and thus crash. While `874bbfe600` has been reverted for a different reason making the bug less visible again, it can still happen. Fix it by mapping NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node(). This is a temporary workaround. The long term solution is keeping CPU -> NODE mapping stable across CPU off/online cycles which is being worked on. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-03 15:07:27 -08:00
Linus Torvalds	e25ac7ddaa	Merge branch 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue update from Tejun Heo: "This pull request contains one patch to make an unbound worker pool allocated from the NUMA node containing it if such node exists. As unbound worker pools are node-affine by default, this makes most pools allocated on the right node" * 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Allocate the unbound pool using local node memory	2015-11-05 14:16:27 -08:00
Xunlei Pang	e2273584d3	workqueue: Allocate the unbound pool using local node memory Currently, get_unbound_pool() uses kzalloc() to allocate the worker pool. Actually, we can use the right node to do the allocation, achieving local memory access. This patch selects target node first, and uses kzalloc_node() instead. Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-10-12 12:17:31 -04:00
Shaohua Li	874bbfe600	workqueue: make sure delayed work run in local cpu My system keeps crashing with below message. vmstat_update() schedules a delayed work in current cpu and expects the work runs in the cpu. schedule_delayed_work() is expected to make delayed work run in local cpu. The problem is timer can be migrated with NO_HZ. __queue_work() queues work in timer handler, which could run in a different cpu other than where the delayed work is scheduled. The end result is the delayed work runs in different cpu. The patch makes __queue_delayed_work records local cpu earlier. Where the timer runs doesn't change where the work runs with the change. [ 28.010131] ------------[ cut here ]------------ [ 28.010609] kernel BUG at ../mm/vmstat.c:1392! [ 28.011099] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN [ 28.011860] Modules linked in: [ 28.012245] CPU: 0 PID: 289 Comm: kworker/0:3 Tainted: G W4.3.0-rc3+ #634 [ 28.013065] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014 [ 28.014160] Workqueue: events vmstat_update [ 28.014571] task: ffff880117682580 ti: ffff8800ba428000 task.ti: ffff8800ba428000 [ 28.015445] RIP: 0010:[<ffffffff8115f921>] [<ffffffff8115f921>]vmstat_update+0x31/0x80 [ 28.016282] RSP: 0018:ffff8800ba42fd80 EFLAGS: 00010297 [ 28.016812] RAX: 0000000000000000 RBX: ffff88011a858dc0 RCX:0000000000000000 [ 28.017585] RDX: ffff880117682580 RSI: ffffffff81f14d8c RDI:ffffffff81f4df8d [ 28.018366] RBP: ffff8800ba42fd90 R08: 0000000000000001 R09:0000000000000000 [ 28.019169] R10: 0000000000000000 R11: 0000000000000121 R12:ffff8800baa9f640 [ 28.019947] R13: ffff88011a81e340 R14: ffff88011a823700 R15:0000000000000000 [ 28.020071] FS: 0000000000000000(0000) GS:ffff88011a800000(0000)knlGS:0000000000000000 [ 28.020071] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 28.020071] CR2: 00007ff6144b01d0 CR3: 00000000b8e93000 CR4:00000000000006f0 [ 28.020071] Stack: [ 28.020071] ffff88011a858dc0 ffff8800baa9f640 ffff8800ba42fe00ffffffff8106bd88 [ 28.020071] ffffffff8106bd0b 0000000000000096 0000000000000000ffffffff82f9b1e8 [ 28.020071] ffffffff829f0b10 0000000000000000 ffffffff81f18460ffff88011a81e340 [ 28.020071] Call Trace: [ 28.020071] [<ffffffff8106bd88>] process_one_work+0x1c8/0x540 [ 28.020071] [<ffffffff8106bd0b>] ? process_one_work+0x14b/0x540 [ 28.020071] [<ffffffff8106c214>] worker_thread+0x114/0x460 [ 28.020071] [<ffffffff8106c100>] ? process_one_work+0x540/0x540 [ 28.020071] [<ffffffff81071bf8>] kthread+0xf8/0x110 [ 28.020071] [<ffffffff81071b00>] ?kthread_create_on_node+0x200/0x200 [ 28.020071] [<ffffffff81a6522f>] ret_from_fork+0x3f/0x70 [ 28.020071] [<ffffffff81071b00>] ?kthread_create_on_node+0x200/0x200 Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org # v2.6.31+	2015-09-30 13:06:46 -04:00
Linus Torvalds	7d3e2eb178	Merge branch 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue updates from Tejun Heo: "Only three trivial changes for workqueue this time - doc, MAINTAINERS and EXPORT_SYMBOL updates" * 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: fix some docbook warnings workqueue: Make flush_workqueue() available again to non GPL modules workqueue: add myself as a dedicated reviwer	2015-09-02 08:02:20 -07:00
Linus Torvalds	a1d8561172	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The biggest change in this cycle is the rewrite of the main SMP load balancing metric: the CPU load/utilization. The main goal was to make the metric more precise and more representative - see the changelog of this commit for the gory details: `9d89c257df` ("sched/fair: Rewrite runnable load and utilization average tracking") It is done in a way that significantly reduces complexity of the code: 5 files changed, 249 insertions(+), 494 deletions(-) and the performance testing results are encouraging. Nevertheless we need to keep an eye on potential regressions, since this potentially affects every SMP workload in existence. This work comes from Yuyang Du. Other changes: - SCHED_DL updates. (Andrea Parri) - Simplify architecture callbacks by removing finish_arch_switch(). (Peter Zijlstra et al) - cputime accounting: guarantee stime + utime == rtime. (Peter Zijlstra) - optimize idle CPU wakeups some more - inspired by Facebook server loads. (Mike Galbraith) - stop_machine fixes and updates. (Oleg Nesterov) - Introduce the 'trace_sched_waking' tracepoint. (Peter Zijlstra) - sched/numa tweaks. (Srikar Dronamraju) - misc fixes and small cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits) sched/deadline: Fix comment in enqueue_task_dl() sched/deadline: Fix comment in push_dl_tasks() sched: Change the sched_class::set_cpus_allowed() calling context sched: Make sched_class::set_cpus_allowed() unconditional sched: Fix a race between __kthread_bind() and sched_setaffinity() sched: Ensure a task has a non-normalized vruntime when returning back to CFS sched/numa: Fix NUMA_DIRECT topology identification tile: Reorganize _switch_to() sched, sparc32: Update scheduler comments in copy_thread() sched: Remove finish_arch_switch() sched, tile: Remove finish_arch_switch sched, sh: Fold finish_arch_switch() into switch_to() sched, score: Remove finish_arch_switch() sched, avr32: Remove finish_arch_switch() sched, MIPS: Get rid of finish_arch_switch() sched, arm: Remove finish_arch_switch() sched/fair: Clean up load average references sched/fair: Provide runnable_load_avg back to cfs_rq sched/fair: Remove task and group entity load when they are dead sched/fair: Init cfs_rq's sched_entity load average ...	2015-08-31 20:26:22 -07:00
Peter Zijlstra	25834c73f9	sched: Fix a race between __kthread_bind() and sched_setaffinity() Because sched_setscheduler() checks p->flags & PF_NO_SETAFFINITY without locks, a caller might observe an old value and race with the set_cpus_allowed_ptr() call from __kthread_bind() and effectively undo it: __kthread_bind() do_set_cpus_allowed() <SYSCALL> sched_setaffinity() if (p->flags & PF_NO_SETAFFINITIY) set_cpus_allowed_ptr() p->flags \|= PF_NO_SETAFFINITY Fix the bug by putting everything under the regular scheduler locks. This also closes a hole in the serialization of task_struct::{nr_,}cpus_allowed. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dedekind1@gmail.com Cc: juri.lelli@arm.com Cc: mgorman@suse.de Cc: riel@redhat.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/20150515154833.545640346@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-12 12:06:09 +02:00
Tim Gardner	1dadafa86a	workqueue: Make flush_workqueue() available again to non GPL modules Commit `37b1ef31a5` ("workqueue: move flush_scheduled_work() to workqueue.h") moved the exported non GPL flush_scheduled_work() from a function to an inline wrapper. Unfortunately, it directly calls flush_workqueue() which is a GPL function. This has the effect of changing the licensing requirement for this function and makes it unavailable to non GPL modules. See commit `ad7b1f841f` ("workqueue: Make schedule_work() available again to non GPL modules") for precedent. Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-08-04 14:04:54 -04:00
Paul E. McKenney	f78f5b90c4	rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN() This commit renames rcu_lockdep_assert() to RCU_LOCKDEP_WARN() for consistency with the WARN() series of macros. This also requires inverting the sense of the conditional, which this commit also does. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Ingo Molnar <mingo@kernel.org>	2015-07-22 15:27:32 -07:00
Linus Torvalds	02201e3f1b	Minor merge needed, due to function move. Main excitement here is Peter Zijlstra's lockless rbtree optimization to speed module address lookup. He found some abusers of the module lock doing that too. A little bit of parameter work here too; including Dan Streetman's breaking up the big param mutex so writing a parameter can load another module (yeah, really). Unfortunately that broke the usual suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were appended too. Cheers, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVkgKHAAoJENkgDmzRrbjxQpwQAJVmBN6jF3SnwbQXv9vRixjH 58V33sb1G1RW+kXxQ3/e8jLX/4VaN479CufruXQp+IJWXsN/CH0lbC3k8m7u50d7 b1Zeqd/Yrh79rkc11b0X1698uGCSMlzz+V54Z0QOTEEX+nSu2ZZvccFS4UaHkn3z rqDo00lb7rxQz8U25qro2OZrG6D3ub2q20TkWUB8EO4AOHkPn8KWP2r429Axrr0K wlDWDTTt8/IsvPbuPf3T15RAhq1avkMXWn9nDXDjyWbpLfTn8NFnWmtesgY7Jl4t GjbXC5WYekX3w2ZDB9KaT/DAMQ1a7RbMXNSz4RX4VbzDl+yYeSLmIh2G9fZb1PbB PsIxrOgy4BquOWsJPm+zeFPSC3q9Cfu219L4AmxSjiZxC3dlosg5rIB892Mjoyv4 qxmg6oiqtc4Jxv+Gl9lRFVOqyHZrTC5IJ+xgfv1EyP6kKMUKLlDZtxZAuQxpUyxR HZLq220RYnYSvkWauikq4M8fqFM8bdt6hLJnv7bVqllseROk9stCvjSiE3A9szH5 OgtOfYV5GhOeb8pCZqJKlGDw+RoJ21jtNCgOr6DgkNKV9CX/kL/Puwv8gnA0B0eh dxCeB7f/gcLl7Cg3Z3gVVcGlgak6JWrLf5ITAJhBZ8Lv+AtL2DKmwEWS/iIMRmek tLdh/a9GiCitqS0bT7GE =tWPQ -----END PGP SIGNATURE----- Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell: "Main excitement here is Peter Zijlstra's lockless rbtree optimization to speed module address lookup. He found some abusers of the module lock doing that too. A little bit of parameter work here too; including Dan Streetman's breaking up the big param mutex so writing a parameter can load another module (yeah, really). Unfortunately that broke the usual suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were appended too" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits) modules: only use mod->param_lock if CONFIG_MODULES param: fix module param locks when !CONFIG_SYSFS. rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE() module: add per-module param_lock module: make perm const params: suppress unused variable error, warn once just in case code changes. modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'. kernel/module.c: avoid ifdefs for sig_enforce declaration kernel/workqueue.c: remove ifdefs over wq_power_efficient kernel/params.c: export param_ops_bool_enable_only kernel/params.c: generalize bool_enable_only kernel/module.c: use generic module param operaters for sig_enforce kernel/params: constify struct kernel_param_ops uses sysfs: tightened sysfs permission checks module: Rework module_addr_{min,max} module: Use __module_address() for module_address_lookup() module: Make the mod_tree stuff conditional on PERF_EVENTS \|\| TRACING module: Optimize __module_address() using a latched RB-tree rbtree: Implement generic latch_tree seqlock: Introduce raw_read_seqcount_latch() ...	2015-07-01 10:49:25 -07:00
Shailendra Verma	402dd89d6c	workqueue: fix typos in comments tj: dropped iff -> if, iff is if and only if not a typo. Spotted by Randy Dunlap. Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org>	2015-05-29 09:20:01 -04:00
Luis R. Rodriguez	552f530cbc	kernel/workqueue.c: remove ifdefs over wq_power_efficient We can avoid an ifdef over wq_power_efficient's declaration by just using IS_ENABLED(). Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: cocci@systeme.lip6.fr Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2015-05-28 11:32:12 +09:30
Lai Jiangshan	37b1ef31a5	workqueue: move flush_scheduled_work() to workqueue.h flush_scheduled_work() is just a simple call to flush_work(). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-05-21 17:26:22 -04:00
Lai Jiangshan	899a94fe15	workqueue: remove the lock from wq_sysfs_prep_attrs() Reading to wq->unbound_attrs requires protection of either wq_pool_mutex or wq->mutex, and wq_sysfs_prep_attrs() is called with wq_pool_mutex held, so we don't need to grab wq->mutex here. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-05-21 17:26:22 -04:00
Lai Jiangshan	da7f91b2e2	workqueue: remove the declaration of copy_workqueue_attrs() This pre-declaration was unneeded since a previous refactor patch `6ba94429c8` ("workqueue: Reorder sysfs code"). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-05-21 17:26:22 -04:00
Lai Jiangshan	d4d3e25797	workqueue: ensure attrs changes are properly synchronized Current modification to attrs via sysfs is not fully synchronized. Process A (change cpumask) \| Process B (change numa affinity) wq_cpumask_store() \| wq_sysfs_prep_attrs() \| \| apply_workqueue_attrs() apply_workqueue_attrs() \| It results that the Process B's operation is totally reverted without any notification, it is a buggy behavior. So this patch moves wq_sysfs_prep_attrs() into the protection under wq_pool_mutex to ensure attrs changes are properly synchronized. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-05-19 17:37:00 -04:00
Lai Jiangshan	a0111cf671	workqueue: separate out and refactor the locking of applying attrs Applying attrs requires two locks: get_online_cpus() and wq_pool_mutex, and this code is duplicated at two places (apply_workqueue_attrs() and workqueue_set_unbound_cpumask()). So we separate out this locking code into apply_wqattrs_[un]lock() and do a minor refactor on apply_workqueue_attrs(). The apply_wqattrs_[un]lock() will be also used on later patch for ensuring attrs changes are properly synchronized. tj: minor updates to comments Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-05-19 17:37:00 -04:00
Lai Jiangshan	f7142ed483	workqueue: simplify wq_update_unbound_numa() wq_update_unbound_numa() is known be called with wq_pool_mutex held. But wq_update_unbound_numa() requests wq->mutex before reading wq->unbound_attrs, wq->numa_pwq_tbl[] and wq->dfl_pwq. But these fields were changed to be allowed being read with wq_pool_mutex held. So we simply remove the mutex_lock(&wq->mutex). Without the dependence on the the mutex_lock(&wq->mutex), the test of wq->unbound_attrs->no_numa can also be moved upward. The old code need a long comment to describe the stableness of @wq->unbound_attrs which is also guaranteed by wq_pool_mutex now, so we don't need this such comment. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-05-18 16:22:57 -04:00
Lai Jiangshan	5b95e1af8d	workqueue: wq_pool_mutex protects the attrs-installation Current wq_pool_mutex doesn't proctect the attrs-installation, it results that ->unbound_attrs, ->numa_pwq_tbl[] and ->dfl_pwq can only be accessed under wq->mutex and causes some inconveniences. Example, wq_update_unbound_numa() has to acquire wq->mutex before fetching the wq->unbound_attrs->no_numa and the old_pwq. attrs-installation is a short operation, so this change will no cause any latency for other operations which also acquire the wq_pool_mutex. The only unprotected attrs-installation code is in apply_workqueue_attrs(), so this patch touches code less than comments. It is also a preparation patch for next several patches which read wq->unbound_attrs, wq->numa_pwq_tbl[] and wq->dfl_pwq with only wq_pool_mutex held. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-05-18 16:22:56 -04:00
Chen Hanxiao	b749b1b673	workqueue: fix a typo s/detemined/determined Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-05-13 10:23:56 -04:00
Gong Zhaogang	30186c6fdc	workqueue: function name in the comment differs from the real function name modify wq_calc_node_mask to wq_calc_node_cpumask Signed-off-by: Gong Zhaogang <gongzhaogang@inspur.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-05-11 11:03:34 -04:00
Lai Jiangshan	042f7df15a	workqueue: Allow modifying low level unbound workqueue cpumask Allow to modify the low-level unbound workqueues cpumask through sysfs. This is performed by traversing the entire workqueue list and calling apply_wqattrs_prepare() on the unbound workqueues with the new low level mask. Only after all the preparation are done, we commit them all together. Ordered workqueues are ignored from the low level unbound workqueue cpumask, it will be handled in near future. All the (default & per-node) pwqs are mandatorily controlled by the low level cpumask. If the user configured cpumask doesn't overlap with the low level cpumask, the low level cpumask will be used for the wq instead. The comment of wq_calc_node_cpumask() is updated and explicitly requires that its first argument should be the attrs of the default pwq. The default wq_unbound_cpumask is cpu_possible_mask. The workqueue subsystem doesn't know its best default value, let the system manager or the other subsystem set it when needed. Changed from V8: merge the calculating code for the attrs of the default pwq together. minor change the code&comments for saving the user configured attrs. remove unnecessary list_del(). minor update the comment of wq_calc_node_cpumask(). update the comment of workqueue_set_unbound_cpumask(); Cc: Christoph Lameter <cl@linux.com> Cc: Kevin Hilman <khilman@linaro.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Mike Galbraith <bitbucket@online.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Original-patch-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-04-30 09:24:29 -04:00
Frederic Weisbecker	b05a79280b	workqueue: Create low-level unbound workqueues cpumask Create a cpumask that limits the affinity of all unbound workqueues. This cpumask is controlled through a file at the root of the workqueue sysfs directory. It works on a lower-level than the per WQ_SYSFS workqueues cpumask files such that the effective cpumask applied for a given unbound workqueue is the intersection of /sys/devices/virtual/workqueue/$WORKQUEUE/cpumask and the new /sys/devices/virtual/workqueue/cpumask file. This patch implements the basic infrastructure and the read interface. wq_unbound_cpumask is initially set to cpu_possible_mask. Cc: Christoph Lameter <cl@linux.com> Cc: Kevin Hilman <khilman@linaro.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Mike Galbraith <bitbucket@online.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-04-27 11:13:40 -04:00
Lai Jiangshan	2d5f0764b5	workqueue: split apply_workqueue_attrs() into 3 stages Current apply_workqueue_attrs() includes pwqs-allocation and pwqs-installation, so when we batch multiple apply_workqueue_attrs()s as a transaction, we can't ensure the transaction must succeed or fail as a complete unit. To solve this, we split apply_workqueue_attrs() into three stages. The first stage does the preparation: allocation memory, pwqs. The second stage does the attrs-installaion and pwqs-installation. The third stage frees the allocated memory and (old or unused) pwqs. As the result, batching multiple apply_workqueue_attrs()s can succeed or fail as a complete unit: 1) batch do all the first stage for all the workqueues 2) only commit all when all the above succeed. This patch is a preparation for the next patch ("Allow modifying low level unbound workqueue cpumask") which will do a multiple apply_workqueue_attrs(). The patch doesn't have functionality changed except two minor adjustment: 1) free_unbound_pwq() for the error path is removed, we use the heavier version put_pwq_unlocked() instead since the error path is rare. this adjustment simplifies the code. 2) the memory-allocation is also moved into wq_pool_mutex. this is needed to avoid to do the further splitting. tj: minor updates to comments. Suggested-by: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Kevin Hilman <khilman@linaro.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Mike Galbraith <bitbucket@online.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-04-27 11:13:40 -04:00
Frederic Weisbecker	6ba94429c8	workqueue: Reorder sysfs code The sysfs code usually belongs to the botom of the file since it deals with high level objects. In the workqueue code it's misplaced and such that we'll need to work around functions references to allow the sysfs code to call APIs like apply_workqueue_attrs(). Lets move that block further in the file, almost the botom. And declare workqueue_sysfs_unregister() just before destroy_workqueue() which reference it. tj: Moved workqueue_sysfs_unregister() forward declaration where other forward declarations are. Suggested-by: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Kevin Hilman <khilman@linaro.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Mike Galbraith <bitbucket@online.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2015-04-06 11:16:04 -04:00
Tejun Heo	3494fc3084	workqueue: dump workqueues on sysrq-t Workqueues are used extensively throughout the kernel but sometimes it's difficult to debug stalls involving work items because visibility into its inner workings is fairly limited. Although sysrq-t task dump annotates each active worker task with the information on the work item being executed, it is challenging to find out which work items are pending or delayed on which queues and how pools are being managed. This patch implements show_workqueue_state() which dumps all busy workqueues and pools and is called from the sysrq-t handler. At the end of sysrq-t dump, something like the following is printed. Showing busy workqueues and worker pools: ... workqueue filler_wq: flags=0x0 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=2/256 in-flight: 491:filler_workfn, 507:filler_workfn pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 in-flight: 501:filler_workfn pending: filler_workfn ... workqueue test_wq: flags=0x8 pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/1 in-flight: 510(RESCUER):test_workfn BAR(69) BAR(500) delayed: test_workfn1 BAR(492), test_workfn2 ... pool 0: cpus=0 node=0 flags=0x0 nice=0 workers=2 manager: 137 pool 2: cpus=1 node=0 flags=0x0 nice=0 workers=3 manager: 469 pool 3: cpus=1 node=0 flags=0x0 nice=-20 workers=2 idle: 16 pool 8: cpus=0-3 flags=0x4 nice=0 workers=2 manager: 62 The above shows that test_wq is executing test_workfn() on pid 510 which is the rescuer and also that there are two tasks 69 and 500 waiting for the work item to finish in flush_work(). As test_wq has max_active of 1, there are two work items for test_workfn1() and test_workfn2() which are delayed till the current work item is finished. In addition, pid 492 is flushing test_workfn1(). The work item for test_workfn() is being executed on pwq of pool 2 which is the normal priority per-cpu pool for CPU 1. The pool has three workers, two of which are executing filler_workfn() for filler_wq and the last one is assuming the manager role trying to create more workers. This extra workqueue state dump will hopefully help chasing down hangs involving workqueues. v3: cpulist_pr_cont() replaced with "%*pbl" printf formatting. v2: As suggested by Andrew, minor formatting change in pr_cont_work(), printk()'s replaced with pr_info()'s, and cpumask printing now uses cpulist_pr_cont(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> CC: Ingo Molnar <mingo@redhat.com>	2015-03-09 09:22:28 -04:00
Tejun Heo	2607d7a6db	workqueue: keep track of the flushing task and pool manager Add wq_barrier->task and worker_pool->manager to keep track of the flushing task and pool manager respectively. These are purely informational and will be used to implement sysrq dump of workqueues. Signed-off-by: Tejun Heo <tj@kernel.org>	2015-03-09 09:22:28 -04:00
Tejun Heo	e2dca7adff	workqueue: make the workqueues list RCU walkable The workqueues list is protected by wq_pool_mutex and a workqueue and its subordinate data structures are freed directly on destruction. We want to add the ability dump workqueues from a sysrq callback which requires walking all workqueues without grabbing wq_pool_mutex. This patch makes freeing of workqueues RCU protected and makes the workqueues list walkable while holding RCU read lock. Note that pool_workqueues and pools are already sched-RCU protected. For consistency, workqueues are also protected with sched-RCU. While at it, reverse the workqueues list so that a workqueue which is created earlier comes before. The order of the list isn't significant functionally but this makes the planned sysrq dump list system workqueues first. Signed-off-by: Tejun Heo <tj@kernel.org>	2015-03-09 09:22:28 -04:00
Tejun Heo	8603e1b300	workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE cancel[_delayed]_work_sync() are implemented using __cancel_work_timer() which grabs the PENDING bit using try_to_grab_pending() and then flushes the work item with PENDING set to prevent the on-going execution of the work item from requeueing itself. try_to_grab_pending() can always grab PENDING bit without blocking except when someone else is doing the above flushing during cancelation. In that case, try_to_grab_pending() returns -ENOENT. In this case, __cancel_work_timer() currently invokes flush_work(). The assumption is that the completion of the work item is what the other canceling task would be waiting for too and thus waiting for the same condition and retrying should allow forward progress without excessive busy looping Unfortunately, this doesn't work if preemption is disabled or the latter task has real time priority. Let's say task A just got woken up from flush_work() by the completion of the target work item. If, before task A starts executing, task B gets scheduled and invokes __cancel_work_timer() on the same work item, its try_to_grab_pending() will return -ENOENT as the work item is still being canceled by task A and flush_work() will also immediately return false as the work item is no longer executing. This puts task B in a busy loop possibly preventing task A from executing and clearing the canceling state on the work item leading to a hang. task A task B worker executing work __cancel_work_timer() try_to_grab_pending() set work CANCELING flush_work() block for work completion completion, wakes up A __cancel_work_timer() while (forever) { try_to_grab_pending() -ENOENT as work is being canceled flush_work() false as work is no longer executing } This patch removes the possible hang by updating __cancel_work_timer() to explicitly wait for clearing of CANCELING rather than invoking flush_work() after try_to_grab_pending() fails with -ENOENT. Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com v3: bit_waitqueue() can't be used for work items defined in vmalloc area. Switched to custom wake function which matches the target work item and exclusive wait and wakeup. v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if the target bit waitqueue has wait_bit_queue's on it. Use DEFINE_WAIT_BIT() and __wake_up_bit() instead. Reported by Tomeu Vizoso. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Rabin Vincent <rabin.vincent@axis.com> Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com> Cc: stable@vger.kernel.org Tested-by: Jesper Nilsson <jesper.nilsson@axis.com> Tested-by: Rabin Vincent <rabin.vincent@axis.com>	2015-03-05 08:04:13 -05:00
Tejun Heo	dfbcbf42dd	workqueue: use %pb[l] to format bitmaps including cpumasks and nodemasks printk and friends can now format bitmaps using '%pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-13 21:21:37 -08:00
Tejun Heo	29187a9eea	workqueue: fix subtle pool management issue which can stall whole worker_pool A worker_pool's forward progress is guaranteed by the fact that the last idle worker assumes the manager role to create more workers and summon the rescuers if creating workers doesn't succeed in timely manner before proceeding to execute work items. This manager role is implemented in manage_workers(), which indicates whether the worker may proceed to work item execution with its return value. This is necessary because multiple workers may contend for the manager role, and, if there already is a manager, others should proceed to work item execution. Unfortunately, the function also indicates that the worker may proceed to work item execution if need_to_create_worker() is false at the head of the function. need_to_create_worker() tests the following conditions. pending work items && !nr_running && !nr_idle The first and third conditions are protected by pool->lock and thus won't change while holding pool->lock; however, nr_running can change asynchronously as other workers block and resume and while it's likely to be zero, as someone woke this worker up in the first place, some other workers could have become runnable inbetween making it non-zero. If this happens, manage_worker() could return false even with zero nr_idle making the worker, the last idle one, proceed to execute work items. If then all workers of the pool end up blocking on a resource which can only be released by a work item which is pending on that pool, the whole pool can deadlock as there's no one to create more workers or summon the rescuers. This patch fixes the problem by removing the early exit condition from maybe_create_worker() and making manage_workers() return false iff there's already another manager, which ensures that the last worker doesn't start executing work items. We can leave the early exit condition alone and just ignore the return value but the only reason it was put there is because the manage_workers() used to perform both creations and destructions of workers and thus the function may be invoked while the pool is trying to reduce the number of workers. Now that manage_workers() is called only when more workers are needed, the only case this early exit condition is triggered is rare race conditions rendering it pointless. Tested with simulated workload and modified workqueue code which trigger the pool deadlock reliably without this patch. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Eric Sandeen <sandeen@sandeen.net> Link: http://lkml.kernel.org/g/54B019F4.8030009@sandeen.net Cc: Dave Chinner <david@fromorbit.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: stable@vger.kernel.org	2015-01-16 14:21:16 -05:00
NeilBrown	008847f66c	workqueue: allow rescuer thread to do more work. When there is serious memory pressure, all workers in a pool could be blocked, and a new thread cannot be created because it requires memory allocation. In this situation a WQ_MEM_RECLAIM workqueue will wake up the rescuer thread to do some work. The rescuer will only handle requests that are already on ->worklist. If max_requests is 1, that means it will handle a single request. The rescuer will be woken again in 100ms to handle another max_requests requests. I've seen a machine (running a 3.0 based "enterprise" kernel) with thousands of requests queued for xfslogd, which has a max_requests of 1, and is needed for retiring all 'xfs' write requests. When one of the worker pools gets into this state, it progresses extremely slowly and possibly never recovers (only waited an hour or two). With this patch we leave a pool_workqueue on mayday list until it is clearly no longer in need of assistance. This allows all requests to be handled in a timely fashion. We keep each pool_workqueue on the mayday list until need_to_create_worker() is false, and no work for this workqueue is found in the pool. I have tested this in combination with a (hackish) patch which forces all work items to be handled by the rescuer thread. In that context it significantly improves performance. A similar patch for a 3.0 kernel significantly improved performance on a heavy work load. Thanks to Jan Kara for some design ideas, and to Dongsu Park for some comments and testing. tj: Inverted the lock order between wq_mayday_lock and pool->lock with a preceding patch and simplified this patch. Added comment and updated changelog accordingly. Dongsu spotted missing get_pwq() in the simplified code. Cc: Dongsu Park <dongsu.park@profitbricks.com> Cc: Jan Kara <jack@suse.cz> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-12-08 12:39:16 -05:00
Tejun Heo	b2d829096b	workqueue: invert the order between pool->lock and wq_mayday_lock Currently, pool->lock nests inside pool->lock. There's no inherent reason for this order. The only place where the two locks are held together is pool_mayday_timeout() and it just got decided that way. This nesting order turns out to complicate things with the planned rescuer_thread() update. Let's invert them. This doesn't cause any behavior differences. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: NeilBrown <neilb@suse.de> Cc: Dongsu Park <dongsu.park@profitbricks.com>	2014-12-08 12:39:16 -05:00
Tejun Heo	0479c8c549	workqueue: cosmetic update in rescuer_thread() rescuer_thread() caches &rescuer->scheduled in a local variable scheduled for convenience. There's one WARN_ON_ONCE() which was using &rescuer->scheduled directly. Replace it with the local variable. This patch causes no functional difference. Signed-off-by: Tejun Heo <tj@kernel.org>	2014-12-04 10:14:54 -05:00
Joe Lawrence	3e28e37720	workqueue: Use cond_resched_rcu_qs macro Tidy up and use cond_resched_rcu_qs when calling cond_resched and reporting potential quiescent state to RCU. Splitting this change in this way allows easy backporting to -stable for kernel versions not having cond_resched_rcu_qs(). Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2014-10-06 05:58:26 -07:00
Joe Lawrence	789cbbeca4	workqueue: Add quiescent state between work items Similar to the stop_machine deadlock scenario on !PREEMPT kernels addressed in `b22ce2785d` "workqueue: cond_resched() after processing each work item", kworker threads requeueing back-to-back with zero jiffy delay can stall RCU. The cond_resched call introduced in that fix will yield only iff there are other higher priority tasks to run, so force a quiescent RCU state between work items. Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> Link: https://lkml.kernel.org/r/20140926105227.01325697@jlaw-desktop.mno.stratus.com Link: https://lkml.kernel.org/r/20140929115445.40221d8e@jlaw-desktop.mno.stratus.com Fixes: `b22ce2785d` ("workqueue: cond_resched() after processing each work item") Cc: <stable@vger.kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2014-10-06 05:57:43 -07:00
Linus Torvalds	f2a84170ed	Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: - Major reorganization of percpu header files which I think makes things a lot more readable and logical than before. - percpu-refcount is updated so that it requires explicit destruction and can be reinitialized if necessary. This was pulled into the block tree to replace the custom percpu refcnting implemented in blk-mq. - In the process, percpu and percpu-refcount got cleaned up a bit * 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (21 commits) percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero() percpu-refcount: require percpu_ref to be exited explicitly percpu-refcount: use unsigned long for pcpu_count pointer percpu-refcount: add helpers for ->percpu_count accesses percpu-refcount: one bit is enough for REF_STATUS percpu-refcount, aio: use percpu_ref_cancel_init() in ioctx_alloc() workqueue: stronger test in process_one_work() workqueue: clear POOL_DISASSOCIATED in rebind_workers() percpu: Use ALIGN macro instead of hand coding alignment calculation percpu: invoke __verify_pcpu_ptr() from the generic part of accessors and operations percpu: preffity percpu header files percpu: use raw_cpu_() to define __this_cpu_() percpu: reorder macros in percpu header files percpu: move {raw\|this}_cpu_() definitions to include/linux/percpu-defs.h percpu: move generic {raw\|this}_cpu__N() definitions to include/asm-generic/percpu.h percpu: only allow sized arch overrides for {raw\|this}_cpu_*() ops percpu: reorganize include/linux/percpu-defs.h percpu: move accessors from include/linux/percpu.h to percpu-defs.h percpu: include/asm-generic/percpu.h should contain only arch-overridable parts percpu: introduce arch_raw_cpu_ptr() ...	2014-08-04 10:09:27 -07:00
Linus Torvalds	c4c3f5fba0	Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue updates from Tejun Heo: "Lai has been doing a lot of cleanups of workqueue and kthread_work. No significant behavior change. Just a lot of cleanups all over the place. Some are a bit invasive but overall nothing too dangerous" * 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: kthread_work: remove the unused wait_queue_head kthread_work: wake up worker only when the worker is idle workqueue: use nr_node_ids instead of wq_numa_tbl_len workqueue: remove the misnamed out_unlock label in get_unbound_pool() workqueue: remove the stale comment in pwq_unbound_release_workfn() workqueue: move rescuer pool detachment to the end workqueue: unfold start_worker() into create_worker() workqueue: remove @wakeup from worker_set_flags() workqueue: remove an unneeded UNBOUND test before waking up the next worker workqueue: wake regular worker if need_more_worker() when rescuer leave the pool workqueue: alloc struct worker on its local node workqueue: reuse the already calculated pwq in try_to_grab_pending() workqueue: stronger test in process_one_work() workqueue: clear POOL_DISASSOCIATED in rebind_workers() workqueue: sanity check pool->cpu in wq_worker_sleeping() workqueue: clear leftover flags when detached workqueue: remove useless WARN_ON_ONCE() workqueue: use schedule_timeout_interruptible() instead of open code workqueue: remove the empty check in too_many_workers() workqueue: use "pool->cpu < 0" to stand for an unbound pool	2014-08-04 10:04:44 -07:00
Lai Jiangshan	ddcb57e2ed	workqueue: use nr_node_ids instead of wq_numa_tbl_len They are the same and nr_node_ids is provided by the memory subsystem. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-07-22 12:10:39 -04:00
Lai Jiangshan	3fb1823c09	workqueue: remove the misnamed out_unlock label in get_unbound_pool() After the locking was moved up to the caller of the get_unbound_pool(), out_unlock label doesn't need to do any unlock operation and the name became bad, so we just remove this label, and the only usage-site "goto out_unlock" is subsituted to "return pool". Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-07-22 12:10:39 -04:00
Lai Jiangshan	29b1cb416a	workqueue: remove the stale comment in pwq_unbound_release_workfn() In `75ccf5950f` ("workqueue: prepare flush_workqueue() for dynamic creation and destrucion of unbound pool_workqueues"), a comment about the synchronization for the pwq in pwq_unbound_release_workfn() was added. The comment claimed the flush_mutex wasn't strictly necessary, it was correct in that time, due to the pwq was protected by workqueue_lock. But it is incorrect now since the wq->flush_mutex was renamed to wq->mutex and workqueue_lock was removed, the wq->mutex is strictly needed. But the comment was miss-updated when the synchronization was changed. This patch removes the incorrect comments and doesn't add any new comment to explain why wq->mutex is needed here, which is definitely obvious and wq->pwqs_node has "WQ" notation in its definition which is better comment. The old commit mentioned above also introduced a comment in link_pwq() about the synchronization. This comment is also removed in this patch since the whole link_pwq() is proteced by wq->mutex. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-07-22 12:10:39 -04:00
Lai Jiangshan	13b1d625ef	workqueue: move rescuer pool detachment to the end In `51697d3939` ("workqueue: use generic attach/detach routine for rescuers"), The rescuer detaches itself from the pool before put_pwq() so that the put_unbound_pool() will not destroy the rescuer-attached pool. It is unnecessary. worker_detach_from_pool() can be used as the last statement to access to the pool just like the regular workers, put_unbound_pool() will wait for it to detach and then free the pool. So we move the worker_detach_from_pool() down, make it coincide with the regular workers. tj: Minor description update. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-07-22 12:10:39 -04:00

1 2 3 4 5 ...