evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
Trilok Soni	5ab1e18aa3	Revert "Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4" This reverts commit `9d6fd2c3e9` ("Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4"), because it breaks the dump parsing tools due to kernel can be loaded anywhere in the memory now and not fixed at linear mapping. Change-Id: Id416f0a249d803442847d09ac47781147b0d0ee6 Signed-off-by: Trilok Soni <tsoni@codeaurora.org>	2016-08-26 14:34:05 -07:00
Trilok Soni	9d6fd2c3e9	Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4 * msm-4.4/tmp-510d0a3f: Linux 4.4.11 nf_conntrack: avoid kernel pointer value leak in slab name drm/radeon: fix DP link training issue with second 4K monitor drm/i915/bdw: Add missing delay during L3 SQC credit programming drm/i915: Bail out of pipe config compute loop on LPT drm/radeon: fix PLL sharing on DCE6.1 (v2) Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing" Input: max8997-haptic - fix NULL pointer dereference get_rock_ridge_filename(): handle malformed NM entries tools lib traceevent: Do not reassign parg after collapse_tree() qla1280: Don't allocate 512kb of host tags atomic_open(): fix the handling of create_error regulator: axp20x: Fix axp22x ldo_io voltage ranges regulator: s2mps11: Fix invalid selector mask and voltages for buck9 workqueue: fix rebind bound workers warning ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC vfs: rename: check backing inode being equal vfs: add vfs_select_inode() helper perf/core: Disable the event on a truncated AUX record regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case pinctrl: at91-pio4: fix pull-up/down logic spi: spi-ti-qspi: Handle truncated frames properly spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT ALSA: hda - Fix broken reconfig ALSA: hda - Fix white noise on Asus UX501VW headset ALSA: hda - Fix subwoofer pin on ASUS N751 and N551 ALSA: usb-audio: Yet another Phoneix Audio device quirk ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2) crypto: testmgr - Use kmalloc memory for RSA input crypto: hash - Fix page length clamping in hash walk crypto: qat - fix invalid pf2vf_resp_wq logic s390/mm: fix asce_bits handling with dynamic pagetable levels zsmalloc: fix zs_can_compact() integer overflow ocfs2: fix posix_acl_create deadlock ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang net/route: enforce hoplimit max value tcp: refresh skb timestamp at retransmit time net: thunderx: avoid exposing kernel stack net: fix a kernel infoleak in x25 module uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h MIME-Version: 1.0 bridge: fix igmp / mld query parsing net: bridge: fix old ioctl unlocked net device walk VSOCK: do not disconnect socket when peer has shutdown SEND only net/mlx4_en: Fix endianness bug in IPV6 csum calculation net: fix infoleak in rtnetlink net: fix infoleak in llc net: fec: only clear a queue's work bit if the queue was emptied netem: Segment GSO packets on enqueue sch_dsmark: update backlog as well sch_htb: update backlog as well net_sched: update hierarchical backlog too net_sched: introduce qdisc_replace() helper gre: do not pull header in ICMP error processing net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case samples/bpf: fix trace_output example bpf: fix check_map_func_compatibility logic bpf: fix refcnt overflow bpf: fix double-fdput in replace_map_fd_with_map_ptr() net/mlx4_en: fix spurious timestamping callbacks ipv4/fib: don't warn when primary address is missing if in_dev is dead net/mlx5e: Fix minimum MTU net/mlx5e: Device's mtu field is u16 and not int openvswitch: use flow protocol when recalculating ipv6 checksums atl2: Disable unimplemented scatter/gather feature vlan: pull on __vlan_insert_tag error path and fix csum correction net: use skb_postpush_rcsum instead of own implementations cdc_mbim: apply "NDP to end" quirk to all Huawei devices bpf/verifier: reject invalid LD_ABS \| BPF_DW instruction net: sched: do not requeue a NULL skb packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface route: do not cache fib route info on local routes with oif decnet: Do not build routes to devices without decnet private data. parisc: Use generic extable search and sort routines arm64: kasan: Use actual memory node when populating the kernel image shadow arm64: mm: treat memstart_addr as a signed quantity arm64: lse: deal with clobbered IP registers after branch via PLT arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly arm64: kasan: Fix zero shadow mapping overriding kernel image shadow arm64: consistently use p?d_set_huge arm64: fix KASLR boot-time I-cache maintenance arm64: hugetlb: partial revert of 66b3923a1a0f arm64: make irq_stack_ptr more robust arm64: efi: invoke EFI_RNG_PROTOCOL to supply KASLR randomness efi: stub: use high allocation for converted command line efi: stub: add implementation of efi_random_alloc() efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOL arm64: kaslr: randomize the linear region arm64: add support for kernel ASLR arm64: add support for building vmlinux as a relocatable PIE binary arm64: switch to relative exception tables extable: add support for relative extables to search and sort routines scripts/sortextable: add support for ET_DYN binaries arm64: futex.h: Add missing PAN toggling arm64: make asm/elf.h available to asm files arm64: avoid dynamic relocations in early boot code arm64: avoid R_AARCH64_ABS64 relocations for Image header fields arm64: add support for module PLTs arm64: move brk immediate argument definitions to separate header arm64: mm: use bit ops rather than arithmetic in pa/va translations arm64: mm: only perform memstart_addr sanity check if DEBUG_VM arm64: User die() instead of panic() in do_page_fault() arm64: allow kernel Image to be loaded anywhere in physical memory arm64: defer __va translation of initrd_start and initrd_end arm64: move kernel image to base of vmalloc area arm64: kvm: deal with kernel symbols outside of linear mapping arm64: decouple early fixmap init from linear mapping arm64: pgtable: implement static [pte\|pmd\|pud]_offset variants arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region arm64: add support for ioremap() block mappings arm64: prevent potential circular header dependencies in asm/bug.h of/fdt: factor out assignment of initrd_start/initrd_end of/fdt: make memblock minimum physical address arch configurable arm64: Remove the get_thread_info() function arm64: kernel: Don't toggle PAN on systems with UAO arm64: cpufeature: Test 'matches' pointer to find the end of the list arm64: kernel: Add support for User Access Override arm64: add ARMv8.2 id_aa64mmfr2 boiler plate arm64: cpufeature: Change read_cpuid() to use sysreg's mrs_s macro arm64: use local label prefixes for __reg_num symbols arm64: vdso: Mark vDSO code as read-only arm64: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL arm64: ptdump: Indicate whether memory should be faulting arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC arm64: Drop alloc function from create_mapping arm64: prefetch: add missing #include for spin_lock_prefetch arm64: lib: patch in prfm for copy_page if requested arm64: lib: improve copy_page to deal with 128 bytes at a time arm64: prefetch: add alternative pattern for CPUs without a prefetcher arm64: prefetch: don't provide spin_lock_prefetch with LSE arm64: allow vmalloc regions to be set with set_memory_* arm64: kernel: implement ACPI parking protocol arm64: mm: create new fine-grained mappings at boot arm64: ensure _stext and _etext are page-aligned arm64: mm: allow passing a pgdir to alloc_init_* arm64: mm: allocate pagetables anywhere arm64: mm: use fixmap when creating page tables arm64: mm: add functions to walk tables in fixmap arm64: mm: add __{pud,pgd}_populate arm64: mm: avoid redundant __pa(__va(x)) arm64: mm: add functions to walk page tables by PA arm64: mm: move pte_* macros arm64: kasan: avoid TLB conflicts arm64: mm: add code to safely replace TTBR1_EL1 arm64: add function to install the idmap arm64: unmap idmap earlier arm64: unify idmap removal arm64: mm: place empty_zero_page in bss arm64: mm: specialise pagetable allocators asm-generic: Fix local variable shadow in __set_fixmap_offset Eliminate the .eh_frame sections from the aarch64 vmlinux and kernel modules arm64: Fix an enum typo in mm/dump.c arm64: kasan: ensure that the KASAN zero page is mapped read-only arch/arm64/include/asm/pgtable.h: add pmd_mkclean for THP arm64: hide __efistub_ aliases from kallsyms Linux 4.4.10 drm/i915/skl: Fix DMC load on Skylake J0 and K0 lib/test-string_helpers.c: fix and improve string_get_size() tests ACPI / processor: Request native thermal interrupt handling via _OSC drm/i915: Fake HDMI live status drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW drm/i915: Fix eDP low vswing for Broadwell drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume drm/radeon: make sure vertical front porch is at least 1 iio: ak8975: fix maybe-uninitialized warning iio: ak8975: Fix NULL pointer exception on early interrupt drm/amdgpu: set metadata pointer to NULL after freeing. drm/amdgpu: make sure vertical front porch is at least 1 gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading nvmem: mxs-ocotp: fix buffer overflow in read USB: serial: cp210x: add Straizona Focusers device ids USB: serial: cp210x: add ID for Link ECU ata: ahci-platform: Add ports-implemented DT bindings. libahci: save port map for forced port map powerpc: Fix bad inline asm constraint in create_zero_mask() ACPICA: Dispatcher: Update thread ID for recursive method calls x86/sysfb_efi: Fix valid BAR address range check ARC: Add missing io barriers to io{read,write}{16,32}be() ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value propogate_mnt: Handle the first propogated copy being a slave fs/pnode.c: treat zero mnt_group_id-s as unequal x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO MAINTAINERS: Remove asterisk from EFI directory names writeback: Fix performance regression in wb_over_bg_thresh() batman-adv: Reduce refcnt of removed router when updating route batman-adv: Fix broadcast/ogm queue limit on a removed interface batman-adv: Check skb size before using encapsulated ETH+VLAN header batman-adv: fix DAT candidate selection (must use vid) mm: update min_free_kbytes from khugepaged after core initialization proc: prevent accessing /proc/<PID>/environ until it's ready Input: zforce_ts - fix dual touch recognition HID: Fix boot delay for Creative SB Omni Surround 5.1 with quirk HID: wacom: Add support for DTK-1651 xen/evtchn: fix ring resize when binding new events xen/balloon: Fix crash when ballooning on x86 32 bit PAE xen: Fix page <-> pfn conversion on 32 bit systems ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernel ARM: EXYNOS: Properly skip unitialized parent clock in power domain on mm/zswap: provide unique zpool name mm, cma: prevent nr_isolated_* counters from going negative Minimal fix-up of bad hashing behavior of hash_64() MD: make bio mergeable tracing: Don't display trigger file for events that can't be enabled mac80211: fix statistics leak if dev_alloc_name() fails ath9k: ar5008_hw_cmn_spur_mitigate: add missing mask_m & mask_p initialisation lpfc: fix misleading indentation clk: qcom: msm8960: Fix ce3_src register offset clk: versatile: sp810: support reentrance clk: qcom: msm8960: fix ce3_core clk enable register clk: meson: Fix meson_clk_register_clks() signature type mismatch clk: rockchip: free memory in error cases when registering clock branches soc: rockchip: power-domain: fix err handle while probing clk-divider: make sure read-only dividers do not write to their register CNS3xxx: Fix PCI cns3xxx_write_config() mwifiex: fix corner case association failure ata: ahci_xgene: dereferencing uninitialized pointer in probe nbd: ratelimit error msgs after socket close mfd: intel-lpss: Remove clock tree on error path ipvs: drop first packet to redirect conntrack ipvs: correct initial offset of Call-ID header search in SIP persistence engine ipvs: handle ip_vs_fill_iph_skb_off failure RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips Revert: "powerpc/tm: Check for already reclaimed tasks" arm64: head.S: use memset to clear BSS efi: stub: define DISABLE_BRANCH_PROFILING for all architectures arm64: entry: remove pointless SPSR mode check arm64: mm: move pgd_cache initialisation to pgtable_cache_init arm64: module: avoid undefined shift behavior in reloc_data() arm64: module: fix relocation of movz instruction with negative immediate arm64: traps: address fallout from printk -> pr_* conversion arm64: ftrace: fix a stack tracer's output under function graph tracer arm64: pass a task parameter to unwind_frame() arm64: ftrace: modify a stack frame in a safe way arm64: remove irq_count and do_softirq_own_stack() arm64: hugetlb: add support for PTE contiguous bit arm64: Use PoU cache instr for I/D coherency arm64: Defer dcache flush in __cpu_copy_user_page arm64: reduce stack use in irq_handler arm64: Documentation: add list of software workarounds for errata arm64: mm: place __cpu_setup in .text arm64: cmpxchg: Don't incldue linux/mmdebug.h arm64: mm: fold alternatives into .init arm64: Remove redundant padding from linker script arm64: mm: remove pointless PAGE_MASKing arm64: don't call C code with el0's fp register arm64: when walking onto the task stack, check sp & fp are in current->stack arm64: Add this_cpu_ptr() assembler macro for use in entry.S arm64: irq: fix walking from irq stack to task stack arm64: Add do_softirq_own_stack() and enable irq_stacks arm64: Modify stack trace and dump for use with irq_stack arm64: Store struct thread_info in sp_el0 arm64: Add trace_hardirqs_off annotation in ret_to_user arm64: ftrace: fix the comments for ftrace_modify_code arm64: ftrace: stop using kstop_machine to enable/disable tracing arm64: spinlock: serialise spin_unlock_wait against concurrent lockers arm64: enable HAVE_IRQ_TIME_ACCOUNTING arm64: fix COMPAT_SHMLBA definition for large pages arm64: add __init/__initdata section marker to some functions/variables arm64: pgtable: implement pte_accessible() arm64: mm: allow sections for unaligned bases arm64: mm: detect bad __create_mapping uses Linux 4.4.9 extcon: max77843: Use correct size for reading the interrupt register stm class: Select CONFIG_SRCU megaraid_sas: add missing curly braces in ioctl handler sunrpc/cache: drop reference when sunrpc_cache_pipe_upcall() detects a race thermal: rockchip: fix a impossible condition caused by the warning unbreak allmodconfig KCONFIG_ALLCONFIG=... jme: Fix device PM wakeup API usage jme: Do not enable NIC WoL functions on S0 bus: imx-weim: Take the 'status' property value into account ARM: dts: pxa: fix dma engine node to pxa3xx-nand ARM: dts: armada-375: use armada-370-sata for SATA ARM: EXYNOS: select THERMAL_OF ARM: prima2: always enable reset controller ARM: OMAP3: Add cpuidle parameters table for omap3430 ext4: fix races of writeback with punch hole and zero range ext4: fix races between buffered IO and collapse / insert range ext4: move unlocked dio protection from ext4_alloc_file_blocks() ext4: fix races between page faults and hole punching perf stat: Document --detailed option perf tools: handle spaces in file names obtained from /proc/pid/maps perf hists browser: Only offer symbol scripting when a symbol is under the cursor mtd: nand: Drop mtd.owner requirement in nand_scan mtd: brcmnand: Fix v7.1 register offsets mtd: spi-nor: remove micron_quad_enable() serial: sh-sci: Remove cpufreq notifier to fix crash/deadlock ext4: fix NULL pointer dereference in ext4_mark_inode_dirty() x86/mm/kmmio: Fix mmiotrace for hugepages perf evlist: Reference count the cpu and thread maps at set_maps() drivers/misc/ad525x_dpot: AD5274 fix RDAC read back errors rtc: max77686: Properly handle regmap_irq_get_virq() error code rtc: rx8025: remove rv8803 id rtc: ds1685: passing bogus values to irq_restore rtc: vr41xx: Wire up alarm_irq_enable rtc: hym8563: fix invalid year calculation PM / Domains: Fix removal of a subdomain PM / OPP: Initialize u_volt_min/max to a valid value misc: mic/scif: fix wrap around tests misc/bmp085: Enable building as a module lib/mpi: Endianness fix fbdev: da8xx-fb: fix videomodes of lcd panels scsi_dh: force modular build if SCSI is a module paride: make 'verbose' parameter an 'int' again regulator: s5m8767: fix get_register() error handling irqchip/mxs: Fix error check of of_io_request_and_map() irqchip/sunxi-nmi: Fix error check of of_io_request_and_map() spi/rockchip: Make sure spi clk is on in rockchip_spi_set_cs locking/mcs: Fix mcs_spin_lock() ordering regulator: core: Fix nested locking of supplies regulator: core: Ensure we lock all regulators regulator: core: fix regulator_lock_supply regression Revert "regulator: core: Fix nested locking of supplies" videobuf2-v4l2: Verify planes array in buffer dequeueing videobuf2-core: Check user space planes array in dqbuf USB: usbip: fix potential out-of-bounds write cgroup: make sure a parent css isn't freed before its children mm/hwpoison: fix wrong num_poisoned_pages accounting mm: vmscan: reclaim highmem zone if buffer_heads is over limit numa: fix /proc/<pid>/numa_maps for THP mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check memcg: relocate charge moving from ->attach to ->post_attach cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback slub: clean up code for kmem cgroup support to kmem_cache_free_bulk workqueue: fix ghost PENDING flag while doing MQ IO x86/apic: Handle zero vector gracefully in clear_vector_irq() efi: Expose non-blocking set_variable() wrapper to efivars efi: Fix out-of-bounds read in variable_matches() IB/security: Restrict use of the write() interface IB/mlx5: Expose correct max_sge_rd limit cxl: Keep IRQ mappings on context teardown v4l2-dv-timings.h: fix polarity for 4k formats vb2-memops: Fix over allocation of frame vectors ASoC: rt5640: Correct the digital interface data select ASoC: dapm: Make sure we have a card when displaying component widgets ASoC: ssm4567: Reset device before regcache_sync() ASoC: s3c24xx: use const snd_soc_component_driver pointer EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback toshiba_acpi: Fix regression caused by hotkey enabling value i2c: exynos5: Fix possible ABBA deadlock by keeping I2C clock prepared i2c: cpm: Fix build break due to incompatible pointer types perf intel-pt: Fix segfault tracing transactions drm/i915: Use fw_domains_put_with_fifo() on HSW drm/i915: Fixup the free space logic in ring_prepare drm/amdkfd: uninitialized variable in dbgdev_wave_control_set_registers() drm/i915: skl_update_scaler() wants a rotation bitmask instead of bit number drm/i915: Cleanup phys status page too pwm: brcmstb: Fix check of devm_ioremap_resource() return code drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1() drm/dp/mst: Restore primary hub guid on resume drm/dp/mst: Validate port in drm_dp_payload_send_msg() drm/nouveau/gr/gf100: select a stream master to fixup tfb offset queries drm: Loongson-3 doesn't fully support wc memory drm/radeon: fix vertical bars appear on monitor (v2) drm/radeon: forbid mapping of userptr bo through radeon device file drm/radeon: fix initial connector audio value drm/radeon: add a quirk for a XFX R9 270X drm/amdgpu: fix regression on CIK (v2) amdgpu/uvd: add uvd fw version for amdgpu drm/amdgpu: bump the afmt limit for CZ, ST, Polaris drm/amdgpu: use defines for CRTCs and AMFT blocks drm/amdgpu: when suspending, if uvd/vce was running. need to cancel delay work. iommu/dma: Restore scatterlist offsets correctly iommu/amd: Fix checking of pci dma aliases pinctrl: single: Fix pcs_parse_bits_in_pinctrl_entry to use __ffs than ffs pinctrl: mediatek: correct debounce time unit in mtk_gpio_set_debounce xen kconfig: don't "select INPUT_XEN_KBDDEV_FRONTEND" Input: pmic8xxx-pwrkey - fix algorithm for converting trigger delay Input: gtco - fix crash on detecting device without endpoints netlink: don't send NETLINK_URELEASE for unbound sockets nl80211: check netlink protocol in socket release notification powerpc: Update TM user feature bits in scan_features() powerpc: Update cpu_user_features2 in scan_features() powerpc: scan_features() updates incorrect bits for REAL_LE crypto: talitos - fix AEAD tcrypt tests crypto: talitos - fix crash in talitos_cra_init() crypto: sha1-mb - use corrcet pointer while completing jobs crypto: ccp - Prevent information leakage on export iwlwifi: mvm: fix memory leak in paging iwlwifi: pcie: lower the debug level for RSA semaphore access s390/pci: add extra padding to function measurement block cpufreq: intel_pstate: Fix processing for turbo activation ratio Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control" Revert "drm/radeon: disable runtime pm on PX laptops without dGPU power control" drm/i915: Fix race condition in intel_dp_destroy_mst_connector() drm/qxl: fix cursor position with non-zero hotspot drm/nouveau/core: use vzalloc for allocating ramht futex: Acknowledge a new waiter in counter before plist futex: Handle unlock_pi race gracefully asm-generic/futex: Re-enable preemption in futex_atomic_cmpxchg_inatomic() ALSA: hda - Add dock support for ThinkPad X260 ALSA: pcxhr: Fix missing mutex unlock ALSA: hda - add PCI ID for Intel Broxton-T ALSA: hda - Keep powering up ADCs on Cirrus codecs ALSA: hda/realtek - Add ALC3234 headset mode for Optiplex 9020m ALSA: hda - Don't trust the reported actual power state x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address x86/mm/xen: Suppress hugetlbfs in PV guests arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission arm64: Honour !PTE_WRITE in set_pte_at() for kernel mappings sched/cgroup: Fix/cleanup cgroup teardown/init dmaengine: pxa_dma: fix the maximum requestor line dmaengine: hsu: correct use of channel status register dmaengine: dw: fix master selection debugfs: Make automount point inodes permanently empty lib: lz4: fixed zram with lz4 on big endian machines dm cache metadata: fix cmd_read_lock() acquiring write lock dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros usb: gadget: f_fs: Fix use-after-free usb: hcd: out of bounds access in for_each_companion xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers usb: xhci: fix wild pointers in xhci_mem_cleanup xhci: resume USB 3 roothub first usb: xhci: applying XHCI_PME_STUCK_QUIRK to Intel BXT B0 host assoc_array: don't call compare_object() on a node ARM: OMAP2+: hwmod: Fix updating of sysconfig register ARM: OMAP2: Fix up interconnect barrier initialization for DRA7 ARM: mvebu: Correct unit address for linksys ARM: dts: AM43x-epos: Fix clk parent for synctimer KVM: arm/arm64: Handle forward time correction gracefully kvm: x86: do not leak guest xcr0 into host interrupt handlers x86/mce: Avoid using object after free in genpool block: loop: fix filesystem corruption in case of aio/dio block: partition: initialize percpuref before sending out KOBJ_ADD Conflicts: arch/arm64/Kconfig arch/arm64/include/asm/cputype.h arch/arm64/include/asm/hardirq.h arch/arm64/include/asm/irq.h arch/arm64/kernel/cpu_errata.c arch/arm64/kernel/cpuinfo.c arch/arm64/kernel/setup.c arch/arm64/kernel/smp.c arch/arm64/kernel/stacktrace.c arch/arm64/mm/init.c arch/arm64/mm/mmu.c arch/arm64/mm/pageattr.c mm/memcontrol.c CRs-Fixed: 1054234 Signed-off-by: Trilok Soni <tsoni@codeaurora.org> Change-Id: I2a7a34631ffee36ce18b9171f16d023be777392f	2016-08-18 14:50:45 -07:00
Subash Abhinov Kasiviswanathan	6d2650831d	Revert "genetlink: disallow subscribing to unknown mcast groups" Commit `5ad6300524` ("genetlink: disallow subscribing to unknown mcast groups") disallows userspace to subscribe to groups that don't exist in kernel. As a result, communication between processes is not possible unless they explicitly register a dummy group with the kernel even if the communication is between userspace processes only. NETLINK_USERSOCK cannot be used here since userspace processes would require CAP_NET_ADMIN to receive multicast messages which is available for priveleged processes only. Fix this problem by reverting the change till a solution is determined internally and upstream discussion. CRs-Fixed: 1052589 Change-Id: Id559d9ef9d1e0a25e3bbdc81503978f01c6ed85f Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>	2016-08-09 21:34:10 -06:00
Dmitry Ivanov	95415ac578	netlink: don't send NETLINK_URELEASE for unbound sockets commit e27260203912b40751fa353d009eaa5a642c739f upstream. All existing users of NETLINK_URELEASE use it to clean up resources that were previously allocated to a socket via some command. As a result, no users require getting this notification for unbound sockets. Sending it for unbound sockets, however, is a problem because any user (including unprivileged users) can create a socket that uses the same ID as an existing socket. Binding this new socket will fail, but if the NETLINK_URELEASE notification is generated for such sockets, the users thereof will be tricked into thinking the socket that they allocated the resources for is closed. In the nl80211 case, this will cause destruction of virtual interfaces that still belong to an existing hostapd process; this is the case that Dmitry noticed. In the NFC case, it will cause a poll abort. In the case of netlink log/queue it will cause them to stop reporting events, as if NFULNL_CFG_CMD_UNBIND/NFQNL_CFG_CMD_UNBIND had been called. Fix this problem by checking that the socket is bound before generating the NETLINK_URELEASE notification. Signed-off-by: Dmitry Ivanov <dima@ubnt.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-05-04 14:48:45 -07:00
Mel Gorman	d0164adc89	mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-06 17:50:42 -08:00
David S. Miller	ba3e2084f2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/ipv6/xfrm6_output.c net/openvswitch/flow_netlink.c net/openvswitch/vport-gre.c net/openvswitch/vport-vxlan.c net/openvswitch/vport.c net/openvswitch/vport.h The openvswitch conflicts were overlapping changes. One was the egress tunnel info fix in 'net' and the other was the vport ->send() op simplification in 'net-next'. The xfrm6_output.c conflicts was also a simplification overlapping a bug fix. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-24 06:54:12 -07:00
David Herrmann	47191d65b6	netlink: fix locking around NETLINK_LIST_MEMBERSHIPS Currently, NETLINK_LIST_MEMBERSHIPS grabs the netlink table while copying the membership state to user-space. However, grabing the netlink table is effectively a write_lock_irq(), and as such we should not be triggering page-faults in the critical section. This can be easily reproduced by the following snippet: int s = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE); void p = mmap(0, 4096, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_ANON, -1, 0); int r = getsockopt(s, 0x10e, 9, p, (void)((char*)p + 4092)); This should work just fine, but currently triggers EFAULT and a possible WARN_ON below handle_mm_fault(). Fix this by reducing locking of NETLINK_LIST_MEMBERSHIPS to a read-side lock. The write-lock was overkill in the first place, and the read-lock allows page-faults just fine. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 07:18:28 -07:00
David S. Miller	26440c835f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/asix_common.c net/ipv4/inet_connection_sock.c net/switchdev/switchdev.c In the inet_connection_sock.c case the request socket hashing scheme is completely different in net-next. The other two conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-20 06:08:27 -07:00
Arad, Ronen	db65a3aaf2	netlink: Trim skb to alloc size to avoid MSG_TRUNC netlink_dump() allocates skb based on the calculated min_dump_alloc or a per socket max_recvmsg_len. min_alloc_size is maximum space required for any single netdev attributes as calculated by rtnl_calcit(). max_recvmsg_len tracks the user provided buffer to netlink_recvmsg. It is capped at 16KiB. The intention is to avoid small allocations and to minimize the number of calls required to obtain dump information for all net devices. netlink_dump packs as many small messages as could fit within an skb that was sized for the largest single netdev information. The actual space available within an skb is larger than what is requested. It could be much larger and up to near 2x with align to next power of 2 approach. Allowing netlink_dump to use all the space available within the allocated skb increases the buffer size a user has to provide to avoid truncaion (i.e. MSG_TRUNG flag set). It was observed that with many VLANs configured on at least one netdev, a larger buffer of near 64KiB was necessary to avoid "Message truncated" error in "ip link" or "bridge [-c[ompressvlans]] vlan show" when min_alloc_size was only little over 32KiB. This patch trims skb to allocated size in order to allow the user to avoid truncation with more reasonable buffer size. Signed-off-by: Ronen Arad <ronen.arad@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-18 19:34:12 -07:00
Yaowei Bai	61d03535e4	net/netlink: lockdep_genl_is_held can be boolean This patch makes lockdep_genl_is_held return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <bywxiaobai@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-09 07:48:59 -07:00
David S. Miller	4963ed48f2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/ipv4/arp.c The net/ipv4/arp.c conflict was one commit adding a new local variable while another commit was deleting one. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-26 16:08:27 -07:00
Jiri Benc	92c14d9b5e	genetlink: simplify genl_notify The genl_notify function has too many arguments for no real reason - all callers use genl_info to get them anyway. Just pass the genl_info down to genl_notify. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-24 12:25:23 -07:00
Herbert Xu	da314c9923	netlink: Replace rhash_portid with bound On Mon, Sep 21, 2015 at 02:20:22PM -0400, Tejun Heo wrote: > > store_release and load_acquire are different from the usual memory > barriers and can't be paired this way. You have to pair store_release > and load_acquire. Besides, it isn't a particularly good idea to OK I've decided to drop the acquire/release helpers as they don't help us at all and simply pessimises the code by using full memory barriers (on some architectures) where only a write or read barrier is needed. > depend on memory barriers embedded in other data structures like the > above. Here, especially, rhashtable_insert() would have write barrier > before the entry is hashed not necessarily after, which means that > in the above case, a socket which appears to have set bound to a > reader might not visible when the reader tries to look up the socket > on the hashtable. But you are right we do need an explicit write barrier here to ensure that the hashing is visible. > There's no reason to be overly smart here. This isn't a crazy hot > path, write barriers tend to be very cheap, store_release more so. > Please just do smp_store_release() and note what it's paired with. It's not about being overly smart. It's about actually understanding what's going on with the code. I've seen too many instances of people simply sprinkling synchronisation primitives around without any knowledge of what is happening underneath, which is just a recipe for creating hard-to-debug races. > > @@ -1539,7 +1546,7 @@ static int netlink_bind(struct socket sock, struct sockaddr addr, > > } > > } > > > > - if (!nlk->portid) { > > + if (!nlk->bound) { > > I don't think you can skip load_acquire here just because this is the > second deref of the variable. That doesn't change anything. Race > condition could still happen between the first and second tests and > skipping the second would lead to the same kind of bug. The reason this one is OK is because we do not use nlk->portid or try to get nlk from the hash table before we return to user-space. However, there is a real bug here that none of these acquire/release helpers discovered. The two bound tests here used to be a single one. Now that they are separate it is entirely possible for another thread to come in the middle and bind the socket. So we need to repeat the portid check in order to maintain consistency. > > @@ -1587,7 +1594,7 @@ static int netlink_connect(struct socket sock, struct sockaddr addr, > > !netlink_allowed(sock, NL_CFG_F_NONROOT_SEND)) > > return -EPERM; > > > > - if (!nlk->portid) > > + if (!nlk->bound) > > Don't we need load_acquire here too? Is this path holding a lock > which makes that unnecessary? Ditto. ---8<--- The commit `1f770c0a09` ("netlink: Fix autobind race condition that leads to zero port ID") created some new races that can occur due to inconcsistencies between the two port IDs. Tejun is right that a barrier is unavoidable. Therefore I am reverting to the original patch that used a boolean to indicate that a user netlink socket has been bound. Barriers have been added where necessary to ensure that a valid portid and the hashed socket is visible. I have also changed netlink_insert to only return EBUSY if the socket is bound to a portid different to the requested one. This combined with only reading nlk->bound once in netlink_bind fixes a race where two threads that bind the socket at the same time with different port IDs may both succeed. Fixes: `1f770c0a09` ("netlink: Fix autobind race condition that leads to zero port ID") Reported-by: Tejun Heo <tj@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Nacked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-24 12:07:08 -07:00
Herbert Xu	1f770c0a09	netlink: Fix autobind race condition that leads to zero port ID The commit `c0bb07df7d` ("netlink: Reset portid after netlink_insert failure") introduced a race condition where if two threads try to autobind the same socket one of them may end up with a zero port ID. This led to kernel deadlocks that were observed by multiple people. This patch reverts that commit and instead fixes it by introducing a separte rhash_portid variable so that the real portid is only set after the socket has been successfully hashed. Fixes: `c0bb07df7d` ("netlink: Reset portid after netlink_insert failure") Reported-by: Tejun Heo <tj@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-20 22:55:31 -07:00
Daniel Borkmann	1853c94964	netlink, mmap: transform mmap skb into full skb on taps Ken-ichirou reported that running netlink in mmap mode for receive in combination with nlmon will throw a NULL pointer dereference in __kfree_skb() on nlmon_xmit(), in my case I can also trigger an "unable to handle kernel paging request". The problem is the skb_clone() in __netlink_deliver_tap_skb() for skbs that are mmaped. I.e. the cloned skb doesn't have a destructor, whereas the mmap netlink skb has it pointed to netlink_skb_destructor(), set in the handler netlink_ring_setup_skb(). There, skb->head is being set to NULL, so that in such cases, __kfree_skb() doesn't perform a skb_release_data() via skb_release_all(), where skb->head is possibly being freed through kfree(head) into slab allocator, although netlink mmap skb->head points to the mmap buffer. Similarly, the same has to be done also for large netlink skbs where the data area is vmalloced. Therefore, as discussed, make a copy for these rather rare cases for now. This fixes the issue on my and Ken-ichirou's test-cases. Reference: http://thread.gmane.org/gmane.linux.network/371129 Fixes: `bcbde0d449` ("net: netlink: virtual tap device management") Reported-by: Ken-ichirou MATSUZAWA <chamaken@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Ken-ichirou MATSUZAWA <chamaken@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-11 14:36:49 -07:00
Daniel Borkmann	6bb0fef489	netlink, mmap: fix edge-case leakages in nf queue zero-copy When netlink mmap on receive side is the consumer of nf queue data, it can happen that in some edge cases, we write skb shared info into the user space mmap buffer: Assume a possible rx ring frame size of only 4096, and the network skb, which is being zero-copied into the netlink skb, contains page frags with an overall skb->len larger than the linear part of the netlink skb. skb_zerocopy(), which is generic and thus not aware of the fact that shared info cannot be accessed for such skbs then tries to write and fill frags, thus leaking kernel data/pointers and in some corner cases possibly writing out of bounds of the mmap area (when filling the last slot in the ring buffer this way). I.e. the ring buffer slot is then of status NL_MMAP_STATUS_VALID, has an advertised length larger than 4096, where the linear part is visible at the slot beginning, and the leaked sizeof(struct skb_shared_info) has been written to the beginning of the next slot (also corrupting the struct nl_mmap_hdr slot header incl. status etc), since skb->end points to skb->data + ring->frame_size - NL_MMAP_HDRLEN. The fix adds and lets __netlink_alloc_skb() take the actual needed linear room for the network skb + meta data into account. It's completely irrelevant for non-mmaped netlink sockets, but in case mmap sockets are used, it can be decided whether the available skb_tailroom() is really large enough for the buffer, or whether it needs to internally fallback to a normal alloc_skb(). >From nf queue side, the information whether the destination port is an mmap RX ring is not really available without extra port-to-socket lookup, thus it can only be determined in lower layers i.e. when __netlink_alloc_skb() is called that checks internally for this. I chose to add the extra ldiff parameter as mmap will then still work: We have data_len and hlen in nfqnl_build_packet_message(), data_len is the full length (capped at queue->copy_range) for skb_zerocopy() and hlen some possible part of data_len that needs to be copied; the rem_len variable indicates the needed remaining linear mmap space. The only other workaround in nf queue internally would be after allocation time by f.e. cap'ing the data_len to the skb_tailroom() iff we deal with an mmap skb, but that would 1) expose the fact that we use a mmap skb to upper layers, and 2) trim the skb where we otherwise could just have moved the full skb into the normal receive queue. After the patch, in my test case the ring slot doesn't fit and therefore shows NL_MMAP_STATUS_COPY, where a full skb carries all the data and thus needs to be picked up via recv(). Fixes: `3ab1f683bf` ("nfnetlink: add support for memory mapped netlink") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-09 21:43:22 -07:00
Daniel Borkmann	a66e36568e	netlink, mmap: don't walk rx ring on poll if receive queue non-empty In case of netlink mmap, there can be situations where received frames have to be placed into the normal receive queue. The ring buffer indicates this through NL_MMAP_STATUS_COPY, so the user is asked to pick them up via recvmsg(2) syscall, and to put the slot back to NL_MMAP_STATUS_UNUSED. Commit `0ef707700f` ("netlink: rx mmap: fix POLLIN condition") changed polling, so that we walk in the worst case the whole ring through the new netlink_has_valid_frame(), for example, when the ring would have no NL_MMAP_STATUS_VALID, but at least one NL_MMAP_STATUS_COPY frame. Since we do a datagram_poll() already earlier to pick up a mask that could possibly contain POLLIN \| POLLRDNORM already (due to NL_MMAP_STATUS_COPY), we can skip checking the rx ring entirely. In case the kernel is compiled with !CONFIG_NETLINK_MMAP, then all this is irrelevant anyway as netlink_poll() is just defined as datagram_poll(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-09 21:42:51 -07:00
Ken-ichirou MATSUZAWA	0ef707700f	netlink: rx mmap: fix POLLIN condition Poll() returns immediately after setting the kernel current frame (ring->head) to SKIP from user space even though there is no new frame. And in a case of all frames is VALID, user space program unintensionally sets (only) kernel current frame to UNUSED, then calls poll(), it will not return immediately even though there are VALID frames. To avoid situations like above, I think we need to scan all frames to find VALID frames at poll() like netlink_alloc_skb(), netlink_forward_ring() finding an UNUSED frame at skb allocation. Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:55:51 -07:00
Ken-ichirou MATSUZAWA	7084a31589	netlink: mmap: fix lookup frame position __netlink_lookup_frame() was always called with the same "pos" value in netlink_forward_ring(). It will look at the same ring entry header over and over again, every time through this loop. Then cycle through the whole ring, advancing ring->head, not "pos" until it equals the "ring->head != head" loop test fails. Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-28 22:25:42 -07:00
Christophe Ricard	0a6a3a23ea	netlink: add NETLINK_CAP_ACK socket option Since commit `c05cdb1b86` ("netlink: allow large data transfers from user-space"), the kernel may fail to allocate the necessary room for the acknowledgment message back to userspace. This patch introduces a new socket option that trims off the payload of the original netlink message. The netlink message header is still included, so the user can guess from the sequence number what is the message that has triggered the acknowledgment. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-28 22:25:42 -07:00
Ken-ichirou MATSUZAWA	c953e23936	netlink: mmap: fix tx type check I can't send netlink message via mmaped netlink socket since commit: `a8866ff6a5` netlink: make the check for "send from tx_ring" deterministic msg->msg_iter.type is set to WRITE (1) at SYSCALL_DEFINE6(sendto, ... import_single_range(WRITE, ... iov_iter_init(1, WRITE, ... call path, so that we need to check the type by iter_is_iovec() to accept the WRITE. Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-23 16:04:46 -07:00
Daniel Borkmann	4e7c133068	netlink: make sure -EBUSY won't escape from netlink_insert Linus reports the following deadlock on rtnl_mutex; triggered only once so far (extract): [12236.694209] NetworkManager D 0000000000013b80 0 1047 1 0x00000000 [12236.694218] ffff88003f902640 0000000000000000 ffffffff815d15a9 0000000000000018 [12236.694224] ffff880119538000 ffff88003f902640 ffffffff81a8ff84 00000000ffffffff [12236.694230] ffffffff81a8ff88 ffff880119c47f00 ffffffff815d133a ffffffff81a8ff80 [12236.694235] Call Trace: [12236.694250] [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10 [12236.694257] [<ffffffff815d133a>] ? schedule+0x2a/0x70 [12236.694263] [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10 [12236.694271] [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0 [12236.694280] [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30 [12236.694291] [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30 [12236.694299] [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180 [12236.694309] [<ffffffff814f5ad3>] ? rtnl_getlink+0x113/0x190 [12236.694319] [<ffffffff814f202a>] ? rtnetlink_rcv_msg+0x7a/0x210 [12236.694331] [<ffffffff8124565c>] ? sock_has_perm+0x5c/0x70 [12236.694339] [<ffffffff814f1fb0>] ? rtnetlink_rcv+0x30/0x30 [12236.694346] [<ffffffff8150d62c>] ? netlink_rcv_skb+0x9c/0xc0 [12236.694354] [<ffffffff814f1f9f>] ? rtnetlink_rcv+0x1f/0x30 [12236.694360] [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180 [12236.694367] [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0 [12236.694376] [<ffffffff810a236f>] ? __wake_up+0x2f/0x50 [12236.694387] [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40 [12236.694396] [<ffffffff814cb05e>] ? ___sys_sendmsg+0x22e/0x240 [12236.694405] [<ffffffff814cab75>] ? ___sys_recvmsg+0x135/0x1a0 [12236.694415] [<ffffffff811a9d12>] ? eventfd_write+0x82/0x210 [12236.694423] [<ffffffff811a0f9e>] ? fsnotify+0x32e/0x4c0 [12236.694429] [<ffffffff8108cb70>] ? wake_up_q+0x60/0x60 [12236.694434] [<ffffffff814cba09>] ? __sys_sendmsg+0x39/0x70 [12236.694440] [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a It seems so far plausible that the recursive call into rtnetlink_rcv() looks suspicious. One way, where this could trigger is that the senders NETLINK_CB(skb).portid was wrongly 0 (which is rtnetlink socket), so the rtnl_getlink() request's answer would be sent to the kernel instead to the actual user process, thus grabbing rtnl_mutex() twice. One theory would be that netlink_autobind() triggered via netlink_sendmsg() internally overwrites the -EBUSY error to 0, but where it is wrongly originating from __netlink_insert() instead. That would reset the socket's portid to 0, which is then filled into NETLINK_CB(skb).portid later on. As commit `d470e3b483` ("[NETLINK]: Fix two socket hashing bugs.") also puts it, -EBUSY should not be propagated from netlink_insert(). It looks like it's very unlikely to reproduce. We need to trigger the rhashtable_insert_rehash() handler under a situation where rehashing currently occurs (one /rare/ way would be to hit ht->elasticity limits while not filled enough to expand the hashtable, but that would rather require a specifically crafted bind() sequence with knowledge about destination slots, seems unlikely). It probably makes sense to guard __netlink_insert() in any case and remap that error. It was suggested that EOVERFLOW might be better than an already overloaded ENOMEM. Reference: http://thread.gmane.org/gmane.linux.network/372676 Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-10 10:59:10 -07:00
Florian Westphal	0470eb99b4	netlink: don't hold mutex in rcu callback when releasing mmapd ring Kirill A. Shutemov says: This simple test-case trigers few locking asserts in kernel: int main(int argc, char *argv) { unsigned int block_size = 16 4096; struct nl_mmap_req req = { .nm_block_size = block_size, .nm_block_nr = 64, .nm_frame_size = 16384, .nm_frame_nr = 64 * block_size / 16384, }; unsigned int ring_size; int fd; fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC); if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &req, sizeof(req)) < 0) exit(1); if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &req, sizeof(req)) < 0) exit(1); ring_size = req.nm_block_nr * req.nm_block_size; mmap(NULL, 2 * ring_size, PROT_READ\|PROT_WRITE, MAP_SHARED, fd, 0); return 0; } +++ exited with 0 +++ BUG: sleeping function called from invalid context at /home/kas/git/public/linux-mm/kernel/locking/mutex.c:616 in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init 3 locks held by init/1: #0: (reboot_mutex){+.+...}, at: [<ffffffff81080959>] SyS_reboot+0xa9/0x220 #1: ((reboot_notifier_list).rwsem){.+.+..}, at: [<ffffffff8107f379>] __blocking_notifier_call_chain+0x39/0x70 #2: (rcu_callback){......}, at: [<ffffffff810d32e0>] rcu_do_batch.isra.49+0x160/0x10c0 Preemption disabled at:[<ffffffff8145365f>] __delay+0xf/0x20 CPU: 1 PID: 1 Comm: init Not tainted 4.1.0-00009-gbddf4c4818e0 #253 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Debian-1.8.2-1 04/01/2014 ffff88017b3d8000 ffff88027bc03c38 ffffffff81929ceb 0000000000000102 0000000000000000 ffff88027bc03c68 ffffffff81085a9d 0000000000000002 ffffffff81ca2a20 0000000000000268 0000000000000000 ffff88027bc03c98 Call Trace: <IRQ> [<ffffffff81929ceb>] dump_stack+0x4f/0x7b [<ffffffff81085a9d>] ___might_sleep+0x16d/0x270 [<ffffffff81085bed>] __might_sleep+0x4d/0x90 [<ffffffff8192e96f>] mutex_lock_nested+0x2f/0x430 [<ffffffff81932fed>] ? _raw_spin_unlock_irqrestore+0x5d/0x80 [<ffffffff81464143>] ? __this_cpu_preempt_check+0x13/0x20 [<ffffffff8182fc3d>] netlink_set_ring+0x1ed/0x350 [<ffffffff8182e000>] ? netlink_undo_bind+0x70/0x70 [<ffffffff8182fe20>] netlink_sock_destruct+0x80/0x150 [<ffffffff817e484d>] __sk_free+0x1d/0x160 [<ffffffff817e49a9>] sk_free+0x19/0x20 [..] Cong Wang says: We can't hold mutex lock in a rcu callback, [..] Thomas Graf says: The socket should be dead at this point. It might be simpler to add a netlink_release_ring() function which doesn't require locking at all. Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name> Diagnosed-by: Cong Wang <cwang@twopensource.com> Suggested-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-21 22:22:56 -07:00
Markus Elfring	92b80eb33c	netlink: Delete an unnecessary check before the function call "module_put" The module_put() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-03 09:27:43 -07:00
David Herrmann	b42be38b27	netlink: add API to retrieve all group memberships This patch adds getsockopt(SOL_NETLINK, NETLINK_LIST_MEMBERSHIPS) to retrieve all groups a socket is a member of. Currently, we have to use getsockname() and look at the nl.nl_groups bitmask. However, this mask is limited to 32 groups. Hence, similar to NETLINK_ADD_MEMBERSHIP and NETLINK_DROP_MEMBERSHIP, this adds a separate sockopt to manager higher groups IDs than 32. This new NETLINK_LIST_MEMBERSHIPS option takes a pointer to __u32 and the size of the array. The array is filled with the full membership-set of the socket, and the required array size is returned in optlen. Hence, user-space can retry with a properly sized array in case it was too small. Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-21 10:18:18 -07:00
David S. Miller	36583eb54d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/cadence/macb.c drivers/net/phy/phy.c include/linux/skbuff.h net/ipv4/tcp.c net/switchdev/switchdev.c Switchdev was a case of RTNH_H_{EXTERNAL --> OFFLOAD} renaming overlapping with net-next changes of various sorts. phy.c was a case of two changes, one adding a local variable to a function whilst the second was removing one. tcp.c overlapped a deadlock fix with the addition of new tcp_info statistic values. macb.c involved the addition of two zyncq device entries. skbuff.h involved adding back ipv4_daddr to nf_bridge_info whilst net-next changes put two other existing members of that struct into a union. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-23 01:22:35 -04:00
Herbert Xu	b9fbe709de	netlink: Use random autobind rover Currently we use a global rover to select a port ID that is unique. This used to work consistently when it was protected with a global lock. However as we're now lockless, the global rover can exhibit pathological behaviour should multiple threads all stomp on it at the same time. Granted this will eventually resolve itself but the process is suboptimal. This patch replaces the global rover with a pseudorandom starting point to avoid this issue. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-17 23:43:31 -04:00
Herbert Xu	c0bb07df7d	netlink: Reset portid after netlink_insert failure The commit `c5adde9468` ("netlink: eliminate nl_sk_hash_lock") breaks the autobind retry mechanism because it doesn't reset portid after a failed netlink_insert. This means that should autobind fail the first time around, then the socket will be stuck in limbo as it can never be bound again since it already has a non-zero portid. Fixes: `c5adde9468` ("netlink: eliminate nl_sk_hash_lock") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-16 17:08:57 -04:00
Eric Dumazet	91dd93f956	netlink: move nl_table in read_mostly section netlink sockets creation and deletion heavily modify nl_table_users and nl_table_lock. If nl_table is sharing one cache line with one of them, netlink performance is really bad on SMP. ffffffff81ff5f00 B nl_table ffffffff81ff5f0c b nl_table_users Putting nl_table in read_mostly section increased performance of my open/delete netlink sockets test by about 80 % This came up while diagnosing a getaddrinfo() problem. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-14 17:49:06 -04:00
David S. Miller	b04096ff33	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Four minor merge conflicts: 1) qca_spi.c renamed the local variable used for the SPI device from spi_device to spi, meanwhile the spi_set_drvdata() call got moved further up in the probe function. 2) Two changes were both adding new members to codel params structure, and thus we had overlapping changes to the initializer function. 3) 'net' was making a fix to sk_release_kernel() which is completely removed in 'net-next'. 4) In net_namespace.c, the rtnl_net_fill() call for GET operations had the command value fixed, meanwhile 'net-next' adjusted the argument signature a bit. This also matches example merge resolutions posted by Stephen Rothwell over the past two days. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-13 14:31:43 -04:00
Eric W. Biederman	13d3078e22	netlink: Create kernel netlink sockets in the proper network namespace Utilize the new functionality of sk_alloc so that nothing needs to be done to suprress the reference counting on kernel sockets. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-11 10:50:18 -04:00
Eric W. Biederman	11aa9c28b4	net: Pass kern from net_proto_family.create to sk_alloc In preparation for changing how struct net is refcounted on kernel sockets pass the knowledge that we are creating a kernel socket from sock_create_kern through to sk_alloc. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-11 10:50:17 -04:00
Nicolas Dichtel	59324cf35a	netlink: allow to listen "all" netns More accurately, listen all netns that have a nsid assigned into the netns where the netlink socket is opened. For this purpose, a netlink socket option is added: NETLINK_LISTEN_ALL_NSID. When this option is set on a netlink socket, this socket will receive netlink notifications from all netns that have a nsid assigned into the netns where the socket has been opened. The nsid is sent to userland via an anscillary data. With this patch, a daemon needs only one socket to listen many netns. This is useful when the number of netns is high. Because 0 is a valid value for a nsid, the field nsid_is_set indicates if the field nsid is valid or not. skb->cb is initialized to 0 on skb allocation, thus we are sure that we will never send a nsid 0 by error to the userland. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 22:15:31 -04:00
Nicolas Dichtel	cc3a572fe6	netlink: rename private flags and states These flags and states have the same prefix (NETLINK_) that netlink socket options. To avoid confusion and to be able to name a flag like a socket option, let's use an other prefix: NETLINK_[S\|F]_. Note: a comment has been fixed, it was talking about NETLINK_RECV_NO_ENOBUFS socket option instead of NETLINK_NO_ENOBUFS. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 22:15:31 -04:00
Herbert Xu	edac450d5b	netlink: Remove max_size setting We currently limit the hash table size to 64K which is very bad as even 10 years ago it was relatively easy to generate millions of sockets. Since the hash table is naturally limited by memory allocation failure, we don't really need an explicit limit so this patch removes it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-03 23:27:02 -04:00
Eric Dumazet	2ea2f62c8b	net: fix crash in build_skb() When I added pfmemalloc support in build_skb(), I forgot netlink was using build_skb() with a vmalloc() area. In this patch I introduce __build_skb() for netlink use, and build_skb() is a wrapper handling both skb->head_frag and skb->pfmemalloc This means netlink no longer has to hack skb->head_frag [ 1567.700067] kernel BUG at arch/x86/mm/physaddr.c:26! [ 1567.700067] invalid opcode: 0000 [#1] PREEMPT SMP KASAN [ 1567.700067] Dumping ftrace buffer: [ 1567.700067] (ftrace buffer empty) [ 1567.700067] Modules linked in: [ 1567.700067] CPU: 9 PID: 16186 Comm: trinity-c182 Not tainted 4.0.0-next-20150424-sasha-00037-g4796e21 #2167 [ 1567.700067] task: ffff880127efb000 ti: ffff880246770000 task.ti: ffff880246770000 [ 1567.700067] RIP: __phys_addr (arch/x86/mm/physaddr.c:26 (discriminator 3)) [ 1567.700067] RSP: 0018:ffff8802467779d8 EFLAGS: 00010202 [ 1567.700067] RAX: 000041000ed8e000 RBX: ffffc9008ed8e000 RCX: 000000000000002c [ 1567.700067] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffffb3fd6049 [ 1567.700067] RBP: ffff8802467779f8 R08: 0000000000000019 R09: ffff8801d0168000 [ 1567.700067] R10: ffff8801d01680c7 R11: ffffed003a02d019 R12: ffffc9000ed8e000 [ 1567.700067] R13: 0000000000000f40 R14: 0000000000001180 R15: ffffc9000ed8e000 [ 1567.700067] FS: 00007f2a7da3f700(0000) GS:ffff8801d1000000(0000) knlGS:0000000000000000 [ 1567.700067] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1567.700067] CR2: 0000000000738308 CR3: 000000022e329000 CR4: 00000000000007e0 [ 1567.700067] Stack: [ 1567.700067] ffffc9000ed8e000 ffff8801d0168000 ffffc9000ed8e000 ffff8801d0168000 [ 1567.700067] ffff880246777a28 ffffffffad7c0a21 0000000000001080 ffff880246777c08 [ 1567.700067] ffff88060d302e68 ffff880246777b58 ffff880246777b88 ffffffffad9a6821 [ 1567.700067] Call Trace: [ 1567.700067] build_skb (include/linux/mm.h:508 net/core/skbuff.c:316) [ 1567.700067] netlink_sendmsg (net/netlink/af_netlink.c:1633 net/netlink/af_netlink.c:2329) [ 1567.774369] ? sched_clock_cpu (kernel/sched/clock.c:311) [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273) [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273) [ 1567.774369] sock_sendmsg (net/socket.c:614 net/socket.c:623) [ 1567.774369] sock_write_iter (net/socket.c:823) [ 1567.774369] ? sock_sendmsg (net/socket.c:806) [ 1567.774369] __vfs_write (fs/read_write.c:479 fs/read_write.c:491) [ 1567.774369] ? get_lock_stats (kernel/locking/lockdep.c:249) [ 1567.774369] ? default_llseek (fs/read_write.c:487) [ 1567.774369] ? vtime_account_user (kernel/sched/cputime.c:701) [ 1567.774369] ? rw_verify_area (fs/read_write.c:406 (discriminator 4)) [ 1567.774369] vfs_write (fs/read_write.c:539) [ 1567.774369] SyS_write (fs/read_write.c:586 fs/read_write.c:577) [ 1567.774369] ? SyS_read (fs/read_write.c:577) [ 1567.774369] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 1567.774369] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 kernel/locking/lockdep.c:2636) [ 1567.774369] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42) [ 1567.774369] system_call_fastpath (arch/x86/kernel/entry_64.S:261) Fixes: `79930f5892` ("net: do not deplete pfmemalloc reserve") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-25 15:49:49 -04:00
Patrick McHardy	49f7b33e63	rhashtable: provide len to obj_hashfn nftables sets will be converted to use so called setextensions, moving the key to a non-fixed position. To hash it, the obj_hashfn must be used, however it so far doesn't receive the length parameter. Pass the key length to obj_hashfn() and convert existing users. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:33 +01:00
Thomas Graf	b5e2c150ac	rhashtable: Disable automatic shrinking by default Introduce a new bool automatic_shrinking to require the user to explicitly opt-in to automatic shrinking of tables. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:48:40 -04:00
Herbert Xu	11b58ba146	netlink: Use default rhashtable hashfn This patch removes the explicit jhash value for the hashfn parameter of rhashtable. As the key length is a multiple of 4, this means that we will actually end up using jhash2. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:07:51 -04:00
Herbert Xu	8f2ddaac30	netlink: Remove netlink_compare_arg.trailer Instead of computing the offset from trailer, this patch computes netlink_compare_arg_len from the offset of portid and then adds 4 to it. This allows trailer to be removed. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-21 00:16:39 -04:00
Herbert Xu	c428ecd1a2	netlink: Move namespace into hash key Currently the name space is a de facto key because it has to match before we find an object in the hash table. However, it isn't in the hash value so all objects from different name spaces with the same port ID hash to the same bucket. This is bad as the number of name spaces is unbounded. This patch fixes this by using the namespace when doing the hash. Because the namespace field doesn't lie next to the portid field in the netlink socket, this patch switches over to the rhashtable interface without a fixed key. This patch also uses the new inlined rhashtable interface where possible. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	b06eee59b1	netlink: Use rhashtable max_size instead of max_shift This patch converts netlink to use rhashtable max_size instead of the obsolete max_shift. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 12:46:40 -04:00
David S. Miller	71a83a6db6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/rocker/rocker.c The rocker commit was two overlapping changes, one to rename the ->vport member to ->pport, and another making the bitmask expression use '1ULL' instead of plain '1'. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 21:16:48 -05:00
Ying Xue	1b78414047	net: Remove iocb argument from sendmsg and recvmsg After TIPC doesn't depend on iocb argument in its internal implementations of sendmsg() and recvmsg() hooks defined in proto structure, no any user is using iocb argument in them at all now. Then we can drop the redundant iocb argument completely from kinds of implementations of both sendmsg() and recvmsg() in the entire networking stack. Cc: Christoph Hellwig <hch@lst.de> Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 13:06:31 -05:00
Daniel Borkmann	4c4b52d9b2	rhashtable: remove indirection for grow/shrink decision functions Currently, all real users of rhashtable default their grow and shrink decision functions to rht_grow_above_75() and rht_shrink_below_30(), so that there's currently no need to have this explicitly selectable. It can/should be generic and private inside rhashtable until a real use case pops up. Since we can make this private, we'll save us this additional indirection layer and can improve insertion/deletion time as well. Reference: http://patchwork.ozlabs.org/patch/443040/ Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:06:02 -05:00
David S. Miller	6e03f896b5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/vxlan.c drivers/vhost/net.c include/linux/if_vlan.h net/core/dev.c The net/core/dev.c conflict was the overlap of one commit marking an existing function static whilst another was adding a new function. In the include/linux/if_vlan.h case, the type used for a local variable was changed in 'net', whereas the function got rewritten to fix a stacked vlan bug in 'net-next'. In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next' overlapped with an endainness fix for VHOST 1.0 in 'net'. In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter in 'net-next' whereas in 'net' there was a bug fix to pass in the correct network namespace pointer in calls to this function. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 14:33:28 -08:00
David S. Miller	f2683b743f	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs More iov_iter work from Al Viro. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 20:46:55 -08:00
Herbert Xu	56d28b1e92	netlink: Use rhashtable walk iterator This patch gets rid of the manual rhashtable walk in netlink which touches rhashtable internals that should not be exposed. It does so by using the rhashtable iterator primitives. In fact the existing code was very buggy. Some sockets weren't shown at all while others were shown more than once. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 20:34:53 -08:00
Al Viro	a8866ff6a5	netlink: make the check for "send from tx_ring" deterministic As it is, zero msg_iovlen means that the first iovec in the kernel array of iovecs is left uninitialized, so checking if its ->iov_base is NULL is random. Since the real users of that thing are doing sendto(fd, NULL, 0, ...), they are getting msg_iovlen = 1 and msg_iov[0] = {NULL, 0}, which is what this test is trying to catch. As suggested by davem, let's just check that msg_iovlen was 1 and msg_iov[0].iov_base was NULL - _that_ is well-defined and it catches what we want to catch. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-04 01:34:13 -05:00
Pablo Neira	8b7c36d810	netlink: fix wrong subscription bitmask to group mapping in The subscription bitmask passed via struct sockaddr_nl is converted to the group number when calling the netlink_bind() and netlink_unbind() callbacks. The conversion is however incorrect since bitmask (1 << 0) needs to be mapped to group number 1. Note that you cannot specify the group number 0 (usually known as _NONE) from setsockopt() using NETLINK_ADD_MEMBERSHIP since this is rejected through -EINVAL. This problem became noticeable since `97840cb` ("netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind") when binding to bitmask (1 << 0) in ctnetlink. Reported-by: Andre Tomt <andre@tomt.net> Reported-by: Ivan Delalande <colona@arista.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-30 17:43:47 -08:00

1 2 3 4 5 ...