Commit graph

616 commits

Author SHA1 Message Date
Greg Kroah-Hartman
64a73ff728 This is the 4.4.76 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAllc3f0ACgkQONu9yGCS
 aT4fmA/+OHeYbhpaMRKqrUpsxB3NpROr2Z47ow6vaVjYZzd0irrODLlfIfDQ6EEo
 N3v28povu16VeYXk+4h8bsAP2K2j6/BlRaSi2hB6dmnY8GDMaXEfRojPYAlzVz50
 qnK/6152siDDarUx1h5Zc8GcmX/tEl6h3bOOxDcwLR+RvyIcWxenuR+uqRM/AV6o
 BPEiOuMu7P6LjID7KYgBTFNajVBMLrDXt4SCWdzOZmlNt0QXgKB9yw68vTcc+edC
 ZcXqa0M6nEWSDvwobbwBZhFL8H2dJjzweyjeFBgxnxgmOrRh6kvZG2wsz2c8O3/P
 g8TuMxU7siu+I3lFwKy+dgZ/1REz+6Q3oFBqXsuddrcPYu23rV6mz/GxqWy4cerb
 M4eTWz6L9vA2GoYpvBaWi0tKC9tkNM49g48Y24a6CW1O4dJWlz3RrpTiZmequbNF
 mo8EKomSXn4kYAm1xT03DGljQkK/i2JtyI5sk2hLEqqxKvZ/3q9xxLLKOVx8dPvs
 PIbfpapfYMXXMWgR6e+UKueNLgevfWE12X/OU4SgvSY4n/07/mH40XEd3zd82IsZ
 1Mw0qj3JnqCAFDBBMsDYa+OvABaGD1dHARuiv+aeqW8tqoBglFHxWqF+SQVNXLIE
 qTLiKz78vjQpH0zGpkA3HEOh/h4L7a0y3qRMECsk5SUxXsgu1gg=
 =bwNU
 -----END PGP SIGNATURE-----

Merge 4.4.76 into android-4.4

Changes in 4.4.76
	ipv6: release dst on error in ip6_dst_lookup_tail
	net: don't call strlen on non-terminated string in dev_set_alias()
	decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb
	net: Zero ifla_vf_info in rtnl_fill_vfinfo()
	af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers
	Fix an intermittent pr_emerg warning about lo becoming free.
	net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx
	igmp: acquire pmc lock for ip_mc_clear_src()
	igmp: add a missing spin_lock_init()
	ipv6: fix calling in6_ifa_hold incorrectly for dad work
	net/mlx5: Wait for FW readiness before initializing command interface
	decnet: always not take dst->__refcnt when inserting dst into hash table
	net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
	sfc: provide dummy definitions of vswitch functions
	ipv6: Do not leak throw route references
	rtnetlink: add IFLA_GROUP to ifla_policy
	netfilter: xt_TCPMSS: add more sanity tests on tcph->doff
	netfilter: synproxy: fix conntrackd interaction
	NFSv4: fix a reference leak caused WARNING messages
	drm/ast: Handle configuration without P2A bridge
	mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff()
	MIPS: Avoid accidental raw backtrace
	MIPS: pm-cps: Drop manual cache-line alignment of ready_count
	MIPS: Fix IRQ tracing & lockdep when rescheduling
	ALSA: hda - Fix endless loop of codec configure
	ALSA: hda - set input_path bitmap to zero after moving it to new place
	drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr
	usb: gadget: f_fs: Fix possibe deadlock
	sysctl: enable strict writes
	block: fix module reference leak on put_disk() call for cgroups throttle
	mm: numa: avoid waiting on freed migrated pages
	KVM: x86: fix fixing of hypercalls
	scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type
	scsi: lpfc: Set elsiocb contexts to NULL after freeing it
	qla2xxx: Fix erroneous invalid handle message
	ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags
	net: mvneta: Fix for_each_present_cpu usage
	MIPS: ath79: fix regression in PCI window initialization
	net: korina: Fix NAPI versus resources freeing
	MIPS: ralink: MT7688 pinmux fixes
	MIPS: ralink: fix USB frequency scaling
	MIPS: ralink: Fix invalid assignment of SoC type
	MIPS: ralink: fix MT7628 pinmux typos
	MIPS: ralink: fix MT7628 wled_an pinmux gpio
	mtd: bcm47xxpart: limit scanned flash area on BCM47XX (MIPS) only
	bgmac: fix a missing check for build_skb
	mtd: bcm47xxpart: don't fail because of bit-flips
	bgmac: Fix reversed test of build_skb() return value.
	net: bgmac: Fix SOF bit checking
	net: bgmac: Start transmit queue in bgmac_open
	net: bgmac: Remove superflous netif_carrier_on()
	powerpc/eeh: Enable IO path on permanent error
	gianfar: Do not reuse pages from emergency reserve
	Btrfs: fix truncate down when no_holes feature is enabled
	virtio_console: fix a crash in config_work_handler
	swiotlb-xen: update dev_addr after swapping pages
	xen-netfront: Fix Rx stall during network stress and OOM
	scsi: virtio_scsi: Reject commands when virtqueue is broken
	platform/x86: ideapad-laptop: handle ACPI event 1
	amd-xgbe: Check xgbe_init() return code
	net: dsa: Check return value of phy_connect_direct()
	drm/amdgpu: check ring being ready before using
	vfio/spapr: fail tce_iommu_attach_group() when iommu_data is null
	virtio_net: fix PAGE_SIZE > 64k
	vxlan: do not age static remote mac entries
	ibmveth: Add a proper check for the availability of the checksum features
	kernel/panic.c: add missing \n
	HID: i2c-hid: Add sleep between POWER ON and RESET
	scsi: lpfc: avoid double free of resource identifiers
	spi: davinci: use dma_mapping_error()
	mac80211: initialize SMPS field in HT capabilities
	x86/mpx: Use compatible types in comparison to fix sparse error
	coredump: Ensure proper size of sparse core files
	swiotlb: ensure that page-sized mappings are page-aligned
	s390/ctl_reg: make __ctl_load a full memory barrier
	be2net: fix status check in be_cmd_pmac_add()
	perf probe: Fix to show correct locations for events on modules
	net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV
	sctp: check af before verify address in sctp_addr_id2transport
	ravb: Fix use-after-free on `ifconfig eth0 down`
	jump label: fix passing kbuild_cflags when checking for asm goto support
	xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY
	xfrm: NULL dereference on allocation failure
	xfrm: Oops on error in pfkey_msg2xfrm_state()
	watchdog: bcm281xx: Fix use of uninitialized spinlock.
	sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting
	ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation
	ARM: 8685/1: ensure memblock-limit is pmd-aligned
	x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
	x86/mm: Fix flush_tlb_page() on Xen
	ocfs2: o2hb: revert hb threshold to keep compatible
	iommu/vt-d: Don't over-free page table directories
	iommu: Handle default domain attach failure
	iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
	cpufreq: s3c2416: double free on driver init error path
	KVM: x86: fix emulation of RSM and IRET instructions
	KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh()
	KVM: x86: zero base3 of unusable segments
	KVM: nVMX: Fix exception injection
	Linux 4.4.76

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-07-05 16:16:58 +02:00
Xin Long
0d1effe95e ipv6: fix calling in6_ifa_hold incorrectly for dad work
[ Upstream commit f8a894b218138888542a5058d0e902378fd0d4ec ]

Now when starting the dad work in addrconf_mod_dad_work, if the dad work
is idle and queued, it needs to hold ifa.

The problem is there's one gap in [1], during which if the pending dad work
is removed elsewhere. It will miss to hold ifa, but the dad word is still
idea and queue.

        if (!delayed_work_pending(&ifp->dad_work))
                in6_ifa_hold(ifp);
                    <--------------[1]
        mod_delayed_work(addrconf_wq, &ifp->dad_work, delay);

An use-after-free issue can be caused by this.

Chen Wei found this issue when WARN_ON(!hlist_unhashed(&ifp->addr_lst)) in
net6_ifa_finish_destroy was hit because of it.

As Hannes' suggestion, this patch is to fix it by holding ifa first in
addrconf_mod_dad_work, then calling mod_delayed_work and putting ifa if
the dad_work is already in queue.

Note that this patch did not choose to fix it with:

  if (!mod_delayed_work(delay))
          in6_ifa_hold(ifp);

As with it, when delay == 0, dad_work would be scheduled immediately, all
addrconf_mod_dad_work(0) callings had to be moved under ifp->lock.

Reported-by: Wei Chen <weichen@redhat.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05 14:37:14 +02:00
Greg Kroah-Hartman
5672779e72 This is the 4.4.73 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAllEst4ACgkQONu9yGCS
 aT5RJQ//UBkwmDInzy8BbfZk7pXY4IXzbXYfZ/nS5QbmPQNcQArKeIsOH+C1ZTKq
 suo4wR4yWUzATr8BBBxbUKyyrAsA85oDq/fl3EyPVLYOHdChnM/sS9H/eFrQhJIN
 7PM9j1YiqyTohkagB2tkWSb8tO1r/wgTSTLmIE+RzsWn6rvMCPMPVY+3OGgC1fuT
 LJrgtXKGBl9zNxmrns3VlSou1MhJvh/y8ESIrhdYHdZKos0zsgQaISXJjhZtyQcV
 OnEzsuz9NMXWE9XGjNpyAm88Nh8T41Ey/vwOjt6mkvWac3r77IgI5NWaLV/QDyqm
 d3jpVWK5BSPcLsmeN4LwQC5aYayvHlh8CfP8ZlBx1xkB5TpclnqXGgQA9BYpXAKw
 XoeFl8n8xLaPrgX8gp3kw/f6C6443OC2JVeRvgnH/0ZM7M0+rZxE6DstRcUHGqf6
 K8PN+AssQpBLIjXSGHnzDVHME/1xWUSmJZfLd5bd6NJ1zSZqZOy1gkf5dx77p0Ka
 UaGVOg6UzOojr3GeUTE62bRO2ZuAno0QO1NQJJkUK1CbNMYmE61vYLM8i0pLKWZJ
 3mDlhcoGK8aJH8chNLU3mgkXECU/9zOVKveWZFoghhMlv8ImgeTuiqZhvztzzT38
 42DxdXPfMzxCwBF02zYu4qn+WDJbNyOqMQrlMEHwwb88wnKiOUg=
 =Ic1J
 -----END PGP SIGNATURE-----

Merge 4.4.73 into android-4.4

Changes in 4.4.73
	s390/vmem: fix identity mapping
	partitions/msdos: FreeBSD UFS2 file systems are not recognized
	ARM: dts: imx6dl: Fix the VDD_ARM_CAP voltage for 396MHz operation
	staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory.
	Call echo service immediately after socket reconnect
	net: xilinx_emaclite: fix freezes due to unordered I/O
	net: xilinx_emaclite: fix receive buffer overflow
	ipv6: Handle IPv4-mapped src to in6addr_any dst.
	ipv6: Inhibit IPv4-mapped src address on the wire.
	NET: Fix /proc/net/arp for AX.25
	NET: mkiss: Fix panic
	net: hns: Fix the device being used for dma mapping during TX
	sierra_net: Skip validating irrelevant fields for IDLE LSIs
	sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications
	i2c: piix4: Fix request_region size
	ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches
	PM / runtime: Avoid false-positive warnings from might_sleep_if()
	jump label: pass kbuild_cflags when checking for asm goto support
	kasan: respect /proc/sys/kernel/traceoff_on_warning
	log2: make order_base_2() behave correctly on const input value zero
	ethtool: do not vzalloc(0) on registers dump
	fscache: Fix dead object requeue
	fscache: Clear outstanding writes when disabling a cookie
	FS-Cache: Initialise stores_lock in netfs cookie
	ipv6: fix flow labels when the traffic class is non-0
	drm/nouveau: prevent userspace from deleting client object
	drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers
	net/mlx4_core: Avoid command timeouts during VF driver device shutdown
	gianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page
	pinctrl: berlin-bg4ct: fix the value for "sd1a" of pin SCRD0_CRD_PRES
	net: adaptec: starfire: add checks for dma mapping errors
	parisc, parport_gsc: Fixes for printk continuation lines
	drm/nouveau: Don't enabling polling twice on runtime resume
	drm/ast: Fixed system hanged if disable P2A
	ravb: unmap descriptors when freeing rings
	nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"
	r8152: re-schedule napi for tx
	r8152: fix rtl8152_post_reset function
	r8152: avoid start_xmit to schedule napi when napi is disabled
	sctp: sctp_addr_id2transport should verify the addr before looking up assoc
	romfs: use different way to generate fsid for BLOCK or MTD
	proc: add a schedule point in proc_pid_readdir()
	tipc: ignore requests when the connection state is not CONNECTED
	xtensa: don't use linux IRQ #0
	s390/kvm: do not rely on the ILC on kvm host protection fauls
	sparc64: make string buffers large enough
	Linux 4.4.73

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-06-27 09:47:36 +02:00
Linus Lüssing
8d228758f9 ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches
[ Upstream commit a088d1d73a4bcfd7bc482f8d08375b9b665dc3e5 ]

When for instance a mobile Linux device roams from one access point to
another with both APs sharing the same broadcast domain and a
multicast snooping switch in between:

1)    (c) <~~~> (AP1) <--[SSW]--> (AP2)

2)              (AP1) <--[SSW]--> (AP2) <~~~> (c)

Then currently IPv6 multicast packets will get lost for (c) until an
MLD Querier sends its next query message. The packet loss occurs
because upon roaming the Linux host so far stayed silent regarding
MLD and the snooping switch will therefore be unaware of the
multicast topology change for a while.

This patch fixes this by always resending MLD reports when an interface
change happens, for instance from NO-CARRIER to CARRIER state.

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17 06:39:36 +02:00
Greg Kroah-Hartman
285c13770a This is the 4.4.68 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlkYQIMACgkQONu9yGCS
 aT6lQRAAx+GV9h6oAE5s6ehb/soIXrgvq/veRM52HRpECKvNOjp8p7rf2V9jLKy4
 HV/6n5Q7CClHgKkyfSvFput6iMzzzJWHl2cCFwiZ3e7eq3yCzIV4+Px0CD9SH5S7
 ukYSdmR5zU5oOoMvbW9op1GlUvyNlCtBqWLkXAhopyAuFG7aqvjprPRoJXNVsDqy
 QooRFbGilztrLTKXvnKlz2y0CDxrrHERRdVwRCpzeOpN0rEDoJfdNO6IoXph5vDj
 T2ZWH8WmL+2RPDUFA3fQ2pRKSZribk7Bw4BUDZGNKnXYGSwBWS4r0+1UkCyXGRda
 gLLajv0EIciXvNglkvZ6mzlCcucyJu1mhjFwh778HlFdzvgayxaXQMqFN72OPF8K
 SRsEZnBs4QiflLf4kI9WjiIBAL2uIrP6p9dFq8yHs5yEzRWGtXyODfFYRBnhW7ka
 KbJ47j+MMYvjyu82W+Zzw7qKFXluzLdQKzmY1HUiqegQEtwqjDr/jOL+uC0CkSBb
 OWSmo9/JZUcKn40epenP+ojgDkhJVoKeN5Cy1vWeDUV1pWjK+ErZ5GQZ9F9fNuvV
 MNaFjgQy+bZ4MQ1TgetZzvDKVnNHvuDwKKX6yIK1PHSMsBI4f7M1KLfwDi5WeUmg
 BeF3wDSQEhLGFhiwn3UzhK6VGjfaRsXXv8AhrELrgpnhWkZkg/A=
 =czqa
 -----END PGP SIGNATURE-----

Merge 4.4.68 into android-4.4

Changes in 4.4.68
	9p: fix a potential acl leak
	ARM: 8452/3: PJ4: make coprocessor access sequences buildable in Thumb2 mode
	cpupower: Fix turbo frequency reporting for pre-Sandy Bridge cores
	powerpc/powernv: Fix opal_exit tracepoint opcode
	power: supply: bq24190_charger: Fix irq trigger to IRQF_TRIGGER_FALLING
	power: supply: bq24190_charger: Call set_mode_host() on pm_resume()
	power: supply: bq24190_charger: Install irq_handler_thread() at end of probe()
	power: supply: bq24190_charger: Call power_supply_changed() for relevant component
	power: supply: bq24190_charger: Don't read fault register outside irq_handle_thread()
	power: supply: bq24190_charger: Handle fault before status on interrupt
	leds: ktd2692: avoid harmless maybe-uninitialized warning
	ARM: OMAP5 / DRA7: Fix HYP mode boot for thumb2 build
	mwifiex: debugfs: Fix (sometimes) off-by-1 SSID print
	mwifiex: remove redundant dma padding in AMSDU
	mwifiex: Avoid skipping WEP key deletion for AP
	x86/ioapic: Restore IO-APIC irq_chip retrigger callback
	x86/pci-calgary: Fix iommu_free() comparison of unsigned expression >= 0
	clk: Make x86/ conditional on CONFIG_COMMON_CLK
	kprobes/x86: Fix kernel panic when certain exception-handling addresses are probed
	x86/platform/intel-mid: Correct MSI IRQ line for watchdog device
	Revert "KVM: nested VMX: disable perf cpuid reporting"
	KVM: nVMX: initialize PML fields in vmcs02
	KVM: nVMX: do not leak PML full vmexit to L1
	usb: host: ehci-exynos: Decrese node refcount on exynos_ehci_get_phy() error paths
	usb: host: ohci-exynos: Decrese node refcount on exynos_ehci_get_phy() error paths
	usb: chipidea: Only read/write OTGSC from one place
	usb: chipidea: Handle extcon events properly
	USB: serial: keyspan_pda: fix receive sanity checks
	USB: serial: digi_acceleport: fix incomplete rx sanity check
	USB: serial: ssu100: fix control-message error handling
	USB: serial: io_edgeport: fix epic-descriptor handling
	USB: serial: ti_usb_3410_5052: fix control-message error handling
	USB: serial: ark3116: fix open error handling
	USB: serial: ftdi_sio: fix latency-timer error handling
	USB: serial: quatech2: fix control-message error handling
	USB: serial: mct_u232: fix modem-status error handling
	USB: serial: io_edgeport: fix descriptor error handling
	phy: qcom-usb-hs: Add depends on EXTCON
	serial: 8250_omap: Fix probe and remove for PM runtime
	scsi: mac_scsi: Fix MAC_SCSI=m option when SCSI=m
	MIPS: R2-on-R6 MULTU/MADDU/MSUBU emulation bugfix
	brcmfmac: Ensure pointer correctly set if skb data location changes
	brcmfmac: Make skb header writable before use
	staging: wlan-ng: add missing byte order conversion
	staging: emxx_udc: remove incorrect __init annotations
	ALSA: hda - Fix deadlock of controller device lock at unbinding
	tcp: do not underestimate skb->truesize in tcp_trim_head()
	bpf, arm64: fix jit branch offset related to ldimm64
	tcp: fix wraparound issue in tcp_lp
	tcp: do not inherit fastopen_req from parent
	ipv4, ipv6: ensure raw socket message is big enough to hold an IP header
	rtnetlink: NUL-terminate IFLA_PHYS_PORT_NAME string
	ipv6: initialize route null entry in addrconf_init()
	ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf
	bnxt_en: allocate enough space for ->ntp_fltr_bmap
	f2fs: sanity check segment count
	drm/ttm: fix use-after-free races in vm fault handling
	block: get rid of blk_integrity_revalidate()
	Linux 4.4.68

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-05-15 09:25:05 +02:00
WANG Cong
5c333f84bb ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf
[ Upstream commit 242d3a49a2a1a71d8eb9f953db1bcaa9d698ce00 ]

For each netns (except init_net), we initialize its null entry
in 3 places:

1) The template itself, as we use kmemdup()
2) Code around dst_init_metrics() in ip6_route_net_init()
3) ip6_route_dev_notify(), which is supposed to initialize it after
   loopback registers

Unfortunately the last one still happens in a wrong order because
we expect to initialize net->ipv6.ip6_null_entry->rt6i_idev to
net->loopback_dev's idev, thus we have to do that after we add
idev to loopback. However, this notifier has priority == 0 same as
ipv6_dev_notf, and ipv6_dev_notf is registered after
ip6_route_dev_notifier so it is called actually after
ip6_route_dev_notifier. This is similar to commit 2f460933f58e
("ipv6: initialize route null entry in addrconf_init()") which
fixes init_net.

Fix it by picking a smaller priority for ip6_route_dev_notifier.
Also, we have to release the refcnt accordingly when unregistering
loopback_dev because device exit functions are called before subsys
exit functions.

Acked-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14 13:32:58 +02:00
WANG Cong
5117f03fd6 ipv6: initialize route null entry in addrconf_init()
[ Upstream commit 2f460933f58eee3393aba64f0f6d14acb08d1724 ]

Andrey reported a crash on init_net.ipv6.ip6_null_entry->rt6i_idev
since it is always NULL.

This is clearly wrong, we have code to initialize it to loopback_dev,
unfortunately the order is still not correct.

loopback_dev is registered very early during boot, we lose a chance
to re-initialize it in notifier. addrconf_init() is called after
ip6_route_init(), which means we have no chance to correct it.

Fix it by moving this initialization explicitly after
ipv6_add_dev(init_net.loopback_dev) in addrconf_init().

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14 13:32:58 +02:00
Maciej Żenczykowski
e246a2f11f UPSTREAM: ipv6 addrconf: implement RFC7559 router solicitation backoff
This implements:
  https://tools.ietf.org/html/rfc7559

Backoff is performed according to RFC3315 section 14:
  https://tools.ietf.org/html/rfc3315#section-14

We allow setting /proc/sys/net/ipv6/conf/*/router_solicitations
to a negative value meaning an unlimited number of retransmits,
and we make this the new default (inline with the RFC).

We also add a new setting:
  /proc/sys/net/ipv6/conf/*/router_solicitation_max_interval
defaulting to 1 hour (per RFC recommendation).

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Erik Kline <ek@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bd11f0741fa5a2c296629898ad07759dd12b35bb in
DaveM's net-next/master, should make Linus' tree in 4.9-rc1)
Change-Id: Ia32cdc5c61481893ef8040734e014bf2229fc39e
2017-04-06 08:36:00 +00:00
Joel Scherpelz
e953f89b85 net: ipv6: Add sysctl for minimum prefix len acceptable in RIOs.
This commit adds a new sysctl accept_ra_rt_info_min_plen that
defines the minimum acceptable prefix length of Route Information
Options. The new sysctl is intended to be used together with
accept_ra_rt_info_max_plen to configure a range of acceptable
prefix lengths. It is useful to prevent misconfigurations from
unintentionally blackholing too much of the IPv6 address space
(e.g., home routers announcing RIOs for fc00::/7, which is
incorrect).

[backport of net-next bbea124bc99df968011e76eba105fe964a4eceab]
Bug: 33333670
Test: net_test passes

Signed-off-by: Joel Scherpelz <jscherpelz@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23 05:03:51 +00:00
Dmitry Shmidt
5edfa05a10 This is the 4.4.48 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlicFCgACgkQONu9yGCS
 aT4TLg//QVqQvdkxyy0lKQfOxmo4RSErmpFstgkvuVgucGh6Akvh8OV9hHJKabjK
 RUn3BNASoWfQF+G1vn7EQWcTGDgJhF/P39DvMu3zvpRbSYMMeX7og9iDnoNn2WtG
 l89l+5YfQG7Y8eJWj1mnTW2ul9pUxJFg4j2rjmcLhfgKPvJPCn+cpU2XKUxpj7gM
 yd/nbVuQlMFW6qfEES1W1RbDEOQ1KWJgdupsMEgodRxb/dlg8KldBQFmv1fGcrA6
 5jFqWzsQQ7AyfMWIRDBm9mJlHuvdoGCEGkyTbsZoSyuN72/cyfPSfTZPInpi09bb
 l0sod1nzcZsuQVJzaQHTKlvpMEduIDQVxy2/pNW/pKnGAS++fkK+uJCsu0mz+6+8
 zntaPdVoboiwwoK5dgP27vgWpYpw2QoCpPqWno7NIVNZfUcWWng3NS49goN+ytvY
 m1i1ih4KU1bMqMrT0qZugQwHHqaE9IJ8xyDMdXc86cMH1ylTo8ZnOOyGxRKKLOW1
 nVs4aQT2i7E9yQ8TjVJplLxtU3t/Q3D1qqPr5U70XJyEgT5X4/V0mXJaRRWXAzXP
 2IBJOLznqwbwuIHV8ocp7i76qtpVqbJkpMx2NhB0tFP0XjffqpZvv0v8aBTAdBS2
 060nyG8fZad6L++tWVODt7nd7gkD4NN/I8BqD0XzXx6zbOJexqA=
 =GUZe
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.48' into android-4.4.y

This is the 4.4.48 stable release
2017-02-09 10:59:15 -08:00
Kefeng Wang
8051bf2890 ipv6: addrconf: Avoid addrconf_disable_change() using RCU read-side lock
[ Upstream commit 03e4deff4987f79c34112c5ba4eb195d4f9382b0 ]

Just like commit 4acd4945cd ("ipv6: addrconf: Avoid calling
netdevice notifiers with RCU read-side lock"), it is unnecessary
to make addrconf_disable_change() use RCU iteration over the
netdev list, since it already holds the RTNL lock, or we may meet
Illegal context switch in RCU read-side critical section.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-04 09:45:09 +01:00
Dmitry Shmidt
324e88de4a This is the 4.4.32 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYKq+PAAoJEDjbvchgkmk+W3sQAKHJ6dI10P/sFTe4AlGoRGNr
 ZtCwGwwolBoD/NtXa2HCovc9ofIU4zWYXl5P+kbHtKV/ZB4q5+m7Q5bpWh4TQFUy
 9TKho6aywF9uXpAEV99qKYvAOIq5EgJXdgrhCRTYBBR9+uR3+B1cUJhxpyD6htw4
 H7ABpmihWjij0o9YYAin7y/O+8jeqnuNLPUoCek1Emf0cn7G5keMg8Lli0WCz7jM
 JdKOjbvaYscgvb4BqTKqtg5NneC3GoeNp43Kvz4LbmcPw1yT5N8sHswqlSio4U2U
 Sxyvtj0RxoSoAus2UR62pTGDu1TrSHxWEWpYpqa77hr1/TpBY7put1OldFmUfu1B
 voQUI05Ox74RT9pl5c8DGnXH8Zyiu6a7Fpj6EdWbWxtbIgvWCLaDHniEY1WKR6cj
 Bmil/zjGyDtzANJBasC9NJHF8yd+/vxNfn5n0eAz6Xp94MIdOGPIQle+NATG5osN
 0b/NLit64B2F6Djijkv1vV9V7x1oYqIYVG6f1BoVtRXCjhcx9PnkskXcP+1SKUhH
 xOTXLt6rGNaTj+T2/41VJUtZ6eiZj+0GZMXILu5SIEdKiRiGLfsLHX117OK3ZhYT
 PFzzzWZoC2FOL/ldp/K6ncPZV0oHn3yfQa3T97jGI1LbsYkXXyQkW5PNwqGccbUc
 xvhEAPDvBxDlfcgqWMaw
 =DC+B
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.32' into android-4.4.y

This is the 4.4.32 stable release

Change-Id: I5028402eadfcf055ac44a5e67abc6da75b2068b3
2016-11-15 17:02:38 -08:00
Nicolas Dichtel
e635b47661 ipv6: correctly add local routes when lo goes up
[ Upstream commit a220445f9f4382c36a53d8ef3e08165fa27f7e2c ]

The goal of the patch is to fix this scenario:
 ip link add dummy1 type dummy
 ip link set dummy1 up
 ip link set lo down ; ip link set lo up

After that sequence, the local route to the link layer address of dummy1 is
not there anymore.

When the loopback is set down, all local routes are deleted by
addrconf_ifdown()/rt6_ifdown(). At this time, the rt6_info entry still
exists, because the corresponding idev has a reference on it. After the rcu
grace period, dst_rcu_free() is called, and thus ___dst_free(), which will
set obsolete to DST_OBSOLETE_DEAD.

In this case, init_loopback() is called before dst_rcu_free(), thus
obsolete is still sets to something <= 0. So, the function doesn't add the
route again. To avoid that race, let's check the rt6 refcnt instead.

Fixes: 25fb6ca4ed ("net IPv6 : Fix broken IPv6 routing table after loopback down-up")
Fixes: a881ae1f62 ("ipv6: don't call addrconf_dst_alloc again when enable lo")
Fixes: 33d99113b1 ("ipv6: reallocate addrconf router for ipv6 address when lo device up")
Reported-by: Francesco Santoro <francesco.santoro@6wind.com>
Reported-by: Samuel Gauthier <samuel.gauthier@6wind.com>
CC: Balakumaran Kannan <Balakumaran.Kannan@ap.sony.com>
CC: Maruthi Thotad <Maruthi.Thotad@ap.sony.com>
CC: Sabrina Dubroca <sd@queasysnail.net>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
CC: Weilong Chen <chenweilong@huawei.com>
CC: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-15 07:46:38 +01:00
Dmitry Shmidt
734bcf32c2 This is the 4.4.22 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJX5jR2AAoJEDjbvchgkmk+nXwQAML5WFM1xDL8frXh3vIS3RzD
 fP2YHP0Bm+xE/G9jDnlcoqJmxg4DKPUCP4T/rCZmeNRWc/RaIBX+VTyfVhN969uo
 v5f8jN6fc4TO9WMD+G++Vx3MZqupJbSAXlY2ZSUTF389lM/jHvaWj+DfA1qGLmGJ
 UbfO1jNszadZGIb8yOo/qmR+E3sSV/nT+/y7Sa2rSqkKt5+YI+z1Q1ezLo7BZ+uO
 6p968djKTXSOO7SHciddoegJ8lF2hhgY4cW95CEV+Dqu2O6AVyFyMz+ngYivEueZ
 ZwwQCaYIl+68ssAoI61VmtQHEvuaikTx5g9vjAApScWWijZU+V/M65BLAL6GAMWH
 kWOmilbtZKhyirecAxgnRIkJR8Tp0YcgUYAivsqkYqVPelcPsHvOFRfr4D6HrcBt
 wLrjaoBj+1vAjskozKJEymDNGQJ2Me/nBAWgN44MQYLRGg4kdBxNS/CGyeh8O8wO
 gEeVqa+zDOQCSeg2LJdiql3TdMQfQ+kpCsfjcrrl1oRkRX7OX130+gLuI8Tt1Fno
 6niq6w+QeAY445RSyM45vLeJ6vXB7oFadtuD4QvsB5YFr0X0P0KF3GKlHl0xiyEV
 JFpWJiXYsnOvM8entT23aeCSTlDT1p6os3jLh8p7CBn9TvP3uW2nfgG/FKvy0wGD
 7L7FKYb4Mw+YSrROfxBT
 =5+OA
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.22' into android-4.4.y

This is the 4.4.22 stable release

Change-Id: Id49e3c87d2cacb2fa85d85a17226f718f4a5ac28
2016-09-26 10:37:43 -07:00
Wei Yongjun
8b18e0e498 ipv6: addrconf: fix dev refcont leak when DAD failed
commit 751eb6b6042a596b0080967c1a529a9fe98dac1d upstream.

In general, when DAD detected IPv6 duplicate address, ifp->state
will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
delayed work, the call tree should be like this:

ndisc_recv_ns
  -> addrconf_dad_failure        <- missing ifp put
     -> addrconf_mod_dad_work
       -> schedule addrconf_dad_work()
         -> addrconf_dad_stop()  <- missing ifp hold before call it

addrconf_dad_failure() called with ifp refcont holding but not put.
addrconf_dad_work() call addrconf_dad_stop() without extra holding
refcount. This will not cause any issue normally.

But the race between addrconf_dad_failure() and addrconf_dad_work()
may cause ifp refcount leak and netdevice can not be unregister,
dmesg show the following messages:

IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
...
unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Fixes: c15b1ccadb ("ipv6: move DAD and addrconf_verify processing
to workqueue")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:42 +02:00
Dmitry Shmidt
b558f17a13 This is the 4.4.16 stable release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXmOXmAAoJEDjbvchgkmk+QYIP/1S8oBZsvjfDzvH8t63HyLeH
 i43MFlYoFAqUIZc002XpluSvZ8uHoG+r7R8Hq3wmv48wxe3M6OBnMdBVTht6mPw+
 t5OLTZr40lWaJm2EIi4aekueMIrCgmL+Et+IFYv7ZVBuYLteVcfny+zdq4EqGmgj
 /a19+L/sTTr4SHtJIhHxWhiVJ9fVMgQk/N3VgQmIiNF2+lVbiFI7QQiDPLbFl0KK
 CM4ETO22HxHCYilGpzhpSMsHCxv12VqNaXNLAsPAepGGW7PqvUmrEWAqgwsbOfRc
 GxTLNk0dUgJqMrfEpQ8ZOMlgzvCAYG2jZuNSuT+nuzrWSUP+WOGRi9TTTxp1CYuZ
 PHlhNTH7ZnqosxJUUZS2d9N5ygpqD48Rhlfl824YzOWCy94VeUnedkVLb20uJwPF
 Y5aQ5WjktBC9why5e4OgGQERvx/U9KTk8E1zRfZZPc2oft9My0YxuemjjKAKZiYN
 ne4WhXbgOJTQkAoZwh2xqny3bWyEaoSrWpQ3R7bBJ9SIRLEOdCKzKpduDbAnbMP7
 QWgQOQC/6qA1mKqjrqF4KPA1Quo9PcUK2Ajh523ewMGCowgY90vyejAgh4Q8g0GC
 fKlx+jJDoKVDbQ8v4hc9PPHMsNNIKT9a1ptwVS3lE+bq1D5Ffm57A4/uOTMYHVab
 gKqu8h1CA0MCVBsH3nNA
 =nY8S
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.16' into android-4.4.y

This is the 4.4.16 stable release

Change-Id: Ibaf7b7e03695e1acebc654a2ca1a4bfcc48fcea4
2016-08-01 15:57:55 -07:00
Anton Protopopov
b7c2e2acc6 rtnl: RTM_GETNETCONF: fix wrong return value
[ Upstream commit a97eb33ff225f34a8124774b3373fd244f0e83ce ]

An error response from a RTM_GETNETCONF request can return the positive
error value EINVAL in the struct nlmsgerr that can mislead userspace.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-03 15:07:07 -08:00
subashab@codeaurora.org
cdbc66828d ipv6: addrconf: Fix recursive spin lock call
[ Upstream commit 16186a82de1fdd868255448274e64ae2616e2640 ]

A rcu stall with the following backtrace was seen on a system with
forwarding, optimistic_dad and use_optimistic set. To reproduce,
set these flags and allow ipv6 autoconf.

This occurs because the device write_lock is acquired while already
holding the read_lock. Back trace below -

INFO: rcu_preempt self-detected stall on CPU { 1}  (t=2100 jiffies
 g=3992 c=3991 q=4471)
<6> Task dump for CPU 1:
<2> kworker/1:0     R  running task    12168    15   2 0x00000002
<2> Workqueue: ipv6_addrconf addrconf_dad_work
<6> Call trace:
<2> [<ffffffc000084da8>] el1_irq+0x68/0xdc
<2> [<ffffffc000cc4e0c>] _raw_write_lock_bh+0x20/0x30
<2> [<ffffffc000bc5dd8>] __ipv6_dev_ac_inc+0x64/0x1b4
<2> [<ffffffc000bcbd2c>] addrconf_join_anycast+0x9c/0xc4
<2> [<ffffffc000bcf9f0>] __ipv6_ifa_notify+0x160/0x29c
<2> [<ffffffc000bcfb7c>] ipv6_ifa_notify+0x50/0x70
<2> [<ffffffc000bd035c>] addrconf_dad_work+0x314/0x334
<2> [<ffffffc0000b64c8>] process_one_work+0x244/0x3fc
<2> [<ffffffc0000b7324>] worker_thread+0x2f8/0x418
<2> [<ffffffc0000bb40c>] kthread+0xe0/0xec

v2: do addrconf_dad_kick inside read lock and then acquire write
lock for ipv6_ifa_notify as suggested by Eric

Fixes: 7fd2561e4e ("net: ipv6: Add a sysctl to make optimistic
addresses useful candidates")

Cc: Eric Dumazet <edumazet@google.com>
Cc: Erik Kline <ek@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-03 15:07:05 -08:00
Lorenzo Colitti
6dd69fdc00 net: ipv6: autoconf routes into per-device tables
Currently, IPv6 router discovery always puts routes into
RT6_TABLE_MAIN. This causes problems for connection managers
that want to support multiple simultaneous network connections
and want control over which one is used by default (e.g., wifi
and wired).

To work around this connection managers typically take the routes
they prefer and copy them to static routes with low metrics in
the main table. This puts the burden on the connection manager
to watch netlink to see if the routes have changed, delete the
routes when their lifetime expires, etc.

Instead, this patch adds a per-interface sysctl to have the
kernel put autoconf routes into different tables. This allows
each interface to have its own autoconf table, and choosing the
default interface (or using different interfaces at the same
time for different types of traffic) can be done using
appropriate ip rules.

The sysctl behaves as follows:

- = 0: default. Put routes into RT6_TABLE_MAIN as before.
- > 0: manual. Put routes into the specified table.
- < 0: automatic. Add the absolute value of the sysctl to the
       device's ifindex, and use that table.

The automatic mode is most useful in conjunction with
net.ipv6.conf.default.accept_ra_rt_table. A connection manager
or distribution could set it to, say, -100 on boot, and
thereafter just use IP rules.

Change-Id: I82d16e3737d9cdfa6489e649e247894d0d60cbb1
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2016-02-16 13:51:36 -08:00
WANG Cong
5449a5ca9b addrconf: always initialize sysctl table data
When sysctl performs restrict writes, it allows to write from
a middle position of a sysctl file, which requires us to initialize
the table data before calling proc_dostring() for the write case.

Fixes: 3d1bec9932 ("ipv6: introduce secret_stable to ipv6_devconf")
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-22 17:00:58 -05:00
Hannes Frederic Sowa
9b29c6962b ipv6: automatically enable stable privacy mode if stable_secret set
Bjørn reported that while we switch all interfaces to privacy stable mode
when setting the secret, we don't set this mode for new interfaces. This
does not make sense, so change this behaviour.

Fixes: 622c81d57b ("ipv6: generation of stable privacy addresses for link-local and autoconf")
Reported-by: Bjørn Mork <bjorn@mork.no>
Cc: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 23:37:32 -05:00
Bjørn Mork
9a1ec4612c ipv6: keep existing flags when setting IFA_F_OPTIMISTIC
Commit 64236f3f3d ("ipv6: introduce IFA_F_STABLE_PRIVACY flag")
failed to update the setting of the IFA_F_OPTIMISTIC flag, causing
the IFA_F_STABLE_PRIVACY flag to be lost if IFA_F_OPTIMISTIC is set.

Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Fixes: 64236f3f3d ("ipv6: introduce IFA_F_STABLE_PRIVACY flag")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 18:09:46 -05:00
Nicolas Dichtel
304d888b29 Revert "ipv6: ndisc: inherit metadata dst when creating ndisc requests"
This reverts commit ab450605b3.

In IPv6, we cannot inherit the dst of the original dst. ndisc packets
are IPv6 packets and may take another route than the original packet.

This patch breaks the following scenario: a packet comes from eth0 and
is forwarded through vxlan1. The encapsulated packet triggers an NS
which cannot be sent because of the wrong route.

CC: Jiri Benc <jbenc@redhat.com>
CC: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01 15:07:59 -05:00
Sabrina Dubroca
2a189f9e57 ipv6: clean up dev_snmp6 proc entry when we fail to initialize inet6_dev
In ipv6_add_dev, when addrconf_sysctl_register fails, we do not clean up
the dev_snmp6 entry that we have already registered for this device.
Call snmp6_unregister_dev in this case.

Fixes: a317a2f19d ("ipv6: fail early when creating netdev named all or default")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-04 23:49:48 -05:00
Alexander Duyck
b7b0b1d290 ipv6: recreate ipv6 link-local addresses when increasing MTU over IPV6_MIN_MTU
This change makes it so that we reinitialize the interface if the MTU is
increased back above IPV6_MIN_MTU and the interface is up.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-30 18:11:07 +09:00
Arad, Ronen
b1974ed05e netlink: Rightsize IFLA_AF_SPEC size calculation
if_nlmsg_size() overestimates the minimum allocation size of netlink
dump request (when called from rtnl_calcit()) or the size of the
message (when called from rtnl_getlink()). This is because
ext_filter_mask is not supported by rtnl_link_get_af_size() and
rtnl_link_get_size().

The over-estimation is significant when at least one netdev has many
VLANs configured (8 bytes for each configured VLAN).

This patch-set "rightsizes" the protocol specific attribute size
calculation by propagating ext_filter_mask to rtnl_link_get_af_size()
and adding this a argument to get_link_af_size op in rtnl_af_ops.

Bridge module already used filtering aware sizing for notifications.
br_get_link_af_size_filtered() is consistent with the modified
get_link_af_size op so it replaces br_get_link_af_size() in br_af_ops.
br_get_link_af_size() becomes unused and thus removed.

Signed-off-by: Ronen Arad <ronen.arad@intel.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21 19:15:20 -07:00
David S. Miller
26440c835f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/asix_common.c
	net/ipv4/inet_connection_sock.c
	net/switchdev/switchdev.c

In the inet_connection_sock.c case the request socket hashing scheme
is completely different in net-next.

The other two conflicts were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-20 06:08:27 -07:00
David Ahern
ca254490c8 net: Add VRF support to IPv6 stack
As with IPv4 support for VRFs added to IPv6 stack by replacing hardcoded
table ids with possibly device specific ones and manipulating the oif in
the flowi6. The flow flags are used to skip oif compare in nexthop lookups
if the device is enslaved to a VRF via the L3 master device.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:55:08 -07:00
Hannes Frederic Sowa
d9e4ce65b2 ipv6: gre: setup default multicast routes over PtP links
GRE point-to-point interfaces should also support ipv6 multicast. Setting
up default multicast routes on interface creation was forgotten. Add it.

Bugzilla: <https://bugzilla.kernel.org/show_bug.cgi?id=103231>
Cc: Julien Muchembled <jm@jmuchemb.eu>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Nicolas Dumazet <ndumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-11 05:30:43 -07:00
David S. Miller
4963ed48f2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv4/arp.c

The net/ipv4/arp.c conflict was one commit adding a new
local variable while another commit was deleting one.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-26 16:08:27 -07:00
Jiri Benc
38cf595b19 ipv6: remove unused neigh parameter from ndisc functions
Since commit 12fd84f438 ("ipv6: Remove unused neigh argument for
icmp6_dst_alloc() and its callers."), the neigh parameter of ndisc_send_na
and ndisc_send_ns is unused.

CC: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 12:26:08 -07:00
Sowmini Varadhan
d5566fd72e rtnetlink: RTEXT_FILTER_SKIP_STATS support to avoid dumping inet/inet6 stats
Many commonly used functions like getifaddrs() invoke RTM_GETLINK
to dump the interface information, and do not need the
the AF_INET6 statististics that are always returned by default
from rtnl_fill_ifinfo().

Computing the statistics can be an expensive operation that impacts
scaling, so it is desirable to avoid this if the information is
not needed.

This patch adds a the RTEXT_FILTER_SKIP_STATS extended info flag that
can be passed with netlink_request() to avoid statistics computation
for the ifinfo path.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 15:25:02 -07:00
Martin KaFai Lau
8e3d5be736 ipv6: Avoid double dst_free
It is a prep work to get dst freeing from fib tree undergo
a rcu grace period.

The following is a common paradigm:
if (ip6_del_rt(rt))
	dst_free(rt)

which means, if rt cannot be deleted from the fib tree, dst_free(rt) now.
1. We don't know the ip6_del_rt(rt) failure is because it
   was not managed by fib tree (e.g. DST_NOCACHE) or it had already been
   removed from the fib tree.
2. If rt had been managed by the fib tree, ip6_del_rt(rt) failure means
   dst_free(rt) has been called already.  A second
   dst_free(rt) is not always obviously safe.  The rt may have
   been destroyed already.
3. If rt is a DST_NOCACHE, dst_free(rt) should not be called.
4. It is a stopper to make dst freeing from fib tree undergo a
   rcu grace period.

This patch is to use a DST_NOCACHE flag to indicate a rt is
not managed by the fib tree.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:53:05 -07:00
Linus Torvalds
26d2177e97 Changes for 4.3
- Create drivers/staging/rdma
 - Move amso1100 driver to staging/rdma and schedule for deletion
 - Move ipath driver to staging/rdma and schedule for deletion
 - Add hfi1 driver to staging/rdma and set TODO for move to regular tree
 - Initial support for namespaces to be used on RDMA devices
 - Add RoCE GID table handling to the RDMA core caching code
 - Infrastructure to support handling of devices with differing
   read and write scatter gather capabilities
 - Various iSER updates
 - Kill off unsafe usage of global mr registrations
 - Update SRP driver
 - Misc. mlx4 driver updates
 - Support for the mr_alloc verb
 - Support for a netlink interface between kernel and user space cache
   daemon to speed path record queries and route resolution
 - Ininitial support for safe hot removal of verbs devices
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV7v8wAAoJELgmozMOVy/d2dcP/3PXnGFPgFGJODKE6VCZtTvj
 nooNXRKXjxv470UT5DiAX7SNcBxzzS7Zl/Lj+831H9iNXUyzuH31KtBOAZ3W03vZ
 yXwCB2caOStSldTRSUUvPe2aIFPnyNmSpC4i6XcJLJMCFijKmxin5pAo8qE44BQU
 yjhT+wC9P6LL5wZXsn/nFIMLjOFfu0WBFHNp3gs5j59paxlx5VeIAZk16aQZH135
 m7YCyicwrS8iyWQl2bEXRMon2vlCHlX2RHmOJ4f/P5I0quNcGF2+d8Yxa+K1VyC5
 zcb3OBezz+wZtvh16yhsDfSPqHWirljwID2VzOgRSzTJWvQjju8VkwHtkq6bYoBW
 egIxGCHcGWsD0R5iBXLYr/tB+BmjbDObSm0AsR4+JvSShkeVA1IpeoO+19162ixE
 n6CQnk2jCee8KXeIN4PoIKsjRSbIECM0JliWPLoIpuTuEhhpajftlSLgL5hf1dzp
 HrSy6fXmmoRj7wlTa7DnYIC3X+ffwckB8/t1zMAm2sKnIFUTjtQXF7upNiiyWk4L
 /T1QEzJ2bLQckQ9yY4v528SvBQwA4Dy1amIQB7SU8+2S//bYdUvhysWPkdKC4oOT
 WlqS5PFDCI31MvNbbM3rUbMAD8eBAR8ACw9ZpGI/Rffm5FEX5W3LoxA8gfEBRuqt
 30ZYFuW8evTL+YQcaV65
 =EHLg
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull inifiniband/rdma updates from Doug Ledford:
 "This is a fairly sizeable set of changes.  I've put them through a
  decent amount of testing prior to sending the pull request due to
  that.

  There are still a few fixups that I know are coming, but I wanted to
  go ahead and get the big, sizable chunk into your hands sooner rather
  than waiting for those last few fixups.

  Of note is the fact that this creates what is intended to be a
  temporary area in the drivers/staging tree specifically for some
  cleanups and additions that are coming for the RDMA stack.  We
  deprecated two drivers (ipath and amso1100) and are waiting to hear
  back if we can deprecate another one (ehca).  We also put Intel's new
  hfi1 driver into this area because it needs to be refactored and a
  transfer library created out of the factored out code, and then it and
  the qib driver and the soft-roce driver should all be modified to use
  that library.

  I expect drivers/staging/rdma to be around for three or four kernel
  releases and then to go away as all of the work is completed and final
  deletions of deprecated drivers are done.

  Summary of changes for 4.3:

   - Create drivers/staging/rdma
   - Move amso1100 driver to staging/rdma and schedule for deletion
   - Move ipath driver to staging/rdma and schedule for deletion
   - Add hfi1 driver to staging/rdma and set TODO for move to regular
     tree
   - Initial support for namespaces to be used on RDMA devices
   - Add RoCE GID table handling to the RDMA core caching code
   - Infrastructure to support handling of devices with differing read
     and write scatter gather capabilities
   - Various iSER updates
   - Kill off unsafe usage of global mr registrations
   - Update SRP driver
   - Misc  mlx4 driver updates
   - Support for the mr_alloc verb
   - Support for a netlink interface between kernel and user space cache
     daemon to speed path record queries and route resolution
   - Ininitial support for safe hot removal of verbs devices"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (136 commits)
  IB/ipoib: Suppress warning for send only join failures
  IB/ipoib: Clean up send-only multicast joins
  IB/srp: Fix possible protection fault
  IB/core: Move SM class defines from ib_mad.h to ib_smi.h
  IB/core: Remove unnecessary defines from ib_mad.h
  IB/hfi1: Add PSM2 user space header to header_install
  IB/hfi1: Add CSRs for CONFIG_SDMA_VERBOSITY
  mlx5: Fix incorrect wc pkey_index assignment for GSI messages
  IB/mlx5: avoid destroying a NULL mr in reg_user_mr error flow
  IB/uverbs: reject invalid or unknown opcodes
  IB/cxgb4: Fix if statement in pick_local_ip6adddrs
  IB/sa: Fix rdma netlink message flags
  IB/ucma: HW Device hot-removal support
  IB/mlx4_ib: Disassociate support
  IB/uverbs: Enable device removal when there are active user space applications
  IB/uverbs: Explicitly pass ib_dev to uverbs commands
  IB/uverbs: Fix race between ib_uverbs_open and remove_one
  IB/uverbs: Fix reference counting usage of event files
  IB/core: Make ib_dealloc_pd return void
  IB/srp: Create an insecure all physical rkey only if needed
  ...
2015-09-09 08:33:31 -07:00
Raghavendra K T
a3a773726c net: Optimize snmp stat aggregation by walking all the percpu data at once
Docker container creation linearly increased from around 1.6 sec to 7.5 sec
(at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field.

reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks
through per cpu data of an item (iteratively for around 36 items).

idea: This patch tries to aggregate the statistics by going through
all the items of each cpu sequentially which is reducing cache
misses.

Docker creation got faster by more than 2x after the patch.

Result:
                       Before           After
Docker creation time   6.836s           3.25s
cache miss             2.7%             1.41%

perf before:
    50.73%  docker           [kernel.kallsyms]       [k] snmp_fold_field
     9.07%  swapper          [kernel.kallsyms]       [k] snooze_loop
     3.49%  docker           [kernel.kallsyms]       [k] veth_stats_one
     2.85%  swapper          [kernel.kallsyms]       [k] _raw_spin_lock

perf after:
    10.57%  docker           docker                [.] scanblock
     8.37%  swapper          [kernel.kallsyms]     [k] snooze_loop
     6.91%  docker           [kernel.kallsyms]     [k] snmp_get_cpu_field
     6.67%  docker           [kernel.kallsyms]     [k] veth_stats_one

changes/ideas suggested:
Using buffer in stack (Eric), Usage of memset (David), Using memcpy in
place of unaligned_put (Joe).

Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-30 21:48:58 -07:00
Matan Barak
399e6f9581 net/ipv6: Export addrconf_ifid_eui48
For loopback purposes, RoCE devices should have a default GID in the
port GID table, even when the interface is down. In order to do so,
we use the IPv6 link local address which would have been genenrated
for the related Ethernet netdevice when it goes up as a default GID.

addrconf_ifid_eui48 is used to gernerate this address, export it.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:08:49 -04:00
Jiri Benc
ab450605b3 ipv6: ndisc: inherit metadata dst when creating ndisc requests
If output device wants to see the dst, inherit the dst of the original skb
in the ndisc request.

This is an IPv6 counterpart of commit 0accfc268f ("arp: Inherit metadata
dst when creating ARP requests").

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20 15:42:37 -07:00
Andy Gospodarek
0344338bd8 net: addr IFLA_OPERSTATE to netlink message for ipv6 ifinfo
This is useful information to include in ipv6 netlink messages that
report interface information.  IFLA_OPERSTATE is already included in
ipv4 messages, but missing for ipv6.  This closes that gap.

Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 22:35:30 -07:00
Andy Gospodarek
35103d1117 net: ipv6 sysctl option to ignore routes when nexthop link is down
Like the ipv4 patch with a similar title, this adds a sysctl to allow
the user to change routing behavior based on whether or not the
interface associated with the nexthop was an up or down link.  The
default setting preserves the current behavior, but anyone that enables
it will notice that nexthops on down interfaces will no longer be
selected:

net.ipv6.conf.all.ignore_routes_with_linkdown = 0
net.ipv6.conf.default.ignore_routes_with_linkdown = 0
net.ipv6.conf.lo.ignore_routes_with_linkdown = 0
...

When the above sysctls are set, not only will link status be reported to
userspace, but an indication that a nexthop is dead and will not be used
is also reported.

1000::/8 via 7000::2 dev p7p1  metric 1024 dead linkdown  pref medium
1000::/8 via 8000::2 dev p8p1  metric 1024  pref medium
7000::/8 dev p7p1  proto kernel  metric 256 dead linkdown  pref medium
8000::/8 dev p8p1  proto kernel  metric 256  pref medium
9000::/8 via 8000::2 dev p8p1  metric 2048  pref medium
9000::/8 via 7000::2 dev p7p1  metric 1024 dead linkdown  pref medium
fe80::/64 dev p7p1  proto kernel  metric 256 dead linkdown  pref medium
fe80::/64 dev p8p1  proto kernel  metric 256  pref medium

This also adds devconf support and notification when sysctl values
change.

v2: drop use of rt6i_nhflags since it is not needed right now

Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 21:27:19 -07:00
Hangbin Liu
8013d1d7ea net/ipv6: add sysctl option accept_ra_min_hop_limit
Commit 6fd99094de ("ipv6: Don't reduce hop limit for an interface")
disabled accept hop limit from RA if it is smaller than the current hop
limit for security stuff. But this behavior kind of break the RFC definition.

RFC 4861, 6.3.4.  Processing Received Router Advertisements
   A Router Advertisement field (e.g., Cur Hop Limit, Reachable Time,
   and Retrans Timer) may contain a value denoting that it is
   unspecified.  In such cases, the parameter should be ignored and the
   host should continue using whatever value it is already using.

   If the received Cur Hop Limit value is non-zero, the host SHOULD set
   its CurHopLimit variable to the received value.

So add sysctl option accept_ra_min_hop_limit to let user choose the minimum
hop limit value they can accept from RA. And set default to 1 to meet RFC
standards.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-30 15:56:40 -07:00
Erik Kline
3985e8a361 ipv6: sysctl to restrict candidate source addresses
Per RFC 6724, section 4, "Candidate Source Addresses":

    It is RECOMMENDED that the candidate source addresses be the set
    of unicast addresses assigned to the interface that will be used
    to send to the destination (the "outgoing" interface).

Add a sysctl to enable this behaviour.

Signed-off-by: Erik Kline <ek@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-22 10:54:11 -07:00
YOSHIFUJI Hideaki
c15df306fc ipv6: Remove unused arguments for __ipv6_dev_get_saddr().
Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-16 01:00:56 -07:00
YOSHIFUJI Hideaki/吉藤英明
c0b8da1e76 ipv6: Fix finding best source address in ipv6_dev_get_saddr().
Commit 9131f3de2 ("ipv6: Do not iterate over all interfaces when
finding source address on specific interface.") did not properly
update best source address available.  Plus, it introduced
possible NULL pointer dereference.

Bug was reported by Erik Kline <ek@google.com>.
Based on patch proposed by Hajime Tazaki <thehajime@gmail.com>.

Fixes: 9131f3de24 ("ipv6: Do not
	iterate over all interfaces when finding source address
	on specific interface.")
Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Acked-by: Hajime Tazaki <thehajime@gmail.com>
Acked-by: Erik Kline <ek@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-15 21:06:13 -07:00
YOSHIFUJI Hideaki/吉藤英明
9131f3de24 ipv6: Do not iterate over all interfaces when finding source address on specific interface.
If outgoing interface is specified and the candidate address is
restricted to the outgoing interface, it is enough to iterate
over that given interface only.

Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Acked-by: Erik Kline <ek@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-10 23:19:25 -07:00
Martin KaFai Lau
1f56a01f4e ipv6: Consider RTF_CACHE when searching the fib6 tree
It is a prep work for the later bug-fix patch which will stop /128 route
from disappearing after pmtu update.

The later bug-fix patch will allow a /128 route and its RTF_CACHE clone
both exist at the same fib6_node.  To do this, we need to prepare the
existing fib6 tree search to expect RTF_CACHE for /128 route.

Note that the fn->leaf is sorted by rt6i_metric.  Hence,
RTF_CACHE (if there is any) is always at the front.  This property
leads to the following:

1. When doing ip6_route_del(), it should honor the RTF_CACHE flag which
   the caller is used to ask for deleting clone or non-clone.
   The rtm_to_fib6_config() should also check the RTM_F_CLONED and
   then set RTF_CACHE accordingly so that:
   - 'ip -6 r del...' will make ip6_route_del() to delete a route
     and all its clones. Note that its clones is flushed by fib6_del()
   - 'ip -6 r flush table cache' will make ip6_route_del() to
      only delete clone(s).

2. Exclude RTF_CACHE from addrconf_get_prefix_route() which
   should not configure on a cloned route.

3. No change is need for rt6_device_match() since it currently could
   return a RTF_CACHE clone route, so the later bug-fix patch will not
   affect it.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-01 20:57:06 -04:00
Nicolas Dichtel
a54acb3a6f dev: introduce dev_get_iflink()
The goal of this patch is to prepare the removal of the iflink field. It
introduces a new ndo function, which will be implemented by virtual interfaces.

There is no functional change into this patch. All readers of iflink field
now call dev_get_iflink().

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:04:59 -04:00
Jiri Benc
930345ea63 netlink: implement nla_put_in_addr and nla_put_in6_addr
IP addresses are often stored in netlink attributes. Add generic functions
to do that.

For nla_put_in_addr, it would be nicer to pass struct in_addr but this is
not used universally throughout the kernel, in way too many places __be32 is
used to store IPv4 address.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31 13:58:35 -04:00
Ian Morris
63159f29be ipv6: coding style: comparison for equality with NULL
The ipv6 code uses a mixture of coding styles. In some instances check for NULL
pointer is done as x == NULL and sometimes as !x. !x is preferred according to
checkpatch and this patch makes the code consistent by adopting the latter
form.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31 13:51:54 -04:00
Hannes Frederic Sowa
ff40217e73 ipv6: fix sparse warnings in privacy stable addresses generation
Those warnings reported by sparse endianness check (via kbuild test robot)
are harmless, nevertheless fix them up and make the code a little bit
easier to read.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 622c81d57b ("ipv6: generation of stable privacy addresses for link-local and autoconf")
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-24 15:21:35 -04:00
Hannes Frederic Sowa
1855b7c3e8 ipv6: introduce idgen_delay and idgen_retries knobs
This is specified by RFC 7217.

Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 22:12:09 -04:00