Commit graph

217 commits

Author SHA1 Message Date
Greg Kroah-Hartman
4b2d6badbc This is the 4.4.144 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltYMlwACgkQONu9yGCS
 aT5ZmxAAjAWUndXt7fTUyHgxkoG61sEkdX4jcsp6NFwQMudU0UHx4/kcZE+HdMjL
 VU8BZtdUg+jMLXM4erVBpQRKY9YHIPi8nWMTm1UjduMCxVD6dVL1HU6/RXl1cYIx
 rf/opYOimqT9lYCeffmd9ai2zEEJKSt7/avddcJY4qHiqLan27gbUdAq2H26aM/5
 LUzAaSBzhq3VYo9Q5zv03b1+tORAxh2BIffZjGEFe8SQQl1o63WqwV4RxEhV/Bjt
 hBgl/6B/+EHtQnYnbnoOT/an9Ma15ik4/z3vVv6yRLNK+hS5T31OKcYCsUrjp6O+
 TQVaVLWWmn/VpIHAMkrhBs9Xxg5GmRziF77AkzyC506tK268M2+IoY77ursVl1YK
 STaOwUcLUlKLbl5OADqMpYtNU9ybkP+MmgDZsIEXz9UiCZM721fL5Au2PHuzaYOD
 2nE2EQb04It4k9GN8FStv2KPIiKUCEXi9MlNsHGPs6Mc+fliIigoKPhpU5JG+sxR
 eJgPMNv4OWhwXWTd1wf0Gy5X+i0lQlwlGgIHFfSB8vzArJ0Y/yuPj2a6xhQshOza
 Ivq7JudHvxYxhDSWYoCKgtTgzMdSBbJ3xjOoUUHy4ryamYeyaMvgFjsaCTMr0dsw
 76BkgNTbpsip+I77a9h4Ozlk5QE7h61EsqjmZBkGVqLYjrUQ/IU=
 =X4tZ
 -----END PGP SIGNATURE-----

Merge 4.4.144 into android-4.4

Changes in 4.4.144
	KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel.
	x86/MCE: Remove min interval polling limitation
	fat: fix memory allocation failure handling of match_strdup()
	ALSA: rawmidi: Change resized buffers atomically
	ARC: Fix CONFIG_SWAP
	ARC: mm: allow mprotect to make stack mappings executable
	mm: memcg: fix use after free in mem_cgroup_iter()
	ipv4: Return EINVAL when ping_group_range sysctl doesn't map to user ns
	ipv6: fix useless rol32 call on hash
	lib/rhashtable: consider param->min_size when setting initial table size
	net/ipv4: Set oif in fib_compute_spec_dst
	net: phy: fix flag masking in __set_phy_supported
	ptp: fix missing break in switch
	tg3: Add higher cpu clock for 5762.
	net: Don't copy pfmemalloc flag in __copy_skb_header()
	skbuff: Unconditionally copy pfmemalloc in __skb_clone()
	xhci: Fix perceived dead host due to runtime suspend race with event handler
	x86/paravirt: Make native_save_fl() extern inline
	x86/cpufeatures: Add CPUID_7_EDX CPUID leaf
	x86/cpufeatures: Add Intel feature bits for Speculation Control
	x86/cpufeatures: Add AMD feature bits for Speculation Control
	x86/msr: Add definitions for new speculation control MSRs
	x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
	x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes
	x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support
	x86/cpufeatures: Clean up Spectre v2 related CPUID flags
	x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
	x86/pti: Mark constant arrays as __initconst
	x86/asm/entry/32: Simplify pushes of zeroed pt_regs->REGs
	x86/entry/64/compat: Clear registers for compat syscalls, to reduce speculation attack surface
	x86/speculation: Update Speculation Control microcode blacklist
	x86/speculation: Correct Speculation Control microcode blacklist again
	x86/speculation: Clean up various Spectre related details
	x86/speculation: Fix up array_index_nospec_mask() asm constraint
	x86/speculation: Add <asm/msr-index.h> dependency
	x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
	x86/mm: Factor out LDT init from context init
	x86/mm: Give each mm TLB flush generation a unique ID
	x86/speculation: Use Indirect Branch Prediction Barrier in context switch
	x86/spectre_v2: Don't check microcode versions when running under hypervisors
	x86/speculation: Use IBRS if available before calling into firmware
	x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP
	x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist
	selftest/seccomp: Fix the flag name SECCOMP_FILTER_FLAG_TSYNC
	selftest/seccomp: Fix the seccomp(2) signature
	xen: set cpu capabilities from xen_start_kernel()
	x86/amd: don't set X86_BUG_SYSRET_SS_ATTRS when running under Xen
	x86/nospec: Simplify alternative_msr_write()
	x86/bugs: Concentrate bug detection into a separate function
	x86/bugs: Concentrate bug reporting into a separate function
	x86/bugs: Read SPEC_CTRL MSR during boot and re-use reserved bits
	x86/bugs, KVM: Support the combination of guest and host IBRS
	x86/cpu: Rename Merrifield2 to Moorefield
	x86/cpu/intel: Add Knights Mill to Intel family
	x86/bugs: Expose /sys/../spec_store_bypass
	x86/cpufeatures: Add X86_FEATURE_RDS
	x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation
	x86/bugs/intel: Set proper CPU features and setup RDS
	x86/bugs: Whitelist allowed SPEC_CTRL MSR values
	x86/bugs/AMD: Add support to disable RDS on Fam[15, 16, 17]h if requested
	x86/speculation: Create spec-ctrl.h to avoid include hell
	prctl: Add speculation control prctls
	x86/process: Optimize TIF checks in __switch_to_xtra()
	x86/process: Correct and optimize TIF_BLOCKSTEP switch
	x86/process: Optimize TIF_NOTSC switch
	x86/process: Allow runtime control of Speculative Store Bypass
	x86/speculation: Add prctl for Speculative Store Bypass mitigation
	nospec: Allow getting/setting on non-current task
	proc: Provide details on speculation flaw mitigations
	seccomp: Enable speculation flaw mitigations
	prctl: Add force disable speculation
	seccomp: Use PR_SPEC_FORCE_DISABLE
	seccomp: Add filter flag to opt-out of SSB mitigation
	seccomp: Move speculation migitation control to arch code
	x86/speculation: Make "seccomp" the default mode for Speculative Store Bypass
	x86/bugs: Rename _RDS to _SSBD
	proc: Use underscores for SSBD in 'status'
	Documentation/spec_ctrl: Do some minor cleanups
	x86/bugs: Fix __ssb_select_mitigation() return type
	x86/bugs: Make cpu_show_common() static
	x86/bugs: Fix the parameters alignment and missing void
	x86/cpu: Make alternative_msr_write work for 32-bit code
	x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
	x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
	x86/cpufeatures: Disentangle SSBD enumeration
	x86/cpu/AMD: Fix erratum 1076 (CPB bit)
	x86/cpufeatures: Add FEATURE_ZEN
	x86/speculation: Handle HT correctly on AMD
	x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
	x86/speculation: Add virtualized speculative store bypass disable support
	x86/speculation: Rework speculative_store_bypass_update()
	x86/bugs: Unify x86_spec_ctrl_{set_guest, restore_host}
	x86/bugs: Expose x86_spec_ctrl_base directly
	x86/bugs: Remove x86_spec_ctrl_set()
	x86/bugs: Rework spec_ctrl base and mask logic
	x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
	x86/bugs: Rename SSBD_NO to SSB_NO
	x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
	x86/cpu: Re-apply forced caps every time CPU caps are re-read
	block: do not use interruptible wait anywhere
	clk: tegra: Fix PLL_U post divider and initial rate on Tegra30
	ubi: Introduce vol_ignored()
	ubi: Rework Fastmap attach base code
	ubi: Be more paranoid while seaching for the most recent Fastmap
	ubi: Fix races around ubi_refill_pools()
	ubi: Fix Fastmap's update_vol()
	ubi: fastmap: Erase outdated anchor PEBs during attach
	Linux 4.4.144

Change-Id: Ia3e9b2b7bc653cba68b76878d34f8fcbbc007a13
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-31 20:18:19 +02:00
David Ahern
be64f9f7a2 net/ipv4: Set oif in fib_compute_spec_dst
[ Upstream commit e7372197e15856ec4ee66b668020a662994db103 ]

Xin reported that icmp replies may not use the address on the device the
echo request is received if the destination address is broadcast. Instead
a route lookup is done without considering VRF context. Fix by setting
oif in flow struct to the master device if it is enslaved. That directs
the lookup to the VRF table. If the device is not enslaved, oif is still
0 so no affect.

Fixes: cd2fbe1b6b ("net: Use VRF device index for lookups on RX")
Reported-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25 10:18:16 +02:00
Greg Kroah-Hartman
8cbe01c651 This is the 4.4.109 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlpL3okACgkQONu9yGCS
 aT6p5g/8CAG9NU/fLu7IMcIlyqfVvdOhzxn44oHCxq08eycqoggdnb3TZXxBUBgY
 +w8uZk8yxNdjXR39GjkMSUy06WRvl2XDSrd36sDGRCBP62Fi8l5scmlRaNEnI/E8
 ltBSB93P16SmnpKa/3Zscz+7LcaoXHpU5Xhs8Zmf4I69qmzOFX2qSKsUyzVT+gNI
 ZoSN/mYuXf7+dzrcKhVdYzm4ZdMRvxdT0WefeoeZMekfAtU9D8zaFOA9jTIAMHSZ
 adNn18s7UKmaipZf/01mW9srvZce4nPKiUC8WVGstiyl27ws+IDleKVmDnqFALjy
 2LIxDvjDth/x8jfqTb7F6bFh6dVtMJjwUmd3KL7hgPuTddoQQe/GfKnjSHkbNxyR
 qNxNtbOgQ2EVOf59fejxWshCP/fButNo8uvCI1ERdm4axGXcf9hiucdlwzCYezHs
 UN0xrxAXprhqTq4hQFB9E4C49e8nMPNsyXTMZwSZRPe2z53spD53JR/0sl5Z2RWe
 ueO21tBZ6ev9jPNi+lJrCVw1oBO+PKOmdNPAaSynUVm96grRnW6grUI3mX9FqMXb
 r62UWG3YCWWBgxA3iQQrMxf/3S2YZXz59TBbp9GU8xOYJZLhKL29/iB7Rv4ANtkR
 aMDrABjWqrCZpIazqkZ5uwbsNl6Q51e3Mji3EfwkBaMqjc41++I=
 =B52+
 -----END PGP SIGNATURE-----

Merge 4.4.109 into android-4.4

Changes in 4.4.109
	ACPI: APEI / ERST: Fix missing error handling in erst_reader()
	crypto: mcryptd - protect the per-CPU queue with a lock
	mfd: cros ec: spi: Don't send first message too soon
	mfd: twl4030-audio: Fix sibling-node lookup
	mfd: twl6040: Fix child-node lookup
	ALSA: rawmidi: Avoid racy info ioctl via ctl device
	ALSA: usb-audio: Fix the missing ctl name suffix at parsing SU
	PCI / PM: Force devices to D0 in pci_pm_thaw_noirq()
	parisc: Hide Diva-built-in serial aux and graphics card
	spi: xilinx: Detect stall with Unknown commands
	KVM: X86: Fix load RFLAGS w/o the fixed bit
	kvm: x86: fix RSM when PCID is non-zero
	powerpc/perf: Dereference BHRB entries safely
	net: mvneta: clear interface link status on port disable
	tracing: Remove extra zeroing out of the ring buffer page
	tracing: Fix possible double free on failure of allocating trace buffer
	tracing: Fix crash when it fails to alloc ring buffer
	ring-buffer: Mask out the info bits when returning buffer page length
	iw_cxgb4: Only validate the MSN for successful completions
	ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
	ASoC: twl4030: fix child-node lookup
	ALSA: hda: Drop useless WARN_ON()
	ALSA: hda - fix headset mic detection issue on a Dell machine
	x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
	x86/mm: Remove flush_tlb() and flush_tlb_current_task()
	x86/mm: Make flush_tlb_mm_range() more predictable
	x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
	x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code
	x86/mm: Disable PCID on 32-bit kernels
	x86/mm: Add the 'nopcid' boot option to turn off PCID
	x86/mm: Enable CR4.PCIDE on supported systems
	x86/mm/64: Fix reboot interaction with CR4.PCIDE
	kbuild: add '-fno-stack-check' to kernel build options
	ipv4: igmp: guard against silly MTU values
	ipv6: mcast: better catch silly mtu values
	net: igmp: Use correct source address on IGMPv3 reports
	netlink: Add netns check on taps
	net: qmi_wwan: add Sierra EM7565 1199:9091
	net: reevalulate autoflowlabel setting after sysctl setting
	tcp md5sig: Use skb's saddr when replying to an incoming segment
	tg3: Fix rx hang on MTU change with 5717/5719
	net: ipv4: fix for a race condition in raw_sendmsg
	net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case
	sctp: Replace use of sockets_allocated with specified macro.
	ipv4: Fix use-after-free when flushing FIB tables
	net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks
	net: Fix double free and memory corruption in get_net_ns_by_id()
	net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
	sock: free skb in skb_complete_tx_timestamp on error
	usbip: fix usbip bind writing random string after command in match_busid
	usbip: stub: stop printing kernel pointer addresses in messages
	usbip: vhci: stop printing kernel pointer addresses in messages
	USB: serial: ftdi_sio: add id for Airbus DS P8GR
	USB: serial: qcserial: add Sierra Wireless EM7565
	USB: serial: option: add support for Telit ME910 PID 0x1101
	USB: serial: option: adding support for YUGA CLM920-NC5
	usb: Add device quirk for Logitech HD Pro Webcam C925e
	usb: add RESET_RESUME for ELSA MicroLink 56K
	USB: Fix off by one in type-specific length check of BOS SSP capability
	usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201
	nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
	x86/smpboot: Remove stale TLB flush invocations
	n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
	mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP
	Linux 4.4.109

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-01-02 20:58:26 +01:00
Ido Schimmel
169a9861c6 ipv4: Fix use-after-free when flushing FIB tables
[ Upstream commit b4681c2829e24943aadd1a7bb3a30d41d0a20050 ]

Since commit 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse") the
local table uses the same trie allocated for the main table when custom
rules are not in use.

When a net namespace is dismantled, the main table is flushed and freed
(via an RCU callback) before the local table. In case the callback is
invoked before the local table is iterated, a use-after-free can occur.

Fix this by iterating over the FIB tables in reverse order, so that the
main table is always freed after the local table.

v3: Reworded comment according to Alex's suggestion.
v2: Add a comment to make the fix more explicit per Dave's and Alex's
feedback.

Fixes: 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-02 20:33:26 +01:00
Greg Kroah-Hartman
dfff30bca9 This is the 4.4.81 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlmN2eMACgkQONu9yGCS
 aT7zCQ//eDgCF9YJnE1v8/JJ0yl2uK7XjVrF/tpPvzgTgszu4En4kGfhUO+WvmkU
 0/pqYBMAPZEbfmx+6q8FJx/MHDjFA1oKb+a9pS1RUovzWDLQoRxYwiBtR2osmuOE
 f1fbDMt9ETDUxUGLhRJ/vuzeIjmouhPkz5vZAg863+sKYYjPHlczymcgMs0sRMsE
 3kkgo6mhCKTLt8gvioSUjeVWs4a5y3unvImhSLjEHjcfydlDLwA8RuFdFwBIgNfP
 yPrgW3v5l9HHXI1lWMcOCTpVeDI272sKNOppYg4r2N/I/epBN79j7jGrqGQpG8NP
 mKOkgRDoR7ifyKLSS55R8anLyNoi4jfQAHbOxlSVGymwpd9kRuHoeTE5+IqYs+V5
 qLkqLz63hmbfRQuW6az6L+SGVwgj3DSHakGQFkB0ouB8h5ubU2OqINxOsaNABbHD
 C1Q9giqG8b2MEv5D4O4m7BhK1tDzSJmT2tb9UG+UV8LJn1PhFSnSMkjP4S7trZl+
 +8myxdoNVvDMpd23UqM7o1fuYalbslTKED9el31FimOaNF79+tzyjnNbWA6zqX+X
 U3I+Pp2FafOS2heTLTX59fz09LKRI+iP3pnlCBpp1a+MKAIEbjeW8YB5zTKrSNOv
 RkZ+1qIQtmGyhVp/YDsua5J1lhZVXeLeoEqDXYerELOdGKF30jw=
 =pHqB
 -----END PGP SIGNATURE-----

Merge 4.4.81 into android-4.4

Changes in 4.4.81
	libata: array underflow in ata_find_dev()
	workqueue: restore WQ_UNBOUND/max_active==1 to be ordered
	ALSA: hda - Fix speaker output from VAIO VPCL14M1R
	ASoC: do not close shared backend dailink
	KVM: async_pf: make rcu irq exit if not triggered from idle task
	mm/page_alloc: Remove kernel address exposure in free_reserved_area()
	ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize
	ext4: fix overflow caused by missing cast in ext4_resize_fs()
	ARM: dts: armada-38x: Fix irq type for pca955
	media: platform: davinci: return -EINVAL for VPFE_CMD_S_CCDC_RAW_PARAMS ioctl
	target: Avoid mappedlun symlink creation during lun shutdown
	iscsi-target: Always wait for kthread_should_stop() before kthread exit
	iscsi-target: Fix early sk_data_ready LOGIN_FLAGS_READY race
	iscsi-target: Fix initial login PDU asynchronous socket close OOPs
	iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP
	iser-target: Avoid isert_conn->cm_id dereference in isert_login_recv_done
	mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries
	media: lirc: LIRC_GET_REC_RESOLUTION should return microseconds
	f2fs: sanity check checkpoint segno and blkoff
	drm: rcar-du: fix backport bug
	saa7164: fix double fetch PCIe access condition
	ipv4: ipv6: initialize treq->txhash in cookie_v[46]_check()
	net: Zero terminate ifr_name in dev_ifname().
	ipv6: avoid overflow of offset in ip6_find_1stfragopt
	ipv4: initialize fib_trie prior to register_netdev_notifier call.
	rtnetlink: allocate more memory for dev_set_mac_address()
	mcs7780: Fix initialization when CONFIG_VMAP_STACK is enabled
	openvswitch: fix potential out of bound access in parse_ct
	packet: fix use-after-free in prb_retire_rx_blk_timer_expired()
	ipv6: Don't increase IPSTATS_MIB_FRAGFAILS twice in ip6_fragment()
	net: ethernet: nb8800: Handle all 4 RGMII modes identically
	dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly
	dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properly
	dccp: fix a memleak for dccp_feat_init err process
	sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()
	sctp: fix the check for _sctp_walk_params and _sctp_walk_errors
	net/mlx5: Fix command bad flow on command entry allocation failure
	net: phy: Correctly process PHY_HALTED in phy_stop_machine()
	net: phy: Fix PHY unbind crash
	xen-netback: correctly schedule rate-limited queues
	sparc64: Measure receiver forward progress to avoid send mondo timeout
	wext: handle NULL extra data in iwe_stream_add_point better
	sh_eth: R8A7740 supports packet shecksumming
	net: phy: dp83867: fix irq generation
	tg3: Fix race condition in tg3_get_stats64().
	x86/boot: Add missing declaration of string functions
	phy state machine: failsafe leave invalid RUNNING state
	scsi: qla2xxx: Get mutex lock before checking optrom_state
	drm/virtio: fix framebuffer sparse warning
	virtio_blk: fix panic in initialization error path
	ARM: 8632/1: ftrace: fix syscall name matching
	mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDER
	lib/Kconfig.debug: fix frv build failure
	signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
	mm: don't dereference struct page fields of invalid pages
	ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output
	net: account for current skb length when deciding about UFO
	workqueue: implicit ordered attribute should be overridable
	Linux 4.4.81

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-08-11 13:09:21 -07:00
Mahesh Bandewar
31afa8b5ed ipv4: initialize fib_trie prior to register_netdev_notifier call.
[ Upstream commit 8799a221f5944a7d74516ecf46d58c28ec1d1f75 ]

Net stack initialization currently initializes fib-trie after the
first call to netdevice_notifier() call. In fact fib_trie initialization
needs to happen before first rtnl_register(). It does not cause any problem
since there are no devices UP at this moment, but trying to bring 'lo'
UP at initialization would make this assumption wrong and exposes the issue.

Fixes following crash

 Call Trace:
  ? alternate_node_alloc+0x76/0xa0
  fib_table_insert+0x1b7/0x4b0
  fib_magic.isra.17+0xea/0x120
  fib_add_ifaddr+0x7b/0x190
  fib_netdev_event+0xc0/0x130
  register_netdevice_notifier+0x1c1/0x1d0
  ip_fib_init+0x72/0x85
  ip_rt_init+0x187/0x1e9
  ip_init+0xe/0x1a
  inet_init+0x171/0x26c
  ? ipv4_offload_init+0x66/0x66
  do_one_initcall+0x43/0x160
  kernel_init_freeable+0x191/0x219
  ? rest_init+0x80/0x80
  kernel_init+0xe/0x150
  ret_from_fork+0x22/0x30
 Code: f6 46 23 04 74 86 4c 89 f7 e8 ae 45 01 00 49 89 c7 4d 85 ff 0f 85 7b ff ff ff 31 db eb 08 4c 89 ff e8 16 47 01 00 48 8b 44 24 38 <45> 8b 6e 14 4d 63 76 74 48 89 04 24 0f 1f 44 00 00 48 83 c4 08
 RIP: kmem_cache_alloc+0xcf/0x1c0 RSP: ffff9b1500017c28
 CR2: 0000000000000014

Fixes: 7b1a74fdbb ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.")
Fixes: 7f9b80529b ("[IPV4]: fib hash|trie initialization")

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-11 09:08:52 -07:00
Greg Kroah-Hartman
6fc0573f6d This is the 4.4.71 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlk30BwACgkQONu9yGCS
 aT5cmhAAh3etTuZ3xRw2eGW/Y/C8L2F2CjJjmR4vp1ms8P55uZg3xA20r5jNj7Ho
 pwag3WTNzHpVfKFApavfEzToqDszRAtXcvYPPW9uXUPeu8LWyBJyvmN7lSQVKgDc
 M9SWsd+8EGceopaj8KHjLMxNsV2n8j2ckxNf/BL/KgiMtJlgp/1TCDKUVS1k0cA7
 CsuxDhxpRYpQofsIVww1hdrwCxVuntAY7u+/3B19ozXGFSRe/h5GO6xYRcG8pqfT
 lvIgD6btdQJwI55QoSpJCpL96a534zc+akO0dtyaMJ3Q8UWQXD3JF8ZxMiPPrAe8
 CLW390ATranIafmLi9g9DU1vQeEPNFXpeiYfxe65YL7igeAj/uPtVzKp0MvRcKG7
 IBVNxbtsTQa73ig7gKSJ323CnpEfrr/XG73JNVtUQLxHa2poY7SUonRI587MFW2T
 sONl9Pk3TxRC7Rc45si4RFsIj4jEF8ubUDXOPb2CrmDMB7MrM0PHfOW9lLCP92FD
 pn0fM4vwNvm2ILsblqNcBumgeIBQ8ld2TBTbhRbh2FK4Rzxd2TSlWh4KqkcWcXCt
 Lz8conU06AwTvDob1xoht3m6Gj32maopKZKGn5/Wq0YlfjOB/70CXOvPO3ChhKTh
 QGNgA66bYdm+xn55wf7ty7Bq8yO6kcSNPQCXOb9S61nfCLA4KHM=
 =U7IH
 -----END PGP SIGNATURE-----

Merge 4.4.71 into android-4.4

Changes in 4.4.71
	sparc: Fix -Wstringop-overflow warning
	dccp/tcp: do not inherit mc_list from parent
	ipv6/dccp: do not inherit ipv6_mc_list from parent
	s390/qeth: handle sysfs error during initialization
	s390/qeth: unbreak OSM and OSN support
	s390/qeth: avoid null pointer dereference on OSN
	tcp: avoid fragmenting peculiar skbs in SACK
	sctp: fix src address selection if using secondary addresses for ipv6
	sctp: do not inherit ipv6_{mc|ac|fl}_list from parent
	tcp: eliminate negative reordering in tcp_clean_rtx_queue
	net: Improve handling of failures on link and route dumps
	ipv6: Prevent overrun when parsing v6 header options
	ipv6: Check ip6_find_1stfragopt() return value properly.
	bridge: netlink: check vlan_default_pvid range
	qmi_wwan: add another Lenovo EM74xx device ID
	bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
	ipv6: fix out of bound writes in __ip6_append_data()
	be2net: Fix offload features for Q-in-Q packets
	virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
	tcp: avoid fastopen API to be used on AF_UNSPEC
	sctp: fix ICMP processing if skb is non-linear
	ipv4: add reference counting to metrics
	netem: fix skb_orphan_partial()
	net: phy: marvell: Limit errata to 88m1101
	vlan: Fix tcp checksum offloads in Q-in-Q vlans
	i2c: i2c-tiny-usb: fix buffer not being DMA capable
	mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read
	HID: wacom: Have wacom_tpc_irq guard against possible NULL dereference
	scsi: mpt3sas: Force request partial completion alignment
	drm/radeon/ci: disable mclk switching for high refresh rates (v2)
	drm/radeon: Unbreak HPD handling for r600+
	pcmcia: remove left-over %Z format
	ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430
	slub/memcg: cure the brainless abuse of sysfs attributes
	drm/gma500/psb: Actually use VBT mode when it is found
	mm/migrate: fix refcount handling when !hugepage_migration_supported()
	mlock: fix mlock count can not decrease in race condition
	xfs: Fix missed holes in SEEK_HOLE implementation
	xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff()
	xfs: fix over-copying of getbmap parameters from userspace
	xfs: handle array index overrun in xfs_dir2_leaf_readbuf()
	xfs: prevent multi-fsb dir readahead from reading random blocks
	xfs: fix up quotacheck buffer list error handling
	xfs: support ability to wait on new inodes
	xfs: update ag iterator to support wait on new inodes
	xfs: wait on new inodes during quotaoff dquot release
	xfs: fix indlen accounting error on partial delalloc conversion
	xfs: bad assertion for delalloc an extent that start at i_size
	xfs: fix unaligned access in xfs_btree_visit_blocks
	xfs: in _attrlist_by_handle, copy the cursor back to userspace
	xfs: only return -errno or success from attr ->put_listent
	Linux 4.4.71

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-06-07 12:36:01 +02:00
David Ahern
640bfcf232 net: Improve handling of failures on link and route dumps
[ Upstream commit f6c5775ff0bfa62b072face6bf1d40f659f194b2 ]

In general, rtnetlink dumps do not anticipate failure to dump a single
object (e.g., link or route) on a single pass. As both route and link
objects have grown via more attributes, that is no longer a given.

netlink dumps can handle a failure if the dump function returns an
error; specifically, netlink_dump adds the return code to the response
if it is <= 0 so userspace is notified of the failure. The missing
piece is the rtnetlink dump functions returning the error.

Fix route and link dump functions to return the errors if no object is
added to an skb (detected by skb->len != 0). IPv6 route dumps
(rt6_dump_route) already return the error; this patch updates IPv4 and
link dumps. Other dump functions may need to be ajusted as well.

Reported-by: Jan Moskyto Matejka <mq@ucw.cz>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-07 12:05:58 +02:00
Greg Kroah-Hartman
29950430ce This is the 4.4.58 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAljctYYACgkQONu9yGCS
 aT6MbxAAwyGobI5sOr63yX5Myji1jf17vlY2h5dXet+8lu/csbFKmqHxaTLNwGIw
 7u6V3AJ4zWdX8Q22dcVC98oySxcLxUhv+Rv/Dbonr3CYM00wNIex2wzON8f77eJQ
 CBGJRNJR4/VG6opbVI/qp0t/c2oFiqHJXPldm3/Ru7jcBrLo5UHDWDY6cDhrj/Tg
 F1maCBMAu1qW0z9KTnrQDvHjPHXmKfCviGzXpFTSVBQrh1s1bJkZkTqcY9eZNa/u
 AXhHek5ZLFxlhkO105leR0YtXADbopiJ5c4EgXCASzNQ92/6IKsl21eOhHgOU9OA
 YUCYftwKVMcxXGB6QFbdefLVtnCjUtlDa9+70oW1/4Ecee5FUBzNjVWVHOtYEsTY
 pA3DqQI+U7EBCuIXsTtV0DiRWhHKq5uS1aphXZwnq/8qc2A3PD86JV/MBK5sWZfB
 2V1N7xkitLFFCR6vMFLuusM8Np7kJ3zaAxQOd3IRc72iiNLkbNjdfJcAQ+E9b/Zx
 5tpcthOl2RKhlOHHVKmYIioar8+RkZgWWl64+RTt6M1KjvHs07lPdCI+4cW8ELLM
 /FUeRNTLmOiUv4dEPj5INYukEcLuCNp4fIo9lq8HrDceXbJXwiMFtHHlo4SQ/ubm
 9v5iYdEmGet8jYPrfa5LDtD7G8K//k08nElVmI6N/4fNxRC2GcY=
 =TI1/
 -----END PGP SIGNATURE-----

Merge 4.4.48 into android-4.4

Changes in 4.4.48:
	net/openvswitch: Set the ipv6 source tunnel key address attribute correctly
	net: bcmgenet: Do not suspend PHY if Wake-on-LAN is enabled
	net: properly release sk_frag.page
	amd-xgbe: Fix jumbo MTU processing on newer hardware
	net: unix: properly re-increment inflight counter of GC discarded candidates
	net/mlx5: Increase number of max QPs in default profile
	net/mlx5e: Count LRO packets correctly
	net: bcmgenet: remove bcmgenet_internal_phy_setup()
	ipv4: provide stronger user input validation in nl_fib_input()
	socket, bpf: fix sk_filter use after free in sk_clone_lock
	tcp: initialize icsk_ack.lrcvtime at session start time
	Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw
	Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000
	Input: iforce - validate number of endpoints before using them
	Input: ims-pcu - validate number of endpoints before using them
	Input: hanwang - validate number of endpoints before using them
	Input: yealink - validate number of endpoints before using them
	Input: cm109 - validate number of endpoints before using them
	Input: kbtab - validate number of endpoints before using them
	Input: sur40 - validate number of endpoints before using them
	ALSA: seq: Fix racy cell insertions during snd_seq_pool_done()
	ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call
	ALSA: hda - Adding a group of pin definition to fix headset problem
	USB: serial: option: add Quectel UC15, UC20, EC21, and EC25 modems
	USB: serial: qcserial: add Dell DW5811e
	ACM gadget: fix endianness in notifications
	usb: gadget: f_uvc: Fix SuperSpeed companion descriptor's wBytesPerInterval
	usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk
	USB: uss720: fix NULL-deref at probe
	USB: lvtest: fix NULL-deref at probe
	USB: idmouse: fix NULL-deref at probe
	USB: wusbcore: fix NULL-deref at probe
	usb: musb: cppi41: don't check early-TX-interrupt for Isoch transfer
	usb: hub: Fix crash after failure to read BOS descriptor
	uwb: i1480-dfu: fix NULL-deref at probe
	uwb: hwa-rc: fix NULL-deref at probe
	mmc: ushc: fix NULL-deref at probe
	iio: adc: ti_am335x_adc: fix fifo overrun recovery
	iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3
	parport: fix attempt to write duplicate procfiles
	ext4: mark inode dirty after converting inline directory
	mmc: sdhci: Do not disable interrupts while waiting for clock
	xen/acpi: upload PM state from init-domain to Xen
	iommu/vt-d: Fix NULL pointer dereference in device_to_iommu
	ARM: at91: pm: cpu_idle: switch DDR to power-down mode
	ARM: dts: at91: sama5d2: add dma properties to UART nodes
	cpufreq: Restore policy min/max limits on CPU online
	raid10: increment write counter after bio is split
	libceph: don't set weight to IN when OSD is destroyed
	xfs: don't allow di_size with high bit set
	xfs: fix up xfs_swap_extent_forks inline extent handling
	nl80211: fix dumpit error path RTNL deadlocks
	USB: usbtmc: add missing endpoint sanity check
	xfs: clear _XBF_PAGES from buffers when readahead page
	xen: do not re-use pirq number cached in pci device msi msg data
	igb: Workaround for igb i210 firmware issue
	igb: add i211 to i210 PHY workaround
	x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
	PCI: Separate VF BAR updates from standard BAR updates
	PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
	PCI: Add comments about ROM BAR updating
	PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
	PCI: Don't update VF BARs while VF memory space is enabled
	PCI: Update BARs using property bits appropriate for type
	PCI: Ignore BAR updates on virtual functions
	PCI: Do any VF BAR updates before enabling the BARs
	vfio/spapr: Postpone allocation of userspace version of TCE table
	block: allow WRITE_SAME commands with the SG_IO ioctl
	s390/zcrypt: Introduce CEX6 toleration
	uvcvideo: uvc_scan_fallback() for webcams with broken chain
	ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520
	ACPI / blacklist: Make Dell Latitude 3350 ethernet work
	serial: 8250_pci: Detach low-level driver during PCI error recovery
	fbcon: Fix vc attr at deinit
	crypto: algif_hash - avoid zero-sized array
	Linux 4.4.58

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2017-03-30 13:18:20 +02:00
Eric Dumazet
38dece41e5 ipv4: provide stronger user input validation in nl_fib_input()
[ Upstream commit c64c0b3cac4c5b8cb093727d2c19743ea3965c0b ]

Alexander reported a KMSAN splat caused by reads of uninitialized
field (tb_id_in) from user provided struct fib_result_nl

It turns out nl_fib_input() sanity tests on user input is a bit
wrong :

User can pretend nlh->nlmsg_len is big enough, but provide
at sendmsg() time a too small buffer.

Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-30 09:35:14 +02:00
Dmitry Shmidt
f103e3b0d8 This is the 4.4.43 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlh7bhIACgkQONu9yGCS
 aT5KKRAAw7baMz//gshbaXZuZZHJjqB+rBekdnzgBMBo4P2OJwiuFi7N27dRxiaO
 6uFAB5BUlFoc16AExAnmQJIiWB8lWeAt8S20RBLaiGGQ0iPTr4W7bsVH4Tk3zEaF
 gjCt3Tv8kzbno64lWk02xDilkxFO09y3ZtiMVkleUDpI1DRm5iAF11j+C42OG1Ox
 U1QPsjCoWJyZ9Ta7SEyoQsuJcU32Wl0IW1VAroqfYAJJF5yLOxGoJQfWsiyvwEjQ
 VQg+Yd2LlJkHjuOp4lSAaYjNrCjvV91KwcwOocyI2iw69vyyCQpbKeg50wA1+jBO
 2+b0WKTIYSA6EruAivIj0646UqnzzpUGf9DfeH2NIApO7PvTGWaIWk5uvheOf3Vz
 yVviVGYdedtMXixdzHVXgRVZQThlhLe2D5bvYB0bFInDrY8LlMZJVwjrbJuVQaUy
 u0eguKvOIXSsUwtDOLCEKKh7bH1605JXVm0yUAYRmTPbRjs8LQHu0kPpS70L5tYI
 MaftvgPFyLev88cDns+VjnJxm1cOHrSRyLigM4ArCrZdNs8EKPScFeV3bKcR2Gwi
 u05MdpwagOMSFqKdPFhiGYjjcpAeieeAOkmMro9C1KvIRhVt83cAlbP6L9R0PYSK
 n/wfpvrcbDKl0vcAPVscw1iM590WbRPGGrqlDGv+ak4cjsCb8ro=
 =kCbR
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.43' into android-4.4.y

This is the 4.4.43 stable release
2017-01-17 12:44:14 -08:00
Alexander Duyck
807cac887d ipv4: Do not allow MAIN to be alias for new LOCAL w/ custom rules
[ Upstream commit 5350d54f6cd12eaff623e890744c79b700bd3f17 ]

In the case of custom rules being present we need to handle the case of the
LOCAL table being intialized after the new rule has been added.  To address
that I am adding a new check so that we can make certain we don't use an
alias of MAIN for LOCAL when allocating a new table.

Fixes: 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Oliver Brunel <jjk@jjacky.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-15 13:41:35 +01:00
Dmitry Shmidt
b558f17a13 This is the 4.4.16 stable release
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXmOXmAAoJEDjbvchgkmk+QYIP/1S8oBZsvjfDzvH8t63HyLeH
 i43MFlYoFAqUIZc002XpluSvZ8uHoG+r7R8Hq3wmv48wxe3M6OBnMdBVTht6mPw+
 t5OLTZr40lWaJm2EIi4aekueMIrCgmL+Et+IFYv7ZVBuYLteVcfny+zdq4EqGmgj
 /a19+L/sTTr4SHtJIhHxWhiVJ9fVMgQk/N3VgQmIiNF2+lVbiFI7QQiDPLbFl0KK
 CM4ETO22HxHCYilGpzhpSMsHCxv12VqNaXNLAsPAepGGW7PqvUmrEWAqgwsbOfRc
 GxTLNk0dUgJqMrfEpQ8ZOMlgzvCAYG2jZuNSuT+nuzrWSUP+WOGRi9TTTxp1CYuZ
 PHlhNTH7ZnqosxJUUZS2d9N5ygpqD48Rhlfl824YzOWCy94VeUnedkVLb20uJwPF
 Y5aQ5WjktBC9why5e4OgGQERvx/U9KTk8E1zRfZZPc2oft9My0YxuemjjKAKZiYN
 ne4WhXbgOJTQkAoZwh2xqny3bWyEaoSrWpQ3R7bBJ9SIRLEOdCKzKpduDbAnbMP7
 QWgQOQC/6qA1mKqjrqF4KPA1Quo9PcUK2Ajh523ewMGCowgY90vyejAgh4Q8g0GC
 fKlx+jJDoKVDbQ8v4hc9PPHMsNNIKT9a1ptwVS3lE+bq1D5Ffm57A4/uOTMYHVab
 gKqu8h1CA0MCVBsH3nNA
 =nY8S
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.16' into android-4.4.y

This is the 4.4.16 stable release

Change-Id: Ibaf7b7e03695e1acebc654a2ca1a4bfcc48fcea4
2016-08-01 15:57:55 -07:00
Paolo Abeni
0633185047 ipv4/fib: don't warn when primary address is missing if in_dev is dead
[ Upstream commit 391a20333b8393ef2e13014e6e59d192c5594471 ]

After commit fbd40ea0180a ("ipv4: Don't do expensive useless work
during inetdev destroy.") when deleting an interface,
fib_del_ifaddr() can be executed without any primary address
present on the dead interface.

The above is safe, but triggers some "bug: prim == NULL" warnings.

This commit avoids warning if the in_dev is dead

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-18 17:06:36 -07:00
Lance Richardson
80de2e4115 ipv4: initialize flowi4_flags before calling fib_lookup()
[ Upstream commit 4cfc86f3dae6ca38ed49cdd78f458a03d4d87992 ]

Field fl4.flowi4_flags is not initialized in fib_compute_spec_dst()
before calling fib_lookup(), which means fib_table_lookup() is
using non-deterministic data at this line:

	if (!(flp->flowi4_flags & FLOWI_FLAG_SKIP_NH_OIF)) {

Fix by initializing the entire fl4 structure, which will prevent
similar issues as fields are added in the future by ensuring that
all fields are initialized to zero unless explicitly initialized
to another value.

Fixes: 58189ca7b2 ("net: Fix vti use case with oif in dst lookups")
Suggested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-20 15:42:05 +09:00
David S. Miller
5478975991 ipv4: Don't do expensive useless work during inetdev destroy.
[ Upstream commit fbd40ea0180a2d328c5adc61414dc8bab9335ce2 ]

When an inetdev is destroyed, every address assigned to the interface
is removed.  And in this scenerio we do two pointless things which can
be very expensive if the number of assigned interfaces is large:

1) Address promotion.  We are deleting all addresses, so there is no
   point in doing this.

2) A full nf conntrack table purge for every address.  We only need to
   do this once, as is already caught by the existing
   masq_dev_notifier so masq_inet_event() can skip this.

Reported-by: Solar Designer <solar@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-20 15:42:03 +09:00
Lorenzo Colitti
fd2cf795f3 net: core: Support UID-based routing.
This contains the following commits:

1. cc2f522 net: core: Add a UID range to fib rules.
2. d7ed2bd net: core: Use the socket UID in routing lookups.
3. 2f9306a net: core: Add a RTA_UID attribute to routes.
    This is so that userspace can do per-UID route lookups.
4. 8e46efb net: ipv6: Use the UID in IPv6 PMTUD
    IPv4 PMTUD already does this because ipv4_sk_update_pmtu
    uses __build_flow_key, which includes the UID.

Bug: 15413527
Change-Id: Iae3d4ca3979d252b6cec989bdc1a6875f811f03a
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
2016-02-16 13:51:37 -08:00
David Ahern
7f49e7a38b net: Flush local routes when device changes vrf association
The VRF driver cycles netdevs when an interface is enslaved or released:
the down event is used to flush neighbor and route tables and the up
event (if the interface was already up) effectively moves local and
connected routes to the proper table.

As of 4f823defdd the local route is left hanging around after a link
down, so when a netdev is moved from one VRF to another (or released
from a VRF altogether) local routes are left in the wrong table.

Fix by handling the NETDEV_CHANGEUPPER event. When the upper dev is
an L3mdev then call fib_disable_ip to flush all routes, local ones
to.

Fixes: 4f823defdd ("ipv4: fix to not remove local route on link down")
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-13 23:58:44 -05:00
David S. Miller
73186df8d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor overlapping changes in net/ipv4/ipmr.c, in 'net' we were
fixing the "BH-ness" of the counter bumps whilst in 'net-next'
the functions were modified to take an explicit 'net' parameter.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 13:41:45 -05:00
Julian Anastasov
4f823defdd ipv4: fix to not remove local route on link down
When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
we should not delete the local routes if the local address
is still present. The confusion comes from the fact that both
fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
constant. Fix it by returning back the variable 'force'.

Steps to reproduce:
modprobe dummy
ifconfig dummy0 192.168.168.1 up
ifconfig dummy0 down
ip route list table local | grep dummy | grep host
local 192.168.168.1 dev dummy0  proto kernel  scope host  src 192.168.168.1

Fixes: 8a3d03166f ("net: track link-status of ipv4 nexthops")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01 16:57:39 -05:00
Paolo Abeni
7b1311807f ipv4: implement support for NOPREFIXROUTE ifa flag for ipv4 address
Currently adding a new ipv4 address always cause the creation of the
related network route, with default metric. When a host has multiple
interfaces on the same network, multiple routes with the same metric
are created.

If the userspace wants to set specific metric on each routes, i.e.
giving better metric to ethernet links in respect to Wi-Fi ones,
the network routes must be deleted and recreated, which is error-prone.

This patch implements the support for IFA_F_NOPREFIXROUTE for ipv4
address. When an address is added with such flag set, no associated
network route is created, no network route is deleted when
said IP is gone and it's up to the user space manage such route.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 02:54:54 -07:00
David S. Miller
f6d3125fa3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/dsa/slave.c

net/dsa/slave.c simply had overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-02 07:21:25 -07:00
David Ahern
b84f787820 net: Initialize flow flags in input path
The fib_table_lookup tracepoint found 2 places where the flowi4_flags is
not initialized.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 21:52:32 -07:00
David Ahern
3236b0042b net: Replace vrf_dev_table and friends
Replace calls to vrf_dev_table and friends with l3mdev_fib_table
and kin.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 20:40:33 -07:00
David Ahern
385add906b net: Replace vrf_master_ifindex{, _rcu} with l3mdev equivalents
Replace calls to vrf_master_ifindex_rcu and vrf_master_ifindex with either
l3mdev_master_ifindex_rcu or l3mdev_master_ifindex.

The pattern:
    oif = vrf_master_ifindex(dev) ? : dev->ifindex;
is replaced with
    oif = l3mdev_fib_oif(dev);

And remove the now unused vrf macros.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 20:40:33 -07:00
David Ahern
9b8ff51822 net: Make table id type u32
A number of VRF patches used 'int' for table id. It should be u32 to be
consistent with the rest of the stack.

Fixes:
4e3c89920c ("net: Introduce VRF related flags and helpers")
15be405eb2 ("net: Add inet_addr lookup by table")
30bbaa1950 ("net: Fix up inet_addr_type checks")
021dd3b8a1 ("net: Add routes to the table associated with the device")
dc028da54e ("inet: Move VRF table lookup to inlined function")
f6d3c19274 ("net: FIB tracepoints")

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01 14:32:44 -07:00
David Ahern
f6d3c19274 net: FIB tracepoints
A few useful tracepoints developing VRF driver.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-29 13:05:16 -07:00
David Ahern
021dd3b8a1 net: Add routes to the table associated with the device
When a device associated with a VRF is brought up or down routes
should be added to/removed from the table associated with the VRF.
fib_magic defaults to using the main or local tables. Have it use
the table with the device if there is one.

A part of this is directing prefsrc validations to the correct
table as well.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 22:43:21 -07:00
David Ahern
30bbaa1950 net: Fix up inet_addr_type checks
Currently inet_addr_type and inet_dev_addr_type expect local addresses
to be in the local table. With the VRF device local routes for devices
associated with a VRF will be in the table associated with the VRF.
Provide an alternate inet_addr lookup to use a specific table rather
than defaulting to the local table.

inet_addr_type_dev_table keeps the same semantics as inet_addr_type but
if the passed in device is enslaved to a VRF then the table for that VRF
is used for the lookup.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 22:43:21 -07:00
David Ahern
15be405eb2 net: Add inet_addr lookup by table
Currently inet_addr_type and inet_dev_addr_type expect local addresses
to be in the local table. With the VRF device local routes for devices
associated with a VRF will be in the table associated with the VRF.
Provide an alternate inet_addr lookup to use a specific table rather
than defaulting to the local table.

Signed-off-by: Shrijeet Mukherjee <shm@cumulusnetworks.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 22:43:21 -07:00
David Ahern
cd2fbe1b6b net: Use VRF device index for lookups on RX
On ingress use index of VRF master device for route lookups if real device
is enslaved. Rules are expected to be installed for the VRF device to
direct lookups to a specific table.

Signed-off-by: Shrijeet Mukherjee <shm@cumulusnetworks.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 22:43:20 -07:00
Thomas Graf
1b7179d3ad route: Extend flow representation with tunnel key
Add a new flowi_tunnel structure which is a subset of ip_tunnel_key to
allow routes to match on tunnel metadata. For now, the tunnel id is
added to flowi_tunnel which allows for routes to be bound to specific
virtual tunnels.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:06 -07:00
Roopa Prabhu
571e722676 ipv4: support for fib route lwtunnel encap attributes
This patch adds support in ipv4 fib functions to parse user
provided encap attributes and attach encap state data to fib_nh
and rtable.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:03 -07:00
Andy Gospodarek
0eeb075fad net: ipv4 sysctl option to ignore routes when nexthop link is down
This feature is only enabled with the new per-interface or ipv4 global
sysctls called 'ignore_routes_with_linkdown'.

net.ipv4.conf.all.ignore_routes_with_linkdown = 0
net.ipv4.conf.default.ignore_routes_with_linkdown = 0
net.ipv4.conf.lo.ignore_routes_with_linkdown = 0
...

When the above sysctls are set, will report to userspace that a route is
dead and will no longer resolve to this nexthop when performing a fib
lookup.  This will signal to userspace that the route will not be
selected.  The signalling of a RTNH_F_DEAD is only passed to userspace
if the sysctl is enabled and link is down.  This was done as without it
the netlink listeners would have no idea whether or not a nexthop would
be selected.   The kernel only sets RTNH_F_DEAD internally if the
interface has IFF_UP cleared.

With the new sysctl set, the following behavior can be observed
(interface p8p1 is link-down):

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 dead linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 dead linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2
90.0.0.1 via 70.0.0.2 dev p7p1  src 70.0.0.1
    cache
local 80.0.0.1 dev lo  src 80.0.0.1
    cache <local>
80.0.0.2 via 10.0.5.2 dev p9p1  src 10.0.5.15
    cache

While the route does remain in the table (so it can be modified if
needed rather than being wiped away as it would be if IFF_UP was
cleared), the proper next-hop is chosen automatically when the link is
down.  Now interface p8p1 is linked-up:

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2
192.168.56.0/24 dev p2p1  proto kernel  scope link  src 192.168.56.2
90.0.0.1 via 80.0.0.2 dev p8p1  src 80.0.0.1
    cache
local 80.0.0.1 dev lo  src 80.0.0.1
    cache <local>
80.0.0.2 dev p8p1  src 80.0.0.1
    cache

and the output changes to what one would expect.

If the sysctl is not set, the following output would be expected when
p8p1 is down:

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2

Since the dead flag does not appear, there should be no expectation that
the kernel would skip using this route due to link being down.

v2: Split kernel changes into 2 patches, this actually makes a
behavioral change if the sysctl is set.  Also took suggestion from Alex
to simplify code by only checking sysctl during fib lookup and
suggestion from Scott to add a per-interface sysctl.

v3: Code clean-ups to make it more readable and efficient as well as a
reverse path check fix.

v4: Drop binary sysctl

v5: Whitespace fixups from Dave

v6: Style changes from Dave and checkpatch suggestions

v7: One more checkpatch fixup

Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 02:15:54 -07:00
Andy Gospodarek
8a3d03166f net: track link-status of ipv4 nexthops
Add a fib flag called RTNH_F_LINKDOWN to any ipv4 nexthops that are
reachable via an interface where carrier is off.  No action is taken,
but additional flags are passed to userspace to indicate carrier status.

This also includes a cleanup to fib_disable_ip to more clearly indicate
what event made the function call to replace the more cryptic force
option previously used.

v2: Split out kernel functionality into 2 patches, this patch simply
sets and clears new nexthop flag RTNH_F_LINKDOWN.

v3: Cleanups suggested by Alex as well as a bug noticed in
fib_sync_down_dev and fib_sync_up when multipath was not enabled.

v5: Whitespace and variable declaration fixups suggested by Dave.

v6: Style fixups noticed by Dave; ran checkpatch to be sure I got them
all.

Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 02:15:54 -07:00
David S. Miller
c85d6975ef Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/mellanox/mlx4/cmd.c
	net/core/fib_rules.c
	net/ipv4/fib_frontend.c

The fib_rules.c and fib_frontend.c conflicts were locking adjustments
in 'net' overlapping addition and removal of code in 'net-next'.

The mlx4 conflict was a bug fix in 'net' happening in the same
place a constant was being replaced with a more suitable macro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 22:34:15 -04:00
Ian Morris
51456b2914 ipv4: coding style: comparison for equality with NULL
The ipv4 code uses a mixture of coding styles. In some instances check
for NULL pointer is done as x == NULL and sometimes as !x. !x is
preferred according to checkpatch and this patch makes the code
consistent by adopting the latter form.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03 12:11:15 -04:00
WANG Cong
419df12fb5 net: move fib_rules_unregister() under rtnl lock
We have to hold rtnl lock for fib_rules_unregister()
otherwise the following race could happen:

fib_rules_unregister():	fib_nl_delrule():
...				...
...				ops = lookup_rules_ops();
list_del_rcu(&ops->list);
				list_for_each_entry(ops->rules) {
fib_rules_cleanup_ops(ops);	  ...
  list_del_rcu();		  list_del_rcu();
				}

Note, net->rules_mod_lock is actually not needed at all,
either upper layer netns code or rtnl lock guarantees
we are safe.

Cc: Alexander Duyck <alexander.h.duyck@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 20:52:34 -04:00
Alexander Duyck
6e47d6caff fib_trie: Cleanup ip_fib_net_exit code path
While fixing a recent issue I noticed that we are doing some unnecessary
work inside the loop for ip_fib_net_exit.  As such I am pulling out the
initialization to NULL for the locally stored fib_local, fib_main, and
fib_default.

In addition I am restoring the original code for flushing the table as
there is no need to split up the fib_table_flush and hlist_del work since
the code for packing the tnodes with multiple key vectors was dropped.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31 13:18:56 -04:00
Alexander Duyck
ad88d05136 fib_trie: Fix warning on fib4_rules_exit
This fixes the following warning:

 BUG: sleeping function called from invalid context at mm/slub.c:1268
 in_atomic(): 1, irqs_disabled(): 0, pid: 6, name: kworker/u8:0
 INFO: lockdep is turned off.
 CPU: 3 PID: 6 Comm: kworker/u8:0 Tainted: G        W       4.0.0-rc5+ #895
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 Workqueue: netns cleanup_net
  0000000000000006 ffff88011953fa68 ffffffff81a203b6 000000002c3a2c39
  ffff88011952a680 ffff88011953fa98 ffffffff8109daf0 ffff8801186c6aa8
  ffffffff81fbc9e5 00000000000004f4 0000000000000000 ffff88011953fac8
 Call Trace:
  [<ffffffff81a203b6>] dump_stack+0x4c/0x65
  [<ffffffff8109daf0>] ___might_sleep+0x1c3/0x1cb
  [<ffffffff8109db70>] __might_sleep+0x78/0x80
  [<ffffffff8117a60e>] slab_pre_alloc_hook+0x31/0x8f
  [<ffffffff8117d4f6>] __kmalloc+0x69/0x14e
  [<ffffffff818ed0e1>] ? kzalloc.constprop.20+0xe/0x10
  [<ffffffff818ed0e1>] kzalloc.constprop.20+0xe/0x10
  [<ffffffff818ef622>] fib_trie_table+0x27/0x8b
  [<ffffffff818ef6bd>] fib_trie_unmerge+0x37/0x2a6
  [<ffffffff810b06e1>] ? arch_local_irq_save+0x9/0xc
  [<ffffffff818e9793>] fib_unmerge+0x2d/0xb3
  [<ffffffff818f5f56>] fib4_rule_delete+0x1f/0x52
  [<ffffffff817f1c3f>] ? fib_rules_unregister+0x30/0xb2
  [<ffffffff817f1c8b>] fib_rules_unregister+0x7c/0xb2
  [<ffffffff818f64a1>] fib4_rules_exit+0x15/0x18
  [<ffffffff818e8c0a>] ip_fib_net_exit+0x23/0xf2
  [<ffffffff818e91f8>] fib_net_exit+0x32/0x36
  [<ffffffff817c8352>] ops_exit_list+0x45/0x57
  [<ffffffff817c8d3d>] cleanup_net+0x13c/0x1cd
  [<ffffffff8108b05d>] process_one_work+0x255/0x4ad
  [<ffffffff8108af69>] ? process_one_work+0x161/0x4ad
  [<ffffffff8108b4b1>] worker_thread+0x1cd/0x2ab
  [<ffffffff8108b2e4>] ? process_scheduled_works+0x2f/0x2f
  [<ffffffff81090686>] kthread+0xd4/0xdc
  [<ffffffff8109ec8f>] ? local_clock+0x19/0x22
  [<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83
  [<ffffffff81a2c0c8>] ret_from_fork+0x58/0x90
  [<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83

The issue was that as a part of exiting the default rules were being
deleted which resulted in the local trie being unmerged.  By moving the
freeing of the FIB tables up we can avoid the unmerge since there is no
local table left when we call the fib4_rules_exit function.

Fixes: 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31 13:18:56 -04:00
Alexander Duyck
3c9e9f7320 fib_trie: Avoid NULL pointer if local table is not allocated
The function fib_unmerge assumed the local table had already been
allocated.  If that is not the case however when custom rules are applied
then this can result in a NULL pointer dereference.

In order to prevent this we must check the value of the local table pointer
and if it is NULL simply return 0 as there is no local table to separate
from the main.

Fixes: 0ddcf43d5 ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-12 18:26:51 -04:00
Alexander Duyck
61f0d861fc fib_trie: Fix uninitialized variable warning
The 0-day kernel test infrastructure reported a use of uninitialized
variable warning for local_table due to the fact that the local and main
allocations had been swapped from the original setup.  This change corrects
that by making it so that we free the main table if the local table
allocation fails.

Fixes: 0ddcf43d5 ("ipv4: FIB Local/MAIN table collapse")

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-11 17:33:44 -04:00
Sabrina Dubroca
6dede75b7e fib_trie: call fib_table_flush_external under RTNL
Move rtnl_lock() before the call to fib4_rules_exit so that
fib_table_flush_external is called under RTNL.

Fixes: 104616e74e ("switchdev: don't support custom ip rules, for now")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-11 16:46:26 -04:00
Alexander Duyck
0ddcf43d5d ipv4: FIB Local/MAIN table collapse
This patch is meant to collapse local and main into one by converting
tb_data from an array to a pointer.  Doing this allows us to point the
local table into the main while maintaining the same variables in the
table.

As such the tb_data was converted from an array to a pointer, and a new
array called data is added in order to still provide an object for tb_data
to point to.

In order to track the origin of the fib aliases a tb_id value was added in
a hole that existed on 64b systems.  Using this we can also reverse the
merge in the event that custom FIB rules are enabled.

With this patch I am seeing an improvement of 20ns to 30ns for routing
lookups as long as custom rules are not enabled, with custom rules enabled
we fall back to split tables and the original behavior.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-11 16:22:14 -04:00
Scott Feldman
104616e74e switchdev: don't support custom ip rules, for now
Keep switchdev FIB offload model simple for now and don't allow custom ip
rules.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-06 00:24:58 -05:00
Alexander Duyck
a7e5353123 fib_trie: Make fib_table rcu safe
The fib_table was wrapped in several places with an
rcu_read_lock/rcu_read_unlock however after looking over the code I found
several spots where the tables were being accessed as just standard
pointers without any protections.  This change fixes that so that all of
the proper protections are in place when accessing the table to take RCU
replacement or removal of the table into account.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-04 23:35:18 -05:00
Alexander Duyck
345e9b5426 fib_trie: Push rcu_read_lock/unlock to callers
This change is to start cleaning up some of the rcu_read_lock/unlock
handling.  I realized while reviewing the code there are several spots that
I don't believe are being handled correctly or are masking warnings by
locally calling rcu_read_lock/unlock instead of calling them at the correct
level.

A common example is a call to fib_get_table followed by fib_table_lookup.
The rcu_read_lock/unlock ought to wrap both but there are several spots where
they were not wrapped.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-31 18:25:54 -05:00
Alexander Duyck
8274a97aa4 fib_trie: Update usage stats to be percpu instead of global variables
The trie usage stats were currently being shared by all threads that were
calling fib_table_lookup.  As a result when multiple threads were
performing lookups simultaneously the trie would begin to cache bounce
between those threads.

In order to prevent this I have updated the usage stats to use a set of
percpu variables.  By doing this we should be able to avoid the cache
bouncing and still make use of these stats.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-31 18:25:53 -05:00
Sébastien Barré
1dced6a854 ipv4: Restore accept_local behaviour in fib_validate_source()
Commit 7a9bc9b81a ("ipv4: Elide fib_validate_source() completely when possible.")
introduced a short-circuit to avoid calling fib_validate_source when not
needed. That change took rp_filter into account, but not accept_local.
This resulted in a change of behaviour: with rp_filter and accept_local
off, incoming packets with a local address in the source field should be
dropped.

Here is how to reproduce the change pre/post 7a9bc9b81a commit:
-configure the same IPv4 address on hosts A and B.
-try to send an ARP request from B to A.
-The ARP request will be dropped before that commit, but accepted and answered
after that commit.

This adds a check for ACCEPT_LOCAL, to maintain full
fib validation in case it is 0. We also leave __fib_validate_source() earlier
when possible, based on the same check as fib_validate_source(), once the
accept_local stuff is verified.

Cc: Gregory Detal <gregory.detal@uclouvain.be>
Cc: Christoph Paasch <christoph.paasch@uclouvain.be>
Cc: Hannes Frederic Sowa <hannes@redhat.com>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Sébastien Barré <sebastien.barre@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-22 12:23:10 -07:00
Cong Wang
6a662719c9 ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
As suggested by Julian:

	Simply, flowi4_iif must not contain 0, it does not
	look logical to ignore all ip rules with specified iif.

because in fib_rule_match() we do:

        if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
                goto out;

flowi4_iif should be LOOPBACK_IFINDEX by default.

We need to move LOOPBACK_IFINDEX to include/net/flow.h:

1) It is mostly used by flowi_iif

2) Fix the following compile error if we use it in flow.h
by the patches latter:

In file included from include/linux/netfilter.h:277:0,
                 from include/net/netns/netfilter.h:5,
                 from include/net/net_namespace.h:21,
                 from include/linux/netdevice.h:43,
                 from include/linux/icmpv6.h:12,
                 from include/linux/ipv6.h:61,
                 from include/net/ipv6.h:16,
                 from include/linux/sunrpc/clnt.h:27,
                 from include/linux/nfs_fs.h:30,
                 from init/do_mounts.c:32:
include/net/flow.h: In function ‘flowi4_init_output’:
include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)

Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16 15:05:11 -04:00