-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltYMlwACgkQONu9yGCS
aT5ZmxAAjAWUndXt7fTUyHgxkoG61sEkdX4jcsp6NFwQMudU0UHx4/kcZE+HdMjL
VU8BZtdUg+jMLXM4erVBpQRKY9YHIPi8nWMTm1UjduMCxVD6dVL1HU6/RXl1cYIx
rf/opYOimqT9lYCeffmd9ai2zEEJKSt7/avddcJY4qHiqLan27gbUdAq2H26aM/5
LUzAaSBzhq3VYo9Q5zv03b1+tORAxh2BIffZjGEFe8SQQl1o63WqwV4RxEhV/Bjt
hBgl/6B/+EHtQnYnbnoOT/an9Ma15ik4/z3vVv6yRLNK+hS5T31OKcYCsUrjp6O+
TQVaVLWWmn/VpIHAMkrhBs9Xxg5GmRziF77AkzyC506tK268M2+IoY77ursVl1YK
STaOwUcLUlKLbl5OADqMpYtNU9ybkP+MmgDZsIEXz9UiCZM721fL5Au2PHuzaYOD
2nE2EQb04It4k9GN8FStv2KPIiKUCEXi9MlNsHGPs6Mc+fliIigoKPhpU5JG+sxR
eJgPMNv4OWhwXWTd1wf0Gy5X+i0lQlwlGgIHFfSB8vzArJ0Y/yuPj2a6xhQshOza
Ivq7JudHvxYxhDSWYoCKgtTgzMdSBbJ3xjOoUUHy4ryamYeyaMvgFjsaCTMr0dsw
76BkgNTbpsip+I77a9h4Ozlk5QE7h61EsqjmZBkGVqLYjrUQ/IU=
=X4tZ
-----END PGP SIGNATURE-----
Merge 4.4.144 into android-4.4
Changes in 4.4.144
KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel.
x86/MCE: Remove min interval polling limitation
fat: fix memory allocation failure handling of match_strdup()
ALSA: rawmidi: Change resized buffers atomically
ARC: Fix CONFIG_SWAP
ARC: mm: allow mprotect to make stack mappings executable
mm: memcg: fix use after free in mem_cgroup_iter()
ipv4: Return EINVAL when ping_group_range sysctl doesn't map to user ns
ipv6: fix useless rol32 call on hash
lib/rhashtable: consider param->min_size when setting initial table size
net/ipv4: Set oif in fib_compute_spec_dst
net: phy: fix flag masking in __set_phy_supported
ptp: fix missing break in switch
tg3: Add higher cpu clock for 5762.
net: Don't copy pfmemalloc flag in __copy_skb_header()
skbuff: Unconditionally copy pfmemalloc in __skb_clone()
xhci: Fix perceived dead host due to runtime suspend race with event handler
x86/paravirt: Make native_save_fl() extern inline
x86/cpufeatures: Add CPUID_7_EDX CPUID leaf
x86/cpufeatures: Add Intel feature bits for Speculation Control
x86/cpufeatures: Add AMD feature bits for Speculation Control
x86/msr: Add definitions for new speculation control MSRs
x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes
x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support
x86/cpufeatures: Clean up Spectre v2 related CPUID flags
x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
x86/pti: Mark constant arrays as __initconst
x86/asm/entry/32: Simplify pushes of zeroed pt_regs->REGs
x86/entry/64/compat: Clear registers for compat syscalls, to reduce speculation attack surface
x86/speculation: Update Speculation Control microcode blacklist
x86/speculation: Correct Speculation Control microcode blacklist again
x86/speculation: Clean up various Spectre related details
x86/speculation: Fix up array_index_nospec_mask() asm constraint
x86/speculation: Add <asm/msr-index.h> dependency
x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
x86/mm: Factor out LDT init from context init
x86/mm: Give each mm TLB flush generation a unique ID
x86/speculation: Use Indirect Branch Prediction Barrier in context switch
x86/spectre_v2: Don't check microcode versions when running under hypervisors
x86/speculation: Use IBRS if available before calling into firmware
x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP
x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist
selftest/seccomp: Fix the flag name SECCOMP_FILTER_FLAG_TSYNC
selftest/seccomp: Fix the seccomp(2) signature
xen: set cpu capabilities from xen_start_kernel()
x86/amd: don't set X86_BUG_SYSRET_SS_ATTRS when running under Xen
x86/nospec: Simplify alternative_msr_write()
x86/bugs: Concentrate bug detection into a separate function
x86/bugs: Concentrate bug reporting into a separate function
x86/bugs: Read SPEC_CTRL MSR during boot and re-use reserved bits
x86/bugs, KVM: Support the combination of guest and host IBRS
x86/cpu: Rename Merrifield2 to Moorefield
x86/cpu/intel: Add Knights Mill to Intel family
x86/bugs: Expose /sys/../spec_store_bypass
x86/cpufeatures: Add X86_FEATURE_RDS
x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation
x86/bugs/intel: Set proper CPU features and setup RDS
x86/bugs: Whitelist allowed SPEC_CTRL MSR values
x86/bugs/AMD: Add support to disable RDS on Fam[15, 16, 17]h if requested
x86/speculation: Create spec-ctrl.h to avoid include hell
prctl: Add speculation control prctls
x86/process: Optimize TIF checks in __switch_to_xtra()
x86/process: Correct and optimize TIF_BLOCKSTEP switch
x86/process: Optimize TIF_NOTSC switch
x86/process: Allow runtime control of Speculative Store Bypass
x86/speculation: Add prctl for Speculative Store Bypass mitigation
nospec: Allow getting/setting on non-current task
proc: Provide details on speculation flaw mitigations
seccomp: Enable speculation flaw mitigations
prctl: Add force disable speculation
seccomp: Use PR_SPEC_FORCE_DISABLE
seccomp: Add filter flag to opt-out of SSB mitigation
seccomp: Move speculation migitation control to arch code
x86/speculation: Make "seccomp" the default mode for Speculative Store Bypass
x86/bugs: Rename _RDS to _SSBD
proc: Use underscores for SSBD in 'status'
Documentation/spec_ctrl: Do some minor cleanups
x86/bugs: Fix __ssb_select_mitigation() return type
x86/bugs: Make cpu_show_common() static
x86/bugs: Fix the parameters alignment and missing void
x86/cpu: Make alternative_msr_write work for 32-bit code
x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
x86/cpufeatures: Disentangle SSBD enumeration
x86/cpu/AMD: Fix erratum 1076 (CPB bit)
x86/cpufeatures: Add FEATURE_ZEN
x86/speculation: Handle HT correctly on AMD
x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
x86/speculation: Add virtualized speculative store bypass disable support
x86/speculation: Rework speculative_store_bypass_update()
x86/bugs: Unify x86_spec_ctrl_{set_guest, restore_host}
x86/bugs: Expose x86_spec_ctrl_base directly
x86/bugs: Remove x86_spec_ctrl_set()
x86/bugs: Rework spec_ctrl base and mask logic
x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
x86/bugs: Rename SSBD_NO to SSB_NO
x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
x86/cpu: Re-apply forced caps every time CPU caps are re-read
block: do not use interruptible wait anywhere
clk: tegra: Fix PLL_U post divider and initial rate on Tegra30
ubi: Introduce vol_ignored()
ubi: Rework Fastmap attach base code
ubi: Be more paranoid while seaching for the most recent Fastmap
ubi: Fix races around ubi_refill_pools()
ubi: Fix Fastmap's update_vol()
ubi: fastmap: Erase outdated anchor PEBs during attach
Linux 4.4.144
Change-Id: Ia3e9b2b7bc653cba68b76878d34f8fcbbc007a13
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit e7372197e15856ec4ee66b668020a662994db103 ]
Xin reported that icmp replies may not use the address on the device the
echo request is received if the destination address is broadcast. Instead
a route lookup is done without considering VRF context. Fix by setting
oif in flow struct to the master device if it is enslaved. That directs
the lookup to the VRF table. If the device is not enslaved, oif is still
0 so no affect.
Fixes: cd2fbe1b6b ("net: Use VRF device index for lookups on RX")
Reported-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlpL3okACgkQONu9yGCS
aT6p5g/8CAG9NU/fLu7IMcIlyqfVvdOhzxn44oHCxq08eycqoggdnb3TZXxBUBgY
+w8uZk8yxNdjXR39GjkMSUy06WRvl2XDSrd36sDGRCBP62Fi8l5scmlRaNEnI/E8
ltBSB93P16SmnpKa/3Zscz+7LcaoXHpU5Xhs8Zmf4I69qmzOFX2qSKsUyzVT+gNI
ZoSN/mYuXf7+dzrcKhVdYzm4ZdMRvxdT0WefeoeZMekfAtU9D8zaFOA9jTIAMHSZ
adNn18s7UKmaipZf/01mW9srvZce4nPKiUC8WVGstiyl27ws+IDleKVmDnqFALjy
2LIxDvjDth/x8jfqTb7F6bFh6dVtMJjwUmd3KL7hgPuTddoQQe/GfKnjSHkbNxyR
qNxNtbOgQ2EVOf59fejxWshCP/fButNo8uvCI1ERdm4axGXcf9hiucdlwzCYezHs
UN0xrxAXprhqTq4hQFB9E4C49e8nMPNsyXTMZwSZRPe2z53spD53JR/0sl5Z2RWe
ueO21tBZ6ev9jPNi+lJrCVw1oBO+PKOmdNPAaSynUVm96grRnW6grUI3mX9FqMXb
r62UWG3YCWWBgxA3iQQrMxf/3S2YZXz59TBbp9GU8xOYJZLhKL29/iB7Rv4ANtkR
aMDrABjWqrCZpIazqkZ5uwbsNl6Q51e3Mji3EfwkBaMqjc41++I=
=B52+
-----END PGP SIGNATURE-----
Merge 4.4.109 into android-4.4
Changes in 4.4.109
ACPI: APEI / ERST: Fix missing error handling in erst_reader()
crypto: mcryptd - protect the per-CPU queue with a lock
mfd: cros ec: spi: Don't send first message too soon
mfd: twl4030-audio: Fix sibling-node lookup
mfd: twl6040: Fix child-node lookup
ALSA: rawmidi: Avoid racy info ioctl via ctl device
ALSA: usb-audio: Fix the missing ctl name suffix at parsing SU
PCI / PM: Force devices to D0 in pci_pm_thaw_noirq()
parisc: Hide Diva-built-in serial aux and graphics card
spi: xilinx: Detect stall with Unknown commands
KVM: X86: Fix load RFLAGS w/o the fixed bit
kvm: x86: fix RSM when PCID is non-zero
powerpc/perf: Dereference BHRB entries safely
net: mvneta: clear interface link status on port disable
tracing: Remove extra zeroing out of the ring buffer page
tracing: Fix possible double free on failure of allocating trace buffer
tracing: Fix crash when it fails to alloc ring buffer
ring-buffer: Mask out the info bits when returning buffer page length
iw_cxgb4: Only validate the MSN for successful completions
ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
ASoC: twl4030: fix child-node lookup
ALSA: hda: Drop useless WARN_ON()
ALSA: hda - fix headset mic detection issue on a Dell machine
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
x86/mm: Remove flush_tlb() and flush_tlb_current_task()
x86/mm: Make flush_tlb_mm_range() more predictable
x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code
x86/mm: Disable PCID on 32-bit kernels
x86/mm: Add the 'nopcid' boot option to turn off PCID
x86/mm: Enable CR4.PCIDE on supported systems
x86/mm/64: Fix reboot interaction with CR4.PCIDE
kbuild: add '-fno-stack-check' to kernel build options
ipv4: igmp: guard against silly MTU values
ipv6: mcast: better catch silly mtu values
net: igmp: Use correct source address on IGMPv3 reports
netlink: Add netns check on taps
net: qmi_wwan: add Sierra EM7565 1199:9091
net: reevalulate autoflowlabel setting after sysctl setting
tcp md5sig: Use skb's saddr when replying to an incoming segment
tg3: Fix rx hang on MTU change with 5717/5719
net: ipv4: fix for a race condition in raw_sendmsg
net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case
sctp: Replace use of sockets_allocated with specified macro.
ipv4: Fix use-after-free when flushing FIB tables
net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks
net: Fix double free and memory corruption in get_net_ns_by_id()
net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
sock: free skb in skb_complete_tx_timestamp on error
usbip: fix usbip bind writing random string after command in match_busid
usbip: stub: stop printing kernel pointer addresses in messages
usbip: vhci: stop printing kernel pointer addresses in messages
USB: serial: ftdi_sio: add id for Airbus DS P8GR
USB: serial: qcserial: add Sierra Wireless EM7565
USB: serial: option: add support for Telit ME910 PID 0x1101
USB: serial: option: adding support for YUGA CLM920-NC5
usb: Add device quirk for Logitech HD Pro Webcam C925e
usb: add RESET_RESUME for ELSA MicroLink 56K
USB: Fix off by one in type-specific length check of BOS SSP capability
usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
x86/smpboot: Remove stale TLB flush invocations
n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP
Linux 4.4.109
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit b4681c2829e24943aadd1a7bb3a30d41d0a20050 ]
Since commit 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse") the
local table uses the same trie allocated for the main table when custom
rules are not in use.
When a net namespace is dismantled, the main table is flushed and freed
(via an RCU callback) before the local table. In case the callback is
invoked before the local table is iterated, a use-after-free can occur.
Fix this by iterating over the FIB tables in reverse order, so that the
main table is always freed after the local table.
v3: Reworded comment according to Alex's suggestion.
v2: Add a comment to make the fix more explicit per Dave's and Alex's
feedback.
Fixes: 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlmN2eMACgkQONu9yGCS
aT7zCQ//eDgCF9YJnE1v8/JJ0yl2uK7XjVrF/tpPvzgTgszu4En4kGfhUO+WvmkU
0/pqYBMAPZEbfmx+6q8FJx/MHDjFA1oKb+a9pS1RUovzWDLQoRxYwiBtR2osmuOE
f1fbDMt9ETDUxUGLhRJ/vuzeIjmouhPkz5vZAg863+sKYYjPHlczymcgMs0sRMsE
3kkgo6mhCKTLt8gvioSUjeVWs4a5y3unvImhSLjEHjcfydlDLwA8RuFdFwBIgNfP
yPrgW3v5l9HHXI1lWMcOCTpVeDI272sKNOppYg4r2N/I/epBN79j7jGrqGQpG8NP
mKOkgRDoR7ifyKLSS55R8anLyNoi4jfQAHbOxlSVGymwpd9kRuHoeTE5+IqYs+V5
qLkqLz63hmbfRQuW6az6L+SGVwgj3DSHakGQFkB0ouB8h5ubU2OqINxOsaNABbHD
C1Q9giqG8b2MEv5D4O4m7BhK1tDzSJmT2tb9UG+UV8LJn1PhFSnSMkjP4S7trZl+
+8myxdoNVvDMpd23UqM7o1fuYalbslTKED9el31FimOaNF79+tzyjnNbWA6zqX+X
U3I+Pp2FafOS2heTLTX59fz09LKRI+iP3pnlCBpp1a+MKAIEbjeW8YB5zTKrSNOv
RkZ+1qIQtmGyhVp/YDsua5J1lhZVXeLeoEqDXYerELOdGKF30jw=
=pHqB
-----END PGP SIGNATURE-----
Merge 4.4.81 into android-4.4
Changes in 4.4.81
libata: array underflow in ata_find_dev()
workqueue: restore WQ_UNBOUND/max_active==1 to be ordered
ALSA: hda - Fix speaker output from VAIO VPCL14M1R
ASoC: do not close shared backend dailink
KVM: async_pf: make rcu irq exit if not triggered from idle task
mm/page_alloc: Remove kernel address exposure in free_reserved_area()
ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize
ext4: fix overflow caused by missing cast in ext4_resize_fs()
ARM: dts: armada-38x: Fix irq type for pca955
media: platform: davinci: return -EINVAL for VPFE_CMD_S_CCDC_RAW_PARAMS ioctl
target: Avoid mappedlun symlink creation during lun shutdown
iscsi-target: Always wait for kthread_should_stop() before kthread exit
iscsi-target: Fix early sk_data_ready LOGIN_FLAGS_READY race
iscsi-target: Fix initial login PDU asynchronous socket close OOPs
iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP
iser-target: Avoid isert_conn->cm_id dereference in isert_login_recv_done
mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries
media: lirc: LIRC_GET_REC_RESOLUTION should return microseconds
f2fs: sanity check checkpoint segno and blkoff
drm: rcar-du: fix backport bug
saa7164: fix double fetch PCIe access condition
ipv4: ipv6: initialize treq->txhash in cookie_v[46]_check()
net: Zero terminate ifr_name in dev_ifname().
ipv6: avoid overflow of offset in ip6_find_1stfragopt
ipv4: initialize fib_trie prior to register_netdev_notifier call.
rtnetlink: allocate more memory for dev_set_mac_address()
mcs7780: Fix initialization when CONFIG_VMAP_STACK is enabled
openvswitch: fix potential out of bound access in parse_ct
packet: fix use-after-free in prb_retire_rx_blk_timer_expired()
ipv6: Don't increase IPSTATS_MIB_FRAGFAILS twice in ip6_fragment()
net: ethernet: nb8800: Handle all 4 RGMII modes identically
dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly
dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properly
dccp: fix a memleak for dccp_feat_init err process
sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()
sctp: fix the check for _sctp_walk_params and _sctp_walk_errors
net/mlx5: Fix command bad flow on command entry allocation failure
net: phy: Correctly process PHY_HALTED in phy_stop_machine()
net: phy: Fix PHY unbind crash
xen-netback: correctly schedule rate-limited queues
sparc64: Measure receiver forward progress to avoid send mondo timeout
wext: handle NULL extra data in iwe_stream_add_point better
sh_eth: R8A7740 supports packet shecksumming
net: phy: dp83867: fix irq generation
tg3: Fix race condition in tg3_get_stats64().
x86/boot: Add missing declaration of string functions
phy state machine: failsafe leave invalid RUNNING state
scsi: qla2xxx: Get mutex lock before checking optrom_state
drm/virtio: fix framebuffer sparse warning
virtio_blk: fix panic in initialization error path
ARM: 8632/1: ftrace: fix syscall name matching
mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDER
lib/Kconfig.debug: fix frv build failure
signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
mm: don't dereference struct page fields of invalid pages
ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output
net: account for current skb length when deciding about UFO
workqueue: implicit ordered attribute should be overridable
Linux 4.4.81
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit 8799a221f5944a7d74516ecf46d58c28ec1d1f75 ]
Net stack initialization currently initializes fib-trie after the
first call to netdevice_notifier() call. In fact fib_trie initialization
needs to happen before first rtnl_register(). It does not cause any problem
since there are no devices UP at this moment, but trying to bring 'lo'
UP at initialization would make this assumption wrong and exposes the issue.
Fixes following crash
Call Trace:
? alternate_node_alloc+0x76/0xa0
fib_table_insert+0x1b7/0x4b0
fib_magic.isra.17+0xea/0x120
fib_add_ifaddr+0x7b/0x190
fib_netdev_event+0xc0/0x130
register_netdevice_notifier+0x1c1/0x1d0
ip_fib_init+0x72/0x85
ip_rt_init+0x187/0x1e9
ip_init+0xe/0x1a
inet_init+0x171/0x26c
? ipv4_offload_init+0x66/0x66
do_one_initcall+0x43/0x160
kernel_init_freeable+0x191/0x219
? rest_init+0x80/0x80
kernel_init+0xe/0x150
ret_from_fork+0x22/0x30
Code: f6 46 23 04 74 86 4c 89 f7 e8 ae 45 01 00 49 89 c7 4d 85 ff 0f 85 7b ff ff ff 31 db eb 08 4c 89 ff e8 16 47 01 00 48 8b 44 24 38 <45> 8b 6e 14 4d 63 76 74 48 89 04 24 0f 1f 44 00 00 48 83 c4 08
RIP: kmem_cache_alloc+0xcf/0x1c0 RSP: ffff9b1500017c28
CR2: 0000000000000014
Fixes: 7b1a74fdbb ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.")
Fixes: 7f9b80529b ("[IPV4]: fib hash|trie initialization")
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlk30BwACgkQONu9yGCS
aT5cmhAAh3etTuZ3xRw2eGW/Y/C8L2F2CjJjmR4vp1ms8P55uZg3xA20r5jNj7Ho
pwag3WTNzHpVfKFApavfEzToqDszRAtXcvYPPW9uXUPeu8LWyBJyvmN7lSQVKgDc
M9SWsd+8EGceopaj8KHjLMxNsV2n8j2ckxNf/BL/KgiMtJlgp/1TCDKUVS1k0cA7
CsuxDhxpRYpQofsIVww1hdrwCxVuntAY7u+/3B19ozXGFSRe/h5GO6xYRcG8pqfT
lvIgD6btdQJwI55QoSpJCpL96a534zc+akO0dtyaMJ3Q8UWQXD3JF8ZxMiPPrAe8
CLW390ATranIafmLi9g9DU1vQeEPNFXpeiYfxe65YL7igeAj/uPtVzKp0MvRcKG7
IBVNxbtsTQa73ig7gKSJ323CnpEfrr/XG73JNVtUQLxHa2poY7SUonRI587MFW2T
sONl9Pk3TxRC7Rc45si4RFsIj4jEF8ubUDXOPb2CrmDMB7MrM0PHfOW9lLCP92FD
pn0fM4vwNvm2ILsblqNcBumgeIBQ8ld2TBTbhRbh2FK4Rzxd2TSlWh4KqkcWcXCt
Lz8conU06AwTvDob1xoht3m6Gj32maopKZKGn5/Wq0YlfjOB/70CXOvPO3ChhKTh
QGNgA66bYdm+xn55wf7ty7Bq8yO6kcSNPQCXOb9S61nfCLA4KHM=
=U7IH
-----END PGP SIGNATURE-----
Merge 4.4.71 into android-4.4
Changes in 4.4.71
sparc: Fix -Wstringop-overflow warning
dccp/tcp: do not inherit mc_list from parent
ipv6/dccp: do not inherit ipv6_mc_list from parent
s390/qeth: handle sysfs error during initialization
s390/qeth: unbreak OSM and OSN support
s390/qeth: avoid null pointer dereference on OSN
tcp: avoid fragmenting peculiar skbs in SACK
sctp: fix src address selection if using secondary addresses for ipv6
sctp: do not inherit ipv6_{mc|ac|fl}_list from parent
tcp: eliminate negative reordering in tcp_clean_rtx_queue
net: Improve handling of failures on link and route dumps
ipv6: Prevent overrun when parsing v6 header options
ipv6: Check ip6_find_1stfragopt() return value properly.
bridge: netlink: check vlan_default_pvid range
qmi_wwan: add another Lenovo EM74xx device ID
bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
ipv6: fix out of bound writes in __ip6_append_data()
be2net: Fix offload features for Q-in-Q packets
virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
tcp: avoid fastopen API to be used on AF_UNSPEC
sctp: fix ICMP processing if skb is non-linear
ipv4: add reference counting to metrics
netem: fix skb_orphan_partial()
net: phy: marvell: Limit errata to 88m1101
vlan: Fix tcp checksum offloads in Q-in-Q vlans
i2c: i2c-tiny-usb: fix buffer not being DMA capable
mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read
HID: wacom: Have wacom_tpc_irq guard against possible NULL dereference
scsi: mpt3sas: Force request partial completion alignment
drm/radeon/ci: disable mclk switching for high refresh rates (v2)
drm/radeon: Unbreak HPD handling for r600+
pcmcia: remove left-over %Z format
ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430
slub/memcg: cure the brainless abuse of sysfs attributes
drm/gma500/psb: Actually use VBT mode when it is found
mm/migrate: fix refcount handling when !hugepage_migration_supported()
mlock: fix mlock count can not decrease in race condition
xfs: Fix missed holes in SEEK_HOLE implementation
xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff()
xfs: fix over-copying of getbmap parameters from userspace
xfs: handle array index overrun in xfs_dir2_leaf_readbuf()
xfs: prevent multi-fsb dir readahead from reading random blocks
xfs: fix up quotacheck buffer list error handling
xfs: support ability to wait on new inodes
xfs: update ag iterator to support wait on new inodes
xfs: wait on new inodes during quotaoff dquot release
xfs: fix indlen accounting error on partial delalloc conversion
xfs: bad assertion for delalloc an extent that start at i_size
xfs: fix unaligned access in xfs_btree_visit_blocks
xfs: in _attrlist_by_handle, copy the cursor back to userspace
xfs: only return -errno or success from attr ->put_listent
Linux 4.4.71
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit f6c5775ff0bfa62b072face6bf1d40f659f194b2 ]
In general, rtnetlink dumps do not anticipate failure to dump a single
object (e.g., link or route) on a single pass. As both route and link
objects have grown via more attributes, that is no longer a given.
netlink dumps can handle a failure if the dump function returns an
error; specifically, netlink_dump adds the return code to the response
if it is <= 0 so userspace is notified of the failure. The missing
piece is the rtnetlink dump functions returning the error.
Fix route and link dump functions to return the errors if no object is
added to an skb (detected by skb->len != 0). IPv6 route dumps
(rt6_dump_route) already return the error; this patch updates IPv4 and
link dumps. Other dump functions may need to be ajusted as well.
Reported-by: Jan Moskyto Matejka <mq@ucw.cz>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAljctYYACgkQONu9yGCS
aT6MbxAAwyGobI5sOr63yX5Myji1jf17vlY2h5dXet+8lu/csbFKmqHxaTLNwGIw
7u6V3AJ4zWdX8Q22dcVC98oySxcLxUhv+Rv/Dbonr3CYM00wNIex2wzON8f77eJQ
CBGJRNJR4/VG6opbVI/qp0t/c2oFiqHJXPldm3/Ru7jcBrLo5UHDWDY6cDhrj/Tg
F1maCBMAu1qW0z9KTnrQDvHjPHXmKfCviGzXpFTSVBQrh1s1bJkZkTqcY9eZNa/u
AXhHek5ZLFxlhkO105leR0YtXADbopiJ5c4EgXCASzNQ92/6IKsl21eOhHgOU9OA
YUCYftwKVMcxXGB6QFbdefLVtnCjUtlDa9+70oW1/4Ecee5FUBzNjVWVHOtYEsTY
pA3DqQI+U7EBCuIXsTtV0DiRWhHKq5uS1aphXZwnq/8qc2A3PD86JV/MBK5sWZfB
2V1N7xkitLFFCR6vMFLuusM8Np7kJ3zaAxQOd3IRc72iiNLkbNjdfJcAQ+E9b/Zx
5tpcthOl2RKhlOHHVKmYIioar8+RkZgWWl64+RTt6M1KjvHs07lPdCI+4cW8ELLM
/FUeRNTLmOiUv4dEPj5INYukEcLuCNp4fIo9lq8HrDceXbJXwiMFtHHlo4SQ/ubm
9v5iYdEmGet8jYPrfa5LDtD7G8K//k08nElVmI6N/4fNxRC2GcY=
=TI1/
-----END PGP SIGNATURE-----
Merge 4.4.48 into android-4.4
Changes in 4.4.48:
net/openvswitch: Set the ipv6 source tunnel key address attribute correctly
net: bcmgenet: Do not suspend PHY if Wake-on-LAN is enabled
net: properly release sk_frag.page
amd-xgbe: Fix jumbo MTU processing on newer hardware
net: unix: properly re-increment inflight counter of GC discarded candidates
net/mlx5: Increase number of max QPs in default profile
net/mlx5e: Count LRO packets correctly
net: bcmgenet: remove bcmgenet_internal_phy_setup()
ipv4: provide stronger user input validation in nl_fib_input()
socket, bpf: fix sk_filter use after free in sk_clone_lock
tcp: initialize icsk_ack.lrcvtime at session start time
Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw
Input: i8042 - add noloop quirk for Dell Embedded Box PC 3000
Input: iforce - validate number of endpoints before using them
Input: ims-pcu - validate number of endpoints before using them
Input: hanwang - validate number of endpoints before using them
Input: yealink - validate number of endpoints before using them
Input: cm109 - validate number of endpoints before using them
Input: kbtab - validate number of endpoints before using them
Input: sur40 - validate number of endpoints before using them
ALSA: seq: Fix racy cell insertions during snd_seq_pool_done()
ALSA: ctxfi: Fix the incorrect check of dma_set_mask() call
ALSA: hda - Adding a group of pin definition to fix headset problem
USB: serial: option: add Quectel UC15, UC20, EC21, and EC25 modems
USB: serial: qcserial: add Dell DW5811e
ACM gadget: fix endianness in notifications
usb: gadget: f_uvc: Fix SuperSpeed companion descriptor's wBytesPerInterval
usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk
USB: uss720: fix NULL-deref at probe
USB: lvtest: fix NULL-deref at probe
USB: idmouse: fix NULL-deref at probe
USB: wusbcore: fix NULL-deref at probe
usb: musb: cppi41: don't check early-TX-interrupt for Isoch transfer
usb: hub: Fix crash after failure to read BOS descriptor
uwb: i1480-dfu: fix NULL-deref at probe
uwb: hwa-rc: fix NULL-deref at probe
mmc: ushc: fix NULL-deref at probe
iio: adc: ti_am335x_adc: fix fifo overrun recovery
iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3
parport: fix attempt to write duplicate procfiles
ext4: mark inode dirty after converting inline directory
mmc: sdhci: Do not disable interrupts while waiting for clock
xen/acpi: upload PM state from init-domain to Xen
iommu/vt-d: Fix NULL pointer dereference in device_to_iommu
ARM: at91: pm: cpu_idle: switch DDR to power-down mode
ARM: dts: at91: sama5d2: add dma properties to UART nodes
cpufreq: Restore policy min/max limits on CPU online
raid10: increment write counter after bio is split
libceph: don't set weight to IN when OSD is destroyed
xfs: don't allow di_size with high bit set
xfs: fix up xfs_swap_extent_forks inline extent handling
nl80211: fix dumpit error path RTNL deadlocks
USB: usbtmc: add missing endpoint sanity check
xfs: clear _XBF_PAGES from buffers when readahead page
xen: do not re-use pirq number cached in pci device msi msg data
igb: Workaround for igb i210 firmware issue
igb: add i211 to i210 PHY workaround
x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
PCI: Separate VF BAR updates from standard BAR updates
PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
PCI: Add comments about ROM BAR updating
PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
PCI: Don't update VF BARs while VF memory space is enabled
PCI: Update BARs using property bits appropriate for type
PCI: Ignore BAR updates on virtual functions
PCI: Do any VF BAR updates before enabling the BARs
vfio/spapr: Postpone allocation of userspace version of TCE table
block: allow WRITE_SAME commands with the SG_IO ioctl
s390/zcrypt: Introduce CEX6 toleration
uvcvideo: uvc_scan_fallback() for webcams with broken chain
ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520
ACPI / blacklist: Make Dell Latitude 3350 ethernet work
serial: 8250_pci: Detach low-level driver during PCI error recovery
fbcon: Fix vc attr at deinit
crypto: algif_hash - avoid zero-sized array
Linux 4.4.58
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit c64c0b3cac4c5b8cb093727d2c19743ea3965c0b ]
Alexander reported a KMSAN splat caused by reads of uninitialized
field (tb_id_in) from user provided struct fib_result_nl
It turns out nl_fib_input() sanity tests on user input is a bit
wrong :
User can pretend nlh->nlmsg_len is big enough, but provide
at sendmsg() time a too small buffer.
Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5350d54f6cd12eaff623e890744c79b700bd3f17 ]
In the case of custom rules being present we need to handle the case of the
LOCAL table being intialized after the new rule has been added. To address
that I am adding a new check so that we can make certain we don't use an
alias of MAIN for LOCAL when allocating a new table.
Fixes: 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Oliver Brunel <jjk@jjacky.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 391a20333b8393ef2e13014e6e59d192c5594471 ]
After commit fbd40ea0180a ("ipv4: Don't do expensive useless work
during inetdev destroy.") when deleting an interface,
fib_del_ifaddr() can be executed without any primary address
present on the dead interface.
The above is safe, but triggers some "bug: prim == NULL" warnings.
This commit avoids warning if the in_dev is dead
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4cfc86f3dae6ca38ed49cdd78f458a03d4d87992 ]
Field fl4.flowi4_flags is not initialized in fib_compute_spec_dst()
before calling fib_lookup(), which means fib_table_lookup() is
using non-deterministic data at this line:
if (!(flp->flowi4_flags & FLOWI_FLAG_SKIP_NH_OIF)) {
Fix by initializing the entire fl4 structure, which will prevent
similar issues as fields are added in the future by ensuring that
all fields are initialized to zero unless explicitly initialized
to another value.
Fixes: 58189ca7b2 ("net: Fix vti use case with oif in dst lookups")
Suggested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fbd40ea0180a2d328c5adc61414dc8bab9335ce2 ]
When an inetdev is destroyed, every address assigned to the interface
is removed. And in this scenerio we do two pointless things which can
be very expensive if the number of assigned interfaces is large:
1) Address promotion. We are deleting all addresses, so there is no
point in doing this.
2) A full nf conntrack table purge for every address. We only need to
do this once, as is already caught by the existing
masq_dev_notifier so masq_inet_event() can skip this.
Reported-by: Solar Designer <solar@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This contains the following commits:
1. cc2f522 net: core: Add a UID range to fib rules.
2. d7ed2bd net: core: Use the socket UID in routing lookups.
3. 2f9306a net: core: Add a RTA_UID attribute to routes.
This is so that userspace can do per-UID route lookups.
4. 8e46efb net: ipv6: Use the UID in IPv6 PMTUD
IPv4 PMTUD already does this because ipv4_sk_update_pmtu
uses __build_flow_key, which includes the UID.
Bug: 15413527
Change-Id: Iae3d4ca3979d252b6cec989bdc1a6875f811f03a
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
The VRF driver cycles netdevs when an interface is enslaved or released:
the down event is used to flush neighbor and route tables and the up
event (if the interface was already up) effectively moves local and
connected routes to the proper table.
As of 4f823defdd the local route is left hanging around after a link
down, so when a netdev is moved from one VRF to another (or released
from a VRF altogether) local routes are left in the wrong table.
Fix by handling the NETDEV_CHANGEUPPER event. When the upper dev is
an L3mdev then call fib_disable_ip to flush all routes, local ones
to.
Fixes: 4f823defdd ("ipv4: fix to not remove local route on link down")
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor overlapping changes in net/ipv4/ipmr.c, in 'net' we were
fixing the "BH-ness" of the counter bumps whilst in 'net-next'
the functions were modified to take an explicit 'net' parameter.
Signed-off-by: David S. Miller <davem@davemloft.net>
When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
we should not delete the local routes if the local address
is still present. The confusion comes from the fact that both
fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
constant. Fix it by returning back the variable 'force'.
Steps to reproduce:
modprobe dummy
ifconfig dummy0 192.168.168.1 up
ifconfig dummy0 down
ip route list table local | grep dummy | grep host
local 192.168.168.1 dev dummy0 proto kernel scope host src 192.168.168.1
Fixes: 8a3d03166f ("net: track link-status of ipv4 nexthops")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently adding a new ipv4 address always cause the creation of the
related network route, with default metric. When a host has multiple
interfaces on the same network, multiple routes with the same metric
are created.
If the userspace wants to set specific metric on each routes, i.e.
giving better metric to ethernet links in respect to Wi-Fi ones,
the network routes must be deleted and recreated, which is error-prone.
This patch implements the support for IFA_F_NOPREFIXROUTE for ipv4
address. When an address is added with such flag set, no associated
network route is created, no network route is deleted when
said IP is gone and it's up to the user space manage such route.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fib_table_lookup tracepoint found 2 places where the flowi4_flags is
not initialized.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace calls to vrf_dev_table and friends with l3mdev_fib_table
and kin.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace calls to vrf_master_ifindex_rcu and vrf_master_ifindex with either
l3mdev_master_ifindex_rcu or l3mdev_master_ifindex.
The pattern:
oif = vrf_master_ifindex(dev) ? : dev->ifindex;
is replaced with
oif = l3mdev_fib_oif(dev);
And remove the now unused vrf macros.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A number of VRF patches used 'int' for table id. It should be u32 to be
consistent with the rest of the stack.
Fixes:
4e3c89920c ("net: Introduce VRF related flags and helpers")
15be405eb2 ("net: Add inet_addr lookup by table")
30bbaa1950 ("net: Fix up inet_addr_type checks")
021dd3b8a1 ("net: Add routes to the table associated with the device")
dc028da54e ("inet: Move VRF table lookup to inlined function")
f6d3c19274 ("net: FIB tracepoints")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A few useful tracepoints developing VRF driver.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a device associated with a VRF is brought up or down routes
should be added to/removed from the table associated with the VRF.
fib_magic defaults to using the main or local tables. Have it use
the table with the device if there is one.
A part of this is directing prefsrc validations to the correct
table as well.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently inet_addr_type and inet_dev_addr_type expect local addresses
to be in the local table. With the VRF device local routes for devices
associated with a VRF will be in the table associated with the VRF.
Provide an alternate inet_addr lookup to use a specific table rather
than defaulting to the local table.
inet_addr_type_dev_table keeps the same semantics as inet_addr_type but
if the passed in device is enslaved to a VRF then the table for that VRF
is used for the lookup.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently inet_addr_type and inet_dev_addr_type expect local addresses
to be in the local table. With the VRF device local routes for devices
associated with a VRF will be in the table associated with the VRF.
Provide an alternate inet_addr lookup to use a specific table rather
than defaulting to the local table.
Signed-off-by: Shrijeet Mukherjee <shm@cumulusnetworks.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On ingress use index of VRF master device for route lookups if real device
is enslaved. Rules are expected to be installed for the VRF device to
direct lookups to a specific table.
Signed-off-by: Shrijeet Mukherjee <shm@cumulusnetworks.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new flowi_tunnel structure which is a subset of ip_tunnel_key to
allow routes to match on tunnel metadata. For now, the tunnel id is
added to flowi_tunnel which allows for routes to be bound to specific
virtual tunnels.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support in ipv4 fib functions to parse user
provided encap attributes and attach encap state data to fib_nh
and rtable.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This feature is only enabled with the new per-interface or ipv4 global
sysctls called 'ignore_routes_with_linkdown'.
net.ipv4.conf.all.ignore_routes_with_linkdown = 0
net.ipv4.conf.default.ignore_routes_with_linkdown = 0
net.ipv4.conf.lo.ignore_routes_with_linkdown = 0
...
When the above sysctls are set, will report to userspace that a route is
dead and will no longer resolve to this nexthop when performing a fib
lookup. This will signal to userspace that the route will not be
selected. The signalling of a RTNH_F_DEAD is only passed to userspace
if the sysctl is enabled and link is down. This was done as without it
the netlink listeners would have no idea whether or not a nexthop would
be selected. The kernel only sets RTNH_F_DEAD internally if the
interface has IFF_UP cleared.
With the new sysctl set, the following behavior can be observed
(interface p8p1 is link-down):
default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 dead linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 dead linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2
90.0.0.1 via 70.0.0.2 dev p7p1 src 70.0.0.1
cache
local 80.0.0.1 dev lo src 80.0.0.1
cache <local>
80.0.0.2 via 10.0.5.2 dev p9p1 src 10.0.5.15
cache
While the route does remain in the table (so it can be modified if
needed rather than being wiped away as it would be if IFF_UP was
cleared), the proper next-hop is chosen automatically when the link is
down. Now interface p8p1 is linked-up:
default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1
90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1
90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2
192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2
90.0.0.1 via 80.0.0.2 dev p8p1 src 80.0.0.1
cache
local 80.0.0.1 dev lo src 80.0.0.1
cache <local>
80.0.0.2 dev p8p1 src 80.0.0.1
cache
and the output changes to what one would expect.
If the sysctl is not set, the following output would be expected when
p8p1 is down:
default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15
70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1
80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2
Since the dead flag does not appear, there should be no expectation that
the kernel would skip using this route due to link being down.
v2: Split kernel changes into 2 patches, this actually makes a
behavioral change if the sysctl is set. Also took suggestion from Alex
to simplify code by only checking sysctl during fib lookup and
suggestion from Scott to add a per-interface sysctl.
v3: Code clean-ups to make it more readable and efficient as well as a
reverse path check fix.
v4: Drop binary sysctl
v5: Whitespace fixups from Dave
v6: Style changes from Dave and checkpatch suggestions
v7: One more checkpatch fixup
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a fib flag called RTNH_F_LINKDOWN to any ipv4 nexthops that are
reachable via an interface where carrier is off. No action is taken,
but additional flags are passed to userspace to indicate carrier status.
This also includes a cleanup to fib_disable_ip to more clearly indicate
what event made the function call to replace the more cryptic force
option previously used.
v2: Split out kernel functionality into 2 patches, this patch simply
sets and clears new nexthop flag RTNH_F_LINKDOWN.
v3: Cleanups suggested by Alex as well as a bug noticed in
fib_sync_down_dev and fib_sync_up when multipath was not enabled.
v5: Whitespace and variable declaration fixups suggested by Dave.
v6: Style fixups noticed by Dave; ran checkpatch to be sure I got them
all.
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/mellanox/mlx4/cmd.c
net/core/fib_rules.c
net/ipv4/fib_frontend.c
The fib_rules.c and fib_frontend.c conflicts were locking adjustments
in 'net' overlapping addition and removal of code in 'net-next'.
The mlx4 conflict was a bug fix in 'net' happening in the same
place a constant was being replaced with a more suitable macro.
Signed-off-by: David S. Miller <davem@davemloft.net>
The ipv4 code uses a mixture of coding styles. In some instances check
for NULL pointer is done as x == NULL and sometimes as !x. !x is
preferred according to checkpatch and this patch makes the code
consistent by adopting the latter form.
No changes detected by objdiff.
Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have to hold rtnl lock for fib_rules_unregister()
otherwise the following race could happen:
fib_rules_unregister(): fib_nl_delrule():
... ...
... ops = lookup_rules_ops();
list_del_rcu(&ops->list);
list_for_each_entry(ops->rules) {
fib_rules_cleanup_ops(ops); ...
list_del_rcu(); list_del_rcu();
}
Note, net->rules_mod_lock is actually not needed at all,
either upper layer netns code or rtnl lock guarantees
we are safe.
Cc: Alexander Duyck <alexander.h.duyck@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While fixing a recent issue I noticed that we are doing some unnecessary
work inside the loop for ip_fib_net_exit. As such I am pulling out the
initialization to NULL for the locally stored fib_local, fib_main, and
fib_default.
In addition I am restoring the original code for flushing the table as
there is no need to split up the fib_table_flush and hlist_del work since
the code for packing the tnodes with multiple key vectors was dropped.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the following warning:
BUG: sleeping function called from invalid context at mm/slub.c:1268
in_atomic(): 1, irqs_disabled(): 0, pid: 6, name: kworker/u8:0
INFO: lockdep is turned off.
CPU: 3 PID: 6 Comm: kworker/u8:0 Tainted: G W 4.0.0-rc5+ #895
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: netns cleanup_net
0000000000000006 ffff88011953fa68 ffffffff81a203b6 000000002c3a2c39
ffff88011952a680 ffff88011953fa98 ffffffff8109daf0 ffff8801186c6aa8
ffffffff81fbc9e5 00000000000004f4 0000000000000000 ffff88011953fac8
Call Trace:
[<ffffffff81a203b6>] dump_stack+0x4c/0x65
[<ffffffff8109daf0>] ___might_sleep+0x1c3/0x1cb
[<ffffffff8109db70>] __might_sleep+0x78/0x80
[<ffffffff8117a60e>] slab_pre_alloc_hook+0x31/0x8f
[<ffffffff8117d4f6>] __kmalloc+0x69/0x14e
[<ffffffff818ed0e1>] ? kzalloc.constprop.20+0xe/0x10
[<ffffffff818ed0e1>] kzalloc.constprop.20+0xe/0x10
[<ffffffff818ef622>] fib_trie_table+0x27/0x8b
[<ffffffff818ef6bd>] fib_trie_unmerge+0x37/0x2a6
[<ffffffff810b06e1>] ? arch_local_irq_save+0x9/0xc
[<ffffffff818e9793>] fib_unmerge+0x2d/0xb3
[<ffffffff818f5f56>] fib4_rule_delete+0x1f/0x52
[<ffffffff817f1c3f>] ? fib_rules_unregister+0x30/0xb2
[<ffffffff817f1c8b>] fib_rules_unregister+0x7c/0xb2
[<ffffffff818f64a1>] fib4_rules_exit+0x15/0x18
[<ffffffff818e8c0a>] ip_fib_net_exit+0x23/0xf2
[<ffffffff818e91f8>] fib_net_exit+0x32/0x36
[<ffffffff817c8352>] ops_exit_list+0x45/0x57
[<ffffffff817c8d3d>] cleanup_net+0x13c/0x1cd
[<ffffffff8108b05d>] process_one_work+0x255/0x4ad
[<ffffffff8108af69>] ? process_one_work+0x161/0x4ad
[<ffffffff8108b4b1>] worker_thread+0x1cd/0x2ab
[<ffffffff8108b2e4>] ? process_scheduled_works+0x2f/0x2f
[<ffffffff81090686>] kthread+0xd4/0xdc
[<ffffffff8109ec8f>] ? local_clock+0x19/0x22
[<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83
[<ffffffff81a2c0c8>] ret_from_fork+0x58/0x90
[<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83
The issue was that as a part of exiting the default rules were being
deleted which resulted in the local trie being unmerged. By moving the
freeing of the FIB tables up we can avoid the unmerge since there is no
local table left when we call the fib4_rules_exit function.
Fixes: 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function fib_unmerge assumed the local table had already been
allocated. If that is not the case however when custom rules are applied
then this can result in a NULL pointer dereference.
In order to prevent this we must check the value of the local table pointer
and if it is NULL simply return 0 as there is no local table to separate
from the main.
Fixes: 0ddcf43d5 ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Madhu Challa <challa@noironetworks.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 0-day kernel test infrastructure reported a use of uninitialized
variable warning for local_table due to the fact that the local and main
allocations had been swapped from the original setup. This change corrects
that by making it so that we free the main table if the local table
allocation fails.
Fixes: 0ddcf43d5 ("ipv4: FIB Local/MAIN table collapse")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move rtnl_lock() before the call to fib4_rules_exit so that
fib_table_flush_external is called under RTNL.
Fixes: 104616e74e ("switchdev: don't support custom ip rules, for now")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is meant to collapse local and main into one by converting
tb_data from an array to a pointer. Doing this allows us to point the
local table into the main while maintaining the same variables in the
table.
As such the tb_data was converted from an array to a pointer, and a new
array called data is added in order to still provide an object for tb_data
to point to.
In order to track the origin of the fib aliases a tb_id value was added in
a hole that existed on 64b systems. Using this we can also reverse the
merge in the event that custom FIB rules are enabled.
With this patch I am seeing an improvement of 20ns to 30ns for routing
lookups as long as custom rules are not enabled, with custom rules enabled
we fall back to split tables and the original behavior.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Keep switchdev FIB offload model simple for now and don't allow custom ip
rules.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fib_table was wrapped in several places with an
rcu_read_lock/rcu_read_unlock however after looking over the code I found
several spots where the tables were being accessed as just standard
pointers without any protections. This change fixes that so that all of
the proper protections are in place when accessing the table to take RCU
replacement or removal of the table into account.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change is to start cleaning up some of the rcu_read_lock/unlock
handling. I realized while reviewing the code there are several spots that
I don't believe are being handled correctly or are masking warnings by
locally calling rcu_read_lock/unlock instead of calling them at the correct
level.
A common example is a call to fib_get_table followed by fib_table_lookup.
The rcu_read_lock/unlock ought to wrap both but there are several spots where
they were not wrapped.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The trie usage stats were currently being shared by all threads that were
calling fib_table_lookup. As a result when multiple threads were
performing lookups simultaneously the trie would begin to cache bounce
between those threads.
In order to prevent this I have updated the usage stats to use a set of
percpu variables. By doing this we should be able to avoid the cache
bouncing and still make use of these stats.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7a9bc9b81a ("ipv4: Elide fib_validate_source() completely when possible.")
introduced a short-circuit to avoid calling fib_validate_source when not
needed. That change took rp_filter into account, but not accept_local.
This resulted in a change of behaviour: with rp_filter and accept_local
off, incoming packets with a local address in the source field should be
dropped.
Here is how to reproduce the change pre/post 7a9bc9b81a commit:
-configure the same IPv4 address on hosts A and B.
-try to send an ARP request from B to A.
-The ARP request will be dropped before that commit, but accepted and answered
after that commit.
This adds a check for ACCEPT_LOCAL, to maintain full
fib validation in case it is 0. We also leave __fib_validate_source() earlier
when possible, based on the same check as fib_validate_source(), once the
accept_local stuff is verified.
Cc: Gregory Detal <gregory.detal@uclouvain.be>
Cc: Christoph Paasch <christoph.paasch@uclouvain.be>
Cc: Hannes Frederic Sowa <hannes@redhat.com>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Sébastien Barré <sebastien.barre@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
As suggested by Julian:
Simply, flowi4_iif must not contain 0, it does not
look logical to ignore all ip rules with specified iif.
because in fib_rule_match() we do:
if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
goto out;
flowi4_iif should be LOOPBACK_IFINDEX by default.
We need to move LOOPBACK_IFINDEX to include/net/flow.h:
1) It is mostly used by flowi_iif
2) Fix the following compile error if we use it in flow.h
by the patches latter:
In file included from include/linux/netfilter.h:277:0,
from include/net/netns/netfilter.h:5,
from include/net/net_namespace.h:21,
from include/linux/netdevice.h:43,
from include/linux/icmpv6.h:12,
from include/linux/ipv6.h:61,
from include/net/ipv6.h:16,
from include/linux/sunrpc/clnt.h:27,
from include/linux/nfs_fs.h:30,
from init/do_mounts.c:32:
include/net/flow.h: In function ‘flowi4_init_output’:
include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>