PRM driver now has a generic API for asserting hardware resets. SoC
specific support functions are registered through the prm_ll_data.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Added support for prm_init for AM33xx SoC. This is needed to register
SoC specific prm_ll_data for these devices.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
AM43xx is using OMAP4+ PRM driver, so it should be using the corresponding
hardreset ops from the hwmod also.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
These are now identical with the OMAP4 implementations, so use the OMAP4
versions and remove the AM33xx ones.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
CM driver has a generic API which calls the SoC specific split function
through cm_ll_data, so there is no need for the SoC specific functions to
be publicly available.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Adds a generic CM driver API for enabling/disabling modules.
The SoC specific implementations are registered through cm_ll_data.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
These are not accessed outside the cm*.c files themselves, so make them
static.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Adds a generic CM driver API for waiting module to enter idle / standby.
The SoC specific implementations are registered through cm_ll_data.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
This patch consolidates the parameters provided for the SoC specific
cm_*_wait_module_ready calls, adds the missing cm_ll_data function
pointers and uses the now generic call from the mach-omap2 board code.
SoC specific *_wait_module_ready calls are also made static so they
can only be accessed through the generic CM driver API only.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
This is needed for expanding the generic CM driver API to include
AM33xx and OMAP4 also.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
This is not needed for anything. This also eases the consolidation of
the wait_module_ready / wait_module_idle calls behind a generic CM
driver API by reducing the number of needed parameters.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The implementation on these is identical, so no need to have them separate.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
AM43xx will be re-using OMAP4 PRM driver, thus call its init function.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
This is done in attempt to get rid of cpu_is_X calls from the PRM core.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
snd_bebob_stream_check_internal_clock() may get an id from
saffirepro_both_clk_src_get (via clk_src->get()) that was uninitialized.
a) make logic in saffirepro_both_clk_src_get explicit
b) test if id used in snd_bebob_stream_check_internal_clock matches array size
[fixed missing signed prefix to *_maps[] by tiwai]
Signed-off-by: Christian Vogel <vogelchr@vogel.cx>
Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
A few small driver fixes for v3.18 plus the removal of the s6000 support
since the relevant chip is no longer supported in mainline.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUTiqwAAoJECTWi3JdVIfQpswH/1pP9jyGe7XjT73m9QfodcZT
nIpKd1Hm06vyi74Yehryq/ows9vBQOhDA1pkF6j3ti6Q0Fdpu1Ko9Co64t+9EL47
iIUixRrQFsN0BDJvmlkHuH047x9OLnhgaegC/RgSu/ReIrpoL5AxP8LcJhgT7K8j
tG1YrBg+wVv/lLr2UV216bXpS/dU430G/uGPj3NC4llDiP4U10YXkr1QBSbGcRo9
2YgJV7TIO3fitjn7fZFQYjRUtwXnyjxxDN9jMWUJnbCkfuTPMyecIqoS9xGj5VyB
YMOlFdrkBVqeqkXjks1K6k6hBSVbmLit7dBxJ64ly7+Vf1+8HK+ha7GulEe7WwM=
=TSKi
-----END PGP SIGNATURE-----
Merge tag 'asoc-v3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v3.18
A few small driver fixes for v3.18 plus the removal of the s6000 support
since the relevant chip is no longer supported in mainline.
vlv_cdclk_freq is in kHz but we need MHz for the GMBUSFREQ divider.
This is a regression from:
commit f8bf63fdcb
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Fri Jun 13 13:37:54 2014 +0300
drm/i915: Kill duplicated cdclk readout code from i2c
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Turning vdd on/off can generate a long hpd pulse on eDP ports. In order
to handle hpd we would need to turn on vdd to perform aux transfers.
This would lead to an endless cycle of
"vdd off -> long hpd -> vdd on -> detect -> vdd off -> ..."
So ignore long hpd pulses on eDP ports. eDP panels should be physically
tied to the machine anyway so they should not actually disappear and
thus don't need long hpd handling. Short hpds are still needed for link
re-train and whatnot so we can't just turn off the hpd interrupt
entirely for eDP ports. Perhaps we could turn it off whenever the panel
is disabled, but just ignoring the long hpd seems sufficient.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Todd Previte <tprevite@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Sometimes we seem to get utter garbage from DPCD reads. The resulting
buffer is filled with the same byte, and the operation completed without
errors. My HP ZR24w monitor seems particularly susceptible to this
problem once it's gone into a sleep mode.
The issue seems to happen only for the first AUX message that wakes the
sink up. But as the first AUX read we often do is the DPCD receiver
cap it does wreak a bit of havoc with subsequent link training etc. when
the receiver cap bw/lane/etc. information is garbage.
A sufficient workaround seems to be to perform a single byte dummy read
before reading the actual data. I suppose that just wakes up the sink
sufficiently and we can just throw away the returned data in case it's
crap. DP_DPCD_REV seems like a sufficiently safe location to read here.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Todd Previte <tprevite@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
It is lite version of AIO machine(0x0626).
The audio layout of this machine was similar with SSID 0x0626.
The audio was same as commit ad8ff99e6b.
Signed-off-by: Kailang Yang <kailang@realtek.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Precedence of & and >> is not the same and is not left to right.
shift has higher precedence and should be done after the mask.
Add parentheses around the mask.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Terratec PHASE 88 rack fw has two registers for source of clock, one is
for internal/external, and another is for wordclock/spdif for external.
When clock source is internal, information in another register has no meaning.
Thus it must be ignored, but current implementation decodes it. This causes
over-indexing reference to labels.
Reported-by: András Murányi <muranyia@gmail.com>
Tested-by: András Murányi <muranyia@gmail.com>
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit 0b0b0893d4 "of/pci: Fix the conversion of IO ranges into IO
resources" changed the behaviour of of_pci_range_to_resource().
The issue is described here:
"powerpc/pci: Fix IO space breakage after of_pci_range_to_resource()
change"
(sha1: aeba3731b1)
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
The time Kconfig expects that NR_CPUS is defined.
This patch remove this config warning:
"kernel/time/Kconfig:163:warning: range is invalid"
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Do not reuse skb if it was pfmemalloc tainted, otherwise
future frame might be dropped anyway.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eli Cohen says:
====================
irq sync fixes
This two patch series fixes a race where an interrupt handler could access a
freed memory.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After moving the EQ ownership to software effectively destroying it, call
synchronize_irq() to ensure that any handler routines running on other CPU
cores finish execution. Only then free the EQ buffer.
The same thing is done when we destroy a CQ which is one of the sources
generating interrupts. In the case of CQ we want to avoid completion handlers
on a CQ that was destroyed. In the case we do the same to avoid receiving
asynchronous events after the EQ has been destroyed and its buffers freed.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After destroying the EQ, the object responsible for generating interrupts, call
synchronize_irq() to ensure that any handler routines running on other CPU
cores finish execution. Only then free the EQ buffer. This patch solves a very
rare case when we get panic on driver unload.
The same thing is done when we destroy a CQ which is one of the sources
generating interrupts. In the case of CQ we want to avoid completion handlers
on a CQ that was destroyed. In the case we do the same to avoid receiving
asynchronous events after the EQ has been destroyed and its buffers freed.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The call of 'kfree(ci->hw_bank.regmap)' in ci_hdrc_remove() sometimes causes
a kernel oops when removing the ci_hdrc module.
Since there is no separate memory allocated for the ci->hw_bank.regmap array,
there is no need to free it.
Cc: v3.14+ <stable@@vger.kernel.org>
Signed-off-by: Torsten Fleischer <to-fleischer@t-online.de>
Signed-off-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On E2, the arch timer is hooked up to a different clock, and the CA7's arch
timer CNTVOFF register must be initialized.
Based on work by Hisashi Nakamura.
Signed-off-by: Ulrich Hecht <ulrich.hecht+renesas@gmail.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Commit c387f07e62 (clocksource: arm_arch_timer: Discard unavailable
timers correctly) changed the way the driver makes sure both the memory
and system-register timers have been probed before finalizing the probing.
There is a interesting flaw in this logic that leads to this final step
never to be executed. Things seems to work pretty well until something
actually needs the data that is produced during this final stage.
For example, KVM explodes on the first run of a guest when executed on
a platform that has both memory and sysreg nodes (Juno, for example).
Just fix the damned logic, and enjoy booting VMs again.
Tested on a Juno system.
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Reported-by: Riku Voipio <riku.voipio@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Another week, another small batch of fixes.
Most of these make zynq, socfpga and sunxi platforms work a bit
better:
* Due to new requirements for regulators, DWMMC on socfpga broke past 3.17.
* SMP spinup fix for socfpga
* A few DT fixes for zynq
* Another option (FIXED_REGULATOR) for sunxi is needed that used to be selected
by other options but no longer is.
* A couple of small DT fixes for at91
* ...and a couple for i.MX.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUTSwQAAoJEIwa5zzehBx3D0AP/3ktsJ9ORSSDDEbpGWUPndQN
bLGOT4DGfWWn/BOlMYN9kM2k7Gr6ttxFzqepKoeb0Dl5myUeqC4C42t8FqrI78TB
wf8e9f2lXI+j3wve55FarTDk9JSh6PbQdavgnNCzVLJQddA//JKz9vZhL4jVYC/s
kh8VeoLOYKeE/sdcYeBF36zNAkmy0CfaGjC01SZEcd7BjVv8qq0TvkXXSP1bjsry
ztH+DN8OJ3gg7IKB8IntfzaxSnDQl+zxlVeOsPaU1Lvahs6wSFgRqA849Nc6KXdl
rpAuaTH6Pa5RNEd1zqhE2+o4xZymk/BM+JU77pizq4dP0o3JnDy5tzzMMd24FuMG
sD+JZrSCP9o58L1y9W1jhVgoxmpnRGZNO1n8FhABcnSTL50W3iAzIvlpxnOIu0/z
SzNMdItA3dtCn/Aec7wL7eGLUlyI73khMIt4heQ0jPY+IncGJ0yvdFe2m8SZKmS2
mDeQaChml8rjXvIdjiWIlDTagBpTkR1R1JX6aJh0lgZIF1K9qf1ZfzJ5dbLAXtZe
xjGeoOe8hXRxR0spc1rRAJlPGJh/Fqkm0UeFLDwP0DOJISTcgz4daT/Y7zdDGRJ6
n+1kjrmwv/M481wNifFt33sdZEB1EcUO/uNAYfUV0Wlpv5ye7x2aLsfbsnMEh+qd
H0a6R6NZu7473ewhWxRu
=MTvh
-----END PGP SIGNATURE-----
Merge tag 'armsoc-for-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Another week, another small batch of fixes.
Most of these make zynq, socfpga and sunxi platforms work a bit
better:
- due to new requirements for regulators, DWMMC on socfpga broke past
v3.17
- SMP spinup fix for socfpga
- a few DT fixes for zynq
- another option (FIXED_REGULATOR) for sunxi is needed that used to
be selected by other options but no longer is.
- a couple of small DT fixes for at91
- ...and a couple for i.MX"
* tag 'armsoc-for-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: imx28-evk: Let i2c0 run at 100kHz
ARM: i.MX6: Fix "emi" clock name typo
ARM: multi_v7_defconfig: enable CONFIG_MMC_DW_ROCKCHIP
ARM: sunxi_defconfig: enable CONFIG_REGULATOR_FIXED_VOLTAGE
ARM: dts: socfpga: Add a 3.3V fixed regulator node
ARM: dts: socfpga: Fix SD card detect
ARM: dts: socfpga: rename gpio nodes
ARM: at91/dt: sam9263: fix PLLB frequencies
power: reset: at91-reset: fix power down register
MAINTAINERS: add atmel ssc driver maintainer entry
arm: socfpga: fix fetching cpu1start_addr for SMP
ARM: zynq: DT: trivial: Fix mc node
ARM: zynq: DT: Add cadence watchdog node
ARM: zynq: DT: Add missing reference for memory-controller
ARM: zynq: DT: Add missing reference for ADC
ARM: zynq: DT: Add missing address for L2 pl310
ARM: zynq: DT: Remove 222 MHz OPP
ARM: zynq: DT: Fix GEM register area size
free_pi_state and exit_pi_state_list both clean up futex_pi_state's.
exit_pi_state_list takes the hb lock first, and most callers of
free_pi_state do too. requeue_pi doesn't, which means free_pi_state
can free the pi_state out from under exit_pi_state_list. For example:
task A | task B
exit_pi_state_list |
pi_state = |
curr->pi_state_list->next |
| futex_requeue(requeue_pi=1)
| // pi_state is the same as
| // the one in task A
| free_pi_state(pi_state)
| list_del_init(&pi_state->list)
| kfree(pi_state)
list_del_init(&pi_state->list) |
Move the free_pi_state calls in requeue_pi to before it drops the hb
locks which it's already holding.
[ tglx: Removed a pointless free_pi_state() call and the hb->lock held
debugging. The latter comes via a seperate patch ]
Signed-off-by: Brian Silverman <bsilver16384@gmail.com>
Cc: austin.linux@gmail.com
Cc: darren@dvhart.com
Cc: peterz@infradead.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1414282837-23092-1-git-send-email-bsilver16384@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Update our documentation as of fix 76835b0ebf (futex: Ensure
get_futex_key_refs() always implies a barrier). Explicitly
state that we don't do key referencing for private futexes.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Matteo Franchin <Matteo.Franchin@arm.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: http://lkml.kernel.org/r/1414121220.817.0.camel@linux-t7sj.site
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- Revert one patch which increases I2C bus frequency on imx28-evk
- Fix a typo on imx6q EIM clock name
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUTE8JAAoJEFBXWFqHsHzOGawH/0KGNaHbI3rj+Hx1HHtN056y
3rgHSsLZSLQB89+bMd8aEVPJ2z0RKYXfyI1IvkcgEZxsqmHwRY8Fwlof4D38/bfP
tRHnyzT2E+znnyhvUZlH9yd9foTd3VkXbxFxbEssRHl2W2OxA0+3MbskknERPZqs
qr22DcMLKyrTbUH39iiEjS43qcJhuf/6vZmoVGCGdZonZwkH8WccIQ+kKneOn8/Z
11U4ioB4pirqvhM1niYQ95RLG0TveBN6op3c1HWkhqY4EKOlraZHQb4EOoslSO/X
vWoJqgB9DLH3eV+WTFI0FjGDK/6CFhgAth8q0FKVlHA3FFHr+fXdxv/+NLtagzQ=
=elO/
-----END PGP SIGNATURE-----
Merge tag 'imx-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
Merge "ARM: imx: fixes for 3.18" from Shawn Guo:
The i.MX fixes for 3.18:
- Revert one patch which increases I2C bus frequency on imx28-evk
- Fix a typo on imx6q EIM clock name
* tag 'imx-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx28-evk: Let i2c0 run at 100kHz
ARM: i.MX6: Fix "emi" clock name typo
Signed-off-by: Olof Johansson <olof@lixom.net>
drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c: In function ‘xgene_enet_ecc_init’:
drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c:126: warning: ‘data’ may be used uninitialized in this function
Depending on the arbitrary value on the stack, the loop may terminate
too early, and cause a bogus -ENODEV failure.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We accidentally mask by the _SHIFT variable. It means that "event" is
always zero.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to cancel the work queue after rcu grace period,
otherwise it can be rescheduled by incoming packets.
We need to purge queue if some skbs are still in it.
We can use __skb_queue_head_init() variant in
macvlan_process_broadcast()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 412ca1550c ("macvlan: Move broadcasts into a work queue")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
percpu tcp_md5sig_pool contains memory blobs that ultimately
go through sg_set_buf().
-> sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
This requires that whole area is in a physically contiguous portion
of memory. And that @buf is not backed by vmalloc().
Given that alloc_percpu() can use vmalloc() areas, this does not
fit the requirements.
Replace alloc_percpu() by a static DEFINE_PER_CPU() as tcp_md5sig_pool
is small anyway, there is no gain to dynamically allocate it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 765cf9976e ("tcp: md5: remove one indirection level in tcp_md5sig_pool")
Reported-by: Crestez Dan Leonard <cdleonard@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Vrabel says:
====================
xen-netback: guest Rx queue drain and stall fixes
This series fixes two critical xen-netback bugs.
1. Netback may consume all of host memory by queuing an unlimited
number of skb on the internal guest Rx queue. This behaviour is
guest triggerable.
2. Carrier flapping under high traffic rates which reduces
performance.
The first patch is a prerequite. Removing support for frontends with
feature-rx-notify makes it easier to reason about the correctness of
netback since it no longer has to support this outdated and broken
mode.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If a frontend not receiving packets it is useful to detect this and
turn off the carrier so packets are dropped early instead of being
queued and drained when they expire.
A to-guest queue is stalled if it doesn't have enough free slots for a
an extended period of time (default 60 s).
If at least one queue is stalled, the carrier is turned off (in the
expectation that the other queues will soon stall as well). The
carrier is only turned on once all queues are ready.
When the frontend connects, all the queues start in the stalled state
and only become ready once the frontend queues enough Rx requests.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netback needs to discard old to-guest skb's (guest Rx queue drain) and
it needs detect guest Rx stalls (to disable the carrier so packets are
discarded earlier), but the current implementation is very broken.
1. The check in hard_start_xmit of the slot availability did not
consider the number of packets that were already in the guest Rx
queue. This could allow the queue to grow without bound.
The guest stops consuming packets and the ring was allowed to fill
leaving S slot free. Netback queues a packet requiring more than S
slots (ensuring that the ring stays with S slots free). Netback
queue indefinately packets provided that then require S or fewer
slots.
2. The Rx stall detection is not triggered in this case since the
(host) Tx queue is not stopped.
3. If the Tx queue is stopped and a guest Rx interrupt occurs, netback
will consider this an Rx purge event which may result in it taking
the carrier down unnecessarily. It also considers a queue with
only 1 slot free as unstalled (even though the next packet might
not fit in this).
The internal guest Rx queue is limited by a byte length (to 512 Kib,
enough for half the ring). The (host) Tx queue is stopped and started
based on this limit. This sets an upper bound on the amount of memory
used by packets on the internal queue.
This allows the estimatation of the number of slots for an skb to be
removed (it wasn't a very good estimate anyway). Instead, the guest
Rx thread just waits for enough free slots for a maximum sized packet.
skbs queued on the internal queue have an 'expires' time (set to the
current time plus the drain timeout). The guest Rx thread will detect
when the skb at the head of the queue has expired and discard expired
skbs. This sets a clear upper bound on the length of time an skb can
be queued for. For a guest being destroyed the maximum time needed to
wait for all the packets it sent to be dropped is still the drain
timeout (10 s) since it will not be sending new packets.
Rx stall detection is reintroduced in a later commit.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>