Commit graph

521886 commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
3e323dc0a8 perf hists browser: React to unassigned hotkey pressing
When that happens we were just ignoring the key press, now this
message is presented in the bottom line (the help line):

  "Press '?' for help on key bindings"

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-iyma2j5kj3q9i1stl4mfh90n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19 18:14:05 -03:00
Arnaldo Carvalho de Melo
ae3b6ab603 perf top: Tell the user how to unfreeze events after pressing 'f'
When the user presses 'f' to disable events the visual cues are, well,
the percentages not changing and the number of events freezing.

Be more explicit by changing the help line at the bottom of the screen
to show the following messages when 'f' is pressed:

  "Press 'f' again to re-enable the events"

And then, when 'f' is pressed again:

  "Press 'f' to disable the events or 'h'

Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-uhiswg9a9rxm5gxg7ptjskjn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19 18:13:59 -03:00
Arnaldo Carvalho de Melo
5f00b0f45b perf hists browser: Honour the help line provided by builtin-{top,report}.c
The hists_browser was replacing whatever helpline provided by 'top' or
'report' with a static "Press '?' for help on key bindings", fix it.

Now the message passed by top appears at the bottom of the screen:

"For a higher level overview, try: perf top --sort comm,dso"

As well the message that will be added when the user presses 'f' to
disable the events, something along the lines of "press f again to
re-enable...".

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-dacaja70mbfz3a0yj1n180gx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19 17:30:20 -03:00
Arnaldo Carvalho de Melo
516e536849 perf hists browser: Do not exit when 'f' is pressed in 'report' mode
The 'f' hotkey is only used when in 'top', dynamic mode, to
enable/disable events, currently not making sense in the 'report',
static mode, where we can't go from showing the histogram entries
created from a perf.data file to adding more events after recreating the
evlist created from the perf.data file, albeit possible, this is not
implemented right now.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-lholzf472pu98dkkijggwx2m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19 16:59:43 -03:00
Arnaldo Carvalho de Melo
fbb7997e30 perf top: Replace CTRL+z with 'f' as hotkey for enable/disable events
I.e. 'freeze'/'unfreeze', this is because CTRL+z has a well known
action, i.e. suspend the app, perf needs to follow that convention, that
will be done on a separate patch, tho.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-oedcl6ovohara4koig14ayip@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19 16:56:04 -03:00
Lukas Wunner
75e84ab906 perf tools: Fix build breakage if prefix= is specified
Invoking Makefile.perf with prefix= breaks the build since Makefile.perf
hands that variable down to Makefile.build where it overrides

    prefix       := $(subst ./,,$(OUTPUT)$(dir)/)

leading to errors like this:

    No rule to make target '/usrabspath.o', needed by '/usrlibperf-in.o'

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Fixes: c819e2cf2e
Link: http://lkml.kernel.org/r/5582c48a.84a22b0a.a918.5285SMTPIN_ADDED_MISSING@mx.google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19 16:39:42 -03:00
Arnaldo Carvalho de Melo
276af92f10 perf annotate: Rename source_line_percent to source_line_samples
To better reflect the purpose of this struct, that is to hold
info about samples, its total number and is percentage.

Cc: Martin Liska <mliska@suse.cz>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-6bf8gwcl975uurl0ttpvtk69@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19 16:39:21 -03:00
Martin Liška
0c4a5bcea4 perf annotate: Display total number of samples with --show-total-period
To compare two records on an instruction base, with --show-total-period
option provided, display total number of samples that belong to a line
in assembly language.

New hot key 't' is introduced for 'perf annotate' TUI.

Signed-off-by: Martin Liska <mliska@suse.cz>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/5583E26D.1040407@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19 16:39:18 -03:00
Adrian Hunter
a5499b3719 perf tools: Ensure thread-stack is flushed
The thread-stack represents a thread's current stack.  When a thread
exits there can still be many functions on the stack e.g. exit() can be
called many levels deep, so all the callers will never return.  To get
that information output, the thread-stack must be flushed.

Previously it was assumed the thread-stack would be flushed when the
struct thread was deleted.  With thread ref-counting it is no longer
clear when that will be, if ever. So instead explicitly flush all the
thread-stacks at the end of a session.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1432906425-9911-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-19 16:03:33 -03:00
Linus Torvalds
bb16140a2c Very late clk regression fixes for the ARM-based AT91 platform. These went
unnoticed by me until recently, hence the late pull request.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVhCncAAoJEKI6nJvDJaTU+YUQAIPostZWwDMGWLcOWibg1168
 aIhdrpvgS6eFshACYGQgLeyPsBd+UTRWvI0T48YPn3GWJxq/wfQU/41GKjHySlla
 F95AtcJkG0HbqVetTZNtkVk003HMemoq20NC62gijiiK4pC4AEqyjGnziyhn02NT
 Poz7wljr6G/pk26mKTTsx0e8v+7S9trSwwRVNopofjqZ2VGgYk/7Vp4rxM9PebuG
 QNq5Ffy3E9dl9FisOLDl6KnVXBGOslRSx2Dt2liVLicXYodoFUIPq42LbTEVbsS4
 C7Onnm3IOEOiT/nrYiN9xsZHd7jF+xJubvRxN1n3+Lb6FVcyMZGoZPKlwBr9WlLp
 SEzU6V7fwN2t5JpzNYHzVXVYjzjTntAp1jQ0Q8945XyvMF4hHAvfDYX1BErfdVFB
 YhX4yvC3y24GDw308xlwSxvwCIuItA5A3DE3AJj2WlKmvAg3FvlSW3odPATWFB4n
 rg71u9iS1asBE1MYR2zg7HoAzQAwZuNhctAk+DrFkafQIoWxzLm+RWgsPDF69yEn
 BtVjMAR0Cs+7AoI4cweN/W5Ik/p171KIYDsTawc7y9hEiAM2dvzWOfq/CS0OEGFA
 SAGPeVPRVyfcvv0qVr2REkWBiW5JtfpSeYNMldOTEJLCd8QhSPkurZJ6rEIYIDYM
 atDouBGRYTrzhLGnmALi
 =0Lhn
 -----END PGP SIGNATURE-----

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Michael Turquette:
 "Very late clk regression fixes for the ARM-based AT91 platform.

  These went unnoticed by me until recently, hence the late pull
  request"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: at91: fix h32mx prototype inclusion in pmc header
  clk: at91: trivial: typo in peripheral clock description
  clk: at91: fix PERIPHERAL_MAX_SHIFT definition
  clk: at91: pll: fix input range validity check
2015-06-19 07:36:50 -10:00
Linus Torvalds
9a10758c44 sound fixes for 4.1-final
Nothing looks scary, just a few usual HD-audio regression fixes
 and fixup, in addition to a minor Kconfig dependency fix for
 the old MIPS drivers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJVhAJbAAoJEGwxgFQ9KSmk46UP/0K9YGpUANqAoNdINPmLm6ug
 VECymygbOVLBR7lDooBKxSPfDYH3jXwOcVYm9cBF2tgIpuaS3qgq6RYr+4pKGMQM
 vW2VN0742Ag78X6+YJ70Tw7IbP0pHTvpoNvzYvOpfxhx2ebcF3Zw8Z65BLQtMc16
 n+bYqgc8WeDc9RhnqfziVORD2CwXkATYiGhl1yHVrSAs9V75UKTFwCbV7fVoWcHU
 DKbrkH+2FzWpdWraL01HAQ/z5bGECtww3khFvmPPFnxLcUF6C2bzGTc6OCcduHpX
 NcwvL+NbP++tEAbw9sQiuuWhu2oRvFLhPrmmlN2ngHLVtCyPb5TEL+si6qLvtuRx
 qmlP0Uco2bd5Ypb8SF/mJaoWRBD+AW+mhfF5n81+XrNrQRGZcV6LGTdqBKKag9yA
 p7VX8/CpK4DLn5GggPAMMcO2SDBlwI66ivozGEKEFLYODoFZcDDZcH4dhafW7RCA
 sZPkr8hNEghJr5V28orKFm1ogy6bRMsnUWxMuekJaR6Ux6mTjDZqM2LJRPsZaIu4
 ApCcHi8KVWBV3Io5iU518/+ylobe5heg5lOl8Y1UYGFnfc0QePezzHvryKa7XaB/
 xsCWe+qXaG4s1jZzmrqbryNXIvzfjaZ3SUGJfzFdTrbn5J+JcVKrUEtFZR4L3pAg
 t+d8KJaKLCqajWyuDhlj
 =KSPF
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Nothing looks scary, just a few usual HD-audio regression fixes and
  fixup, in addition to a minor Kconfig dependency fix for the old MIPS
  drivers"

* tag 'sound-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix unused label skip_i915
  ALSA: hda - Fix noisy outputs on Dell XPS13 (2015 model)
  ALSA: mips: let SND_SGI_O2 select SND_PCM
  ALSA: hda - Fix audio crackles on Dell Latitude E7x40
  ALSA: hda - adding a DAC/pin preference map for a HP Envy TS machine
2015-06-19 07:34:14 -10:00
Michael Turquette
909aa10e6d Merge branch 'ccf/atmel-fixes-for-4.1' of https://github.com/bbrezillon/linux-at91 into clk-fixes 2015-06-19 07:37:14 -07:00
Boris BREZILLON
2df6bb5d8b crypto: marvell/cesa - add DT bindings documentation
Add DT bindings documentation for the new marvell-cesa driver.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:06 +08:00
Arnaud Ebalard
7240425579 crypto: marvell/cesa - add support for Kirkwood and Dove SoCs
Add the Kirkwood and Dove SoC descriptions, and control the allhwsupport
module parameter to avoid probing the CESA IP when the old CESA driver is
enabled (unless it is explicitly requested to do so).

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:05 +08:00
Boris BREZILLON
0bf6948995 crypto: marvell/cesa - add support for Orion SoCs
Add the Orion SoC description, and select this implementation by default
to support non-DT probing: Orion is the only platform where non-DT boards
are declaring the CESA block.

Control the allhwsupport module parameter to avoid probing the CESA IP when
the old CESA driver is enabled (unless it is explicitly requested to do
so).

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:05 +08:00
Boris BREZILLON
64c55d499b crypto: marvell/cesa - add allhwsupport module parameter
The old and new marvell CESA drivers both support Orion and Kirkwood SoCs.
Add a module parameter to choose whether these SoCs should be attached to
the new or the old driver.

The default policy is to keep attaching those IPs to the old driver if it
is enabled, until we decide the new CESA driver is stable/secure enough.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:05 +08:00
Boris BREZILLON
898c9d5ea2 crypto: marvell/cesa - add support for all armada SoCs
Add CESA IP description for all the missing armada SoCs (XP, 375 and 38x).

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:05 +08:00
Arnaud Ebalard
f85a762e49 crypto: marvell/cesa - add SHA256 support
Add support for SHA256 operations.

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:04 +08:00
Arnaud Ebalard
7aeef693d1 crypto: marvell/cesa - add MD5 support
Add support for MD5 operations.

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:04 +08:00
Arnaud Ebalard
4ada483978 crypto: marvell/cesa - add Triple-DES support
Add support for Triple-DES operations.

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:04 +08:00
Boris BREZILLON
7b3aaaa095 crypto: marvell/cesa - add DES support
Add support for DES operations.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:04 +08:00
Boris BREZILLON
db509a4533 crypto: marvell/cesa - add TDMA support
The CESA IP supports CPU offload through a dedicated DMA engine (TDMA)
which can control the crypto block.
When you use this mode, all the required data (operation metadata and
payload data) are transferred using DMA, and the results are retrieved
through DMA when possible (hash results are not retrieved through DMA yet),
thus reducing the involvement of the CPU and providing better performances
in most cases (for small requests, the cost of DMA preparation might
exceed the performance gain).

Note that some CESA IPs do not embed this dedicated DMA, hence the
activation of this feature on a per platform basis.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:03 +08:00
Boris BREZILLON
f63601fd61 crypto: marvell/cesa - add a new driver for Marvell's CESA
The existing mv_cesa driver supports some features of the CESA IP but is
quite limited, and reworking it to support new features (like involving the
TDMA engine to offload the CPU) is almost impossible.
This driver has been rewritten from scratch to take those new features into
account.

This commit introduce the base infrastructure allowing us to add support
for DMA optimization.
It also includes support for one hash (SHA1) and one cipher (AES)
algorithm, and enable those features on the Armada 370 SoC.

Other algorithms and platforms will be added later on.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:03 +08:00
Boris BREZILLON
1fa2e9ae1d crypto: mv_cesa - explicitly define kirkwood and dove compatible strings
We are about to add a new driver to support new features like using the
TDMA engine to offload the CPU.
Orion, Dove and Kirkwood platforms are already using the mv_cesa driver,
but Orion SoCs do not embed the TDMA engine, which means we will have to
differentiate them if we want to get TDMA support on Dove and Kirkwood.
In the other hand, the migration from the old driver to the new one is not
something all people are willing to do without first auditing the new
driver.
Hence we have to support the new compatible in the mv_cesa driver so that
new platforms with updated DTs can still attach their crypto engine device
to this driver.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:02 +08:00
Boris BREZILLON
51b44fc811 crypto: mv_cesa - use gen_pool to reserve the SRAM memory region
The mv_cesa driver currently expects the SRAM memory region to be passed
as a platform device resource.

This approach implies two drawbacks:
- the DT representation is wrong
- the only one that can access the SRAM is the crypto engine

The last point is particularly annoying in some cases: for example on
armada 370, a small region of the crypto SRAM is used to implement the
cpuidle, which means you would not be able to enable both cpuidle and the
CESA driver.

To address that problem, we explicitly define the SRAM device in the DT
and then reference the sram node from the crypto engine node.

Also note that the old way of retrieving the SRAM memory region is still
supported, or in other words, backward compatibility is preserved.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:18:02 +08:00
Boris BREZILLON
1c07548685 crypto: mv_cesa - document the clocks property
On Dove platforms, the crypto engine requires a clock. Document this
clocks property in the mv_cesa bindings doc.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-19 22:17:27 +08:00
Herbert Xu
c0b59fafe3 Merge branch 'mvebu/drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Merge the mvebu/drivers branch of the arm-soc tree which contains
just a single patch bfa1ce5f38 ("bus:
mvebu-mbus: add mv_mbus_dram_info_nooverlap()") that happens to be
a prerequisite of the new marvell/cesa crypto driver.
2015-06-19 22:07:07 +08:00
Borislav Petkov
04c17341b4 x86/boot: Fix overflow warning with 32-bit binutils
When building the kernel with 32-bit binutils built with support
only for the i386 target, we get the following warning:

  arch/x86/kernel/head_32.S:66: Warning: shift count out of range (32 is not between 0 and 31)

The problem is that in that case, binutils' internal type
representation is 32-bit wide and the shift range overflows.

In order to fix this, manipulate the shift expression which
creates the 4GiB constant to not overflow the shift count.

Suggested-by: Michael Matz <matz@suse.de>
Reported-and-tested-by: Enrico Mioso <mrkiko.rs@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 16:03:26 +02:00
Nicolas Ferre
28df9c2fb6 clk: at91: fix h32mx prototype inclusion in pmc header
Trivial fix that prevents to compile this pmc clock driver if h32mx clock is
present but smd clock isn't.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Fixes: bcc5fd49a0 ("clk: at91: add a driver for the h32mx clock")
Cc: <stable@vger.kernel.org> # 3.18+
2015-06-19 15:48:34 +02:00
Nicolas Ferre
c49bb94c84 clk: at91: trivial: typo in peripheral clock description
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2015-06-19 15:47:33 +02:00
Thomas Gleixner
683be13a28 timer: Minimize nohz off overhead
If nohz is disabled on the kernel command line the [hr]timer code
still calls wake_up_nohz_cpu() and tick_nohz_full_cpu(), a pretty
pointless exercise. Cache nohz_active in [hr]timer per cpu bases and
avoid the overhead.

Before:
  48.10%  hog       [.] main
  15.25%  [kernel]  [k] _raw_spin_lock_irqsave
   9.76%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.50%  [kernel]  [k] mod_timer
   6.44%  [kernel]  [k] lock_timer_base.isra.38
   3.87%  [kernel]  [k] detach_if_pending
   3.80%  [kernel]  [k] del_timer
   2.67%  [kernel]  [k] internal_add_timer
   1.33%  [kernel]  [k] __internal_add_timer
   0.73%  [kernel]  [k] timerfn
   0.54%  [kernel]  [k] wake_up_nohz_cpu

After:
  48.73%  hog       [.] main
  15.36%  [kernel]  [k] _raw_spin_lock_irqsave
   9.77%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.61%  [kernel]  [k] lock_timer_base.isra.38
   6.42%  [kernel]  [k] mod_timer
   3.90%  [kernel]  [k] detach_if_pending
   3.76%  [kernel]  [k] del_timer
   2.41%  [kernel]  [k] internal_add_timer
   1.39%  [kernel]  [k] __internal_add_timer
   0.76%  [kernel]  [k] timerfn

We probably should have a cached value for nohz full in the per cpu
bases as well to avoid the cpumask check. The base cache line is hot
already, the cpumask not necessarily.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Link: http://lkml.kernel.org/r/20150526224512.207378134@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-19 15:18:28 +02:00
Thomas Gleixner
bc7a34b8b9 timer: Reduce timer migration overhead if disabled
Eric reported that the timer_migration sysctl is not really nice
performance wise as it needs to check at every timer insertion whether
the feature is enabled or not. Further the check does not live in the
timer code, so we have an extra function call which checks an extra
cache line to figure out that it is disabled.

We can do better and store that information in the per cpu (hr)timer
bases. I pondered to use a static key, but that's a nightmare to
update from the nohz code and the timer base cache line is hot anyway
when we select a timer base.

The old logic enabled the timer migration unconditionally if
CONFIG_NO_HZ was set even if nohz was disabled on the kernel command
line.

With this modification, we start off with migration disabled. The user
visible sysctl is still set to enabled. If the kernel switches to NOHZ
migration is enabled, if the user did not disable it via the sysctl
prior to the switch. If nohz=off is on the kernel command line,
migration stays disabled no matter what.

Before:
  47.76%  hog       [.] main
  14.84%  [kernel]  [k] _raw_spin_lock_irqsave
   9.55%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.71%  [kernel]  [k] mod_timer
   6.24%  [kernel]  [k] lock_timer_base.isra.38
   3.76%  [kernel]  [k] detach_if_pending
   3.71%  [kernel]  [k] del_timer
   2.50%  [kernel]  [k] internal_add_timer
   1.51%  [kernel]  [k] get_nohz_timer_target
   1.28%  [kernel]  [k] __internal_add_timer
   0.78%  [kernel]  [k] timerfn
   0.48%  [kernel]  [k] wake_up_nohz_cpu

After:
  48.10%  hog       [.] main
  15.25%  [kernel]  [k] _raw_spin_lock_irqsave
   9.76%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.50%  [kernel]  [k] mod_timer
   6.44%  [kernel]  [k] lock_timer_base.isra.38
   3.87%  [kernel]  [k] detach_if_pending
   3.80%  [kernel]  [k] del_timer
   2.67%  [kernel]  [k] internal_add_timer
   1.33%  [kernel]  [k] __internal_add_timer
   0.73%  [kernel]  [k] timerfn
   0.54%  [kernel]  [k] wake_up_nohz_cpu


Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-19 15:18:28 +02:00
Thomas Gleixner
c74441a17e timer: Stats: Simplify the flags handling
Simplify the handling of the flag storage for the timer statistics. No
intermediate storage anymore. Just hand over the flags field.

I left the printout of 'deferrable' for now because changing this
would be an ABI update and I have no idea how strong people feel about
that. OTOH, I wonder whether we should kill the whole timer stats
stuff because all of that information can be retrieved via ftrace/perf
as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Link: http://lkml.kernel.org/r/20150526224512.046626248@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-19 15:18:27 +02:00
Thomas Gleixner
0eeda71bc3 timer: Replace timer base by a cpu index
Instead of storing a pointer to the per cpu tvec_base we can simply
cache a CPU index in the timer_list and use that to get hold of the
correct per cpu tvec_base. This is only used in lock_timer_base() and
the slightly larger code is peanuts versus the spinlock operation and
the d-cache foot print of the timer wheel.

Aside of that this allows to get rid of following nuisances:

 - boot_tvec_base

   That statically allocated 4k bss data is just kept around so the
   timer has a home when it gets statically initialized. It serves no
   other purpose.

   With the CPU index we assign the timer to CPU0 at static
   initialization time and therefor can avoid the whole boot_tvec_base
   dance.  That also simplifies the init code, which just can use the
   per cpu base.

   Before:
     text	   data	    bss	    dec	    hex	filename
    17491	   9201	   4160	  30852	   7884	../build/kernel/time/timer.o
   After:
     text	   data	    bss	    dec	    hex	filename
    17440	   9193	      0	  26633	   6809	../build/kernel/time/timer.o

 - Overloading the base pointer with various flags

   The CPU index has enough space to hold the flags (deferrable,
   irqsafe) so we can get rid of the extra masking and bit fiddling
   with the base pointer.

As a benefit we reduce the size of struct timer_list on 64 bit
machines. 4 - 8 bytes, a size reduction up to 15% per struct timer_list,
which is a real win as we have tons of them embedded in other structs.

This changes also the newly added deferrable printout of the timer
start trace point to capture and print all timer->flags, which allows
us to decode the target cpu of the timer as well.

We might have used bitfields for this, but that would change the
static initializers and the init function for no value to accomodate
big endian bitfields.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Badhri Jagan Sridharan <Badhri@google.com>
Link: http://lkml.kernel.org/r/20150526224511.950084301@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-19 15:18:27 +02:00
Thomas Gleixner
1dabbcec2c timer: Use hlist for the timer wheel hash buckets
This reduces the size of struct tvec_base by 50% and results in
slightly smaller code as well.

Before:
   struct tvec_base: size: 8256, cachelines: 129

   text	   data	    bss	    dec	    hex	filename
  17698	  13297	   8256	  39251	   9953	../build/kernel/time/timer.o

After:
  struct tvec_base: 4160, cachelines: 65

   text	   data	    bss	    dec	    hex	filename
  17491	   9201	   4160	  30852	   7884	../build/kernel/time/timer.o

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Link: http://lkml.kernel.org/r/20150526224511.854731214@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-19 15:18:27 +02:00
Thomas Gleixner
1bd04bf6f6 timer: Remove FIFO "guarantee"
The FIFO guarantee is only there if two timers are queued into the
same bucket at the same jiffie on the same cpu:

 - The slack value depends on the delta between expiry and enqueue
   time, so the resulting expiry time can be different for timers
   which are queued in different jiffies.

 - Timers which are queued into the secondary array end up after a
   later queued timer which was queued into the primary array due to
   cascading.

 - Timers can end up on different cpus due to the NOHZ target moving
   around. Obviously there is no guarantee of expiry ordering between
   cpus.

So anything which relies on FIFO behaviour of the timer wheel is
broken already.

This is a preparatory patch for converting the timer wheel to hlist
which reduces the memory foot print of the wheel by 50%.

It's a seperate patch so any (unlikely to happen) regression caused by
this can be identified clearly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Cc: George Spelvin <linux@horizon.com>
Link: http://lkml.kernel.org/r/20150526224511.757520403@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-19 15:18:27 +02:00
Thomas Gleixner
3bb475a344 timers: Sanitize catchup_timer_jiffies() usage
catchup_timer_jiffies() has been applied blindly to several functions
without looking for possible better ways to do it.

1) internal_add_timer()

   Move the update to base->all_timers before we actually insert the
   timer into the wheel.

2) detach_if_pending()

   Again the update to base->all_timers allows us to explicitely do
   the timer_jiffies update in place, if this was the last timer which
   got removed.

3) __run_timers()

   We only check on entry, which is silly, because base->timer_jiffies
   can be behind - especially on NOHZ kernels - and if there is a
   single deferrable timer somewhere between base->timer_jiffies and
   jiffies we expire it and then loop until base->timer_jiffies ==
   jiffies.

   Move it into the loop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Link: http://lkml.kernel.org/r/20150526224511.662994644@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-19 15:18:27 +02:00
Boris Brezillon
86e4404af2 clk: at91: fix PERIPHERAL_MAX_SHIFT definition
Fix the PERIPHERAL_MAX_SHIFT definition (3 instead of 4) and adapt the
round_rate and set_rate logic accordingly.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: "Wu, Songjun" <Songjun.Wu@atmel.com>
2015-06-19 14:43:40 +02:00
Boris Brezillon
6c7b03e1ae clk: at91: pll: fix input range validity check
The PLL impose a certain input range to work correctly, but it appears that
this input range does not apply on the input clock (or parent clock) but
on the input clock after it has passed the PLL divisor.
Fix the implementation accordingly.

Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Jonas Andersson <jonas@microbit.se>
2015-06-19 14:43:39 +02:00
Zhiqiang Zhang
6fab541019 sched/deadline: Remove needless parameter in dl_runtime_exceeded()
Sine commit 269ad8015a ("sched/deadline: Avoid double-accounting in
case of missed deadlines), parameter 'rq' is no longer used, so
remove it.

Signed-off-by: Zhiqiang Zhang <zhangzhiqiang.zhang@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <juri.lelli@gmail.com>
Cc: <luca.abeni@unitn.it>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1434338120-43773-1-git-send-email-zhangzhiqiang.zhang@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:06:48 +02:00
Wanpeng Li
6713c3aa7f sched: Remove superfluous resetting of the p->dl_throttled flag
Resetting the p->dl_throttled flag in rt_mutex_setprio() (for a task that is going
to be boosted) is superfluous, as the natural place to do so is in
replenish_dl_entity().

If the task was on the runqueue and it is boosted by a DL task, it will be enqueued
back with ENQUEUE_REPLENISH flag set, which can guarantee that dl_throttled is
reset in replenish_dl_entity().

This patch drops the resetting of throttled status in function rt_mutex_setprio().

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431496867-4194-6-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:06:47 +02:00
Wanpeng Li
178a4d23e4 sched/deadline: Drop duplicate init_sched_dl_class() declaration
There are two init_sched_dl_class() declarations, this patch drops
the duplicate.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431496867-4194-5-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:06:47 +02:00
Wanpeng Li
9d51426242 sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target
This patch adds a check that prevents futile attempts to move DL tasks
to a CPU with active tasks of equal or earlier deadline. The same
behavior as commit 80e3d87b2c ("sched/rt: Reduce rq lock contention
by eliminating locking of non-feasible target") for rt class.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431496867-4194-3-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:06:46 +02:00
Wanpeng Li
a6c0e746fb sched/deadline: Make init_sched_dl_class() __init
It's a bootstrap function, make init_sched_dl_class() __init.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431496867-4194-2-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:06:46 +02:00
Wanpeng Li
8b5e770ed7 sched/deadline: Optimize pull_dl_task()
pull_dl_task() uses pick_next_earliest_dl_task() to select a migration
candidate; this is sub-optimal since the next earliest task -- as per
the regular runqueue -- might not be migratable at all. This could
result in iterating the entire runqueue looking for a task.

Instead iterate the pushable queue -- this queue only contains tasks
that have at least 2 cpus set in their cpus_allowed mask.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
[ Improved the changelog. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431496867-4194-1-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:06:45 +02:00
Peter Zijlstra
1cde2930e1 sched/preempt: Add static_key() to preempt_notifiers
Avoid touching the curr->preempt_notifier cacheline when not needed.

Provides a small improvement on pipe-bench:

  taskset 01 perf stat --repeat 10 -- perf bench sched pipe

before:

 Performance counter stats for 'perf bench sched pipe' (10 runs):

      12385.016204      task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.34% )
         2,000,023      context-switches          #    0.161 M/sec                    ( +-  0.00% )
                 0      cpu-migrations            #    0.000 K/sec
               175      page-faults               #    0.014 K/sec                    ( +-  0.26% )
    41,376,162,250      cycles                    #    3.341 GHz                      ( +-  0.11% )
    17,389,139,321      stalled-cycles-frontend   #   42.03% frontend cycles idle     ( +-  0.25% )
   <not supported>      stalled-cycles-backend
    68,788,588,003      instructions              #    1.66  insns per cycle
                                                  #    0.25  stalled cycles per insn  ( +-  0.02% )
    13,449,387,620      branches                  # 1085.940 M/sec                    ( +-  0.02% )
        20,880,690      branch-misses             #    0.16% of all branches          ( +-  0.98% )

      12.372646094 seconds time elapsed                                          ( +-  0.34% )

after:

 Performance counter stats for 'perf bench sched pipe' (10 runs):

      12180.936528      task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.33% )
         2,000,077      context-switches          #    0.164 M/sec                    ( +-  0.00% )
                 0      cpu-migrations            #    0.000 K/sec
               174      page-faults               #    0.014 K/sec                    ( +-  0.27% )
    40,691,545,577      cycles                    #    3.341 GHz                      ( +-  0.06% )
    16,446,333,371      stalled-cycles-frontend   #   40.42% frontend cycles idle     ( +-  0.18% )
   <not supported>      stalled-cycles-backend
    68,570,100,387      instructions              #    1.69  insns per cycle
                                                  #    0.24  stalled cycles per insn  ( +-  0.01% )
    13,389,740,014      branches                  # 1099.237 M/sec                    ( +-  0.01% )
        20,175,440      branch-misses             #    0.15% of all branches          ( +-  0.52% )

      12.169253010 seconds time elapsed                                          ( +-  0.33% )

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:06:45 +02:00
Mathieu Desnoyers
d84525a845 sched/preempt: Fix preempt notifiers documentation about hlist_del() within unsafe iteration
preempt_notifier_unregister() documents:

  "This is safe to call from within a preemption notifier."

However, both fire_sched_in_preempt_notifiers() and
fire_sched_out_preempt_notifiers() are using hlist_for_each_entry(),
which is not safe against entry removal during iteration.

Inspection of the KVM code does not reveal any use of
preempt_notifier_unregister() within the preempt notifiers.

Therefore, fix the comment.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431881590-1456-1-git-send-email-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:06:44 +02:00
Peter Zijlstra
b17718d02f sched/stop_machine: Fix deadlock between multiple stop_two_cpus()
Jiri reported a machine stuck in multi_cpu_stop() with
migrate_swap_stop() as function and with the following src,dst cpu
pairs: {11,  4} {13, 11} { 4, 13}

                        4       11      13

cpuM: queue(4 ,13)
                        *Ma
cpuN: queue(13,11)
                                *N      Na
                        *M              Mb
cpuO: queue(11, 4)
                        *O      Oa
                                *Nb
                        *Ob

Where *X denotes the cpu running the queueing of cpu-X and X[ab] denotes
the first/second queued work.

You'll observe the top of the workqueue for each cpu: 4,11,13 to be work
from cpus: M, O, N resp. IOW. deadlock.

Do away with the queueing trickery and introduce lg_double_lock() to
lock both CPUs and fully serialize the stop_two_cpus() callers instead
of the partial (and buggy) serialization we have now.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150605153023.GH19282@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:03:12 +02:00
Srikar Dronamraju
82a0d27626 sched/debug: Add sum_sleep_runtime to /proc/<pid>/sched
When CONFIG_SCHEDSTATS is enabled, /proc/<pid>/sched prints almost all
sched statistics except sum_sleep_runtime. Since sum_sleep_runtime is
a good info to collect, add this it to /proc/<pid>/sched.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1433751041-11724-4-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:03:11 +02:00
Srikar Dronamraju
c5f3ab1c3b sched/debug: Replace vruntime with wait_sum in /proc/sched_debug
Within runnable tasks in /proc/sched_debug, vruntime is printed twice,
once as tree-key and again as exec-runtime.

Since exec-runtime isnt populated in !CONFIG_SCHEDSTATS, use this field
to print wait_sum.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1433751041-11724-3-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 10:03:11 +02:00