Commit graph

8 commits

Author SHA1 Message Date
Chris Redpath
fce0ecf04a schedstats/eas: guard properly to avoid breaking non-smp schedstats users
Add appropriate #ifdef guards to ensure the smp-only easstats structs
are not used when smp is not enabled. Arnd got a report from buildbot,
analysed it, and pointed out exactly what the issue was.

Reported-by: "Arnd Bergmann" <arnd@arndb.de>
Suggested-by: "Arnd Bergmann" <arnd@arndb.de>
Fixes: 4b85765a3d ("sched/fair: Add eas (& cas)
 specific rq, sd and task stats")
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Change-Id: I60554dea20137f6774db3f59b4afd40a06554cfc
2017-06-03 15:03:03 +01:00
Dietmar Eggemann
4b85765a3d sched/fair: Add eas (& cas) specific rq, sd and task stats
The statistic counter are placed in the eas (& cas) wakeup path. Each
of them has one representation for the runqueue (rq), the sched_domain
(sd) and the task.
A task counter is always incremented. A rq counter is always
incremented for the rq the scheduler is currently running on. A sd
counter is only incremented if a relation to a sd exists.

The counters are exposed:

(1) In /proc/schedstat for rq's and sd's:

$ cat /proc/schedstat
...
cpu0 71422 0 2321254 ...
eas  44144 0 0 19446 0 24698 568435 51621 156932 133 222011 17459 120279 516814 83 0 156962 359235 176439 139981
  <- runqueue for cpu0
...
domain0 3 42430 42331 ...
eas 0 0 0 14200 0 0 0 0 0 0 0 0 0 0 0 0 0 0 66355 0  <- MC sched domain for cpu0
...

The per-cpu eas vector has the following elements:

sis_attempts  sis_idle   sis_cache_affine sis_suff_cap    sis_idle_cpu    sis_count               ||
secb_attempts secb_sync  secb_idle_bt     secb_insuff_cap secb_no_nrg_sav secb_nrg_sav secb_count ||
fbt_attempts  fbt_no_cpu fbt_no_sd        fbt_pref_idle   fbt_count                               ||
cas_attempts  cas_count

The following relations exist between these counters (from cpu0 eas
vector above):

sis_attempts = sis_idle + sis_cache_affine + sis_suff_cap + sis_idle_cpu + sis_count

44144        = 0        + 0                + 19446        + 0            + 24698

secb_attempts = secb_sync + secb_idle_bt + secb_insuff_cap + secb_no_nrg_sav + secb_nrg_sav + secb_count

568435        = 51621     + 156932       + 133             + 222011          + 17459        + 120279

fbt_attempts = fbt_no_cpu + fbt_no_sd + fbt_pref_idle + fbt_count + (return -1)

516814       = 83         + 0         + 156962        + 359235    + (534)

cas_attempts = cas_count + (return -1 or smp_processor_id())

176439       = 139981    + (36458)

(2) In /proc/$PROCESS_PID/task/$TASK_PID/sched for a task.

example: main thread of system_server

$ cat /proc/1083/task/1083/sched

...
se.statistics.nr_wakeups_sis_attempts        :                  945
se.statistics.nr_wakeups_sis_idle            :                    0
se.statistics.nr_wakeups_sis_cache_affine    :                    0
se.statistics.nr_wakeups_sis_suff_cap        :                  219
se.statistics.nr_wakeups_sis_idle_cpu        :                    0
se.statistics.nr_wakeups_sis_count           :                  726
se.statistics.nr_wakeups_secb_attempts       :                10376
se.statistics.nr_wakeups_secb_sync           :                 1462
se.statistics.nr_wakeups_secb_idle_bt        :                 6984
se.statistics.nr_wakeups_secb_insuff_cap     :                    3
se.statistics.nr_wakeups_secb_no_nrg_sav     :                  927
se.statistics.nr_wakeups_secb_nrg_sav        :                  206
se.statistics.nr_wakeups_secb_count          :                  794
se.statistics.nr_wakeups_fbt_attempts        :                 8914
se.statistics.nr_wakeups_fbt_no_cpu          :                    0
se.statistics.nr_wakeups_fbt_no_sd           :                    0
se.statistics.nr_wakeups_fbt_pref_idle       :                 6987
se.statistics.nr_wakeups_fbt_count           :                 1554
se.statistics.nr_wakeups_cas_attempts        :                 3107
se.statistics.nr_wakeups_cas_count           :                 1195
...

The same relation between the counters as in the per-cpu case apply.

Change-Id: Ie7d01267c78a3f41f60a3ef52917d5a5d463f195
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2017-06-02 08:01:54 -07:00
Tejun Heo
333470ee46 sched: use %*pb[l] to print bitmaps including cpumasks and nodemasks
printk and friends can now format bitmaps using '%*pb[l]'.  cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:37 -08:00
Paul Gortmaker
c96d6660dc kernel: audit/fix non-modular users of module_init in core code
Code that is obj-y (always built-in) or dependent on a bool Kconfig
(built-in or absent) can never be modular.  So using module_init as an
alias for __initcall can be somewhat misleading.

Fix these up now, so that we can relocate module_init from init.h into
module.h in the future.  If we don't do this, we'd have to add module.h
to obviously non-modular code, and that would be a worse thing.

The audit targets the following module_init users for change:
 kernel/user.c                  obj-y
 kernel/kexec.c                 bool KEXEC (one instance per arch)
 kernel/profile.c               bool PROFILING
 kernel/hung_task.c             bool DETECT_HUNG_TASK
 kernel/sched/stats.c           bool SCHEDSTATS
 kernel/user_namespace.c        bool USER_NS

Note that direct use of __initcall is discouraged, vs.  one of the
priority categorized subgroups.  As __initcall gets mapped onto
device_initcall, our use of subsys_initcall (which makes sense for these
files) will thus change this registration from level 6-device to level
4-subsys (i.e.  slightly earlier).  However no observable impact of that
difference has been observed during testing.

Also, two instances of missing ";" at EOL are fixed in kexec.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-03 16:21:07 -07:00
Al Viro
8e0bcc7222 fix a leak in /proc/schedstats
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-29 15:41:45 -04:00
Nathan Zimmer
cb152ff267 sched: Fix /proc/sched_stat failure on very very large systems
On systems with 4096 cores doing a cat /proc/sched_stat fails,
because we are trying to push all the data into a single kmalloc
buffer.

The issue is on these very large machines all the data will not
fit in 4mb.

A better solution is to not use the single_open() mechanism but
to provide our own seq_operations.

The output should be identical to previous version and thus not
need the version number.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
[ Fix memleak]
[ Fix spello in comment]
[ Fix warnings]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-22 10:27:24 +01:00
Rakib Mullick
30fd049afc sched: Remove sched_switch
Currently we don't utilize the sched_switch field anymore.

But, simply removing sched_switch field from the middle of the
sched_stat output will break tools.

So, to stay compatible we hardcode it to zero and remove the
field from the scheduler data structures.

Update the schedstat documentation accordingly.

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1327422836.27181.5.camel@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-27 13:28:53 +01:00
Peter Zijlstra
391e43da79 sched: Move all scheduler bits into kernel/sched/
There's too many sched*.[ch] files in kernel/, give them their own
directory.

(No code changed, other than Makefile glue added.)

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-17 12:20:22 +01:00
Renamed from kernel/sched_stats.c (Browse further)