Commit graph

21946 commits

Author SHA1 Message Date
Steve Muckle
4edc997e12 sched: take rq lock prior to saving idle task's mark_start
When the idle task is being re-initialized during hotplug its
mark_start value must be retained. The runqueue lock must be
held when reading this value though to serialize this with
other CPUs that could update the idle task's window-based
statistics.

CRs-Fixed: 743991
Change-Id: I1bca092d9ebc32a808cea2b9fe890cd24dc868cd
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
2016-03-23 20:00:55 -07:00
Srivatsa Vaddagiri
98f89f00dc sched: update governor notification logic
Make criteria for notifying governor to be per-cpu. Governor is
notified of any large change in cpu's busy time statistics
(rq->prev_runnable_sum) since the last reported value.

Change-Id: I727354d994d909b166d093b94d3dade7c7dddc0d
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:54 -07:00
Srivatsa Vaddagiri
6139e8a16f sched: window-stats: Retain idle thread's mark_start
init_idle() is called on a cpu's idle-thread once at bootup and
subsequently everytime the cpu is hot-added. Since init_idle() calls
__sched_fork(), we end up blowing idle thread's ravg.mark_start value.
As a result we will fail to accurately maintain cpu's
curr/prev_runnable_sum counters. Below example illustrates such a
failure:

CS = curr_runnable_sum, PS = prev_runnable_sum

t0 -> New window starts for CPU2
<after some_task_activity> CS = X, PS = Y
t1 -> <cpu2 is hot-removed. idle_task start's running on cpu2>
      At this time, cpu2_idle_thread.ravg.mark_start = t1

t1 -> t0 + W. One window elapses. CPU2 still hot-removed. We
	defer swapping CS and PS until some future task event occurs

t2 -> CPU2 hot-added.  _cpu_up()->idle_thread_get()->init_idle()
	->__sched_fork() results in cpu2_idle_thread.ravg.mark_start = 0

t3 -> Some task wakes on cpu2. Since mark_start = 0, we don't swap CS
	and PS => which is a BUG!

Fix this by retaining idle task's original mark_start value during
init_idle() call.

Change-Id: I4ac9bfe3a58fb5da8a6c7bc378c79d9930d17942
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:53 -07:00
Olav Haugan
604c41065b sched: Add checks for frequency change
We need to check for frequency change when a task is migrated due to
affinity change and during active balance.

Change-Id: I96676db04d34b5b91edd83431c236a1c28166985
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
[rameezmustafa@codeaurora.org]: Port to msm-3.18]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org
[joonwoop@codeaurora.org: fixed minor conflict in core.c]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:00:52 -07:00
Srivatsa Vaddagiri
c12a2b5ab9 sched: Use absolute scale for notifying governor
Make the tunables used for deciding the need for notification to be on
absolute scale. The earlier scale (in percent terms relative to
cur_freq) does not work well with available range of frequencies. For
example, 100% tunable value would work well for lower range of
frequencies and not for higher range. Having the tunable to be on
absolute scale makes tuning more realistic.

Change-Id: I35a8c4e2f2e9da57f4ca4462072276d06ad386f1
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:51 -07:00
Srivatsa Vaddagiri
3a67b4ce87 sched: window-stats: Enhance cpu busy time accounting
rq->curr/prev_runnable_sum counters represent cpu demand from various
tasks that have run on a cpu. Any task that runs on a cpu will have a
representation in rq->curr_runnable_sum. Their partial_demand value
will be included in rq->curr_runnable_sum. Since partial_demand is
derived from historical load samples for a task, rq->curr_runnable_sum
could represent "inflated/un-realistic" cpu usage. As an example, lets
say that task with partial_demand of 10ms runs for only 1ms on a cpu.
What is included in rq->curr_runnable_sum is 10ms (and not the actual
execution time of 1ms). This leads to cpu busy time being reported on
the upside causing frequency to stay higher than necessary.

This patch fixes cpu busy accounting scheme to strictly represent
actual usage. It also provides for conditional fixup of busy time upon
migration and upon heavy-task wakeup.

CRs-Fixed: 691443
Change-Id: Ic4092627668053934049af4dfef65d9b6b901e6b
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed conflict in init_task_load(),
 se.avg.decay_count has deprecated.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:00:50 -07:00
Srivatsa Vaddagiri
977dc392f7 sched: window-stats: ftrace event improvements
Add two new ftrace event:

* trace_sched_freq_alert, to log notifications sent
  to governor for requesting change in frequency.
* trace_sched_get_busy, to log cpu busytime information returned by
  scheduler

Extend existing ftrace events as follows:

* sched_update_task_ravg() event to log irqtime parameter
* sched_migration_update_sum() to log threadid which is being migrated
  (and thus responsible for update of curr_runnable_sum and
  prev_runnable_sum counters)

Change-Id: Ia68ce0953a2d21d319a1db7f916c51ff6a91557c
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:49 -07:00
Srivatsa Vaddagiri
c9d0953c31 sched: improve logic for alerting governor
Currently we send notification to governor not taking note of cpus
that are synchronized with regard to their frequency. As a result,
scheduler could send pointless notifications (notification spam!).

Avoid this by considering synchronized cpus and alerting governor only
when the highest demand of any cpu within cluster far exceeds or falls
behind current frequency.

Change-Id: I74908b5a212404ca56b38eb94548f9b1fbcca33d
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:48 -07:00
Syed Rameez Mustafa
53eb9ad023 sched: Stop task migration to busy CPUs due to power active balance
Power active balance should only be invoked when the destination CPU
is calling load balance with either a CPU_IDLE or a CPU_NEWLY_IDLE
environment. We do not want to push tasks towards busy CPUs even they
are a more power efficient place to run that task. This can cause
higher scheduling latencies due to the resulting load imbalance.

Change-Id: I8e0f242338887d189e2fc17acfb63586e7c40839
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:47 -07:00
Srivatsa Vaddagiri
d44a8bea4e sched: window-stats: Fix accounting bug in legacy mode
TASK_UPDATE event currently does not result in increment of
rq->curr_runnable_sum in legacy mode, which is wrong. As a result,
cpu busy time reported under legacy mode could be incorrect.

Change-Id: Ifa76c735a0ead23062c1a64faf97e7b801b66bf9
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:46 -07:00
Srivatsa Vaddagiri
b7609762c9 sched: window-stats: Note legacy mode in fork() and exit()
In legacy mode, mark_task_starting() should avoid adding (new) task's
(initial) demand to rq->curr_runnable_sum and rq->prev_runnable_sum.
Similarly exit() should avoid removing (exiting) task's demand from
rq->curr_runnable_sum and rq->prev_runnable_sum (as those counters
don't include task's demand and partial_demand values in legacy mode).

Change-Id: I26820b1ac5885a9d681d363ec53d6866a2ea2e6f
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:45 -07:00
Srivatsa Vaddagiri
dd4c950f7b sched: Fix reference to stale task_struct in try_to_wake_up()
try_to_wake_up() currently drops p->pi_lock and later checks for need
to notify cpufreq governor on task migrations or wakeups. However the
woken task could exit between the time p->pi_lock is released and the
time the test for notification is run. As a result, the test for
notification could refer to an exited task. task_notify_on_migrate(p)
could thus lead to invalid memory reference.

Fix this by running the test for notification with task's pi_lock
held.

Change-Id: I1c7a337473d2d8e79342a015a179174ce00702e1
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:44 -07:00
Syed Rameez Mustafa
034fb588ae sched: Remove hack to enable/disable HMP scheduling extensions
The current method of turning HMP scheduling extensions on or off
based on the number of CPUs is inappropriate as there may be SoCs with
4 or less cores that require the use of these extensions. Remove this
hack as HMP extensions will now be enabled/disabled via command line
options.

Change-Id: Id44b53c2c3b3c3b83e1911a834e2c824f3958135
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:43 -07:00
Srivatsa Vaddagiri
b7e40e50e9 sched: fix wrong load_scale_factor/capacity/nr_big/small_tasks
A couple bugs exist with incorrect use of cpu_online_mask in
pre/post_big_small_task() functions, leading to potentially incorrect
computation of load_scale_factor/capacity/nr_big/small_tasks.

pre/post_big_small_task_count_change() use cpu_online_mask in an
unreliable manner. While local_irq_disable() in
pre_big_small_task_count_change() ensures a cpu won't go away in
cpu_online_mask, nothing prevents a cpu from coming online
concurrently. As a result, cpu_online_mask used in
pre_big_small_task_count_change() can be inconsistent with that used
in post_big_small_task_count_change() which can lead to an attempt to
unlock rq->lock which was not taken before.

Secondly, when either max_possible_freq or min_max_freq is changing,
it needs to trigger recomputation of load_scale_factor and capacity
for *all* cpus, even if some are offline. Otherwise, an offline cpu
could later come online with incorrect load_scale_factor/capacity.

While it should be sufficient to scan online cpus for
updating their nr_big/small_tasks in
post_big_small_task_count_change(), unfortunately it sounds pretty
hard to provide a stable cpu_online_mask when its called from
cpufreq_notifier_policy(). cpufreq framework can trigger a
CPUFREQ_NOTIFY notification in multiple contexts, some in cpu-hotplug
paths, which makes it pretty hard to guess whether get_online_cpus()
can be taken without causing deadlocks or not. To workaround the
insufficient information we have about the hotplug-safety context when
CPUFREQ_NOTIFY is issued, have post_big_small_task_count_change()
traverse all possible cpus in updating nr_big/small_task_count.

CRs-Fixed: 717134
Change-Id: Ife8f3f7cdfd77d5a21eee63627d7a3465930aed5
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:42 -07:00
Syed Rameez Mustafa
38daa13114 sched: add check for cpu idleness when using C-state information
Task enqueue on a CPU occurs prior to that CPU exiting an idle state.
For the time duration between enqueue and idle exit, the CPU C-state
information can no longer be relied on for further task placement
since already enqueued/waiting tasks are not taken into account. The
small task placement algorithm implicitly assumes a non zero C-state
implies an idle CPU. Since this assumption is incorrect for the
duration described above, make the cpu_idle() check explicit. This
problem can lead to task packing beyond the mostly_idle threshold.

Change-Id: Idb5be85705d6b15f187d011ea2196e1bfe31dbf2
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:41 -07:00
Syed Rameez Mustafa
daf32916eb sched: extend sched_task_load tracepoint to indicate small tasks
While debugging its always useful to know whether a task is small or
not to determine the scheduling algorithm being used. Have the
sched_task_load tracepoint indicate this information rather than
having to do manual calculations for every task placement.

Change-Id: Ibf390095f05c7da80df1ebfe00f4c5af66c97d12
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:40 -07:00
Syed Rameez Mustafa
b12fe96d55 sched: Add C-state tracking to the sched_cpu_load trace event
C-state information is used by the scheduler for small task placement
decisions. Track this information in the sched_cpu_load trace event.
Also add the trace event in best_small_task_cpu(). This will help
better understand small task placement decisions.

Change-Id: Ife5f05bba59f85c968fab999bd13b9fb6b1c184e
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:39 -07:00
Syed Rameez Mustafa
759ea99236 sched: window-stats: add a new AVG policy
The current WINDOW_STATS_AVG policy is actually a misnomer since it
uses the maximum value of the runtime in the recent window and the
average of the past ravg_hist_size windows. Add a policy that only
uses the average and call it WINDOW_STATS_AVG policy. Rename all the
other polices to make them shorter and unambiguous.

Change-Id: I080a4ea072a84a88858ca9da59a4151dfbdbe62c
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:38 -07:00
Srivatsa Vaddagiri
3dcd52ded0 sched: Fix compile error
sched_get_busy(), sched_set_io_is_busy() and sched_set_window() need
to be defined only when CONFIG_SCHED_FREQ_INPUT is defined, otherwise
we get compilation error related to dual definition of those routines

Change-Id: Ifd5c9b6675b78d04c2f7ef0e24efeae70f7ce19b
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[joonwoop@codeaurora.org: fixed minor conflict in include/linux/sched.h]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:00:37 -07:00
Syed Rameez Mustafa
a6f510aa0a sched: update ld_moved for active balance from the load balancer
ld_moved is currently left set to 0 when the load balancer calls upon
active balance. This behavior is incorrect as it prevents the termination
of load balance for parent sched domains. Currently the feature is used
quite frequently for power active balance and sched boost. This means that
while sched boost is in effect we could run into a scenario where a more
power efficient newly idle big CPU first triggers active migration from a
less power efficient busy big CPU. It then continues to load balance at the
cluster level causing active migration for a task running on a little CPU.
Consequently the more power efficient big CPU ends up with two tasks where
as the less power efficient big CPU may become idle. Fix this problem by
updating ld_moved when active migration has been requested.

Change-Id: I52e84eafb77249fd9378ebe531abe2d694178537
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:35 -07:00
Syed Rameez Mustafa
9aecd4c576 sched: actively migrate tasks to idle big CPUs during sched boost
The sched boost feature is currently tick driven, i.e. task placement
decisions only take place at a tick (or wakeup). The load balancer
does not have any knowledge of boost being in effect. Tasks that are
woken up on a little CPU when all big CPUs are busy will continue
executing there at least until the next tick even if one of the big
CPUs becomes idle. Reduce this latency by adding support for detecting
whether boost is in effect or not in the load balancer.  If boost is
in effect any big CPU running idle balance will trigger active
migration from a little CPU with the highest task load.

Change-Id: Ib2828809efa0f9857f5009b29931f63b276a59f3
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:34 -07:00
Syed Rameez Mustafa
97b9ad42d9 sched: always do idle balance with a NEWLY_IDLE idle environment
With the introduction of energy aware scheduling, if idle_balance() is
to be called on behalf of a different CPU which is idle, CPU_IDLE is
used in the environment for load_balance(). This, however, introduces
subtle differences in load calculations and policies in the load
balancer. For example there are restrictions on which CPU is permitted
to do load balancing during !CPU_NEWLY_IDLE (see update_sg_lb_stats)
and find_busiest_group() uses different criteria to detect the
presence of a busy group. There are other differences as well. Revert
back to using the NEWLY_IDLE environment irrespective of whether
idle_balance() is called for the newly idle CPU or on behalf on
already existing idle CPU. This will ensure that task movement logic
while doing idle balance remains unaffected.

Change-Id: I388b0ad9a38ca550667895c8ed19628f3d25ce1a
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:33 -07:00
Syed Rameez Mustafa
251081550f sched: fix bail condition in bail_inter_cluster_balance()
Following commit efcad25cbfb (revert "sched: influence cpu_power based
on max_freq and efficiency), all CPUs in the system have the same
cpu_power and consequently the same group capacity. Therefore, the
check in bail_inter_cluster_balance() can now no longer be used to
distinguish a higher performance cluster from one with lower
performance. The check is currently broken and always returns true for
every load balancing attempt. Fix this by using runqueue capacity
instead which can still be used as a good measure of cluster
capabilities.

Change-Id: Idecfd1ed221d27d4324b20539e5224a92bf8b751
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:00:32 -07:00
Srivatsa Vaddagiri
b7f98009c5 sched: Initialize env->loop variable to 0
load_balance() function does not explicitly initialize env->loop
variable to 0. As a result, there is a vague possibility of
move_tasks() hitting a very long (unnecessary) loop when its unable to
move tasks from src_cpu. This can lead to unpleasant results like a
watchdog bark. Fix this by explicitly initializing env->loop variable
to 0 (in both load_balance() and active_load_balance_cpu_stop()).

Change-Id: I36b84c91a9753870fa16ef9c9339db7b706527be
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:31 -07:00
Srivatsa Vaddagiri
84d1fa51ee sched: window-stats: use policy_mutex in sched_set_window()
Several configuration variable change will result in
reset_all_window_stats() being called. All of them, except
sched_set_window(), are serialized via policy_mutex. Take
policy_mutex in sched_set_window() as well to serialize use of
reset_all_window_stats() function

Change-Id: Iada7ff8ac85caa1517e2adcf6394c5b050e3968a
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:30 -07:00
Srivatsa Vaddagiri
da4ffc0b59 sched: window-stats: Avoid taking all cpu's rq->lock for long
reset_all_window_stats() walks task-list with all cpu's rq->lock held,
which can cause spinlock timeouts if task-list is huge (and hence lead
to a spinlock bug report). Avoid this by walking task-list without
cpu's rq->lock held.

Change-Id: Id09afd8b730fa32c76cd3bff5da7c0cd7aeb8dfb
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:29 -07:00
Srivatsa Vaddagiri
29581dc620 sched: window_stats: Add "disable" mode support
"disabled" mode (sched_disble_window_stats = 1) disables all
window-stats related activity. This is useful when changing key
configuration variables associated with window-stats feature (like
policy or window size).

Change-Id: I9e55c9eb7f7e3b1b646079c3aa338db6259a9cfe
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:28 -07:00
Srivatsa Vaddagiri
2a7d718b3d sched: window-stats: Fix exit race
Exiting tasks are removed from tasklist and hence at some point will
become invisible to do_each_thread/for_each_thread task iterators.
This breaks the functionality of reset_all_windows_stats() which *has*
to reset stats for *all* tasks.

This patch causes exiting tasks stats to be reset *before* they are
removed from tasklist. DONT_ACCOUNT bit in exiting task's ravg.flags
is also marked so that their remaining execution time is not accounted
in cpu busy time counters (rq->curr/prev_runnable_sum).
reset_all_windows_stats() is thus guaranteed to return with all task's
stats reset to 0.

Change-Id: I5f101156a4f958c1b3f31eb0db8cd06e621b75e9
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:27 -07:00
Srivatsa Vaddagiri
dfeae566bb sched: window-stats: code cleanup
Provide a wrapper function to reset task's window statistics. This will be
reused by a subsequent patch

Change-Id: Ied7d32325854088c91285d8fee55d5a5e8a954b3
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:26 -07:00
Srivatsa Vaddagiri
d8932ae7df sched: window-stats: legacy mode
Support legacy mode, which results in busy time being seen by governor
that is close to what it would have seen via existing APIs i.e
get_cpu_idle_time_us(), get_cpu_iowait_time_us() and
get_cpu_idle_time_jiffy(). In particular, legacy mode means that only
task execution time is counted in rq->curr_runnable_sum and
rq->prev_runnable_sum. Also task migration does not result in
adjustment of those counters.

Change-Id: If374ccc084aa73f77374b6b3ab4cd0a4ca7b8c90
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:26 -07:00
Srivatsa Vaddagiri
32e4c4a368 sched: window-stats: Code cleanup
Collapse duplicated comments about keeping few of sysctl knobs
initialized to same value as their non-sysctl copies

Change-Id: Idc8261d86b9f36e5f2f2ab845213bae268ae9028
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:25 -07:00
Srivatsa Vaddagiri
e39131c3be sched: window-stats: Code cleanup
Remove code duplication associated with update of various window-stats
related sysctl tunables

Change-Id: I64e29ac065172464ba371a03758937999c42a71f
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:24 -07:00
Srivatsa Vaddagiri
90a01bb623 sched: window-stats: Code cleanup
add_task_demand() and 'long_sleep' calculation in it are not strictly
required. rq_freq_margin() check for need to change frequency, which
removes need for long_sleep calculation. Once that is removed, need
for add_task_demand() vanishes.

Change-Id: I936540c06072eb8238fc18754aba88789ee3c9f5
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[joonwoop@codeaurora.org: fixed minior conflict in core.c]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:00:23 -07:00
Srivatsa Vaddagiri
9425ce4309 sched: window-stats: Remove unused prev_window variable
Remove unused prev_window variable in 'struct ravg'

Change-Id: I22ec040bae6fa5810f9f8771aa1cb873a2183746
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:22 -07:00
Steve Muckle
6ed9cab723 sched: disable frequency notifications by default
The frequency notifications from the scheduler do not currently respect
synchronous topologies. If demand on CPU 0 is driving frequency high and
CPU 1 is in the same frequency domain, and demand on CPU 1 is low,
frequency notifiers will be continuously sent by CPU 1 in an attempt to
have its frequency lowered.

Until the notifiers are fixed, disable them by default. They can still
be re-enabled at runtime.

Change-Id: Ic8a927af2236d8fe83b4f4a633b20a8ddcfba359
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
2016-03-23 20:00:21 -07:00
Steve Muckle
ecae24dd92 sched: fix misalignment between requested and actual windows
When set_window_start() is first executed sched_clock() has not yet
stabilized. Refresh the sched_init_jiffy and sched_clock_at_init_jiffy
values until it is known that sched_clock has stabilized - this will
be the case by the time a client calls the sched_set_window() API.

Change-Id: Icd057707ff44c3b240e5e7e96891b23c95733daa
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
2016-03-23 20:00:20 -07:00
Olav Haugan
8eede4a8d5 sched: Make RAVG_HIST_SIZE tunable
Make RAVG_HIST_SIZE available from /proc/sys/kernel/sched_ravg_hist_size
to allow tuning of the size of the history that is used in computation
of task demand.

CRs-fixed: 706138
Change-Id: Id54c1e4b6e974a62d787070a0af1b4e8ce3b4be6
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
[joonwoop@codeaurora.org: fixed minor conflict in sysctl.h]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:00:19 -07:00
Srivatsa Vaddagiri
778ce1a13c sched: Fix possibility of "stuck" reserved flag
check_for_migration() could mark a thread for migration (in
rq->push_task) and invoke active_load_balance_cpu_stop(). However that
thread could get migrated to another cpu by the time
active_load_balance_cpu_stop() runs, which could fail to clear
reserved flag for a cpu and drop task_sruct reference when cpu has
only one task (stopper thread running
active_load_balance_cpu_stop()). This would cause a cpu to have
reserved bit stuck, which prevents it from being used effectively.

Fix this by having active_load_balance_cpu_stop() drop reserved bit
always.

Change-Id: I2464a46b4ddb52376a95518bcc95dd9768e891f9
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[rameezmustafa@codeaurora.org]: Port to msm-3.18]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org
2016-03-23 20:00:18 -07:00
Srivatsa Vaddagiri
35e98218fd sched: initialize env->flags variable to 0
env->flags and env->new_dst_cpu fields are not initialized in
load_balance() function. As a result, load_balance() could wrongly see
LBF_SOME_PINNED flag set and access (bogus) new_dst_cpu's runqueue
leading to invalid memory reference. Fix this by initializing
env->flags field to 0. While we are at it, fix similar issue in
active_load_balance_cpu_stop() function, although there is no harm
present currently in that function with uninitialized env->flags
variable.

Change-Id: Ied470b0abd65bf2ecfa33fa991ba554a5393f649
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:17 -07:00
Srivatsa Vaddagiri
13b29fc0f7 sched: window-stats: 64-bit type for curr/prev_runnable_sum
Expand rq->curr_runnable_sum and rq->prev_runnable_sum to be 64-bit
counters as otherwise they can easily overflow when a cpu has many
tasks.

Change-Id: I68ab2658ac6a3174ddb395888ecd6bf70ca70473
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:16 -07:00
Srivatsa Vaddagiri
4641b37da8 sched: window-stats: Allow acct_wait_time to be tuned
Add sysctl interface to tune sched_acct_wait_time variable at runtime

Change-Id: I38339cdb388a507019e429709a7c28e80b5b3585
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:15 -07:00
Srivatsa Vaddagiri
c097c9b574 sched: window-stats: Account interrupt handling time as busy time
Account cycles spent by idle cpu handling interrupts (irq or softirq)
towards its busy time.

Change-Id: I84cc084ced67502e1cfa7037594f29ed2305b2b1
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[joonwoop@codeaurora.org: fixed minor conflict in core.c]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:00:14 -07:00
Srivatsa Vaddagiri
c20a41478d sched: window-stats: Account idle time as busy time
Provide a knob to consider idle time as busy time, when cpu becomes
idle as a result of io_schedule() call. This will let governor
parameter 'io_is_busy' to be appropriately honored.

Change-Id: Id9fb4fe448e8e4909696aa8a3be5a165ad7529d3
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:13 -07:00
Srivatsa Vaddagiri
900b44b621 sched: window-stats: Account wait time
Extend window-based task load accounting mechanism to include
wait-time as part of task demand. A subsequent patch will make this
feature configurable at runtime.

Change-Id: I8e79337c30a19921d5c5527a79ac0133b385f8a9
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:12 -07:00
Srivatsa Vaddagiri
b9f8d63c08 sched: window-stats: update task demand on tick
A task can execute on a cpu for a long time without being preempted
or migrated. In such case, its demand can become outdated for a long
time. Prevent that from happening by updating demand of currently
running task during scheduler tick.

Change-Id: I321917b4590635c0a612560e3a1baf1e6921e792
CRs-Fixed: 698662
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[joonwoop@codeaurora.org: fixed trivial merge conflict in core.c]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:00:11 -07:00
Srivatsa Vaddagiri
8e526b1ab4 sched: Fix herding issue
check_for_migration() could run concurrently on multiple cpus,
resulting in multiple tasks wanting to migrate to same cpu. This could
cause cpus to be underutilized and lead to increased scheduling
latencies for tasks. Fix this by serializing select_best_cpu() calls
from cpus running check_for_migration() check and marking selected
cpus as reserved, so that subsequent call to select_best_cpu() from
check_for_migration() will skip reserved cpus.

Change-Id: I73a22cacab32dee3c14267a98b700f572aa3900c
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[rameezmustafa@codeaurora.org]: Port to msm-3.18]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org
2016-03-23 20:00:10 -07:00
Srivatsa Vaddagiri
c820f1c5f2 sched: window-stats: print window size in /proc/sched_debug
Printing window size in /proc/sched_debug would provide useful
information to debug scheduler issues.

Change-Id: Ia12ab2cb544f41a61c8a1d87bf821b85a19e09fd
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:10 -07:00
Srivatsa Vaddagiri
69fec0486f sched: Extend ftrace event to record boost and reason code
Add a new ftrace event to record changes to boost setting. Also extend
sched_task_load() ftrace event to record boost setting and reason code
passed to select_best_cpu(). This will be useful for debug purpose.

Change-Id: Idac72f86d954472abe9f88a8db184343b7730287
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:09 -07:00
Srivatsa Vaddagiri
8f8c8db1c5 sched: Avoid needless migration
Restrict check_for_migration() to operate on fair_sched class tasks
only.

Also check_for_migration() can result in a call to select_best_cpu()
to look for a better cpu for currently running task on a cpu. However
select_best_cpu() can end up suggesting a cpu that is not necessarily
better than the cpu on which task is running currently. This will
result in unnecessary migration. Prevent that from happening.

Change-Id: I391cdda0d7285671d5f79aa2da12eaaa6cae42d7
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:08 -07:00
Srivatsa Vaddagiri
35bf2d9d10 sched: Drop active balance request upon cpu going offline
A cpu could mark its currently running task to be migrated to another
cpu (via rq->push_task/rq->push_cpu) and could go offline before
active load balance handles the request. In such case, clear the
active load balance request.

Change-Id: Ia3e668e34edbeb91d8559c1abb4cbffa25b1830b
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
2016-03-23 20:00:06 -07:00