Commit graph

21927 commits

Author SHA1 Message Date
Joonwoo Park
d009f9c149 sched: eliminate sched_enable_power_aware knob and parameter
Kill unused scheduler knob and parameter sched_enable_power_aware.  HMP
scheduler always take into account power cost for placing task.

Change-Id: Ib26a21df9b903baac26c026862b0a41b4a8834f3
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-06-01 15:21:29 -07:00
Joonwoo Park
462213d1ac sched: eliminate sched_freq_account_wait_time knob
Kill unused scheduler knob sched_freq_account_wait_time.

Change-Id: Ib74123ebd69dfa3f86cf7335099f50c12a6e93c3
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-06-01 15:21:18 -07:00
Joonwoo Park
5160d93b6d sched: eliminate sched_account_wait_time knob
Kill unused scheduler knob sched_account_wait_time.  With this change
scheduler always accounts task's wait time into demand.

Change-Id: Ifa4bcb5685798f48fd020f3d0c9853220b3f5fdc
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-06-01 15:21:04 -07:00
Srivatsa Vaddagiri
e6aae1c3e0 sched: Aggregate for frequency
Related threads in a group could execute on different CPUs and hence
present a split-demand picture to cpufreq governor. IOW the governor
fails to see the net cpu demand of all related threads in a given
window if the threads's execution were to be split across CPUs. That
could result in sub-optimal frequency chosen in comparison to the
ideal frequency at which the aggregate work (taken up by related
threads) needs to be run.

This patch aggregates cpu execution stats in a window for all related
threads in a group. This helps present cpu busy time to governor as if
all related threads were part of the same thread and thus help select
the right frequency required by related threads. This aggregation
is done per-cluster.

Change-Id: I71e6047620066323721c6d542034ddd4b2950e7f
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: Fixed notify_migration() to hold rcu read
 lock as this version of Linux doesn't hold p->pi_lock when the
 function gets called while keeping use of rcu_access_pointer() since
 we never dereference return value.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-05-26 15:28:59 -07:00
Kishor PK
05bd41f94e trace: prevent NULL pointer dereference
Prevent unintended NULL pointer dereference in trace_event_perf.

Change-Id: I35151c460b4350ebd414b67c655684c2019f799f
Signed-off-by: Kishor PK <kpbhat@codeaurora.org>
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
2016-05-25 14:21:54 -07:00
Aparna Das
89ee45617e coresight: add stm logging to support optimization in trace printk
The function trace_printk() performs optimization by determining if
there are no format parameters in argument string and calls appropriate
apis to write to ftrace buffer. Add STM logging to support this
optimization in order to allow CoreSight STM tracing for optimized
trace_printk path.

Change-Id: I1a77291e77410c6ed99474335a6d25742c409e47
Signed-off-by: Aparna Das <adas@codeaurora.org>
Signed-off-by: Pratik Patel <pratikp@codeaurora.org>
Signed-off-by: Shashank Mittal <mittals@codeaurora.org>
2016-05-24 14:15:32 -07:00
Shashank Mittal
e56ad58d2c coresight: enable stm logging for trace events, marker and printk
Dup ftrace event traffic and writes to trace_marker file from
userspace to STM. Also dup trace printk traffic to STM. This
allows Linux tracing and log data to be correlated with other
data transported over STM.

Change-Id: I4fcb42f2e97ab963fdc85853f4f3ea1f208bfc3c
Signed-off-by: Pratik Patel <pratikp@codeaurora.org>
[spjoshi@codeaurora.org: 3.18 code fixup]
Signed-off-by: Sarangdhar Joshi <spjoshi@codeaurora.org>
[mittals@codeaurora.org: 4.4 code fixup]
Signed-off-by: Shashank Mittal <mittals@codeaurora.org>
2016-05-24 14:15:31 -07:00
Joonwoo Park
d9ff0d77af sched: simplify CPU frequency estimation and cycle counter API
Most of CPUs increase cycle counter by one every cycle which makes
frequency = cycles / time_delta is correct.  Therefore it's reasonable
to get rid of current cpu_cycle_max_scale_factor and ask cycle counter
read callback function to return scaled counter value when it's needed
in such a case that cycle counter doesn't increase every cycle.

Thus multiply NSEC_PER_SEC / HZ_PER_KHZ to CPU cycle counter delta
as we calculate frequency in khz and remove cpu_cycle_max_scale_factor.
This allows us to simplify frequency estimation and cycle counter API.

Change-Id: Ie7a628d4bc77c9b6c769f6099ce8d75740262a14
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-05-20 19:23:47 -07:00
Srinivasarao P
06ae01888e perf: duplicate deletion of perf event
a malicious app can open a perf event with constraint_duplicate
bit set, disable the event, and close the fd.  On closing the fd,
the perf_release() modification causes the kernel to clean up
the event as if it still were enabled, leading to the event
being removed from a list twice.

CRs-Fixed: 977563
Change-Id: I5fbec3722407d2f3d0ff0d9f7097c5889e31fd62
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2016-05-18 13:39:58 -07:00
Joonwoo Park
c0cc65346e sched: use correct Kconfig macro name CONFIG_SCHED_HMP_CSTATE_AWARE
Fix macro name so CONFIG_SCHED_HMP_CSTATE_AWARE=y to take effect.

Change-Id: I0218b36b2d74974f50a173a0ac3bc59156c57624
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-05-16 20:10:32 -07:00
Joonwoo Park
cd947ad761 Revert "sched: set HMP scheduler's default initial task load to 100%"
This reverts commit 28f67e5a50 ("sched: set HMP scheduler's
default initial task load to 100%") since 100% of init task load
makes too much of power inefficiency on some targets.

CRs-fixed: 1006303
Change-Id: I81b4ba8fdc2e2fe1b40f18904964098fa558989b
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-05-16 20:10:30 -07:00
David Keitel
5834faf085 trace: cpu_freq_switch: use tracefs instead of debugfs
Rather than using debugfs, switch to tracefs which trace
moved to in kernel 4.4.

Signed-off-by: David Keitel <dkeitel@codeaurora.org>
Change-Id: I52ef7d45cabb20cc61fbd2fb3ef5016b041bc56c
2016-05-16 20:10:17 -07:00
Biswajit Paul
60c6b65403 kernel: Restrict permissions of /proc/iomem.
The permissions of /proc/iomem currently are -r--r--r--. Everyone can
see its content. As iomem contains information about the physical memory
content of the device, restrict the information only to root.

Change-Id: If0be35c3fac5274151bea87b738a48e6ec0ae891
CRs-Fixed: 786116
Signed-off-by: Biswajit Paul <biswajitpaul@codeaurora.org>
Signed-off-by: Avijit Kanti Das <avijitnsec@codeaurora.org>
2016-05-09 18:35:28 -07:00
Tejun Heo
e2b6ea208b workqueue: implement lockup detector
Workqueue stalls can happen from a variety of usage bugs such as
missing WQ_MEM_RECLAIM flag or concurrency managed work item
indefinitely staying RUNNING.  These stalls can be extremely difficult
to hunt down because the usual warning mechanisms can't detect
workqueue stalls and the internal state is pretty opaque.

To alleviate the situation, this patch implements workqueue lockup
detector.  It periodically monitors all worker_pools periodically and,
if any pool failed to make forward progress longer than the threshold
duration, triggers warning and dumps workqueue state as follows.

 BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s!
 Showing busy workqueues and worker pools:
 workqueue events: flags=0x0
   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256
     pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent
 workqueue events_power_efficient: flags=0x80
   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
     pending: check_lifetime, neigh_periodic_work
 workqueue cgroup_pidlist_destroy: flags=0x0
   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
     pending: cgroup_pidlist_destroy_work_fn
 ...

The detection mechanism is controller through kernel parameter
workqueue.watchdog_thresh and can be updated at runtime through the
sysfs module parameter file.

v2: Decoupled from softlockup control knobs.

CRs-Fixed: 1007459
Change-Id: Id7dfbbd2701128a942b1bcac2299e07a66db8657
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Git-commit: 82607adcf9cdf40fb7b5331269780c8f70ec6e35
Git-repo: git://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
2016-05-05 15:05:53 -07:00
Tejun Heo
b24e86e268 watchdog: introduce touch_softlockup_watchdog_sched()
touch_softlockup_watchdog() is used to tell watchdog that scheduler
stall is expected.  One group of usage is from paths where the task
may not be able to yield for a long time such as performing slow PIO
to finicky device and coming out of suspend.  The other is to account
for scheduler and timer going idle.

For scheduler softlockup detection, there's no reason to distinguish
the two cases; however, workqueue lockup detector is planned and it
can use the same signals from the former group while the latter would
spuriously prevent detection.  This patch introduces a new function
touch_softlockup_watchdog_sched() and convert the latter group to call
it instead.  For now, it just calls touch_softlockup_watchdog() and
there's no functional difference.

CRs-Fixed: 1007459
Change-Id: I6fe77926acd4240458cab29d399f81d8739a16c0
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Git-commit: 03e0d4610bf4d4a93bfa16b2474ed4fd5243aa71
Git-repo: git://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
2016-05-05 15:05:52 -07:00
Joonwoo Park
55b8e041e6 sched: take into account of limited CPU min and max frequencies
Actual CPU's min and max frequencies can be limited by hardware
components while governor's not aware of.  Provide an API for them to
notify for scheduler to be able to notice accurate currently
operating frequency boundaries which helps better task placement
decision.

CRs-fixed: 1006303
Change-Id: I608f5fa8b0baff8d9e998731dcddec59c9073d20
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-04-27 19:13:06 -07:00
Joonwoo Park
35f1d99e0a sched: add support for CPU frequency estimation with cycle counter
At present scheduler calculates task's demand with the task's execution
time weighted over CPU frequency.  The CPU frequency is given by
governor's CPU frequency transition notification.  Such notification
may not be available.

Provide an API for CPU clock driver to register callback functions so
in order for scheduler to access CPU's cycle counter to estimate CPU's
frequency without notification.  At time point scheduler assumes the
cycle counter increases always even when cluster is idle which might
not be true.  This will be fixed by subsequent change for more accurate
I/O wait time accounting.

CRs-fixed: 1006303
Change-Id: I93b187efd7bc225db80da0184683694f5ab99738
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-04-27 19:13:05 -07:00
Joonwoo Park
f8bf0307bc sched: revise sched_boost to make the best of big cluster CPUs
At present sched_boost changes scheduler to place tasks on the least
loaded CPU under the assumption both big and little clusters capacities
are same at the same level of frequency.  This is suboptimal for the
big.Little system that doesn't have such a symmetrical capacity between
big and little CPUs.

Fix sched_boost to place tasks on the big CPUs for the non-symmetrical
capacity target.

CRs-fixed: 1006303
Change-Id: I752f020acf1a76580edb5cd0e5ad283b62edfeed
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-04-25 17:44:39 -07:00
Joonwoo Park
16c433e4c5 sched: fix excessive task packing where CONFIG_SCHED_HMP_CSTATE_AWARE=y
At present among the same power cost and c-state CPUs scheduler places
newly waking up task on the most loaded CPU which can incur too much of
task packing on the same CPU.  Place onto the most loaded CPU only when
the best CPU is in idle cstate, otherwise spread out by placing onto the
least loaded CPU.

CRs-fixed: 1006303
Change-Id: I8ae7332971b3293d912b1582f75e33fd81407d86
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-04-22 15:05:34 -07:00
Joonwoo Park
2e0ebb0155 sched: add option whether CPU C-state is used to guide task placement
There are CPUs that don't have an obvious low power mode exit latency
penalty.  Add a new Kconfig CONFIG_SCHED_HMP_CSTATE_AWARE which controls
whether CPU C-state is used to guide task placement.

CRs-fixed: 1006303
Change-Id: Ie8dbab8e173c3a1842d922f4d1fbd8cc4221789c
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-04-22 15:05:24 -07:00
Syed Rameez Mustafa
d4ca4d767f sched: update placement logic to prefer C-state and busier CPUs
Update the wakeup placement logic when need_idle is not set. Break
ties in power with C-state. If C-state is the same break ties with
prev_cpu. Finally go for the most loaded CPU.

CRs-fixed: 1006303
Change-Id: Iafa98a909ed464af33f4fe3345bbfc8e77dee963
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed bug where assigns best_cpu_cstate with
 uninitialized cpu_cstate.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-04-22 15:05:13 -07:00
Syed Rameez Mustafa
c34b0b85aa sched: Optimize wakeup placement logic when need_idle is set
Try and find the min cstate CPU within the little cluster when a
task fits there. If there is no idle CPU return the least busy
CPU. Also Add a prev CPU bias when C-states or load is the same.

CRs-fixed: 1006303
Change-Id: I577cc70a59f2b0c5309c87b54e106211f96e04a0
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-04-22 15:05:01 -07:00
Liam Mark
21f86651a6 android/lowmemorykiller: Ignore tasks with freed mm
A killed task can stay in the task list long after its
memory has been returned to the system, therefore
ignore any tasks whose mm struct has been freed.

Change-Id: I76394b203b4ab2312437c839976f0ecb7b6dde4e
CRs-fixed: 450383
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2016-04-13 11:09:29 -07:00
Jeevan Shriram
a6c4b5ad91 kernel: sched: Fix compilation issues for Usermode Linux
Fix compilation errors for ARCH=um for x86_64 architecture.

CRs-Fixed: 996252
Change-Id: I414b551e28a950e4b601f31bb4bfa2f1200d1713
Signed-off-by: Jeevan Shriram <jshriram@codeaurora.org>
2016-04-12 15:49:42 -07:00
Prasad Sodagudi
efea890321 genirq: call cancel_work_sync from irq_set_affinity_notifier
When ever notification of IRQ affinity changes, call
cancel_work_sync from irq_set_affinity_notifier to cancel
all pending works to avoid work list corruption.

Change-Id: I1f093bcc43be8c6696bad29250e4926cbc6c4029
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
2016-03-25 16:02:35 -07:00
Pavankumar Kondeti
bd887e4a58 sched: fix circular dependency of rq->lock and kswadp waitqueue lock
There is a deadlock scenario due to the circular dependency of CPU's
rq->lock and kswapd's waitqueue lock.

(1) when kswapd is woken up, try_to_wake_up() is called with it's
waitqueue lock held. It's previous CPU is offline, so it is woken
up on a different CPU. We try to acquire the offline CPU's rq->lock
in either cpufreq change callback or fixup_busy_time()

(2) At the same time, the offline CPU is coming online and init_idle()
is called from __cpu_up(). init_idle() calls __sched_fork() with
rq->lock held. A debug object allocation in hrtimer_init() called
from __sched_fork() is trying to wakeup the kswapd and attempts to
take the waitqueue lock held in the (1) path.

Task specific initialization is done in __sched_fork() and rq->lock
is not held when it is called for other tasks. The same holds true for
the idle task as well. __sched_fork() for the idle task is called only
when the CPU is not active.

Acquire the rq->lock after calling __sched_fork() in init_idle()
to fix this deadlock.

CRs-Fixed: 965873
Change-Id: Ib8a265835c29861dba571c9b2a6b7e75b5cb43ee
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
[satyap: trivial merge conflicts resolution and omitted changes for QHMP]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2016-03-23 21:30:42 -07:00
Joonwoo Park
375d7195fc sched: move out migration notification out of spinlock
The commit 5e16bbc2fb ("sched: Streamline the task migration locking
a little") hardened task migration locking and now __migrate_task() is
called after rq lock held.  Move out notification out of spinlock.

Change-Id: I553adcfe80d5c670f4ddf83438226fd5e0924fe8
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 21:25:25 -07:00
Joonwoo Park
d96fdc91d1 sched: fix compile failure with !CONFIG_SCHED_HMP
Fix various compilation failures when CONFIG_SCHED_HMP or
CONFIG_SCHED_INPUT isn't enabled.

Change-Id: I385dd37cfd778919f54f606bc13bebedd2fb5b9e
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 21:25:24 -07:00
Joonwoo Park
16ecb20600 sched: restrict sync wakee placement bias with waker's demand
Biasing sync wakee task towards waker CPU's cluster makes sense when the
waker's demand is high enough so the wakee also can take advantage
of high CPU frequency voted because of waker's load.  Placing sync wakee
on the low demand waker's CPU can lead placement imbalance which can
lead unnecessary migration.

Introduce a new tunable "sched_big_waker_task_load" that defines the big
waker so scheduler avoid wakee on waker's cluster bias when the waker's
load is below the tunable.

CRs-fixed: 971295
Change-Id: I1550ede0a71ac8c9be74a7daabe164c6a269a3fb
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
[joonwoop@codeaurora.org: fixed a minor conflict in
 include/linux/sched/sysctl.h.]
2016-03-23 21:25:23 -07:00
Joonwoo Park
616e04a51c sched: add preference for waker cluster CPU in wakee task placement
If sync wakee task's demand is small it's worth to place the wakee task
on waker's cluster for better performance in the sense that waker and
wakee are corelated so the wakee should take advantage of waker cluster's
frequency which is voted by the waker along with cache locality benefit.
While biasing towards the waker's cluster we want to avoid the waker CPU
as much as possible as placing the wakee on the waker's CPU can make the
waker got preempted and migrated by load balancer.

Introduce a new tunable 'sched_small_wakee_task_load' that differentiates
eligible small wakee task and place the small wakee tasks on the waker's
cluster.

CRs-fixed: 971295
Change-Id: I96897d9a72a6f63dca4986d9219c2058cd5a7916
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
[joonwoop@codeaurora.org: fixed a minor conflict in
 include/linux/sched/sysctl.h.]
2016-03-23 21:25:22 -07:00
Olav Haugan
b29f9a7a84 sched/core: Add protection against null-pointer dereference
p->grp is being accessed outside of lock which can cause null-pointer
dereference. Fix this and also add rcu critical section around access
of this data structure.

CRs-fixed: 985379
Change-Id: Ic82de6ae2821845d704f0ec18046cc6a24f98e39
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
[joonwoop@codeaurora.org: fixed conflict in init_new_task_load().]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 21:25:21 -07:00
Joonwoo Park
615b6f6221 sched: allow select_prev_cpu_us to be set to values greater than 100us
At present sched_select_prev_cpu_us tunable is restricted to values
below 100us.  Fix this unintended restriction.

CRs-Fixed: 972237
Change-Id: I5eaf9f40468805c396328ca1022baef32acf8de0
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 21:25:20 -07:00
Pavankumar Kondeti
fbeb32ce8f sched: clean up idle task's mark_start restoring in init_idle()
The idle task's mark_start can get updated even without the CPU being
online. Hence the mark_start is restored when the CPU is coming online.

The idle task's mark_start is reset in init_idle()->__sched_fork()->
init_new_task_load(). The original mark_start is saved and restored
later. This can be avoided by moving init_new_task_load() to
wake_up_new_task(), which never gets called for an idle task.

We only care about idle task's ravg.mark_start and not initializing
the other fields of ravg struct will not have any side effects.

This clean up allows the subsequent patches to drop the rq->lock
while calling __sched_fork() in init_idle().

CRs-Fixed: 965873
Change-Id: I41de6d69944d7d44b9c4d11b2d97ad01bd8fe96d
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
[joonwoop@codeaurora.org: fixed a minor conflict in core.c.  omitted
 changes for CONFIG_SCHED_QHMP.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 21:25:19 -07:00
Pavankumar Kondeti
58d411413f sched: let sched_boost take precedence over sched_restrict_cluster_spill
When sched_restrict_cluster_spill knob is enabled, RT tasks are restricted
to lower power cluster. This knob also restricts inter cluster no-hz kicks.
Ignore this knob setting when sched_boost is enabled so that tasks are
placed on CPUs with highest spare capacity.

CRs-Fixed: 968852
Change-Id: I01b3fc10b39dc834a733d64c2ee29c308d7ff730
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-03-23 21:25:18 -07:00
Pavankumar Kondeti
6d742ce87b sched: Add separate load tracking histogram to predict loads
Current window based load tracking only saves history for five
windows. A historically heavy task's heavy load will be completely
forgotten after five windows of light load. Even before the five
window expires, a heavy task wakes up on same CPU it used to run won't
trigger any frequency change until end of the window. It would starve
for the entire window. It also adds one "small" load window to
history because it's accumulating load at a low frequency, further
reducing the tracked load for this heavy task.

Ideally, scheduler should be able to identify such tasks and notify
governor to increase frequency immediately after it wakes up.

Add a histogram for each task to track a much longer load history. A
prediction will be made based on runtime of previous or current
window, histogram data and load tracked in recent windows. Prediction
of all tasks that is currently running or runnable on a CPU is
aggregated and reported to CPUFreq governor in sched_get_cpus_busy().

sched_get_cpus_busy() now returns predicted busy time in addition
to previous window busy time and new task busy time, scaled to
the CPU maximum possible frequency.

Tunables:

- /proc/sys/kernel/sched_gov_alert_freq (KHz)

This tunable can be used to further filter the notifications.
Frequency alert notification is sent only when the predicted
load exceeds previous window load by sched_gov_alert_freq converted to
load.

Change-Id: If29098cd2c5499163ceaff18668639db76ee8504
Suggested-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
[joonwoop@codeaurora.org: fixed merge conflicts around __migrate_task()
 and removed changes for CONFIG_SCHED_QHMP.]
2016-03-23 21:25:17 -07:00
Junjie Wu
efa673322f sched: Provide a wake up API without sending freq notifications
Each time a task wakes up, scheduler evaluates its load and notifies
governor if the resulting frequency of destination CPU is larger than
a threshold. However, some governor wakes up a separate task that
handles frequency change, which again calls wake_up_process().

This is dangerous because if the task being woken up meets the
threshold and ends up being moved around, there is a potential for
endless recursive notifications.

Introduce a new API for waking up a task without triggering
frequency notification.

Change-Id: I24261af81b7dc410c7fb01eaa90920b8d66fbd2a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 21:25:17 -07:00
Pavankumar Kondeti
71a8c392b7 sched: Take downmigrate threshold into consideration
If the tasks are run on the higher capacity cluster solely due to the
reason that they can not be be fit in the lower capacity cluster, the
downmigrate threshold prevents the frequent tasks migrations between
the clusters.

Change-Id: I234a23ffd907c2476c94d5f6227dab1bb6c9bebb
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-03-23 21:25:16 -07:00
Pavankumar Kondeti
6003b006be sched: Provide a facility to restrict RT tasks to lower power cluster
The current CPU selection algorithm for RT tasks looks for the
least loaded CPU in all clusters. Stop the search at the lowest
possible power cluster based on "sched_restrict_cluster_spill"
sysctl tunable.

Change-Id: I34fdaefea56e0d1b7e7178d800f1bb86aa0ec01c
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-03-23 21:25:15 -07:00
Pavankumar Kondeti
8cd1d7ef16 sched: Take cluster's minimum power into account for optimizing sbc()
The select_best_cpu() algorithm iterates over all the clusters and
selects the most power efficient CPU that satisfies the task needs.
During the search, skip the next cluster if its minimum power cost
is higher than the power cost of an eligible CPU found in the previous
cluster.

In a b.L system, if the BIG cluster minimum power cost is higher than
the maximum power cost of the little cluster, this optimization avoids
searching the BIG cluster if an eligible CPU is found in the little
cluster.

Change-Id: I5e3755f107edb6c72180edbec2a658be931c276d
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-03-23 21:25:14 -07:00
Pavankumar Kondeti
6418f213ab sched: Revise the inter cluster load balance restrictions
The frequency based inter cluster load balance restrictions are not
reliable as frequency does not provide a good estimate of the CPU's
current load. Replace them with the spill_load and spill_nr_run
based checks.

The higher capacity cluster is restricted from pulling the tasks from
the lower capacity cluster unless all of the lower capacity CPUs are
above spill. This behavior can be controlled by a sysctl tunable and
it is disabled by default (i.e. no load balance restrictions).

Change-Id: I45c09c8adcb61a8a7d4e08beadf2f97f1805fb42
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
[joonwoop@codeaurora.org: fixed merge conflicts due to omitted changes
 for CONFIG_SCHED_QHMP.]
2016-03-23 21:25:13 -07:00
Srivatsa Vaddagiri
3004236139 sched: colocate related threads
Provide userspace interface for tasks to be grouped together as
"related" threads. For example, all threads involved in updating
display buffer could be tagged as related.

Scheduler will attempt to provide special treatment for group of
related threads such as:

1) Colocation of related threads in same "preferred" cluster
2) Aggregation of demand towards determination of cluster frequency

This patch extends scheduler to provide best-effort colocation support
for a group of related threads.

Change-Id: Ic2cd769faf5da4d03a8f3cb0ada6224d0101a5f5
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[joonwoop@codeaurora.org: fixed minor merge conflicts.  removed ifdefry
 for CONFIG_SCHED_QHMP.]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>

Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 21:25:12 -07:00
Srivatsa Vaddagiri
df6bfcaf70 sched: Update fair and rt placement logic to use scheduler clusters
Make use of clusters in the fair and rt scheduling classes. This is
needed as the freq domain mask can no longer be used to do correct
task placement. The freq domain mask was being used to demarcate
clusters.

Change-Id: I57f74147c7006f22d6760256926c10fd0bf50cbd
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed merge conflicts due to omitted changes
 for CONFIG_SCHED_QHMP.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 21:25:11 -07:00
Srivatsa Vaddagiri
cb1bb6a8f4 sched: Introduce the concept CPU clusters in the scheduler
A cluster is set of CPUs sharing some power controls and an L2 cache.
This patch buids a list of clusters at bootup which are sorted by
their max_power_cost. Many cluster-shared attributes like cur_freq,
max_freq etc are needlessly maintained in per-cpu 'struct rq' currently.
Consolidate them in a cluster structure.

Change-Id: I0567672ad5fb67d211d9336181ceb53b9f6023af
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
[joonwoop@codeaurora.org: fixed minor conflict in
 arch/arm64/kernel/topology.c. fixed conflict due to ommited changes for
 CONFIG_SCHED_QHMP.]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 21:25:10 -07:00
Satya Durga Srinivasu Prabhala
ffff87b7bf kernel/watchdog.c: fix compilation warning on Kernel 4.4
This change fixes below compilation warning on Kernel 4.4.

watchdog.c:122:22: warning: 'hardlockup_allcpu_dumped' \
defined but not used [-Wunused-variable]

Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2016-03-23 21:25:03 -07:00
Junjie Wu
0ea6cc5218 tracing: power: Add trace events for core control
Add trace events for core control module.

Change-Id: I36da5381709f81ef1ba82025cd9cf8610edef3fc
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 21:24:30 -07:00
Jeevan Shriram
48d195bfd6 sched: remove init_new_task_load from CONFIG_SMP
Move init_new_task_load function from CONFIG_SMP to avoid
linking error for ARCH=um

Signed-off-by: Jeevan Shriram <jshriram@codeaurora.org>
2016-03-23 21:24:22 -07:00
Junjie Wu
809d8460ca sched: Export sched_setscheduler_nocheck()
Export sched_setscheduler_nocheck() so that external kernel modules
can use it.

Change-Id: Ib50f537f5aef50c365ba63fb8ffce05bc1c7c431
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Signed-off-by: Bryan Huntsman <bryanh@codeaurora.org>
2016-03-23 21:24:02 -07:00
Bryan Huntsman
f720d40148 Revert "sched: Export sched_setscheduler_nocheck"
This reverts commit 84778472e1.

Signed-off-by: Bryan Huntsman <bryanh@codeaurora.org>
2016-03-23 21:24:01 -07:00
Lingutla Chandrasekhar
2fbd2f64e4 time: alarmtimer: include lpm-levels for MSM targets only
lpm-level headers required only when CONFIG_MSM_PM is set.

To compile msm kernel for other targets (arch=um), add config
check to include lpm levels.

Change-Id: Ia1bd51da4952e56b945a5e51a3b1ff8aaa643cd5
Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org>
Signed-off-by: Jeevan Shriram <jshriram@codeaurora.org>
2016-03-23 21:23:46 -07:00
Mohit Aggarwal
4b81b05c57 rtc: alarm: Change wake-up source
Currently, RTC_ALARM is used to wake-up target from
suspend state and is also used for power-off alarm
feature. This patch uses qtimer to wake-up from
suspend state.

Change-Id: Ia42cfecd573309be2f03c18b4f1c321be8202d7d
Signed-off-by: Mohit Aggarwal <maggarwa@codeaurora.org>
2016-03-23 21:23:46 -07:00