Commit graph

564971 commits

Author SHA1 Message Date
Junjie Wu
1bef8bd6b6 cpufreq: interactive: Avoid down_read_trylock if down_write() is held
down_read_trylock is not always non-blocking if the same thread calls
down_write() before.

CPU1					CPU2
					down_read()
down_write()
  __down_write_nested()
    schedule()
      __down_read_trylock()
					up_read()
					  acquires sem->wait_lock
					    __rwsem_wake_one_writer()
	tries to lock sem->wait_lock

Now CPU2 is waiting for CPU1's schedule() to complete, while holding
sem->wait_lock. CPU1 needs sem->wait_lock to continue.

This problem only happens after cpufreq_interactive introduced load
change notification that could be called within schedule().

Add a separate flag to ignore notification if current thread is in
middle of down_write(). This avoids attempting to hold sem->wait_lock.
The additional flag doesn't have any side effects because
down_read_trylock() would have failed anyway.

Change-Id: Iff97cac36c170cf6d03f36de695141289c3d6930
[junjiew@codeaurora.org: Resolved merge conflicts. Dropped changes
 to code that no longer exists.]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:14 -07:00
Rohit Gupta
f4d69aec4f cpufreq: interactive: Report CPU loads through govinfo notifier
Report CPU load to modules subscribed to cpufreq govinfo notification
chain every time governor timer expires to evaluate load.

Change-Id: I0b35947b1924c179649aafa0b7b93d974164af1a
[junjiew@codeaurora.org: Resolved trivial merge conflicts]
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2016-03-23 20:03:13 -07:00
Junjie Wu
2d91526439 cpufreq: interactive: Do not align sample windows by default
Disable sample window alignment by default to match default behavior
of upstream interactive governor.

Change-Id: Ibbf4bdd4dd423f97d3a9dd5442eba78b378e66e2
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:12 -07:00
Junjie Wu
c7cdf7954e cpufreq: interactive: Re-evaluate immediately in load change callback
Previously, there was a limitation in load change callback that it
can't attempt to wake up a task. Therefore the best we can do is to
schedule timer at current jiffy. The timer function will only be
executed at next timer tick. This could take up to 10ms.

Now that this limitation is removed, re-evaluate load immediately upon
receiving this callback.

Change-Id: Iab3de4705b9aae96054655b1541e32fb040f7e60
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:11 -07:00
Junjie Wu
50d577eb97 cpufreq: interactive: Make window alignment optional
Make sampling window alignment optional when scheduler inputs
are not enabled.

Change-Id: If69c111a3efe219cdd1e38c1f46f03404789c0bb
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:11 -07:00
Junjie Wu
814503a967 cpufreq: interactive: Add max_freq_hysteresis feature
Previously known as sampling down factor, max_freq_hysteresis
extends the period that interactive governor will stay at policy->max.
This feature is to accomodate short idle periods in an otherwise very
intensive workload.

When the feature is enabled, it ensures that once a CPU goes to max
frequency, it doesn't reduce the frequency for max_freq_hysteresis
microseconds from the time it first goes to idle.

Change-Id: Ia54985cb554f63f8c22d0b554a0a0f2ed2be038f
[junjiew@codeaurora.org: Resolved conflicts. Dropped changes to code
 that no longer exists. Trivial checkpatch fix. Renamed
 max_freq_idle_start_time to max_freq_hyst_start_time.]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:10 -07:00
Junjie Wu
6cd960aa45 cpufreq: interactive: Add support for using scheduler inputs
Interactive governor does not have enough information about the tasks
on a CPU to make a more informed decision on the frequency the CPUs
should run at. To address this problem, modify interactive governor
to get load information from scheduler. In addition, it can get
notification from scheduler on significant load change to reevaluate
CPU frequency immediately.

Add two sysfs file to control the behavior of load evaluation:
use_sched_load:
	When enabled, governor uses load information from scheduler
	instead of busy/idle time from past window.
use_migration_notif:
	Whenever a task migrates, scheduler might send a notification
	so that governor can re-evaluate load and scale frequency.
	Governor will ignore this notification unless both
	use_sched_hint 	and use_migration_notification are true for
	the policy group.

Change-Id: Iaf66e424c6166ec15480db027002b3a3b357d79c
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:09 -07:00
Junjie Wu
3b48f85cd1 cpufreq: interactive: Use del_timer/add_timer_on to rearm timers
Replace mod_timer_pinned() with del_timer(), add_timer_on().
mod_timer_pinned() always adds timer onto current CPU. Interactive
governor expects each CPU's timers to be running on the same CPU.
If cpufreq_interactive_timer_resched() is called from another CPU,
the timer will be armed on the wrong CPU.

Replacing mod_timer_pinned() with del_timer() and add_timer_on()
guarantees timers are still run on the right CPU even if another
CPU reschedules the timer. This would provide more flexibility
for future changes.

Change-Id: I3a10be37632afc0ea4e0cc9c86323b9783b216b1
[junjiew@codeaurora.org: Dropped changes that are no longer needed
 due to removal of relevant code]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:08 -07:00
Junjie Wu
bbe0d10d97 cpufreq: interactive: Cache tunables when they are created
Currently, tunables are only saved to per_cpu field when
CPUFREQ_GOV_POLICY_EXIT event happens. Save tunables the moment they
are created so that per_cpu cached_tunables field always matches
the tunables in use. This is useful for modifying tunable values
across clusters.

Change-Id: I9e30d5e93d6fde1282b5450458d8a605d568a0f5
[junjiew@codeaurora.org: Resolved trivial conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:07 -07:00
Junjie Wu
6673fa0f98 cpufreq: interactive: Align timer windows for all CPUs
It's more advantageous to evaluate all CPUs at same time so that
interactive governor gets a complete picture of the load on
each CPU at a specific time. It could also reduce number of speed
changes made if there are many CPUs controlled by same policy. In
addition, waking up all CPUs at same time would allow the cluster
to go into a deeper sleep state when it's idle.

Change-Id: I6915050c5339ef1af106eb906ebe4b7c618061e2
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:06 -07:00
Junjie Wu
7b9f8f19a1 cpufreq: interactive: Move cached_tunables into cpuinfo
Interactive governor already has a per_cpu field cpuinfo to keep track
of per_cpu data. Move cached_tunables into cpuinfo.

Change-Id: I77fda0cda76b56ff949456a95f96d129d877aa7b
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:06 -07:00
Saravana Kannan
9993217296 cpufreq: interactive: Fix freeing of cached tunabled during module_exit()
To avoid multiple frees of an allocated tunables struct during
module_exit(), the pointer to the allocated tunables should be stored in
only one of the per-CPU cached_tunables pointer.

So, in the case of per policy governor configuration, store the cached
values in the pointer of first CPU in a policy. In the case of one governor
across all policies, store it in the CPU0 pointer.

Change-Id: Id4334246491519ac91ab725a8758b2748f743bb0
Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
2016-03-23 20:03:05 -07:00
Junjie Wu
249a58c4a5 cpufreq: interactive: Permanently cache tunable values
Userspace might change tunable values for a governor. Currently, if
all CPUs in a policy go offline, governor frees its tunable. This
wipes out all userspace modifications. Kernel drivers can call
cpu_up/down() directly and thus userspace won't have a chance to
restore the tunables.

Permanently save tunable struct in a per_cpu field so that we
preserve tunable values across hotplug, suspend/resume and governor
switch.

Change-Id: I126b8278c8e75c8eadb3e2ddfe97fcc72cddfa23
[junjiew@codeaurora.org: Resolved merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:04 -07:00
Junjie Wu
546c1f0400 cpufreq: interactive: Remove cpufreq_get/put_global_kobject()
Change-Id: I9bb41acc4c86074c2c14562f34480004184494f7
[junjiew@codeaurora.org: resolved trivial merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:03 -07:00
Junjie Wu
edacb2395e cpufreq: Use pr_info() for driver registration and unregistration
Many subsystems depend on cpufreq API for CPU frequency scaling.
Cpufreq API is expected to fail until cpufreq device registers.

Change pr_debug() to pr_info() so that user could determine when
cpufreq API becomes available during boot from kernel messages. This
is crucial to understand whether a cpufreq API failure is benign
during early boot.

Change-Id: Id2dfa009ae33859ec3efcdb29a3296e891852c6a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:02 -07:00
Junjie Wu
fcb680bdad cpufreq: Improve governor related CPUFreq error messages
Governor error messages point to important failures in governor or
framework. Output triggering CPU and policy->cpu to help debugging.

Resolved conflicts for 3.18 kernel.

Change-Id: I4c5c392ec973b764ec3240bb2eb455c624bcaf63
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:01 -07:00
Junjie Wu
79d8f58e5e cpufreq: qcom-cpufreq: Check return of cpufreq_frequency_get_table
cpufreq_frequency_get_table could return NULL. Do error check on the
return value instead of continue with a potentially NULL pointer.

Change-Id: I0cb8a3a8ae3499e738683e5f45271aeadee488f6
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:00 -07:00
Junjie Wu
1a7d6caef6 cpufreq: Check current frequency in device driver
__cpufreq_driver_target() checks if policy->cur is same as target_freq
without holding any lock. This function is used by governor to
directly set CPU frequency. Governor calling this function can't hold
any CPUfreq framework locks due to deadlock possibility.

However, this results in a race condition where one thread could see
a stale policy->cur while another thread is changing CPU frequency.

Thread A: Governor calls __cpufreq_driver_target(), starts increasing
frequency but hasn't sent out CPUFREQ_POSTCHANGE notification yet.
Thread B: Some other driver (could be thermal mitigation) starts
limiting frequency using cpufreq_update_policy(). Every limits are
applied to policy->min/max and final policy->max happens to be same as
policy->cur. __cpufreq_driver_target() simply returns 0.
Thread A: Governor finish scaling and now policy->cur violates
policy->max and could last forever until next CPU frequency scaling
happens.

Shifting the responsibility of checking policy->cur and target_freq
to CPUfreq device driver would resolve the race as long as the device
driver holds a common mutex.

Change-Id: I6f943228e793a4a4300c58b3ae0143e09ed01d7d
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:03:00 -07:00
Rohit Gupta
db44d1e057 cpufreq: Add a notifer chain that governors can use to report information
Some modules can benefit from getting additional information cpufreq
governors use to make frequency switch decisions.
This change lays down a basic framework that the governors can use
to report additional information (Eg: CPU's load) information to
the clients that subscribe to cpufreq govinfo notifier chain.

Change-Id: I511b4bdb7d12394a31ce5352ae47553861e49303
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
[imaund@codeaurora.org: resolved context conflicts]
Signed-off-by: Ian Maund <imaund@codeaurora.org>
2016-03-23 20:02:59 -07:00
Junjie Wu
db35fc6d65 qcom-cpufreq: Use devm_kfree() to match devm_kzalloc()
Frequency table is allocated with devm_kzalloc() and thus should be
freed using devm_kfree().

Change-Id: I9c08838eadb9fc04bda9cc66596e1e0b45b3e4db
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:02:58 -07:00
Junjie Wu
cc0057fe57 qcom-cpufreq: Fill in policy->freq_table
CPUfreq framework replaced per-cpu freq_table with per-policy
freq_table, and deprecated previous per-cpu APIs.

Fill in policy->freq_table.

Change-Id: Ifc9ac1b6695fd12629a447984dbbd57d657961b2
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:02:57 -07:00
Junjie Wu
1502197f64 qcom-cpufreq: Use new cpufreq_freq_transition_begin/end() API
Previous cpufreq_notify_transition() is deprecated in favor of
cpufreq_freq_transition_begin/end() API which provides serialization
guarantee for notifications.

Use the new API for transition notification.

Change-Id: I8d559e5c6ef4771986b24e017c900476da1f6cdf
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:02:56 -07:00
Junjie Wu
14d4d12ad6 qcom-cpufreq: Rename cpufreq_suspend to suspend_data
cpufreq_suspend is now a function in core CPUfreq framework. Rename
qcom-cpufreq's local per-cpu variable to suspend_data.

Change-Id: I2f567f0c04271d728d4e6a17b61cea2152c4d8f7
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:02:55 -07:00
Junjie Wu
16d6aa6ee5 qcom-cpufreq: Remove save/restore of scheduling policy
Different structures might need to be saved and restored based on
different scheduling policies of current thread. Saving and restoring
priority using scheduler APIs is very fragile due to potential changes
in scheduler code. In addition, the priority change doesn't
provide any starvation guarantee because threads can be preempted
before the priority change.

Therefore remove save and restore of priority to avoid potential bugs
when scheduler API changes. Caller will now be responsible for setting
the right priority for their CPU frequency scaling workqueue/thread.

Change-Id: I2a5d8599e75c0c4aa902df3214c17ab2b13dc9a9
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:02:55 -07:00
Junjie Wu
f1e1ba3b6c qcom-cpufreq: Restore CPU frequency during resume
qcom-cpufreq blocks CPU frequency change request during suspend, because
its dependencies might be suspended. Thus a freq change request would
fail silently, and CPU clock won't change until first frequency update
is requested after system comes out of suspend. This creates a period
when thermal driver cannot perform frequency mitigation, even though
policy->min/max have been correctly updated.

Check each online CPU's policy during resume to correct any frequency
violation as soon as possible.

Change-Id: I3be79cf91e7d5e361314020c9806b770823c0b72
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:02:54 -07:00
Junjie Wu
9c69b365d2 qcom-cpufreq: Remove per-cpu workqueue
It's no longer a requirement to pin frequency change on the CPU that
is being scaled. Therefore, there is no longer a need for per-cpu
workqueue in qcom-cpufreq. Remove the workqueue.

Change-Id: Ic6fd7f898fa8b1b1226a178b04530c24f0398daa
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:02:53 -07:00
Junjie Wu
a80a914583 arm: msm: Remove MSM_CPU_FREQ_SET_MIN_MAX related config
MSM_CPU_FREQ_SET_MIN_MAX and related Kconfigs are deprecated. Purge
them from Kconfig and qcom-cpufreq.

Change-Id: I8ac786c155c7e235154b60c79f97d76ea15dace2
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:02:52 -07:00
Matt Wagantall
c7984141ec trace: power: add cpu_frequency_switch_{start, end}
It is sometimes useful to profile how long CPU frequency switches
take, since they often involve variable overhead (PLL lock times,
voltage increase time, etc.). Add additional traces to to make this
possible.

Since the overhead involved may differ based on the frequencies
being switched between, record both the start and the end frequencies
as part of the trace.

Change-Id: I2de743fc357dad3590fd4980f65f38f6073d426e
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
[abhimany: resolve trivial merge conflicts]
Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
2016-03-23 20:02:51 -07:00
Stephen Boyd
7252c408d4 cpufreq: Add snapshot of qcom-cpufreq driver
This is a snapshot of qcom-cpufreq as of msm-3.10 commit

acdce027751d5a7488b283f0ce3111f873a5816d (Merge "defconfig: arm64:
Enable ONESHOT_SYNC for msm8994")

Change-Id: Idb99a856330566ffad6309c48edabb220cee7917
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
[junjiew@codeaurora.org: resolved conflicts in Kconfig.arm
 and Makefile. Dropped dependency on ARCH_MSM.]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:02:50 -07:00
Junjie Wu
86ace8299b cpufreq: cpu-boost: Move CPU_BOOST Kconfig to correct section
cpu-boost driver is not a CPUFreq device. Moving it to the end of
CPUFreq governor section.

Change-Id: Ib433f81e7596789a2e6ea03d0bd0a8d166ecf9e9
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 20:02:49 -07:00
Rohit Gupta
03b71555db cpufreq: cpu-boost: Force most/all tasks to big cluster on input event
Scheduler provides an API to force tasks to the big cluster. To
improve performance, use this API to move most/all tasks to the
big cluster for short duration on an input event. On the removal of
frequency boost (after input_boost_ms), this scheduler boost is also
deactivated.

Change-Id: I9d643914ebc75266478cc22260a45862faad6236
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
2016-03-23 20:02:49 -07:00
Joonwoo Park
449517019b defconfig: msm: enable CONFIG_SCHED_DEBUG
Enable CONFIG_SCHED_DEBUG in order to expose /proc/sched_debug.

Change-Id: Id784c80fe6203f007501637c3d17876528329e2b
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:48 -07:00
Joonwoo Park
48b7015c47 defconfig: msm: enable HMP scheduler
Enable HMP scheduler along with scheduler guided frequency input.

Change-Id: Ia0e7cf6c5c5ff44492836ebb5189574f55cb742e
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:47 -07:00
Joonwoo Park
d794be8975 defconfig: msm: clean up msm_defconfig and msm-perf_defconfig
Clean up msm_defconfig and msm-perf_defconfig with 'make savedefconfig'.

Change-Id: I118d9d4ddc1fb89b4301cb7ceffdbccc60699329
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:46 -07:00
Joonwoo Park
1770a392ab ARM: dts: enable HMP scheduler
Enable HMP scheduler for msm8996.

Change-Id: I2ecdf4b2409b3e1d4f176f2b9f63a9c17aec5ead
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:45 -07:00
Joonwoo Park
20af1732c6 sched: fix compile error where !CONFIG_SCHED_FREQ_INPUT
The sysctl node sched_new_task_windows is only for CONFIG_SCHED_HMP and
CONFIG_SCHED_FREQ_INPUT.

Change-Id: I4791e977fa8516fd2cd31198f71103b8d7e874c3
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:44 -07:00
Joonwoo Park
9df619ba91 sched: fix compile failure where !CONFIG_SCHED_HMP
Fix compile failure when HMP scheduler isn't selected.

Change-Id: I411fa3501a4c4ac280c037a1698aa3b7278d440f
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:44 -07:00
Joonwoo Park
07eb3f803b sched: select task's prev_cpu as the best CPU when it was chosen recently
Select given task's prev_cpu when the task slept for short period to
reduce latency of task placement and migrations.  A new tunable
/proc/sys/kernel/sched_select_prev_cpu_us introduced to determine whether
tasks are eligible to go through fast path.

CRs-fixed: 947467
Change-Id: Ia507665b91f4e9f0e6ee1448d8df8994ead9739a
[joonwoop@codeaurora.org: fixed conflict in include/linux/sched.h,
 include/linux/sched/sysctl.h, kernel/sched/core.c and kernel/sysctl.c]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:43 -07:00
Syed Rameez Mustafa
fd38bb103d sched: Add documentation for the revised hmp zone scheduler.
Add documentation for the revised task placement logic for the
scheduler. Since the old file sched-hmp.txt is still required,
add a new one instead.

Change-Id: Ic7e3845c8d6b85b7918cd35c2a0a482a621fe525
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:02:42 -07:00
Joonwoo Park
0498f793e8 sched: use ktime instead of sched_clock for load tracking
At present, HMP scheduler uses sched_clock to setup window boundary to
be aligned with timer interrupt to ensure timer interrupt fires after
window rollover.  However this alignment won't last long since the timer
interrupt rearms next timer based on time measured by ktime which isn't
coupled with sched_clock.

Convert sched_clock to ktime to avoid wallclock discrepancy between
scheduler and timer so that we can ensure scheduler's window boundary is
always aligned with timer.

CRs-fixed: 933330
Change-Id: I4108819a4382f725b3ce6075eb46aab0cf670b7e
[joonwoop@codeaurora.org: fixed minor conflict in include/linux/tick.h
 and kernel/sched/core.c.  omitted fixes for kernel/sched/qhmp_core.c]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:41 -07:00
Syed Rameez Mustafa
d2d5734fec sched: Update min/max capacity for the CPUFREQ_CREATE_POLICY notifier
Following the change "57e2905 sched: Skip resetting HMP stats when
max frequencies remain unchanged" the scheduler fails to update
min/max capacities appropriately when CPUs are hot added after being
hot removed. Fix this problem by handling the CPUFREQ_CREATE_POLICY
notification and explicitly updating min/max capacities.

Change-Id: I5dadac3258e18897fa3d505cf128ebe24c091efa
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:02:40 -07:00
Pavankumar Kondeti
516b19042c sched/cputime: fix a deadlock on 32bit systems
cpu_hardirq_time and cpu_softirq_time are protected with
seqlock on 32bit systems. There is a potential deadlock
with this seqlock and rq->lock.

CPU 1                             CPU0
==========================        ========================
--> acquire CPU0 rq->lock         --> __irq_enter()
----> task enqueue/dequeue        ----> irqtime_account_irq()
------> update_rq_clock()         ------> irq_time_write_begin()
--------> irq_time_read()         --------> sched_account_irqtime()
(waiting for the seqlock          (waiting for the CPU0 rq->lock)
held in irq_time_write_begin()

Fix this issue by dropping the seqlock before calling
sched_account_irqtime()

Change-Id: I29a33876e372f99435a57cc11eada9c8cfd59a3f
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-03-23 20:02:39 -07:00
Syed Rameez Mustafa
280b866848 sched: Optimize scheduler trace events to reduce trace buffer usage
Scheduler ftrace events currently generate a lot of data when turned
on. The excessive log messages often end up overflowing trace buffers
for long use cases or crowding out other events. Optimize scheduler
events so that the log spew is less and more manageable. To that end
change the variable type for some event fields; introduce variants
of sched_cpu_load that can be turned on/off for separate code paths
and remove unused fields from various events.

Change-Id: I2b313542b39ad5e09a01ad1303b5dfe2c4883b8a
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed conflict in rt.c due to
 CONFIG_SCHED_QHMP.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:38 -07:00
Joonwoo Park
f406df7c35 sched: initialize frequency domain cpumask
It's possible select_best_cpu() gets called before the first cpufreq
notifier call.  In such scenario select_best_cpu() can hang forever by
not clearing search_cpus.

Initialize frequency domain cpumask with the CPU of rq to avoid such
scenario.

CRs-fixed: 931349
Change-Id: If8d31c5477efe61ad7c6b336ba9e27ca6f556b63
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:38 -07:00
Joonwoo Park
60abcbcfdf sched: print sched_task_load always
At present select_best_cpu() bails out when best idle CPU found without
printing sched_task_load trace event.  Print it.

Change-Id: Ie749239bdb32afa5b1b704c048342b905733647e
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:37 -07:00
Joonwoo Park
0254e50843 sched: add preference for prev and sibling CPU in RT task placement
Add a bias towards the RT task's previous CPU and sibling CPUs in order
to avoid cache bouncing and migrations.

CRs-fixed: 927903
Change-Id: I45d79d774e65efcb38282130b6692b4c3b03c2f0
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:36 -07:00
Vikram Mulukutla
dba1a27b5a sched: core: Don't use current task_cpu when migrating with stop_one_cpu
To migrate a running task using stop_one_cpu, one has to give up
the the pi_lock and rq_lock. To safeguard against migration
between giving up those locks and actually invoking stop_one_cpu,
one has to save away task_cpu(p) before releasing pi_lock, and
use the saved value when passing it as the src_cpu argument to
stop_one_cpu. If the current task_cpu is passed in, the task may
have already been migrated to that CPU for whatever other reason.

sched_exec attempts to invoke stop_one_cpu with source CPU
set to task_cpu(task) after dropping the pi_lock. While this
doesn't result in a functional error, it is rather useless to
have the entire migration code run when the task is already
running on the destination CPU.

Change-Id: I02963ed02c7119a3d707580a191fbc86b94cdfaf
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
[joonwoop@codeaurora.org: omitted changes for qhmp_core.c]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:35 -07:00
Syed Rameez Mustafa
c00814c023 sched: Notify cpufreq governor early about potential big tasks
Tasks that are on the runqueue continuously for a certain amount of time
have the potential to be big tasks at the end of the window in which they
are runnable. In such scenarios ramping the CPU frequency early can
boost performance rather than waiting till the end of a window for the
governor to query load. Notify the governor early at every tick when a
task has been observed to execute beyond some percentage of the tick
period.

The threshold beyond which a task is eligible for early detection can be
changed via the tunable sched_early_detection_duration. The feature itself
is enabled only when scheduler boost is in effect.

Change-Id: I528b72bbc79a55b4593d1b8ab45450411c6d70f3
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed conflict in scheduler_tick() in
 kernel/sched/core.c.  fixed minor conflicts in include/linux/sched.h,
 include/linux/sched/sysctl.h and kernel/sysctl.c due to
 CONFIG_SCHED_QHMP.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:34 -07:00
Syed Rameez Mustafa
a805e4b220 sched: Skip resetting HMP stats when max frequencies remain unchanged
A change in cpufreq policy parameters currently trigger a partial reset
of HMP stats. This is necessary when there are changes in the max
frequency of any cluster since updated load scaling factors necessitate
updating the number of big and small tasks on every CPU. However, this
computation is redundant when parameters other than the max freq change.
Optimize code by avoiding the redundant calculations.

Change-Id: Ib572f5dfdc4ada378e695f328ff81e2ce31132ba
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:02:33 -07:00
Joonwoo Park
cee02f8168 sched: update sched_task_load trace event
Add best_cpu and latency field to sched_task_load trace event.  The latency
field represents combined latency of update_task_ravg(), update_task_ravg()
and select_best_cpu() which is useful to analyze latency overhead of HMP
scheduler.

Change-Id: Ie6d777c918d0414d361d758490e3cd7d509f5837
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:32 -07:00