evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
Pavankumar Kondeti	c17d7d3c40	sched: auto adjust the upmigrate and downmigrate thresholds The load scale factor of a CPU gets boosted when its max freq is restricted. A task load at the same frequency is scaled higher than normal under this scenario. This results in tasks migrating early to the better capacity CPUs and their residency over there also gets increased as their inflated load would be relatively higher than than the downmigrate threshold. Auto adjust the upmigrate and downmigrate thresholds by a factor equal to rq->max_possible_freq/rq->max_freq of a lower capacity CPU. If the adjusted upmigrate threshold exceeds the window size, it is clipped to the window size. If the adjusted downmigrate threshold decreases the difference between the upmigrate and downmigrate, it is clipped to a value such that the difference between the modified and the original thresholds is same. Change-Id: Ifa70ee5d4ca5fe02789093c7f070c77629907f04 Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>	2016-03-23 20:01:59 -07:00
Syed Rameez Mustafa	38f3da47d7	sched: Use only partial wait time as task demand The scheduler currently either considers a tasks entire wait time as task demand or completely ignores wait time based on the tunable sched_account_wait_time. Both approaches have their limitations, however. The former artificially boosts tasks demand when it may not actually be justified. With the latter, the scheduler runs the risk of never being able to recognize true load (consider two CPU hogs on a single little CPU). To achieve a compromise between these two extremes, change the load tracking algorithm to only consider part of a tasks wait time as its demand. The portion of wait time accounted as demand is determined by each tasks percent load, i.e. a task that waits for 10ms and has 60 % task load, only 6 ms of the wait will contribute to task demand. This approach is more fair as the scheduler now tries to determine how much of its wait time would a task actually have been using the CPU if it had been executing. It ensures that tasks with high demand continue to see most of the benefits of accounting wait time as busy time, however, lower demand tasks don't experience a disproportionately high boost to demand triggering unjustified big CPU usage. Note that this new approach is only applicable to wait time being considered as task demand and not wait time considered as CPU busy time. To achieve the above effect, ensure that anytime a task is waiting, its runtime in every relevant window segment is appropriately adjusted using its pct load. Change-Id: I6a698d6cb1adeca49113c3499029b422daf7871f Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 20:01:55 -07:00
Syed Rameez Mustafa	f0ddb64b10	sched: Update max_capacity when an entire cluster is hotplugged When an entire cluster is hotplugged, the scheduler's notion of max_capacity can get outdated. This introduces the following inefficiencies in behavior: * task_will_fit() does not return true on all tasks. Consequently all big tasks go through fallback CPU selection logic skipping C-state and power checks in select_best_cpu(). * During boost, migration_needed() return true unnecessarily causing an avoidable rerun of select_best_cpu(). * An unnecessary kick is sent to all little CPUs when boost is set. * An opportunity for early bailout from nohz_kick_needed() is lost. Start handling CPUFREQ_REMOVE_POLICY in the policy notifier callback which indicates the last CPU in a cluster being hotplugged out. Also modify update_min_max_capacity() to only iterate through online CPUs instead of possible CPUs. While we can't guarantee the integrity of the cpu_online_mask in the notifier callback, the scheduler will fix up all state soon after any changes to the online mask. The change does have one side effect; early termination from the notifier callback when min_max_freq or max_possible_freq remain unchanged is no longer possible. This is because when the last CPU in a cluster is hot removed, only max_capacity is updated without affecting min_max_freq or max_possible_freq. Therefore, when the first CPU in the same cluster gets hot added at a later point max_capacity must once again be recomputed despite there being no change in min_max_freq or max_possible_freq. Change-Id: I9a1256b5c2cd6fcddd85b069faf5e2ace177e122 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 20:01:52 -07:00
Srivatsa Vaddagiri	73b7708de7	sched: Add cgroup-based criteria for upmigration It may be desirable to discourage upmigration of tasks belonging to some cgroups. Add a per-cgroup flag (upmigrate_discourage) that discourages upmigration of tasks of a cgroup. Tasks of the cgroup are allowed to upmigrate only under overcommitted scenario. Change-Id: I1780e420af1b6865c5332fb55ee1ee408b74d8ce Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org: Use new cgroup APIs] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 20:01:44 -07:00
Srivatsa Vaddagiri	c41a54cb8d	sched: Keep track of average nr_big_tasks Extend sched_get_nr_running_avg() API to return average nr_big_tasks, in addition to average nr_running and average nr_io_wait tasks. Also add a new trace point to record values returned by sched_get_nr_running_avg() API. Change-Id: Id3591e6d04da8db484b4d1cb9d95dba075f5ab9a Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org: Resolve trivial merge conflicts] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 20:01:42 -07:00
Srivatsa Vaddagiri	44d892787e	sched: Fix bug in average nr_running and nr_iowait calculation sched_get_nr_running_avg() returns average nr_running and nr_iowait task count since it was last invoked. Fix several bugs in their calculation. * sched_update_nr_prod() needs to consider that nr_running count can change by more than 1 when CFS_BANDWIDTH feature is used * sched_get_nr_running_avg() needs to sum up nr_iowait count across all cpus, rather than just one * sched_get_nr_running_avg() could race with sched_update_nr_prod(), as a result of which it could use curr_time which is behind a cpu's 'last_time' value. That would lead to erroneous calculation of average nr_running or nr_iowait. While at it, fix also a bug in BUG_ON() check in sched_update_nr_prod() function and remove unnecessary nr_running argument to sched_update_nr_prod() function. Change-Id: I46737614737292fae0d7204c4648fb9b862f65b2 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 20:01:41 -07:00
Srivatsa Vaddagiri	a6c5eb13da	sched: Support CFS_BANDWIDTH feature in HMP scheduler CFS_BANDWIDTH feature is not currently well-supported by HMP scheduler. Issues encountered include a kernel panic when rq->nr_big_tasks count becomes negative. This patch fixes HMP scheduler code to better handle CFS_BANDWIDTH feature. The most prominent change introduced is maintenance of HMP stats (nr_big_tasks, nr_small_tasks, cumulative_runnable_avg) per 'struct cfs_rq' in addition to being maintained in each 'struct rq'. This allows HMP stats to be updated easily when a group is throttled on a cpu. Change-Id: Iad9f378b79ab5d9d76f86d1775913cc1941e266a Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict in dequeue_task_fair().]	2016-03-23 20:01:35 -07:00
Srivatsa Vaddagiri	0a33ec2ea9	sched: Consolidate hmp stats into their own struct Key hmp stats (nr_big_tasks, nr_small_tasks and cumulative_runnable_average) are currently maintained per-cpu in 'struct rq'. Merge those stats in their own structure (struct hmp_sched_stats) and modify impacted functions to deal with the newly introduced structure. This cleanup is required for a subsequent patch which fixes various issues with use of CFS_BANDWIDTH feature in HMP scheduler. Change-Id: Ieffc10a3b82a102f561331bc385d042c15a33998 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in __update_load_avg().] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:01:34 -07:00
Jeff Ohlstein	b0ccf5db31	sched_avg: add run queue averaging Add code to calculate the run queue depth of a cpu and iowait depth of the cpu. The scheduler calls in to sched_update_nr_prod whenever there is a runqueue change. This function maintains the runqueue average and the iowait of that cpu in that time interval. Whoever wants to know the runqueue average is expected to call sched_get_nr_running_avg periodically to get the accumulated runqueue and iowait averages for all the cpus. Change-Id: Id8cb2ecf0ed479f090a83ccb72dd59c53fa73e0c Signed-off-by: Jeff Ohlstein <johlstei@codeaurora.org> (cherry picked from commit 0299fcaaad80e2c0ac9aa583c95107f6edc27750) [rameezmustafa@codeaurora.org: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 20:01:32 -07:00
Srivatsa Vaddagiri	5b45dc56e5	sched: Per-cpu prefer_idle flag Remove the global sysctl_sched_prefer_idle flag and replace it with a per-cpu prefer_idle flag. The per-cpu flag is expected to same for all cpus in a cluster. It thus provides convenient means to disable packing in one cluster while allowing packing in another cluster. Change-Id: Ie4cc73bb1a55b4eac5697be38e558546161faca1 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:01:26 -07:00
Olav Haugan	3f947e7ba7	sched: Add sysctl to enable power aware scheduling Add sysctl to enable energy awareness at runtime. This is useful for performance/power tuning/measurements and debugging. In addition this will match up with the Documentation/scheduler/sched-hmp.txt documentation. Change-Id: I0a9185498640d66917b38bf5d55f6c59fc60ad5c Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org	2016-03-23 20:01:24 -07:00
Joonwoo Park	a4ca8c9b56	sched: take account of irq preemption when calculating irqload delta If irq raises while sched_irqload() is calculating irqload delta, sched_account_irqtime() can update rq's irqload_ts which can be greater than the jiffies stored in sched_irqload()'s context so delta can be negative. This negative delta means there was recent irq occurence. So remove improper BUG_ON(). CRs-fixed: 771894 Change-Id: I5bb01b50ec84c14bf9f26dd9c95de82ec2cd19b5 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:01:23 -07:00
Olav Haugan	5a48aeb06c	sched: Add temperature to cpu_load trace point Add the current CPU temperature to the sched_cpu_load trace point. This will allow us to track the CPU temperature. CRs-Fixed: 764788 Change-Id: Ib2e3559bbbe3fe07a6b7c8115db606828bc36254 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>	2016-03-23 20:01:19 -07:00
Steve Muckle	588055e8c7	sched: make sched_cpu_high_irqload a runtime tunable It may be desirable to be able to alter the scehd_cpu_high_irqload setting easily, so make it a runtime tunable value. Change-Id: I832030eec2aafa101f0f435a4fd2d401d447880d Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2016-03-23 20:01:11 -07:00
Steve Muckle	d3abb1dd6b	sched: avoid CPUs with high irq activity CPUs with significant IRQ activity will not be able to serve tasks quickly. Avoid them if possible by disqualifying such CPUs from being recognized as mostly idle. Change-Id: I2c09272a4f259f0283b272455147d288fce11982 Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2016-03-23 20:01:09 -07:00
Steve Muckle	3b5eac8886	sched: track soft/hard irqload per-RQ with decaying avg The scheduler currently ignores irq activity when deciding which CPUs to place tasks on. If a CPU is getting hammered with IRQ activity but has no tasks it will look attractive to the scheduler as it will not be in a low power mode. Track irqload with a decaying average. This quantity can be used in the task placement logic to avoid CPUs which are under high irqload. The decay factor is 3/4. Note that with this algorithm the tracked irqload quantity will be higher than the actual irq time observed in any single window. Some sample outcomes with steady irqloads per 10ms window and the 3/4 decay factor (irqload of 10 is used as a threshold in a subsequent patch): irqload per window load value asymptote # windows to > 10 2ms 8 n/a 3ms 12 7 4ms 16 4 5ms 20 3 Of course irqload will not be constant in each window, these are just given as simple examples. Change-Id: I9dba049f5dfdcecc04339f727c8dd4ff554e01a5 Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2016-03-23 20:01:07 -07:00
Syed Rameez Mustafa	57ee8ef06e	sched: Make RT tasks eligible for boost During sched boost RT tasks currently end up going to the lowest power cluster. This can be a performance bottleneck especially if the frequency and IPC differences between clusters are high. Furthermore, when RT tasks go over to the little cluster during boost, the load balancer keeps attempting to pull work over to the big cluster. This results in pre-emption of the executing RT task causing more delays. Finally, containing more work on a single cluster during boost might help save some power if the little cluster can then enter deeper low power modes. Change-Id: I177b2e81be5657c23e7ac43889472561ce9993a9 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 20:01:05 -07:00
Srivatsa Vaddagiri	8e3aa6790c	sched: Packing support until a frequency threshold Add another dimension for task packing based on frequency. This patch adds a per-cpu tunable, rq->mostly_idle_freq, which when set will result in tasks being packed on a single cpu in cluster as long as cluster frequency is less than set threshold. Change-Id: I318e9af6c8788ddf5dfcda407d621449ea5343c0 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:01:03 -07:00
Srivatsa Vaddagiri	b2e57842c0	sched: per-cpu mostly_idle threshold sched_mostly_idle_load and sched_mostly_idle_nr_run knobs help pack tasks on cpus to some extent. In some cases, it may be desirable to have different packing limits for different cpus. For example, pack to a higher limit on high-performance cpus compared to power-efficient cpus. This patch removes the global mostly_idle tunables and makes them per-cpu, thus letting task packing behavior to be controlled in a fine-grained manner. Change-Id: Ifc254cda34b928eae9d6c342ce4c0f64e531e6c2 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:59 -07:00
Srivatsa Vaddagiri	98f89f00dc	sched: update governor notification logic Make criteria for notifying governor to be per-cpu. Governor is notified of any large change in cpu's busy time statistics (rq->prev_runnable_sum) since the last reported value. Change-Id: I727354d994d909b166d093b94d3dade7c7dddc0d Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:54 -07:00
Srivatsa Vaddagiri	3a67b4ce87	sched: window-stats: Enhance cpu busy time accounting rq->curr/prev_runnable_sum counters represent cpu demand from various tasks that have run on a cpu. Any task that runs on a cpu will have a representation in rq->curr_runnable_sum. Their partial_demand value will be included in rq->curr_runnable_sum. Since partial_demand is derived from historical load samples for a task, rq->curr_runnable_sum could represent "inflated/un-realistic" cpu usage. As an example, lets say that task with partial_demand of 10ms runs for only 1ms on a cpu. What is included in rq->curr_runnable_sum is 10ms (and not the actual execution time of 1ms). This leads to cpu busy time being reported on the upside causing frequency to stay higher than necessary. This patch fixes cpu busy accounting scheme to strictly represent actual usage. It also provides for conditional fixup of busy time upon migration and upon heavy-task wakeup. CRs-Fixed: 691443 Change-Id: Ic4092627668053934049af4dfef65d9b6b901e6b Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in init_task_load(), se.avg.decay_count has deprecated.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:00:50 -07:00
Srivatsa Vaddagiri	c9d0953c31	sched: improve logic for alerting governor Currently we send notification to governor not taking note of cpus that are synchronized with regard to their frequency. As a result, scheduler could send pointless notifications (notification spam!). Avoid this by considering synchronized cpus and alerting governor only when the highest demand of any cpu within cluster far exceeds or falls behind current frequency. Change-Id: I74908b5a212404ca56b38eb94548f9b1fbcca33d Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:48 -07:00
Srivatsa Vaddagiri	b7e40e50e9	sched: fix wrong load_scale_factor/capacity/nr_big/small_tasks A couple bugs exist with incorrect use of cpu_online_mask in pre/post_big_small_task() functions, leading to potentially incorrect computation of load_scale_factor/capacity/nr_big/small_tasks. pre/post_big_small_task_count_change() use cpu_online_mask in an unreliable manner. While local_irq_disable() in pre_big_small_task_count_change() ensures a cpu won't go away in cpu_online_mask, nothing prevents a cpu from coming online concurrently. As a result, cpu_online_mask used in pre_big_small_task_count_change() can be inconsistent with that used in post_big_small_task_count_change() which can lead to an attempt to unlock rq->lock which was not taken before. Secondly, when either max_possible_freq or min_max_freq is changing, it needs to trigger recomputation of load_scale_factor and capacity for all cpus, even if some are offline. Otherwise, an offline cpu could later come online with incorrect load_scale_factor/capacity. While it should be sufficient to scan online cpus for updating their nr_big/small_tasks in post_big_small_task_count_change(), unfortunately it sounds pretty hard to provide a stable cpu_online_mask when its called from cpufreq_notifier_policy(). cpufreq framework can trigger a CPUFREQ_NOTIFY notification in multiple contexts, some in cpu-hotplug paths, which makes it pretty hard to guess whether get_online_cpus() can be taken without causing deadlocks or not. To workaround the insufficient information we have about the hotplug-safety context when CPUFREQ_NOTIFY is issued, have post_big_small_task_count_change() traverse all possible cpus in updating nr_big/small_task_count. CRs-Fixed: 717134 Change-Id: Ife8f3f7cdfd77d5a21eee63627d7a3465930aed5 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:42 -07:00
Syed Rameez Mustafa	759ea99236	sched: window-stats: add a new AVG policy The current WINDOW_STATS_AVG policy is actually a misnomer since it uses the maximum value of the runtime in the recent window and the average of the past ravg_hist_size windows. Add a policy that only uses the average and call it WINDOW_STATS_AVG policy. Rename all the other polices to make them shorter and unambiguous. Change-Id: I080a4ea072a84a88858ca9da59a4151dfbdbe62c Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 20:00:38 -07:00
Syed Rameez Mustafa	251081550f	sched: fix bail condition in bail_inter_cluster_balance() Following commit efcad25cbfb (revert "sched: influence cpu_power based on max_freq and efficiency), all CPUs in the system have the same cpu_power and consequently the same group capacity. Therefore, the check in bail_inter_cluster_balance() can now no longer be used to distinguish a higher performance cluster from one with lower performance. The check is currently broken and always returns true for every load balancing attempt. Fix this by using runqueue capacity instead which can still be used as a good measure of cluster capabilities. Change-Id: Idecfd1ed221d27d4324b20539e5224a92bf8b751 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 20:00:32 -07:00
Srivatsa Vaddagiri	84d1fa51ee	sched: window-stats: use policy_mutex in sched_set_window() Several configuration variable change will result in reset_all_window_stats() being called. All of them, except sched_set_window(), are serialized via policy_mutex. Take policy_mutex in sched_set_window() as well to serialize use of reset_all_window_stats() function Change-Id: Iada7ff8ac85caa1517e2adcf6394c5b050e3968a Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:30 -07:00
Srivatsa Vaddagiri	29581dc620	sched: window_stats: Add "disable" mode support "disabled" mode (sched_disble_window_stats = 1) disables all window-stats related activity. This is useful when changing key configuration variables associated with window-stats feature (like policy or window size). Change-Id: I9e55c9eb7f7e3b1b646079c3aa338db6259a9cfe Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:28 -07:00
Srivatsa Vaddagiri	d8932ae7df	sched: window-stats: legacy mode Support legacy mode, which results in busy time being seen by governor that is close to what it would have seen via existing APIs i.e get_cpu_idle_time_us(), get_cpu_iowait_time_us() and get_cpu_idle_time_jiffy(). In particular, legacy mode means that only task execution time is counted in rq->curr_runnable_sum and rq->prev_runnable_sum. Also task migration does not result in adjustment of those counters. Change-Id: If374ccc084aa73f77374b6b3ab4cd0a4ca7b8c90 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:26 -07:00
Srivatsa Vaddagiri	e39131c3be	sched: window-stats: Code cleanup Remove code duplication associated with update of various window-stats related sysctl tunables Change-Id: I64e29ac065172464ba371a03758937999c42a71f Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:24 -07:00
Olav Haugan	8eede4a8d5	sched: Make RAVG_HIST_SIZE tunable Make RAVG_HIST_SIZE available from /proc/sys/kernel/sched_ravg_hist_size to allow tuning of the size of the history that is used in computation of task demand. CRs-fixed: 706138 Change-Id: Id54c1e4b6e974a62d787070a0af1b4e8ce3b4be6 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict in sysctl.h] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:00:19 -07:00
Srivatsa Vaddagiri	13b29fc0f7	sched: window-stats: 64-bit type for curr/prev_runnable_sum Expand rq->curr_runnable_sum and rq->prev_runnable_sum to be 64-bit counters as otherwise they can easily overflow when a cpu has many tasks. Change-Id: I68ab2658ac6a3174ddb395888ecd6bf70ca70473 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:16 -07:00
Srivatsa Vaddagiri	4641b37da8	sched: window-stats: Allow acct_wait_time to be tuned Add sysctl interface to tune sched_acct_wait_time variable at runtime Change-Id: I38339cdb388a507019e429709a7c28e80b5b3585 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:15 -07:00
Srivatsa Vaddagiri	c097c9b574	sched: window-stats: Account interrupt handling time as busy time Account cycles spent by idle cpu handling interrupts (irq or softirq) towards its busy time. Change-Id: I84cc084ced67502e1cfa7037594f29ed2305b2b1 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict in core.c] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:00:14 -07:00
Srivatsa Vaddagiri	8e526b1ab4	sched: Fix herding issue check_for_migration() could run concurrently on multiple cpus, resulting in multiple tasks wanting to migrate to same cpu. This could cause cpus to be underutilized and lead to increased scheduling latencies for tasks. Fix this by serializing select_best_cpu() calls from cpus running check_for_migration() check and marking selected cpus as reserved, so that subsequent call to select_best_cpu() from check_for_migration() will skip reserved cpus. Change-Id: I73a22cacab32dee3c14267a98b700f572aa3900c Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org	2016-03-23 20:00:10 -07:00
Srivatsa Vaddagiri	1f6363e54c	sched: trigger immediate migration of tasks upon boost Currently turning on boost does not immediately trigger migration of tasks from lower capacity cpus. Tasks could incur migration latency of up to one timer tick (when check_for_migration() is run). Fix this by triggering a migration check on cpus with lower capacity as soon as boost is turned on for first time. Change-Id: I244649f9cb6608862d87631325967b887b7f4b7e Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org	2016-03-23 20:00:05 -07:00
Srivatsa Vaddagiri	1ffae4dc94	sched: window-stats: Handle policy change properly sched_window_stat_policy influences task demand and thus various statistics maintained per-cpu like curr_runnable_sum. Changing policy non-atomically would lead to improper accounting. For example, when task is enqueued on a cpu's runqueue, its demand that is added to rq->cumulative_runnable_avg could be based on AVG policy and when its dequeued its demand that is removed can be based on MAX, leading to erroneous accounting. This change causes policy change to be "atomic" i.e all cpu's rq->lock are held and all task's window-stats are reset before policy is changed. Change-Id: I6a3e4fb7bc299dfc5c367693b5717a1ef518c32d CRs-Fixed: 687409 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict in include/linux/sched/sysctl.h. Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:00:03 -07:00
Srivatsa Vaddagiri	f27b626521	sched: remove sysctl control for HMP and power-aware task placement There is no real need to control HMP and power-aware task placement at runtime after kernel has booted. Boot-time control should be sufficient. Not allowing for runtime (sysctl) support simplifies the code quite a bit. Also rename sysctl_sched_enable_hmp_task_placement to be shorter. Change-Id: I60cae51a173c6f73b79cbf90c50ddd41a27604aa Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict. p->nr_cpus_allowed == 1 has moved to core.c Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:55 -07:00
Srivatsa Vaddagiri	ad25ca2afb	sched: support legacy mode better It should be possible to bypass all HMP scheduler changes at runtime by setting sysctl_sched_enable_hmp_task_placement and sysctl_sched_enable_power_aware to 0. Fix various code paths to honor this requirement. Change-Id: I74254e68582b3f9f1b84661baf7dae14f981c025 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in rt.c, p->nr_cpus_allowed == 1 is now moved in core.c] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:54 -07:00
Syed Rameez Mustafa	3e7b06d9cf	sched: Add a per rq max_possible_capacity for use in power calculations In the absence of a power driver providing real power values, the scheduler currently defaults to using capacity of a CPU as a measure of power. This, however, is not a good measure since the capacity of a CPU can change due to thermal conditions and/or other hardware restrictions. These frequency restrictions have no effect on the power efficiency of those CPUs. Introduce max possible capacity of a CPU to track an absolute measure of capacity which translates into a good absolute measure of power efficiency. Max possible capacity takes the max possible frequency of CPUs into account instead of max frequency. Change-Id: Ia970b853e43a90eb8cc6fd990b5c47fca7e50db8 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 19:59:49 -07:00
Syed Rameez Mustafa	6998234564	sched: Make task and CPU load calculations safe from truncation Load calculations have been modified to accept and return 64 bit values. Fix up all the places where we make such calculations to store the result in 64 bit variables. This is necessary to avoid issues caused by truncation of values. While at it update scale_task_load() to scale_load_to_cpu(). This is because the API is used to scale load of both individual tasks as well as the cumulative load of CPUs. In this sense the name was a misnomer. Also clean up power_cost() to use max_task_load(). Change-Id: I51e683e1592a5ea3c4e4b2b06d7a7339a49cce9c Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in power_cost(). power_cost() now supports sched_use_pelt=1 back again. Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:46 -07:00
Syed Rameez Mustafa	eefd598594	sched/fair: Introduce C-state aware task placement for small tasks Small tasks execute for small durations. This means that the power cost of taking CPUs out of a low power mode outweigh any performance advantage of using an idle core or power advantage of using the most power efficient CPU. Introduce C-state aware task placement for small tasks. This requires a two pass approach where we first determine the most power effecient CPU and establish a band of CPUs offering a similar power cost for the task. The order of preference then is as follows: 1) Any mostly idle CPU in active C-state in the same power band. 2) A CPU with the shallowest C-state in the same power band. 3) A CPU with the least load in the same power band. 4) Lowest power CPU in a higher power band. The patch also modifies the definition of a small task. Small tasks are now determined relative to minimum capacity CPUs in the system and not the task CPU. Change-Id: Ia09840a5972881cad7ba7bea8fe34c45f909725e Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 19:59:46 -07:00
Srivatsa Vaddagiri	2735664021	sched: Use historical load for freq governor input Historical load maintained per task can be used to influence cpu frequency better. For example, when a heavy demand task wakes up after prolonged sleep, we could use the historical load information to alert cpufreq governor about the need to raise cpu frequency. This patch changes CPU busy statistics to be aggregation of historical task demand. Also task's historical load (as defined by sysctl_sched_window_stats_policy) is add to cpu's busy statistics (rq->curr_runnable_sum) whenever it executes on a cpu. Change-Id: I2b66136f138b147ba19083b9b044c4feb20d9b57 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org	2016-03-23 19:59:34 -07:00
Syed Rameez Mustafa	a536cf8ac8	sched: Introduce spill threshold tunables to manage overcommitment When the number of tasks intended for a cluster exceed the number of mostly idle CPUs in that cluster, the scheduler currently freely uses CPUs in other clusters if possible. While this is optimal for performance the power trade off can be quite significant. Introduce spill threshold tunables that govern the extent to which the scheduler should attempt to contain tasks within a cluster. Change-Id: I797e6c6b2aa0c3a376dad93758abe1d587663624 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org [joonwoop@codeaurora.org: fixed conflict in nohz_kick_needed()] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:32 -07:00
Steve Muckle	f469bce8e2	sched: add migration load change notifier for frequency guidance When a task moves between CPUs in two different frequency domains the cpufreq governor may wish to immediately modify the frequency of both the source and destination CPUs of the migrating task. A tunable is provided to establish what size task is considered "significant" enough to warrant notifying cpufreq. Also fix a bug that would cause load to not be accounted properly during wakeup migrations. Change-Id: Ie8f6b1cc4d43a602840dac18590b42a81327c95a Signed-off-by: Steve Muckle <smuckle@codeaurora.org> [rameezmustafa@codeaurora.org: Add double rq locking for set_task_cpu()] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 19:59:29 -07:00
Syed Rameez Mustafa	3f9d4439f2	sched/rt: Introduce power aware scheduling for real time tasks Real Time task scheduling has historically been geared towards performance with a significant attempt to keep higher priority tasks on the same CPU. This is not optimal for power since the task CPU may not be the most power efficient CPU. Also task movement via select_lowest_rq() gives CPU priority the primary consideration before looking at CPU topologies to find a CPU closest to the task CPU in terms of topology. This again is not optimal for power since the closest CPU may be significantly worse for power than CPUs further away. This patch removes any bias for the task CPU. When the lowest priority CPUs in the system are found we give no consideration to the CPU topology. Instead we find the lowest power CPU within local_cpu_mask. This takes care of select_task_rq_rt() and push_task(). The pull model remains unaffected since we have no room for power optimization there. Change-Id: I4162ebe2f74be14240e62476f231f9e4a18bd9e8 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [joonwoop@codeaurora.org: s/__get_cpu_var/this_cpu_ptr/] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:26 -07:00
Srivatsa Vaddagiri	6e8842f8be	sched: Extend update_task_ravg() to accept wallclock as argument This will make it easier to account interrupt time on a cpu, introduced in a subsequent patch. Change-Id: I0e1fb5255c280ca374fd255e7fc19d5de9f8b045 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org	2016-03-23 19:59:20 -07:00
Srivatsa Vaddagiri	188a6bc174	sched: add sched_get_busy, sched_set_window APIs sched_get_busy() returns the busy time of a cpu during the most recent completed window. sched_set_window() will set window size and aligns windows across all CPUs. Change-Id: Ic53e27f43fd4600109b7b6db979e1c52c7aca103 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict in include/linux/sched.h] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:19 -07:00
Steve Muckle	bb0b8e9859	sched: window-stats: Add aggregated runqueue windowed stats Add counters per-cpu to track its busy time in the latest window and one previous to that. This would be needed to track accurate busy time per-cpu that accounts for migrations. Basically once a task migrates, its execution time in current window is migrated as well to new cpu. The idle task's runtime is not accounted since it should not count towards runqueue busy time. Change-Id: I4014dd686f95dbbfaa4274269bc36ed716573421 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 19:59:17 -07:00
Srivatsa Vaddagiri	33e7100103	sched: window-stats: synchronize windows across cpus Synchronizing windows across cpus for task load measurements simplifies cpu busy time accounting during migrations. For task migrations, its usage in current window can be carried over to its new cpu. This lets cpufreq governor see a correct picture of cpu busy time that is not affected by migrations. This patch lines up windows across cpus. One of the cpu, sync_cpu, serves as a reference for all others. During bootup sync_cpu would initialize its window_start (from its sched_clock()). Other cpus will synchronize their window_start in reference to sync_cpu. This patch assumes synchronous sched_clock() across cpus and may need some change to address architectures which do not provide such synchronized sched_clock(). Change-Id: I13381389a72f5f9f85cc2446401d493a55c78ab7 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 19:59:16 -07:00
Srivatsa Vaddagiri	fb9ab2a720	sched: Provide tunable to switch between PELT and window-based stats Provide a runtime tunable to switch between using PELT-based load stats and window-based load stats. This will be needed for runtime analysis of the two load tracking schemes. Change-Id: I018f6a90b49844bf2c4e5666912621d87acc7217 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 19:59:11 -07:00

1 2 3 4 5 ...