evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
Vincent Guittot	e62a1ca36b	UPSTREAM: sched/core: Fix group_entity's share update The update of the share of a cfs_rq is done when its load_avg is updated but before the group_entity's load_avg has been updated for the past time slot. This generates wrong load_avg accounting which can be significant when small tasks are involved in the scheduling. Let take the example of a task a that is dequeued of its task group A: root (cfs_rq) \ (se) A (cfs_rq) \ (se) a Task "a" was the only task in task group A which becomes idle when a is dequeued. We have the sequence: - dequeue_entity a->se - update_load_avg(a->se) - dequeue_entity_load_avg(A->cfs_rq, a->se) - update_cfs_shares(A->cfs_rq) A->cfs_rq->load.weight == 0 A->se->load.weight is updated with the new share (0 in this case) - dequeue_entity A->se - update_load_avg(A->se) but its weight is now null so the last time slot (up to a tick) will be accounted with a weight of 0 instead of its real weight during the time slot. The last time slot will be accounted as an idle one whereas it was a running one. If the running time of task a is short enough that no tick happens when it runs, all running time of group entity A->se will be accounted as idle time. Instead, we should update the share of a cfs_rq (in fact the weight of its group entity) only after having updated the load_avg of the group_entity. update_cfs_shares() now takes the sched_entity as a parameter instead of the cfs_rq, and the weight of the group_entity is updated only once its load_avg has been synced with current time. Change-Id: Id6ce3be1767b44b444ce2a77ed1ba063e57c0664 Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: pjt@google.com Link: http://lkml.kernel.org/r/1482335426-7664-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 89ee048f3cc796db6f26906c6bef4edf0bee70fd) [minor cherry pick stuff] Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:54 -07:00
Peter Zijlstra	baaa21b59b	UPSTREAM: sched/fair: Fix calc_cfs_shares() fixed point arithmetics width confusion Commit: `fde7d22e01` ("sched/fair: Fix overly small weight for interactive group entities") did something non-obvious but also did it buggy yet latent. The problem was exposed for real by a later commit in the v4.7 merge window: 2159197d6677 ("sched/core: Enable increased load resolution on 64-bit kernels") ... after which tg->load_avg and cfs_rq->load.weight had different units (10 bit fixed point and 20 bit fixed point resp.). Add a comment to explain the use of cfs_rq->load.weight over the 'natural' cfs_rq->avg.load_avg and add scale_load_down() to correct for the difference in unit. Since this is (now, as per a previous commit) the only user of calc_tg_weight(), collapse it. The effects of this bug should be randomly inconsistent SMP-balancing of cgroups workloads. Change-Id: If1e565662ea163485edd94a12aef644d0e0dfe7a Reported-by: Jirka Hladky <jhladky@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 2159197d6677 ("sched/core: Enable increased load resolution on 64-bit kernels") Fixes: `fde7d22e01` ("sched/fair: Fix overly small weight for interactive group entities") Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit ea1dc6fc6242f991656e35e2ed3d90ec1cd13418) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:54 -07:00
Vincent Guittot	20bbd92679	UPSTREAM: sched/fair: Fix incorrect task group ->load_avg A scheduler performance regression has been reported by Joseph Salisbury, which he bisected back to: 3d30544f0212 ("sched/fair: Apply more PELT fixes) The regression triggers when several levels of task groups are involved (read: SystemD) and cpu_possible_mask != cpu_present_mask. The root cause is that group entity's load (tg_child->se[i]->avg.load_avg) is initialized to scale_load_down(se->load.weight). During the creation of a child task group, its group entities on possible CPUs are attached to parent's cfs_rq (tg_parent) and their loads are added to the parent's load (tg_parent->load_avg) with update_tg_load_avg(). But only the load on online CPUs will then be updated to reflect real load, whereas load on other CPUs will stay at the initial value. The result is a tg_parent->load_avg that is higher than the real load, the weight of group entities (tg_parent->se[i]->load.weight) on online CPUs is smaller than it should be, and the task group gets a less running time than what it could expect. ( This situation can be detected with /proc/sched_debug. The ".tg_load_avg" of the task group will be much higher than sum of ".tg_load_avg_contrib" of online cfs_rqs of the task group. ) The load of group entities don't have to be intialized to something else than 0 because their load will increase when an entity is attached. Change-Id: Ie55021ff98ba49016adfddb2444e9c9709939226 Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@vger.kernel.org> # 4.8.x Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: joonwoop@codeaurora.org Fixes: 3d30544f0212 ("sched/fair: Apply more PELT fixes) Link: http://lkml.kernel.org/r/1476881123-10159-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit b5a9b340789b2b24c6896bcf7a065c31a4db671c) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:54 -07:00
Peter Zijlstra	640c909c34	UPSTREAM: sched/fair: Fix effective_load() to consistently use smoothed load Starting with the following commit: `fde7d22e01` ("sched/fair: Fix overly small weight for interactive group entities") calc_tg_weight() doesn't compute the right value as expected by effective_load(). The difference is in the 'correction' term. In order to ensure \Sum rw_j >= rw_i we cannot use tg->load_avg directly, since that might be lagging a correction on the current cfs_rq->avg.load_avg value. Therefore we use tg->load_avg - cfs_rq->tg_load_avg_contrib + cfs_rq->avg.load_avg. Now, per the referenced commit, calc_tg_weight() doesn't use cfs_rq->avg.load_avg, as is later used in @w, but uses cfs_rq->load.weight instead. So stop using calc_tg_weight() and do it explicitly. The effects of this bug are wake_affine() making randomly poor choices in cgroup-intense workloads. Change-Id: I1c0058ff674650cf295c8dc3b88a5a3de4bddab0 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> # v4.3+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `fde7d22e01` ("sched/fair: Fix overly small weight for interactive group entities") Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 7dd4912594daf769a46744848b05bd5bc6d62469) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:54 -07:00
Vincent Guittot	89e4d18a67	UPSTREAM: sched/fair: Propagate asynchrous detach A task can be asynchronously detached from cfs_rq when migrating between CPUs. The load of the migrated task is then removed from source cfs_rq during its next update. We use this event to set propagation flag. During the load balance, we take advantage of the update of blocked load to propagate any pending changes. The propagation relies on patch: "sched: Fix hierarchical order in rq->leaf_cfs_rq_list" ... which orders children and parents, to ensure that it's done in one pass. Change-Id: I33782e35fc4711f5901e8c23d6aa7ec5f2ff7ee5 Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten.Rasmussen@arm.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Cc: kernellwp@gmail.com Cc: pjt@google.com Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1478598827-32372-6-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 4e5160766fcc9f41bbd38bac11f92dce993644aa) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:54 -07:00
Vincent Guittot	e875665411	UPSTREAM: sched/fair: Propagate load during synchronous attach/detach When a task moves from/to a cfs_rq, we set a flag which is then used to propagate the change at parent level (sched_entity and cfs_rq) during next update. If the cfs_rq is throttled, the flag will stay pending until the cfs_rq is unthrottled. For propagating the utilization, we copy the utilization of group cfs_rq to the sched_entity. For propagating the load, we have to take into account the load of the whole task group in order to evaluate the load of the sched_entity. Similarly to what was done before the rewrite of PELT, we add a correction factor in case the task group's load is greater than its share so it will contribute the same load of a task of equal weight. Change-Id: Id34a9888484716961c9027299c0b4d82881a39d1 Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten.Rasmussen@arm.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Cc: kernellwp@gmail.com Cc: pjt@google.com Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1478598827-32372-5-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 09a43ace1f986b003c118fdf6ddf1fd685692d49) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:54 -07:00
Vincent Guittot	8370e07d82	UPSTREAM: sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list Fix the insertion of cfs_rq in rq->leaf_cfs_rq_list to ensure that a child will always be called before its parent. The hierarchical order in shares update list has been introduced by commit: `67e86250f8` ("sched: Introduce hierarchal order on shares update list") With the current implementation a child can be still put after its parent. Lets take the example of: root \ b /\ c d* \| e* with root -> b -> c already enqueued but not d -> e so the leaf_cfs_rq_list looks like: head -> c -> b -> root -> tail The branch d -> e will be added the first time that they are enqueued, starting with e then d. When e is added, its parents is not already on the list so e is put at the tail : head -> c -> b -> root -> e -> tail Then, d is added at the head because its parent is already on the list: head -> d -> c -> b -> root -> e -> tail e is not placed at the right position and will be called the last whereas it should be called at the beginning. Because it follows the bottom-up enqueue sequence, we are sure that we will finished to add either a cfs_rq without parent or a cfs_rq with a parent that is already on the list. We can use this event to detect when we have finished to add a new branch. For the others, whose parents are not already added, we have to ensure that they will be added after their children that have just been inserted the steps before, and after any potential parents that are already in the list. The easiest way is to put the cfs_rq just after the last inserted one and to keep track of it untl the branch is fully added. Change-Id: I4fe0b8502ea628c13d14e8e5c5279bce67fb8845 Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten.Rasmussen@arm.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Cc: kernellwp@gmail.com Cc: pjt@google.com Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1478598827-32372-3-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 9c2791f936ef5fd04a118b5c284f2c9a95f4a647) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:54 -07:00
Vincent Guittot	723dab7871	BACKPORT: sched/fair: Factorize PELT update Every time we modify load/utilization of sched_entity, we start to sync it with its cfs_rq. This update is done in different ways: - when attaching/detaching a sched_entity, we update cfs_rq and then we sync the entity with the cfs_rq. - when enqueueing/dequeuing the sched_entity, we update both sched_entity and cfs_rq metrics to now. Use update_load_avg() everytime we have to update and sync cfs_rq and sched_entity before changing the state of a sched_enity. Change-Id: Ibde9a7e07ac80e9d5753bb4a0c30dfb3643cc666 Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten.Rasmussen@arm.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Cc: kernellwp@gmail.com Cc: pjt@google.com Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1478598827-32372-4-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> [backported FROMLIST] Signed-off-by: Andres Oportus <andresoportus@google.com> (cherry picked from commit d31b1a66cbe0931733583ad9d9e8c6cfd710907d) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Vincent Guittot	18d09a45ec	UPSTREAM: sched/fair: Factorize attach/detach entity Factorize post_init_entity_util_avg() and part of attach_task_cfs_rq() in one function attach_entity_cfs_rq(). Create symmetric detach_entity_cfs_rq() function. Change-Id: I44fc6bb5e71460be65f6b8928d4620c6c27a6a67 Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten.Rasmussen@arm.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Cc: kernellwp@gmail.com Cc: pjt@google.com Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1478598827-32372-2-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit df217913e72ec7e603d8b68cc4c70646cf7000db) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Peter Zijlstra	f9bef52c85	UPSTREAM: sched/fair: Improve PELT stuff some more Vincent noted that the update_tg_load_avg() usage in commit: 3d30544f0212 ("sched/fair: Apply more PELT fixes") isn't entirely sufficient. We need to call this function every time cfs_rq->avg.load changes, this includes when update_cfs_rq_load_avg() returns true, but {attach,detach}_entity_load_avg() themselves also change it. This means we need to unconditionally call update_tg_load_avg(). Also, add more comments. Change-Id: I7e55fceb587601f73c760c8b0d47a7ef2b777b9e Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 7c3edd2c300b7ef2005a69dc727692ee07434aa5) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Peter Zijlstra	dc1386b6f7	UPSTREAM: sched/fair: Apply more PELT fixes One additional 'rule' for using update_cfs_rq_load_avg() is that one should call update_tg_load_avg() if it returns true. Add a bunch of comments to hopefully clarify some of the rules: o You need to update cfs_rq _before_ any entity attach/detach, this is important, because while for mathmatical consisency this isn't strictly needed, it is required for the physical interpretation of the model, you attach/detach _now_. o When you modify the cfs_rq avg, you have to then call update_tg_load_avg() in order to propagate changes upwards. o (Fair) entities are always attached, switched_{to,from}_fair() deal with !fair. This directly follows from the definition of the cfs_rq averages, namely that they are a direct sum of all (runnable or blocked) entities on that rq. It is the second rule that this patch enforces, but it adds comments pertaining to all of them. Change-Id: Icdc906e98c67b84cb9582c893bc761a9886be57a Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 3d30544f02120b884bba2a9466c87dba980e3be5) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Peter Zijlstra	3fd734a8f9	UPSTREAM: sched/fair: Fix post_init_entity_util_avg() serialization Chris Wilson reported a divide by 0 at: post_init_entity_util_avg(): > 725 if (cfs_rq->avg.util_avg != 0) { > 726 sa->util_avg = cfs_rq->avg.util_avg * se->load.weight; > -> 727 sa->util_avg /= (cfs_rq->avg.load_avg + 1); > 728 > 729 if (sa->util_avg > cap) > 730 sa->util_avg = cap; > 731 } else { Which given the lack of serialization, and the code generated from update_cfs_rq_load_avg() is entirely possible: if (atomic_long_read(&cfs_rq->removed_load_avg)) { s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0); sa->load_avg = max_t(long, sa->load_avg - r, 0); sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0); removed_load = 1; } turns into: ffffffff81087064: 49 8b 85 98 00 00 00 mov 0x98(%r13),%rax ffffffff8108706b: 48 85 c0 test %rax,%rax ffffffff8108706e: 74 40 je ffffffff810870b0 ffffffff81087070: 4c 89 f8 mov %r15,%rax ffffffff81087073: 49 87 85 98 00 00 00 xchg %rax,0x98(%r13) ffffffff8108707a: 49 29 45 70 sub %rax,0x70(%r13) ffffffff8108707e: 4c 89 f9 mov %r15,%rcx ffffffff81087081: bb 01 00 00 00 mov $0x1,%ebx ffffffff81087086: 49 83 7d 70 00 cmpq $0x0,0x70(%r13) ffffffff8108708b: 49 0f 49 4d 70 cmovns 0x70(%r13),%rcx Which you'll note ends up with 'sa->load_avg - r' in memory at ffffffff8108707a. By calling post_init_entity_util_avg() under rq->lock we're sure to be fully serialized against PELT updates and cannot observe intermediate state like this. Change-Id: I56c11886102b7859df82e26c88b1b7c200a39f6e Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yuyang Du <yuyang.du@intel.com> Cc: bsegall@google.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: steve.muckle@linaro.org Fixes: 2b8c41daba32 ("sched/fair: Initiate a new task's util avg to a bounded value") Link: http://lkml.kernel.org/r/20160609130750.GQ30909@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit b7fa30c9cc48c4f55663420472505d3b4f6e1705) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Yuyang Du	9de438d27c	BACKPORT: sched/fair: Initiate a new task's util avg to a bounded value A new task's util_avg is set to full utilization of a CPU (100% time running). This accelerates a new task's utilization ramp-up, useful to boost its execution in early time. However, it may result in (insanely) high utilization for a transient time period when a flood of tasks are spawned. Importantly, it violates the "fundamentally bounded" CPU utilization, and its side effect is negative if we don't take any measure to bound it. This patch proposes an algorithm to address this issue. It has two methods to approach a sensible initial util_avg: (1) An expected (or average) util_avg based on its cfs_rq's util_avg: util_avg = cfs_rq->util_avg / (cfs_rq->load_avg + 1) * se.load.weight (2) A trajectory of how successive new tasks' util develops, which gives 1/2 of the left utilization budget to a new task such that the additional util is noticeably large (when overall util is low) or unnoticeably small (when overall util is high enough). In the meantime, the aggregate utilization is well bounded: util_avg_cap = (1024 - cfs_rq->avg.util_avg) / 2^n where n denotes the nth task. If util_avg is larger than util_avg_cap, then the effective util is clamped to the util_avg_cap. Change-Id: Idafe989b24d9e70911666f09800bf1d5a011e1f4 Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: steve.muckle@linaro.org Link: http://lkml.kernel.org/r/1459283456-21682-1-git-send-email-yuyang.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 2b8c41daba327c633228169e8bd8ec067ab443f8) [integrate with schedfreq - schedfreq has a tuneable for init task util but this commit removes the use of the tuneable since we have a new algorithm for calculating an initial utilisation. I've left the tuneable in place, but it is no longer used even when schedfreq is the CPUFreq governor] Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Dietmar Eggemann	4e18c8a10d	sched/fair: Simplify idle_idx handling in select_idle_sibling() Rename best_idle to best_idle_cpu so the same name is used like in find_best_target(). Fix if (best_idle > 0) since best_idle_cpu = 0 is a valid target. Use 'unsigned long' data type for best_idle_capacity. Since we're looking for the shallowest best_idle_cstate initialize best_idle_cstate = INT_MAX. For cpus which are not idle (idle_idx = -1) the condition 'if (idle_idx < best_idle_cstate && ...)' is never executed. Change-Id: Ic5b63d58478696b3d1ec6253cf739a69a574cf99 Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> (cherry picked from commit 8bff5e9c0968108d465e1f2a4624fc5ec2f00849) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Dietmar Eggemann	b31ae71ef7	sched/fair: refactor find_best_target() for simplicity Simplify backup_capacity handling and use 'unsigned long' data type for cpu capacity, simplify target_util handling, simplify idle_idx handling & refactor min_util, new_util. Also return first idle cpu for prefer_idle task immediately. Change-Id: Ic89e140f7b369f3965703fdc8463013d16e9b94a Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Dietmar Eggemann	d3f5e8c3e9	sched/fair: Change cpu iteration order in find_best_target() The schedtune task parameter 'boosted' is mapped into the cpu iteration order. Currently for 'boosted' equal true the iteration starts at the last cpu (NR_CPUS-1) whereas for 'boosted' equal false it starts at the first cpu (0). This only has the desired effect if the cpu topology oerdering matches the underlying assumption. This e.g. is the case for the Qc snapdragon 821 with its [L0 L1 b0 b1] cpu topology layout (L=lower max freq, b=higher max freq). This results in cpus with higher maximum capacity being given the highest logical cpu ids. However not all big.LITTLE systems enumerate their cpus in the same way. For example, the ARM Versatile Express Juno board has 6 cpus for which the default configuration has topology [L0 b0 b1 L1 L2 L3]. To make this approach independent from the cpu topology layout it now iterates over the cpus in the order of the sched_groups of the EAS sched_domain (sd_ea). The order of cpu iteration is different for the different cpu types in case the cpu is used to dereference sd_ea. Considering the Qc snapdragon 821 again, for cpu L0 and L1 the order is 'b0->b1->L0->L1' whereas for b0 and b1 the order is 'b0->b1->L0->L1'. This approach does not allow the exact same iteration order as with the currently used flat iteration over [0 .. NR_CPUS-1] but the cpus are ordered by the original cpu capacity. The cpu iteration is now done in the sd_ea sched_group order required by the 'boosted' value ['L0->L1->b0->b1'/'b0->b1->L0->L1'] rather than forward/backward over the flat cpu space ['L0->L1->b0->b1'/ 'b1->b0->L1->L0']. Change-Id: I8fbe2073dedd2ecb1c750620c6000c11a5ff4358 Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> (cherry picked from commit a0c6a4272c3968c0ff50d3fed65f5865b72d777b) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Dietmar Eggemann	633b98b651	sched/core: Add first cpu w/ max/min orig capacity to root domain This will allow to start iterating from a cpu with max or min original capacity in the wakeup path regardless on which cpu the scheduler is currently running (smp_processor_id()) or the previous cpu of the task (task_cpu(p)). This iteration has to happen on a sched_domain spanning all cpus in the order of the sched_groups of this sched_domain seen by the starting cpu. In case of an SMP system the first cpu with max orig capacity and the the one with min orig capacity is the same. This can temporally happen on a big.LITTLE system with hotplug as well. E.g. the different order of cpu iteration can be used to map schedtune task parameter 'boosted' into the cpu iteration order in find_best_target(). Use of READ_ONCE()/WRITE_ONCE() to avoid load/store tearing. Change-Id: I812fbd9c7e5f506617e456c0eec3edcd2c016e92 Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> (cherry picked from commit fd6e9543c1fd8971a5e2e68e39b2f6e591d46114) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Dietmar Eggemann	3e44a647c0	sched/core: Remove remnants of commit fd5c98da1a42 Commit fd5c98da1a42 "WIP: sched: Store system-wide maximum cpu capacity in root domain" was repalced by commit 8148bdfff4f5 "WIP: sched: Update max cpu capacity in case of max frequency constraints" which didn't remove all the now unused bits. Change-Id: I067f6366431f43337cffa7a2a8e0de32dd33d2f9 Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> (cherry picked from commit 6d284a607cec51bcafca313bc396bc3103b1e876) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Dietmar Eggemann	242695407a	sched: Remove sysctl_sched_is_big_little With the new wakeup approach this sysctl is not necessary any more. Change-Id: I52114b3c918791f6a4f9f30f50002919ccbc1a9c Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> (cherry picked from commit 885c0d503bcdf0ef4e9b46822496f16b20aa3bbd) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Dietmar Eggemann	9e92e8a24f	sched/fair: Code !is_big_little path into select_energy_cpu_brute() This patch replaces the existing EAS upstream implementation of select_energy_cpu_brute() with the one of find_best_target() used in Android previously. It also removes the cpumask 'and' from select_energy_cpu_brute, see the existing use of 'cpu = smp_processor_id()' in select_task_rq_fair(). Change-Id: If678c002efaa87d1ba3ec9989a4e9f8df98b83ec Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> [ added guarding for non-schedtune builds ] Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Dietmar Eggemann	f6f9314893	EAS: sched/fair: Re-integrate 'honor sync wakeups' into wakeup path This patch re-integrates the part which was initially provided by 3b9d7554aeec ("EAS: sched/fair: tunable to honor sync wakeups") into energy_aware_wake_cpu() into select_energy_cpu_brute(). Change-Id: I748fde3ecdeb44651179bce0a5bb8dd82d1903f6 Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> (cherry picked from commit b75b7286cb068d5761621ea134c23dd131db953f) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:53 -07:00
Dietmar Eggemann	81bd5ed393	Fixup!: sched/fair.c: Set SchedTune specific struct energy_env.task This has to be done in the caller function of energy_diff() version of SchedTune to avoid Null pointer dereference in energy_diff(). Change-Id: I3f0f68dbd11efb15bbb3b1832f8294419ed85241 Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> (cherry picked from commit 14531d4e245d063f713ee5ed835df958e6c7838f) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	3935105f57	sched/fair: Energy-aware wake-up task placement When the systems is not overutilized, place waking tasks on the most energy efficient cpu. Previous attempts reduced the search space by matching task utilization to cpu capacity before consulting the energy model as this is an expensive operation. The search heuristics didn't work very well and lacking any better alternatives this patch takes the brute-force route and tries all potential targets. This approach doesn't scale, but it might be sufficient for many embedded applications while work is continuing on a heuristic that can minimize the necessary computations. The heuristic must be derrived from the platform energy model rather than make additional assumptions, such lower capacity implies better energy efficiency. PeterZ mentioned in the past that we might be able to derrive some simpler deciding functions using mathematical (modal?) analysis. Change-Id: I772bacb4c8fd599f8006fa422f842e66377a9c6c Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> [rebase: on top of msm-google/android-msm-marlin-3.18] Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> (cherry picked from commit a894422dbdb7b77ea2acfe7ff909ccb5ded23514) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	02cbde61f4	sched/fair: Add energy_diff dead-zone margin It is not worth the overhead to migrate tasks for tiny insignificant energy savings. To prevent this, an energy margin is introduced in energy_diff() which effectively adds a dead-zone that rounds tiny energy differences to zero. Since no scale is enforced for energy model data the margin can't be absolute. Instead it is defined as +/-1.56% energy saving compared to the current total estimated energy consumption. Change-Id: I6be069c752c701fb825430896b3b768a7ab2fee4 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> [rebase: on top of msm-google/android-msm-marlin-3.18, massage original patch which changes code in energy_diff() into __energy_diff() introduced by SchedTune] Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> (cherry picked from commit 780cb5a5fa47adf13d4fc2b77e8e94448cd56098) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Dietmar Eggemann	3b6ba235bc	sched/fair: Decommission energy_aware_wake_cpu() The EAS functionality in the wakeup path will be brought back by the following patch ("sched/fair: Energy-aware wake-up task placement") providing the function select_energy_cpu_brute(). Change-Id: I927fb9e8261cfacfe404695f853941c7959aa146 [ Trivial merge conflicts resolved. ] Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> (cherry picked from commit 80aee424fb7765a777267e144037642625a71304) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Dietmar Eggemann	168228463c	sched/fair: Do not force want_affine eq. true if EAS is enabled This lets us use Capacity-Aware Scheduling (CAS) if EAS is enabled. Change-Id: I2e647a201ea0b733d1487c3e153047a49fb22847 Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> (cherry picked from commit 00b7da2ae58bf568529e67614980f77e275b8d29) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	c6cc7ca915	UPSTREAM: sched/fair: Fix incorrect comment for capacity_margin The comment for capacity_margin introduced in: 3273163c6775 ("sched/fair: Let asymmetric CPU configurations balance at wake-up") ... got its usage the wrong way round - fix it. Change-Id: Ie46eac3e5ff43397b5bed61d0999d2817f1a1d96 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1476452472-24740-7-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 893c5d2279041afeb593f1fa8edd9d02edf5b7cb) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	adc7f08b2f	UPSTREAM: sched/fair: Avoid pulling tasks from non-overloaded higher capacity groups For asymmetric CPU capacity systems it is counter-productive for throughput if low capacity CPUs are pulling tasks from non-overloaded CPUs with higher capacity. The assumption is that higher CPU capacity is preferred over running alone in a group with lower CPU capacity. This patch rejects higher CPU capacity groups with one or less task per CPU as potential busiest group which could otherwise lead to a series of failing load-balancing attempts leading to a force-migration. Change-Id: I428875bb6267c780026ef75e2882300738d016e7 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1476452472-24740-5-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 9e0994c0a1c1f82c705f1f66388e1bcffcee8bb9) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	60cc9f4e1e	UPSTREAM: sched/fair: Add per-CPU min capacity to sched_group_capacity struct sched_group_capacity currently represents the compute capacity sum of all CPUs in the sched_group. Unless it is divided by the group_weight to get the average capacity per CPU, it hides differences in CPU capacity for mixed capacity systems (e.g. high RT/IRQ utilization or ARM big.LITTLE). But even the average may not be sufficient if the group covers CPUs of different capacities. Instead, by extending struct sched_group_capacity to indicate min per-CPU capacity in the group a suitable group for a given task utilization can more easily be found such that CPUs with reduced capacity can be avoided for tasks with high utilization (not implemented by this patch). Change-Id: If3cae1be62d01a199e752bca5abb45357d5d0fbd Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1476452472-24740-4-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit bf475ce0a3dd75b5d1df6c6c14ae25168caa15ac) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	f3f132b8e5	UPSTREAM: sched/fair: Consider spare capacity in find_idlest_group() In low-utilization scenarios comparing relative loads in find_idlest_group() doesn't always lead to the most optimum choice. Systems with groups containing different numbers of cpus and/or cpus of different compute capacity are significantly better off when considering spare capacity rather than relative load in those scenarios. In addition to existing load based search an alternative spare capacity based candidate sched_group is found and selected instead if sufficient spare capacity exists. If not, existing behaviour is preserved. Change-Id: I6097af76c302a5a12e240ca24c70f707ad118242 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1476452472-24740-3-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 6a0b19c0f39a7a7b7fb77d3867a733136ff059a3) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	68c27298cd	UPSTREAM: sched/fair: Compute task/cpu utilization at wake-up correctly At task wake-up load-tracking isn't updated until the task is enqueued. The task's own view of its utilization contribution may therefore not be aligned with its contribution to the cfs_rq load-tracking which may have been updated in the meantime. Basically, the task's own utilization hasn't yet accounted for the sleep decay, while the cfs_rq may have (partially). Estimating the cfs_rq utilization in case the task is migrated at wake-up as task_rq(p)->cfs.avg.util_avg - p->se.avg.util_avg is therefore incorrect as the two load-tracking signals aren't time synchronized (different last update). To solve this problem, this patch synchronizes the task utilization with its previous rq before the task utilization is used in the wake-up path. Currently the update/synchronization is done _after_ the task has been placed by select_task_rq_fair(). The synchronization is done without having to take the rq lock using the existing mechanism used in remove_entity_load_avg(). Change-Id: I5605cca0c94c6ba43d9ce11554765a2456cf85bc Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1476452472-24740-2-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 104cb16d9eb684f071d5bf3aa87c0d01af259b7c) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	abfff9dece	UPSTREAM: sched/fair: Let asymmetric CPU configurations balance at wake-up Currently, SD_WAKE_AFFINE always takes priority over wakeup balancing if SD_BALANCE_WAKE is set on the sched_domains. For asymmetric configurations SD_WAKE_AFFINE is only desirable if the waking task's compute demand (utilization) is suitable for the waking CPU and the previous CPU, and all CPUs within their respective SD_SHARE_PKG_RESOURCES domains (sd_llc). If not, let wakeup balancing take over (find_idlest_{group, cpu}()). This patch makes affine wake-ups conditional on whether both the waker CPU and the previous CPU has sufficient capacity for the waking task, or not, assuming that the CPU capacities within an SD_SHARE_PKG_RESOURCES domain (sd_llc) are homogeneous. Change-Id: I6d5d0426713da9ef6198f574ad9afbe58dacc1f0 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1469453670-2660-10-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 3273163c6775c4c21823985304c2364b08ca6ea2) [removed existing definition of capacity_margin] Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	5173a85e22	UPSTREAM: sched/core: Enable SD_BALANCE_WAKE for asymmetric capacity systems A domain with the SD_ASYM_CPUCAPACITY flag set indicate that sched_groups at this level and below do not include CPUs of all capacities available (e.g. group containing little-only or big-only CPUs in big.LITTLE systems). It is therefore necessary to put in more effort in finding an appropriate CPU at task wake-up by enabling balancing at wake-up (SD_BALANCE_WAKE) on all lower (child) levels. Change-Id: I4615917f540d03d7e7ef7de8f0da33b1ad97387c Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1469453670-2660-8-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 9ee1cda5ee25c7dd82acf25892e0d229e818f8c7) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	e51c057694	UPSTREAM: sched/core: Pass child domain into sd_init() If behavioural sched_domain flags depend on topology flags set at higher domain levels we need a way to update the child domain flags. Moving the child pointer assignment inside sd_init() should make that possible. Change-Id: If043921fdf102c310adcc9e0280afa33c48c4783 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1469453670-2660-7-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 3676b13e8524c576825fe1e731e347dba0083888) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	68a3b157d9	UPSTREAM: sched/core: Introduce SD_ASYM_CPUCAPACITY sched_domain topology flag Add a topology flag to the sched_domain hierarchy indicating the lowest domain level where the full range of CPU capacities is represented by the domain members for asymmetric capacity topologies (e.g. ARM big.LITTLE). The flag is intended to indicate that extra care should be taken when placing tasks on CPUs and this level spans all the different types of CPUs found in the system (no need to look further up the domain hierarchy). This information is currently only available through iterating through the capacities of all the CPUs at parent levels in the sched_domain hierarchy. SD 2 [ 0 1 2 3] SD_ASYM_CPUCAPACITY SD 1 [ 0 1] [ 2 3] !SD_ASYM_CPUCAPACITY CPU: 0 1 2 3 capacity: 756 756 1024 1024 If the topology in the example above is duplicated to create an eight CPU example with third sched_domain level on top (SD 3), this level should not have the flag set (!SD_ASYM_CPUCAPACITY) as its two group would both have all CPU capacities represented within them. Change-Id: I1526407b90567cac387419719b7d7fdc8b259a85 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1469453670-2660-6-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 1f6e6c7cb9bcd58abb5ee11243e0eefe6b36fc8e) [trivial merge conflict] Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:52 -07:00
Morten Rasmussen	3e9cdd5ae9	UPSTREAM: sched/core: Remove unnecessary NULL-pointer check Checking if the sched_domain pointer returned by sd_init() is NULL seems pointless as sd_init() neither checks if it is valid to begin with nor set it to NULL. Change-Id: I5e16fd0c2ca7234b097be7c95409ddb15c5e9de9 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: freedom.tan@mediatek.com Cc: keita.kobayashi.ym@renesas.com Cc: mgalbraith@suse.de Cc: sgurrappadi@nvidia.com Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1469453670-2660-5-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 0e6d2a67a41321b3ef650b780a279a37855de08e) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:51 -07:00
Morten Rasmussen	bc7c939b3a	UPSTREAM: sched/fair: Optimize find_idlest_cpu() when there is no choice In the current find_idlest_group()/find_idlest_cpu() search we end up calling find_idlest_cpu() in a sched_group containing only one CPU in the end. Checking idle-states becomes pointless when there is no alternative, so bail out instead. Change-Id: Ic62bf09b53a7984143ac2431aaa69c69b204cd56 Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: linux-kernel@vger.kernel.org Cc: mgalbraith@suse.de Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1466615004-3503-4-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit eaecf41f5abf80b63c8e025fcb9ee4aa203c3038) Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:51 -07:00
Morten Rasmussen	3bb3d7e7d9	BACKPORT: sched/fair: Make the use of prev_cpu consistent in the wakeup path In commit: `ac66f54772` ("sched/numa: Introduce migrate_swap()") select_task_rq() got a 'cpu' argument to enable overriding of prev_cpu in special cases (NUMA task swapping). However, the select_task_rq_fair() helper functions: wake_affine() and select_idle_sibling(), still use task_cpu(p) directly to work out prev_cpu, which leads to inconsistencies. This patch passes prev_cpu (potentially overridden by NUMA code) into the helper functions to ensure prev_cpu is indeed the same CPU everywhere in the wakeup path. Change-Id: I4951c4eead2e6045e4fb34e89f6cda17d881d4d7 cc: Ingo Molnar <mingo@redhat.com> cc: Rik van Riel <riel@redhat.com> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: linux-kernel@vger.kernel.org Cc: mgalbraith@suse.de Cc: vincent.guittot@linaro.org Cc: yuyang.du@intel.com Link: http://lkml.kernel.org/r/1466615004-3503-3-git-send-email-morten.rasmussen@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 772bd008cd9a1d4e8ce566f2edcc61d1c28fcbe5) [merged with Android/EAS wakeup path] Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:51 -07:00
Dietmar Eggemann	cb88574a68	Partial Revert: "WIP: sched: Add cpu capacity awareness to wakeup balancing" Revert the changes in find_idlest_cpu() and find_idlest_group(). Keep the infrastructure bits which are used in following EAS patches. Change-Id: Id516ca5f3e51b9a13db1ebb8de2df3aa25f9679b Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>	2017-06-02 08:01:51 -07:00
Dietmar Eggemann	bd6ff3505f	Revert "WIP: sched: Consider spare cpu capacity at task wake-up" This reverts commit 75a9695b619741019363f889c99c97c7bb823797. Change-Id: I846b21f2bdeb0b0ca30ad65683564ed07a429428 Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> [ minor merge changes ] Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:51 -07:00
Viresh Kumar	5c015afebd	FROM-LIST: cpufreq: schedutil: Redefine the rate_limit_us tunable The rate_limit_us tunable is intended to reduce the possible overhead from running the schedutil governor. However, that overhead can be divided into two separate parts: the governor computations and the invocation of the scaling driver to set the CPU frequency. The latter is where the real overhead comes from. The former is much less expensive in terms of execution time and running it every time the governor callback is invoked by the scheduler, after rate_limit_us interval has passed since the last frequency update, would not be a problem. For this reason, redefine the rate_limit_us tunable so that it means the minimum time that has to pass between two consecutive invocations of the scaling driver by the schedutil governor (to set the CPU frequency). Change-Id: Iced64116b826c25441ef537c27a3dabfcf81919e Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [pulled from linux-pm linux-next https://patchwork.kernel.org/patch/9583949/ ] Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:51 -07:00
Steve Muckle	51b20b214f	cpufreq: schedutil: add up/down frequency transition rate limits The rate-limit tunable in the schedutil governor applies to transitions to both lower and higher frequencies. On several platforms it is not the ideal tunable though, as it is difficult to get best power/performance figures using the same limit in both directions. It is common on mobile platforms with demanding user interfaces to want to increase frequency rapidly for example but decrease slowly. One of the example can be a case where we have short busy periods followed by similar or longer idle periods. If we keep the rate-limit high enough, we will not go to higher frequencies soon enough. On the other hand, if we keep it too low, we will have too many frequency transitions, as we will always reduce the frequency after the busy period. It would be very useful if we can set low rate-limit while increasing the frequency (so that we can respond to the short busy periods quickly) and high rate-limit while decreasing frequency (so that we don't reduce the frequency immediately after the short busy period and that may avoid frequency transitions before the next busy period). Implement separate up/down transition rate limits. Note that the governor avoids frequency recalculations for a period equal to minimum of up and down rate-limit. A global mutex is also defined to protect updates to min_rate_limit_us via two separate sysfs files. Note that this wouldn't change behavior of the schedutil governor for the platforms which wish to keep same values for both up and down rate limits. This is tested with the rt-app [1] on ARM Exynos, dual A15 processor platform. Testcase: Run a SCHED_OTHER thread on CPU0 which will emulate work-load for X ms of busy period out of the total period of Y ms, i.e. Y - X ms of idle period. The values of X/Y taken were: 20/40, 20/50, 20/70, i.e idle periods of 20, 30 and 50 ms respectively. These were tested against values of up/down rate limits as: 10/10 ms and 10/40 ms. For every test we noticed a performance increase of 5-10% with the schedutil governor, which was very much expected. [Viresh]: Simplified user interface and introduced min_rate_limit_us + mutex, rewrote commit log and included test results. [1] https://github.com/scheduler-tools/rt-app/ Change-Id: I18720a83855b196b8e21dcdc8deae79131635b84 Signed-off-by: Steve Muckle <smuckle.linux@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> (applied from https://marc.info/?l=linux-kernel&m=147936011103832&w=2) [trivial adaptations] Signed-off-by: Juri Lelli <juri.lelli@arm.com>	2017-06-02 08:01:51 -07:00
Juri Lelli	f71d9f01c6	sched/cpufreq: make schedutil use WALT signal If WALT is available and enabled, make schedutil governor use its utilization signal. Change-Id: I92bc37989447a76616e9bcc4e9e8616774fb9925 Signed-off-by: Juri Lelli <juri.lelli@arm.com> [we need to use boosted_cpu_util for schedutil, so make it not static] Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:51 -07:00
Steve Muckle	e5da6c11b2	sched: cpufreq: use rt_avg as estimate of required RT CPU capacity A policy of going to fmax on any RT activity will be detrimental for power on many platforms. Often RT accounts for only a small amount of CPU activity so sending the CPU frequency to fmax is overkill. Worse still, some platforms may not be able to even complete the CPU frequency change before the RT activity has already completed. Cpufreq governors have not treated RT activity this way in the past so it is not part of the expected semantics of the RT scheduling class. The DL class offers guarantees about task completion and could be used for this purpose. Modify the schedutil algorithm to instead use rt_avg as an estimate of RT utilization of the CPU. Based on previous work by Vincent Guittot <vincent.guittot@linaro.org>. Change-Id: I1ed605a3e2512a94d34217a8e57c3fd97cca60be Signed-off-by: Steve Muckle <smuckle@linaro.org>	2017-06-02 08:01:51 -07:00
Viresh Kumar	e2aa75a4c7	cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task If slow path frequency changes are conducted in a SCHED_OTHER context then they may be delayed for some amount of time, including indefinitely, when real time or deadline activity is taking place. Move the slow path to a real time kernel thread. In the future the thread should be made SCHED_DEADLINE. The RT priority is arbitrarily set to 50 for now. Hackbench results on ARM Exynos, dual core A15 platform for 10 iterations: $ hackbench -s 100 -l 100 -g 10 -f 20 Before After --------------------------------- 1.808 1.603 1.847 1.251 2.229 1.590 1.952 1.600 1.947 1.257 1.925 1.627 2.694 1.620 1.258 1.621 1.919 1.632 1.250 1.240 Average: 1.8829 1.5041 Based on initial work by Steve Muckle. Change-Id: I8f53037e94f353960c6d10abf07822d671631ef7 Signed-off-by: Steve Muckle <smuckle.linux@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from 02a7b1ee3baa) [adapt to the 3.18 kthread interface] Signed-off-by: Juri Lelli <juri.lelli@arm.com>	2017-06-02 08:01:51 -07:00
Steve Muckle	ca7b7d3c99	sched/cpufreq: fix tunables for schedfreq governor The schedfreq governor does not currently handle cpufreq drivers which use a global set of tunables (!have_governor_per_policy). For example on x86 and using the acpi cpufreq driver, doing this cat /sys/devices/system/cpu/cpufreq/sched/up_throttle_nsec will result in a bad pointer access. Update the tunable code using the upstream schedutil tunable code by Rafael Wysocki as a guide. Includes a partial backport of the reorganized cpufreq tunable infrastructure. Change-Id: I7e6f8de1dac297077ad43f37dd2f6ddbfe921c98 Signed-off-by: Steve Muckle <smuckle@linaro.org> [fixed cherry-pick issue] Signed-off-by: Juri Lelli <juri.lelli@arm.com> [fixed cherry-pick issue] Signed-off-by: Thierry Strudel <tstrudel@google.com>	2017-06-02 08:01:50 -07:00
Steve Muckle	6bc6115c16	BACKPORT: cpufreq: schedutil: New governor based on scheduler utilization data Add a new cpufreq scaling governor, called "schedutil", that uses scheduler-provided CPU utilization information as input for making its decisions. Doing that is possible after commit 34e2c55 (cpufreq: Add mechanism for registering utilization update callbacks) that introduced cpufreq_update_util() called by the scheduler on utilization changes (from CFS) and RT/DL task status updates. In particular, CPU frequency scaling decisions may be based on the the utilization data passed to cpufreq_update_util() by CFS. The new governor is relatively simple. The frequency selection formula used by it depends on whether or not the utilization is frequency-invariant. In the frequency-invariant case the new CPU frequency is given by next_freq = 1.25 * max_freq * util / max where util and max are the last two arguments of cpufreq_update_util(). In turn, if util is not frequency-invariant, the maximum frequency in the above formula is replaced with the current frequency of the CPU: next_freq = 1.25 * curr_freq * util / max The coefficient 1.25 corresponds to the frequency tipping point at (util / max) = 0.8. All of the computations are carried out in the utilization update handlers provided by the new governor. One of those handlers is used for cpufreq policies shared between multiple CPUs and the other one is for policies with one CPU only (and therefore it doesn't need to use any extra synchronization means). The governor supports fast frequency switching if that is supported by the cpufreq driver in use and possible for the given policy. In the fast switching case, all operations of the governor take place in its utilization update handlers. If fast switching cannot be used, the frequency switch operations are carried out with the help of a work item which only calls __cpufreq_driver_target() (under a mutex) to trigger a frequency update (to a value already computed beforehand in one of the utilization update handlers). Currently, the governor treats all of the RT and DL tasks as "unknown utilization" and sets the frequency to the allowed maximum when updated from the RT or DL sched classes. That heavy-handed approach should be replaced with something more subtle and specifically targeted at RT and DL tasks. The governor shares some tunables management code with the "ondemand" and "conservative" governors and uses some common definitions from cpufreq_governor.h, but apart from that it is stand-alone. Change-Id: I03876e622768e4b3ee4dc28682af7cce771f2f4c Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> (cherry-picked from 9bdcb44e391da5c41b98573bf0305a0e0b1c9569) [ Backport the schedutil cpufreq governor from 4.9. Some cpufreq tunable infrastructure as well as the resolve_freq API is also backported as those are dependencies] Signed-off-by: Steve Muckle <smuckle@linaro.org> [trivial cherry-picking fixes] Signed-off-by: Juri Lelli <juri.lelli@arm.com> [fixed default governor machinery] Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:50 -07:00
Steve Muckle	f02702dcf2	sched: backport cpufreq hooks from 4.9-rc4 The scheduler cpufreq hooks are required by the schedutil cpufreq governor. Change-Id: Ied6c46262bb33b7e81bbb3d3d2761124e0c676b7 Signed-off-by: Steve Muckle <smuckle@linaro.org> [trivial cherry-picking fixes] Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-06-02 08:01:50 -07:00
Greg Kroah-Hartman	9bc462220d	This is the 4.4.70 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlkm0zAACgkQONu9yGCS aT5QnxAAh9uZYFJtQ7wYngD7cQcDH1KVztqEYxCP5OtxzAZBrSNBufLdhKBbc1ZP C04Mo+FzzNiJtBwkmlOqYaEPYUSx/uwCEk9mNX85VtchIhKBrwWF7GxkeXCPs6e5 yP5TUXmxbbSp3qM4q2Z4XSW8eEPZ2l3zoy0fkjz2kS02e4RW0yQ34dvzw0BG2urr +9ocyVjDBoU3QNKyVw3fd1AltKesSZK0fa2vEO+TOTW6Bm3xD4egCJdOzu9saUwK hfSKXsJ0/pf1r1iyfz2foR/Hi3i4j6vRqnneyqozT7nxEJEuBQ3B5WhnsbDfzrXu +CY23KBkDkQ1RBngmtTQd3ABHEN1E2StpBImG5RUr+5giV6/e4rdz0/HWGMvCvAz iWqXdgZNdCnc96HPEWaDGUKxndCxsiaJOhgZwW2zm/0drVWRE+vjsOmFLyUp2Ky1 1vnKfwlvTFU4xjQ5H44AuuSHQsv+GNEtPPIHrbBv/wg90/2VuF0aYuNYjHSsc4Ca 3YM53S6/sjQqmsKixWboax8Kh2wRrEuFbqSFQV64JjFpGau61JQFMtRNl4+FFXzm Cm+26Fan4Wtyo5zB9xnBZbDwCOXqwTXQYUP2SejtObq+Uk2tXxF05emeta9pURF3 vdgv6N0cTPm4K3VZyBZvj8JitEr2OEaIxoUqE2BXkA1MPmbqOoI= =Z1no -----END PGP SIGNATURE----- Merge 4.4.70 into android-4.4 Changes in 4.4.70 usb: misc: legousbtower: Fix buffers on stack usb: misc: legousbtower: Fix memory leak USB: ene_usb6250: fix DMA to the stack watchdog: pcwd_usb: fix NULL-deref at probe char: lp: fix possible integer overflow in lp_setup() USB: core: replace %p with %pK ARM: tegra: paz00: Mark panel regulator as enabled on boot tpm_crb: check for bad response size infiniband: call ipv6 route lookup via the stub interface dm btree: fix for dm_btree_find_lowest_key() dm raid: select the Kconfig option CONFIG_MD_RAID0 dm bufio: avoid a possible ABBA deadlock dm bufio: check new buffer allocation watermark every 30 seconds dm cache metadata: fail operations if fail_io mode has been established dm bufio: make the parameter "retain_bytes" unsigned long dm thin metadata: call precommit before saving the roots dm space map disk: fix some book keeping in the disk space map md: update slab_cache before releasing new stripes when stripes resizing rtlwifi: rtl8821ae: setup 8812ae RFE according to device type mwifiex: pcie: fix cmd_buf use-after-free in remove/reset ima: accept previously set IMA_NEW_FILE KVM: x86: Fix load damaged SSEx MXCSR register KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation regulator: tps65023: Fix inverted core enable logic. s390/kdump: Add final note s390/cputime: fix incorrect system time ath9k_htc: Add support of AirTies 1eda:2315 AR9271 device ath9k_htc: fix NULL-deref at probe drm/amdgpu: Avoid overflows/divide-by-zero in latency_watermark calculations. drm/amdgpu: Make display watermark calculations more accurate drm/nouveau/therm: remove ineffective workarounds for alarm bugs drm/nouveau/tmr: ack interrupt before processing alarms drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm drm/nouveau/tmr: avoid processing completed alarms when adding a new one drm/nouveau/tmr: handle races with hw when updating the next alarm time cdc-acm: fix possible invalid access when processing notification proc: Fix unbalanced hard link numbers of: fix sparse warning in of_pci_range_parser_one iio: dac: ad7303: fix channel description pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes() USB: serial: ftdi_sio: fix setting latency for unprivileged users USB: serial: ftdi_sio: add Olimex ARM-USB-TINY(H) PIDs ext4 crypto: don't let data integrity writebacks fail with ENOMEM ext4 crypto: fix some error handling net: qmi_wwan: Add SIMCom 7230E fscrypt: fix context consistency check when key(s) unavailable f2fs: check entire encrypted bigname when finding a dentry fscrypt: avoid collisions when presenting long encrypted filenames sched/fair: Do not announce throttled next buddy in dequeue_task_fair() sched/fair: Initialize throttle_count for new task-groups lazily usb: host: xhci-plat: propagate return value of platform_get_irq() xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton usb: host: xhci-mem: allocate zeroed Scratchpad Buffer net: irda: irda-usb: fix firmware name on big-endian hosts usbvision: fix NULL-deref at probe mceusb: fix NULL-deref at probe ttusb2: limit messages to buffer size usb: musb: tusb6010_omap: Do not reset the other direction's packet size USB: iowarrior: fix info ioctl on big-endian hosts usb: serial: option: add Telit ME910 support USB: serial: qcserial: add more Lenovo EM74xx device IDs USB: serial: mct_u232: fix big-endian baud-rate handling USB: serial: io_ti: fix div-by-zero in set_termios USB: hub: fix SS hub-descriptor handling USB: hub: fix non-SS hub-descriptor handling ipx: call ipxitf_put() in ioctl error path iio: proximity: as3935: fix as3935_write ceph: fix recursion between ceph_set_acl() and __ceph_setattr() gspca: konica: add missing endpoint sanity check s5p-mfc: Fix unbalanced call to clock management dib0700: fix NULL-deref at probe zr364xx: enforce minimum size when reading header dvb-frontends/cxd2841er: define symbol_rate_min/max in T/C fe-ops cx231xx-audio: fix init error path cx231xx-audio: fix NULL-deref at probe cx231xx-cards: fix NULL-deref at probe powerpc/book3s/mce: Move add_taint() later in virtual mode powerpc/pseries: Fix of_node_put() underflow during DLPAR remove powerpc/64e: Fix hang when debugging programs with relocated kernel ARM: dts: at91: sama5d3_xplained: fix ADC vref ARM: dts: at91: sama5d3_xplained: not all ADC channels are available arm64: xchg: hazard against entire exchange variable arm64: uaccess: ensure extension of access_ok() addr arm64: documentation: document tagged pointer stack constraints xc2028: Fix use-after-free bug properly mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp staging: rtl8192e: fix 2 byte alignment of register BSSIDR. staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD. iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings metag/uaccess: Fix access_ok() metag/uaccess: Check access_ok in strncpy_from_user uwb: fix device quirk on big-endian hosts genirq: Fix chained interrupt data ordering osf_wait4(): fix infoleak tracing/kprobes: Enforce kprobes teardown after testing PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platforms PCI: Freeze PME scan before suspending devices drm/edid: Add 10 bpc quirk for LGD 764 panel in HP zBook 17 G2 nfsd: encoders mustn't use unitialized values in error cases drivers: char: mem: Check for address space wraparound with mmap() Linux 4.4.70 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2017-05-25 17:31:28 +02:00
Konstantin Khlebnikov	ada79b5ecd	sched/fair: Initialize throttle_count for new task-groups lazily commit 094f469172e00d6ab0a3130b0e01c83b3cf3a98d upstream. Cgroup created inside throttled group must inherit current throttle_count. Broken throttle_count allows to nominate throttled entries as a next buddy, later this leads to null pointer dereference in pick_next_task_fair(). This patch initialize cfs_rq->throttle_count at first enqueue: laziness allows to skip locking all rq at group creation. Lazy approach also allows to skip full sub-tree scan at throttling hierarchy (not in this patch). Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Link: http://lkml.kernel.org/r/146608182119.21870.8439834428248129633.stgit@buzz Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Ben Pineau <benjamin.pineau@mirakl.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 14:30:12 +02:00

... 2 3 4 5 6 ...