Commit graph

22243 commits

Author SHA1 Message Date
Linux Build Service Account
7643874858 Merge "sched: Ensure proper task migration when a CPU is isolated" 2016-12-10 15:43:17 -08:00
Syed Rameez Mustafa
6e24ba90a2 sched: Ensure proper task migration when a CPU is isolated
migrate_tasks() migrates all tasks of a CPU by using pick_next_task().
This works in the hotplug case as we force migrate every single task
allowing pick_next_task() to return a new task on every loop iteration.
In the case of isolation, however, task migration is not guaranteed
which causes pick_next_task() to keep returning the same task over and
over again until we terminate the loop without having migrated all the
tasks that were supposed to migrated.

Fix the above problem by temporarily dequeuing tasks that are pinned
and marking them with TASK_ON_RQ_MIGRATING. This not only allows
pick_next_task() to properly walk the runqueue but also prevents any
migrations or changes in affinity for the dequeued tasks. Once we are
done with migrating all possible tasks, we re-enqueue all the dequeued
tasks.

While at it, ensure consistent ordering between task de-activation and
setting the TASK_ON_RQ_MIGRATING flag across all scheduling classes.

Change-Id: Id06151a8e34edab49ac76b4bffd50c132f0b792f
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-12-09 14:30:41 -08:00
Olav Haugan
8cf404403a sched/core: Fix race condition in clearing hmp request
There is a race condition between clearing an HMP request for active
migration and the actual active migration. Active migration can he
half-way through doing the migration when the HMP request can be cleared
by another core. Move clearing of HMP request to the stopper thread to
avoid this.

Change-Id: I6d73b8f246ae3754ab60984af198333fd284ae16
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-12-09 13:45:42 -08:00
Olav Haugan
584d38f189 sched/core: Prevent (user) space tasks from affining to isolated cpus
We don't want user space tasks to run on isolated cpus. If the affinity
mask that the user space task is trying to set only includes online
cpus that are isolated return error.

Also ensure that tasks do not get stuck on isolated cores. We are not
properly updating the mask that we check against the current CPU so we
might end up thinking we can run on the current CPU. Fix this.

Change-Id: I078d01e63860d1fc60fc96eb0c739c0f680ae983
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-12-09 13:45:42 -08:00
Linux Build Service Account
b832093be4 Merge "sched: pre-allocate colocation groups" 2016-12-01 16:39:40 -08:00
Joonwoo Park
7437cd7c4b sched: pre-allocate colocation groups
At present, sched_set_group_id() dynamically allocates structure for
colocation group to assign the given task to the group.  However
this can cause deadlock as memory allocator can wakeup a task which
also tries to acquire related_thread_group_lock.

Avoid such deadlock by pre-allocating colocation structures.  This
limits maximum colocation groups to static number but it's fine as it's
never expected to be a lot.

Change-Id: Ifc32ab4ead63c382ae390358ed86f7cc5b6eb2dc
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-12-01 11:28:01 -08:00
Linux Build Service Account
fbfd0301be Merge "sched: Disable interrupts while holding related_thread_group_lock" 2016-11-29 07:44:06 -08:00
Linux Build Service Account
46c5a88fdf Merge "sched/core: Do not free task while holding rq lock" 2016-11-28 23:57:56 -08:00
Linux Build Service Account
40493b8042 Merge "qos: Register irq notify after adding the qos request" 2016-11-28 23:57:32 -08:00
Olav Haugan
e5c095a2c7 sched/core: Do not free task while holding rq lock
Clearing the hmp request can cause a task to be freed. When a task is
freed the free call might wake up a kworker which will cause a
spinlock lockup (rq lock). Fix this by avoiding calling put_task_struct
when holding the rq lock.

In addition move call to clear_hmp_request out of stopper thread context
since it is not necessary to do this on the cpu being isolated.

Change-Id: Ie577db4701a88849560df385869ff7cf73695a05
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-28 11:00:29 -08:00
Pavankumar Kondeti
d0ff1c04e8 sched: Disable interrupts while holding related_thread_group_lock
There is a potential deadlock condition if interrupts are enabled
while holding the related_thread_group_lock. Prevent this.

----------------                              --------------------
    CPU 0                                          CPU 1
---------------                               --------------------

check_for_migration()                         cgroup_file_write(p)

check_for_freq_change()                       cgroup_attach_task(p)

send_notification()                           schedtune_attach(p)

read_lock(&related_thread_group_lock)         sched_set_group_id(p)

                                              raw_spin_lock_irqsave(
					       &p->pi_lock, flags)

					      write_lock_irqsave(
					       &related_thread_group_lock)

					       waiting on CPU#0

raw_spin_lock_irqsave(&rq->lock, flags)

raw_spin_unlock_irqrestore(&rq->lock, flags)

--> interrupt()

----> ttwu(p)

-------> waiting for p's pi_lock on CPU#1

Change-Id: I6f0f8f742d6e1b3ff735dcbeabd54ef101329cdf
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-11-28 22:15:56 +05:30
Linux Build Service Account
9aa1df0cf5 Merge "sched: Ensure proper synch between isolation, hotplug, and suspend" 2016-11-27 19:40:21 -08:00
Anil Kumar Mamidala
9879d0300b qos: Register irq notify after adding the qos request
Before adding the irq affinity based qos request to the list, if
the affinity of the interrupt changes it will trigger notify call.
This notifier call will try to update the qos request. Accessing
the qos request which is not yet added to the list leads to a
NULL pointer exception.

Avoid this race by registering the notifier after adding the
qos request.

Change-Id: I99869cc233573b5db10e4f3224d65c29511050ea
Signed-off-by: Anil Kumar Mamidala <amami@codeaurora.org>
2016-11-27 08:21:28 -08:00
Linux Build Service Account
5b3053ec24 Merge "qos: wake up cores based on the qos updated cpu mask" 2016-11-26 21:27:49 -08:00
Linux Build Service Account
22f8318fcc Merge "audit: fix a double fetch in audit_log_single_execve_arg()" 2016-11-25 08:32:35 -08:00
Anil Kumar Mamidala
625eb19435 qos: wake up cores based on the qos updated cpu mask
If the qos value is increased only for a subset of cpu's
aggregated qos for those cpu's is still the previous value.
This is because the qos request list is maintained per
request and not per cpu. In this case as there is no change
in aggregated qos value, these cpu's are not wokenup to
take the new qos value into effect.

So wakeup cpu's even if the aggregated qos value does not change
but the cpumask changes.

Change-Id: If5a4a100108e85e04beb77e5249bd6c452672edf
Signed-off-by: Anil Kumar Mamidala <amami@codeaurora.org>
2016-11-24 21:42:52 -08:00
Paul Moore
7b0a354c5e audit: fix a double fetch in audit_log_single_execve_arg()
There is a double fetch problem in audit_log_single_execve_arg()
where we first check the execve(2) argumnets for any "bad" characters
which would require hex encoding and then re-fetch the arguments for
logging in the audit record[1].  Of course this leaves a window of
opportunity for an unsavory application to munge with the data.

This patch reworks things by only fetching the argument data once[2]
into a buffer where it is scanned and logged into the audit
records(s).  In addition to fixing the double fetch, this patch
improves on the original code in a few other ways: better handling
of large arguments which require encoding, stricter record length
checking, and some performance improvements (completely unverified,
but we got rid of some strlen() calls, that's got to be a good
thing).

As part of the development of this patch, I've also created a basic
regression test for the audit-testsuite, the test can be tracked on
GitHub at the following link:

 * https://github.com/linux-audit/audit-testsuite/issues/25

[1] If you pay careful attention, there is actually a triple fetch
problem due to a strnlen_user() call at the top of the function.

[2] This is a tiny white lie, we do make a call to strnlen_user()
prior to fetching the argument data.  I don't like it, but due to the
way the audit record is structured we really have no choice unless we
copy the entire argument at once (which would require a rather
wasteful allocation).  The good news is that with this patch the
kernel no longer relies on this strnlen_user() value for anything
beyond recording it in the log, we also update it with a trustworthy
value whenever possible.

Reported-by: Pengfei Wang <wpengfeinudt@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Change-Id: Ie9848961d236739df5014474f2c2a781af9fb811
Git-repo: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
Git-commit: 43761473c254b45883a64441dd0bc85a42f3645c
Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org>
2016-11-23 11:33:07 -08:00
Nick Desaulniers
3b5cf91f45 cgroup: prefer %pK to %p
Prevents leaking kernel pointers when using kptr_restrict.

Bug: 30149174
Change-Id: I0fa3cd8d4a0d9ea76d085bba6020f1eda073c09b
Git-repo: https://android.googlesource.com/kernel/msm.git
Git-commit: 505e48f32f1321ed7cf80d49dd5f31b16da445a8
Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org>
2016-11-18 17:08:58 -08:00
Olav Haugan
704e5bfc25 sched: Ensure proper synch between isolation, hotplug, and suspend
Isolation code needs to be synchronized with both hotplug and suspend.
Ensure this by taking the lock that is taken by both paths and ensure
hotplug notifiers are processed for suspend/resume.

Change-Id: I663588cfd2f9e3972b9adc1a10887ef36cd70c57
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-18 14:04:39 -08:00
Syed Rameez Mustafa
30fc774235 sched/hmp: Enhance co-location and scheduler boost features
The recent introduction of the schedtune cgroup controller has provided
the scheduler with added flexibility in terms of some of it's placement
features. In particular each cgroup under the schedtune controller can
now specify:

1) Whether it needs co-location along with other cgroups
2) Whether it is eligible for scheduler boost (sched_boost_enabled)
3) Whether the kernel can override the boost eligibility when necessary
   (sched_boost_no_override)

The scheduler now creates a reserved co-location group at boot. This
group is used to co-locate all tasks that form part of any one of the
cgroups that have co-location enabled. This reserved group can neither
be destroyed nor reused for other purposes. Furthermore, cgroups are
only allowed to indicate their co-location preference once at boot.
Further updates are disallowed.

Since we are now creating co-location groups for an extended period of
time, there are a few other factors to consider when determining the
preferred cluster for the group. We first exclude any tasks in the
group that have not been observed to be running for a significant
amount of time. Secondly we introduce the notion of group up and down
migrate tunables to allow different migration policies than individual
tasks. Lastly we break co-location if a single task in a group exceeds
up-migrate but the total load of the group does not exceed group
up-migrate.

In terms of sched_boost, the scheduler now supports multiple types of
boost. These are:

1) FULL_THROTTLE : Force up-migrate tasks belonging any cgroup that
                   has the sched_boost_enabled flag turned on. Little
                   CPUs will only be used when big CPUs can no longer
                   accommodate tasks. Also up-migrate all RT tasks.

2) CONSERVATIVE : Override the sched_boost_enabled flag for all cgroups
                  except those that have the sched_boost_no_override
                  flag set. Force up-migrate all tasks belonging to only
                  those cgroups that still remain eligible for boost.
                  RT tasks do not get force up migrated.

3) RESTRAINED : Start frequency aggregation for co-located tasks. This
                type of boost does not force up-migrate any task.

Finally the boost API removes ref-counting. This means that there can
only be a single entity using boost at any given time. If multiple
entities are managing boost, they are required to be well behaved so
that they don't interfere with one another. Even for a single client,
it is not possible to switch directly from one boost type to another.
Boost must be first turned off before switching over to a new type.

Change-Id: I8d224a70cbef162f27078b62b73acaa22670861d
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-11-16 17:57:56 -08:00
Joonwoo Park
fd5b530593 sched: revise boost logic when boost_type is SCHED_BOOST_ON_BIG
At present HMP scheduler boost tends to pack tasks by taking into
account of power cost and cstate.  This is suboptimal to performance
as it can lead preemption and higher latency.

Revise logic to prefer the least loaded CPU among the big cluster CPUs
when boost type is SCHED_BOOST_ON_BIG.  New logic still honor the
behaviour that scheduler can place tasks on the little CPUs when the
big CPUs are all overcommitted.

Also, it's found that need_idle with boost can easily return previous
CPU when there is no idle CPU found.  Fix this issue by making
need_idle flag to take precedence over sched_boost.

CRs-fixed: 1074879
Change-Id: I470bcd0588e038b4a540d337fe6a412f2fa74920
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-11-16 17:57:55 -08:00
Syed Rameez Mustafa
8b74c7eb5f sched: Remove thread group iteration from colocation
Iterating a leader task's thread group in order to add them to a
colocation group involves a complex locking chain that ends up
causing a deadlock. The deadlock is as follows when the same task
is being referenced on three different CPUs:

-----                     ------                      -----
CPU 0                     CPU 1                       CPU 2
-----                     ------                      -----
                          add_task_to_group(p)

__schedule(prev = p)      write_lock(                 ttwu(p)
                          related_thread_grp_lock)
                                                      lock(pi_lock)

idle_balance()                                        wait for
                                                      p->on_cpu
load_balance()            unable to acquire
                          p->pi_lock
send_notification()

wait for read_lock(
related_thread_grp_lock)

unable to set p->on_cpu

There are a couple of ways to resolve this deadlock in the kernel,
however, they are not trivial. For the sake of simplicity, move
the responsibility of thread group iteration back to userspace. This
would apply to both adding and removing the leader task from a
colocation group. The kernel would continue to automatically add
newly forked children of the colocated leader to the colocation
group.

This still leaves an issue with the locking order of the pi_lock and
the related_thread_group_lock. To solve all deadlocks, we need to avoid
taking the pi_lock in reset_all_task_stats() and instead rely on a more
heavy handed approach of taking all rq locks. The pi_lock was taken to
avoid a race between reset_all_task_stats() and sched_exit(). The race
can be avoided with rq locks as well.

Change-Id: I15323e3ef91401142d3841db59c18fd8fee753fd
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-11-16 17:57:35 -08:00
Linux Build Service Account
2ce4f26719 Merge "core_ctl: Export boost function" 2016-11-15 04:07:49 -08:00
Linux Build Service Account
30f6933a15 Merge "sched: core: Skip migrating tasks that aren't enqueued on dead_rq" 2016-11-14 21:54:02 -08:00
Olav Haugan
8bf3523cf7 core_ctl: Export boost function
Export core control boost function to make it accessible to kernel
modules.

Change-Id: I94359afa433ad57dd5bfeae3cb78a1f196cd02fe
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-14 16:28:03 -08:00
Vikram Mulukutla
3f11a4bc4f sched: core: Skip migrating tasks that aren't enqueued on dead_rq
During migrate_tasks, we have to drop the dead_rq lock in
order to preserve locking order when acquiring task->pi_lock.
This may allow the task to migrate off of dead_rq. Therefore,
don't attempt to migrate such a task again from dead_rq.

Change-Id: Id31b58e231d3dcd7d32e0dc7f264595d60a7c408
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2016-11-11 16:09:55 -08:00
Linux Build Service Account
1787801211 Merge "timer: Don't wait for running timers when migrating during isolation" 2016-11-10 22:49:40 -08:00
Linux Build Service Account
befd242303 Merge "sched/core: Fix migrate tasks bail-out condition" 2016-11-10 22:49:39 -08:00
Linux Build Service Account
2401d64a48 Merge "core_ctl: Synchronize access to cluster cpu list" 2016-11-10 22:49:39 -08:00
Olav Haugan
45b8775b62 sched/core: Fix migrate tasks bail-out condition
Migrate tasks function is used by both hotplug and cpu isolation. During
hotplug all the cpus are stalled (in stop machine) while tasks are being
migrated. However, this is not the case during cpu isolation. A task
that was counted as a pinned thread might have been migrated off the
cpu. Take this into account when checking whether we have completed
moving all tasks off the runqueue.

Also ignore warning about tasks moving off the run-queue for isolation
use case.

Change-Id: I5c5f25eb9b1eaf0605b606a65e0ac86996fa5f27
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-10 14:19:05 -08:00
Olav Haugan
c82e2f73d1 core_ctl: Synchronize access to cluster cpu list
Cluster cpu list traversal is not properly protected against removal of
element by a separate thread. Add proper locking to ensure an element
cannot be removed while accessing the list.

In addition ensure we don't end up in a livelock never exiting the loop
due to hotplug continuously moving elements to the end of the list.

Change-Id: Ie98fe48c2f4fdd0244573229b77ee9823df9e214
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-10 14:19:00 -08:00
Abinaya P
d9a48a7cd2 Revert "input: touchscreen: Add synaptics v1 driver"
This reverts  'commit d13776d16a ("input: touchscreen: Add synaptics
v1 driver")'

Change-Id: I1c0c57de3319c59c094b9e8d9192995312192354
Signed-off-by: Abinaya P <abinayap@codeaurora.org>
2016-11-10 00:42:14 -08:00
Abinaya P
04e7c994ca Revert "input: touchscreen: synaptics v1.1"
This reverts 'commit 7112993181 ("input: touchscreen: synaptics v1.1")'
This change is not needed in 4.4 kernel.

Change-Id: I89ab8f353bc04bc0a04d5f5a6993e8e8e5ebbd2e
Signed-off-by: Abinaya P <abinayap@codeaurora.org>
Signed-off-by: Shantanu Jain <shjain@codeaurora.org>
2016-11-10 12:40:32 +05:30
Vikram Mulukutla
4142e30898 timer: Don't wait for running timers when migrating during isolation
A CPU that is isolated needs to have its timers migrated off to
another CPU. If while migrating timers, there is a running
timer, acquiring the timer base lock after marking a CPU as
isolated will ensure that:

1) No more timers can be queued on to the isolated CPU, and
2) A running timer will finish execution on the to-be-isolated
   CPU, and so will any just expired timers since they're all
   taken off of the CPU's tvec1 in one go while the base lock
   is held.

Therefore there is no apparent reason to wait for the expired
timers to finish execution, and isolation can proceed to migrate
non-expired timers even when the expired ones are running
concurrently.

While we're here, also add a delay to the wait-loop inside
migrate_hrtimer_list to allow for store-exclusive fairness
when run_hrtimer is attempting to grab the hrtimer base
lock.

Change-Id: Ib697476c93c60e3d213aaa8fff0a2bcc2985bfce
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2016-11-09 15:57:24 -08:00
Linux Build Service Account
a97a6be4de Merge "sched: Ensure watchdog is enabled before disabling" 2016-11-08 11:19:07 -08:00
Linux Build Service Account
5a1e6a8dbf Merge "sched/core: Keep rq online after cpu isolation" 2016-11-08 11:19:06 -08:00
Linux Build Service Account
b820fbcdd2 Merge "sched: Fix race condition with active balance" 2016-11-08 11:19:06 -08:00
Linux Build Service Account
a44eb7063b Merge "cgroup: Disable IRQs while holding css_set_lock" 2016-11-08 11:18:32 -08:00
Olav Haugan
af04b3a2ba sched: Ensure watchdog is enabled before disabling
There is a race between watchdog being enabled by hotplug and core
isolation disabling the watchdog. When a CPU is hotplugged in and
the hotplug lock has been released the watchdog thread might not
have run yet to enable the watchdog.  We have to wait for the
watchdog to be enabled before proceeding.

Change-Id: I88f73603b6d389a46f8e819d9b490091d5ba4fe9
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-07 17:51:48 -08:00
Olav Haugan
34a3cdf14e sched/core: Keep rq online after cpu isolation
To move tasks off a cpu when offlining the rq needs to be offlined to
un-throttle tasks.  However, tasks might still run on the CPU even after
the CPU has been isolated (per-CPU threads). Thus we should leave the rq
in online state after tasks have been moved.

Change-Id: I61486e8648af0dbb82595fe699e1bc158e837362
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-07 17:51:42 -08:00
Olav Haugan
411a978bce sched: Fix race condition with active balance
There is a race condition between checking for whether an active load
balance request has been set and clearing the request. A cpu might have
an active load balance request set and queued but not executed yet.
Before the load balance request is executed the request flag might be
cleared by cpu isolation. Then subsequently the load balancer or tick
might try to do another active load balance.  This can cause the same
active load balance work to be queued twice causing report of list
corruption.

Fix this by moving the clearing of the request to the stopper thread and
ensuring that load balance will not try to queue a request on an
already isolated cpu.

Change-Id: I5c900d2ee161fa692d66e3e66012398869715662
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-07 17:51:25 -08:00
Syed Rameez Mustafa
b9b63b0c62 sched/hmp: Fix memory leak when task fork fails
The scheduler allocates memory for the task load structures during
fork. It then relies to sched_exit() to be called to free that memory.
However, if the fork itself fails at any point after the allocation,
the memory is left unclaimed forever. Fix this memory leak by freeing
the allocated memory under error conditions.

Change-Id: I14a8290c9fcc4174ec80560e9f9d7bcdb119761f
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-11-07 14:46:22 -08:00
Syed Rameez Mustafa
576259be4a sched/hmp: Use GFP_KERNEL for top task memory allocations
Task load structure allocations can consume a lot of memory as the
number of tasks begin to increase. Also they might exhaust the atomic
memory pool pretty quickly if a workload starts spawning lots of
threads in a short amount of time thus increasing the possibility of
failed allocations. Move the call to init_new_task_load() outside
atomic context and start using GFP_KERNEL for allocations. There is
no need for this allocation to be in atomic context.

Change-Id: I357772e10bf8958804d9cd0c78eda27139054b21
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-11-07 14:46:21 -08:00
Syed Rameez Mustafa
ecd8f7800f sched/hmp: Use improved information for frequency notifications
Recent changes to scheduler guided frequency have started reporting
the maximum of the cpu load and the load of the top task on a CPU
to the governor. Use the same information to determine whether a
notification is necessary or not.

Change-Id: I1928c6cd0509952443a912ef54e0d72d5f75955d
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-11-07 14:46:20 -08:00
Syed Rameez Mustafa
54052c3658 sched/hmp: Remove capping when reporting load to the cpufreq governor
Capping load when reporting to the governor was important prior to new
scheduler guided frequency changes as intra-cluster migrations would
sometimes lead to CPU loads well in excess of 100%. With the new top
task approach however, load greater than 100% is no longer possible
except for the same conditions that were previously exempted (i.e.
inter-cluster migrations and frequency aggregation).

Change-Id: I3e4f5e39ec9ae7eeaba9a567efd245a7aec1b7ad
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-11-07 14:46:19 -08:00
Linux Build Service Account
6d25dab1ba Merge "sched: prevent race between disable window statistics and task grouping" 2016-11-04 22:22:09 -07:00
Linux Build Service Account
8e9e0fd780 Merge "Merge remote-tracking branch 'msm4.4/tmp-da9a92f' into msm-4.4" 2016-11-04 22:22:00 -07:00
Joonwoo Park
dfb9634d03 sched: prevent race between disable window statistics and task grouping
Change of colocation group requires to finish CPU busy time accounting
prior to its operation by calling update_task_ravg().  However when
window statistics accounting is disabled, update_task_ravg() acts as
nop and results in incorrect CPU time accounting.

Disallow colocation group change while window statistics accounting is
disabled in order to prevent race between reset_all_window_stats() and
colocation grouping functions.

Change-Id: I6dfa20b8d8b0ae7ccc94119bf9cf14c5e11a1cf7
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-11-04 17:25:30 -07:00
Daniel Bristot de Oliveira
916622c7d5 cgroup: Disable IRQs while holding css_set_lock
While testing the deadline scheduler + cgroup setup I hit this
warning.

[  132.612935] ------------[ cut here ]------------
[  132.612951] WARNING: CPU: 5 PID: 0 at kernel/softirq.c:150 __local_bh_enable_ip+0x6b/0x80
[  132.612952] Modules linked in: (a ton of modules...)
[  132.612981] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.7.0-rc2 #2
[  132.612981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
[  132.612982]  0000000000000086 45c8bb5effdd088b ffff88013fd43da0 ffffffff813d229e
[  132.612984]  0000000000000000 0000000000000000 ffff88013fd43de0 ffffffff810a652b
[  132.612985]  00000096811387b5 0000000000000200 ffff8800bab29d80 ffff880034c54c00
[  132.612986] Call Trace:
[  132.612987]  <IRQ>  [<ffffffff813d229e>] dump_stack+0x63/0x85
[  132.612994]  [<ffffffff810a652b>] __warn+0xcb/0xf0
[  132.612997]  [<ffffffff810e76a0>] ? push_dl_task.part.32+0x170/0x170
[  132.612999]  [<ffffffff810a665d>] warn_slowpath_null+0x1d/0x20
[  132.613000]  [<ffffffff810aba5b>] __local_bh_enable_ip+0x6b/0x80
[  132.613008]  [<ffffffff817d6c8a>] _raw_write_unlock_bh+0x1a/0x20
[  132.613010]  [<ffffffff817d6c9e>] _raw_spin_unlock_bh+0xe/0x10
[  132.613015]  [<ffffffff811388ac>] put_css_set+0x5c/0x60
[  132.613016]  [<ffffffff8113dc7f>] cgroup_free+0x7f/0xa0
[  132.613017]  [<ffffffff810a3912>] __put_task_struct+0x42/0x140
[  132.613018]  [<ffffffff810e776a>] dl_task_timer+0xca/0x250
[  132.613027]  [<ffffffff810e76a0>] ? push_dl_task.part.32+0x170/0x170
[  132.613030]  [<ffffffff8111371e>] __hrtimer_run_queues+0xee/0x270
[  132.613031]  [<ffffffff81113ec8>] hrtimer_interrupt+0xa8/0x190
[  132.613034]  [<ffffffff81051a58>] local_apic_timer_interrupt+0x38/0x60
[  132.613035]  [<ffffffff817d9b0d>] smp_apic_timer_interrupt+0x3d/0x50
[  132.613037]  [<ffffffff817d7c5c>] apic_timer_interrupt+0x8c/0xa0
[  132.613038]  <EOI>  [<ffffffff81063466>] ? native_safe_halt+0x6/0x10
[  132.613043]  [<ffffffff81037a4e>] default_idle+0x1e/0xd0
[  132.613044]  [<ffffffff810381cf>] arch_cpu_idle+0xf/0x20
[  132.613046]  [<ffffffff810e8fda>] default_idle_call+0x2a/0x40
[  132.613047]  [<ffffffff810e92d7>] cpu_startup_entry+0x2e7/0x340
[  132.613048]  [<ffffffff81050235>] start_secondary+0x155/0x190
[  132.613049] ---[ end trace f91934d162ce9977 ]---

The warn is the spin_(lock|unlock)_bh(&css_set_lock) in the interrupt
context. Converting the spin_lock_bh to spin_lock_irq(save) to avoid
this problem - and other problems of sharing a spinlock with an
interrupt.

Change-Id: I2064d3c21863e50ee1a70e57f7915d04f2ba0407
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: cgroups@vger.kernel.org
Cc: stable@vger.kernel.org # 4.5+
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Acked-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Git-commit: 82d6489d0fed2ec8a8c48c19e8d8a04ac8e5bb26
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[runminw@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Runmin Wang <runminw@codeaurora.org>
2016-11-03 11:57:08 -07:00
Linux Build Service Account
4b37769e9e Merge "sched/hmp: Automatically add children threads to colocation group" 2016-11-02 14:41:34 -07:00