Commit graph

22582 commits

Author SHA1 Message Date
Ding Tianhong
dfb704f96c rcu: Fix soft lockup for rcu_nocb_kthread
commit bedc1969150d480c462cdac320fa944b694a7162 upstream.

Carrying out the following steps results in a softlockup in the
RCU callback-offload (rcuo) kthreads:

1. Connect to ixgbevf, and set the speed to 10Gb/s.
2. Use ifconfig to bring the nic up and down repeatedly.

[  317.005148] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[  368.106005] BUG: soft lockup - CPU#1 stuck for 22s! [rcuos/1:15]
[  368.106005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  368.106005] task: ffff88057dd8a220 ti: ffff88057dd9c000 task.ti: ffff88057dd9c000
[  368.106005] RIP: 0010:[<ffffffff81579e04>]  [<ffffffff81579e04>] fib_table_lookup+0x14/0x390
[  368.106005] RSP: 0018:ffff88061fc83ce8  EFLAGS: 00000286
[  368.106005] RAX: 0000000000000001 RBX: 00000000020155c0 RCX: 0000000000000001
[  368.106005] RDX: ffff88061fc83d50 RSI: ffff88061fc83d70 RDI: ffff880036d11a00
[  368.106005] RBP: ffff88061fc83d08 R08: 0000000000000001 R09: 0000000000000000
[  368.106005] R10: ffff880036d11a00 R11: ffffffff819e0900 R12: ffff88061fc83c58
[  368.106005] R13: ffffffff816154dd R14: ffff88061fc83d08 R15: 00000000020155c0
[  368.106005] FS:  0000000000000000(0000) GS:ffff88061fc80000(0000) knlGS:0000000000000000
[  368.106005] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  368.106005] CR2: 00007f8c2aee9c40 CR3: 000000057b222000 CR4: 00000000000407e0
[  368.106005] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  368.106005] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  368.106005] Stack:
[  368.106005]  00000000010000c0 ffff88057b766000 ffff8802e380b000 ffff88057af03e00
[  368.106005]  ffff88061fc83dc0 ffffffff815349a6 ffff88061fc83d40 ffffffff814ee146
[  368.106005]  ffff8802e380af00 00000000e380af00 ffffffff819e0900 020155c0010000c0
[  368.106005] Call Trace:
[  368.106005]  <IRQ>
[  368.106005]
[  368.106005]  [<ffffffff815349a6>] ip_route_input_noref+0x516/0xbd0
[  368.106005]  [<ffffffff814ee146>] ? skb_release_data+0xd6/0x110
[  368.106005]  [<ffffffff814ee20a>] ? kfree_skb+0x3a/0xa0
[  368.106005]  [<ffffffff8153698f>] ip_rcv_finish+0x29f/0x350
[  368.106005]  [<ffffffff81537034>] ip_rcv+0x234/0x380
[  368.106005]  [<ffffffff814fd656>] __netif_receive_skb_core+0x676/0x870
[  368.106005]  [<ffffffff814fd868>] __netif_receive_skb+0x18/0x60
[  368.106005]  [<ffffffff814fe4de>] process_backlog+0xae/0x180
[  368.106005]  [<ffffffff814fdcb2>] net_rx_action+0x152/0x240
[  368.106005]  [<ffffffff81077b3f>] __do_softirq+0xef/0x280
[  368.106005]  [<ffffffff8161619c>] call_softirq+0x1c/0x30
[  368.106005]  <EOI>
[  368.106005]
[  368.106005]  [<ffffffff81015d95>] do_softirq+0x65/0xa0
[  368.106005]  [<ffffffff81077174>] local_bh_enable+0x94/0xa0
[  368.106005]  [<ffffffff81114922>] rcu_nocb_kthread+0x232/0x370
[  368.106005]  [<ffffffff81098250>] ? wake_up_bit+0x30/0x30
[  368.106005]  [<ffffffff811146f0>] ? rcu_start_gp+0x40/0x40
[  368.106005]  [<ffffffff8109728f>] kthread+0xcf/0xe0
[  368.106005]  [<ffffffff810971c0>] ? kthread_create_on_node+0x140/0x140
[  368.106005]  [<ffffffff816147d8>] ret_from_fork+0x58/0x90
[  368.106005]  [<ffffffff810971c0>] ? kthread_create_on_node+0x140/0x140

==================================cut here==============================

It turns out that the rcuos callback-offload kthread is busy processing
a very large quantity of RCU callbacks, and it is not reliquishing the
CPU while doing so.  This commit therefore adds an cond_resched_rcu_qs()
within the loop to allow other tasks to run.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
[ paulmck: Substituted cond_resched_rcu_qs for cond_resched. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Dhaval Giani <dhaval.giani@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-08 07:15:24 +01:00
Linux Build Service Account
b832093be4 Merge "sched: pre-allocate colocation groups" 2016-12-01 16:39:40 -08:00
Joonwoo Park
7437cd7c4b sched: pre-allocate colocation groups
At present, sched_set_group_id() dynamically allocates structure for
colocation group to assign the given task to the group.  However
this can cause deadlock as memory allocator can wakeup a task which
also tries to acquire related_thread_group_lock.

Avoid such deadlock by pre-allocating colocation structures.  This
limits maximum colocation groups to static number but it's fine as it's
never expected to be a lot.

Change-Id: Ifc32ab4ead63c382ae390358ed86f7cc5b6eb2dc
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-12-01 11:28:01 -08:00
Ke Wang
a8575ee86e sched: tune: Fix lacking spinlock initialization
The spinlock used by boost_groups in sched tune must be initialized.
This commit fixes this lack and the following errors:

[    0.384739] c2 BUG: spinlock bad magic on CPU#2, swapper/2/0
[    0.390313] c2  lock: 0xffffffc15fe1fc80, .magic:00000000, .owner: <none>/-1, .owner_cpu: 0
[    0.398739] c2 CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.4.6+ #4
[    0.404816] c2 Hardware name: Spreadtrum SP9860gBoard (DT)
[    0.410462] c2 Call trace:
[    0.413159] c2 [<ffffff800808b50c>] dump_backtrace+0x0/0x210
[    0.418803] c2 [<ffffff800808b73c>] show_stack+0x20/0x28
[    0.424100] c2 [<ffffff8008433310>] dump_stack+0xa8/0xe0
[    0.429398] c2 [<ffffff8008139398>] spin_dump+0x78/0x9c
[    0.434608] c2 [<ffffff80081393ec>] spin_bug+0x30/0x3c
[    0.439644] c2 [<ffffff80081394e4>] do_raw_spin_lock+0xac/0x1b4
[    0.445639] c2 [<ffffff8008abffe4>] _raw_spin_lock_irqsave+0x58/0x68
[    0.451977] c2 [<ffffff800812a560>] schedtune_enqueue_task+0x84/0x3bc
[    0.458320] c2 [<ffffff8008111678>] enqueue_task_fair+0x438/0x208c
[    0.464487] c2 [<ffffff80080feeec>] activate_task+0x70/0xd0
[    0.470130] c2 [<ffffff80080ff4a4>] ttwu_do_activate.constprop.131+0x4c/0x98
[    0.477079] c2 [<ffffff80081005d0>] try_to_wake_up+0x254/0x54c
[    0.482899] c2 [<ffffff80081009d4>] default_wake_function+0x30/0x3c
[    0.489154] c2 [<ffffff8008122464>] autoremove_wake_function+0x3c/0x6c
[    0.495754] c2 [<ffffff8008121b70>] __wake_up_common+0x64/0xa4
[    0.501574] c2 [<ffffff8008121e9c>] __wake_up+0x48/0x60
[    0.506788] c2 [<ffffff8008150fac>] rcu_gp_kthread_wake+0x50/0x5c
[    0.512866] c2 [<ffffff8008151fec>] note_gp_changes+0xac/0xd4
[    0.518597] c2 [<ffffff8008153044>] rcu_process_callbacks+0xe8/0x93c
[    0.524940] c2 [<ffffff80080d0b84>] __do_softirq+0x24c/0x5b8
[    0.530584] c2 [<ffffff80080d1284>] irq_exit+0xc0/0xec
[    0.535623] c2 [<ffffff8008144208>] __handle_domain_irq+0x94/0xf8
[    0.541789] c2 [<ffffff8008082554>] gic_handle_irq+0x64/0xc0

Signed-off-by: Ke Wang <ke.wang@spreadtrum.com>
2016-12-01 15:18:44 +05:30
Joel Fernandes
789790d859 UPSTREAM: trace: Add an option for boot clock as trace clock
Unlike monotonic clock, boot clock as a trace clock will account for
time spent in suspend useful for tracing suspend/resume. This uses
earlier introduced infrastructure for using the fast boot clock.

Bug: b/33184060

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
2016-12-01 15:18:44 +05:30
Joel Fernandes
bcddfb47bf UPSTREAM: timekeeping: Add a fast and NMI safe boot clock
This boot clock can be used as a tracing clock and will account for
suspend time.

To keep it NMI safe since we're accessing from tracing, we're not using a
separate timekeeper with updates to monotonic clock and boot offset
protected with seqlocks. This has the following minor side effects:

(1) Its possible that a timestamp be taken after the boot offset is updated
but before the timekeeper is updated. If this happens, the new boot offset
is added to the old timekeeping making the clock appear to update slightly
earlier:
   CPU 0                                        CPU 1
   timekeeping_inject_sleeptime64()
   __timekeeping_inject_sleeptime(tk, delta);
                                                timestamp();
   timekeeping_update(tk, TK_CLEAR_NTP...);

(2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be
partially updated.  Since the tk->offs_boot update is a rare event, this
should be a rare occurrence which postprocessing should be able to handle.

Bug: b/33184060

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-12-01 15:18:44 +05:30
Viresh Kumar
d4a5b037e0 cpufreq: sched: Fix kernel crash on accessing sysfs file
If the cpufreq driver hasn't set the CPUFREQ_HAVE_GOVERNOR_PER_POLICY
flag, then the kernel will crash on accessing sysfs files for the sched
governor.

CPUFreq governors we can have the governor specific sysfs files in two
places:

A. /sys/devices/system/cpu/cpuX/cpufreq/<governor>
B. /sys/devices/system/cpu/cpufreq/<governor>

The case A. is for governor per policy case, where we can control the
governor tunables for each policy separately. The case B. is for system
wide tunable values.

The schedfreq governor only implements the case A. and not B.  The sysfs
files in case B will still be present in
/sys/devices/system/cpu/cpufreq/<governor>, but accessing them will
crash kernel as the governor doesn't support that.

Moreover the sched governor is pretty new and will be used only for the
ARM platforms and there is no need to support the case B at all.

Hence use policy->kobj instead of get_governor_parent_kobj(), so that we
always create the sysfs files in path A.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-12-01 15:18:44 +05:30
Linux Build Service Account
fbfd0301be Merge "sched: Disable interrupts while holding related_thread_group_lock" 2016-11-29 07:44:06 -08:00
Pavankumar Kondeti
822561f075 sched: Fix out of bounds array access in sched_reset_all_window_stats()
A new reset reason code "FREQ_AGGREGATE_CHANGE" is added to
reset_reason_code enum but the corresponding string array is not
updated. Fix this.

Change-Id: I2a17d95328bef91c4a5dd4dde418296efca44431
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-11-29 15:48:34 +05:30
Alex Shi
c39fa16476 Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android 2016-11-29 16:06:48 +08:00
Alex Shi
072c0f9ee4 Merge remote-tracking branch 'origin/v4.4/topic/wb-cg2' into linux-linaro-lsk-v4.4 2016-11-29 15:58:42 +08:00
Linux Build Service Account
46c5a88fdf Merge "sched/core: Do not free task while holding rq lock" 2016-11-28 23:57:56 -08:00
Linux Build Service Account
40493b8042 Merge "qos: Register irq notify after adding the qos request" 2016-11-28 23:57:32 -08:00
Tejun Heo
cd5367ae02 cgroup: replace __DEVEL__sane_behavior with cgroup2 fs type
With major controllers - cpu, memory and io - shaping up for the
unified hierarchy, cgroup2 is about ready to be, gradually, released
into the wild.  Replace __DEVEL__sane_behavior flag which was used to
select the unified hierarchy with a separate filesystem type "cgroup2"
so that unified hierarchy can be mounted as follows.

  mount -t cgroup2 none $MOUNT_POINT

The cgroup2 fs has its own magic number - 0x63677270 ("cgrp").

v2: Assign a different magic number to cgroup2 fs.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
(cherry picked from commit 67e9c74b8a873408c27ac9a8e4c1d1c8d72c93ff)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
2016-11-29 15:25:16 +08:00
Olav Haugan
e5c095a2c7 sched/core: Do not free task while holding rq lock
Clearing the hmp request can cause a task to be freed. When a task is
freed the free call might wake up a kworker which will cause a
spinlock lockup (rq lock). Fix this by avoiding calling put_task_struct
when holding the rq lock.

In addition move call to clear_hmp_request out of stopper thread context
since it is not necessary to do this on the cpu being isolated.

Change-Id: Ie577db4701a88849560df385869ff7cf73695a05
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-28 11:00:29 -08:00
Pavankumar Kondeti
d0ff1c04e8 sched: Disable interrupts while holding related_thread_group_lock
There is a potential deadlock condition if interrupts are enabled
while holding the related_thread_group_lock. Prevent this.

----------------                              --------------------
    CPU 0                                          CPU 1
---------------                               --------------------

check_for_migration()                         cgroup_file_write(p)

check_for_freq_change()                       cgroup_attach_task(p)

send_notification()                           schedtune_attach(p)

read_lock(&related_thread_group_lock)         sched_set_group_id(p)

                                              raw_spin_lock_irqsave(
					       &p->pi_lock, flags)

					      write_lock_irqsave(
					       &related_thread_group_lock)

					       waiting on CPU#0

raw_spin_lock_irqsave(&rq->lock, flags)

raw_spin_unlock_irqrestore(&rq->lock, flags)

--> interrupt()

----> ttwu(p)

-------> waiting for p's pi_lock on CPU#1

Change-Id: I6f0f8f742d6e1b3ff735dcbeabd54ef101329cdf
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-11-28 22:15:56 +05:30
Alex Shi
3597171388 Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android 2016-11-28 13:24:42 +08:00
Alex Shi
edf995d84e Merge tag 'v4.4.35' into linux-linaro-lsk-v4.4
This is the 4.4.35 stable release
2016-11-28 12:00:55 +08:00
Linux Build Service Account
9aa1df0cf5 Merge "sched: Ensure proper synch between isolation, hotplug, and suspend" 2016-11-27 19:40:21 -08:00
Anil Kumar Mamidala
9879d0300b qos: Register irq notify after adding the qos request
Before adding the irq affinity based qos request to the list, if
the affinity of the interrupt changes it will trigger notify call.
This notifier call will try to update the qos request. Accessing
the qos request which is not yet added to the list leads to a
NULL pointer exception.

Avoid this race by registering the notifier after adding the
qos request.

Change-Id: I99869cc233573b5db10e4f3224d65c29511050ea
Signed-off-by: Anil Kumar Mamidala <amami@codeaurora.org>
2016-11-27 08:21:28 -08:00
Linux Build Service Account
5b3053ec24 Merge "qos: wake up cores based on the qos updated cpu mask" 2016-11-26 21:27:49 -08:00
Johan Hovold
469fcbcb84 PM / sleep: fix device reference leak in test_suspend
commit ceb75787bc75d0a7b88519ab8a68067ac690f55a upstream.

Make sure to drop the reference taken by class_find_device() after
opening the RTC device.

Fixes: 77437fd4e6 (pm: boot time suspend selftest)
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-26 09:54:53 +01:00
Linux Build Service Account
22f8318fcc Merge "audit: fix a double fetch in audit_log_single_execve_arg()" 2016-11-25 08:32:35 -08:00
Anil Kumar Mamidala
625eb19435 qos: wake up cores based on the qos updated cpu mask
If the qos value is increased only for a subset of cpu's
aggregated qos for those cpu's is still the previous value.
This is because the qos request list is maintained per
request and not per cpu. In this case as there is no change
in aggregated qos value, these cpu's are not wokenup to
take the new qos value into effect.

So wakeup cpu's even if the aggregated qos value does not change
but the cpumask changes.

Change-Id: If5a4a100108e85e04beb77e5249bd6c452672edf
Signed-off-by: Anil Kumar Mamidala <amami@codeaurora.org>
2016-11-24 21:42:52 -08:00
Paul Moore
7b0a354c5e audit: fix a double fetch in audit_log_single_execve_arg()
There is a double fetch problem in audit_log_single_execve_arg()
where we first check the execve(2) argumnets for any "bad" characters
which would require hex encoding and then re-fetch the arguments for
logging in the audit record[1].  Of course this leaves a window of
opportunity for an unsavory application to munge with the data.

This patch reworks things by only fetching the argument data once[2]
into a buffer where it is scanned and logged into the audit
records(s).  In addition to fixing the double fetch, this patch
improves on the original code in a few other ways: better handling
of large arguments which require encoding, stricter record length
checking, and some performance improvements (completely unverified,
but we got rid of some strlen() calls, that's got to be a good
thing).

As part of the development of this patch, I've also created a basic
regression test for the audit-testsuite, the test can be tracked on
GitHub at the following link:

 * https://github.com/linux-audit/audit-testsuite/issues/25

[1] If you pay careful attention, there is actually a triple fetch
problem due to a strnlen_user() call at the top of the function.

[2] This is a tiny white lie, we do make a call to strnlen_user()
prior to fetching the argument data.  I don't like it, but due to the
way the audit record is structured we really have no choice unless we
copy the entire argument at once (which would require a rather
wasteful allocation).  The good news is that with this patch the
kernel no longer relies on this strnlen_user() value for anything
beyond recording it in the log, we also update it with a trustworthy
value whenever possible.

Reported-by: Pengfei Wang <wpengfeinudt@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Change-Id: Ie9848961d236739df5014474f2c2a781af9fb811
Git-repo: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
Git-commit: 43761473c254b45883a64441dd0bc85a42f3645c
Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org>
2016-11-23 11:33:07 -08:00
Nick Desaulniers
3b5cf91f45 cgroup: prefer %pK to %p
Prevents leaking kernel pointers when using kptr_restrict.

Bug: 30149174
Change-Id: I0fa3cd8d4a0d9ea76d085bba6020f1eda073c09b
Git-repo: https://android.googlesource.com/kernel/msm.git
Git-commit: 505e48f32f1321ed7cf80d49dd5f31b16da445a8
Signed-off-by: Dennis Cagle <d-cagle@codeaurora.org>
2016-11-18 17:08:58 -08:00
Olav Haugan
704e5bfc25 sched: Ensure proper synch between isolation, hotplug, and suspend
Isolation code needs to be synchronized with both hotplug and suspend.
Ensure this by taking the lock that is taken by both paths and ensure
hotplug notifiers are processed for suspend/resume.

Change-Id: I663588cfd2f9e3972b9adc1a10887ef36cd70c57
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-18 14:04:39 -08:00
Syed Rameez Mustafa
30fc774235 sched/hmp: Enhance co-location and scheduler boost features
The recent introduction of the schedtune cgroup controller has provided
the scheduler with added flexibility in terms of some of it's placement
features. In particular each cgroup under the schedtune controller can
now specify:

1) Whether it needs co-location along with other cgroups
2) Whether it is eligible for scheduler boost (sched_boost_enabled)
3) Whether the kernel can override the boost eligibility when necessary
   (sched_boost_no_override)

The scheduler now creates a reserved co-location group at boot. This
group is used to co-locate all tasks that form part of any one of the
cgroups that have co-location enabled. This reserved group can neither
be destroyed nor reused for other purposes. Furthermore, cgroups are
only allowed to indicate their co-location preference once at boot.
Further updates are disallowed.

Since we are now creating co-location groups for an extended period of
time, there are a few other factors to consider when determining the
preferred cluster for the group. We first exclude any tasks in the
group that have not been observed to be running for a significant
amount of time. Secondly we introduce the notion of group up and down
migrate tunables to allow different migration policies than individual
tasks. Lastly we break co-location if a single task in a group exceeds
up-migrate but the total load of the group does not exceed group
up-migrate.

In terms of sched_boost, the scheduler now supports multiple types of
boost. These are:

1) FULL_THROTTLE : Force up-migrate tasks belonging any cgroup that
                   has the sched_boost_enabled flag turned on. Little
                   CPUs will only be used when big CPUs can no longer
                   accommodate tasks. Also up-migrate all RT tasks.

2) CONSERVATIVE : Override the sched_boost_enabled flag for all cgroups
                  except those that have the sched_boost_no_override
                  flag set. Force up-migrate all tasks belonging to only
                  those cgroups that still remain eligible for boost.
                  RT tasks do not get force up migrated.

3) RESTRAINED : Start frequency aggregation for co-located tasks. This
                type of boost does not force up-migrate any task.

Finally the boost API removes ref-counting. This means that there can
only be a single entity using boost at any given time. If multiple
entities are managing boost, they are required to be well behaved so
that they don't interfere with one another. Even for a single client,
it is not possible to switch directly from one boost type to another.
Boost must be first turned off before switching over to a new type.

Change-Id: I8d224a70cbef162f27078b62b73acaa22670861d
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-11-16 17:57:56 -08:00
Joonwoo Park
fd5b530593 sched: revise boost logic when boost_type is SCHED_BOOST_ON_BIG
At present HMP scheduler boost tends to pack tasks by taking into
account of power cost and cstate.  This is suboptimal to performance
as it can lead preemption and higher latency.

Revise logic to prefer the least loaded CPU among the big cluster CPUs
when boost type is SCHED_BOOST_ON_BIG.  New logic still honor the
behaviour that scheduler can place tasks on the little CPUs when the
big CPUs are all overcommitted.

Also, it's found that need_idle with boost can easily return previous
CPU when there is no idle CPU found.  Fix this issue by making
need_idle flag to take precedence over sched_boost.

CRs-fixed: 1074879
Change-Id: I470bcd0588e038b4a540d337fe6a412f2fa74920
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-11-16 17:57:55 -08:00
Syed Rameez Mustafa
8b74c7eb5f sched: Remove thread group iteration from colocation
Iterating a leader task's thread group in order to add them to a
colocation group involves a complex locking chain that ends up
causing a deadlock. The deadlock is as follows when the same task
is being referenced on three different CPUs:

-----                     ------                      -----
CPU 0                     CPU 1                       CPU 2
-----                     ------                      -----
                          add_task_to_group(p)

__schedule(prev = p)      write_lock(                 ttwu(p)
                          related_thread_grp_lock)
                                                      lock(pi_lock)

idle_balance()                                        wait for
                                                      p->on_cpu
load_balance()            unable to acquire
                          p->pi_lock
send_notification()

wait for read_lock(
related_thread_grp_lock)

unable to set p->on_cpu

There are a couple of ways to resolve this deadlock in the kernel,
however, they are not trivial. For the sake of simplicity, move
the responsibility of thread group iteration back to userspace. This
would apply to both adding and removing the leader task from a
colocation group. The kernel would continue to automatically add
newly forked children of the colocated leader to the colocation
group.

This still leaves an issue with the locking order of the pi_lock and
the related_thread_group_lock. To solve all deadlocks, we need to avoid
taking the pi_lock in reset_all_task_stats() and instead rely on a more
heavy handed approach of taking all rq locks. The pi_lock was taken to
avoid a race between reset_all_task_stats() and sched_exit(). The race
can be avoided with rq locks as well.

Change-Id: I15323e3ef91401142d3841db59c18fd8fee753fd
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-11-16 17:57:35 -08:00
Amit Pundir
91e63c11a5 Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android
Conflicts:
* arch/arm64/include/asm/assembler.h
    Pick changes from AOSP Change-Id: I450594dc311b09b6b832b707a9abb357608cc6e4
    ("UPSTREAM: arm64: include alternative handling in dcache_by_line_op").

* drivers/android/binder.c
    Pick changes from LTS commit 14f09e8e7c ("ANDROID: binder: Add strong ref checks"),
    instead of AOSP Change-Id: I66c15b066808f28bd27bfe50fd0e03ff45a09fca
    ("ANDROID: binder: Add strong ref checks").

* drivers/usb/gadget/function/u_ether.c
    Refactor throttling of highspeed IRQ logic in AOSP by adding
    a check for last queue request as intended by LTS commit
    660c04e8f1 ("usb: gadget: function: u_ether: don't starve tx request queue").
    Fixes AOSP Change-Id: I26515bfd9bbc8f7af38be7835692143f7093118a
    ("USB: gadget: u_ether: Fix data stall issue in RNDIS tethering mode").

Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-11-15 18:33:34 +05:30
Linux Build Service Account
2ce4f26719 Merge "core_ctl: Export boost function" 2016-11-15 04:07:49 -08:00
Alex Shi
62c3330b7f Merge branch v4.4/topic/hibernate into linux-linaro-lsk-v4.4 2016-11-15 17:28:46 +08:00
Linux Build Service Account
30f6933a15 Merge "sched: core: Skip migrating tasks that aren't enqueued on dead_rq" 2016-11-14 21:54:02 -08:00
Olav Haugan
8bf3523cf7 core_ctl: Export boost function
Export core control boost function to make it accessible to kernel
modules.

Change-Id: I94359afa433ad57dd5bfeae3cb78a1f196cd02fe
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-14 16:28:03 -08:00
Alex Shi
ce11555672 Merge branch 'v4.4/topic/hibernate' into linux-linaro-lsk-v4.4
Conflicts:
	conflicts are almost come from mm-kaslr, focus on mm
	arch/arm64/include/asm/cpufeature.h
	arch/arm64/include/asm/pgtable.h
	arch/arm64/kernel/Makefile
	arch/arm64/kernel/cpufeature.c
	arch/arm64/kernel/head.S
	arch/arm64/kernel/suspend.c
	arch/arm64/kernel/vmlinux.lds.S
	arch/arm64/kvm/hyp.S
	arch/arm64/mm/init.c
	arch/arm64/mm/mmu.c
	arch/arm64/mm/proc-macros.S
2016-11-14 21:20:48 +08:00
Vikram Mulukutla
3f11a4bc4f sched: core: Skip migrating tasks that aren't enqueued on dead_rq
During migrate_tasks, we have to drop the dead_rq lock in
order to preserve locking order when acquiring task->pi_lock.
This may allow the task to migrate off of dead_rq. Therefore,
don't attempt to migrate such a task again from dead_rq.

Change-Id: Id31b58e231d3dcd7d32e0dc7f264595d60a7c408
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2016-11-11 16:09:55 -08:00
Linux Build Service Account
1787801211 Merge "timer: Don't wait for running timers when migrating during isolation" 2016-11-10 22:49:40 -08:00
Linux Build Service Account
befd242303 Merge "sched/core: Fix migrate tasks bail-out condition" 2016-11-10 22:49:39 -08:00
Linux Build Service Account
2401d64a48 Merge "core_ctl: Synchronize access to cluster cpu list" 2016-11-10 22:49:39 -08:00
Alex Shi
17d454ca33 Merge tag 'v4.4.31' into linux-linaro-lsk-v4.4
This is the 4.4.31 stable release
2016-11-11 12:01:04 +08:00
Olav Haugan
45b8775b62 sched/core: Fix migrate tasks bail-out condition
Migrate tasks function is used by both hotplug and cpu isolation. During
hotplug all the cpus are stalled (in stop machine) while tasks are being
migrated. However, this is not the case during cpu isolation. A task
that was counted as a pinned thread might have been migrated off the
cpu. Take this into account when checking whether we have completed
moving all tasks off the runqueue.

Also ignore warning about tasks moving off the run-queue for isolation
use case.

Change-Id: I5c5f25eb9b1eaf0605b606a65e0ac86996fa5f27
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-10 14:19:05 -08:00
Olav Haugan
c82e2f73d1 core_ctl: Synchronize access to cluster cpu list
Cluster cpu list traversal is not properly protected against removal of
element by a separate thread. Add proper locking to ensure an element
cannot be removed while accessing the list.

In addition ensure we don't end up in a livelock never exiting the loop
due to hotplug continuously moving elements to the end of the list.

Change-Id: Ie98fe48c2f4fdd0244573229b77ee9823df9e214
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-11-10 14:19:00 -08:00
Arnd Bergmann
603c78000f cgroup: avoid false positive gcc-6 warning
commit cfe02a8a973e7e5f66926b8ae38dfce404b19e29 upstream.

When all subsystems are disabled, gcc notices that cgroup_subsys_enabled_key
is a zero-length array and that any access to it must be out of bounds:

In file included from ../include/linux/cgroup.h:19:0,
                 from ../kernel/cgroup.c:31:
../kernel/cgroup.c: In function 'cgroup_add_cftypes':
../kernel/cgroup.c:261:53: error: array subscript is above array bounds [-Werror=array-bounds]
  return static_key_enabled(cgroup_subsys_enabled_key[ssid]);
                            ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
../include/linux/jump_label.h:271:40: note: in definition of macro 'static_key_enabled'
  static_key_count((struct static_key *)x) > 0;    \
                                        ^

We should never call the function in this particular case, so this is
not a bug. In order to silence the warning, this adds an explicit check
for the CGROUP_SUBSYS_COUNT==0 case.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10 16:36:36 +01:00
Abinaya P
d9a48a7cd2 Revert "input: touchscreen: Add synaptics v1 driver"
This reverts  'commit d13776d16a ("input: touchscreen: Add synaptics
v1 driver")'

Change-Id: I1c0c57de3319c59c094b9e8d9192995312192354
Signed-off-by: Abinaya P <abinayap@codeaurora.org>
2016-11-10 00:42:14 -08:00
Rafael J. Wysocki
3eb846e0d5 PM / sleep: Add support for read-only sysfs attributes
Some sysfs attributes in /sys/power/ should really be read-only,
so add support for that, convert those attributes to read-only
and drop the stub .show() routines from them.

Original-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit a1e9ca6967d68209c70e616a224efa89a6b86ca6)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
2016-11-10 16:16:59 +08:00
James Morse
4d4fe8f2b2 PM / Hibernate: Call flush_icache_range() on pages restored in-place
Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches before the processor can fetch them as new
instructions.

During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Modify the read and decompress code to call
flush_icache_range() on all pages that are restored, so that the restored
in-place pages are guaranteed to be executable on these architectures.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
[will: make clean_pages_on_* static and remove initialisers]
Signed-off-by: Will Deacon <will.deacon@arm.com>

(cherry picked from commit f6cf0545ec697ddc278b7457b7d0c0d86a2ea88e)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
2016-11-10 15:55:00 +08:00
Abinaya P
04e7c994ca Revert "input: touchscreen: synaptics v1.1"
This reverts 'commit 7112993181 ("input: touchscreen: synaptics v1.1")'
This change is not needed in 4.4 kernel.

Change-Id: I89ab8f353bc04bc0a04d5f5a6993e8e8e5ebbd2e
Signed-off-by: Abinaya P <abinayap@codeaurora.org>
Signed-off-by: Shantanu Jain <shjain@codeaurora.org>
2016-11-10 12:40:32 +05:30
Vikram Mulukutla
4142e30898 timer: Don't wait for running timers when migrating during isolation
A CPU that is isolated needs to have its timers migrated off to
another CPU. If while migrating timers, there is a running
timer, acquiring the timer base lock after marking a CPU as
isolated will ensure that:

1) No more timers can be queued on to the isolated CPU, and
2) A running timer will finish execution on the to-be-isolated
   CPU, and so will any just expired timers since they're all
   taken off of the CPU's tvec1 in one go while the base lock
   is held.

Therefore there is no apparent reason to wait for the expired
timers to finish execution, and isolation can proceed to migrate
non-expired timers even when the expired ones are running
concurrently.

While we're here, also add a delay to the wait-loop inside
migrate_hrtimer_list to allow for store-exclusive fairness
when run_hrtimer is attempting to grab the hrtimer base
lock.

Change-Id: Ib697476c93c60e3d213aaa8fff0a2bcc2985bfce
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2016-11-09 15:57:24 -08:00
Linux Build Service Account
a97a6be4de Merge "sched: Ensure watchdog is enabled before disabling" 2016-11-08 11:19:07 -08:00