Commit graph

21845 commits

Author SHA1 Message Date
Arun Bharadwaj
9f6eb26ae8 tracing/sched: Track per-cpu rt and non-rt cpu_load.
Add a new tracepoint trace_sched_enq_deq_task to track
per-cpu rt and non-rt cpu_load during task enqueue
and dequeue.

This is useful to visualize and compare the load on
different cpus and also to understand how balanced
the load is at any point of time.

Note: We only print cpu_load[0] because we only care about
the most recent load history for tracking load balancer
effectiveness.

Change-Id: I46f0bb84e81652099ed5edf8c2686c70c8b8330c
Signed-off-by: Arun Bharadwaj <abharadw@codeaurora.org>
2016-03-23 19:58:34 -07:00
Srivatsa Vaddagiri
0ec4cf2484 sched: re-calculate a cpu's next_balance point upon sched domain changes
A cpus next_balance point could be stale when its being attached to a
sched domain hierarchy. That would lead to undesirable delay in cpu
doing a load balance and hence potentially affect scheduling
latencies for tasks. Fix that by initializing cpu's next_balance point
when its being attached to a sched domain hierarchy.

Change-Id: I855cff8da5ca28d278596c3bb0163b839d4704bc
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[rameezmustafa@codeaurora.org: Modify commit text to reflect dropped patches]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 19:58:34 -07:00
Steve Muckle
63249df6b2 sched: provide per cpu-cgroup option to notify on migrations
On systems where CPUs may run asynchronously, task migrations
between CPUs running at grossly different speeds can cause
problems.

This change provides a mechanism to notify a subsystem
in the kernel if a task in a particular cgroup migrates to a
different CPU. Other subsystems (such as cpufreq) may then
register for this notifier to take appropriate action when
such a task is migrated.

The cgroup attribute to set for this behavior is
"notify_on_migrate" .

Change-Id: Ie1868249e53ef901b89c837fdc33b0ad0c0a4590
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
[rameezmustafa@codeaurora.org: Use new cgroup APIs, fix 64-bit
			compilation issues and resolve some merge
			conflicts. Also squash "2bd8075 sched:
			remove migration notification from RT class"
			into this patch.]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: Incorporated with new __migrate_task().]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 19:58:33 -07:00
Srivatsa Vaddagiri
82f6fb4f5c sched: Fix SCHED_HRTICK bug leading to late preemption of tasks
SCHED_HRTICK feature is useful to preempt SCHED_FAIR tasks on-the-dot
(just when they would have exceeded their ideal_runtime). It makes use
of a a per-cpu hrtimer resource and hence alarming that hrtimer should
be based on total SCHED_FAIR tasks a cpu has across its various cfs_rqs,
rather than being based on number of tasks in a particular cfs_rq (as
implemented currently). As a result, with current code, its possible for
a running task (which is the sole task in its cfs_rq) to be preempted
much after its ideal_runtime has elapsed, resulting in increased latency
for tasks in other cfs_rq on same cpu.

Fix this by alarming sched hrtimer based on total number of SCHED_FAIR
tasks a CPU has across its various cfs_rqs.

Change-Id: I1f23680a64872f8ce0f451ac4bcae28e8967918f
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[rameezmustafa@codeaurora.org]: Squash "c24fb502 sched: fix reference to
				wrong cfs_rq" into this patch]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 19:58:32 -07:00
Steve Muckle
2c04026b17 kernel: reduce sleep duration in wait_task_inactive
Sleeping for an entire tick adds unnecessary latency to
hotplugging a cpu (cpu_up).

Change-Id: Iab323a79f4048bc9101ecfd368e0f275827ed4ab
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
[rameezmustafa@codeaurora.org]: Port to msm-3.18]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 19:58:31 -07:00
Steve Muckle
03ab01a318 sched: add sysctl for controlling task migrations on wake
The PF_WAKE_UP_IDLE per-task flag made it impossible to enable
the old behavior of SD_SHARE_PKG_RESOURCES, where every task
migrates to an idle CPU on wakeup.

The sched_wake_to_idle sysctl value, when made nonzero, will cause
all tasks to migrate to an idle CPU if one is available when the
task is woken up. This is regardless of how PF_WAKE_UP_IDLE is
configured for tasks in the system. Similar to PF_WAKE_UP_IDLE,
the SD_SHARE_PKG_RESOURCES scheduler domain flag must be enabled
for the sysctl value to have an effect.

Change-Id: I23bed846d26502c7aed600bfcf1c13053a7e5f61
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
(cherry picked from commit 9d5b38dc0025d19df5b756b16024b4269e73f282)
2016-03-23 19:58:30 -07:00
Matt Wagantall
cc06d4a91d sched/rt: Add Kconfig option to enable panicking for RT throttling
This may be useful for detecting and debugging RT throttling issues.

Change-Id: I5807a897d11997d76421c1fcaa2918aad988c6c9
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
[rameezmustafa@codeaurora.org]: Port to msm-3.18]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed conflict in lib/Kconfig.debug]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 19:58:29 -07:00
Matt Wagantall
841af4dbae sched/rt: print RT tasks when RT throttling is activated
Existing debug prints do not provide any clues about which tasks
may have triggered RT throttling. Print the names and PIDs of
all tasks on the throttled rt_rq to help narrow down the source
of the problem.

Change-Id: I180534c8a647254ed38e89d0c981a8f8bccd741c
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
[rameezmustafa@codeaurora.org]: Port to msm-3.18]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 19:58:28 -07:00
Steve Muckle
74b3b06c52 sched: add PF_WAKE_UP_IDLE
Certain workloads may benefit from the SD_SHARE_PKG_RESOURCES behavior
of waking their tasks up on idle CPUs. The feature has too much of a
negative impact on other workloads however to apply globally. The
PF_WAKE_UP_IDLE flag tells the scheduler to wake up tasks that have this
flag set, or tasks woken by tasks with this flag set, on an idle CPU
if one is available.

Change-Id: I20b28faf35029f9395e9d9f5ddd57ce2de795039
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
[joonwoop@codeaurora.org: fixed conflict around set_wake_up_idle() in
 include/linux/sched.h]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 19:58:27 -07:00
Srivatsa Vaddagiri
8da8122d5f sched: Make the scheduler aware of C-state for cpus
C-state represents a power-state of a cpu. A cpu could have one or
more C-states associated with it. C-state transitions are based on
various factors (expected sleep time for example). "Deeper" C-states
implies longer wakeup latencies.

Scheduler needs to know wakeup latency associated with various C-states.
Having this information allows the scheduler to make better decisions
during task placement. For example:

- Prefer an idle cpu that is in the least shallow C-state
- Avoid waking up small tasks on a idle cpu unless it is in the least
  shallow C-state

This patch introduces APIs in the scheduler that can be used by the
architecture specific power-management driver to inform the scheduler
about C-states for cpus.

Change-Id: I39c5ae6dbace4f8bd96e88f75cd2d72620436dd1
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-03-23 19:58:27 -07:00
Prasad Sodagudi
805c18d71c lib: spinlock: Trigger a watchdog bite on spin_dump for rwlock
Currently dump_stack is printed once a spin_bug is detected for rwlock.
So provide an options to trigger a panic or watchdog bite for debugging
rwlock magic corruptions and lockups.

Change-Id: I20807e8eceb8b81635e58701d1f9f9bd36ab5877
[abhimany: replace msm with qcom]
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
2016-03-22 11:16:32 -07:00
Rohit Vaswani
77d758e283 lib: spinlock: Cause a watchdog bite on spin_dump
Currently we cause a BUG_ON once a spin_bug is detected, but
that causes a whole lot of processing and the other CPUs would
have proceeded to perform other actions and the state of the system
is moved by the time we can analyze it.
Provide an option to trigger  a watchdog bite instead so that we
can get the traces as close to the issue as possible.

Change-Id: Ic8d692ebd02c6940a3b4e5798463744db20b0026
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
2016-03-22 11:16:31 -07:00
Rohit Vaswani
a543061653 lib: spinlock_debug: Prevent continuous spin dump logs on panic
Once a spinlock lockup is detected on a CPU, we invoke a Kernel Panic.
During the panic handling, we might see more instances of spinlock
lockup from other CPUs. This causes the dmesg to be cluttered and makes
it cumbersome to detect what exactly happened.
Call spin_bug instead of calling spin_dump directly.

Change-Id: I57857a991345a8dac3cd952463d05129145867a8
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org>
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
2016-03-22 11:16:30 -07:00
Syed Rameez Mustafa
e2cddd1040 kernel/lib: add additional debug capabilites for data corruption
Data corruptions in the kernel often end up in system crashes that
are easier to debug closer to the time of detection. Specifically,
if we do not panic immediately after lock or list corruptions have been
detected, the problem context is lost in the ensuing system mayhem.
Add support for allowing system crash immediately after such corruptions
are detected. The CONFIG option controls the enabling/disabling of the
feature.

Change-Id: I9b2eb62da506a13007acff63e85e9515145909ff
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[abhimany: minor merge conflict resolution]
Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
2016-03-22 11:16:29 -07:00
Se Wang (Patrick) Oh
dae9a397e1 kernel: fork: Call KASan alloc before release the thread info pages
the pages allocated for thread info is used for stack. KAsan marks
some stack memory region for guarding area and the bitmasks for
that region are not cleared until the pages are freed. When
CONFIG_PAGE_POISONING is enabled, as the pages still have special
bitmasks, a out of bound access KASan report arises during pages
poisoning. So mark the pages as alloc status before poisoning the
pages.
==================================================================
BUG: KASan: out of bounds on stack in memset+0x24/0x44 at addr ffffffc0b8e3f000
Write of size 4096 by task swapper/0/0
page:ffffffbacc38e760 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x4000000000000000()
page dumped because: kasan: bad access detected
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W      3.18.0-g5a4a5d5-07244-g488682c-dirty #12
Hardware name: Qualcomm Technologies, Inc. MSM 8996 v2.0 LiQUID (DT)
Call trace:
[<ffffffc00008c010>] dump_backtrace+0x0/0x250
[<ffffffc00008c270>] show_stack+0x10/0x1c
[<ffffffc001b6f9e4>] dump_stack+0x74/0xfc
[<ffffffc0002debf4>] kasan_report_error+0x2b0/0x408
[<ffffffc0002dee28>] kasan_report+0x34/0x40
[<ffffffc0002de240>] __asan_storeN+0x15c/0x168
[<ffffffc0002de47c>] memset+0x20/0x44
[<ffffffc0002d77bc>] kernel_map_pages+0x2e8/0x384
[<ffffffc000266458>] free_pages_prepare+0x340/0x3a0
[<ffffffc0002694cc>] __free_pages_ok+0x20/0x12c
[<ffffffc00026a698>] __free_pages+0x34/0x44
[<ffffffc00026abb0>] free_kmem_pages+0x68/0x80
[<ffffffc0000b0424>] free_task+0x80/0xac
[<ffffffc0000b05a8>] __put_task_struct+0x158/0x23c
[<ffffffc0000b9194>] delayed_put_task_struct+0x188/0x1cc
[<ffffffc00018586c>] rcu_process_callbacks+0x6cc/0xbb0
[<ffffffc0000bfdb0>] __do_softirq+0x368/0x750
[<ffffffc0000c0630>] irq_exit+0xd8/0x15c
[<ffffffc00016f610>] __handle_domain_irq+0x108/0x168
[<ffffffc000081af8>] gic_handle_irq+0x50/0xc0
Memory state around the buggy address:
 ffffffc0b8e3f980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffffc0b8e3fa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 >ffffffc0b8e3fa80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
                                            ^
 ffffffc0b8e3fb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffffc0b8e3fb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Change-Id: I90aa1c6e82a0bde58d2d5d68d84e67f932728a88
Signed-off-by: Se Wang (Patrick) Oh <sewango@codeaurora.org>
2016-03-22 11:10:44 -07:00
Andrey Ryabinin
bbf900d8b6 kernel: printk: specify alignment for struct printk_log
On architectures that have support for efficient unaligned access struct
printk_log has 4-byte alignment.  Specify alignment attribute in type
declaration.

The whole point of this patch is to fix deadlock which happening when
UBSAN detects unaligned access in printk() thus UBSAN recursively calls
printk() with logbuf_lock held by top printk() call.

CRs-Fixed: 969533
Change-Id: Iee822667b9104a69c3d717caeded0fcde3f6d560
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yury Gribov <y.gribov@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kostya Serebryany <kcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-repo: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/
Git-commit: 5c9cf8af2e77388f1da81c39237fb4f20c2f85d5
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
[satyap@codeaurora.org: trivial merge conflict resolution]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2016-03-22 11:09:56 -07:00
Tianyi Gou
5b047145cd net/ipv6/addrconf: IPv6 tethering enhancement
Added new procfs flag to toggle the automatic addition of prefix
routes on a per device basis. The new flag is accept_ra_prefix_route.
Defaults to 1 as to not break existing behavior.

CRs-Fixed: 435320
Change-Id: If25493890c7531c27f5b2c4855afebbbbf5d072a
Acked-by: Harout S. Hedeshian <harouth@qti.qualcomm.com>
Signed-off-by: Tianyi Gou <tgou@codeaurora.org>
[subashab@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
2016-03-22 11:09:54 -07:00
Mao Jinlong
f0aafc9926 rtc: alarm: Add power-on alarm feature
Android does not support powering-up the phone through alarm.
Set rtc alarm in timerfd to power-up the phone after alarm
expiration.

Change-Id: I781389c658fb00ba7f0ce089d706c10f202a7dc6
Signed-off-by: Mao Jinlong <c_jmao@codeaurora.org>
2016-03-22 11:08:58 -07:00
Mao Jinlong
439e21802e alarmtimer: add rtc irq support for alarm
Add the rtc irq support for alarmtimer to wakeup the
alarm during system suspend.

Change-Id: I41b774ed4e788359321e1c6a564551cc9cd40c8e
Signed-off-by: Xiaocheng Li <lix@codeaurora.org>
2016-03-22 11:08:57 -07:00
Matt Wagantall
6b75e9afc1 soc: qcom: rq_stats: add snapshot of run queue stats driver
Add a snapshot of the run queue stats driver as of msm-3.10 commit
4bf320bd ("Merge "ASoC: msm8952: set async flag for 8952 dailink"")

Resolve checkpatch warnings in the process, notably the replacement
of sscanf with kstrtouint.

Change-Id: I7e2f98223677e6477df114ffe770c0740ed37de9
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
2016-03-22 11:08:42 -07:00
Karthikeyan Ramasubramanian
e9deb36997 trace: ipc_logging: Use virtual counter
Using the physical counter leads to a kernel BUG_ON(). Update the
IPC Logging Driver to use virtual counter.

Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
2016-03-22 11:07:57 -07:00
Karthikeyan Ramasubramanian
42e7b9ac7c trace: Add snapshot of ipc_logging driver
This snapshot is taken as of msm-3.18 commit e70ad0cd (Promotion of
kernel.lnx.3.18-151201.)

Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
2016-03-22 11:07:56 -07:00
Stepan Moskovchenko
1ca3decb1b smp: Allow booting a specific subset of CPUs
In a heterogenous multiprocessor system, specifying the
'maxcpus' parameter on the kernel command line does not
provide sufficient control over which CPUs are brought
online at kernel boot time, since CPUs may have nonuniform
performance characteristics. Thus, we introduce a
'boot_cpus' command line argument, allowing the user to
explicitly specify the list of CPUs that shall be brought
online during kernel boot.

Change-Id: I5f119e23202660941fa7be8c4e6dd91a82365451
Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>
[abhimany: resolve trivial merge conflicts]
Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
2016-03-22 11:07:48 -07:00
Murali Nalajala
b3445ac6ea lpm-levels: Do not disable non-sec interrupts in suspend
When the system suspend is happening, last core disables
the non-sec interrupts at QGIC by setting the GRPEN1_EL1_NS
to ZERO. This makes core not seen any non-sec interrupts
and would result into system do not wake up from any of
interrupts. Do not touch GRPEN1_EL1_NS register while
system is going into suspend.

Change-Id: I7d6c5047fb4743df187fe49fba18b64db3179bc9
Signed-off-by: Murali Nalajala <mnalajal@codeaurora.org>

Conflicts:
	drivers/irqchip/irq-gic-common.h
2016-03-22 11:07:22 -07:00
Mahesh Sivasubramanian
885262005f cpuidle: lpm-levels: Fixes for clockevents_notify
The use of clockevents_notify is deprcated and targeted APIs are used
instead of the clockevents_notify callbacks. Switch broadcast timer
notifications to tick_broadcast_enter and tick_broadcast_exit.

Change-Id: I3441873eb4009b105db04f4a18d28ae9ccd07e95
2016-03-22 11:07:21 -07:00
Murali Nalajala
764f9334a3 cpu_pm: Add level to the cluster pm notification
Cluster pm notifications without level information increases difficulty
and complexity for the registered drivers to figure out when the last
coherency level is going into power collapse.

Send notifications with level information that allows the registered
drivers to easily determine the cluster level that is going in/out of
power collapse.

There is an issue with this implementation. GIC driver saves and
restores the distributed registers as part of cluster notifications. On
newer platforms there are multiple cluster levels are defined (e.g l2,
cci etc). These cluster level notofications can happen independently.
On MSM platforms GIC is still active while the cluster sleeps in idle,
causing the GIC state to be overwritten with an incorrect previous state
of the interrupts. This leads to a system hang. Do not save and restore
on any L2 and higher cache coherency level sleep entry and exit.

Change-Id: I31918d6383f19e80fe3b064cfaf0b55e16b97eb6
Signed-off-by: Archana Sathyakumar <asathyak@codeaurora.org>
Signed-off-by: Murali Nalajala <mnalajal@codeaurora.org>
Signed-off-by: Mahesh Sivasubramanian <msivasub@codeaurora.org>
Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>
2016-03-22 11:07:20 -07:00
Venkat Gopalakrishnan
014929f975 block/fs: keep track of the task that dirtied the page
Background writes happen in the context of a background thread.
It is very useful to identify the actual task that generated the
request instead of background task that submited the request.
Hence keep track of the task when a page gets dirtied and dump
this task info while tracing. Not all the pages in the bio are
dirtied by the same task but most likely it will be, since the
sectors accessed on the device must be adjacent.

Change-Id: I6afba85a2063dd3350a0141ba87cf8440ce9f777
Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
[venkatg@codeaurora.org: Fixed trivial merge conflicts]
Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
2016-03-22 11:02:00 -07:00
Mahesh Sivasubramanian
b1060581fb qos: Add support for PM_QOS_SUM type
PM_QOS_SUM is a new enum type supported in the upstream kernel. The target
qos value for PM_QOS_SUM type is updated as the sum of all the priorities
that are applicable to the current CPU.

Change-Id: I89152db4fbbf08db113b52e6c5fee4aba9b70933
Signed-off-by: Mahesh Sivasubramanian <msivasub@codeaurora.org>
2016-03-01 12:22:50 -08:00
Mahesh Sivasubramanian
394da4e230 qos: Pass the list of cpus with affected qos to notifier
Send the list of cpus whose qos has been affected along with the changed
value. Driver listening in for notifier can use this to apply the qos value
for the respective cpus.

Change-Id: I8f3c2ea624784c806c55de41cc7c7fcf8ebf02da
Signed-off-by: Mahesh Sivasubramanian <msivasub@codeaurora.org>
[mattw@codeaurora.org: resolve trivial context conflicts]
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>

Conflicts:
	kernel/power/qos.c
2016-03-01 12:22:50 -08:00
Praveen Chidambaram
e62cf98098 QoS: Enhance framework to support cpu/irq specific QoS requests
QoS request for CPU_DMA_LATENCY can be better optimized if the request
can be set only for the required cpus and not all cpus. This helps save
power on other cores, while still gauranteeing the quality of service.

Enhance the QoS constraints data structures to support target value for
each core. Requests specify if the QoS is applicable to all cores
(default) or to a selective subset of the cores or to a core(s), that the
IRQ is affine to.

QoS requests that need to track an IRQ can be set to apply only on the
cpus to which the IRQ's smp_affinity attribute is set to. The QoS
framework will automatically track IRQ migration between the cores. The
QoS is updated to be applied only to the core(s) that the IRQ has been
migrated to.

Idle and interested drivers can request a PM QoS value for a constraint
across all cpus, or a specific cpu or a set of cpus. Separate APIs have
been added to request for individual cpu or a cpumask.  The default
behaviour of PM QoS is maintained i.e, requests that do not specify a
type of the request will continue to be effected on all cores.  Requests
that want to specify an affinity of cpu(s) or an irq, can modify the PM
QoS request data structures by specifying the type of the request and
either the mask of the cpus or the IRQ number depending on the type.
Updating the request does not reset the type of the request.

The userspace sysfs interface does not support CPU/IRQ affinity.

Change-Id: I09ae85a1e8585d44440e86d63504ad734e8e3e36
Signed-off-by: Praveen Chidambaram <pchidamb@codeaurora.org>

Conflicts:
	kernel/power/qos.c
2016-03-01 12:22:50 -08:00
Praveen Chidambaram
259672e3c7 QoS: Modify data structures and function arguments for scalability.
QoS add requests uses a handle to the priority list that is used
internally to save the request, but this does not extend well. Also,
dev_pm_qos structure definition seems to use a list object directly.
The 'derivative' relationship seems to be broken.

Use pm_qos_request objects instead of passing around the protected
priority list object.

Change-Id: Ie4c9c22dd4ea13265fe01f080ba68cf77d9d484d
Signed-off-by: Praveen Chidambaram <pchidamb@codeaurora.org>
[mattw@codeaurora.org: resolve context conflicts and extend
struct modifications to additional affected users]
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>

Conflicts:
	include/linux/pm_qos.h
2016-03-01 12:22:49 -08:00
Se Wang (Patrick) Oh
05635f8cd7 trace: rtb: disable RTB in the first panic notifier
As the priority of RTB panic notifier was zero, it was
not guaranteed to disable RTB right after kernel panic.
So RTB log buffer could be flooded with some I/O operations
after panic. By setting the priority of RTB panic notifier
to the highest value, make sure RTB is disabled right after
a kernel panic.

Change-Id: If9efc2ec31efa6aa17e92b2b01e81ab4df6d1730
Signed-off-by: Se Wang (Patrick) Oh <sewango@codeaurora.org>
2016-03-01 12:22:26 -08:00
Vignesh Radhakrishnan
7cc3dac353 msm: rtb: Add timestamp to rtb logging
RTB logging currently doesn't log the time
at which the logging was done. This can be
useful to compare with dmesg during debug.
The bytes for timestamp are taken by reducing
the sentinel array size to three from eleven
thus giving the extra 8 bytes to store time.
This maintains the size of the layout at 32.

Change-Id: Ifc7e4d2e89ed14d2a97467891ebefa9515983630
Signed-off-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
2016-03-01 12:22:25 -08:00
Matt Wagantall
6a0fccbd29 trace: rtb: add msm_rtb register tracing feature snapshot
This snapshot is taken as of msm-3.10 commit:
 78c36fa0ef (Merge "msm: mdss: Prevent backlight update during
 continuous splash")

RTB support captures system events such as register writes to a
small uncached region. This is designed to aid in debugging, where
it may be useful to know the last events that occurred prior to a
device reset.

Change-Id: Idc51e618380f58a6803f40c47f2b3d29033b3196
Signed-off-by: Matt Wagantall <mattw@codeaurora.org>
[spjoshi@codeaurora.org: fix merge conflict]
Signed-off-by: Sarangdhar Joshi <spjoshi@codeaurora.org>
2016-03-01 12:22:25 -08:00
David Collins
d4b065ff47 sysctl: add boot_reason and cold_boot sysctl entries for arm64
Define boot_reason and cold_boot variables in the arm64 version
of setup.c so that arm64 targets can export the boot_reason and
cold_boot sysctl entries.

This feature is required by the qpnp-power-on driver.

Change-Id: Id2d4ff5b8caa2e6a35d4ac61e338963d602c8b84
Signed-off-by: David Collins <collinsd@codeaurora.org>
[osvaldob: resolved trival merge conflicts]
Signed-off-by: Osvaldo Banuelos <osvaldob@codeaurora.org>
2016-03-01 12:22:13 -08:00
Colin Cross
55d35d82b7 hardlockup: detect hard lockups without NMIs using secondary cpus
Emulate NMIs on systems where they are not available by using timer
interrupts on other cpus.  Each cpu will use its softlockup hrtimer
to check that the next cpu is processing hrtimer interrupts by
verifying that a counter is increasing.

This patch is useful on systems where the hardlockup detector is not
available due to a lack of NMIs, for example most ARM SoCs.
Without this patch any cpu stuck with interrupts disabled can
cause a hardware watchdog reset with no debugging information,
but with this patch the kernel can detect the lockup and panic,
which can result in useful debugging info.

Change-Id: Ia5faf50243e19c1755201212e04c8892d929785a
Signed-off-by: Colin Cross <ccross@android.com>
2016-02-16 13:54:19 -08:00
dcashman
d49d88766b FROMLIST: mm: mmap: Add new /proc tunable for mmap_base ASLR.
(cherry picked from commit https://lkml.org/lkml/2015/12/21/337)

ASLR  only uses as few as 8 bits to generate the random offset for the
mmap base address on 32 bit architectures. This value was chosen to
prevent a poorly chosen value from dividing the address space in such
a way as to prevent large allocations. This may not be an issue on all
platforms. Allow the specification of a minimum number of bits so that
platforms desiring greater ASLR protection may determine where to place
the trade-off.

Bug: 24047224
Signed-off-by: Daniel Cashman <dcashman@android.com>
Signed-off-by: Daniel Cashman <dcashman@google.com>
Change-Id: Ibf9ed3d4390e9686f5cc34f605d509a20d40e6c2
2016-02-16 13:54:14 -08:00
Amit Pundir
29a4f01daa mm: private anonymous memory build fixes for 4.4
Update vma_merge() call in private anonymous memory prctl,
introduced in AOSP commit ee8c5f78f09a
"mm: add a field to store names for private anonymous memory",
so as to align with changes from upstream commit 19a809afe2
"userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx".

Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-02-16 13:54:13 -08:00
Colin Cross
586278d78b mm: add a field to store names for private anonymous memory
Userspace processes often have multiple allocators that each do
anonymous mmaps to get memory.  When examining memory usage of
individual processes or systems as a whole, it is useful to be
able to break down the various heaps that were allocated by
each layer and examine their size, RSS, and physical memory
usage.

This patch adds a user pointer to the shared union in
vm_area_struct that points to a null terminated string inside
the user process containing a name for the vma.  vmas that
point to the same address will be merged, but vmas that
point to equivalent strings at different addresses will
not be merged.

Userspace can set the name for a region of memory by calling
prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name);
Setting the name to NULL clears it.

The names of named anonymous vmas are shown in /proc/pid/maps
as [anon:<name>] and in /proc/pid/smaps in a new "Name" field
that is only present for named vmas.  If the userspace pointer
is no longer valid all or part of the name will be replaced
with "<fault>".

The idea to store a userspace pointer to reduce the complexity
within mm (at the expense of the complexity of reading
/proc/pid/mem) came from Dave Hansen.  This results in no
runtime overhead in the mm subsystem other than comparing
the anon_name pointers when considering vma merging.  The pointer
is stored in a union with fieds that are only used on file-backed
mappings, so it does not increase memory usage.

Includes fix from Jed Davis <jld@mozilla.com> for typo in
prctl_set_vma_anon_name, which could attempt to set the name
across two vmas at the same time due to a typo, which might
corrupt the vma list.  Fix it to use tmp instead of end to limit
the name setting to a single vma at a time.

Change-Id: I9aa7b6b5ef536cd780599ba4e2fba8ceebe8b59f
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2016-02-16 13:54:13 -08:00
Rik van Riel
f8ade3666c add extra free kbytes tunable
Add a userspace visible knob to tell the VM to keep an extra amount
of memory free, by increasing the gap between each zone's min and
low watermarks.

This is useful for realtime applications that call system
calls and have a bound on the number of allocations that happen
in any short time period.  In this application, extra_free_kbytes
would be left at an amount equal to or larger than than the
maximum number of allocations that happen in any burst.

It may also be useful to reduce the memory use of virtual
machines (temporarily?), in a way that does not cause memory
fragmentation like ballooning does.

[ccross]
Revived for use on old kernels where no other solution exists.
The tunable will be removed on kernels that do better at avoiding
direct reclaim.

Change-Id: I765a42be8e964bfd3e2886d1ca85a29d60c3bb3e
Signed-off-by: Rik van Riel<riel@redhat.com>
Signed-off-by: Colin Cross <ccross@android.com>
2016-02-16 13:54:12 -08:00
Arve Hjønnevåg
2774238025 ARM: Fix "Make low-level printk work" to use a separate config option
Change-Id: I5ca8db61b595adc642a07ea187bd41fd7636840e
Signed-off-by: Arve Hjønnevåg <arve@android.com>
2016-02-16 13:54:03 -08:00
Nishanth Menon
4e09c51018 panic: Add board ID to panic output
At times, it is necessary for boards to provide some additional information
as part of panic logs. Provide information on the board hardware as part
of panic logs.

It is safer to print this information at the very end in case something
bad happens as part of the information retrieval itself.

To use this, set global mach_panic_string to an appropriate string in the
board file.

Change-Id: Id12cdda87b0cd2940dd01d52db97e6162f671b4d
Signed-off-by: Nishanth Menon <nm@ti.com>
2016-02-16 13:54:02 -08:00
Tony Lindgren
3200304ca3 ARM: Make low-level printk work
Makes low-level printk work.

Signed-off-by: Tony Lindgren <tony@atomide.com>
2016-02-16 13:54:01 -08:00
San Mehat
9d19f72b43 proc: smaps: Allow smaps access for CAP_SYS_RESOURCE
Signed-off-by: San Mehat <san@google.com>
2016-02-16 13:53:50 -08:00
Micha Kalfon
d4d049c55d prctl: make PR_SET_TIMERSLACK_PID pid namespace aware
Make PR_SET_TIMERSLACK_PID consider pid namespace and resolve the
target pid in the caller's namespace. Otherwise, calls from pid
namespace other than init would fail or affect the wrong task.

Change-Id: I1da15196abc4096536713ce03714e99d2e63820a
Signed-off-by: Micha Kalfon <micha@cellrox.com>
Acked-by: Oren Laadan <orenl@cellrox.com>
2016-02-16 13:53:49 -08:00
Micha Kalfon
18f42f60be prctl: fix misplaced PR_SET_TIMERSLACK_PID case
The case clause for the PR_SET_TIMERSLACK_PID option was placed inside
the an internal switch statement for PR_MCE_KILL (see commits 37a591d4
and 8ae872f1) . This commit moves it to the right place.

Change-Id: I63251669d7e2f2aa843d1b0900e7df61518c3dea
Signed-off-by: Micha Kalfon <micha@cellrox.com>
Acked-by: Oren Laadan <orenl@cellrox.com>
2016-02-16 13:53:48 -08:00
Ruchi Kandoi
2476d3c241 prctl: adds the capable(CAP_SYS_NICE) check to PR_SET_TIMERSLACK_PID.
Adds a capable() check to make sure that arbitary apps do not change the
timer slack for other apps.

Bug: 15000427
Change-Id: I558a2551a0e3579c7f7e7aae54b28aa9d982b209
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
2016-02-16 13:53:48 -08:00
Ruchi Kandoi
f2902f9065 prctl: adds PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.
Second argument is similar to PR_SET_TIMERSLACK, if non-zero then the
slack is set to that value otherwise sets it to the default for the thread.

Takes PID of the thread as the third argument.

This allows power/performance management software to set timer slack for
other threads according to its policy for the thread (such as when the
thread is designated foreground vs. background activity)

Change-Id: I744d451ff4e60dae69f38f53948ff36c51c14a3f
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
2016-02-16 13:53:47 -08:00
Amit Pundir
703920c14a cgroup: refactor allow_attach handler for 4.4
Refactor *allow_attach() handler to align it with the changes
from mainline commit 1f7dd3e5a6 "cgroup: fix handling of
multi-destination migration from subtree_control enabling".

Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-02-16 13:53:46 -08:00
Dmitry Shmidt
69db8fca42 cgroup: fix cgroup_taskset_for_each call in allow_attach() for 4.1
Change-Id: I05013f6e76c30b0ece3671f9f2b4bbdc626cd35c
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2016-02-16 13:53:46 -08:00