Commit graph

563227 commits

Author SHA1 Message Date
Todd Kjos
8935b6b4d2 FIXUP: sched: Fix double-release of spinlock in move_queued_task
BUG: 29519455
Change-Id: I4d1c27a1b4bcbba03d4b175d170cfe1701a90ffd
2016-08-11 14:26:47 -07:00
Todd Kjos
740d312ce8 FIXUP: sched/fair: Fix hang during suspend in sched_group_energy
BUG: 29353986
Change-Id: I0d0d8d5c107a2e0bd219819e036091106bb40e11
2016-08-11 14:26:46 -07:00
Patrick Bellasi
5156b67204 FIXUP: sched: fix SchedFreq integration for both PELT and WALT
The current kernel allows to use either PELT or WALT to track CPUs utilizations.
One of the main differences between the two approaches is that PELT
tracks only utilization of SCHED_OTHER classes while WALT tracks all tasks
with a single signal.

The current sched_freq_tick does not make this distinction and, when WALT
is in use, we end up adding multiple time the contribution related to
the RT and DL classes. This patch fixes this issue by:

1. providing two different code paths for PELT and WALT, thus granting that
   when we switch to PELT we get the original behaviour based on the assumption
   that class aggregations is done underneath by SchedFreq.

2. avoiding the double accounting of DL and RT workloads, when WALT is in use,
   by just adding a margin to the original WALT signal when we need to check
   if the CFS capacity has to be increased.

Change-Id: I7326fd50e868e97fb5e12351917e9d2969bfdae7
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
2016-08-11 14:26:45 -07:00
Todd Kjos
782c9d64a1 sched: EAS: Avoid causing spikes to max-freq unnecessarily
During scheduler tick handling, the frequency was being set to
max-freq if the current frequency is less than the current
utilization. Change to just request "right" frequency instead
of max.

BUG: 29871410
Change-Id: I6fe65b14413da44b1520ba116f72320083eb92f8
2016-08-11 14:26:45 -07:00
Patrick Bellasi
dfc1151b46 FIXUP: sched: fix set_cfs_cpu_capacity when WALT is in use
The CPU utilization reported when WALT is in use already tracks the
contributions due to RT and DL workloads. However, SchedFreq exposes
different capacity update functions, one for each class, and does classes
utilization internally at update_cpu_capacity_request() call time.

This patch ensures that when WALT is in use, the:
  cpu_sched_capacity_reqs::cfs
value is tracking just the load generated by SCHED_OTHER tasks.

Change-Id: Ibd9c9a10874a1d91f62477034548f7664e57cd6a
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
2016-08-11 14:26:44 -07:00
Srinath Sridharan
519c62750e sched/walt: Accounting for number of irqs pending on each core
Schedules on a core whose irq count is less than a threshold.
Improves I/O performance of EAS.

Change-Id: I08ff7dd0d22502a0106fc636b1af2e6fe9e758b5
2016-08-11 14:26:43 -07:00
Srivatsa Vaddagiri
efb86bd08a sched: Introduce Window Assisted Load Tracking (WALT)
use a window based view of time in order to track task
demand and CPU utilization in the scheduler.

Window Assisted Load Tracking (WALT) implementation credits:
 Srivatsa Vaddagiri, Steve Muckle, Syed Rameez Mustafa, Joonwoo Park,
 Pavan Kumar Kondeti, Olav Haugan

2016-03-06: Integration with EAS/refactoring by Vikram Mulukutla
            and Todd Kjos

Change-Id: I21408236836625d4e7d7de1843d20ed5ff36c708

Includes fixes for issues:

eas/walt: Use walt_ktime_clock() instead of ktime_get_ns() to avoid a
race resulting in watchdog resets
BUG: 29353986
Change-Id: Ic1820e22a136f7c7ebd6f42e15f14d470f6bbbdb

Handle walt accounting anomoly during resume

During resume, there is a corner case where on wakeup, a task's
prev_runnable_sum can go negative. This is a workaround that
fixes the condition and warns (instead of crashing).

BUG: 29464099
Change-Id: I173e7874324b31a3584435530281708145773508

Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Srinath Sridharan <srinathsr@google.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
[jstultz: fwdported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-11 14:26:43 -07:00
Patrick Bellasi
345be81dba sched/tune: fix PB and PC cuts indexes definition
The current definition of the Performance Boost (PB) and Performance Constraint
(PC) regions is has two main issues:
1) in the computation of the boost index we overflow the thresholds_gains
   table for boost=100
2) the two cuts had _NOT_ the same ratio

The last point means that when boost=0 we do _not_ have a "standard" EAS
behaviour, i.e. accepting all candidate which decrease energy regardless
of their impact on performances. Instead, we accept only schedule candidate
which are in the Optimal region, i.e. decrease energy while increasing
performances.

This behaviour can have a negative impact also on CPU selection policies
which tries to spread tasks to reduce latencies. Indeed, for example
we could end up rejecting a schedule candidate which want to move a task
from a congested CPU to an idle one while, specifically in the case where
the target CPU will be running on a lower OPP.

This patch fixes these two issues by properly clamping the boost value
in the appropriate range to compute the threshold indexes as well as
by using the same threshold index for both cuts.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Srinath Sridharan <srinathsr@google.com>

sched/tune: fix update of threshold index for boost groups

When SchedTune is configured to work with CGroup mode, each time we update
the boost value of a group we do not update the threshed indexes for the
definition of the Performance Boost (PC) and Performance Constraint (PC)
region. This means that while the OPP boosting and CPU biasing selection
is working as expected, the __schedtune_accept_deltas function is always
using the initial values for these cuts.

This patch ensure that each time a new boost value is configured for a
boost group, the cuts for the PB and PC region are properly updated too.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Srinath Sridharan <srinathsr@google.com>

sched/tune: update PC and PB cuts definition

The current definition of Performance Boost (PB) and Performance
Constraint (PC) cuts defines two "dead regions":
- up to 20% boost: we are in energy-reduction only mode, i.e.
  accept all candidate which reduce energy
- over 70% boost: we are in performance-increase only mode, i.e.
  accept only sched candidate which do not reduce performances

This patch uses a more fine grained configuration where these two "dead
regions" are reduced to: up to 10% and over 90%.
This should allow to have some boosting benefits starting from 10% boost
values as well as not being to much permissive starting from boost values
of 80%.

Suggested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Srinath Sridharan <srinathsr@google.com>

bug: 28312446
Change-Id: Ia326c66521e38c98e7a7eddbbb7c437875efa1ba

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
2016-08-11 14:26:42 -07:00
Todd Kjos
369bcbbae4 sched/fair: optimize idle cpu selection for boosted tasks
find_best_target CPU selection is biased towards lower CPU IDs. Bias
towards higher CPUs for boosted tasks. For boosted tasks unconditionally
use the idle CPU returned by find_best_target.

BUG: 29512132
Change-Id: I3d650051752163fcf3dc7909751d1fde3f9d17c0

Conflicts:
	kernel/sched/fair.c
2016-08-11 14:26:41 -07:00
Patrick Bellasi
af14760e19 FIXUP: sched/tune: fix accounting for runnable tasks
Contains:

sched/tune: fix accounting for runnable tasks (1/5)

The accounting for tasks into boost groups of different CPUs is currently
broken mainly because:
a) we do not properly track the change of boost group of a RUNNABLE task
b) there are race conditions between migration code and accounting code

This patch provides a fixes to ensure enqueue/dequeue
accounting also for throttled tasks.

Without this patch is can happen that a task is enqueued into a throttled
RQ thus not being accounted for the boosting of the corresponding RQ.
We could argue that a throttled task should not boost a CPU, however:
a) properly implementing CPU boosting considering throttled tasks will
   increase a lot the complexity of the solution
b) it's not easy to quantify the benefits introduced by such a more
   complex solution

Since task throttling requires the usage of the CFS bandwidth controller,
which is not widely used on mobile systems (at least not by Android kernels
so far), for the time being we go for the simple solution and boost also
for throttled RQs.

sched/tune: fix accounting for runnable tasks (2/5)

This patch provides the code required to enforce proper locking.
A per boost group spinlock has been added to grant atomic
accounting of tasks as well as to serialise enqueue/dequeue operations,
triggered by tasks migrations, with cgroups's attach/detach operations.

sched/tune: fix accounting for runnable tasks (3/5)

This patch adds cgroups {allow,can,cancel}_attach callbacks.

Since a task can be migrated between boost groups while it's running,
the CGroups's attach callbacks have been added to properly migrate
boost contributions of RUNNABLE tasks.

The RQ's lock is used to serialise enqueue/dequeue operations, triggered
by tasks migrations, with cgroups's attach/detach operations. While the
SchedTune's CPU lock is used to grant atrocity of the accounting within
the CPU.

NOTE: the current implementation does not allows a concurrent CPU migration
      and CGroups change.

sched/tune: fix accounting for runnable tasks (4/5)

This fixes accounting for exiting tasks by adding a dedicated call early
in the do_exit() syscall, which disables SchedTune accounting as soon as a
task is flagged PF_EXITING.

This flag is set before the multiple dequeue/enqueue dance triggered
by cgroup_exit() which is useful only to inject useless tasks movements
thus increasing possibilities for race conditions with the migration code.
The schedtune_exit_task() call does the last dequeue of a task from its
current boost group. This is a solution more aligned with what happens in
mainline kernels (>v4.4) where the exit_cgroup does not move anymore a dying
task to the root control group.

sched/tune: fix accounting for runnable tasks (5/5)

To avoid accounting issues at startup, this patch disable the SchedTune
accounting until the required data structures have been properly
initialized.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
[jstultz: fwdported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-11 14:26:41 -07:00
Patrick Bellasi
7f8f24a0ea sched/tune: use a single initialisation function
With the introduction of initialization function required to compute the
energy normalization constants from DTB at boot time, we have now a
late_initcall which is already used by SchedTune.

This patch consolidate within that function the other initialization
bits which was previously deferred to the first CGroup creation.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
[jstultz: fwdported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-10 15:21:52 -07:00
Patrick Bellasi
274bbcfbe4 sched/{fair,tune}: simplify fair.c code
The usage of conditional compiled code is discouraged in fair.c.

This patch clean up a bit fair.c by moving schedtune_{cpu.task}_boost
definitions into tune.h.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
2016-08-10 15:20:38 -07:00
Patrick Bellasi
254f5095a8 FIXUP: sched/tune: fix payoff calculation for boost region
The definition of the acceptance regions as well as the translation of
these regions into a payoff value was both wrong which turned out in:
a) a wrong definition of payoff for the performance boost region
b) a correct "by chance" definition of the payoff for the performance
   constraint region (i.e. two sign errors together fixing the formula)

This patch provides a better description of the cut regions as well as
a fixed version of the payoff computations, which are now reduced to a
single formula usable for both cases.

Reported-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
2016-08-10 15:18:40 -07:00
Srinath Sridharan
00aae8d5d5 sched/tune: Add support for negative boost values
Change-Id: I164ee04ba98c3a776605f18cb65ee61b3e917939

Contains also:

eas/stune: schedtune cpu boost_max must be non-negative.

This is to avoid under-accounting cpu capacity which may
cause task stacking and frequency spikes.

Change-Id: Ie1c1cbd52a6edb77b4c15a830030aa748dff6f29
2016-08-10 15:18:35 -07:00
Patrick Bellasi
6ba071d89d FIX: sched/tune: move schedtune_nornalize_energy into fair.c
The energy normalization function is required to get the proper values
for the P-E space filtering function to work.
That normalization is part of the hot wakeup path and currently implemented
with a function call.

Moving the normalization function into fair.c allows the compiler to
further optimize that code by reducing overheads in the wakeup hot path.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
[jstultz: fwdported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-10 15:17:45 -07:00
Patrick Bellasi
28e8cb961c FIX: sched/tune: update usage of boosted task utilisation on CPU selection
A boosted task needs to be scheduled on a CPU which can grant a minimum
capacity which is higher than its utilization.
However, a task can be allocated on a CPU which already provides an utilization
which is higher than the task boosted utilization itself.
Moreover, with the previous approach a task 100% boosted is not fitting any
CPU.

This patch makes use of the boosted task utilization just as a threashold
which defines the minimum capacity should be available on a CPU to host that
task.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
2016-08-10 15:16:02 -07:00
Todd Kjos
c50cc2299c sched/fair: add tunable to set initial task load
The choice of initial task load upon fork has a large influence
on CPU and OPP selection when scheduler-driven DVFS is in use.
Make this tuneable by adding a new sysctl "sched_initial_task_util".

If the sched governor is not used, the default remains at SCHED_LOAD_SCALE
Otherwise, the value from the sysctl is used. This defaults to 0.

Signed-off-by: "Todd Kjos <tkjos@google.com>"
2016-08-10 15:15:55 -07:00
Juri Lelli
4a5e890ec6 sched/fair: add tunable to force selection at cpu granularity
EAS assumes that clusters with smaller capacity cores are more
energy-efficient. This may not be true on non-big-little devices,
so EAS can make incorrect cluster selections when finding a CPU
to wake. The "sched_is_big_little" hint can be used to cause a
cpu-based selection instead of cluster-based selection.

This change incorporates the addition of the sync hint enable patch

EAS did not honour synchronous wakeup hints, a new sysctl is
created to ask EAS to use this information when selecting a CPU.
The control is called "sched_sync_hint_enable".

Also contains:

EAS: sched/fair: for SMP bias toward idle core with capacity

For SMP devices, on wakeup bias towards idle cores that have capacity
vs busy devices that need a higher OPP

eas: favor idle cpus for boosted tasks

BUG: 29533997
BUG: 29512132
Change-Id: I0cc9a1b1b88fb52916f18bf2d25715bdc3634f9c
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Srinath Sridharan <srinathsr@google.com>

eas/sched/fair: Favoring busy cpus with low OPPs

BUG: 29533997
BUG: 29512132
Change-Id: I9305b3239698d64278db715a2e277ea0bb4ece79

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
2016-08-10 15:15:46 -07:00
Srinath Sridharan
2e9abbc942 sched: EAS: take cstate into account when selecting idle core
Introduce a new sysctl for this option, 'sched_cstate_aware'.
When this is enabled, select_idle_sibling in CFS is modified to
choose the idle CPU in the sibling group which has the lowest
idle state index - idle state indexes are assumed to increase
as sleep depth and hence wakeup latency increase. In this way,
we attempt to minimise wakeup latency when an idle CPU is
required.

Signed-off-by: Srinath Sridharan <srinathsr@google.com>

Includes:
sched: EAS: fix select_idle_sibling

when sysctl_sched_cstate_aware is enabled, best_idle cpu will not be chosen
in the original flow because it will goto done directly

Bug: 30107557
Change-Id: Ie09c2e3960cafbb976f8d472747faefab3b4d6ac
Signed-off-by: martin_liu <martin_liu@htc.com>
2016-08-10 15:15:39 -07:00
Srinath Sridharan
d753e92e19 sched/cpufreq_sched: Consolidated update
Contains:

sched/cpufreq_sched: use shorter throttle for raising OPP

Avoid cases where a brief drop in load causes a change to a low OPP
for the full throttle period. Use a shorter throttle period for
raising OPP than for lowering OPP.

sched-freq: Fix handling of max/min frequency

This reverts commit 9726142608f5b3bf5df4280243c9d324e692a510.

Change-Id: Ia78095354f7ad9492f00deb509a2b45112361eda

sched/cpufreq: Increasing throttle_down_nsec to 50ms

Change-Id: I2d8969cf2a64fa719b9dd86f43f9dd14b1ff84fe

sched-freq: make throttle times tunable

Change-Id: I127879645367425b273441d7f0306bb15d5633cb

Signed-off-by: Srinath Sridharan <srinathsr@google.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
[jstultz: Fwdported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-10 15:14:51 -07:00
Patrick Bellasi
765c2ab363 FIXUP: sched: fix build for non-SMP target
Currently the build for a single-core (e.g. user-mode) Linux is broken
and this configuration is required (at least) to run some network tests.

The main issues for the current code support on single-core systems are:
1. {se,rq}::sched_avg is not available nor maintained for !SMP systems
   This means that load and utilisation signals are NOT available in single
   core systems. All the EAS code depends on these signals.
2. sched_group_energy is also SMP dependant. Again this means that all the
   EAS setup and preparation code (energyn model initialization) has to be
   properly guarded/disabled for !SMP systems.
3. SchedFreq depends on utilization signal, which is not available on
   !SMP systems.
4. SchedTune is useless on unicore systems if SchedFreq is not available.
5. WALT machinery is not required on single-core systems.

This patch addresses all these issues by enforcing some constraints for
single-core systems:
a) WALT, SchedTune and SchedTune are now dependant on SMP
b) The default governor for !SMP systems is INTERACTIVE
c) The energy model initialisation/build functions are
d) Other minor code re-arrangements and CONFIG_SMP guarding to enable
   single core builds.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
2016-08-10 15:08:01 -07:00
Patrick Bellasi
ae54c7741f DEBUG: sched/tune: add tracepoint on P-E space filtering
Change-Id: I31dfed67c0486713b88efb75df767329f2802e06
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
2016-08-10 15:07:28 -07:00
Patrick Bellasi
4525aa343e DEBUG: sched/tune: add tracepoint for energy_diff() values
Change-Id: Id8fafbd85f6d81248f322e073ee790a7ceec0bf7
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
2016-08-10 15:07:07 -07:00
Patrick Bellasi
962b7c10ab DEBUG: sched/tune: add tracepoint for task boost signal
Change-Id: I545d3bf5569fc41c0fa70f51dff9a19c11d532ee
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
2016-08-10 15:06:58 -07:00
Dietmar Eggemann
687fa2dcd2 arm: topology: Define TC2 energy and provide it to the scheduler
This patch is only here to be able to test provisioning of energy related
data from an arch topology shim layer to the scheduler. Since there is no
code today which deals with extracting energy related data from the dtb or
acpi, and process it in the topology shim layer, the content of the
sched_group_energy structures as well as the idle_state and capacity_state
arrays are hard-coded here.

This patch defines the sched_group_energy structure as well as the
idle_state and capacity_state array for the cluster (relates to sched
groups (sgs) in DIE sched domain level) and for the core (relates to sgs
in MC sd level) for a Cortex A7 as well as for a Cortex A15.
It further provides related implementations of the sched_domain_energy_f
functions (cpu_cluster_energy() and cpu_core_energy()).

To be able to propagate this information from the topology shim layer to
the scheduler, the elements of the arm_topology[] table have been
provisioned with the appropriate sched_domain_energy_f functions.

Change-Id: I8c014bbd04f6a1d57892be9bfa16affe07948dcf
cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
2016-08-10 15:05:04 -07:00
Joseph Lo
3a400abdc5 CHROMIUM: sched: update the average of nr_running
Doing a Exponential moving average per nr_running++/-- does not
guarantee a fixed sample rate which induces errors if there are lots of
threads being enqueued/dequeued from the rq (Linpack mt). Instead of
keeping track of the avg, the scheduler now keeps track of the integral
of nr_running and allows the readers to perform filtering on top.

Original-author: Sai Charan Gurrappadi <sgurrappadi@nvidia.com>

Change-Id: Id946654f32fa8be0eaf9d8fa7c9a8039b5ef9fab
Signed-off-by: Joseph Lo <josephl@nvidia.com>
Signed-off-by: Andrew Bresticker <abrestic@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/174694
Reviewed-on: https://chromium-review.googlesource.com/272853
[jstultz: fwdported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-08-10 15:01:22 -07:00
Jeremy Compostella
edb97738f2 ANDROID: dm verity fec: pack the fec_header structure
The fec_header structure is generated build time and stored on disk.
The fec_header might be build on a 64 bits machine while it is read
per a 32 bits device or the other way around.  In such situations, the
fec_header fields are not aligned as expected by the device and it
fails to read the fec_header structure.

This patch makes the fec_header packed.

Change-Id: Idb84453e70cc11abd5ef3a0adfbb16f8b5feaf06
Signed-off-by: Jeremy Compostella <jeremy.compostella@intel.com>
2016-08-10 13:29:44 -07:00
Badhri Jagan Sridharan
ad2f6cf0be ANDROID: dm: android-verity: Verify header before fetching table
Move header validation logic before reading the verity_table as
an invalid header implies the table is invalid as well.

(Cherry-picked from:
https://partner-android-review.git.corp.google.com/#/c/625203)

BUG: 29940612
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: Ib34d25c0854202f3e70df0a6d0ef1d96f0250c8e
2016-08-10 13:26:25 -07:00
Badhri Jagan Sridharan
f74284f6c2 ANDROID: dm: allow adb disable-verity only in userdebug
adb disable-verity was allowed when the phone is in the
unlocked state. Since the driver is now aware of the build
variant, honor "adb disable-verity" only in userdebug
builds.

(Cherry-picked from
https://partner-android-review.git.corp.google.com/#/c/622117)

BUG: 29276559
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I7ce9f38d8c7a62361392c5a8ccebb288f8a3a2ea
2016-08-10 13:26:14 -07:00
Badhri Jagan Sridharan
58bae772a7 ANDROID: dm: mount as linear target if eng build
eng builds dont have verity enabled i.e it does even
have verity metadata appended to the parition. Therefore
add rootdev as linear device and map the entire partition
if build variant is "eng".

(Cherry-picked based on
https://partner-android-review.git.corp.google.com/#/c/618690/)

BUG: 29276559
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I8f5c2289b842b820ca04f5773525e5449bb3f355
2016-08-10 13:26:04 -07:00
Badhri Jagan Sridharan
051d4706c6 ANDROID: dm: use default verity public key
If the dm-android-verity target does not provide a default
key try using the default public key from the system keyring.
The defualt verity keyid is passed as a kernel command line
argument veritykeyid=.

The order of the dm-android-verity params have been reversed
to facilitate the change.

Old format example:
dm="system none ro,0 1 android-verity Android:#7e4333f9bba00adfe0ede979e28ed1920492b40f /dev/mmcblk0p43"

New formats supported:
dm="system none ro,0 1 android-verity /dev/mmcblk0p43 Android:#7e4333f9bba00adfe0ede979e28ed1920492b40f"

(or)

dm="system none ro,0 1 android-verity /dev/mmcblk0p43"
when veritykeyid= is set in the kernel command line.

BUG: 28384658
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I506c89b053d835ab579e703eef2bc1f8487250de
(cherry picked from commit c5c74d0327729f35b576564976885596c6d0e7fb)
2016-08-10 13:25:54 -07:00
Badhri Jagan Sridharan
9c43aca47b ANDROID: dm: fix signature verification flag
The bug was that the signature verification was only
happening when verity was disabled. It should always
happen when verity is enabled.

Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I2d9354e240d36ea06fc68c2d18d8e87b823a4c2f
(cherry picked from commit 5364b5ca0b1a12a58283b51408e43fc36d4e4fe7)
2016-08-10 13:25:42 -07:00
Jeremy Compostella
a517817c17 ANDROID: dm: use name_to_dev_t
This patch makes android_verity_ctr() parse its block device string
parameter with name_to_dev_t().  It allows the use of less hardware
related block device reference like PARTUUID for instance.

Change-Id: Idb84453e70cc11abd5ef3a0adfbb16f8b5feaf07
Signed-off-by: Jeremy Compostella <jeremy.compostella@intel.com>
2016-08-10 13:25:28 -07:00
Badhri Jagan Sridharan
86fd82659f ANDROID: dm: rename dm-linear methods for dm-android-verity
This keeps linear_target as static variable and just exposes
the linear target methods for android-verity

Cherry-picked: https://android-review.googlesource.com/#/c/212858

Change-Id: I4a377e417b00afd9ecccdb3e605fea31a7df112e
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
(cherry picked from commit a6d1b091f40b25d97849487e29ec097bc5f568dd)
2016-08-10 13:25:16 -07:00
Badhri Jagan Sridharan
438e162621 ANDROID: dm: Minor cleanup
Compacts the linear device arguments removing the
unnecessary variables.

Bug: 27175947
Change-Id: I157170eebe3c0f89a68ae05870a1060f188d0da0
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
2016-08-10 13:25:06 -07:00
Badhri Jagan Sridharan
67ce481897 ANDROID: dm: Mounting root as linear device when verity disabled
This CL makes android-verity target to be added as linear
dm device if when bootloader is unlocked and verity is disabled.

Bug: 27175947
Change-Id: Ic41ca4b8908fb2777263799cf3a3e25934d70f18
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
2016-08-10 13:24:56 -07:00
Badhri Jagan Sridharan
f42f971b7b ANDROID: dm-android-verity: Rebase on top of 4.1
Following CLs in upstream causes minor changes to dm-android-verity target.
1. keys: change asymmetric keys to use common hash definitions
2. block: Abstract out bvec iterator
Rebase dm-android-verity on top of these changes.

Bug: 27175947

Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: Icfdc3e7b3ead5de335a059cade1aca70414db415
2016-08-10 13:24:45 -07:00
Badhri Jagan Sridharan
36d01a590d ANDROID: dm: Add android verity target
This device-mapper target is virtually a VERITY target. This
target is setup by reading the metadata contents piggybacked
to the actual data blocks in the block device. The signature
of the metadata contents are verified against the key included
in the system keyring. Upon success, the underlying verity
target is setup.

BUG: 27175947

Change-Id: I7e99644a0960ac8279f02c0158ed20999510ea97
Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
2016-08-10 13:24:36 -07:00
Jeremy Compostella
26cab56d9a ANDROID: dm: fix dm_substitute_devices()
When candidate is the last parameter, candidate_end points to the '\0'
character and not the DM_FIELD_SEP character.  In such a situation, we
should not move the candidate_end pointer one character backward.

Signed-off-by: Jeremy Compostella <jeremy.compostella@intel.com>
2016-08-10 13:24:12 -07:00
Badhri Jagan Sridharan
5a77db7839 ANDROID: dm: Rebase on top of 4.1
1. "dm: optimize use SRCU and RCU" removes the use of dm_table_put.
2. "dm: remove request-based logic from make_request_fn wrapper" necessitates
    calling dm_setup_md_queue or else the request_queue's make_request_fn
    pointer ends being unset.

[    7.711600] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[    7.717519] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W       4.1.15-02273-gb057d16-dirty #33
[    7.726559] Hardware name: HiKey Development Board (DT)
[    7.731779] task: ffffffc005f8acc0 ti: ffffffc005f8c000 task.ti: ffffffc005f8c000
[    7.739257] PC is at 0x0
[    7.741787] LR is at generic_make_request+0x8c/0x108
....
[    9.082931] Call trace:
[    9.085372] [<          (null)>]           (null)
[    9.090074] [<ffffffc0003f4ac0>] submit_bio+0x98/0x1e0
[    9.095212] [<ffffffc0001e2618>] _submit_bh+0x120/0x1f0
[    9.096165] cfg80211: Calling CRDA to update world regulatory domain
[    9.106781] [<ffffffc0001e5450>] __bread_gfp+0x94/0x114
[    9.112004] [<ffffffc00024a748>] ext4_fill_super+0x18c/0x2d64
[    9.117750] [<ffffffc0001b275c>] mount_bdev+0x194/0x1c0
[    9.122973] [<ffffffc0002450dc>] ext4_mount+0x14/0x1c
[    9.128021] [<ffffffc0001b29a0>] mount_fs+0x3c/0x194
[    9.132985] [<ffffffc0001d059c>] vfs_kern_mount+0x4c/0x134
[    9.138467] [<ffffffc0001d2168>] do_mount+0x204/0xbbc
[    9.143514] [<ffffffc0001d2ec4>] SyS_mount+0x94/0xe8
[    9.148479] [<ffffffc000c54074>] mount_block_root+0x120/0x24c
[    9.154222] [<ffffffc000c543e8>] mount_root+0x110/0x12c
[    9.159443] [<ffffffc000c54574>] prepare_namespace+0x170/0x1b8
[    9.165273] [<ffffffc000c53d98>] kernel_init_freeable+0x23c/0x260
[    9.171365] [<ffffffc0009b1748>] kernel_init+0x10/0x118
[    9.176589] Code: bad PC value
[    9.179807] ---[ end trace 75e1bc52ba364d13 ]---

Bug: 27175947

Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I952d86fd1475f0825f9be1386e3497b36127abd0
2016-08-10 13:23:56 -07:00
Will Drewry
96b0434c25 CHROMIUM: dm: boot time specification of dm=
This is a wrap-up of three patches pending upstream approval.
I'm bundling them because they are interdependent, and it'll be
easier to drop it on rebase later.

1. dm: allow a dm-fs-style device to be shared via dm-ioctl

Integrates feedback from Alisdair, Mike, and Kiyoshi.

Two main changes occur here:

- One function is added which allows for a programmatically created
mapped device to be inserted into the dm-ioctl hash table.  This binds
the device to a name and, optional, uuid which is needed by udev and
allows for userspace management of the mapped device.

- dm_table_complete() was extended to handle all of the final
functional changes required for the table to be operational once
called.

2. init: boot to device-mapper targets without an initr*

Add a dm= kernel parameter modeled after the md= parameter from
do_mounts_md.  It allows for device-mapper targets to be configured at
boot time for use early in the boot process (as the root device or
otherwise).  It also replaces /dev/XXX calls with major:minor opportunistically.

The format is dm="name uuid ro,table line 1,table line 2,...".  The
parser expects the comma to be safe to use as a newline substitute but,
otherwise, uses the normal separator of space.  Some attempt has been
made to make it forgiving of additional spaces (using skip_spaces()).

A mapped device created during boot will be assigned a minor of 0 and
may be access via /dev/dm-0.

An example dm-linear root with no uuid may look like:

root=/dev/dm-0  dm="lroot none ro, 0 4096 linear /dev/ubdb 0, 4096 4096 linear /dv/ubdc 0"

Once udev is started, /dev/dm-0 will become /dev/mapper/lroot.

Older upstream threads:
http://marc.info/?l=dm-devel&m=127429492521964&w=2
http://marc.info/?l=dm-devel&m=127429499422096&w=2
http://marc.info/?l=dm-devel&m=127429493922000&w=2

Latest upstream threads:
https://patchwork.kernel.org/patch/104859/
https://patchwork.kernel.org/patch/104860/
https://patchwork.kernel.org/patch/104861/

Bug: 27175947

Signed-off-by: Will Drewry <wad@chromium.org>

Review URL: http://codereview.chromium.org/2020011

Change-Id: I92bd53432a11241228d2e5ac89a3b20d19b05a31
2016-08-10 13:23:42 -07:00
Arnaldo Carvalho de Melo
ee247d4205 UPSTREAM: net: Fix use after free in the recvmmsg exit path
(cherry picked from commit 34b88a68f26a75e4fded796f1a49c40f82234b7d)

The syzkaller fuzzer hit the following use-after-free:

  Call Trace:
   [<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295
   [<ffffffff851cc31a>] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261
   [<     inline     >] SYSC_recvmmsg net/socket.c:2281
   [<ffffffff851cc57f>] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270
   [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
  arch/x86/entry/entry_64.S:185

And, as Dmitry rightly assessed, that is because we can drop the
reference and then touch it when the underlying recvmsg calls return
some packets and then hit an error, which will make recvmmsg to set
sock->sk->sk_err, oops, fix it.

Reported-and-Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Fixes: a2e2725541 ("net: Introduce recvmmsg socket syscall")
http://lkml.kernel.org/r/20160122211644.GC2470@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Change-Id: I2adb0faf595b7b634d9b739dfdd1a47109e20ecb
Bug: 30515201
2016-08-07 23:22:14 -07:00
James Carr
f3d9c312b8 Implement memory_state_time, used by qcom,cpubw
New driver memory_state_time tracks time spent in different DDR
frequency and bandwidth states.

Memory drivers such as qcom,cpubw can post updated state to the driver
after registering a callback. Processed by a workqueue

Bandwidth buckets are read in from device tree in the relevant qualcomm
section, can be defined in any quantity and spacing.

The data is exposed at /sys/kernel/memory_state_time, able to be read by
the Android framework.

Functionality is behind a config option CONFIG_MEMORY_STATE_TIME

Change-Id: I4fee165571cb975fb9eacbc9aada5e6d7dd748f0
Signed-off-by: James Carr <carrja@google.com>
2016-08-03 15:45:47 -07:00
Amit Pundir
818aa36ea8 Revert "panic: Add board ID to panic output"
This reverts commit 4e09c51018.

I checked for the usage of this debug helper in AOSP common kernels as
well as vendor kernels (e.g exynos, msm, mediatek, omap, tegra, x86,
x86_64) hosted at https://android.googlesource.com/kernel/ and I found
out that other than few fairly obsolete Omap trees (for tuna & Glass)
and Exynos tree (for Manta), there is no active user of this debug
helper. So we can safely remove this helper code.

Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-08-01 11:17:00 -07:00
Anson Jacob
1b06e25392 usb: gadget: f_accessory: remove duplicate endpoint alloc
usb_ep_autoconfig is called twice for allocating
bulk out endpoint.

Removed the unwanted call.

Fixes Issue: 67180

Change-Id: I03e87a86fbbbc85831ff7f0496adf038d1de2956
Signed-off-by: Anson Jacob <ansonjacob.aj@gmail.com>
2016-07-31 22:30:14 -04:00
Arend Van Spriel
8501176294 BACKPORT: brcmfmac: defer DPC processing during probe
The sdio dpc starts processing when in SDIOD_STATE_DATA. This state was
entered right after firmware download. This patch moves that transition
just before enabling sdio interrupt handling thus avoiding watchdog
expiry which would put the bus to sleep while probing.

Change-Id: I09c60752374b8145da78000935062be08c5c8a52
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2016-07-29 09:57:56 -07:00
John Stultz
5d1b9c0c01 FROMLIST: proc: Add LSM hook checks to /proc/<tid>/timerslack_ns
As requested, this patch checks the existing LSM hooks
task_getscheduler/task_setscheduler when reading or modifying
the task's timerslack value.

Previous versions added new get/settimerslack LSM hooks, but
since they checked the same PROCESS__SET/GETSCHED values as
existing hooks, it was suggested we just use the existing ones.

Cc: Kees Cook <keescook@chromium.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
CC: Arjan van de Ven <arjan@linux.intel.com>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Colin Cross <ccross@android.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: Elliott Hughes <enh@google.com>
Cc: James Morris <jmorris@namei.org>
Cc: Android Kernel Team <kernel-team@android.com>
Cc: linux-security-module@vger.kernel.org
Cc: selinux@tycho.nsa.gov
Signed-off-by: John Stultz <john.stultz@linaro.org>

FROMLIST URL: https://lkml.org/lkml/2016/7/21/523
Change-Id: Id157d10e2fe0b85f1be45035a6117358a42af028
(Cherry picked back to common/android-4.4)
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-07-25 14:35:11 -07:00
John Stultz
419d43edc5 FROMLIST: proc: Relax /proc/<tid>/timerslack_ns capability requirements
When an interface to allow a task to change another tasks
timerslack was first proposed, it was suggested that something
greater then CAP_SYS_NICE would be needed, as a task could be
delayed further then what normally could be done with nice
adjustments.

So CAP_SYS_PTRACE was adopted instead for what became the
/proc/<tid>/timerslack_ns interface. However, for Android (where
this feature originates), giving the system_server
CAP_SYS_PTRACE would allow it to observe and modify all tasks
memory. This is considered too high a privilege level for only
needing to change the timerslack.

After some discussion, it was realized that a CAP_SYS_NICE
process can set a task as SCHED_FIFO, so they could fork some
spinning processes and set them all SCHED_FIFO 99, in effect
delaying all other tasks for an infinite amount of time.

So as a CAP_SYS_NICE task can already cause trouble for other
tasks, using it as a required capability for accessing and
modifying /proc/<tid>/timerslack_ns seems sufficient.

Thus, this patch loosens the capability requirements to
CAP_SYS_NICE and removes CAP_SYS_PTRACE, simplifying some
of the code flow as well.

This is technically an ABI change, but as the feature just
landed in 4.6, I suspect no one is yet using it.

Cc: Kees Cook <keescook@chromium.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
CC: Arjan van de Ven <arjan@linux.intel.com>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Colin Cross <ccross@android.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: Elliott Hughes <enh@google.com>
Cc: Android Kernel Team <kernel-team@android.com>
Reviewed-by: Nick Kralevich <nnk@google.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>

FROMLIST URL: https://lkml.org/lkml/2016/7/21/522
Change-Id: Ia75481402e3948165a1b7c1551c539530cb25509
(Cherry picked against common/android-4.4)
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-07-25 14:34:56 -07:00
Jann Horn
85a09f2732 UPSTREAM: sched: panic on corrupted stack end
(cherry picked from commit 29d6455178a09e1dc340380c582b13356227e8df)

Until now, hitting this BUG_ON caused a recursive oops (because oops
handling involves do_exit(), which calls into the scheduler, which in
turn raises an oops), which caused stuff below the stack to be
overwritten until a panic happened (e.g.  via an oops in interrupt
context, caused by the overwritten CPU index in the thread_info).

Just panic directly.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Ia3acb3f747f7a58ec2d071644433b0591925969f
Bug: 29444228
2016-07-25 13:03:32 -07:00
Jann Horn
ed23e11450 UPSTREAM: ecryptfs: forbid opening files without mmap handler
(cherry picked from commit 2f36db71009304b3f0b95afacd8eba1f9f046b87)

This prevents users from triggering a stack overflow through a recursive
invocation of pagefault handling that involves mapping procfs files into
virtual memory.

Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I0be77c7f8bd3046bc34cd87ef577529792d479bc
Bug: 29444228
2016-07-25 13:03:32 -07:00