This backports da8b44d5a9f8bf26da637b7336508ca534d6b319 from upstream.
This patchset introduces a /proc/<pid>/timerslack_ns interface which
would allow controlling processes to be able to set the timerslack value
on other processes in order to save power by avoiding wakeups (Something
Android currently does via out-of-tree patches).
The first patch tries to fix the internal timer_slack_ns usage which was
defined as a long, which limits the slack range to ~4 seconds on 32bit
systems. It converts it to a u64, which provides the same basically
unlimited slack (500 years) on both 32bit and 64bit machines.
The second patch introduces the /proc/<pid>/timerslack_ns interface
which allows the full 64bit slack range for a task to be read or set on
both 32bit and 64bit machines.
With these two patches, on a 32bit machine, after setting the slack on
bash to 10 seconds:
$ time sleep 1
real 0m10.747s
user 0m0.001s
sys 0m0.005s
The first patch is a little ugly, since I had to chase the slack delta
arguments through a number of functions converting them to u64s. Let me
know if it makes sense to break that up more or not.
Other than that things are fairly straightforward.
This patch (of 2):
The timer_slack_ns value in the task struct is currently a unsigned
long. This means that on 32bit applications, the maximum slack is just
over 4 seconds. However, on 64bit machines, its much much larger (~500
years).
This disparity could make application development a little (as well as
the default_slack) to a u64. This means both 32bit and 64bit systems
have the same effective internal slack range.
Now the existing ABI via PR_GET_TIMERSLACK and PR_SET_TIMERSLACK specify
the interface as a unsigned long, so we preserve that limitation on
32bit systems, where SET_TIMERSLACK can only set the slack to a unsigned
long value, and GET_TIMERSLACK will return ULONG_MAX if the slack is
actually larger then what can be stored by an unsigned long.
This patch also modifies hrtimer functions which specified the slack
delta as a unsigned long.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In upstream commit 7200135bc1
(netfilter: kill ulog targets)
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7200135bc1e6
ipt_ULOG target was removed, meanwhile, the IP_NF_TARGET_ULOG Kconfig
and ipt_ULOG.h header file were removed too. This causes we cannot enable
QUOTA2_LOG, and netd complains this error: "Unable to open quota socket".
So when we reach the quota2 limit, userspace will not be notified with
this event.
Since IP_NF_TARGET_ULOG was removed, we need not depend on
"IP_NF_TARGET_ULOG=n", and for compatibility, add ulog_packet_msg_t
related definitions copied from "ipt_ULOG.h".
Change-Id: I38132efaabf52bea75dfd736ce734a1b9690e87e
Reported-by: Samboo Shen <samboo.shen@spreadtrum.com>
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Backport notes: This resolves clk warnings in the designware i2c
driver on HiKey seen during suspend/resume.
Cherrypicked from: aa8e54b559479d0cb7eb632ba443b8cacd20cd4b
If a suitable prepare callback cannot be found for a given device and
its driver has no PM callbacks at all, assume that it can go direct to
complete when the system goes to sleep.
The reason for this is that there's lots of devices in a system that do
no PM at all and there's no reason for them to prevent their ancestors
to do direct_complete if they can support it.
Change-Id: Ia773afb4b266f012336b99fc8cf87453839e078b
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[jstultz: Backported to 4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Enabled UID_CPUTIME and dependent PROFILING config option.
UID_CPUTIME (/proc/uid_cputime) interfaces provide amount of time a
UID's processes spent executing in user-space and kernel-space. It is
used by batterystats service.
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
(cherry pick from commit 681fef8380eb818c0b845fca5d2ab1dcbab114ee)
The stack object “ci” has a total size of 8 bytes. Its last 3 bytes
are padding bytes which are not initialized and leaked to userland
via “copy_to_user”.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 28619695
Change-Id: I170754d659d0891c075f85211b5e3970b114f097
(cherry pick from commit 9a47e9cff994f37f7f0dbd9ae23740d0f64f9fe6)
The stack object “r1” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Bug: 28980217
Change-Id: I2e4c27352894b9f1f4c808b8db3ae5f9284faec1
(cherry pick from commit e4ec8cc8039a7063e24204299b462bd1383184a5)
The stack object “r1” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Bug: 28980217
Change-Id: If2bba3c9ffb4e57190583b0bb2524d3b2514b2a3
(cherry pick from commit cec8f96e49d9be372fdb0c3836dcf31ec71e457e)
The stack object “tread” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Bug: 28980557
Change-Id: Ibda2d126f6d72fedf797a98796c3cde7bb03db76
(cherry pick from commit 5f8e44741f9f216e33736ea4ec65ca9ac03036e6)
The stack object “map” has a total size of 32 bytes. Its last 4
bytes are padding generated by compiler. These padding bytes are
not initialized and sent out via “nla_put”.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug: 28620102
Change-Id: Ica015c6a90d47e9188b1cd87a280ac6819dd9d09
Remove following configs which no longer exist:
CONFIG_IP6_NF_TARGET_REJECT_SKERR
CONFIG_IP_NF_TARGET_REJECT_SKERR
CONFIG_RESOURCE_COUNTERS
CONFIG_TABLET_USB_WACOM
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
In case some sysfs nodes needs to be labeled with a different label than
sysfs then user needs to be notified when a core is brought back online.
Signed-off-by: Thierry Strudel <tstrudel@google.com>
Bug: 29359497
Change-Id: I0395c86e01cd49c348fda8f93087d26f88557c91
(cherry pick from commit 1666984c8625b3db19a9abc298931d35ab7bc64b)
In case bind() works, but a later error forces bailing
in probe() in error cases work and a timer may be scheduled.
They must be killed. This fixes an error case related to
the double free reported in
http://www.spinics.net/lists/netdev/msg367669.html
and needs to go on top of Linus' fix to cdc-ncm.
Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug: 28744625
Explicitly initialize recursion level to zero at the beginning of each
I/O operation.
Bug: 28943429
Change-Id: I00c612be2b8c22dd5afb65a739551df91cb324fc
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
(cherry picked from commit 32ffb3a22d7fd269b2961323478ece92c06a8334)
A call to do_div was changed in Linux 4.5 to div64_u64 in
verity_fec_decode, which broke RS block calculation due to
incompatible semantics. This change fixes the computation.
Bug: 21893453
Change-Id: Idb88b901e0209c2cccc9c0796689f780592d58f9
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
(cherry picked from commit 879aac93eebcc2862d71afa9eca3a0c0f51b3b01)
Add a release function to allow destroying the dm-verity device.
Bug: 27928374
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Change-Id: Ic0f7c17e4889c5580d70b52d9a709a37165a5747
(cherry picked from commit 0039ccf47c8f99888f7b71b2a36a68a027fbe357)
If verity tree itself is sufficiently corrupted in addition to data
blocks, it's possible for error correction to end up in a deep recursive
error correction loop that eventually causes a kernel panic as follows:
[ 14.728962] [<ffffffc0008c1a14>] verity_fec_decode+0xa8/0x138
[ 14.734691] [<ffffffc0008c3ee0>] verity_verify_level+0x11c/0x180
[ 14.740681] [<ffffffc0008c482c>] verity_hash_for_block+0x88/0xe0
[ 14.746671] [<ffffffc0008c1508>] fec_decode_rsb+0x318/0x75c
[ 14.752226] [<ffffffc0008c1a14>] verity_fec_decode+0xa8/0x138
[ 14.757956] [<ffffffc0008c3ee0>] verity_verify_level+0x11c/0x180
[ 14.763944] [<ffffffc0008c482c>] verity_hash_for_block+0x88/0xe0
This change limits the recursion to a reasonable level during a single
I/O operation.
Bug: 28943429
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Change-Id: I0a7ebff331d259c59a5e03c81918cc1613c3a766
(cherry picked from commit f4b9e40597e73942d2286a73463c55f26f61bfa7)
Add:
CONFIG_SECURITY_PERF_EVENTS_RESTRICT=y
to android-base.cfg
The kernel.perf_event_paranoid sysctl is set to 3 by default.
No unprivileged use of the perf_event_open syscall will be
permitted unless it is changed.
Bug: 29054680
Change-Id: Ie7512259150e146d8e382dc64d40e8faaa438917
When kernel.perf_event_open is set to 3 (or greater), disallow all
access to performance events by users without CAP_SYS_ADMIN.
Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that
makes this value the default.
This is based on a similar feature in grsecurity
(CONFIG_GRKERNSEC_PERF_HARDEN). This version doesn't include making
the variable read-only. It also allows enabling further restriction
at run-time regardless of whether the default is changed.
https://lkml.org/lkml/2016/1/11/587
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Bug: 29054680
Change-Id: Iff5bff4fc1042e85866df9faa01bce8d04335ab8
perf_event_paranoid was only documented in source code and a perf error
message. Copy the documentation from the error message to
Documentation/sysctl/kernel.txt.
perf_cpu_time_max_percent was already documented but missing from the
list at the top, so add it there.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/20160119213515.GG2637@decadent.org.uk
[ Remove reference to external Documentation file, provide info inline, as before ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Bug: 29054680
Change-Id: I13e73cfb2ad761c94762d0c8196df7725abdf5c5
Compilers may engage the improbability drive when encountering shifts
by a distance that is a multiple of the size of the operand type. Since
the required bounds check is very simple here, we can get rid of all the
fuzzy masking, shifting and comparing, and use the documented bounds
directly.
Change-Id: Ibc1b73f4a630bc182deb6edfa7458b5e29ba9577
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The test whether a movz instruction with a signed immediate should be
turned into a movn instruction (i.e., when the immediate is negative)
is flawed, since the value of imm is always positive. Also, the
subsequent bounds check is incorrect since the limit update never
executes, due to the fact that the imm_type comparison will always be
false for negative signed immediates.
Let's fix this by performing the sign test on sval directly, and
replacing the bounds check with a simple comparison against U16_MAX.
Change-Id: I9ad3d8bfd91e5fdc6434b1be6c3062dfec193176
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: tidied up use of sval, renamed MOVK enum value to MOVKZ]
Signed-off-by: Will Deacon <will.deacon@arm.com>
This reverts commit 97312429c2.
Drop AOSP's "armv6 dcc tty driver" in favor of upstream DCC driver for
ARMv6/v7 16c63f8ea4 (drivers: char: hvc: add arm JTAG DCC console
support) and for ARMv8 4cad4c57e0 (ARM64: TTY: hvc_dcc: Add support
for ARM64 dcc).
Change-Id: I0ca651ef2d854fff03cee070524fe1e3971b6d8f
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
This reverts commit dfc1d4be88.
Drop AOSP's "armv6 dcc tty driver" in favor of upstream DCC driver for
ARMv6/v7 16c63f8ea4 (drivers: char: hvc: add arm JTAG DCC console
support) and for ARMv8 4cad4c57e0 (ARM64: TTY: hvc_dcc: Add support
for ARM64 dcc).
Change-Id: I8110a4fd649b8ac1ec9bfac00255c1214135e4b2
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
(This cherry-picks b4201cc4fc6e1c57d6d306b1f787865043d60129 upstream)
This fixes:
net/mac80211/mesh_hwmp.c:603:26: warning: ‘target_metric’ may be used uninitialized in this function
target_metric is only consumed when reply = true so no bug exists here,
but not all versions of gcc realize it. Initialize to 0 to remove the
warning.
Change-Id: I13923fda9d314f48196c29e4354133dfe01f5abd
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
[jstultz: Cherry-picked to android-4.4]
Signed-off-by: John Stultz <john.stultz@linaro.org>
(cherry pick from commit 5c17c861a357e9458001f021a7afa7aab9937439)
ioctl(TIOCGETD) retrieves the line discipline id directly from the
ldisc because the line discipline id (c_line) in termios is untrustworthy;
userspace may have set termios via ioctl(TCSETS*) without actually
changing the line discipline via ioctl(TIOCSETD).
However, directly accessing the current ldisc via tty->ldisc is
unsafe; the ldisc ptr dereferenced may be stale if the line discipline
is changing via ioctl(TIOCSETD) or hangup.
Wait for the line discipline reference (just like read() or write())
to retrieve the "current" line discipline id.
Cc: <stable@vger.kernel.org>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 28409131
Change-Id: I6774bd883a2e48bbe020486c72c42fb410e3f98a
This reverts commit e1b5d10389.
This patch fixed the aosp commit ad86cc8ad6 (drivers: power:
Add watchdog timer to catch drivers which lockup during suspend.),
which we dropped in Change Id Ic72a87432e27844155467817600adc6cf0c2209c,
so we no longer need this fix. A part of this patch is already reverted
in above mentioned Change Id.
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Upstream commit 8eec1020f0 (cpufreq: create cpu/cpufreq at boot time)
make sure that cpufreq sysfs entry get created at boot time, and there
is no need to create/destroy it on need basis anymore.
So drop deprecated cpufreq_{get,put}_global_kobject function calls which
otherwise result in following compilation errors:
drivers/cpufreq/cpufreq_interactive.c: In function 'cpufreq_governor_interactive':
drivers/cpufreq/cpufreq_interactive.c:1187:4: error: implicit declaration of function 'cpufreq_get_global_kobject' [-Werror=implicit-function-declaration]
WARN_ON(cpufreq_get_global_kobject());
^
drivers/cpufreq/cpufreq_interactive.c:1197:5: error: implicit declaration of function 'cpufreq_put_global_kobject'[-Werror=implicit-function-declaration]
cpufreq_put_global_kobject();
^
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
This reverts commit bc68f6c4ef.
This build fix broke the Interactive Gov at runtime with duplicate sysfs
entry warnings at boot time. We no longer need to this create/destroy
cpufreq sysfs entry at run time on need basis thanks to upstream commit
8eec1020f0 (cpufreq: create cpu/cpufreq at boot time) which creates it
at boot time. Hence drop this build fix.
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
In an issue very similar to 4e461c777e (xt_qtaguid: Fix panic
caused by synack processing), we were seeing panics on occasion
in testing.
In this case, it was the same issue, but caused by a different
call path, as the sk being returned from qtaguid_find_sk() was
not a full socket. Resulting in the sk->sk_socket deref to fail.
This patch adds an extra check to ensure the sk being retuned
is a full socket, and if not it returns NULL.
Reported-by: Milosz Wasilewski <milosz.wasilewski@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
This change allows to use same kernel image with
different console options for uart and fiq_debugger.
If fiq_debugger.disable will be set to 1/y/Y,
fiq_debugger will not be initialized.
Change-Id: I71fda54f5f863d13b1437b1f909e52dd375d002d
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
The PR_DUMPABLE flag causes the pid related paths of the
proc file system to be owned by ROOT. The implementation
of pthread_set/getname_np however needs access to
/proc/<pid>/task/<tid>/comm.
If PR_DUMPABLE is false this implementation is locked out.
This patch installs a special permission function for
the file "comm" that grants read and write access to
all threads of the same group regardless of the ownership
of the inode. For all other threads the function falls back
to the generic inode permission check.
Signed-off-by: Janis Danisevskis <jdanis@google.com>
When you configure (set it up) a STA interface, the driver
install a multicast filter. This is normal behavior, when
one application subscribe to multicast address the filter
is updated. When Access Point interface is configured, there
is no filter installation and the "filter update" path is
disabled in the driver.
The problem happens when you switch an interface from STA
type to AP type. The filter is installed but there are no
means to update it.
Change-Id: Ied22323af831575303abd548574918baa9852dd0
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
This patch makes the energy data available via procfs. The related files
are placed as sub-directory named 'energy' inside the
/proc/sys/kernel/sched_domain/cpuX/domainY/groupZ directory for those
cpu/domain/group tuples which have energy information.
The following example depicts the contents of
/proc/sys/kernel/sched_domain/cpu0/domain0/group[01] for a system which
has energy information attached to domain level 0.
├── cpu0
│ ├── domain0
│ │ ├── busy_factor
│ │ ├── busy_idx
│ │ ├── cache_nice_tries
│ │ ├── flags
│ │ ├── forkexec_idx
│ │ ├── group0
│ │ │ └── energy
│ │ │ ├── cap_states
│ │ │ ├── idle_states
│ │ │ ├── nr_cap_states
│ │ │ └── nr_idle_states
│ │ ├── group1
│ │ │ └── energy
│ │ │ ├── cap_states
│ │ │ ├── idle_states
│ │ │ ├── nr_cap_states
│ │ │ └── nr_idle_states
│ │ ├── idle_idx
│ │ ├── imbalance_pct
│ │ ├── max_interval
│ │ ├── max_newidle_lb_cost
│ │ ├── min_interval
│ │ ├── name
│ │ ├── newidle_idx
│ │ └── wake_idx
│ └── domain1
│ ├── busy_factor
│ ├── busy_idx
│ ├── cache_nice_tries
│ ├── flags
│ ├── forkexec_idx
│ ├── idle_idx
│ ├── imbalance_pct
│ ├── max_interval
│ ├── max_newidle_lb_cost
│ ├── min_interval
│ ├── name
│ ├── newidle_idx
│ └── wake_idx
The files 'nr_idle_states' and 'nr_cap_states' contain a scalar value
whereas 'idle_states' and 'cap_states' contain a vector of power
consumption at this idle state respectively (compute capacity, power
consumption) at this capacity state.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Once the SchedTune support is enabled and the CPU bandwidth demand of a
task is boosted, we could expect increased energy consumptions which are
balanced by corresponding increases of tasks performance.
However, the current implementation of the energy_diff() function
accepts all and _only_ the schedule candidates which results into a
reduced expected system energy, which works against the boosting
strategy.
This patch links the energy_diff() function with the "energy payoff"
engine provided by SchedTune. The energy variation computed by the
energy_diff() function is now filtered using the SchedTune support to
evaluated the energy payoff for a boosted task.
With that patch, the energy_diff() function is going to reported as
"acceptable schedule candidate" only the schedule candidate which
corresponds to a positive energy_payoff.
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
The current EAS implementation considers only energy variations, while it
disregards completely the impact on performance for the selection of
a certain schedule candidate. Moreover, it also makes its decision based
on the "absolute" value of expected energy variations.
In order to properly define a trade-off strategy between increased energy
consumption and performances benefits it is required to compare energy
variations with performance variations.
Thus, both performance and energy metrics must be expressed in comparable
units. While the performance variations are expressed in terms of capacity
deltas, which are defined in the range [0..SCHED_LOAD_SCALE], the same
scale is not used for energy variations.
This patch introduces the function:
schedtune_normalize_energy(energy_diff)
which returns a normalized value in the same range of capacity variations,
i.e. [0..SCHED_LOAD_SCALE].
A proper set of energy normalization constants are required to provide
a fast division by a constant during the normalziation of the energy_diff.
The value of these constants depends on the specific energy model and
topology of a target device.
Thus, this patch provides also the required support for the computation
at boot time of this set of variables.
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
The current EAS implementation does not allow "to boost" tasks
performances, for example by running them at an higher OPP (or a more
capable CPU), even if that could require a "reasonable" increase in
energy consumption. To defined how much reasonable is an energy
increase with respect to a required boost value, it is required to
define and compute a trade-off between the expected energy and
performance variations.
However, the current EAS implementation considers only energy variations
while completely disregard the impact on performance for the selection
of a certain schedule candidate.
This patch extends the eenv energy environment to keep track of both
energy and performance deltas which are implied by the activation of a
schedule candidate.
The performance variation is estimated considering the different
capacities of the CPUs in which the task could be scheduled. The idea is
that while running on a CPU with higher capacity (e.g. higher operating
point) the task could (potentially) complete faster and thus get better
performance.
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
The task utilization signal, which is derived from PELT signals and
properly scaled to be architecture and frequency invariant, is used by
EAS as an estimation of the task requirements in terms of CPU bandwidth.
When the energy aware scheduler is in use, this signal affects the CPU
selection. Thus, a convenient way to bias that decision, which is also
little intrusive, is to boost the task utilization signal each time it
is required to support them.
This patch introduces the new function:
boosted_task_util(task)
which returns a boosted value for the utilization of the specified task.
The margin added to the original utilization is:
1. computed based on the "boosting strategy" in use
2. proportional to boost value defined either by the sysctl interface,
when global boosting is in use, or the "taskgroup" value, when
per-task boosting is enabled.
The boosted signal is used by EAS
a. transparently, via its integration into the task_fits() function
b. explicitly, in the energy-aware wakeup path
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
When per-task boosting is enabled, every time a task enters/exits a CPU
its boost value could impact the currently selected OPP for that CPU.
Thus, the "aggregated" boost value for that CPU potentially needs to
be updated to match the current maximum boost value among all the tasks
currently RUNNABLE on that CPU.
This patch introduces the required support to keep track of which boost
groups are impacting a CPU. Each time a task is enqueued/dequeued to/from
a CPU its boost group is used to increment a per-cpu counter of RUNNABLE
tasks on that CPU.
Only when the number of runnable tasks for a specific boost group
becomes 1 or 0 the corresponding boost group changes its effects on
that CPU, specifically:
a) boost_group::tasks == 1: this boost group starts to impact the CPU
b) boost_group::tasks == 0: this boost group stops to impact the CPU
In each of these two conditions the aggregation function:
sched_cpu_update(cpu)
could be required to run in order to identify the new maximum boost
value required for the CPU.
The proposed patch minimizes the number of times the aggregation
function is executed while still providing the required support to
always boost a CPU to the maximum boost value required by all its
currently RUNNABLE tasks.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>