The callback is invoked when process is scheduled in or out.
It provides mechanism for later patches to save/store the LBR
stack. For the schedule in case, the callback is invoked at
the same place that flush branch stack callback is invoked.
So it also can replace the flush branch stack callback. To
avoid unnecessary overhead, the callback is enabled only when
there are events use the LBR stack.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: eranian@google.com
Cc: jolsa@redhat.com
Link: http://lkml.kernel.org/r/1415156173-10035-3-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For hardware events, the userspace page of the event gets updated in
context switches, so if we read the timestamp in the page, we get
fresh info.
For software events, this is missing currently. This patch makes the
behavior consistent.
With this patch, we can implement clock_gettime(THREAD_CPUTIME) with
PERF_COUNT_SW_DUMMY in userspace as suggested by Andy and Peter. Code
like this:
if (pc->cap_user_time) {
do {
seq = pc->lock;
barrier();
running = pc->time_running;
cyc = rdtsc();
time_mult = pc->time_mult;
time_shift = pc->time_shift;
time_offset = pc->time_offset;
barrier();
} while (pc->lock != seq);
quot = (cyc >> time_shift);
rem = cyc & ((1 << time_shift) - 1);
delta = time_offset + quot * time_mult +
((rem * time_mult) >> time_shift);
running += delta;
return running;
}
I tried it on a busy system, the userspace page updating doesn't
have noticeable overhead.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/aa2dd2e4f1e9f2225758be5ba00f14d6909a8ce1.1423180257.git.shli@fb.com
[ Improved the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
37e9562453 ("locking/rwsem: Allow conservative optimistic
spinning when readers have lock") forced the default for
optimistic spinning to be disabled if the lock owner was
nil, which makes much sense for readers. However, while
it is not our priority, we can make some optimizations
for write-mostly workloads. We can bail the spinning step
and still be conservative if there are any active tasks,
otherwise there's really no reason not to spin, as the
semaphore is most likely unlocked.
This patch recovers most of a Unixbench 'execl' benchmark
throughput by sleeping less and making better average system
usage:
before:
CPU %user %nice %system %iowait %steal %idle
all 0.60 0.00 8.02 0.00 0.00 91.38
after:
CPU %user %nice %system %iowait %steal %idle
all 1.22 0.00 70.18 0.00 0.00 28.60
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1422609267-15102-6-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When readers hold the semaphore, the ->owner is nil. As such,
and unlike mutexes, '!owner' does not necessarily imply that
the lock is free. This will cause writers to potentially spin
excessively as they've been mislead to thinking they have a
chance of acquiring the lock, instead of blocking.
This patch therefore enhances the counter check when the owner
is not set by the time we've broken out of the loop. Otherwise
we can return true as a new owner has the lock and thus we want
to continue spinning. While at it, we can make rwsem_spin_on_owner()
less ambiguos and return right away under need_resched conditions.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1422609267-15102-5-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In order to optimize the spinning step, we need to set the lock
owner as soon as the lock is acquired; after a successful counter
cmpxchg operation, that is. This is particularly useful as rwsems
need to set the owner to nil for readers, so there is a greater
chance of falling out of the spinning. Currently we only set the
owner much later in the game, in the more generic level -- latency
can be specially bad when waiting for a node->next pointer when
releasing the osq in up_write calls.
As such, update the owner inside rwsem_try_write_lock (when the
lock is obtained after blocking) and rwsem_try_write_lock_unqueued
(when the lock is obtained while spinning). This requires creating
a new internal rwsem.h header to share the owner related calls.
Also cleanup some headers for mutex and rwsem.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1422609267-15102-4-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The need for the smp_mb() in __rwsem_do_wake() should be
properly documented. Applies to both xadd and spinlock
variants.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1422609267-15102-3-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
attach_to_pi_owner() checks p->mm to prevent attaching to kthreads and
this looks doubly wrong:
1. It should actually check PF_KTHREAD, kthread can do use_mm().
2. If this task is not kthread and it is actually the lock owner we can
wrongly return -EPERM instead of -ESRCH or retry-if-EAGAIN.
And note that this wrong EPERM is the likely case unless the exiting
task is (auto)reaped quickly, we check ->mm before PF_EXITING.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Darren Hart <darren@dvhart.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mateusz Guzik <mguzik@redhat.com>
Link: http://lkml.kernel.org/r/20150202140536.GA26406@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As suggested by Davidlohr, we could refactor mutex_spin_on_owner().
Currently, we split up owner_running() with mutex_spin_on_owner().
When the owner changes, we make duplicate owner checks which are not
necessary. It also makes the code a bit obscure as we are using a
second check to figure out why we broke out of the loop.
This patch modifies it such that we remove the owner_running() function
and the mutex_spin_on_owner() loop directly checks for if the owner changes,
if the owner is not running, or if we need to reschedule. If the owner
changes, we break out of the loop and return true. If the owner is not
running or if we need to reschedule, then break out of the loop and return
false.
Suggested-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: chegu_vinod@hp.com
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/r/1422914367-5574-3-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In the mutex_spin_on_owner(), we return true only if lock->owner == NULL.
This was beneficial in situations where there were multiple threads
simultaneously spinning for the mutex. If another thread got the lock
while other spinner(s) were also doing mutex_spin_on_owner(), then the
other spinners would stop spinning. This workaround helped reduce the
chance that many spinners were simultaneously spinning for the mutex
which can help reduce contention in highly contended cases.
However, recent changes were made to the optimistic spinning code such
that instead of having all spinners simultaneously spin for the mutex,
we queue the spinners with an MCS lock such that only one thread spins
for the mutex at a time. Furthermore, the OSQ optimizations ensure that
spinners in the queue will stop waiting if it needs to reschedule.
Now, we don't have to worry about multiple threads spinning on owner
at the same time, and if lock->owner is not NULL at this point, it likely
means another thread happens to obtain the lock in the fastpath. In this
case, it would make sense for the spinner to continue spinning as long
as the spinner doesn't need to schedule and the mutex owner is running.
This patch changes this so that mutex_spin_on_owner() returns true when
the lock owner changes, which means a thread will only stop spinning
if it either needs to reschedule or if the lock owner is not running.
We saw up to a 5% performance improvement in the fserver workload with
this patch.
Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: chegu_vinod@hp.com
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/r/1422914367-5574-2-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 81907478c4 ("sched/fair: Avoid using uninitialized variable
in preferred_group_nid()") unconditionally initializes max_group with
NODE_MASK_NONE, this means that when !max_faults (max_group didn't get
set), we'll now continue the iteration with an empty mask.
Which in turn makes the actual body of the loop go away, so we'll just
iterate until completion; short circuit this by breaking out of the
loop as soon as this would happen.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150209113727.GS5029@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is a subtle interaction between the logic introduced in commit
e63da03639 ("sched/numa: Allow task switch if load imbalance improves"),
the way the load balancer counts the load on each NUMA node, and the way
NUMA hinting faults are done.
Specifically, the load balancer only counts currently running tasks
in the load, while NUMA hinting faults may cause tasks to stop, if
the page is locked by another task.
This could cause all of the threads of a large single instance workload,
like SPECjbb2005, to migrate to the same NUMA node. This was possible
because occasionally they all fault on the same few pages, and only one
of the threads remains runnable. That thread can move to the process's
preferred NUMA node without making the imbalance worse, because nothing
else is running at that time.
The fix is to check the direction of the net moving of load, and to
refuse a NUMA move if it would cause the system to move past the point
of balance. In an unbalanced state, only moves that bring us closer
to the balance point are allowed.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: mgorman@suse.de
Link: http://lkml.kernel.org/r/20150203165648.0e9ac692@annuminas.surriel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Setting the root group's cpu.rt_runtime_us to 0 is a bad thing; it
would disallow the kernel creating RT tasks.
One can of course still set it to 1, which will (likely) still wreck
your kernel, but at least make it clear that setting it to 0 is not
good.
Collect both sanity checks into the one place while we're there.
Suggested-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150209112715.GO24151@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Because task_group() uses a cache of autogroup_task_group(), whose
output depends on sched_class, switching classes can generate
problems.
In particular, when started as fair, the cache points to the
autogroup, so when switching to RT the tg_rt_schedulable() test fails
for every cpu.rt_{runtime,period}_us change because now the autogroup
has tasks and no runtime.
Furthermore, going back to the previous semantics of varying
task_group() with sched_class has the down-side that the sched_debug
output varies as well, even though the task really is in the
autogroup.
Therefore add an autogroup exception to tg_has_rt_tasks() -- such that
both (all) task_group() usages in sched/core now have one. And remove
all the remnants of the variable task_group() output.
Reported-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Stefan Bader <stefan.bader@canonical.com>
Fixes: 8323f26ce3 ("sched: Fix race in task_group()")
Link: http://lkml.kernel.org/r/20150209112237.GR5029@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It is not possible for the clockevents core to know which modes (other than
those with a corresponding feature flag) are supported by a particular
implementation. And drivers are expected to handle transition to all modes
elegantly, as ->set_mode() would be issued for them unconditionally.
Now, adding support for a new mode complicates things a bit if we want to use
the legacy ->set_mode() callback. We need to closely review all clockevents
drivers to see if they would break on addition of a new mode. And after such
reviews, it is found that we have to do non-trivial changes to most of the
drivers [1].
Introduce mode-specific set_mode_*() callbacks, some of which the drivers may or
may not implement. A missing callback would clearly convey the message that the
corresponding mode isn't supported.
A driver may still choose to keep supporting the legacy ->set_mode() callback,
but ->set_mode() wouldn't be supporting any new modes beyond RESUME. If a driver
wants to benefit from using a new mode, it would be required to migrate to
the mode specific callbacks.
The legacy ->set_mode() callback and the newly introduced mode-specific
callbacks are mutually exclusive. Only one of them should be supported by the
driver.
Sanity check is done at the time of registration to distinguish between optional
and required callbacks and to make error recovery and handling simpler. If the
legacy ->set_mode() callback is provided, all mode specific ones would be
ignored by the core but a warning is thrown if they are present.
Call sites calling ->set_mode() directly are also updated to use
__clockevents_set_mode() instead, as ->set_mode() may not be available anymore
for few drivers.
[1] https://lkml.org/lkml/2014/12/9/605
[2] https://lkml.org/lkml/2015/1/23/255
Suggested-by: Thomas Gleixner <tglx@linutronix.de> [2]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: linaro-kernel@lists.linaro.org
Cc: linaro-networking@linaro.org
Link: http://lkml.kernel.org/r/792d59a40423f0acffc9bb0bec9de1341a06fa02.1423788565.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For things like netpoll there is a need to disable an interrupt from
atomic context. Currently netpoll uses disable_irq() which will
sleep-wait on threaded handlers and thus forced_irqthreads breaks
things.
Provide disable_hardirq(), which uses synchronize_hardirq() to only wait
for active hardirq handlers; also change synchronize_hardirq() to
return the status of threaded handlers.
This will allow one to try-disable an interrupt from atomic context, or
in case of request_threaded_irq() to only wait for the hardirq part.
Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Miller <davem@davemloft.net>
Cc: Eyal Perry <eyalpe@mellanox.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Quentin Lambert <lambert.quentin@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>
Link: http://lkml.kernel.org/r/20150205130623.GH5029@twins.programming.kicks-ass.net
[ Fixed typos and such. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Additional validation of adjtimex freq values to avoid
potential multiplication overflows were added in commit
5e5aeb4367 (time: adjtimex: Validate the ADJ_FREQUENCY values)
Unfortunately the patch used LONG_MAX/MIN instead of
LLONG_MAX/MIN, which was fine on 64-bit systems, but being
much smaller on 32-bit systems caused false positives
resulting in most direct frequency adjustments to fail w/
EINVAL.
ntpd only does direct frequency adjustments at startup, so
the issue was not as easily observed there, but other time
sync applications like ptpd and chrony were more effected by
the bug.
See bugs:
https://bugzilla.kernel.org/show_bug.cgi?id=92481https://bugzilla.redhat.com/show_bug.cgi?id=1188074
This patch changes the checks to use LLONG_MAX for
clarity, and additionally the checks are disabled
on 32-bit systems since LLONG_MAX/PPM_SCALE is always
larger then the 32-bit long freq value, so multiplication
overflows aren't possible there.
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Reported-by: George Joseph <george.joseph@fairview5.com>
Tested-by: George Joseph <george.joseph@fairview5.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # v3.19+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Link: http://lkml.kernel.org/r/1423553436-29747-1-git-send-email-john.stultz@linaro.org
[ Prettified the changelog and the comments a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
io_schedule() calls blk_flush_plug() which, depending on the
contents of current->plug, can initiate arbitrary blk-io requests.
Note that this contrasts with blk_schedule_flush_plug() which requires
all non-trivial work to be handed off to a separate thread.
This makes it possible for io_schedule() to recurse, and initiating
block requests could possibly call mempool_alloc() which, in times of
memory pressure, uses io_schedule().
Apart from any stack usage issues, io_schedule() will not behave
correctly when called recursively as delayacct_blkio_start() does
not allow for repeated calls.
So:
- use ->in_iowait to detect recursion. Set it earlier, and restore
it to the old value.
- move the call to "raw_rq" after the call to blk_flush_plug().
As this is some sort of per-cpu thing, we want some chance that
we are on the right CPU
- When io_schedule() is called recurively, use blk_schedule_flush_plug()
which cannot further recurse.
- as this makes io_schedule() a lot more complex and as io_schedule()
must match io_schedule_timeout(), but all the changes in io_schedule_timeout()
and make io_schedule a simple wrapper for that.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Moved the now rudimentary io_schedule() into sched.h. ]
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Link: http://lkml.kernel.org/r/20150213162600.059fffb2@notabene.brown
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit de30ec4730 "Remove unnecessary ->wait.lock serialization when
reading completion state" was not correct, without lock/unlock the code
like stop_machine_from_inactive_cpu()
while (!completion_done())
cpu_relax();
can return before complete() finishes its spin_unlock() which writes to
this memory. And spin_unlock_wait().
While at it, change try_wait_for_completion() to use READ_ONCE().
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reported-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Added a comment with the barrier. ]
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicholas Mc Guire <der.herr@hofr.at>
Cc: raghavendra.kt@linux.vnet.ibm.com
Cc: waiman.long@hp.com
Fixes: de30ec4730 ("sched/completion: Remove unnecessary ->wait.lock serialization when reading completion state")
Link: http://lkml.kernel.org/r/20150212195913.GA30430@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since the function graph tracer needs to disable preemption, it might
call preempt_schedule() after reenabling it if something triggered the
need for rescheduling in between.
Therefore we can't trace preempt_schedule() itself because we would
face a function tracing recursion otherwise as the tracer is always
called before PREEMPT_ACTIVE gets set to prevent that recursion. This is
why preempt_schedule() is tagged as "notrace".
But the same issue applies to every function called by preempt_schedule()
before PREEMPT_ACTIVE is actually set. And preempt_schedule_common() is
one such example. Unfortunately we forgot to tag it as notrace as well
and as a result we are encountering tracing recursion since it got
introduced by:
a18b5d0181 ("sched: Fix missing preemption opportunity")
Let's fix that by applying the appropriate function tag to
preempt_schedule_common().
Reported-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1424110807-15057-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A deadline task may be throttled and dequeued at the same time.
This happens, when it becomes throttled in schedule(), which
is called to go to sleep:
current->state = TASK_INTERRUPTIBLE;
schedule()
deactivate_task()
dequeue_task_dl()
update_curr_dl()
start_dl_timer()
__dequeue_task_dl()
prev->on_rq = 0;
Later the timer fires, but the task is still dequeued:
dl_task_timer()
enqueue_task_dl() /* queues on dl_rq; on_rq remains 0 */
Someone wakes it up:
try_to_wake_up()
enqueue_dl_entity()
BUG_ON(on_dl_rq())
Patch fixes this problem, it prevents queueing !on_rq tasks
on dl_rq.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Wrote comment. ]
Cc: Juri Lelli <juri.lelli@arm.com>
Fixes: 1019a359d3 ("sched/deadline: Fix stale yield state")
Link: http://lkml.kernel.org/r/1374601424090314@web4j.yandex.ru
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Kirill reported that a dl task can be throttled and dequeued at the
same time. This happens, when it becomes throttled in schedule(),
which is called to go to sleep:
current->state = TASK_INTERRUPTIBLE;
schedule()
deactivate_task()
dequeue_task_dl()
update_curr_dl()
start_dl_timer()
__dequeue_task_dl()
prev->on_rq = 0;
This invalidates the assumption from commit 0f397f2c90 ("sched/dl:
Fix race in dl_task_timer()"):
"The only reason we don't strictly need ->pi_lock now is because
we're guaranteed to have p->state == TASK_RUNNING here and are
thus free of ttwu races".
And therefore we have to use the full task_rq_lock() here.
This further amends the fact that we forgot to update the rq lock loop
for TASK_ON_RQ_MIGRATE, from commit cca26e8009 ("sched: Teach
scheduler to understand TASK_ON_RQ_MIGRATING state").
Reported-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Link: http://lkml.kernel.org/r/20150217123139.GN5029@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There was a wee bit of confusion around the exact ordering here;
clarify things.
Reported-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20150217121258.GM5029@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With task_blocks_on_rt_mutex() returning early -EDEADLK we never
add the waiter to the waitqueue. Later, we try to remove it via
remove_waiter() and go boom in rt_mutex_top_waiter() because
rb_entry() gives a NULL pointer.
( Tested on v3.18-RT where rtmutex is used for regular mutex and I
tried to get one twice in a row. )
Not sure when this started but I guess 397335f004 ("rtmutex: Fix
deadlock detector for real") or commit 3d5c9340d1 ("rtmutex:
Handle deadlock detection smarter").
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org> # for v3.16 and later kernels
Link: http://lkml.kernel.org/r/1424187823-19600-1-git-send-email-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull getname/putname updates from Al Viro:
"Rework of getname/getname_kernel/etc., mostly from Paul Moore. Gets
rid of quite a pile of kludges between namei and audit..."
* 'getname2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
audit: replace getname()/putname() hacks with reference counters
audit: fix filename matching in __audit_inode() and __audit_inode_child()
audit: enable filename recording via getname_kernel()
simpler calling conventions for filename_mountpoint()
fs: create proper filename objects using getname_kernel()
fs: rework getname_kernel to handle up to PATH_MAX sized filenames
cut down the number of do_path_lookup() callers
Pull misc VFS updates from Al Viro:
"This cycle a lot of stuff sits on topical branches, so I'll be sending
more or less one pull request per branch.
This is the first pile; more to follow in a few. In this one are
several misc commits from early in the cycle (before I went for
separate branches), plus the rework of mntput/dput ordering on umount,
switching to use of fs_pin instead of convoluted games in
namespace_unlock()"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
switch the IO-triggering parts of umount to fs_pin
new fs_pin killing logics
allow attaching fs_pin to a group not associated with some superblock
get rid of the second argument of acct_kill()
take count and rcu_head out of fs_pin
dcache: let the dentry count go down to zero without taking d_lock
pull bumping refcount into ->kill()
kill pin_put()
mode_t whack-a-mole: chelsio
file->f_path.dentry is pinned down for as long as the file is open...
get rid of lustre_dump_dentry()
gut proc_register() a bit
kill d_validate()
ncpfs: get rid of d_validate() nonsense
selinuxfs: don't open-code d_genocide()
Merge yet more updates from Andrew Morton:
- a pile of minor fs fixes and cleanups
- kexec updates
- random misc fixes in various places: vmcore, rbtree, eventfd, ipc, seccomp.
- a series of python-based kgdb helper scripts
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (58 commits)
seccomp: cap SECCOMP_RET_ERRNO data to MAX_ERRNO
samples/seccomp: improve label helper
ipc,sem: use current->state helpers
scripts/gdb: disable pagination while printing from breakpoint handler
scripts/gdb: define maintainer
scripts/gdb: convert CpuList to generator function
scripts/gdb: convert ModuleList to generator function
scripts/gdb: use a generator instead of iterator for task list
scripts/gdb: ignore byte-compiled python files
scripts/gdb: port to python3 / gdb7.7
scripts/gdb: add basic documentation
scripts/gdb: add lx-lsmod command
scripts/gdb: add class to iterate over CPU masks
scripts/gdb: add lx_current convenience function
scripts/gdb: add internal helper and convenience function for per-cpu lookup
scripts/gdb: add get_gdbserver_type helper
scripts/gdb: add internal helper and convenience function to retrieve thread_info
scripts/gdb: add is_target_arch helper
scripts/gdb: add helper and convenience function to look up tasks
scripts/gdb: add task iteration class
...
The value resulting from the SECCOMP_RET_DATA mask could exceed MAX_ERRNO
when setting errno during a SECCOMP_RET_ERRNO filter action. This makes
sure we have a reliable value being set, so that an invalid errno will not
be ignored by userspace.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This provides a reliable breakpoint target, required for automatic symbol
loading via the gdb helper command 'lx-symbols'.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simplify the code around one of the conditionals in the kexec_load syscall
routine.
The original code was confusing with a redundant check on KEXEC_ON_CRASH
and comments outside of the conditional block. This change switches the
order of the conditional check, and cleans up the comments for the
conditional. There is no functional change to the code.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Maximilian Attems <max@stro.at>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
struct kimage has a member destination which is used to store the real
destination address of each page when load segment from user space buffer
to kernel. But we never retrieve the value stored in kimage->destination,
so this member variable in kimage and its assignment operation are
redundent code.
I guess for_each_kimage_entry just does the work that kimage->destination
is expected to do.
So in this patch just make a cleanup to remove it.
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Call __set_current_state() instead of assigning the new state directly.
These interfaces also aid CONFIG_DEBUG_ATOMIC_SLEEP environments, keeping
track of who changed the state.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 84c751bd4a ("ptrace: add ability to retrieve signals without
removing from a queue (v4)") includes <linux/compat.h> globally in
ptrace.c
This patch removes inclusion under if defined CONFIG_COMPAT.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Till now suspend-to-idle has not been able to save much more energy
than runtime PM because of timer interrupts that periodically bring
CPUs out of idle while they are waiting for a wakeup interrupt. Of
course, the timer interrupts are not wakeup ones, so the handling of
them can be deferred until a real wakeup interrupt happens, but at
the same time we don't want to mass-expire timers at that point.
The solution is to suspend the entire timekeeping when the last CPU
is entering an idle state and resume it when the first CPU goes out
of idle. That has to be done with care, though, so as to avoid
accessing suspended clocksources etc. end we need extra support
from idle drivers for that.
This series of commits adds support for quiescing timers during
suspend-to-idle and adds the requisite callbacks to intel_idle
and the ACPI cpuidle driver.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJU4PNaAAoJEILEb/54YlRxgjsP/0UbDGbltVyM8VFhsobqhOni
thKJTJsqWqYgsPfTufbOGyvP6zskbsarDlzCXoKXuHaynIqcxY8xfZvMdcQr1j0S
nhKdqv4R6qlP3w2cFxXVZwhw21X3YO1zIxpi5Do1HdVuWoOvxq8Dk4cU8MrgOJC0
6ThC9Q7klheV4tY6Narlmmf6sX5O+S/EaqnupESSG4cqxNmlPw5AguLviBaUNVAY
RSjUX8LAce05bOIGEpaFY+vUws+jlU7/T/GEajquEsGF9zalh2CsWso5nQvilxrJ
22MVqXUyHaXmTC+U7nV78qRkavR6zyr3v/JBDse56qRI1mFlmyvGh8mE5ukmpqJE
Cg5rRC68o71xlBSVGhKW3Os2ks2Nenj2NLltrTyuh43OBJ691TaLsZnKh5nYt/MW
MZdqRRjIDTMF+/P1u4wY8S63labrrmp7w4T720CgaZCLJ/9VfZQuqKXTTm2R5/II
eDhFvdYXoP2748uUOn5sOr5/o0xhnMdaxykZZxE3IkSpOpIV1Mo2HWTIyDYXlILP
0OuJUUZFZtFOjWGCPn3YgoFT94C3nlO1vkXw//44okTUiUaaOZz+VWDF4fxdVeLR
8NGTe+/QzEq+2lbs+ZWRSM1hPukOntFcwCgWXFiqh9x2F00LAw9JpkiKBujxTjUV
m2WstYaML3W7gBMyhxg0
=55Jb
-----END PGP SIGNATURE-----
Merge tag 'suspend-to-idle-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull suspend-to-idle updates from Rafael Wysocki:
"Suspend-to-idle timer quiescing support for v3.20-rc1
Until now suspend-to-idle has not been able to save much more energy
than runtime PM because of timer interrupts that periodically bring
CPUs out of idle while they are waiting for a wakeup interrupt. Of
course, the timer interrupts are not wakeup ones, so the handling of
them can be deferred until a real wakeup interrupt happens, but at the
same time we don't want to mass-expire timers at that point.
The solution is to suspend the entire timekeeping when the last CPU is
entering an idle state and resume it when the first CPU goes out of
idle. That has to be done with care, though, so as to avoid accessing
suspended clocksources etc. end we need extra support from idle
drivers for that.
This series of commits adds support for quiescing timers during
suspend-to-idle and adds the requisite callbacks to intel_idle and the
ACPI cpuidle driver"
* tag 'suspend-to-idle-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / idle: Implement ->enter_freeze callback routine
intel_idle: Add ->enter_freeze callbacks
PM / sleep: Make it possible to quiesce timers during suspend-to-idle
timekeeping: Make it safe to use the fast timekeeper while suspended
timekeeping: Pass readout base to update_fast_timekeeper()
PM / sleep: Re-implement suspend-to-idle handling
Pull irqchip updates from Ingo Molnar:
"Various irqchip driver updates, plus a genirq core update that allows
the initial spreading of irqs amonst CPUs without having to do it from
user-space"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Fix null pointer reference in irq_set_affinity_hint()
irqchip: gic: Allow interrupt level to be set for PPIs
irqchip: mips-gic: Handle pending interrupts once in __gic_irq_dispatch()
irqchip: Conexant CX92755 interrupts controller driver
irqchip: Devicetree: document Conexant Digicolor irq binding
irqchip: omap-intc: Remove unused legacy interface for omap2
irqchip: omap-intc: Fix support for dm814 and dm816
irqchip: mtk-sysirq: Get irq number from register resource size
irqchip: renesas-intc-irqpin: r8a7779 IRLM setup support
genirq: Set initial affinity in irq_set_affinity_hint()
Pull x86 perf updates from Ingo Molnar:
"This series tightens up RDPMC permissions: currently even highly
sandboxed x86 execution environments (such as seccomp) have permission
to execute RDPMC, which may leak various perf events / PMU state such
as timing information and other CPU execution details.
This 'all is allowed' RDPMC mode is still preserved as the
(non-default) /sys/devices/cpu/rdpmc=2 setting. The new default is
that RDPMC access is only allowed if a perf event is mmap-ed (which is
needed to correctly interpret RDPMC counter values in any case).
As a side effect of these changes CR4 handling is cleaned up in the
x86 code and a shadow copy of the CR4 value is added.
The extra CR4 manipulation adds ~ <50ns to the context switch cost
between rdpmc-capable and rdpmc-non-capable mms"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks
perf/x86: Only allow rdpmc if a perf_event is mapped
perf: Pass the event to arch_perf_update_userpage()
perf: Add pmu callbacks to track event mapping and unmapping
x86: Add a comment clarifying LDT context switching
x86: Store a per-cpu shadow copy of CR4
x86: Clean up cr4 manipulation
kobject_init_and_add() takes expects format string for a name, so we
better provide it in order to avoid infoleaks if modules craft their
mod->name in a special way.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Kees Cook <keescook@chromium.org>
Acked-by: Seth Jennings <sjenning@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.
However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion. A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.
The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping. That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs. It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.
Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit. Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.
To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices. Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Theoretically, ktime_get_mono_fast_ns() may be executed after
timekeeping has been suspended (or before it is resumed) which
in turn may lead to undefined behavior, for example, when the
clocksource read from timekeeping_get_ns() called by it is
not accessible at that time.
Prevent that from happening by setting up a dummy readout base for
the fast timekeeper during timekeeping_suspend() such that it will
always return the same number of cycles.
After the last timekeeping_update() in timekeeping_suspend() the
clocksource is read and the result is stored as cycles_at_suspend.
The readout base from the current timekeeper is copied onto the
dummy and the ->read pointer of the dummy is set to a routine
unconditionally returning cycles_at_suspend. Next, the dummy is
passed to update_fast_timekeeper().
Then, ktime_get_mono_fast_ns() will work until the subsequent
timekeeping_resume() and the proper readout base for the fast
timekeeper will be restored by the timekeeping_update() called
right after clearing timekeeping_suspended.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
debugfs/kprobes/enabled doesn't work correctly on optimized kprobes.
Masami Hiramatsu has a test report on x86_64 platform:
https://lkml.org/lkml/2015/1/19/274
This patch forces it to unoptimize kprobe if kprobes_all_disarmed is set.
It also checks the flag in unregistering path for skipping unneeded
disarming process when kprobes globally disarmed.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In original code, the probed instruction doesn't get optimized after
echo 0 > /sys/kernel/debug/kprobes/enabled
echo 1 > /sys/kernel/debug/kprobes/enabled
This is because original code checks kprobes_all_disarmed in
optimize_kprobe(), but this flag is turned off after calling that
function. Therefore, optimize_kprobe() will see kprobes_all_disarmed ==
true and doesn't do the optimization.
This patch simply turns off kprobes_all_disarmed earlier to enable
optimization.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This feature let us to detect accesses out of bounds of global variables.
This will work as for globals in kernel image, so for globals in modules.
Currently this won't work for symbols in user-specified sections (e.g.
__init, __read_mostly, ...)
The idea of this is simple. Compiler increases each global variable by
redzone size and add constructors invoking __asan_register_globals()
function. Information about global variable (address, size, size with
redzone ...) passed to __asan_register_globals() so we could poison
variable's redzone.
This patch also forces module_alloc() to return 8*PAGE_SIZE aligned
address making shadow memory handling (
kasan_module_alloc()/kasan_module_free() ) more simple. Such alignment
guarantees that each shadow page backing modules address space correspond
to only one module_alloc() allocation.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>