There is a race between resume from hibernation and the asynchronous
scanning of SCSI devices and to prevent it from happening we need to
call scsi_complete_async_scans() during resume from hibernation.
In addition, if the resume from hibernation is userland-driven, it's
better to wait for all device probes in the kernel to complete before
attempting to open the resume device.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing/filters: return proper error code when writing filter file
tracing/filters: allow user input integer to be oct or hex
tracing/filters: fix NULL pointer dereference
tracing/filters: NIL-terminate user input filter
ftrace: Output REC->var instead of __entry->var for trace format
Make __stringify support variable argument macros too
tracing: fix document references
tracing: fix splice return too large
tracing: update file->f_pos when splice(2) it
tracing: allocate page when needed
tracing: disable seeking for trace_pipe_raw
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: continue lock debugging despite some taints
lockdep: warn about lockdep disabling after kernel taint
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
percpu: unbreak alpha percpu
mutex: have non-spinning mutexes on s390 by default
Impact: broaden lockdep checks
Lockdep is disabled after any kernel taints. This might be convenient
to ignore bad locking issues which sources come from outside the kernel
tree. Nevertheless, it might be a frustrating experience for the
staging developers or those who experience a warning but are focused
on another things that require lockdep.
The v2 of this patch simply don't disable anymore lockdep in case
of TAINT_CRAP and TAINT_WARN events.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: LTP <ltp-list@lists.sourceforge.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg KH <gregkh@suse.de>
LKML-Reference: <1239412638-6739-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: provide useful missing info for developers
Kernel taint can occur in several situations such as warnings,
load of prorietary or staging modules, bad page, etc...
But when such taint happens, a developer might still be working on
the kernel, expecting that lockdep is still enabled. But a taint
disables lockdep without ever warning about it.
Such a kernel behaviour doesn't really help for kernel development.
This patch adds this missing warning.
Since the taint is done most of the time after the main message that
explain the real source issue, it seems safe to warn about it inside
add_taint() so that it appears at last, without hurting the main
information.
v2: Use a generic helper to disable lockdep instead of an
open coded xchg().
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1239412638-6739-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
BLK_TC_PC events should be treated differently with BLK_TC_FS events.
Before this patch:
# echo 1 > /sys/block/sda/sda1/trace/enable
# echo pc > /sys/block/sda/sda1/trace/act_mask
# echo blk > /debugfs/tracing/current_tracer
# (generate some BLK_TC_PC events)
# cat trace
bash-2184 [000] 1774.275413: 8,7 I N [bash]
bash-2184 [000] 1774.275435: 8,7 D N [bash]
bash-2184 [000] 1774.275540: 8,7 I R [bash]
bash-2184 [000] 1774.275547: 8,7 D R [bash]
ksoftirqd/0-4 [000] 1774.275580: 8,7 C N 0 [0]
bash-2184 [000] 1774.275648: 8,7 I R [bash]
bash-2184 [000] 1774.275653: 8,7 D R [bash]
ksoftirqd/0-4 [000] 1774.275682: 8,7 C N 0 [0]
bash-2184 [000] 1774.275739: 8,7 I R [bash]
bash-2184 [000] 1774.275744: 8,7 D R [bash]
ksoftirqd/0-4 [000] 1774.275771: 8,7 C N 0 [0]
bash-2184 [000] 1774.275804: 8,7 I R [bash]
bash-2184 [000] 1774.275808: 8,7 D R [bash]
ksoftirqd/0-4 [000] 1774.275836: 8,7 C N 0 [0]
After this patch:
# cat trace
bash-2263 [000] 366.782149: 8,7 I N 0 (00 ..) [bash]
bash-2263 [000] 366.782323: 8,7 D N 0 (00 ..) [bash]
bash-2263 [000] 366.782557: 8,7 I R 8 (25 00 ..) [bash]
bash-2263 [000] 366.782560: 8,7 D R 8 (25 00 ..) [bash]
ksoftirqd/0-4 [000] 366.782582: 8,7 C N (25 00 ..) [0]
bash-2263 [000] 366.782648: 8,7 I R 8 (5a 00 3f 00) [bash]
bash-2263 [000] 366.782650: 8,7 D R 8 (5a 00 3f 00) [bash]
ksoftirqd/0-4 [000] 366.782669: 8,7 C N (5a 00 3f 00) [0]
bash-2263 [000] 366.782710: 8,7 I R 8 (5a 00 08 00) [bash]
bash-2263 [000] 366.782713: 8,7 D R 8 (5a 00 08 00) [bash]
ksoftirqd/0-4 [000] 366.782730: 8,7 C N (5a 00 08 00) [0]
bash-2263 [000] 366.783375: 8,7 I R 36 (5a 00 08 00) [bash]
bash-2263 [000] 366.783379: 8,7 D R 36 (5a 00 08 00) [bash]
ksoftirqd/0-4 [000] 366.783404: 8,7 C N (5a 00 08 00) [0]
This is what we do with PC events in user-space blktrace.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <49D32387.9040106@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Not all events are pc (packet command) events. An event is a pc
event only if it has BLK_TC_PC bit set.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <49D3236D.3090705@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: refactor code for future changes
Current kmemtrace.h is used both as header file of kmemtrace and kmem's
tracepoints definition.
Tracepoints' definition file may be used by other code, and should only have
definition of tracepoint.
We can separate include/trace/kmemtrace.h into 2 files:
include/linux/kmemtrace.h: header file for kmemtrace
include/trace/kmem.h: definition of kmem tracepoints
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49DEE68A.5040902@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- propagate return value of filter_add_pred() to the user
- return -ENOSPC but not -ENOMEM or -EINVAL when the filter array
is full
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <49E04CF0.3010105@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make sure messages from user space are NIL-terminated strings,
otherwise we could dump random memory while reading filter file.
Try this:
# echo 'parent_comm ==' > events/sched/sched_process_fork/filter
# cat events/sched/sched_process_fork/filter
parent_comm == �
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <49E04C32.6060508@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Several drivers use asynchronous work to do device discovery, and we
synchronize with them in the compiled-in case before we actually try to
mount root filesystems etc.
However, when compiled as modules, that synchronization is missing - the
module loading completes, but the driver hasn't actually finished
probing for devices, and that means that any user mode that expects to
use the devices after the 'insmod' is now potentially broken.
We already saw one case of a similar issue in the ACPI battery code,
where the kernel itself expected the module to be all done, and unmapped
the init memory - but the async device discovery was still running.
That got hacked around by just removing the "__init" (see commit
5d38258ec0 "ACPI battery: fix async boot
oops"), but the real fix is to just make the module loading wait for all
async work to be completed.
It will slow down module loading, but since common devices should be
built in anyway, and since the bug is really annoying and hard to handle
from user space (and caused several S3 resume regressions), the simple
fix to wait is the right one.
This fixes at least
http://bugzilla.kernel.org/show_bug.cgi?id=13063
but probably a few other bugzilla entries too (12936, for example), and
is confirmed to fix Rafael's storage driver breakage after resume bug
report (no bugzilla entry).
We should also be able to now revert that ACPI battery fix.
Reported-and-tested-by: Rafael J. Wysocki <rjw@suse.com>
Tested-by: Heinz Diehl <htd@fancy-poultry.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the get_futex_key() call were to fail, the existing code would
try and put_futex_key() prior to returning. This patch makes sure
we only put_futex_key() if get_futex_key() succeeded.
Reported-by: Clark Williams <williams@redhat.com>
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
LKML-Reference: <20090410165005.14342.16973.stgit@Aeon>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When moving documents to Documentation/trace/, I forgot to
grep Kconfig to find out those references.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Pekka Paalanen <pq@iki.fi>
Cc: eduard.munteanu@linux360.ro
LKML-Reference: <49DE97EF.7080208@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While trying to optimize the new lock on reiserfs to replace
the bkl, I find the lock tracing very useful though it lacks
something important for performance (and latency) instrumentation:
the time a task waits for a lock.
That's what this patch implements:
bash-4816 [000] 202.652815: lock_contended: lock_contended: &sb->s_type->i_mutex_key
bash-4816 [000] 202.652819: lock_acquired: &rq->lock (0.000 us)
<...>-4787 [000] 202.652825: lock_acquired: &rq->lock (0.000 us)
<...>-4787 [000] 202.652829: lock_acquired: &rq->lock (0.000 us)
bash-4816 [000] 202.652833: lock_acquired: &sb->s_type->i_mutex_key (16.005 us)
As shown above, the "lock acquired" field is followed by the time
it has been waiting for the lock. Usually, a lock contended entry
is followed by a near lock_acquired entry with a non-zero time waited.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1238975373-15739-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge reason: pick up both v2.6.30-rc1 [which includes tracing/urgent fixes]
and pick up the current lineup of tracing/urgent fixes as well
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I got these from strace:
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
It is because the return value of tracing_buffers_splice_read()
does not include "zero out any left over data" bytes.
But tracing_buffers_read() includes these bytes, we make them
consistent.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Cleanup
These two lines:
if (unlikely(*ppos))
return -ESPIPE;
in tracing_buffers_splice_read() are not needed, VFS layer
has disabled seek(2).
We remove these two lines, and then we can update file->f_pos.
And tracing_buffers_read() updates file->f_pos, this fix
make tracing_buffers_splice_read() updates file->f_pos too.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46670.4010503@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Cleanup
Sometimes, we open trace_pipe_raw, but we don't read(2) it,
we just splice(2) it, thus, the page is not used.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D4666B.4010608@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: disable pread()
We set tracing_buffers_fops.llseek to no_llseek,
but we can still perform pread() to read this file.
That is not expected.
This fix uses nonseekable_open() to disable it.
tracing_buffers_fops.llseek is still set to no_llseek,
it mark this file is a "non-seekable device" and is used by
sys_splice(). See also do_splice() or manual of splice(2):
ERRORS
EINVAL Target file system doesn't support splicing;
neither of the descriptors refers to a pipe;
or offset given for non-seekable device.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46668.8030806@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: consolidate documents
blktrace: pass the right pointer to kfree()
tracing/syscalls: use a dedicated file header
tracing: append a comma to INIT_FTRACE_GRAPH
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: do not count frozen tasks toward load
sched: refresh MAINTAINERS entry
sched: Print sched_group::__cpu_power in sched_domain_debug
cpuacct: add per-cgroup utime/stime statistics
posixtimers, sched: Fix posix clock monotonicity
sched_rt: don't allocate cpumask in fastpath
cpuacct: make cpuacct hierarchy walk in cpuacct_charge() safe when rcupreempt is used -v2
Impact: performance regression fix for s390
The adaptive spinning mutexes will not always do what one would expect on
virtualized architectures like s390. Especially the cpu_relax() loop in
mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
away by the hypervisor.
We would end up in a cpu_relax() loop when there is no chance that the
state of the mutex changes until the target cpu has been scheduled again by
the hypervisor.
For that reason we should change the default behaviour to no-spin on s390.
We do have an instruction which allows to yield the current cpu in favour of
a different target cpu. Also we have an instruction which allows us to figure
out if the target cpu is physically backed.
However we need to do some performance tests until we can come up with
a solution that will do the right thing on s390.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
LKML-Reference: <20090409184834.7a0df7b2@osiris.boeblingen.de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix perf-report output for /home mounted binaries, etc.
dentry_path() only provide path-names up to the mount root, which is
unsuited for out purpose, use d_path() instead.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090409085524.601794134@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add sysctl for paranoid/relaxed perfcounters policy
Allow the use of system wide perf counters to everybody, but provide
a sysctl to disable it for the paranoid security minded.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090409085524.514046352@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: performance optimization
The mmap/comm tracking code does quite a lot of work before it discovers
there's no interest in it, avoid that by keeping a counter.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090409085524.427173196@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: micro-optimization
Under group scheduling we traverse up until we are at common siblings
to make the wakeup comparison on.
At this point however, they should have the same parent so continuing
to check up the tree is redundant.
Signed-off-by: Paul Turner <pjt@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <alpine.DEB.1.00.0904081520320.30317@kitami.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix kfree crash with non-standard act_mask string
If passing a string with leading white spaces to strstrip(),
the returned ptr != the original ptr.
This bug was introduced by me.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <49DD694C.8020902@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix build warnings and possibe compat misbehavior on IA64
Building a kernel on ia64 might trigger these ugly build warnings:
CC arch/ia64/ia32/sys_ia32.o
In file included from arch/ia64/ia32/sys_ia32.c:55:
arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
In file included from include/linux/elf.h:7,
from include/linux/module.h:14,
from include/linux/ftrace.h:8,
from include/linux/syscalls.h:68,
from arch/ia64/ia32/sys_ia32.c:18:
arch/ia64/include/asm/elf.h:19:1: warning: this is the location of the previous definition
[...]
sys_ia32.c includes linux/syscalls.h which in turn includes linux/ftrace.h
to import the syscalls tracing prototypes.
But including ftrace.h can pull too much things for a low level file,
especially on ia64 where the ia32 private headers conflict with higher
level headers.
Now we isolate the syscall tracing headers in their own lightweight file.
Reported-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jason Baron <jbaron@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Michael Rubin <mrubin@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Michael Davidson <md@google.com>
LKML-Reference: <20090408184058.GB6017@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: circular locking bugfix
The various implemetnations and proposed implemetnations of work_on_cpu()
are vulnerable to various deadlocks because they all used queues of some
form.
Unrelated pieces of kernel code thus gained dependencies wherein if one
work_on_cpu() caller holds a lock which some other work_on_cpu() callback
also takes, the kernel could rarely deadlock.
Fix this by creating a short-lived kernel thread for each work_on_cpu()
invokation.
This is not terribly fast, but the only current caller of work_on_cpu() is
pci_call_probe().
It would be nice to find some other way of doing the node-local
allocations in the PCI probe code so that we can zap work_on_cpu()
altogether. The code there is rather nasty. I can't think of anything
simple at this time...
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
kthreadd is the single thread which implements ths "create" request, move
sched_setscheduler/etc from create_kthread() to kthread_create() to
improve the scalability.
We should be careful with sched_setscheduler(), use _nochek helper.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Vitaliy Gusev <vgusev@openvz.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Remove the unnecessary find_task_by_pid_ns(). kthread() can just
use "current" to get the same result.
Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This fixes all the checkpatch --file complaints about kernel/ptrace.c
and also removes an unused #include. I've verified that there are no
changes to the compiled code on x86_64.
Signed-off-by: Roland McGrath <roland@redhat.com>
[ Removed the parts that just split a line - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul suggested we allow for data addresses to be recorded along with
the traditional IPs as power can provide these.
For now, only the software pagefault events provide data addresses,
but in the future power might as well for some events.
x86 doesn't seem capable of providing this atm.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130409.394816925@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move PERF_RECORD_TIME so that all the fixed length items come before
the variable length ones.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130409.307926436@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Similar to the mmap data stream, add one that tracks the task COMM field,
so that the userspace reporting knows what to call a task.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130409.127422406@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Push the PERF_EVENT_COUNTER_OVERFLOW bit into the misc field so that
we can have the full 32bit for PERF_RECORD_ bits.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130408.891867663@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Limit the size of each record to 64k (or should we count in multiples
of u64 and have a 512K limit?), this gives 16 bits or spare room in the
header, which we can use for misc bits, so as to not have to grow the
record with u64 every time we have a few bits to report.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130408.769271806@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We should not be updating ctx->time from NMI context, work around that.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130408.681326666@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
update_rlimit_cpu() tries to optimize out set_process_cpu_timer() in case
when we already have CPUCLOCK_PROF timer which should expire first. But it
uses cputime_lt() instead of cputime_gt().
Test case:
int main(void)
{
struct itimerval it = {
.it_value = { .tv_sec = 1000 },
};
assert(!setitimer(ITIMER_PROF, &it, NULL));
struct rlimit rl = {
.rlim_cur = 1,
.rlim_max = 1,
};
assert(!setrlimit(RLIMIT_CPU, &rl));
for (;;)
;
return 0;
}
Without this patch, the task is not killed as RLIMIT_CPU demands.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Lojkin <ia6432@inbox.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: stable@kernel.org
LKML-Reference: <20090327000610.GA10108@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
See http://bugzilla.kernel.org/show_bug.cgi?id=12911
copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
fastpath_timer_check() returns false unless we have other expired cpu timers.
Change copy_signal() to set cputime_expires.prof_exp if we have RLIMIT_CPU.
Also, set cputimer.running = 1 in that case. This is not strictly necessary,
but imho makes sense.
Reported-by: Peter Lojkin <ia6432@inbox.ru>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Lojkin <ia6432@inbox.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: stable@kernel.org
LKML-Reference: <20090327000607.GA10104@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Thomas's testing caught a problem when the requeue target futex is
unowned and multiple tasks are requeued to it. This patch ensures
the FUTEX_WAITERS bit gets set if futex_requeue() will requeue one
or more tasks in addition to the one acquiring the lock.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Conflicts:
arch/powerpc/include/asm/systbl.h
arch/powerpc/include/asm/unistd.h
include/linux/init_task.h
Merge reason: the conflicts are non-trivial: PowerPC placement
of sys_perf_counter_open has to be mixed with the
new preadv/pwrite syscalls.
Signed-off-by: Ingo Molnar <mingo@elte.hu>