Commit graph

14579 commits

Author SHA1 Message Date
Paul Turner
9d85f21c94 sched: Track the runnable average on a per-task entity basis
Instead of tracking averaging the load parented by a cfs_rq, we can track
entity load directly. With the load for a given cfs_rq then being the sum
of its children.

To do this we represent the historical contribution to runnable average
within each trailing 1024us of execution as the coefficients of a
geometric series.

We can express this for a given task t as:

  runnable_sum(t) = \Sum u_i * y^i, runnable_avg_period(t) = \Sum 1024 * y^i
  load(t) = weight_t * runnable_sum(t) / runnable_avg_period(t)

Where: u_i is the usage in the last i`th 1024us period (approximately 1ms)
~ms and y is chosen such that y^k = 1/2.  We currently choose k to be 32 which
roughly translates to about a sched period.

Signed-off-by: Paul Turner <pjt@google.com>
Reviewed-by: Ben Segall <bsegall@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120823141506.372695337@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24 10:27:18 +02:00
Chuansheng Liu
351f181f91 timers, sched: Correct the comments for tick_sched_timer()
In the comments of function tick_sched_timer(), the sentence
"timer->base->cpu_base->lock held" is not right.

In function __run_hrtimer(), before call timer->function(),
the cpu_base->lock has been unlocked.

Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
Cc: fei.li@intel.com
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1351098455.15558.1421.camel@cliu38-desktop-build
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24 10:16:51 +02:00
Daniel Vetter
6b898c07cb console: use might_sleep in console_lock
Instead of BUG_ON(in_interrupt()), since that doesn't check for all
the newfangled stuff like preempt.

Note that this is valid since the console_sem is essentially used like
a real mutex with only two twists:
- we allow trylock from hardirq context
- across suspend/resume we lock the logical console_lock, but drop the
  semaphore protecting the locking state.

Now that doesn't guarantee that no one is playing tricks in
single-thread atomic contexts at suspend/resume/boot time, but
- I couldn't find anything suspicious with some grepping,
- might_sleep shouldn't die,
- and I think the upside of catching more potential issues is worth
  the risk of getting a might_sleep backtrace that would have been
  save (and then dealing with that fallout).

Cc: Dave Airlie <airlied@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-23 20:14:55 -07:00
Linus Torvalds
e17b131583 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Most of these are uprobes race fixes from Oleg, and their preparatory
  cleanups.  (It's larger than what I'd normally send for an -rc kernel,
  but they looked significant enough to not delay them.)

  There's also an oprofile fix and an uncore PMU fix."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
  perf/x86: Disable uncore on virtualized CPUs
  oprofile, x86: Fix wrapping bug in op_x86_get_ctrl()
  ring-buffer: Check for uninitialized cpu buffer before resizing
  uprobes: Fix the racy uprobe->flags manipulation
  uprobes: Fix prepare_uprobe() race with itself
  uprobes: Introduce prepare_uprobe()
  uprobes: Fix handle_swbp() vs unregister() + register() race
  uprobes: Do not delete uprobe if uprobe_unregister() fails
  uprobes: Don't return success if alloc_uprobe() fails
  uprobes/x86: Only rep+nop can be emulated correctly
  uprobes: Simplify is_swbp_at_addr(), remove stale comments
  uprobes: Kill set_orig_insn()->is_swbp_at_addr()
  uprobes: Introduce copy_opcode(), kill read_opcode()
  uprobes: Kill set_swbp()->is_swbp_at_addr()
  uprobes: Restrict valid_vma(false) to skip VM_SHARED vmas
  uprobes: Change valid_vma() to demand VM_MAYEXEC rather than VM_EXEC
  uprobes: Change write_opcode() to use FOLL_FORCE
  uprobes: Move clear_thread_flag(TIF_UPROBE) to uprobe_notify_resume()
  uprobes: Kill UTASK_BP_HIT state
  uprobes: Fix UPROBE_SKIP_SSTEP checks in handle_swbp()
  ...
2012-10-24 04:07:51 +03:00
Paul E. McKenney
53bb857c37 rcu: Dump number of callbacks in stall warning messages
In theory, if a grace period manages to get started despite there being
no callbacks on any of the CPUs, all CPUs could go into dyntick-idle
mode, so that the grace period would never end.  This commit updates
the RCU CPU stall warning messages to detect this condition by summing
up the number of callbacks on all CPUs.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:55:27 -07:00
Paul E. McKenney
eee0588261 rcu: Add grace-period information to RCU CPU stall warnings
This commit causes the last grace period started and completed to be
printed on RCU CPU stall warning messages in order to aid diagnosis.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:55:26 -07:00
Paul E. McKenney
b637a328bd rcu: Print remote CPU's stacks in stall warnings
The RCU CPU stall warnings rely on trigger_all_cpu_backtrace() to
do NMI-based dump of the stack traces of all CPUs.  Unfortunately, a
number of architectures do not implement trigger_all_cpu_backtrace(), in
which case RCU falls back to just dumping the stack of the running CPU.
This is unhelpful in the case where the running CPU has detected that
some other CPU has stalled.

This commit therefore makes the running CPU dump the stacks of the
tasks running on the stalled CPUs.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:55:25 -07:00
Lai Jiangshan
f2ebfbc991 srcu: Export process_srcu()
Because process_srcu() will be used in DEFINE_SRCU(), which is a macro
that could be expanded pretty much anywhere, it can no longer be static.
Note that process_srcu() is still internal to srcu.h.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:54:42 -07:00
Lai Jiangshan
4e87b2d7e8 srcu: Credit Lai Jiangshan with SRCU rewrite
Lai Jiangshan rewrote SRCU, so this commit ensures that he gets his
proper share of blame^Wcredit.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:54:41 -07:00
Paul E. McKenney
340f588bba rcu: Fix precedence error in cpu_needs_another_gp()
The fix introduced by a10d206e (rcu: Fix day-one dyntick-idle
stall-warning bug) has a C-language precedence error.  It turns out
that this error is harmless in that the same result is computed for all
inputs, but the code is nevertheless a potential source of confusion.
This commit therefore introduces parentheses in order to force the
execution of the code to reflect the intent.

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:54:09 -07:00
Antti P Miettinen
3705b88db0 rcu: Add a module parameter to force use of expedited RCU primitives
There have been some embedded applications that would benefit from
use of expedited grace-period primitives.  In some ways, this is
similar to synchronize_net() doing either a normal or an expedited
grace period depending on lock state, but with control outside of
the kernel.

This commit therefore adds rcu_expedited boot and sysfs parameters
that cause the kernel to substitute expedited primitives for the
normal grace-period primitives.

[ paulmck: Add trace/event/rcu.h to kernel/srcu.c to avoid build error.
	   Get rid of infinite loop through contention path.]

Signed-off-by: Antti P Miettinen <amiettinen@nvidia.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:54:08 -07:00
Frederic Weisbecker
4d9a5d4319 rcu: Remove rcu_switch()
It's only there to call rcu_user_hooks_switch(). Let's
just call rcu_user_hooks_switch() directly, we don't need this
function in the middle.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:54:06 -07:00
Paul E. McKenney
489832609a rcu: Make rcutorture give diagnostics if CPU offline fails
This commit causes rcutorture to print the errno if cpu_down() fails
when the rcutorture "verbose" module parameter is specified.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:46:47 -07:00
Paul E. McKenney
abfd6e58ae rcu: Fix comment about _rcu_barrier()/orphanage exclusion
In the old days, _rcu_barrier() acquired ->onofflock to exclude
rcu_send_cbs_to_orphanage(), which allowed the latter to avoid memory
barriers in callback handling.  However, _rcu_barrier() recently started
doing get_online_cpus() to lock out CPU-hotplug operations entirely, which
means that the comment in rcu_send_cbs_to_orphanage() that talks about
->onofflock is now obsolete.  This commit therefore fixes the comment.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:46:47 -07:00
Daniel Vetter
daee779718 console: implement lockdep support for console_lock
Dave Airlie recently discovered a locking bug in the fbcon layer,
where a timer_del_sync (for the blinking cursor) deadlocks with the
timer itself, since both (want to) hold the console_lock:

https://lkml.org/lkml/2012/8/21/36

Unfortunately the console_lock isn't a plain mutex and hence has no
lockdep support. Which resulted in a few days wasted of tracking down
this bug (complicated by the fact that printk doesn't show anything
when the console is locked) instead of noticing the bug much earlier
with the lockdep splat.

Hence I've figured I need to fix that for the next deadlock involving
console_lock - and with kms/drm growing ever more complex locking
that'll eventually happen.

Now the console_lock has rather funky semantics, so after a quick irc
discussion with Thomas Gleixner and Dave Airlie I've quickly ditched
the original idead of switching to a real mutex (since it won't work)
and instead opted to annotate the console_lock with lockdep
information manually.

There are a few special cases:
- The console_lock state is protected by the console_sem, and usually
  grabbed/dropped at _lock/_unlock time. But the suspend/resume code
  drops the semaphore without dropping the console_lock (see
  suspend_console/resume_console). But since the same thread that did
  the suspend will do the resume, we don't need to fix up anything.

- In the printk code there's a special trylock, only used to kick off
  the logbuffer printk'ing in console_unlock. But all that happens
  while lockdep is disable (since printk does a few other evil
  tricks). So no issue there, either.

- The console_lock can also be acquired form irq context (but only
  with a trylock). lockdep already handles that.

This all leaves us with annotating the normal console_lock, _unlock
and _trylock functions.

And yes, it works - simply unloading a drm kms driver resulted in
lockdep complaining about the deadlock in fbcon_deinit:

======================================================
[ INFO: possible circular locking dependency detected ]
3.6.0-rc2+ #552 Not tainted
-------------------------------------------------------
kms-reload/3577 is trying to acquire lock:
 ((&info->queue)){+.+...}, at: [<ffffffff81058c70>] wait_on_work+0x0/0xa7

but task is already holding lock:
 (console_lock){+.+.+.}, at: [<ffffffff81264686>] bind_con_driver+0x38/0x263

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (console_lock){+.+.+.}:
       [<ffffffff81087440>] lock_acquire+0x95/0x105
       [<ffffffff81040190>] console_lock+0x59/0x5b
       [<ffffffff81209cb6>] fb_flashcursor+0x2e/0x12c
       [<ffffffff81057c3e>] process_one_work+0x1d9/0x3b4
       [<ffffffff810584a2>] worker_thread+0x1a7/0x24b
       [<ffffffff8105ca29>] kthread+0x7f/0x87
       [<ffffffff813b1204>] kernel_thread_helper+0x4/0x10

-> #0 ((&info->queue)){+.+...}:
       [<ffffffff81086cb3>] __lock_acquire+0x999/0xcf6
       [<ffffffff81087440>] lock_acquire+0x95/0x105
       [<ffffffff81058cab>] wait_on_work+0x3b/0xa7
       [<ffffffff81058dd6>] __cancel_work_timer+0xbf/0x102
       [<ffffffff81058e33>] cancel_work_sync+0xb/0xd
       [<ffffffff8120a3b3>] fbcon_deinit+0x11c/0x1dc
       [<ffffffff81264793>] bind_con_driver+0x145/0x263
       [<ffffffff81264a45>] unbind_con_driver+0x14f/0x195
       [<ffffffff8126540c>] store_bind+0x1ad/0x1c1
       [<ffffffff8127cbb7>] dev_attr_store+0x13/0x1f
       [<ffffffff8116d884>] sysfs_write_file+0xe9/0x121
       [<ffffffff811145b2>] vfs_write+0x9b/0xfd
       [<ffffffff811147b7>] sys_write+0x3e/0x6b
       [<ffffffff813b0039>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(console_lock);
                               lock((&info->queue));
                               lock(console_lock);
  lock((&info->queue));

 *** DEADLOCK ***

v2: Mark the lockdep_map static, noticed by Jani Nikula.

Cc: Dave Airlie <airlied@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-22 16:12:20 -07:00
Rafael J. Wysocki
5efbe4279f PM / QoS: Introduce request and constraint data types for PM QoS flags
Introduce struct pm_qos_flags_request and struct pm_qos_flags
representing PM QoS flags request type and PM QoS flags constraint
type, respectively.  With these definitions the data structures
will be arranged so that the list member of a struct pm_qos_flags
object will contain the head of a list of struct pm_qos_flags_request
objects representing all of the "flags" requests present for the
given device.  Then, the effective_flags member of a struct
pm_qos_flags object will contain the bitwise OR of the flags members
of all the struct pm_qos_flags_request objects in the list.

Additionally, introduce helper function pm_qos_update_flags()
allowing the caller to manage the list of struct pm_qos_flags_request
pointed to by the list member of struct pm_qos_flags.

The flags are of type s32 so that the request's "value" field
is always of the same type regardless of what kind of request it
is (latency requests already have value fields of type s32).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Jean Pihet <j-pihet@ti.com>
Acked-by: mark gross <markgross@thegnar.org>
2012-10-23 01:07:46 +02:00
Randy Dunlap
0390c88356 module_signing: fix printk format warning
Fix the warning:

  kernel/module_signing.c:195:2: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'size_t'

by using the proper 'z' modifier for printing a size_t.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-22 08:56:34 +03:00
Ingo Molnar
ef8ff74ed8 Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent
Pull ftrace ring-buffer resizing fix from Steve Rostedt.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-21 19:53:34 +02:00
Ingo Molnar
f38787f4f9 Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/urgent
Pull various uprobes bugfixes from Oleg Nesterov - mostly race and
failure path fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-21 18:18:17 +02:00
Ingo Molnar
0acfd009be Merge branch 'nohz/core' of git://github.com/fweisbec/linux-dynticks into timers/core
Pull uncontroversial cleanup/refactoring nohz patches from Frederic Weisbecker.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-21 18:14:02 +02:00
Tejun Heo
ead5c47371 cgroup_freezer: don't use cgroup_lock_live_group()
freezer_read/write() used cgroup_lock_live_group() to synchronize
against task migration into and out of the target cgroup.
cgroup_lock_live_group() grabs the internal cgroup lock and using it
from outside cgroup core leads to complex and fragile locking
dependency issues which are difficult to resolve.

Now that freezer_can_attach() is replaced with freezer_attach() and
update_if_frozen() updated, nothing requires excluding migration
against freezer state reads and changes.

This patch removes cgroup_lock_live_group() and the matching
cgroup_unlock() usages.  The prone-to-bitrot, already outdated and
unnecessary global lock hierarchy documentation is replaced with
documentation in local scope.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Li Zefan <lizefan@huawei.com>
2012-10-20 16:33:12 -07:00
Tejun Heo
b4d18311d3 cgroup_freezer: prepare update_if_frozen() for locking change
Locking will change such that migration can happen while
freezer_read/write() is in progress.  This means that
update_if_frozen() can no longer assume that all tasks in the cgroup
coform to the current freezer state - newly migrated tasks which
haven't finished freezer_attach() yet might be in any state.

This patch updates update_if_frozen() such that it no longer verifies
task states against freezer state.  It now simply decides whether
FREEZING stage is complete.

This removal of verification makes it meaningless to call from
freezer_change_state().  Drop it and move the fast exit test from
freezer_read() - the only left caller - to update_if_frozen().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Li Zefan <lizefan@huawei.com>
2012-10-20 16:33:08 -07:00
Tejun Heo
8755ade683 cgroup_freezer: allow moving tasks in and out of a frozen cgroup
cgroup_freezer is one of the few users of cgroup_subsys->can_attach()
and uses it to prevent tasks from being migrated into or out of a
frozen cgroup.  This makes cgroup_freezer cumbersome to use especially
when co-mounted with other controllers.

->can_attach() is problematic in general as it can make co-mounting
multiple cgroups difficult - migrating tasks may fail for reasons
completely irrelevant for other controllers.  freezer_can_attach() in
particular is more problematic because it messes with cgroup internal
locking to ensure that the state verification performed at
freezer_can_attach() stays valid until migration is complete.

This patch replaces freezer_can_attach() with freezer_attach() so that
tasks are always allowed to migrate - they are nudged into the
conforming state from freezer_attach().  This means that there can be
tasks which are being migrated which don't conform to the current
cgroup_freezer state until freezer_attach() is complete.  Under the
current locking scheme, the only such place is freezer_fork() which is
updated to handle such window.

While this patch doesn't remove the use of internal cgroup locking
from freezer_read/write() paths, it removes the requirement to keep
the freezer state constant while migrating and enables such change.

Note that this creates a userland visible behavior change - FROZEN
cgroup can no longer be used to lock migrations in and out of the
cgroup.  This behavior change is intended.  I don't think the feature
is necessary - userland should coordinate accesses to cgroup fs anyway
- and even if the feature is needed cgroup_freezer is the completely
wrong place to implement it.

Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <1350426526-14254-1-git-send-email-tj@kernel.org>
Cc: Matt Helsley <matthltc@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Li Zefan <lizefan@huawei.com>
2012-10-20 16:28:56 -07:00
Paul E. McKenney
62da192129 rcu: Accelerate callbacks for CPU initiating a grace period
Because grace-period initialization is carried out by a separate
kthread, it might happen on a different CPU than the one that
had the callback needing a grace period -- which is where the
callback acceleration needs to happen.

Fortunately, rcu_start_gp() holds the root rcu_node structure's
->lock, which prevents a new grace period from starting.  This
allows this function to safely determine that a grace period has
not yet started, which in turn allows it to fully accelerate any
callbacks that it has pending.  This commit adds this acceleration.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-20 13:47:10 -07:00
Kees Cook
31fd84b95e use clamp_t in UNAME26 fix
The min/max call needed to have explicit types on some architectures
(e.g. mn10300). Use clamp_t instead to avoid the warning:

  kernel/sys.c: In function 'override_release':
  kernel/sys.c:1287:10: warning: comparison of distinct pointer types lacks a cast [enabled by default]

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-19 18:51:17 -07:00
David Howells
caabe24057 MODSIGN: Move the magic string to the end of a module and eliminate the search
Emit the magic string that indicates a module has a signature after the
signature data instead of before it.  This allows module_sig_check() to
be made simpler and faster by the elimination of the search for the
magic string.  Instead we just need to do a single memcmp().

This works because at the end of the signature data there is the
fixed-length signature information block.  This block then falls
immediately prior to the magic number.

From the contents of the information block, it is trivial to calculate
the size of the signature data and thus the size of the actual module
data.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-19 17:30:40 -07:00
Tejun Heo
d878383211 Revert "cgroup: Remove task_lock() from cgroup_post_fork()"
This reverts commit 7e3aa30ac8.

The commit incorrectly assumed that fork path always performed
threadgroup_change_begin/end() and depended on that for
synchronization against task exit and cgroup migration paths instead
of explicitly grabbing task_lock().

threadgroup_change is not locked when forking a new process (as
opposed to a new thread in the same process) and even if it were it
wouldn't be effective as different processes use different threadgroup
locks.

Revert the incorrect optimization.

Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <20121008020000.GB2575@localhost>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: stable@vger.kernel.org
2012-10-19 14:09:35 -07:00
Tejun Heo
9bb71308b8 Revert "cgroup: Drop task_lock(parent) on cgroup_fork()"
This reverts commit 7e381b0eb1.

The commit incorrectly assumed that fork path always performed
threadgroup_change_begin/end() and depended on that for
synchronization against task exit and cgroup migration paths instead
of explicitly grabbing task_lock().

threadgroup_change is not locked when forking a new process (as
opposed to a new thread in the same process) and even if it were it
wouldn't be effective as different processes use different threadgroup
locks.

Revert the incorrect optimization.

Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <20121008020000.GB2575@localhost>
Acked-by: Li Zefan <lizefan@huawei.com>
Bitterly-Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: stable@vger.kernel.org
2012-10-19 14:08:49 -07:00
Cyrill Gorcunov
bbc2e3ef87 pidns: remove recursion from free_pid_ns()
free_pid_ns() operates in a recursive fashion:

free_pid_ns(parent)
  put_pid_ns(parent)
    kref_put(&ns->kref, free_pid_ns);
      free_pid_ns

thus if there was a huge nesting of namespaces the userspace may trigger
avalanche calling of free_pid_ns leading to kernel stack exhausting and a
panic eventually.

This patch turns the recursion into an iterative loop.

Based on a patch by Andrew Vagin.

[akpm@linux-foundation.org: export put_pid_ns() to modules]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-19 14:07:47 -07:00
Kees Cook
2702b1526c kernel/sys.c: fix stack memory content leak via UNAME26
Calling uname() with the UNAME26 personality set allows a leak of kernel
stack contents.  This fixes it by defensively calculating the length of
copy_to_user() call, making the len argument unsigned, and initializing
the stack buffer to zero (now technically unneeded, but hey, overkill).

CVE-2012-0957

Reported-by: PaX Team <pageexec@freemail.hu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-19 14:07:47 -07:00
Paul E. McKenney
85eae82a08 printk: Fix scheduling-while-atomic problem in console_cpu_notify()
The console_cpu_notify() function runs with interrupts disabled in the
CPU_DYING case.  It therefore cannot block, for example, as will happen
when it calls console_lock().  Therefore, remove the CPU_DYING leg of
the switch statement to avoid this problem.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-16 18:17:44 -07:00
Daisuke Nishimura
1f5320d597 cgroup: notify_on_release may not be triggered in some cases
notify_on_release must be triggered when the last process in a cgroup is
move to another. But if the first(and only) process in a cgroup is moved to
another, notify_on_release is not triggered.

	# mkdir /cgroup/cpu/SRC
	# mkdir /cgroup/cpu/DST
	#
	# echo 1 >/cgroup/cpu/SRC/notify_on_release
	# echo 1 >/cgroup/cpu/DST/notify_on_release
	#
	# sleep 300 &
	[1] 8629
	#
	# echo 8629 >/cgroup/cpu/SRC/tasks
	# echo 8629 >/cgroup/cpu/DST/tasks
	-> notify_on_release for /SRC must be triggered at this point,
	   but it isn't.

This is because put_css_set() is called before setting CGRP_RELEASABLE
in cgroup_task_migrate(), and is a regression introduce by the
commit:74a1166d(cgroups: make procs file writable), which was merged
into v3.0.

Cc: Ben Blum <bblum@andrew.cmu.edu>
Cc: <stable@vger.kernel.org> # v3.0.x and later
Acked-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Tejun Heo <tj@kernel.org>
2012-10-16 17:09:36 -07:00
Tejun Heo
3c426d5e11 cgroup_freezer: don't stall transition to FROZEN for PF_NOFREEZE or PF_FREEZER_SKIP tasks
cgroup_freezer doesn't transition from FREEZING to FROZEN if the
cgroup contains PF_NOFREEZE tasks or tasks sleeping with
PF_FREEZER_SKIP set.

Only kernel tasks can be non-freezable (PF_NOFREEZE) and there's
nothing cgroup_freezer or userland can do about or to it.  It's
pointless to stall the transition for PF_NOFREEZE tasks.

PF_FREEZER_SKIP indicates that the task can be skipped when
determining whether frozen state is reached.  A task with
PF_FREEZER_SKIP is guaranteed to perform try_to_freeze() after it
wakes up and can be considered frozen much like stopped or traced
tasks.  Note that a vfork parent uses PF_FREEZER_SKIP while waiting
for the child.

This updates update_if_frozen() such that it only considers freezable
tasks and treats %true freezer_should_skip() tasks as frozen.

This allows cgroups w/ kthreads and vfork parents successfully reach
FROZEN state.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
2012-10-16 15:03:14 -07:00
Tejun Heo
51f246ed95 cgroup_freezer: make it official that writes to freezer.state don't fail
try_to_freeze_cgroup() has condition checks which are intended to fail
the write operation to freezer.state if there are tasks which can't be
frozen.  The condition checks have been broken for quite some time
now.  freeze_task() returns %false if the target task can't be frozen,
so num_cant_freeze_now is never incremented.

In addition, strangely, cgroup freezing proceeds even after the write
is failed, which is rather broken.

This patch rips out the non-working code intended to fail the write to
freezer.state when the cgroup contains non-freezable tasks and makes
it official that writes to freezer.state succeed whether there are
non-freezable tasks in the cgroup or not.

This leaves is_task_frozen_enough() with only one user -
upste_if_frozen().  Collapse it into the caller.  Note that this
removes an extra call to freezing().

This doesn't cause any userland behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
2012-10-16 15:03:14 -07:00
Tejun Heo
5edee61ede cgroup: cgroup_subsys->fork() should be called after the task is added to css_set
cgroup core has a bug which violates a basic rule about event
notifications - when a new entity needs to be added, you add that to
the notification list first and then make the new entity conform to
the current state.  If done in the reverse order, an event happening
inbetween will be lost.

cgroup_subsys->fork() is invoked way before the new task is added to
the css_set.  Currently, cgroup_freezer is the only user of ->fork()
and uses it to make new tasks conform to the current state of the
freezer.  If FROZEN state is requested while fork is in progress
between cgroup_fork_callbacks() and cgroup_post_fork(), the child
could escape freezing - the cgroup isn't frozen when ->fork() is
called and the freezer couldn't see the new task on the css_set.

This patch moves cgroup_subsys->fork() invocation to
cgroup_post_fork() after the new task is added to the css_set.
cgroup_fork_callbacks() is removed.

Because now a task may be migrated during cgroup_subsys->fork(),
freezer_fork() is updated so that it adheres to the usual RCU locking
and the rather pointless comment on why locking can be different there
is removed (if it doesn't make anything simpler, why even bother?).

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@vger.kernel.org
2012-10-16 15:03:14 -07:00
Ingo Molnar
8ed92e51f9 sched: Add WAKEUP_PREEMPTION feature flag, on by default
As per the recent discussion with Mike and Linus, make it easier to
test with/without this feature. No change in default behavior.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-izoxq4haeg4mTognnDbwcevt@git.kernel.org
2012-10-16 10:05:27 +02:00
Frederic Weisbecker
94a5714020 tick: Conditionally build nohz specific code in tick handler
This optimize a bit the high res tick sched handler.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-10-15 18:51:08 +02:00
Frederic Weisbecker
9e8f559b08 tick: Consolidate tick handling for high and low res handlers
Besides unifying code, this also adds the idle check before
processing idle accounting specifics on the low res handler.
This way we also generalize this part of the nohz code for
!CONFIG_HIGH_RES_TIMERS to prepare for the adaptive tickless
features.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-10-15 18:42:25 +02:00
Frederic Weisbecker
5bb962269c tick: Consolidate timekeeping handling code
Unify the duplicated timekeeping handling code of low and high res tick
sched handlers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-10-15 18:35:11 +02:00
Linus Torvalds
d25282d1c9 Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module signing support from Rusty Russell:
 "module signing is the highlight, but it's an all-over David Howells frenzy..."

Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.

* 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
  X.509: Fix indefinite length element skip error handling
  X.509: Convert some printk calls to pr_devel
  asymmetric keys: fix printk format warning
  MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
  MODSIGN: Make mrproper should remove generated files.
  MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
  MODSIGN: Use the same digest for the autogen key sig as for the module sig
  MODSIGN: Sign modules during the build process
  MODSIGN: Provide a script for generating a key ID from an X.509 cert
  MODSIGN: Implement module signature checking
  MODSIGN: Provide module signing public keys to the kernel
  MODSIGN: Automatically generate module signing keys if missing
  MODSIGN: Provide Kconfig options
  MODSIGN: Provide gitignore and make clean rules for extra files
  MODSIGN: Add FIPS policy
  module: signature checking hook
  X.509: Add a crypto key parser for binary (DER) X.509 certificates
  MPILIB: Provide a function to read raw data into an MPI
  X.509: Add an ASN.1 decoder
  X.509: Add simple ASN.1 grammar compiler
  ...
2012-10-14 13:39:34 -07:00
Linus Torvalds
6c536a17fa KGDB/KDB fixes and cleanups
Cleanups
    Clean up compile warnings in kgdboc.c and x86/kernel/kgdb.c
    Add module event hooks for simplified debugging with gdb
  Fixes
    Fix kdb to stop paging with 'q' on bta and dmesg
    Fix for data that scrolls off the vga console due to line wrapping
      when using the kdb pager
  New
    The debug core registers for kernel module events which allows a
      kernel aware gdb to automatically load symbols and break on entry
      to a kernel module
    Allow kgdboc=kdb to setup kdb on the vga console
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQeB8KAAoJEIciOldedpOjpbIP/j+LXEkzXKKfi/3m79VQ87DB
 5iUmTS84t84pomHamXX175AC0gA/2mC0FbbcHpqjlhxF4awXcviCNIiTdtSOTbbu
 G102naLHY8i77X+XbHuN2utJeaRLw8rsfMMZGmjJnjfpc4LtsaH0YTkUzbt3qvba
 N6/QvknadzIrmoCJvHipdOdsSmL0YmTS22+koG4es9B5jvOqVH/W7jZs1qRlVw96
 VxG5Psx4LPB+RI+ZwF1WwbGxbtqKGwkVvkcGG1XIW7FQojHmjw+vUERQCjoFueJ5
 NkKfus98j85/+MvSTkWx3L1K46MHMCFbtJs9RWftJ8GtoNNnm7GDxasoIG2bJKyG
 HFD3IGPuKAokE/equF3eGTRHeEM0IUGwT3EnBqdKd73zud27WsHaSqC/1CPR+74v
 ojLQ2ft1QF+pEkGrhRTdQpLyVnvEmxu8q+j9z9n/HlGEVv8kZ6LGxDPjWB+um/Yi
 Cs0XAryYrL5gE5O+Vwna61luughtIYJwR7+DeVxnQYJ43x/0MtN/SoURnwvrCTEo
 9FeoMgZm1nLh6EW29ahIT/hMu4f0sM91Kiwrmc/zEWZgoB++wo1n470qQmUUrOx4
 CPD7zdmDrf6YxDG2QTHjCtVErO4aJ5zN4Dq0+YyodV545SZVn3t4qBDTVvKhq4Y6
 NIhZAxrv5RKABwtLcP9E
 =uf0L
 -----END PGP SIGNATURE-----

Merge tag 'for_linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb

Pull KGDB/KDB fixes and cleanups from Jason Wessel:
 "Cleanups
   - Clean up compile warnings in kgdboc.c and x86/kernel/kgdb.c
   - Add module event hooks for simplified debugging with gdb
 Fixes
   - Fix kdb to stop paging with 'q' on bta and dmesg
   - Fix for data that scrolls off the vga console due to line wrapping
     when using the kdb pager
 New
   - The debug core registers for kernel module events which allows a
     kernel aware gdb to automatically load symbols and break on entry
     to a kernel module
   - Allow kgdboc=kdb to setup kdb on the vga console"

* tag 'for_linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
  tty/console: fix warnings in drivers/tty/serial/kgdboc.c
  kdb,vt_console: Fix missed data due to pager overruns
  kdb: Fix dmesg/bta scroll to quit with 'q'
  kgdboc: Accept either kbd or kdb to activate the vga + keyboard kdb shell
  kgdb,x86: fix warning about unused variable
  mips,kgdb: fix recursive page fault with CONFIG_KPROBES
  kgdb: Add module event hooks
2012-10-13 11:16:58 +09:00
Linus Torvalds
ade0899b29 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "This tree includes some late late perf items that missed the first
  round:

  tools:

   - Bash auto completion improvements, now we can auto complete the
     tools long options, tracepoint event names, etc, from Namhyung Kim.

   - Look up thread using tid instead of pid in 'perf sched'.

   - Move global variables into a perf_kvm struct, from David Ahern.

   - Hists refactorings, preparatory for improved 'diff' command, from
     Jiri Olsa.

   - Hists refactorings, preparatory for event group viewieng work, from
     Namhyung Kim.

   - Remove double negation on optional feature macro definitions, from
     Namhyung Kim.

   - Remove several cases of needless global variables, on most
     builtins.

   - misc fixes

  kernel:

   - sysfs support for IBS on AMD CPUs, from Robert Richter.

   - Support for an upcoming Intel CPU, the Xeon-Phi / Knights Corner
     HPC blade PMU, from Vince Weaver.

   - misc fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  perf: Fix perf_cgroup_switch for sw-events
  perf: Clarify perf_cpu_context::active_pmu usage by renaming it to ::unique_pmu
  perf/AMD/IBS: Add sysfs support
  perf hists: Add more helpers for hist entry stat
  perf hists: Move he->stat.nr_events initialization to a template
  perf hists: Introduce struct he_stat
  perf diff: Removing the total_period argument from output code
  perf tool: Add hpp interface to enable/disable hpp column
  perf tools: Removing hists pair argument from output path
  perf hists: Separate overhead and baseline columns
  perf diff: Refactor diff displacement possition info
  perf hists: Add struct hists pointer to struct hist_entry
  perf tools: Complete tracepoint event names
  perf/x86: Add support for Intel Xeon-Phi Knights Corner PMU
  perf evlist: Remove some unused methods
  perf evlist: Introduce add_newtp method
  perf kvm: Move global variables into a perf_kvm struct
  perf tools: Convert to BACKTRACE_SUPPORT
  perf tools: Long option completion support for each subcommands
  perf tools: Complete long option names of perf command
  ...
2012-10-13 10:20:11 +09:00
Linus Torvalds
4e21fc138b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull third pile of kernel_execve() patches from Al Viro:
 "The last bits of infrastructure for kernel_thread() et.al., with
  alpha/arm/x86 use of those.  Plus sanitizing the asm glue and
  do_notify_resume() on alpha, fixing the "disabled irq while running
  task_work stuff" breakage there.

  At that point the rest of kernel_thread/kernel_execve/sys_execve work
  can be done independently for different architectures.  The only
  pending bits that do depend on having all architectures converted are
  restrictred to fs/* and kernel/* - that'll obviously have to wait for
  the next cycle.

  I thought we'd have to wait for all of them done before we start
  eliminating the longjump-style insanity in kernel_execve(), but it
  turned out there's a very simple way to do that without flagday-style
  changes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  alpha: switch to saner kernel_execve() semantics
  arm: switch to saner kernel_execve() semantics
  x86, um: convert to saner kernel_execve() semantics
  infrastructure for saner ret_from_kernel_thread semantics
  make sure that kernel_thread() callbacks call do_exit() themselves
  make sure that we always have a return path from kernel_execve()
  ppc: eeh_event should just use kthread_run()
  don't bother with kernel_thread/kernel_execve for launching linuxrc
  alpha: get rid of switch_stack argument of do_work_pending()
  alpha: don't bother passing switch_stack separately from regs
  alpha: take SIGPENDING/NOTIFY_RESUME loop into signal.c
  alpha: simplify TIF_NEED_RESCHED handling
2012-10-13 10:05:52 +09:00
Linus Torvalds
8418263e35 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull third pile of VFS updates from Al Viro:
 "Stuff from Jeff Layton, mostly.  Sanitizing interplay between audit
  and namei, removing a lot of insanity from audit_inode() mess and
  getting things ready for his ESTALE patchset."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  procfs: don't need a PATH_MAX allocation to hold a string representation of an int
  vfs: embed struct filename inside of names_cache allocation if possible
  audit: make audit_inode take struct filename
  vfs: make path_openat take a struct filename pointer
  vfs: turn do_path_lookup into wrapper around struct filename variant
  audit: allow audit code to satisfy getname requests from its names_list
  vfs: define struct filename and have getname() return it
  vfs: unexport getname and putname symbols
  acct: constify the name arg to acct_on
  vfs: allocate page instead of names_cache buffer in mount_block_root
  audit: overhaul __audit_inode_child to accomodate retrying
  audit: optimize audit_compare_dname_path
  audit: make audit_compare_dname_path use parent_len helper
  audit: remove dirlen argument to audit_compare_dname_path
  audit: set the name_len in audit_inode for parent lookups
  audit: add a new "type" field to audit_names struct
  audit: reverse arguments to audit_inode_child
  audit: no need to walk list in audit_inode if name is NULL
  audit: pass in dentry to audit_copy_inode wherever possible
  audit: remove unnecessary NULL ptr checks from do_path_lookup
2012-10-13 10:04:42 +09:00
Jeff Layton
adb5c2473d audit: make audit_inode take struct filename
Keep a pointer to the audit_names "slot" in struct filename.

Have all of the audit_inode callers pass a struct filename ponter to
audit_inode instead of a string pointer. If the aname field is already
populated, then we can skip walking the list altogether and just use it
directly.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-12 20:15:09 -04:00
Jeff Layton
669abf4e55 vfs: make path_openat take a struct filename pointer
...and fix up the callers. For do_file_open_root, just declare a
struct filename on the stack and fill out the .name field. For
do_filp_open, make it also take a struct filename pointer, and fix up its
callers to call it appropriately.

For filp_open, add a variant that takes a struct filename pointer and turn
filp_open into a wrapper around it.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-12 20:15:09 -04:00
Jeff Layton
7ac86265dc audit: allow audit code to satisfy getname requests from its names_list
Currently, if we call getname() on a userland string more than once,
we'll get multiple copies of the string and multiple audit_names
records.

Add a function that will allow the audit_names code to satisfy getname
requests using info from the audit_names list, avoiding a new allocation
and audit_names records.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-12 20:15:08 -04:00
Jeff Layton
91a27b2a75 vfs: define struct filename and have getname() return it
getname() is intended to copy pathname strings from userspace into a
kernel buffer. The result is just a string in kernel space. It would
however be quite helpful to be able to attach some ancillary info to
the string.

For instance, we could attach some audit-related info to reduce the
amount of audit-related processing needed. When auditing is enabled,
we could also call getname() on the string more than once and not
need to recopy it from userspace.

This patchset converts the getname()/putname() interfaces to return
a struct instead of a string. For now, the struct just tracks the
string in kernel space and the original userland pointer for it.

Later, we'll add other information to the struct as it becomes
convenient.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-12 20:14:55 -04:00
Al Viro
a74fb73c12 infrastructure for saner ret_from_kernel_thread semantics
* allow kernel_execve() leave the actual return to userland to
caller (selected by CONFIG_GENERIC_KERNEL_EXECVE).  Callers
updated accordingly.
* architecture that does select GENERIC_KERNEL_EXECVE in its
Kconfig should have its ret_from_kernel_thread() do this:
	call schedule_tail
	call the callback left for it by copy_thread(); if it ever
returns, that's because it has just done successful kernel_execve()
	jump to return from syscall
IOW, its only difference from ret_from_fork() is that it does call the
callback.
* such an architecture should also get rid of ret_from_kernel_execve()
and __ARCH_WANT_KERNEL_EXECVE

This is the last part of infrastructure patches in that area - from
that point on work on different architectures can live independently.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-12 13:35:07 -04:00
Linus Torvalds
03d3602a83 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core update from Thomas Gleixner:
 - Bug fixes (one for a longstanding dead loop issue)
 - Rework of time related vsyscalls
 - Alarm timer updates
 - Jiffies updates to remove compile time dependencies

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Cast raw_interval to u64 to avoid shift overflow
  timers: Fix endless looping between cascade() and internal_add_timer()
  time/jiffies: bring back unconditional LATCH definition
  time: Convert x86_64 to using new update_vsyscall
  time: Only do nanosecond rounding on GENERIC_TIME_VSYSCALL_OLD systems
  time: Introduce new GENERIC_TIME_VSYSCALL
  time: Convert CONFIG_GENERIC_TIME_VSYSCALL to CONFIG_GENERIC_TIME_VSYSCALL_OLD
  time: Move update_vsyscall definitions to timekeeper_internal.h
  time: Move timekeeper structure to timekeeper_internal.h for vsyscall changes
  jiffies: Remove compile time assumptions about CLOCK_TICK_RATE
  jiffies: Kill unused TICK_USEC_TO_NSEC
  alarmtimer: Rename alarmtimer_remove to alarmtimer_dequeue
  alarmtimer: Remove unused helpers & defines
  alarmtimer: Use hrtimer per-alarm instead of per-base
  alarmtimer: Implement minimum alarm interval for allowing suspend
2012-10-12 22:17:48 +09:00