evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
Shawn Bohrer	6bfa687c19	sched/rt: Remove redundant nr_cpus_allowed test In `76854c7e8f` ("sched: Use rt.nr_cpus_allowed to recover select_task_rq() cycles") an optimization was added to select_task_rq_rt() that immediately returns when p->nr_cpus_allowed == 1 at the beginning of the function. This makes the latter p->nr_cpus_allowed > 1 check redundant, which can now be removed. Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <mgalbraith@suse.de> Cc: tomk@rgmadvisors.com Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1380914693-24634-1-git-send-email-shawn.bohrer@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-06 11:28:40 +02:00
Linus Torvalds	7dee8dff47	ACPI and power management fixes for 3.12-rc4 1) The resume part of user space driven hibernation (s2disk) is now broken after the change that moved the creation of memory bitmaps to after the freezing of tasks, because I forgot that the resume utility loaded the image before freezing tasks and needed the bitmaps for that. The fix adds special handling for that case. 2) One of recent commits changed the export of acpi_bus_get_device() to EXPORT_SYMBOL_GPL(), which was technically correct but broke existing binary modules using that function including one in particularly widespread use. Change it back to EXPORT_SYMBOL(). 3) The intel_pstate driver sometimes fails to disable turbo if its no_turbo sysfs attribute is set. Fix from Srinivas Pandruvada. 4) One of recent cpufreq fixes forgot to update a check in cpufreq-cpu0 which still (incorrectly) treats non-NULL as non-error. Fix from Philipp Zabel. 5) The SPEAr cpufreq driver uses a wrong variable type in one place preventing it from catching errors returned by one of the functions called by it. Fix from Sachin Kamat. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABCAAGBQJSTvXfAAoJEKhOf7ml8uNslkkP+QGoghnGR9hScYq/0Mcnzr4b kwkiRx54NggjzzN8Q+ejZmxNZ7UZt3q05PmHPtJk3A8gzqIMsb83jnXsZNiDiQs6 m+KBYrV5dhPZkp08X2tHJp5ijZNRULpp9QA49ulnLfVT/A+rkr5xBCK0W3ln/zL3 tJSlGJ3N7yYUXe3nMRCCNnnnAzWA+Tk8yRaMx5MnFqlQWWnyx1SGKjD/kVv0/3RA 6rlDPQEIuoCTqLKotnGIqVN2hTFPFJKc9yTrRGZ15pMjdUGHMwnHy6KMAdXy4Rdh R1DOdf+bvPkkFiGE1D1vKOt7pdOG/cTtNkppvWZRuoGg2AMJGm5KWlrdLhlvunyt IQXmdt/eWecNr+WzN8FiDp4LEQcI6VjEDaJ3qbjXHLH/FOupBKXYoNWpejj4bGSE PtPmJYjNpD2vF3cdtt80ZAYSxhLutwPQksoAwyJ40++l53Ygi81BO31LWZQnDk/8 HPWOXFThmWJtT03b0sG25GpboiCpYtHEmbwQe+y+pRx7L12HBfE4StT3hmv5Z9J4 RXXB3yNq4ApXtFq1mitpiPmSVfYe+zu590m7ZUr457BpXi7MH17tzDn9nUJ2eTZl kXwUNWiRKGjPmKYxV/ml/apClozsGMFP+XoZkYotFd0W5+SVLuhdXdtClIt4NAbD dUkYVMm/BBBALpmH+yKw =P4mh -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management fixes from Rafael Wysocki: - The resume part of user space driven hibernation (s2disk) is now broken after the change that moved the creation of memory bitmaps to after the freezing of tasks, because I forgot that the resume utility loaded the image before freezing tasks and needed the bitmaps for that. The fix adds special handling for that case. - One of recent commits changed the export of acpi_bus_get_device() to EXPORT_SYMBOL_GPL(), which was technically correct but broke existing binary modules using that function including one in particularly widespread use. Change it back to EXPORT_SYMBOL(). - The intel_pstate driver sometimes fails to disable turbo if its no_turbo sysfs attribute is set. Fix from Srinivas Pandruvada. - One of recent cpufreq fixes forgot to update a check in cpufreq-cpu0 which still (incorrectly) treats non-NULL as non-error. Fix from Philipp Zabel. - The SPEAr cpufreq driver uses a wrong variable type in one place preventing it from catching errors returned by one of the functions called by it. Fix from Sachin Kamat. * tag 'pm+acpi-3.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: Use EXPORT_SYMBOL() for acpi_bus_get_device() intel_pstate: fix no_turbo cpufreq: cpufreq-cpu0: NULL is a valid regulator, part 2 cpufreq: SPEAr: Fix incorrect variable type PM / hibernate: Fix user space driven resume regression	2013-10-04 15:03:42 -07:00
Andi Kleen	fdfbbd07e9	perf: Add generic transaction flags Add a generic qualifier for transaction events, as a new sample type that returns a flag word. This is particularly useful for qualifying aborts: to distinguish aborts which happen due to asynchronous events (like conflicts caused by another CPU) versus instructions that lead to an abort. The tuning strategies are very different for those cases, so it's important to distinguish them easily and early. Since it's inconvenient and inflexible to filter for this in the kernel we report all the events out and allow some post processing in user space. The flags are based on the Intel TSX events, but should be fairly generic and mostly applicable to other HTM architectures too. In addition to various flag words there's also reserved space to report an program supplied abort code. For TSX this is used to distinguish specific classes of aborts, like a lock busy abort when doing lock elision. Flags: Elision and generic transactions (ELISION vs TRANSACTION) (HLE vs RTM on TSX; IBM etc. would likely only use TRANSACTION) Aborts caused by current thread vs aborts caused by others (SYNC vs ASYNC) Retryable transaction (RETRY) Conflicts with other threads (CONFLICT) Transaction write capacity overflow (CAPACITY WRITE) Transaction read capacity overflow (CAPACITY READ) Transactions implicitely aborted can also return an abort code. This can be used to signal specific events to the profiler. A common case is abort on lock busy in a RTM eliding library (code 0xff) To handle this case we include the TSX abort code Common example aborts in TSX would be: - Data conflict with another thread on memory read. Flags: TRANSACTION\|ASYNC\|CONFLICT - executing a WRMSR in a transaction. Flags: TRANSACTION\|SYNC - HLE transaction in user space is too large Flags: ELISION\|SYNC\|CAPACITY-WRITE The only flag that is somewhat TSX specific is ELISION. This adds the perf core glue needed for reporting the new flag word out. v2: Add MEM/MISC v3: Move transaction to the end v4: Separate capacity-read/write and remove misc v5: Remove _SAMPLE. Move abort flags to 32bit. Rename transaction to txn Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-2-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:06:08 +02:00
Knut Petersen	723478c8a4	perf: Enforce 1 as lower limit for perf_event_max_sample_rate /proc/sys/kernel/perf_event_max_sample_rate will accept negative values as well as 0. Negative values are unreasonable, and 0 causes a divide by zero exception in perf_proc_update_handler. This patch enforces a lower limit of 1. Signed-off-by: Knut Petersen <Knut_Petersen@t-online.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/5242DB0C.4070005@t-online.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:06:07 +02:00
Peter Zijlstra	9886167d20	perf: Fix perf_pmu_migrate_context While auditing the list_entry usage due to a trinity bug I found that perf_pmu_migrate_context violates the rules for perf_event::event_entry. The problem is that perf_event::event_entry is a RCU list element, and hence we must wait for a full RCU grace period before re-using the element after deletion. Therefore the usage in perf_pmu_migrate_context() which re-uses the entry immediately is broken. For now introduce another list_head into perf_event for this specific usage. This doesn't actually fix the trinity report because that never goes through this code. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-mkj72lxagw1z8fvjm648iznw@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 09:58:53 +02:00
Mike Travis	8daaa5f826	kdb: Add support for external NMI handler to call KGDB/KDB This patch adds a kgdb_nmicallin() interface that can be used by external NMI handlers to call the KGDB/KDB handler. The primary need for this is for those types of NMI interrupts where all the CPUs have already received the NMI signal. Therefore no send_IPI(NMI) is required, and in fact it will cause a 2nd unhandled NMI to occur. This generates the "Dazed and Confuzed" messages. Since all the CPUs are getting the NMI at roughly the same time, it's not guaranteed that the first CPU that hits the NMI handler will manage to enter KGDB and set the dbg_master_lock before the slaves start entering. The new argument "send_ready" was added for KGDB to signal the NMI handler to release the slave CPUs for entry into KGDB. Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Jason Wessel <jason.wessel@windriver.com> Reviewed-by: Dimitri Sivanich <sivanich@sgi.com> Reviewed-by: Hedi Berriche <hedi@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/20131002151417.928886849@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-03 18:47:54 +02:00
Ingo Molnar	68e9074028	Merge branch 'clockevents/3.13' of git://git.linaro.org/people/dlezcano/linux into timers/core Pull (mostly) ARM clocksource driver updates from Daniel Lezcano: " - Soren Brinkmann added FEAT_PERCPU to a clock device when it is local per cpu. This feature prevents the clock framework to choose a per cpu timer as a broadcast timer. This problem arised when the ARM global timer is used when switching to the broadcast timer which is the case now on Xillinx with its cpuidle driver. - Stephen Boyd extended the generic sched_clock code to support 64bit counters and removes the setup_sched_clock deprecation, as that causes lots of warnings since there's still users in the arch/arm tree. He added also the CLOCK_SOURCE_SUSPEND_NONSTOP flag on the architected timer as they continue counting during suspend. - Uwe Kleine-König added some missing __init sections and consolidated the code by moving the of_node_put call from the drivers to the function clocksource_of_init. " Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-03 07:57:02 +02:00
Ingo Molnar	19f29887a7	Merge branch 'timers/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/core Merge updated full dynticks support from Frederic Weisbecker: - support 32-bit systems (full dynticks was 64-bit only before) - support ARM Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-03 07:53:25 +02:00
Ingo Molnar	6c09f6d830	Linux 3.12-rc3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJSSKOHAAoJEHm+PkMAQRiGeREH/3EqHmJPBzmVoJwR9/ykDoLg u+TJTkuxZG220WhgXS7W/0ECyBX0U7yA0bY9PZbqgcdiLjY0veR18/pOhEq5RzHq ub8Q+AJdiORF/sq268q7gnNmy3rSCgnrAyHA/bzBtkbisYODwZPYvWQVUjgNZ2dW qtW/TE9rjANcUrk8WdOu9oWcwsq4cyG3cscbfHE/JLFy/8tB5GoD158gxKLZsLXk uTCeUHMmvFRT56fZwfyvNstA8ozxXcHBmuu6+Ttceky2zeGzp6dOrd+d2SU1Ps3O P91x4e/Af4RFEwDczGP6TpSBEf/J/JaqrM1drjhnQHho0hrNRZVUXhADFVADCXY= =dOjB -----END PGP SIGNATURE----- Merge tag 'v3.12-rc3' into timers/core Merge Linux 3.12-rc3 - refresh the tree with the latest fixes before merging new bits. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-03 07:52:21 +02:00
Soren Brinkmann	245a349626	tick: broadcast: Deny per-cpu clockevents from being broadcast sources On most ARM systems the per-cpu clockevents are truly per-cpu in the sense that they can't be controlled on any other CPU besides the CPU that they interrupt. If one of these clockevents were to become a broadcast source we will run into a lot of trouble because the broadcast source is enabled on the first CPU to go into deep idle (if that CPU suffers from FEAT_C3_STOP) and that could be a different CPU than what the clockevent is interrupting (or even worse the CPU that the clockevent interrupts could be offline). Theoretically it's possible to support per-cpu clockevents as the broadcast source but so far we haven't needed this and supporting it is rather complicated. Let's just deny the possibility for now until this becomes a reality (let's hope it never does!). Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Michal Simek <michal.simek@xilinx.com>	2013-10-02 11:34:06 +02:00
Ingo Molnar	0d119fb576	Merge branch 'irq/urgent-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into irq/urgent Pull a hardirq-nesting fix from Frederic Weisbecker. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-02 07:53:01 +02:00
Frederic Weisbecker	cc1f027454	irq: Optimize softirq stack selection in irq exit If irq_exit() is called on the arch's specified irq stack, it should be safe to run softirqs inline under that same irq stack as it is near empty by the time we call irq_exit(). For example if we use the same stack for both hard and soft irqs here, the worst case scenario is: hardirq -> softirq -> hardirq. But then the softirq supersedes the first hardirq as the stack user since irq_exit() is called in a mostly empty stack. So the stack merge in this case looks acceptable. Stack overrun still have a chance to happen if hardirqs have more opportunities to nest, but then it's another problem to solve. So lets adapt the irq exit's softirq stack on top of a new Kconfig symbol that can be defined when irq_exit() runs on the irq stack. That way we can spare some stack switch on irq processing and all the cache issues that come along. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org>	2013-10-01 12:53:27 +02:00
Frederic Weisbecker	0bed698a33	irq: Justify the various softirq stack choices For clarity, comment the various stack choices for softirqs processing, whether we execute them from ksoftirqd or local_irq_enable() calls. Their use on irq_exit() is already commented. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org>	2013-10-01 12:53:27 +02:00
Frederic Weisbecker	5d60d3e7c0	irq: Improve a bit softirq debugging do_softirq() has a debug check that verifies that it is not nesting on softirqs processing, nor miscounting the softirq part of the preempt count. But making sure that softirqs processing don't nest is actually a more generic concern that applies to any caller of __do_softirq(). Do take it one step further and generalize that debug check to any softirq processing. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org>	2013-10-01 12:53:26 +02:00
Frederic Weisbecker	be6e101644	irq: Optimize call to softirq on hardirq exit Before processing softirqs on hardirq exit, we already do the check for pending softirqs while hardirqs are guaranteed to be disabled. So we can take a shortcut and safely jump to the arch specific implementation directly. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org>	2013-10-01 12:53:25 +02:00
Frederic Weisbecker	7d65f4a655	irq: Consolidate do_softirq() arch overriden implementations All arch overriden implementations of do_softirq() share the following common code: disable irqs (to avoid races with the pending check), check if there are softirqs pending, then execute __do_softirq() on a specific stack. Consolidate the common parts such that archs only worry about the stack switch. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org>	2013-10-01 12:53:25 +02:00
Frederic Weisbecker	ded7975475	irq: Force hardirq exit's softirq processing on its own stack The commit `facd8b80c6` ("irq: Sanitize invoke_softirq") converted irq exit calls of do_softirq() to __do_softirq() on all architectures, assuming it was only used there for its irq disablement properties. But as a side effect, the softirqs processed in the end of the hardirq are always called on the inline current stack that is used by irq_exit() instead of the softirq stack provided by the archs that override do_softirq(). The result is mostly safe if the architecture runs irq_exit() on a separate irq stack because then softirqs are processed on that same stack that is near empty at this stage (assuming hardirq aren't nesting). Otherwise irq_exit() runs in the task stack and so does the softirq too. The interrupted call stack can be randomly deep already and the softirq can dig through it even further. To add insult to the injury, this softirq can be interrupted by a new hardirq, maximizing the chances for a stack overrun as reported in powerpc for example: do_IRQ: stack overflow: 1920 CPU: 0 PID: 1602 Comm: qemu-system-ppc Not tainted 3.10.4-300.1.fc19.ppc64p7 #1 Call Trace: [c0000000050a8740] .show_stack+0x130/0x200 (unreliable) [c0000000050a8810] .dump_stack+0x28/0x3c [c0000000050a8880] .do_IRQ+0x2b8/0x2c0 [c0000000050a8930] hardware_interrupt_common+0x154/0x180 --- Exception: 501 at .cp_start_xmit+0x3a4/0x820 [8139cp] LR = .cp_start_xmit+0x390/0x820 [8139cp] [c0000000050a8d40] .dev_hard_start_xmit+0x394/0x640 [c0000000050a8e00] .sch_direct_xmit+0x110/0x260 [c0000000050a8ea0] .dev_queue_xmit+0x260/0x630 [c0000000050a8f40] .br_dev_queue_push_xmit+0xc4/0x130 [bridge] [c0000000050a8fc0] .br_dev_xmit+0x198/0x270 [bridge] [c0000000050a9070] .dev_hard_start_xmit+0x394/0x640 [c0000000050a9130] .dev_queue_xmit+0x428/0x630 [c0000000050a91d0] .ip_finish_output+0x2a4/0x550 [c0000000050a9290] .ip_local_out+0x50/0x70 [c0000000050a9310] .ip_queue_xmit+0x148/0x420 [c0000000050a93b0] .tcp_transmit_skb+0x4e4/0xaf0 [c0000000050a94a0] .__tcp_ack_snd_check+0x7c/0xf0 [c0000000050a9520] .tcp_rcv_established+0x1e8/0x930 [c0000000050a95f0] .tcp_v4_do_rcv+0x21c/0x570 [c0000000050a96c0] .tcp_v4_rcv+0x734/0x930 [c0000000050a97a0] .ip_local_deliver_finish+0x184/0x360 [c0000000050a9840] .ip_rcv_finish+0x148/0x400 [c0000000050a98d0] .__netif_receive_skb_core+0x4f8/0xb00 [c0000000050a99d0] .netif_receive_skb+0x44/0x110 [c0000000050a9a70] .br_handle_frame_finish+0x2bc/0x3f0 [bridge] [c0000000050a9b20] .br_nf_pre_routing_finish+0x2ac/0x420 [bridge] [c0000000050a9bd0] .br_nf_pre_routing+0x4dc/0x7d0 [bridge] [c0000000050a9c70] .nf_iterate+0x114/0x130 [c0000000050a9d30] .nf_hook_slow+0xb4/0x1e0 [c0000000050a9e00] .br_handle_frame+0x290/0x330 [bridge] [c0000000050a9ea0] .__netif_receive_skb_core+0x34c/0xb00 [c0000000050a9fa0] .netif_receive_skb+0x44/0x110 [c0000000050aa040] .napi_gro_receive+0xe8/0x120 [c0000000050aa0c0] .cp_rx_poll+0x31c/0x590 [8139cp] [c0000000050aa1d0] .net_rx_action+0x1dc/0x310 [c0000000050aa2b0] .__do_softirq+0x158/0x330 [c0000000050aa3b0] .irq_exit+0xc8/0x110 [c0000000050aa430] .do_IRQ+0xdc/0x2c0 [c0000000050aa4e0] hardware_interrupt_common+0x154/0x180 --- Exception: 501 at .bad_range+0x1c/0x110 LR = .get_page_from_freelist+0x908/0xbb0 [c0000000050aa7d0] .list_del+0x18/0x50 (unreliable) [c0000000050aa850] .get_page_from_freelist+0x908/0xbb0 [c0000000050aa9e0] .__alloc_pages_nodemask+0x21c/0xae0 [c0000000050aaba0] .alloc_pages_vma+0xd0/0x210 [c0000000050aac60] .handle_pte_fault+0x814/0xb70 [c0000000050aad50] .__get_user_pages+0x1a4/0x640 [c0000000050aae60] .get_user_pages_fast+0xec/0x160 [c0000000050aaf10] .__gfn_to_pfn_memslot+0x3b0/0x430 [kvm] [c0000000050aafd0] .kvmppc_gfn_to_pfn+0x64/0x130 [kvm] [c0000000050ab070] .kvmppc_mmu_map_page+0x94/0x530 [kvm] [c0000000050ab190] .kvmppc_handle_pagefault+0x174/0x610 [kvm] [c0000000050ab270] .kvmppc_handle_exit_pr+0x464/0x9b0 [kvm] [c0000000050ab320] kvm_start_lightweight+0x1ec/0x1fc [kvm] [c0000000050ab4f0] .kvmppc_vcpu_run_pr+0x168/0x3b0 [kvm] [c0000000050ab9c0] .kvmppc_vcpu_run+0xc8/0xf0 [kvm] [c0000000050aba50] .kvm_arch_vcpu_ioctl_run+0x5c/0x1a0 [kvm] [c0000000050abae0] .kvm_vcpu_ioctl+0x478/0x730 [kvm] [c0000000050abc90] .do_vfs_ioctl+0x4ec/0x7c0 [c0000000050abd80] .SyS_ioctl+0xd4/0xf0 [c0000000050abe30] syscall_exit+0x0/0x98 Since this is a regression, this patch proposes a minimalistic and low-risk solution by blindly forcing the hardirq exit processing of softirqs on the softirq stack. This way we should reduce significantly the opportunities for task stack overflow dug by softirqs. Longer term solutions may involve extending the hardirq stack coverage to irq_exit(), etc... Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: #3.9.. <stable@vger.kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org>	2013-10-01 12:39:08 +02:00
Borislav Petkov	a17bce4d1d	x86/boot: Further compress CPUs bootup message Turn it into (for example): [ 0.073380] x86: Booting SMP configuration: [ 0.074005] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 [ 0.603005] .... node #1, CPUs: #8 #9 #10 #11 #12 #13 #14 #15 [ 1.200005] .... node #2, CPUs: #16 #17 #18 #19 #20 #21 #22 #23 [ 1.796005] .... node #3, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 [ 2.393005] .... node #4, CPUs: #32 #33 #34 #35 #36 #37 #38 #39 [ 2.996005] .... node #5, CPUs: #40 #41 #42 #43 #44 #45 #46 #47 [ 3.600005] .... node #6, CPUs: #48 #49 #50 #51 #52 #53 #54 #55 [ 4.202005] .... node #7, CPUs: #56 #57 #58 #59 #60 #61 #62 #63 [ 4.811005] .... node #8, CPUs: #64 #65 #66 #67 #68 #69 #70 #71 [ 5.421006] .... node #9, CPUs: #72 #73 #74 #75 #76 #77 #78 #79 [ 6.032005] .... node #10, CPUs: #80 #81 #82 #83 #84 #85 #86 #87 [ 6.648006] .... node #11, CPUs: #88 #89 #90 #91 #92 #93 #94 #95 [ 7.262005] .... node #12, CPUs: #96 #97 #98 #99 #100 #101 #102 #103 [ 7.865005] .... node #13, CPUs: #104 #105 #106 #107 #108 #109 #110 #111 [ 8.466005] .... node #14, CPUs: #112 #113 #114 #115 #116 #117 #118 #119 [ 9.073006] .... node #15, CPUs: #120 #121 #122 #123 #124 #125 #126 #127 [ 9.679901] x86: Booted up 16 nodes, 128 CPUs and drop useless elements. Change num_digits() to hpa's division-avoiding, cell-phone-typed version which he went at great lengths and pains to submit on a Saturday evening. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: huawei.libin@huawei.com Cc: wangyijing@huawei.com Cc: fenghua.yu@intel.com Cc: guohanjun@huawei.com Cc: paul.gortmaker@windriver.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-01 10:52:30 +02:00
Oleg Nesterov	314a8ad0f1	pidns: fix free_pid() to handle the first fork failure "case 0" in free_pid() assumes that disable_pid_allocation() should clear PIDNS_HASH_ADDING before the last pid goes away. However this doesn't happen if the first fork() fails to create the child reaper which should call disable_pid_allocation(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-30 14:31:03 -07:00
Tetsuo Handa	4c1c7be95c	kernel/kmod.c: check for NULL in call_usermodehelper_exec() If /proc/sys/kernel/core_pattern contains only "\|", a NULL pointer dereference happens upon core dump because argv_split("") returns argv[0] == NULL. This bug was once fixed by commit `264b83c07a` ("usermodehelper: check subprocess_info->path != NULL") but was by error reintroduced by commit `7f57cfa4e2` ("usermodehelper: kill the sub_info->path[0] check"). This bug seems to exist since 2.6.19 (the version which core dump to pipe was added). Depending on kernel version and config, some side effect might happen immediately after this oops (e.g. kernel panic with 2.6.32-358.18.1.el6). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-30 14:31:02 -07:00
Rafael J. Wysocki	aab1728915	PM / hibernate: Fix user space driven resume regression Recent commit `8fd37a4` (PM / hibernate: Create memory bitmaps after freezing user space) broke the resume part of the user space driven hibernation (s2disk), because I forgot that the resume utility loaded the image into memory without freezing user space (it still freezes tasks after loading the image). This means that during user space driven resume we need to create the memory bitmaps at the "device open" time rather than at the "freeze tasks" time, so make that happen (that's a special case anyway, so it needs to be treated in a special way). Reported-and-tested-by: Ronald <ronald645@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-09-30 19:40:56 +02:00
Kevin Hilman	ff3fb25412	nohz: Drop generic vtime obsolete dependency on CONFIG_64BIT The CONFIG_64BIT requirement on vtime can finally be removed since we now depend on HAVE_VIRT_CPU_ACCOUNTING_GEN which already takes care of the arch ability to handle nsecs based cputime_t safely. Signed-off-by: Kevin Hilman <khilman@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Arm Linux <linux-arm-kernel@lists.infradead.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2013-09-30 15:37:01 +02:00
Kevin Hilman	554b0004d0	vtime: Add HAVE_VIRT_CPU_ACCOUNTING_GEN Kconfig With VIRT_CPU_ACCOUNTING_GEN, cputime_t becomes 64-bit. In order to use that feature, arch code should be audited to ensure there are no races in concurrent read/write of cputime_t. For example, reading/writing 64-bit cputime_t on some 32-bit arches may require multiple accesses for low and high value parts, so proper locking is needed to protect against concurrent accesses. Therefore, add CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN which arches can enable after they've been audited for potential races. This option is automatically enabled on 64-bit platforms. Feature requested by Frederic Weisbecker. Signed-off-by: Kevin Hilman <khilman@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Arm Linux <linux-arm-kernel@lists.infradead.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2013-09-30 15:35:53 +02:00
Greg Kroah-Hartman	88502b9c0a	Merge 3.12-rc3 into driver-core-next We want the driver core and sysfs fixes in here to make merges and development easier. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-29 18:29:23 -07:00
Linus Torvalds	669fc2f0c7	Merge branches 'sched-urgent-for-linus', 'timers-urgent-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler, timer and x86 fixes from Ingo Molnar: - A context tracking ARM build and functional fix - A handful of ARM clocksource/clockevent driver fixes - An AMD microcode patch level sysfs reporting fixlet * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm: Fix build error with context tracking calls * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast clocksource: of: Respect device tree node status clocksource: exynos_mct: Set IRQ affinity when the CPU goes online arm: clocksource: mvebu: Use the main timer as clock source from DT * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/AMD: Fix patch level reporting for family 15h	2013-09-28 14:22:17 -07:00
Jean Delvare	3a126f85e0	kernel/params: fix handling of signed integer types Commit `6072ddc852` ("kernel: replace strict_strto() with kstrto()") broke the handling of signed integer types, fix it. Signed-off-by: Jean Delvare <khali@linux-fr.org> Reported-by: Christian Kujau <lists@nerdbynature.de> Tested-by: Christian Kujau <lists@nerdbynature.de> Cc: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-28 12:35:52 -07:00
Ingo Molnar	62d08aec6a	Merge branch 'context_tracking/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into sched/urgent Pull context tracking ARM fix from Frederic Weisbecker. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-28 08:50:09 +02:00
Frederic Weisbecker	0c06a5d4b1	arm: Fix build error with context tracking calls `ad65782fba` (context_tracking: Optimize main APIs off case with static key) converted context tracking main APIs to inline function and left ARM asm callers behind. This can be easily fixed by making ARM calling the post static keys context tracking function. We just need to replicate the static key checks there. We'll remove these later when ARM will support the context tracking static keys. Reported-by: Guenter Roeck <linux@roeck-us.net> Reported-by: Russell King <linux@arm.linux.org.uk> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Kevin Hilman <khilman@linaro.org> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Anil Kumar <anilk4.v@gmail.com> Cc: Tony Lindgren <tony@atomide.com> Cc: Benoit Cousson <b-cousson@ti.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Kevin Hilman <khilman@linaro.org>	2013-09-27 17:59:47 +02:00
Greg Kroah-Hartman	90826ca740	pmu_bus: convert bus code to use dev_groups The dev_attrs field of struct bus_type is going away soon, dev_groups should be used instead. This converts the pmu bus code to use the correct field. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-09-26 15:49:43 -07:00
Linus Torvalds	82dfaa58a7	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Three small fixes" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/balancing: Fix cfs_rq->task_h_load calculation sched/balancing: Fix 'local->avg_load > busiest->avg_load' case in fix_small_imbalance() sched/balancing: Fix 'local->avg_load > sds->avg_load' case in calculate_imbalance()	2013-09-25 13:28:45 -07:00
Linus Torvalds	bdc5663fa1	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Assorted standalone fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Add model number for Avoton Silvermont perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page' perf/x86/intel/uncore: Don't use smp_processor_id() in validate_group() perf: Update ABI comment tools lib lk: Uninclude linux/magic.h in debugfs.c perf tools: Fix old GCC build error in trace-event-parse.c:parse_proc_kallsyms() perf probe: Fix finder to find lines of given function perf session: Check for SIGINT in more loops perf tools: Fix compile with libelf without get_phdrnum perf tools: Fix buildid cache handling of kallsyms with kcore perf annotate: Fix objdump line parsing offset validation perf tools: Fill in new definitions for madvise()/mmap() flags perf tools: Sharpen the libaudit dependencies test	2013-09-25 13:28:08 -07:00
Paul E. McKenney	5c173eb8bc	rcu: Consistent rcu_is_watching() naming The old rcu_is_cpu_idle() function is just __rcu_is_watching() with preemption disabled. This commit therefore renames rcu_is_cpu_idle() to rcu_is_watching. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-09-25 06:45:06 -07:00
Paul E. McKenney	f9ffc31ebd	rcu: Change EXPORT_SYMBOL() to EXPORT_SYMBOL_GPL() Commit `e6b80a3b` (rcu: Detect illegal rcu dereference in extended quiescent state) exported the pre-existing rcu_is_cpu_idle() function using EXPORT_SYMBOL(). However, this is inconsistent with the remaining exports from RCU, which are all EXPORT_SYMBOL_GPL(). The current state of affairs means that a non-GPL module could use rcu_is_cpu_idle(), but in a CONFIG_TREE_PREEMPT_RCU=y kernel would be unable to invoke rcu_read_lock() and rcu_read_unlock(). This commit therefore makes rcu_is_cpu_idle()'s export be consistent with the rest of RCU, namely EXPORT_SYMBOL_GPL(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-09-25 06:44:56 -07:00
Paul E. McKenney	cc6783f788	rcu: Is it safe to enter an RCU read-side critical section? There is currently no way for kernel code to determine whether it is safe to enter an RCU read-side critical section, in other words, whether or not RCU is paying attention to the currently running CPU. Given the large and increasing quantity of code shared by the idle loop and non-idle code, the this shortcoming is becoming increasingly painful. This commit therefore adds __rcu_is_watching(), which returns true if it is safe to enter an RCU read-side critical section on the currently running CPU. This function is quite fast, using only a __this_cpu_read(). However, the caller must disable preemption. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-09-25 06:44:41 -07:00
Paul E. McKenney	c337f8f58e	rcu: Throttle invoke_rcu_core() invocations due to non-lazy callbacks If a non-lazy callback arrives on a CPU that has previously gone idle with no non-lazy callbacks, invoke_rcu_core() forces the RCU core to run. However, it does not update the conditions, which could result in several closely spaced invocations of the RCU core, which in turn could result in an excessively high context-switch rate and resulting high overhead. This commit therefore updates the ->all_lazy and ->nonlazy_posted_snap fields to prevent closely spaced invocations. Reported-by: Tibor Billes <tbilles@gmx.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Tibor Billes <tbilles@gmx.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-09-25 06:44:33 -07:00
Paul E. McKenney	c229828ca6	rcu: Throttle rcu_try_advance_all_cbs() execution The rcu_try_advance_all_cbs() function is invoked on each attempted entry to and every exit from idle. If this function determines that there are callbacks ready to invoke, the caller will invoke the RCU core, which in turn will result in a pair of context switches. If a CPU enters and exits idle extremely frequently, this can result in an excessive number of context switches and high CPU overhead. This commit therefore causes rcu_try_advance_all_cbs() to throttle itself, refusing to do work more than once per jiffy. Reported-by: Tibor Billes <tbilles@gmx.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Tibor Billes <tbilles@gmx.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-09-25 06:44:25 -07:00
Paul E. McKenney	7a497c963e	rcu: Remove redundant code from rcu_cleanup_after_idle() The rcu_try_advance_all_cbs() function returns a bool saying whether or not there are callbacks ready to invoke, but rcu_cleanup_after_idle() rechecks this regardless. This commit therefore uses the value returned by rcu_try_advance_all_cbs() instead of making rcu_cleanup_after_idle() do this recheck. Reported-by: Tibor Billes <tbilles@gmx.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Tibor Billes <tbilles@gmx.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-09-25 06:44:03 -07:00
Peter Zijlstra	a233f1120c	sched: Prepare for per-cpu preempt_count When using per-cpu preempt_count variables we need to save/restore the preempt_count on context switch (into per task storage; for instance the old thread_info::preempt_count variable) because of PREEMPT_ACTIVE. However, this means that on fork() the preempt_count value of the last context switch gets copied and if we had a PREEMPT_ACTIVE switch right before cloning a child task the child task will now too have PREEMPT_ACTIVE set and start its life with an extra PREEMPT_ACTIVE count. Therefore we need to make init_task_preempt_count() unconditional; this resets whatever preempt_count we inherited from our parent process. Doing so for !per-cpu implementations is harmless. For !PREEMPT_COUNT kernels we need to be careful not to start life with an increased preempt_count. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-4k0b7oy1rcdyzochwiixuwi9@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:55 +02:00
Peter Zijlstra	bdb4380658	sched: Extract the basic add/sub preempt_count modifiers Rewrite the preempt_count macros in order to extract the 3 basic preempt_count value modifiers: __preempt_count_add() __preempt_count_sub() and the new: __preempt_count_dec_and_test() And since we're at it anyway, replace the unconventional $op_preempt_count names with the more conventional preempt_count_$op. Since these basic operators are equivalent to the previous _notrace() variants, do away with the _notrace() versions. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-ewbpdbupy9xpsjhg960zwbv8@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:54 +02:00
Peter Zijlstra	0102874755	sched: Create more preempt_count accessors We need a few special preempt_count accessors: - task_preempt_count() for when we're interested in the preemption count of another (non-running) task. - init_task_preempt_count() for properly initializing the preemption count. - init_idle_preempt_count() a special case of the above for the idle threads. With these no generic code ever touches thread_info::preempt_count anymore and architectures could choose to remove it. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-jf5swrio8l78j37d06fzmo4r@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:52 +02:00
Peter Zijlstra	f27dde8dee	sched: Add NEED_RESCHED to the preempt_count In order to combine the preemption and need_resched test we need to fold the need_resched information into the preempt_count value. Since the NEED_RESCHED flag is set across CPUs this needs to be an atomic operation, however we very much want to avoid making preempt_count atomic, therefore we keep the existing TIF_NEED_RESCHED infrastructure in place but at 3 sites test it and fold its value into preempt_count; namely: - resched_task() when setting TIF_NEED_RESCHED on the current task - scheduler_ipi() when resched_task() sets TIF_NEED_RESCHED on a remote task it follows it up with a reschedule IPI and we can modify the cpu local preempt_count from there. - cpu_idle_loop() for when resched_task() found tsk_is_polling(). We use an inverted bitmask to indicate need_resched so that a 0 means both need_resched and !atomic. Also remove the barrier() in preempt_enable() between preempt_enable_no_resched() and preempt_check_resched() to avoid having to reload the preemption value and allow the compiler to use the flags of the previuos decrement. I couldn't come up with any sane reason for this barrier() to be there as preempt_enable_no_resched() already has a barrier() before doing the decrement. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-7a7m5qqbn5pmwnd4wko9u6da@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:49 +02:00
Peter Zijlstra	4a2b4b2227	sched: Introduce preempt_count accessor functions Replace the single preempt_count() 'function' that's an lvalue with two proper functions: preempt_count() - returns the preempt_count value as rvalue preempt_count_set() - Allows setting the preempt-count value Also provide preempt_count_ptr() as a convenience wrapper to implement all modifying operations. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-orxrbycjozopqfhb4dxdkdvb@git.kernel.org [ Fixed build failure. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:32 +02:00
Peter Zijlstra	ea81174789	sched, idle: Fix the idle polling state logic Mike reported that commit `7d1a9417` ("x86: Use generic idle loop") regressed several workloads and caused excessive reschedule interrupts. The patch in question failed to notice that the x86 code had an inverted sense of the polling state versus the new generic code (x86: default polling, generic: default !polling). Fix the two prominent x86 mwait based idle drivers and introduce a few new generic polling helpers (fixing the wrong smp_mb__after_clear_bit usage). Also switch the idle routines to using tif_need_resched() which is an immediate TIF_NEED_RESCHED test as opposed to need_resched which will end up being slightly different. Reported-by: Mike Galbraith <bitbucket@online.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: lenb@kernel.org Cc: tglx@linutronix.de Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 13:53:10 +02:00
Peter Zijlstra	b021fe3e25	sched, rcu: Make RCU use resched_cpu() We're going to deprecate and remove set_need_resched() for it will do the wrong thing. Make an exception for RCU and allow it to use resched_cpu() which will do the right thing. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-2eywnacjl1nllctl1nszqa5w@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 13:53:08 +02:00
Michael S. Tsirkin	4314895165	sched: Micro-optimize by dropping unnecessary task_rq() calls We always know the rq used, let's just pass it around. This seems to cut the size of scheduler core down a tiny bit: Before: [linux]$ size kernel/sched/core.o.orig text data bss dec hex filename 62760 16130 3876 82766 1434e kernel/sched/core.o.orig After: [linux]$ size kernel/sched/core.o.patched text data bss dec hex filename 62566 16130 3876 82572 1428c kernel/sched/core.o.patched Probably speeds it up as well. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130922142054.GA11499@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 13:51:06 +02:00
Chuansheng Liu	e2f0b88e84	kernel/reboot.c: re-enable the function of variable reboot_default Commit `1b3a5d02ee` ("reboot: move arch/x86 reboot= handling to generic kernel") did some cleanup for reboot= command line, but it made the reboot_default inoperative. The default value of variable reboot_default should be 1, and if command line reboot= is not set, system will use the default reboot mode. [akpm@linux-foundation.org: fix comment layout] Signed-off-by: Li Fei <fei.li@intel.com> Signed-off-by: liu chuansheng <chuansheng.liu@intel.com> Acked-by: Robin Holt <robinmholt@linux.com> Cc: <stable@vger.kernel.org> [3.11.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Konstantin Khlebnikov	8ac1c8d5de	audit: fix endless wait in audit_log_start() After commit `829199197a` ("kernel/audit.c: avoid negative sleep durations") audit emitters will block forever if userspace daemon cannot handle backlog. After the timeout the waiting loop turns into busy loop and runs until daemon dies or returns back to work. This is a minimal patch for that bug. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Richard Guy Briggs <rgb@redhat.com> Cc: Eric Paris <eparis@redhat.com> Cc: Chuck Anderson <chuck.anderson@oracle.com> Cc: Dan Duval <dan.duval@oracle.com> Cc: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Michal Hocko	9809b18fcf	watchdog: update watchdog_thresh properly watchdog_tresh controls how often nmi perf event counter checks per-cpu hrtimer_interrupts counter and blows up if the counter hasn't changed since the last check. The counter is updated by per-cpu watchdog_hrtimer hrtimer which is scheduled with 2/5 watchdog_thresh period which guarantees that hrtimer is scheduled 2 times per the main period. Both hrtimer and perf event are started together when the watchdog is enabled. So far so good. But... But what happens when watchdog_thresh is updated from sysctl handler? proc_dowatchdog will set a new sampling period and hrtimer callback (watchdog_timer_fn) will use the new value in the next round. The problem, however, is that nobody tells the perf event that the sampling period has changed so it is ticking with the period configured when it has been set up. This might result in an ear ripping dissonance between perf and hrtimer parts if the watchdog_thresh is increased. And even worse it might lead to KABOOM if the watchdog is configured to panic on such a spurious lockup. This patch fixes the issue by updating both nmi perf even counter and hrtimers if the threshold value has changed. The nmi one is disabled and then reinitialized from scratch. This has an unpleasant side effect that the allocation of the new event might fail theoretically so the hard lockup detector would be disabled for such cpus. On the other hand such a memory allocation failure is very unlikely because the original event is deallocated right before. It would be much nicer if we just changed perf event period but there doesn't seem to be any API to do that right now. It is also unfortunate that perf_event_alloc uses GFP_KERNEL allocation unconditionally so we cannot use on_each_cpu() and do the same thing from the per-cpu context. The update from the current CPU should be safe because perf_event_disable removes the event atomically before it clears the per-cpu watchdog_ev so it cannot change anything under running handler feet. The hrtimer is simply restarted (thanks to Don Zickus who has pointed this out) if it is queued because we cannot rely it will fire&adopt to the new sampling period before a new nmi event triggers (when the treshold is decreased). [akpm@linux-foundation.org: the UP version of __smp_call_function_single ended up in the wrong place] Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Fabio Estevam <festevam@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:25 -07:00
Michal Hocko	359e6fab66	watchdog: update watchdog attributes atomically proc_dowatchdog doesn't synchronize multiple callers which might lead to confusion when two parallel callers might confuse watchdog_enable_all_cpus resp watchdog_disable_all_cpus (eg watchdog gets enabled even if watchdog_thresh was set to 0 already). This patch adds a local mutex which synchronizes callers to the sysctl handler. Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:25 -07:00
Li Zefan	2ff2a7d03b	cgroup: kill css_id The only user of css_id was memcg, and it has been convered to use cgroup->id, so kill css_id. Signed-off-by: Li Zefan <lizefan@huwei.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-09-23 21:44:16 -04:00

... 4 5 6 7 8 ...