evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
Pavankumar Kondeti	a68e39b7fd	sched: break the forever prev_cpu selection preference The select_best_cpu() algorithm selects the previous CPU as the target CPU if the task did not sleep for more than 2 msec (controlled by /proc/sys/kernel/sched_select_prev_cpu_us). The complete CPU search is not done for a long time for tasks which sleeps for a short duration in between the long execution slices. Enforce a 100 msec threshold since the last selection time to run the complete algorithm. CRs-Fixed: 984463 Change-Id: I329eecc6bae8f130cd5598f6cee8ca5a01391cca [joonwoop@codeaurora.org: fixed conflict in bias_to_prev_cpu() and sched.h where CONFIG_SCHED_QHMP used to be.] Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>	2016-06-23 14:03:24 -07:00
Vikram Mulukutla	3026cbf1d0	sched: core: Fix possible hotplug race in set_cpus_allowed_ptr Since a CPU may go offline after cpu_active_mask is used to query active CPUs, set_cpus_allowed_ptr might inadverntently pass an invalid cpu number to move_queued_task. Fix this by ensuring that the cpumask op that uses cpu_active_mask checks the return value. CRs-Fixed: 1029014 Change-Id: Id43a629b40b72cc47773e4027d30953b3a94058d Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>	2016-06-22 14:44:57 -07:00
David Keitel	ea2143f756	sysctl: add cold_boot sysctl entry Add a cold_boot parameter which supplements the boot_reason sysctl entry with information about whether the system was booted from cold or warm state. /proc/sys/kernel/cold_boot entry is updated with 1 or 0 when system was booted from cold or warm boot state respecitively. CRs-Fixed: 461256 Change-Id: I2bc5d80c8f26eb9e9dbb4b34960d991a51a224e4 Signed-off-by: David Keitel <dkeitel@codeaurora.org> [abhimany: fixup minor merge conflict and drop changes to kernel/sysctl.c and Documentation since it was brought in via snapshot commit] Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>	2016-06-21 15:13:20 -07:00
Rick Adams	44ed42824b	msm: falcon: put reason for boot in procfs from SMEM During board initialization read the shared memory item SMEM_POWER_ON_STATUS_INFO and place it in the procfs at /proc/sys/kernel/boot_reason The data item is an integer with a bit being set to identify the reason the device was powered on. The values of this data item is defined in the document Document/arm/msm/boot.txt, the following is the data in the documentation file. power_on_status values set by the PMIC for power on event: ---------------------------------------------------------- 0x01 -- keyboard power on 0x02 -- RTC alarm 0x04 -- cable power on 0x08 -- SMPL 0x10 -- Watch Dog timeout 0x20 -- USB charger 0x40 -- Wall charger 0xFF -- error reading power_on_status value This is cherrypicked from commit <372d39f87b0da75> ("put reason for boot in procfs") of 3.18 tree. Change-Id: I59e665f92e6e29f7dfef4380314f676a2d92c94b Signed-off-by: Rick Adams <rgadams@codeaurora.org> [abhimany: fix up minor merge conflicts] Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org> Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>	2016-06-21 15:13:09 -07:00
Joonwoo Park	c876c09f58	sched: kill unnecessary divisions on fast path The max_possible_efficiency and CPU's efficiency are fixed values which are determined at cluster allocation time. Avoid division on the fast by using precomputed scale factor. Also update_cpu_busy_time() doesn't need to know how many full windows have elapsed. Thus replace unneeded division with simple comparison. Change-Id: I2be1aad3fb9b895e4f0917d05bd8eade985bbccf Suggested-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-21 15:11:21 -07:00
Joonwoo Park	47c31979a1	sched: prevent race where update CPU cycles Updating cycle counter should be serialized by holding rq lock. Add missing rq lock hold when cycle counter is updated by irq entry point. Change-Id: I92cf75d047a45ebf15a6ddeeecf8fc3823f96e5d Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-21 15:11:07 -07:00
Joonwoo Park	14ac5ed8b8	sched: fix overflow in scaled execution time calculation Task execution time in nanoseconds and CPU cycle counters are large enough to cause overflow when we multiply both. Avoid overflow by calculating frequency separately. Change-Id: I076d9ecd27cb1c1f11578f009ebe1a19c1619454 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-21 15:10:56 -07:00
Joonwoo Park	c07e88c80f	sched: remove unused parameter cpu from cpu_cycles_to_freq() The function parameter cpu isn't used anymore by cpu_cycles_to_freq(). So remove it. Change-Id: Ide19321206dacb88fedca97e1b689d740f872866 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-21 15:10:22 -07:00
Alex Shi	9b0440e3b2	Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android	2016-06-21 11:22:43 +08:00
Alex Shi	46b4dd0c25	Merge branch 'v4.4/topic/coresight' into linux-linaro-lsk-v4.4	2016-06-21 11:14:16 +08:00
Mathieu Poirier	e5fd3d6e84	perf: passing struct perf_event to function setup_aux() Some information, like driver specific configuration, is found in the perf event structure. As such pass a 'struct perf_event' to function setup_aux() rather than just the CPU number so that individual drivers can make the right configuration when setting up a session. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2016-06-20 11:09:46 -06:00
Mathieu Poirier	1efb79086e	perf/core: adding PMU driver specific configuration It is entirely possible that some PMUs need specific configuration that is currently not found in the perf options before a session can be setup. It is the case for the CoreSight PMU where a sink needs to be provided. That sink doesn't fall in any of the current perf options. As such this patch adds the capability to receive driver specific configuration using the existing ioctl() mechanism. Once the configuration has been pushed down the kernel PMU callbacks are used to deal with the information sent from user space. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2016-06-20 11:09:45 -06:00
Jeff Vander Stoep	934f4983c7	FROMLIST: security,perf: Allow further restriction of perf_event_open When kernel.perf_event_open is set to 3 (or greater), disallow all access to performance events by users without CAP_SYS_ADMIN. Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that makes this value the default. This is based on a similar feature in grsecurity (CONFIG_GRKERNSEC_PERF_HARDEN). This version doesn't include making the variable read-only. It also allows enabling further restriction at run-time regardless of whether the default is changed. https://lkml.org/lkml/2016/1/11/587 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Bug: 29054680 Change-Id: Iff5bff4fc1042e85866df9faa01bce8d04335ab8	2016-06-16 13:44:10 +05:30
Subash Abhinov Kasiviswanathan	ea60e2fbe4	Revert "kernel/sysctl.c: detect overflows when converting to int" We have scripts which write to certain fields on 3.18 kernels but this seems to be failing on 4.4 kernels. An entry which we write to here is xfrm_aevent_rseqth which is u32. echo 4294967295 > /proc/sys/net/core/xfrm_aevent_rseqth Commit `230633d109` ("kernel/sysctl.c: detect overflows when converting to int") prevented writing to sysctl entries when integer overflow occurs. However, this does not apply to unsigned integers. u32 should be able to hold 4294967295 here, however it fails due to this check. static int do_proc_dointvec_conv(bool negp, unsigned long lvalp, if (*lvalp > (unsigned long) INT_MAX) return -EINVAL; Fix this for now by reverting this commit till a solution is finalized upstream. CRs-Fixed: `1026507` Change-Id: I4fae5f442e4cc2c2414a69e960d42c05c3062415 Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>	2016-06-15 16:16:33 -07:00
Alex Shi	9ad8208bd7	Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android	2016-06-14 17:08:03 +08:00
Alex Shi	c66b2190a1	Merge tag 'v4.4.13' into linux-linaro-lsk-v4.4 This is the 4.4.13 stable release	2016-06-14 17:07:59 +08:00
Joonwoo Park	8c8a1a12e8	sched: avoid potential race between governor and thermal driver It's possible thermal driver and governor notify that fmax is being changed at the same time. In such case we can potentially skip updating of CPU's capacity. Fix this by updating capacity always when limited fmax is changed by same entity. Meanwhile serialize sched_update_cpu_freq_min_max() with spinlock since this function can be called by multiple drivers at the same time. Change-Id: I3608cb09c30797bf858f434579fd07555546fb60 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-09 15:09:45 -07:00
Joonwoo Park	96818d6f1d	sched: fix potential deflated frequency estimation during IRQ handling Time between mark_start of idle task and IRQ handler entry time is CPU cycle counter stall period. Therefore it's inappropriate to include such duration as part of sample period when we do frequency estimation. Fix such suboptimality by replenishing idle task's CPU cycle counter upon IRQ entry and using irqtime as time delta. Change-Id: I274d5047a50565cfaaa2fb821ece21c8cf4c991d Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-09 15:08:01 -07:00
Joonwoo Park	6e8c9ac98d	sched: fix CPU frequency estimation while idle CPU cycle counter won't increase when CPU or cluster is idle depending on hardware. Thus using cycle counter in that period of time can result in incorrect CPU frequency estimation. Use previously calculated CPU frequency when CPU was idle. Change-Id: I732b50c974a73c08038995900e008b4e16e9437b Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-09 15:07:48 -07:00
Joonwoo Park	54c0b0001b	sched: preserve CPU cycle counter in rq Preserve cycle counter in rq in preparation for wait time accounting while CPU idle fix. Change-Id: I469263c90e12f39bb36bde5ed26298b7c1c77597 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-09 15:07:35 -07:00
Willy Tarreau	fa6d0ba12a	pipe: limit the per-user amount of pages allocated in pipes commit 759c01142a5d0f364a462346168a56de28a80f52 upstream. On no-so-small systems, it is possible for a single process to cause an OOM condition by filling large pipes with data that are never read. A typical process filling 4000 pipes with 1 MB of data will use 4 GB of memory. On small systems it may be tricky to set the pipe max size to prevent this from happening. This patch makes it possible to enforce a per-user soft limit above which new pipes will be limited to a single page, effectively limiting them to 4 kB each, as well as a hard limit above which no new pipes may be created for this user. This has the effect of protecting the system against memory abuse without hurting other users, and still allowing pipes to work correctly though with less data at once. The limit are controlled by two new sysctls : pipe-user-pages-soft, and pipe-user-pages-hard. Both may be disabled by setting them to zero. The default soft limit allows the default number of FDs per process (1024) to create pipes of the default size (64kB), thus reaching a limit of 64MB before starting to create only smaller pipes. With 256 processes limited to 1024 FDs each, this results in 102464kB + (2561024 - 1024) * 4kB = 1084 MB of memory allocated for a user. The hard limit is disabled by default to avoid breaking existing applications that make intensive use of pipes (eg: for splicing). Reported-by: socketpair@gmail.com Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Mitigates: CVE-2013-4312 (Linux 2.0+) Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Moritz Muehlenhoff <moritz@wikimedia.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-06-07 18:14:35 -07:00
Oleg Nesterov	0eea2e24fc	wait/ptrace: assume __WALL if the child is traced commit bf959931ddb88c4e4366e96dd22e68fa0db9527c upstream. The following program (simplified version of generated by syzkaller) #include <pthread.h> #include <unistd.h> #include <sys/ptrace.h> #include <stdio.h> #include <signal.h> void thread_func(void arg) { ptrace(PTRACE_TRACEME, 0,0,0); return 0; } int main(void) { pthread_t thread; if (fork()) return 0; while (getppid() != 1) ; pthread_create(&thread, NULL, thread_func, NULL); pthread_join(thread, NULL); return 0; } creates an unreapable zombie if /sbin/init doesn't use __WALL. This is not a kernel bug, at least in a sense that everything works as expected: debugger should reap a traced sub-thread before it can reap the leader, but without __WALL/__WCLONE do_wait() ignores sub-threads. Unfortunately, it seems that /sbin/init in most (all?) distributions doesn't use it and we have to change the kernel to avoid the problem. Note also that most init's use sys_waitid() which doesn't allow __WALL, so the necessary user-space fix is not that trivial. This patch just adds the "ptrace" check into eligible_child(). To some degree this matches the "tsk->ptrace" in exit_notify(), ->exit_signal is mostly ignored when the tracee reports to debugger. Or WSTOPPED, the tracer doesn't need to set this flag to wait for the stopped tracee. This obviously means the user-visible change: __WCLONE and __WALL no longer have any meaning for debugger. And I can only hope that this won't break something, but at least strace/gdb won't suffer. We could make a more conservative change. Say, we can take __WCLONE into account, or !thread_group_leader(). But it would be nice to not complicate these historical/confusing checks. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Roland McGrath <roland@hack.frob.com> Cc: <syzkaller@googlegroups.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-06-07 18:14:35 -07:00
Sarangdhar Joshi	7ab05c20ad	arm64: Add support for app specific settings Add support to provide an interface that can be used from userspace to decide whether app specific settings need to be applied / cleared when particular processes are running. CRs-Fixed: 981519 997757 Change-Id: Id81f8b70de64f291a8586150f4d2c7c8f8b4420f Signed-off-by: Sarangdhar Joshi <spjoshi@codeaurora.org> [satyap@codeaurora.org: trivial merge conflict resolution and pull fixes for CR: 997757] Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>	2016-06-07 11:53:27 -07:00
Joonwoo Park	42ab5394f4	Revert "sched: warn/panic upon excessive scheduling latency" This reverts commit `8f90803a45` ("sched: warn/panic upon excessive scheduling latency") as this feature is no longer used. Change-Id: I200d0e9e8dad5047522cd02a68de25d4a70a91a4 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-03 14:48:17 -07:00
Joonwoo Park	9103cfbaa1	Revert "sched: add scheduling latency tracking procfs node" This reverts commit `b40bf941f6` ("sched: add scheduling latency tracking procfs node") as this feature is no longer used. Change-Id: I5de789b6349e6ea78ae3725af2a3ffa72b7b7f11 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-03 14:48:05 -07:00
Joonwoo Park	11ad3c4f92	sched: eliminate sched_early_detection_duration knob Kill unused scheduler knob sched_early_detection_duration. Change-Id: I36b7a10982367f9c7ab8eefcb8ef1d0f9955601d Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-03 14:47:51 -07:00
Joonwoo Park	eedf0821f6	sched: Remove the sched heavy task frequency guidance feature This has always been unused feature given its limitation of adding phantom load to the system. Since there are no immediate plans of using this and the fact that it adds unnecessary complications to the new load fixup mechanism, remove this feature for now. It can be revisited later in light of the new mechanism. Change-Id: Ie9501a898d0f423338293a8dde6bc56f493f1e75 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-03 14:47:39 -07:00
Joonwoo Park	6b2c4343e7	sched: eliminate sched_migration_fixup knob Kill unused scheduler knob sched_migration_fixup. With this change scheduler always adjusts CPU's busy time during migration. Change-Id: I5d59e89d5cc0f2c705c40036cd7b47f5d3f89e58 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-03 14:47:25 -07:00
Joonwoo Park	dc284e65df	sched: eliminate sched_upmigrate_min_nice knob Kill unused scheduler knob sched_upmigrate_min_nice. Change-Id: I53ddfde39c78e78306bd746c1c4da9a94ec67cd8 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-03 14:46:44 -07:00
Alex Shi	58189909e6	Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android	2016-06-02 12:18:57 +08:00
Alex Shi	37aa27cffb	Merge tag 'v4.4.12' into linux-linaro-lsk-v4.4 This is the 4.4.12 stable release	2016-06-02 12:18:55 +08:00
Alex Shi	e1599ccfad	Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android	2016-06-02 10:25:31 +08:00
Alex Shi	2ea80ad420	Merge remote-tracking branch 'v4.4/topic/coresight' into linux-linaro-lsk-v4.4	2016-06-02 09:54:57 +08:00
Joonwoo Park	d009f9c149	sched: eliminate sched_enable_power_aware knob and parameter Kill unused scheduler knob and parameter sched_enable_power_aware. HMP scheduler always take into account power cost for placing task. Change-Id: Ib26a21df9b903baac26c026862b0a41b4a8834f3 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-01 15:21:29 -07:00
Joonwoo Park	462213d1ac	sched: eliminate sched_freq_account_wait_time knob Kill unused scheduler knob sched_freq_account_wait_time. Change-Id: Ib74123ebd69dfa3f86cf7335099f50c12a6e93c3 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-01 15:21:18 -07:00
Joonwoo Park	5160d93b6d	sched: eliminate sched_account_wait_time knob Kill unused scheduler knob sched_account_wait_time. With this change scheduler always accounts task's wait time into demand. Change-Id: Ifa4bcb5685798f48fd020f3d0c9853220b3f5fdc Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-06-01 15:21:04 -07:00
Alexander Shishkin	84b84017f4	perf/ring_buffer: Document AUX API usage In order to ensure safe AUX buffer management, we rely on the assumption that pmu::stop() stops its ongoing AUX transaction and not just the hw. This patch documents this requirement for the perf_aux_output_{begin,end}() APIs. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1457098969-21595-4-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit af5bb4ed1254a378b6028c09e58bdcc1cd9bf5b3)	2016-06-01 15:42:47 -06:00
Alexander Shishkin	c5611c7b5f	perf/core: Free AUX pages in unmap path Now that we can ensure that when ring buffer's AUX area is on the way to getting unmapped new transactions won't start, we only need to stop all events that can potentially be writing aux data to our ring buffer. Having done that, we can safely free the AUX pages and corresponding PMU data, as this time it is guaranteed to be the last aux reference holder. This partially reverts: `57ffc5ca67` ("perf: Fix AUX buffer refcounting") ... which was made to defer deallocation that was otherwise possible from an NMI context. Now it is no longer the case; the last call to rb_free_aux() that drops the last AUX reference has to happen in perf_mmap_close() on that AUX area. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/87d1qtz23d.fsf@ashishki-desk.ger.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 95ff4ca26c492fc1ed7751f5dd7ab7674b54f4e0)	2016-06-01 15:42:47 -06:00
Alexander Shishkin	5025483e77	perf/ring_buffer: Refuse to begin AUX transaction after rb->aux_mmap_count drops When ring buffer's AUX area is unmapped and rb->aux_mmap_count drops to zero, new AUX transactions into this buffer can still be started, even though the buffer in en route to deallocation. This patch adds a check to perf_aux_output_begin() for rb->aux_mmap_count being zero, in which case there is no point starting new transactions, in other words, the ring buffers that pass a certain point in perf_mmap_close will not have their events sending new data, which clears path for freeing those buffers' pages right there and then, provided that no active transactions are holding the AUX reference. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1457098969-21595-2-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit dcb10a967ce82d5ad20570693091139ae716ff76)	2016-06-01 15:42:46 -06:00
Alexander Shishkin	e05ed32680	perf/core: Disable the event on a truncated AUX record When the PMU driver reports a truncated AUX record, it effectively means that there is no more usable room in the event's AUX buffer (even though there may still be some room, so that perf_aux_output_begin() doesn't take action). At this point the consumer still has to be woken up and the event has to be disabled, otherwise the event will just keep spinning between perf_aux_output_begin() and perf_aux_output_end() until its context gets unscheduled. Again, for cpu-wide events this means never, so once in this condition, they will be forever losing data. Fix this by disabling the event and waking up the consumer in case of a truncated AUX record. Reported-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1462886313-13660-3-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 9f448cd3cbcec8995935e60b27802ae56aac8cc0)	2016-06-01 15:29:57 -06:00
Alexander Shishkin	75663c46e8	perf/core: Don't leak event in the syscall error path In the error path, event_file not being NULL is used to determine whether the event itself still needs to be free'd, so fix it up to avoid leaking. Reported-by: Leon Yu <chianglungyu@gmail.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 130056275ade ("perf: Do not double free") Link: http://lkml.kernel.org/r/87twk06yxp.fsf@ashishki-desk.ger.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 201c2f85bd0bc13b712d9c0b3d11251b182e06ae)	2016-06-01 15:29:56 -06:00
Alexander Shishkin	072536f070	perf/core: Fix perf_sched_count derailment The error path in perf_event_open() is such that asking for a sampling event on a PMU that doesn't generate interrupts will end up in dropping the perf_sched_count even though it hasn't been incremented for this event yet. Given a sufficient amount of these calls, we'll end up disabling scheduler's jump label even though we'd still have active events in the system, thereby facilitating the arrival of the infernal regions upon us. I'm fixing this by moving account_event() inside perf_event_alloc(). Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1456917854-29427-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 927a5570855836e5d5859a80ce7e91e963545e8f)	2016-06-01 15:29:56 -06:00
Alexander Shishkin	52814abf26	perf: Synchronously free aux pages in case of allocation failure We are currently using asynchronous deallocation in the error path in AUX mmap code, which is unnecessary and also presents a problem for users that wish to probe for the biggest possible buffer size they can get: they'll get -EINVAL on all subsequent attemts to allocate a smaller buffer before the asynchronous deallocation callback frees up the pages from the previous unsuccessful attempt. Currently, gdb does that for allocating AUX buffers for Intel PT traces. More specifically, overwrite mode of AUX pmus that don't support hardware sg (some implementations of Intel PT, for instance) is limited to only one contiguous high order allocation for its buffer and there is no way of knowing its size without trying. This patch changes error path freeing to be synchronous as there won't be any contenders for the AUX pages at that point. Reported-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1453216469-9509-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 45c815f06b80031659c63d7b93e580015d6024dd)	2016-06-01 15:28:34 -06:00
Vik Heyndrickx	1df73f1884	sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems commit 20878232c52329f92423d27a60e48b6a6389e0dd upstream. Systems show a minimal load average of 0.00, 0.01, 0.05 even when they have no load at all. Uptime and /proc/loadavg on all systems with kernels released during the last five years up until kernel version 4.6-rc5, show a 5- and 15-minute minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on idle systems, but the way the kernel calculates this value prevents it from getting lower than the mentioned values. Likewise but not as obviously noticeable, a fully loaded system with no processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95 (multiplied by number of cores). Once the (old) load becomes 93 or higher, it mathematically can never get lower than 93, even when the active (load) remains 0 forever. This results in the strange 0.00, 0.01, 0.05 uptime values on idle systems. Note: 93/2048 = 0.0454..., which rounds up to 0.05. It is not correct to add a 0.5 rounding (=1024/2048) here, since the result from this function is fed back into the next iteration again, so the result of that +0.5 rounding value then gets multiplied by (2048-2037), and then rounded again, so there is a virtual "ghost" load created, next to the old and active load terms. By changing the way the internally kept value is rounded, that internal value equivalent now can reach 0.00 on idle, and 1.00 on full load. Upon increasing load, the internally kept load value is rounded up, when the load is decreasing, the load value is rounded down. The modified code was tested on nohz=off and nohz kernels. It was tested on vanilla kernel 4.6-rc5 and on centos 7.1 kernel 3.10.0-327. It was tested on single, dual, and octal cores system. It was tested on virtual hosts and bare hardware. No unwanted effects have been observed, and the problems that the patch intended to fix were indeed gone. Tested-by: Damien Wyart <damien.wyart@free.fr> Signed-off-by: Vik Heyndrickx <vik.heyndrickx@veribox.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Doug Smythies <dsmythies@telus.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `0f004f5a69` ("sched: Cure more NO_HZ load average woes") Link: http://lkml.kernel.org/r/e8d32bff-d544-7748-72b5-3c86cc71f09f@veribox.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-06-01 12:15:49 -07:00
Steven Rostedt (Red Hat)	f199023137	ring-buffer: Prevent overflow of size in ring_buffer_resize() commit 59643d1535eb220668692a5359de22545af579f6 upstream. If the size passed to ring_buffer_resize() is greater than MAX_LONG - BUF_PAGE_SIZE then the DIV_ROUND_UP() will return zero. Here's the details: # echo 18014398509481980 > /sys/kernel/debug/tracing/buffer_size_kb tracing_entries_write() processes this and converts kb to bytes. 18014398509481980 << 10 = 18446744073709547520 and this is passed to ring_buffer_resize() as unsigned long size. size = DIV_ROUND_UP(size, BUF_PAGE_SIZE); Where DIV_ROUND_UP(a, b) is (a + b - 1)/b BUF_PAGE_SIZE is 4080 and here 18446744073709547520 + 4080 - 1 = 18446744073709551599 where 18446744073709551599 is still smaller than 2^64 2^64 - 18446744073709551599 = 17 But now 18446744073709551599 / 4080 = 4521260802379792 and size = size * 4080 = 18446744073709551360 This is checked to make sure its still greater than 2 * 4080, which it is. Then we convert to the number of buffer pages needed. nr_page = DIV_ROUND_UP(size, BUF_PAGE_SIZE) but this time size is 18446744073709551360 and 2^64 - (18446744073709551360 + 4080 - 1) = -3823 Thus it overflows and the resulting number is less than 4080, which makes 3823 / 4080 = 0 an nr_pages is set to this. As we already checked against the minimum that nr_pages may be, this causes the logic to fail as well, and we crash the kernel. There's no reason to have the two DIV_ROUND_UP() (that's just result of historical code changes), clean up the code and fix this bug. Fixes: `83f40318da` ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-06-01 12:15:49 -07:00
Steven Rostedt (Red Hat)	dfb71aefc9	ring-buffer: Use long for nr_pages to avoid overflow failures commit 9b94a8fba501f38368aef6ac1b30e7335252a220 upstream. The size variable to change the ring buffer in ftrace is a long. The nr_pages used to update the ring buffer based on the size is int. On 64 bit machines this can cause an overflow problem. For example, the following will cause the ring buffer to crash: # cd /sys/kernel/debug/tracing # echo 10 > buffer_size_kb # echo 8556384240 > buffer_size_kb Then you get the warning of: WARNING: CPU: 1 PID: 318 at kernel/trace/ring_buffer.c:1527 rb_update_pages+0x22f/0x260 Which is: RB_WARN_ON(cpu_buffer, nr_removed); Note each ring buffer page holds 4080 bytes. This is because: 1) 10 causes the ring buffer to have 3 pages. (10kb requires 3 * 4080 pages to hold) 2) (2^31 / 2^10 + 1) * 4080 = 8556384240 The value written into buffer_size_kb is shifted by 10 and then passed to ring_buffer_resize(). 8556384240 * 2^10 = 8761737461760 3) The size passed to ring_buffer_resize() is then divided by BUF_PAGE_SIZE which is 4080. 8761737461760 / 4080 = 2147484672 4) nr_pages is subtracted from the current nr_pages (3) and we get: 2147484669. This value is saved in a signed integer nr_pages_to_update 5) 2147484669 is greater than 2^31 but smaller than 2^32, a signed int turns into the value of -2147482627 6) As the value is a negative number, in update_pages_handler() it is negated and passed to rb_remove_pages() and 2147482627 pages will be removed, which is much larger than 3 and it causes the warning because not all the pages asked to be removed were removed. Link: https://bugzilla.kernel.org/show_bug.cgi?id=118001 Fixes: `7a8e76a382` ("tracing: unified trace buffer") Reported-by: Hao Qin <QEver.cn@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-06-01 12:15:49 -07:00
Peter Zijlstra	c5174678e2	perf/core: Fix perf_event_open() vs. execve() race commit 79c9ce57eb2d5f1497546a3946b4ae21b6fdc438 upstream. Jann reported that the ptrace_may_access() check in find_lively_task_by_vpid() is racy against exec(). Specifically: perf_event_open() execve() ptrace_may_access() commit_creds() ... if (get_dumpable() != SUID_DUMP_USER) perf_event_exit_task(); perf_install_in_context() would result in installing a counter across the creds boundary. Fix this by wrapping lots of perf_event_open() in cred_guard_mutex. This should be fine as perf_event_exit_task() is already called with cred_guard_mutex held, so all perf locks already nest inside it. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-06-01 12:15:47 -07:00
Srivatsa Vaddagiri	e6aae1c3e0	sched: Aggregate for frequency Related threads in a group could execute on different CPUs and hence present a split-demand picture to cpufreq governor. IOW the governor fails to see the net cpu demand of all related threads in a given window if the threads's execution were to be split across CPUs. That could result in sub-optimal frequency chosen in comparison to the ideal frequency at which the aggregate work (taken up by related threads) needs to be run. This patch aggregates cpu execution stats in a window for all related threads in a group. This helps present cpu busy time to governor as if all related threads were part of the same thread and thus help select the right frequency required by related threads. This aggregation is done per-cluster. Change-Id: I71e6047620066323721c6d542034ddd4b2950e7f Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: Fixed notify_migration() to hold rcu read lock as this version of Linux doesn't hold p->pi_lock when the function gets called while keeping use of rcu_access_pointer() since we never dereference return value.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-05-26 15:28:59 -07:00
Kishor PK	05bd41f94e	trace: prevent NULL pointer dereference Prevent unintended NULL pointer dereference in trace_event_perf. Change-Id: I35151c460b4350ebd414b67c655684c2019f799f Signed-off-by: Kishor PK <kpbhat@codeaurora.org> Signed-off-by: Srinivasarao P <spathi@codeaurora.org> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>	2016-05-25 14:21:54 -07:00
Aparna Das	89ee45617e	coresight: add stm logging to support optimization in trace printk The function trace_printk() performs optimization by determining if there are no format parameters in argument string and calls appropriate apis to write to ftrace buffer. Add STM logging to support this optimization in order to allow CoreSight STM tracing for optimized trace_printk path. Change-Id: I1a77291e77410c6ed99474335a6d25742c409e47 Signed-off-by: Aparna Das <adas@codeaurora.org> Signed-off-by: Pratik Patel <pratikp@codeaurora.org> Signed-off-by: Shashank Mittal <mittals@codeaurora.org>	2016-05-24 14:15:32 -07:00

... 10 11 12 13 14 ...