evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
Srivatsa Vaddagiri	5b45dc56e5	sched: Per-cpu prefer_idle flag Remove the global sysctl_sched_prefer_idle flag and replace it with a per-cpu prefer_idle flag. The per-cpu flag is expected to same for all cpus in a cluster. It thus provides convenient means to disable packing in one cluster while allowing packing in another cluster. Change-Id: Ie4cc73bb1a55b4eac5697be38e558546161faca1 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:01:26 -07:00
Olav Haugan	3f947e7ba7	sched: Add sysctl to enable power aware scheduling Add sysctl to enable energy awareness at runtime. This is useful for performance/power tuning/measurements and debugging. In addition this will match up with the Documentation/scheduler/sched-hmp.txt documentation. Change-Id: I0a9185498640d66917b38bf5d55f6c59fc60ad5c Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org	2016-03-23 20:01:24 -07:00
Joonwoo Park	790d5d8a4a	sched: Prevent race conditions where upmigrate_min_nice changes When upmigrate_min_nice is changed dec_nr_big_small_task() can trigger BUG_ON(rq->nr_big_tasks < 0). This happens when there is a task which was considered as non-big task due to its nice > upmigrate_min_nice and later upmigrate_min_nice is changed to higher value so the task becomes big task. In this case runqueue still has nr_big_tasks = 0 incorrectly with current implementation. Consequently next scheduler tick sees a big task to schedule and try to decrease nr_big_tasks which is already 0. Introduce sched_upmigrate_min_nice which is updated atomically and re-count the number of big and small tasks to fix BUG_ON() triggering. Change-Id: I6f5fc62ed22bbe5c52ec71613082a6e64f406e58 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:01:22 -07:00
Olav Haugan	90dc3fa9a7	sched: Avoid frequent task migration due to EA in lb A new tunable exists that allow task migration to be throttled when the scheduler tries to do task migrations due to Energy Awareness (EA). This tunable is only taken into account when migrations occur in the tick path. Extend the usage of the tunable to take into account the load balancer (lb) path also. In addition ensure that the start of task execution on a CPU is updated correctly. If a task is preempted but still runnable on the same CPU the start of execution should not be updated. Only update the start of execution when a task wakes up after sleep or moves to a new CPU. Change-Id: I6b2a8e06d8d2df8e0f9f62b7aba3b4ee4b2c1c4d Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org [joonwoop@codeaurora.org: fixed conflict in group_classify() and set_task_cpu().] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:01:21 -07:00
Srivatsa Vaddagiri	29a412dffa	sched: Avoid frequent migration of running task Power values for cpus can drop quite considerably when it goes idle. As a result, the best choice for running a single task in a cluster can vary quite rapidly. As the task keeps hopping cpus, other cpus go idle and start being seen as more favorable target for running a task, leading to task migrating almost every scheduler tick! Prevent this by keeping track of when a task started running on a cpu and allowing task migration in tick path (migration_needed()) on account of energy efficiency reasons only if the task has run sufficiently long (as determined by sysctl_sched_min_runtime variable). Note that currently sysctl_sched_min_runtime setting is considered only in scheduler_tick()->migration_needed() path and not in idle_balance() path. In other words, a task could be migrated to another cpu which did a idle_balance(). This limitation should not affect high-frequency migrations seen typically (when a single high-demand task runs on high-performance cpu). CRs-Fixed: 756570 Change-Id: I96413b7a81b623193c3bbcec6f3fa9dfec367d99 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in set_task_cpu() and __schedule().] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:01:17 -07:00
Srivatsa Vaddagiri	72b7c5d36c	sched: Provide knob to prefer mostly_idle over idle cpus sysctl_sched_prefer_idle lets the scheduler bias selection of idle cpus over mostly idle cpus for tasks. This knob could be useful to control balance between power and performance. Change-Id: Ide6eef684ef94ac8b9927f53c220ccf94976fe67 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:01:12 -07:00
Steve Muckle	588055e8c7	sched: make sched_cpu_high_irqload a runtime tunable It may be desirable to be able to alter the scehd_cpu_high_irqload setting easily, so make it a runtime tunable value. Change-Id: I832030eec2aafa101f0f435a4fd2d401d447880d Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2016-03-23 20:01:11 -07:00
Srivatsa Vaddagiri	b2e57842c0	sched: per-cpu mostly_idle threshold sched_mostly_idle_load and sched_mostly_idle_nr_run knobs help pack tasks on cpus to some extent. In some cases, it may be desirable to have different packing limits for different cpus. For example, pack to a higher limit on high-performance cpus compared to power-efficient cpus. This patch removes the global mostly_idle tunables and makes them per-cpu, thus letting task packing behavior to be controlled in a fine-grained manner. Change-Id: Ifc254cda34b928eae9d6c342ce4c0f64e531e6c2 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:59 -07:00
Srivatsa Vaddagiri	98f89f00dc	sched: update governor notification logic Make criteria for notifying governor to be per-cpu. Governor is notified of any large change in cpu's busy time statistics (rq->prev_runnable_sum) since the last reported value. Change-Id: I727354d994d909b166d093b94d3dade7c7dddc0d Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:54 -07:00
Srivatsa Vaddagiri	c12a2b5ab9	sched: Use absolute scale for notifying governor Make the tunables used for deciding the need for notification to be on absolute scale. The earlier scale (in percent terms relative to cur_freq) does not work well with available range of frequencies. For example, 100% tunable value would work well for lower range of frequencies and not for higher range. Having the tunable to be on absolute scale makes tuning more realistic. Change-Id: I35a8c4e2f2e9da57f4ca4462072276d06ad386f1 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:51 -07:00
Srivatsa Vaddagiri	3a67b4ce87	sched: window-stats: Enhance cpu busy time accounting rq->curr/prev_runnable_sum counters represent cpu demand from various tasks that have run on a cpu. Any task that runs on a cpu will have a representation in rq->curr_runnable_sum. Their partial_demand value will be included in rq->curr_runnable_sum. Since partial_demand is derived from historical load samples for a task, rq->curr_runnable_sum could represent "inflated/un-realistic" cpu usage. As an example, lets say that task with partial_demand of 10ms runs for only 1ms on a cpu. What is included in rq->curr_runnable_sum is 10ms (and not the actual execution time of 1ms). This leads to cpu busy time being reported on the upside causing frequency to stay higher than necessary. This patch fixes cpu busy accounting scheme to strictly represent actual usage. It also provides for conditional fixup of busy time upon migration and upon heavy-task wakeup. CRs-Fixed: 691443 Change-Id: Ic4092627668053934049af4dfef65d9b6b901e6b Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in init_task_load(), se.avg.decay_count has deprecated.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:00:50 -07:00
Srivatsa Vaddagiri	c9d0953c31	sched: improve logic for alerting governor Currently we send notification to governor not taking note of cpus that are synchronized with regard to their frequency. As a result, scheduler could send pointless notifications (notification spam!). Avoid this by considering synchronized cpus and alerting governor only when the highest demand of any cpu within cluster far exceeds or falls behind current frequency. Change-Id: I74908b5a212404ca56b38eb94548f9b1fbcca33d Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:48 -07:00
Srivatsa Vaddagiri	d8932ae7df	sched: window-stats: legacy mode Support legacy mode, which results in busy time being seen by governor that is close to what it would have seen via existing APIs i.e get_cpu_idle_time_us(), get_cpu_iowait_time_us() and get_cpu_idle_time_jiffy(). In particular, legacy mode means that only task execution time is counted in rq->curr_runnable_sum and rq->prev_runnable_sum. Also task migration does not result in adjustment of those counters. Change-Id: If374ccc084aa73f77374b6b3ab4cd0a4ca7b8c90 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:26 -07:00
Srivatsa Vaddagiri	e39131c3be	sched: window-stats: Code cleanup Remove code duplication associated with update of various window-stats related sysctl tunables Change-Id: I64e29ac065172464ba371a03758937999c42a71f Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:24 -07:00
Olav Haugan	8eede4a8d5	sched: Make RAVG_HIST_SIZE tunable Make RAVG_HIST_SIZE available from /proc/sys/kernel/sched_ravg_hist_size to allow tuning of the size of the history that is used in computation of task demand. CRs-fixed: 706138 Change-Id: Id54c1e4b6e974a62d787070a0af1b4e8ce3b4be6 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict in sysctl.h] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:00:19 -07:00
Srivatsa Vaddagiri	4641b37da8	sched: window-stats: Allow acct_wait_time to be tuned Add sysctl interface to tune sched_acct_wait_time variable at runtime Change-Id: I38339cdb388a507019e429709a7c28e80b5b3585 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 20:00:15 -07:00
Srivatsa Vaddagiri	1ffae4dc94	sched: window-stats: Handle policy change properly sched_window_stat_policy influences task demand and thus various statistics maintained per-cpu like curr_runnable_sum. Changing policy non-atomically would lead to improper accounting. For example, when task is enqueued on a cpu's runqueue, its demand that is added to rq->cumulative_runnable_avg could be based on AVG policy and when its dequeued its demand that is removed can be based on MAX, leading to erroneous accounting. This change causes policy change to be "atomic" i.e all cpu's rq->lock are held and all task's window-stats are reset before policy is changed. Change-Id: I6a3e4fb7bc299dfc5c367693b5717a1ef518c32d CRs-Fixed: 687409 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict in include/linux/sched/sysctl.h. Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 20:00:03 -07:00
Srivatsa Vaddagiri	f27b626521	sched: remove sysctl control for HMP and power-aware task placement There is no real need to control HMP and power-aware task placement at runtime after kernel has booted. Boot-time control should be sufficient. Not allowing for runtime (sysctl) support simplifies the code quite a bit. Also rename sysctl_sched_enable_hmp_task_placement to be shorter. Change-Id: I60cae51a173c6f73b79cbf90c50ddd41a27604aa Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict. p->nr_cpus_allowed == 1 has moved to core.c Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:55 -07:00
Syed Rameez Mustafa	754f666131	sched/fair: Introduce scheduler boost for low latency workloads Certain low latency bursty workloads require immediate use of highest capacity CPUs in HMP systems. Existing load tracking mechanisms may be unable to respond to the sudden surge in the system load within the latency requirements. Introduce the scheduler boost feature for such workloads. While boost is in effect the scheduler bypasses regular load based task placement and prefers highest capacity CPUs in the system for all non-small fair sched class tasks. Provide both a kernel and userspace API for software that may have apriori knowledge about the system workload. Change-Id: I783f585d1f8c97219e629d9c54f712318821922f Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict in include/linux/sched/sysctl.h.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:45 -07:00
Steve Muckle	476ea8d45d	sched: notify cpufreq on over/underprovisioned CPUs After a migration occurs the source and destination CPUs may not be running at frequencies which match the new task load on those CPUs. Previously, the scheduler was notifying cpufreq anytime a task greater than a certain size migrates. This is suboptimal however since this does not take into account the CPU's current frequency and other task activity that may be present. Change-Id: I5092bda3a517e1343f97e5a455957c25ee19b549 Signed-off-by: Steve Muckle <smuckle@codeaurora.org>	2016-03-23 19:59:32 -07:00
Syed Rameez Mustafa	a536cf8ac8	sched: Introduce spill threshold tunables to manage overcommitment When the number of tasks intended for a cluster exceed the number of mostly idle CPUs in that cluster, the scheduler currently freely uses CPUs in other clusters if possible. While this is optimal for performance the power trade off can be quite significant. Introduce spill threshold tunables that govern the extent to which the scheduler should attempt to contain tasks within a cluster. Change-Id: I797e6c6b2aa0c3a376dad93758abe1d587663624 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org [joonwoop@codeaurora.org: fixed conflict in nohz_kick_needed()] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:32 -07:00
Steve Muckle	f469bce8e2	sched: add migration load change notifier for frequency guidance When a task moves between CPUs in two different frequency domains the cpufreq governor may wish to immediately modify the frequency of both the source and destination CPUs of the migrating task. A tunable is provided to establish what size task is considered "significant" enough to warrant notifying cpufreq. Also fix a bug that would cause load to not be accounted properly during wakeup migrations. Change-Id: Ie8f6b1cc4d43a602840dac18590b42a81327c95a Signed-off-by: Steve Muckle <smuckle@codeaurora.org> [rameezmustafa@codeaurora.org: Add double rq locking for set_task_cpu()] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 19:59:29 -07:00
Steve Muckle	ec7d8cc076	sched: add power aware scheduling sysctl The sched_enable_power_aware sysctl will control whether or not scheduling decisions are influenced by the power consumption of individual CPUs. Change-Id: I312f892cf76a3fccc4ecc8aa6703908b205267f0 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 19:59:21 -07:00
Srivatsa Vaddagiri	7cec8569e3	sched: Basic task placement support for HMP systems HMP systems have cpus with different power and performance characteristics. Some cpus could offer better power at cost of lower performance while other cpus could offer better performance at cost of higher power. As a result, bandwidth consumed by a task to do some "fixed" amount of work could vary across cpus. Optimal task placement on HMP would involve placing a task on a cpu where it can meet its performance goals at lowest power cost. Since kernel has little to no awareness of performance goals of applications, we guestimate whether task is meeting its performance goals or not by looking at its cpu bandwidth consumption. High bandwidth consumption could imply that task's performance can improve by running on cpus with better capacity/performance-characterisitcs. This patch makes the basic changes to support HMP. It provides a configurable threshold and any task consuming bandwidth in excess of threshold will be placed on a cpu with better capacity. Change-Id: I3fd98edd430f73342fbef06411e8b2d1cf2f56fa Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict about members of p->se which are not available anymore.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:03 -07:00
Srivatsa Vaddagiri	551f83f5d6	sched: Add CONFIG_SCHED_HMP Kconfig option Add a compile-time flag to enable or disable scheduler features for HMP (heterogenous multi-processor) systems. Main feature deals with optimizing task placement for best power/performance tradeoff. Also extend features currently dependent on CONFIG_SCHED_FREQ_INPUT to be enabled for CONFIG_HMP as well. Change-Id: I03b3942709a80cc19f7b934a8089e1d84c14d72d Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor ifdefry conflict.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:59:00 -07:00
Srivatsa Vaddagiri	77fe8dd14d	sched: Introduce CONFIG_SCHED_FREQ_INPUT Introduce a compile time flag to enable scheduler guidance of frequency selection. This flag is also used to turn on or off window-based load stats feature. Having a compile time flag will let some platforms avoid any overhead that may be present with this scheduler feature. Change-Id: Id8dec9839f90dcac82f58ef7e2bd0ccd0b6bd16c Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict around sysctl_timer_migration.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:58:59 -07:00
Srivatsa Vaddagiri	a25a5c1c30	sched: window-based load stats improvements Following cleanups and improvements are made to window-based load stats feature: * Add sysctl to pick max, avg or most recent samples as task's demand. * Fix overflow possibility in calculation of sum for average policy. * Use unscaled statistics when a task is running on a CPU which is thermally throttled. Change-Id: I8293565ca0c2a785dadf8adb6c67f579a445ed29 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>	2016-03-23 19:58:58 -07:00
Srivatsa Vaddagiri	3967da2dd1	sched: Window-based load stat improvements Some tasks can have a sporadic load pattern such that they can suddenly start running for longer intervals of time after running for shorter durations. To recognize such sharp increase in tasks' demands, max between the average of 5 window load samples and the most recent sample is chosen as the task demand. Make the window size (sched_ravg_window) configurable at boot up time. To prevent users from setting inappropriate values for window size, min and max limits are defined. As 'ravg' struct tracks load for both real-time and non real-time tasks it is moved out of sched_entity struct. In order to prevent changing function signatures for move_tasks() and move_one_task() per-cpu variables are defined to track the total load moved. In case multiple tasks are selected to migrate in one load balance operation, loads > 100 could be sent through migration notifiers. Prevent this scenario by setting mnd.load to 100 in such cases. Define wrapper functions to compute cpu demands for tasks and to change rq->cumulative_runnable_avg. Change-Id: I9abfbf3b5fe23ae615a6acd3db9580cfdeb515b4 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Rohit Gupta <rohgup@codeaurora.org> [rameezmustafa@codeaurora.org: Port to msm-3.18 and squash "dcf7256 sched: window-stats: Fix overflow bug" into this patch.] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in __migrate_task().] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:58:53 -07:00
Rohit Gupta	e3fe80da05	sched: Call the notify_on_migrate notifier chain for wakeups as well Add a change to send notify_on_migrate hints on wakeups of foreground tasks from scheduler if their load is above wakeup_load_thresholds (default value is 60). These hints can be used to choose an appropriate CPU frequency corresponding to the load of the task being woken up. By default sched_wakeup_load_threshold is set to 60 and therefore wakeup hints are sent out for those tasks whose loads are higher that value. This might cause unnecessary wakeup boosts to happen when load based syncing is turned ON for cpu-boost. Disable the wake up hints by setting the sched_wakeup_load_threshold to a value higher than 100 so that wakeup boost doesnt happen unless it is explicitly turned ON from adb shell. Change-Id: Ieca413c1a8bd2b14a15a7591e8e15d22925c42ca Signed-off-by: Rohit Gupta <rohgup@codeaurora.org> [rameezmustafa@codeaurora.org: Squash "a26fcce sched: Disable wakeup hints for foreground tasks by default" into this patch and update commit text.] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>	2016-03-23 19:58:52 -07:00
Srivatsa Vaddagiri	74463329e4	sched: window-based load stats for tasks Provide a metric per task that specifies how cpu bound a task is. Task execution is monitored over several time windows and the fraction of the window for which task was found to be executing or wanting to run is recorded as task's demand. Windows over which task was sleeping are ignored. We track last 5 recent windows for every task and the maximum demand seen in any of the previous 5 windows (where task had some activity) drives freq demand for every task. A per-cpu metric (rq->cumulative_runnable_avg) is also provided which is an aggregation of cpu demand of all tasks currently enqueued on it. rq->cumulative_runnable_avg will be useful to know if cpu frequency will need to be changed to match task demand. Change-Id: Ib83207b9ba8683cd3304ee8a2290695c34f08fe2 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org]: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in ttwu_do_wakeup() to incorporate with changed trace_sched_wakeup() location.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>	2016-03-23 19:58:39 -07:00
Steve Muckle	03ab01a318	sched: add sysctl for controlling task migrations on wake The PF_WAKE_UP_IDLE per-task flag made it impossible to enable the old behavior of SD_SHARE_PKG_RESOURCES, where every task migrates to an idle CPU on wakeup. The sched_wake_to_idle sysctl value, when made nonzero, will cause all tasks to migrate to an idle CPU if one is available when the task is woken up. This is regardless of how PF_WAKE_UP_IDLE is configured for tasks in the system. Similar to PF_WAKE_UP_IDLE, the SD_SHARE_PKG_RESOURCES scheduler domain flag must be enabled for the sysctl value to have an effect. Change-Id: I23bed846d26502c7aed600bfcf1c13053a7e5f61 Signed-off-by: Steve Muckle <smuckle@codeaurora.org> (cherry picked from commit 9d5b38dc0025d19df5b756b16024b4269e73f282)	2016-03-23 19:58:30 -07:00
David Collins	d4b065ff47	sysctl: add boot_reason and cold_boot sysctl entries for arm64 Define boot_reason and cold_boot variables in the arm64 version of setup.c so that arm64 targets can export the boot_reason and cold_boot sysctl entries. This feature is required by the qpnp-power-on driver. Change-Id: Id2d4ff5b8caa2e6a35d4ac61e338963d602c8b84 Signed-off-by: David Collins <collinsd@codeaurora.org> [osvaldob: resolved trival merge conflicts] Signed-off-by: Osvaldo Banuelos <osvaldob@codeaurora.org>	2016-03-01 12:22:13 -08:00
dcashman	d49d88766b	FROMLIST: mm: mmap: Add new /proc tunable for mmap_base ASLR. (cherry picked from commit https://lkml.org/lkml/2015/12/21/337) ASLR only uses as few as 8 bits to generate the random offset for the mmap base address on 32 bit architectures. This value was chosen to prevent a poorly chosen value from dividing the address space in such a way as to prevent large allocations. This may not be an issue on all platforms. Allow the specification of a minimum number of bits so that platforms desiring greater ASLR protection may determine where to place the trade-off. Bug: 24047224 Signed-off-by: Daniel Cashman <dcashman@android.com> Signed-off-by: Daniel Cashman <dcashman@google.com> Change-Id: Ibf9ed3d4390e9686f5cc34f605d509a20d40e6c2	2016-02-16 13:54:14 -08:00
Rik van Riel	f8ade3666c	add extra free kbytes tunable Add a userspace visible knob to tell the VM to keep an extra amount of memory free, by increasing the gap between each zone's min and low watermarks. This is useful for realtime applications that call system calls and have a bound on the number of allocations that happen in any short time period. In this application, extra_free_kbytes would be left at an amount equal to or larger than than the maximum number of allocations that happen in any burst. It may also be useful to reduce the memory use of virtual machines (temporarily?), in a way that does not cause memory fragmentation like ballooning does. [ccross] Revived for use on old kernels where no other solution exists. The tunable will be removed on kernels that do better at avoiding direct reclaim. Change-Id: I765a42be8e964bfd3e2886d1ca85a29d60c3bb3e Signed-off-by: Rik van Riel<riel@redhat.com> Signed-off-by: Colin Cross <ccross@android.com>	2016-02-16 13:54:12 -08:00
Don Zickus	ac1f591249	kernel/watchdog.c: add sysctl knob hardlockup_panic The only way to enable a hardlockup to panic the machine is to set 'nmi_watchdog=panic' on the kernel command line. This makes it awkward for end users and folks who want to run automate tests (like myself). Mimic the softlockup_panic knob and create a /proc/sys/kernel/hardlockup_panic knob. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Aaron Tomlin <atomlin@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-05 19:34:48 -08:00
Jiri Kosina	55537871ef	kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup In many cases of hardlockup reports, it's actually not possible to know why it triggered, because the CPU that got stuck is usually waiting on a resource (with IRQs disabled) in posession of some other CPU is holding. IOW, we are often looking at the stacktrace of the victim and not the actual offender. Introduce sysctl / cmdline parameter that makes it possible to have hardlockup detector perform all-CPU backtrace. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Aaron Tomlin <atomlin@redhat.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Acked-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-05 19:34:48 -08:00
Alexei Starovoitov	1be7f75d16	bpf: enable non-root eBPF programs In order to let unprivileged users load and execute eBPF programs teach verifier to prevent pointer leaks. Verifier will prevent - any arithmetic on pointers (except R10+Imm which is used to compute stack addresses) - comparison of pointers (except if (map_value_ptr == 0) ... ) - passing pointers to helper functions - indirectly passing pointers in stack to helper functions - returning pointer from bpf program - storing pointers into ctx or maps Spill/fill of pointers into stack is allowed, but mangling of pointers stored in the stack or reading them byte by byte is not. Within bpf programs the pointers do exist, since programs need to be able to access maps, pass skb pointer to LD_ABS insns, etc but programs cannot pass such pointer values to the outside or obfuscate them. Only allow BPF_PROG_TYPE_SOCKET_FILTER unprivileged programs, so that socket filters (tcpdump), af_packet (quic acceleration) and future kcm can use it. tracing and tc cls/act program types still require root permissions, since tracing actually needs to be able to see all kernel pointers and tc is for root only. For example, the following unprivileged socket filter program is allowed: int bpf_prog1(struct __sk_buff skb) { u32 index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol)); u64 value = bpf_map_lookup_elem(&my_map, &index); if (value) value += skb->len; return 0; } but the following program is not: int bpf_prog1(struct __sk_buff skb) { u32 index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol)); u64 value = bpf_map_lookup_elem(&my_map, &index); if (value) value += (u64) skb; return 0; } since it would leak the kernel address into the map. Unprivileged socket filter bpf programs have access to the following helper functions: - map lookup/update/delete (but they cannot store kernel pointers into them) - get_random (it's already exposed to unprivileged user space) - get_smp_processor_id - tail_call into another socket filter program - ktime_get_ns The feature is controlled by sysctl kernel.unprivileged_bpf_disabled. This toggle defaults to off (0), but can be set true (1). Once true, bpf programs and maps cannot be accessed from unprivileged process, and the toggle cannot be set back to false. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-12 19:13:35 -07:00
Ilya Dryomov	9a5bc726d5	sysctl: fix int -> unsigned long assignments in INT_MIN case The following if (val < 0) *lvalp = (unsigned long)-val; is incorrect because the compiler is free to assume -val to be positive and use a sign-extend instruction for extending the bit pattern. This is a problem if val == INT_MIN: # echo -2147483648 >/proc/sys/dev/scsi/logging_level # cat /proc/sys/dev/scsi/logging_level -18446744071562067968 Cast to unsigned long before negation - that way we first sign-extend and then negate an unsigned, which is well defined. With this: # cat /proc/sys/dev/scsi/logging_level -2147483648 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Cc: Mikulas Patocka <mikulas@twibright.com> Cc: Robert Xiao <nneonneo@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Dave Young	2965faa5e0	kexec: split kexec_load syscall from kexec core code There are two kexec load syscalls, kexec_load another and kexec_file_load. kexec_file_load has been splited as kernel/kexec_file.c. In this patch I split kexec_load syscall code to kernel/kexec.c. And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and use kexec_file_load only, or vice verse. The original requirement is from Ted Ts'o, he want kexec kernel signature being checked with CONFIG_KEXEC_VERIFY_SIG enabled. But kexec-tools use kexec_load syscall can bypass the checking. Vivek Goyal proposed to create a common kconfig option so user can compile in only one syscall for loading kexec kernel. KEXEC/KEXEC_FILE selects KEXEC_CORE so that old config files still work. Because there's general code need CONFIG_KEXEC_CORE, so I updated all the architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects KEXEC_CORE in arch Kconfig. Also updated general kernel code with to kexec_load syscall. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Dave Young <dyoung@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Josh Boyer <jwboyer@fedoraproject.org> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Linus Torvalds	0cbee99269	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace updates from Eric Biederman: "Long ago and far away when user namespaces where young it was realized that allowing fresh mounts of proc and sysfs with only user namespace permissions could violate the basic rule that only root gets to decide if proc or sysfs should be mounted at all. Some hacks were put in place to reduce the worst of the damage could be done, and the common sense rule was adopted that fresh mounts of proc and sysfs should allow no more than bind mounts of proc and sysfs. Unfortunately that rule has not been fully enforced. There are two kinds of gaps in that enforcement. Only filesystems mounted on empty directories of proc and sysfs should be ignored but the test for empty directories was insufficient. So in my tree directories on proc, sysctl and sysfs that will always be empty are created specially. Every other technique is imperfect as an ordinary directory can have entries added even after a readdir returns and shows that the directory is empty. Special creation of directories for mount points makes the code in the kernel a smidge clearer about it's purpose. I asked container developers from the various container projects to help test this and no holes were found in the set of mount points on proc and sysfs that are created specially. This set of changes also starts enforcing the mount flags of fresh mounts of proc and sysfs are consistent with the existing mount of proc and sysfs. I expected this to be the boring part of the work but unfortunately unprivileged userspace winds up mounting fresh copies of proc and sysfs with noexec and nosuid clear when root set those flags on the previous mount of proc and sysfs. So for now only the atime, read-only and nodev attributes which userspace happens to keep consistent are enforced. Dealing with the noexec and nosuid attributes remains for another time. This set of changes also addresses an issue with how open file descriptors from /proc/<pid>/ns/* are displayed. Recently readlink of /proc/<pid>/fd has been triggering a WARN_ON that has not been meaningful since it was added (as all of the code in the kernel was converted) and is not now actively wrong. There is also a short list of issues that have not been fixed yet that I will mention briefly. It is possible to rename a directory from below to above a bind mount. At which point any directory pointers below the renamed directory can be walked up to the root directory of the filesystem. With user namespaces enabled a bind mount of the bind mount can be created allowing the user to pick a directory whose children they can rename to outside of the bind mount. This is challenging to fix and doubly so because all obvious solutions must touch code that is in the performance part of pathname resolution. As mentioned above there is also a question of how to ensure that developers by accident or with purpose do not introduce exectuable files on sysfs and proc and in doing so introduce security regressions in the current userspace that will not be immediately obvious and as such are likely to require breaking userspace in painful ways once they are recognized" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: vfs: Remove incorrect debugging WARN in prepend_path mnt: Update fs_fully_visible to test for permanently empty directories sysfs: Create mountpoints with sysfs_create_mount_point sysfs: Add support for permanently empty directories to serve as mount points. kernfs: Add support for always empty directories. proc: Allow creating permanently empty directories that serve as mount points sysctl: Allow creating permanently empty directories that serve as mountpoints. fs: Add helper functions for permanently empty directories. vfs: Ignore unlocked mounts in fs_fully_visible mnt: Modify fs_fully_visible to deal with locked ro nodev and atime mnt: Refactor the logic for mounting sysfs and proc in a user namespace	2015-07-03 15:20:57 -07:00
Eric W. Biederman	f9bd6733d3	sysctl: Allow creating permanently empty directories that serve as mountpoints. Add a magic sysctl table sysctl_mount_point that when used to create a directory forces that directory to be permanently empty. Update the code to use make_empty_dir_inode when accessing permanently empty directories. Update the code to not allow adding to permanently empty directories. Update /proc/sys/fs/binfmt_misc to be a permanently empty directory. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2015-07-01 10:36:39 -05:00
Chris Metcalf	fe4ba3c343	watchdog: add watchdog_cpumask sysctl to assist nohz Change the default behavior of watchdog so it only runs on the housekeeping cores when nohz_full is enabled at build and boot time. Allow modifying the set of cores the watchdog is currently running on with a new kernel.watchdog_cpumask sysctl. In the current system, the watchdog subsystem runs a periodic timer that schedules the watchdog kthread to run. However, nohz_full cores are designed to allow userspace application code running on those cores to have 100% access to the CPU. So the watchdog system prevents the nohz_full application code from being able to run the way it wants to, thus the motivation to suppress the watchdog on nohz_full cores, which this patchset provides by default. However, if we disable the watchdog globally, then the housekeeping cores can't benefit from the watchdog functionality. So we allow disabling it only on some cores. See Documentation/lockup-watchdogs.txt for more information. [jhubbard@nvidia.com: fix a watchdog crash in some configurations] Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-06-24 17:49:40 -07:00
Thomas Gleixner	bc7a34b8b9	timer: Reduce timer migration overhead if disabled Eric reported that the timer_migration sysctl is not really nice performance wise as it needs to check at every timer insertion whether the feature is enabled or not. Further the check does not live in the timer code, so we have an extra function call which checks an extra cache line to figure out that it is disabled. We can do better and store that information in the per cpu (hr)timer bases. I pondered to use a static key, but that's a nightmare to update from the nohz code and the timer base cache line is hot anyway when we select a timer base. The old logic enabled the timer migration unconditionally if CONFIG_NO_HZ was set even if nohz was disabled on the kernel command line. With this modification, we start off with migration disabled. The user visible sysctl is still set to enabled. If the kernel switches to NOHZ migration is enabled, if the user did not disable it via the sysctl prior to the switch. If nohz=off is on the kernel command line, migration stays disabled no matter what. Before: 47.76% hog [.] main 14.84% [kernel] [k] _raw_spin_lock_irqsave 9.55% [kernel] [k] _raw_spin_unlock_irqrestore 6.71% [kernel] [k] mod_timer 6.24% [kernel] [k] lock_timer_base.isra.38 3.76% [kernel] [k] detach_if_pending 3.71% [kernel] [k] del_timer 2.50% [kernel] [k] internal_add_timer 1.51% [kernel] [k] get_nohz_timer_target 1.28% [kernel] [k] __internal_add_timer 0.78% [kernel] [k] timerfn 0.48% [kernel] [k] wake_up_nohz_cpu After: 48.10% hog [.] main 15.25% [kernel] [k] _raw_spin_lock_irqsave 9.76% [kernel] [k] _raw_spin_unlock_irqrestore 6.50% [kernel] [k] mod_timer 6.44% [kernel] [k] lock_timer_base.isra.38 3.87% [kernel] [k] detach_if_pending 3.80% [kernel] [k] del_timer 2.67% [kernel] [k] internal_add_timer 1.33% [kernel] [k] __internal_add_timer 0.73% [kernel] [k] timerfn 0.54% [kernel] [k] wake_up_nohz_cpu Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Joonwoo Park <joonwoop@codeaurora.org> Cc: Wenbo Wang <wenbo.wang@memblaze.com> Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-06-19 15:18:28 +02:00
Heinrich Schuchardt	230633d109	kernel/sysctl.c: detect overflows when converting to int When converting unsigned long to int overflows may occur. These currently are not detected when writing to the sysctl file system. E.g. on a system where int has 32 bits and long has 64 bits echo 0x800001234 > /proc/sys/kernel/threads-max has the same effect as echo 0x1234 > /proc/sys/kernel/threads-max The patch adds the missing check in do_proc_dointvec_conv. With the patch an overflow will result in an error EINVAL when writing to the the sysctl file system. Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-04-17 09:04:08 -04:00
Heinrich Schuchardt	16db3d3f11	kernel/sysctl.c: threads-max observe limits Users can change the maximum number of threads by writing to /proc/sys/kernel/threads-max. With the patch the value entered is checked against the same limits that apply when fork_init is called. Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-04-17 09:04:07 -04:00
Eric B Munson	5bbe3547aa	mm: allow compaction of unevictable pages Currently, pages which are marked as unevictable are protected from compaction, but not from other types of migration. The POSIX real time extension explicitly states that mlock() will prevent a major page fault, but the spirit of this is that mlock() should give a process the ability to control sources of latency, including minor page faults. However, the mlock manpage only explicitly says that a locked page will not be written to swap and this can cause some confusion. The compaction code today does not give a developer who wants to avoid swap but wants to have large contiguous areas available any method to achieve this state. This patch introduces a sysctl for controlling compaction behavior with respect to the unevictable lru. Users who demand no page faults after a page is present can set compact_unevictable_allowed to 0 and users who need the large contiguous areas can enable compaction on locked memory by leaving the default value of 1. To illustrate this problem I wrote a quick test program that mmaps a large number of 1MB files filled with random data. These maps are created locked and read only. Then every other mmap is unmapped and I attempt to allocate huge pages to the static huge page pool. When the compact_unevictable_allowed sysctl is 0, I cannot allocate hugepages after fragmenting memory. When the value is set to 1, allocations succeed. Signed-off-by: Eric B Munson <emunson@akamai.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Lameter <cl@linux.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-04-15 16:35:17 -07:00
Linus Torvalds	1dcf58d6e6	Merge branch 'akpm' (patches from Andrew) Merge first patchbomb from Andrew Morton: - arch/sh updates - ocfs2 updates - kernel/watchdog feature - about half of mm/ * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (122 commits) Documentation: update arch list in the 'memtest' entry Kconfig: memtest: update number of test patterns up to 17 arm: add support for memtest arm64: add support for memtest memtest: use phys_addr_t for physical addresses mm: move memtest under mm mm, hugetlb: abort __get_user_pages if current has been oom killed mm, mempool: do not allow atomic resizing memcg: print cgroup information when system panics due to panic_on_oom mm: numa: remove migrate_ratelimited mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE mm: split ET_DYN ASLR from mmap ASLR s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE mm: expose arch_mmap_rnd when available s390: standardize mmap_rnd() usage powerpc: standardize mmap_rnd() usage mips: extract logic for mmap_rnd() arm64: standardize mmap_rnd() usage x86: standardize mmap_rnd() usage arm: factor out mmap ASLR into mmap_rnd ...	2015-04-14 16:49:17 -07:00
Ulrich Obergfell	195daf665a	watchdog: enable the new user interface of the watchdog mechanism With the current user interface of the watchdog mechanism it is only possible to disable or enable both lockup detectors at the same time. This series introduces new kernel parameters and changes the semantics of some existing kernel parameters, so that the hard lockup detector and the soft lockup detector can be disabled or enabled individually. With this series applied, the user interface is as follows. - parameters in /proc/sys/kernel . soft_watchdog This is a new parameter to control and examine the run state of the soft lockup detector. . nmi_watchdog The semantics of this parameter have changed. It can now be used to control and examine the run state of the hard lockup detector. . watchdog This parameter is still available to control the run state of both lockup detectors at the same time. If this parameter is examined, it shows the logical OR of soft_watchdog and nmi_watchdog. . watchdog_thresh The semantics of this parameter are not affected by the patch. - kernel command line parameters . nosoftlockup The semantics of this parameter have changed. It can now be used to disable the soft lockup detector at boot time. . nmi_watchdog=0 or nmi_watchdog=1 Disable or enable the hard lockup detector at boot time. The patch introduces '=1' as a new option. . nowatchdog The semantics of this parameter are not affected by the patch. It is still available to disable both lockup detectors at boot time. Also, remove the proc_dowatchdog() function which is no longer needed. [dzickus@redhat.com: wrote changelog] [dzickus@redhat.com: update documentation for kernel params and sysctl] Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-04-14 16:48:59 -07:00
Linus Torvalds	ca2ec32658	Merge branch 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs update from Al Viro: "Part one: - struct filename-related cleanups - saner iov_iter_init() replacements (and switching the syscalls to use of those) - ntfs switch to ->write_iter() (Anton) - aio cleanups and splitting iocb into common and async parts (Christoph) - assorted fixes (me, bfields, Andrew Elble) There's a lot more, including the completion of switchover to ->{read,write}_iter(), d_inode/d_backing_inode annotations, f_flags race fixes, etc, but that goes after #for-davem merge. David has pulled it, and once it's in I'll send the next vfs pull request" * 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (35 commits) sg_start_req(): use import_iovec() sg_start_req(): make sure that there's not too many elements in iovec blk_rq_map_user(): use import_single_range() sg_io(): use import_iovec() process_vm_access: switch to {compat_,}import_iovec() switch keyctl_instantiate_key_common() to iov_iter switch {compat_,}do_readv_writev() to {compat_,}import_iovec() aio_setup_vectored_rw(): switch to {compat_,}import_iovec() vmsplice_to_user(): switch to import_iovec() kill aio_setup_single_vector() aio: simplify arguments of aio_setup_..._rw() aio: lift iov_iter_init() into aio_setup_..._rw() lift iov_iter into {compat_,}do_readv_writev() NFS: fix BUG() crash in notify_change() with patch to chown_common() dcache: return -ESTALE not -EBUSY on distributed fs race NTFS: Version 2.1.32 - Update file write from aio_write to write_iter. VFS: Add iov_iter_fault_in_multipages_readable() drop bogus check in file_open_root() switch security_inode_getattr() to struct path * constify tomoyo_realpath_from_path() ...	2015-04-14 15:31:03 -07:00
Christoph Hellwig	e2e40f2c1e	fs: move struct kiocb to fs.h struct kiocb now is a generic I/O container, so move it to fs.h. Also do a #include diet for aio.h while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-03-25 20:28:11 -04:00

1 2 3 4 5 ...