Commit graph

22236 commits

Author SHA1 Message Date
Thomas Gleixner
b27d9147f2 locking/rtmutex: Prevent dequeue vs. unlock race
commit dbb26055defd03d59f678cb5f2c992abe05b064a upstream.

David reported a futex/rtmutex state corruption. It's caused by the
following problem:

CPU0		CPU1		CPU2

l->owner=T1
		rt_mutex_lock(l)
		lock(l->wait_lock)
		l->owner = T1 | HAS_WAITERS;
		enqueue(T2)
		boost()
		  unlock(l->wait_lock)
		schedule()

				rt_mutex_lock(l)
				lock(l->wait_lock)
				l->owner = T1 | HAS_WAITERS;
				enqueue(T3)
				boost()
				  unlock(l->wait_lock)
				schedule()
		signal(->T2)	signal(->T3)
		lock(l->wait_lock)
		dequeue(T2)
		deboost()
		  unlock(l->wait_lock)
				lock(l->wait_lock)
				dequeue(T3)
				  ===> wait list is now empty
				deboost()
				 unlock(l->wait_lock)
		lock(l->wait_lock)
		fixup_rt_mutex_waiters()
		  if (wait_list_empty(l)) {
		    owner = l->owner & ~HAS_WAITERS;
		    l->owner = owner
		     ==> l->owner = T1
		  }

				lock(l->wait_lock)
rt_mutex_unlock(l)		fixup_rt_mutex_waiters()
				  if (wait_list_empty(l)) {
				    owner = l->owner & ~HAS_WAITERS;
cmpxchg(l->owner, T1, NULL)
 ===> Success (l->owner = NULL)
				    l->owner = owner
				     ==> l->owner = T1
				  }

That means the problem is caused by fixup_rt_mutex_waiters() which does the
RMW to clear the waiters bit unconditionally when there are no waiters in
the rtmutexes rbtree.

This can be fatal: A concurrent unlock can release the rtmutex in the
fastpath because the waiters bit is not set. If the cmpxchg() gets in the
middle of the RMW operation then the previous owner, which just unlocked
the rtmutex is set as the owner again when the write takes place after the
successfull cmpxchg().

The solution is rather trivial: verify that the owner member of the rtmutex
has the waiters bit set before clearing it. This does not require a
cmpxchg() or other atomic operations because the waiters bit can only be
set and cleared with the rtmutex wait_lock held. It's also safe against the
fast path unlock attempt. The unlock attempt via cmpxchg() will either see
the bit set and take the slowpath or see the bit cleared and release it
atomically in the fastpath.

It's remarkable that the test program provided by David triggers on ARM64
and MIPS64 really quick, but it refuses to reproduce on x86-64, while the
problem exists there as well. That refusal might explain that this got not
discovered earlier despite the bug existing from day one of the rtmutex
implementation more than 10 years ago.

Thanks to David for meticulously instrumenting the code and providing the
information which allowed to decode this subtle problem.

Reported-by: David Daney <ddaney@caviumnetworks.com>
Tested-by: David Daney <david.daney@cavium.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: 23f78d4a03 ("[PATCH] pi-futex: rt mutex core")
Link: http://lkml.kernel.org/r/20161130210030.351136722@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-15 08:49:22 -08:00
Dmitry Shmidt
b82fdb62ee Merge remote-tracking branch 'common/android-4.4' into android-4.4.y 2016-12-12 10:15:36 -08:00
Dmitry Shmidt
97b84c9650 This is the 4.4.37 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlhI+pgACgkQONu9yGCS
 aT4cfg/9EvPNVXZaO12eBDOXExA9SJKysG7kxVq2Ks6Z4hpKOkbvTv3enbUM7P37
 eN+G2v5zythE7MRR4MfF1pFcUDDLury3yAdZSfghRNpU85LwKLJmPYh74WVl9SSj
 wNoAG27gbAidVg5aYNVa3vkt+kmNkM8zNJ79Sk7mfqyGbC25GBkqQ/e2oFnGtZfN
 KbQ/J0G9hvKhFc8q6xY/Oz/PZbuf1kB7W2YuuQq1eLvNf99PErh1UAthL7C0HKDZ
 p6BMCIoE/Q3X4iXhGIpr0NJLr7/ke+fwjl3+ACwfj/xJ5vew/mj+tT8xxZvUGtaA
 jtP432UtRkoofpsgZ6pFUhRcWxErsd5ikAM7xfrzsO2vPE+XGnMODms2A0G7J8kC
 rXF4QtUR9i92HTroTG32EhIGDH1M0ZqeM5Hy+8iMsMidIuemHIORZdva/ZPSR7Lk
 7rtD2Ye2r54oDib97d8MdMIlsQxS2gUF8Yh2NxquK6UCbBdXYzgMo0XlM6sMYrur
 Nt1u9IYa4hVnSbFPh4pmm7MBwNkkyUKrI/Jc/4tr0f3yGOS/xKq3thTxsWUdxoRi
 Gs7nkJx9vPZdB0gmd4wA4Lvjk4VCtyXQI3apMsSvZpG0IOnjKQRJ+6ExN7ZC1Tgf
 WeWRfMS6SfoRLnbPFumGpjp5OJloCxvMcRQtvfiTy3eMSR++Y0c=
 =rmG9
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.37' into android-4.4.y

This is the 4.4.37 stable release

Change-Id: Ic6753a5a223abc02c4fe5205642d4f904de2e5b8
2016-12-08 15:08:27 -08:00
Juri Lelli
e487a24793 sched/walt: kill {min,max}_capacity
{min,max}_capacity are static variables that are only updated from
__update_min_max_capacity(), but not used anywhere else.

Remove them together with the function updating them. This has also
the nice side effect of fixing a LOCKDEP warning related to locking
all CPUs in update_min_max_capacity(), as reported by Ke Wang:

[    2.853595] c0 =============================================
[    2.859219] c0 [ INFO: possible recursive locking detected ]
[    2.864852] c0 4.4.6+ #5 Tainted: G        W
[    2.869604] c0 ---------------------------------------------
[    2.875230] c0 swapper/0/1 is trying to acquire lock:
[    2.880248]  (&rq->lock){-.-.-.}, at: [<ffffff80081241cc>] cpufreq_notifier_policy+0x2e8/0x37c
[    2.888815] c0
[    2.888815] c0 but task is already holding lock:
[    2.895132]  (&rq->lock){-.-.-.}, at: [<ffffff80081241cc>] cpufreq_notifier_policy+0x2e8/0x37c
[    2.903700] c0
[    2.903700] c0 other info that might help us debug this:
[    2.910710] c0  Possible unsafe locking scenario:
[    2.910710] c0
[    2.917112] c0        CPU0
[    2.919795] c0        ----
[    2.922478]   lock(&rq->lock);
[    2.925507]   lock(&rq->lock);
[    2.928536] c0
[    2.928536] c0  *** DEADLOCK ***
[    2.928536] c0
[    2.935200] c0  May be due to missing lock nesting notation
[    2.935200] c0
[    2.942471] c0 7 locks held by swapper/0/1:
[    2.946623]  #0:  (&dev->mutex){......}, at: [<ffffff800850e118>] __driver_attach+0x64/0xb8
[    2.954931]  #1:  (&dev->mutex){......}, at: [<ffffff800850e128>] __driver_attach+0x74/0xb8
[    2.963239]  #2:  (cpu_hotplug.lock){++++++}, at: [<ffffff80080cb218>] get_online_cpus+0x48/0xa8
[    2.971979]  #3:  (subsys mutex#6){+.+.+.}, at: [<ffffff800850bed4>] subsys_interface_register+0x44/0xc0
[    2.981411]  #4:  (&policy->rwsem){+.+.+.}, at: [<ffffff8008720338>] cpufreq_online+0x330/0x76c
[    2.990065]  #5:  ((cpufreq_policy_notifier_list).rwsem){.+.+..}, at: [<ffffff80080f3418>] blocking_notifier_call_chain+0x38/0xc4
[    3.001661]  #6:  (&rq->lock){-.-.-.}, at: [<ffffff80081241cc>] cpufreq_notifier_policy+0x2e8/0x37c
[    3.010661] c0
[    3.010661] c0 stack backtrace:
[    3.015514] c0 CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W 4.4.6+ #5
[    3.022864] c0 Hardware name: Spreadtrum SP9860g Board (DT)
[    3.028402] c0 Call trace:
[    3.031092] c0 [<ffffff800808b50c>] dump_backtrace+0x0/0x210
[    3.036716] c0 [<ffffff800808b73c>] show_stack+0x20/0x28
[    3.041994] c0 [<ffffff8008433310>] dump_stack+0xa8/0xe0
[    3.047273] c0 [<ffffff80081349e0>] __lock_acquire+0x1e0c/0x2218
[    3.053243] c0 [<ffffff80081353c0>] lock_acquire+0xe0/0x280
[    3.058784] c0 [<ffffff8008abfdfc>] _raw_spin_lock+0x44/0x58
[    3.064407] c0 [<ffffff80081241cc>] cpufreq_notifier_policy+0x2e8/0x37c
[    3.070983] c0 [<ffffff80080f3458>] blocking_notifier_call_chain+0x78/0xc4
[    3.077820] c0 [<ffffff8008720294>] cpufreq_online+0x28c/0x76c
[    3.083618] c0 [<ffffff80087208a4>] cpufreq_add_dev+0x98/0xdc
[    3.089331] c0 [<ffffff800850bf14>] subsys_interface_register+0x84/0xc0
[    3.095907] c0 [<ffffff800871fa0c>] cpufreq_register_driver+0x168/0x28c
[    3.102486] c0 [<ffffff80087272f8>] sprd_cpufreq_probe+0x134/0x19c
[    3.108629] c0 [<ffffff8008510768>] platform_drv_probe+0x58/0xd0
[    3.114599] c0 [<ffffff800850de2c>] driver_probe_device+0x1e8/0x470
[    3.120830] c0 [<ffffff800850e168>] __driver_attach+0xb4/0xb8
[    3.126541] c0 [<ffffff800850b750>] bus_for_each_dev+0x6c/0xac
[    3.132339] c0 [<ffffff800850d6c0>] driver_attach+0x2c/0x34
[    3.137877] c0 [<ffffff800850d234>] bus_add_driver+0x210/0x298
[    3.143676] c0 [<ffffff800850f1f4>] driver_register+0x7c/0x114
[    3.149476] c0 [<ffffff8008510654>] __platform_driver_register+0x60/0x6c
[    3.156139] c0 [<ffffff8008f49f40>] sprd_cpufreq_platdrv_init+0x18/0x20
[    3.162714] c0 [<ffffff8008082a64>] do_one_initcall+0xd0/0x1d8
[    3.168514] c0 [<ffffff8008f0bc58>] kernel_init_freeable+0x1fc/0x29c
[    3.174834] c0 [<ffffff8008ab554c>] kernel_init+0x20/0x12c
[    3.180281] c0 [<ffffff8008086290>] ret_from_fork+0x10/0x40

Reported-by: Ke Wang <ke.wang@spreadtrum.com>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
2016-12-08 16:10:13 +00:00
Ding Tianhong
dfb704f96c rcu: Fix soft lockup for rcu_nocb_kthread
commit bedc1969150d480c462cdac320fa944b694a7162 upstream.

Carrying out the following steps results in a softlockup in the
RCU callback-offload (rcuo) kthreads:

1. Connect to ixgbevf, and set the speed to 10Gb/s.
2. Use ifconfig to bring the nic up and down repeatedly.

[  317.005148] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[  368.106005] BUG: soft lockup - CPU#1 stuck for 22s! [rcuos/1:15]
[  368.106005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  368.106005] task: ffff88057dd8a220 ti: ffff88057dd9c000 task.ti: ffff88057dd9c000
[  368.106005] RIP: 0010:[<ffffffff81579e04>]  [<ffffffff81579e04>] fib_table_lookup+0x14/0x390
[  368.106005] RSP: 0018:ffff88061fc83ce8  EFLAGS: 00000286
[  368.106005] RAX: 0000000000000001 RBX: 00000000020155c0 RCX: 0000000000000001
[  368.106005] RDX: ffff88061fc83d50 RSI: ffff88061fc83d70 RDI: ffff880036d11a00
[  368.106005] RBP: ffff88061fc83d08 R08: 0000000000000001 R09: 0000000000000000
[  368.106005] R10: ffff880036d11a00 R11: ffffffff819e0900 R12: ffff88061fc83c58
[  368.106005] R13: ffffffff816154dd R14: ffff88061fc83d08 R15: 00000000020155c0
[  368.106005] FS:  0000000000000000(0000) GS:ffff88061fc80000(0000) knlGS:0000000000000000
[  368.106005] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  368.106005] CR2: 00007f8c2aee9c40 CR3: 000000057b222000 CR4: 00000000000407e0
[  368.106005] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  368.106005] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  368.106005] Stack:
[  368.106005]  00000000010000c0 ffff88057b766000 ffff8802e380b000 ffff88057af03e00
[  368.106005]  ffff88061fc83dc0 ffffffff815349a6 ffff88061fc83d40 ffffffff814ee146
[  368.106005]  ffff8802e380af00 00000000e380af00 ffffffff819e0900 020155c0010000c0
[  368.106005] Call Trace:
[  368.106005]  <IRQ>
[  368.106005]
[  368.106005]  [<ffffffff815349a6>] ip_route_input_noref+0x516/0xbd0
[  368.106005]  [<ffffffff814ee146>] ? skb_release_data+0xd6/0x110
[  368.106005]  [<ffffffff814ee20a>] ? kfree_skb+0x3a/0xa0
[  368.106005]  [<ffffffff8153698f>] ip_rcv_finish+0x29f/0x350
[  368.106005]  [<ffffffff81537034>] ip_rcv+0x234/0x380
[  368.106005]  [<ffffffff814fd656>] __netif_receive_skb_core+0x676/0x870
[  368.106005]  [<ffffffff814fd868>] __netif_receive_skb+0x18/0x60
[  368.106005]  [<ffffffff814fe4de>] process_backlog+0xae/0x180
[  368.106005]  [<ffffffff814fdcb2>] net_rx_action+0x152/0x240
[  368.106005]  [<ffffffff81077b3f>] __do_softirq+0xef/0x280
[  368.106005]  [<ffffffff8161619c>] call_softirq+0x1c/0x30
[  368.106005]  <EOI>
[  368.106005]
[  368.106005]  [<ffffffff81015d95>] do_softirq+0x65/0xa0
[  368.106005]  [<ffffffff81077174>] local_bh_enable+0x94/0xa0
[  368.106005]  [<ffffffff81114922>] rcu_nocb_kthread+0x232/0x370
[  368.106005]  [<ffffffff81098250>] ? wake_up_bit+0x30/0x30
[  368.106005]  [<ffffffff811146f0>] ? rcu_start_gp+0x40/0x40
[  368.106005]  [<ffffffff8109728f>] kthread+0xcf/0xe0
[  368.106005]  [<ffffffff810971c0>] ? kthread_create_on_node+0x140/0x140
[  368.106005]  [<ffffffff816147d8>] ret_from_fork+0x58/0x90
[  368.106005]  [<ffffffff810971c0>] ? kthread_create_on_node+0x140/0x140

==================================cut here==============================

It turns out that the rcuos callback-offload kthread is busy processing
a very large quantity of RCU callbacks, and it is not reliquishing the
CPU while doing so.  This commit therefore adds an cond_resched_rcu_qs()
within the loop to allow other tasks to run.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
[ paulmck: Substituted cond_resched_rcu_qs for cond_resched. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Dhaval Giani <dhaval.giani@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-08 07:15:24 +01:00
Dmitry Shmidt
fa40ddf73c Merge remote-tracking branch 'common/android-4.4' into android-4.4.y 2016-12-01 13:57:58 -08:00
Dmitry Shmidt
f225dbdca9 This is the 4.4.35 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYOU3vAAoJEDjbvchgkmk+rVkQAJ/SzsPfZL3j+dFrI/H+NO5b
 mPkdSqjvJ0ww+JJ0Q+YoZ4ibyv6l33vhjg7dVkf6Cr0CTSzRXkmDfZy9nOB5JwUN
 Sn/6e7cib5+gs3rFF7Jdy+VoHwrtvvcfKX5ZW1pB9qJDe+24l/B52ur2ILQHReGl
 1sD2JmPsuW1Mu2p1An5HIi15bw1ZO7t9nPwqpyIVEyGFhkuXO1SYdmcXbx3Ugw17
 5O6MBhe8Wzkdn8j4guA5FgZMjdjn74vRQyuWpmhHAtvcJbHjROi2od/mD8S/RaFK
 7XCs6qI+d62KgiAZz/u55Ev13pNDbABmVEJj7zzHEUg8Mep7m7YdCf2z5wp7aYWK
 LjxWxdvv0Geu0cj7POH8VKPnF2AtzA3B74KKj9nq0xovu6T8MThYu/+heBffx+N9
 SmH5iNufDwFP275xXGMpkWux8tSO519Fi9TjB8cKH3zsHUCQRvPRhO4OEAHXJ880
 B4YAV1nk7AqkW3PeyHoDstQH3OTV5S1esUORyYWPEF6bcskd0OlQAAY+kmtItcDd
 xRdpSglGTZ/NIX6c5pv3skdaoZawNv/fYHkRqM6v8ckMwWLFqCc2hDi3+xiIQkxg
 qhmT7MHbAbhBIYjadf3rNCF++GRN1xyGmyLVRWg89odlIoZEH7EKhtiqDuSHC336
 Cr32DpZZCwZPATlJN+ES
 =Qm+E
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.35' into android-4.4.y

This is the 4.4.35 stable release
2016-12-01 13:56:55 -08:00
Ke Wang
8e53f7c023 sched: tune: Fix lacking spinlock initialization
The spinlock used by boost_groups in sched tune must be initialized.
This commit fixes this lack and the following errors:

[    0.384739] c2 BUG: spinlock bad magic on CPU#2, swapper/2/0
[    0.390313] c2  lock: 0xffffffc15fe1fc80, .magic:00000000, .owner: <none>/-1, .owner_cpu: 0
[    0.398739] c2 CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.4.6+ #4
[    0.404816] c2 Hardware name: Spreadtrum SP9860gBoard (DT)
[    0.410462] c2 Call trace:
[    0.413159] c2 [<ffffff800808b50c>] dump_backtrace+0x0/0x210
[    0.418803] c2 [<ffffff800808b73c>] show_stack+0x20/0x28
[    0.424100] c2 [<ffffff8008433310>] dump_stack+0xa8/0xe0
[    0.429398] c2 [<ffffff8008139398>] spin_dump+0x78/0x9c
[    0.434608] c2 [<ffffff80081393ec>] spin_bug+0x30/0x3c
[    0.439644] c2 [<ffffff80081394e4>] do_raw_spin_lock+0xac/0x1b4
[    0.445639] c2 [<ffffff8008abffe4>] _raw_spin_lock_irqsave+0x58/0x68
[    0.451977] c2 [<ffffff800812a560>] schedtune_enqueue_task+0x84/0x3bc
[    0.458320] c2 [<ffffff8008111678>] enqueue_task_fair+0x438/0x208c
[    0.464487] c2 [<ffffff80080feeec>] activate_task+0x70/0xd0
[    0.470130] c2 [<ffffff80080ff4a4>] ttwu_do_activate.constprop.131+0x4c/0x98
[    0.477079] c2 [<ffffff80081005d0>] try_to_wake_up+0x254/0x54c
[    0.482899] c2 [<ffffff80081009d4>] default_wake_function+0x30/0x3c
[    0.489154] c2 [<ffffff8008122464>] autoremove_wake_function+0x3c/0x6c
[    0.495754] c2 [<ffffff8008121b70>] __wake_up_common+0x64/0xa4
[    0.501574] c2 [<ffffff8008121e9c>] __wake_up+0x48/0x60
[    0.506788] c2 [<ffffff8008150fac>] rcu_gp_kthread_wake+0x50/0x5c
[    0.512866] c2 [<ffffff8008151fec>] note_gp_changes+0xac/0xd4
[    0.518597] c2 [<ffffff8008153044>] rcu_process_callbacks+0xe8/0x93c
[    0.524940] c2 [<ffffff80080d0b84>] __do_softirq+0x24c/0x5b8
[    0.530584] c2 [<ffffff80080d1284>] irq_exit+0xc0/0xec
[    0.535623] c2 [<ffffff8008144208>] __handle_domain_irq+0x94/0xf8
[    0.541789] c2 [<ffffff8008082554>] gic_handle_irq+0x64/0xc0

Signed-off-by: Ke Wang <ke.wang@spreadtrum.com>
2016-11-29 05:10:24 +00:00
Joel Fernandes
5fe7d45f44 UPSTREAM: trace: Add an option for boot clock as trace clock
Unlike monotonic clock, boot clock as a trace clock will account for
time spent in suspend useful for tracing suspend/resume. This uses
earlier introduced infrastructure for using the fast boot clock.

Bug: b/33184060

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
2016-11-28 18:46:46 -08:00
Joel Fernandes
af80d7c3e9 UPSTREAM: timekeeping: Add a fast and NMI safe boot clock
This boot clock can be used as a tracing clock and will account for
suspend time.

To keep it NMI safe since we're accessing from tracing, we're not using a
separate timekeeper with updates to monotonic clock and boot offset
protected with seqlocks. This has the following minor side effects:

(1) Its possible that a timestamp be taken after the boot offset is updated
but before the timekeeper is updated. If this happens, the new boot offset
is added to the old timekeeping making the clock appear to update slightly
earlier:
   CPU 0                                        CPU 1
   timekeeping_inject_sleeptime64()
   __timekeeping_inject_sleeptime(tk, delta);
                                                timestamp();
   timekeeping_update(tk, TK_CLEAR_NTP...);

(2) On 32-bit systems, the 64-bit boot offset (tk->offs_boot) may be
partially updated.  Since the tk->offs_boot update is a rare event, this
should be a rare occurrence which postprocessing should be able to handle.

Bug: b/33184060

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-11-28 18:43:30 -08:00
Johan Hovold
469fcbcb84 PM / sleep: fix device reference leak in test_suspend
commit ceb75787bc75d0a7b88519ab8a68067ac690f55a upstream.

Make sure to drop the reference taken by class_find_device() after
opening the RTC device.

Fixes: 77437fd4e6 (pm: boot time suspend selftest)
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-26 09:54:53 +01:00
Dmitry Shmidt
f9c1bf3018 Merge remote-tracking branch 'common/android-4.4' into android-4.4.y 2016-11-21 10:57:49 -08:00
Viresh Kumar
6dc8c51a76 cpufreq: sched: Fix kernel crash on accessing sysfs file
If the cpufreq driver hasn't set the CPUFREQ_HAVE_GOVERNOR_PER_POLICY
flag, then the kernel will crash on accessing sysfs files for the sched
governor.

CPUFreq governors we can have the governor specific sysfs files in two
places:

A. /sys/devices/system/cpu/cpuX/cpufreq/<governor>
B. /sys/devices/system/cpu/cpufreq/<governor>

The case A. is for governor per policy case, where we can control the
governor tunables for each policy separately. The case B. is for system
wide tunable values.

The schedfreq governor only implements the case A. and not B.  The sysfs
files in case B will still be present in
/sys/devices/system/cpu/cpufreq/<governor>, but accessing them will
crash kernel as the governor doesn't support that.

Moreover the sched governor is pretty new and will be used only for the
ARM platforms and there is no need to support the case B at all.

Hence use policy->kobj instead of get_governor_parent_kobj(), so that we
always create the sysfs files in path A.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-11-16 09:46:21 -08:00
Dmitry Shmidt
324e88de4a This is the 4.4.32 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYKq+PAAoJEDjbvchgkmk+W3sQAKHJ6dI10P/sFTe4AlGoRGNr
 ZtCwGwwolBoD/NtXa2HCovc9ofIU4zWYXl5P+kbHtKV/ZB4q5+m7Q5bpWh4TQFUy
 9TKho6aywF9uXpAEV99qKYvAOIq5EgJXdgrhCRTYBBR9+uR3+B1cUJhxpyD6htw4
 H7ABpmihWjij0o9YYAin7y/O+8jeqnuNLPUoCek1Emf0cn7G5keMg8Lli0WCz7jM
 JdKOjbvaYscgvb4BqTKqtg5NneC3GoeNp43Kvz4LbmcPw1yT5N8sHswqlSio4U2U
 Sxyvtj0RxoSoAus2UR62pTGDu1TrSHxWEWpYpqa77hr1/TpBY7put1OldFmUfu1B
 voQUI05Ox74RT9pl5c8DGnXH8Zyiu6a7Fpj6EdWbWxtbIgvWCLaDHniEY1WKR6cj
 Bmil/zjGyDtzANJBasC9NJHF8yd+/vxNfn5n0eAz6Xp94MIdOGPIQle+NATG5osN
 0b/NLit64B2F6Djijkv1vV9V7x1oYqIYVG6f1BoVtRXCjhcx9PnkskXcP+1SKUhH
 xOTXLt6rGNaTj+T2/41VJUtZ6eiZj+0GZMXILu5SIEdKiRiGLfsLHX117OK3ZhYT
 PFzzzWZoC2FOL/ldp/K6ncPZV0oHn3yfQa3T97jGI1LbsYkXXyQkW5PNwqGccbUc
 xvhEAPDvBxDlfcgqWMaw
 =DC+B
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.32' into android-4.4.y

This is the 4.4.32 stable release

Change-Id: I5028402eadfcf055ac44a5e67abc6da75b2068b3
2016-11-15 17:02:38 -08:00
Steven Rostedt (Red Hat)
ab2bff9a8f UPSTREAM: ring-buffer: Prevent overflow of size in ring_buffer_resize()
(Cherry picked from commit 59643d1535eb220668692a5359de22545af579f6)

If the size passed to ring_buffer_resize() is greater than MAX_LONG - BUF_PAGE_SIZE
then the DIV_ROUND_UP() will return zero.

Here's the details:

  # echo 18014398509481980 > /sys/kernel/debug/tracing/buffer_size_kb

tracing_entries_write() processes this and converts kb to bytes.

 18014398509481980 << 10 = 18446744073709547520

and this is passed to ring_buffer_resize() as unsigned long size.

 size = DIV_ROUND_UP(size, BUF_PAGE_SIZE);

Where DIV_ROUND_UP(a, b) is (a + b - 1)/b

BUF_PAGE_SIZE is 4080 and here

 18446744073709547520 + 4080 - 1 = 18446744073709551599

where 18446744073709551599 is still smaller than 2^64

 2^64 - 18446744073709551599 = 17

But now 18446744073709551599 / 4080 = 4521260802379792

and size = size * 4080 = 18446744073709551360

This is checked to make sure its still greater than 2 * 4080,
which it is.

Then we convert to the number of buffer pages needed.

 nr_page = DIV_ROUND_UP(size, BUF_PAGE_SIZE)

but this time size is 18446744073709551360 and

 2^64 - (18446744073709551360 + 4080 - 1) = -3823

Thus it overflows and the resulting number is less than 4080, which makes

  3823 / 4080 = 0

an nr_pages is set to this. As we already checked against the minimum that
nr_pages may be, this causes the logic to fail as well, and we crash the
kernel.

There's no reason to have the two DIV_ROUND_UP() (that's just result of
historical code changes), clean up the code and fix this bug.

Cc: stable@vger.kernel.org # 3.5+
Fixes: 83f40318da ("ring-buffer: Make removal of ring buffer pages atomic")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Change-Id: I1147672317a3ad0fc995b1f32baaa050a7976ac4
Bug: 32659848
2016-11-16 00:16:04 +00:00
Arnd Bergmann
603c78000f cgroup: avoid false positive gcc-6 warning
commit cfe02a8a973e7e5f66926b8ae38dfce404b19e29 upstream.

When all subsystems are disabled, gcc notices that cgroup_subsys_enabled_key
is a zero-length array and that any access to it must be out of bounds:

In file included from ../include/linux/cgroup.h:19:0,
                 from ../kernel/cgroup.c:31:
../kernel/cgroup.c: In function 'cgroup_add_cftypes':
../kernel/cgroup.c:261:53: error: array subscript is above array bounds [-Werror=array-bounds]
  return static_key_enabled(cgroup_subsys_enabled_key[ssid]);
                            ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
../include/linux/jump_label.h:271:40: note: in definition of macro 'static_key_enabled'
  static_key_count((struct static_key *)x) > 0;    \
                                        ^

We should never call the function in this particular case, so this is
not a bug. In order to silence the warning, this adds an explicit check
for the CGROUP_SUBSYS_COUNT==0 case.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10 16:36:36 +01:00
Dmitry Shmidt
5577070b6b This is the 4.4.30 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYF/ZyAAoJEDjbvchgkmk+6RcP+wewepsJZuBDDT0hNVglGDoQ
 GeD5/Ck6sMoRMGLSwM8IxHdC0wBml8fFj0DUX103UK+LK2CF8NSzLOgYCM4g7E9P
 E5Ye+SfSeefl47Is3hbW2YesrLjswEYht8zqytHehc7Uz6UAYwxHvAlpmbRvpI8k
 l+sbTnqovTBoXw3S/tAGpOm3hSeb7Qw1lVKlOtnYAk7e7OotqKbQjbbKKzI8Orbg
 uhs5UV2iw6AQY2mHZssilUiw5iEg/wqw+Ksjxos69thbT5oHrhAfx+grdOAWU9//
 m1YHh6ljZb/FXTrQBbt82R96HhlOKCKCj88dPfsHtIBfVOLWSacv9nc0PC8Xt3Uy
 xDq8DgP7OJRPVTNGuFFL1TuReWFtoYsDcxdQDyivn61XOPXYbKuXI4L5PJaSKQ85
 w4cbD0iOsk6MyfZR2wzj/yESzcJQz40A0sAOxGjak0ZgnGL6LrkKgcZhEBRMR6cs
 zl4TF6R2bCbSDSpnSxHVmCKU+rzENoXql6IE/9ExMNMbo1t0NDrlB+Okojj+n2oR
 wZOHxEPxAzeZVNx4pBiuVlRarSZ54TS0miLROnCRUfbwn4chDMOEbB+i5NwFzey2
 2yfqV5cNUAy0bzbz/Zw5DcF+FNkYadgPdDHZLGu0w6HJ8//V1gZKg/6Ca3Cc/iK0
 WNtHNZUOpCFLPesosNpt
 =+QqD
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.30' into android-4.4.y

This is the 4.4.30 stable release
2016-11-07 10:23:59 -08:00
Sebastian Frias
f2c4508a35 genirq/generic_chip: Add irq_unmap callback
commit ee26c013cdee0b947e29d6cadfb9ff3341c69ff9 upstream.

Without this patch irq_domain_disassociate() cannot properly release the
interrupt. In fact, irq_map_generic_chip() checks a bit on 'gc->installed'
but said bit is never cleared, only set.

Commit 088f40b7b0 ("genirq: Generic chip: Add linear irq domain support")
added irq_map_generic_chip() function and also stated "This lacks a removal
function for now".

This commit provides an implementation of an unmap function that can be
called by irq_domain_disassociate().

[ tglx: Made the function static and removed the export as we have neither
  	a prototype nor a modular user. ]

Fixes: 088f40b7b0 ("genirq: Generic chip: Add linear irq domain support")
Signed-off-by: Sebastian Frias <sf84@laposte.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mason <slash.tmp@free.fr>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/579F5C5A.2070507@laposte.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31 04:13:59 -06:00
Dmitry Shmidt
a979feb9e9 Merge remote-tracking branch 'common/android-4.4' into android-4.4.y 2016-10-27 10:17:22 -07:00
John Stultz
64f0245f31 [RFC]cgroup: Change from CAP_SYS_NICE to CAP_SYS_RESOURCE for cgroup migration permissions
Try to better match what we're pushing upstream, use CAP_SYS_RESOURCE
instead of CAP_SYS_NICE, which shoudln't affect Android as Zygote and
system_server already use CAP_SYS_RESOURCE.

Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-10-24 12:46:15 -07:00
Lianwei Wang
a3c8dc25c4 UPSTREAM: cpu/hotplug: Handle unbalanced hotplug enable/disable
(cherry picked from commit 01b41159066531cc8d664362ff0cd89dd137bbfa)

When cpu_hotplug_enable() is called unbalanced w/o a preceeding
cpu_hotplug_disable() the code emits a warning, but happily decrements the
disabled counter. This causes the next operations to malfunction.

Prevent the decrement and just emit a warning.

Signed-off-by: Lianwei Wang <lianwei.wang@gmail.com>
Cc: peterz@infradead.org
Cc: linux-pm@vger.kernel.org
Cc: oleg@redhat.com
Link: http://lkml.kernel.org/r/1465541008-12476-1-git-send-email-lianwei.wang@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-22 23:54:57 -07:00
Dmitry Shmidt
59fc70469a Merge remote-tracking branch 'common/android-4.4' into android-4.4.y
Change-Id: I8c5ec371d8b612f6880b2428893bec89d7da71f6
2016-10-21 13:50:06 -07:00
Dmitry Shmidt
b8fa4a3ee5 This is the 4.4.26 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYCHnGAAoJEDjbvchgkmk+/tAP/RvQZ+Ft5ItRdlAxrXYgTFda
 PXpyUiCPC8XnrzBClXRoA+EiERIpckC5THdU3uUok4BWuALsgrhxK8YxfHXWLQWe
 2cqn19QPiFMZFz6E1sktWtwk63FH4uyfSXQ7/Roun5oQR4Ltg+Q6wRWlf33ox03t
 v8WVv880M4Q2pbp08USCDD8/EcWj3BNK2qFb05RD6pyW9jgIToSXEFoxtKsRiwDv
 QvUTXRkQOfo6Ui9zQTPzQR/ltahb6W/78fEiqu0G1p+uLkTqwusV33t4T64F+829
 s7MTMb7vnQWwuWKEcWy+CMjMwwUfrZHi6gl0PcBxHatFW/AOADixag0PYZWEku9/
 ok9Ltx63ZaSXvGy9odzIgjB/VIjdQXfPwRGOnNojJIvRDkb/zP3fCCglH9kXIwG2
 W5tXCOf43i1a6G3DyEkNEfCk66WYhuwyulGRRpo5RjsiPXnLiKbX9rqVh+WA3ucZ
 jtnZRFKGu2AsXes4tp/aip46zu73BL3wkKhI08z5ipUoV1PDEUb0fxhw0ZkEI/J7
 +tKotIQXdc190qtQNdkBwYAxujlzwhVYJAb9dwgYpWweVL5cEGIp5+sV5isJ5u4M
 0eD9pLVKsgG/BocnNve/h2heRNtTCq34vQrGEnh437NCrZbMGiRhwnI8HRCstqhr
 ECbkLL5cDnwtRiieBYSj
 =RoKA
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.26' into android-4.4.y

This is the 4.4.26 stable release
2016-10-21 13:40:45 -07:00
Guenter Roeck
7bb5218b77 cgroup: Remove leftover instances of allow_attach
Fix:

kernel/sched/tune.c:718:2: error:
	unknown field ‘allow_attach’ specified in initializer
kernel/cpuset.c:2087:2: error:
	unknown field 'allow_attach' specified in initializer

Change-Id: Ie524350ffc6158f3182d90095cca502e58b6f197
Fixes: e78f134a78 ("CHROMIUM: remove Android's cgroup generic permissions checks")
Signed-off-by: Guenter Roeck <groeck@chromium.org>
2016-10-18 12:35:03 -07:00
Dmitry Torokhov
140cab8310 CHROMIUM: cgroups: relax permissions on moving tasks between cgroups
Android expects system_server to be able to move tasks between different
cgroups/cpusets, but does not want to be running as root. Let's relax
permission check so that processes can move other tasks if they have
CAP_SYS_NICE in the affected task's user namespace.

BUG=b:31790445,chromium:647994
TEST=Boot android container, examine logcat

Change-Id: Ia919c66ab6ed6a6daf7c4cf67feb38b13b1ad09b
Signed-off-by: Dmitry Torokhov <dtor@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/394927
Reviewed-by: Ricky Zhou <rickyz@chromium.org>
2016-10-17 22:55:20 -07:00
Dmitry Torokhov
e78f134a78 CHROMIUM: remove Android's cgroup generic permissions checks
The implementation is utterly broken, resulting in all processes being
allows to move tasks between sets (as long as they have access to the
"tasks" attribute), and upstream is heading towards checking only
capability anyway, so let's get rid of this code.

BUG=b:31790445,chromium:647994
TEST=Boot android container, examine logcat

Change-Id: I2f780a5992c34e52a8f2d0b3557fc9d490da2779
Signed-off-by: Dmitry Torokhov <dtor@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/394967
Reviewed-by: Ricky Zhou <rickyz@chromium.org>
Reviewed-by: John Stultz <john.stultz@linaro.org>
2016-10-17 22:55:19 -07:00
John Stultz
78c7b55b36 timekeeping: Fix __ktime_get_fast_ns() regression
commit 58bfea9532552d422bde7afa207e1a0f08dffa7d upstream.

In commit 27727df240c7 ("Avoid taking lock in NMI path with
CONFIG_DEBUG_TIMEKEEPING"), I changed the logic to open-code
the timekeeping_get_ns() function, but I forgot to include
the unit conversion from cycles to nanoseconds, breaking the
function's output, which impacts users like perf.

This results in bogus perf timestamps like:
 swapper     0 [000]   253.427536:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.426573:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.426687:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.426800:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.426905:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.427022:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.427127:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.427239:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.427346:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.427463:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   255.426572:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])

Instead of more reasonable expected timestamps like:
 swapper     0 [000]    39.953768:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.064839:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.175956:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.287103:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.398217:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.509324:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.620437:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.731546:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.842654:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.953772:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    41.064881:  111111111 cpu-clock:  ffffffff810a0de6 native_safe_halt+0x6 ([kernel.kallsyms])

Add the proper use of timekeeping_delta_to_ns() to convert
the cycle delta to nanoseconds as needed.

Thanks to Brendan and Alexei for finding this quickly after
the v4.8 release. Unfortunately the problematic commit has
landed in some -stable trees so they'll need this fix as
well.

Many apologies for this mistake. I'll be looking to add a
perf-clock sanity test to the kselftest timers tests soon.

Fixes: 27727df240c7 "timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING"
Reported-by: Brendan Gregg <bgregg@netflix.com>
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Tested-and-reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1475636148-26539-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-16 17:36:14 +02:00
Christopher S. Hall
9b57d91c03 time: Add cycles to nanoseconds translation
commit 6bd58f09e1d8cc6c50a824c00bf0d617919986a1 upstream.

The timekeeping code does not currently provide a way to translate
externally provided clocksource cycles to system time. The cycle count
is always provided by the result clocksource read() method internal to
the timekeeping code. The added function timekeeping_cycles_to_ns()
calculated a nanosecond value from a cycle count that can be added to
tk_read_base.base value yielding the current system time. This allows
clocksource cycle values external to the timekeeping code to provide a
cycle count that can be transformed to system time.

Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: kevin.b.stanton@intel.com
Cc: kevin.j.clarke@intel.com
Cc: hpa@zytor.com
Cc: jeffrey.t.kirsher@intel.com
Cc: netdev@vger.kernel.org
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-16 17:36:14 +02:00
Dmitry Shmidt
14de94f03d This is the 4.4.24 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJX96H2AAoJEDjbvchgkmk+MqcQAJuhiwLmCsXRKXGujGByPi5P
 vk+mnkt8o2UpamvT4KVRnWQrJuN8EHDHg29esGXKHV9Ahdmw/UfnXvbz3P0auet9
 GvMi4rKZpL3vD/dcMxshQchRKF7SwUbNNMkyJv1WYCjLex7W1LU/NOQV5VLx21i6
 9E/R/ARrazGhrGqanfm4NhIZYOR9QAWCrsc8pbJiE2OSty1HbQCLapA8gWg2PSnz
 sTlH0BF9tJ2kKSnyYjXM1Xb1zHbZuj83qEELhSnXsGK71Sq/8jIH9a5SiUwSDtdt
 szGp+vLODPqIMYa01qyLtFA2tvkusvKDUps8vtZ5mp9t38u2R8TDA+CAbz6w19mb
 C0d9abvZ9l1pIe/96OdgZkdSmGG2DC5hxnk3eaxhsyHn6RkIXfB9igH5+Fk7r/nm
 Yq15xxOIu6DCcuoQesUcAHoIR2961kbo/ZnnUEy2hRsqvR3/21X7qW3oj68uTdnB
 QQtMc1jq32toaZFk21ojLDtxKAlVqVHuslQ0hsMMgKtADZAveWpqZj408aNlPOi8
 CFHoEAxYXCQItOhRCoQeC1mljahvhEBI9N+5Zbpf30q5imLKen9hQphytbKABEWN
 CVJ6h6YndrdnlN7cS/AQ62+SNDk4kLmeMomgXfB701WTJ1cvI6eW4q6WUSS+54DZ
 q+brnDATt0K3nUmsrpGM
 =JaSn
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.24' into android-4.4.y

This is the 4.4.24 stable release
2016-10-14 13:34:43 -07:00
Dmitry Shmidt
fed8e608c3 Merge remote-tracking branch 'common/android-4.4' into android-4.4.y
Change-Id: I203e905e0a63db40a5bb8ee85fcac1e128736331
2016-10-10 13:10:27 -07:00
Dmitry Shmidt
09f6247a9c This is the 4.4.23 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJX7iBiAAoJEDjbvchgkmk+aIQQAIAZ97gsrZInLRZaLJCMS6Me
 4zZRry3pUDtrLkBglerFiKrJTG/mFzasJxyyHuvNU++C9Nu8GdIkslnZ6/g+BO4P
 xaX4PLeM4nCq33f8R5QX5dfM8qaCwWEdD01xK17Agrfcw8nljomPu3B1o8HnaFhb
 jZbmQ9I2yIpDivNorbHZAWZWV3fmk4brDbO/X60X6k4nn42ZSp5f2M2NlcirzR9/
 to5ZVEY51nrShXCJcoaNEMMd/lxPrsv1j5rI+WYibDlOJ4RTEy/UK+yJFgZqAXL9
 mou/A9D0p0uKAKH85s/5wpjvQ/7QsFRasW1HM1nEd8B3TqS2Xi9k+nYAD6S0HvE4
 IPKwPTpV9J+7ZixWSE4lorpHKZhhla+DVP09ZEZwJQlrxs/sGPQw2EgdhY5Kid1J
 Bd7dyxIUieF5sDFJnwYnsCGdtJaSaKW/Kpscz5q70bI2h8SugZYdIBpJdHTTe5cX
 vvfy+JaChpdcTLTgWh4XvgtWsabS2W2hFH6uBIkhy9hjhiflotBEG6WFOo3vrEC/
 lTqRx9AphBb9fW8hIIKNhI8gKEsAF7xzZ7/YHounGrBXCiJTbiogyysvNHkebHfd
 LbWtwMTSYrNNsu4ixiobofGu29PEDQW/i/emkUlF9jbKIL09bSGRaGet/qQOY6EJ
 WoHyxZAZCT+3xGuvhTcj
 =5gCn
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.23' into android-4.4.y

This is the 4.4.23 stable release
2016-10-10 12:43:03 -07:00
Michal Hocko
82b7839a40 kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd
commit 735f2770a770156100f534646158cb58cb8b2939 upstream.

Commit fec1d01152 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal
exit") has caused a subtle regression in nscd which uses
CLONE_CHILD_CLEARTID to clear the nscd_certainly_running flag in the
shared databases, so that the clients are notified when nscd is
restarted.  Now, when nscd uses a non-persistent database, clients that
have it mapped keep thinking the database is being updated by nscd, when
in fact nscd has created a new (anonymous) one (for non-persistent
databases it uses an unlinked file as backend).

The original proposal for the CLONE_CHILD_CLEARTID change claimed
(https://lkml.org/lkml/2006/10/25/233):

: The NPTL library uses the CLONE_CHILD_CLEARTID flag on clone() syscalls
: on behalf of pthread_create() library calls.  This feature is used to
: request that the kernel clear the thread-id in user space (at an address
: provided in the syscall) when the thread disassociates itself from the
: address space, which is done in mm_release().
:
: Unfortunately, when a multi-threaded process incurs a core dump (such as
: from a SIGSEGV), the core-dumping thread sends SIGKILL signals to all of
: the other threads, which then proceed to clear their user-space tids
: before synchronizing in exit_mm() with the start of core dumping.  This
: misrepresents the state of process's address space at the time of the
: SIGSEGV and makes it more difficult for someone to debug NPTL and glibc
: problems (misleading him/her to conclude that the threads had gone away
: before the fault).
:
: The fix below is to simply avoid the CLONE_CHILD_CLEARTID action if a
: core dump has been initiated.

The resulting patch from Roland (https://lkml.org/lkml/2006/10/26/269)
seems to have a larger scope than the original patch asked for.  It
seems that limitting the scope of the check to core dumping should work
for SIGSEGV issue describe above.

[Changelog partly based on Andreas' description]
Fixes: fec1d01152 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit")
Link: http://lkml.kernel.org/r/1471968749-26173-1-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Tested-by: William Preston <wpreston@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Andreas Schwab <schwab@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07 15:23:46 +02:00
Subash Abhinov Kasiviswanathan
70cd763eb1 sysctl: handle error writing UINT_MAX to u32 fields
commit e7d316a02f683864a12389f8808570e37fb90aa3 upstream.

We have scripts which write to certain fields on 3.18 kernels but this
seems to be failing on 4.4 kernels.  An entry which we write to here is
xfrm_aevent_rseqth which is u32.

  echo 4294967295  > /proc/sys/net/core/xfrm_aevent_rseqth

Commit 230633d109 ("kernel/sysctl.c: detect overflows when converting
to int") prevented writing to sysctl entries when integer overflow
occurs.  However, this does not apply to unsigned integers.

Heinrich suggested that we introduce a new option to handle 64 bit
limits and set min as 0 and max as UINT_MAX.  This might not work as it
leads to issues similar to __do_proc_doulongvec_minmax.  Alternatively,
we would need to change the datatype of the entry to 64 bit.

  static int __do_proc_doulongvec_minmax(void *data, struct ctl_table
  {
      i = (unsigned long *) data;   //This cast is causing to read beyond the size of data (u32)
      vleft = table->maxlen / sizeof(unsigned long); //vleft is 0 because maxlen is sizeof(u32) which is lesser than sizeof(unsigned long) on x86_64.

Introduce a new proc handler proc_douintvec.  Individual proc entries
will need to be updated to use the new handler.

[akpm@linux-foundation.org: coding-style fixes]
Fixes: 230633d109 ("kernel/sysctl.c:detect overflows when converting to int")
Link: http://lkml.kernel.org/r/1471479806-5252-1-git-send-email-subashab@codeaurora.org
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07 15:23:46 +02:00
Nicolas Iooss
6b502c1d73 printk: fix parsing of "brl=" option
commit ae6c33ba6e37eea3012fe2640b22400ef3f2d0f3 upstream.

Commit bbeddf52ad ("printk: move braille console support into separate
braille.[ch] files") moved the parsing of braille-related options into
_braille_console_setup(), changing the type of variable str from char*
to char**.  In this commit, memcmp(str, "brl,", 4) was correctly updated
to memcmp(*str, "brl,", 4) but not memcmp(str, "brl=", 4).

Update the code to make "brl=" option work again and replace memcmp()
with strncmp() to make the compiler able to detect such an issue.

Fixes: bbeddf52ad ("printk: move braille console support into separate braille.[ch] files")
Link: http://lkml.kernel.org/r/20160823165700.28952-1-nicolas.iooss_linux@m4x.org
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07 15:23:43 +02:00
Mark Rutland
711a78a792 perf/core: Fix pmu::filter_match for SW-led groups
commit 2c81a6477081966fe80b8c6daa68459bca896774 upstream.

The following commit:

  66eb579e66 ("perf: allow for PMU-specific event filtering")

added the pmu::filter_match() callback. This was intended to
avoid HW constraints on events from resulting in extremely
pessimistic scheduling.

However, pmu::filter_match() is only called for the leader of each event
group. When the leader is a SW event, we do not filter the groups, and
may fail at pmu::add() time, and when this happens we'll give up on
scheduling any event groups later in the list until they are rotated
ahead of the failing group.

This can result in extremely sub-optimal event scheduling behaviour,
e.g. if running the following on a big.LITTLE platform:

$ taskset -c 0 ./perf stat \
 -e 'a57{context-switches,armv8_cortex_a57/config=0x11/}' \
 -e 'a53{context-switches,armv8_cortex_a53/config=0x11/}' \
 ls

     <not counted>      context-switches                                              (0.00%)
     <not counted>      armv8_cortex_a57/config=0x11/                                 (0.00%)
                24      context-switches                                              (37.36%)
          57589154      armv8_cortex_a53/config=0x11/                                 (37.36%)

Here the 'a53' event group was always eligible to be scheduled, but
the 'a57' group never eligible to be scheduled, as the task was always
affine to a Cortex-A53 CPU. The SW (group leader) event in the 'a57'
group was eligible, but the HW event failed at pmu::add() time,
resulting in ctx_flexible_sched_in giving up on scheduling further
groups with HW events.

One way of avoiding this is to check pmu::filter_match() on siblings
as well as the group leader. If any of these fail their
pmu::filter_match() call, we must skip the entire group before
attempting to add any events.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: 66eb579e66 ("perf: allow for PMU-specific event filtering")
Link: http://lkml.kernel.org/r/1465917041-15339-1-git-send-email-mark.rutland@arm.com
[ Small readability edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07 15:23:41 +02:00
Joonwoo Park
8132ffc977 cpuset: handle race between CPU hotplug and cpuset_hotplug_work
commit 28b89b9e6f7b6c8fef7b3af39828722bca20cfee upstream.

A discrepancy between cpu_online_mask and cpuset's effective_cpus
mask is inevitable during hotplug since cpuset defers updating of
effective_cpus mask using a workqueue, during which time nothing
prevents the system from more hotplug operations.  For that reason
guarantee_online_cpus() walks up the cpuset hierarchy until it finds
an intersection under the assumption that top cpuset's effective_cpus
mask intersects with cpu_online_mask even with such a race occurring.

However a sequence of CPU hotplugs can open a time window, during which
none of the effective CPUs in the top cpuset intersect with
cpu_online_mask.

For example when there are 4 possible CPUs 0-3 and only CPU0 is online:

  ========================  ===========================
   cpu_online_mask           top_cpuset.effective_cpus
  ========================  ===========================
   echo 1 > cpu2/online.
   CPU hotplug notifier woke up hotplug work but not yet scheduled.
      [0,2]                     [0]

   echo 0 > cpu0/online.
   The workqueue is still runnable.
      [2]                       [0]
  ========================  ===========================

  Now there is no intersection between cpu_online_mask and
  top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
  this moment can cause following:

   Unable to handle kernel NULL pointer dereference at virtual address 000000d0
   ------------[ cut here ]------------
   Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
   Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
   Modules linked in:
   CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
   task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
   PC is at guarantee_online_cpus+0x2c/0x58
   LR is at cpuset_cpus_allowed+0x4c/0x6c
   <snip>
   Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
   Call trace:
   [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
   [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
   [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
   [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
   [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28

The top cpuset's effective_cpus are guaranteed to be identical to
cpu_online_mask eventually.  Hence fall back to cpu_online_mask when
there is no intersection between top cpuset's effective_cpus and
cpu_online_mask.

Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07 15:23:40 +02:00
James Morse
116fcd882e PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
commit 924d8696751c4b9e58263bc82efdafcf875596a6 upstream.

rtree_next_node() walks the linked list of leaf nodes to find the next
block of pages in the struct memory_bitmap. If it walks off the end of
the list of nodes, it walks the list of memory zones to find the next
region of memory. If it walks off the end of the list of zones, it
returns false.

This leaves the struct bm_position's node and zone pointers pointing
at their respective struct list_heads in struct mem_zone_bm_rtree.

memory_bm_find_bit() uses struct bm_position's node and zone pointers
to avoid walking lists and trees if the next bit appears in the same
node/zone. It handles these values being stale.

Swap rtree_next_node()s 'step then test' to 'test-next then step',
this means if we reach the end of memory we return false and leave
the node and zone pointers as they were.

This fixes a panic on resume using AMD Seattle with 64K pages:
[    6.868732] Freezing user space processes ... (elapsed 0.000 seconds) done.
[    6.875753] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds)
[    6.896453] PM: Using 3 thread(s) for decompression.
[    6.896453] PM: Loading and decompressing image data (5339 pages)...
[    7.318890] PM: Image loading progress:   0%
[    7.323395] Unable to handle kernel paging request at virtual address 00800040
[    7.330611] pgd = ffff000008df0000
[    7.334003] [00800040] *pgd=00000083fffe0003, *pud=00000083fffe0003, *pmd=00000083fffd0003, *pte=0000000000000000
[    7.344266] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[    7.349825] Modules linked in:
[    7.352871] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G        W I     4.8.0-rc1 #4737
[    7.360512] Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1002C 04/08/2016
[    7.369109] task: ffff8003c0220000 task.stack: ffff8003c0280000
[    7.375020] PC is at set_bit+0x18/0x30
[    7.378758] LR is at memory_bm_set_bit+0x24/0x30
[    7.383362] pc : [<ffff00000835bbc8>] lr : [<ffff0000080faf18>] pstate: 60000045
[    7.390743] sp : ffff8003c0283b00
[    7.473551]
[    7.475031] Process swapper/0 (pid: 1, stack limit = 0xffff8003c0280020)
[    7.481718] Stack: (0xffff8003c0283b00 to 0xffff8003c0284000)
[    7.800075] Call trace:
[    7.887097] [<ffff00000835bbc8>] set_bit+0x18/0x30
[    7.891876] [<ffff0000080fb038>] duplicate_memory_bitmap.constprop.38+0x54/0x70
[    7.899172] [<ffff0000080fcc40>] snapshot_write_next+0x22c/0x47c
[    7.905166] [<ffff0000080fe1b4>] load_image_lzo+0x754/0xa88
[    7.910725] [<ffff0000080ff0a8>] swsusp_read+0x144/0x230
[    7.916025] [<ffff0000080fa338>] load_image_and_restore+0x58/0x90
[    7.922105] [<ffff0000080fa660>] software_resume+0x2f0/0x338
[    7.927752] [<ffff000008083350>] do_one_initcall+0x38/0x11c
[    7.933314] [<ffff000008b40cc0>] kernel_init_freeable+0x14c/0x1ec
[    7.939395] [<ffff0000087ce564>] kernel_init+0x10/0xfc
[    7.944520] [<ffff000008082e90>] ret_from_fork+0x10/0x40
[    7.949820] Code: d2800022 8b400c21 f9800031 9ac32043 (c85f7c22)
[    7.955909] ---[ end trace 0024a5986e6ff323 ]---
[    7.960529] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Here struct mem_zone_bm_rtree's start_pfn has been returned instead of
struct rtree_node's addr as the node/zone pointers are corrupt after
we walked off the end of the lists during mark_unsafe_pages().

This behaviour was exposed by commit 6dbecfd345a6 ("PM / hibernate:
Simplify mark_unsafe_pages()"), which caused mark_unsafe_pages() to call
duplicate_memory_bitmap(), which uses memory_bm_find_bit() after walking
off the end of the memory bitmap.

Fixes: 3a20cb1779 (PM / Hibernate: Implement position keeping in radix tree)
Signed-off-by: James Morse <james.morse@arm.com>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:39 +02:00
Thomas Garnier
119c348a8e PM / hibernate: Restore processor state before using per-CPU variables
commit 62822e2ec4ad091ba31f823f577ef80db52e3c2c upstream.

Restore the processor state before calling any other functions to
ensure per-CPU variables can be used with KASLR memory randomization.

Tracing functions use per-CPU variables (GS based on x86) and one was
called just before restoring the processor state fully. It resulted
in a double fault when both the tracing & the exception handler
functions tried to use a per-CPU variable.

Fixes: bb3632c610 (PM / sleep: trace events for suspend/resume)
Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Reported-by: Jiri Kosina <jikos@kernel.org>
Tested-by: Rafael J. Wysocki <rafael@kernel.org>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:39 +02:00
Steven Rostedt (Red Hat)
8b275b4522 tracing: Move mutex to protect against resetting of seq data
commit 1245800c0f96eb6ebb368593e251d66c01e61022 upstream.

The iter->seq can be reset outside the protection of the mutex. So can
reading of user data. Move the mutex up to the beginning of the function.

Fixes: d7350c3f45 ("tracing/core: make the read callbacks reentrants")
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:38 +02:00
Al Viro
369796a884 fix memory leaks in tracing_buffers_splice_read()
commit 1ae2293dd6d2f5c823cf97e60b70d03631cd622f upstream.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:38 +02:00
Steven Rostedt
a52031beb0 Makefile: Mute warning for __builtin_return_address(>0) for tracing only
commit 377ccbb483738f84400ddf5840c7dd8825716985 upstream.

With the latest gcc compilers, they give a warning if
__builtin_return_address() parameter is greater than 0. That is because if
it is used by a function called by a top level function (or in the case of
the kernel, by assembly), it can try to access stack frames outside the
stack and crash the system.

The tracing system uses __builtin_return_address() of up to 2! But it is
well aware of the dangers that it may have, and has even added precautions
to protect against it (see the thunk code in arch/x86/entry/thunk*.S)

Linus originally added KBUILD_CFLAGS that would suppress the warning for the
entire kernel, as simply adding KBUILD_CFLAGS to the tracing directory
wouldn't work. The tracing directory plays a bit with the CFLAGS and
requires a little more logic.

This adds that special logic to only suppress the warning for the tracing
directory. If it is used anywhere else outside of tracing, the warning will
still be triggered.

Link: http://lkml.kernel.org/r/20160728223043.51996267@grimm.local.home

Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:35 +02:00
Arnd Bergmann
29bd03596e Fix build warning in kernel/cpuset.c
>           2 ../kernel/cpuset.c:2101:11: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
>           1 ../kernel/cpuset.c:2101:2: warning: initialization from incompatible pointer type
>           1 ../kernel/cpuset.c:2101:2: warning: (near initialization for 'cpuset_cgrp_subsys.fork')

This got introduced by 06ec7a1d76 ("cpuset: make sure new tasks
conform to the current config of the cpuset"). In the upstream
kernel, the function prototype was changed as of b53202e63089
("cgroup: kill cgrp_ss_priv[CGROUP_CANFORK_COUNT] and friends").

That patch is not suitable for stable kernels, and fortunately
the warning seems harmless as the prototypes only differ in the
second argument that is unused. Adding that argument gets rid
of the warning:

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30 10:18:33 +02:00
Dmitry Shmidt
8760f8e3d9 Merge remote-tracking branch 'common/android-4.4' into android-4.4.y
Change-Id: I6c4e7f9f47392d4b334f71e2b20f2ccf33827632
2016-09-26 14:58:53 -07:00
Dmitry Shmidt
734bcf32c2 This is the 4.4.22 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJX5jR2AAoJEDjbvchgkmk+nXwQAML5WFM1xDL8frXh3vIS3RzD
 fP2YHP0Bm+xE/G9jDnlcoqJmxg4DKPUCP4T/rCZmeNRWc/RaIBX+VTyfVhN969uo
 v5f8jN6fc4TO9WMD+G++Vx3MZqupJbSAXlY2ZSUTF389lM/jHvaWj+DfA1qGLmGJ
 UbfO1jNszadZGIb8yOo/qmR+E3sSV/nT+/y7Sa2rSqkKt5+YI+z1Q1ezLo7BZ+uO
 6p968djKTXSOO7SHciddoegJ8lF2hhgY4cW95CEV+Dqu2O6AVyFyMz+ngYivEueZ
 ZwwQCaYIl+68ssAoI61VmtQHEvuaikTx5g9vjAApScWWijZU+V/M65BLAL6GAMWH
 kWOmilbtZKhyirecAxgnRIkJR8Tp0YcgUYAivsqkYqVPelcPsHvOFRfr4D6HrcBt
 wLrjaoBj+1vAjskozKJEymDNGQJ2Me/nBAWgN44MQYLRGg4kdBxNS/CGyeh8O8wO
 gEeVqa+zDOQCSeg2LJdiql3TdMQfQ+kpCsfjcrrl1oRkRX7OX130+gLuI8Tt1Fno
 6niq6w+QeAY445RSyM45vLeJ6vXB7oFadtuD4QvsB5YFr0X0P0KF3GKlHl0xiyEV
 JFpWJiXYsnOvM8entT23aeCSTlDT1p6os3jLh8p7CBn9TvP3uW2nfgG/FKvy0wGD
 7L7FKYb4Mw+YSrROfxBT
 =5+OA
 -----END PGP SIGNATURE-----

Merge tag 'v4.4.22' into android-4.4.y

This is the 4.4.22 stable release

Change-Id: Id49e3c87d2cacb2fa85d85a17226f718f4a5ac28
2016-09-26 10:37:43 -07:00
Paul Moore
494a59a683 BACKPORT: audit: consistently record PIDs with task_tgid_nr()
Unfortunately we record PIDs in audit records using a variety of
methods despite the correct way being the use of task_tgid_nr().
This patch converts all of these callers, except for the case of
AUDIT_SET in audit_receive_msg() (see the comment in the code).

Reported-by: Jeff Vander Stoep <jeffv@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>

Bug: 28952093

(cherry picked from commit fa2bea2f5cca5b8d4a3e5520d2e8c0ede67ac108)
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Change-Id: If6645f9de8bc58ed9755f28dc6af5fbf08d72a00
2016-09-24 15:10:25 -07:00
Thomas Gleixner
2fbc61d977 genirq/msi: Fix broken debug output
commit 4364e1a29be16b2783c0bcbc263f61236af64281 upstream.

virq is not required to be the same for all msi descs. Use the base irq number
from the desc in the debug printk.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:46 +02:00
Arnd Bergmann
268a8bf90e kconfig: tinyconfig: provide whole choice blocks to avoid warnings
commit 236dec051078a8691950f56949612b4b74107e48 upstream.

Using "make tinyconfig" produces a couple of annoying warnings that show
up for build test machines all the time:

    .config:966:warning: override: NOHIGHMEM changes choice state
    .config:965:warning: override: SLOB changes choice state
    .config:963:warning: override: KERNEL_XZ changes choice state
    .config:962:warning: override: CC_OPTIMIZE_FOR_SIZE changes choice state
    .config:933:warning: override: SLOB changes choice state
    .config:930:warning: override: CC_OPTIMIZE_FOR_SIZE changes choice state
    .config:870:warning: override: SLOB changes choice state
    .config:868:warning: override: KERNEL_XZ changes choice state
    .config:867:warning: override: CC_OPTIMIZE_FOR_SIZE changes choice state

I've made a previous attempt at fixing them and we discussed a number of
alternatives.

I tried changing the Makefile to use "merge_config.sh -n
$(fragment-list)" but couldn't get that to work properly.

This is yet another approach, based on the observation that we do want
to see a warning for conflicting 'choice' options, and that we can
simply make them non-conflicting by listing all other options as
disabled.  This is a trivial patch that we can apply independent of
plans for other changes.

Link: http://lkml.kernel.org/r/20160829214952.1334674-2-arnd@arndb.de
Link: https://storage.kernelci.org/mainline/v4.7-rc6/x86-tinyconfig/build.log
https://patchwork.kernel.org/patch/9212749/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:42 +02:00
Balbir Singh
a37a538e82 sched/core: Fix a race between try_to_wake_up() and a woken up task
commit 135e8c9250dd5c8c9aae5984fde6f230d0cbfeaf upstream.

The origin of the issue I've seen is related to
a missing memory barrier between check for task->state and
the check for task->on_rq.

The task being woken up is already awake from a schedule()
and is doing the following:

	do {
		schedule()
		set_current_state(TASK_(UN)INTERRUPTIBLE);
	} while (!cond);

The waker, actually gets stuck doing the following in
try_to_wake_up():

	while (p->on_cpu)
		cpu_relax();

Analysis:

The instance I've seen involves the following race:

 CPU1					CPU2

 while () {
   if (cond)
     break;
   do {
     schedule();
     set_current_state(TASK_UN..)
   } while (!cond);
					wakeup_routine()
					  spin_lock_irqsave(wait_lock)
   raw_spin_lock_irqsave(wait_lock)	  wake_up_process()
 }					  try_to_wake_up()
 set_current_state(TASK_RUNNING);	  ..
 list_del(&waiter.list);

CPU2 wakes up CPU1, but before it can get the wait_lock and set
current state to TASK_RUNNING the following occurs:

 CPU3
 wakeup_routine()
 raw_spin_lock_irqsave(wait_lock)
 if (!list_empty)
   wake_up_process()
   try_to_wake_up()
   raw_spin_lock_irqsave(p->pi_lock)
   ..
   if (p->on_rq && ttwu_wakeup())
   ..
   while (p->on_cpu)
     cpu_relax()
   ..

CPU3 tries to wake up the task on CPU1 again since it finds
it on the wait_queue, CPU1 is spinning on wait_lock, but immediately
after CPU2, CPU3 got it.

CPU3 checks the state of p on CPU1, it is TASK_UNINTERRUPTIBLE and
the task is spinning on the wait_lock. Interestingly since p->on_rq
is checked under pi_lock, I've noticed that try_to_wake_up() finds
p->on_rq to be 0. This was the most confusing bit of the analysis,
but p->on_rq is changed under runqueue lock, rq_lock, the p->on_rq
check is not reliable without this fix IMHO. The race is visible
(based on the analysis) only when ttwu_queue() does a remote wakeup
via ttwu_queue_remote. In which case the p->on_rq change is not
done uder the pi_lock.

The result is that after a while the entire system locks up on
the raw_spin_irqlock_save(wait_lock) and the holder spins infintely

Reproduction of the issue:

The issue can be reproduced after a long run on my system with 80
threads and having to tweak available memory to very low and running
memory stress-ng mmapfork test. It usually takes a long time to
reproduce. I am trying to work on a test case that can reproduce
the issue faster, but thats work in progress. I am still testing the
changes on my still in a loop and the tests seem OK thus far.

Big thanks to Benjamin and Nick for helping debug this as well.
Ben helped catch the missing barrier, Nick caught every missing
bit in my theory.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[ Updated comment to clarify matching barriers. Many
  architectures do not have a full barrier in switch_to()
  so that cannot be relied upon. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicholas Piggin <nicholas.piggin@gmail.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e02cce7b-d9ca-1ad0-7a61-ea97c7582b37@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:42 +02:00
Zefan Li
06ec7a1d76 cpuset: make sure new tasks conform to the current config of the cpuset
commit 06f4e94898918bcad00cdd4d349313a439d6911e upstream.

A new task inherits cpus_allowed and mems_allowed masks from its parent,
but if someone changes cpuset's config by writing to cpuset.cpus/cpuset.mems
before this new task is inserted into the cgroup's task list, the new task
won't be updated accordingly.

Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:39 +02:00
Mateusz Guzik
d0259cc85b audit: fix exe_file access in audit_exe_compare
commit 5efc244346f9f338765da3d592f7947b0afdc4b5 upstream.

Prior to the change the function would blindly deference mm, exe_file
and exe_file->f_inode, each of which could have been NULL or freed.

Use get_task_exe_file to safely obtain stable exe_file.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24 10:07:36 +02:00