android_kernel_oneplus_msm8998/kernel
Petr Mladek b36612b863 tracing: Initialize iter->seq after zeroing in tracing_read_pipe()
[ Upstream commit d303de1fcf344ff7c15ed64c3f48a991c9958775 ]

A customer reported the following softlockup:

[899688.160002] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [test.sh:16464]
[899688.160002] CPU: 0 PID: 16464 Comm: test.sh Not tainted 4.12.14-6.23-azure #1 SLE12-SP4
[899688.160002] RIP: 0010:up_write+0x1a/0x30
[899688.160002] Kernel panic - not syncing: softlockup: hung tasks
[899688.160002] RIP: 0010:up_write+0x1a/0x30
[899688.160002] RSP: 0018:ffffa86784d4fde8 EFLAGS: 00000257 ORIG_RAX: ffffffffffffff12
[899688.160002] RAX: ffffffff970fea00 RBX: 0000000000000001 RCX: 0000000000000000
[899688.160002] RDX: ffffffff00000001 RSI: 0000000000000080 RDI: ffffffff970fea00
[899688.160002] RBP: ffffffffffffffff R08: ffffffffffffffff R09: 0000000000000000
[899688.160002] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b59014720d8
[899688.160002] R13: ffff8b59014720c0 R14: ffff8b5901471090 R15: ffff8b5901470000
[899688.160002]  tracing_read_pipe+0x336/0x3c0
[899688.160002]  __vfs_read+0x26/0x140
[899688.160002]  vfs_read+0x87/0x130
[899688.160002]  SyS_read+0x42/0x90
[899688.160002]  do_syscall_64+0x74/0x160

It caught the process in the middle of trace_access_unlock(). There is
no loop. So, it must be looping in the caller tracing_read_pipe()
via the "waitagain" label.

Crashdump analyze uncovered that iter->seq was completely zeroed
at this point, including iter->seq.seq.size. It means that
print_trace_line() was never able to print anything and
there was no forward progress.

The culprit seems to be in the code:

	/* reset all but tr, trace, and overruns */
	memset(&iter->seq, 0,
	       sizeof(struct trace_iterator) -
	       offsetof(struct trace_iterator, seq));

It was added by the commit 53d0aa7730 ("ftrace:
add logic to record overruns"). It was v2.6.27-rc1.
It was the time when iter->seq looked like:

     struct trace_seq {
	unsigned char		buffer[PAGE_SIZE];
	unsigned int		len;
     };

There was no "size" variable and zeroing was perfectly fine.

The solution is to reinitialize the structure after or without
zeroing.

Link: http://lkml.kernel.org/r/20191011142134.11997-1-pmladek@suse.com

Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06 12:09:16 +01:00
..
bpf bpf: silence warning messages in core 2019-08-04 09:34:46 +02:00
configs
debug kdb: use memmove instead of overlapping memcpy 2018-12-13 09:21:29 +01:00
events perf/core: Fix creating kernel counters for PMUs that override event->cpu 2019-08-25 10:52:51 +02:00
gcov gcov: disable for COMPILE_TEST 2018-01-23 19:50:10 +01:00
irq genirq: Prevent NULL pointer dereference in resend_irqs() 2019-09-21 07:12:43 +02:00
livepatch
locking locking/lockdep: Add debug_locks check in __lock_downgrade() 2019-10-05 12:27:39 +02:00
power PM / Hibernate: Call flush_icache_range() on pages restored in-place 2019-04-03 06:23:24 +02:00
printk printk: Do not lose last line in kmsg buffer dump 2019-10-05 12:27:51 +02:00
rcu rcutorture: Fix cleanup path for invalid torture_type strings 2019-06-11 12:24:04 +02:00
sched sched/core: Fix CPU controller for !RT_GROUP_SCHED 2019-10-05 12:27:45 +02:00
time alarmtimer: Use EOPNOTSUPP instead of ENOTSUPP 2019-10-05 12:27:53 +02:00
trace tracing: Initialize iter->seq after zeroing in tracing_read_pipe() 2019-11-06 12:09:16 +01:00
.gitignore
acct.c kernel/acct.c: fix the acct->needcheck check in check_free_space() 2018-01-10 09:27:08 +01:00
async.c kernel/async.c: revert "async: simplify lowest_in_progress()" 2018-02-16 20:09:45 +01:00
audit.c audit: return on memory error to avoid null pointer dereference 2018-05-30 07:49:16 +02:00
audit.h
audit_fsnotify.c
audit_tree.c
audit_watch.c audit: fix use-after-free in audit_add_watch 2018-09-26 08:35:08 +02:00
auditfilter.c audit: fix a memory leak bug 2019-06-11 12:23:58 +02:00
auditsc.c audit: allow not equal op for audit by executable 2018-08-06 16:24:38 +02:00
backtracetest.c
bounds.c kbuild: fix kernel/bounds.c 'W=1' warning 2018-11-21 09:27:35 +01:00
capability.c
cgroup.c cgroup: Disable IRQs while holding css_set_lock 2019-09-06 10:18:11 +02:00
cgroup_freezer.c
cgroup_pids.c
compat.c
configs.c
context_tracking.c
cpu.c cpu/speculation: Warn on unsupported mitigations= parameter 2019-07-10 09:56:36 +02:00
cpu_pm.c
cpuset.c
crash_dump.c
cred.c access: avoid the RCU grace period for the temporary subjective credentials 2019-08-04 09:35:01 +02:00
delayacct.c
dma.c
elfcore.c kernel/elfcore.c: include proper prototypes 2019-10-17 13:40:56 -07:00
exec_domain.c
exit.c kernel/exit.c: release ptraced tasks before zap_pid_ns_processes 2019-02-06 19:43:07 +01:00
extable.c
fork.c kernel/sysctl.c: do not override max_threads provided by userspace 2019-10-17 13:41:04 -07:00
freezer.c
futex.c futex: Fix futex lock the wrong page 2019-06-22 08:18:22 +02:00
futex_compat.c
groups.c kernel: make groups_sort calling a responsibility group_info allocators 2018-01-10 09:27:10 +01:00
hung_task.c kernel/hung_task.c: break RCU locks based on jiffies 2019-02-20 10:13:14 +01:00
irq_work.c
jump_label.c
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c
kexec_core.c
kexec_file.c
kexec_internal.h
kmod.c
kprobes.c kprobes: Prohibit probing on BUG() and WARN() address 2019-10-05 12:27:50 +02:00
ksysfs.c
kthread.c kthread, tracing: Don't expose half-written comm when creating kthreads 2018-09-09 20:04:34 +02:00
latencytop.c
Makefile
membarrier.c
memremap.c mm, devm_memremap_pages: kill mapping "System RAM" support 2019-01-13 10:05:32 +01:00
module-internal.h
module.c kernel/module.c: Only return -EEXIST for modules that have finished loading 2019-08-06 18:28:25 +02:00
module_signing.c
notifier.c
nsproxy.c
padata.c padata: use smp_mb in padata_reorder to avoid orphaned padata jobs 2019-08-04 09:34:51 +02:00
panic.c panic: ensure preemption is disabled during panic() 2019-10-17 13:40:58 -07:00
params.c
pid.c pidns: disable pid allocation if pid_ns_prepare_proc() is failed in alloc_pid() 2018-04-13 19:50:03 +02:00
pid_namespace.c signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig 2019-08-04 09:34:42 +02:00
profile.c profile: hide unused functions when !CONFIG_PROC_FS 2018-02-25 11:03:44 +01:00
ptrace.c ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME 2019-07-10 09:56:42 +02:00
range.c
reboot.c
relay.c kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE 2018-05-30 07:49:00 +02:00
resource.c resource: fix integer overflow at reallocation 2018-04-24 09:32:05 +02:00
seccomp.c seccomp: Move speculation migitation control to arch code 2018-07-25 10:18:27 +02:00
signal.c kernel/signal.c: trace_signal_deliver when signal_group_exit 2019-06-11 12:24:10 +02:00
smp.c
smpboot.c
smpboot.h
softirq.c
stacktrace.c
stop_machine.c
sys.c kernel/sys.c: prctl: fix false positive in validate_prctl_map() 2019-06-22 08:18:18 +02:00
sys_ni.c
sysctl.c sysctl: return -EINVAL if val violates minmax 2019-06-22 08:18:17 +02:00
sysctl_binary.c
task_work.c
taskstats.c
test_kprobes.c
torture.c
tracepoint.c tracepoint: Do not warn on ENOMEM 2018-05-16 10:06:47 +02:00
tsacct.c
uid16.c kernel: make groups_sort calling a responsibility group_info allocators 2018-01-10 09:27:10 +01:00
up.c
user-return-notifier.c
user.c
user_namespace.c userns: move user access out of the mutex 2018-09-09 20:04:35 +02:00
utsname.c
utsname_sysctl.c sys: don't hold uts_sem while accessing userspace memory 2018-09-09 20:04:35 +02:00
watchdog.c
workqueue.c workqueue: use put_device() instead of kfree() 2018-05-30 07:49:04 +02:00
workqueue_internal.h