[ Upstream commit 8799a221f5944a7d74516ecf46d58c28ec1d1f75 ]
Net stack initialization currently initializes fib-trie after the
first call to netdevice_notifier() call. In fact fib_trie initialization
needs to happen before first rtnl_register(). It does not cause any problem
since there are no devices UP at this moment, but trying to bring 'lo'
UP at initialization would make this assumption wrong and exposes the issue.
Fixes following crash
Call Trace:
? alternate_node_alloc+0x76/0xa0
fib_table_insert+0x1b7/0x4b0
fib_magic.isra.17+0xea/0x120
fib_add_ifaddr+0x7b/0x190
fib_netdev_event+0xc0/0x130
register_netdevice_notifier+0x1c1/0x1d0
ip_fib_init+0x72/0x85
ip_rt_init+0x187/0x1e9
ip_init+0xe/0x1a
inet_init+0x171/0x26c
? ipv4_offload_init+0x66/0x66
do_one_initcall+0x43/0x160
kernel_init_freeable+0x191/0x219
? rest_init+0x80/0x80
kernel_init+0xe/0x150
ret_from_fork+0x22/0x30
Code: f6 46 23 04 74 86 4c 89 f7 e8 ae 45 01 00 49 89 c7 4d 85 ff 0f 85 7b ff ff ff 31 db eb 08 4c 89 ff e8 16 47 01 00 48 8b 44 24 38 <45> 8b 6e 14 4d 63 76 74 48 89 04 24 0f 1f 44 00 00 48 83 c4 08
RIP: kmem_cache_alloc+0xcf/0x1c0 RSP: ffff9b1500017c28
CR2: 0000000000000014
Fixes: 7b1a74fdbb ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.")
Fixes: 7f9b80529b ("[IPV4]: fib hash|trie initialization")
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6399f1fae4ec29fab5ec76070435555e256ca3a6 ]
In some cases, offset can overflow and can cause an infinite loop in
ip6_find_1stfragopt(). Make it unsigned int to prevent the overflow, and
cap it at IPV6_MAXPLEN, since packets larger than that should be invalid.
This problem has been here since before the beginning of git history.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 63679112c536289826fec61c917621de95ba2ade ]
The ifr.ifr_name is passed around and assumed to be NULL terminated.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6fb05e0dd32e566facb96ea61a48c7488daa5ac3 upstream.
Avoid a double fetch by reusing the values from the prior transfer.
Originally reported via https://bugzilla.kernel.org/show_bug.cgi?id=195559
Thanks to Pengfei Wang <wpengfeinudt@gmail.com> for reporting.
Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reported-by: Pengfei Wang <wpengfeinudt@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the backport of commit 4f7b0d263833 ("drm: rcar-du: Simplify and fix
probe error handling"), which is commit 8255d26322 in this tree, the
error handling path was incorrect. This patch fixes it up.
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Cc: thongsyho <thong.ho.px@rvc.renesas.com>
Cc: Nhan Nguyen <nhan.nguyen.yb@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15d3042a937c13f5d9244241c7a9c8416ff6e82a upstream.
Make sure segno and blkoff read from raw image are valid.
Cc: stable@vger.kernel.org
Signed-off-by: Jin Qian <jinqian@google.com>
[Jaegeuk Kim: adjust minor coding style]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[AmitP: Found in Android Security bulletin for Aug'17, fixes CVE-2017-10663]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f5039ba440e499d85c29b1ddbc3cbc9dc90e44b upstream.
Since commit e8f4818895b3 ("[media] lirc: advertise
LIRC_CAN_GET_REC_RESOLUTION and improve") lircd uses the ioctl
LIRC_GET_REC_RESOLUTION to determine the shortest pulse or space that
the hardware can detect. This breaks decoding in lirc because lircd
expects the answer in microseconds, but nanoseconds is returned.
Reported-by: Derek <user.vdr@gmail.com>
Tested-by: Derek <user.vdr@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ea277194daaeaa84ce75180ec7c7a2075027a68 upstream.
Stable note for 4.4: The upstream patch patches madvise(MADV_FREE) but 4.4
does not have support for that feature. The changelog is left
as-is but the hunk related to madvise is omitted from the backport.
Nadav Amit identified a theoritical race between page reclaim and
mprotect due to TLB flushes being batched outside of the PTL being held.
He described the race as follows:
CPU0 CPU1
---- ----
user accesses memory using RW PTE
[PTE now cached in TLB]
try_to_unmap_one()
==> ptep_get_and_clear()
==> set_tlb_ubc_flush_pending()
mprotect(addr, PROT_READ)
==> change_pte_range()
==> [ PTE non-present - no flush ]
user writes using cached RW PTE
...
try_to_unmap_flush()
The same type of race exists for reads when protecting for PROT_NONE and
also exists for operations that can leave an old TLB entry behind such
as munmap, mremap and madvise.
For some operations like mprotect, it's not necessarily a data integrity
issue but it is a correctness issue as there is a window where an
mprotect that limits access still allows access. For munmap, it's
potentially a data integrity issue although the race is massive as an
munmap, mmap and return to userspace must all complete between the
window when reclaim drops the PTL and flushes the TLB. However, it's
theoritically possible so handle this issue by flushing the mm if
reclaim is potentially currently batching TLB flushes.
Other instances where a flush is required for a present pte should be ok
as either the page lock is held preventing parallel reclaim or a page
reference count is elevated preventing a parallel free leading to
corruption. In the case of page_mkclean there isn't an obvious path
that userspace could take advantage of without using the operations that
are guarded by this patch. Other users such as gup as a race with
reclaim looks just at PTEs. huge page variants should be ok as they
don't race with reclaim. mincore only looks at PTEs. userfault also
should be ok as if a parallel reclaim takes place, it will either fault
the page back in or read some of the data before the flush occurs
triggering a fault.
Note that a variant of this patch was acked by Andy Lutomirski but this
was for the x86 parts on top of his PCID work which didn't make the 4.13
merge window as expected. His ack is dropped from this version and
there will be a follow-on patch on top of PCID that will include his
ack.
[akpm@linux-foundation.org: tweak comments]
[akpm@linux-foundation.org: fix spello]
Link: http://lkml.kernel.org/r/20170717155523.emckq2esjro6hf3z@suse.de
Reported-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fce50a2fa4e9c6e103915c351b6d4a98661341d6 upstream.
This patch fixes a NULL pointer dereference in isert_login_recv_done()
of isert_conn->cm_id due to isert_cma_handler() -> isert_connect_error()
resetting isert_conn->cm_id = NULL during a failed login attempt.
As per Sagi, we will always see the completion of all recv wrs posted
on the qp (given that we assigned a ->done handler), this is a FLUSH
error completion, we just don't get to verify that because we deref
NULL before.
The issue here, was the assumption that dereferencing the connection
cm_id is always safe, which is not true since:
commit 4a579da258
Author: Sagi Grimberg <sagig@mellanox.com>
Date: Sun Mar 29 15:52:04 2015 +0300
iser-target: Fix possible deadlock in RDMA_CM connection error
As I see it, we have a direct reference to the isert_device from
isert_conn which is the one-liner fix that we actually need like
we do in isert_rdma_read_done() and isert_rdma_write_done().
Reported-by: Andrea Righi <righi.andrea@gmail.com>
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 105fa2f44e504c830697b0c794822112d79808dc upstream.
This patch fixes a BUG() in iscsit_close_session() that could be
triggered when iscsit_logout_post_handler() execution from within
tx thread context was not run for more than SECONDS_FOR_LOGOUT_COMP
(15 seconds), and the TCP connection didn't already close before
then forcing tx thread context to automatically exit.
This would manifest itself during explicit logout as:
[33206.974254] 1 connection(s) still exist for iSCSI session to iqn.1993-08.org.debian:01:3f5523242179
[33206.980184] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 2100.772 msecs
[33209.078643] ------------[ cut here ]------------
[33209.078646] kernel BUG at drivers/target/iscsi/iscsi_target.c:4346!
Normally when explicit logout attempt fails, the tx thread context
exits and iscsit_close_connection() from rx thread context does the
extra cleanup once it detects conn->conn_logout_remove has not been
cleared by the logout type specific post handlers.
To address this special case, if the logout post handler in tx thread
context detects conn->tx_thread_active has already been cleared, simply
return and exit in order for existing iscsit_close_connection()
logic from rx thread context do failed logout cleanup.
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Gary Guo <ghg@datera.io>
Tested-by: Chu Yuan Lin <cyl@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 25cdda95fda78d22d44157da15aa7ea34be3c804 upstream.
This patch fixes a OOPs originally introduced by:
commit bb048357da
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Thu Sep 5 14:54:04 2013 -0700
iscsi-target: Add sk->sk_state_change to cleanup after TCP failure
which would trigger a NULL pointer dereference when a TCP connection
was closed asynchronously via iscsi_target_sk_state_change(), but only
when the initial PDU processing in iscsi_target_do_login() from iscsi_np
process context was blocked waiting for backend I/O to complete.
To address this issue, this patch makes the following changes.
First, it introduces some common helper functions used for checking
socket closing state, checking login_flags, and atomically checking
socket closing state + setting login_flags.
Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP
connection has dropped via iscsi_target_sk_state_change(), but the
initial PDU processing within iscsi_target_do_login() in iscsi_np
context is still running. For this case, it sets LOGIN_FLAGS_CLOSED,
but doesn't invoke schedule_delayed_work().
The original NULL pointer dereference case reported by MNC is now handled
by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before
transitioning to FFP to determine when the socket has already closed,
or iscsi_target_start_negotiation() if the login needs to exchange
more PDUs (eg: iscsi_target_do_login returned 0) but the socket has
closed. For both of these cases, the cleanup up of remaining connection
resources will occur in iscsi_target_start_negotiation() from iscsi_np
process context once the failure is detected.
Finally, to handle to case where iscsi_target_sk_state_change() is
called after the initial PDU procesing is complete, it now invokes
conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once
existing iscsi_target_sk_check_close() checks detect connection failure.
For this case, the cleanup of remaining connection resources will occur
in iscsi_target_do_login_rx() from delayed workqueue process context
once the failure is detected.
Reported-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Tested-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Reported-by: Hannes Reinecke <hare@suse.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Varun Prakash <varun@chelsio.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f0dfb3d8b1120c61f6e2cc3729290db10772b2d upstream.
There is a iscsi-target/tcp login race in LOGIN_FLAGS_READY
state assignment that can result in frequent errors during
iscsi discovery:
"iSCSI Login negotiation failed."
To address this bug, move the initial LOGIN_FLAGS_READY
assignment ahead of iscsi_target_do_login() when handling
the initial iscsi_target_start_negotiation() request PDU
during connection login.
As iscsi_target_do_login_rx() work_struct callback is
clearing LOGIN_FLAGS_READ_ACTIVE after subsequent calls
to iscsi_target_do_login(), the early sk_data_ready
ahead of the first iscsi_target_do_login() expects
LOGIN_FLAGS_READY to also be set for the initial
login request PDU.
As reported by Maged, this was first obsered using an
MSFT initiator running across multiple VMWare host
virtual machines with iscsi-target/tcp.
Reported-by: Maged Mokhtar <mmokhtar@binarykinetics.com>
Tested-by: Maged Mokhtar <mmokhtar@binarykinetics.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e0cf5e6c43b9e19fc0284f69e5cd2b4a47523b0 upstream.
There are three timing problems in the kthread usages of iscsi_target_mod:
- np_thread of struct iscsi_np
- rx_thread and tx_thread of struct iscsi_conn
In iscsit_close_connection(), it calls
send_sig(SIGINT, conn->tx_thread, 1);
kthread_stop(conn->tx_thread);
In conn->tx_thread, which is iscsi_target_tx_thread(), when it receive
SIGINT the kthread will exit without checking the return value of
kthread_should_stop().
So if iscsi_target_tx_thread() exit right between send_sig(SIGINT...)
and kthread_stop(...), the kthread_stop() will try to stop an already
stopped kthread.
This is invalid according to the documentation of kthread_stop().
(Fix -ECONNRESET logout handling in iscsi_target_tx_thread and
early iscsi_target_rx_thread failure case - nab)
Signed-off-by: Jiang Yi <jiangyilism@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49cb77e297dc611a1b795cfeb79452b3002bd331 upstream.
This patch closes a race between se_lun deletion during configfs
unlink in target_fabric_port_unlink() -> core_dev_del_lun()
-> core_tpg_remove_lun(), when transport_clear_lun_ref() blocks
waiting for percpu_ref RCU grace period to finish, but a new
NodeACL mappedlun is added before the RCU grace period has
completed.
This can happen in target_fabric_mappedlun_link() because it
only checks for se_lun->lun_se_dev, which is not cleared until
after transport_clear_lun_ref() percpu_ref RCU grace period
finishes.
This bug originally manifested as NULL pointer dereference
OOPsen in target_stat_scsi_att_intr_port_show_attr_dev() on
v4.1.y code, because it dereferences lun->lun_se_dev without
a explicit NULL pointer check.
In post v4.1 code with target-core RCU conversion, the code
in target_stat_scsi_att_intr_port_show_attr_dev() no longer
uses se_lun->lun_se_dev, but the same race still exists.
To address the bug, go ahead and set se_lun>lun_shutdown as
early as possible in core_tpg_remove_lun(), and ensure new
NodeACL mappedlun creation in target_fabric_mappedlun_link()
fails during se_lun shutdown.
Reported-by: James Shen <jcs@datera.io>
Cc: James Shen <jcs@datera.io>
Tested-by: James Shen <jcs@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da05d52d2f0f6bd61094a0cd045fed94bf7d673a upstream.
this patch makes sure VPFE_CMD_S_CCDC_RAW_PARAMS ioctl no longer works
for vpfe_capture driver with a minimal patch suitable for backporting.
- This ioctl was never in public api and was only defined in kernel header.
- The function set_params constantly mixes up pointers and phys_addr_t
numbers.
- This is part of a 'VPFE_CMD_S_CCDC_RAW_PARAMS' ioctl command that is
described as an 'experimental ioctl that will change in future kernels'.
- The code to allocate the table never gets called after we copy_from_user
the user input over the kernel settings, and then compare them
for inequality.
- We then go on to use an address provided by user space as both the
__user pointer for input and pass it through phys_to_virt to come up
with a kernel pointer to copy the data to. This looks like a trivially
exploitable root hole.
Due to these reasons we make sure this ioctl now returns -EINVAL and backport
this patch as far as possible.
Fixes: 5f15fbb68f ("V4L/DVB (12251): v4l: dm644x ccdc module for vpfe capture driver")
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d4514173211586c6238629b1ef1e071927735f5 upstream.
As written in the datasheet the PCA955 can only handle low level irq and
not edge irq.
Without this fix the interrupt is not usable for pca955: the gpio-pca953x
driver already set the irq type as low level which is incompatible with
edge type, then the kernel prevents using the interrupt:
"irq: type mismatch, failed to map hwirq-18 for
/soc/internal-regs/gpio@18100!"
Fixes: 928413bd85 ("ARM: mvebu: Add Armada 388 General Purpose
Development Board support")
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aec51758ce10a9c847a62a48a168f8c804c6e053 upstream.
On a 32-bit platform, the value of n_blcoks_count may be wrong during
the file system is resized to size larger than 2^32 blocks. This may
caused the superblock being corrupted with zero blocks count.
Fixes: 1c6bd7173d
Signed-off-by: Jerry Lee <jerrylee@qnap.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fcf5ea10992fbac3c7473a1db33d56a139333cd1 upstream.
ext4_find_unwritten_pgoff() does not properly handle a situation when
starting index is in the middle of a page and blocksize < pagesize. The
following command shows the bug on filesystem with 1k blocksize:
xfs_io -f -c "falloc 0 4k" \
-c "pwrite 1k 1k" \
-c "pwrite 3k 1k" \
-c "seek -a -r 0" foo
In this example, neither lseek(fd, 1024, SEEK_HOLE) nor lseek(fd, 2048,
SEEK_DATA) will return the correct result.
Fix the problem by neglecting buffers in a page before starting offset.
Reported-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit adb1fe9ae2ee6ef6bc10f3d5a588020e7664dfa7 upstream.
Linus suggested we try to remove some of the low-hanging fruit related
to kernel address exposure in dmesg. The only leaks I see on my local
system are:
Freeing SMP alternatives memory: 32K (ffffffff9e309000 - ffffffff9e311000)
Freeing initrd memory: 10588K (ffffa0b736b42000 - ffffa0b737599000)
Freeing unused kernel memory: 3592K (ffffffff9df87000 - ffffffff9e309000)
Freeing unused kernel memory: 1352K (ffffa0b7288ae000 - ffffa0b728a00000)
Freeing unused kernel memory: 632K (ffffa0b728d62000 - ffffa0b728e00000)
Linus says:
"I suspect we should just remove [the addresses in the 'Freeing'
messages]. I'm sure they are useful in theory, but I suspect they
were more useful back when the whole "free init memory" was
originally done.
These days, if we have a use-after-free, I suspect the init-mem
situation is the easiest situation by far. Compared to all the dynamic
allocations which are much more likely to show it anyway. So having
debug output for that case is likely not all that productive."
With this patch the freeing messages now look like this:
Freeing SMP alternatives memory: 32K
Freeing initrd memory: 10588K
Freeing unused kernel memory: 3592K
Freeing unused kernel memory: 1352K
Freeing unused kernel memory: 632K
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/6836ff90c45b71d38e5d4405aec56fa9e5d1d4b2.1477405374.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 337c017ccdf2653d0040099433fc1a2b1beb5926 upstream.
WARNING: CPU: 5 PID: 1242 at kernel/rcu/tree_plugin.h:323 rcu_note_context_switch+0x207/0x6b0
CPU: 5 PID: 1242 Comm: unity-settings- Not tainted 4.13.0-rc2+ #1
RIP: 0010:rcu_note_context_switch+0x207/0x6b0
Call Trace:
__schedule+0xda/0xba0
? kvm_async_pf_task_wait+0x1b2/0x270
schedule+0x40/0x90
kvm_async_pf_task_wait+0x1cc/0x270
? prepare_to_swait+0x22/0x70
do_async_page_fault+0x77/0xb0
? do_async_page_fault+0x77/0xb0
async_page_fault+0x28/0x30
RIP: 0010:__d_lookup_rcu+0x90/0x1e0
I encounter this when trying to stress the async page fault in L1 guest w/
L2 guests running.
Commit 9b132fbe54 (Add rcu user eqs exception hooks for async page
fault) adds rcu_irq_enter/exit() to kvm_async_pf_task_wait() to exit cpu
idle eqs when needed, to protect the code that needs use rcu. However,
we need to call the pair even if the function calls schedule(), as seen
from the above backtrace.
This patch fixes it by informing the RCU subsystem exit/enter the irq
towards/away from idle for both n.halted and !n.halted.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b1cd2e34c69a2f3988786af451b6e17967c293a0 upstream.
Multiple frontend dailinks may be connected to a backend
dailink at the same time. When one of frontend dailinks is
closed, the associated backend dailink should not be closed
if it is connected to other active frontend dailinks. Change
ensures that backend dailink is closed only after all
connected frontend dailinks are closed.
Signed-off-by: Gopikrishnaiah Anandan <agopik@codeaurora.org>
Signed-off-by: Banajit Goswami <bgoswami@codeaurora.org>
Signed-off-by: Patrick Lai <plai@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f3c371421e601fa93b6cb7fb52da9ad59ec90b4 upstream.
Sony VAIO VPCL14M1R needs the quirk to make the speaker working properly.
Tested-by: Dmitriy <mexx400@yandex.ru>
Signed-off-by: Sergei A. Trusov <sergei.a.trusov@ya.ru>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c0338c68706be53b3dc472e4308961c36e4ece1 upstream.
The combination of WQ_UNBOUND and max_active == 1 used to imply
ordered execution. After NUMA affinity 4c16bd327c ("workqueue:
implement NUMA affinity for unbound workqueues"), this is no longer
true due to per-node worker pools.
While the right way to create an ordered workqueue is
alloc_ordered_workqueue(), the documentation has been misleading for a
long time and people do use WQ_UNBOUND and max_active == 1 for ordered
workqueues which can lead to subtle bugs which are very difficult to
trigger.
It's unlikely that we'd see noticeable performance impact by enforcing
ordering on WQ_UNBOUND / max_active == 1 workqueues. Let's
automatically set __WQ_ORDERED for those workqueues.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Christoph Hellwig <hch@infradead.org>
Reported-by: Alexei Potashnik <alexei@purestorage.com>
Fixes: 4c16bd327c ("workqueue: implement NUMA affinity for unbound workqueues")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 59a5e266c3f5c1567508888dd61a45b86daed0fa upstream.
My static checker complains that "devno" can be negative, meaning that
we read before the start of the loop. I've looked at the code, and I
think the warning is right. This come from /proc so it's root only or
it would be quite a quite a serious bug. The call tree looks like this:
proc_scsi_write() <- gets id and channel from simple_strtoul()
-> scsi_add_single_device() <- calls shost->transportt->user_scan()
-> ata_scsi_user_scan()
-> ata_find_dev()
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This can cause issues with processes using the poll()
interface:
1) client sends two oneway transactions
2) the second one gets queued on async_todo
(because the server didn't handle the first one
yet)
3) server returns from poll(), picks up the
first transaction and does transaction work
4) server is done with the transaction, sends
BC_FREE_BUFFER, and the second transaction gets
moved to thread->todo
5) libbinder's handlePolledCommands() only handles
the commands in the current data buffer, so
doesn't see the new transaction
6) the server continues running and issues a new
outgoing transaction. Now, it suddenly finds
the incoming oneway transaction on its thread
todo, and returns that to userspace.
7) userspace does not expect this to happen; it
may be holding a lock while making the outgoing
transaction, and if handling the incoming
trasnaction requires taking the same lock,
userspace will deadlock.
By queueing the async transaction to the proc
workqueue, we make sure it's only picked up when
a thread is ready for proc work.
Bug: 38201220
Bug: 63075553
Bug: 63079216
Change-Id: I84268cc112f735d7e3173793873dfdb4b268468b
Signed-off-by: Martijn Coenen <maco@android.com>
This allows userspace to request death notifications without
having to worry about getting an immediate callback on the same
thread; one scenario where this would be problematic is if the
death recipient handler grabs a lock that was already taken
earlier (eg as part of a nested transaction).
Bug: 23525545
Test: binderLibTest.DeathNotificationThread passes
Change-Id: I955e16306fe3110dacb9a391ffff1bf869249495
Signed-off-by: Martijn Coenen <maco@android.com>
Because we're not guaranteed that subsequent calls
to poll() will have a poll_table_struct parameter
with _qproc set. When _qproc is not set, poll_wait()
is a noop, and we won't be woken up correctly.
Bug: 64552728
Change-Id: I5b904c9886b6b0994d1631a636f5c5e5f6327950
Test: binderLibTest stops hanging with new test
Signed-off-by: Martijn Coenen <maco@android.com>
This patch moves arm64's struct thread_info from the task stack into
task_struct. This protects thread_info from corruption in the case of
stack overflows, and makes its address harder to determine if stack
addresses are leaked, making a number of attacks more difficult. Precise
detection and handling of overflow is left for subsequent patches.
Largely, this involves changing code to store the task_struct in sp_el0,
and acquire the thread_info from the task struct. Core code now
implements current_thread_info(), and as noted in <linux/sched.h> this
relies on offsetof(task_struct, thread_info) == 0, enforced by core
code.
This change means that the 'tsk' register used in entry.S now points to
a task_struct, rather than a thread_info as it used to. To make this
clear, the TI_* field offsets are renamed to TSK_TI_*, with asm-offsets
appropriately updated to account for the structural change.
Userspace clobbers sp_el0, and we can no longer restore this from the
stack. Instead, the current task is cached in a per-cpu variable that we
can safely access from early assembly as interrupts are disabled (and we
are thus not preemptible).
Both secondary entry and idle are updated to stash the sp and task
pointer separately.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This is a modification of Mark Rutland's original patch. Guards to check
if CONFIG_THREAD_INFO_IN_TASK is used has been inserted. get_current()
for when CONFIG_THREAD_INFO_IN_TASK is not used has been added to
arch/arm64/include/asm/current.h.
Bug: 38331309
Change-Id: Ic5eae344a7c2baea0864f6ae16be1e9c60c0a74a
(cherry picked from commit c02433dd6de32f042cf3ffe476746b1115b8c096)
Signed-off-by: Zubin Mithra <zsm@google.com>
Shortly we will want to load a percpu variable in the return from
userspace path. We can save an instruction by folding the addition of
the percpu offset into the load instruction, and this patch adds a new
helper to do so.
At the same time, we clean up this_cpu_ptr for consistency. As with
{adr,ldr,str}_l, we change the template to take the destination register
first, and name this dst. Secondly, we rename the macro to adr_this_cpu,
following the scheme of adr_l, and matching the newly added
ldr_this_cpu.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 38331309
Change-Id: Iaaf4ea9674ab89289badee216b5305204172895e
(cherry picked from commit 1b7e2296a822dfd2349960addc42a139360ce769)
Signed-off-by: Zubin Mithra <zsm@google.com>
In the absence of CONFIG_THREAD_INFO_IN_TASK, core code maintains
thread_info::cpu, and low-level architecture code can access this to
build raw_smp_processor_id(). With CONFIG_THREAD_INFO_IN_TASK, core code
maintains task_struct::cpu, which for reasons of hte header soup is not
accessible to low-level arch code.
Instead, we can maintain a percpu variable containing the cpu number.
For both the old and new implementation of raw_smp_processor_id(), we
read a syreg into a GPR, add an offset, and load the result. As the
offset is now larger, it may not be folded into the load, but otherwise
the assembly shouldn't change much.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 38331309
Change-Id: I154927b0f9fc0ebbbed88c9958408bbb19cf09de
(cherry picked from commit 57c82954e77fa12c1023e87210d2ede77aaa0058)
Signed-off-by: Zubin Mithra <zsm@google.com>
Subsequent patches will make smp_processor_id() use a percpu variable.
This will make smp_processor_id() dependent on the percpu offset, and
thus we cannot use smp_processor_id() to figure out what to initialise
the offset to.
Prepare for this by initialising the percpu offset based on
current::cpu, which will work regardless of how smp_processor_id() is
implemented. Also, make this relationship obvious by placing this code
together at the start of secondary_start_kernel().
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 38331309
Change-Id: I43304d06602216fbb5b968ff83e0face11e238f5
(cherry picked from commit 580efaa7ccfb8c0790dce4396434f0e5ac8d86ee)
Signed-off-by: Zubin Mithra <zsm@google.com>
When returning from idle, we rely on the fact that thread_info lives at
the end of the kernel stack, and restore this by masking the saved stack
pointer. Subsequent patches will sever the relationship between the
stack and thread_info, and to cater for this we must save/restore sp_el0
explicitly, storing it in cpu_suspend_ctx.
As cpu_suspend_ctx must be doubleword aligned, this leaves us with an
extra slot in cpu_suspend_ctx. We can use this to save/restore tpidr_el1
in the same way, which simplifies the code, avoiding pointer chasing on
the restore path (as we no longer need to load thread_info::cpu followed
by the relevant slot in __per_cpu_offset based on this).
This patch stashes both registers in cpu_suspend_ctx.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This is a modification of Mark Rutland's original patch. The differences
from the original patch are as follows :-
- NR_CTX_REGS is set to 13 instead of 12
- x13 and x14 are used as temporary registers to hold sp_el0 and
tpidr_el1 instead of x11 and x12.
- The values are temporarily stashed at offset 88 and 96 of
cpu_suspend_ctx instead of 80 and 88.
The original patch would not apply cleanly and these changes were made
to resolve this.
Bug: 38331309
Change-Id: I4e72aebd51e99d3767487383c14a1ba784312bf1
(cherry picked from commit 623b476fc815464a0241ea7483da7b3580b7d8ac)
Signed-off-by: Zubin Mithra <zsm@google.com>
When CONFIG_THREAD_INFO_IN_TASK is selected, task stacks may be freed
before a task is destroyed. To account for this, the stacks are
refcounted, and when manipulating the stack of another task, it is
necessary to get/put the stack to ensure it isn't freed and/or re-used
while we do so.
This patch reworks the arm64 stack walking code to account for this.
When CONFIG_THREAD_INFO_IN_TASK is not selected these perform no
refcounting, and this should only be a structural change that does not
affect behaviour.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 38331309
Change-Id: I89c4f53c4fea0d0be2f88221489c0c7f43366810
(cherry picked from commit 9bbd4c56b0b642f04396da378296e68096d5afca)
Signed-off-by: Zubin Mithra <zsm@google.com>
The walk_stackframe functions is architecture-specific, with a varying
prototype, and common code should not use it directly. None of its
current users can be built as modules. With THREAD_INFO_IN_TASK, users
will also need to hold a stack reference before calling it.
There's no reason for it to be exported, and it's very easy to misuse,
so unexport it for now.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 38331309
Change-Id: Ibe0dca36cc7d35f92c6bc13b373755d82f0eb9ef
(cherry picked from commit 2020a5ae7c8c2c8504565004915017507b135c63)
Signed-off-by: Zubin Mithra <zsm@google.com>
In arm64's die and __die routines we pass around a thread_info, and
subsequently use this to determine the relevant task_struct, and the end
of the thread's stack. Subsequent patches will decouple thread_info from
the stack, and this approach will no longer work.
To figure out the end of the stack, we can use the new generic
end_of_stack() helper. As we only call __die() from die(), and die()
always deals with the current task, we can remove the parameter and have
both acquire current directly, which also makes it clear that __die
can't be called for arbitrary tasks.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 38331309
Change-Id: Ie1a96a0a8e244d458a7f147001b64216403e07c4
(cherry picked from commit 876e7a38e8788773aac768091aaa3b42e470c03b)
Signed-off-by: Zubin Mithra <zsm@google.com>
We define current_stack_pointer in <asm/thread_info.h>, though other
files and header relying upon it do not have this necessary include, and
are thus fragile to changes in the header soup.
Subsequent patches will affect the header soup such that directly
including <asm/thread_info.h> may result in a circular header include in
some of these cases, so we can't simply include <asm/thread_info.h>.
Instead, factor current_thread_info into its own header, and have all
existing users include this explicitly.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 38331309
Change-Id: I4d6bc27bef686d0dade1d6abe1ce947cf6c4dfb3
(cherry picked from commit a9ea0017ebe8889dfa136cac2aa7ae0ee6915e1f)
Signed-off-by: Zubin Mithra <zsm@google.com>
Subsequent patches will move the thread_info::{task,cpu} fields, and the
current TI_{TASK,CPU} offset definitions are not used anywhere.
This patch removes the redundant definitions.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This is a modification of Mark Rutland's original patch. Guards to
check if CONFIG_THREAD_INFO_IN_TASK is used has been inserted.
Bug: 38331309
Change-Id: I95903e0f862fc5dcf89e51926afa22389f2f7cee
(cherry picked from commit 3fe12da4c7fa6491e0fb7c5371716ac7f8ea80a5)
Signed-off-by: Zubin Mithra <zsm@google.com>
We have a comment claiming __switch_to() cares about where cpu_context
is located relative to cpu_domain in thread_info. However arm64 has
never had a thread_info::cpu_domain field, and neither __switch_to nor
cpu_switch_to care where the cpu_context field is relative to others.
Additionally, the init_thread_info alias is never used anywhere in the
kernel, and will shortly become problematic when thread_info is moved
into task_struct.
This patch removes both.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 38331309
Change-Id: Ia4769ddcc6fc556e9eb6193d64fc99fe2d9e39ab
(cherry picked from commit dcbe02855f048fdf1e13ebc697e83c8d297f9f5a)
Signed-off-by: Zubin Mithra <zsm@google.com>
When CONFIG_THREAD_INFO_IN_TASK is selected, the current_thread_info()
macro relies on current having been defined prior to its use. However,
not all users of current_thread_info() include <asm/current.h>, and thus
current is not guaranteed to be defined.
When CONFIG_THREAD_INFO_IN_TASK is not selected, it's possible that
get_current() / current are based upon current_thread_info(), and
<asm/current.h> includes <asm/thread_info.h>. Thus always including
<asm/current.h> would result in circular dependences on some platforms.
To ensure both cases work, this patch includes <asm/current.h>, but only
when CONFIG_THREAD_INFO_IN_TASK is selected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 38331309
Change-Id: Ia981a829798d60a54d4e3eb679d8e24b01228357
(cherry picked from commit dc3d2a679cd8631b8a570fc8ca5f4712d7d25698)
Signed-off-by: Zubin Mithra <zsm@google.com>
Since commit f56141e3e2 ("all arches, signal: move restart_block
to struct task_struct"), thread_info and restart_block have been
logically distinct, yet struct restart_block is still defined in
<linux/thread_info.h>.
At least one architecture (erroneously) uses restart_block as part of
its thread_info, and thus the definition of restart_block must come
before the include of <asm/thread_info>. Subsequent patches in this
series need to shuffle the order of includes and definitions in
<linux/thread_info.h>, and will make this ordering fragile.
This patch moves the definition of restart_block out to its own header.
This serves as generic cleanup, logically separating thread_info and
restart_block, and also makes it easier to avoid fragility.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Bug: 38331309
Change-Id: I4283c87072c092179e2b6c02cbf7248b4a1c2d22
(cherry picked from commit 53d74d056a4e306a72b8883d325b5d853c0618e6)
Signed-off-by: Zubin Mithra <zsm@google.com>
get_task_struct(tsk) no longer pins tsk->stack so all users of
to_live_kthread() should do try_get_task_stack/put_task_stack to protect
"struct kthread" which lives on kthread's stack.
TODO: Kill to_live_kthread(), perhaps we can even kill "struct kthread" too,
and rework kthread_stop(), it can use task_work_add() to sync with the exiting
kernel thread.
Message-Id: <20160629180357.GA7178@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/cb9b16bbc19d4aea4507ab0552e4644c1211d130.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Bug: 38331309
Change-Id: I2872658e56dcb1ab4173c490ef8f52affa54a404
(cherry picked from commit 23196f2e5f5d810578a772785807dcdc2b9fdce9)
Signed-off-by: Zubin Mithra <zsm@google.com>
There are a few places in the kernel that access stack memory
belonging to a different task. Before we can start freeing task
stacks before the task_struct is freed, we need a way for those code
paths to pin the stack.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/17a434f50ad3d77000104f21666575e10a9c1fbd.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Bug: 38331309
Change-Id: I414853e9b72ecb0967d5e1cbfc77b4929bf3f4f5
(cherry picked from commit c6c314a613cd7d03fb97713e0d642b493de42e69)
Signed-off-by: Zubin Mithra <zsm@google.com>
If an arch opts in by setting CONFIG_THREAD_INFO_IN_TASK_STRUCT,
then thread_info is defined as a single 'u32 flags' and is the first
entry of task_struct. thread_info::task is removed (it serves no
purpose if thread_info is embedded in task_struct), and
thread_info::cpu gets its own slot in task_struct.
This is heavily based on a patch written by Linus.
Originally-from: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a0898196f0476195ca02713691a5037a14f2aac5.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Bug: 38331309
Change-Id: I25e5a830f2ada5e74fa93661e97e5e701b1b70d2
(cherry picked from commit c65eacbe290b8141554c71b2c94489e73ade8c8d)
Signed-off-by: Zubin Mithra <zsm@google.com>
We currently show:
task: <current> ti: <current_thread_info()> task.ti: <task_thread_info(current)>"
"ti" and "task.ti" are redundant, and neither is actually what we want
to show, which the the base of the thread stack. Change the display to
show the stack pointer explicitly.
Link: http://lkml.kernel.org/r/543ac5bd66ff94000a57a02e11af7239571a3055.1468523549.git.luto@kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 38331309
Change-Id: I7d4b915d38770d0c9384695b2064e4c66b22e94e
(cherry picked from commit 8b70ca65616b3588ea1907e87f0df6d2530350df)
Signed-off-by: Zubin Mithra <zsm@google.com>
The INIT_TASK() initializer was similarly confused about the stack vs
thread_info allocation that the allocators had, and that were fixed in
commit b235beea9e99 ("Clarify naming of thread info/stack allocators").
The task ->stack pointer only incidentally ends up having the same value
as the thread_info, and in fact that will change.
So fix the initial task struct initializer to point to 'init_stack'
instead of 'init_thread_info', and make sure the ia64 definition for
that exists.
This actually makes the ia64 tsk->stack pointer be sensible for the
initial task, but not for any other task. As mentioned in commit
b235beea9e99, that whole pointer isn't actually used on ia64, since
task_stack_page() there just points to the (single) allocation.
All the other architectures seem to have copied the 'init_stack'
definition, even if it tended to be generally unusued.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 38331309
Change-Id: Ia96e9225b07e38df2f4af2b9a7eb2aa972d8845a
(cherry picked from commit 7f1a00b6fcd0e3c19beba2e92d157dc0c2cf3494)
Signed-off-by: Zubin Mithra <zsm@google.com>
We've had the thread info allocated together with the thread stack for
most architectures for a long time (since the thread_info was split off
from the task struct), but that is about to change.
But the patches that move the thread info to be off-stack (and a part of
the task struct instead) made it clear how confused the allocator and
freeing functions are.
Because the common case was that we share an allocation with the thread
stack and the thread_info, the two pointers were identical. That
identity then meant that we would have things like
ti = alloc_thread_info_node(tsk, node);
...
tsk->stack = ti;
which certainly _worked_ (since stack and thread_info have the same
value), but is rather confusing: why are we assigning a thread_info to
the stack? And if we move the thread_info away, the "confusing" code
just gets to be entirely bogus.
So remove all this confusion, and make it clear that we are doing the
stack allocation by renaming and clarifying the function names to be
about the stack. The fact that the thread_info then shares the
allocation is an implementation detail, and not really about the
allocation itself.
This is a pure renaming and type fix: we pass in the same pointer, it's
just that we clarify what the pointer means.
The ia64 code that actually only has one single allocation (for all of
task_struct, thread_info and kernel thread stack) now looks a bit odd,
but since "tsk->stack" is actually not even used there, that oddity
doesn't matter. It would be a separate thing to clean that up, I
intentionally left the ia64 changes as a pure brute-force renaming and
type change.
Acked-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 38331309
Change-Id: I870b5476fc900c9145134f9dd3ed18a32a490162
(cherry picked from commit b235beea9e996a4d36fed6cfef4801a3e7d7a9a5)
Signed-off-by: Zubin Mithra <zsm@google.com>
Otherwise, lower_fs->ioctl() fails due to inode_owner_or_capable().
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Bug: 63260873
Change-Id: I623a6c7c5f8a3cbd7ec73ef89e18ddb093c43805
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlmHzogACgkQONu9yGCS
aT72Kg/9Ea02hrf7SCaEmReH0CNBsZiWBp0u/4b6QtXt3TrPDXK0oteIB4SUIVi/
zOzjU5SkssMLL9RoRQob81DLFJlL0b9ME5nLXxAACe2P74DaRSxA3DDmrYILgerH
Gnv4k9xjbVMXMjdk6qAZ/SahCFfYPfnPCRO/zPeb3+6EZk8UQpaaB/GNxVCsGFTZ
AfThsAHYzfFOg2fYdK0T09eDtAFqAokwGY6O8uaigkJt3u5mbMXcgxSp4o322OcG
V3jxCUPzSk/78QtoSqQErXDCj/30451oLVByMBuRpBJAilsDf6VaURuz1dVfKFW8
PdkLiy397sir696HwPU0HwHz++kRnZK2u2z//TRDE5wmgsC9VSq9fkggZdmNBol5
N4ekCWjhYyyJzxf9hTxK/fA4t4KRFtOcdRiEkJj9RDIhT9jxsxPMr3TGJ25LJaUH
8Qae+nNlYVe7lmaojckGa+AjIMm5HRB7LZnf4VQr1E8kvWpWpwA/0YtnduzPsXhH
6xqT0rL/1/Z1Jz63/zPAtZ9OSL/ne0hJs+xOuUhKHGwH3oWBKrgmxAH8CAxYq0x9
Y6ALkDweS3e+vVt+4BcHpUz8JTNTlspMcebt4VvjqvmERpKwmVsl7tEY242Uw4LQ
wMF50vA9Cc0bVkVS7w2Ns/dn6XEWYpqS4a/MninjaBOMbtMia78=
=l+tE
-----END PGP SIGNATURE-----
Merge 4.4.80 into android-4.4
Changes in 4.4.80
af_key: Add lock to key dump
pstore: Make spinlock per zone instead of global
net: reduce skb_warn_bad_offload() noise
powerpc/pseries: Fix of_node_put() underflow during reconfig remove
crypto: authencesn - Fix digest_null crash
md/raid5: add thread_group worker async_tx_issue_pending_all
drm/vmwgfx: Fix gcc-7.1.1 warning
drm/nouveau/bar/gf100: fix access to upper half of BAR2
KVM: PPC: Book3S HV: Context-switch EBB registers properly
KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit
KVM: PPC: Book3S HV: Reload HTM registers explicitly
KVM: PPC: Book3S HV: Save/restore host values of debug registers
Revert "powerpc/numa: Fix percpu allocations to be NUMA aware"
Staging: comedi: comedi_fops: Avoid orphaned proc entry
drm/rcar: Nuke preclose hook
drm: rcar-du: Perform initialization/cleanup at probe/remove time
drm: rcar-du: Simplify and fix probe error handling
perf intel-pt: Fix ip compression
perf intel-pt: Fix last_ip usage
perf intel-pt: Use FUP always when scanning for an IP
perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero
xfs: don't BUG() on mixed direct and mapped I/O
nfc: fdp: fix NULL pointer dereference
net: phy: Do not perform software reset for Generic PHY
isdn: Fix a sleep-in-atomic bug
isdn/i4l: fix buffer overflow
ath10k: fix null deref on wmi-tlv when trying spectral scan
wil6210: fix deadlock when using fw_no_recovery option
mailbox: always wait in mbox_send_message for blocking Tx mode
mailbox: skip complete wait event if timer expired
mailbox: handle empty message in tx_tick
mpt3sas: Don't overreach ioc->reply_post[] during initialization
kaweth: fix firmware download
kaweth: fix oops upon failed memory allocation
sched/cgroup: Move sched_online_group() back into css_online() to fix crash
PM / Domains: defer dev_pm_domain_set() until genpd->attach_dev succeeds if present
RDMA/uverbs: Fix the check for port number
libnvdimm, btt: fix btt_rw_page not returning errors
ipmi/watchdog: fix watchdog timeout set on reboot
dentry name snapshots
v4l: s5c73m3: fix negation operator
Make file credentials available to the seqfile interfaces
/proc/iomem: only expose physical resource addresses to privileged users
vlan: Propagate MAC address to VLANs
pstore: Allow prz to control need for locking
pstore: Correctly initialize spinlock and flags
pstore: Use dynamic spinlock initializer
net: skb_needs_check() accepts CHECKSUM_NONE for tx
sched/cputime: Fix prev steal time accouting during CPU hotplug
xen/blkback: don't free be structure too early
xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
tpm: fix a kernel memory leak in tpm-sysfs.c
tpm: Replace device number bitmap with IDR
x86/mce/AMD: Make the init code more robust
r8169: add support for RTL8168 series add-on card.
ARM: dts: n900: Mark eMMC slot with no-sdio and no-sd flags
ipv6: Should use consistent conditional judgement for ip6 fragment between __ip6_append_data and ip6_finish_output
net/mlx4: Remove BUG_ON from ICM allocation routine
drm/msm: Ensure that the hardware write pointer is valid
drm/msm: Verify that MSM_SUBMIT_BO_FLAGS are set
vfio-pci: use 32-bit comparisons for register address for gcc-4.5
irqchip/keystone: Fix "scheduling while atomic" on rt
ASoC: tlv320aic3x: Mark the RESET register as volatile
spi: dw: Make debugfs name unique between instances
ASoC: nau8825: fix invalid configuration in Pre-Scalar of FLL
irqchip/mxs: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND
openrisc: Add _text symbol to fix ksym build error
dmaengine: ioatdma: Add Skylake PCI Dev ID
dmaengine: ioatdma: workaround SKX ioatdma version
dmaengine: ti-dma-crossbar: Add some 'of_node_put()' in error path.
ARM64: zynqmp: Fix W=1 dtc 1.4 warnings
ARM64: zynqmp: Fix i2c node's compatible string
ARM: s3c2410_defconfig: Fix invalid values for NF_CT_PROTO_*
ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
usb: gadget: Fix copy/pasted error message
Btrfs: adjust outstanding_extents counter properly when dio write is split
tools lib traceevent: Fix prev/next_prio for deadline tasks
xfrm: Don't use sk_family for socket policy lookups
perf tools: Install tools/lib/traceevent plugins with install-bin
perf symbols: Robustify reading of build-id from sysfs
video: fbdev: cobalt_lcdfb: Handle return NULL error from devm_ioremap
vfio-pci: Handle error from pci_iomap
arm64: mm: fix show_pte KERN_CONT fallout
nvmem: imx-ocotp: Fix wrong register size
sh_eth: enable RX descriptor word 0 shift on SH7734
ALSA: usb-audio: test EP_FLAG_RUNNING at urb completion
HID: ignore Petzl USB headlamp
scsi: fnic: Avoid sending reset to firmware when another reset is in progress
scsi: snic: Return error code on memory allocation failure
ASoC: dpcm: Avoid putting stream state to STOP when FE stream is paused
Linux 4.4.80
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>