android_kernel_oneplus_msm8998/net/tipc
Jon Paul Maloy 59e0cd110f tipc: fix socket timer deadlock
commit f1d048f24e66ba85d3dabf3d076cefa5f2b546b0 upstream.

We sometimes observe a 'deadly embrace' type deadlock occurring
between mutually connected sockets on the same node. This happens
when the one-hour peer supervision timers happen to expire
simultaneously in both sockets.

The scenario is as follows:

CPU 1:                          CPU 2:
--------                        --------
tipc_sk_timeout(sk1)            tipc_sk_timeout(sk2)
  lock(sk1.slock)                 lock(sk2.slock)
  msg_create(probe)               msg_create(probe)
  unlock(sk1.slock)               unlock(sk2.slock)
  tipc_node_xmit_skb()            tipc_node_xmit_skb()
    tipc_node_xmit()                tipc_node_xmit()
      tipc_sk_rcv(sk2)                tipc_sk_rcv(sk1)
        lock(sk2.slock)                 lock((sk1.slock)
        filter_rcv()                    filter_rcv()
          tipc_sk_proto_rcv()             tipc_sk_proto_rcv()
            msg_create(probe_rsp)           msg_create(probe_rsp)
            tipc_sk_respond()               tipc_sk_respond()
              tipc_node_xmit_skb()            tipc_node_xmit_skb()
                tipc_node_xmit()                tipc_node_xmit()
                  tipc_sk_rcv(sk1)                tipc_sk_rcv(sk2)
                    lock((sk1.slock)                lock((sk2.slock)
                    ===> DEADLOCK                   ===> DEADLOCK

Further analysis reveals that there are three different locations in the
socket code where tipc_sk_respond() is called within the context of the
socket lock, with ensuing risk of similar deadlocks.

We now solve this by passing a buffer queue along with all upcalls where
sk_lock.slock may potentially be held. Response or rejected message
buffers are accumulated into this queue instead of being sent out
directly, and only sent once we know we are safely outside the slock
context.

Reported-by: GUNA <gbalasun@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-30 05:49:28 +02:00
..
addr.c
addr.h
bcast.c tipc: unlock in error path 2016-03-03 15:07:07 -08:00
bcast.h tipc: clean up unused code and structures 2015-10-24 06:56:47 -07:00
bearer.c tipc: clean up unused code and structures 2015-10-24 06:56:47 -07:00
bearer.h tipc: clean up unused code and structures 2015-10-24 06:56:47 -07:00
core.c tipc: make dist queue pernet 2017-04-30 05:49:27 +02:00
core.h tipc: make dist queue pernet 2017-04-30 05:49:27 +02:00
discover.c tipc: let neighbor discoverer tranmsit consumable buffers 2015-10-24 06:56:44 -07:00
discover.h
eth_media.c
ib_media.c
Kconfig
link.c tipc: move linearization of buffers to generic code 2016-09-24 10:07:35 +02:00
link.h tipc: clean up unused code and structures 2015-10-24 06:56:47 -07:00
Makefile
msg.c tipc: let broadcast packet reception use new link receive function 2015-10-24 06:56:37 -07:00
msg.h tipc: let broadcast packet reception use new link receive function 2015-10-24 06:56:37 -07:00
name_distr.c tipc: fix random link resets while adding a second bearer 2017-04-30 05:49:28 +02:00
name_distr.h
name_table.c
name_table.h
net.c tipc: create broadcast transmission link at namespace init 2015-10-24 06:56:27 -07:00
net.h
netlink.c
netlink.h
netlink_compat.c tipc: fix nl compat regression for link statistics 2016-09-15 08:27:49 +02:00
node.c tipc: correct error in node fsm 2017-04-30 05:49:27 +02:00
node.h tipc: clean up unused code and structures 2015-10-24 06:56:47 -07:00
server.c
server.h
socket.c tipc: fix socket timer deadlock 2017-04-30 05:49:28 +02:00
socket.h tipc: clean up socket layer message reception 2015-07-26 16:31:50 -07:00
subscr.c tipc: fix nullptr crash during subscription cancel 2016-09-15 08:27:44 +02:00
subscr.h
sysctl.c
udp_media.c tipc: make sure IPv6 header fits in skb headroom 2017-04-30 05:49:27 +02:00