virtio_ring currently sends the device (usually a hypervisor)
physical addresses of its I/O buffers. This is okay when DMA
addresses and physical addresses are the same thing, but this isn't
always the case. For example, this never works on Xen guests, and
it is likely to fail if a physical "virtio" device ever ends up
behind an IOMMU or swiotlb.
The immediate use case for me is to enable virtio on Xen guests.
For that to work, we need vring to support DMA address translation
as well as a corresponding change to virtio_pci or to another
driver.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 780bc7903a32edb63be138487fd981694d993610)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I0b21ea8a054df791fe114a10701a1120341806ce
This is a kludge, but no one has come up with a a better idea yet.
We'll introduce DMA API support guarded by vring_use_dma_api().
Eventually we may be able to return true on more and more systems,
and hopefully we can get rid of vring_use_dma_api() entirely some
day.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d26c96c8102549f91eb0bea6196d54711ab52176)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: Icfe1fdb85df794d0238fef53123cd10d99d10aa4
[ Upstream commit c38f57da428b033f2721b611d84b1f40bde674a8 ]
If a local process has closed a connected socket and hasn't received a
RST packet yet, then the socket remains in the table until a timeout
expires.
When a vhost_vsock instance is released with the timeout still pending,
the socket is never freed because vhost_vsock has already set the
SOCK_DONE flag.
Check if the close timer is pending and let it close the socket. This
prevents the race which can leak sockets.
Reported-by: Maximilian Riemensberger <riemensberger@cadami.net>
Cc: Graham Whaley <graham.whaley@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 06ec6679fe12cacafce68ab7b509586482a2ae1b)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I6b4564aebecc3938b789e56499a3f9b8bbbeb2a1
[ Upstream commit 834e772c8db0c6a275d75315d90aba4ebbb1e249 ]
If the network stack calls .send_pkt()/.cancel_pkt() during .release(),
a struct vhost_vsock use-after-free is possible. This occurs because
.release() does not wait for other CPUs to stop using struct
vhost_vsock.
Switch to an RCU-enabled hashtable (indexed by guest CID) so that
.release() can wait for other CPUs by calling synchronize_rcu(). This
also eliminates vhost_vsock_lock acquisition in the data path so it
could have a positive effect on performance.
This is CVE-2018-14625 "kernel: use-after-free Read in vhost_transport_send_pkt".
Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+bd391451452fb0b93039@syzkaller.appspotmail.com
Reported-by: syzbot+e3e074963495f92a89ed@syzkaller.appspotmail.com
Reported-by: syzbot+d5a0a170c5069658b141@syzkaller.appspotmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 569fc4ffb5de8f12fe01759f0b85098b7b9bba8e)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I87f1e0fbe3fc01ccc18924085e33220373856e29
[ Upstream commit 2d66f997f0545c8f7fc5cf0b49af1decb35170e7 ]
We don't wakeup the virtqueue if the first byte of pending iova range
is the last byte of the range we just got updated. This will lead a
virtqueue to wait for IOTLB updating forever. Fixing by correct the
check and wake up the virtqueue in this case.
Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API")
Reported-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a21a39a9c37b8f629633d22a29cab69bbce38261)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I2cd64a1d7e195a508a3385e3afe4ecf22632770f
[ Upstream commit 1b15ad683ab42a203f98b67045b40720e99d0e9a ]
DaeRyong Jeong reports a race between vhost_dev_cleanup() and
vhost_process_iotlb_msg():
Thread interleaving:
CPU0 (vhost_process_iotlb_msg) CPU1 (vhost_dev_cleanup)
(In the case of both VHOST_IOTLB_UPDATE and
VHOST_IOTLB_INVALIDATE)
===== =====
vhost_umem_clean(dev->iotlb);
if (!dev->iotlb) {
ret = -EFAULT;
break;
}
dev->iotlb = NULL;
The reason is we don't synchronize between them, fixing by protecting
vhost_process_iotlb_msg() with dev mutex.
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f833209e15bd6cf066e731463308f0058736a74b)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I5f372cf9e2d0cf0002ebe01d692e33c3ff52f5d2
commit 670ae9caaca467ea1bfd325cb2a5c98ba87f94ad upstream.
struct vhost_msg within struct vhost_msg_node is copied to userspace.
Unfortunately it turns out on 64 bit systems vhost_msg has padding after
type which gcc doesn't initialize, leaking 4 uninitialized bytes to
userspace.
This padding also unfortunately means 32 bit users of this interface are
broken on a 64 bit kernel which will need to be fixed separately.
Fixes: CVE-2018-1118
Cc: stable@vger.kernel.org
Reported-by: Kevin Easton <kevin@guarana.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: syzbot+87cfa083e727a224754b@syzkaller.appspotmail.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9681c3bdb098f6c87a0422b6b63912c1b90ad197)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: Ie5a29c3946792ae0f20e04015ba28c89fd90becb
[ Upstream commit d14d2b78090c7de0557362b26a4ca591aa6a9faa ]
Commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ("vhost: validate log
when IOTLB is enabled") introduced a regression. The logic was
originally:
if (vq->iotlb)
return 1;
return A && B;
After the patch the short-circuit logic for A was inverted:
if (A || vq->iotlb)
return A;
return B;
This patch fixes the regression by rewriting the checks in the obvious
way, no longer returning A when vq->iotlb is non-NULL (which is hard to
understand).
Reported-by: syzbot+65a84dde0214b0387ccd@syzkaller.appspotmail.com
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 72de9891b5f46f1f98e7e6243c47076a4b4daa3c)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I63938aaa9f1cf44e4eb1a18693f8c7963eff927e
[ Upstream commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ]
Vq log_base is the userspace address of bitmap which has nothing to do
with IOTLB. So it needs to be validated unconditionally otherwise we
may try use 0 as log_base which may lead to pin pages that will lead
unexpected result (e.g trigger BUG_ON() in set_bit_to_user()).
Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 992dbb1d5a653c49a7f2159f75d9cda0d8248f1f)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I64ef687ec8f55800ce300d0a0eac5473853bf6fd
[ Upstream commit aaa3149bbee9ba9b4e6f0bd6e3e7d191edeae942 ]
We try to hold TX virtqueue mutex in vhost_net_rx_peek_head_len()
after RX virtqueue mutex is held in handle_rx(). This requires an
appropriate lock nesting notation to calm down deadlock detector.
Fixes: 0308813724606 ("vhost_net: basic polling support")
Reported-by: syzbot+7f073540b1384a614e09@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1cb81756b7c3813f7d65d3c6bb1133cb53b69775)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: Ia72badca8f5f92095cfad91051fe47f76e6b03b7
commit e9cb4239134c860e5f92c75bf5321bd377bb505b upstream.
We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to
hold mutexes of all virtqueues. This may confuse lockdep to report a
possible deadlock because of trying to hold locks belong to same
class. Switch to use mutex_lock_nested() to avoid false positive.
Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Reported-by: syzbot+dbb7c1161485e61b0241@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit bd3ccdc6f922c6b7db4b7075d1b6596ddb986a98)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I20829e721706c10ef1696de0d78dd99110e40a6b
commit a72b69dc083a931422cc8a5e33841aff7d5312f2 upstream.
The vhost_vsock->guest_cid field is uninitialized when /dev/vhost-vsock
is opened until the VHOST_VSOCK_SET_GUEST_CID ioctl is called.
kvmalloc(..., GFP_KERNEL | __GFP_RETRY_MAYFAIL) does not zero memory.
All other vhost_vsock fields are initialized explicitly so just
initialize this field too.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Daniel Verkamp <dverkamp@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 258d8549b55e2cd5d73d9ff3d18b39aeeab08e1f)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I230006942ceb015f6549449160205cbb501b8fa6
[ Upstream commit 8b949bef9172ca69d918e93509a4ecb03d0355e0 ]
We check tx avail through vhost_enable_notify() in the past which is
wrong since it only checks whether or not guest has filled more
available buffer since last avail idx synchronization which was just
done by vhost_vq_avail_empty() before. What we really want is checking
pending buffers in the avail ring. Fix this by calling
vhost_vq_avail_empty() instead.
This issue could be noticed by doing netperf TCP_RR benchmark as
client from guest (but not host). With this fix, TCP_RR from guest to
localhost restores from 1375.91 trans per sec to 55235.28 trans per
sec on my laptop (Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz).
Fixes: 030881372460 ("vhost_net: basic polling support")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f5755c0e870056dd35c95a0b5c0a038cdb4382ee)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: Ia61d229fb5c82e64fd1c30aff06a9f938d20e467
commit 499fde662f1957e3cb8d192a94a099ebe19c714b upstream.
As reported by Michal, vsock_stream_sendmsg() could still
sleep at vsock_stream_has_space() after prepare_to_wait():
vsock_stream_has_space
vmci_transport_stream_has_space
vmci_qpair_produce_free_space
qp_lock
qp_acquire_queue_mutex
mutex_lock
Just switch to the new wait API like we did for commit
d9dc8b0f8b4e ("net: fix sleeping for sk_wait_event()").
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: "Jorgen S. Hansen" <jhansen@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6be6e48daabc18858a0a5a9d18fbd399714fc9e4)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: Ie0ac0685071c2034c9565e2554ff5e41f6cfd619
[ Upstream commit 380feae0def7e6a115124a3219c3ec9b654dca32 ]
Otherwise we'll leave the packets queued until releasing vsock device.
E.g., if guest is slow to start up, resulting ETIMEDOUT on connect, guest
will get the connect requests from failed host sockets.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 98d20e5902667f9b44a75116041a630823f81e46)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I0685ec15925c662d4a51cdd24fc4f26a8c7ff723
[ Upstream commit 16320f363ae128d9b9c70e60f00f2a572f57c23d ]
To allow canceling all packets of a connection.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 482b3f92aea249b031ba0bc24df73f3c48f1f5c2)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I1a06010bbcbcf03e1557d053b773cd6a2c1c0b02
[ Upstream commit 36d277bac8080202684e67162ebb157f16631581 ]
So that we can cancel a queued pkt later if necessary.
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6f1848e778d9a9f9dd89abee53d2a688277d1784)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I29b8bacf06eaf57a91291c1d088802c0dea7fed0
commit cda8bba0f99d25d2061c531113c14fa41effc3ae upstream.
Currently, under certain circumstances vhost_init_is_le does just a part
of the initialization job, and depends on vhost_reset_is_le being called
too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
when vq->private_data is NULL. This is not only counter intuitive, but
also real a problem because it breaks vhost_net. The bug was introduced to
vhost_net with commit 2751c9882b ("vhost: cross-endian support for
legacy devices"). The symptom is corruption of the vq's used.idx field
(virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
shutdown on a vq with pending descriptors.
Let us make sure the outcome of vhost_init_is_le never depend on the state
it is actually supposed to initialize, and fix virtio_net by removing the
reset from vhost_vq_init_access.
With the above, there is no reason for vhost_reset_is_le to do just half
of the job. Let us make vhost_reset_is_le reinitialize is_le.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reported-by: Michael A. Tebolt <miket@us.ibm.com>
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: commit 2751c9882b ("vhost: cross-endian support for legacy devices")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Michael A. Tebolt <miket@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1594edd9ea0d75ef106bffc23c2b07b509f3301c)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I5436f3410dc32fe059c904c8001fbe05feae756d
[ Upstream commit 0516ffd88fa0d006ee80389ce14a9ca5ae45e845 ]
Propagate the error when vhost_vq_init_access() fails and set
vq->private_data to NULL.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ae36f6a65af6f4eaca01cc5b68d8ecb266dbcc17)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I1e062ae003b92d000967921014a31bfc574632d8
[ Upstream commit 6c083c2b8a0a110cad936bc0a2c089f0d8115175 ]
Multi vsocks may setup the same cid at the same time.
Signed-off-by: Gao feng <omarapazanadi@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 2d5a1b31799efcd37a57fe4d2f492d8dc2a0a334)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: Ic2fa852a2080a4df1cb3b823c02d65a354c15b34
local_addr.svm_cid is host cid. We should check guest cid instead,
which is remote_addr.svm_cid. Otherwise we end up resetting all
connections to all guests.
Cc: stable@vger.kernel.org [4.8+]
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c4587631c7bad47c045e081d1553cd73a23be59a)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: Ib9bf9e1f0befa71f4ebc0ff0cdcef7115a41b110
commit f83f12d660d11718d3eed9d979ee03e83aa55544 upstream.
These fields are 64 bit, using le32_to_cpu and friends
on these will not do the right thing.
Fix this up.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit e80ceb2da52e0aae8e0ae9632c3abbfdd579cf61)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: Ieccae61162243040f9ad29dc465a50be7f45ee62
If a pending socket is marked as rejected, we will decrease the
sk_ack_backlog twice. So don't decrement it for rejected sockets
in vsock_pending_work().
Testing of the rejected socket path was done through code
modifications.
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1190cfdb1a19d89561ae51cff7d9c2ead24b3ebe)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: Ic610a13e81f53fc570e037d0685c1263dc646e70
Remove unnecessary use of enable/disable callback notifications
and the incorrect more space available check.
The virtio_transport_tx_work handles when the TX virtqueue
has more buffers available.
Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 21bc54fc0cdc31de72b57d2b3c79cf9c2b83cf39)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: Id76bb00fb4393add39d876e29b0e20617c6033e7
Stash the packet length in a local variable before handing over
ownership of the packet to virtio_transport_recv_pkt() or
virtio_transport_free_pkt().
This patch solves the use-after-free since pkt is no longer guaranteed
to be alive.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 3fda5d6e580193fa005014355b3a61498f1b3ae0)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I2a6a8b2eb1b647645ff7c76a37f61dce3b0fab9f
Use kvfree() instead of open-coding it.
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit b226acab2f6aaa45c2af27279b63f622b23a44bd)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: Ie270449a727baf74ee547137928e166e772b62ad
vringh is pulled in by caif and mic, but the other
vhost config does not need to be there.
In particular, it makes no sense to have vhost net/scsi/sock
under caif/mic.
Create a separate Kconfig file and put vringh bits there.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 4d93824561057d54712066544609dfc7453b210f)
[astrachan: Backported around no mic driver on 4.4]
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I6e630228bf91a8569bc665e9ad5d399eca8f7384
vringh isn't used by vhost net or scsi - it's used
by CAIF only at the moment. Drop the dependency.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 6190efb08c16dcd68c64b096a28f47ab33f017d7)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I9db24d3ced664637cffcf27fde8a1c08962bbebe
vringh isn't used by vhost net or scsi - it's used
by CAIF only at the moment. Drop the dependency.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 6190efb08c16dcd68c64b096a28f47ab33f017d7)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I2373bf9a5cd21d350af3aca957df811c6aaeae63
Detect and fail early if long wrap around is triggered.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ec33d031a14b3c5dd516627139c9550350dbba3e)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: Id71c1ea0355ce3e403bb5865dc3056d197fe218b
Enable virtio-vsock and vhost-vsock.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 304ba62fd4e670c1a5784585da0fac9f7309ef6c)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I0b0f08cd28a94516903dbf3452e5999375e0f85a
VM sockets vhost transport implementation. This driver runs on the
host.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 433fc58e6bf2c8bd97e57153ed28e64fd78207b8)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: Id90d852ffd498a7d89075cddb6d8ed0b9af5e69f
VM sockets virtio transport implementation. This driver runs in the
guest.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0ea9e1d3a9e3ef7d2a1462d3de6b95131dc7d872)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: Ib12e1e4d21183ac3d917316566694758717596bd
This module contains the common code and header files for the following
virtio_transporto and vhost_vsock kernel modules.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 06a8fc78367d070720af960dcecec917d3ae5f3b)
[astrachan: Backported around stable backport 62209d1 ("vsock: split
dwork to avoid reinitializations")]
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I723c073db804663ad4bf83b657c72b16cbdb220a
The virtio transport will implement graceful shutdown and the related
SO_LINGER socket option. This requires orphaning the sock but keeping
it in the table of connections after .release().
This patch adds the vsock_remove_sock() function and leaves it up to the
transport when to remove the sock.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 6773b7dc39f165bd9d824b50ac52cbb3f87d53c8)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I889cdbc0b1de8d2ff54a70ab7a6b4623edb3de06
struct vsock_transport contains function pointers called by AF_VSOCK
core code. The transport may want its own transport-specific function
pointers and they can be added after struct vsock_transport.
Allow the transport to fetch vsock_transport. It can downcast it to
access transport-specific function pointers.
The virtio transport will use this.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0b01aeb3d2fbf16787f0c9629f4ca52ae792f732)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I442706ae71dc14c70fc4033d9719134c2d034509
There are several places where the listener and pending or accept queue
child sockets are accessed at the same time. Lockdep is unhappy that
two locks from the same class are held.
Tell lockdep that it is safe and document the lock ordering.
Originally Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> sent a similar
patch asking whether this is safe. I have audited the code and also
covered the vsock_pending_work() function.
Suggested-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4192f672fae559f32d82de72a677701853cc98a7)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I0cb7ee964057e9338971e1a2043ae17557feaec7
This patch tries to implement an device IOTLB for vhost. This could be
used with userspace(qemu) implementation of DMA remapping
to emulate an IOMMU for the guest.
The idea is simple, cache the translation in a software device IOTLB
(which is implemented as an interval tree) in vhost and use vhost_net
file descriptor for reporting IOTLB miss and IOTLB
update/invalidation. When vhost meets an IOTLB miss, the fault
address, size and access can be read from the file. After userspace
finishes the translation, it writes the translated address to the
vhost_net file to update the device IOTLB.
When device IOTLB is enabled by setting VIRTIO_F_IOMMU_PLATFORM all vq
addresses set by ioctl are treated as iova instead of virtual address and
the accessing can only be done through IOTLB instead of direct userspace
memory access. Before each round or vq processing, all vq metadata is
prefetched in device IOTLB to make sure no translation fault happens
during vq processing.
In most cases, virtqueues are contiguous even in virtual address space.
The IOTLB translation for virtqueue itself may make it a little
slower. We might add fast path cache on top of this patch.
Signed-off-by: Jason Wang <jasowang@redhat.com>
[mst: use virtio feature bit: VHOST_F_DEVICE_IOTLB -> VIRTIO_F_IOMMU_PLATFORM ]
[mst: fix build warnings ]
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
[ weiyj.lk: missing unlock on error ]
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
(cherry picked from commit 6b1e6cc7855b09a0a9bfa1d9f30172ba366f161c)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + vsock adb tunnel
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I10e4e64d6bc9b36a0d9b444c2319e290921c63c6
Current pre-sorted memory region array has some limitations for future
device IOTLB conversion:
1) need extra work for adding and removing a single region, and it's
expected to be slow because of sorting or memory re-allocation.
2) need extra work of removing a large range which may intersect
several regions with different size.
3) need trick for a replacement policy like LRU
To overcome the above shortcomings, this patch convert it to interval
tree which can easily address the above issue with almost no extra
work.
The patch could be used for:
- Extend the current API and only let the userspace to send diffs of
memory table.
- Simplify Device IOTLB implementation.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a9709d6874d55130663567577a9b05c35138cc6b)
[astrachan: Backported around stable backport 711df71 ("vhost_net: stop
device during reset owner")]
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I51c7c7229908a5ce1f082a80eeda5a01e85c2234
This patch introduces vhost memory accessors which were just wrappers
for userspace address access helpers. This is a requirement for vhost
device iotlb implementation which will add iotlb translations in those
accessors.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit bfe2bc512884d0b1c5297a15350f940ca80e439b)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: Ia67c171384109e646f55027a71dd8df9c6b9c61a
We don't stop rx polling socket during rx processing, this will lead
unnecessary wakeups from under layer net devices (E.g
sock_def_readable() form tun). Rx will be slowed down in this
way. This patch avoids this by stop polling socket during rx
processing. A small drawback is that this introduces some overheads in
light load case because of the extra start/stop polling, but single
netperf TCP_RR does not notice any change. In a super heavy load case,
e.g using pktgen to inject packet to guest, we get about ~8.8%
improvement on pps:
before: ~1240000 pkt/s
after: ~1350000 pkt/s
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8241a1e466cd56e6c10472cac9c1ad4e54bc65db)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: Ia51c61a6c1976b6f6406342f369ffc365d964078
The vsock_transport structure is never modified, so declare it as const.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 56130915bbe31656c80f7493d28536693f8de0e2)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I59ffded185fbf22aaeb39753870ef9c866ab1a2a
We use spinlock to synchronize the work list now which may cause
unnecessary contentions. So this patch switch to use llist to remove
this contention. Pktgen tests shows about 5% improvement:
Before:
~1300000 pps
After:
~1370000 pps
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 04b96e5528ca97199b429810fe963185a67dd40e)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: Icf032cf2010eaedd92dc5906d454274709b4a9b4
We used to implement the work flushing through tracking queued seq,
done seq, and the number of flushing. This patch simplify this by just
implement work flushing through another kind of vhost work with
completion. This will be used by lockless enqueuing patch.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 7235acdb1144460d9f520f0d931f3cbb79eb244c)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: Id3903050706dd734a0217c56ad8ca99b2b22471e
If skb_recv_datagram returns an skb, we should ignore the err
value returned. Otherwise, datagram receives will return EAGAIN
when they have to wait for a datagram.
Acked-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 9c995cc9a206a008699da82f6cd01e9b2615649a)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: I69d487529656cb2fe8be9c2cef0db440d4db5cac
When a thread is prepared for waiting by calling prepare_to_wait, sleeping
is not allowed until either the wait has taken place or finish_wait has
been called. The existing code in af_vsock imposed unnecessary no-sleep
assumptions to a broad list of backend functions.
This patch shrinks the influence of prepare_to_wait to the area where it
is strictly needed, therefore relaxing the no-sleep restriction there.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f7f9b5e7f8eccfd68ffa7b8d74b07c478bb9e7f0)
[astrachan: Backported around stable backport 3223ea1 ("vsock: use new
wait API for vsock_stream_sendmsg()")]
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Cody Schuffelen <schuffelen@google.com>
Change-Id: Ic1d7bae4b1187ad48194bf4d0b1dd09ab0275734
This patch tries to poll for new added tx buffer or socket receive
queue for a while at the end of tx/rx processing. The maximum time
spent on polling were specified through a new kind of vring ioctl.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0308813724606549436d30efd877a80c8e00790e)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I93db6c14e39b2db04e047ee2e0dce3af5436d95e
This patch introduces a helper which will return true if we're sure
that the available ring is empty for a specific vq. When we're not
sure, e.g vq access failure, return false instead. This could be used
for busy polling code to exit the busy loop.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d4a60603fa0b42012decfa058dfa44cffde7a10c)
Bug: 121166534
Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS
Signed-off-by: Alistair Strachan <astrachan@google.com>
Change-Id: I1d4e972543470530a5bc37e19f4199b1f345e042