In selinux_ip_output() we always label packets based on the parent
socket. While this approach works in almost all cases, it doesn't
work in the case of TCP SYN-ACK packets when the correct label is not
the label of the parent socket, but rather the label of the larval
socket represented by the request_sock struct.
Unfortunately, since the request_sock isn't queued on the parent
socket until *after* the SYN-ACK packet is sent, we can't lookup the
request_sock to determine the correct label for the packet; at this
point in time the best we can do is simply pass/NF_ACCEPT the packet.
It must be said that simply passing the packet without any explicit
labeling action, while far from ideal, is not terrible as the SYN-ACK
packet will inherit any IP option based labeling from the initial
connection request so the label *should* be correct and all our
access controls remain in place so we shouldn't have to worry about
information leaks.
Reported-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Tested-by: Janak Desai <Janak.Desai@gtri.gatech.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
These were implemented by Andrew Jackson and Laurence Evans but not
previously included in-tree.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The operation can now fail, so change its return type to int.
Remove the inline wrapper while we're changing the signature.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Currently a higher priority client can remove a lower priority
client's filter with equal match-expression. This might happen if (a)
the higher priority client has a double-free bug, or (b) another
client with sufficient priority replaced and then removed an equal
filter, allowing the low priority client to insert an equal filter.
In neither case does it actually make sense to carry out the removal;
we should say the filter doesn't exist, as the filter currently
present is not the one that the high-priority client is referring to.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Change all the 'stack' naming to 'auto' (or other meaningful term);
the device address list is based on more than just what the network
stack wants, and the no-match filters aren't really what the stack
wants at all.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
MAC filters inserted automatically by the driver, based on the device
address list (EF10) or no-match filters (Siena), should be overridable
at MANUAL or REQUIRED priority. Currently they themselves have
REQUIRED priority and this requires some odd special-casing.
We also can't reliably tell whether such a MAC filter has or has
not been overridden. We just remember that it is wanted by the
stack (RX_STACK flag).
Add another priority level, AUTO, between HINT and MANUAL, and
use this for the automatic filters while they have not been
overridden. Remove the RX_STACK flag. Add an RX_OVER_AUTO
flag which is set only when an AUTO filter has been overridden
(or was requested to be inserted while a higher-priority filter
existed).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The EF10 implementation already does this, and it makes more logical
sense to group the RSS hash key and indirection table together.
Rename the operation to rx_push_rss_config.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
In case of certain hardware and firmware errors it can be useful to
have more context than just the file and line number.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The SFC9100 family has only one clock per controller, shared by all
functions. Therefore only create a clock device under the primary
function, and make all other functions refer to the primary's clock
device.
Since PTP functionality is limited to port 0 and PF 0 on the earlier
SFN[56]322F boards, and we also set the primary flag for that
function, we can make the creation of a clock device conditional only
on this flag.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The primary function of an EF10 controller will share its clock
device with other functions in the same domain (which we call
secondary functions). To this end, we need to associate functions
on the same controller.
We do not control probe order, so allow primary and secondary
functions to appear in any order. Maintain global lists of all
primary functions and of unassociated secondary functions,
and a list of secondary functions on each primary function.
Use the VPD serial number to tell whether functions are part of the
same controller. VPD will not be readable by virtual functions, so
this may need to be revisited later.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The EF10 firmware can optionally insert RX timestamps in the packet
prefix. These only include the clock minor value. We must also
enable periodic time sync events on each event queue which provide
the high bits of the clock value.
[bwh: Combined and rebased several changes.
Added the above description and some sanity checks for inline vs
separate timestamps.
Changed efx_rx_skb_attach_timestamp() to read the packet prefix
from the skb head area.]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We can potentially pull the entire packet contents into the head area
and then free the page it was in. In order to read an inline
timestamp safely, we need to copy the prefix into the head area as
well.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
I added efx_ptp_get_mode() to avoid moving the definition for
efx_ptp_data, since the current PTP mode is needed for
siena.c:siena_set_ptp_hwtstamp.
[bwh: Also move the rx_filters mask, and add kernel-doc]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The clock minor tick on the SFC9100 family is 2^-27 s, not 1 ns.
There are also various pipeline delays which we need to correct for
when interpreting timestamps.
We query the firmware for the clock format and corrections at run-time.
[bwh: Combined and rebased several changes]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We'll be sharing clocks between multiple functions with their own MAC
addresses. The name field is now documented as 'A short "friendly
name" to identify the clock ...' and '... not meant to be a unique
id.' So use the name 'sfc'.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We need a dedicated channel on Siena to ensure we can match up
the separate RX and timestamp events for each PTP packet. We won't
do this for EF10 as timestamps are delivered inline.
Pass a channel index of 0 to MC_CMD_PTP_OP_ENABLE when there is no
dedicated channel.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The MC firmware will return error MC_CMD_ERR_ENOSPC if filter
insertion fails due to lack of resources. The net driver's filter
implementation for Falcon-architecture returns EBUSY. They should
behave consistently, so for EF10 change ENOSPC to EBUSY.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
efx_flush_all() is a really misleading name - it has nothing to do
with e.g. flushing DMA queues. Since it's called immediately after
efx_stop_port() and is highly dependent on what that does, combine
the two functions.
Update comments to explain what this is doing a little better.
Also update an related and erroneous comment in efx_start_port().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Split each of efx_mcdi_rpc, efx_mcdi_rpc_finish, and efx_mcdi_rpc_async into
a normal and a _quiet version; made the former log MCDI errors with
netif_err (and include the raw MCDI error code), and the latter never log
them at all. Changed various callers; any where some errors are expected
(but others are not) call the _quiet version and then if necessary log the
MCDI error themselves. Said logging is done by new efx_mcdi_display_error.
Callers of efx_mcdi_rpc*_quiet functions which may want to log the error
need to ensure that their outbuf is big enough to hold an MCDI error; to
this end, they now use MCDI_DECLARE_BUF_OUT_OR_ERR, which always allocates
at least 8 bytes.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We don't directly control RX ingress on Siena or any later
controllers, and so we cannot prevent packets from entering the RX
datapath while the RX queues are not set up. This results in
the hardware incrementing RX_NODESC_DROP_CNT, but it's not an
error and we should not include it in error stats.
When bringing an interface up or down, pull (or wait for) stats and
count the number of packets that were dropped while the interface was
down. Subtract this from the reported RX dropped count.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The addition of RX event merging support means we don't reliably
detect dropped RX events now. Currently we will only detect them if
the previous event for the RX queue had the CONT bit set.
Only accept RX completion events as merged if the
GET_CAPABILITIES_OUT_RX_BATCHING bit is set in datapath_caps (which it
won't be for the low-latency datapath) and the CONT bit is not set on
the event.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
To run BISTs the MC goes down in to a special mode where it will only
respond to MCDI from the testing PF, and TX, RX and event queues are
torn down. Other PFs get a message as it goes down to tell them it's
going down.
When the other PFs get this message, they check the soft status
register to tell when the MC has rebooted after BIST mode and they can
start recovery.
[bwh: Convert the test result to 1 or -1 as for earlier NICs]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
clk_prepare_enable() may fail, so let's check its return value and propagate it
in the case of error.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
A guest can cause a BUG_ON() leading to a host kernel crash.
When the guest writes to the ICR to request an IPI, while in x2apic
mode the following things happen, the destination is read from
ICR2, which is a register that the guest can control.
kvm_irq_delivery_to_apic_fast uses the high 16 bits of ICR2 as the
cluster id. A BUG_ON is triggered, which is a protection against
accessing map->logical_map with an out-of-bounds access and manages
to avoid that anything really unsafe occurs.
The logic in the code is correct from real HW point of view. The problem
is that KVM supports only one cluster with ID 0 in clustered mode, but
the code that has the bug does not take this into account.
Reported-by: Lars Bull <larsbull@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In kvm_lapic_sync_from_vapic and kvm_lapic_sync_to_vapic there is the
potential to corrupt kernel memory if userspace provides an address that
is at the end of a page. This patches concerts those functions to use
kvm_write_guest_cached and kvm_read_guest_cached. It also checks the
vapic_address specified by userspace during ioctl processing and returns
an error to userspace if the address is not a valid GPA.
This is generally not guest triggerable, because the required write is
done by firmware that runs before the guest. Also, it only affects AMD
processors and oldish Intel that do not have the FlexPriority feature
(unless you disable FlexPriority, of course; then newer processors are
also affected).
Fixes: b93463aa59 ('KVM: Accelerated apic support')
Reported-by: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Under guest controllable circumstances apic_get_tmcct will execute a
divide by zero and cause a crash. If the guest cpuid support
tsc deadline timers and performs the following sequence of requests
the host will crash.
- Set the mode to periodic
- Set the TMICT to 0
- Set the mode bits to 11 (neither periodic, nor one shot, nor tsc deadline)
- Set the TMICT to non-zero.
Then the lapic_timer.period will be 0, but the TMICT will not be. If the
guest then reads from the TMCCT then the host will perform a divide by 0.
This patch ensures that if the lapic_timer.period is 0, then the division
does not occur.
Reported-by: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In multiple functions the vcpu_id is used as an offset into a bitfield. Ag
malicious user could specify a vcpu_id greater than 255 in order to set or
clear bits in kernel memory. This could be used to elevate priveges in the
kernel. This patch verifies that the vcpu_id provided is less than 255.
The api documentation already specifies that the vcpu_id must be less than
max_vcpus, but this is currently not checked.
Reported-by: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If a muxed i2c bus gets created the default retry count and
timeout of the muxed bus is zero. Hence it it possible that you
end up with a situation where the parent controller sets a default
retry count and timeout which gets applied and used while the muxed
bus (using the same controller) has a default retry count of zero
and a default timeout of 1s (set in i2c_add_adapter()). This can be
solved by initializing the retry count and timeout of the muxed
bus with the values used by the the parent at creation time.
Signed-off-by: Elie De Brauwer <eliedebrauwer@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Still a slightly high amount of changes than wished, but they are all
good regression and/or device-specific fixes. Majority of commits are
for HD-audio, an HDMI ctl index fix that hits old graphics boards,
regression fixes for AD codecs and a few quirks. Other than that, two
major fixes are included: a 64bit ABI fix for compress offload, and
64bit dma_addr_t truncation fix, which had hit on PAE kernels.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJSqCKIAAoJEGwxgFQ9KSmkPqAQALL4GbkCVyMvKFP6f98jhDX6
+owx2h4FiQMzw7aApHCjrZmZIpsHTiPL4Ma5vcTcjL3H8KE1UHaHwQ4NObvKLWmX
IncxEy3uMurHA3v5I00X67VarCtPdXN51C1Ky6ShcQ1y9lcBGd+RSEaxlpK605n5
nMqNF0wIAmFdesQ22hnCNiwIuqvkurcHUEaZvEh+dCXniv2zSQ6Y/e/Qjzy7x5uF
rAzulso5X3ERjDA7B27OpoWGeU4n4OcY/3leAtiz0k7fCeAqZaXNchaBScAwIztH
H6ydHe0v9NpDKDYTjuPoGkuYAj2vj9rJIsz1ZW8yrEdwAFWAUOev68F6hhDmC90o
2+k6ibb9tXHrjDM2MP3m6d/Hl98u0q6r8fFy3Hnwyfpr7ZXi1SmS7i/VmW6e/JYt
cpMrTPVGV46boDaegeIqKpSDIqhZqtCHTESN0pl4UOlGqsqdViXLOQ2EzWzzcDuh
qMGlY7uCRN7vp7NwtK/oFVva8HT/myM47TF3116z/5SaYT3RNOjquwbsCNQC5Wt5
usUQws46Rn8XlruXsKMuzXosgEuWYs7XUVZckm796m/acEFgMPQJV9qfDA2q6O2Q
Mzu+W5CDx3Zi50T/RSBeuntnis1spP6JV7YkalofLHwNxIi7mNXkWcd2oI/66d4r
+W2+w0O1TEbjdfv2DxsZ
=R0zk
-----END PGP SIGNATURE-----
Merge tag 'sound-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Still a slightly high amount of changes than wished, but they are all
good regression and/or device-specific fixes. Majority of commits are
for HD-audio, an HDMI ctl index fix that hits old graphics boards,
regression fixes for AD codecs and a few quirks.
Other than that, two major fixes are included: a 64bit ABI fix for
compress offload, and 64bit dma_addr_t truncation fix, which had hit
on PAE kernels"
* tag 'sound-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add static DAC/pin mapping for AD1986A codec
ALSA: hda - One more Dell headset detection quirk
ALSA: hda - hdmi: Fix IEC958 ctl indexes for some simple HDMI devices
ALSA: hda - Mute all aamix inputs as default
ALSA: compress: Fix 64bit ABI incompatibility
ALSA: memalloc.h - fix wrong truncation of dma_addr_t
ALSA: hda - Another Dell headset detection quirk
ALSA: hda - A Dell headset detection quirk
ALSA: hda - Remove quirk for Dell Vostro 131
ALSA: usb-audio: fix uninitialized variable compile warning
ALSA: hda - fix mic issues on Acer Aspire E-572
Introduced by 1397ed35f2
"ipv6: add flowinfo for tcp6 pkt_options for all cases"
Reported-by: kbuild test robot <fengguang.wu@intel.com>
V2: fix the title, add empty line after the declaration (Sergei Shtylyov
feedbacks)
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull input fixes from Dmitry Torokhov:
"A fix for recent sysfs breakage in serio subsystem plus a fixup to
adxl34x driver"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: adxl34x - Fix bug in definition of ADXL346_2D_ORIENT
Input: serio - fix sysfs layout
There is a mistake in checking the gso_prefix mask when passing large
packets to a guest. The wrong shift is applied to the bit - the raw skb
gso type is used rather then the translated one. This leads to large packets
being handed to the guest without the GSO metadata. This patch fixes the
check.
The mistake manifested as errors whilst running Microsoft HCK large packet
offload tests between a pair of Windows 8 VMs. I have verified this patch
fixes those errors.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>