commit 0cbe40112f42cf5e008f9127f6cd5952ba3946c7 upstream.
This specifically fixes resource leaks in the registration error paths.
Device-managed resources is a bad fit for this driver as devices can be
registered from the n_nci line discipline. Firstly, a tty may not even
have a corresponding device (should it be part of a Unix98 pty)
something which would lead to a NULL-pointer dereference when
registering resources.
Secondly, if the tty has a class device, its lifetime exceeds that of
the line discipline, which means that resources would leak every time
the line discipline is closed (or if registration fails).
Currently, the devres interface was only being used to request a reset
gpio despite the fact that it was already explicitly freed in
nfcmrvl_nci_unregister_dev() (along with the private data), something
which also prevented the resource leak at close.
Note that the driver treats gpio number 0 as invalid despite it being
perfectly valid. This will be addressed in a follow-up patch.
Fixes: b2fe288eac ("NFC: nfcmrvl: free reset gpio")
Fixes: 4a2b947f56 ("NFC: nfcmrvl: add chip reset management")
Cc: Vincent Cuissard <cuissard@marvell.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15e0c59f1535926a939d1df66d6edcf997d7c1b9 upstream.
Make sure to check the tty-device pointer before trying to access the
parent device to avoid dereferencing a NULL-pointer when the tty is one
end of a Unix98 pty.
Fixes: e097dc624f ("NFC: nfcmrvl: add UART driver")
Cc: Vincent Cuissard <cuissard@marvell.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20777bc57c346b6994f465e0d8261a7fbf213a09 upstream.
Commit 7eda8b8e96 ("NFC: Use IDR library to assing NFC devices IDs")
moved device-id allocation and struct-device initialisation from
nfc_allocate_device() to nfc_register_device().
This broke just about every nfc-device-registration error path, which
continue to call nfc_free_device() that tries to put the device
reference of the now uninitialised (but zeroed) struct device:
kobject: '(null)' (ce316420): is not initialized, yet kobject_put() is being called.
The late struct-device initialisation also meant that various work
queues whose names are derived from the nfc device name were also
misnamed:
421 root 0 SW< [(null)_nci_cmd_]
422 root 0 SW< [(null)_nci_rx_w]
423 root 0 SW< [(null)_nci_tx_w]
Move the id-allocation and struct-device initialisation back to
nfc_allocate_device() and fix up the single call site which did not use
nfc_free_device() in its error path.
Fixes: 7eda8b8e96 ("NFC: Use IDR library to assing NFC devices IDs")
Cc: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bde717ab473668377fc65872398a102d40cb2d58 upstream.
The hard coded register 0x9864 and 0x9924 are invalid
for ar9300 chips.
Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cf8ce1ea61b75712a154c93e40f2a5af2e4dd997 upstream.
One scenario that could lead to UAF is two threads writing
simultaneously to the "tx99" debug file. One of them would
set the "start" value to true and follow to ath9k_tx99_init().
Inside the function it would set the sc->tx99_state to true
after allocating sc->tx99skb. Then, the other thread would
execute write_file_tx99() and call ath9k_tx99_deinit().
sc->tx99_state would be freed. After that, the first thread
would continue inside ath9k_tx99_init() and call
r = ath9k_tx99_send(sc, sc->tx99_skb, &txctl);
that would make use of the freed sc->tx99_skb memory.
Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 289d72afddf83440117c35d864bf0c6309c1d011 upstream.
After the lock is dropped, it is possible that the cpufreq_dev gets
freed before we call get_level() and that can cause kernel to crash.
Drop the lock after we are done using the structure.
Fixes: 02373d7c69 ("thermal: cpu_cooling: fix lockdep problems in cpu_cooling")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a16e37726c444cbda91e73ed5f742e717bfe866f upstream.
Gcc 7.1 complains about:
drivers/media/platform/s5p-jpeg/jpeg-core.c: In function 's5p_jpeg_parse_hdr.isra.9':
drivers/media/platform/s5p-jpeg/jpeg-core.c:1207:12: warning: 'width' may be used uninitialized in this function [-Wmaybe-uninitialized]
result->w = width;
~~~~~~~~~~^~~~~~~
drivers/media/platform/s5p-jpeg/jpeg-core.c:1208:12: warning: 'height' may be used uninitialized in this function [-Wmaybe-uninitialized]
result->h = height;
~~~~~~~~~~^~~~~~~~
Indeed the code would allow it to return a random value (although
it shouldn't happen, in practice). So, explicitly set both to zero,
just in case.
Acked-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd7e31bbade02bc1e92aa00d5cf2cee2da66838a upstream.
gcc-7 suggests that an expression using a bitwise not and a bitmask
on a 'bool' variable is better written using boolean logic:
drivers/media/rc/imon.c: In function 'imon_incoming_scancode':
drivers/media/rc/imon.c:1725:22: error: '~' on a boolean expression [-Werror=bool-operation]
ictx->pad_mouse = ~(ictx->pad_mouse) & 0x1;
^
drivers/media/rc/imon.c:1725:22: note: did you mean to use logical not?
I agree.
Fixes: 21677cfc56 ("V4L/DVB: ir-core: add imon driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd664f6b3e376a8ef4990f87d08271cc2d01ba9a upstream.
I made the mistake of upgrading my desktop to the new Fedora 26 that
comes with gcc-7.1.1.
There's nothing wrong per se that I've noticed, but I now have 1500
lines of warnings, mostly from the new format-truncation warning
triggering all over the tree.
We use 'snprintf()' and friends in a lot of places, and often know that
the numbers are fairly small (ie a controller index or similar), but gcc
doesn't know that, and sees an 'int', and thinks that it could be some
huge number. And then complains when our buffers are not able to fit
the name for the ten millionth controller.
These warnings aren't necessarily bad per se, and we probably want to
look through them subsystem by subsystem, but at least during the merge
window they just mean that I can't even see if somebody is introducing
any *real* problems when I pull.
So warnings disabled for now.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 691bd4340bef49cf7e5855d06cf24444b5bf2d85 upstream.
It's easier for host applications, such as QEMU, if they can always
access guest MSR_IA32_BNDCFGS in VMCS, even though MPX is disabled in
guest cpuid.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4531662d1abf6c1f0e5c2b86ddb60e61509786c8 upstream.
Bits 11:2 must be zero and the linear addess in bits 63:12 must be
canonical. Otherwise, WRMSR(BNDCFGS) should raise #GP.
Fixes: 0dd376e709 ("KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save")
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4439af9f911ae0243ffe4e2dfc12bace49605d8b upstream.
The BNDCFGS MSR should only be exposed to the guest if the guest
supports MPX. (cf. the TSC_AUX MSR and RDTSCP.)
Fixes: 0dd376e709 ("KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save")
Change-Id: I3ad7c01bda616715137ceac878f3fa7e66b6b387
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8b6fda38f80e75afa3b125c9e7f2550b579454b upstream.
The MSR permission bitmaps are shared by all VMs. However, some VMs
may not be configured to support MPX, even when the host does. If the
host supports VMX and the guest does not, we should intercept accesses
to the BNDCFGS MSR, so that we can synthesize a #GP
fault. Furthermore, if the host does not support MPX and the
"ignore_msrs" kvm kernel parameter is set, then we should intercept
accesses to the BNDCFGS MSR, so that we can skip over the rdmsr/wrmsr
without raising a #GP fault.
Fixes: da8999d318 ("KVM: x86: Intel MPX vmx and msr handle")
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a87036add09283e6c4f4103a15c596c67b86ab86 upstream.
When eager FPU is disabled, KVM will still see the MPX bit in CPUID and
presumably the MPX vmentry and vmexit controls. However, it will not
be able to expose the MPX XSAVE features to the guest, because the guest's
accessible XSAVE features are always a subset of host_xcr0.
In this case, we should disable the MPX CPUID bit, the BNDCFGS MSR,
and the MPX vmentry and vmexit controls for nested virtualization.
It is then unnecessary to enable guest eager FPU if the guest has the
MPX CPUID bit set.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c59f29cb144a6a0dfac16ede9dc8eafc02dc56ca upstream.
The 's' flag is supposed to indicate that a softirq is running. This
can be detected by testing the preempt_count with SOFTIRQ_OFFSET.
The current code tests the preempt_count with SOFTIRQ_MASK, which
would be true even when softirqs are disabled but not serving a
softirq.
Link: http://lkml.kernel.org/r/1481300417-3564-1-git-send-email-pkondeti@codeaurora.org
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ca30331c156ca9e97643ad05dd8930b8fe78b01 upstream.
In the current code, if the user accidentally writes a bogus command to
this sysfs file, then we set the latency tolerance to an uninitialized
variable.
Fixes: 2d984ad132 (PM / QoS: Introcuce latency tolerance device PM QoS type)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea0212f40c6bc0594c8eff79266759e3ecd4bacc upstream.
The wakeirq infrastructure uses RCU to protect the list of wakeirqs. That
breaks the irq bus locking infrastructure, which is allows sleeping
functions to be called so interrupt controllers behind slow busses,
e.g. i2c, can be handled.
The wakeirq functions hold rcu_read_lock and call into irq functions, which
in case of interrupts using the irq bus locking will trigger a
might_sleep() splat.
Convert the wakeirq infrastructure to Sleepable RCU and unbreak it.
Fixes: 4990d4fe32 (PM / Wakeirq: Add automated device wake IRQ handling)
Reported-by: Brian Norris <briannorris@chromium.org>
Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f32d782e31bf079f600dcec126ed117b0577e85c upstream.
The group mask is always used in intersection with the group CPUs. So,
when building the group mask, we don't have to care about CPUs that are
not part of the group.
Signed-off-by: Lauro Ramos Venancio <lvenanci@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: lwang@redhat.com
Cc: riel@redhat.com
Link: http://lkml.kernel.org/r/1492717903-5195-2-git-send-email-lvenanci@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 73bb059f9b8a00c5e1bf2f7ca83138c05d05e600 upstream.
The point of sched_group_mask is to select those CPUs from
sched_group_cpus that can actually arrive at this balance domain.
The current code gets it wrong, as can be readily demonstrated with a
topology like:
node 0 1 2 3
0: 10 20 30 20
1: 20 10 20 30
2: 30 20 10 20
3: 20 30 20 10
Where (for example) domain 1 on CPU1 ends up with a mask that includes
CPU0:
[] CPU1 attaching sched-domain:
[] domain 0: span 0-2 level NUMA
[] groups: 1 (mask: 1), 2, 0
[] domain 1: span 0-3 level NUMA
[] groups: 0-2 (mask: 0-2) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072)
This causes sched_balance_cpu() to compute the wrong CPU and
consequently should_we_balance() will terminate early resulting in
missed load-balance opportunities.
The fixed topology looks like:
[] CPU1 attaching sched-domain:
[] domain 0: span 0-2 level NUMA
[] groups: 1 (mask: 1), 2, 0
[] domain 1: span 0-3 level NUMA
[] groups: 0-2 (mask: 1) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072)
(note: this relies on OVERLAP domains to always have children, this is
true because the regular topology domains are still here -- this is
before degenerate trimming)
Debugged-by: Lauro Ramos Venancio <lvenanci@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Fixes: e3589f6c81 ("sched: Allow for overlapping sched_domain spans")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7459e1d25ffefa2b1be799477fcc1f6c62f6cec7 upstream.
Driver does not properly handle the case when signals interrupt
wait_for_completion_interruptible():
-it does not check for return value
-completion structure is allocated on stack; in case a signal interrupts
the sleep, it will go out of scope, causing the worker thread
(caam_jr_dequeue) to fail when it accesses it
wait_for_completion_interruptible() is replaced with uninterruptable
wait_for_completion().
We choose to block all signals while waiting for I/O (device executing
the split key generation job descriptor) since the alternative - in
order to have a deterministic device state - would be to flush the job
ring (aborting *all* in-progress jobs).
Fixes: 045e36780f ("crypto: caam - ahash hmac support")
Fixes: 4c1ec1f930 ("crypto: caam - refactor key_gen, sg")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b82ce24426a4071da9529d726057e4e642948667 upstream.
It has been reported that sha1-avx2 can cause page faults by reading
beyond the end of the input. This patch disables it until it can be
fixed.
Fixes: 7c1da8d0d0 ("crypto: sha - SHA1 transform x86_64 AVX2")
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1606043f214f912a52195293614935811a6e3e53 upstream.
The Atmel SHA driver was treating -EBUSY as indication of queueing
to backlog without checking that backlog is enabled for the request.
Fix it by checking request flags.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03d2c5114c95797c0aa7d9f463348b171a274fd4 upstream.
An updated patch that also handles the additional key length requirements
for the AEAD algorithms.
The max keysize is not 96. For SHA384/512 it's 128, and for the AEAD
algorithms it's longer still. Extend the max keysize for the
AEAD size for AES256 + HMAC(SHA512).
Fixes: 357fb60502 ("crypto: talitos - add sha224, sha384 and sha512 to existing AEAD algorithms")
Signed-off-by: Martin Hicks <mort@bork.org>
Acked-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 37511fb5c91db93d8bd6e3f52f86e5a7ff7cfcdf upstream.
Jörn Engel noticed that the expand_upwards() function might not return
-ENOMEM in case the requested address is (unsigned long)-PAGE_SIZE and
if the architecture didn't defined TASK_SIZE as multiple of PAGE_SIZE.
Affected architectures are arm, frv, m68k, blackfin, h8300 and xtensa
which all define TASK_SIZE as 0xffffffff, but since none of those have
an upwards-growing stack we currently have no actual issue.
Nevertheless let's fix this just in case any of the architectures with
an upward-growing stack (currently parisc, metag and partly ia64) define
TASK_SIZE similar.
Link: http://lkml.kernel.org/r/20170702192452.GA11868@p100.box
Fixes: bd726c90b6b8 ("Allow stack to grow up to address space limit")
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: Jörn Engel <joern@purestorage.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1bd4a792d3961a04e6154118816b00167aad91a upstream.
If a TPM2 loses power without a TPM2_Shutdown command being issued (a
"disorderly reboot"), it may lose some state that has yet to be
persisted to NVRam, and will increment the DA counter. After the DA
counter gets sufficiently large, the TPM will lock the user out.
NOTE: This only changes behavior on TPM2 devices. Since TPM1 uses sysfs,
and sysfs relies on implicit locking on chip->ops, it is not safe to
allow this code to run in TPM1, or to add sysfs support to TPM2, until
that locking is made explicit.
Signed-off-by: Josh Zimmerman <joshz@google.com>
Cc: stable@vger.kernel.org
Fixes: 74d6b3ceaa ("tpm: fix suspend/resume paths for TPM 2.0")
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f77af15165847406b15d8f70c382c4cb15846b2a upstream.
The TPM class has some common shutdown code that must be executed for
all drivers. This adds some needed functionality for that.
Signed-off-by: Josh Zimmerman <joshz@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: 74d6b3ceaa ("tpm: fix suspend/resume paths for TPM 2.0")
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e26195f240d73150e8308ae42874702e3df8d2c upstream.
Add a read/write semaphore around the ops function pointers so
ops can be set to null when the driver un-registers.
Previously the tpm core expected module locking to be enough to
ensure that tpm_unregister could not be called during certain times,
however that hasn't been sufficient for a long time.
Introduce a read/write semaphore around 'ops' so the core can set
it to null when unregistering. This provides a strong fence around
the driver callbacks, guaranteeing to the driver that no callbacks
are running or will run again.
For now the ops_lock is placed very high in the call stack, it could
be pushed down and made more granular in future if necessary.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8cfffc9d4d3786d3b496a021d7224e06328bac7d upstream.
This is a hold over from before the struct device conversion.
- All prints should be using &chip->dev, which is the Linux
standard. This changes prints to use tpm0 as the device name,
not the PnP/etc ID.
- The few places involving sysfs/modules that really do need the
parent just use chip->dev.parent instead
- We no longer need to get_device(pdev) in any places since it is no
longer used by any of the code. The kref on the parent is held
by the device core during device_add and dropped in device_del
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Tested-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 796a3bae2fba6810427efdb314a1c126c9490fb3 upstream.
test_execve does rather odd mount manipulations to safely create
temporary setuid and setgid executables that aren't visible to the
rest of the system. Those executables end up in the test's cwd, but
that cwd is MNT_DETACHed.
The core namespace code considers MNT_DETACHed trees to belong to no
mount namespace at all and, in general, MNT_DETACHed trees are only
barely function. This interacted with commit 380cf5ba6b0a ("fs:
Treat foreign mounts as nosuid") to cause all MNT_DETACHed trees to
act as though they're nosuid, breaking the test.
Fix it by just not detaching the tree. It's still in a private
mount namespace and is therefore still invisible to the rest of the
system (except via /proc, and the same nosuid logic will protect all
other programs on the system from believing in test_execve's setuid
bits).
While we're at it, fix some blatant whitespace problems.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: 380cf5ba6b0a ("fs: Treat foreign mounts as nosuid")
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Greg KH <greg@kroah.com>
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 296990deb389c7da21c78030376ba244dc1badf5 upstream.
Andrei Vagin pointed out that time to executue propagate_umount can go
non-linear (and take a ludicrious amount of time) when the mount
propogation trees of the mounts to be unmunted by a lazy unmount
overlap.
Make the walk of the mount propagation trees nearly linear by
remembering which mounts have already been visited, allowing
subsequent walks to detect when walking a mount propgation tree or a
subtree of a mount propgation tree would be duplicate work and to skip
them entirely.
Walk the list of mounts whose propgatation trees need to be traversed
from the mount highest in the mount tree to mounts lower in the mount
tree so that odds are higher that the code will walk the largest trees
first, allowing later tree walks to be skipped entirely.
Add cleanup_umount_visitation to remover the code's memory of which
mounts have been visited.
Add the functions last_slave and skip_propagation_subtree to allow
skipping appropriate parts of the mount propagation tree without
needing to change the logic of the rest of the code.
A script to generate overlapping mount propagation trees:
$ cat runs.h
set -e
mount -t tmpfs zdtm /mnt
mkdir -p /mnt/1 /mnt/2
mount -t tmpfs zdtm /mnt/1
mount --make-shared /mnt/1
mkdir /mnt/1/1
iteration=10
if [ -n "$1" ] ; then
iteration=$1
fi
for i in $(seq $iteration); do
mount --bind /mnt/1/1 /mnt/1/1
done
mount --rbind /mnt/1 /mnt/2
TIMEFORMAT='%Rs'
nr=$(( ( 2 ** ( $iteration + 1 ) ) + 1 ))
echo -n "umount -l /mnt/1 -> $nr "
time umount -l /mnt/1
nr=$(cat /proc/self/mountinfo | grep zdtm | wc -l )
time umount -l /mnt/2
$ for i in $(seq 9 19); do echo $i; unshare -Urm bash ./run.sh $i; done
Here are the performance numbers with and without the patch:
mhash | 8192 | 8192 | 1048576 | 1048576
mounts | before | after | before | after
------------------------------------------------
1025 | 0.040s | 0.016s | 0.038s | 0.019s
2049 | 0.094s | 0.017s | 0.080s | 0.018s
4097 | 0.243s | 0.019s | 0.206s | 0.023s
8193 | 1.202s | 0.028s | 1.562s | 0.032s
16385 | 9.635s | 0.036s | 9.952s | 0.041s
32769 | 60.928s | 0.063s | 44.321s | 0.064s
65537 | | 0.097s | | 0.097s
131073 | | 0.233s | | 0.176s
262145 | | 0.653s | | 0.344s
524289 | | 2.305s | | 0.735s
1048577 | | 7.107s | | 2.603s
Andrei Vagin reports fixing the performance problem is part of the
work to fix CVE-2016-6213.
Fixes: a05964f391 ("[PATCH] shared mounts handling: umount")
Reported-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 99b19d16471e9c3faa85cad38abc9cbbe04c6d55 upstream.
While investigating some poor umount performance I realized that in
the case of overlapping mount trees where some of the mounts are locked
the code has been failing to unmount all of the mounts it should
have been unmounting.
This failure to unmount all of the necessary
mounts can be reproduced with:
$ cat locked_mounts_test.sh
mount -t tmpfs test-base /mnt
mount --make-shared /mnt
mkdir -p /mnt/b
mount -t tmpfs test1 /mnt/b
mount --make-shared /mnt/b
mkdir -p /mnt/b/10
mount -t tmpfs test2 /mnt/b/10
mount --make-shared /mnt/b/10
mkdir -p /mnt/b/10/20
mount --rbind /mnt/b /mnt/b/10/20
unshare -Urm --propagation unchaged /bin/sh -c 'sleep 5; if [ $(grep test /proc/self/mountinfo | wc -l) -eq 1 ] ; then echo SUCCESS ; else echo FAILURE ; fi'
sleep 1
umount -l /mnt/b
wait %%
$ unshare -Urm ./locked_mounts_test.sh
This failure is corrected by removing the prepass that marks mounts
that may be umounted.
A first pass is added that umounts mounts if possible and if not sets
mount mark if they could be unmounted if they weren't locked and adds
them to a list to umount possibilities. This first pass reconsiders
the mounts parent if it is on the list of umount possibilities, ensuring
that information of umoutability will pass from child to mount parent.
A second pass then walks through all mounts that are umounted and processes
their children unmounting them or marking them for reparenting.
A last pass cleans up the state on the mounts that could not be umounted
and if applicable reparents them to their first parent that remained
mounted.
While a bit longer than the old code this code is much more robust
as it allows information to flow up from the leaves and down
from the trunk making the order in which mounts are encountered
in the umount propgation tree irrelevant.
Fixes: 0c56fe3142 ("mnt: Don't propagate unmounts to locked mounts")
Reviewed-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6987dc8a70976561d22450b5858fc9767788cc1c upstream.
Only read access is checked before this call.
Actually, at the moment this is not an issue, as every in-tree arch does
the same manual checks for VERIFY_READ vs VERIFY_WRITE, relying on the MMU
to tell them apart, but this wasn't the case in the past and may happen
again on some odd arch in the future.
If anyone cares about 3.7 and earlier, this is a security hole (untested)
on real 80386 CPUs.
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da029c11e6b12f321f36dac8771e833b65cec962 upstream.
To avoid pathological stack usage or the need to special-case setuid
execs, just limit all arg stack usage to at most 75% of _STK_LIM (6MB).
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a73dc5370e153ac63718d850bddf0c9aa9d871e6 upstream.
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, which is the
traditional x86 minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided). For s390 the
position could be 0x10000, but that is needlessly close to the NULL
address.
Link: http://lkml.kernel.org/r/1498154792-49952-5-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 47ebb09d54856500c5a5e14824781902b3bb738e upstream.
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, which is the
traditional x86 minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided).
Link: http://lkml.kernel.org/r/1498154792-49952-4-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02445990a96e60a67526510d8b00f7e3d14101c3 upstream.
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, to match ARM.
This could be 0x8000, the standard ET_EXEC load address, but that is
needlessly close to the NULL address, and anyone running arm compat PIE
will have an MMU, so the tight mapping is not needed.
Link: http://lkml.kernel.org/r/1498251600-132458-4-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a9af90a3bcde217a1c053e135f5f43e5d5fafbd upstream.
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
4MB is chosen here mainly to have parity with x86, where this is the
traditional minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided).
For ARM the position could be 0x8000, the standard ET_EXEC load address,
but that is needlessly close to the NULL address, and anyone running PIE
on 32-bit ARM will have an MMU, so the tight mapping is not needed.
Link: http://lkml.kernel.org/r/1498154792-49952-2-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Qualys Security Advisory <qsa@qualys.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eab09532d40090698b05a07c1c87f39fdbc5fab5 upstream.
The ELF_ET_DYN_BASE position was originally intended to keep loaders
away from ET_EXEC binaries. (For example, running "/lib/ld-linux.so.2
/bin/cat" might cause the subsequent load of /bin/cat into where the
loader had been loaded.)
With the advent of PIE (ET_DYN binaries with an INTERP Program Header),
ELF_ET_DYN_BASE continued to be used since the kernel was only looking
at ET_DYN. However, since ELF_ET_DYN_BASE is traditionally set at the
top 1/3rd of the TASK_SIZE, a substantial portion of the address space
is unused.
For 32-bit tasks when RLIMIT_STACK is set to RLIM_INFINITY, programs are
loaded above the mmap region. This means they can be made to collide
(CVE-2017-1000370) or nearly collide (CVE-2017-1000371) with
pathological stack regions.
Lowering ELF_ET_DYN_BASE solves both by moving programs below the mmap
region in all cases, and will now additionally avoid programs falling
back to the mmap region by enforcing MAP_FIXED for program loads (i.e.
if it would have collided with the stack, now it will fail to load
instead of falling back to the mmap region).
To allow for a lower ELF_ET_DYN_BASE, loaders (ET_DYN without INTERP)
are loaded into the mmap region, leaving space available for either an
ET_EXEC binary with a fixed location or PIE being loaded into mmap by
the loader. Only PIE programs are loaded offset from ELF_ET_DYN_BASE,
which means architectures can now safely lower their values without risk
of loaders colliding with their subsequently loaded programs.
For 64-bit, ELF_ET_DYN_BASE is best set to 4GB to allow runtimes to use
the entire 32-bit address space for 32-bit pointers.
Thanks to PaX Team, Daniel Micay, and Rik van Riel for inspiration and
suggestions on how to implement this solution.
Fixes: d1fd836dcf ("mm: split ET_DYN ASLR from mmap ASLR")
Link: http://lkml.kernel.org/r/20170621173201.GA114489@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Qualys Security Advisory <qsa@qualys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d81ae05d0176da1c54aeaed697fa34be5c5575e upstream.
As of perl 5, version 26, subversion 0 (v5.26.0) some new warnings have
occurred when running checkpatch.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){
<-- HERE \s*/ at scripts/checkpatch.pl line 3544.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){
<-- HERE \s*/ at scripts/checkpatch.pl line 3885.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in
m/^(\+.*(?:do|\))){ <-- HERE / at scripts/checkpatch.pl line 4374.
It seems perfectly reasonable to do as the warning suggests and simply
escape the left brace in these three locations.
Link: http://lkml.kernel.org/r/20170607060135.17384-1-cyrilbur@gmail.com
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b17c070fb624cf10162cf92ea5e1ec25cd8ac176 upstream.
__list_lru_walk_one() acquires nlru spin lock (nlru->lock) for longer
duration if there are more number of items in the lru list. As per the
current code, it can hold the spin lock for upto maximum UINT_MAX
entries at a time. So if there are more number of items in the lru
list, then "BUG: spinlock lockup suspected" is observed in the below
path:
spin_bug+0x90
do_raw_spin_lock+0xfc
_raw_spin_lock+0x28
list_lru_add+0x28
dput+0x1c8
path_put+0x20
terminate_walk+0x3c
path_lookupat+0x100
filename_lookup+0x6c
user_path_at_empty+0x54
SyS_faccessat+0xd0
el0_svc_naked+0x24
This nlru->lock is acquired by another CPU in this path -
d_lru_shrink_move+0x34
dentry_lru_isolate_shrink+0x48
__list_lru_walk_one.isra.10+0x94
list_lru_walk_node+0x40
shrink_dcache_sb+0x60
do_remount_sb+0xbc
do_emergency_remount+0xb0
process_one_work+0x228
worker_thread+0x2e0
kthread+0xf4
ret_from_fork+0x10
Fix this lockup by reducing the number of entries to be shrinked from
the lru list to 1024 at once. Also, add cond_resched() before
processing the lru list again.
Link: http://marc.info/?t=149722864900001&r=1&w=2
Link: http://lkml.kernel.org/r/1498707575-2472-1-git-send-email-stummala@codeaurora.org
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Suggested-by: Jan Kara <jack@suse.cz>
Suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Alexander Polakov <apolyakov@beget.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c80cd57c74339889a8752b20862a16c28929c3a upstream.
list_lru_count_node() iterates over all memcgs to get the total number of
entries on the node but it can race with memcg_drain_all_list_lrus(),
which migrates the entries from a dead cgroup to another. This can return
incorrect number of entries from list_lru_count_node().
Fix this by keeping track of entries per node and simply return it in
list_lru_count_node().
Link: http://lkml.kernel.org/r/1498707555-30525-1-git-send-email-stummala@codeaurora.org
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Alexander Polakov <apolyakov@beget.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0d80ddab89916273cb97114889d3f337bc370ae upstream.
core_kernel_text is used by MIPS in its function graph trace processing,
so having this method traced leads to an infinite set of recursive calls
such as:
Call Trace:
ftrace_return_to_handler+0x50/0x128
core_kernel_text+0x10/0x1b8
prepare_ftrace_return+0x6c/0x114
ftrace_graph_caller+0x20/0x44
return_to_handler+0x10/0x30
return_to_handler+0x0/0x30
return_to_handler+0x0/0x30
ftrace_ops_no_ops+0x114/0x1bc
core_kernel_text+0x10/0x1b8
core_kernel_text+0x10/0x1b8
core_kernel_text+0x10/0x1b8
ftrace_ops_no_ops+0x114/0x1bc
core_kernel_text+0x10/0x1b8
prepare_ftrace_return+0x6c/0x114
ftrace_graph_caller+0x20/0x44
(...)
Mark the function notrace to avoid it being traced.
Link: http://lkml.kernel.org/r/1498028607-6765-1-git-send-email-marcin.nowakowski@imgtec.com
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Meyer <thomas@m3y3r.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98dcea0cfd04e083ac74137ceb9a632604740e2d upstream.
liblockdep has been broken since commit 75dd602a5198 ("lockdep: Fix
lock_chain::base size"), as that adds a check that MAX_LOCK_DEPTH is
within the range of lock_chain::depth and in liblockdep it is much
too large.
That should have resulted in a compiler error, but didn't because:
- the check uses ARRAY_SIZE(), which isn't yet defined in liblockdep
so is assumed to be an (undeclared) function
- putting a function call inside a BUILD_BUG_ON() expression quietly
turns it into some nonsense involving a variable-length array
It did produce a compiler warning, but I didn't notice because
liblockdep already produces too many warnings if -Wall is enabled
(which I'll fix shortly).
Even before that commit, which reduced lock_chain::depth from 8 bits
to 6, MAX_LOCK_DEPTH was too large.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: a.p.zijlstra@chello.nl
Link: http://lkml.kernel.org/r/20170525130005.5947-3-alexander.levin@verizon.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 649aa24254e85bf6bd7807dd372d083707852b1f upstream.
This is because of commit f98db6013c55 ("sched/core: Add switch_mm_irqs_off()
and use it in the scheduler") in which switch_mm_irqs_off() is called by the
scheduler, vs switch_mm() which is used by use_mm().
This patch lets the parisc code mirror the x86 and powerpc code, ie. it
disables interrupts in switch_mm(), and optimises the scheduler case by
defining switch_mm_irqs_off().
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b0f94efd5aa8daa8a07d7601714c2573266cd4c9 upstream.
Architectures with a compat syscall table must put compat_sys_keyctl()
in it, not sys_keyctl(). The parisc architecture was not doing this;
fix it.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 247462316f85a9e0479445c1a4223950b68ffac1 upstream.
When a process runs out of stack the parisc kernel wrongly faults with SIGBUS
instead of the expected SIGSEGV signal.
This example shows how the kernel faults:
do_page_fault() command='a.out' type=15 address=0xfaac2000 in libc-2.24.so[f8308000+16c000]
trap #15: Data TLB miss fault, vm_start = 0xfa2c2000, vm_end = 0xfaac2000
The vma->vm_end value is the first address which does not belong to the vma, so
adjust the check to include vma->vm_end to the range for which to send the
SIGSEGV signal.
This patch unbreaks building the debian libsigsegv package.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>