Generic code to provide new per cpu atomic features
this_cpu_cmpxchg
this_cpu_xchg
Fallback occurs to functions using interrupts disable/enable
to ensure correct per cpu atomicity.
Fallback to regular cmpxchg and xchg is not possible since per cpu atomic
semantics include the guarantee that the current cpus per cpu data is
accessed atomically. Use of regular cmpxchg and xchg requires the
determination of the address of the per cpu data before regular cmpxchg
or xchg which therefore cannot be atomically included in an xchg or
cmpxchg without segment override.
tj: - Relocated new ops to conform better to the general organization.
- This patch contains a trivial comment fix.
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
The fix in commit #6a0cc49 requires more than three concurrent instances
of synchronize_sched_expedited() before batching is possible. This
patch uses a ticket-counter-like approach that is also not unrelated to
Lai Jiangshan's Ring RCU to allow sharing of expedited grace periods even
when there are only two concurrent instances of synchronize_sched_expedited().
This commit builds on Tejun's original posting, which may be found at
http://lkml.org/lkml/2010/11/9/204, adding memory barriers, avoiding
overflow of signed integers (other than via atomic_t), and fixing the
detection of batching.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
$ cat << EOF | gcc -Wconversion -xc -S -o/dev/null -
unsigned f(void) {return NLMSG_HDRLEN;}
EOF
<stdin>: In function 'f':
<stdin>:3:26: warning: negative integer implicitly converted to unsigned type
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds arch_remove_reservations(), which an arch can implement if it
needs to protect part of the address space from allocation.
Sometimes that can be done by just putting a region in the resource tree,
but there are cases where that doesn't work well. For example, x86 BIOS
E820 reservations are not related to devices, so they may overlap part of,
all of, or more than a device resource, so they may not end up at the
correct spot in the resource tree.
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For read operation, we have to set the argument _write_ of get_user_pages
to 1 since we will write data to pages. Also, we need to SetPageDirty before
releasing these pages.
Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM / Runtime: Fix pm_runtime_suspended()
PM / Hibernate: Restore old swap signature to avoid user space breakage
PM / Hibernate: Fix PM_POST_* notification with user-space suspend
- include/linux/percpu.h: this_cpu_add_return() and friends were
located next to __this_cpu_add_return(). However, the overall
organization is to first group by preemption safeness. Relocate
this_cpu_add_return() and friends to preemption-safe area.
- arch/x86/include/asm/percpu.h: Relocate percpu_add_return_op() after
other more basic operations. Relocate [__]this_cpu_add_return_8()
so that they're first grouped by preemption safeness.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Use this_cpu operations to optimize access primitives for highmem.
The main effect is the avoidance of address calculations through the
use of a segment prefix.
V3->V4
- kmap_atomic_idx: Do not return a value.
- Use __this_cpu_dec without HIGHMEM_DEBUG
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Introduce generic support for this_cpu_add_return etc.
The fallback is to realize these operations with simpler __this_cpu_ops.
tj: - Reformatted __cpu_size_call_return2() to make it more consistent
with its neighbors.
- Dropped unnecessary temp variable ret__ from
__this_cpu_generic_add_return().
Reviewed-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
__get_cpu_var() can be replaced with this_cpu_read and will then use a
single read instruction with implied address calculation to access the
correct per cpu instance.
However, the address of a per cpu variable passed to __this_cpu_read()
cannot be determined (since it's an implied address conversion through
segment prefixes). Therefore apply this only to uses of __get_cpu_var
where the address of the variable is not used.
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Hugh Dickins <hughd@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Use this_cpu ops in various places to optimize per cpu data access.
Cc: Jason Baron <jbaron@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Implement blk_limits_max_hw_sectors() and make
blk_queue_max_hw_sectors() a wrapper around it.
DM needs this to avoid setting queue_limits' max_hw_sectors and
max_sectors directly. dm_set_device_limits() now leverages
blk_limits_max_hw_sectors() logic to establish the appropriate
max_hw_sectors minimum (PAGE_SIZE). Fixes issue where DM was
incorrectly setting max_sectors rather than max_hw_sectors (which
caused dm_merge_bvec()'s max_hw_sectors check to be ineffective).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
When stacking devices, a request_queue is not always available. This
forced us to have a no_cluster flag in the queue_limits that could be
used as a carrier until the request_queue had been set up for a
metadevice.
There were several problems with that approach. First of all it was up
to the stacking device to remember to set queue flag after stacking had
completed. Also, the queue flag and the queue limits had to be kept in
sync at all times. We got that wrong, which could lead to us issuing
commands that went beyond the max scatterlist limit set by the driver.
The proper fix is to avoid having two flags for tracking the same thing.
We deprecate QUEUE_FLAG_CLUSTER and use the queue limit directly in the
block layer merging functions. The queue_limit 'no_cluster' is turned
into 'cluster' to avoid double negatives and to ease stacking.
Clustering defaults to being enabled as before. The queue flag logic is
removed from the stacking function, and explicitly setting the cluster
flag is no longer necessary in DM and MD.
Reported-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
tty: add 'active' sysfs attribute to tty0 and console device
Userspace can query the actual virtual console, and the configured
console devices behind /dev/tt0 and /dev/console.
The last entry in the list of devices is the active device, analog
to the console= kernel command line option.
The attribute supports poll(), which is raised when the virtual
console is changed or /dev/console is reconfigured.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
index 0000000..b138b66
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
fanotify: fill in the metadata_len field on struct fanotify_event_metadata
fanotify: split version into version and metadata_len
fanotify: Dont try to open a file descriptor for the overflow event
fanotify: Introduce FAN_NOFD
fanotify: do not leak user reference on allocation failure
inotify: stop kernel memory leak on file creation failure
fanotify: on group destroy allow all waiters to bypass permission check
fanotify: Dont allow a mask of 0 if setting or removing a mark
fanotify: correct broken ref counting in case adding a mark failed
fanotify: if set by user unset FMODE_NONOTIFY before fsnotify_perm() is called
fanotify: remove packed from access response message
fanotify: deny permissions when no event was sent
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (28 commits)
MIPS: Add a CONFIG_FORCE_MAX_ZONEORDER Kconfig option.
MIPS: LD/SD o32 macro GAS fix update
MIPS: Alchemy: fix build with SERIAL_8250=n
MIPS: Rename mips_dma_cache_sync back to dma_cache_sync
MIPS: MT: Fix typo in comment.
SSB: Fix nvram_get on BCM47xx platform
MIPS: BCM47xx: Swap serial console if ttyS1 was specified.
MIPS: BCM47xx: Use sscanf for parsing mac address
MIPS: BCM47xx: Fill values for b43 into SSB sprom
MIPS: BCM47xx: Do not read config from CFE
MIPS: FDT size is a be32
MIPS: Fix CP0 COUNTER clockevent race
MIPS: Fix regression on BCM4710 processor detection
MIPS: JZ4740: Fix pcm device name
MIPS: Separate two consecutive loads in memset.S
MIPS: Send proper signal and siginfo on FP emulator faults.
MIPS: AR7: Fix loops per jiffies on TNETD7200 devices
MIPS: AR7: Fix double ar7_gpio_init declaration
MIPS: Rework GENERIC_HARDIRQS Kconfig.
MIPS: Alchemy: Add return value check for strict_strtoul()
...
Introduce skb_checksum_start_offset() to replace repetitive calculation.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Special care is taken inside sk_port_alloc to avoid overwriting
skc_node/skc_nulls_node. We should also avoid overwriting
skc_bind_node/skc_portaddr_node.
The patch fixes the following crash:
BUG: unable to handle kernel paging request at fffffffffffffff0
IP: [<ffffffff812ec6dd>] udp4_lib_lookup2+0xad/0x370
[<ffffffff812ecc22>] __udp4_lib_lookup+0x282/0x360
[<ffffffff812ed63e>] __udp4_lib_rcv+0x31e/0x700
[<ffffffff812bba45>] ? ip_local_deliver_finish+0x65/0x190
[<ffffffff812bbbf8>] ? ip_local_deliver+0x88/0xa0
[<ffffffff812eda35>] udp_rcv+0x15/0x20
[<ffffffff812bba45>] ip_local_deliver_finish+0x65/0x190
[<ffffffff812bbbf8>] ip_local_deliver+0x88/0xa0
[<ffffffff812bb2cd>] ip_rcv_finish+0x32d/0x6f0
[<ffffffff8128c14c>] ? netif_receive_skb+0x99c/0x11c0
[<ffffffff812bb94b>] ip_rcv+0x2bb/0x350
[<ffffffff8128c14c>] netif_receive_skb+0x99c/0x11c0
Signed-off-by: Leonard Crestez <lcrestez@ixiacom.com>
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some windows versions have wrong RFC1323 implementations, with SYN and
SYNACKS messages containing zero tcp timestamps.
We relaxed in commit fc1ad92dfc the passive connection case
(Windows connects to a linux machine), but the reverse case (linux
connects to a Windows machine) has an analogue problem when tsvals from
windows machine are 'negative' (high order bit set) : PAWS triggers and
we drops incoming messages.
Fix this by making zero ts_recent value special, allowing frame to be
processed.
Based on a report and initial patch from Dmitiy Balakin
Bugzilla reference : https://bugzilla.kernel.org/show_bug.cgi?id=24842
Reported-by: dmitriy.balakin@nicneiron.ru
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add dev_close_many and dev_deactivate_many to factorize another
sync-rcu operation on the netdevice unregister path.
$ modprobe dummy numdummies=10000
$ ip link set dev dummy* up
$ time rmmod dummy
Without the patch With the patch
real 0m 24.63s real 0m 5.15s
user 0m 0.00s user 0m 0.00s
sys 0m 6.05s sys 0m 5.14s
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the calcualation of the Tx hash for a given hash range into a separate
function and define the skb_tx_hash(), which calculates a Tx hash for a
[0; dev->real_num_tx_queues - 1] hash values range, using this
function (__skb_tx_hash()).
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid unbalanced IRQ wake disable, ensure that wakeups are disabled
only when wakeups have been successfully enabled.
Tested on OMAP3630SDP/ZOOM3.
Signed-off-by: Govindraj.R <govindraj.raja@ti.com>
Reported-by: Paul Walmsley <paul@pwsan.com>
Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix various typos and other errors in comments of tty_driver.h. The most
significant is the wrong name of a function for the description of
TTY_DRIVER_DYNAMIC_DEV.
Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add a new notification to indicate that a received, unprotected
Deauthentication or Disassociation frame was dropped due to
management frame protection being in use. This notification is
needed to allow user space (e.g., wpa_supplicant) to implement
SA Query procedure to recover from association state mismatch
between an AP and STA.
This is needed to avoid getting stuck in non-working state when MFP
(IEEE 802.11w) is used and a protected Deauthentication or
Disassociation frame is dropped for any reason. After that, the
station would silently discard any unprotected Deauthentication or
Disassociation frame that could be indicating that the AP does not
have association for the STA (when the Reason Code would be 6 or 7).
IEEE Std 802.11w-2009, 11.13 describes this recovery mechanism.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The nvram_get function was never in the mainline kernel, it only existed in
an external OpenWrt patch. Use nvram_getenv function, which is in mainline
and use an include instead of an extra function declaration. et0macaddr
contains the mac address in text from like 00:11:22:33:44:55. We have to
parse it before adding it into macaddr.
nvram_parse_macaddr will be merged into asm/mach-bcm47xx/nvram.h through
the MIPS git tree and will be available soon. It will not build now without
nvram_parse_macaddr, but it hasn't before either.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
To: linux-mips@linux-mips.org
Cc: mb@bu3sch.de
Cc: netdev@vger.kernel.org
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Michael Buesch <mb@bu3sch.de>
Patchwork: https://patchwork.linux-mips.org/patch/1849/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* usb-next: (132 commits)
USB: uas: Use GFP_NOIO instead of GFP_KERNEL in I/O submission path
USB: uas: Ensure we only bind to a UAS interface
USB: uas: Rename sense pipe and sense urb to status pipe and status urb
USB: uas: Use kzalloc instead of kmalloc
USB: uas: Fix up the Sense IU
usb: musb: core: kill unneeded #include's
DA8xx: assign name to MUSB IRQ resource
usb: gadget: g_ncm added
usb: gadget: f_ncm.c added
usb: gadget: u_ether: prepare for NCM
usb: pch_udc: Fix setup transfers with data out
usb: pch_udc: Fix compile error, warnings and checkpatch warnings
usb: add ab8500 usb transceiver driver
USB: gadget: Implement runtime PM for MSM bus glue driver
USB: gadget: Implement runtime PM for ci13xxx gadget
USB: gadget: Add USB controller driver for MSM SoC
USB: gadget: Introduce ci13xxx_udc_driver struct
USB: gadget: Initialize ci13xxx gadget device's coherent DMA mask
USB: gadget: Fix "scheduling while atomic" bugs in ci13xxx_udc
USB: gadget: Separate out PCI bus code from ci13xxx_udc
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: define separate EVIOCGKEYCODE_V2/EVIOCSKEYCODE_V2
Input: wacom - add another Bamboo Pen ID (0xd4)
There are some situations (e.g. in __pm_generic_call()), where
pm_runtime_suspended() is used to decide whether or not to execute
a device's (system) ->suspend() callback. The callback is not
executed if pm_runtime_suspended() returns true, but it does so
for devices that don't even support runtime PM, because the
power.disable_depth device field is ignored by it. This leads to
problems (i.e. devices are not suspened when they should), so rework
pm_runtime_suspended() so that it returns false if the device's
power.disable_depth field is different from zero.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
Simple sysfs emumeration of the PMUs.
Use a "event_source" bus, and add PMU devices using their name.
Each PMU device has a type attribute which contrains the value needed
for perf_event_attr::type to identify this PMU.
This is the minimal stub needed to start using this interface,
we'll consider extending the sysfs usage later.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222056.316982569@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Extend the perf_pmu_register() interface to allow for named and
dynamic pmu types.
Because we need to support the existing static types we cannot use
dynamic types for everything, hence provide a type argument.
If we want to enumerate the PMUs they need a name, provide one.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222056.259707703@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Touch devices capable of hovering, i.e., fingers detected a
distance from the surface, are not supported by the current
input MT protocol. This patch adds ABS_MT_DISTANCE, which may
be used to indicate the distance between the contact and the
surface.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
The drivers using the type B protocol all report tracking information
the same way. The contact id is semantically equivalent to
ABS_MT_SLOT, and the handling of ABS_MT_TRACKING_ID only complicates
the driver. The situation can be improved upon by providing a common
pointer emulation code, thereby removing the need for the tracking id
in the driver. This patch moves all tracking event handling over to
the input core, simplifying both the existing drivers and the ones
currently in preparation.
Acked-by: Ping Cheng <pingc@wacom.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
The MT slots devices all follow the same initialization pattern
of creating slots and hinting about buffer size. Let drivers call
an initialization function instead, and make sure it can be called
repeatedly without side effects.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
In preparation for common code to handle a larger set of MT slots
devices, move the slots handling over to a separate file.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Allow drivers or rate control algorithms to specify BlockAck session
timeout when initiating an ADDBA transaction. This is useful in cases
where maintaining persistent BA sessions does not incur any overhead.
The current timeout value of 5000 TUs is retained for all non ath9k/ath9k_htc
drivers.
Signed-off-by: Sujith Manoharan <Sujith.Manoharan@atheros.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
With the upcoming hardware offload implementation,
some devices will have a different maximum duration
for the remain-on-channel command. Advertise the
maximum duration in mac80211, and make mac80211 set
it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Without this, gcc 4.5 won't compile xen-netfront and xen-blkfront, where
this is being used to specify array sizes.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: David Miller <davem@davemloft.net>
Cc: Stable Kernel <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To implement per event type optional headers we are interested in
knowing how long the metadata structure is. This patch slits the __u32
version field into a __u8 version and a __u16 metadata_len field (with
__u8 left over). This should allow for backwards compat ABI.
Signed-off-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
[rewrote descrtion and changed object sizes and ordering - eparis]
Signed-off-by: Eric Paris <eparis@redhat.com>