Commit graph

20175 commits

Author SHA1 Message Date
Lukas Czerner
53fdcf992d ext4: don't hold spinlock while calling ext4_issue_discard()
We can't hold the block group spinlock because we ext4_issue_discard()
calls wait and hence can get rescheduled.

Google-Bug-Id: 3017678

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-10-27 21:30:04 -04:00
Lukas Czerner
5829870982 ext4: check for negative error code from sb_issue_discard
sb_issue_discard() is returning negative error code, so check for
-EOPNOTSUPP.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-10-27 21:30:03 -04:00
Eric Sandeen
b443e7339a ext4: don't bump up LONG_MAX nr_to_write by a factor of 8
I'm uneasy with lots of stuff going on in ext4_da_writepages(),
but bumping nr_to_write from LLONG_MAX to -8 clearly isn't
making anything better, so avoid the multiplier in that case.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-10-27 21:30:03 -04:00
Eric Sandeen
659c6009ca ext4: stop looping in ext4_num_dirty_pages when max_pages reached
Today we simply break out of the inner loop when we have accumulated
max_pages; this keeps scanning forwad and doing pagevec_lookup_tag()
in the while (!done) loop, this does potentially a lot of work
with no net effect.

When we have accumulated max_pages, just clean up and return.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-10-27 21:30:03 -04:00
Curt Wohlgemuth
fb1813f4a8 ext4: use dedicated slab caches for group_info structures
ext4_group_info structures are currently allocated with kmalloc().
With a typical 4K block size, these are 136 bytes each -- meaning
they'll each consume a 256-byte slab object.  On a system with many
ext4 large partitions, that's a lot of wasted kernel slab space.
(E.g., a single 1TB partition will have about 8000 block groups, using
about 2MB of slab, of which nearly 1MB is wasted.)

This patch creates an array of slab pointers created as needed --
depending on the superblock block size -- and uses these slabs to
allocate the group info objects.

Google-Bug-Id: 2980809

Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-10-27 21:29:12 -04:00
Brian King
39e3ac2599 jbd2: Fix I/O hang in jbd2_journal_release_jbd_inode
This fixes a hang seen in jbd2_journal_release_jbd_inode
on a lot of Power 6 systems running with ext4. When we get
in the hung state, all I/O to the disk in question gets blocked
where we stay indefinitely. Looking at the task list, I can see
we are stuck in jbd2_journal_release_jbd_inode waiting on a
wake up. I added some debug code to detect this scenario and
dump additional data if we were stuck in jbd2_journal_release_jbd_inode
for longer than 30 minutes. When it hit, I was able to see that
i_flags was 0, suggesting we missed the wake up.

This patch changes i_flags to be an unsigned long, uses bit operators
to access it, and adds barriers around the accesses. Prior to applying
this patch, we were regularly hitting this hang on numerous systems
in our test environment. After applying the patch, the hangs no longer
occur.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-10-27 21:25:12 -04:00
Theodore Ts'o
58590b06d7 ext4: fix EOFBLOCKS_FL handling
It turns out we have several problems with how EOFBLOCKS_FL is
handled.  First of all, there was a fencepost error where we were not
clearing the EOFBLOCKS_FL when fill in the last uninitialized block,
but rather when we allocate the next block _after_ the uninitalized
block.  Secondly we were not testing to see if we needed to clear the
EOFBLOCKS_FL when writing to the file O_DIRECT or when were converting
an uninitialized block (which is the most common case).

Google-Bug-Id: 2928259

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-10-27 21:23:12 -04:00
Linus Torvalds
55f335a885 fasync: Fix placement of FASYNC flag comment
In commit f7347ce4ee ("fasync: re-organize fasync entry insertion to
allow it under a spinlock") Arnd took an earlier patch of mine that had
the comment about the FASYNC flag above the wrong function.

When the fasync_add_entry() function was split to introduce the new
fasync_insert_entry() helper function, the code that actually cares
about the FASYNC bit moved to that new helper.

So just move the comment to the right point.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:17:02 -07:00
Linus Torvalds
7420a8c0de Merge branch 'flock' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'flock' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  locks: turn lock_flocks into a spinlock
  fasync: re-organize fasync entry insertion to allow it under a spinlock
  locks/nfsd: allocate file lock outside of spinlock
  lockd: fix nlmsvc_notify_blocked locking
  lockd: push lock_flocks down
2010-10-27 18:13:34 -07:00
Shawn Bohrer
95aac7b1cd epoll: make epoll_wait() use the hrtimer range feature
This make epoll use hrtimers for the timeout value which prevents
epoll_wait() from timing out up to a millisecond early.

This mirrors the behavior of select() and poll().

Signed-off-by: Shawn Bohrer <shawn.bohrer@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:18 -07:00
Andrew Morton
231f3d393f select: rename estimate_accuracy() to select_estimate_accuracy()
Make it a subsystem-specific identifier because we wish to amke it
non-static in the next patch ("epoll: make epoll_wait() use the hrtimer
range feature").

Cc: Shawn Bohrer <shawn.bohrer@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:18 -07:00
Miklos Szeredi
0be8557bcd fuse: use release_pages()
Replace iterated page_cache_release() with release_pages(), which is
faster and shorter.

Needs release_pages() to be exported to modules.

Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:17 -07:00
KOSAKI Motohiro
98391cf4dc exec: don't turn PF_KTHREAD off when a target command was not found
Presently do_execve() turns PF_KTHREAD off before search_binary_handler().
 THis has a theorical risk of PF_KTHREAD getting lost.  We don't have to
turn PF_KTHREAD off in the ENOEXEC case.

This patch moves this flag modification to after the finding of the
executable file.

This is only a theorical issue because kthreads do not call do_execve()
directly.  But fixing would be better.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:13 -07:00
KAMEZAWA Hiroyuki
478735e388 /proc/stat: fix scalability of irq sum of all cpu
In /proc/stat, the number of per-IRQ event is shown by making a sum each
irq's events on all cpus.  But we can make use of kstat_irqs().

kstat_irqs() do the same calculation, If !CONFIG_GENERIC_HARDIRQ,
it's not a big cost. (Both of the number of cpus and irqs are small.)

If a system is very big and CONFIG_GENERIC_HARDIRQ, it does

	for_each_irq()
		for_each_cpu()
			- look up a radix tree
			- read desc->irq_stat[cpu]
This seems not efficient. This patch adds kstat_irqs() for
CONFIG_GENRIC_HARDIRQ and change the calculation as

	for_each_irq()
		look up radix tree
		for_each_cpu()
			- read desc->irq_stat[cpu]

This reduces cost.

A test on (4096cpusp, 256 nodes, 4592 irqs) host (by Jack Steiner)

%time cat /proc/stat > /dev/null

Before Patch:	 2.459 sec
After Patch :	  .561 sec

[akpm@linux-foundation.org: unexport kstat_irqs, coding-style tweaks]
[akpm@linux-foundation.org: fix unused variable 'per_irq_sum']
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Jack Steiner <steiner@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:13 -07:00
KAMEZAWA Hiroyuki
f2c66cd8ee /proc/stat: scalability of irq num per cpu
/proc/stat shows the total number of all interrupts to each cpu.  But when
the number of IRQs are very large, it take very long time and 'cat
/proc/stat' takes more than 10 secs.  This is because sum of all irq
events are counted when /proc/stat is read.  This patch adds "sum of all
irq" counter percpu and reduce read costs.

The cost of reading /proc/stat is important because it's used by major
applications as 'top', 'ps', 'w', etc....

A test on a mechin (4096cpu, 256 nodes, 4592 irqs) shows

 %time cat /proc/stat > /dev/null
 Before Patch:  12.627 sec
 After  Patch:  2.459 sec

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Jack Steiner <steiner@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:13 -07:00
Davidlohr Bueso
19cd56c48d procfs: fix /proc/softirqs formatting
The length of the BLOCK_IPOLL string is making i's value be printed too
far to the right.  This patch fixes this and makes the output a bit
neater.

Currently:
                CPU0
      HI:          0
   TIMER:     599792
  NET_TX:          2
  NET_RX:          6
   BLOCK:      80807
BLOCK_IOPOLL:          0
 TASKLET:      20012
   SCHED:          0
 HRTIMER:         63
     RCU:     619279

With patch:
                    CPU0
          HI:          0
       TIMER:     585582
      NET_TX:          2
      NET_RX:          6
       BLOCK:      80320
BLOCK_IOPOLL:          0
     TASKLET:      19287
       SCHED:          0
     HRTIMER:         62
         RCU:     604441

Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Acked-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:13 -07:00
Nikanth Karthikesan
b40d4f84be /proc/pid/smaps: export amount of anonymous memory in a mapping
Export the number of anonymous pages in a mapping via smaps.

Even the private pages in a mapping backed by a file, would be marked as
anonymous, when they are modified. Export this information to user-space via
smaps.

Exporting this count will help gdb to make a better decision on which
areas need to be dumped in its coredump; and should be useful to others
studying the memory usage of a process.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Acked-by: Hugh Dickins <hughd@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:13 -07:00
Roland McGrath
895021552d coredump: default CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
The userland ELF tools have been coping with partial-segments core files
for a few years now.  Multiple distro builds are now setting this option.
It behooves everyone who ever deals with core files to have more info
dumped in there, especially as more and more people's compilers are
producing build IDs.  Make it the default.

Anyone using older tools confused by these core files can configure this
option off, or just change /proc/PID/coredump_filter after boot.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:12 -07:00
Xiaotian Feng
1b0d300bd0 core_pattern: fix truncation by core_pattern handler with long parameters
We met a parameter truncated issue, consider following:
> echo "|/root/core_pattern_pipe_test %p /usr/libexec/blah-blah-blah \
%s %c %p %u %g 11 12345678901234567890123456789012345678 %t" > \
/proc/sys/kernel/core_pattern

This is okay because the strings is less than CORENAME_MAX_SIZE.  "cat
/proc/sys/kernel/core_pattern" shows the whole string.  but after we run
core_pattern_pipe_test in man page, we found last parameter was truncated
like below:

        argc[10]=<12807486>

The root cause is core_pattern allows % specifiers, which need to be
replaced during parse time, but the replace may expand the strings to
larger than CORENAME_MAX_SIZE.  So if the last parameter is % specifiers,
the replace code is using snprintf(out_ptr, out_end - out_ptr, ...), this
will write out of corename array.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:12 -07:00
KOSAKI Motohiro
9b1bf12d5d signals: move cred_guard_mutex from task_struct to signal_struct
Oleg Nesterov pointed out we have to prevent multiple-threads-inside-exec
itself and we can reuse ->cred_guard_mutex for it.  Yes, concurrent
execve() has no worth.

Let's move ->cred_guard_mutex from task_struct to signal_struct.  It
naturally prevent multiple-threads-inside-exec.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:12 -07:00
Ondrej Zary
e45c9effed isofs: work-around for Rock Ridge+Joliet CDs with empty ISO root directory
If a CD has both Rock Ridge and Joliet extensions and the ISO root
directory is empty, no files are visible.  Disable Rock Ridge extensions
in this case and use Joliet root directory instead.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:08 -07:00
Jan Kara
4408ea41c0 quota: Fix possible oops in __dquot_initialize()
When quotaon(8) races with __dquot_initialize() or dqget() fails because
of EIO, ENOSPC, or similar error, we could possibly dereference NULL pointer
in inode->i_dquot[cnt]. Add proper checking.

Reported-by: Dmitry Monakhov <dmonakhov@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:06 +02:00
Namhyung Kim
a4c18ad2ee ext3: Update kernel-doc comments
Update missing/broken argument descriptions and fix formatting.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:05 +02:00
Andrea Gelmini
bcf3d0bcff jbd/2: fixed typos
"wakup"

Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:05 +02:00
Andrea Gelmini
58c6ed38a1 ext2: fixed typo.
"excpet"

Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:05 +02:00
Namhyung Kim
db50d20b1d ext3: Fix debug messages in ext3_group_extend()
Fix a typo, break long lines and use E3FSBLK on ext3_fsblk_t.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:05 +02:00
Namhyung Kim
e4d5e3a497 jbd: Convert atomic_inc() to get_bh()
Convert atomic_inc(&bh->b_count) to get_bh(bh) for consistency.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:05 +02:00
Namhyung Kim
bfa01dfbe0 ext3: Remove misplaced BUFFER_TRACE() in ext3_truncate()
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:04 +02:00
Namhyung Kim
b8ea49fa9b jbd: Fix debug message in do_get_write_access()
'buffer_head' should be 'journal_head'.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:04 +02:00
Namhyung Kim
2a0e33889b jbd: Check return value of __getblk()
Fail journal creation if __getblk() returns NULL.  unlikely() is
added because it is called in a loop and we've been OK without
the check until now.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:04 +02:00
Namhyung Kim
81a4e320e6 ext3: Use DIV_ROUND_UP() on group desc block counting
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:04 +02:00
Namhyung Kim
4569cd1b0d ext3: Return proper error code on ext3_fill_super()
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:03 +02:00
Namhyung Kim
57e94d8647 ext3: Remove unnecessary casts on bh->b_data
bh->b_data is already a pointer to char so casts to 'char *' should
be meaningless. Remove them.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:03 +02:00
Namhyung Kim
df0d6b8ff1 ext3: Cleanup ext3_setup_super()
Fix mount-count check to emit warning only if s_max_mnt_count
is greater than 0 according to man tune2fs(8). Also removes
unnecessary casts.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:03 +02:00
Jan Kara
86f3cbec4a quota: Fix issuing of warnings from dquot_transfer
__dquot_transfer accidentally called flush_warnings for a wrong set of
dquots which could result in quota warnings being issued with a wrong
identification. Also when operation fails because of EDQUOT, there's no
need check for issuing information message about user getting below limits
(no transfer has actually happened).

Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:03 +02:00
Dmitry
9e32784b71 quota: fix dquot_disable vs dquot_transfer race v2
I've got following lockup:
dquot_disable                              dquot_transfer
                                            ->dqget()
					       sb_has_quota_active
dqopt->flags &= ~dquot_state_flag(f, cnt)      atomic_inc(dq->dq_count)
 ->drop_dquot_ref(sb, cnt);
    down_write(dqptr_sem)
    inode->i_dquot[cnt] = NULL              ->__dquot_transfer
invalidate_dquots(sb, cnt);		       down_write(&dqptr_sem)
  ->wait for dq_wait_unused		       inode->i_dquot = new_dquot
  /* wait forever */                            ^^^^New quota user^^^^^^

We cannot allow new references to dquots from inodes after drop_dquot_ref()
has removed them.  We have to recheck quota state under dqptr_sem and before
assignment, as we do it in dquot_initialize().

Signed-off-by: Dmitry Monakhov <dmonakhov@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:02 +02:00
Namhyung Kim
a910eefa51 jbd: Convert bitops to buffer fns
Convert set/clear_bit(BH_JWrite, ...) to set/clear_buffer_jwrite()
for consistency.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:02 +02:00
Darrick J. Wong
dff6825e9f ext3/jbd: Avoid WARN() messages when failing to write the superblock
This fixes a WARN backtrace in mark_buffer_dirty() that occurs during unmount
when the underlying block device is removed.  This bug has been seen on System
Z when removing all paths from a multipath-backed ext3 mount; on System P when
injecting enough PCI EEH errors to make the SCSI controller go offline; and
similar warnings have been seen (and patched) with ext2/ext4.

The super block update from a previous operation has marked the buffer as in
error, and the flag has to be cleared before doing the update. Similar changes
have been made to ext4 by commit 914258bf2c.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:02 +02:00
Namhyung Kim
8117f98c05 jbd: Use offset_in_page() instead of manual calculation
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:02 +02:00
Namhyung Kim
2b23976908 jbd: Remove unnecessary goto statement
Remove goto statement which jumps to very next line. Also remove
target label because it is no longer used anywhere.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:01 +02:00
Namhyung Kim
f81e3d4564 jbd: Use printk_ratelimited() in journal_alloc_journal_head()
Use printk_ratelimited() instead of doing it manually.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:01 +02:00
Namhyung Kim
c5639bef63 jbd: Move debug message into #ifdef area
Move call to jbd_debug() into #ifdef CONFIG_JBD_DEBUG block because
'dropped' is declared there. The code could be compiled without this
change anyway, simply because jbd_debug() expands to nothing if
!CONFIG_JBD_DEBUG but IMHO it doesn't look good in general.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:01 +02:00
Namhyung Kim
26f78b7a42 ext2: fix comment on ext2_try_to_allocate()
@handle doesn't exist in ext2. Remove it.
Also, fit comment header into kernel-doc format.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-10-28 01:30:01 +02:00
Arnd Bergmann
72f98e7255 locks: turn lock_flocks into a spinlock
Nothing depends on lock_flocks using the BKL
any more, so we can do the switch over to
a private spinlock.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-27 22:07:36 +02:00
Linus Torvalds
f7347ce4ee fasync: re-organize fasync entry insertion to allow it under a spinlock
You currently cannot use "fasync_helper()" in an atomic environment to
insert a new fasync entry, because it will need to allocate the new
"struct fasync_struct".

Yet fcntl_setlease() wants to call this under lock_flocks(), which is in
the process of being converted from the BKL to a spinlock.

In order to fix this, this abstracts out the actual fasync list
insertion and the fasync allocations into functions of their own, and
teaches fs/locks.c to pre-allocate the fasync_struct entry.  That way
the actual list insertion can happen while holding the required
spinlock.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bfields@redhat.com: rebase on top of my changes to Arnd's patch]
Tested-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-27 22:06:17 +02:00
Arnd Bergmann
c5b1f0d92c locks/nfsd: allocate file lock outside of spinlock
As suggested by Christoph Hellwig, this moves allocation
of new file locks out of generic_setlease into the
callers, nfs4_open_delegation and fcntl_setlease in order
to allow GFP_KERNEL allocations when lock_flocks has
become a spinlock.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: J. Bruce Fields <bfields@redhat.com>
2010-10-27 21:41:50 +02:00
J. Bruce Fields
a282a1fa6b lockd: fix nlmsvc_notify_blocked locking
nlmsvc_notify_blocked walks the nlm_blocked list,
which requires nlm_blocked_lock.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-27 21:39:50 +02:00
Arnd Bergmann
763641d812 lockd: push lock_flocks down
lockd should use lock_flocks() instead of lock_kernel()
to lock against posix locks accessing the i_flock list.

This is a prerequisite to turning lock_flocks into a
spinlock.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: J. Bruce Fields <bfields@redhat.com>
2010-10-27 21:39:39 +02:00
Christoph Hellwig
85b8fe8cc4 hfsplus: free space correcly for files unlinked while open
hfsplus_delete_inode only truncates away all block allocations if
i_nlink is zero.  Make sure we properly drop the unlink count even
when doing the rename hack for open but unlinked files.

Signed-off-by: Christoph Hellwig <hch@tuxera.com>
2010-10-27 13:45:50 +02:00
Linus Torvalds
426e1f5cec Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
  split invalidate_inodes()
  fs: skip I_FREEING inodes in writeback_sb_inodes
  fs: fold invalidate_list into invalidate_inodes
  fs: do not drop inode_lock in dispose_list
  fs: inode split IO and LRU lists
  fs: switch bdev inode bdi's correctly
  fs: fix buffer invalidation in invalidate_list
  fsnotify: use dget_parent
  smbfs: use dget_parent
  exportfs: use dget_parent
  fs: use RCU read side protection in d_validate
  fs: clean up dentry lru modification
  fs: split __shrink_dcache_sb
  fs: improve DCACHE_REFERENCED usage
  fs: use percpu counter for nr_dentry and nr_dentry_unused
  fs: simplify __d_free
  fs: take dcache_lock inside __d_path
  fs: do not assign default i_ino in new_inode
  fs: introduce a per-cpu last_ino allocator
  new helper: ihold()
  ...
2010-10-26 17:58:44 -07:00