Commit graph

6930 commits

Author SHA1 Message Date
Nick Piggin
eedcbba5e0 bfs: convert to new aops
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:56 -07:00
Nick Piggin
d6091b7201 hpfs: convert to new aops
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:56 -07:00
Nick Piggin
7c0efc6277 hfsplus: convert to new aops
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:56 -07:00
Nick Piggin
7903d9eed8 hfs: convert to new aops
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:56 -07:00
Nick Piggin
d7777a25a0 fat: convert to new aops
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin
89e107877b fs: new cont helpers
Rework the generic block "cont" routines to handle the new aops.  Supporting
cont_prepare_write would take quite a lot of code to support, so remove it
instead (and we later convert all filesystems to use it).

write_begin gets passed AOP_FLAG_CONT_EXPAND when called from
generic_cont_expand, so filesystems can avoid the old hacks they used.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Steven Whitehouse
7765ec26ae gfs2: convert to new aops
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin
d79689c703 xfs: convert to new aops
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin
bfc1af650a ext4: convert to new aops
Convert ext4 to use write_begin()/write_end() methods.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Dmitriy Monakhov <dmonakhov@sw.ru>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin
f4fc66a894 ext3: convert to new aops
Various fixes and improvements

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin
f34fb6eccc ext2: convert to new aops
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin
6272b5a586 block_dev: convert to new aops
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin
800d15a53e implement simple fs aops
Implement new aops for some of the simpler filesystems.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin
afddba49d1 fs: introduce write_begin, write_end, and perform_write aops
These are intended to replace prepare_write and commit_write with more
flexible alternatives that are also able to avoid the buffered write
deadlock problems efficiently (which prepare_write is unable to do).

[mark.fasheh@oracle.com: API design contributions, code review and fixes]
[akpm@linux-foundation.org: various fixes]
[dmonakhov@sw.ru: new aop block_write_begin fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin
637aff46f9 fs: fix data-loss on error
New buffers against uptodate pages are simply be marked uptodate, while the
buffer_new bit remains set.  This causes error-case code to zero out parts of
those buffers because it thinks they contain stale data: wrong, they are
actually uptodate so this is a data loss situation.

Fix this by actually clearning buffer_new and marking the buffer dirty.  It
makes sense to always clear buffer_new before setting a buffer uptodate.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin
eb2be18931 mm: buffered write cleanup
Quite a bit of code is used in maintaining these "cached pages" that are
probably pretty unlikely to get used. It would require a narrow race where
the page is inserted concurrently while this process is allocating a page
in order to create the spare page. Then a multi-page write into an uncached
part of the file, to make use of it.

Next, the buffered write path (and others) uses its own LRU pagevec when it
should be just using the per-CPU LRU pagevec (which will cut down on both data
and code size cacheline footprint). Also, these private LRU pagevecs are
emptied after just a very short time, in contrast with the per-CPU pagevecs
that are persistent. Net result: 7.3 times fewer lru_lock acquisitions required
to add the pages to pagecache for a bulk write (in 4K chunks).

[this gets rid of some cond_resched() calls in readahead.c and mpage.c due
 to clashes in -mm. What put them there, and why? ]

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:54 -07:00
Nick Piggin
a4b0672db3 fs: fix nobh error handling
nobh mode error handling is not just pretty slack, it's wrong.

One cannot zero out the whole page to ensure new blocks are zeroed, because
it just brings the whole page "uptodate" with zeroes even if that may not
be the correct uptodate data.  Also, other parts of the page may already
contain dirty data which would get lost by zeroing it out.  Thirdly, the
writeback of zeroes to the new blocks will also erase existing blocks.  All
these conditions are pagecache and/or filesystem corruption.

The problem comes about because we didn't keep track of which buffers
actually are new or old.  However it is not enough just to keep only this
state, because at the point we start dirtying parts of the page (new
blocks, with zeroes), the handling of IO errors becomes impossible without
buffers because the page may only be partially uptodate, in which case the
page flags allone cannot capture the state of the parts of the page.

So allocate all buffers for the page upfront, but leave them unattached so
that they don't pick up any other references and can be freed when we're
done.  If the error path is hit, then zero the new buffers as the regular
buffer path does, then attach the buffers to the page so that it can
actually be written out correctly and be subject to the normal IO error
handling paths.

As an upshot, we save 1K of kernel stack on ia64 or powerpc 64K page
systems.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:54 -07:00
Dmitry Monakhov
68671f35fe mm: add end_buffer_read helper function
Move duplicated code from end_buffer_read_XXX methods to separate helper
function.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:53 -07:00
Nick Piggin
557ed1fa26 remove ZERO_PAGE
The commit b5810039a5 contains the note

  A last caveat: the ZERO_PAGE is now refcounted and managed with rmap
  (and thus mapcounted and count towards shared rss).  These writes to
  the struct page could cause excessive cacheline bouncing on big
  systems.  There are a number of ways this could be addressed if it is
  an issue.

And indeed this cacheline bouncing has shown up on large SGI systems.
There was a situation where an Altix system was essentially livelocked
tearing down ZERO_PAGE pagetables when an HPC app aborted during startup.
This situation can be avoided in userspace, but it does highlight the
potential scalability problem with refcounting ZERO_PAGE, and corner
cases where it can really hurt (we don't want the system to livelock!).

There are several broad ways to fix this problem:
1. add back some special casing to avoid refcounting ZERO_PAGE
2. per-node or per-cpu ZERO_PAGES
3. remove the ZERO_PAGE completely

I will argue for 3. The others should also fix the problem, but they
result in more complex code than does 3, with little or no real benefit
that I can see.

Why? Inserting a ZERO_PAGE for anonymous read faults appears to be a
false optimisation: if an application is performance critical, it would
not be doing many read faults of new memory, or at least it could be
expected to write to that memory soon afterwards. If cache or memory use
is critical, it should not be working with a significant number of
ZERO_PAGEs anyway (a more compact representation of zeroes should be
used).

As a sanity check -- mesuring on my desktop system, there are never many
mappings to the ZERO_PAGE (eg. 2 or 3), thus memory usage here should not
increase much without it.

When running a make -j4 kernel compile on my dual core system, there are
about 1,000 mappings to the ZERO_PAGE created per second, but about 1,000
ZERO_PAGE COW faults per second (less than 1 ZERO_PAGE mapping per second
is torn down without being COWed). So removing ZERO_PAGE will save 1,000
page faults per second when running kbuild, while keeping it only saves
less than 1 page clearing operation per second. 1 page clear is cheaper
than a thousand faults, presumably, so there isn't an obvious loss.

Neither the logical argument nor these basic tests give a guarantee of no
regressions. However, this is a reasonable opportunity to try to remove
the ZERO_PAGE from the pagefault path. If it is found to cause regressions,
we can reintroduce it and just avoid refcounting it.

The /dev/zero ZERO_PAGE usage and TLB tricks also get nuked.  I don't see
much use to them except on benchmarks.  All other users of ZERO_PAGE are
converted just to use ZERO_PAGE(0) for simplicity. We can look at
replacing them all and maybe ripping out ZERO_PAGE completely when we are
more satisfied with this solution.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus "snif" Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:53 -07:00
Fengguang Wu
f4e6b498d6 readahead: combine file_ra_state.prev_index/prev_offset into prev_pos
Combine the file_ra_state members
				unsigned long prev_index
				unsigned int prev_offset
into
				loff_t prev_pos

It is more consistent and better supports huge files.

Thanks to Peter for the nice proposal!

[akpm@linux-foundation.org: fix shift overflow]
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:52 -07:00
Jens Axboe
a39d113936 Merge branch 'barrier' into for-linus 2007-10-16 12:29:29 +02:00
Jens Axboe
992c5ddaf1 bio: make freeing of ->bi_io_vec conditional in bio_free()
The empty barrier patches do not carry data, so they have no
iovec attached.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:03:52 +02:00
Jens Axboe
2b94de552e bio: use memset() in bio_init()
Use memset() to clear the bio, instead of doing each field manually.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:03:51 +02:00
Jens Axboe
6866bef40d splice: fix double kunmap() in vmsplice copy path
The out label should not include the unmap, the only way to jump
there already has unmapped the source.

00002000
       f7c21a00 00000000 00000000 c0489036 00018e32 00000002 00000000
00001000
Call Trace:
 [<c0487dd9>] pipe_to_user+0xca/0xd3
 [<c0488233>] __splice_from_pipe+0x53/0x1bd
 [<c0454947>] ------------[ cut here ]------------
filemap_fault+0x221/0x380
 [<c0487d0f>] pipe_to_user+0x0/0xd3
 [<c0489036>] sys_vmsplice+0x3b7/0x422
 [<c045ec3f>] kernel BUG at mm/highmem.c:206!
handle_mm_fault+0x4d5/0x8eb
 [<c041ed5b>] kmap_atomic+0x1c/0x20
 [<c045d33d>] unmap_vmas+0x3d1/0x584
 [<c045f717>] free_pgtables+0x90/0xa0
 [<c041d84b>] pgd_dtor+0x0/0x1
 [<c044d665>] audit_syscall_exit+0x2aa/0x2c6
 [<c0407817>] do_syscall_trace+0x124/0x169
 [<c0404df2>] syscall_call+0x7/0xb
 =======================
Code: 2d 00 d0 5b 00 25 00 00 e0 ff 29 invalid opcode: 0000 [#1]
c2 89 d0 c1 e8 0c 8b 14 85 a0 6c 7c c0 4a 85 d2 89 14 85 a0 6c 7c c0 74 07
31 c9 4a 75 15 eb 04 <0f> 0b eb fe 31 c9 81 3d 78 38 6d c0 78 38 6d c0 0f
95 c1 b0 01
EIP: [<c045bbc3>] kunmap_high+0x51/0x8e SS:ESP 0068:f5960df0
SMP
Modules linked in: netconsole autofs4 hidp nfs lockd nfs_acl rfcomm l2cap
bluetooth sunrpc ipv6 ib_iser rdma_cm ib_cm iw_cmib_sa ib_mad ib_core
ib_addr iscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_multipath
dm_mod video output sbs batteryac parport_pc lp parport sg i2c_piix4
i2c_core floppy cfi_probe gen_probe scb2_flash mtd chipreg tg3 e1000 button
ide_cd serio_raw cdrom aic7xxx scsi_transport_spi sd_mod scsi_mod ext3 jbd
ehci_hcd ohci_hcd uhci_hcd
CPU:    3
EIP:    0060:[<c045bbc3>]    Not tainted VLI
EFLAGS: 00010246   (2.6.23 #1)
EIP is at kunmap_high+0x51/0x8e

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 10:01:29 +02:00
Tim Shimmin
150f29ef2e [XFS] no longer using io_vnode, as was remaining from 23 cherrypick
Because we cherrypicked SGI-Modid xfs-linux-melb:xfs-kern:29675a
and it depended on the sgi mod which removed io_vnode (which was
not cherrypicked in 23) it was hand modified.
This fixes things back up (to the originial mod) now we have moved
on again.

Reviewed-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 16:20:12 +10:00
Tim Shimmin
479ba36bbb [XFS] Remove STATIC which was missing from prior manual merge
Removes STATIC on xfs_freeze function which was not manually
applied for SGI-Modid: xfs-linux-melb:xfs-kern:29504a.

Reviewed-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 15:32:57 +10:00
Tim Shimmin
cd514bdaa8 [XFS] Put back the QUEUE_ORDERED_NONE test in the barrier check.
Put back the QUEUE_ORDERED_NONE test which caused us grief in sles when it
was taken out as, IIRC, it allowed md/lvm to be thought of as supporting
barriers when they weren't in some configurations. This patch will be
reverting what went in as part of a change for the SGI-pv 964544
(SGI-Modid: xfs-linux-melb:xfs-kern:28568a).

SGI-PV: 971783
SGI-Modid: xfs-linux-melb:xfs-kern:29882a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2007-10-16 14:23:21 +10:00
Lachlan McIlroy
bebf963fec [XFS] Turn off XBF_ASYNC flag before re-reading superblock.
SGI-PV: 971603
SGI-Modid: xfs-linux-melb:xfs-kern:29871a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 14:22:39 +10:00
Lachlan McIlroy
e893bffd4c [XFS] avoid race in sync_inodes() that can fail to write out all dirty data
In xfs_fs_sync_super() treat a sync the same as a filesystem freeze. This
is needed to force the log to disk for inodes which are not marked dirty
in the Linux inode (the inodes are marked dirty on completion of the log
I/O) and so sync_inodes() will not flush them.

In xfs_fs_write_inode() a synchronous flush will not get an EAGAIN from
xfs_inode_flush() and if an asynchronous flush returns EAGAIN we should
pass it on to the caller. If we get an error while flushing the inode then
re-dirty it so we can try again later.

SGI-PV: 971670
SGI-Modid: xfs-linux-melb:xfs-kern:29860a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 14:22:28 +10:00
Lachlan McIlroy
c2cba57e83 [XFS] This fix prevents bulkstat from spinning in an infinite loop.
Here 'agino' increments through the inodes in an allocation group. At the
end of the innermost 'for' loop it will hold the value of the next inode
to look at (ie the first inode in the next cluster/chunk). Assigning
'lastino' to 'agino' resets it to the last inode in the last inode cluster
we just looked at. This causes us to look up the very same cluster and
examine all the inodes all over again, and again, and again...

We also want to set 'lastino' for the cases when we're not interested in
the inode so that the next call to bulkstat won't re-examine the same
uninteresting inodes.

SGI-PV: 971064
SGI-Modid: xfs-linux-melb:xfs-kern:29840a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 14:21:56 +10:00
Christoph Hellwig
3e5daf05a0 [XFS] simplify xfs_create/mknod/symlink prototype
Simplify the prototype for xfs_create/xfs_mkdir/xfs_symlink by not passing
down a bhv_vattr_t that just hogs stack space. Instead pass down the mode
in a mode_t and in case of xfs_create the rdev as a scalar type as well.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29794a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 14:15:32 +10:00
Christoph Hellwig
c83bfab1fa [XFS] avoid xfs_getattr in XFS_IOC_FSGETXATTR ioctl
No need to call into xfs_getattr and put a big bhv_vattr_t on the stack
just to get a little information from the XFS inode.

Add a helper called xfs_ioc_fsgetxattr instead that deals with retrieving
the information in a clean way.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29780a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:21:48 +10:00
Vlad Apostolov
859d718279 [XFS] get_bulkall() could return incorrect inode state
In the following scenario xfs_bulkstat() returns incorrect stale inode
state:

1. File_A is created and its inode synced to disk. 2. File_A is unlinked
and doesn't exist anymore. 3. Filesystem sync is invoked. 4. File_B is
created. File_B happens to reclaim File_A's inode. 5. xfs_bulkstat() is
called and detects File_B but reports the

incorrect File_A inode state.

Explanation for the incorrect inode state is that inodes are not
immediately synced on file create for performance reasons. This leaves the
on-disk inode buffer uninitialized (or with old state from a previous
generation inode) and this is what xfs_bulkstat() would report.

The patch marks the on-disk inode buffer "dirty" on unlink. When the inode
is reclaimed (by a new file create), xfs_bulkstat() would filter this
inode by the "dirty" mark. Once the inode is flushed to disk, the on-disk
buffer "dirty" mark is automatically removed and a following
xfs_bulkstat() would return the correct inode state.

Marking the on-disk inode buffer "dirty" on unlink is achieved by setting
the on-disk di_nlink field to 0. Note that the in-core di_nlink has
already been set to 0 and a corresponding transaction logged by
xfs_droplink(). This is an exception from the rule that any on-disk inode
buffer changes has to be followed by a disk write (inode flush).
Synchronizing the in-core to on-disk di_nlink values in advance (before
the actual inode flush to disk) should be fine in this case because the
inode is already unlinked and it would never change its di_nlink again for
this inode generation.

SGI-PV: 970842
SGI-Modid: xfs-linux-melb:xfs-kern:29757a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:21:15 +10:00
Christoph Hellwig
ba532a980b [XFS] Kill unused IOMAP_EOF flag
SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29705a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:20:54 +10:00
Vlad Apostolov
574342f4ad [XFS] fix when DMAPI mount option processing happens
Fix for a regression caused by a recent patch
that moved the DMAPI mount option processing inside xfs_parseargs(). The
DMAPI mount option used to be processed in the DMAPI module loaded before
xfs_parseargs() was invoked.

SGI-PV: 970451
SGI-Modid: xfs-linux-melb:xfs-kern:29683a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:20:39 +10:00
Lachlan McIlroy
5903c4956f [XFS] ensure file size is logged on synchronous writes
Synchronous writes currently log inode changes before syncing pages to
disk. Since the file size is updated on I/O completion we wont be writing
out the updated file size and if we crash the file will have the wrong
size. This change moves the logging after the syncing of the pages to
ensure we log the correct file size.

SGI-PV: 970334
SGI-Modid: xfs-linux-melb:xfs-kern:29649a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:18:38 +10:00
Christoph Hellwig
cc92e7ac8d [XFS] growlock should be a mutex
m_growlock only needs plain binary mutex semantics, so use a struct mutex
instead of a semaphore for it.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29512a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:18:09 +10:00
Christoph Hellwig
0adba5363c [XFS] replace some large xfs_log_priv.h macros by proper functions
... or in the case of XLOG_TIC_ADD_OPHDR remove a useless macro entirely.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29511a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:17:56 +10:00
Christoph Hellwig
b267ce9952 [XFS] kill struct bhv_vfs
Now that struct bhv_vfs doesn't have any members left we can kill it and
go directly from the super_block to the xfs_mount everywhere.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29509a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:17:27 +10:00
Christoph Hellwig
7439449670 [XFS] move syncing related members from struct bhv_vfs to struct xfs_mount
SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29508a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:16:35 +10:00
Christoph Hellwig
bd186aa901 [XFS] kill the vfs_flags member in struct bhv_vfs
All flags are added to xfs_mount's m_flag instead. Note that the 32bit
inode flag was duplicated in both of them, but only cleared in the mount
when it was not nessecary due to the filesystem beeing small enough. Two
flags are still required here - one to indicate the mount option setting,
and one to indicate if it applies or not.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29507a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:45:57 +10:00
Christoph Hellwig
0ce4cfd4f7 [XFS] kill the vfs_fsid and vfs_altfsid members in struct bhv_vfs
vfs_altfsid was just a pointer to mp->m_fixedfsid so we can trivially
replace it with the latter. vfs_fsid also was identical to m_fixedfsid
through rather obfuscated ways so we can kill it as well and simply its
only user.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29506a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:45:02 +10:00
Christoph Hellwig
745f691912 [XFS] call common xfs vfs-level helpers directly and remove vfs operations
Also remove the now dead behavior code.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29505a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:44:08 +10:00
Christoph Hellwig
48c872a9f3 [XFS] decontaminate vfs operations from behavior details
All vfs ops now take struct xfs_mount pointers and the behaviour related
glue is split out into methods of its own.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29504a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:43:55 +10:00
Christoph Hellwig
b09cc77109 [XFS] remove dependency of the quota module on behaviors
Mount options are now parsed by the main XFS module and rejected if quota
support is not available, and there are some new quota operation for the
quotactl syscall and calls to quote in the mount, unmount and sync
callchains.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29503a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:43:26 +10:00
Christoph Hellwig
293688ec42 [XFS] remove dependency of the dmapi module on behaviors
Mount options are now parsed by the main XFS module and rejected if dmapi
support is not available, and there is a new dm operation to send the
mount event.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29502a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:41:15 +10:00
Christoph Hellwig
f541d270db [XFS] move freeing the mount structure from xfs_mount_free into the callers
In the next patch we need to look at the mount structure until just before
it's freed, so we need to be able to free it as the very last thing in
xfs_unmount.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29501a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:40:52 +10:00
Christoph Hellwig
0a74cd1964 [XFS] kill struct bhv_vnode
Now that struct bhv_vnode is empty we can just kill it. Retain bhv_vnode_t
as a typedef for struct inode for the time being until all the fallout is
cleaned up.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29500a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:40:24 +10:00
Christoph Hellwig
2aeaa258c0 [XFS] kill the v_number member in struct bhv_vnode
It's entirely unused except for ignored arguments in the mrlock
initialization, so remove it.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29499a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:39:42 +10:00
Christoph Hellwig
1543d79c45 [XFS] move v_trace from bhv_vnode to xfs_inode
struct bhv_vnode is on it's way out, so move the trace buffer to the XFS
inode. Note that this makes the tracing macros rather misnamed, but this
kind of fallout will be fixed up incrementally later on.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29498a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:39:25 +10:00