Commit graph

13861 commits

Author SHA1 Message Date
Jeff Mahoney
3cd6dbe6fe reiserfs: cleanup path functions
This patch cleans up some redundancies in the reiserfs tree path code.

decrement_bcount() is essentially the same function as brelse(), so we use
that instead.

decrement_counters_in_path() is exactly the same function as pathrelse(), so
we kill that and use pathrelse() instead.

There's also a bit of cleanup that makes the code a bit more readable.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:39 -07:00
Jeff Mahoney
fba4ebb5f0 reiserfs: factor out buffer_info initialization
This is the first in a series of patches to make balance_leaf() not
quite so insane.

This patch factors out the open coded initializations of buffer_info
structures and defines a few initializers for the 4 cases they're used.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:39 -07:00
Jeff Mahoney
57fe60df62 reiserfs: add atomic addition of selinux attributes during inode creation
Some time ago, some changes were made to make security inode attributes
be atomically written during inode creation.  ReiserFS fell behind in
this area, but with the reworking of the xattr code, it's now fairly
easy to add.

The following patch adds the ability for security attributes to be added
automatically during inode creation.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:39 -07:00
Jeff Mahoney
a41f1a4715 reiserfs: use generic readdir for operations across all xattrs
The current reiserfs xattr implementation open codes reiserfs_readdir
and frees the path before calling the filldir function.  Typically, the
filldir function is something that modifies the file system, such as a
chown or an inode deletion that also require reading of an inode
associated with each direntry.  Since the file system is modified, the
path retained becomes invalid for the next run.  In addition, it runs
backwards in attempt to minimize activity.

This is clearly suboptimal from a code cleanliness perspective as well
as performance-wise.

This patch implements a generic reiserfs_for_each_xattr that uses the
generic readdir and a specific filldir routine that simply populates an
array of dentries and then performs a specific operation on them.  When
all files have been operated on, it then calls the operation on the
directory itself.

The result is a noticable code reduction and better performance.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
0ab2621ebd reiserfs: journaled xattrs
Deadlocks are possible in the xattr code between the journal lock and the
xattr sems.

This patch implements journalling for xattr operations. The benefit is
twofold:
 * It gets rid of the deadlock possibility by always ensuring that xattr
   write operations are initiated inside a transaction.
 * It corrects the problem where xattr backing files aren't considered any
   differently than normal files, despite the fact they are metadata.

I discussed the added journal load with Chris Mason, and we decided that
since xattrs (versus other journal activity) is fairly rare, the introduction
of larger transactions to support journaled xattrs wouldn't be too big a deal.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
48b32a3553 reiserfs: use generic xattr handlers
Christoph Hellwig had asked me quite some time ago to port the reiserfs
xattrs to the generic xattr interface.

This patch replaces the reiserfs-specific xattr handling code with the
generic struct xattr_handler.

However, since reiserfs doesn't split the prefix and name when accessing
xattrs, it can't leverage generic_{set,get,list,remove}xattr without
needlessly reconstructing the name on the back end.

Update 7/26/07: Added missing dput() to deletion path.
Update 8/30/07: Added missing mark_inode_dirty when i_mode is used to
                represent an ACL and no previous ACL existed.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
8ecbe550a1 reiserfs: remove i_has_xattr_dir
With the changes to xattr root locking, the i_has_xattr_dir flag
is no longer needed. This patch removes it.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
8b6dd72a44 reiserfs: make per-inode xattr locking more fine grained
The per-inode locking can be made more fine-grained to surround just the
interaction with the filesystem itself.  This really only applies to
protecting reads during a write, since concurrent writes are barred with
inode->i_mutex at the vfs level.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
d984561b32 reiserfs: eliminate per-super xattr lock
With the switch to using inode->i_mutex locking during lookups/creation
in the xattr root, the per-super xattr lock is no longer needed.

This patch removes it.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
6c17675e1e reiserfs: simplify xattr internal file lookups/opens
The xattr file open/lookup code is needlessly complex.  We can use
vfs-level operations to perform the same work, and also simplify the
locking constraints.  The locking advantages will be exploited in future
patches.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:37 -07:00
Jeff Mahoney
a72bdb1cd2 reiserfs: Clean up xattrs when REISERFS_FS_XATTR is unset
The current reiserfs xattr implementation will not clean up old xattr
files if files are deleted when REISERFS_FS_XATTR is unset.  This
results in inaccessible lost files, wasting space.

This patch compiles in basic xattr knowledge, such as how to delete them
and change ownership for quota tracking.  If the file system has never
used xattrs, then the operation is quite fast: it returns immediately
when it sees there is no .reiserfs_priv directory.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:37 -07:00
Jeff Mahoney
6dfede6963 reiserfs: remove IS_PRIVATE helpers
There are a number of helper functions for marking a reiserfs inode
private that were leftover from reiserfs did its own thing wrt to
private inodes.  S_PRIVATE has been in the kernel for some time, so this
patch removes the helpers and uses IS_PRIVATE instead.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:37 -07:00
Jeff Mahoney
010f5a21a3 reiserfs: remove link detection code
Early in the reiserfs xattr development, there was a plan to use
hardlinks to save disk space for identical xattrs.  That code never
materialized and isn't going to, so this patch removes the detection
code.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:37 -07:00
Jeff Mahoney
ec6ea56b2f reiserfs: xattr reiserfs_get_page takes offset instead of index
This patch changes reiserfs_get_page to take an offset rather than an
index since no callers calculate the index differently.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:37 -07:00
Jeff Mahoney
f437c529e3 reiserfs: small variable cleanup
This patch removes the xinode and mapping variables from
reiserfs_xattr_{get,set}.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:37 -07:00
Jeff Mahoney
0030b64570 reiserfs: use reiserfs_error()
This patch makes many paths that are currently using warnings to handle
the error.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:37 -07:00
Jeff Mahoney
1e5e59d431 reiserfs: introduce reiserfs_error()
Although reiserfs can currently handle severe errors such as journal failure,
it cannot handle less severe errors like metadata i/o failure. The following
patch adds a reiserfs_error() function akin to the one in ext3.

Subsequent patches will use this new error handler to handle errors more
gracefully in general.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:36 -07:00
Jeff Mahoney
32e8b10629 reiserfs: rearrange journal abort
This patch kills off reiserfs_journal_abort as it is never called, and
combines __reiserfs_journal_abort_{soft,hard} into one function called
reiserfs_abort_journal, which performs the same work. It is silent
as opposed to the old version, since the message was always issued
after a regular 'abort' message.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:36 -07:00
Jeff Mahoney
c3a9c2109f reiserfs: rework reiserfs_panic
ReiserFS panics can be somewhat inconsistent.
In some cases:
 * a unique identifier may be associated with it
 * the function name may be included
 * the device may be printed separately

This patch aims to make warnings more consistent. reiserfs_warning() prints
the device name, so printing it a second time is not required. The function
name for a warning is always helpful in debugging, so it is now automatically
inserted into the output. Hans has stated that every warning should have
a unique identifier. Some cases lack them, others really shouldn't have them.
reiserfs_warning() now expects an id associated with each message. In the
rare case where one isn't needed, "" will suffice.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:36 -07:00
Jeff Mahoney
78b6513d28 reiserfs: add locking around error buffer
The formatting of the error buffer is race prone. It uses static buffers
for both formatting and output. While overwriting the error buffer
can product garbled output, overwriting the format buffer with incompatible
% directives can cause crashes.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:36 -07:00
Jeff Mahoney
cacbe3d7ad reiserfs: prepare_error_buf wrongly consumes va_arg
vsprintf will consume varargs on its own. Skipping them manually
results in garbage in the error buffer, or Oopses in the case of
pointers.

This patch removes the advancement and fixes a number of bugs where
crashes were observed as side effects of a regular error report.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:36 -07:00
Jeff Mahoney
45b03d5e8e reiserfs: rework reiserfs_warning
ReiserFS warnings can be somewhat inconsistent.
In some cases:
 * a unique identifier may be associated with it
 * the function name may be included
 * the device may be printed separately

This patch aims to make warnings more consistent. reiserfs_warning() prints
the device name, so printing it a second time is not required. The function
name for a warning is always helpful in debugging, so it is now automatically
inserted into the output. Hans has stated that every warning should have
a unique identifier. Some cases lack them, others really shouldn't have them.
reiserfs_warning() now expects an id associated with each message. In the
rare case where one isn't needed, "" will suffice.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:36 -07:00
Jeff Mahoney
1d889d9958 reiserfs: make some warnings informational
In several places, reiserfs_warning is used when there is no warning, just
a notice. This patch changes some of them to indicate that the message
is merely informational.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:35 -07:00
Jeff Mahoney
a5437152ee reiserfs: use more consistent printk formatting
The output format between a warning/error/panic/info/etc changes with
which one is used.

The following patch makes the messages more internally consistent, but also
more consistent with other Linux filesystems.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:35 -07:00
Jeff Mahoney
eba0030559 reiserfs: use buffer_info for leaf_paste_entries
This patch makes leaf_paste_entries more consistent with respect to the
other leaf operations.  Using buffer_info instead of buffer_head
directly allows us to get a superblock pointer for use in error
handling.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:35 -07:00
Jeff Mahoney
600ed41675 reiserfs: audit transaction ids to always be unsigned ints
This patch fixes up the reiserfs code such that transaction ids are
always unsigned ints.  In places they can currently be signed ints or
unsigned longs.

The former just causes an annoying clm-2200 warning and may join a
transaction when it should wait.

The latter is just for correctness since the disk format uses a 32-bit
transaction id.  There aren't any runtime problems that result from it
not wrapping at the correct location since the value is truncated
correctly even on big endian systems.  The 0 value might make it to
disk, but the mount-time checks will bump it to 10 itself.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:35 -07:00
Jeff Mahoney
702d21c6f6 reiserfs: add support for mount count incrementing
The following patch adds the fields for tracking mount counts and last
fsck timestamps to the superblock.  It also increments the mount count
on every read-write mount.

Reiserfsprogs 3.6.21 added support for these fields.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:35 -07:00
Linus Torvalds
2d25ee36c8 Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6
* 'bkl-removal' of git://git.lwn.net/linux-2.6:
  Fix a lockdep warning in fasync_helper()
  Add a missing unlock_kernel() in raw_open()
2009-03-30 11:31:47 -07:00
Linus Torvalds
b80e0d2716 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix fuse_file_lseek returning with lock held
2009-03-30 10:08:10 -07:00
Linus Torvalds
ffd1428514 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
  jfs: needs crc32_le
  jfs: Fix error handling in metapage_writepage()
  jfs: return f_fsid for statfs(2)
  jfs: remove xtLookupList()
  jfs: clean up a dangling comment
2009-03-30 10:02:36 -07:00
Dan Carpenter
5291658d87 fuse: fix fuse_file_lseek returning with lock held
This bug was found with smatch (http://repo.or.cz/w/smatch.git/).  If
we return directly the inode->i_mutex lock doesn't get released.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
2009-03-30 17:26:24 +02:00
Jonathan Corbet
4a6a449969 Fix a lockdep warning in fasync_helper()
Lockdep gripes if file->f_lock is taken in a no-IRQ situation, since that
is not always the case.  We don't really want to disable IRQs for every
acquisition of f_lock; instead, just move it outside of fasync_lock.

Reported-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-30 08:00:24 -06:00
Tero Roponen
a4e49cb69e trivial: remove unused variable 'path' in alloc_file()
'struct path' is not used in alloc_file().

Signed-off-by: Tero Roponen <tero.roponen@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-03-30 15:22:03 +02:00
Masatake YAMATO
3e3cb64f6c trivial: fix a pdlfush -> pdflush typo in comment
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-03-30 15:22:03 +02:00
Alberto Bertogli
c7eee1b836 trivial: Fix typo in bio_split()'s documentation
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-03-30 15:22:02 +02:00
Matt LaPlante
692105b8ac trivial: fix typos/grammar errors in Kconfig texts
Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-03-30 15:22:01 +02:00
Uwe Kleine-Koenig
973c32bebf trivial: fix typo "kernal" -> "kernel"
Signed-off-by: Uwe Kleine-Koenig <Uwe.Kleine-Koenig@digi.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-03-30 15:21:57 +02:00
Rusty Russell
af76aba00f cpumask: fix seq_bitmap_*() functions.
1) seq_bitmap_list() should take a const.
2) All the seq_bitmap should use cpumask_bits().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-03-30 22:05:11 +10:30
Christoph Hellwig
27174203f5 xfs: cleanup uuid handling
The uuid table handling should not be part of a semi-generic uuid library
but in the XFS code using it, so move those bits to xfs_mount.c and
refactor the whole glob to make it a proper abstraction.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-03-30 10:21:31 +02:00
Andy Adamson
e354d571bb nfsd: embed nfsd4_current_state in nfsd4_compoundres
Remove the allocation of struct nfsd4_compound_state.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-29 16:20:12 -04:00
Christoph Hellwig
1a5902c5d2 xfs: remove m_attroffset
With the upcoming v3 inodes the default attroffset needs to be calculated
for each specific inode, so we can't cache it in the superblock anymore.

Also replace the assert for wrong inode sizes with a proper error check
also included in non-debug builds.  Note that the ENOSYS return for
that might seem odd, but that error is returned by xfs_mount_validate_sb
for all theoretically valid but not supported filesystem geometries.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
2009-03-29 19:26:46 +02:00
Malcolm Parsons
9da096fd13 xfs: fix various typos
Signed-off-by: Malcolm Parsons <malcolm.parsons@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-03-29 09:55:42 +02:00
Hisashi Hifumi
bddaafa11a xfs: pagecache usage optimization
Hi.

I introduced "is_partially_uptodate" aops for XFS.

A page can have multiple buffers and even if a page is not uptodate,
some buffers can be uptodate on pagesize != blocksize environment.

This aops checks that all buffers which correspond to a part of a file
that we want to read are uptodate. If so, we do not have to issue actual
read IO to HDD even if a page is not uptodate because the portion we
want to read are uptodate.

"block_is_partially_uptodate" function is already used by ext2/3/4.
With the following patch random read/write mixed workloads or random read
after random write workloads can be optimized and we can get performance
improvement.

I did a performance test using the sysbench.

#sysbench --num-threads=4 --max-requests=100000 --test=fileio --file-num=1 \
--file-block-size=8K --file-total-size=1G --file-test-mode=rndrw \
--file-fsync-freq=0 --file-rw-ratio=0.5 run

-2.6.29-rc6
Test execution summary:
    total time:                          123.8645s
    total number of events:              100000
    total time taken by event execution: 442.4994
    per-request statistics:
         min:                            0.0000s
         avg:                            0.0044s
         max:                            0.3387s
         approx.  95 percentile:         0.0118s

-2.6.29-rc6-patched
Test execution summary:
    total time:                          108.0757s
    total number of events:              100000
    total time taken by event execution: 417.7505
    per-request statistics:
         min:                            0.0000s
         avg:                            0.0042s
         max:                            0.3217s
         approx.  95 percentile:         0.0118s

arch: ia64
pagesize: 16k
blocksize: 4k

Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-03-29 09:53:38 +02:00
Christoph Hellwig
6447c36209 xfs: remove m_litino
With the upcoming v3 inodes the inode data/attr area size needs to be
calculated for each specific inode, so we can't cache it in the superblock
anymore.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-03-29 09:51:14 +02:00
Christoph Hellwig
a19d9f887d xfs: kill ino64 mount option
The ino64 mount option adds a fixed offset to 32bit inode numbers
to bring them into the 64bit range.  There's no need for this kind
of debug tool given that it's easy to produce real 64bit inode numbers
for testing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-03-29 09:51:08 +02:00
Christoph Hellwig
a0b0b8a5b3 xfs: kill mutex_t typedef
People continue to complain about this for weird reasons, but there's
really no point in keeping this typedef for a couple of users anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-03-29 09:51:00 +02:00
Hugh Dickins
7c2c7d9930 fix setuid sometimes wouldn't
check_unsafe_exec() also notes whether the fs_struct is being
shared by more threads than will get killed by the exec, and if so
sets LSM_UNSAFE_SHARE to make bprm_set_creds() careful about euid.
But /proc/<pid>/cwd and /proc/<pid>/root lookups make transient
use of get_fs_struct(), which also raises that sharing count.

This might occasionally cause a setuid program not to change euid,
in the same way as happened with files->count (check_unsafe_exec
also looks at sighand->count, but /proc doesn't raise that one).

We'd prefer exec not to unshare fs_struct: so fix this in procfs,
replacing get_fs_struct() by get_fs_path(), which does path_get
while still holding task_lock, instead of raising fs->count.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
___

 fs/proc/base.c |   50 +++++++++++++++--------------------------------
 1 file changed, 16 insertions(+), 34 deletions(-)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-28 17:30:00 -07:00
Hugh Dickins
e426b64c41 fix setuid sometimes doesn't
Joe Malicki reports that setuid sometimes doesn't: very rarely,
a setuid root program does not get root euid; and, by the way,
they have a health check running lsof every few minutes.

Right, check_unsafe_exec() notes whether the files_struct is being
shared by more threads than will get killed by the exec, and if so
sets LSM_UNSAFE_SHARE to make bprm_set_creds() careful about euid.
But /proc/<pid>/fd and /proc/<pid>/fdinfo lookups make transient
use of get_files_struct(), which also raises that sharing count.

There's a rather simple fix for this: exec's check on files->count
has been redundant ever since 2.6.1 made it unshare_files() (except
while compat_do_execve() omitted to do so) - just remove that check.

[Note to -stable: this patch will not apply before 2.6.29: earlier
releases should just remove the files->count line from unsafe_exec().]

Reported-by: Joe Malicki <jmalicki@metacarta.com>
Narrowed-down-by: Michael Itz <mitz@metacarta.com>
Tested-by: Joe Malicki <jmalicki@metacarta.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-28 17:30:00 -07:00
Hugh Dickins
53e9309e01 compat_do_execve should unshare_files
2.6.26's commit fd8328be87
"sanitize handling of shared descriptor tables in failing execve()"
moved the unshare_files() from flush_old_exec() and several binfmts
to the head of do_execve(); but forgot to make the same change to
compat_do_execve(), leaving a CLONE_FILES files_struct shared across
exec from a 32-bit process on a 64-bit kernel.

It's arguable whether the files_struct really ought to be unshared
across exec; but 2.6.1 made that so to stop the loading binary's fd
leaking into other threads, and a 32-bit process on a 64-bit kernel
ought to behave in the same way as 32 on 32 and 64 on 64.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-28 17:30:00 -07:00
Chuck Lever
3c8c45dfab NFS: Simplify logic to compare socket addresses in client.c
Callback requests from IPv4 servers are now always guaranteed to be
AF_INET, and never mapped IPv4 AF_INET6 addresses.  Both
nfs_match_client() and nfs_find_client() can now share the same
address comparison logic, so fold them together.

We can also dispense with of most of the conditional compilation
in here.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-28 16:51:04 -04:00