evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
Tejun Heo	839a8e8660	writeback: replace custom worker pool implementation with unbound workqueue Writeback implements its own worker pool - each bdi can be associated with a worker thread which is created and destroyed dynamically. The worker thread for the default bdi is always present and serves as the "forker" thread which forks off worker threads for other bdis. there's no reason for writeback to implement its own worker pool when using unbound workqueue instead is much simpler and more efficient. This patch replaces custom worker pool implementation in writeback with an unbound workqueue. The conversion isn't too complicated but the followings are worth mentioning. * bdi_writeback->last_active, task and wakeup_timer are removed. delayed_work ->dwork is added instead. Explicit timer handling is no longer necessary. Everything works by either queueing / modding / flushing / canceling the delayed_work item. * bdi_writeback_thread() becomes bdi_writeback_workfn() which runs off bdi_writeback->dwork. On each execution, it processes bdi->work_list and reschedules itself if there are more things to do. The function also handles low-mem condition, which used to be handled by the forker thread. If the function is running off a rescuer thread, it only writes out limited number of pages so that the rescuer can serve other bdis too. This preserves the flusher creation failure behavior of the forker thread. * INIT_LIST_HEAD(&bdi->bdi_list) is used to tell bdi_writeback_workfn() about on-going bdi unregistration so that it always drains work_list even if it's running off the rescuer. Note that the original code was broken in this regard. Under memory pressure, a bdi could finish unregistration with non-empty work_list. * The default bdi is no longer special. It now is treated the same as any other bdi and bdi_cap_flush_forker() is removed. * BDI_pending is no longer used. Removed. * Some tracepoints become non-applicable. The following TPs are removed - writeback_nothread, writeback_wake_thread, writeback_wake_forker_thread, writeback_thread_start, writeback_thread_stop. Everything, including devices coming and going away and rescuer operation under simulated memory pressure, seems to work fine in my test setup. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Jeff Moyer <jmoyer@redhat.com>	2013-04-01 19:08:06 -07:00
Jeff Layton	094f7b69ea	selinux: make security_sb_clone_mnt_opts return an error on context mismatch I had the following problem reported a while back. If you mount the same filesystem twice using NFSv4 with different contexts, then the second context= option is ignored. For instance: # mount server:/export /mnt/test1 # mount server:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0 # ls -dZ /mnt/test1 drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test1 # ls -dZ /mnt/test2 drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test2 When we call into SELinux to set the context of a "cloned" superblock, it will currently just bail out when it notices that we're reusing an existing superblock. Since the existing superblock is already set up and presumably in use, we can't go overwriting its context with the one from the "original" sb. Because of this, the second context= option in this case cannot take effect. This patch fixes this by turning security_sb_clone_mnt_opts into an int return operation. When it finds that the "new" superblock that it has been handed is already set up, it checks to see whether the contexts on the old superblock match it. If it does, then it will just return success, otherwise it'll return -EBUSY and emit a printk to tell the admin why the second mount failed. Note that this patch may cause casualties. The NFSv4 code relies on being able to walk down to an export from the pseudoroot. If you mount filesystems that are nested within one another with different contexts, then this patch will make those mounts fail in new and "exciting" ways. For instance, suppose that /export is a separate filesystem on the server: # mount server:/ /mnt/test1 # mount salusa:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0 mount.nfs: an incorrect mount option was specified ...with the printk in the ring buffer. Because we might eventually walk down to /mnt/test1/export, the mount is denied due to this patch. The second mount needs the pseudoroot superblock, but that's already present with the wrong context. OTOH, if we mount these in the reverse order, then both mounts work, because the pseudoroot superblock created when mounting /export is discarded once that mount is done. If we then however try to walk into that directory, the automount fails for the similar reasons: # cd /mnt/test1/scratch/ -bash: cd: /mnt/test1/scratch: Device or resource busy The story I've gotten from the SELinux folks that I've talked to is that this is desirable behavior. In SELinux-land, mounting the same data under different contexts is wrong -- there can be only one. Cc: Steve Dickson <steved@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-04-02 11:30:13 +11:00
Anatol Pomozov	c1681bf8a7	loop: prevent bdev freeing while device in use struct block_device lifecycle is defined by its inode (see fs/block_dev.c) - block_device allocated first time we access /dev/loopXX and deallocated on bdev_destroy_inode. When we create the device "losetup /dev/loopXX afile" we want that block_device stay alive until we destroy the loop device with "losetup -d". But because we do not hold /dev/loopXX inode its counter goes 0, and inode/bdev can be destroyed at any moment. Usually it happens at memory pressure or when user drops inode cache (like in the test below). When later in loop_clr_fd() we want to use bdev we have use-after-free error with following stack: BUG: unable to handle kernel NULL pointer dereference at 0000000000000280 bd_set_size+0x10/0xa0 loop_clr_fd+0x1f8/0x420 [loop] lo_ioctl+0x200/0x7e0 [loop] lo_compat_ioctl+0x47/0xe0 [loop] compat_blkdev_ioctl+0x341/0x1290 do_filp_open+0x42/0xa0 compat_sys_ioctl+0xc1/0xf20 do_sys_open+0x16e/0x1d0 sysenter_dispatch+0x7/0x1a To prevent use-after-free we need to grab the device in loop_set_fd() and put it later in loop_clr_fd(). The issue is reprodusible on current Linus head and v3.3. Here is the test: dd if=/dev/zero of=loop.file bs=1M count=1 while [ true ]; do losetup /dev/loop0 loop.file echo 2 > /proc/sys/vm/drop_caches losetup -d /dev/loop0 done [ Doing bdgrab/bput in loop_set_fd/loop_clr_fd is safe, because every time we call loop_set_fd() we check that loop_device->lo_state is Lo_unbound and set it to Lo_bound If somebody will try to set_fd again it will get EBUSY. And if we try to loop_clr_fd() on unbound loop device we'll get ENXIO. loop_set_fd/loop_clr_fd (and any other loop ioctl) is called under loop_device->lo_ctl_mutex. ] Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-01 15:48:47 -07:00
Greg Kroah-Hartman	0f8b1a0204	Merge v3.9-rc5 into driver-core-next We want the fixes in here. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-04-01 11:05:59 -07:00
Alexandru Gheorghiu	79b5793be4	f2fs: use kmemdup Use kmemdup instead of kzalloc and memcpy. Signed-off-by: Alexandru Gheorghiu <gheorghiuandru@gmail.com> Acked-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-31 09:12:18 +09:00
Chuck Lever	4edaa30888	NFS: Use "krb5i" to establish NFSv4 state whenever possible Currently our client uses AUTH_UNIX for state management on Kerberos NFS mounts in some cases. For example, if the first mount of a server specifies "sec=sys," the SETCLIENTID operation is performed with AUTH_UNIX. Subsequent mounts using stronger security flavors can not change the flavor used for lease establishment. This might be less security than an administrator was expecting. Dave Noveck's migration issues draft recommends the use of an integrity-protecting security flavor for the SETCLIENTID operation. Let's ignore the mount's sec= setting and use krb5i as the default security flavor for SETCLIENTID. If our client can't establish a GSS context (eg. because it doesn't have a keytab or the server doesn't support Kerberos) we fall back to using AUTH_NULL. For an operation that requires a machine credential (which never represents a particular user) AUTH_NULL is as secure as AUTH_UNIX. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-29 15:45:22 -04:00
Chuck Lever	c4eafe1135	NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC Most NFSv4 servers implement AUTH_UNIX, and administrators will prefer this over AUTH_NULL. It is harmless for our client to try this flavor in addition to the flavors mandated by RFC 3530/5661. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-29 15:45:09 -04:00
Chuck Lever	9a744ba398	NFS: Use static list of security flavors during root FH lookup recovery If the Linux NFS client receives an NFS4ERR_WRONGSEC error while trying to look up an NFS server's root file handle, it retries the lookup operation with various security flavors to see what flavor the NFS server will accept for pseudo-fs access. The list of flavors the client uses during retry consists only of flavors that are currently registered in the kernel RPC client. This list may not include any GSS pseudoflavors if auth_rpcgss.ko has not yet been loaded. Let's instead use a static list of security flavors that the NFS standard requires the server to implement (RFC 3530bis, section 3.2.1). The RPC client should now be able to load support for these dynamically; if not, they are skipped. Recovery behavior here is prescribed by RFC 3530bis, section 15.33.5: > For LOOKUPP, PUTROOTFH and PUTPUBFH, the client will be unable to > use the SECINFO operation since SECINFO requires a current > filehandle and none exist for these two [sic] operations. Therefore, > the client must iterate through the security triples available at > the client and reattempt the PUTROOTFH or PUTPUBFH operation. In > the unfortunate event none of the MANDATORY security triples are > supported by the client and server, the client SHOULD try using > others that support integrity. Failing that, the client can try > using AUTH_NONE, but because such forms lack integrity checks, > this puts the client at risk. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-29 15:44:58 -04:00
Chuck Lever	83ca7f5ab3	NFS: Avoid PUTROOTFH when managing leases Currently, the compound operation the Linux NFS client sends to the server to confirm a client ID looks like this: { SETCLIENTID_CONFIRM; PUTROOTFH; GETATTR(lease_time) } Once the lease is confirmed, it makes sense to know how long before the client will have to renew it. And, performing these operations in the same compound saves a round trip. Unfortunately, this arrangement assumes that the security flavor used for establishing a client ID can also be used to access the server's pseudo-fs. If the server requires a different security flavor to access its pseudo-fs than it allowed for the client's SETCLIENTID operation, the PUTROOTFH in this compound fails with NFS4ERR_WRONGSEC. Even though the SETCLIENTID_CONFIRM succeeded, our client's trunking detection logic interprets the failure of the compound as a failure by the server to confirm the client ID. As part of server trunking detection, the client then begins another SETCLIENTID pass with the same nfs4_client_id. This fails with NFS4ERR_CLID_INUSE because the first SETCLIENTID/SETCLIENTID_CONFIRM already succeeded in confirming that client ID -- it was the PUTROOTFH operation that caused the SETCLIENTID_CONFIRM compound to fail. To address this issue, separate the "establish client ID" step from the "accessing the server's pseudo-fs root" step. The first access of the server's pseudo-fs may require retrying the PUTROOTFH operation with different security flavors. This access is done in nfs4_proc_get_rootfh(). That leaves the matter of how to retrieve the server's lease time. nfs4_proc_fsinfo() already retrieves the lease time value, though none of its callers do anything with the retrieved value (nor do they mark the lease as "renewed"). Note that NFSv4.1 state recovery invokes nfs4_proc_get_lease_time() using the lease management security flavor. This may cause some heartburn if that security flavor isn't the same as the security flavor the server requires for accessing the pseudo-fs. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-29 15:44:49 -04:00
Chuck Lever	2ed4b95b7e	NFS: Clean up nfs4_proc_get_rootfh The long lines with no vertical white space make this function difficult for humans to read. Add a proper documenting comment while we're here. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-29 15:44:12 -04:00
Chuck Lever	75bc8821bd	NFS: Handle missing rpc.gssd when looking up root FH When rpc.gssd is not running, any NFS operation that needs to use a GSS security flavor of course does not work. If looking up a server's root file handle results in an NFS4ERR_WRONGSEC, nfs4_find_root_sec() is called to try a bunch of security flavors until one works or all reasonable flavors have been tried. When rpc.gssd isn't running, this loop seems to fail immediately after rpcauth_create() craps out on the first GSS flavor. When the rpcauth_create() call in nfs4_lookup_root_sec() fails because rpc.gssd is not available, nfs4_lookup_root_sec() unconditionally returns -EIO. This prevents nfs4_find_root_sec() from retrying any other flavors; it drops out of its loop and fails immediately. Having nfs4_lookup_root_sec() return -EACCES instead allows nfs4_find_root_sec() to try all flavors in its list. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-29 15:43:55 -04:00
Chuck Lever	a77c806fb9	SUNRPC: Refactor nfsd4_do_encode_secinfo() Clean up. This matches a similar API for the client side, and keeps ULP fingers out the of the GSS mech switch. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-29 15:43:33 -04:00
Chuck Lever	9568c5e9a6	SUNRPC: Introduce rpcauth_get_pseudoflavor() A SECINFO reply may contain flavors whose kernel module is not yet loaded by the client's kernel. A new RPC client API, called rpcauth_get_pseudoflavor(), is introduced to do proper checking for support of a security flavor. When this API is invoked, the RPC client now tries to load the module for each flavor first before performing the "is this supported?" check. This means if a module is available on the client, but has not been loaded yet, it will be loaded and registered automatically when the SECINFO reply is processed. The new API can take a full GSS tuple (OID, QoP, and service). Previously only the OID and service were considered. nfs_find_best_sec() is updated to verify all flavors requested in a SECINFO reply, including AUTH_NULL and AUTH_UNIX. Previously these two flavors were simply assumed to be supported without consulting the RPC client. Note that the replaced version of nfs_find_best_sec() can return RPC_AUTH_MAXFLAVOR if the server returns a recognized OID but an unsupported "service" value. nfs_find_best_sec() now returns RPC_AUTH_UNIX in this case. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-29 15:43:07 -04:00
Chuck Lever	fb15b26f8b	SUNRPC: Define rpcsec_gss_info structure The NFSv4 SECINFO procedure returns a list of security flavors. Any GSS flavor also has a GSS tuple containing an OID, a quality-of- protection value, and a service value, which specifies a particular GSS pseudoflavor. For simplicity and efficiency, I'd like to return each GSS tuple from the NFSv4 SECINFO XDR decoder and pass it straight into the RPC client. Define a data structure that is visible to both the NFS client and the RPC client. Take structure and field names from the relevant standards to avoid confusion. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-29 15:42:56 -04:00
Linus Torvalds	3615db41c4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "We've had a busy two weeks of bug fixing. The biggest patches in here are some long standing early-enospc problems (Josef) and a very old race where compression and mmap combine forces to lose writes (me). I'm fairly sure the mmap bug goes all the way back to the introduction of the compression code, which is proof that fsx doesn't trigger every possible mmap corner after all. I'm sure you'll notice one of these is from this morning, it's a small and isolated use-after-free fix in our scrub error reporting. I double checked it here." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: don't drop path when printing out tree errors in scrub Btrfs: fix wrong return value of btrfs_lookup_csum() Btrfs: fix wrong reservation of csums Btrfs: fix double free in the btrfs_qgroup_account_ref() Btrfs: limit the global reserve to 512mb Btrfs: hold the ordered operations mutex when waiting on ordered extents Btrfs: fix space accounting for unlink and rename Btrfs: fix space leak when we fail to reserve metadata space Btrfs: fix EIO from btrfs send in is_extent_unchanged for punched holes Btrfs: fix race between mmap writes and compression Btrfs: fix memory leak in btrfs_create_tree() Btrfs: fix locking on ROOT_REPLACE operations in tree mod log Btrfs: fix missing qgroup reservation before fallocating Btrfs: handle a bogus chunk tree nicely Btrfs: update to use fs_state bit	2013-03-29 11:13:25 -07:00
Jan Kara	35e5cbc0af	reiserfs: Fix warning and inode leak when deleting inode with xattrs After commit `21d8a15a` (lookup_one_len: don't accept . and ..) reiserfs started failing to delete xattrs from inode. This was due to a buggy test for '.' and '..' in fill_with_dentries() which resulted in passing '.' and '..' entries to lookup_one_len() in some cases. That returned error and so we failed to iterate over all xattrs of and inode. Fix the test in fill_with_dentries() along the lines of the one in lookup_one_len(). Reported-by: Pawel Zawora <pzawora@gmail.com> CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>	2013-03-29 17:08:43 +01:00
Josef Bacik	d8fe29e9de	Btrfs: don't drop path when printing out tree errors in scrub A user reported a panic where we were panicing somewhere in tree_backref_for_extent from scrub_print_warning. He only captured the trace but looking at scrub_print_warning we drop the path right before we mess with the extent buffer to print out a bunch of stuff, which isn't right. So fix this by dropping the path after we use the eb if we need to. Thanks, Cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2013-03-29 10:18:59 -04:00
Linus Torvalds	97f084b8e6	sysfs fixes for 3.9-rc4 Here are two fixes for sysfs that resolve issues that have been found by the Trinity fuzz tool, causing oopses in sysfs. They both have been in linux-next for a while to ensure that they do not cause any other problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlFUdHUACgkQMUfUDdst+ykk+ACfWz6U/DW97ibFusDj+Sys1pEt essAn15ZFy/pT5myhCvxqVH0MHrIftup =BM+Q -----END PGP SIGNATURE----- Merge tag 'driver-core-3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull sysfs fixes from Greg Kroah-Hartman: "Here are two fixes for sysfs that resolve issues that have been found by the Trinity fuzz tool, causing oopses in sysfs. They both have been in linux-next for a while to ensure that they do not cause any other problems." * tag 'driver-core-3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: sysfs: handle failure path correctly for readdir() sysfs: fix race between readdir and lseek	2013-03-28 15:52:14 -07:00
Linus Torvalds	2c3de1c2d7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull userns fixes from Eric W Biederman: "The bulk of the changes are fixing the worst consequences of the user namespace design oversight in not considering what happens when one namespace starts off as a clone of another namespace, as happens with the mount namespace. The rest of the changes are just plain bug fixes. Many thanks to Andy Lutomirski for pointing out many of these issues." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Restrict when proc and sysfs can be mounted ipc: Restrict mounting the mqueue filesystem vfs: Carefully propogate mounts across user namespaces vfs: Add a mount flag to lock read only bind mounts userns: Don't allow creation if the user is chrooted yama: Better permission check for ptraceme pid: Handle the exit of a multi-threaded init. scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.	2013-03-28 13:43:46 -07:00
Trond Myklebust	809b426c7f	NFSv4: Fix Oopses in the fs_locations code If the server sends us a pathname with more components than the client limit of NFS4_PATHNAME_MAXCOMPONENTS, more server entries than the client limit of NFS4_FS_LOCATION_MAXSERVERS, or sends a total number of fs_locations entries than the client limit of NFS4_FS_LOCATIONS_MAXENTRIES then we will currently Oops because the limit checks are done _after_ we've decoded the data into the arrays. Reported-by: fanchaoting<fanchaoting@cn.fujitsu.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-28 16:22:17 -04:00
Trond Myklebust	91876b13b8	NFSv4: Fix another reboot recovery race If the open_context for the file is not yet fully initialised, then open recovery cannot succeed, and since nfs4_state_find_open_context returns an ENOENT, we end up treating the file as being irrecoverable. What we really want to do, is just defer the recovery until later. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-28 16:22:16 -04:00
Miao Xie	82d130ff39	Btrfs: fix wrong return value of btrfs_lookup_csum() If we don't find the expected csum item, but find a csum item which is adjacent to the specified extent, we should return -EFBIG, or we should return -ENOENT. But btrfs_lookup_csum() return -EFBIG even the csum item is not adjacent to the specified extent. Fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-03-28 09:51:31 -04:00
Miao Xie	39847c4d3d	Btrfs: fix wrong reservation of csums We reserve the space for csums only when we write data into a file, in the other cases, such as tree log, log replay, we don't do reservation, so we can use the reservation of the transaction handle just for the former. And for the latter, we should use the tree's own reservation. But the function - btrfs_csum_file_blocks() didn't differentiate between these two types of the cases, fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-03-28 09:51:30 -04:00
Wang Shilong	a7975026ff	Btrfs: fix double free in the btrfs_qgroup_account_ref() The function btrfs_find_all_roots is responsible to allocate memory for 'roots' and free it if errors happen,so the caller should not free it again since the work has been done. Besides,'tmp' is allocated after the function btrfs_find_all_roots, so we can return directly if btrfs_find_all_roots() fails. Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com> Reviewed-by: Miao Xie <miaox@cn.fujitsu.com> Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-03-28 09:51:29 -04:00
Josef Bacik	fdf30d1c1b	Btrfs: limit the global reserve to 512mb A user reported a problem where he was getting early ENOSPC with hundreds of gigs of free data space and 6 gigs of free metadata space. This is because the global block reserve was taking up the entire free metadata space. This is ridiculous, we have infrastructure in place to throttle if we start using too much of the global reserve, so instead of letting it get this huge just limit it to 512mb so that users can still get work done. This allowed the user to complete his rsync without issues. Thanks Cc: stable@vger.kernel.org Reported-and-tested-by: Stefan Priebe <s.priebe@profihost.ag> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-03-28 09:51:29 -04:00
Josef Bacik	db1d607d3c	Btrfs: hold the ordered operations mutex when waiting on ordered extents We need to hold the ordered_operations mutex while waiting on ordered extents since we splice and run the ordered extents list. We need to make sure anybody else who wants to wait on ordered extents does actually wait for them to be completed. This will keep us from bailing out of flushing in case somebody is already waiting on ordered extents to complete. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-03-28 09:51:28 -04:00
Josef Bacik	6e137ed3f3	Btrfs: fix space accounting for unlink and rename We are way over-reserving for unlink and rename. Rename is just some random huge number and unlink accounts for tree log operations that don't actually happen during unlink, not to mention the tree log doesn't take from the trans block rsv anyway so it's completely useless. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-03-28 09:51:27 -04:00
Josef Bacik	f4881bc7a8	Btrfs: fix space leak when we fail to reserve metadata space Dave reported a warning when running xfstest 275. We have been leaking delalloc metadata space when our reservations fail. This is because we were improperly calculating how much space to free for our checksum reservations. The problem is we would sometimes free up space that had already been freed in another thread and we would end up with negative usage for the delalloc space. This patch fixes the problem by calculating how much space the other threads would have already freed, and then calculate how much space we need to free had we not done the reservation at all, and then freeing any excess space. This makes xfstests 275 no longer have leaked space. Thanks Cc: stable@vger.kernel.org Reported-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-03-28 09:51:26 -04:00
Jan Schmidt	adaa4b8e4d	Btrfs: fix EIO from btrfs send in is_extent_unchanged for punched holes When you take a snapshot, punch a hole where there has been data, then take another snapshot and try to send an incremental stream, btrfs send would give you EIO. That is because is_extent_unchanged had no support for holes being punched. With this patch, instead of returning EIO we just return 0 (== the extent is not unchanged) and we're good. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Cc: Alexander Block <ablock84@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-03-28 09:51:26 -04:00
Trond Myklebust	6e3cf24152	NFSv4: Add a mapping for NFS4ERR_FILE_OPEN in nfs4_map_errors With unlink is an asynchronous operation in the sillyrename case, it expects nfs4_async_handle_error() to map the error correctly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-27 12:44:40 -04:00
Jan Kara	e678a4f0f5	jbd: don't wait (forever) for stale tid caused by wraparound In the case where an inode has a very stale transaction id (tid) in i_datasync_tid or i_sync_tid, it's possible that after a very large (2**31) number of transactions, that the tid number space might wrap, causing tid_geq()'s calculations to fail. Commit `d9b0193` "jbd: fix fsync() tid wraparound bug" attempted to fix this problem, but it only avoided kjournald spinning forever by fixing the logic in jbd_log_start_commit(). Signed-off-by: Jan Kara <jack@suse.cz>	2013-03-27 17:30:59 +01:00
Al Viro	3e84f48edf	vfs/splice: Fix missed checks in new __kernel_write() helper Commit `06ae43f34b` ("Don't bother with redoing rw_verify_area() from default_file_splice_from()") lost the checks to test existence of the write/aio_write methods. My apologies ;-/ Eventually, we want that in fs/splice.c side of things (no point repeating it for every buffer, after all), but for now this is the obvious minimal fix. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-27 09:24:02 -07:00
Eric W. Biederman	87a8ebd637	userns: Restrict when proc and sysfs can be mounted Only allow unprivileged mounts of proc and sysfs if they are already mounted when the user namespace is created. proc and sysfs are interesting because they have content that is per namespace, and so fresh mounts are needed when new namespaces are created while at the same time proc and sysfs have content that is shared between every instance. Respect the policy of who may see the shared content of proc and sysfs by only allowing new mounts if there was an existing mount at the time the user namespace was created. In practice there are only two interesting cases: proc and sysfs are mounted at their usual places, proc and sysfs are not mounted at all (some form of mount namespace jail). Cc: stable@vger.kernel.org Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-27 07:50:08 -07:00
Eric W. Biederman	132c94e31b	vfs: Carefully propogate mounts across user namespaces As a matter of policy MNT_READONLY should not be changable if the original mounter had more privileges than creator of the mount namespace. Add the flag CL_UNPRIVILEGED to note when we are copying a mount from a mount namespace that requires more privileges to a mount namespace that requires fewer privileges. When the CL_UNPRIVILEGED flag is set cause clone_mnt to set MNT_NO_REMOUNT if any of the mnt flags that should never be changed are set. This protects both mount propagation and the initial creation of a less privileged mount namespace. Cc: stable@vger.kernel.org Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-27 07:50:05 -07:00
Eric W. Biederman	90563b198e	vfs: Add a mount flag to lock read only bind mounts When a read-only bind mount is copied from mount namespace in a higher privileged user namespace to a mount namespace in a lesser privileged user namespace, it should not be possible to remove the the read-only restriction. Add a MNT_LOCK_READONLY mount flag to indicate that a mount must remain read-only. CC: stable@vger.kernel.org Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-27 07:50:04 -07:00
Eric W. Biederman	3151527ee0	userns: Don't allow creation if the user is chrooted Guarantee that the policy of which files may be access that is established by setting the root directory will not be violated by user namespaces by verifying that the root directory points to the root of the mount namespace at the time of user namespace creation. Changing the root is a privileged operation, and as a matter of policy it serves to limit unprivileged processes to files below the current root directory. For reasons of simplicity and comprehensibility the privilege to change the root directory is gated solely on the CAP_SYS_CHROOT capability in the user namespace. Therefore when creating a user namespace we must ensure that the policy of which files may be access can not be violated by changing the root directory. Anyone who runs a processes in a chroot and would like to use user namespace can setup the same view of filesystems with a mount namespace instead. With this result that this is not a practical limitation for using user namespaces. Cc: stable@vger.kernel.org Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-27 07:49:29 -07:00
Linus Torvalds	de55eb1d60	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "stable fodder; assorted deadlock fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vt: synchronize_rcu() under spinlock is not nice... Nest rename_lock inside vfsmount_lock Don't bother with redoing rw_verify_area() from default_file_splice_from()	2013-03-26 17:42:55 -07:00
Jaegeuk Kim	953a3e27e1	f2fs: fix to give correct parent inode number for roll forward When we recover fsync'ed data after power-off-recovery, we should guarantee that any parent inode number should be correct for each direct inode blocks. So, let's make the following rules. - The fsync should do checkpoint to all the inodes that were experienced hard links. - So, the only normal files can be recovered by roll-forward. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-27 09:16:25 +09:00
Jaegeuk Kim	fa37241743	f2fs: remain nat cache entries for further free nid allocation In the checkpoint flow, the f2fs investigates the total nat cache entries. Previously, if an entry has NULL_ADDR, f2fs drops the entry and adds the obsolete nid to the free nid list. However, this free nid will be reused sooner, resulting in its nat entry miss. In order to avoid this, we don't need to drop the nat cache entry at this moment. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-27 09:16:18 +09:00
Jaegeuk Kim	0ff153a2f1	f2fs: do not skip writing file meta during fsync This patch removes data_version check flow during the fsync call. The original purpose for the use of data_version was to avoid writng inode pages redundantly by the fsync calls repeatedly. However, when user can modify file meta and then call fsync, we should not skip fsync procedure. So, let's remove this condition check and hope that user triggers in right manner. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-27 09:16:16 +09:00
Jaegeuk Kim	6ead114232	f2fs: fix the recovery flow to handle errors correctly We should handle errors during the recovery flow correctly. For example, if we get -ENOMEM, we should report a mount failure instead of conducting the remained mount procedure. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-27 09:16:06 +09:00
Al Viro	7ea600b531	Nest rename_lock inside vfsmount_lock ... lest we get livelocks between path_is_under() and d_path() and friends. The thing is, wrt fairness lglocks are more similar to rwsems than to rwlocks; it is possible to have thread B spin on attempt to take lock shared while thread A is already holding it shared, if B is on lower-numbered CPU than A and there's a thread C spinning on attempt to take the same lock exclusive. As the result, we need consistent ordering between vfsmount_lock (lglock) and rename_lock (seq_lock), even though everything that takes both is going to take vfsmount_lock only shared. Spotted-by: Brad Spengler <spender@grsecurity.net> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-03-26 18:25:57 -04:00
Linus Torvalds	5d538483ea	NFS client bugfixes for Linux 3.9 - Fix an NFSv4 idmapper regression - Fix an Oops in the pNFS blocks client - Fix up various issues with pNFS layoutcommit - Ensure correct read ordering of variables in rpc_wake_up_task_queue_locked -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJRUedyAAoJEGcL54qWCgDyar0P/2pTT/yxX8ejTu5DmY7e4PYJ jhPG2AEqY/yMLn9GvB375VIs1L8tuY50+3NFhWZFjyNbEU3GV+5Y+kPpBtAgYiSI VyIXiJ/xMtXdYJMYuE/nh5jbcqJsHwGjpcIaSd5BuWzQUaoUYvLulxWd4QN8mmaT 5SuzmgV+7WIqV6RjlaYF82srcOKAjwemcrfRkCNzzJr6aT39gH2YdYFbDaTr7qhU fw0x3QlI7887vSNQcfaGbC1+jr6oe8wRCneOR0tceU/8bcj6zlUDk5HxqSOc28mA jUQieoVRggcM4s5DFpNcuwW6qCPZOmzv/OFD6oqnhyyonPOrue+7zaoujZmGNmjx dT2V/jQehanYD25WpDO8OyFXUeYE4x9bgHKsszhBTwr4x5D8ceEJ1sugcOPiTTxu tflbbuWbt+BguvXp4p8QayUj0V2cplM/nOovWyUG+BH46sz3Dtv46NOgJeO2a29g T6jayxmKCxvtPKtG0j34BzLngiKabZTSEhFms6Qarp9lwWvHWrR9KWGuDBNvy1Ts GMBN8P6Ib40yVi6Pwlj5Jpy6yLKVklHtJQpactr63AZmYrF4bBBSom+MWAh3X1iO QtF0x9Z1bBkXY2Q/u+3vWMxQtEPeW+pSiloj8aiceFAt33zKM+1bLofDhEw0s2fI wJEHYsGyGtDQINgP0v1e =OPbZ -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Fix an NFSv4 idmapper regression - Fix an Oops in the pNFS blocks client - Fix up various issues with pNFS layoutcommit - Ensure correct read ordering of variables in rpc_wake_up_task_queue_locked * tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked NFSv4.1: Add a helper pnfs_commit_and_return_layout NFSv4.1: Always clear the NFS_INO_LAYOUTCOMMIT in layoutreturn NFSv4.1: Fix a race in pNFS layoutcommit pnfs-block: removing DM device maybe cause oops when call dev_remove NFSv4: Fix the string length returned by the idmapper	2013-03-26 14:23:45 -07:00
J. Bruce Fields	64a817cfbd	nfsd4: reject "negative" acl lengths Since we only enforce an upper bound, not a lower bound, a "negative" length can get through here. The symptom seen was a warning when we attempt to a kmalloc with an excessive size. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-03-26 16:18:27 -04:00
Chris Mason	4adaa61102	Btrfs: fix race between mmap writes and compression Btrfs uses page_mkwrite to ensure stable pages during crc calculations and mmap workloads. We call clear_page_dirty_for_io before we do any crcs, and this forces any application with the file mapped to wait for the crc to finish before it is allowed to change the file. With compression on, the clear_page_dirty_for_io step is happening after we've compressed the pages. This means the applications might be changing the pages while we are compressing them, and some of those modifications might not hit the disk. This commit adds the clear_page_dirty_for_io before compression starts and makes sure to redirty the page if we have to fallback to uncompressed IO as well. Signed-off-by: Chris Mason <chris.mason@fusionio.com> Reported-by: Alexandre Oliva <oliva@gnu.org> cc: stable@vger.kernel.org	2013-03-26 13:19:14 -04:00
Maarten Lankhorst	3db3c62584	sysfs: use atomic_inc_unless_negative in sysfs_get_active It seems that sysfs has an interesting way of doing the same thing. This removes the cpu_relax unfortunately, but if it's really needed, it would be better to add this to include/linux/atomic.h to benefit all atomic ops users. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-03-25 10:42:36 -07:00
Linus Torvalds	844fdd9ac1	Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from J Bruce Fields: "Fixes for a couple mistakes in the new DRC code. And thanks to Kent Overstreet for noticing we've been sync'ing the wrong range on stable writes since 3.8." * 'for-3.9' of git://linux-nfs.org/~bfields/linux: nfsd: fix bad offset use nfsd: fix startup order in nfsd_reply_cache_init nfsd: only unhash DRC entries that are in the hashtable	2013-03-25 09:25:12 -07:00
Trond Myklebust	ccb46e2063	NFSv4.1: Use CLAIM_DELEG_CUR_FH opens when available Now that we do CLAIM_FH opens, we may run into situations where we get a delegation but don't have perfect knowledge of the file path. When returning the delegation, we might therefore not be able to us CLAIM_DELEGATE_CUR opens to convert the delegation into OPEN stateids and locks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-25 12:04:11 -04:00
Trond Myklebust	49f9a0fafd	NFSv4.1: Enable open-by-filehandle Sometimes, we actually _want_ to do open-by-filehandle, for instance when recovering opens after a network partition, or when called from nfs4_file_open. Enable that functionality using a new capability NFS_CAP_ATOMIC_OPEN_V1, and which is only enabled for NFSv4.1 servers that support it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-25 12:04:11 -04:00
Trond Myklebust	d9fc6619ca	NFSv4.1: Add xdr support for CLAIM_FH and CLAIM_DELEG_CUR_FH opens Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-25 12:04:11 -04:00

... 12 13 14 15 16 ...