evie/android_kernel_oneplus_msm8998 - Gay Catgirls Forgejo: gay catgirls having sex

evie/android_kernel_oneplus_msm8998

Author	SHA1	Message	Date
Jaegeuk Kim	a1dd3c13ce	f2fs: fix to recover i_size from roll-forward If user requests many data writes and fsync together, the last updated i_size should be stored to the inode block consistently. But, previous write_end just marks the inode as dirty and doesn't update its metadata into its inode block. After that, fsync just writes the inode block with newly updated data index excluding inode metadata updates. So, this patch introduces write_end in which updates inode block too when the i_size is changed. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-02 08:48:16 +09:00
Gu Zheng	5ebefc5b40	f2fs: remove the unused argument "sbi" of func destroy_fsync_dnodes() As destroy_fsync_dnodes() is a simple list-cleanup func, so delete the unused and unrelated f2fs_sb_info argument of it. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-02 08:48:15 +09:00
Jaegeuk Kim	763bfe1bc5	f2fs: remove reusing any prefree segments This patch removes check_prefree_segments initially designed to enhance the performance by narrowing the range of LBA usage across the whole block device. When allocating a new segment, previous f2fs tries to find proper prefree segments, and then, if finds a segment, it reuses the segment for further data or node block allocation. However, I found that this was totally wrong approach since the prefree segments have several data or node blocks that will be used by the roll-forward mechanism operated after sudden-power-off. Let's assume the following scenario. /* write 8MB with fsync / for (i = 0; i < 2048; i++) { offset = i 4096; write(fd, offset, 4KB); fsync(fd); } In this case, naive segment allocation sequence will be like: data segment: x, x+1, x+2, x+3 node segment: y, y+1, y+2, y+3. But, if we can reuse prefree segments, the sequence can be like: data segment: x, x+1, y, y+1 node segment: y, y+1, y+2, y+3. Because, y, y+1, and y+2 became prefree segments one by one, and those are reused by data allocation. After conducting this workload, we should consider how to recover the latest inode with its data. If we reuse the prefree segments such as y or y+1, we lost the old node blocks so that f2fs even cannot start roll-forward recovery. Therefore, I suggest that we should remove reusing prefree segments. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-02 08:48:15 +09:00
Gu Zheng	6cc4af5606	f2fs: code cleanup and simplify in func {find/add}_gc_inode This patch simplifies list operations in find_gc_inode and add_gc_inode. Just simple code cleanup. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: add description] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-02 08:48:14 +09:00
Namjae Jeon	8736fbf003	f2fs: optimize the init_dirty_segmap function Optimize the while loop condition Since this condition will always be true and while loop will be terminated by the following condition in code: if (segno >= TOTAL_SEGS(sbi)) break; Hence we can replace the while loop condition with while(1) instead of always checking for segno to be less than Total segs. Also we do not need to use TOTAL_SEGS() everytime. We can store this value in a local variable since this value is constant. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-02 08:48:14 +09:00
Jaegeuk Kim	060dd67b3c	f2fs: fix an endian conversion bug detected by sparse This patch should fix the following bug reported by kbuild test robot. fs/f2fs/recovery.c:233:33: sparse: incorrect type in assignment (different base types) parse warnings: (new ones prefixed by >>) >> recovery.c:233: sparse: incorrect type in assignment (different base types) recovery.c:233: expected unsigned int [unsigned] [assigned] ofs_in_node recovery.c:233: got restricted __le16 [assigned] [usertype] ofs_in_node >> recovery.c:238: sparse: incorrect type in assignment (different base types) recovery.c:238: expected unsigned int [unsigned] ofs_in_node recovery.c:238: got restricted __le16 [assigned] [usertype] ofs_in_node Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-02 08:48:13 +09:00
Jaegeuk Kim	7e586fa024	f2fs: fix crc endian conversion While calculating CRC for the checkpoint block, we use __u32, but when storing the crc value to the disk, we use __le32. Let's fix the inconsistency. Reported-and-Tested-by: Oded Gabbay <ogabbay@advaoptical.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-02 08:47:35 +09:00
J. Bruce Fields	d08d32e6e5	nfsd4: return delegation immediately if lease fails This case shouldn't happen--the administrator shouldn't really allow other applications access to the export until clients have had the chance to reclaim their state--but if it does then we should set the "return this lease immediately" bit on the reply. That still leaves some small races, but it's the best the protocol allows us to do in the case a lease is ripped out from under us.... Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:07 -04:00
J. Bruce Fields	0a262ffb75	nfsd4: do not throw away 4.1 lock state on last unlock This reverts commit `eb2099f31b` "nfsd4: release lockowners on last unlock in 4.1 case". Trond identified language in rfc 5661 section 8.2.4 which forbids this behavior: Stateids associated with byte-range locks are an exception. They remain valid even if a LOCKU frees all remaining locks, so long as the open file with which they are associated remains open, unless the client frees the stateids via the FREE_STATEID operation. And bakeathon 2013 testing found a 4.1 freebsd client was getting an incorrect BAD_STATEID return from a FREE_STATEID in the above situation and then failing. The spec language honestly was probably a mistake but at this point with implementations already following it we're probably stuck with that. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:06 -04:00
J. Bruce Fields	89f6c3362c	nfsd4: delegation-based open reclaims should bypass permissions We saw a v4.0 client's create fail as follows: - open create succeeds and gets a read delegation - client attempts to set mode on new file, gets DELAY while server recalls delegation. - client attempts a CLAIM_DELEGATE_CUR open using the delegation, gets error because of new file mode. This probably can't happen on a recent kernel since we're no longer giving out delegations on create opens. Nevertheless, it's a bug--reclaim opens should bypass permission checks. Reported-by: Steve Dickson <steved@redhat.com> Reported-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:05 -04:00
J. Bruce Fields	590b743143	nfsd4: minor read_buf cleanup The code to step to the next page seems reasonably self-contained. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:03 -04:00
J. Bruce Fields	247500820e	nfsd4: fix decoding of compounds across page boundaries A freebsd NFSv4.0 client was getting rare IO errors expanding a tarball. A network trace showed the server returning BAD_XDR on the final getattr of a getattr+write+getattr compound. The final getattr started on a page boundary. I believe the Linux client ignores errors on the post-write getattr, and that that's why we haven't seen this before. Cc: stable@vger.kernel.org Reported-by: Rick Macklem <rmacklem@uoguelph.ca> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:29:40 -04:00
J. Bruce Fields	99c415156c	nfsd4: clean up nfs4_open_delegation The nfs4_open_delegation logic is unecessarily baroque. Also stop pretending we support write delegations in several places. Some day we will support write delegations, but when that happens adding back in these flag parameters will be the easy part. For now they're just confusing. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:07 -04:00
Steve Dickson	9a0590aec3	NFSD: Don't give out read delegations on creates When an exclusive create is done with the mode bits set (aka open(testfile, O_CREAT \| O_EXCL, 0777)) this causes a OPEN op followed by a SETATTR op. When a read delegation is given in the OPEN, it causes the SETATTR to delay with EAGAIN until the delegation is recalled. This patch caused exclusive creates to give out a write delegation (which turn into no delegation) which allows the SETATTR seamlessly succeed. Signed-off-by: Steve Dickson <steved@redhat.com> [bfields: do this for any CREATE, not just exclusive; comment] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:07 -04:00
J. Bruce Fields	57569a7070	nfsd4: allow client to send no cb_sec flavors In testing I notice that some of the pynfs tests forget to send any cb_sec flavors, and that we haven't necessarily errored out in that case before. I'll fix pynfs, but am also inclined to default to trying AUTH_NONE in that case in case this is something clients actually do. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:07 -04:00
J. Bruce Fields	b78724b705	nfsd4: fail attempts to request gss on the backchannel We don't support gss on the backchannel. We should state that fact up front rather than just letting things continue and later making the client try to figure out why the backchannel isn't working. Trond suggested instead returning NFS4ERR_NOENT. I think it would be tricky for the client to distinguish between the case "I don't support gss on the backchannel" and "I can't find that in my cache, please create another context and try that instead", and I'd prefer something that currently doesn't have any other meaning for this operation, hence the (somewhat arbitrary) NFS4ERR_ENCR_ALG_UNSUPP. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:06 -04:00
J. Bruce Fields	57266a6e91	nfsd4: implement minimal SP4_MACH_CRED Do a minimal SP4_MACH_CRED implementation suggested by Trond, ignoring the client-provided spo_must_* arrays and just enforcing credential checks for the minimum required operations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:06 -04:00
J. Bruce Fields	0dc1531aca	svcrpc: store gss mech in svc_cred Store a pointer to the gss mechanism used in the rq_cred and cl_cred. This will make it easier to enforce SP4_MACH_CRED, which needs to compare the mechanism used on the exchange_id with that used on protected operations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:06 -04:00
Eliezer Tamir	91e2fd3378	net: avoid calling sched_clock when LLS is off Change Low Latency Sockets code for select and poll so that when LLS is disabled sched_clock() is never called. Also, avoid sending POLL_LL to sockets if disabled. Reported-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 14:06:47 -07:00
Dan Carpenter	6af8652849	ceph: tidy ceph_mdsmap_decode() a little I introduced a new temporary variable "info" instead of "m->m_info[mds]". Also I reversed the if condition and pulled everything in one indent level. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Alex Elder <elder@inktank.com>	2013-07-01 09:52:02 -07:00
Emil Goode	c213b50b7d	ceph: improve error handling in ceph_mdsmap_decode This patch makes the following improvements to the error handling in the ceph_mdsmap_decode function: - Add a NULL check for return value from kcalloc - Make use of the variable err Signed-off-by: Emil Goode <emilgoode@gmail.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-07-01 09:52:02 -07:00
Jim Schutt	4d1bf79aff	ceph: fix up comment for ceph_count_locks() as to which lock to hold Signed-off-by: Jim Schutt <jaschut@sandia.gov> Reviewed-by: Alex Elder <elder@inktank.com>	2013-07-01 09:52:01 -07:00
Josef Bacik	a71754fc68	Btrfs: move btrfs_truncate_page to btrfs_cont_expand instead of btrfs_truncate This has plagued us forever and I'm so over working around it. When we truncate down to a non-page aligned offset we will call btrfs_truncate_page to zero out the end of the page and write it back to disk, this will keep us from exposing stale data if we truncate back up from that point. The problem with this is it requires data space to do this, and people don't really expect to get ENOSPC from truncate() for these sort of things. This also tends to bite the orphan cleanup stuff too which keeps people from mounting. To get around this we can just move this into btrfs_cont_expand() to make sure if we are truncating up from a non-page size aligned i_size we will zero out the rest of this page so that we don't expose stale data. This will give ENOSPC if you try to truncate() up or if you try to write past the end of isize, which is much more reasonable. This fixes xfstests generic/083 failing to mount because of the orphan cleanup failing. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:33 -04:00
Josef Bacik	0b08851fda	Btrfs: optimize reada_for_balance This patch does two things. First we no longer explicitly read in the blocks we're trying to readahead. For things like balance_level we may never actually use the blocks so this just adds uneeded latency, and balance_level and split_node will both read in the blocks they care about explicitly so if the blocks need to be waited on it will be done there. Secondly we no longer drop the path if we do readahead, we just set the path blocking before we call reada_for_balance() and then we're good to go. Hopefully this will cut down on the number of re-searches. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:32 -04:00
Josef Bacik	bdf7c00e8f	Btrfs: optimize read_block_for_search This patch does two things, first it only does one call to btrfs_buffer_uptodate() with the gen specified instead of once with 0 and then again with gen specified. The other thing is to call btrfs_read_buffer() on the buffer we've found instead of dropping it and then calling read_tree_block(). This will keep us from doing yet another radix tree lookup for a buffer we've already found. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:32 -04:00
Josef Bacik	fdf8e2ea3c	Btrfs: unlock extent range on enospc in compressed submit A user reported a deadlock where the async submit thread was blocked on the lock_extent() lock, and then everybody behind him was locked on the page lock for the page he was holding. Looking at the code I noticed we do not unlock the extent range when we get ENOSPC and goto retry. This is bad because we immediately try to lock that range again to do the cow, which will cause a deadlock. Fix this by unlocking the range. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:31 -04:00
Wang Sheng-Hui	90b6d2830a	Btrfs: fix the comment typo for btrfs_attach_transaction_barrier The comment is for btrfs_attach_transaction_barrier, not for btrfs_attach_transaction. Fix the typo. Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Acked-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:30 -04:00
Josef Bacik	aee68ee5f5	Btrfs: fix not being able to find skinny extents during relocate We unconditionally search for the EXTENT_ITEM_KEY for metadata during balance, and then check the key that we found to see if it is actually a METADATA_ITEM_KEY, but this doesn't work right because METADATA is a higher key value, so if what we are looking for happens to be the first item in the leaf the search will dump us out at the previous leaf, and we won't find our item. So instead do what we do everywhere else, search for the skinny extent first and if we don't find it go back and re-search for the extent item. This patch fixes the panic I was hitting when balancing a large file system with skinny extents. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:30 -04:00
Josef Bacik	da61d31a78	Btrfs: cleanup backref search commit root flag stuff Looking into this backref problem I noticed we're using a macro to what turns out to essentially be a NULL check to see if we need to search the commit root. I'm killing this, let's just do what everybody else does and checks if trans == NULL. I've also made it so we pass in the path to __resolve_indirect_refs which will have the search_commit_root flag set properly already and that way we can avoid allocating another path when we have a perfectly good one to use. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:29 -04:00
Josef Bacik	d88d46c6e0	Btrfs: free csums when we're done scrubbing an extent A user reported scrub taking up an unreasonable amount of ram as it ran. This is because we lookup the csums for the extent we're scrubbing but don't free it up until after we're done with the scrub, which means we can take up a whole lot of ram. This patch fixes this by dropping the csums once we're done with the extent we've scrubbed. The user reported this to fix their problem. Thanks, Reported-and-tested-by: Remco Hosman <remco@hosman.xs4all.nl> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:28 -04:00
Josef Bacik	1be41b78bc	Btrfs: fix transaction throttling for delayed refs Dave has this fs_mark script that can make btrfs abort with sufficient amount of ram. This is because with more ram we can keep more dirty metadata in cache which in a round about way makes for many more pending delayed refs. What happens is we end up not throttling the transaction enough so when we go to commit the transaction when we've completely filled the file system we'll abort() because we use all of the space in the global reserve and we still have delayed refs to run. To fix this we need to make the delayed ref flushing and the transaction throttling dependant upon the number of delayed refs that we have instead of how much reserved space is left in the global reserve. With this patch we not only stop aborting transactions but we also get a smoother run speed with fs_mark and it makes us about 10% faster. Thanks, Reported-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:28 -04:00
Josef Bacik	501407aab8	Btrfs: stop waiting on current trans if we aborted I hit a hang when run_delayed_refs returned an error in the beginning of btrfs_commit_transaction. If we decide we need to commit the transaction in btrfs_end_transaction we'll set BLOCKED and start to commit, but if we get an error this early on we'll just exit without committing. This is fine, except that anybody else who tried to start a transaction will sit in wait_current_trans() since we're set to BLOCKED and we never set it to something else and woke people up. To fix this we want to check for trans->aborted everywhere we wait for the transaction state to change, and make btrfs_abort_transaction() wake up any waiters there may be. All the callers will notice that the transaction has aborted and exit out properly. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:27 -04:00
Josef Bacik	f971fe29b1	Btrfs: wake up delayed ref flushing waiters on abort I hit a deadlock because we aborted when flushing delayed refs but didn't wake any of the other flushers up and so everybody was just sleeping forever. This should fix the problem. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:26 -04:00
Jie Liu	3fb4037599	btrfs: fix the code comments for LZO compression workspace Fix the code comments for lzo compression workspace. The buf item is used to store the decompressed data and cbuf is used to store the compressed data. Signed-off-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:26 -04:00
Miao Xie	5bc7247ac4	Btrfs: fix broken nocow after balance Balance will create reloc_root for each fs root, and it's going to record last_snapshot to filter shared blocks. The side effect of setting last_snapshot is to break nocow attributes of files. Since the extents are not shared by the relocation tree after the balance, we can recover the old last_snapshot safely if no one snapshoted the source tree. We fix the above problem by this way. Reported-by: Kyle Gates <kylegates@hotmail.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>	2013-07-01 08:52:25 -04:00
Ashish Sangwan	6ae06ff51e	ext4: optimize starting extent in ext4_ext_rm_leaf() Both hole punch and truncate use ext4_ext_rm_leaf() for removing blocks. Currently we choose the last extent as the starting point for removing blocks: ex = EXT_LAST_EXTENT(eh); This is OK for truncate but for hole punch we can optimize the extent selection as the path is already initialized. We could use this information to select proper starting extent. The code change in this patch will not affect truncate as for truncate path[depth].p_ext will always be NULL. Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-01 08:12:41 -04:00
Theodore Ts'o	41a5b91319	jbd2: invalidate handle if jbd2_journal_restart() fails If jbd2_journal_restart() fails the handle will have been disconnected from the current transaction. In this situation, the handle must not be used for for any jbd2 function other than jbd2_journal_stop(). Enforce this with by treating a handle which has a NULL transaction pointer as an aborted handle, and issue a kernel warning if jbd2_journal_extent(), jbd2_journal_get_write_access(), jbd2_journal_dirty_metadata(), etc. is called with an invalid handle. This commit also fixes a bug where jbd2_journal_stop() would trip over a kernel jbd2 assertion check when trying to free an invalid handle. Also move the responsibility of setting current->journal_info to start_this_handle(), simplifying the three users of this function. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Younger Liu <younger.liu@huawei.com> Cc: Jan Kara <jack@suse.cz>	2013-07-01 08:12:41 -04:00
Theodore Ts'o	21ddd568c1	ext4: translate flag bits to strings in tracepoints Translate the bitfields used in various flags argument to strings to make the tracepoint output more human-readable. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-01 08:12:40 -04:00
Theodore Ts'o	cb53054118	ext4: fix up error handling for mpage_map_and_submit_extent() The function mpage_released_unused_page() must only be called once; otherwise the kernel will BUG() when the second call to mpage_released_unused_page() tries to unlock the pages which had been unlocked by the first call. Also restructure the error handling so that we only give up on writing the dirty pages in the case of ENOSPC where retrying the allocation won't help. Otherwise, a transient failure, such as a kmalloc() failure in calling ext4_map_blocks() might cause us to give up on those pages, leading to a scary message in /var/log/messages plus data loss. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>	2013-07-01 08:12:40 -04:00
Theodore Ts'o	39c04153fd	jbd2: fix theoretical race in jbd2__journal_restart Once we decrement transaction->t_updates, if this is the last handle holding the transaction from closing, and once we release the t_handle_lock spinlock, it's possible for the transaction to commit and be released. In practice with normal kernels, this probably won't happen, since the commit happens in a separate kernel thread and it's unlikely this could all happen within the space of a few CPU cycles. On the other hand, with a real-time kernel, this could potentially happen, so save the tid found in transaction->t_tid before we release t_handle_lock. It would require an insane configuration, such as one where the jbd2 thread was set to a very high real-time priority, perhaps because a high priority real-time thread is trying to read or write to a file system. But some people who use real-time kernels have been known to do insane things, including controlling laser-wielding industrial robots. :-) Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-07-01 08:12:40 -04:00
Lukas Czerner	e1be3a928e	ext4: only zero partial blocks in ext4_zero_partial_blocks() Currently if we pass range into ext4_zero_partial_blocks() which covers entire block we would attempt to zero it even though we should only zero unaligned part of the block. Fix this by checking whether the range covers the whole block skip zeroing if so. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-01 08:12:39 -04:00
Theodore Ts'o	42c832debb	ext4: check error return from ext4_write_inline_data_end() The function ext4_write_inline_data_end() can return an error. So we need to assign it to a signed integer variable to check for an error return (since copied is an unsigned int). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Zheng Liu <wenqing.lz@taobao.com> Cc: stable@vger.kernel.org	2013-07-01 08:12:39 -04:00
jon ernst	353eefd338	ext4: delete unnecessary C statements Comparing unsigned variable with 0 always returns false. err = 0 is duplicated and unnecessary. [ tytso: Also cleaned up error handling in ext4_block_zero_page_range() ] Signed-off-by: "Jon Ernst" <jonernst07@gmx.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-01 08:12:39 -04:00
Al Viro	64cb927371	ext3,ext4: don't mess with dir_file->f_pos in htree_dirblock_to_tree() Both ext3 and ext4 htree_dirblock_to_tree() is just filling the in-core rbtree for use by call_filldir(). All updates of ->f_pos are done by the latter; bumping it here (on error) is obviously wrong - we might very well have it nowhere near the block we'd found an error in. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-07-01 08:12:38 -04:00
Theodore Ts'o	fe52d17cdd	jbd2: move superblock checksum calculation to jbd2_write_superblock() Some of the functions which modify the jbd2 superblock were not updating the checksum before calling jbd2_write_superblock(). Move the call to jbd2_superblock_csum_set() to jbd2_write_superblock(), so that the checksum is calculated consistently. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: stable@vger.kernel.org	2013-07-01 08:12:38 -04:00
Ashish Sangwan	aeb2817a4e	ext4: pass inode pointer instead of file pointer to punch hole No need to pass file pointer when we can directly pass inode pointer. Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-01 08:12:38 -04:00
boxi liu	c4932dbe63	ext4: improve free space calculation for inline_data In ext4 feature inline_data,it use the xattr's space to store the inline data in inode.When we calculate the inline data as the xattr,we add the pad.But in get_max_inline_xattr_value_size() function we count the free space without pad.It cause some contents are moved to a block even if it can be stored in the inode. Signed-off-by: liulei <lewis.liulei@huawei.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Tao Ma <boyu.mt@taobao.com>	2013-07-01 08:12:37 -04:00
Joe Perches	e7c96e8e47	ext4: reduce object size when !CONFIG_PRINTK Reduce the object size ~10% could be useful for embedded systems. Add #ifdef CONFIG_PRINTK #else #endif blocks to hold formats and arguments, passing " " to functions when !CONFIG_PRINTK and still verifying format and arguments with no_printk. $ size fs/ext4/built-in.o* text data bss dec hex filename 239375 610 888 240873 3ace9 fs/ext4/built-in.o.new 264167 738 888 265793 40e41 fs/ext4/built-in.o.old $ grep -E "CONFIG_EXT4\|CONFIG_PRINTK" .config # CONFIG_PRINTK is not set CONFIG_EXT4_FS=y CONFIG_EXT4_USE_FOR_EXT23=y CONFIG_EXT4_FS_POSIX_ACL=y # CONFIG_EXT4_FS_SECURITY is not set # CONFIG_EXT4_DEBUG is not set Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-01 08:12:37 -04:00
Zheng Liu	d3922a777f	ext4: improve extent cache shrink mechanism to avoid to burn CPU time Now we maintain an proper in-order LRU list in ext4 to reclaim entries from extent status tree when we are under heavy memory pressure. For keeping this order, a spin lock is used to protect this list. But this lock burns a lot of CPU time. We can use the following steps to trigger it. % cd /dev/shm % dd if=/dev/zero of=ext4-img bs=1M count=2k % mkfs.ext4 ext4-img % mount -t ext4 -o loop ext4-img /mnt % cd /mnt % for ((i=0;i<160;i++)); do truncate -s 64g $i; done % for ((i=0;i<160;i++)); do cp $i /dev/null &; done % perf record -a -g % perf report This commit tries to fix this problem. Now a new member called i_touch_when is added into ext4_inode_info to record the last access time for an inode. Meanwhile we never need to keep a proper in-order LRU list. So this can avoid to burns some CPU time. When we try to reclaim some entries from extent status tree, we use list_sort() to get a proper in-order list. Then we traverse this list to discard some entries. In ext4_sb_info, we use s_es_last_sorted to record the last time of sorting this list. When we traverse the list, we skip the inode that is newer than this time, and move this inode to the tail of LRU list. When the head of the list is newer than s_es_last_sorted, we will sort the LRU list again. In this commit, we break the loop if s_extent_cache_cnt == 0 because that means that all extents in extent status tree have been reclaimed. Meanwhile in this commit, ext4_es_{un}register_shrinker()'s prototype is changed to save a local variable in these functions. Reported-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-01 08:12:37 -04:00
Alexey Khoroshilov	2c00ef3ee3	ext4: implement error handling of ext4_mb_new_preallocation() If memory allocation in ext4_mb_new_group_pa() is failed, it returns error code, ext4_mb_new_preallocation() propages it, but ext4_mb_new_blocks() ignores it. An observed result was: - allocation fail means ext4_mb_new_group_pa() does not update ext4_allocation_context; - ext4_mb_new_blocks() sets ext4_allocation_request->len (ar->len = ac->ac_b_ex.fe_len;) to number of blocks preallocated (512) instead of number of blocks requested (1); - that activates update cycle in ext4_splice_branch(): for (i = 1; i < blks; i++) <-- blks is 512 instead of 1 here *(where->p + i) = cpu_to_le32(current_block++); - it iterates 511 times and corrupts a chunk of memory including inode structure; - page fault happens at EXT4_SB(inode->i_sb) in ext4_mark_inode_dirty(); - system hangs with 'scheduling while atomic' BUG. The patch implements a check for ext4_mb_new_preallocation() error code and handles its failure as if ext4_mb_regular_allocator() fails. Found by Linux File System Verification project (linuxtesting.org). [ Patch restructed by tytso to make the flow of control easier to follow. ] Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-07-01 08:12:36 -04:00

... 5 6 7 8 9 ...