* refs/heads/tmp-b1bad9e
Linux 4.4.141
loop: remember whether sysfs_create_group() was done
RDMA/ucm: Mark UCM interface as BROKEN
PM / hibernate: Fix oops at snapshot_write()
loop: add recursion validation to LOOP_CHANGE_FD
netfilter: x_tables: initialise match/target check parameter struct
netfilter: nf_queue: augment nfqa_cfg_policy
uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
x86/cpufeature: Add helper macro for mask check macros
x86/cpufeature: Make sure DISABLED/REQUIRED macros are updated
x86/cpufeature: Update cpufeaure macros
x86/cpufeature, x86/mm/pkeys: Fix broken compile-time disabling of pkeys
x86/cpu: Add detection of AMD RAS Capabilities
x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
x86/cpufeature, x86/mm/pkeys: Add protection keys related CPUID definitions
x86/cpufeature: Speed up cpu_feature_enabled()
x86/boot: Simplify kernel load address alignment check
x86/vdso: Use static_cpu_has()
x86/alternatives: Discard dynamic check after init
x86/alternatives: Add an auxilary section
x86/cpufeature: Get rid of the non-asm goto variant
x86/cpufeature: Replace the old static_cpu_has() with safe variant
x86/cpufeature: Carve out X86_FEATURE_*
x86/headers: Don't include asm/processor.h in asm/atomic.h
x86/fpu: Get rid of xstate_fault()
x86/fpu: Add an XSTATE_OP() macro
x86/cpu: Provide a config option to disable static_cpu_has
x86/cpufeature: Cleanup get_cpu_cap()
x86/cpufeature: Move some of the scattered feature bits to x86_capability
iw_cxgb4: correctly enforce the max reg_mr depth
tools build: fix # escaping in .cmd files for future Make
Fix up non-directory creation in SGID directories
HID: usbhid: add quirk for innomedia INNEX GENESIS/ATARI adapter
xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
usb: quirks: add delay quirks for Corsair Strafe
USB: serial: mos7840: fix status-register error handling
USB: yurex: fix out-of-bounds uaccess in read handler
USB: serial: keyspan_pda: fix modem-status error handling
USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
USB: serial: ch341: fix type promotion bug in ch341_control_in()
ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
vmw_balloon: fix inflation with batching
ibmasm: don't write out of bounds in read handler
MIPS: Fix ioremap() RAM check
cpufreq: Kconfig: Remove CPU_FREQ_DEFAULT_GOV_SCHED
Change-Id: I0909a2917621f2384cdfe27078577cc2c06b9612
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltNt4MACgkQONu9yGCS
aT4Otg//e7FAfNGllvjx+53RBbpRUoa4ltdKNrdKa94ZGgVbGCdctKa9BntDkHSb
Vw6tfvdonuJSs3e9KBSt4vOiTWkJ0eOnajdRYEQUg/jtufIULWgHNEl1dk0JB2Oj
+8GAfXzlZ7NRfjEV0l0m44aU/qHaWVBBPQcmqLlxnLEr+0idWfSAGALEBnK6W+nH
5yNU8X1pxVb1qSnL2YVM03+B9cfrFlpiPv46+hrHaQ6r87e+veD6f1tE1o8BvVy6
f8CxWGvYisKJZ+OOQLH95xVahzcsGG5RKcarXzjsq30XJM1QZj8hBSWlzj0aBZmW
OAiJ2dJccZaThxBSPJWLm6jzrUpjmQOtQMRK6TnlGxhG03eA8noxffTE03RUzL7Q
jog6oxGgnrM+h08kmNHQEWP8EMgc6GTextKY2v9LQL51L+IBkvX8YOJwZS8YltOI
XcoriH/lrNq5O7gSEQ4WoZWYlDlVYNc8r5EqI8lYeeShdGJqps6/wOZa1zqBFtbE
BD0UxIDOs4zmcqPBebVUqGoPklLsGW5QfZi1dgBTiGNnopokMxia3DlPnQeq/euM
b7+DBzL0ce2EamIh///HS+HF2uAM5N7w+BdEbYpIUCoSTKB0hUuKIM+T6rgXEvzD
y0wJhH4SmjBH8w/Hc57VYVqOMAG+cUPDlhrw5XBkZ9HXy1ns1HM=
=780A
-----END PGP SIGNATURE-----
Merge 4.4.141 into android-4.4
Changes in 4.4.141
MIPS: Fix ioremap() RAM check
ibmasm: don't write out of bounds in read handler
vmw_balloon: fix inflation with batching
ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
USB: serial: ch341: fix type promotion bug in ch341_control_in()
USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
USB: serial: keyspan_pda: fix modem-status error handling
USB: yurex: fix out-of-bounds uaccess in read handler
USB: serial: mos7840: fix status-register error handling
usb: quirks: add delay quirks for Corsair Strafe
xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
HID: usbhid: add quirk for innomedia INNEX GENESIS/ATARI adapter
Fix up non-directory creation in SGID directories
tools build: fix # escaping in .cmd files for future Make
iw_cxgb4: correctly enforce the max reg_mr depth
x86/cpufeature: Move some of the scattered feature bits to x86_capability
x86/cpufeature: Cleanup get_cpu_cap()
x86/cpu: Provide a config option to disable static_cpu_has
x86/fpu: Add an XSTATE_OP() macro
x86/fpu: Get rid of xstate_fault()
x86/headers: Don't include asm/processor.h in asm/atomic.h
x86/cpufeature: Carve out X86_FEATURE_*
x86/cpufeature: Replace the old static_cpu_has() with safe variant
x86/cpufeature: Get rid of the non-asm goto variant
x86/alternatives: Add an auxilary section
x86/alternatives: Discard dynamic check after init
x86/vdso: Use static_cpu_has()
x86/boot: Simplify kernel load address alignment check
x86/cpufeature: Speed up cpu_feature_enabled()
x86/cpufeature, x86/mm/pkeys: Add protection keys related CPUID definitions
x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
x86/cpu: Add detection of AMD RAS Capabilities
x86/cpufeature, x86/mm/pkeys: Fix broken compile-time disabling of pkeys
x86/cpufeature: Update cpufeaure macros
x86/cpufeature: Make sure DISABLED/REQUIRED macros are updated
x86/cpufeature: Add helper macro for mask check macros
uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
netfilter: nf_queue: augment nfqa_cfg_policy
netfilter: x_tables: initialise match/target check parameter struct
loop: add recursion validation to LOOP_CHANGE_FD
PM / hibernate: Fix oops at snapshot_write()
RDMA/ucm: Mark UCM interface as BROKEN
loop: remember whether sysfs_create_group() was done
Linux 4.4.141
Change-Id: I777b39a0ede95b58638add97756d6beaf4a9d154
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 0fa3ecd87848c9c93c2c828ef4c3a8ca36ce46c7 upstream.
sgid directories have special semantics, making newly created files in
the directory belong to the group of the directory, and newly created
subdirectories will also become sgid. This is historically used for
group-shared directories.
But group directories writable by non-group members should not imply
that such non-group members can magically join the group, so make sure
to clear the sgid bit on non-directories for non-members (but remember
that sgid without group execute means "mandatory locking", just to
confuse things even more).
Reported-by: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* refs/heads/tmp-20ddb25
Linux 4.4.116
ftrace: Remove incorrect setting of glob search field
mn10300/misalignment: Use SIGSEGV SEGV_MAPERR to report a failed user copy
ovl: fix failure to fsync lower dir
ACPI: sbshc: remove raw pointer from printk() message
nvme: Fix managing degraded controllers
btrfs: Handle btrfs_set_extent_delalloc failure in fixup worker
pktcdvd: Fix pkt_setup_dev() error path
EDAC, octeon: Fix an uninitialized variable warning
xtensa: fix futex_atomic_cmpxchg_inatomic
alpha: fix reboot on Avanti platform
alpha: fix crash if pthread_create races with signal delivery
signal/sh: Ensure si_signo is initialized in do_divide_error
signal/openrisc: Fix do_unaligned_access to send the proper signal
Bluetooth: btusb: Restore QCA Rome suspend/resume fix with a "rewritten" version
Revert "Bluetooth: btusb: fix QCA Rome suspend/resume"
Bluetooth: btsdio: Do not bind to non-removable BCM43341
HID: quirks: Fix keyboard + touchpad on Toshiba Click Mini not working
kernel/async.c: revert "async: simplify lowest_in_progress()"
media: cxusb, dib0700: ignore XC2028_I2C_FLUSH
media: ts2020: avoid integer overflows on 32 bit machines
watchdog: imx2_wdt: restore previous timeout after suspend+resume
KVM: nVMX: Fix races when sending nested PI while dest enters/leaves L2
arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
crypto: caam - fix endless loop when DECO acquire fails
media: v4l2-compat-ioctl32.c: refactor compat ioctl32 logic
media: v4l2-compat-ioctl32.c: don't copy back the result for certain errors
media: v4l2-compat-ioctl32.c: drop pr_info for unknown buffer type
media: v4l2-compat-ioctl32.c: copy clip list in put_v4l2_window32
media: v4l2-compat-ioctl32: Copy v4l2_window->global_alpha
media: v4l2-compat-ioctl32.c: make ctrl_is_pointer work for subdevs
media: v4l2-compat-ioctl32.c: fix ctrl_is_pointer
media: v4l2-compat-ioctl32.c: copy m.userptr in put_v4l2_plane32
media: v4l2-compat-ioctl32.c: avoid sizeof(type)
media: v4l2-compat-ioctl32.c: move 'helper' functions to __get/put_v4l2_format32
media: v4l2-compat-ioctl32.c: fix the indentation
media: v4l2-compat-ioctl32.c: add missing VIDIOC_PREPARE_BUF
vb2: V4L2_BUF_FLAG_DONE is set after DQBUF
media: v4l2-ioctl.c: don't copy back the result for -ENOTTY
nsfs: mark dentry with DCACHE_RCUACCESS
crypto: poly1305 - remove ->setkey() method
crypto: cryptd - pass through absence of ->setkey()
crypto: hash - introduce crypto_hash_alg_has_setkey()
ahci: Add Intel Cannon Lake PCH-H PCI ID
ahci: Add PCI ids for Intel Bay Trail, Cherry Trail and Apollo Lake AHCI
ahci: Annotate PCI ids for mobile Intel chipsets as such
kernfs: fix regression in kernfs_fop_write caused by wrong type
NFS: reject request for id_legacy key without auxdata
NFS: commit direct writes even if they fail partially
NFS: Add a cond_resched() to nfs_commit_release_pages()
nfs/pnfs: fix nfs_direct_req ref leak when i/o falls back to the mds
ubi: block: Fix locking for idr_alloc/idr_remove
mtd: nand: sunxi: Fix ECC strength choice
mtd: nand: Fix nand_do_read_oob() return value
mtd: nand: brcmnand: Disable prefetch by default
mtd: cfi: convert inline functions to macros
media: dvb-usb-v2: lmedm04: move ts2020 attach to dm04_lme2510_tuner
media: dvb-usb-v2: lmedm04: Improve logic checking of warm start
dccp: CVE-2017-8824: use-after-free in DCCP code
sched/rt: Up the root domain ref count when passing it around via IPIs
sched/rt: Use container_of() to get root domain in rto_push_irq_work_func()
usb: gadget: uvc: Missing files for configfs interface
posix-timer: Properly check sigevent->sigev_notify
netfilter: nf_queue: Make the queue_handler pernet
kaiser: fix compile error without vsyscall
x86/kaiser: fix build error with KASAN && !FUNCTION_GRAPH_TRACER
dmaengine: dmatest: fix container_of member in dmatest_callback
CIFS: zero sensitive data when freeing
cifs: Fix autonegotiate security settings mismatch
cifs: Fix missing put_xid in cifs_file_strict_mmap
powerpc/pseries: include linux/types.h in asm/hvcall.h
x86/microcode: Do the family check first
x86/microcode/AMD: Do not load when running on a hypervisor
crypto: tcrypt - fix S/G table for test_aead_speed()
don't put symlink bodies in pagecache into highmem
KEYS: encrypted: fix buffer overread in valid_master_desc()
media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
vhost_net: stop device during reset owner
tcp: release sk_frag.page in tcp_disconnect
r8169: fix RTL8168EP take too long to complete driver initialization.
qlcnic: fix deadlock bug
net: igmp: add a missing rcu locking section
ip6mr: fix stale iterator
x86/asm: Fix inline asm call constraints for GCC 4.4
drm: rcar-du: Fix race condition when disabling planes at CRTC stop
drm: rcar-du: Use the VBK interrupt for vblank events
ASoC: rsnd: avoid duplicate free_irq()
ASoC: rsnd: don't call free_irq() on Parent SSI
ASoC: simple-card: Fix misleading error message
net: cdc_ncm: initialize drvflags before usage
usbip: fix 3eee23c3ec14 tcp_socket address still in the status file
usbip: vhci_hcd: clear just the USB_PORT_STAT_POWER bit
ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
powerpc/64s: Allow control of RFI flush via debugfs
powerpc/64s: Wire up cpu_show_meltdown()
powerpc/powernv: Check device-tree for RFI flush settings
powerpc/pseries: Query hypervisor for RFI flush settings
powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti
powerpc/64s: Add support for RFI flush of L1-D cache
powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL
powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL
powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL
powerpc/64s: Simple RFI macro conversions
powerpc/64: Add macros for annotating the destination of rfid/hrfid
powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper
powerpc: Simplify module TOC handling
powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC
powerpc/64: Fix flush_(d|i)cache_range() called from modules
powerpc/bpf/jit: Disable classic BPF JIT on ppc64le
BACKPORT: xfrm: Fix return value check of copy_sec_ctx.
time: Fix ktime_get_raw() incorrect base accumulation
sched/fair: prevent possible infinite loop in sched_group_energy
UPSTREAM: MIPS: Fix build of compressed image
ANDROID: qtaguid: Fix the UAF probelm with tag_ref_tree
UPSTREAM: ANDROID: binder: remove waitqueue when thread exits.
UPSTREAM: arm64/efi: Make strnlen() available to the EFI namespace
UPSTREAM: ARM: boot: Add an implementation of strnlen for libfdt
ANDROID: MIPS: Add ranchu[32r5|32r6|64]_defconfig
FROMLIST: tty: goldfish: Enable 'earlycon' only if built-in
FROMLIST: MIPS: ranchu: Add Ranchu as a new generic-based board
FROMLIST: MIPS: Add noexec=on|off kernel parameter
FROMLIST: MIPS: CPC: Map registers using DT in mips_cpc_default_phys_base()
FROMLIST: dt-bindings: Document mti,mips-cpc binding
FROMLIST: MIPS: math-emu: Mark fall throughs in switch statements with a comment
FROMLIST: MIPS: math-emu: Avoid multiple assignment
FROMLIST: MIPS: math-emu: Avoid an assignment within if statement condition
FROMLIST: MIPS: math-emu: Declare function srl128() as static
FROMLIST: MIPS: math-emu: Avoid definition duplication for macro DPXMULT()
FROMLIST: MIPS: math-emu: Remove an unnecessary header inclusion
UPSTREAM: scripts/dtc: Update to upstream version 0931cea3ba20
UPSTREAM: scripts/dtc: dt_to_config - kernel config options for a devicetree
UPSTREAM: scripts/dtc: Update to upstream version 53bf130b1cdd
UPSTREAM: scripts/dtc: Update to upstream commit b06e55c88b9b
UPSTREAM: scripts/dtc: dtx_diff - add info to error message
UPSTREAM: dtc: create tool to diff device trees
UPSTREAM: config: android-base: disable CONFIG_NFSD and CONFIG_NFS_FS
UPSTREAM: config: android-base: add CGROUP_BPF
UPSTREAM: config: android-base: add CONFIG_MODULES option
UPSTREAM: config: android-base: add CONFIG_IKCONFIG option
UPSTREAM: config: android-base: disable CONFIG_USELIB and CONFIG_FHANDLE
UPSTREAM: config: android-base: enable hardened usercopy and kernel ASLR
UPSTREAM: config: android: enable CONFIG_SECCOMP
UPSTREAM: config: android: set SELinux as default security mode
UPSTREAM: config: android: move device mapper options to recommended
UPSTREAM: config/android: Remove CONFIG_IPV6_PRIVACY
UPSTREAM: config: add android config fragments
BACKPORT: MIPS: generic: Add a MAINTAINERS entry
BACKPORT: irqchip/irq-goldfish-pic: Add Goldfish PIC driver
UPSTREAM: dt-bindings/goldfish-pic: Add device tree binding for Goldfish PIC driver
UPSTREAM: MIPS: Allow storing pgd in C0_CONTEXT for MIPSr6
UPSTREAM: MIPS: CPS: Handle spurious VP starts more gracefully
UPSTREAM: MIPS: CPS: Handle cores not powering down more gracefully
UPSTREAM: MIPS: CPS: Prevent multi-core with dcache aliasing
UPSTREAM: MIPS: CPS: Select CONFIG_SYS_SUPPORTS_SCHED_SMT for MIPSr6
UPSTREAM: MIPS: CM: WARN on attempt to lock invalid VP, not BUG
UPSTREAM: MIPS: CM: Avoid per-core locking with CM3 & higher
UPSTREAM: MIPS: smp-cps: Avoid BUG() when offlining pre-r6 CPUs
UPSTREAM: MIPS: smp-cps: Add support for CPU hotplug of MIPSr6 processors
UPSTREAM: MIPS: generic: Bump default NR_CPUS to 16
UPSTREAM: MIPS: pm-cps: Change FSB workaround to CPU blacklist
UPSTREAM: MIPS: Fix early CM probing
UPSTREAM: MIPS: smp-cps: Stop printing EJTAG exceptions to UART
UPSTREAM: MIPS: smp-cps: Add nothreads kernel parameter
UPSTREAM: MIPS: smp-cps: Support MIPSr6 Virtual Processors
UPSTREAM: MIPS: smp-cps: Skip core setup if coherent
UPSTREAM: MIPS: smp-cps: Pull boot config retrieval out of mips_cps_boot_vpes
UPSTREAM: MIPS: smp-cps: Pull cache init into a function
UPSTREAM: MIPS: smp-cps: Ensure our VP ident calculation is correct
UPSTREAM: irqchip: mips-gic: Provide VP ID accessor
UPSTREAM: irqchip: mips-gic: Use HW IDs for VPE_OTHER_ADDR
UPSTREAM: MIPS: CM: Fix mips_cm_max_vp_width for UP kernels
UPSTREAM: MIPS: CM: Add CM GCR_BEV_BASE accessors
UPSTREAM: MIPS: CPC: Add start, stop and running CM3 CPC registers
UPSTREAM: MIPS: pm-cps: Avoid offset overflow on MIPSr6
UPSTREAM: MIPS: traps: Make sure secondary cores have a sane ebase register
UPSTREAM: MIPS: Detect MIPSr6 Virtual Processor support
UPSTREAM: Documentation: Add device tree binding for Goldfish FB driver
UPSTREAM: MIPS: math-emu: Use preferred flavor of unsigned integer declarations
UPSTREAM: MIPS: math-emu: <MADDF|MSUBF>.D: Fix accuracy (64-bit case)
UPSTREAM: MIPS: math-emu: <MADDF|MSUBF>.S: Fix accuracy (32-bit case)
UPSTREAM: MIPS: Update Goldfish RTC driver maintainer email address
UPSTREAM: MIPS: Update RINT emulation maintainer email address
UPSTREAM: MIPS: math-emu: do not use bools for arithmetic
UPSTREAM: rtc: goldfish: Add RTC driver for Android emulator
BACKPORT: dt-bindings: Add device tree binding for Goldfish RTC driver
UPSTREAM: tty: goldfish: Implement support for kernel 'earlycon' parameter
UPSTREAM: tty: goldfish: Use streaming DMA for r/w operations on Ranchu platforms
UPSTREAM: tty: goldfish: Refactor constants to better reflect their nature
UPSTREAM: MIPS: math-emu: Add FP emu debugfs stats for individual instructions
UPSTREAM: MIPS: math-emu: Add FP emu debugfs clear functionality
UPSTREAM: MIPS: math-emu: Add FP emu debugfs statistics for branches
BACKPORT: MIPS: math-emu: CLASS.D: Zero bits 32-63 of the result
BACKPORT: MIPS: math-emu: RINT.<D|S>: Fix several problems by reimplementation
UPSTREAM: MIPS: math-emu: CMP.Sxxx.<D|S>: Prevent occurrences of SIGILL crashes
UPSTREAM: MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Clean up "maddf_flags" enumeration
UPSTREAM: MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of zero inputs
UPSTREAM: MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of infinite inputs
UPSTREAM: MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix NaN propagation
UPSTREAM: tty: goldfish: Fix a parameter of a call to free_irq
UPSTREAM: MIPS: VDSO: Fix clobber lists in fallback code paths
UPSTREAM: MIPS: VDSO: Fix a mismatch between comment and preprocessor constant
UPSTREAM: MIPS: VDSO: Add implementation of gettimeofday() fallback
UPSTREAM: MIPS: VDSO: Add implementation of clock_gettime() fallback
UPSTREAM: MIPS: VDSO: Fix conversions in do_monotonic()/do_monotonic_coarse()
UPSTREAM: MIPS: unaligned: Add DSP lwx & lhx missaligned access support
UPSTREAM: MIPS: build: Fix "-modd-spreg" switch usage when compiling for mips32r6
UPSTREAM: MIPS: cmdline: Add support for 'memmap' parameter
UPSTREAM: MIPS: math-emu: Handle zero accumulator case in MADDF and MSUBF separately
UPSTREAM: MIPS: Support per-device DMA coherence
UPSTREAM: MIPS: dma-default: Don't check hw_coherentio if device is non-coherent
UPSTREAM: MIPS: Sanitise coherentio semantics
UPSTREAM: MIPS: CPC: Provide default mips_cpc_default_phys_base to ignore CPC
UPSTREAM: MIPS: generic: Introduce generic DT-based board support
UPSTREAM: MIPS: Support generating Flattened Image Trees (.itb)
UPSTREAM: MIPS: Allow emulation for unaligned [LS]DXC1 instructions
UPSTREAM: MIPS: math-emu: Fix BC1EQZ and BC1NEZ condition handling
UPSTREAM: MIPS: r2-on-r6-emu: Clear BLTZALL and BGEZALL debugfs counters
UPSTREAM: MIPS: r2-on-r6-emu: Fix BLEZL and BGTZL identification
UPSTREAM: MIPS: remove aliasing alignment if HW has antialising support
BACKPORT: MIPS: store the appended dtb address in a variable
UPSTREAM: MIPS: Fix FCSR Cause bit handling for correct SIGFPE issue
UPSTREAM: MIPS: kernel: Audit and remove any unnecessary uses of module.h
UPSTREAM: MIPS: c-r4k: Fix sigtramp SMP call to use kmap
UPSTREAM: MIPS: c-r4k: Fix protected_writeback_scache_line for EVA
UPSTREAM: MIPS: Spelling fix lets -> let's
UPSTREAM: MIPS: R6: Fix typo
UPSTREAM: MIPS: traps: Correct the SIGTRAP debug ABI in `do_watch' and `do_trap_or_bp'
UPSTREAM: MIPS: inst.h: Rename cbcond{0,1}_op to pop{1,3}0_op
UPSTREAM: MIPS: inst.h: Rename b{eq,ne}zcji[al]c_op to pop{6,7}6_op
UPSTREAM: MIPS: math-emu: Fix m{add,sub}.s shifts
UPSTREAM: MIPS: inst: Declare fsel_op for sel.fmt instruction
UPSTREAM: MIPS: math-emu: Fix code indentation
UPSTREAM: MIPS: math-emu: Fix bit-width in ieee754dp_{mul, maddf, msubf} comments
UPSTREAM: MIPS: math-emu: Add z argument macros
UPSTREAM: MIPS: math-emu: Unify ieee754dp_m{add,sub}f
UPSTREAM: MIPS: math-emu: Unify ieee754sp_m{add,sub}f
UPSTREAM: MIPS: math-emu: Emulate MIPSr6 sel.fmt instruction
UPSTREAM: MIPS: math-emu: Fix BC1{EQ,NE}Z emulation
UPSTREAM: MIPS: math-emu: Always propagate sNaN payload in quieting
UPSTREAM: MIPS: Fix misspellings in comments.
UPSTREAM: MIPS: math-emu: Add IEEE Std 754-2008 NaN encoding emulation
UPSTREAM: MIPS: math-emu: Add IEEE Std 754-2008 ABS.fmt and NEG.fmt emulation
UPSTREAM: MIPS: non-exec stack & heap when non-exec PT_GNU_STACK is present
UPSTREAM: MIPS: Add IEEE Std 754 conformance mode selection
UPSTREAM: MIPS: Determine the presence of IEEE Std 754-2008 features
UPSTREAM: MIPS: Define the legacy-NaN and 2008-NaN features
UPSTREAM: MIPS: ELF: Interpret the NAN2008 file header flag
UPSTREAM: ELF: Also pass any interpreter's file header to `arch_check_elf'
UPSTREAM: MIPS: Use a union to access the ELF file header
UPSTREAM: MIPS: Fix delay slot emulation count in debugfs
BACKPORT: exit_thread: accept a task parameter to be exited
UPSTREAM: mn10300: let exit_fpu accept a task
UPSTREAM: MIPS: Use per-mm page to execute branch delay slot instructions
BACKPORT: s390: get rid of exit_thread()
BACKPORT: exit_thread: remove empty bodies
UPSTREAM: MIPS: Make flush_thread
UPSTREAM: MIPS: Properly disable FPU in start_thread()
UPSTREAM: MIPS: Select CONFIG_HANDLE_DOMAIN_IRQ and make it work.
UPSTREAM: MIPS: math-emu: Fix typo
UPSTREAM: MIPS: math-emu: dsemul: Remove an unused bit in ADDIUPC emulation
UPSTREAM: MIPS: math-emu: dsemul: Reduce `get_isa16_mode' clutter
UPSTREAM: MIPS: math-emu: dsemul: Correct description of the emulation frame
UPSTREAM: MIPS: math-emu: Correct the emulation of microMIPS ADDIUPC instruction
UPSTREAM: MIPS: math-emu: Make microMIPS branch delay slot emulation work
UPSTREAM: MIPS: math-emu: dsemul: Fix ill formatting of microMIPS part
UPSTREAM: MIPS: math-emu: Correctly handle NOP emulation
Conflicts:
drivers/irqchip/Kconfig
drivers/irqchip/Makefile
drivers/media/v4l2-core/v4l2-compat-ioctl32.c
Change-Id: I98374358ab24ce80dba3afa2f4562c71f45b7aab
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlqHLN8ACgkQONu9yGCS
aT7eyQ/+NGK3/MPgoqRtg8sEvr1CVk8VhH1BiBfiQPGXe/D4nqPrKQzQBBzsW8QX
6Z9PY7wDz9RgFkw+FoOyG0eLuYdgNYOelASdQ4kJzteVH8pB2GxxTbX0drttzV+F
liNy0w39YLYxbjR4FavOSuDekd46dNQsHBvzTawaFKh0BEtQO+1uUGMg1LjMKVPn
F9ry0mEPrOoC2+nRvU6QXIUZy6y4+Pgdda0sfGcO3yXwQev9HoW5h9qMCnGah30J
D3Glt86dtpQcuqeIaXrfX+HnkvAOxTHjP8uRn3O7A7h8+WYBWq5Xms6A7EE9duNV
0UA8OZpvq0r0YSTmBFzrDexAcf/cXW8ajd/VKseI/d53iIauLV5FUaGldLJ3IQMc
gYZ2uNxGTI4z3V+nIiVQ0NCm4kmqogVY8PvMlgUwiFVG2B088iYGZ7iTOQ9b7wBO
VgDo0ouC/yDA8Lmz/A0l3SuvkJDNIPJit5lWzqCGRjk1F8WdPpI5C3ONfp8R3Lko
sTllldOo982KW5up/fg5HfuMg1OjgXZtzO+/NlTtyTpSr9bb1OoniSROG8eEcMqO
lKI1MB8Xx/pqqW1E8OOtb7A/8JPCBFzVV9xVGKwI0uZa2XOQeAwGruOe8Ub6nEpU
8w30DlSgy8MB1BPL6UGC6k+001k8jkohdl/qjpYb6aK55CfbhlA=
=a3k5
-----END PGP SIGNATURE-----
Merge 4.4.116 into android-4.4
Changes in 4.4.116
powerpc/bpf/jit: Disable classic BPF JIT on ppc64le
powerpc/64: Fix flush_(d|i)cache_range() called from modules
powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC
powerpc: Simplify module TOC handling
powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper
powerpc/64: Add macros for annotating the destination of rfid/hrfid
powerpc/64s: Simple RFI macro conversions
powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL
powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL
powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL
powerpc/64s: Add support for RFI flush of L1-D cache
powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti
powerpc/pseries: Query hypervisor for RFI flush settings
powerpc/powernv: Check device-tree for RFI flush settings
powerpc/64s: Wire up cpu_show_meltdown()
powerpc/64s: Allow control of RFI flush via debugfs
ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
usbip: vhci_hcd: clear just the USB_PORT_STAT_POWER bit
usbip: fix 3eee23c3ec14 tcp_socket address still in the status file
net: cdc_ncm: initialize drvflags before usage
ASoC: simple-card: Fix misleading error message
ASoC: rsnd: don't call free_irq() on Parent SSI
ASoC: rsnd: avoid duplicate free_irq()
drm: rcar-du: Use the VBK interrupt for vblank events
drm: rcar-du: Fix race condition when disabling planes at CRTC stop
x86/asm: Fix inline asm call constraints for GCC 4.4
ip6mr: fix stale iterator
net: igmp: add a missing rcu locking section
qlcnic: fix deadlock bug
r8169: fix RTL8168EP take too long to complete driver initialization.
tcp: release sk_frag.page in tcp_disconnect
vhost_net: stop device during reset owner
media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
KEYS: encrypted: fix buffer overread in valid_master_desc()
don't put symlink bodies in pagecache into highmem
crypto: tcrypt - fix S/G table for test_aead_speed()
x86/microcode/AMD: Do not load when running on a hypervisor
x86/microcode: Do the family check first
powerpc/pseries: include linux/types.h in asm/hvcall.h
cifs: Fix missing put_xid in cifs_file_strict_mmap
cifs: Fix autonegotiate security settings mismatch
CIFS: zero sensitive data when freeing
dmaengine: dmatest: fix container_of member in dmatest_callback
x86/kaiser: fix build error with KASAN && !FUNCTION_GRAPH_TRACER
kaiser: fix compile error without vsyscall
netfilter: nf_queue: Make the queue_handler pernet
posix-timer: Properly check sigevent->sigev_notify
usb: gadget: uvc: Missing files for configfs interface
sched/rt: Use container_of() to get root domain in rto_push_irq_work_func()
sched/rt: Up the root domain ref count when passing it around via IPIs
dccp: CVE-2017-8824: use-after-free in DCCP code
media: dvb-usb-v2: lmedm04: Improve logic checking of warm start
media: dvb-usb-v2: lmedm04: move ts2020 attach to dm04_lme2510_tuner
mtd: cfi: convert inline functions to macros
mtd: nand: brcmnand: Disable prefetch by default
mtd: nand: Fix nand_do_read_oob() return value
mtd: nand: sunxi: Fix ECC strength choice
ubi: block: Fix locking for idr_alloc/idr_remove
nfs/pnfs: fix nfs_direct_req ref leak when i/o falls back to the mds
NFS: Add a cond_resched() to nfs_commit_release_pages()
NFS: commit direct writes even if they fail partially
NFS: reject request for id_legacy key without auxdata
kernfs: fix regression in kernfs_fop_write caused by wrong type
ahci: Annotate PCI ids for mobile Intel chipsets as such
ahci: Add PCI ids for Intel Bay Trail, Cherry Trail and Apollo Lake AHCI
ahci: Add Intel Cannon Lake PCH-H PCI ID
crypto: hash - introduce crypto_hash_alg_has_setkey()
crypto: cryptd - pass through absence of ->setkey()
crypto: poly1305 - remove ->setkey() method
nsfs: mark dentry with DCACHE_RCUACCESS
media: v4l2-ioctl.c: don't copy back the result for -ENOTTY
vb2: V4L2_BUF_FLAG_DONE is set after DQBUF
media: v4l2-compat-ioctl32.c: add missing VIDIOC_PREPARE_BUF
media: v4l2-compat-ioctl32.c: fix the indentation
media: v4l2-compat-ioctl32.c: move 'helper' functions to __get/put_v4l2_format32
media: v4l2-compat-ioctl32.c: avoid sizeof(type)
media: v4l2-compat-ioctl32.c: copy m.userptr in put_v4l2_plane32
media: v4l2-compat-ioctl32.c: fix ctrl_is_pointer
media: v4l2-compat-ioctl32.c: make ctrl_is_pointer work for subdevs
media: v4l2-compat-ioctl32: Copy v4l2_window->global_alpha
media: v4l2-compat-ioctl32.c: copy clip list in put_v4l2_window32
media: v4l2-compat-ioctl32.c: drop pr_info for unknown buffer type
media: v4l2-compat-ioctl32.c: don't copy back the result for certain errors
media: v4l2-compat-ioctl32.c: refactor compat ioctl32 logic
crypto: caam - fix endless loop when DECO acquire fails
arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
KVM: nVMX: Fix races when sending nested PI while dest enters/leaves L2
watchdog: imx2_wdt: restore previous timeout after suspend+resume
media: ts2020: avoid integer overflows on 32 bit machines
media: cxusb, dib0700: ignore XC2028_I2C_FLUSH
kernel/async.c: revert "async: simplify lowest_in_progress()"
HID: quirks: Fix keyboard + touchpad on Toshiba Click Mini not working
Bluetooth: btsdio: Do not bind to non-removable BCM43341
Revert "Bluetooth: btusb: fix QCA Rome suspend/resume"
Bluetooth: btusb: Restore QCA Rome suspend/resume fix with a "rewritten" version
signal/openrisc: Fix do_unaligned_access to send the proper signal
signal/sh: Ensure si_signo is initialized in do_divide_error
alpha: fix crash if pthread_create races with signal delivery
alpha: fix reboot on Avanti platform
xtensa: fix futex_atomic_cmpxchg_inatomic
EDAC, octeon: Fix an uninitialized variable warning
pktcdvd: Fix pkt_setup_dev() error path
btrfs: Handle btrfs_set_extent_delalloc failure in fixup worker
nvme: Fix managing degraded controllers
ACPI: sbshc: remove raw pointer from printk() message
ovl: fix failure to fsync lower dir
mn10300/misalignment: Use SIGSEGV SEGV_MAPERR to report a failed user copy
ftrace: Remove incorrect setting of glob search field
Linux 4.4.116
Change-Id: Id000cb8d59b74de063902e9ad24dd07fe1b1694b
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 21fc61c73c3903c4c312d0802da01ec2b323d174 upstream.
kmap() in page_follow_link_light() needed to go - allowing to hold
an arbitrary number of kmaps for long is a great way to deadlocking
the system.
new helper (inode_nohighmem(inode)) needs to be used for pagecache
symlinks inodes; done for all in-tree cases. page_follow_link_light()
instrumented to yell about anything missed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Jin Qian <jinqian@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This allows filesystems to use their mount private data to
influence the permssions they use in setattr2. It has
been separated into a new call to avoid disrupting current
setattr users.
Change-Id: I19959038309284448f1b7f232d579674ef546385
Signed-off-by: Daniel Rosenberg <drosen@google.com>
This allows filesystems to use their mount private data to
influence the permssions they use in setattr2. It has
been separated into a new call to avoid disrupting current
setattr users.
Change-Id: I19959038309284448f1b7f232d579674ef546385
Signed-off-by: Daniel Rosenberg <drosen@google.com>
inode struct members that track cgroup writeback information
should be reinitialized when inode gets allocated from
kmem_cache. Otherwise, their values remain and get used by the
new inode.
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Fixes: d10c809552 ("writeback: implement foreign cgroup inode bdi_writeback switching")
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 3d65ae4634ed8350aee98a4e6f4e41fe40c7d282)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
commit c1892c37769cf89c7e7ba57528ae2ccb5d153c9b upstream.
file_remove_privs() is called with inode lock on file_inode(), which
proceeds to calling notify_change() on file->f_path.dentry. Which triggers
the WARN_ON_ONCE(!inode_is_locked(inode)) in addition to deadlocking later
when ovl_setattr tries to lock the underlying inode again.
Fix this mess by not mixing the layers, but doing everything on underlying
dentry/inode.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 07a2daab49c5 ("ovl: Copy up underlying inode's ->i_mode to overlay inode")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix kernel-doc warning in fs/inode.c:
../fs/inode.c:1606: warning: No description found for parameter 'inode'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On a box with a lot of ram (148gb) I can make the box softlockup after running
an fs_mark job that creates hundreds of millions of empty files. This is
because we never generate enough memory pressure to keep the number of inodes on
our unused list low, so when we go to unmount we have to evict ~100 million
inodes. This makes one processor a very unhappy person, so add a cond_resched()
in dispose_list() and if we need a resched when processing the s_inodes list do
that and run dispose_list() on what we've currently culled. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
There's a small consistency problem between the inode and writeback
naming. Writeback calls the "for IO" inode queues b_io and
b_more_io, but the inode calls these the "writeback list" or
i_wb_list. This makes it hard to an new "under writeback" list to
the inode, or call it an "under IO" list on the bdi because either
way we'll have writeback on IO and IO on writeback and it'll just be
confusing. I'm getting confused just writing this!
So, rename the inode "for IO" list variable to i_io_list so we can
add a new "writeback list" in a subsequent patch.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Dave Chinner <dchinner@redhat.com>
The process of reducing contention on per-superblock inode lists
starts with moving the locking to match the per-superblock inode
list. This takes the global lock out of the picture and reduces the
contention problems to within a single filesystem. This doesn't get
rid of contention as the locks still have global CPU scope, but it
does isolate operations on different superblocks form each other.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Dave Chinner <dchinner@redhat.com>
Pull more vfs updates from Al Viro:
"Assorted VFS fixes and related cleanups (IMO the most interesting in
that part are f_path-related things and Eric's descriptor-related
stuff). UFS regression fixes (it got broken last cycle). 9P fixes.
fs-cache series, DAX patches, Jan's file_remove_suid() work"
[ I'd say this is much more than "fixes and related cleanups". The
file_table locking rule change by Eric Dumazet is a rather big and
fundamental update even if the patch isn't huge. - Linus ]
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
9p: cope with bogus responses from server in p9_client_{read,write}
p9_client_write(): avoid double p9_free_req()
9p: forgetting to cancel request on interrupted zero-copy RPC
dax: bdev_direct_access() may sleep
block: Add support for DAX reads/writes to block devices
dax: Use copy_from_iter_nocache
dax: Add block size note to documentation
fs/file.c: __fget() and dup2() atomicity rules
fs/file.c: don't acquire files->file_lock in fd_install()
fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
vfs: avoid creation of inode number 0 in get_next_ino
namei: make set_root_rcu() return void
make simple_positive() public
ufs: use dir_pages instead of ufs_dir_pages()
pagemap.h: move dir_pages() over there
remove the pointless include of lglock.h
fs: cleanup slight list_entry abuse
xfs: Correctly lock inode when removing suid and file capabilities
fs: Call security_ops->inode_killpriv on truncate
fs: Provide function telling whether file_remove_privs() will do anything
...
currently, get_next_ino() is able to create inodes with inode number = 0.
This have a bad impact in the filesystems relying in this function to generate
inode numbers.
While there is no problem at all in having inodes with number 0, userspace tools
which handle file management tasks can have problems handling these files, like
for example, the impossiblity of users to delete these files, since glibc will
ignore them. So, I believe the best way is kernel to avoid creating them.
This problem has been raised previously, but the old thread didn't have any
other update for a year+, and I've seen too many users hitting the same issue
regarding the impossibility to delete files while using filesystems relying on
this function. So, I'm starting the thread again, with the same patch
that I believe is enough to address this problem.
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull cgroup writeback support from Jens Axboe:
"This is the big pull request for adding cgroup writeback support.
This code has been in development for a long time, and it has been
simmering in for-next for a good chunk of this cycle too. This is one
of those problems that has been talked about for at least half a
decade, finally there's a solution and code to go with it.
Also see last weeks writeup on LWN:
http://lwn.net/Articles/648292/"
* 'for-4.2/writeback' of git://git.kernel.dk/linux-block: (85 commits)
writeback, blkio: add documentation for cgroup writeback support
vfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWB
writeback: do foreign inode detection iff cgroup writeback is enabled
v9fs: fix error handling in v9fs_session_init()
bdi: fix wrong error return value in cgwb_create()
buffer: remove unusued 'ret' variable
writeback: disassociate inodes from dying bdi_writebacks
writeback: implement foreign cgroup inode bdi_writeback switching
writeback: add lockdep annotation to inode_to_wb()
writeback: use unlocked_inode_to_wb transaction in inode_congested()
writeback: implement unlocked_inode_to_wb transaction and use it for stat updates
writeback: implement [locked_]inode_to_wb_and_lock_list()
writeback: implement foreign cgroup inode detection
writeback: make writeback_control track the inode being written back
writeback: relocate wb[_try]_get(), wb_put(), inode_{attach|detach}_wb()
mm: vmscan: disable memcg direct reclaim stalling if cgroup writeback support is in use
writeback: implement memcg writeback domain based throttling
writeback: reset wb_domain->dirty_limit[_tstmp] when memcg domain size changes
writeback: implement memcg wb_domain
writeback: update wb_over_bg_thresh() to use wb_domain aware operations
...
Comment in include/linux/security.h says that ->inode_killpriv() should
be called when setuid bit is being removed and that similar security
labels (in fact this applies only to file capabilities) should be
removed at this time as well. However we don't call ->inode_killpriv()
when we remove suid bit on truncate.
We fix the problem by calling ->inode_need_killpriv() and subsequently
->inode_killpriv() on truncate the same way as we do it on file write.
After this patch there's only one user of should_remove_suid() - ocfs2 -
and indeed it's buggy because it doesn't call ->inode_killpriv() on
write. However fixing it is difficult because of special locking
constraints.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Provide function telling whether file_remove_privs() will do anything.
Currently we only have should_remove_suid() and that does something
slightly different.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
file_remove_suid() is a misnomer since it removes also file capabilities
stored in xattrs and sets S_NOSEC flag. Also should_remove_suid() tells
something else than whether file_remove_suid() call is necessary which
leads to bugs.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
file_remove_suid() could mistakenly set S_NOSEC inode bit when root was
modifying the file. As a result following writes to the file by ordinary
user would avoid clearing suid or sgid bits.
Fix the bug by checking actual mode bits before setting S_NOSEC.
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
For the planned cgroup writeback support, on each bdi
(backing_dev_info), each memcg will be served by a separate wb
(bdi_writeback). This patch updates bdi so that a bdi can host
multiple wbs (bdi_writebacks).
On the default hierarchy, blkcg implicitly enables memcg. This allows
using memcg's page ownership for attributing writeback IOs, and every
memcg - blkcg combination can be served by its own wb by assigning a
dedicated wb to each memcg. This means that there may be multiple
wb's of a bdi mapped to the same blkcg. As congested state is per
blkcg - bdi combination, those wb's should share the same congested
state. This is achieved by tracking congested state via
bdi_writeback_congested structs which are keyed by blkcg.
bdi->wb remains unchanged and will keep serving the root cgroup.
cgwb's (cgroup wb's) for non-root cgroups are created on-demand or
looked up while dirtying an inode according to the memcg of the page
being dirtied or current task. Each cgwb is indexed on bdi->cgwb_tree
by its memcg id. Once an inode is associated with its wb, it can be
retrieved using inode_to_wb().
Currently, none of the filesystems has FS_CGROUP_WRITEBACK and all
pages will keep being associated with bdi->wb.
v3: inode_attach_wb() in account_page_dirtied() moved inside
mapping_cap_account_dirty() block where it's known to be !NULL.
Also, an unnecessary NULL check before kfree() removed. Both
detected by the kbuild bot.
v2: Updated so that wb association is per inode and wb is per memcg
rather than blkcg.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: kbuild test robot <fengguang.wu@intel.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
touch_atime is not RCU-safe, and so cannot be called on an RCU walk.
However, in situations where RCU-walk makes a difference, the symlink
will likely to accessed much more often than it is useful to update
the atime.
So split out the test of "Does the atime actually need to be updated"
into atime_needs_update(), and have get_link() unlazy if it finds that
it will need to do that update.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
let "fast" symlinks store the pointer to the body into ->i_link and
use simple_follow_link for ->follow_link()
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
do_blockdev_direct_IO() increments and decrements the inode
->i_dio_count for each IO operation. It does this to protect against
truncate of a file. Block devices don't need this sort of protection.
For a capable multiqueue setup, this atomic int is the only shared
state between applications accessing the device for O_DIRECT, and it
presents a scaling wall for that. In my testing, as much as 30% of
system time is spent incrementing and decrementing this value. A mixed
read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with
better latencies too. Before:
clat percentiles (usec):
| 1.00th=[ 33], 5.00th=[ 34], 10.00th=[ 34], 20.00th=[ 34],
| 30.00th=[ 34], 40.00th=[ 34], 50.00th=[ 35], 60.00th=[ 35],
| 70.00th=[ 35], 80.00th=[ 35], 90.00th=[ 37], 95.00th=[ 80],
| 99.00th=[ 98], 99.50th=[ 151], 99.90th=[ 155], 99.95th=[ 155],
| 99.99th=[ 165]
After:
clat percentiles (usec):
| 1.00th=[ 95], 5.00th=[ 108], 10.00th=[ 129], 20.00th=[ 149],
| 30.00th=[ 155], 40.00th=[ 161], 50.00th=[ 167], 60.00th=[ 171],
| 70.00th=[ 177], 80.00th=[ 185], 90.00th=[ 201], 95.00th=[ 270],
| 99.00th=[ 390], 99.50th=[ 398], 99.90th=[ 418], 99.95th=[ 422],
| 99.99th=[ 438]
In other setups, Robert Elliott reported seeing good performance
improvements:
https://lkml.org/lkml/2015/4/3/557
The more applications accessing the device, the worse it gets.
Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells
do_blockdev_direct_IO() that it need not worry about incrementing
or decrementing the inode i_dio_count for this caller.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Elliott, Robert (Server Storage) <elliott@hp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
these should be used on objects already in top layer
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull lazytime mount option support from Al Viro:
"Lazytime stuff from tytso"
* 'lazytime' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ext4: add optimization for the lazytime mount option
vfs: add find_inode_nowait() function
vfs: add support for a lazytime mount option
Merge third set of updates from Andrew Morton:
- the rest of MM
[ This includes getting rid of the numa hinting bits, in favor of
just generic protnone logic. Yay. - Linus ]
- core kernel
- procfs
- some of lib/ (lots of lib/ material this time)
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (104 commits)
lib/lcm.c: replace include
lib/percpu_ida.c: remove redundant includes
lib/strncpy_from_user.c: replace module.h include
lib/stmp_device.c: replace module.h include
lib/sort.c: move include inside #if 0
lib/show_mem.c: remove redundant include
lib/radix-tree.c: change to simpler include
lib/plist.c: remove redundant include
lib/nlattr.c: remove redundant include
lib/kobject_uevent.c: remove redundant include
lib/llist.c: remove redundant include
lib/md5.c: simplify include
lib/list_sort.c: rearrange includes
lib/genalloc.c: remove redundant include
lib/idr.c: remove redundant include
lib/halfmd4.c: simplify includes
lib/dynamic_queue_limits.c: simplify includes
lib/sort.c: use simpler includes
lib/interval_tree.c: simplify includes
hexdump: make it return number of bytes placed in buffer
...
Currently, the isolate callback passed to the list_lru_walk family of
functions is supposed to just delete an item from the list upon returning
LRU_REMOVED or LRU_REMOVED_RETRY, while nr_items counter is fixed by
__list_lru_walk_one after the callback returns. Since the callback is
allowed to drop the lock after removing an item (it has to return
LRU_REMOVED_RETRY then), the nr_items can be less than the actual number
of elements on the list even if we check them under the lock. This makes
it difficult to move items from one list_lru_one to another, which is
required for per-memcg list_lru reparenting - we can't just splice the
lists, we have to move entries one by one.
This patch therefore introduces helpers that must be used by callback
functions to isolate items instead of raw list_del/list_move. These are
list_lru_isolate and list_lru_isolate_move. They not only remove the
entry from the list, but also fix the nr_items counter, making sure
nr_items always reflects the actual number of elements on the list if
checked under the appropriate lock.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kmem accounting of memcg is unusable now, because it lacks slab shrinker
support. That means when we hit the limit we will get ENOMEM w/o any
chance to recover. What we should do then is to call shrink_slab, which
would reclaim old inode/dentry caches from this cgroup. This is what
this patch set is intended to do.
Basically, it does two things. First, it introduces the notion of
per-memcg slab shrinker. A shrinker that wants to reclaim objects per
cgroup should mark itself as SHRINKER_MEMCG_AWARE. Then it will be
passed the memory cgroup to scan from in shrink_control->memcg. For
such shrinkers shrink_slab iterates over the whole cgroup subtree under
the target cgroup and calls the shrinker for each kmem-active memory
cgroup.
Secondly, this patch set makes the list_lru structure per-memcg. It's
done transparently to list_lru users - everything they have to do is to
tell list_lru_init that they want memcg-aware list_lru. Then the
list_lru will automatically distribute objects among per-memcg lists
basing on which cgroup the object is accounted to. This way to make FS
shrinkers (icache, dcache) memcg-aware we only need to make them use
memcg-aware list_lru, and this is what this patch set does.
As before, this patch set only enables per-memcg kmem reclaim when the
pressure goes from memory.limit, not from memory.kmem.limit. Handling
memory.kmem.limit is going to be tricky due to GFP_NOFS allocations, and
it is still unclear whether we will have this knob in the unified
hierarchy.
This patch (of 9):
NUMA aware slab shrinkers use the list_lru structure to distribute
objects coming from different NUMA nodes to different lists. Whenever
such a shrinker needs to count or scan objects from a particular node,
it issues commands like this:
count = list_lru_count_node(lru, sc->nid);
freed = list_lru_walk_node(lru, sc->nid, isolate_func,
isolate_arg, &sc->nr_to_scan);
where sc is an instance of the shrink_control structure passed to it
from vmscan.
To simplify this, let's add special list_lru functions to be used by
shrinkers, list_lru_shrink_count() and list_lru_shrink_walk(), which
consolidate the nid and nr_to_scan arguments in the shrink_control
structure.
This will also allow us to avoid patching shrinkers that use list_lru
when we make shrink_slab() per-memcg - all we will have to do is extend
the shrink_control structure to include the target memcg and make
list_lru_shrink_{count,walk} handle this appropriately.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull backing device changes from Jens Axboe:
"This contains a cleanup of how the backing device is handled, in
preparation for a rework of the life time rules. In this part, the
most important change is to split the unrelated nommu mmap flags from
it, but also removing a backing_dev_info pointer from the
address_space (and inode), and a cleanup of other various minor bits.
Christoph did all the work here, I just fixed an oops with pages that
have a swap backing. Arnd fixed a missing export, and Oleg killed the
lustre backing_dev_info from staging. Last patch was from Al,
unexporting parts that are now no longer needed outside"
* 'for-3.20/bdi' of git://git.kernel.dk/linux-block:
Make super_blocks and sb_lock static
mtd: export new mtd_mmap_capabilities
fs: make inode_to_bdi() handle NULL inode
staging/lustre/llite: get rid of backing_dev_info
fs: remove default_backing_dev_info
fs: don't reassign dirty inodes to default_backing_dev_info
nfs: don't call bdi_unregister
ceph: remove call to bdi_unregister
fs: remove mapping->backing_dev_info
fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info
nilfs2: set up s_bdi like the generic mount_bdev code
block_dev: get bdev inode bdi directly from the block device
block_dev: only write bdev inode on close
fs: introduce f_op->mmap_capabilities for nommu mmap support
fs: kill BDI_CAP_SWAP_BACKED
fs: deduplicate noop_backing_dev_info
Merge misc updates from Andrew Morton:
"Bite-sized chunks this time, to avoid the MTA ratelimiting woes.
- fs/notify updates
- ocfs2
- some of MM"
That laconic "some MM" is mainly the removal of remap_file_pages(),
which is a big simplification of the VM, and which gets rid of a *lot*
of random cruft and special cases because we no longer support the
non-linear mappings that it used.
From a user interface perspective, nothing has changed, because the
remap_file_pages() syscall still exists, it's just done by emulating the
old behavior by creating a lot of individual small mappings instead of
one non-linear one.
The emulation is slower than the old "native" non-linear mappings, but
nobody really uses or cares about remap_file_pages(), and simplifying
the VM is a big advantage.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (78 commits)
memcg: zap memcg_slab_caches and memcg_slab_mutex
memcg: zap memcg_name argument of memcg_create_kmem_cache
memcg: zap __memcg_{charge,uncharge}_slab
mm/page_alloc.c: place zone_id check before VM_BUG_ON_PAGE check
mm: hugetlb: fix type of hugetlb_treat_as_movable variable
mm, hugetlb: remove unnecessary lower bound on sysctl handlers"?
mm: memory: merge shared-writable dirtying branches in do_wp_page()
mm: memory: remove ->vm_file check on shared writable vmas
xtensa: drop _PAGE_FILE and pte_file()-related helpers
x86: drop _PAGE_FILE and pte_file()-related helpers
unicore32: drop pte_file()-related helpers
um: drop _PAGE_FILE and pte_file()-related helpers
tile: drop pte_file()-related helpers
sparc: drop pte_file()-related helpers
sh: drop _PAGE_FILE and pte_file()-related helpers
score: drop _PAGE_FILE and pte_file()-related helpers
s390: drop pte_file()-related helpers
parisc: drop _PAGE_FILE and pte_file()-related helpers
openrisc: drop _PAGE_FILE and pte_file()-related helpers
nios2: drop _PAGE_FILE and pte_file()-related helpers
...
We don't create non-linear mappings anymore. Let's drop code which
handles them in rmap.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a new function find_inode_nowait() which is an even more general
version of ilookup5_nowait(). It is designed for callers which need
very fine grained control over when the function is allowed to block
or increment the inode's reference count.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add a new mount option which enables a new "lazytime" mode. This mode
causes atime, mtime, and ctime updates to only be made to the
in-memory version of the inode. The on-disk times will only get
updated when (a) if the inode needs to be updated for some non-time
related change, (b) if userspace calls fsync(), syncfs() or sync(), or
(c) just before an undeleted inode is evicted from memory.
This is OK according to POSIX because there are no guarantees after a
crash unless userspace explicitly requests via a fsync(2) call.
For workloads which feature a large number of random write to a
preallocated file, the lazytime mount option significantly reduces
writes to the inode table. The repeated 4k writes to a single block
will result in undesirable stress on flash devices and SMR disk
drives. Even on conventional HDD's, the repeated writes to the inode
table block will trigger Adjacent Track Interference (ATI) remediation
latencies, which very negatively impact long tail latencies --- which
is a very big deal for web serving tiers (for example).
Google-Bug-Id: 18297052
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that we never use the backing_dev_info pointer in struct address_space
we can simply remove it and save 4 to 8 bytes in every inode.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reviewed-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
The current scheme of using the i_flock list is really difficult to
manage. There is also a legitimate desire for a per-inode spinlock to
manage these lists that isn't the i_lock.
Start conversion to a new scheme to eventually replace the old i_flock
list with a new "file_lock_context" object.
We start by adding a new i_flctx to struct inode. For now, it lives in
parallel with i_flock list, but will eventually replace it. The idea is
to allocate a structure to sit in that pointer and act as a locus for
all things file locking.
We allocate a file_lock_context for an inode when the first lock is
added to it, and it's only freed when the inode is freed. We use the
i_lock to protect the assignment, but afterward it should mostly be
accessed locklessly.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Pull vfs pile #2 from Al Viro:
"Next pile (and there'll be one or two more).
The large piece in this one is getting rid of /proc/*/ns/* weirdness;
among other things, it allows to (finally) make nameidata completely
opaque outside of fs/namei.c, making for easier further cleanups in
there"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
coda_venus_readdir(): use file_inode()
fs/namei.c: fold link_path_walk() call into path_init()
path_init(): don't bother with LOOKUP_PARENT in argument
fs/namei.c: new helper (path_cleanup())
path_init(): store the "base" pointer to file in nameidata itself
make default ->i_fop have ->open() fail with ENXIO
make nameidata completely opaque outside of fs/namei.c
kill proc_ns completely
take the targets of /proc/*/ns/* symlinks to separate fs
bury struct proc_ns in fs/proc
copy address of proc_ns_ops into ns_common
new helpers: ns_alloc_inum/ns_free_inum
make proc_ns_operations work with struct ns_common * instead of void *
switch the rest of proc_ns_operations to working with &...->ns
netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns
make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
common object embedded into various struct ....ns
The i_mmap_mutex is a close cousin of the anon vma lock, both protecting
similar data, one for file backed pages and the other for anon memory. To
this end, this lock can also be a rwsem. In addition, there are some
important opportunities to share the lock when there are no tree
modifications.
This conversion is straightforward. For now, all users take the write
lock.
[sfr@canb.auug.org.au: update fremap.c]
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As it is, default ->i_fop has NULL ->open() (along with all other methods).
The only case where it matters is reopening (via procfs symlink) a file that
didn't get its ->f_op from ->i_fop - anything else will have ->i_fop assigned
to something sane (default would fail on read/write/ioctl/etc.).
Unfortunately, such case exists - alloc_file() users, especially
anon_get_file() ones. There we have tons of opened files of very different
kinds sharing the same inode. As the result, attempt to reopen those via
procfs succeeds and you get a descriptor you can't do anything with.
Moreover, in case of sockets we set ->i_fop that will only be used
on such reopen attempts - and put a failing ->open() into it to make sure
those do not succeed.
It would be simpler to put such ->open() into default ->i_fop and leave
it unchanged both for anon inode (as we do anyway) and for socket ones. Result:
* everything going through do_dentry_open() works as it used to
* sock_no_open() kludge is gone
* attempts to reopen anon-inode files fail as they really ought to
* ditto for aio_private_file()
* ditto for perfmon - this one actually tried to imitate sock_no_open()
trick, but failed to set ->i_fop, so in the current tree reopens succeed and
yield completely useless descriptor. Intent clearly had been to fail with
-ENXIO on such reopens; now it actually does.
* everything else that used alloc_file() keeps working - it has ->i_fop
set for its inodes anyway
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
All filesystems using VFS quotas are now converted to use their private
i_dquot fields. Remove the i_dquot field from generic inode structure.
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
This patch (of 6):
The i_mmap_writable field counts existing writable mappings of an
address_space. To allow drivers to prevent new writable mappings, make
this counter signed and prevent new writable mappings if it is negative.
This is modelled after i_writecount and DENYWRITE.
This will be required by the shmem-sealing infrastructure to prevent any
new writable mappings after the WRITE seal has been set. In case there
exists a writable mapping, this operation will fail with EBUSY.
Note that we rely on the fact that iff you already own a writable mapping,
you can increase the counter without using the helpers. This is the same
that we do for i_writecount.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current "wait_on_bit" interface requires an 'action'
function to be provided which does the actual waiting.
There are over 20 such functions, many of them identical.
Most cases can be satisfied by one of just two functions, one
which uses io_schedule() and one which just uses schedule().
So:
Rename wait_on_bit and wait_on_bit_lock to
wait_on_bit_action and wait_on_bit_lock_action
to make it explicit that they need an action function.
Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
which are *not* given an action function but implicitly use
a standard one.
The decision to error-out if a signal is pending is now made
based on the 'mode' argument rather than being encoded in the action
function.
All instances of the old wait_on_bit and wait_on_bit_lock which
can use the new version have been changed accordingly and their
action functions have been discarded.
wait_on_bit{_lock} does not return any specific error code in the
event of a signal so the caller must check for non-zero and
interpolate their own error code as appropriate.
The wait_on_bit() call in __fscache_wait_on_invalidate() was
ambiguous as it specified TASK_UNINTERRUPTIBLE but used
fscache_wait_bit_interruptible as an action function.
David Howells confirms this should be uniformly
"uninterruptible"
The main remaining user of wait_on_bit{,_lock}_action is NFS
which needs to use a freezer-aware schedule() call.
A comment in fs/gfs2/glock.c notes that having multiple 'action'
functions is useful as they display differently in the 'wchan'
field of 'ps'. (and /proc/$PID/wchan).
As the new bit_wait{,_io} functions are tagged "__sched", they
will not show up at all, but something higher in the stack. So
the distinction will still be visible, only with different
function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
gfs2/glock.c case).
Since first version of this patch (against 3.15) two new action
functions appeared, on in NFS and one in CIFS. CIFS also now
uses an action function that makes the same freezer aware
schedule call as NFS.
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: David Howells <dhowells@redhat.com> (fscache, keys)
Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2)
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The kernel has no concept of capabilities with respect to inodes; inodes
exist independently of namespaces. For example, inode_capable(inode,
CAP_LINUX_IMMUTABLE) would be nonsense.
This patch changes inode_capable to check for uid and gid mappings and
renames it to capable_wrt_inode_uidgid, which should make it more
obvious what it does.
Fixes CVE-2014-4014.
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This typedef is unnecessary and should just be removed.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
and COLLAPSE_RANGE fallocate operations, and scalability improvements
in the jbd2 layer and in xattr handling when the extended attributes
spill over into an external block.
Other than that, the usual clean ups and minor bug fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJTPbD2AAoJENNvdpvBGATwDmUQANSfGYIQazB8XKKgtNTMiG/Y
Ky7n1JzN9lTX/6nMsqQnbfCweLRmxqpWUBuyKDRHUi8IG0/voXSTFsAOOgz0R15A
ERRRWkVvHixLpohuL/iBdEMFHwNZYPGr3jkm0EIgzhtXNgk5DNmiuMwvHmCY27kI
kdNZIw9fip/WRNoFLDBGnLGC37aanoHhCIbVlySy5o9LN1pkC8BgXAYV0Rk19SVd
bWCudSJEirFEqWS5H8vsBAEm/ioxTjwnNL8tX8qms6orZ6h8yMLFkHoIGWPw3Q15
a0TSUoMyav50Yr59QaDeWx9uaPQVeK41wiYFI2rZOnyG2ts0u0YXs/nLwJqTovgs
rzvbdl6cd3Nj++rPi97MTA7iXK96WQPjsDJoeeEgnB0d/qPyTk6mLKgftzLTNgSa
ZmWjrB19kr6CMbebMC4L6eqJ8Fr66pCT8c/iue8wc4MUHi7FwHKH64fqWvzp2YT/
+165dqqo2JnUv7tIp6sUi1geun+bmDHLZFXgFa7fNYFtcU3I+uY1mRr3eMVAJndA
2d6ASe/KhQbpVnjKJdQ8/b833ZS3p+zkgVPrd68bBr3t7gUmX91wk+p1ct6rUPLr
700F+q/pQWL8ap0pU9Ht/h3gEJIfmRzTwxlOeYyOwDseqKuS87PSB3BzV3dDunSU
DrPKlXwIgva7zq5/S0Vr
=4s1Z
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Major changes for 3.14 include support for the newly added ZERO_RANGE
and COLLAPSE_RANGE fallocate operations, and scalability improvements
in the jbd2 layer and in xattr handling when the extended attributes
spill over into an external block.
Other than that, the usual clean ups and minor bug fixes"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (42 commits)
ext4: fix premature freeing of partial clusters split across leaf blocks
ext4: remove unneeded test of ret variable
ext4: fix comment typo
ext4: make ext4_block_zero_page_range static
ext4: atomically set inode->i_flags in ext4_set_inode_flags()
ext4: optimize Hurd tests when reading/writing inodes
ext4: kill i_version support for Hurd-castrated file systems
ext4: each filesystem creates and uses its own mb_cache
fs/mbcache.c: doucple the locking of local from global data
fs/mbcache.c: change block and index hash chain to hlist_bl_node
ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
ext4: refactor ext4_fallocate code
ext4: Update inode i_size after the preallocation
ext4: fix partial cluster handling for bigalloc file systems
ext4: delete path dealloc code in ext4_ext_handle_uninitialized_extents
ext4: only call sync_filesystm() when remounting read-only
fs: push sync_filesystem() down to the file system's remount_fs()
jbd2: improve error messages for inconsistent journal heads
jbd2: minimize region locked by j_list_lock in jbd2_journal_forget()
jbd2: minimize region locked by j_list_lock in journal_get_create_access()
...
Pull renameat2 system call from Miklos Szeredi:
"This adds a new syscall, renameat2(), which is the same as renameat()
but with a flags argument.
The purpose of extending rename is to add cross-rename, a symmetric
variant of rename, which exchanges the two files. This allows
interesting things, which were not possible before, for example
atomically replacing a directory tree with a symlink, etc... This
also allows overlayfs and friends to operate on whiteouts atomically.
Andy Lutomirski also suggested a "noreplace" flag, which disables the
overwriting behavior of rename.
These two flags, RENAME_EXCHANGE and RENAME_NOREPLACE are only
implemented for ext4 as an example and for testing"
* 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ext4: add cross rename support
ext4: rename: split out helper functions
ext4: rename: move EMLINK check up
ext4: rename: create ext4_renament structure for local vars
vfs: add cross-rename
vfs: lock_two_nondirectories: allow directory args
security: add flags to rename hooks
vfs: add RENAME_NOREPLACE flag
vfs: add renameat2 syscall
vfs: rename: use common code for dir and non-dir
vfs: rename: move d_move() up
vfs: add d_is_dir()
Reclaim will be leaving shadow entries in the page cache radix tree upon
evicting the real page. As those pages are found from the LRU, an
iput() can lead to the inode being freed concurrently. At this point,
reclaim must no longer install shadow pages because the inode freeing
code needs to ensure the page tree is really empty.
Add an address_space flag, AS_EXITING, that the inode freeing code sets
under the tree lock before doing the final truncate. Reclaim will check
for this flag before installing shadow pages.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lock_two_nondirectories warned if either of its args was a directory.
Instead just ignore the directory args. This is needed for locking in
cross rename.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>