From adaba23ccd7d1625942f2c27612d2b416c87e011 Mon Sep 17 00:00:00 2001 From: Andi Kleen Date: Sat, 25 Aug 2018 06:50:15 -0700 Subject: [PATCH 1/6] x86/mm/pat: Fix L1TF stable backport for CPA Patch for stable only to fix boot resets caused by the L1TF patches. Stable trees reverted the following patch Revert "x86/mm/pat: Ensure cpa->pfn only contains page frame numbers" This reverts commit 87e2bd898d3a79a8c609f183180adac47879a2a4 which is commit edc3b9129cecd0f0857112136f5b8b1bc1d45918 upstream. but the L1TF patch backported here x86/mm/pat: Make set_memory_np() L1TF safe commit 958f79b9ee55dfaf00c8106ed1c22a2919e0028b upstream set_memory_np() is used to mark kernel mappings not present, but it has it's own open coded mechanism which does not have the L1TF protection of inverting the address bits. assumed that cpa->pfn contains a PFN. With the above patch reverted it does not, which causes the PMD to be set to an incorrect address shifted by 12 bits, which can cause early boot reset on some systems, like an Apollo Lake embedded system. Convert the address to a PFN before passing it to pmd_pfn() Thanks to Bernhard for bisecting and testing. Cc: stable@vger.kernel.org # 4.4 and 4.9 Reported-by: Bernhard Kaindl Tested-by: Bernhard Kaindl Signed-off-by: Andi Kleen Signed-off-by: Greg Kroah-Hartman --- arch/x86/mm/pageattr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 27610c2d1821..1007fa80f5a6 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -1006,7 +1006,7 @@ static int populate_pmd(struct cpa_data *cpa, pmd = pmd_offset(pud, start); - set_pmd(pmd, pmd_mkhuge(pfn_pmd(cpa->pfn, + set_pmd(pmd, pmd_mkhuge(pfn_pmd(cpa->pfn >> PAGE_SHIFT, canon_pgprot(pmd_pgprot)))); start += PMD_SIZE; From f9866720724db8a163cf305fc907cdab0b38fa09 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Thu, 24 Aug 2017 10:50:29 -0700 Subject: [PATCH 2/6] x86/mm: Fix use-after-free of ldt_struct commit ccd5b3235180eef3cfec337df1c8554ab151b5cc upstream. The following commit: 39a0526fb3f7 ("x86/mm: Factor out LDT init from context init") renamed init_new_context() to init_new_context_ldt() and added a new init_new_context() which calls init_new_context_ldt(). However, the error code of init_new_context_ldt() was ignored. Consequently, if a memory allocation in alloc_ldt_struct() failed during a fork(), the ->context.ldt of the new task remained the same as that of the old task (due to the memcpy() in dup_mm()). ldt_struct's are not intended to be shared, so a use-after-free occurred after one task exited. Fix the bug by making init_new_context() pass through the error code of init_new_context_ldt(). This bug was found by syzkaller, which encountered the following splat: BUG: KASAN: use-after-free in free_ldt_struct.part.2+0x10a/0x150 arch/x86/kernel/ldt.c:116 Read of size 4 at addr ffff88006d2cb7c8 by task kworker/u9:0/3710 CPU: 1 PID: 3710 Comm: kworker/u9:0 Not tainted 4.13.0-rc4-next-20170811 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:52 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x24e/0x340 mm/kasan/report.c:409 __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429 free_ldt_struct.part.2+0x10a/0x150 arch/x86/kernel/ldt.c:116 free_ldt_struct arch/x86/kernel/ldt.c:173 [inline] destroy_context_ldt+0x60/0x80 arch/x86/kernel/ldt.c:171 destroy_context arch/x86/include/asm/mmu_context.h:157 [inline] __mmdrop+0xe9/0x530 kernel/fork.c:889 mmdrop include/linux/sched/mm.h:42 [inline] exec_mmap fs/exec.c:1061 [inline] flush_old_exec+0x173c/0x1ff0 fs/exec.c:1291 load_elf_binary+0x81f/0x4ba0 fs/binfmt_elf.c:855 search_binary_handler+0x142/0x6b0 fs/exec.c:1652 exec_binprm fs/exec.c:1694 [inline] do_execveat_common.isra.33+0x1746/0x22e0 fs/exec.c:1816 do_execve+0x31/0x40 fs/exec.c:1860 call_usermodehelper_exec_async+0x457/0x8f0 kernel/umh.c:100 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431 Allocated by task 3700: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:447 set_track mm/kasan/kasan.c:459 [inline] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551 kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3627 kmalloc include/linux/slab.h:493 [inline] alloc_ldt_struct+0x52/0x140 arch/x86/kernel/ldt.c:67 write_ldt+0x7b7/0xab0 arch/x86/kernel/ldt.c:277 sys_modify_ldt+0x1ef/0x240 arch/x86/kernel/ldt.c:307 entry_SYSCALL_64_fastpath+0x1f/0xbe Freed by task 3700: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:447 set_track mm/kasan/kasan.c:459 [inline] kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524 __cache_free mm/slab.c:3503 [inline] kfree+0xca/0x250 mm/slab.c:3820 free_ldt_struct.part.2+0xdd/0x150 arch/x86/kernel/ldt.c:121 free_ldt_struct arch/x86/kernel/ldt.c:173 [inline] destroy_context_ldt+0x60/0x80 arch/x86/kernel/ldt.c:171 destroy_context arch/x86/include/asm/mmu_context.h:157 [inline] __mmdrop+0xe9/0x530 kernel/fork.c:889 mmdrop include/linux/sched/mm.h:42 [inline] __mmput kernel/fork.c:916 [inline] mmput+0x541/0x6e0 kernel/fork.c:927 copy_process.part.36+0x22e1/0x4af0 kernel/fork.c:1931 copy_process kernel/fork.c:1546 [inline] _do_fork+0x1ef/0xfb0 kernel/fork.c:2025 SYSC_clone kernel/fork.c:2135 [inline] SyS_clone+0x37/0x50 kernel/fork.c:2129 do_syscall_64+0x26c/0x8c0 arch/x86/entry/common.c:287 return_from_SYSCALL_64+0x0/0x7a Here is a C reproducer: #include #include #include #include #include #include #include static void *fork_thread(void *_arg) { fork(); } int main(void) { struct user_desc desc = { .entry_number = 8191 }; syscall(__NR_modify_ldt, 1, &desc, sizeof(desc)); for (;;) { if (fork() == 0) { pthread_t t; srand(getpid()); pthread_create(&t, NULL, fork_thread, NULL); usleep(rand() % 10000); syscall(__NR_exit_group, 0); } wait(NULL); } } Note: the reproducer takes advantage of the fact that alloc_ldt_struct() may use vmalloc() to allocate a large ->entries array, and after commit: 5d17a73a2ebe ("vmalloc: back off when the current task is killed") it is possible for userspace to fail a task's vmalloc() by sending a fatal signal, e.g. via exit_group(). It would be more difficult to reproduce this bug on kernels without that commit. This bug only affected kernels with CONFIG_MODIFY_LDT_SYSCALL=y. Signed-off-by: Eric Biggers Acked-by: Dave Hansen Cc: [v4.6+] Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Christoph Hellwig Cc: Denys Vlasenko Cc: Dmitry Vyukov Cc: Linus Torvalds Cc: Michal Hocko Cc: Peter Zijlstra Cc: Rik van Riel Cc: Tetsuo Handa Cc: Thomas Gleixner Cc: linux-mm@kvack.org Fixes: 39a0526fb3f7 ("x86/mm: Factor out LDT init from context init") Link: http://lkml.kernel.org/r/20170824175029.76040-1-ebiggers3@gmail.com Signed-off-by: Ingo Molnar Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman --- arch/x86/include/asm/mmu_context.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index effc12767cbf..d8d19fe99e45 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -109,8 +109,7 @@ static inline int init_new_context(struct task_struct *tsk, struct mm_struct *mm) { mm->context.ctx_id = atomic64_inc_return(&last_mm_ctx_id); - init_new_context_ldt(tsk, mm); - return 0; + return init_new_context_ldt(tsk, mm); } static inline void destroy_context(struct mm_struct *mm) { From d5e678942de33a5d8545a8b7c825eb93b57be1a9 Mon Sep 17 00:00:00 2001 From: Vivek Goyal Date: Mon, 22 Feb 2016 09:28:34 -0500 Subject: [PATCH 3/6] ovl: Ensure upper filesystem supports d_type MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 45aebeaf4f67468f76bedf62923a576a519a9b68 upstream. In some instances xfs has been created with ftype=0 and there if a file on lower fs is removed, overlay leaves a whiteout in upper fs but that whiteout does not get filtered out and is visible to overlayfs users. And reason it does not get filtered out because upper filesystem does not report file type of whiteout as DT_CHR during iterate_dir(). So it seems to be a requirement that upper filesystem support d_type for overlayfs to work properly. Do this check during mount and fail if d_type is not supported. Suggested-by: Dave Chinner Signed-off-by: Vivek Goyal Signed-off-by: Miklos Szeredi Signed-off-by: SZ Lin (林上智) Signed-off-by: Greg Kroah-Hartman --- fs/overlayfs/overlayfs.h | 1 + fs/overlayfs/readdir.c | 37 +++++++++++++++++++++++++++++++++++++ fs/overlayfs/super.c | 15 +++++++++++++++ 3 files changed, 53 insertions(+) diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index c319d5eaabcf..28316b292b8a 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -163,6 +163,7 @@ extern const struct file_operations ovl_dir_operations; int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list); void ovl_cleanup_whiteouts(struct dentry *upper, struct list_head *list); void ovl_cache_free(struct list_head *list); +int ovl_check_d_type_supported(struct path *realpath); /* inode.c */ int ovl_setattr(struct dentry *dentry, struct iattr *attr); diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c index 299a6e1d6b77..0c59955c4653 100644 --- a/fs/overlayfs/readdir.c +++ b/fs/overlayfs/readdir.c @@ -43,6 +43,7 @@ struct ovl_readdir_data { struct ovl_cache_entry *first_maybe_whiteout; int count; int err; + bool d_type_supported; }; struct ovl_dir_file { @@ -581,3 +582,39 @@ void ovl_cleanup_whiteouts(struct dentry *upper, struct list_head *list) } mutex_unlock(&upper->d_inode->i_mutex); } + +static int ovl_check_d_type(struct dir_context *ctx, const char *name, + int namelen, loff_t offset, u64 ino, + unsigned int d_type) +{ + struct ovl_readdir_data *rdd = + container_of(ctx, struct ovl_readdir_data, ctx); + + /* Even if d_type is not supported, DT_DIR is returned for . and .. */ + if (!strncmp(name, ".", namelen) || !strncmp(name, "..", namelen)) + return 0; + + if (d_type != DT_UNKNOWN) + rdd->d_type_supported = true; + + return 0; +} + +/* + * Returns 1 if d_type is supported, 0 not supported/unknown. Negative values + * if error is encountered. + */ +int ovl_check_d_type_supported(struct path *realpath) +{ + int err; + struct ovl_readdir_data rdd = { + .ctx.actor = ovl_check_d_type, + .d_type_supported = false, + }; + + err = ovl_dir_read(realpath, &rdd); + if (err) + return err; + + return rdd.d_type_supported; +} diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index d70208c0de84..2de4e3a7d6e7 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -1054,6 +1054,21 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent) sb->s_flags |= MS_RDONLY; ufs->workdir = NULL; } + + /* + * Upper should support d_type, else whiteouts are visible. + * Given workdir and upper are on same fs, we can do + * iterate_dir() on workdir. + */ + err = ovl_check_d_type_supported(&workpath); + if (err < 0) + goto out_put_workdir; + + if (!err) { + pr_err("overlayfs: upper fs needs to support d_type.\n"); + err = -EINVAL; + goto out_put_workdir; + } } err = -ENOMEM; From 0f9a6d88cd9f3b16a86639bd652202fe27096b18 Mon Sep 17 00:00:00 2001 From: Vivek Goyal Date: Fri, 20 May 2016 09:04:26 -0400 Subject: [PATCH 4/6] ovl: Do d_type check only if work dir creation was successful MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 21765194cecf2e4514ad75244df459f188140a0f upstream. d_type check requires successful creation of workdir as iterates through work dir and expects work dir to be present in it. If that's not the case, this check will always return d_type not supported even if underlying filesystem might be supporting it. So don't do this check if work dir creation failed in previous step. Signed-off-by: Vivek Goyal Signed-off-by: Miklos Szeredi Signed-off-by: SZ Lin (林上智) Signed-off-by: Greg Kroah-Hartman --- fs/overlayfs/super.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index 2de4e3a7d6e7..fd21c5f74fba 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -1058,16 +1058,19 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent) /* * Upper should support d_type, else whiteouts are visible. * Given workdir and upper are on same fs, we can do - * iterate_dir() on workdir. + * iterate_dir() on workdir. This check requires successful + * creation of workdir in previous step. */ - err = ovl_check_d_type_supported(&workpath); - if (err < 0) - goto out_put_workdir; + if (ufs->workdir) { + err = ovl_check_d_type_supported(&workpath); + if (err < 0) + goto out_put_workdir; - if (!err) { - pr_err("overlayfs: upper fs needs to support d_type.\n"); - err = -EINVAL; - goto out_put_workdir; + if (!err) { + pr_err("overlayfs: upper fs needs to support d_type.\n"); + err = -EINVAL; + goto out_put_workdir; + } } } From 7eaa995c75bd23b57163541c3285a2c984018b7e Mon Sep 17 00:00:00 2001 From: Vivek Goyal Date: Fri, 1 Jul 2016 10:02:44 -0400 Subject: [PATCH 5/6] ovl: warn instead of error if d_type is not supported MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit e7c0b5991dd1be7b6f6dc2b54a15a0f47b64b007 upstream. overlay needs underlying fs to support d_type. Recently I put in a patch in to detect this condition and started failing mount if underlying fs did not support d_type. But this breaks existing configurations over kernel upgrade. Those who are running docker (partially broken configuration) with xfs not supporting d_type, are surprised that after kernel upgrade docker does not run anymore. https://github.com/docker/docker/issues/22937#issuecomment-229881315 So instead of erroring out, detect broken configuration and warn about it. This should allow existing docker setups to continue working after kernel upgrade. Signed-off-by: Vivek Goyal Signed-off-by: Miklos Szeredi Fixes: 45aebeaf4f67 ("ovl: Ensure upper filesystem supports d_type") Cc: 4.6 Signed-off-by: SZ Lin (林上智) Signed-off-by: Greg Kroah-Hartman --- fs/overlayfs/super.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index fd21c5f74fba..0035cb80ecd1 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -1066,11 +1066,13 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent) if (err < 0) goto out_put_workdir; - if (!err) { - pr_err("overlayfs: upper fs needs to support d_type.\n"); - err = -EINVAL; - goto out_put_workdir; - } + /* + * We allowed this configuration and don't want to + * break users over kernel upgrade. So warn instead + * of erroring out. + */ + if (!err) + pr_warn("overlayfs: upper fs needs to support d_type.\n"); } } From 577189c37a844243359afce1c3c94418259fe696 Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Tue, 28 Aug 2018 07:23:44 +0200 Subject: [PATCH 6/6] Linux 4.4.153 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 523b0d4354fb..208a813be615 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ VERSION = 4 PATCHLEVEL = 4 -SUBLEVEL = 152 +SUBLEVEL = 153 EXTRAVERSION = NAME = Blurry Fish Butt