android_kernel_oneplus_msm8998/arch/x86/kvm
Quentin Casasnovas 44dd5cec80 KVM: nVMX: VMX instructions: fix segment checks when L1 is in long mode.
commit ff30ef40deca4658e27b0c596e7baf39115e858f upstream.

I couldn't get Xen to boot a L2 HVM when it was nested under KVM - it was
getting a GP(0) on a rather unspecial vmread from Xen:

     (XEN) ----[ Xen-4.7.0-rc  x86_64  debug=n  Not tainted ]----
     (XEN) CPU:    1
     (XEN) RIP:    e008:[<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450
     (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d1v0)
     (XEN) rax: ffff82d0801e6288   rbx: ffff83003ffbfb7c   rcx: fffffffffffab928
     (XEN) rdx: 0000000000000000   rsi: 0000000000000000   rdi: ffff83000bdd0000
     (XEN) rbp: ffff83000bdd0000   rsp: ffff83003ffbfab0   r8:  ffff830038813910
     (XEN) r9:  ffff83003faf3958   r10: 0000000a3b9f7640   r11: ffff83003f82d418
     (XEN) r12: 0000000000000000   r13: ffff83003ffbffff   r14: 0000000000004802
     (XEN) r15: 0000000000000008   cr0: 0000000080050033   cr4: 00000000001526e0
     (XEN) cr3: 000000003fc79000   cr2: 0000000000000000
     (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
     (XEN) Xen code around <ffff82d0801e629e> (vmx_get_segment_register+0x14e/0x450):
     (XEN)  00 00 41 be 02 48 00 00 <44> 0f 78 74 24 08 0f 86 38 56 00 00 b8 08 68 00
     (XEN) Xen stack trace from rsp=ffff83003ffbfab0:

     ...

     (XEN) Xen call trace:
     (XEN)    [<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450
     (XEN)    [<ffff82d0801f3695>] get_page_from_gfn_p2m+0x165/0x300
     (XEN)    [<ffff82d0801bfe32>] hvmemul_get_seg_reg+0x52/0x60
     (XEN)    [<ffff82d0801bfe93>] hvm_emulate_prepare+0x53/0x70
     (XEN)    [<ffff82d0801ccacb>] handle_mmio+0x2b/0xd0
     (XEN)    [<ffff82d0801be591>] emulate.c#_hvm_emulate_one+0x111/0x2c0
     (XEN)    [<ffff82d0801cd6a4>] handle_hvm_io_completion+0x274/0x2a0
     (XEN)    [<ffff82d0801f334a>] __get_gfn_type_access+0xfa/0x270
     (XEN)    [<ffff82d08012f3bb>] timer.c#add_entry+0x4b/0xb0
     (XEN)    [<ffff82d08012f80c>] timer.c#remove_entry+0x7c/0x90
     (XEN)    [<ffff82d0801c8433>] hvm_do_resume+0x23/0x140
     (XEN)    [<ffff82d0801e4fe7>] vmx_do_resume+0xa7/0x140
     (XEN)    [<ffff82d080164aeb>] context_switch+0x13b/0xe40
     (XEN)    [<ffff82d080128e6e>] schedule.c#schedule+0x22e/0x570
     (XEN)    [<ffff82d08012c0cc>] softirq.c#__do_softirq+0x5c/0x90
     (XEN)    [<ffff82d0801602c5>] domain.c#idle_loop+0x25/0x50
     (XEN)
     (XEN)
     (XEN) ****************************************
     (XEN) Panic on CPU 1:
     (XEN) GENERAL PROTECTION FAULT
     (XEN) [error_code=0000]
     (XEN) ****************************************

Tracing my host KVM showed it was the one injecting the GP(0) when
emulating the VMREAD and checking the destination segment permissions in
get_vmx_mem_address():

     3)               |    vmx_handle_exit() {
     3)               |      handle_vmread() {
     3)               |        nested_vmx_check_permission() {
     3)               |          vmx_get_segment() {
     3)   0.074 us    |            vmx_read_guest_seg_base();
     3)   0.065 us    |            vmx_read_guest_seg_selector();
     3)   0.066 us    |            vmx_read_guest_seg_ar();
     3)   1.636 us    |          }
     3)   0.058 us    |          vmx_get_rflags();
     3)   0.062 us    |          vmx_read_guest_seg_ar();
     3)   3.469 us    |        }
     3)               |        vmx_get_cs_db_l_bits() {
     3)   0.058 us    |          vmx_read_guest_seg_ar();
     3)   0.662 us    |        }
     3)               |        get_vmx_mem_address() {
     3)   0.068 us    |          vmx_cache_reg();
     3)               |          vmx_get_segment() {
     3)   0.074 us    |            vmx_read_guest_seg_base();
     3)   0.068 us    |            vmx_read_guest_seg_selector();
     3)   0.071 us    |            vmx_read_guest_seg_ar();
     3)   1.756 us    |          }
     3)               |          kvm_queue_exception_e() {
     3)   0.066 us    |            kvm_multiple_exception();
     3)   0.684 us    |          }
     3)   4.085 us    |        }
     3)   9.833 us    |      }
     3) + 10.366 us   |    }

Cross-checking the KVM/VMX VMREAD emulation code with the Intel Software
Developper Manual Volume 3C - "VMREAD - Read Field from Virtual-Machine
Control Structure", I found that we're enforcing that the destination
operand is NOT located in a read-only data segment or any code segment when
the L1 is in long mode - BUT that check should only happen when it is in
protected mode.

Shuffling the code a bit to make our emulation follow the specification
allows me to boot a Xen dom0 in a nested KVM and start HVM L2 guests
without problems.

Fixes: f9eb4af67c ("KVM: nVMX: VMX instructions: add checks for #GP/#SS exceptions")
Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Eugene Korenevsky <ekorenevsky@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-27 09:47:32 -07:00
..
assigned-dev.c KVM: x86: move kvm_set_irq_inatomic to legacy device assignment 2015-11-04 16:24:36 +01:00
assigned-dev.h KVM: x86: move device assignment out of kvm_host.h 2014-11-24 16:53:50 +01:00
cpuid.c KVM: x86: mask CPUID(0xD,0x1).EAX against host value 2016-06-01 12:15:52 -07:00
cpuid.h KVM: MTRR: treat memory as writeback if MTRR is disabled in guest CPUID 2015-12-22 15:29:00 +01:00
emulate.c KVM: x86: fix conversion of addresses to linear in 32-bit protected mode 2016-03-03 15:07:29 -08:00
hyperv.c kvm/x86: Hyper-V HV_X64_MSR_VP_RUNTIME support 2015-10-01 15:06:33 +02:00
hyperv.h kvm/x86: added hyper-v crash msrs into kvm hyperv context 2015-07-23 08:27:06 +02:00
i8254.c KVM: i8254: change PIT discard tick policy 2016-04-12 09:08:33 -07:00
i8254.h KVM: move iodev.h from virt/kvm/ to include/kvm 2015-03-26 21:43:12 +00:00
i8259.c KVM: x86: clean/fix memory barriers in irqchip_in_kernel 2015-07-30 16:02:56 +02:00
ioapic.c KVM: x86: fix edge EOI and IOAPIC reconfig race 2015-10-14 16:41:08 +02:00
ioapic.h KVM: x86: Add EOI exit bitmap inference 2015-10-01 15:06:28 +02:00
iommu.c KVM: count number of assigned devices 2015-07-10 13:25:26 +02:00
irq.c KVM: x86: Add support for local interrupt requests from userspace 2015-10-01 15:06:29 +02:00
irq.h KVM: x86: Add support for local interrupt requests from userspace 2015-10-01 15:06:29 +02:00
irq_comm.c KVM: x86: move kvm_set_irq_inatomic to legacy device assignment 2015-11-04 16:24:36 +01:00
Kconfig KVM: x86: select IRQ_BYPASS_MANAGER 2015-10-01 15:06:52 +02:00
kvm_cache_regs.h KVM: x86: API changes for SMM support 2015-06-04 16:01:11 +02:00
lapic.c KVM: x86: Move TSC scaling logic out of call-back read_l1_tsc() 2015-11-10 12:06:18 +01:00
lapic.h KVM: Define a new interface kvm_intr_is_single_vcpu() 2015-10-01 15:06:49 +02:00
Makefile kvm/x86: move Hyper-V MSR's/hypercall code into hyperv.c file 2015-07-23 08:27:06 +02:00
mmu.c KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 2016-03-16 08:42:58 -07:00
mmu.h KVM: x86: merge handle_mmio_page_fault and handle_mmio_page_fault_common 2015-11-10 12:06:03 +01:00
mmu_audit.c Minor merge needed, due to function move. 2015-07-01 10:49:25 -07:00
mmutrace.h tracing: Rename ftrace_event.h to trace_events.h 2015-05-13 14:05:12 -04:00
mtrr.c KVM: MTRR: remove MSR 0x2f8 2016-06-01 12:15:52 -07:00
paging_tmpl.h KVM: x86: MMU: fix ubsan index-out-of-range warning 2016-03-03 15:07:29 -08:00
pmu.c KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch 2015-06-23 14:12:14 +02:00
pmu.h KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch 2015-06-23 14:12:14 +02:00
pmu_amd.c KVM: x86/vPMU: Fix unnecessary signed extension for AMD PERFCTRn 2015-08-11 15:19:41 +02:00
pmu_intel.c KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch 2015-06-23 14:12:14 +02:00
svm.c kvm: x86: move tracepoints outside extended quiescent state 2015-12-11 12:26:33 +01:00
trace.h KVM: x86: correctly print #AC in traces 2016-01-31 11:28:54 -08:00
tss.h
vmx.c KVM: nVMX: VMX instructions: fix segment checks when L1 is in long mode. 2016-07-27 09:47:32 -07:00
x86.c KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS 2016-06-24 10:18:18 -07:00
x86.h x86/fpu: Rename XSAVE macros 2015-09-14 12:21:46 +02:00