There is a potential race in record_steal_time() between setting
host-local vcpu->arch.st.steal.preempted to zero (i.e. clearing
KVM_VCPU_PREEMPTED) and propagating this value to the guest with
kvm_write_guest_cached(). Between those two events the guest may
still see KVM_VCPU_PREEMPTED in its copy of kvm_steal_time, set
KVM_VCPU_FLUSH_TLB and assume that hypervisor will do the right
thing. Which it won't.
Instad of copying, we should map kvm_steal_time and that will
guarantee atomicity of accesses to @preempted.
This is part of CVE-2019-3016.
Signed-off-by: Boris Ostrovsky <email address hidden>
Reviewed-by: Joao Martins <email address hidden>
Signed-off-by: Paolo Bonzini <email address hidden>
[bwh: Backported to 4.19: No tracepoint in record_steal_time().]
Signed-off-by: Ben Hutchings <email address hidden>
Signed-off-by: Sasha Levin <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>
Remote TLB flush does a busy wait which is fine in bare-metal
scenario. But with-in the guest, the vcpus might have been pre-empted or
blocked. In this scenario, the initator vcpu would end up busy-waiting
for a long amount of time; it also consumes CPU unnecessarily to wake
up the target of the shootdown.
This patch set adds support for KVM's new paravirtualized TLB flush;
remote TLB flush does not wait for vcpus that are sleeping, instead
KVM will flush the TLB as soon as the vCPU starts running again.
The improvement is clearly visible when the host is overcommitted; in this
case, the PV TLB flush (in addition to avoiding the wait on the main CPU)
prevents preempted vCPUs from stealing precious execution time from the
running ones.
Testing on a Xeon Gold 6142 2.6GHz 2 sockets, 32 cores, 64 threads,
so 64 pCPUs, and each VM is 64 vCPUs.
When running on a virtual machine, IPIs are expensive when the target
CPU is sleeping. Thus, it is nice to be able to avoid them for TLB
shootdowns. KVM can just do the flush via INVVPID on the guest's behalf
the next time the CPU is scheduled.
Cc: Paolo Bonzini <email address hidden>
Cc: Radim Krčmář <email address hidden>
Cc: Peter Zijlstra <email address hidden>
Signed-off-by: Wanpeng Li <email address hidden>
[Use "&" to test the bit instead of "==". - Paolo]
Signed-off-by: Paolo Bonzini <email address hidden>
Signed-off-by: Radim Krčmář <email address hidden>
(backported from commit f38a7b75267f1fb240a8178cbcb16d66dd37aac8)
Signed-off-by: Kamal Mostafa <email address hidden>
UBUNTU: [Config] wireguard -- enable on all architectures
Signed-off-by: Andy Whitcroft <email address hidden>
Acked-by: Stefan Bader <email address hidden>
Acked-by: Thadeu Lima de Souza Cascardo <email address hidden>
Signed-off-by: Thadeu Lima de Souza Cascardo <email address hidden>