]> git.baikalelectronics.ru Git - kernel.git/commit
KVM: MMU: speedup update_permission_bitmask
authorPaolo Bonzini <pbonzini@redhat.com>
Thu, 24 Aug 2017 15:37:25 +0000 (17:37 +0200)
committerPaolo Bonzini <pbonzini@redhat.com>
Thu, 24 Aug 2017 16:09:18 +0000 (18:09 +0200)
commit4c34538956676c5f773a6b0e721b2650605608c4
tree64fa777a0543ebed4e10f0c094e12299f63111b7
parent35af0ea6b11dd77dee2a29a205df678797bee9b8
KVM: MMU: speedup update_permission_bitmask

update_permission_bitmask currently does a 128-iteration loop to,
essentially, compute a constant array.  Computing the 8 bits in parallel
reduces it to 16 iterations, and is enough to speed it up substantially
because many boolean operations in the inner loop become constants or
simplify noticeably.

Because update_permission_bitmask is actually the top item in the profile
for nested vmexits, this speeds up an L2->L1 vmexit by about ten thousand
clock cycles, or up to 30%:

                                         before     after
   cpuid                                 35173      25954
   vmcall                                35122      27079
   inl_from_pmtimer                      52635      42675
   inl_from_qemu                         53604      44599
   inl_from_kernel                       38498      30798
   outl_to_kernel                        34508      28816
   wr_tsc_adjust_msr                     34185      26818
   rd_tsc_adjust_msr                     37409      27049
   mmio-no-eventfd:pci-mem               50563      45276
   mmio-wildcard-eventfd:pci-mem         34495      30823
   mmio-datamatch-eventfd:pci-mem        35612      31071
   portio-no-eventfd:pci-io              44925      40661
   portio-wildcard-eventfd:pci-io        29708      27269
   portio-datamatch-eventfd:pci-io       31135      27164

(I wrote a small C program to compare the tables for all values of CR0.WP,
CR4.SMAP and CR4.SMEP, and they match).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arch/x86/kvm/mmu.c