Hypervisor Growth in Rust Half 1
This text will cowl the event of a minimalistic Intel VT-x research hypervisor in Rust. We’ll use the x86 crate and documentation, which assist simplify the code.
The data acquired to make this hypervisor was from studying blogs and code, notably the 2 wonderful free hypervisor improvement collection by @daax_rynd and @Intel80x86. The first motivation got here shortly after @not_matthias launched an AMD (SVM) Hypervisor in Rust and from Secret Club’s wonderful articles:
The vast majority of the hypervisor was already developed earlier than the legendary @tandasat released Hypervisor 101 in Rust.
Digital Machine Structure
Digital Machine Monitor (VMM):
A VMM serves as a bunch and has full command over the platform’s processor(s) and different {hardware}. A VMM allows visitor software program to run straight on a logical processor by offering it with an abstraction of a digital processor. A VMM can keep granular management over I/O, interrupt dealing with, bodily reminiscence, and processor sources.
Visitor Software program:
Any software program that runs inside a digital machine (VM) managed by a digital machine monitor (VMM) or hypervisor is known as visitor software program. Every digital machine (VM) helps an working system (OS) stack and software software program as a visitor software program atmosphere. Every digital machine runs independently of the others and has a regular interface with the bodily platform’s processor(s), reminiscence, storage, graphics, and I/O. The software program stack performs as if it have been on a platform with no VMM. In order that the VMM could proceed to have management over platform sources, software program working in digital machines should have fewer privileges.
Introduction to Digital Machine Extension (VMX) Operation
An operation that the Digital Machine Monitor (VMM) does to enter or depart a digital machine execution mode is known as a VMX operation. The host system’s normal working mode and the virtualized working mode of the visitor system executing throughout the VM are switched through the VMX process. The virtualization expertise within the processor helps the low-level VMX operation, which allows the VMM to assemble and handle digital machines.
Life Cycle of Digital Machine Monitor (VMM) Software program
The Digital Machine Monitor (VMM) can enter and go away the execution mode of digital machines (VMs) utilizing low-level {hardware} operations known as VM ENTRY
and VM EXIT
. Different low-level {hardware} operations, equivalent to VMXON
and VMXOFF
, allow and disable the VMX operation, the processor’s implementation of {hardware} virtualization that helps VMMs, respectively. In essence, VMXON
and VMXOFF
enable the VMM to assemble and function digital machines, whereas VM ENTRY
and VM EXIT
allow the VMM to maneuver between the host system and the visitor system.
Credit: Intel® 64 and IA-32 Architectures Software program Developer Handbook
Digital-Machine Management Construction (VMCS)
A digital machine’s execution is managed and managed by the Digital Machine Monitor (VMM) through a digital machine management construction (VMCS).
The digital machine’s state, the settings for the digital processor, and the mapping between the digital and bodily sources are all contained within the VMCS.
The VMM employs a group of low-level directions to regulate the VMCS. The Digital-Machine Management Construction Pointer (VMCS pointer), which allows the VMM to entry the VMCS for a specific VM, could be learn utilizing VMPTRST
and loaded utilizing VMPTRLD
. The VMM can alter the digital machine’s state or acquire particulars relating to its current state through the use of the instructions VMREAD
and VMWRITE
, that are used to learn and write values from and to the VMCS, respectively. When a digital machine is terminated, or its state must be reset, VMCLEAR
is used to clear the contents of the VMCS.
Every of the VMCSs assigned to a bodily laptop’s logical processors corresponds to a specific digital machine. In consequence, the VMM can oversee and administer quite a few digital machines on a single bodily gadget. To be able to generate, monitor, and handle the execution of digital machines on logical processors, the VMCS and associated directions give the VMM important management and administration capabilities.
Discovering Help for Digital Machine Extension (VMX)
When creating a hypervisor, it’s essential to find out whether or not Intel or AMD constructed the CPU as a result of every producer has a novel virtualization expertise with distinctive capabilities and directions. It’s important to establish the processor kind and make use of the right approaches to make use of these applied sciences and assure that the hypervisor features on numerous techniques.
The CPUID
instruction can be utilized to find out whether or not Digital Machine Extension (VMX) / Intel Virtualization Know-how is supported. The processor will reveal details about its options, together with whether or not it helps VMX, when the CPUID
instruction is run with the EAX
register set to 1
. The EAX
, EBX
, ECX
, and EDX
registers retailer the CPUID information for the processor. If VMX is supported by the processor, bit 5
of ECX
can be set to 1
. The processor doesn’t assist VMX if the bit will not be set, making virtualization unavailable.
Rust
We verify whether or not Intel makes the CPU by analyzing the CPUID
info utilizing the Rust x86 crate. Particularly, we verify the seller info returned by the CPUID
instruction to see if it equals "GenuineIntel"
. If the seller info signifies an Intel CPU, we return an Okay
outcome; in any other case, we return an error indicating that the hypervisor doesn’t assist the CPU.
/// Examine to see if CPU is Intel (“GenuineIntel”).
pub fn has_intel_cpu() -> Outcome<(), HypervisorError> {
let cpuid = CpuId::new();
if let Some(vi) = cpuid.get_vendor_info() {
if vi.as_str() == "GenuineIntel" {
return Okay(());
}
}
Err(HypervisorError::CPUUnsupported)
}
Rust
We verify whether or not the processor helps Digital Machine Extension (VMX) expertise by checking if the bit 5
within the ECX
register is about to 1
utilizing the CPUID
instruction. We use the Rust x86 crate to get the CPUID info and verify whether or not the processor has VMX assist by studying the function info. If the processor helps VMX, we return an Okay
outcome; in any other case, we return an error indicating that VMX will not be supported.
/// Examine processor helps for Digital Machine Extension (VMX) expertise - CPUID.1:ECX.VMX[bit 5] = 1 (Intel Handbook: 24.6 Discovering Help for VMX)
pub fn has_vmx_support() -> Outcome<(), HypervisorError> {
let cpuid = CpuId::new();
if let Some(fi) = cpuid.get_feature_info() {
if fi.has_vmx() {
return Okay(());
}
}
Err(HypervisorError::VMXUnsupported)
}
Rust
We use a customized HypervisorError
enum to deal with errors, which was made utilizing thiserror-no-std crate.
use thiserror_no_std::Error;
#[derive(Error, Debug)]
pub enum HypervisorError {
#[error("Intel CPU not found")]
CPUUnsupported,
#[error("VMX is not supported")]
VMXUnsupported,
#[error("VMX locked off in BIOS")]
VMXBIOSLock,
#[error("Failed allocate memory via PhysicalAllocator")]
MemoryAllocationFailed(#[from] core::alloc::AllocError),
#[error("Failed to convert from virtual address to physical address")]
VirtualToPhysicalAddressFailed,
#[error("Failed to execute VMXON")]
VMXONFailed,
#[error("Failed to execute VMXOFF")]
VMXOFFFailed,
#[error("Failed to execute VMCLEAR")]
VMCLEARFailed,
#[error("Failed to execute VMPTRLD")]
VMPTRLDFailed,
#[error("Failed to execute VMREAD")]
VMREADFailed,
#[error("Failed to execute VMWRITE")]
VMWRITEFailed,
#[error("Failed to execute VMLAUNCH")]
VMLAUNCHFailed,
#[error("Failed to execute VMRESUME")]
VMRESUMEFailed,
#[error("Failed to switch processor")]
ProcessorSwitchFailed,
#[error("Failed to access VCPU table")]
VcpuIsNone,
}
Enabling and Coming into Digital Machine Extension (VMX) Operation
The CPU should function in a {hardware} virtualization mode to execute digital machines, made potential by Digital Machine Extensions (VMX). System software program initially units the CR4.VMXE[bit 13]
to 1
to allow VMX. This bit is discovered within the management register CR4
, which regulates the processor’s a number of working modes. The system software program can execute the VMXON
instruction to enter VMX working mode as soon as the VMX bit has been set.
But when VMXON
is tried to be executed with CR4.VMXE = 0
, an invalid-opcode exception (#UD
) is raised. As a result of VMX will not be enabled, the CPU doesn’t acknowledge the VMXON
instruction, which results in this exception. After the processor switches to VMX operation mode, the CR4.VMXE
bit can’t be cleared. Due to this, system software program should exit VMX working mode with the VMXOFF
instruction earlier than CR4.VMXE
could also be cleared.
Rust
Now we have a operate known as enable_vmx_operation()
that permits digital machine extensions (VMX). We do that by setting a selected bit (bit 13
) within the CR4
management register to 1
. We first learn the present worth of CR4
utilizing the controlregs::cr4()
operate, then set the suitable bit utilizing the set()
methodology of the Cr4
struct, and eventually, write the up to date worth again to CR4
utilizing the controlregs::cr4_write()
operate.
Along with setting the CR4
bit, we name the set_lock_bit()
operate, which units a lock bit through the IA32_FEATURE_CONTROL
register and logs a message indicating that the lock bit has been set. If every part goes properly, we return a Outcome
with an Okay
worth indicating success. If an error happens, we return a Outcome
with an Err
worth containing a HypervisorError
.
/// Permits Digital Machine Extensions - CR4.VMXE[bit 13] = 1 (Intel Handbook: 24.7 Enabling and Coming into VMX Operation)
pub fn enable_vmx_operation() -> Outcome<(), HypervisorError> {
let mut cr4 = unsafe { controlregs::cr4() };
cr4.set(controlregs::Cr4::CR4_ENABLE_VMX, true);
unsafe { controlregs::cr4_write(cr4) };
set_lock_bit()?;
log::data!("[+] Lock bit set through IA32_FEATURE_CONTROL");
Okay(())
}
The IA32_FEATURE_CONTROL
MSR is a model-specific register that controls the processor’s options, together with VMX functionality. This register is zeroed when a logical processor is reset. Bits 0
via 1
and 2
are essential for VMXON
. Whether or not it may be up to date is determined by the lock bit within the MSR. If the lock bit will not be set, VMXON
execution will fail, and the MSR can’t be modified till after a power-up reset. The lock bit, bit 1
, bit 2
, or each could be modified within the BIOS to deactivate VMX functionality.
Bit 1
promptsVMXON
in SMX mode, offering a safer setting. If this bit will not be set,VMXON
execution in SMX mode will encounter an error.Bit 2
permitsVMXON
execution whereas SMX mode will not be lively. A common safety exception is triggered when this bit is tried to be set on logical processors that can’t assist VMX operation.
The IA32_FEATURE_CONTROL
MSR and management bits in CR4
have to be set with a purpose to activate VMX. The lock bit, bit 1
, and bit 2
allow VMX. As soon as enabled, processors can enter the VMX working mode and function digital machines utilizing VMX directions.
Rust
We first verify the present worth of the IA32_FEATURE_CONTROL
MSR register to see if the lock bit is already set. If it’s not set, then we set the lock bit together with the VMXON_OUTSIDE_SMX
bit and write the brand new worth to the IA32_FEATURE_CONTROL MSR
register. If the lock bit is already set, however the VMXON_OUTSIDE_SMX
bit will not be set, we then return an error indicating that the BIOS has locked the VMX function.
/// Examine if we have to set bits in IA32_FEATURE_CONTROL (Intel Handbook: 24.7 Enabling and Coming into VMX Operation)
fn set_lock_bit() -> Outcome<(), HypervisorError> {
const VMX_LOCK_BIT: u64 = 1 << 0;
const VMXON_OUTSIDE_SMX: u64 = 1 << 2;
let ia32_feature_control = unsafe { rdmsr(msr::IA32_FEATURE_CONTROL) };
if (ia32_feature_control & VMX_LOCK_BIT) == 0 {
unsafe ia32_feature_control,
)
;
} else if (ia32_feature_control & VMXON_OUTSIDE_SMX) == 0 {
return Err(HypervisorError::VMXBIOSLock);
}
Okay(())
}
Restrictions on VMX Operation (Adjusting Management Registers)
To be able to be certain that Digital Machine Extension (VMX) Operation work as meant, particular bits within the Management Registers (CR0
and CR4
) have to be set or cleared to specific values. The VMX operation will fail if any of those bits have an unsupported worth when the system is in virtualization mode. A common safety exception can be thrown if certainly one of these bits is ever tried to be set to an unsupported worth whereas the VMX operation is in progress. Software program ought to seek the advice of the VMX functionality MSRs IA32_VMX_CR0_FIXED0
, IA32_VMX_CR0_FIXED1
, IA32_VMX_CR4_FIXED0
, and IA32_VMX_CR4_FIXED1
to seek out out which bits within the CR0
and CR4
registers are fastened and the way they need to be set.
Rust
Now we have applied features that modify the CR0
and CR4
management registers for virtualization. These features intention to make sure that the necessary bits within the Management Registers are set and cleared appropriately to assist virtualization. To attain this, we now have outlined two features: set_cr0_bits()
and set_cr4_bits()
. The previous units the necessary bits in CR0
whereas clearing the necessary zero
bits, whereas the latter does the identical for CR4
.
To regulate CR0
and CR4
, we learn the values saved within the IA32_VMX_CR0_FIXED0
, IA32_VMX_CR0_FIXED1
, IA32_VMX_CR4_FIXED0
, and IA32_VMX_CR4_FIXED1
Mannequin-Particular Registers (MSRs) to find out which bits needs to be set and cleared. We then use the from_bits_truncate()
operate to make sure that the bit values match throughout the Cr0
and Cr4
sorts, set the necessary bits utilizing the or bitwise operator, and clear the necessary zero
bits utilizing the and bitwise operator. Lastly, we write the ensuing worth again to the CR0
or CR4
register utilizing the cr0_write()
or cr4_write()
features.
Now we have additionally outlined a higher-level operate adjust_control_registers()
that calls each set_cr0_bits()
and set_cr4_bits()
. This operate units and clears the necessary bits in each CR0
and CR4
and logs a message indicating that the bits have been set/cleared.
/// Regulate set and clear the necessary bits in CR0 and CR4
pub fn adjust_control_registers() {
set_cr0_bits();
log::data!("[+] Necessary bits in CR0 set/cleared");
set_cr4_bits();
log::data!("[+] Necessary bits in CR4 set/cleared");
}
/// Set the necessary bits in CR0 and clear bits which can be necessary zero (Intel Handbook: 24.8 Restrictions on VMX Operation)
fn set_cr0_bits() {
let ia32_vmx_cr0_fixed0 = unsafe { msr::rdmsr(msr::IA32_VMX_CR0_FIXED0) };
let ia32_vmx_cr0_fixed1 = unsafe { msr::rdmsr(msr::IA32_VMX_CR0_FIXED1) };
let mut cr0 = unsafe { controlregs::cr0() };
cr0 |= controlregs::Cr0::from_bits_truncate(ia32_vmx_cr0_fixed0 as usize);
cr0 &= controlregs::Cr0::from_bits_truncate(ia32_vmx_cr0_fixed1 as usize);
unsafe { controlregs::cr0_write(cr0) };
}
/// Set the necessary bits in CR4 and clear bits which can be necessary zero (Intel Handbook: 24.8 Restrictions on VMX Operation)
fn set_cr4_bits() {
let ia32_vmx_cr4_fixed0 = unsafe { msr::rdmsr(msr::IA32_VMX_CR4_FIXED0) };
let ia32_vmx_cr4_fixed1 = unsafe { msr::rdmsr(msr::IA32_VMX_CR4_FIXED1) };
let mut cr4 = unsafe { controlregs::cr4() };
cr4 |= controlregs::Cr4::from_bits_truncate(ia32_vmx_cr4_fixed0 as usize);
cr4 &= controlregs::Cr4::from_bits_truncate(ia32_vmx_cr4_fixed1 as usize);
unsafe { controlregs::cr4_write(cr4) };
}
VMXON Area
Software program should allocate a reminiscence area known as the VMXON Area
, which can be utilized by the logical processor for VMX operation, earlier than permitting digital machine extensions (VMX) exercise. The operand for the VMXON
instruction is the bodily handle of this space.
The VMXON
pointer should adhere to sure specs, equivalent to being 4-KByte aligned and never exceeding the processor’s bodily handle width. Software program should use a distinct area for every logical processor and write the VMCS revision identification (VMCS ID) to the VMXON
area earlier than VMXON
is executed. Unpredictable behaviour could emerge from accessing or altering the VMXON
area of a logical processor between the execution of VMXON
and VMXOFF
.
Rust
Fortuitously for us, @not-matthias already has a kernel-alloc crate in Rust prepared for group use.
The PhysicalAllocator
is a customized allocator that allocates bodily reminiscence in Home windows kernel mode. If you allocate reminiscence utilizing this allocator, it calls the MmAllocateContiguousMemorySpecifyCacheNode
operate to allocate contiguous bodily reminiscence. If the allocation is profitable, it returns a pointer to the allotted reminiscence. If it fails, it returns an AllocError
. If you deallocate reminiscence utilizing this allocator, it calls the MmFreeContiguousMemory
operate to free the reminiscence that was beforehand allotted. This allocator can be utilized with Rust’s GlobalAlloc
trait to offer a customized international allocator for Rust’s heap-allocated information sorts like String
, Vec
, and Field
.
If you wish to discover out extra about it, please consult with the alloc::GlobalAllocator or alloc::Allocator and the Rust guide for global_allocator or allocator_api.
/// The bodily kernel allocator construction.
pub struct PhysicalAllocator;
unsafe impl Allocator for PhysicalAllocator {
fn allocate(&self, structure: Format) -> Outcome<NonNull<[u8]>, AllocError> {
let mut boundary: PHYSICAL_ADDRESS = unsafe { core::mem::zeroed() };
let mut lowest: PHYSICAL_ADDRESS = unsafe { core::mem::zeroed() };
let mut highest: PHYSICAL_ADDRESS = unsafe { core::mem::zeroed() };
unsafe { *(boundary.QuadPart_mut()) = 0 };
unsafe { *(lowest.QuadPart_mut()) = 0 };
unsafe { *(highest.QuadPart_mut()) = -1 };
let reminiscence = unsafe {
MmAllocateContiguousMemorySpecifyCacheNode(
structure.measurement(),
lowest,
highest,
boundary,
MmCached,
MM_ANY_NODE_OK,
)
} as *mut u8;
if reminiscence.is_null() {
Err(AllocError)
} else {
let slice = unsafe { core::slice::from_raw_parts_mut(reminiscence, structure.measurement()) };
Okay(unsafe { NonNull::new_unchecked(slice) })
}
}
unsafe fn deallocate(&self, ptr: NonNull<u8>, _layout: Format) {
MmFreeContiguousMemory(ptr.forged().as_ptr());
}
}
We’re defining a struct known as VmxonRegion
, which represents a VMXON Area
in reminiscence. This area have to be aligned to the web page measurement of 4096
bytes (or 0x1000
in hexadecimal). The VmxonRegion
construction comprises two fields: revision_id
and information
. The revision_id
is a 32-bit
unsigned integer representing the model of the VMX capabilities supported by the processor, and it takes up 4
bytes of the reminiscence area. The info area is an array of 4092
bytes that comprises the remainder of the VMXON Area
. Through the use of the repr(C, align(4096))
attribute, we be certain that the VmxonRegion
kind is laid out precisely as specified, with 4096
bytes of reminiscence allotted for every occasion of this kind. This ensures that the VMXON Area
is aligned appropriately in reminiscence and can be utilized by the processor with none points.
pub const PAGE_SIZE: usize = 0x1000;
#[repr(C, align(4096))]
pub struct VmxonRegion {
pub revision_id: u32,
pub information: [u8; PAGE_SIZE - 4],
}
We outline a operate get_vmcs_revision_id
that returns the Digital Machine Management Construction (VMCS) revision ID. To get this revision ID, we learn a Mannequin Particular Register (MSR) utilizing the rdmsr
operate, passing it the MSR identifier IA32_VMX_BASIC
. We forged the returned worth to a 32-bit
unsigned integer after which bitwise AND
it with 0x7FFF_FFFF
to clear the excessive bit, which is reserved. The ensuing worth is the VMCS revision ID, which we return.
/// Get the Digital Machine Management Construction revision identifier (VMCS revision ID) (Intel Handbook: 25.11.5 VMXON Area)
pub fn get_vmcs_revision_id() -> u32 {
unsafe { (msr::rdmsr(msr::IA32_VMX_BASIC) as u32) & 0x7FFF_FFFF }
}
To transform a digital handle to a bodily handle, we will use the MmGetVirtualForPhysical
undocumented operate. Fortunately for us we will reuse the code written by @not-matthias on this amd_hypervisor since there is no such thing as a crate for it at present.
Now we have two features right here. The primary operate, physical_address
takes a pointer to a u64
and converts it to a bodily handle of kind PAddr
. This operate is used to transform digital addresses to bodily addresses. The second operate va_from_pa
takes a bodily handle and converts it to a digital handle. That is achieved utilizing the Home windows kernel undocumented operate MmGetVirtualForPhysical
.
pub fn physical_address(ptr: *const u64) -> PAddr {
PhysicalAddress::from_va(ptr as u64).0
}
fn va_from_pa(pa: u64) -> u64 {
let mut physical_address: PHYSICAL_ADDRESS = unsafe { core::mem::zeroed() };
unsafe { *(physical_address.QuadPart_mut()) = pa as i64 };
unsafe { MmGetVirtualForPhysical(physical_address) as u64 }
}
The VcpuData
struct represents information related to a digital CPU in a hypervisor, and it comprises a area known as vmxon_region
, which is a zero-initialized naturally aligned 4-KByte
area of reminiscence, in addition to a area known as vmxon_region_physical_address
which is its bodily handle. The new()
operate initializes the VcpuData
struct and allocates the VMXON Area
in reminiscence utilizing a PhysicalAllocator
. The init_vmxon_region()
operate initializes the VMXON Area
with the VMCS revision ID, allows VMX operation by calling vmxon()
, and returns an error if the digital to bodily handle translation fails.
pub struct VcpuData {
/// The digital and bodily handle of the Vmxon naturally aligned 4-KByte area of reminiscence
pub vmxon_region: Field<VmxonRegion, PhysicalAllocator>,
pub vmxon_region_physical_address: u64,
}
impl VcpuData {
pub fn new() -> Outcome<Field<Self>, HypervisorError> {
let occasion = Self {
vmxon_region: unsafe { Field::try_new_zeroed_in(PhysicalAllocator)?.assume_init() },
vmxon_region_physical_address: 0,
};
let mut occasion = Field::new(occasion);
log::data!("[+] init_vmxon_region");
occasion.init_vmxon_region()?;
}
/// Allocate a naturally aligned 4-KByte VMXON area of reminiscence to allow VMX operation (Intel Handbook: 25.11.5 VMXON Area)
pub fn init_vmxon_region(&mut self) -> Outcome<(), HypervisorError> {
self.vmxon_region_physical_address = physical_address(self.vmxon_region.as_ref() as *const _ as _).as_u64();
if self.vmxon_region_physical_address == 0 {
return Err(HypervisorError::VirtualToPhysicalAddressFailed);
}
log::data!("[+] VMXON Area Digital Tackle: {:p}", self.vmxon_region);
log::data!("[+] VMXON Area Bodily Addresss: 0x{:x}", self.vmxon_region_physical_address);
self.vmxon_region.revision_id = assist::get_vmcs_revision_id();
self.vmxon_region.as_mut().revision_id.set_bit(31, false);
assist::vmxon(self.vmxon_region_physical_address)?;
log::data!("[+] VMXON profitable!");
Okay(())
}
}
The vmxon()
operate is only a wrapper across the x86 vmxon()
operate, which calls vmxon <addr>
in meeting. Nonetheless, it isn’t essential to create wrappers, nevertheless it helps with error dealing with.
/// Allow VMX operation.
pub fn vmxon(vmxon_pa: u64) -> Outcome<(), HypervisorError> {
match unsafe { x86::bits64::vmx::vmxon(vmxon_pa) } {
Okay(_) => Okay(()),
Err(_) => Err(HypervisorError::VMXONFailed),
}
}
Total, the above initializes a reminiscence area to allow VMX operation for a digital CPU in a hypervisor. Nonetheless, we need to do that for each logical/digital CPU.
Processors, Cores and Logical/Digital Processors (VCPUs)
Processor:
The first a part of a pc that conducts mathematical, logical, enter/output (I/O), and management actions is a processor, typically referred to as a central processing unit (CPU). It’s answerable for finishing up instructions and controlling the info move inside a pc system.
Cores:
A core is a bodily processing unit that may perform directions inside a CPU. To be able to work in parallel with different cores, every core sometimes consists of its arithmetic logic unit (ALU), register set, and cache.
Logical Processor:
A processing unit inside a CPU that may perform a single thread of directions is known as a logical processor, often known as a digital processor. Relying on the actual processor design, every bodily core in present CPUs can home a number of logical processors.
Say we now have 4 bodily cores in our processor; this interprets to 4 separate processing models in our CPU. Hyper-threading expertise permits for the simultaneous execution of two threads on every core. In consequence, there are eight logical processors, which the working system interprets as eight completely different CPUs.
Common objective registers, MSR registers, VMCSs, and VMXON Areas
are among the many registers to which every logical processor has entry. We should be certain that a Digital Machine Monitor (VMM) is about up to make use of all logical processors. This can allow us to profit from our CPU’s capabilities and ship the perfect efficiency for our virtualized workloads.
Rust
Now we have a struct known as Vcpu
that represents a digital CPU. It has two fields: index
, which is an integer that represents the index of the processor, and information
, which is an OnceCell
that holds a boxed VcpuData
occasion. The new()
operate takes an index
as an argument and creates a brand new Vcpu
occasion with that index and an uninitialized information area.
The virtualize_cpu
operate is chargeable for initializing the digital CPU for virtualization. It first allows the Digital Machine Extensions (VMX), adjusts management registers
, after which initializes the VcpuData
construction by calling get_or_try_init
on the information
area. The get_or_try_init
operate initializes the information
area if it has not been initialized earlier than or returns the prevailing worth if it has been initialized.
The devirtualize_cpu()
is used to devirtualize the CPU utilizing the vmxoff
instruction. This instruction is used to disable virtualization and return management to the host working system. The operate returns a Outcome
indicating whether or not the operation was profitable or not and any related error info. The id()
returns the index of the present digital processor, which is useful in multi-processor techniques the place we have to establish which processor is executing the code.
pub struct Vcpu {
/// The index of the processor.
index: u32,
information: OnceCell<Field<VcpuData>>,
}
impl Vcpu {
pub fn new(index: u32) -> Outcome<Self, HypervisorError> {
log::hint!("Creating processor {}", index);
Okay (Self {
index,
information: OnceCell::new(),
})
}
pub fn virtualize_cpu(&self) -> Outcome<(), HypervisorError> VcpuData::new())?;
/// Devirtualize the CPU utilizing vmxoff
pub fn devirtualize_cpu(&self) -> Outcome<(), HypervisorError> {
assist::vmxoff()?;
Okay(())
}
/// Will get the index of the present logical/digital processor
pub fn id(&self) -> u32 {
self.index
}
}
The vmxoff()
operate is only a wrapper across the x86 vmxoff()
operate, which calls vmxoff
in meeting.
/// Disable VMX operation.
pub fn vmxoff() -> Outcome<(), HypervisorError> {
match unsafe { x86::bits64::vmx::vmxoff() } {
Okay(_) => Okay(()),
Err(_) => Err(HypervisorError::VMXOFFFailed),
}
}
As soon as once more, we will reuse the code written by @not-matthias on this amd_hypervisor since there is no such thing as a crate for it at present. The module gives utilities for managing processor affinity, which is the flexibility to regulate which processor(s) a thread can execute.
The processor_count()
operate returns the variety of processors obtainable on the system utilizing the Home windows kernel operate KeQueryActiveProcessorCountEx
The current_processor_index()
operate returns the index of the processor at present executing the calling thread utilizing the Home windows kernel operate KeGetCurrentProcessorNumberEx
The processor_number_from_index()
operate takes an index
and returns the corresponding PROCESSOR_NUMBER
construction, which identifies the processor’s group and quantity inside that group utilizing the Home windows kernel operate KeGetProcessorNumberFromIndex
. If the index is out of vary or if there may be an error within the system name, the operate returns None
.
pub fn processor_count() -> u32 {
unsafe { KeQueryActiveProcessorCountEx(ALL_PROCESSOR_GROUPS) }
}
pub fn current_processor_index() -> u32 {
unsafe { KeGetCurrentProcessorNumberEx(core::ptr::null_mut()) }
}
/// Returns the processor quantity for the required index.
fn processor_number_from_index(index: u32) -> Choice<PROCESSOR_NUMBER> {
let mut processor_number = MaybeUninit::uninit();
let standing = unsafe { KeGetProcessorNumberFromIndex(index, processor_number.as_mut_ptr()) };
if NT_SUCCESS(standing) {
Some(unsafe { processor_number.assume_init() })
} else {
None
}
}
The ProcessorExecutor
struct quickly switches execution to a specified processor till it’s dropped. When an occasion of ProcessorExecutor
is created with a legitimate processor index
, the switch_to_processor()
operate units the affinity of the calling thread to the required processor and yields execution to a different thread utilizing the Home windows kernel operate KeSetSystemGroupAffinityThread
. If there may be an error setting the affinity or yielding execution, the operate returns None
. When the ProcessorExecutor
occasion is dropped, the unique processor affinity is restored utilizing the Home windows kernel operate KeRevertToUserGroupAffinityThread
.
/// Switches execution to a selected processor till dropped.
pub struct ProcessorExecutor {
old_affinity: MaybeUninit<GROUP_AFFINITY>,
}
impl ProcessorExecutor {
pub fn switch_to_processor(i: u32) -> Choice<Self> {
if i > processor_count() {
log::error!("Invalid processor index: {}", i);
return None;
}
let processor_number = processor_number_from_index(i)?;
let mut old_affinity = MaybeUninit::uninit();
let mut affinity: GROUP_AFFINITY = unsafe { core::mem::zeroed() };
affinity.Group = processor_number.Group;
affinity.Masks = 1 << processor_number.Quantity;
affinity.Reserved[0] = 0;
affinity.Reserved[1] = 0;
affinity.Reserved[2] = 0;
log::hint!("Switching execution to processor {}", i);
unsafe { KeSetSystemGroupAffinityThread(&mut affinity, old_affinity.as_mut_ptr()) };
log::hint!("Yielding execution");
if !NT_SUCCESS(unsafe { ZwYieldExecution() }) {
return None;
}
Some(Self { old_affinity })
}
}
impl Drop for ProcessorExecutor {
fn drop(&mut self) {
log::hint!("Switching execution again to earlier processor");
unsafe {
KeRevertToUserGroupAffinityThread(self.old_affinity.as_mut_ptr());
}
}
}
Now we have a Hypervisor
struct and a HypervisorBuilder
struct for virtualization. The HypervisorBuilder
struct has a construct()
operate that creates a brand new Hypervisor
occasion and returns it as a Outcome
. The construct()
operate checks whether or not the CPU is an Intel processor and whether or not it helps the Digital Machine Extension (VMX) expertise. If the CPU and VMX are supported, the operate creates and populates a vector (Vec
) of digital CPUs (Vcpu
), one per obtainable processor, and initializes a brand new Hypervisor
occasion with the vector of digital CPUs (Vcpu
).
The Hypervisor
struct has three strategies:
-
The
builder()
operate returns a brand newHypervisorBuilder
occasion. -
The
virtualize()
operate virtualizes the entire obtainable processors by callingProcessorExecutor::switch_to_processor()
for every processor after which calling thevirtualize_cpu()
methodology on everyVcpu
occasion within the"processors"
vector. -
The
devirtualize()
operate devirtualizes the entire obtainable processors by callingProcessorExecutor::switch_to_processor()
for every processor after which calling thedevirtualize_cpu()
methodology on everyVcpu
object within the"processors"
vector.
The virtualize()
and devirtualize()
features use the ProcessorExecutor
struct to change execution to every processor quickly after which swap again after the virtualization or devirtualization operation is full.
Total, this module gives a strategy to construct a Hypervisor
occasion with assist for virtualizing all obtainable processors and gives strategies for virtualizing and devirtualizing the processors utilizing the Vcpu
struct and the ProcessorExecutor
struct.
#[derive(Default)]
pub struct HypervisorBuilder;
impl HypervisorBuilder {
pub fn construct(self) -> Outcome<Hypervisor, HypervisorError> {
//
// 1) Intel Handbook: 24.6 Uncover Help for Digital Machine Extension (VMX)
//
assist::has_intel_cpu()?;
log::data!("[+] CPU is Intel");
assist::has_vmx_support()?;
log::data!("[+] Digital Machine Extension (VMX) expertise is supported");
let mut processors: Vec<Vcpu> = Vec::new();
for i in 0..processor_count() {
processors.push(Vcpu::new(i)?);
}
log::data!("[+] Discovered {} processors", processors.len());
Okay(Hypervisor { processors })
}
}
pub struct Hypervisor {
processors: Vec<Vcpu>,
}
impl Hypervisor {
pub fn builder() -> HypervisorBuilder {
HypervisorBuilder::default()
}
pub fn virtualize(&mut self) -> Outcome<(), HypervisorError> {
log::data!("[+] Virtualizing processors");
for processor in self.processors.iter_mut() {
let Some(executor) = ProcessorExecutor::switch_to_processor(processor.id()) else {
return Err(HypervisorError::ProcessorSwitchFailed);
};
processor.virtualize_cpu()?;
core::mem::drop(executor);
}
Okay(())
}
pub fn devirtualize(&mut self) -> Outcome<(), HypervisorError> {
log::data!("[+] Devirtualizing processors");
for processor in self.processors.iter_mut() {
let Some(executor) = ProcessorExecutor::switch_to_processor(processor.id()) else {
return Err(HypervisorError::ProcessorSwitchFailed);
};
processor.devirtualize_cpu()?;
core::mem::drop(executor);
}
Okay(())
}
}
This follows an identical neat construction to the amd_hypervisor made by @not-matthias, which is able to assist combine the open-source initiatives if required.
We create a Home windows kernel driver in Rust. When loaded, the driver_entry
operate is named mechanically, and we initialize a logger and set the driving force unload operate to driver_unload
. We then try to virtualize the processor by calling virtualize().is_none()
. If the virtualization course of fails, we return STATUS_UNSUCCESSFUL
, and if it succeeds, we return STATUS_SUCCESS
.
The virtualize()
operate is chargeable for virtualizing the processor utilizing the hypervisor
module. To do that, we create a brand new hypervisor utilizing Hypervisor::builder()
and try to construct it utilizing hv.construct()
. If the construct course of fails, we log an error message and return None
. If the construct course of succeeds, we try to virtualize the processor utilizing hypervisor.virtualize()
. If the virtualization course of succeeds, we log a hit message, and if it fails, we log an error message and return None
. If the virtualization course of succeeds, we save the hypervisor in a static mutable variable known as HYPERVISOR
and return Some(())
.
When our driver is unloaded, the driver_unload
operate is named mechanically, which devirtualizes the processor utilizing the hypervisor
module. If the devirtualization course of succeeds, we log a hit message, and if it fails, we log the error message.
static mut HYPERVISOR: Choice<Hypervisor> = None;
#[no_mangle]
pub extern "system" fn driver_entry(driver: &mut DRIVER_OBJECT, _: &UNICODE_STRING) -> NTSTATUS {
KernelLogger::init(LevelFilter::Information).anticipate("Didn't initialize logger");
log::data!("Driver Entry known as");
driver.DriverUnload = Some(driver_unload);
if virtualize().is_none() {
log::error!("Didn't virtualize processors");
return STATUS_UNSUCCESSFUL;
}
STATUS_SUCCESS
}
pub extern "system" fn driver_unload(_driver: &mut DRIVER_OBJECT) {
log::data!("Driver unloaded efficiently!");
if let Some(mut hypervisor) = unsafe { HYPERVISOR.take() } {
match hypervisor.devirtualize() {
Okay(_) => log::data!("[+] Devirtualized efficiently!"),
Err(err) => log::error!("[-] Didn't dervirtualize {}", err),
}
}
}
fn virtualize() -> Choice<()> {
let hv = Hypervisor::builder();
let Okay(mut hypervisor) = hv.construct() else {
log::error!("[-] Didn't construct hypervisor");
return None;
};
match hypervisor.virtualize() {
Okay(_) => log::data!("[+] VMM initialized"),
Err(err) => {
log::error!("[-] VMM initialization failed: {}", err);
return None;
}
}
unsafe { HYPERVISOR = Some(hypervisor) }
Some(())
}
We are able to now take a look at our code by making a service and beginning it to load our Home windows kernel driver.
sc.exe create hypervisor kind= kernel binPath= C:WindowsSystem32drivershypervisor.sys
sc.exe question hypervisor
sc.exe begin hypervisor
The output is proven in Windbg:
INFO [driver] Driver Entry known as
INFO [hypervisor] [+] CPU is Intel
INFO [hypervisor] [+] Digital Machine Extension (VMX) expertise is supported
INFO [hypervisor] [+] Discovered 2 processors
INFO [hypervisor] [+] Virtualizing processors
INFO [hypervisor::vcpu] [+] Enabling Digital Machine Extensions (VMX)
INFO [hypervisor::support] [+] Lock bit set through IA32_FEATURE_CONTROL
INFO [hypervisor::vcpu] [+] Adjusting Management Registers
INFO [hypervisor::support] [+] Necessary bits in CR0 set/cleared
INFO [hypervisor::support] [+] Necessary bits in CR4 set/cleared
INFO [hypervisor::vcpu] [+] Initializing VcpuData
INFO [hypervisor::vcpu_data] [+] init_vmxon_region
INFO [hypervisor::vcpu_data] [+] VMXON Area Digital Tackle: 0xffffa3801098a000
INFO [hypervisor::vcpu_data] [+] VMXON Area Bodily Addresss: 0x23ffc1000
INFO [hypervisor::vcpu_data] [+] VMXON profitable!
Congratulations! You’ve got accomplished the primary a part of the Intel VT-x Hypervisor Growth in Rust collection. I hope you loved it.
Credit / References / Thanks / Motivation
Due to @daax_rynd, @Intel80x86, @not_matthias, @standa_t, and @felix-rs / @joshuа
Full Disclaimer: Due to OpenAI’s ChatGPT, and Grammarly for serving to/aiding me to write down some components of the weblog. This was the primary time I used ChatGPT in my weblog, and is perhaps the final time as I imagine that you simply be taught extra when you write every part your self, even when your grammar, spelling, or descriptions are usually not nice. I discovered so much from studying/writing/porting code and studying @daax_rynd’s, @Intel80x86’s and @not_matthias’s work. If you use ChatGPT, I extremely advocate that folks attempt to reword it as a lot as potential and provides credit to whoever deserves it.