Now Reading
Hypervisor Growth in Rust Half 1

Hypervisor Growth in Rust Half 1

2023-04-15 07:11:19

This text will cowl the event of a minimalistic Intel VT-x research hypervisor in Rust. We’ll use the x86 crate and documentation, which assist simplify the code.

The data acquired to make this hypervisor was from studying blogs and code, notably the 2 wonderful free hypervisor improvement collection by @daax_rynd and @Intel80x86. The first motivation got here shortly after @not_matthias launched an AMD (SVM) Hypervisor in Rust and from Secret Club’s wonderful articles:

The vast majority of the hypervisor was already developed earlier than the legendary @tandasat released Hypervisor 101 in Rust.

Digital Machine Structure

Digital Machine Monitor (VMM): A VMM serves as a bunch and has full command over the platform’s processor(s) and different {hardware}. A VMM allows visitor software program to run straight on a logical processor by offering it with an abstraction of a digital processor. A VMM can keep granular management over I/O, interrupt dealing with, bodily reminiscence, and processor sources.

Visitor Software program: Any software program that runs inside a digital machine (VM) managed by a digital machine monitor (VMM) or hypervisor is known as visitor software program. Every digital machine (VM) helps an working system (OS) stack and software software program as a visitor software program atmosphere. Every digital machine runs independently of the others and has a regular interface with the bodily platform’s processor(s), reminiscence, storage, graphics, and I/O. The software program stack performs as if it have been on a platform with no VMM. In order that the VMM could proceed to have management over platform sources, software program working in digital machines should have fewer privileges.

Introduction to Digital Machine Extension (VMX) Operation

An operation that the Digital Machine Monitor (VMM) does to enter or depart a digital machine execution mode is known as a VMX operation. The host system’s normal working mode and the virtualized working mode of the visitor system executing throughout the VM are switched through the VMX process. The virtualization expertise within the processor helps the low-level VMX operation, which allows the VMM to assemble and handle digital machines.

Life Cycle of Digital Machine Monitor (VMM) Software program

The Digital Machine Monitor (VMM) can enter and go away the execution mode of digital machines (VMs) utilizing low-level {hardware} operations known as VM ENTRY and VM EXIT. Different low-level {hardware} operations, equivalent to VMXON and VMXOFF, allow and disable the VMX operation, the processor’s implementation of {hardware} virtualization that helps VMMs, respectively. In essence, VMXON and VMXOFF enable the VMM to assemble and function digital machines, whereas VM ENTRY and VM EXIT allow the VMM to maneuver between the host system and the visitor system.

Interaction_of_a_Virtual-Machine_Monitor_and_Guests

Credit: Intel® 64 and IA-32 Architectures Software program Developer Handbook

Digital-Machine Management Construction (VMCS)

A digital machine’s execution is managed and managed by the Digital Machine Monitor (VMM) through a digital machine management construction (VMCS).
The digital machine’s state, the settings for the digital processor, and the mapping between the digital and bodily sources are all contained within the VMCS.

The VMM employs a group of low-level directions to regulate the VMCS. The Digital-Machine Management Construction Pointer (VMCS pointer), which allows the VMM to entry the VMCS for a specific VM, could be learn utilizing VMPTRST and loaded utilizing VMPTRLD. The VMM can alter the digital machine’s state or acquire particulars relating to its current state through the use of the instructions VMREAD and VMWRITE, that are used to learn and write values from and to the VMCS, respectively. When a digital machine is terminated, or its state must be reset, VMCLEAR is used to clear the contents of the VMCS.

Every of the VMCSs assigned to a bodily laptop’s logical processors corresponds to a specific digital machine. In consequence, the VMM can oversee and administer quite a few digital machines on a single bodily gadget. To be able to generate, monitor, and handle the execution of digital machines on logical processors, the VMCS and associated directions give the VMM important management and administration capabilities.

Discovering Help for Digital Machine Extension (VMX)

When creating a hypervisor, it’s essential to find out whether or not Intel or AMD constructed the CPU as a result of every producer has a novel virtualization expertise with distinctive capabilities and directions. It’s important to establish the processor kind and make use of the right approaches to make use of these applied sciences and assure that the hypervisor features on numerous techniques.

The CPUID instruction can be utilized to find out whether or not Digital Machine Extension (VMX) / Intel Virtualization Know-how is supported. The processor will reveal details about its options, together with whether or not it helps VMX, when the CPUID instruction is run with the EAX register set to 1. The EAX, EBX, ECX, and EDX registers retailer the CPUID information for the processor. If VMX is supported by the processor, bit 5 of ECX can be set to 1. The processor doesn’t assist VMX if the bit will not be set, making virtualization unavailable.

Rust

We verify whether or not Intel makes the CPU by analyzing the CPUID info utilizing the Rust x86 crate. Particularly, we verify the seller info returned by the CPUID instruction to see if it equals "GenuineIntel". If the seller info signifies an Intel CPU, we return an Okay outcome; in any other case, we return an error indicating that the hypervisor doesn’t assist the CPU.

/// Examine to see if CPU is Intel (“GenuineIntel”).
pub fn has_intel_cpu() -> Outcome<(), HypervisorError> {
    let cpuid = CpuId::new();
    if let Some(vi) = cpuid.get_vendor_info() {
        if vi.as_str() == "GenuineIntel" {
            return Okay(());
        }
    }
    Err(HypervisorError::CPUUnsupported)
}

Rust

We verify whether or not the processor helps Digital Machine Extension (VMX) expertise by checking if the bit 5 within the ECX register is about to 1 utilizing the CPUID instruction. We use the Rust x86 crate to get the CPUID info and verify whether or not the processor has VMX assist by studying the function info. If the processor helps VMX, we return an Okay outcome; in any other case, we return an error indicating that VMX will not be supported.

/// Examine processor helps for Digital Machine Extension (VMX) expertise - CPUID.1:ECX.VMX[bit 5] = 1 (Intel Handbook: 24.6 Discovering Help for VMX)
pub fn has_vmx_support() -> Outcome<(), HypervisorError> {
    let cpuid = CpuId::new();
    if let Some(fi) = cpuid.get_feature_info() {
        if fi.has_vmx() {
            return Okay(());
        }
    }
    Err(HypervisorError::VMXUnsupported)
}

Rust

We use a customized HypervisorError enum to deal with errors, which was made utilizing thiserror-no-std crate.

use thiserror_no_std::Error;

#[derive(Error, Debug)]
pub enum HypervisorError {
    #[error("Intel CPU not found")]
    CPUUnsupported,
    
    #[error("VMX is not supported")]
    VMXUnsupported,
    
    #[error("VMX locked off in BIOS")]
    VMXBIOSLock,
    
    #[error("Failed allocate memory via PhysicalAllocator")]
    MemoryAllocationFailed(#[from] core::alloc::AllocError),
    
    #[error("Failed to convert from virtual address to physical address")]
    VirtualToPhysicalAddressFailed,
    
    #[error("Failed to execute VMXON")]
    VMXONFailed,
    
    #[error("Failed to execute VMXOFF")]
    VMXOFFFailed,
    
    #[error("Failed to execute VMCLEAR")]
    VMCLEARFailed,

    #[error("Failed to execute VMPTRLD")]
    VMPTRLDFailed,
    
    #[error("Failed to execute VMREAD")]
    VMREADFailed,
    
    #[error("Failed to execute VMWRITE")]
    VMWRITEFailed,
    
    #[error("Failed to execute VMLAUNCH")]
    VMLAUNCHFailed,

    #[error("Failed to execute VMRESUME")]
    VMRESUMEFailed,
    
    #[error("Failed to switch processor")]
    ProcessorSwitchFailed,
    
    #[error("Failed to access VCPU table")]
    VcpuIsNone,
}

Enabling and Coming into Digital Machine Extension (VMX) Operation

The CPU should function in a {hardware} virtualization mode to execute digital machines, made potential by Digital Machine Extensions (VMX). System software program initially units the CR4.VMXE[bit 13] to 1 to allow VMX. This bit is discovered within the management register CR4, which regulates the processor’s a number of working modes. The system software program can execute the VMXON instruction to enter VMX working mode as soon as the VMX bit has been set.

But when VMXON is tried to be executed with CR4.VMXE = 0, an invalid-opcode exception (#UD) is raised. As a result of VMX will not be enabled, the CPU doesn’t acknowledge the VMXON instruction, which results in this exception. After the processor switches to VMX operation mode, the CR4.VMXE bit can’t be cleared. Due to this, system software program should exit VMX working mode with the VMXOFF instruction earlier than CR4.VMXE could also be cleared.

Rust

Now we have a operate known as enable_vmx_operation() that permits digital machine extensions (VMX). We do that by setting a selected bit (bit 13) within the CR4 management register to 1. We first learn the present worth of CR4 utilizing the controlregs::cr4() operate, then set the suitable bit utilizing the set() methodology of the Cr4 struct, and eventually, write the up to date worth again to CR4 utilizing the controlregs::cr4_write() operate.

Along with setting the CR4 bit, we name the set_lock_bit() operate, which units a lock bit through the IA32_FEATURE_CONTROL register and logs a message indicating that the lock bit has been set. If every part goes properly, we return a Outcome with an Okay worth indicating success. If an error happens, we return a Outcome with an Err worth containing a HypervisorError.

/// Permits Digital Machine Extensions - CR4.VMXE[bit 13] = 1 (Intel Handbook: 24.7 Enabling and Coming into VMX Operation)
pub fn enable_vmx_operation() -> Outcome<(), HypervisorError> {
    let mut cr4 = unsafe { controlregs::cr4() };
    cr4.set(controlregs::Cr4::CR4_ENABLE_VMX, true);
    unsafe { controlregs::cr4_write(cr4) };

    set_lock_bit()?;
    log::data!("[+] Lock bit set through IA32_FEATURE_CONTROL");

    Okay(())
}

The IA32_FEATURE_CONTROL MSR is a model-specific register that controls the processor’s options, together with VMX functionality. This register is zeroed when a logical processor is reset. Bits 0 via 1 and 2 are essential for VMXON. Whether or not it may be up to date is determined by the lock bit within the MSR. If the lock bit will not be set, VMXON execution will fail, and the MSR can’t be modified till after a power-up reset. The lock bit, bit 1, bit 2, or each could be modified within the BIOS to deactivate VMX functionality.

  • Bit 1 prompts VMXON in SMX mode, offering a safer setting. If this bit will not be set, VMXON execution in SMX mode will encounter an error.
  • Bit 2 permits VMXON execution whereas SMX mode will not be lively. A common safety exception is triggered when this bit is tried to be set on logical processors that can’t assist VMX operation.

The IA32_FEATURE_CONTROL MSR and management bits in CR4 have to be set with a purpose to activate VMX. The lock bit, bit 1, and bit 2 allow VMX. As soon as enabled, processors can enter the VMX working mode and function digital machines utilizing VMX directions.

Rust

We first verify the present worth of the IA32_FEATURE_CONTROL MSR register to see if the lock bit is already set. If it’s not set, then we set the lock bit together with the VMXON_OUTSIDE_SMX bit and write the brand new worth to the IA32_FEATURE_CONTROL MSR register. If the lock bit is already set, however the VMXON_OUTSIDE_SMX bit will not be set, we then return an error indicating that the BIOS has locked the VMX function.

/// Examine if we have to set bits in IA32_FEATURE_CONTROL (Intel Handbook: 24.7 Enabling and Coming into VMX Operation)
fn set_lock_bit() -> Outcome<(), HypervisorError> {
    const VMX_LOCK_BIT: u64 = 1 << 0;
    const VMXON_OUTSIDE_SMX: u64 = 1 << 2;

    let ia32_feature_control = unsafe { rdmsr(msr::IA32_FEATURE_CONTROL) };

    if (ia32_feature_control & VMX_LOCK_BIT) == 0 {
        unsafe  ia32_feature_control,
            )
        ;
    } else if (ia32_feature_control & VMXON_OUTSIDE_SMX) == 0 {
        return Err(HypervisorError::VMXBIOSLock);
    }

    Okay(())
}

Restrictions on VMX Operation (Adjusting Management Registers)

To be able to be certain that Digital Machine Extension (VMX) Operation work as meant, particular bits within the Management Registers (CR0 and CR4) have to be set or cleared to specific values. The VMX operation will fail if any of those bits have an unsupported worth when the system is in virtualization mode. A common safety exception can be thrown if certainly one of these bits is ever tried to be set to an unsupported worth whereas the VMX operation is in progress. Software program ought to seek the advice of the VMX functionality MSRs IA32_VMX_CR0_FIXED0, IA32_VMX_CR0_FIXED1, IA32_VMX_CR4_FIXED0, and IA32_VMX_CR4_FIXED1 to seek out out which bits within the CR0 and CR4 registers are fastened and the way they need to be set.

Rust

Now we have applied features that modify the CR0 and CR4 management registers for virtualization. These features intention to make sure that the necessary bits within the Management Registers are set and cleared appropriately to assist virtualization. To attain this, we now have outlined two features: set_cr0_bits() and set_cr4_bits(). The previous units the necessary bits in CR0 whereas clearing the necessary zero bits, whereas the latter does the identical for CR4.

To regulate CR0 and CR4, we learn the values saved within the IA32_VMX_CR0_FIXED0, IA32_VMX_CR0_FIXED1, IA32_VMX_CR4_FIXED0, and IA32_VMX_CR4_FIXED1 Mannequin-Particular Registers (MSRs) to find out which bits needs to be set and cleared. We then use the from_bits_truncate() operate to make sure that the bit values match throughout the Cr0 and Cr4 sorts, set the necessary bits utilizing the or bitwise operator, and clear the necessary zero bits utilizing the and bitwise operator. Lastly, we write the ensuing worth again to the CR0 or CR4 register utilizing the cr0_write() or cr4_write() features.

Now we have additionally outlined a higher-level operate adjust_control_registers() that calls each set_cr0_bits() and set_cr4_bits(). This operate units and clears the necessary bits in each CR0 and CR4 and logs a message indicating that the bits have been set/cleared.

/// Regulate set and clear the necessary bits in CR0 and CR4
pub fn adjust_control_registers() {
    set_cr0_bits();
    log::data!("[+] Necessary bits in CR0 set/cleared");

    set_cr4_bits();
    log::data!("[+] Necessary bits in CR4 set/cleared");
}

/// Set the necessary bits in CR0 and clear bits which can be necessary zero (Intel Handbook: 24.8 Restrictions on VMX Operation)
fn set_cr0_bits() {
    let ia32_vmx_cr0_fixed0 = unsafe { msr::rdmsr(msr::IA32_VMX_CR0_FIXED0) };
    let ia32_vmx_cr0_fixed1 = unsafe { msr::rdmsr(msr::IA32_VMX_CR0_FIXED1) };

    let mut cr0 = unsafe { controlregs::cr0() };

    cr0 |= controlregs::Cr0::from_bits_truncate(ia32_vmx_cr0_fixed0 as usize);
    cr0 &= controlregs::Cr0::from_bits_truncate(ia32_vmx_cr0_fixed1 as usize);

    unsafe { controlregs::cr0_write(cr0) };
}

/// Set the necessary bits in CR4 and clear bits which can be necessary zero (Intel Handbook: 24.8 Restrictions on VMX Operation)
fn set_cr4_bits() {
    let ia32_vmx_cr4_fixed0 = unsafe { msr::rdmsr(msr::IA32_VMX_CR4_FIXED0) };
    let ia32_vmx_cr4_fixed1 = unsafe { msr::rdmsr(msr::IA32_VMX_CR4_FIXED1) };

    let mut cr4 = unsafe { controlregs::cr4() };

    cr4 |= controlregs::Cr4::from_bits_truncate(ia32_vmx_cr4_fixed0 as usize);
    cr4 &= controlregs::Cr4::from_bits_truncate(ia32_vmx_cr4_fixed1 as usize);

    unsafe { controlregs::cr4_write(cr4) };
}

VMXON Area

Software program should allocate a reminiscence area known as the VMXON Area, which can be utilized by the logical processor for VMX operation, earlier than permitting digital machine extensions (VMX) exercise. The operand for the VMXON instruction is the bodily handle of this space.

The VMXON pointer should adhere to sure specs, equivalent to being 4-KByte aligned and never exceeding the processor’s bodily handle width. Software program should use a distinct area for every logical processor and write the VMCS revision identification (VMCS ID) to the VMXON area earlier than VMXON is executed. Unpredictable behaviour could emerge from accessing or altering the VMXON area of a logical processor between the execution of VMXON and VMXOFF.

Rust

Fortuitously for us, @not-matthias already has a kernel-alloc crate in Rust prepared for group use.

The PhysicalAllocator is a customized allocator that allocates bodily reminiscence in Home windows kernel mode. If you allocate reminiscence utilizing this allocator, it calls the MmAllocateContiguousMemorySpecifyCacheNode operate to allocate contiguous bodily reminiscence. If the allocation is profitable, it returns a pointer to the allotted reminiscence. If it fails, it returns an AllocError. If you deallocate reminiscence utilizing this allocator, it calls the MmFreeContiguousMemory operate to free the reminiscence that was beforehand allotted. This allocator can be utilized with Rust’s GlobalAlloc trait to offer a customized international allocator for Rust’s heap-allocated information sorts like String, Vec, and Field.

If you wish to discover out extra about it, please consult with the alloc::GlobalAllocator or alloc::Allocator and the Rust guide for global_allocator or allocator_api.

/// The bodily kernel allocator construction.
pub struct PhysicalAllocator;

unsafe impl Allocator for PhysicalAllocator {
    fn allocate(&self, structure: Format) -> Outcome<NonNull<[u8]>, AllocError> {
        let mut boundary: PHYSICAL_ADDRESS = unsafe { core::mem::zeroed() };
        let mut lowest: PHYSICAL_ADDRESS = unsafe { core::mem::zeroed() };
        let mut highest: PHYSICAL_ADDRESS = unsafe { core::mem::zeroed() };

        unsafe { *(boundary.QuadPart_mut()) = 0 };
        unsafe { *(lowest.QuadPart_mut()) = 0 };
        unsafe { *(highest.QuadPart_mut()) = -1 };

        let reminiscence = unsafe {
            MmAllocateContiguousMemorySpecifyCacheNode(
                structure.measurement(),
                lowest,
                highest,
                boundary,
                MmCached,
                MM_ANY_NODE_OK,
            )
        } as *mut u8;
        if reminiscence.is_null() {
            Err(AllocError)
        } else {
            let slice = unsafe { core::slice::from_raw_parts_mut(reminiscence, structure.measurement()) };
            Okay(unsafe { NonNull::new_unchecked(slice) })
        }
    }

    unsafe fn deallocate(&self, ptr: NonNull<u8>, _layout: Format) {
        MmFreeContiguousMemory(ptr.forged().as_ptr());
    }
}

We’re defining a struct known as VmxonRegion, which represents a VMXON Area in reminiscence. This area have to be aligned to the web page measurement of 4096 bytes (or 0x1000 in hexadecimal). The VmxonRegion construction comprises two fields: revision_id and information. The revision_id is a 32-bit unsigned integer representing the model of the VMX capabilities supported by the processor, and it takes up 4 bytes of the reminiscence area. The info area is an array of 4092 bytes that comprises the remainder of the VMXON Area. Through the use of the repr(C, align(4096)) attribute, we be certain that the VmxonRegion kind is laid out precisely as specified, with 4096 bytes of reminiscence allotted for every occasion of this kind. This ensures that the VMXON Area is aligned appropriately in reminiscence and can be utilized by the processor with none points.

pub const PAGE_SIZE: usize = 0x1000;

#[repr(C, align(4096))]
pub struct VmxonRegion {
    pub revision_id: u32,
    pub information: [u8; PAGE_SIZE - 4],
}

We outline a operate get_vmcs_revision_id that returns the Digital Machine Management Construction (VMCS) revision ID. To get this revision ID, we learn a Mannequin Particular Register (MSR) utilizing the rdmsr operate, passing it the MSR identifier IA32_VMX_BASIC. We forged the returned worth to a 32-bit unsigned integer after which bitwise AND it with 0x7FFF_FFFF to clear the excessive bit, which is reserved. The ensuing worth is the VMCS revision ID, which we return.

/// Get the Digital Machine Management Construction revision identifier (VMCS revision ID) (Intel Handbook: 25.11.5 VMXON Area)
pub fn get_vmcs_revision_id() -> u32 {
    unsafe { (msr::rdmsr(msr::IA32_VMX_BASIC) as u32) & 0x7FFF_FFFF }
}

To transform a digital handle to a bodily handle, we will use the MmGetVirtualForPhysical undocumented operate. Fortunately for us we will reuse the code written by @not-matthias on this amd_hypervisor since there is no such thing as a crate for it at present.

Now we have two features right here. The primary operate, physical_address takes a pointer to a u64 and converts it to a bodily handle of kind PAddr. This operate is used to transform digital addresses to bodily addresses. The second operate va_from_pa takes a bodily handle and converts it to a digital handle. That is achieved utilizing the Home windows kernel undocumented operate MmGetVirtualForPhysical.

pub fn physical_address(ptr: *const u64) -> PAddr {
    PhysicalAddress::from_va(ptr as u64).0
}

fn va_from_pa(pa: u64) -> u64 {
    let mut physical_address: PHYSICAL_ADDRESS = unsafe { core::mem::zeroed() };
    unsafe { *(physical_address.QuadPart_mut()) = pa as i64 };

    unsafe { MmGetVirtualForPhysical(physical_address) as u64 }
}

The VcpuData struct represents information related to a digital CPU in a hypervisor, and it comprises a area known as vmxon_region, which is a zero-initialized naturally aligned 4-KByte area of reminiscence, in addition to a area known as vmxon_region_physical_address which is its bodily handle. The new() operate initializes the VcpuData struct and allocates the VMXON Area in reminiscence utilizing a PhysicalAllocator. The init_vmxon_region() operate initializes the VMXON Area with the VMCS revision ID, allows VMX operation by calling vmxon(), and returns an error if the digital to bodily handle translation fails.

pub struct VcpuData {
    /// The digital and bodily handle of the Vmxon naturally aligned 4-KByte area of reminiscence
    pub vmxon_region: Field<VmxonRegion, PhysicalAllocator>,
    pub vmxon_region_physical_address: u64,
}

impl VcpuData {
    pub fn new() -> Outcome<Field<Self>, HypervisorError> {
        let occasion = Self {
            vmxon_region: unsafe { Field::try_new_zeroed_in(PhysicalAllocator)?.assume_init() },
            vmxon_region_physical_address: 0,
        };

        let mut occasion = Field::new(occasion);
                
        log::data!("[+] init_vmxon_region");
        occasion.init_vmxon_region()?;
    }

    /// Allocate a naturally aligned 4-KByte VMXON area of reminiscence to allow VMX operation (Intel Handbook: 25.11.5 VMXON Area)
    pub fn init_vmxon_region(&mut self) -> Outcome<(), HypervisorError> {
        self.vmxon_region_physical_address = physical_address(self.vmxon_region.as_ref() as *const _ as _).as_u64();

        if self.vmxon_region_physical_address == 0 {
            return Err(HypervisorError::VirtualToPhysicalAddressFailed);
        }

        log::data!("[+] VMXON Area Digital Tackle: {:p}", self.vmxon_region);
        log::data!("[+] VMXON Area Bodily Addresss: 0x{:x}", self.vmxon_region_physical_address);

        self.vmxon_region.revision_id = assist::get_vmcs_revision_id();
        self.vmxon_region.as_mut().revision_id.set_bit(31, false);

        assist::vmxon(self.vmxon_region_physical_address)?;
        log::data!("[+] VMXON profitable!");

        Okay(())
    }
}

The vmxon() operate is only a wrapper across the x86 vmxon() operate, which calls vmxon <addr> in meeting. Nonetheless, it isn’t essential to create wrappers, nevertheless it helps with error dealing with.

/// Allow VMX operation.
pub fn vmxon(vmxon_pa: u64) -> Outcome<(), HypervisorError> {
    match unsafe { x86::bits64::vmx::vmxon(vmxon_pa) } {
        Okay(_) => Okay(()),
        Err(_) => Err(HypervisorError::VMXONFailed),
    }
}

Total, the above initializes a reminiscence area to allow VMX operation for a digital CPU in a hypervisor. Nonetheless, we need to do that for each logical/digital CPU.

Processors, Cores and Logical/Digital Processors (VCPUs)

Processor: The first a part of a pc that conducts mathematical, logical, enter/output (I/O), and management actions is a processor, typically referred to as a central processing unit (CPU). It’s answerable for finishing up instructions and controlling the info move inside a pc system.

Cores: A core is a bodily processing unit that may perform directions inside a CPU. To be able to work in parallel with different cores, every core sometimes consists of its arithmetic logic unit (ALU), register set, and cache.

Logical Processor: A processing unit inside a CPU that may perform a single thread of directions is known as a logical processor, often known as a digital processor. Relying on the actual processor design, every bodily core in present CPUs can home a number of logical processors.

Say we now have 4 bodily cores in our processor; this interprets to 4 separate processing models in our CPU. Hyper-threading expertise permits for the simultaneous execution of two threads on every core. In consequence, there are eight logical processors, which the working system interprets as eight completely different CPUs.

Common objective registers, MSR registers, VMCSs, and VMXON Areas are among the many registers to which every logical processor has entry. We should be certain that a Digital Machine Monitor (VMM) is about up to make use of all logical processors. This can allow us to profit from our CPU’s capabilities and ship the perfect efficiency for our virtualized workloads.

Rust

Now we have a struct known as Vcpu that represents a digital CPU. It has two fields: index, which is an integer that represents the index of the processor, and information, which is an OnceCell that holds a boxed VcpuData occasion. The new() operate takes an index as an argument and creates a brand new Vcpu occasion with that index and an uninitialized information area.

The virtualize_cpu operate is chargeable for initializing the digital CPU for virtualization. It first allows the Digital Machine Extensions (VMX), adjusts management registers, after which initializes the VcpuData construction by calling get_or_try_init on the information area. The get_or_try_init operate initializes the information area if it has not been initialized earlier than or returns the prevailing worth if it has been initialized.

See Also

The devirtualize_cpu() is used to devirtualize the CPU utilizing the vmxoff instruction. This instruction is used to disable virtualization and return management to the host working system. The operate returns a Outcome indicating whether or not the operation was profitable or not and any related error info. The id() returns the index of the present digital processor, which is useful in multi-processor techniques the place we have to establish which processor is executing the code.

pub struct Vcpu {
    /// The index of the processor.
    index: u32,
    
    information: OnceCell<Field<VcpuData>>,
}

impl Vcpu {
    pub fn new(index: u32) -> Outcome<Self, HypervisorError> {
        log::hint!("Creating processor {}", index);

        Okay (Self {
            index,
            information: OnceCell::new(),
        })
    }

    pub fn virtualize_cpu(&self) -> Outcome<(), HypervisorError>  VcpuData::new())?;
    

    /// Devirtualize the CPU utilizing vmxoff
    pub fn devirtualize_cpu(&self) -> Outcome<(), HypervisorError> {
        assist::vmxoff()?;
        Okay(())
    }

    /// Will get the index of the present logical/digital processor
    pub fn id(&self) -> u32 {
        self.index
    }
}

The vmxoff() operate is only a wrapper across the x86 vmxoff() operate, which calls vmxoff in meeting.

/// Disable VMX operation.
pub fn vmxoff() -> Outcome<(), HypervisorError> {
    match unsafe { x86::bits64::vmx::vmxoff() } {
        Okay(_) => Okay(()),
        Err(_) => Err(HypervisorError::VMXOFFFailed),
    }
}

As soon as once more, we will reuse the code written by @not-matthias on this amd_hypervisor since there is no such thing as a crate for it at present. The module gives utilities for managing processor affinity, which is the flexibility to regulate which processor(s) a thread can execute.

The processor_count() operate returns the variety of processors obtainable on the system utilizing the Home windows kernel operate KeQueryActiveProcessorCountEx

The current_processor_index() operate returns the index of the processor at present executing the calling thread utilizing the Home windows kernel operate KeGetCurrentProcessorNumberEx

The processor_number_from_index() operate takes an index and returns the corresponding PROCESSOR_NUMBER construction, which identifies the processor’s group and quantity inside that group utilizing the Home windows kernel operate KeGetProcessorNumberFromIndex. If the index is out of vary or if there may be an error within the system name, the operate returns None.

pub fn processor_count() -> u32 {
    unsafe { KeQueryActiveProcessorCountEx(ALL_PROCESSOR_GROUPS) }
}

pub fn current_processor_index() -> u32 {
    unsafe { KeGetCurrentProcessorNumberEx(core::ptr::null_mut()) }
}

/// Returns the processor quantity for the required index.
fn processor_number_from_index(index: u32) -> Choice<PROCESSOR_NUMBER> {
    let mut processor_number = MaybeUninit::uninit();

    let standing = unsafe { KeGetProcessorNumberFromIndex(index, processor_number.as_mut_ptr()) };
    if NT_SUCCESS(standing) {
        Some(unsafe { processor_number.assume_init() })
    } else {
        None
    }
}

The ProcessorExecutor struct quickly switches execution to a specified processor till it’s dropped. When an occasion of ProcessorExecutor is created with a legitimate processor index, the switch_to_processor() operate units the affinity of the calling thread to the required processor and yields execution to a different thread utilizing the Home windows kernel operate KeSetSystemGroupAffinityThread. If there may be an error setting the affinity or yielding execution, the operate returns None. When the ProcessorExecutor occasion is dropped, the unique processor affinity is restored utilizing the Home windows kernel operate KeRevertToUserGroupAffinityThread.

/// Switches execution to a selected processor till dropped.
pub struct ProcessorExecutor {
    old_affinity: MaybeUninit<GROUP_AFFINITY>,
}

impl ProcessorExecutor {
    pub fn switch_to_processor(i: u32) -> Choice<Self> {
        if i > processor_count() {
            log::error!("Invalid processor index: {}", i);
            return None;
        }

        let processor_number = processor_number_from_index(i)?;

        let mut old_affinity = MaybeUninit::uninit();
        let mut affinity: GROUP_AFFINITY = unsafe { core::mem::zeroed() };

        affinity.Group = processor_number.Group;
        affinity.Masks = 1 << processor_number.Quantity;
        affinity.Reserved[0] = 0;
        affinity.Reserved[1] = 0;
        affinity.Reserved[2] = 0;

        log::hint!("Switching execution to processor {}", i);
        unsafe { KeSetSystemGroupAffinityThread(&mut affinity, old_affinity.as_mut_ptr()) };

        log::hint!("Yielding execution");
        if !NT_SUCCESS(unsafe { ZwYieldExecution() }) {
            return None;
        }

        Some(Self { old_affinity })
    }
}

impl Drop for ProcessorExecutor {
    fn drop(&mut self) {
        log::hint!("Switching execution again to earlier processor");
        unsafe {
            KeRevertToUserGroupAffinityThread(self.old_affinity.as_mut_ptr());
        }
    }
}

Now we have a Hypervisor struct and a HypervisorBuilder struct for virtualization. The HypervisorBuilder struct has a construct() operate that creates a brand new Hypervisor occasion and returns it as a Outcome. The construct() operate checks whether or not the CPU is an Intel processor and whether or not it helps the Digital Machine Extension (VMX) expertise. If the CPU and VMX are supported, the operate creates and populates a vector (Vec) of digital CPUs (Vcpu), one per obtainable processor, and initializes a brand new Hypervisor occasion with the vector of digital CPUs (Vcpu).

The Hypervisor struct has three strategies:

  1. The builder() operate returns a brand new HypervisorBuilder occasion.

  2. The virtualize() operate virtualizes the entire obtainable processors by calling ProcessorExecutor::switch_to_processor() for every processor after which calling the virtualize_cpu() methodology on every Vcpu occasion within the "processors" vector.

  3. The devirtualize() operate devirtualizes the entire obtainable processors by calling ProcessorExecutor::switch_to_processor() for every processor after which calling the devirtualize_cpu() methodology on every Vcpu object within the "processors" vector.

The virtualize() and devirtualize() features use the ProcessorExecutor struct to change execution to every processor quickly after which swap again after the virtualization or devirtualization operation is full.

Total, this module gives a strategy to construct a Hypervisor occasion with assist for virtualizing all obtainable processors and gives strategies for virtualizing and devirtualizing the processors utilizing the Vcpu struct and the ProcessorExecutor struct.

#[derive(Default)]
pub struct HypervisorBuilder;

impl HypervisorBuilder {
    pub fn construct(self) -> Outcome<Hypervisor, HypervisorError> {
        //
        // 1) Intel Handbook: 24.6 Uncover Help for Digital Machine Extension (VMX)
        //
        assist::has_intel_cpu()?;
        log::data!("[+] CPU is Intel");
    
        assist::has_vmx_support()?;
        log::data!("[+] Digital Machine Extension (VMX) expertise is supported");

        let mut processors: Vec<Vcpu> = Vec::new();
        
        for i in 0..processor_count() {
            processors.push(Vcpu::new(i)?);
        }
        log::data!("[+] Discovered {} processors", processors.len());

        Okay(Hypervisor { processors })
    }
}

pub struct Hypervisor {
    processors: Vec<Vcpu>,
}

impl Hypervisor {
    
    pub fn builder() -> HypervisorBuilder {
        HypervisorBuilder::default()
    }

    pub fn virtualize(&mut self) -> Outcome<(), HypervisorError> {
        log::data!("[+] Virtualizing processors");

        for processor in self.processors.iter_mut() {
            
            let Some(executor) = ProcessorExecutor::switch_to_processor(processor.id()) else {
                return Err(HypervisorError::ProcessorSwitchFailed);
            };

            processor.virtualize_cpu()?;
                
            core::mem::drop(executor);
        }
        Okay(())
    }

    pub fn devirtualize(&mut self) -> Outcome<(), HypervisorError> {
        log::data!("[+] Devirtualizing processors");

        for processor in self.processors.iter_mut() {
            
            let Some(executor) = ProcessorExecutor::switch_to_processor(processor.id()) else {
                return Err(HypervisorError::ProcessorSwitchFailed);
            };

            processor.devirtualize_cpu()?;
                
            core::mem::drop(executor);
        }

        Okay(())
    }
}

This follows an identical neat construction to the amd_hypervisor made by @not-matthias, which is able to assist combine the open-source initiatives if required.

We create a Home windows kernel driver in Rust. When loaded, the driver_entry operate is named mechanically, and we initialize a logger and set the driving force unload operate to driver_unload. We then try to virtualize the processor by calling virtualize().is_none(). If the virtualization course of fails, we return STATUS_UNSUCCESSFUL, and if it succeeds, we return STATUS_SUCCESS.

The virtualize() operate is chargeable for virtualizing the processor utilizing the hypervisor module. To do that, we create a brand new hypervisor utilizing Hypervisor::builder() and try to construct it utilizing hv.construct(). If the construct course of fails, we log an error message and return None. If the construct course of succeeds, we try to virtualize the processor utilizing hypervisor.virtualize(). If the virtualization course of succeeds, we log a hit message, and if it fails, we log an error message and return None. If the virtualization course of succeeds, we save the hypervisor in a static mutable variable known as HYPERVISOR and return Some(()).

When our driver is unloaded, the driver_unload operate is named mechanically, which devirtualizes the processor utilizing the hypervisor module. If the devirtualization course of succeeds, we log a hit message, and if it fails, we log the error message.

static mut HYPERVISOR: Choice<Hypervisor> = None;

#[no_mangle]
pub extern "system" fn driver_entry(driver: &mut DRIVER_OBJECT, _: &UNICODE_STRING) -> NTSTATUS {
    KernelLogger::init(LevelFilter::Information).anticipate("Didn't initialize logger");
    log::data!("Driver Entry known as");

    driver.DriverUnload = Some(driver_unload);


    if virtualize().is_none() {
        log::error!("Didn't virtualize processors");
        return STATUS_UNSUCCESSFUL;
    }

    STATUS_SUCCESS
}


pub extern "system" fn driver_unload(_driver: &mut DRIVER_OBJECT) {
    log::data!("Driver unloaded efficiently!");
    
    if let Some(mut hypervisor) = unsafe { HYPERVISOR.take() } {
        match hypervisor.devirtualize() {
            Okay(_) => log::data!("[+] Devirtualized efficiently!"),
            Err(err) => log::error!("[-] Didn't dervirtualize {}", err),
        }
    }
}

fn virtualize() -> Choice<()> {

    let hv = Hypervisor::builder();

    let Okay(mut hypervisor) = hv.construct() else {
        log::error!("[-] Didn't construct hypervisor");
        return None;
    };

    match hypervisor.virtualize() {
        Okay(_) => log::data!("[+] VMM initialized"),
        Err(err) =>  {
            log::error!("[-] VMM initialization failed: {}", err);
            return None;
        }
    }

    unsafe { HYPERVISOR = Some(hypervisor) }

    Some(())
}

We are able to now take a look at our code by making a service and beginning it to load our Home windows kernel driver.

sc.exe create hypervisor kind= kernel binPath= C:WindowsSystem32drivershypervisor.sys
sc.exe question hypervisor
sc.exe begin hypervisor

The output is proven in Windbg:

INFO  [driver] Driver Entry known as
INFO  [hypervisor] [+] CPU is Intel
INFO  [hypervisor] [+] Digital Machine Extension (VMX) expertise is supported
INFO  [hypervisor] [+] Discovered 2 processors
INFO  [hypervisor] [+] Virtualizing processors
INFO  [hypervisor::vcpu] [+] Enabling Digital Machine Extensions (VMX)
INFO  [hypervisor::support] [+] Lock bit set through IA32_FEATURE_CONTROL
INFO  [hypervisor::vcpu] [+] Adjusting Management Registers
INFO  [hypervisor::support] [+] Necessary bits in CR0 set/cleared
INFO  [hypervisor::support] [+] Necessary bits in CR4 set/cleared
INFO  [hypervisor::vcpu] [+] Initializing VcpuData
INFO  [hypervisor::vcpu_data] [+] init_vmxon_region
INFO  [hypervisor::vcpu_data] [+] VMXON Area Digital Tackle: 0xffffa3801098a000
INFO  [hypervisor::vcpu_data] [+] VMXON Area Bodily Addresss: 0x23ffc1000
INFO  [hypervisor::vcpu_data] [+] VMXON profitable!

Congratulations! You’ve got accomplished the primary a part of the Intel VT-x Hypervisor Growth in Rust collection. I hope you loved it.

Credit / References / Thanks / Motivation

Due to @daax_rynd, @Intel80x86, @not_matthias, @standa_t, and @felix-rs / @joshuа

Full Disclaimer: Due to OpenAI’s ChatGPT, and Grammarly for serving to/aiding me to write down some components of the weblog. This was the primary time I used ChatGPT in my weblog, and is perhaps the final time as I imagine that you simply be taught extra when you write every part your self, even when your grammar, spelling, or descriptions are usually not nice. I discovered so much from studying/writing/porting code and studying @daax_rynd’s, @Intel80x86’s and @not_matthias’s work. If you use ChatGPT, I extremely advocate that folks attempt to reword it as a lot as potential and provides credit to whoever deserves it.



Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top