Home > Hyper-V, Virtualization > Memory Management in Hyper-V (Part-1)

Memory Management in Hyper-V (Part-1)

Hyper-V supports isolation in terms of a partition. A partition is a logical unit of isolation, supported by the hypervisor, in which operating systems execute. A hypervisor instance has to have at least one parent partition.

The virtualization stack runs in the parent partition and has direct access to the hardware devices. The parent partition then creates the child partitions which host the guest OSs. A parent partition creates child partitions using the hypercall API, which is the application programming interface exposed by Hyper-V.

A virtualized partition does not have access to the physical processor, nor does it handle its real interrupts. Instead, it has a virtual view of the processor and runs in Guest Virtual Address, which (depending on the configuration of the hypervisor) might not necessarily be the entire virtual address space. A hypervisor could choose to expose only a subset of the processors to each partition. The hypervisor handles the interrupts to the processor, and redirects them to the respective partition using a logical Synthetic Interrupt Controller (SynIC). Hyper-V can hardware accelerate the address translation between various Guest Virtual Address-spaces by using an IOMMU (I/O Memory Management Unit) which operates independent of the memory management hardware used by the CPU.

Child partitions do not have direct access to hardware resources, but instead have a virtual view of the resources, in terms of virtual devices. Any request to the virtual devices is redirected via the VMBus to the devices in the parent partition, which will manage the requests. The VMBus is a logical channel which enables inter-partition communication. The response is also redirected via the VMBus. If the devices in the parent partition are also virtual devices, it will be redirected further until it reaches the parent partition, where it will gain access to the physical devices. Parent partitions run a Virtualization Service Provider (VSP), which connects to the VMBus and handles device access requests from child partitions. Child partition virtual devices internally run a Virtualization Service Client (VSC), which redirect the request to VSPs in the parent partition via the VMBus. This entire process is transparent to the guest OS.

Memory Types

The hypervisor architecture defines three independent address spaces:

System physical addresses (SPAs) define the physical address space of the underlying hardware as seen by the CPUs. There is only one system physical address space for the entire machine.

Guest physical addresses (GPAs) define the guest’s view of physical memory. GPAs can be mapped to underlying SPAs. There is one guest physical address space per partition.

Guest virtual addresses (GVAs) are used within the guest when it enables address translation and provides a valid guest page table.

The resource management state affects, and is affected by, any hypercall that needs to allocate or free memory in the hypervisor. Whether and how a particular call interacts with the resource management state are described as part of the specification for each call. For example, the section on partition creation will specify from whose memory pool the initial resources needed to set up a partition will be deducted.

Memory Pools

For each partition, the hypervisor maintains a memory pool of RAM SPA pages. This pool is managed like a bank account. The number of pages in the pool is referred to as the balance. Pages can be deposited into or withdrawn from the pool.

When a partition makes a hypercall that requires memory, the hypervisor draws the required memory from the pool. If the balance in the pool is insufficient, the call fails. If such a hypercall is made by one guest on behalf of another guest (in another partition), the hypervisor draws the required memory from the pool of the latter partition.

Pages within a partition’s memory pool are managed by the hypervisor. These pages cannot be accessed through any partition’s GPA space. That is, in all partitions’ GPA spaces, they must be inaccessible (mapped such that no read, write or execute access is allowed). In general, the only partition that can deposit into or withdraw from a partition is that partition’s parent.

When a partition is created, the memory pool is initially empty.

The following invariants are maintained:

All memory pool pages are RAM SPA pages that were previously mapped into the GPA space of the parent partition.

All pages in a partition’s memory pool must have been explicitly deposited into that partition’s pool; that is, memory does not migrate between memory pools of different partitions.

Pages in a partition’s memory pool cannot be accessed by any guest; that is, these pages are not mapped into any partition’s GPA space with read, write, or execute privileges.

NUMA Proximity Domains

The ACPI tables include topology information for all processors and memory present in a system at system boot. They also contain information for memory that can be added while the system is running without requiring a reboot. The information identifies sets of logical processors and physical memory ranges, enabling a tight coupling between logical processors and memory ranges. Each coupling is referred to as a Proximity Domain. The latency of memory accesses by logical processors within the same proximity domain is typically shorter than the latency of accesses outside of the proximity domain. The root partition can relay this affinity information to the hypervisor by selecting memory resources or specifying the proximity domain to be used with certain hypercalls.

Source Hypervisor Functional Specification 1.0        

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: