What Really Happens Between our CPU and the Cloud?

Overview: This deep-dive explains how operating systems and virtualization work from the hardware level upward. It covers how a CPU executes processes, how memory is virtualized using paging, and how user mode and kernel mode enforce protection and isolation. It extends these concepts to virtualization, showing how hypervisors create virtual machines by multiplexing CPU, memory, and I/O using hardware support such as Intel VT-x and AMD-V.

Table of Contents

1. OS Recap: What is a Process?

Definition (core idea): A process is an executing instance of a program: it bundles an address space (code + data + stack), CPU context (registers, PC), OS resources (open files, sockets), and metadata (PID, owner, scheduling priority).

“Virtualization creates virtual processes inside virtualized operating systems. So understanding process structure is necessary to understand how a VM works.”

“A process is not a program. A program is a passive file on disk. A process is a running, executing, resource-owning entity created by the OS.”

CPU Reality: CPU can only execute machine instructions. Everything else (files, windows, networks) is just abstractions.

Process Equation: Process = Code + Data + Heap + Stack + CPU context + OS resources.
Executable: Machine code.
OS: Allocates memory image, creates PCB, loads program, manages execution.
Virtualization: Creates a fake machine on which a real OS creates real processes.
Understanding process internals is the foundation of hypervisors, VMs, containers, and cloud systems.

Key Structures & Mechanisms

Process Control Block (PCB): PID, registers, program counter, state, open file descriptors, memory map, parent/children, credentials.
Address Space Layout: text (code), data (heap), BSS, stack, mapped files.
States: New → Ready → Running → Waiting (I/O) → Terminated.
Context Switch: OS saves current PCB (registers, PC) and loads destination PCB. Cost: register save/restore, TLB flushs (maybe), cache effects.
Fork/Exec (UNIX): fork() duplicates process (copy-on-write optimizations), exec() replaces address space.

Analogy: The Virtual Office

Virtual Memory (The Desk): Imagine a worker (Process) with a desk that seems infinitely long and clean. They just reach for items. This is the “Virtual Address Space”.

Physical Memory (The Warehouse): In reality, items are stored in boxes in a chaotic warehouse. The worker doesn’t see this complexity.

Page Table (The Index Book): Maps “Item on Desk” to “Box #42 in Warehouse”.

TLB (Sticky Notes): Quick notes for frequently used items to avoid looking up the big index book every time.

Page Fault: If an item isn’t on the desk or nearby, work stops (Fault) until support staff brings it from long-term storage.

2. Types of Hypervisors

Type 1 Hypervisor (Bare-Metal)

Runs directly on physical hardware without an intermediate host operating system. It provides a minimal, high-performance control layer.

Core Characteristics: Installed on bare-metal, no host OS, optimized for security, scalability, and performance.
Common Implementations: VMware ESXi, Microsoft Hyper-V, KVM, Xen.
Use Cases: Enterprise data centers, Public Cloud (AWS Nitro, Azure, GCP).
Why Cloud? Near bare-metal performance, strong tenant isolation, support for hardware virtualization extensions (Intel VT-x, AMD-V).

Figure 1: Type 1 Hypervisor running directly on hardware.

Type 2 Hypervisor (Hosted)

Runs as an application on top of an existing host OS (like Windows or macOS). Relies on the host for hardware management.

Core Characteristics: Installed on top of Host OS, hardware access mediated by host, optimized for convenience.
Common Implementations: Oracle VirtualBox, VMware Workstation, Parallels Desktop.
Use Cases: Local development, learning/experimentation, application compatibility.

3. Virtual Memory & Paging

Virtual memory provides each process with a private, contiguous address space while the OS/MMU transparently maps virtual addresses to physical memory using paging structures.

Pages: Memory is divided into fixed-size blocks (usually 4KB).
Page Tables: Hierarchical data structures map virtual addresses to physical frames.
TLB (Translation Lookaside Buffer): Hardware cache for fast address translation.
Page Faults: Occur when accessing unmapped memory, handled by kernel (allocating or loading from disk).

Figure 2: Mapping Virtual to Physical Addresses.

Figure 3: Address Translation Mechanism.

Figure 4: Demand Paging Workflow.

Virtualization Considerations

Two-Level Address Translation: Guest Virtual Address (GVA) → Guest Physical Address (GPA) → Host Physical Address (HPA).
Hardware-Assisted Nested Paging: EPT (Intel) and NPT (AMD) allow the MMU to perform both translation stages in hardware, improving performance.

4. User Mode vs. Kernel Mode

Modern CPUs enforce multiple privilege levels (rings) to separate user applications from the OS kernel.

User Mode (Ring 3): Unprivileged. Restricted instruction set. No direct hardware access.
Kernel Mode (Ring 0): Privileged. Full access to hardware, memory, and system resources.

The Trap: Transitions from user to kernel mode happen via System Calls, Faults, or Interrupts. The CPU switches privilege levels, jumps to the OS handler, and then returns.

Figure 5: User Mode vs Kernel Mode Code Location.

5. Interrupt and Trap Handling in Virtualized Systems

Interrupts (Asynchronous): Hardware events (timer, network) independent of the current instruction.
Traps (Synchronous): Caused by the instruction itself (system call, page fault, divide by zero).

Figure 6: What happens upon a trap.

Figure 7: Trap Handler Execution.

Virtualization-Specific Design

Trap-and-Emulate: Privileged instructions in guest cause “VM Exits” to the hypervisor. Expensive.
Interrupt Virtualization: Hypervisor injects virtual interrupts into the Guest VM.
Paravirtualization: Guest OS is modified to use “Hypercalls” instead of sensitive instructions, reducing overhead.

Figure 8: Segmentation.

Figure 9: Use of Segmentation.

Analogy: Interrupts & Traps

Interrupt (Phone Ringing): You are working. The phone rings (external event). You pause, place a bookmark, answer, then resume.

Trap (Mistake/Help): You make a calculation error or need approval. You stop work and go to the supervisor (Kernel) to fix it.

Figure 10: Segmentation in modern systems.

6. The Computing Stack & VMM

Figure 11: The Computing Stack.

The Stack

API: Interfaces apps to libs/OS.
ABI: Separates OS layer from apps (System Calls).
ISA: Instruction Set Architecture (Hardware interface).

Figure 12: Interfaces (API, ABI, ISA).

Figure 13: Security Rings and Privilege Modes.

VMM (Virtual Machine Monitor) & Popek Goldberg Theorem

Theorem: To build a VMM efficiently via trap-and-emulate, sensitive instructions must be a subset of privileged instructions.

x86 Issue: Historically, x86 did not satisfy this (some sensitive instructions didn’t trap), making pure trap-and-emulate impossible.

Techniques to Virtualize x86

Paravirtualization (e.g., Xen): Rewrite Guest OS to use Hypercalls. High performance, requires modified OS.
Full Virtualization (e.g., VMware): Binary translation of non-virtualizable instructions. Works with unmodified OS but higher overhead.
Hardware Assisted (e.g., KVM/QEMU): CPU adds VMX mode (Root/Non-Root). Guest runs in Non-Root Ring 0. Hypervisor runs in Root Mode.

Figure 14: Hardware Assisted Virtualization (VMX).

Figure 15: Memory Virtualization techniques.

7. Popular Virtualization Technologies Comparison

Technology	Type	Primary Use	Unique Features
VMware ESXi	Type 1	Enterprise	Mature, vMotion, HA.
Microsoft Hyper-V	Type 1	Azure/Windows	Deep Windows integration.
AWS Nitro	Type 1 (Light)	AWS EC2	Offloads virtualization to hardware cards.
KVM	Type 1 (Kernel)	Linux/Cloud	Built into Linux, open-source.
Docker	Container	App Deployment	Shares host kernel, fast startup.
Kubernetes	Orchestrator	Cloud Native	Automated scaling/healing.
Firecracker	MicroVM	Serverless	Secure like VM, fast like container.