RustyHermit

Short introduction into RustyHermit

Institute for Automation of Complex Power Systems
RWTH Aachen University

Stefan Lankes

Introduction

  • Project started with HermitCore [1] for HPC

  • Combination of the Unikernel and Multi-Kernel to reduce the overhead

    • The same binary is able to run

      • in a VM (classical unikernel setup)

      • or bare-metal side-by-side to Linux (multi-kernel setup)

HermitCore as Unikernel
1. S. Lankes, S. Pickartz, and J. Breitbart, “HermitCore: A Unikernel for Extreme Scale Computing,” 2016, doi: 10.1145/2931088.2931093.

HermitCore Features

HermitCore as Unikernel
  • Support for dominant programming models (OpenMP)

  • Single-address space operating system

    • No TLB Shootdown

  • Runtime support

    • Full C-library support (newlib)

    • Support of pthreads and OpenMP

    • Full integration within GCC ⇒ Support of C / C++, Fortran, Go

HermitCore

HermitCore as Unikernel
  • Completly written in C ⇒ error-prone

  • Combination of different tools to manage the build process (make, cmake)

    • difficult to understand

  • Difficult to maintain code, which is (more or less) equivalent between kernel- and user-space

    • e. g. detection of CPU features

Why Rust for Kernel Development?

  • Safe memory handling through Ownership & Borrowing

  • Runtime is split into a OS-independent (libcore) and OS-dependent (libstd) part

  • By registration of a memory allocator, dynamic data structures are already supported

    • Queues, heaps, and linked lists are part of liballoc

  • The Rust community want to create fast and safe code

    • Support to bypass the strict rules ⇒ unsafe code

  • Already used in many kernel-related projects

    • Many projects share their code via Rust’s package manager

    • For instance, x86 specific data structures are shared in https://crates.io/crates/x86

pub unsafe fn wrmsr(msr: u32, value: u64) {
    let low = value as u32;
    let high = (value >> 32) as u32;
    asm!("wrmsr", in("ecx") msr, in("eax") low, in("edx") high);
}

Do we have disadvantage?

  • Kernel development requires Rust’s nightly compiler

  • Rust code isn’t easy

    • It takes time to write applications

  • In general C code should be faster

  • See Exploring Rust for Unikernel Development [1] for details

1. S. Lankes, J. Breitbart, and S. Pickartz, “Exploring Rust for Unikernel Development,” in Proceedings of the 10th Workshop on Programming Languages and Operating Systems, 2019, pp. 8–15, doi: 10.1145/3365137.3365395.

Removing of POSIX-based system libraries

RustyHermit
  • Removing the dependency to the original toolchain

    • No cross-compiler building required

    • Using of Rust’s default linker

  • The kernel is still a static library

    • C-based binary interface

    • Supports Rust’s libstd

Requirements

$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
  • Required tools

    • The tutorial is based on QEMU

    • The kernel requires NASM for SMP

    • Windows users should take a look at Chocolatey, macOS users at brew to install QEMU

    • Here Ubuntu is used as host system

$ sudo apt-get install qemu-system-x86 nasm git

Build your first RustHermit Application

  • Use our demo application as starting point

$ git clone git@github.com:hermitcore/rusty-demo.git
$ cd rusty-demo
$ git submodule init
$ git submodule update
  • Cargo is the Rust package manager, already installed with your toolchain

  • Cargo.toml is describing your dependencies

  • hermit-sys is a helper crate to build the libOS.

[package]
name = "hello_world"
version = "0.1.0"
authors = ["Stefan Lankes <slankes@eonerc.rwth-aachen.de>"]
edition = "2021"
publish = false
license = "MIT/Apache-2.0"
readme = "README.md"
description = "Hello, RustyHermit!"

[target.'cfg(target_os = "hermit")'.dependencies]
hermit-sys = "0.2"

Key elements of HelloWorld

  • The main program is stored in src/main.rs

    • Import the helper crate hermit-sys

#[cfg(target_os = "hermit")]
use hermit_sys as _;

fn main() {
	println!("Hello World!");
}
  • rust-toolchain.toml specified the used nightly compiler.

[toolchain]
channel = "nightly-2022-05-15"
components = [
    "rust-src",
    "llvm-tools-preview",
    "rustfmt",
    "clippy",
]
targets = [ "x86_64-unknown-hermit" ]

Building the demo application

  • Build HelloWorld for RustyHermit

$ cargo build -Zbuild-std=core,alloc,std,panic_abort \
    -Zbuild-std-features=compiler-builtins-mem \
    --target x86_64-unknown-hermit
  • -Zbuild-std rebuilds the libstd and -Zbuild-std-features the compiler buitins (e.g. memcpy)

  • To run RustyHermint in QEMU a bootloader is required.

    • Already part of the repository

    • Build bootloader

$ cd loader
$ cargo xtask build --arch x86_64
$ cd ..

Run HelloWorld

  • Test RustyHermit in QEMU

    • Add flag -enable-kvm to accelerate virtualization

      • Requires Linux

$ qemu-system-x86_64 -smp 1 -display none -m 1G -serial stdio \
    -cpu qemu64,apic,fsgsbase,rdtscp,xsave,xsaveopt,fxsr \
    -device isa-debug-exit,iobase=0xf4,iosize=0x04 \
    -kernel loader/target/x86_64/debug/rusty-loader \
    -initrd target/x86_64-unknown-hermit/debug/hello_world \
    -smp 1
  • It should run…​

[0][INFO] HermitCore is running on common system!
Hello World!
[0][INFO] Number of interrupts
[0][INFO] [0][7]: 1
[0][INFO] Shutting down system

Release Versions

  • Build a release version to optimize your code

$ cargo build -Zbuild-std=core,alloc,std,panic_abort \
    -Zbuild-std-features=compiler-builtins-mem \
    --target x86_64-unknown-hermit --release
$ cd loader
$ cargo xtask build --arch x86_64 --release
$ cd ..
$ qemu-system-x86_64 -smp 1 -display none -m 1G -serial stdio \
    -cpu qemu64,apic,fsgsbase,rdtscp,xsave,xsaveopt,fxsr \
    -device isa-debug-exit,iobase=0xf4,iosize=0x04 \
    -kernel loader/target/x86_64/release/rusty-loader \
    -initrd target/x86_64-unknown-hermit/release/hello_world \
    -smp 1
  • Code size

$ -rwxr-xr-x  87368 loader/target/x86_64/release/rusty-loader
$ -rwxr-xr-x  4747296 target/x86_64-unknown-hermit/release/hello_world

Concurrent applications

  • Calculating pi via integration

pi
  • Sequential solution

    • Create a new branch to avoid unintended changes (e.g. git checkout -b pi)

let step = 1.0 / num_steps as f64;
let mut sum = 0 as f64;

for i in 0..num_steps {
	let x = (i as f64 + 0.5) * step;
	sum += 4.0 / (1.0 + x * x);
}

Naive solution

  • Rust: There is only one owner of an object

    • No races possible

    • Compiler is able to detect it

let step = 1.0 / NUM_STEPS as f64;
let mut sum = 0.0 as f64;

let threads: Vec<_> = (0..nthreads)
    .map(|tid| {
        thread::spawn(move || 	{
            let start = (NUM_STEPS / nthreads) * tid;
            let end = (NUM_STEPS / nthreads) * (tid+1);

            for i  in start..end {
                let x = (i as f64 + 0.5) * step;
                sum += 4.0 / (1.0 + x * x);
            }
        })
    }).collect();

Compiler error

  • The compiler is able to detect the error

error: cannot assign to immutable captured outer variable
   |
43 |   sum += 4.0 / (1.0 + x * x);
   |   ^^^^^^^^^^^^^^^^^^^^^^^^^^

Concurrent solution (I)

  • Using threads of Rust’s libstd

#[cfg(target_os = "hermit")]
use hermit_sys as _;

use std::thread;
use std::time::Instant;

const NUM_STEPS: u64 = 1000000;
const NTHREADS: u64 = 2;

fn main() {
	let step = 1.0 / NUM_STEPS as f64;
	let mut sum = 0.0 as f64;
	let now = Instant::now();

	let threads: Vec<_> = (0..NTHREADS)
		.map(|tid| {
			thread::spawn(move || {
				let mut partial_sum = 0 as f64;
				let start = (NUM_STEPS / NTHREADS) * tid;
				let end = (NUM_STEPS / NTHREADS) * (tid + 1);

				for i in start..end {
					let x = (i as f64 + 0.5) * step;
					partial_sum += 4.0 / (1.0 + x * x);
				}

				partial_sum
			})
		})
		.collect();

	for t in threads {
		sum += t.join().unwrap();
	}

	let duration = now.elapsed();

	println!(
		"Time to calculate (local sum): {}",
		duration.as_secs() as f64 + (duration.subsec_nanos() as f64 / 1000000000.0)
	);
	println!("Pi: {}", sum * (1.0 / NUM_STEPS as f64));
}

Concurrent solution (II)

  • Test it with more than one CPU

$ cargo build -Zbuild-std=core,alloc,std,panic_abort -Zbuild-std-features=compiler-builtins-mem --target x86_64-unknown-hermit --release
$ qemu-system-x86_64 -smp 1 -display none -m 1G -serial stdio \
    -cpu qemu64,apic,fsgsbase,rdtscp,xsave,xsaveopt,fxsr \
    -device isa-debug-exit,iobase=0xf4,iosize=0x04 \
    -kernel loader/target/x86_64/release/rusty-loader \
    -initrd target/x86_64-unknown-hermit/release/hello_world \
    -smp 2

Time to calculate (local sum): 4.154339
Pi: 3.14159265358991
[0][INFO] Number of interrupts
[0][INFO] [0][7]: 3
[0][INFO] [0][Wakeup]: 1
[0][INFO] [1][7]: 1
[0][INFO] [1][Wakeup]: 1
[0][INFO] Shutting down system

HTTPD

  • crates.io is Rust’s community registry for publishing open-source code

  • Change Cargo.toml according to tiny_http’s README and add options to enable TCP & DHCP support

[package]
...

[dependencies]
tiny_http = "0.11"

[target.'cfg(target_os = "hermit")'.dependencies.hermit-sys]
default-features = false

[features]
default = ["pci", "pci-ids", "acpi", "tcp", "dhcpv4"]
dhcpv4 = ["hermit-sys/dhcpv4"]
pci = ["hermit-sys/pci"]
pci-ids = ["hermit-sys/pci-ids"]
tcp = ["hermit-sys/tcp"]
acpi = ["hermit-sys/acpi"]
fsgsbase = ["hermit-sys/fsgsbase"]
trace = ["hermit-sys/trace"]

Server itself is already explained in the README

#[cfg(target_os = "hermit")]
use hermit_sys as _;
use tiny_http::{Response, Server};

fn main() {
	let server = Server::http("0.0.0.0:8000").unwrap();

	for request in server.incoming_requests() {
		println!(
			"received request! method: {:?}, url: {:?}, headers: {:?}",
			request.method(),
			request.url(),
			request.headers()
		);

		let response = Response::from_string("hello world");
		request.respond(response);
	}
}

Running HTTPD

  • Build httpd

$ cargo build -Zbuild-std=core,alloc,std,panic_abort \
    -Zbuild-std-features=compiler-builtins-mem \
    --target x86_64-unknown-hermit
  • Start QEMU with the emulation of the network card RTL8139

    • Forward any requests at the host port 8000 to the guest

qemu-system-x86_64 -cpu qemu64,apic,fsgsbase,rdtscp,xsave,xsaveopt,fxsr \
   -device isa-debug-exit,iobase=0xf4,iosize=0x04 -display none \
   -serial stdio -kernel loader/target/x86_64/debug/rusty-loader \
   -initrd target/x86_64-unknown-hermit/debug/hello_world \
   -netdev user,id=u1,hostfwd=tcp::8000-:8000 \
   -device rtl8139,netdev=u1 -smp 1 -m 1G
  • Send a request to the server

$ curl http://127.0.0.1:8000/hello
hello world

Using Virtio

  • To increase the performance, virtio can be used

    • Requires Linux as host operating system

  • A TAP device is used to forward ethernet frames from the guest to the host system

  • In addition, a bridge allows VMs to have their own IP addresses while using the same physical link

        |
        | physical link
        |
        o br0
       / \
      /   \
eth0 o     o tap0
           |
- - - - - - - - - - - - - - - Host / VM boundary
           |
           o device inside VM

Create bridge

  • Assumption: eth0 is connected to the physical link and maintains the IP address of the host

    • Modern Linux often use another name for eth0

  • First create a bridge and connect eth0 to it

    • Please note that these commands will drop internet connectivity of your machine.

# Create bridge
sudo ip link add bridge0 type bridge
sudo ip link set bridge0 up

# According to Arch wiki eth0 needs to be up
sudo ip link set eth0 up
sudo ip link set eth0 master bridge0

# Drop existing IP from eth0
sudo ip addr flush dev eth0

# Assign IP to bridge0
sudo ip addr add 192.168.122.10/24 brd + dev bridge0
sudo ip route add default via 192.168.122.1 dev bridge0

Create TAP device and boot HTTPD

  • Create TAP device and a assign it to the bridge

sudo ip tuntap add tap0 mode tap
sudo ip link set dev tap0 up
sudo ip link set tap0 master bridge0
  • Boot kernel

sudo qemu-system-x86_64 --enable-kvm -cpu qemu64,apic,fsgsbase,rdtscp,xsave,xsaveopt,fxsr \
   -display none -smp 1 -m 1G -serial stdio \
   -kernel loader/target/x86_64/debug/rusty-loader \
   -initrd target/x86_64-unknown-hermit/debug/httpd \
   -netdev tap,id=net0,ifname=tap0,script=no,downscript=no,vhost=on \
   -device virtio-net-pci,netdev=net0,disable-legacy=on

Test HTTPD

  • Wait for the IP configuration

Now listening on port 8000
[INFO] DHCP config acquired!
[INFO] IP address:      192.168.122.76/24
[INFO] Default gateway: 192.168.122.1
[INFO] DNS server 0:    192.168.122.1
  • Send a request to the server

$ curl http://192.168.122.76:8000/hello
Hello from RustyHermit 🦀

Work in progress

OCI

Conclusion

  • Test it!

  • Try it!

  • Have fun with system software!

License & Data Privacy

Imprint

Dr. rer. nat. Stefan Lankes

RWTH Aachen University, E.ON Energy Research Center,

Mathieustraße 10, 52074 Aachen, Germany