Building a unikernel that runs WebAssembly – part 1

crest · on Oct 23, 2023

Does anyone else immediately jump to: https://www.destroyallsoftware.com/talks/the-birth-and-death... ?

skavi · on Oct 24, 2023

Yes, and also this old now archived project which had a similar aim: https://github.com/nebulet/nebulet

withinboredom · on Oct 23, 2023

I literally came here to bring this video up.

JonChesterfield · on Oct 23, 2023

Say a non-OS hacker wants a unikernel. What's the sanest way to go about getting to that?

Options that come to mind are:

- build your application as a linux kernel module, load it into a normal kernel, and generally ignore the userspace that runs anyway

- take Linux and hack it down pretty aggressively plus splice your code into it

- find some github unikernel effort and go from there (which I think the OP does)

- take some other OS - freebsd? - and similarly hack out parts

Other?

I like the idea of a x64 machine running a VM connected to a network card as a generic compute resource that does whatever tasks are assigned by sending it data over the network. It's not been worth the hassle relative to a userspace daemon, but one day I may find the time and would be interested in the HN perspective on where best to start the OS level hackery.

walterbell · on Oct 23, 2023

RedHat has been looking at Linux-as-unikernel since 2018, https://research.redhat.com/blog/article/unikernel-linux-ukl...

> The Unikernel Linux (UKL) project started as an effort to exploit Linux’s configurability.. Our experience has led us to a more general goal: creating a kernel that can be configured to span the spectrum between a general-purpose operating system, amenable to a large class of applications, and a highly optimized, possibly application- and hardware-specialized, unikernel... other technologies occupying a similar space have come along, especially io_uring and eBPF. io_uring is interesting because it amortizes syscall overhead. eBPF is interesting because it’s another way to run code in kernel space (albeit for a very limited definition of “code”).

Code, https://github.com/unikernelLinux/ukl

> Unikernel Linux (UKL) is a small patch to Linux and glibc which allows you to build many programs, unmodified, as unikernels. That means they are linked with the Linux kernel into a final vmlinuz and run in kernel space. You can boot these kernels on baremetal or inside a virtual machine. Almost all features and drivers in Linux are available for use by the unikernel.

bboozzoo · on Oct 23, 2023

For starters, assuming the Linux variant, build a statically compiled application, pack it into an initramfs as the only file there, for simplicity name it `/init`, bundle the initramfs with the kernel, boot. At this point, your app should be the PID 1 and the only process running (with the exception of a bunch of kernel threads). At this point you can do whatever you want.

ghotli · on Oct 23, 2023

This is the most realistic comment on this thread (so far).

wahern · on Oct 23, 2023

Realistic, yes, but it's not a unikernel.

There are projects that permit statically linking a traditional kernel with a traditional application into a unikernel. NetBSD pioneered this with their rump kernel build framework, and I believe there's at least one Linux build framework that mimics this. The build frameworks cut out the syscall layer; an application calling read(2) is basically calling the kernel's read syscall implementation directly. Often you don't need to change any application source code. The build frameworks handle configuring and building the kernel image, and statically linking the kernel image with your application binary to produce the unikernel image.

ghotli · on Oct 25, 2023

I probably should have mentioned I've built unikernels with some of the tooling you've described here. It just seems very academic and edge case compared to a single static user space Linux binary that while technically isn't a by the book unikernel, all I guess I meant was that it's diminishing returns beyond that.

unikraft · on Oct 23, 2023

You should also probably check out Unikraft (https://unikraft.org) , supports many languages/apps, x86/ARM64 and QEMU/Firecracker. Is also able to run an ELF built under Linux as a unikernel (see https://unikraft.org/guides/bincompat). Discord is at https://unikraft.org/discord .

ralls_ebfe · on Oct 23, 2023

There is a framework for OCaml for this: https://mirage.io/ So if you are interested in learning OCaml and want a unikernel, this would be a possible path to take.

JonChesterfield · on Oct 23, 2023

OCaml is a good language but perhaps unikernel does not mean what I thought it did:

> fully-standalone, specialised unikernel that runs under a Xen or KVM hypervisor.

Or maybe xen / kvm are no longer called operating systems?

I'm interested in having my code be responsible for thread scheduling and page tables - no OS layer to syscall into - but am not as keen on DIYing the device drivers to get it talking to the rest of the world.

walterbell · on Oct 23, 2023

MirageOS unikernels run directly on Xen, e.g. http://roscidus.com/blog/blog/2016/01/01/a-unikernel-firewal...

> I replace the [QubesOS] Linux firewall VM with a MirageOS unikernel. The resulting VM uses safe (bounds-checked, type-checked) OCaml code to process network traffic, uses less than a tenth of the memory of the default FirewallVM, boots several times faster, and should be much simpler to audit or extend.

NanoVMs has OSS tools for golang unikernels on multiple hypervisors and cloud platforms, https://nanovms.com/dev/tutorials/running-go-unikernels

eyberg · on Oct 23, 2023

Nanos runs not just go but pretty much any language you want to throw at it:

https://github.com/nanovms/ops-examples .

cmrdporcupine · on Oct 23, 2023

> I'm interested in having my code be responsible for thread scheduling and page tables

But MirageOS does exactly that, last I looked. As does RustyHermit.

eru · on Oct 25, 2023

> Or maybe xen / kvm are no longer called operating systems?

> I'm interested in having my code be responsible for thread scheduling and page tables - no OS layer to syscall into [...]

You might be confusing Xen and KVM here? Xen and KVM are rather different in this regard.

KVM runs on a full Linux kernel (as far as I know). But running your application as unikernels on top of Xen is more comparable to the old Exokernel concept.

nderjung · on Oct 23, 2023

There are essentially three ways to put together a unikernel:

1. Minimizing an existing general-purpose OS

2. By-passing the OS

3. Starting from scratch

You can read more in detail about this here from Unikraft's documentation[0].

[0]: https://unikraft.org/docs/concepts/design-principles#approac...

phendrenad2 · on Oct 23, 2023

I'd go with:

- take Linux and hack it down pretty aggressively plus splice your code into it

But rather than starting with a Linux distro and hacking it down, I'd start the other way: Boot the kernel directly (via a UEFI bootloader). You can embed a basic filesystem structure (/dev, /proc, /etc, etc.) in a binary blob inside the kernel file itself on build (kind of dumb that this is required at all, but it is)). The kernel itself has basically everything you'd need (for any reason you'd want a unikernel).

Zambyte · on Oct 23, 2023

Hack Linux all the way down until you're just left with Linux

CMCDragonkai · on Oct 24, 2023

Is there a cloud service similar to cloudflare workers designed to work with unikernels?

eru · on Oct 25, 2023

Anything that can run VMs on Xen should work.

lucasyvas · on Oct 24, 2023

This would be interesting.

crest · on Oct 23, 2023

The problem with Unikernels is that there is no middle ground between a button smashing user and a kernel hacker. If you open the hood everything is part of the kernel and most (all?) existing examples of Unikernels lack proper tracing and debugging support. It will feel like debugging an eight bit MCU (printf() and GPIO writes) running a far larger (and complex) code base through upward emulation.

eyberg · on Oct 23, 2023

This is a rather old and incorrect talking point.

Nanos has had strace/ftrace/gdb, plenty of apm/monitoring such as cloudwatch and all sorts of tools in/around that realm for years now.

https://docs.ops.city/ops/debugging

https://nanovms.com/dev/tutorials/using-netconsole-for-debug...

https://nanovms.com/dev/tutorials/debugging-unikernels-using...

https://nanovms.com/dev/tutorials/profiling-unikernel-syscal...

https://nanovms.com/dev/tutorials/debugging-nanos-unikernels...

https://nanovms.com/dev/tutorials/profiling-and-tracing-nano...

fhuici · on Oct 23, 2023

Actually this isn't a fundamental issue with unikernels, but rather an implementation one. For instance, check out debugging in Unikraft: https://unikraft.org/docs/internals/debugging .

pjmlp · on Oct 23, 2023

A matter of tooling, nothing related to unikernels.

BrainBacon · on Oct 23, 2023

A couple unikernel projects that caught my eye in the past may be of interest to you. I have no experience with them, so I can't speak to their quality though.

https://unikraft.org/

https://github.com/nanovms/nanos

Levitating · on Oct 23, 2023

A very basic kernel isn't that hard to make. I think currently the easiest way would be to follow this series of blogpost by Philip Oppermann: https://os.phil-opp.com/

He made a few crates which handles the boot process, paging, x86 structures and more.

eyberg · on Oct 23, 2023

I'm completely biased since I cut these packages but for this particular example of "run a wasm payload inside of a unikernel":

    ops pkg load eyberg/wasmedge:0.9.1 -c config.json

You could replicate this is seconds and then push that image to AWS or GCP also in seconds.

1vuio0pswjnm7 · on Oct 24, 2023

NetBSD. Someone already did this hacking over 10 years ago. https://en.wikipedia.org/wiki/Rump_kernel

rjsw · on Oct 23, 2023

The Xen sources used to include a minimal unikernel written in C.

fhuici · on Oct 23, 2023

It still exists: https://wiki.xenproject.org/wiki/Mini-OS . But beware that this is no more than a small reference OS, there's a massive gap between getting it to just boot and running real-world applications with it.

cmrdporcupine · on Oct 23, 2023

If you don't mind working in OCaml, I get in the impression that MirageOS is probably your best bet.

That's a lot more mature than RustyHermit, last I looked.

milansuk · on Oct 23, 2023

Nice project! I love WASM. It's designed to be sandboxed and portable from day one. I wish WASM was invented instead of Javascript in the 90s. WASM will eat the world.

What I hope most is endurance. There are many programs that we are not able to run anymore. The best examples are probably older games. I hope WASM will change that, although I'm a little bit nervous about adding new features, because simple specs have a higher chance of surviving, but the future of binaries looks exciting.

cmrdporcupine · on Oct 23, 2023

Believe it or not, back in the 90s we thought (on the whole) that web browsers were for browsing hypertext documents. Not for replacing the operating system. There's a reason JS started out limited to basic scripting functionality for wiring up e.g. on-click handlers and form validation. That it grew into something else is not indicative of any design fault in JS (tho it has plenty), but with the use it was shoehorned into. The browser as delivery mechanism for the types of things you're talking about is... not what Tim Berners Lee or even Marc Andreesen had in mind?

Back then "the network is the computer" people ended up shipping thin X clients: https://en.wikipedia.org/wiki/Network_Computer in order to do richer applications.

I have very mixed feelings about WASM. There is a large... hype-and-novelty screen held up in front of it right now.

There are many Bad Things about treating the web browser as nothing more than a viewport for whatever UI designer and SWE language-of-the-wek fantasy is going around. Especially when we get into things like accessibility, screen readers, etc.

As for the people treating WASM as the universal VM system outside the browser... Yeah, been down that road 30 years ago, that's what the JVM was supposed to be? But I understand that's not "cool" now, so...

Sigh.

milansuk · on Oct 23, 2023

> Believe it or not ...

I believe and agree with most of you wrote ;)

The main problem with HTML/CSS/JS is programmers want more than these languages offer. With WASM you can pick up language(must compile to .wasm) that fits your use case best. This is the freedom most programmers want.

There will always be programmers who will draw their custom buttons(instead of modifying DOM from WASM) and ignore accessibility. They can do this with JS as well, but most of them don't.

cmrdporcupine · on Oct 23, 2023

The original "sin" is that the browser became the delivery tool for what you're talking about. Whether it's a sin or not is of course a matter of opinion.

But is odd after all these years the browser killed off a big junk of "native" apps on the desktop, but in mobile, there's a whole other story.

Which makes me think the problem all along was about distribution, not technology.

ebiester · on Oct 23, 2023

I keep hoping others see this as well. Sun was so close to the right thing, but the problem is too hard to monetize and it's too vulnerable to embrace, extend, and extinguish.

cmrdporcupine · on Oct 23, 2023

Well, Sun did, I think, couple the JVM the VM too closely to Java the language. And really, on purpose. WASM doesn't make that mistake at least.

But it's also missing, like, a garbage collector and other things that the JVM offered up and did really really well. People are doing dumbass stuff like running garbage collected interpreters inside WASM, inside V8 (which has its own GC) in the browser. It's like nested dolls, just pointless tossing of CPU cycles into the wastebin. Their (or their VC's) money, but jeez.

You can say "oh, that's coming" (GC extensions in WASM) but that hardly inspires confidence because it took 20 years for the JVM to reach maturity on this front. Best case scenario we'll have a decent GC story in WASM in 10.

pjmlp · on Oct 24, 2023

That is always bound to happen, even when bytecode is designed from the ground to support multiple languages, eventually one of them ends up winning as it is too much of mental complexity to always keep moving the platform forward with all of them in mind.

Eventually one of them emerges as the main one, and then there are all the others not necessarly having access to everything like in the early days.

One sees this in the Amsterdam toolkit, IBM TIMI, TDF, and more recently CLR, where it seems to mean C# Language Runtime instead of the original Common Language Runtime, since the .NET Framework to .NET Core transition, and decrease of investment into VB, F# and C++/CLI development and feature parity with C#.

The thing that nags me with WASM is how so many people try to sell it, as if it was the very first of its kind.

jstimpfle · on Oct 24, 2023

> The thing that nags me with WASM is how so many people try to sell it, as if it was the very first of its kind.

I don't get that vibe. Just ask, how do you get to write applications with good, predictable performance, perhaps with multithreading and explicit memory management, in the browser?

It doesn't matter how much of this has existed before in some form or shape. It's ablut the "product" more than it is about grandiose ideas (and the product might not be completely there yet, at least it wasn't some 3 years ago)

cmrdporcupine · on Oct 24, 2023

There are two separate, orthogonal channels of discussion that I think people are poking at.

1. WASM as a browser tech for delivering rich applications inside the browser. On this one I will shrug. I understand the motivation. I don't particularly like it, because my vision of the "web" is not that, but it's a lost battle and I don't have a horse in this race. It's effectively the resurrection of Java applets, but done better, and more earnestly. It's going to solve the kinds of problems you're talking about, I guess, but introduce new ones (even more inconsistency of UX, accessibility features, performance issues, etc.)

2. WASM as a general / universal runtime for server side work. On this, I see a lot of hype, thin substance, a lot of smoke but no fire, and I'm quite skeptical. It looks to me like classic "Have a Hammer, Going to Go find Nails" syndrome. I was initially enthused about this aspect of WASM but I had a job employed working with WASM for a bit and I found a lot to be skeptical about. And while likely will be using WASM in some fashion similar to this for a project I have, I am also not convinced that WASM itself makes a lot of sense as some sort of generic answer for containerization, and looks to me like duplication of effort, claims of novelty where there is none, unhealthy cycles in the tech industry, etc.

Anyways, I think the person you're replying to, and myself, are primarily talking about #2 -- as was the original article

pjmlp · on Oct 24, 2023

All those VC powered companies selling WASM containers in Kubernetes as if application servers weren't a thing 20 years ago, or IBM isn't shipping TIMI execuatbles for decades.

Or talking about how "safe" WASM happens to be, while there are already some USENIX papers slowly making their appearance regarding WASM based attacks.

paulddraper · on Oct 23, 2023

> Especially when we get into things like accessibility, screen readers, etc.

> the JVM was supposed to be? But I understand that's not "cool" now

Both of these criticisms in the same post?

galangalalgol · on Oct 23, 2023

I naively hope the web bifurcates into sandboxed wasm apps and document content that doesn't even need js, much less wasm. I'm not sure what a middle ground would look like or why I'd want it. But the realist in me knows wasm will eat the document content too, meaning adblockers and reader view are doomed...

josephg · on Oct 23, 2023

> meaning adblockers and reader view are doomed

Maybe. As inconvenient as accessibility is, with any luck the need to make web content legible to screen readers will also keep adblockers working. Even with wasm, I don’t think the DOM is going anywhere any time soon. I haven’t seen any proposal to replace it.

galangalalgol · on Oct 23, 2023

You are probably right. Raster frameworks that talk straight to a gl context are out there, eframe/egui is one I've used. And yeah, accessibility is bad. Pair that with encrypted websockets and webTPM which if it isn't a thing, will be, you won't have any control over the chain between the screen and the server.

k__ · on Oct 23, 2023

I think, JavaScript (or something similar) was required for this to work. Otherwise the ecosystem would have been infected by something like Java.

paulddraper · on Oct 23, 2023

> Otherwise the ecosystem would have been infected by something like Java

As opposed to the basket of kittens known as JavaScript?

k__ · on Oct 23, 2023

VikingCoder · on Oct 23, 2023

I absolutely love this. I also hadn't seen several of the linked technologies before, so I'm bookmarking all of them, too.

Next up, I want to configure the hypervisor with a WireGuard connection (possibly through something like Tailscale to establish connections?)...

So I have WebAssembly over here on this machine, talking directly to this WebAssembly over there. Based on configuration and capabilities being passed in. Rather than based on the process opening TCP connections to random locations.

eyberg · on Oct 23, 2023

We did this with wireguard:

https://nanovms.com/dev/tutorials/running-nanos-wireguard-vp... .

RantyDave · on Oct 24, 2023

I'm late to the party but...

Has anyone contemplated running Zephyr as a unikernel? https://docs.zephyrproject.org/latest/boards/x86/acrn/doc/in...

brundolf · on Oct 23, 2023

How long till we see dedicated WASM hardware?

ynx · on Oct 23, 2023

Pedantically - never, because in the strictest sense it is not specified enough for that.

But perhaps someone could make a "wasm-but-it's-actually-RISCV-underneath" kinda thing.

brundolf · on Oct 23, 2023

Fair, may have been a dumb question :)

eru · on Oct 25, 2023

I am fairly sure someone will make some. Just like we had Lisp machines and even specific JVM CPUs.

But my prediction is that those will always stay niche, because running WASM on conventional stock hardware will always be faster in general. Mostly because WASM was designed to run fast on stock hardware, and the economics of scale for conventional general purpose processors are much better.

Compare also how the 'International Conference on Functional Programming' started out as the 'Functional Programming and Computer Architecture' conference, but then people figured out how to compile lazy functional programming languages like Haskell to run efficiently on conventional hardware.

Similar also for the Lisp and Java machines: one reason see we don't see things like them anymore is because compiler technology has caught up.

colesantiago · on Oct 23, 2023

What are the use cases of unikernels and WASM?

shakow · on Oct 23, 2023

This is what the first section of the article is about.

fulafel · on Oct 24, 2023

It talks about learning and fun, but there's always a remote chance that someone could have an idea for a practical application.

cmrdporcupine · on Oct 23, 2023

Won't speak to WASM, or I'll go all "get off my lawn."

But to me the value-sell of unikernels is: 1) Perf; squeak out some extra cycles by throwing overboard things you don't need and pulling things into "ring 0" that you do 2) Simplify; Potentially reduce complexity by ditching some of the things you don't need and 3) Security; Potentially change attack surface ... again, by....

To be clear: I don't think this is right for writing microservices and webapps like most of the people on this forum are employed doing... I think the use case is more for people building infrastructure (databases, load balancers, etc. etc.)

dathinab · on Oct 23, 2023

as micro vms can for some (not all) tasks compete with Linux containers but have the benefit of not exposing you Linux kernel to less trusted code

hence why e.g. some cloude on the edge provider convert you docket image to a micro vm when running it

so maybe some use can be found there

through wasm in micro vm in the edge probably will have a hard time competing with wasm as a sandbox on the edge as such provider probably have an Easter time to add useful boundary features/integrations

quickthrower2 · on Oct 23, 2023

Probably to expand where you WASM can go: in the browser, in a docker container, and now in a lightweight OS that could go on an embedded device.

pjmlp · on Oct 23, 2023

One day it can even run on SIM cards and Visa/Mastercard chips!

rcarmo · on Oct 23, 2023

I see what you did there...

pjmlp · on Oct 23, 2023

Yep. :)

H8crilA · on Oct 23, 2023

All the different ways to reinvent the JVM.

quickthrower2 · on Oct 23, 2023

True but can’t Web Assembly also be non-GC’d?

pjmlp · on Oct 23, 2023

Only because they are already five years late adding GC support, and even then WASM isn't the first bytecode format supporting C and C++, there are already a couple since 1980's.

csjh · on Oct 23, 2023

But it's an opt-in GC? It's not accidental by any means

pzmarzly · on Oct 23, 2023

As "promised" years ago in Birth & Death of Javascript [0], at some point we shall get a unikernel running a safe GC-collected runtime in kernel-space, at which point we could drop virtual memory mapping support from CPUs, making them faster. While in 2014 the author predicted this will be JS with asm.js, now WASM seems like the way to go. Can't wait (haha)!

[0]: https://www.destroyallsoftware.com/talks/the-birth-and-death...

hiimkeks · on Oct 23, 2023

> drop virtual memory mapping support from CPUs, making them faster.

In the video, his argument was that the browsers are single-process anyway, and if everything runs in that process, we don't need that separation. However, since then, we've learned that single-process browsers are a security nightmare, so these days browsers are actually not single-process anymore to provide proper sandboxing.

But I love how close to correct that video is, and it's interesting to see in what ways it turned out to be wrong.

conradev · on Oct 23, 2023

Defense-in-depth is always best practice in security. The more layers the attacker has to break and the harder each layer is, the better. All layers can and will be broken.

Apple has spent a long time hardening the JavaScriptCore web sandbox to run untrusted code. We’ve come a long way since JailbreakMe’s web-based jailbreak, but ultimately memory safety requires participation from all parts of the stack and JavaScriptCore and V8 are still both written in C++. You can trigger memory-safety vulnerabilities in the host VM using guest code.

wasmtime is supposedly a hardened WebAssembly runtime written in Rust, but it’s also a JIT, and I have no idea if anyone has put it through its paces security-wise yet. The idea is that WebAssembly can have JIT-like performance without JIT-like security concerns thanks to a simpler translation layer and minimal runtime.

I could see an argument for dropping some layers if the VM isolation become stronger

fensgrim · on Oct 23, 2023

> The more layers the attacker has to break and the harder each layer is, the better.

No its not, when it comes to end-user app performance, experience or privacy.

Sure, by adding security we can have another reason to let developers end up with golang app compiled to wasm running within electron sandboxed through API redirection (OS + antimalware/antivirus/BPF based EDR) and use it for, like, listening music in a very secure way..

With all these layers happily streaming all kinds of telemetry to knows where, with owning nothing but a bunch of numbers behind a ton of DRM layers, and with no ability to change things to the point where we can't have an app's theme matching system colors because crossplatform compatibility/security reasons.

Case 1, firefox:

> dom.security.unexpected_system_load_telemetry_enabled > security.app_menu.recordEventTelemetry > security.protectionspopup.recordEventTelemetry > security.certerrors.recordEventTelemetry

I don't want to accept developer's assumption that these have to be enabled by default.

Case 2, Windows: can't even do a build of a trusted codebase under IntelliJ without antimalware adding up, like, +150% to build time. While IntelliJ (or some of its extensions or plugins that creep up during development) is happily reporting that performance issue back to its masters. Ugly.

__s · on Oct 23, 2023

This may change if you're using a bunch of wasm sandboxes. Browser would split its memory up into multiple sandboxes with a process like interface, but one that doesn't need virtual memory

gpderetta · on Oct 23, 2023

Amen. Single address OSs would be cool to run trusted code with minimal overhead in-kernel while avoiding crashing the machine because of a bug. But I want more sandboxing, not less, when running untrusted code.

tralarpa · on Oct 23, 2023

Javascript/WASM evolution: designed for applications running in a browser -> writing desktop and server applications -> writing an OS or kernel

I can't put my finger on it, but somehow this looks familiar (hint: it starts with a "J", too)

timschmidt · on Oct 23, 2023

WASM manages the trick with a vastly simpler specification and runtime. Not much more than a compile target for other languages.

meindnoch · on Oct 23, 2023

Except it just got a garbage collector.

titzer · on Oct 23, 2023

We're working to make sure that there will be an officially blessed subset (called a "profile") that will not require GC.

AndrewDucker · on Oct 23, 2023

I'd be interested in knowing more about that. Is there a summary of current progress anywhere?

titzer · on Oct 23, 2023

This is the repo for the "profiles" feature by which we will define standard subsets:

https://github.com/webassembly/profiles

swsieber · on Oct 23, 2023

It did? I thought that proposal was basically stuck; I haven't checked in a while, but I haven't heard of it moving forward either.

johncolanduoni · on Oct 23, 2023

Chrome will be shipping it in the next version: https://chromestatus.com/feature/6062715726462976

davexunit · on Oct 23, 2023

Which makes it an even better compilation target!

starlevel003 · on Oct 23, 2023

Can't wait for the Wazelle processor extensions to drop.

pjmlp · on Oct 23, 2023

> The JavaStation was a Network Computer (NC) developed by Sun Microsystems between 1996 and 2000, intended to run only Java applications.

https://en.wikipedia.org/wiki/JavaStation

cmrdporcupine · on Oct 23, 2023

Virtual memory and paging isn't just about protection/security/process isolation. It's also about making the most effective use of physical memory -- process virtual usage can exceed process RSS and not just because of swapping -- and providing a set of abstractions for managing memory generally. The OS and the allocator are working together, with the OS having a lot of smarts on machine usage in order to make that Fairly Smart in the general case.

So I don't think there's an automatic win in terms of performance by ridding yourself of it. Especially if you're running through the (pretty slow) WASM VM layer anyways.

For some applications (e.g. databases), running unikernel or closer to kernel and having direct access to the MMU could be a big win (e.g. see https://github.com/tuhhosg/exmap & https://github.com/viktorleis/vmcache & https://www.cs.cit.tum.de/fileadmin/w00cfj/dis/_my_direct_up...).

For general applications esp those written to a POSIX standard or making assumptions that the machine they're running on looks like a typical modern day computer? Dubious. You'd end up writing a bunch of what the VMM layer does in user code.

dathinab · on Oct 23, 2023

> drop virtual memory mapping support

the more I think about it the less it makes sense

- js engine rely on vmm, and wasm does so, too (in many ways)

- close to every non embedding, non trivial program I have seen is in subtle ways based on the assumption of vmm

- some vm technology, especially around micro vms uses vmm, too. And Unikernels only really make sense as VMs

H8crilA · on Oct 23, 2023

Also how would software memory protection (like seen in JVM, JavaScript, Python, ...) be faster than hardware MMU? Hardware simply adds more transistors that run the translation concurrently. Faults are either bugs (segfaults) or features you'd have to reimplement anyways.

jacobgorm · on Oct 23, 2023

Paging implemented naively needs a handful of extra memory accesses to fetch and decode page tables, for each application memory access, which is obviously very expensive. Which is why we have TLBs, which are (small) caches of page table data.

However, the 4kiB page size that is typically used and is baked into most software was decided on in the mid-1980s, and is tiny compared to today's memory and application working set sizes, causing TLB thrashing, often rendering the TLB solution ineffective.

Whatever overhead software memory protection would add is likely going to be small in comparison to cost of TLB thrashing. Fortunately, TLB thrashing can be reduced/avoided by switching to larger page sizes, as well as the use of sequential access rather than random access algorithms.

https://en.wikipedia.org/wiki/Translation_lookaside_buffer

twoodfin · on Oct 23, 2023

I don’t get this. Any software implementation of virtual address space is going to need translation tables and “lookaside” caches. But now those structures are competing with real application data for L1 space and bandwidth, not to mention the integer execution units when you use them.

As I understand, the Smalltalk world put a lot of engineering effort into making the software-based model work with performance and efficiency. I don’t think the results were encouraging.

jacobgorm · on Oct 23, 2023

The software-implementation would not have to be a direct emulation of what the hardware does. You are working with the type-system of whatever sandboxed language you are running, and can make much more high-level decisions about what accesses would be legal or not, or how they should get translated, instead of having to go through table lookups on each and every memory access. If you trust the JIT or the compiler you can even make many of the decisions ahead of time, or hoist them outside of loops to virtually eliminate any overhead.

A lot has happened since Smalltalk.

candiodari · on Oct 23, 2023

Real answer: because software implementation works by proving mathematically (without running the code) that it won't violate the virutal address space reserved for it by the kernel.

Then, at runtime, it then does nothing at all. Which is very fast.

gpderetta · on Oct 23, 2023

Paging and lookaside tables are needed for virtual->physical translation. The idea is that a pure software based implementation wouldn't need it at all, at most it would use something segment-like (with just a base offset and segment bound) that it is much easier to handle.

Then again, that's the theory, in practice there are many reasons why hardware moved from early segment based architectures to paging, and memory isolation is only one of them.

insanitybit · on Oct 23, 2023

I guess we'd end up with hardware implementations returning to segmentation registers.

dathinab · on Oct 23, 2023

no we will never

segmentation was an evil everyone both from the hardware and software side was very happy to get ride of

whoever reintroduced segmentation will probably be burned on a stick by computer developers in the afterlife (/j)

insanitybit · on Oct 23, 2023

What makes you say that? I know Grsecurity made solid use of the segmentation registers for quite a long time.

gpderetta · on Oct 23, 2023

Yet CHERI is gaining some ground.

hayley-patton · on Oct 23, 2023

You have lighter context switches [0] and finer-grained security domains; consider e.g. passing a pointer versus de/serialising across process boundaries. (The former benefits the latter too, since there's less of a performance cost to cutting up software into more domains.)

[0] https://www.microsoft.com/en-us/research/publication/deconst...

dan-robertson · on Oct 23, 2023

It probably isn’t worth digging too much into what was essentially a joke. I think the claim is that one would sufficiently trust the safety guarantees of the compiler/runtime to not need any runtime memory protection (software or hardware).

The hardware mmu does have costs: tlbs are quite small and looking things up in a several-layer tree adds a lot of latency. If vm were fine, no one would care much about hugepages, and yet people do care about them. (Larger pages means fewer tlb misses and fewer levels in the tree to look up when there is a miss)

dist1ll · on Oct 23, 2023

I wouldn't call TLBs small:

> Consequently, modern processors have extremely large and highly associative two-level TLBs per CPU — for example, Intel’s Skylake chip uses 64-entry level-1 (L1) TLBs and 12-way, 1,536-entry level-2 (L2) TLBs. These structures require almost as much area as L1 caches today, and can consume as much as 10 to 15 percent of the chip energy.

Bhattacharjee, Abhishek. "Preserving virtual memory by mitigating the address translation wall." IEEE Micro 37.5 (2017): 6-10.

pzmarzly · on Oct 23, 2023

The thing is, now we use and pay the price for both - memory is managed in software, and yet CPU MMU and caches have to sacrifice space on the die for complex memory mappings. Instead we could get extra transistors for better performance (or, like in Apple CPUs, dedicated instructions for GC languages).

SkiFire13 · on Oct 23, 2023

> like in Apple CPUs, dedicated instructions for GC languages

Could you expand on this?

pzmarzly · on Oct 23, 2023

I was trying to refer to this https://news.ycombinator.com/item?id=25233554 (https://threadreaderapp.com/thread/1331735383193903104.html) but I didn't have time to look it up, sorry.

masklinn · on Oct 23, 2023

There’s no instructions for GC’d languages. That was the old Jazelle ARM extension (which could microcode some of java’s bytecode for direct execution).

The “javascript instruction” is FJCVTZS, which is a rounding mode matching x86 semantics, which is incidentally what JS specifies for double -> int32 conversions, and soft-coding it on top of FCVTZS it is rather expensive (it requires a dozen additional instructions to fix up edge cases).

This is beneficial to javascript (on the order of a percentage point on some benchmarks suites, however pure javascript crypto can get high double digits gains), but it’s also beneficial for any replication of x86 rounding on ARM, including but not limited to emulating x86 on arm (aka Rosetta 2).

Aerbil313 · on Oct 23, 2023

Theseus OS doesn't depend on hardware for isolation, as an example. Single address space, single privilege level, yet still safe.

gpderetta · on Oct 23, 2023

Single Address Space OSs have been around forever. Turns out that memory protection is useful even if you are running memory safe code.

Also spectre.

HALtheWise · on Oct 23, 2023

Does anyone know of attempts to add CPU instructions that allow JITs and compilers to mitigate Spectre by using speculation-safe instructions for safety critical checks? I could imagine a "load if less than" or similar instruction, which the compiler could use to incorporate the safety check into the load instruction and avoid a separate branch that could be mispredicted. Such an instruction would be documented to have no side effects (even timing side effects) if the condition were not met.

gpderetta · on Oct 23, 2023

Many CPUs have already specualtion barriers. But of course they are slow.

crest · on Oct 23, 2023

If the hardware is designed to support single address space OSs it doesn't have be a security problem. It can help avoid spectre like problems because it can lower the expected overhead of permission checks so far that there is no advantage of speculating on them instead of performing them.

gpderetta · on Oct 23, 2023

I think you are confusing meltdown (a speculation attack on hardware permission checks which was patched in later revisions of intel silicon and never affected other vendors) with Spectre, a general family of attacks on speculative execution, which are generally unsolved.

You could of course add dedicated hardware to lower the overhead specifically of memory access permission checks. In fact most CPUs already do, it is called an MMU.

Aerbil313 · on Oct 23, 2023

Safe-language OSes like Theseus don't have this class of problems, by their very design. I think it's a superior architecture to current conventional OSes which rely on hardware for protection.

gpderetta · on Oct 23, 2023

How does Theseus prevents speculation attacks? This page [1] mentions them, but has nothing on how the software prevents them.

[1] https://www.theseus-os.com/Theseus/book/design/idea.html

Aerbil313 · on Oct 23, 2023

My understanding is that conventional OSes rely on hardware to provide kernel and userspace data isolation, while Theseus relies on Rust compiler, as in safe Rust you can't access arbitrary memory locations.

Maybe watch the project founder's talk? https://youtu.be/n7r8zO7SodE?si=nswWcFrkTj7K1GpZ

gpderetta · on Oct 23, 2023

By "this class of problems" I assumed you were talking about speculation attacks. How does the rust compiler help? Sorry, I'm not going to watch a talk.

Aerbil313 · on Oct 23, 2023

I'm sorry, I did not mean that, misunderstanding twice on my part. I meant that you can have a SAS SPL OS and have it safe too. Theseus Book simply states that relying on hardware for data isolation have proven a deficient approach, given the existence of such attacks.

PaulDavisThe1st · on Oct 23, 2023

Some folks in this subthread would benefit with re-acquanting themselves with some old OS research. I am specifically thinking of Opal [0] which differentiates the various roles virtual memory management plays. In Opal, all tasks (processes) share a single 64 bit address space (so you can just share pointers) but hardware provides page-level protection.

[0] https://homes.cs.washington.edu/~levy/opal/opal.html

rcarmo · on Oct 23, 2023

Without an MMU, swapping to disk becomes a sizeable challenge. I don't think WASM (or Java, or any other kind of VM) should assume it has infinite physical resources of any kind, but am not surprised that JS folk are so far away from hardware they will sometimes forget how computers actually work...

mananaysiempre · on Oct 23, 2023

> Without an MMU, swapping to disk becomes a sizeable challenge.

Swapping object graphs out to disk (and substituting entry points by swap-in proxies) was a thing in Smalltalk systems, and I expect Lisp machines must have had their own solutions. For that matter, 16-bit Windows could (with great difficulty) swap on an 8086, and other DOS “overlay managers” existed. Not that I like the idea, necessarily, but this one problem is not unsolvable.

pjmlp · on Oct 23, 2023

And in all cases they made use of MMUs to make it perform at an usable speed.

I still remember using overlays on Turbo Pascal, Turbo Basic and Clipper.

Amiga also didn't had a MMU, and we all "enjoyed" our Guru Meditation momments.

gmueckl · on Oct 23, 2023

Seeing that flashing red rectangle was quite a common sight, I might add.

rcarmo · on Oct 23, 2023

Well, let's add efficiency to the mix then (I used Smalltalk and LISP machines, and neither managed RAM effectively enough, to the point where emacs was... fast! at the time).

otabdeveloper4 · on Oct 23, 2023

You are correct. The main reason why MMUs exist is to fix memory fragmentation issues, not security. (Security was an afterthought bolted on later.)

crest · on Oct 23, 2023

If you have a GC tracking accesses (not just writes) it could also be used to move seldom referenced objects from memory to disk.

alexvitkov · on Oct 23, 2023

You'd still very much need virtual memory to isolate WASM linear memories of different processes, unless you want to range check every memory access. If we're dropping linear memory and using the new age GC WASM stuff, sure.

An exploit to the runtime in such a system obviously would of course be a disaster of upmost proportions, and to have any chance of a decent performance you'd need a very complex (read exploitable) runtime.

cmrdporcupine · on Oct 23, 2023

I suspect the underlying assumption here is that each WASM module/program would/could likely exist in its own unikernel on the hypervisor. Which is something I guess you could do since boot and startup times could be pretty minimal. How you would share state between the two, I'm unclear on, though.

The question is.. if you have full isolation and separation of the processes etc... why are you bothering with the WASM now?

walterbell · on Oct 23, 2023

> if you have full isolation and separation of the processes etc... why are you bothering with the WASM now?

WASM can help with portability.

Any sandbox layer can help with anomaly/exploit/bug detection, accelerating fixes to untrusted code, or a neighboring sandbox layer.

"Phrack: Twenty years of Escaping the Java Sandbox" (2018), https://www.exploit-db.com/papers/45517

cmrdporcupine · on Oct 23, 2023

Then we must go deeper! Put some WASM in a JVM in the WASM. In an OS. In a hypervisor.

walterbell · on Oct 23, 2023

haha, today's shiny network effect attractor is tomorrow's legacy quicksand to be abstracted, emulated or deprecated. The addition and deletion of turtles will continue.

> Put some WASM in a JVM in the WASM. In an OS. In a hypervisor.

Intel TDX comes to mind.

crest · on Oct 23, 2023

I would like to see a single address space kernel with hardware for permissions and remapping split. This would enable virtually tagged and indexed caches all the way down to the last level cache without risking aliasing. There could be special cases for a handful permissions checks using (base,size) for things like the current stack, largest few code blocks etc. relieving the pressure on the page based permission check hardware which could also run in parallel with cache accesses (just pretty please don't leave observable uarch state changes behind on denied accesses). To support efficient fork() the hardware could differentiate between local and global addresses by xor or add/sub the process identifier into the address if a tag bit is present in the upper address bits. This should move a lot of expensive steps off the critical path to memory without breaking anything userspace software has to do. Add a form of efficient delegation of permissions (e.g. hardware protected capabilities) and you have the building blocks to allow very fast IPC even for large messages.

bheadmaster · on Oct 23, 2023

The earliest implementation (that I know of) of that idea was in 1997. with Inferno OS [0].

One more recent effort that also implements the same idea is the Phantom OS [1].

[0] https://www.vitanuova.com/inferno/

[1] http://phantomos.org/

jacobgorm · on Oct 23, 2023

See also Singularity from Microsoft Research, using the .net CLR. https://en.wikipedia.org/wiki/Singularity_(operating_system) .

In reality, I think there is always going to be a hypervisor to separate the various workloads, and the hypervisor is likely to keep using paging, to support dynamic memory partitioning -- though perhaps with a larger page size, so as to not create too much pressure on the TLB.

fulafel · on Oct 23, 2023

Also MirageOS, probably the most real-world used unikernel (OCaml based).

pjmlp · on Oct 23, 2023

The earliest implementation was Burroughs B5500, in 1961, a bytecode OS written in safe systems language (ESPOL shortly thereafter replaced with NEWP), where all hardware operations are exposed via intrinsics, and is one of the first recorded use of explicit unsafe code blocks.

The CPUs were microcoded, so the bytecode was for all practical purposes Assembly.

https://en.m.wikipedia.org/wiki/Burroughs_Large_Systems

marcocastignoli · on Oct 23, 2023

Molto interessante, complimenti