Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Building a unikernel that runs WebAssembly – part 1 (castelli.me)
266 points by walterbell on Oct 23, 2023 | hide | past | favorite | 136 comments



Yes, and also this old now archived project which had a similar aim: https://github.com/nebulet/nebulet


I literally came here to bring this video up.


Say a non-OS hacker wants a unikernel. What's the sanest way to go about getting to that?

Options that come to mind are:

- build your application as a linux kernel module, load it into a normal kernel, and generally ignore the userspace that runs anyway

- take Linux and hack it down pretty aggressively plus splice your code into it

- find some github unikernel effort and go from there (which I think the OP does)

- take some other OS - freebsd? - and similarly hack out parts

Other?

I like the idea of a x64 machine running a VM connected to a network card as a generic compute resource that does whatever tasks are assigned by sending it data over the network. It's not been worth the hassle relative to a userspace daemon, but one day I may find the time and would be interested in the HN perspective on where best to start the OS level hackery.


RedHat has been looking at Linux-as-unikernel since 2018, https://research.redhat.com/blog/article/unikernel-linux-ukl...

> The Unikernel Linux (UKL) project started as an effort to exploit Linux’s configurability.. Our experience has led us to a more general goal: creating a kernel that can be configured to span the spectrum between a general-purpose operating system, amenable to a large class of applications, and a highly optimized, possibly application- and hardware-specialized, unikernel... other technologies occupying a similar space have come along, especially io_uring and eBPF. io_uring is interesting because it amortizes syscall overhead. eBPF is interesting because it’s another way to run code in kernel space (albeit for a very limited definition of “code”).

Code, https://github.com/unikernelLinux/ukl

> Unikernel Linux (UKL) is a small patch to Linux and glibc which allows you to build many programs, unmodified, as unikernels. That means they are linked with the Linux kernel into a final vmlinuz and run in kernel space. You can boot these kernels on baremetal or inside a virtual machine. Almost all features and drivers in Linux are available for use by the unikernel.


For starters, assuming the Linux variant, build a statically compiled application, pack it into an initramfs as the only file there, for simplicity name it `/init`, bundle the initramfs with the kernel, boot. At this point, your app should be the PID 1 and the only process running (with the exception of a bunch of kernel threads). At this point you can do whatever you want.


This is the most realistic comment on this thread (so far).


Realistic, yes, but it's not a unikernel.

There are projects that permit statically linking a traditional kernel with a traditional application into a unikernel. NetBSD pioneered this with their rump kernel build framework, and I believe there's at least one Linux build framework that mimics this. The build frameworks cut out the syscall layer; an application calling read(2) is basically calling the kernel's read syscall implementation directly. Often you don't need to change any application source code. The build frameworks handle configuring and building the kernel image, and statically linking the kernel image with your application binary to produce the unikernel image.


I probably should have mentioned I've built unikernels with some of the tooling you've described here. It just seems very academic and edge case compared to a single static user space Linux binary that while technically isn't a by the book unikernel, all I guess I meant was that it's diminishing returns beyond that.


You should also probably check out Unikraft (https://unikraft.org) , supports many languages/apps, x86/ARM64 and QEMU/Firecracker. Is also able to run an ELF built under Linux as a unikernel (see https://unikraft.org/guides/bincompat). Discord is at https://unikraft.org/discord .


There is a framework for OCaml for this: https://mirage.io/ So if you are interested in learning OCaml and want a unikernel, this would be a possible path to take.


OCaml is a good language but perhaps unikernel does not mean what I thought it did:

> fully-standalone, specialised unikernel that runs under a Xen or KVM hypervisor.

Or maybe xen / kvm are no longer called operating systems?

I'm interested in having my code be responsible for thread scheduling and page tables - no OS layer to syscall into - but am not as keen on DIYing the device drivers to get it talking to the rest of the world.


MirageOS unikernels run directly on Xen, e.g. http://roscidus.com/blog/blog/2016/01/01/a-unikernel-firewal...

> I replace the [QubesOS] Linux firewall VM with a MirageOS unikernel. The resulting VM uses safe (bounds-checked, type-checked) OCaml code to process network traffic, uses less than a tenth of the memory of the default FirewallVM, boots several times faster, and should be much simpler to audit or extend.

NanoVMs has OSS tools for golang unikernels on multiple hypervisors and cloud platforms, https://nanovms.com/dev/tutorials/running-go-unikernels


Nanos runs not just go but pretty much any language you want to throw at it:

https://github.com/nanovms/ops-examples .


> I'm interested in having my code be responsible for thread scheduling and page tables

But MirageOS does exactly that, last I looked. As does RustyHermit.


> Or maybe xen / kvm are no longer called operating systems?

> I'm interested in having my code be responsible for thread scheduling and page tables - no OS layer to syscall into [...]

You might be confusing Xen and KVM here? Xen and KVM are rather different in this regard.

KVM runs on a full Linux kernel (as far as I know). But running your application as unikernels on top of Xen is more comparable to the old Exokernel concept.


There are essentially three ways to put together a unikernel:

1. Minimizing an existing general-purpose OS

2. By-passing the OS

3. Starting from scratch

You can read more in detail about this here from Unikraft's documentation[0].

[0]: https://unikraft.org/docs/concepts/design-principles#approac...


I'd go with:

- take Linux and hack it down pretty aggressively plus splice your code into it

But rather than starting with a Linux distro and hacking it down, I'd start the other way: Boot the kernel directly (via a UEFI bootloader). You can embed a basic filesystem structure (/dev, /proc, /etc, etc.) in a binary blob inside the kernel file itself on build (kind of dumb that this is required at all, but it is)). The kernel itself has basically everything you'd need (for any reason you'd want a unikernel).


Hack Linux all the way down until you're just left with Linux


Is there a cloud service similar to cloudflare workers designed to work with unikernels?


Anything that can run VMs on Xen should work.


This would be interesting.


The problem with Unikernels is that there is no middle ground between a button smashing user and a kernel hacker. If you open the hood everything is part of the kernel and most (all?) existing examples of Unikernels lack proper tracing and debugging support. It will feel like debugging an eight bit MCU (printf() and GPIO writes) running a far larger (and complex) code base through upward emulation.



Actually this isn't a fundamental issue with unikernels, but rather an implementation one. For instance, check out debugging in Unikraft: https://unikraft.org/docs/internals/debugging .


A matter of tooling, nothing related to unikernels.


A couple unikernel projects that caught my eye in the past may be of interest to you. I have no experience with them, so I can't speak to their quality though.

https://unikraft.org/

https://github.com/nanovms/nanos


A very basic kernel isn't that hard to make. I think currently the easiest way would be to follow this series of blogpost by Philip Oppermann: https://os.phil-opp.com/

He made a few crates which handles the boot process, paging, x86 structures and more.


I'm completely biased since I cut these packages but for this particular example of "run a wasm payload inside of a unikernel":

    ops pkg load eyberg/wasmedge:0.9.1 -c config.json
You could replicate this is seconds and then push that image to AWS or GCP also in seconds.


NetBSD. Someone already did this hacking over 10 years ago. https://en.wikipedia.org/wiki/Rump_kernel


The Xen sources used to include a minimal unikernel written in C.


It still exists: https://wiki.xenproject.org/wiki/Mini-OS . But beware that this is no more than a small reference OS, there's a massive gap between getting it to just boot and running real-world applications with it.


If you don't mind working in OCaml, I get in the impression that MirageOS is probably your best bet.

That's a lot more mature than RustyHermit, last I looked.


Nice project! I love WASM. It's designed to be sandboxed and portable from day one. I wish WASM was invented instead of Javascript in the 90s. WASM will eat the world.

What I hope most is endurance. There are many programs that we are not able to run anymore. The best examples are probably older games. I hope WASM will change that, although I'm a little bit nervous about adding new features, because simple specs have a higher chance of surviving, but the future of binaries looks exciting.


Believe it or not, back in the 90s we thought (on the whole) that web browsers were for browsing hypertext documents. Not for replacing the operating system. There's a reason JS started out limited to basic scripting functionality for wiring up e.g. on-click handlers and form validation. That it grew into something else is not indicative of any design fault in JS (tho it has plenty), but with the use it was shoehorned into. The browser as delivery mechanism for the types of things you're talking about is... not what Tim Berners Lee or even Marc Andreesen had in mind?

Back then "the network is the computer" people ended up shipping thin X clients: https://en.wikipedia.org/wiki/Network_Computer in order to do richer applications.

I have very mixed feelings about WASM. There is a large... hype-and-novelty screen held up in front of it right now.

There are many Bad Things about treating the web browser as nothing more than a viewport for whatever UI designer and SWE language-of-the-wek fantasy is going around. Especially when we get into things like accessibility, screen readers, etc.

As for the people treating WASM as the universal VM system outside the browser... Yeah, been down that road 30 years ago, that's what the JVM was supposed to be? But I understand that's not "cool" now, so...

Sigh.


> Believe it or not ...

I believe and agree with most of you wrote ;)

The main problem with HTML/CSS/JS is programmers want more than these languages offer. With WASM you can pick up language(must compile to .wasm) that fits your use case best. This is the freedom most programmers want.

There will always be programmers who will draw their custom buttons(instead of modifying DOM from WASM) and ignore accessibility. They can do this with JS as well, but most of them don't.


The original "sin" is that the browser became the delivery tool for what you're talking about. Whether it's a sin or not is of course a matter of opinion.

But is odd after all these years the browser killed off a big junk of "native" apps on the desktop, but in mobile, there's a whole other story.

Which makes me think the problem all along was about distribution, not technology.


I keep hoping others see this as well. Sun was so close to the right thing, but the problem is too hard to monetize and it's too vulnerable to embrace, extend, and extinguish.


Well, Sun did, I think, couple the JVM the VM too closely to Java the language. And really, on purpose. WASM doesn't make that mistake at least.

But it's also missing, like, a garbage collector and other things that the JVM offered up and did really really well. People are doing dumbass stuff like running garbage collected interpreters inside WASM, inside V8 (which has its own GC) in the browser. It's like nested dolls, just pointless tossing of CPU cycles into the wastebin. Their (or their VC's) money, but jeez.

You can say "oh, that's coming" (GC extensions in WASM) but that hardly inspires confidence because it took 20 years for the JVM to reach maturity on this front. Best case scenario we'll have a decent GC story in WASM in 10.


That is always bound to happen, even when bytecode is designed from the ground to support multiple languages, eventually one of them ends up winning as it is too much of mental complexity to always keep moving the platform forward with all of them in mind.

Eventually one of them emerges as the main one, and then there are all the others not necessarly having access to everything like in the early days.

One sees this in the Amsterdam toolkit, IBM TIMI, TDF, and more recently CLR, where it seems to mean C# Language Runtime instead of the original Common Language Runtime, since the .NET Framework to .NET Core transition, and decrease of investment into VB, F# and C++/CLI development and feature parity with C#.

The thing that nags me with WASM is how so many people try to sell it, as if it was the very first of its kind.


> The thing that nags me with WASM is how so many people try to sell it, as if it was the very first of its kind.

I don't get that vibe. Just ask, how do you get to write applications with good, predictable performance, perhaps with multithreading and explicit memory management, in the browser?

It doesn't matter how much of this has existed before in some form or shape. It's ablut the "product" more than it is about grandiose ideas (and the product might not be completely there yet, at least it wasn't some 3 years ago)


There are two separate, orthogonal channels of discussion that I think people are poking at.

1. WASM as a browser tech for delivering rich applications inside the browser. On this one I will shrug. I understand the motivation. I don't particularly like it, because my vision of the "web" is not that, but it's a lost battle and I don't have a horse in this race. It's effectively the resurrection of Java applets, but done better, and more earnestly. It's going to solve the kinds of problems you're talking about, I guess, but introduce new ones (even more inconsistency of UX, accessibility features, performance issues, etc.)

2. WASM as a general / universal runtime for server side work. On this, I see a lot of hype, thin substance, a lot of smoke but no fire, and I'm quite skeptical. It looks to me like classic "Have a Hammer, Going to Go find Nails" syndrome. I was initially enthused about this aspect of WASM but I had a job employed working with WASM for a bit and I found a lot to be skeptical about. And while likely will be using WASM in some fashion similar to this for a project I have, I am also not convinced that WASM itself makes a lot of sense as some sort of generic answer for containerization, and looks to me like duplication of effort, claims of novelty where there is none, unhealthy cycles in the tech industry, etc.

Anyways, I think the person you're replying to, and myself, are primarily talking about #2 -- as was the original article


All those VC powered companies selling WASM containers in Kubernetes as if application servers weren't a thing 20 years ago, or IBM isn't shipping TIMI execuatbles for decades.

Or talking about how "safe" WASM happens to be, while there are already some USENIX papers slowly making their appearance regarding WASM based attacks.


> Especially when we get into things like accessibility, screen readers, etc.

> the JVM was supposed to be? But I understand that's not "cool" now

Both of these criticisms in the same post?


I naively hope the web bifurcates into sandboxed wasm apps and document content that doesn't even need js, much less wasm. I'm not sure what a middle ground would look like or why I'd want it. But the realist in me knows wasm will eat the document content too, meaning adblockers and reader view are doomed...


> meaning adblockers and reader view are doomed

Maybe. As inconvenient as accessibility is, with any luck the need to make web content legible to screen readers will also keep adblockers working. Even with wasm, I don’t think the DOM is going anywhere any time soon. I haven’t seen any proposal to replace it.


You are probably right. Raster frameworks that talk straight to a gl context are out there, eframe/egui is one I've used. And yeah, accessibility is bad. Pair that with encrypted websockets and webTPM which if it isn't a thing, will be, you won't have any control over the chain between the screen and the server.


I think, JavaScript (or something similar) was required for this to work. Otherwise the ecosystem would have been infected by something like Java.


> Otherwise the ecosystem would have been infected by something like Java

As opposed to the basket of kittens known as JavaScript?


Yes.


I absolutely love this. I also hadn't seen several of the linked technologies before, so I'm bookmarking all of them, too.

Next up, I want to configure the hypervisor with a WireGuard connection (possibly through something like Tailscale to establish connections?)...

So I have WebAssembly over here on this machine, talking directly to this WebAssembly over there. Based on configuration and capabilities being passed in. Rather than based on the process opening TCP connections to random locations.



I'm late to the party but...

Has anyone contemplated running Zephyr as a unikernel? https://docs.zephyrproject.org/latest/boards/x86/acrn/doc/in...


How long till we see dedicated WASM hardware?


Pedantically - never, because in the strictest sense it is not specified enough for that.

But perhaps someone could make a "wasm-but-it's-actually-RISCV-underneath" kinda thing.


Fair, may have been a dumb question :)


I am fairly sure someone will make some. Just like we had Lisp machines and even specific JVM CPUs.

But my prediction is that those will always stay niche, because running WASM on conventional stock hardware will always be faster in general. Mostly because WASM was designed to run fast on stock hardware, and the economics of scale for conventional general purpose processors are much better.

Compare also how the 'International Conference on Functional Programming' started out as the 'Functional Programming and Computer Architecture' conference, but then people figured out how to compile lazy functional programming languages like Haskell to run efficiently on conventional hardware.

Similar also for the Lisp and Java machines: one reason see we don't see things like them anymore is because compiler technology has caught up.


What are the use cases of unikernels and WASM?


This is what the first section of the article is about.


It talks about learning and fun, but there's always a remote chance that someone could have an idea for a practical application.


Won't speak to WASM, or I'll go all "get off my lawn."

But to me the value-sell of unikernels is: 1) Perf; squeak out some extra cycles by throwing overboard things you don't need and pulling things into "ring 0" that you do 2) Simplify; Potentially reduce complexity by ditching some of the things you don't need and 3) Security; Potentially change attack surface ... again, by....

To be clear: I don't think this is right for writing microservices and webapps like most of the people on this forum are employed doing... I think the use case is more for people building infrastructure (databases, load balancers, etc. etc.)


as micro vms can for some (not all) tasks compete with Linux containers but have the benefit of not exposing you Linux kernel to less trusted code

hence why e.g. some cloude on the edge provider convert you docket image to a micro vm when running it

so maybe some use can be found there

through wasm in micro vm in the edge probably will have a hard time competing with wasm as a sandbox on the edge as such provider probably have an Easter time to add useful boundary features/integrations


Probably to expand where you WASM can go: in the browser, in a docker container, and now in a lightweight OS that could go on an embedded device.


One day it can even run on SIM cards and Visa/Mastercard chips!


I see what you did there...


Yep. :)


All the different ways to reinvent the JVM.


True but can’t Web Assembly also be non-GC’d?


Only because they are already five years late adding GC support, and even then WASM isn't the first bytecode format supporting C and C++, there are already a couple since 1980's.


But it's an opt-in GC? It's not accidental by any means


As "promised" years ago in Birth & Death of Javascript [0], at some point we shall get a unikernel running a safe GC-collected runtime in kernel-space, at which point we could drop virtual memory mapping support from CPUs, making them faster. While in 2014 the author predicted this will be JS with asm.js, now WASM seems like the way to go. Can't wait (haha)!

[0]: https://www.destroyallsoftware.com/talks/the-birth-and-death...


> drop virtual memory mapping support from CPUs, making them faster.

In the video, his argument was that the browsers are single-process anyway, and if everything runs in that process, we don't need that separation. However, since then, we've learned that single-process browsers are a security nightmare, so these days browsers are actually not single-process anymore to provide proper sandboxing.

But I love how close to correct that video is, and it's interesting to see in what ways it turned out to be wrong.


Defense-in-depth is always best practice in security. The more layers the attacker has to break and the harder each layer is, the better. All layers can and will be broken.

Apple has spent a long time hardening the JavaScriptCore web sandbox to run untrusted code. We’ve come a long way since JailbreakMe’s web-based jailbreak, but ultimately memory safety requires participation from all parts of the stack and JavaScriptCore and V8 are still both written in C++. You can trigger memory-safety vulnerabilities in the host VM using guest code.

wasmtime is supposedly a hardened WebAssembly runtime written in Rust, but it’s also a JIT, and I have no idea if anyone has put it through its paces security-wise yet. The idea is that WebAssembly can have JIT-like performance without JIT-like security concerns thanks to a simpler translation layer and minimal runtime.

I could see an argument for dropping some layers if the VM isolation become stronger


> The more layers the attacker has to break and the harder each layer is, the better.

No its not, when it comes to end-user app performance, experience or privacy.

Sure, by adding security we can have another reason to let developers end up with golang app compiled to wasm running within electron sandboxed through API redirection (OS + antimalware/antivirus/BPF based EDR) and use it for, like, listening music in a very secure way..

With all these layers happily streaming all kinds of telemetry to knows where, with owning nothing but a bunch of numbers behind a ton of DRM layers, and with no ability to change things to the point where we can't have an app's theme matching system colors because crossplatform compatibility/security reasons.

Case 1, firefox:

> dom.security.unexpected_system_load_telemetry_enabled > security.app_menu.recordEventTelemetry > security.protectionspopup.recordEventTelemetry > security.certerrors.recordEventTelemetry

I don't want to accept developer's assumption that these have to be enabled by default.

Case 2, Windows: can't even do a build of a trusted codebase under IntelliJ without antimalware adding up, like, +150% to build time. While IntelliJ (or some of its extensions or plugins that creep up during development) is happily reporting that performance issue back to its masters. Ugly.


This may change if you're using a bunch of wasm sandboxes. Browser would split its memory up into multiple sandboxes with a process like interface, but one that doesn't need virtual memory


Amen. Single address OSs would be cool to run trusted code with minimal overhead in-kernel while avoiding crashing the machine because of a bug. But I want more sandboxing, not less, when running untrusted code.


Javascript/WASM evolution: designed for applications running in a browser -> writing desktop and server applications -> writing an OS or kernel

I can't put my finger on it, but somehow this looks familiar (hint: it starts with a "J", too)


WASM manages the trick with a vastly simpler specification and runtime. Not much more than a compile target for other languages.


Except it just got a garbage collector.


We're working to make sure that there will be an officially blessed subset (called a "profile") that will not require GC.


I'd be interested in knowing more about that. Is there a summary of current progress anywhere?


This is the repo for the "profiles" feature by which we will define standard subsets:

https://github.com/webassembly/profiles


It did? I thought that proposal was basically stuck; I haven't checked in a while, but I haven't heard of it moving forward either.


Chrome will be shipping it in the next version: https://chromestatus.com/feature/6062715726462976


Which makes it an even better compilation target!


Can't wait for the Wazelle processor extensions to drop.


> The JavaStation was a Network Computer (NC) developed by Sun Microsystems between 1996 and 2000, intended to run only Java applications.

https://en.wikipedia.org/wiki/JavaStation


Virtual memory and paging isn't just about protection/security/process isolation. It's also about making the most effective use of physical memory -- process virtual usage can exceed process RSS and not just because of swapping -- and providing a set of abstractions for managing memory generally. The OS and the allocator are working together, with the OS having a lot of smarts on machine usage in order to make that Fairly Smart in the general case.

So I don't think there's an automatic win in terms of performance by ridding yourself of it. Especially if you're running through the (pretty slow) WASM VM layer anyways.

For some applications (e.g. databases), running unikernel or closer to kernel and having direct access to the MMU could be a big win (e.g. see https://github.com/tuhhosg/exmap & https://github.com/viktorleis/vmcache & https://www.cs.cit.tum.de/fileadmin/w00cfj/dis/_my_direct_up...).

For general applications esp those written to a POSIX standard or making assumptions that the machine they're running on looks like a typical modern day computer? Dubious. You'd end up writing a bunch of what the VMM layer does in user code.


> drop virtual memory mapping support

the more I think about it the less it makes sense

- js engine rely on vmm, and wasm does so, too (in many ways)

- close to every non embedding, non trivial program I have seen is in subtle ways based on the assumption of vmm

- some vm technology, especially around micro vms uses vmm, too. And Unikernels only really make sense as VMs


Also how would software memory protection (like seen in JVM, JavaScript, Python, ...) be faster than hardware MMU? Hardware simply adds more transistors that run the translation concurrently. Faults are either bugs (segfaults) or features you'd have to reimplement anyways.


Paging implemented naively needs a handful of extra memory accesses to fetch and decode page tables, for each application memory access, which is obviously very expensive. Which is why we have TLBs, which are (small) caches of page table data.

However, the 4kiB page size that is typically used and is baked into most software was decided on in the mid-1980s, and is tiny compared to today's memory and application working set sizes, causing TLB thrashing, often rendering the TLB solution ineffective.

Whatever overhead software memory protection would add is likely going to be small in comparison to cost of TLB thrashing. Fortunately, TLB thrashing can be reduced/avoided by switching to larger page sizes, as well as the use of sequential access rather than random access algorithms.

https://en.wikipedia.org/wiki/Translation_lookaside_buffer


I don’t get this. Any software implementation of virtual address space is going to need translation tables and “lookaside” caches. But now those structures are competing with real application data for L1 space and bandwidth, not to mention the integer execution units when you use them.

As I understand, the Smalltalk world put a lot of engineering effort into making the software-based model work with performance and efficiency. I don’t think the results were encouraging.


The software-implementation would not have to be a direct emulation of what the hardware does. You are working with the type-system of whatever sandboxed language you are running, and can make much more high-level decisions about what accesses would be legal or not, or how they should get translated, instead of having to go through table lookups on each and every memory access. If you trust the JIT or the compiler you can even make many of the decisions ahead of time, or hoist them outside of loops to virtually eliminate any overhead.

A lot has happened since Smalltalk.


Real answer: because software implementation works by proving mathematically (without running the code) that it won't violate the virutal address space reserved for it by the kernel.

Then, at runtime, it then does nothing at all. Which is very fast.


Paging and lookaside tables are needed for virtual->physical translation. The idea is that a pure software based implementation wouldn't need it at all, at most it would use something segment-like (with just a base offset and segment bound) that it is much easier to handle.

Then again, that's the theory, in practice there are many reasons why hardware moved from early segment based architectures to paging, and memory isolation is only one of them.


I guess we'd end up with hardware implementations returning to segmentation registers.


no we will never

segmentation was an evil everyone both from the hardware and software side was very happy to get ride of

whoever reintroduced segmentation will probably be burned on a stick by computer developers in the afterlife (/j)


What makes you say that? I know Grsecurity made solid use of the segmentation registers for quite a long time.


Yet CHERI is gaining some ground.


You have lighter context switches [0] and finer-grained security domains; consider e.g. passing a pointer versus de/serialising across process boundaries. (The former benefits the latter too, since there's less of a performance cost to cutting up software into more domains.)

[0] https://www.microsoft.com/en-us/research/publication/deconst...


It probably isn’t worth digging too much into what was essentially a joke. I think the claim is that one would sufficiently trust the safety guarantees of the compiler/runtime to not need any runtime memory protection (software or hardware).

The hardware mmu does have costs: tlbs are quite small and looking things up in a several-layer tree adds a lot of latency. If vm were fine, no one would care much about hugepages, and yet people do care about them. (Larger pages means fewer tlb misses and fewer levels in the tree to look up when there is a miss)


I wouldn't call TLBs small:

> Consequently, modern processors have extremely large and highly associative two-level TLBs per CPU — for example, Intel’s Skylake chip uses 64-entry level-1 (L1) TLBs and 12-way, 1,536-entry level-2 (L2) TLBs. These structures require almost as much area as L1 caches today, and can consume as much as 10 to 15 percent of the chip energy.

Bhattacharjee, Abhishek. "Preserving virtual memory by mitigating the address translation wall." IEEE Micro 37.5 (2017): 6-10.


The thing is, now we use and pay the price for both - memory is managed in software, and yet CPU MMU and caches have to sacrifice space on the die for complex memory mappings. Instead we could get extra transistors for better performance (or, like in Apple CPUs, dedicated instructions for GC languages).


> like in Apple CPUs, dedicated instructions for GC languages

Could you expand on this?


I was trying to refer to this https://news.ycombinator.com/item?id=25233554 (https://threadreaderapp.com/thread/1331735383193903104.html) but I didn't have time to look it up, sorry.


There’s no instructions for GC’d languages. That was the old Jazelle ARM extension (which could microcode some of java’s bytecode for direct execution).

The “javascript instruction” is FJCVTZS, which is a rounding mode matching x86 semantics, which is incidentally what JS specifies for double -> int32 conversions, and soft-coding it on top of FCVTZS it is rather expensive (it requires a dozen additional instructions to fix up edge cases).

This is beneficial to javascript (on the order of a percentage point on some benchmarks suites, however pure javascript crypto can get high double digits gains), but it’s also beneficial for any replication of x86 rounding on ARM, including but not limited to emulating x86 on arm (aka Rosetta 2).


Theseus OS doesn't depend on hardware for isolation, as an example. Single address space, single privilege level, yet still safe.


Single Address Space OSs have been around forever. Turns out that memory protection is useful even if you are running memory safe code.

Also spectre.


Does anyone know of attempts to add CPU instructions that allow JITs and compilers to mitigate Spectre by using speculation-safe instructions for safety critical checks? I could imagine a "load if less than" or similar instruction, which the compiler could use to incorporate the safety check into the load instruction and avoid a separate branch that could be mispredicted. Such an instruction would be documented to have no side effects (even timing side effects) if the condition were not met.


Many CPUs have already specualtion barriers. But of course they are slow.


If the hardware is designed to support single address space OSs it doesn't have be a security problem. It can help avoid spectre like problems because it can lower the expected overhead of permission checks so far that there is no advantage of speculating on them instead of performing them.


I think you are confusing meltdown (a speculation attack on hardware permission checks which was patched in later revisions of intel silicon and never affected other vendors) with Spectre, a general family of attacks on speculative execution, which are generally unsolved.

You could of course add dedicated hardware to lower the overhead specifically of memory access permission checks. In fact most CPUs already do, it is called an MMU.


Safe-language OSes like Theseus don't have this class of problems, by their very design. I think it's a superior architecture to current conventional OSes which rely on hardware for protection.


How does Theseus prevents speculation attacks? This page [1] mentions them, but has nothing on how the software prevents them.

[1] https://www.theseus-os.com/Theseus/book/design/idea.html


My understanding is that conventional OSes rely on hardware to provide kernel and userspace data isolation, while Theseus relies on Rust compiler, as in safe Rust you can't access arbitrary memory locations.

Maybe watch the project founder's talk? https://youtu.be/n7r8zO7SodE?si=nswWcFrkTj7K1GpZ


By "this class of problems" I assumed you were talking about speculation attacks. How does the rust compiler help? Sorry, I'm not going to watch a talk.


I'm sorry, I did not mean that, misunderstanding twice on my part. I meant that you can have a SAS SPL OS and have it safe too. Theseus Book simply states that relying on hardware for data isolation have proven a deficient approach, given the existence of such attacks.


Some folks in this subthread would benefit with re-acquanting themselves with some old OS research. I am specifically thinking of Opal [0] which differentiates the various roles virtual memory management plays. In Opal, all tasks (processes) share a single 64 bit address space (so you can just share pointers) but hardware provides page-level protection.

[0] https://homes.cs.washington.edu/~levy/opal/opal.html


Without an MMU, swapping to disk becomes a sizeable challenge. I don't think WASM (or Java, or any other kind of VM) should assume it has infinite physical resources of any kind, but am not surprised that JS folk are so far away from hardware they will sometimes forget how computers actually work...


> Without an MMU, swapping to disk becomes a sizeable challenge.

Swapping object graphs out to disk (and substituting entry points by swap-in proxies) was a thing in Smalltalk systems, and I expect Lisp machines must have had their own solutions. For that matter, 16-bit Windows could (with great difficulty) swap on an 8086, and other DOS “overlay managers” existed. Not that I like the idea, necessarily, but this one problem is not unsolvable.


And in all cases they made use of MMUs to make it perform at an usable speed.

I still remember using overlays on Turbo Pascal, Turbo Basic and Clipper.

Amiga also didn't had a MMU, and we all "enjoyed" our Guru Meditation momments.


Seeing that flashing red rectangle was quite a common sight, I might add.


Well, let's add efficiency to the mix then (I used Smalltalk and LISP machines, and neither managed RAM effectively enough, to the point where emacs was... fast! at the time).


You are correct. The main reason why MMUs exist is to fix memory fragmentation issues, not security. (Security was an afterthought bolted on later.)


If you have a GC tracking accesses (not just writes) it could also be used to move seldom referenced objects from memory to disk.


You'd still very much need virtual memory to isolate WASM linear memories of different processes, unless you want to range check every memory access. If we're dropping linear memory and using the new age GC WASM stuff, sure.

An exploit to the runtime in such a system obviously would of course be a disaster of upmost proportions, and to have any chance of a decent performance you'd need a very complex (read exploitable) runtime.


I suspect the underlying assumption here is that each WASM module/program would/could likely exist in its own unikernel on the hypervisor. Which is something I guess you could do since boot and startup times could be pretty minimal. How you would share state between the two, I'm unclear on, though.

The question is.. if you have full isolation and separation of the processes etc... why are you bothering with the WASM now?


> if you have full isolation and separation of the processes etc... why are you bothering with the WASM now?

WASM can help with portability.

Any sandbox layer can help with anomaly/exploit/bug detection, accelerating fixes to untrusted code, or a neighboring sandbox layer.

"Phrack: Twenty years of Escaping the Java Sandbox" (2018), https://www.exploit-db.com/papers/45517


Then we must go deeper! Put some WASM in a JVM in the WASM. In an OS. In a hypervisor.


haha, today's shiny network effect attractor is tomorrow's legacy quicksand to be abstracted, emulated or deprecated. The addition and deletion of turtles will continue.

> Put some WASM in a JVM in the WASM. In an OS. In a hypervisor.

Intel TDX comes to mind.


I would like to see a single address space kernel with hardware for permissions and remapping split. This would enable virtually tagged and indexed caches all the way down to the last level cache without risking aliasing. There could be special cases for a handful permissions checks using (base,size) for things like the current stack, largest few code blocks etc. relieving the pressure on the page based permission check hardware which could also run in parallel with cache accesses (just pretty please don't leave observable uarch state changes behind on denied accesses). To support efficient fork() the hardware could differentiate between local and global addresses by xor or add/sub the process identifier into the address if a tag bit is present in the upper address bits. This should move a lot of expensive steps off the critical path to memory without breaking anything userspace software has to do. Add a form of efficient delegation of permissions (e.g. hardware protected capabilities) and you have the building blocks to allow very fast IPC even for large messages.


The earliest implementation (that I know of) of that idea was in 1997. with Inferno OS [0].

One more recent effort that also implements the same idea is the Phantom OS [1].

[0] https://www.vitanuova.com/inferno/

[1] http://phantomos.org/


See also Singularity from Microsoft Research, using the .net CLR. https://en.wikipedia.org/wiki/Singularity_(operating_system) .

In reality, I think there is always going to be a hypervisor to separate the various workloads, and the hypervisor is likely to keep using paging, to support dynamic memory partitioning -- though perhaps with a larger page size, so as to not create too much pressure on the TLB.


Also MirageOS, probably the most real-world used unikernel (OCaml based).


The earliest implementation was Burroughs B5500, in 1961, a bytecode OS written in safe systems language (ESPOL shortly thereafter replaced with NEWP), where all hardware operations are exposed via intrinsics, and is one of the first recorded use of explicit unsafe code blocks.

The CPUs were microcoded, so the bytecode was for all practical purposes Assembly.

https://en.m.wikipedia.org/wiki/Burroughs_Large_Systems


Molto interessante, complimenti




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: