Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Compiling C to WebAssembly Without Emscripten (dassur.ma)
221 points by ingve on June 2, 2019 | hide | past | favorite | 35 comments


> C without the standard library (called “libc”) is pretty rough

If you don't want to pull in musl or a traditional libc, there's a more Wasm-y solution known as the Web Assembly System Interface (WASI) [0] that delegates the libc functionality to the runtime.

A WASI Wasm module can be compiled using clang, as in the article. The only difference is to use the WASI sysroot [1].

> optimization

LTO and -O3 are great! I've also found the Twiggy [2] tool useful for more "manual" optimization.

[0] https://wasi.dev [1] https://github.com/CraneStation/wasi-sdk/releases [2] https://rustwasm.github.io/twiggy/


One of the things I like about Rust as an embedded developer is that they make a distinction between core and std portions of the standard library. Core is everything that only depends on memset, memcpy, and memcmp,.which means you know you get it for free on new ports.


Author here :) I am very excited about WASI, but as I mentioned in a comment below, wanted to keep it to the fundamentals so you can appreciate what WASI does for you.

And definitely agree with the shout-out to Twiggy (which I mention in the previous post in the series)!


Totally agreed.

Just may be the next step is like rust to have core and std so some minimum subset would be used. But without this option still excellent.


> -O3

If size is really important, you probably want to try -Os (optimize for size) and -Oz (try harder to optimize for size, including at the expense of CPU) as well.

> LTO

If your project gets big enough that -flto results in unacceptable link times, try -flto=thin.


I just created a code snippet in C [1] which prints hello world by using the (bare-metal) WASI interface. No headers included. The compiled wasm file works with wasmer, lucet, wasmtime and an web implementation of WASI [2]

[1] https://gist.github.com/s-macke/6dd78c78be46214d418454abb667...

[2] https://wasi.dev/polyfill/


I've been thinking of writing a blogpost comparing the advantages and disadvantages of emscripten, wasi, and plain llvm (which is what is discussed here), and also how those interact with web vs server. This space has definitely gotten more interesting recently!


I would love to read that post. I hope you can post it soon!


I'm all over that idea too.


Totally agreed. Hope it will have some example and in github so we can try. Looking forward to it.


Please do! Also I'd like to know the state of DOM access availability in each.


Yes please!


One of the pain points to using Wasm in the real world, is the lack of decent debugging.

eg no ability to run code in a debugger, set breakpoints, etc.

That being said, it's an area being worked on.

Wasm generated by LLVM can already have debugging info stored in it using Wasm "custom sections" (they're a thing) in DWARF format. eg .debug_info, .debug_str, (etc)

So, debuggers are at least possible.

Unfortunately, the way Wasm does variables doesn't map to the way DWARF currently does them. So they can't be encoded correctly. A Major problem. :(

Yury Delendik is working through a spec for fixing that (officially):

https://yurydelendik.github.io/webassembly-dwarf/

His initial implementation, with patches (on an older) LLVM so it generates correct debug info according to the in-development spec, is here:

https://github.com/yurydelendik/llvm-project/tree/frame-poin...

Testing and feedback by a wider audience would be useful. :)

A more recent fork of LLVM (based on 8.0.1 dev ~3 days ago), with Yury's patches applied is here:

https://github.com/justinclift/llvm/commits/release_80-wasm_...

Personally, I'm still trying to get my head around generating DWARF debugging info. Hopefully work it out in a few days. :)


Does WebAssembly have a debug trap?


Not specifically:

https://webassembly.github.io/spec/core/syntax/instructions....

There is an "unreachable" which may be possible to use with some creativity.

Haven't really thought that through, as I'm taking a different approach.

Since the Wasm VM executes instructions virtually, it should be feasible to pass the VM a list of addresses to break on.

eg have the VM listen on a socket, and pass it break point info (etc) out of band.

The main Go debugger - Delve - does this with non-Wasm targets.

As a Wasm VM executes, it just needs to check if the current instruction matches the current break point list or any other trigger conditions.

Thus, trying to figure out debug info decoding. At least the location in memory of variables, for displaying them when a breakpoint is hit.

There's not much use in a debugger that can't show the value of variables. ;)


> There's not much use in a debugger that can't show the value of variables. ;)

Well, that depends :)


Heh Heh Heh. While writing that, figured someone might have a valid use case. ;)

What sprang to mind for you?


Nothing specific, just pointing out that debug symbols are not necessary to find value from a debugger. A disassembly and registers (or stack, in this case?) view, along with a place to run debugger commands, is very useful in and of itself.


Anybody used or have opinions about AssemblyScript?

https://github.com/AssemblyScript/assemblyscript

>AssemblyScript compiles strictly typed TypeScript (basically JavaScript with types) to WebAssembly using Binaryen. It generates lean and mean WebAssembly modules while being just an npm install away.

https://dev.to/jtenner/an-assemblyscript-primer-for-typescri...

Here's a great example of a project that uses it:

https://github.com/torch2424/wasmboy

>️Gameboy Emulator Library written in Web Assembly using AssemblyScript, Debugger/Shell in Preact ️

Here's an excellent talk about wasmboy by the author, Aaron Turner -- he's done some really outstanding work:

https://www.youtube.com/watch?v=ZlL1nduatZQ

Since you can compile AssemblyScript into JavaScript with the TypeScript compiler as well as into WebAssembly, you can compare the speed of JavaScript -vs- WebAssembly on the same source code. Aaron did some interesting benchmarks using wasmboy, in the great tradition of using GameBoy emulators to benchmark JavaScript engines:

https://medium.com/@torch2424/webassembly-is-fast-a-real-wor...


Aaron Turner posted this useful link to a good AssemblyScript example a while ago:

https://news.ycombinator.com/threads?id=torch2424

torch2424 7 months ago | unvote | parent [-] | on: Walt: JavaScript-like syntax for WebAssembly

I help out every once and a while on the AssemblyScript team with like issues, docs, and things. And made wasmboy, which uses AssemblyScript:

https://github.com/torch2424/wasmBoy

But that being said, usually when people are interested in the language, we usually direct them to the "n-body" example: https://github.com/AssemblyScript/assemblyscript/blob/master... . Which kind of looks more typescript-y :)

Also, just to stay on topic, I think walt is awesome. Stoked to see so many projects coming up with a "Wasm for JS devs" approach/story.


I look forward to see its maturity improving, given that I am a fan of TypeScript and type safe languages.


Hello!

Aaron Turner here, thank you for all the kind words! :) Yes I did all those things, and stoked to see people excited about it!

Definitely feel free to reach out anytime about AssemblyScript or WasmBoy. Would love to chat with you / anyone interested.

We also have a slack channel you can reach out and get invited to (see the wiki sidebar): https://github.com/AssemblyScript/assemblyscript/wiki

Thanks again!


Hi Aaron! I got to the assemblyscript slack signin page here -- https://assemblyscript.slack.com/ -- but there's no obvious way to get an invitation. Where should I click or send a request to? Or if you could please send one to don@donhopkins.com, I'd appreciate that. Thank you!


This is a great article that takes you pretty far with very little. I think it's much easier to tinker and experiment when the boilerplate and tooling is reduced to a minimum -- you get a much deeper understanding in what's actually going on behind the scenes.


Yes, much easier to tinker with a more barebones system. The downside of powerful toolchains is often their complexity. So there's a difference between one being better for shipping code and one better for learning.


Nice comprehensive writeup. The next step, beyond the basic allocator provided, would be to use wasi-sdk (https://github.com/cranestation/wasi-sdk) which provides a full musl-based libc, targeting the WASI interfaces. With this, you can invoke a C program with arguments, environment variables, and filesystem access.


Agreed! WASI is the logical next step. It’s the “universal glue code” I wanted to have for the longest time.


The inNative WebAssembly Runtime ran into this problem as well, and includes wasm_malloc.c, which can be linked against your application to provide a simple malloc() implementation without having to write one yourself or depend on WASI.

https://github.com/innative-sdk/innative/wiki/Compile-C---wi...

Of course, it'll be a lot easier to simply depend on WASI instead and re-implement standard libraries on top of it.



Author here :D WASI is great and has me all kinds of excited, but I wanted to cover the fundamentals so people can appreciate what WASI really gives you.


Off-topic: the visual aesthetic of this blog really has its own unique charm. It's truly something else and nonetheless seems to do the trick pretty well.


I consider myself aesthetically handicapped, so this comment made my day. Thank you very much.


What a nice blog post! Lots of detail, and just the right level of detail.


Though I'm certain the security concerns were integral to the design of WASM, the ability to pass a pointer from JavaScript downstream into WASM terrifies the hell out of me.

I know that theoretically every WASM module is supposed to have a fully isolated memory block, but I can't help but wonder about the day where a bug allows WASM to deliver malware payloads to read other web browser tabs. Let's hope that WASM doesn't become everyday in advert networks.


It is a wasm pointer, not an OS pointer. You wouldn't actually get an address of OS memory, just to the addressable memory within the wasm VM.

Getting isolation between tabs does seem to be an ongoing concern, but that's happening regardless of the presence of wasm: https://v8.dev/blog/spectre#site-isolation




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: