The Wasm Component Model and idiomatic codegen
Idiomatic code generation for Go using the Wasm Component Model. Compiling for different languages always has tradeoffs, which is why using standards helps everyone.
Building and compiling a Rust FFI library so it can be executed from Go.
The Arcjet security SDK is installed by developers to help protect their applications. It includes a WebAssembly module that analyzes every request and evaluates security rules to decide whether to block or allow each request. This could be detecting bots from the HTTP headers or running email validation on user input.
To minimize latency the goal is to execute as much as possible locally within the deployment environment. However, that is not always possible and so the SDK communicates with our API running close to the developer’s application.
This could be because we need to do more analysis, such as performing a lookup for known bots in our IP database (this is too large to be included as part of the SDK). But it might also simply be because WASM isn’t supported in the deployment environment. Whether as a fallback or as part of a multi-stage analysis pipeline, our API often needs to do the same things as the SDK.
In cases where we need to perform analysis on our servers, we want to ensure we return the same result as if the analysis were performed locally. Our SDK includes WASM compiled from Rust, but our API is implemented in Go (because it’s a gRPC API and Go’s gRPC tooling is better than Rust). It would be a waste of time to build the same functionality in multiple languages and using multiple tech stacks is an opportunity for bugs and subtle differences in results.
A simple example is email verification. We perform both validation (of syntax) and verification (of deliverability). We only execute the verification step if the email syntax is valid, so we always do that first. Ideally this happens within the SDK so an invalid email can return the result immediately. This is implemented through a forked (and modified, with PRs submitted upstream) version of the Rust email_address Crate. However, if WASM isn’t available locally then we need to perform the same validation from our API.
There are libraries for validating email syntax from Go, but there are always subtle differences between how the various email standards are parsed and how certain strings are handled. Is there a way to call the Rust library from our Go code?
Rust is an excellent systems language so it makes a lot of sense to be able to use Rust libraries in other languages. Rust supports a Foreign Function Interface (FFI) so you can call into other C libraries as well as allow C to call into Rust libraries.
Go has something similar with cgo, but we really don’t want to do anything with C directly. Memory safety is important, particularly for a security company! This means the ideal route is to write in Rust, but compile to a format that Go can interface with. That’s what FFI is for.
There are several great guides to compiling Rust to C, including Using Unsafe for Fun and Profit, rustgo: Calling rust from go with near-zero overhead, and this GitHub repo with some helper scripts and example code.
After reading through those I ended up writing a small Rust library that served as the C interface and included my main Rust code as a dependency. This just deals with setting up the C exports and converting between types.
This is a simple example of a function that takes a C string, converts it to a Rust string, calls another Rust function which returns a Rust string, then returns a C string:
use std::ffi::CStr;
#[no_mangle]
pub unsafe extern "C" fn arcjet_launch(input: *const libc::c_char) -> *const libc::c_char {
// Convert the input from a C string to a Rust string
let input_cstr = unsafe { CStr::from_ptr(input) };
let input = input_cstr.to_str().unwrap().to_string();
// You should do some validation here
// Call the Rust function
let output = arcjet_launch::launch(input);
// Convert the output from a Rust string to a C string
let output_cstr = match std::ffi::CString::new(output) {
Ok(cstring) => cstring,
Err(e) => {
println!("({})", e);
return ::std::ptr::null();
}
};
output_cstr.into_raw()
}
This then needs to be compiled with cbindgen to generate the correct C headers. The best way is to automate it with a build.rs
script:
// Adapted from https://michael-f-bryan.github.io/rust-ffi-guide/cbindgen.html
extern crate cbindgen;
use std::env;
use std::path::PathBuf;
fn main() {
let crate_dir = env::var("CARGO_MANIFEST_DIR").unwrap();
let package_name = env::var("CARGO_PKG_NAME").unwrap();
let output_file = target_dir()
.join(format!("{}.h", package_name))
.display()
.to_string();
cbindgen::Builder::new()
.with_crate(crate_dir)
.with_language(cbindgen::Language::C)
.generate()
.unwrap()
.write_to_file(&output_file);
}
/// Find the location of the `target/` directory. Note that this may be
/// overridden by `cmake`, so we also need to check the `CARGO_TARGET_DIR`
/// variable.
fn target_dir() -> PathBuf {
if let Ok(target) = env::var("CARGO_TARGET_DIR") {
PathBuf::from(target)
} else {
PathBuf::from(env::var("CARGO_MANIFEST_DIR").unwrap()).join("target")
}
}
The Cargo.toml
file looks like this:
[package]
name = "arcjet-launch-ffi"
version = "0.0.1"
edition = "2021"
[dependencies]
arcjet-launch = { path = "../launch" }
libc = "0.2.151"
[lib]
crate_type = ["cdylib"]
[build-dependencies]
cbindgen = "0.26.0"
At this point I hit issues with cross-compiling. We develop on macOS ARM, run CI/CD on GitHub actions (Linux x86) and deploy to AWS Graviton instances (Linux ARM). Rust supports --target
, but you also have to manage the linker. This is difficult (see rust#73493 and cargo#4133), so I decided to use Zig as the linker through cargo-zigbuild.
After installing it with cargo install cargo-zigbuild
compiling & linking for different platforms is much easier:
Linux x86
rustup target add x86_64-unknown-linux-gnu
cargo zigbuild --release --target=x86_64-unknown-linux-gnu
Linux ARM
rustup target add aarch64-unknown-linux-gnu
cargo zigbuild --release --target=aarch64-unknown-linux-gnu
macOS ARM
rustup target add aarch64-apple-darwin
cargo zigbuild --target=aarch64-apple-darwin
Enabling cgo requires some thought about security. We build our containers with distroless to ensure the runtime environment is locked down. However, it assumes libraries are statically compiled and don’t require libc. So to load a Rust module compiled as a C library you have to open the environment up slightly. This means using the base distroless image which includes libc.
Calling out to a C library from a memory-safe language somewhat defeats the purpose, so we also looked at how we could minimize any C surface area. Purego is one step towards that. We can compile the Rust code as a shared C library and then dynamically load it at runtime with Purego. This means we don’t need to use cgo, can cache the Go builds, can avoid having to set up a C compiler in CI, and can compile our production binaries with CGO_ENABLED=0
.
The final step is to load the Rust library from Go. This is a dynamic library so we use Dlopen:
arcjetlib, err := purego.Dlopen(libPath, purego.RTLD_NOW|purego.RTLD_GLOBAL)
if err != nil {
// Handle error
}
defer purego.Dlclose(arcjetlib)
// Register the function
var launch func(string) string
purego.RegisterLibFunc(&launch, arcjetlib, "arcjet_launch")
In our real code we do this when the Go process starts so that we only have to initialize the library once. Then elsewhere we call the function like any normal Go function:
result := launch(“input string”)
There is almost zero overhead involved with Go calling the Rust C library so any performance impact will likely come from your own code. Go is fast. Rust is fast. And calling Rust from Go is also fast.
Although we’re executing the same code, there are still three separate languages involved: Rust, Go, and the C interface. Each of these provides an opportunity to introduce bugs and different behavior. The build step is not straightforward and you have to compile different versions for each platform architecture.
One direction is to rewrite our Go API in Rust. Then we can just call the Rust library from within Rust itself. However, Go is the perfect language for building network APIs and the ecosystem and tooling for gRPC is particularly good.
Another direction is to call the WASM module we already compile and run in the SDK. There are several WASM runtimes for Go and we have already been exploring Wazero, a pure-Go WebAssembly runtime. This is very similar in concept to Purego, whereas something like wasmtime-go involves cgo.
In the meantime, we’re enjoying the ability to write core functionality once in Rust and then execute it in whichever environment is most convenient.
Update 2024-07-23: We decided to move to using WebAssembly server side using Wazero in Go. Read more about this approach.
Idiomatic code generation for Go using the Wasm Component Model. Compiling for different languages always has tradeoffs, which is why using standards helps everyone.
Framework switching, custom sidebar, custom table of contents, improved SEO, and a better user experience. How we customized Astro Starlight for the Arcjet docs.
Using Go + Gin to reimplement our backend REST API. How we built the golden API: performance & scalability, comprehensive docs, security, authentication, and testability.
Get the full posts by email every week.