A small, fast WebAssembly runtime written in Zig. Library and CLI.
Supported host targets:
aarch64-macosx86_64-linuxaarch64-linuxx86_64-windows
Most Wasm runtimes are either fast but large (wasmtime ~56MB) or small but slow (wasm3 ~0.3MB, interpreter only). zwasm targets the gap between them: ~1.2MB with ARM64 + x86_64 JIT compilation, including SIMD.
| Runtime | Binary | Memory | JIT |
|---|---|---|---|
| zwasm | 1.2MB | ~3.5MB | ARM64 + x86_64 (incl. SIMD) |
| wasmtime | 56MB | ~12MB | Cranelift |
| wasm3 | 0.3MB | ~1MB | None |
zwasm was extracted from ClojureWasm (a Zig reimplementation of Clojure) where optimizing a Wasm subsystem inside a language runtime created a "runtime within runtime" problem. Separating it produced a cleaner codebase, independent optimization, and a reusable library. ClojureWasm remains the primary consumer.
- 581+ opcodes: Full MVP + SIMD (236 + 20 relaxed) + Exception handling + Function references + GC + Threads (79 atomics)
- 4-tier execution: bytecode > predecoded IR > register IR > ARM64/x86_64 JIT (NEON/SSE SIMD)
- 100% spec conformance: 62,263/62,263 spec tests, 792/792 E2E tests, 50 real-world programs (macOS + Linux + Windows x86_64 CI)
- All Wasm 3.0 proposals: See Spec Coverage below
- Component Model: WIT parser, Canonical ABI, component linking, WASI P2 adapter
- WAT support:
zwasm file.wat, build-time optional (-Dwat=false) - WASI Preview 1 + 2: 46/46 P1 syscalls (100%), P2 via component adapter
- Threads: Shared memory, 79 atomic operations (load/store/RMW/cmpxchg), wait/notify
- Security: Deny-by-default WASI, capability flags, resource limits
- Zero dependencies: Pure Zig, no libc required
- Allocator-parameterized: Caller controls memory allocation
All ratified Wasm proposals through 3.0 are implemented.
| Spec | Proposals | Status |
|---|---|---|
| Wasm 1.0 | MVP (172 opcodes) | Complete |
| Wasm 2.0 | Sign extension, Non-trapping f->i, Bulk memory, Reference types, Multi-value, Fixed-width SIMD (236) | All complete |
| Wasm 3.0 | Memory64, Exception handling, Tail calls, Extended const, Branch hinting, Multi-memory, Relaxed SIMD (20), Function references, GC (31) | All complete |
| Phase 3 | Wide arithmetic (4), Custom page sizes | Complete |
| Phase 4 | Threads (79 atomics) | Complete |
| Layer | Component Model (WIT, Canon ABI, WASI P2) | Complete |
18/18 proposals complete. 521 unit tests, 792/792 E2E tests, 50 real-world compatibility tests.
Benchmarked on Apple M4 Pro against wasmtime 41.0.1 (Cranelift JIT). 16 of 29 benchmarks match or beat wasmtime. 25/29 within 1.5x. Memory usage 3-4x lower than wasmtime, 8-10x lower than Bun/Node.
| Benchmark | zwasm | wasmtime | Bun | Node |
|---|---|---|---|---|
| nqueens(8) | 2ms | 5ms | 14ms | 23ms |
| nbody(1M) | 22ms | 22ms | 32ms | 36ms |
| gcd(12K,67K) | 2ms | 5ms | 14ms | 23ms |
| sieve(1M) | 5ms | 7ms | 17ms | 29ms |
| tak(24,16,8) | 5ms | 9ms | 17ms | 29ms |
| fib(35) | 46ms | 51ms | 36ms | 52ms |
| st_fib2 | 900ms | 674ms | 353ms | 389ms |
Full results (29 benchmarks): bench/runtime_comparison.yaml
SIMD benchmarks (ARM64 NEON / x86_64 SSE JIT):
| Benchmark | zwasm scalar | zwasm SIMD | wasmtime SIMD | SIMD speedup |
|---|---|---|---|---|
| image_blend (128x128) | 73 ms | 16 ms | 12 ms | 4.7x |
| matrix_mul (16x16) | 10 ms | 6 ms | 8 ms | 1.6x |
| byte_search (64KB) | 52 ms | 43 ms | 5 ms | 1.2x |
matrix_mul beats wasmtime. image_blend within 1.3x of wasmtime.
Full SIMD results: bench/simd_comparison.yaml
macOS / Linux:
zig build -Doptimize=ReleaseSafe
cp zig-out/bin/zwasm ~/.local/bin/
curl -fsSL https://raw.githubusercontent.com/clojurewasm/zwasm/main/install.sh | bashWindows (PowerShell):
zig build -Doptimize=ReleaseSafe
Copy-Item zig-out\bin\zwasm.exe "$env:LOCALAPPDATA\Microsoft\WindowsApps\zwasm.exe"
irm https://raw.githubusercontent.com/clojurewasm/zwasm/main/install.ps1 | iexzwasm module.wasm # Run a WASI module (run is optional)
zwasm module.wasm -- arg1 arg2 # With arguments
zwasm module.wat # Run a WAT text module
zwasm module.wasm --invoke fib 35 # Call a specific function
zwasm run module.wasm --allow-all # Explicit run subcommand also works
zwasm inspect module.wasm # Show exports, imports, memory
zwasm validate module.wasm # Validate without running
zwasm features # List supported proposals
zwasm features --json # Machine-readable outputconst zwasm = @import("zwasm");
var module = try zwasm.WasmModule.load(allocator, wasm_bytes);
defer module.deinit();
var args = [_]u64{35};
var results = [_]u64{0};
try module.invoke("fib", &args, &results);
// results[0] == 9227465See docs/usage.md for detailed library and CLI documentation.
zwasm also exposes a C API for use from any FFI-capable language (C, Python, Rust, Go, etc.):
zig build lib # Build libzwasm (.dll/.lib, .dylib/.a, or .so/.a)#include "zwasm.h"
zwasm_module_t *mod = zwasm_module_new(wasm_bytes, len);
uint64_t results[1] = {0};
zwasm_module_invoke(mod, "f", NULL, 0, results, 1);
zwasm_module_delete(mod);See the C API chapter in the book for the full API reference.
33 numbered educational WAT files, ordered from simple to advanced:
| # | Category | Examples |
|---|---|---|
| 01-09 | Basics | hello_add, if_else, loop, factorial, fibonacci, select, collatz, stack_machine, counter |
| 10-15 | Types | i64_math, float_math, bitwise, type_convert, sign_extend, saturating_trunc |
| 16-19 | Memory | memory, data_string, grow_memory, bulk_memory |
| 20-24 | Functions | multi_return, multi_value, br_table, mutual_recursion, call_indirect |
| 25-26 | Wasm 3.0 | return_call (tail calls), extended_const |
| 27-29 | Algorithms | bubble_sort, is_prime, simd_add |
| 30-33 | WASI | wasi_hello, wasi_echo, wasi_args, wasi_write_file |
Each file includes a run command in its header comment:
zwasm examples/wat/01_hello_add.wat --invoke add 2 3 # → 5
zwasm examples/wat/05_fibonacci.wat --invoke fib 10 # → 55
zwasm examples/wat/30_wasi_hello.wat --allow-all # → Hi!5 examples showing the library API: basic, memory, inspect, host_functions, wasi.
C example using libzwasm: load a module, invoke an export, read the result.
Python ctypes example: same workflow as C, no compiled bindings needed.
Rust FFI example: same workflow as C, using extern "C" bindings.
Requires Zig 0.15.2.
zig build # Build (Debug)
zig build test # Run all tests (521 tests)
zig build c-test # Run C API tests
./zig-out/bin/zwasm run file.wasmOn Windows use zig-out\bin\zwasm.exe.
Strip features at compile time to reduce binary size:
| Flag | Description | Default |
|---|---|---|
-Djit=false |
Disable JIT compiler | true |
-Dcomponent=false |
Disable Component Model | true |
-Dwat=false |
Disable WAT parser | true |
-Dsimd=false |
Disable SIMD opcodes | true |
-Dgc=false |
Disable GC proposal | true |
-Dthreads=false |
Disable threads/atomics | true |
Minimal build (~940 KB, −24%): zig build -Doptimize=ReleaseSafe -Djit=false -Dcomponent=false -Dwat=false
.wat text .wasm binary .wasm component
| | |
v | v
WAT Parser | Component Decoder
(optional) | (WIT + Canon ABI)
| | |
+------>-----+-----<---------+
|
v
Module (decode + validate)
|
v
Predecoded IR (fixed-width, cache-friendly)
|
v
Register IR (stack elimination, peephole opts)
| \
v v
RegIR Interpreter ARM64/x86_64 JIT
(fallback) (hot functions)
Hot functions are detected via call counting and back-edge counting, then compiled to native code. Functions that use unsupported opcodes fall back to the register IR interpreter.
Small and fast. zwasm prioritizes binary size and runtime performance density (performance per byte of binary). It targets environments where size and startup time matter: embedded systems, edge computing, CLI tools, and as an embeddable library in Zig projects.
ARM64 + x86_64. Full JIT support on both architectures including SIMD. Primary optimization target is Apple Silicon; x86_64 Linux and Windows are equally supported.
Spec fidelity over expedience. Correctness comes before performance. The spec test suite runs on every change.
- Stages 0-4: Core runtime (extraction, library API, spec conformance, ARM64 JIT)
- Stage 5: JIT coverage (20/21 benchmarks within 2x of wasmtime)
- Stages 7-12: Wasm 3.0 (memory64, exception handling, wide arithmetic, custom page sizes, WAT parser)
- Stage 13: x86_64 JIT backend
- Stages 14-18: Wasm 3.0 proposals (tail calls, multi-memory, relaxed SIMD, function references, GC)
- Stage 19: Post-GC improvements (GC spec tests, WASI P1 full coverage, GC collector)
- Stage 20:
zwasm featuresCLI - Stage 21: Threads (shared memory, 79 atomic operations)
- Stage 22: Component Model (WIT, Canon ABI, WASI P2)
- Stage 23: JIT optimization (smart spill, direct call, FP cache, self-call inline)
- Stage 25: Lightweight self-call (fib now matches wasmtime)
- Stages 26-31: JIT peephole, platform verification, spec cleanup, GC benchmarks
- Stage 32: 100% spec conformance (62,263/62,263 on macOS + Linux)
- Stage 33: Fuzz testing (differential testing, extended fuzz campaign, 0 crashes)
- Stage 34: Windows x86_64 native support (build, test, JIT, C API, release artifacts)
- Stages 35-41: Production hardening (crash safety, CI/CD, docs, API stability, distribution)
- Stages 42-43: Community preparation, v1.0.0 release
- Stages 44-47: WAT parser spec parity, SIMD perf analysis, book i18n, WAT roundtrip 100%
- Reliability: Cross-platform verification (50 real-world programs), JIT correctness (OSR, back-edge, guard pages)
- Phase 8: Real-world coverage (50 programs), WAT parity 100%, 5 JIT codegen fixes
- Phase 13: SIMD JIT (ARM64 NEON 253/256 native, x86_64 SSE 244/256 native)
- Future: WASI P3/async, contiguous v128 storage
SIMD (v128) operations are JIT-compiled to native NEON (ARM64) and SSE (x86_64) instructions. 253/256 opcodes are native on ARM64, 244/256 on x86_64. Hand-written SIMD microbenchmarks show strong results: matrix_mul beats wasmtime, image_blend within 1.3x.
Current gap: Compiler-generated SIMD code (C -msimd128) shows larger gaps (13-131x vs
wasmtime for scalar, wider for SIMD). Root cause is split v128 storage — each 128-bit vector
is stored as two 64-bit halves, adding overhead on every load/store.
Planned improvement: Contiguous v128 register storage (W37) and compiler-pattern
optimization (W38). See .dev/checklist.md for details.
zwasm follows Semantic Versioning. The public API surface is defined in docs/api-boundary.md.
- Stable types and functions (WasmModule, WasmFn, etc.) won't break in minor/patch releases
- Experimental types (runtime.*, WIT) may change in minor releases
- Deprecation: At least one minor version notice before removal
- Book (English) — Getting started, architecture, embedding guide, CLI reference
- Book (日本語) — 日本語ドキュメント
- API Boundary — Stable vs experimental API surface
- CHANGELOG — Version history
See CONTRIBUTING.md for build instructions, development workflow, and CI checks.
MIT
Developed in spare time alongside a day job. If you'd like to support continued development, sponsorship is welcome via GitHub Sponsors.