Performance of WebAssembly runtimes in 2026

I wanted to know if WebAssembly runtimes are getting faster.

This is a follow-up to the earlier libsodium WebAssembly benchmarks from 2019, 2021 and 2023.

Not “does the newest version beat native code in one microbenchmark?”, and not “which runtime has the prettiest benchmark chart?”, but something more boring and more useful:

If I take the same C crypto code, compile it to WebAssembly, and run it on the latest runtime, a runtime from one year ago, and a runtime from two years ago, are things actually improving?

So I benchmarked libsodium on WebAssembly runtimes released around June 2024, June 2025, and June 2026.

The short version:

  • WAVM and WasmEdge can be very fast, but I only got a complete 2026 WAVM run, and the latest stable WasmEdge release was unusably slow on this benchmark on my machine.
  • WAMR in AOT mode is also very fast, landing right next to WAVM and the best Wasmtime results.
  • wasm2c, Wasmer, and Wasmtime are all close enough to native to be interesting for CPU-bound crypto.
  • Wazero is slower, but stable.
  • The Node and Bun rows need a full rerun with longer benchmark loops. A smoke test showed that the short-loop run substantially under-warmed the JITs.
  • The experimental WebAssembly wide_arithmetic instructions are a big deal for crypto code when runtimes support them.

Draft note before publishing: the tables below still reflect the earlier short-loop run. I found two issues after generating them. First, Node 22.3.0 can run the failing password-hashing tests when the Wasm modules declare an explicit 64 MiB maximum linear memory. Second, Node and Bun need more in-module iterations before their optimizing compilers settle; a 1000-iteration smoke run on fast tests reduced Node 26.3.1 from about 13.0x native to 2.34x native, and Bun 1.3.14 from about 11.7x native to 2.61x native. The benchmark scripts now default to ITERATIONS=1000 and WASM_MAX_MEMORY=67108864; rerun the full matrix before treating the tables as final.

What I measured

The test program is libsodium’s benchmark suite, built from libsodium commit 8e3be8615ba6adcd7babaecf5e76f516890ba5fb.

I built one native baseline and several WebAssembly variants:

  • native x86-64, compiled with Zig using the local CPU target
  • plain WebAssembly
  • WebAssembly with lime1
  • WebAssembly with lime1 and simd128
  • WebAssembly with lime1, simd128, and wide_arithmetic

For the native reference, libsodium was built with -Dcpu=native. For wasm2c, the generated C was compiled with zig cc -O3 -march=native.

For WAMR, I used AOT mode: wamrc compiled each .wasm file to an .aot file, and iwasm ran the resulting AOT file. wamrc doesn’t accept --cpu=native, so I used --target=x86_64 --cpu=x86-64-v4 --opt-level=3, which matches the host’s available x86-64 feature level and works across the WAMR versions that could compile these modules.

The native command was:

zig build -Denable_benchmarks -Doptimize=ReleaseFast -Dcpu=native -Diterations=3

The WebAssembly commands were the same shape, with a wasm32-wasi target and the feature-specific CPU strings:

zig build -Denable_benchmarks -Dtarget=wasm32-wasi -Doptimize=ReleaseFast -Diterations=3
zig build -Denable_benchmarks -Dtarget=wasm32-wasi -Doptimize=ReleaseFast -Dcpu=lime1 -Diterations=3
zig build -Denable_benchmarks -Dtarget=wasm32-wasi -Doptimize=ReleaseFast -Dcpu=lime1+simd128 -Diterations=3
zig build -Denable_benchmarks -Dtarget=wasm32-wasi -Doptimize=ReleaseFast -Dcpu=lime1+simd128+wide_arithmetic -Diterations=3

The host was an AMD Ryzen AI 9 HX 470 with 12 cores and 24 threads. CPU boost was disabled and the maximum CPU frequency was 2 GHz. The OS was Linux 7.1.0-rc7, and Zig was 0.17.0-dev.948+e949341b7.

The numbers below are the geometric mean of per-benchmark slowdowns relative to the native build. Lower is better. A value of 2.0 means “twice as slow as native” on this machine.

I used ITERATIONS=3, so the very small libsodium tests are noisy and quantized. Rows reporting zero time were excluded from the aggregate. I did not pin benchmark processes to specific cores. This is still useful for comparing broad runtime behavior, but don’t treat the last decimal place as meaningful.

Versions

For every runtime except WAVM, I used the latest stable release available on June 23, 2026, plus a stable release from roughly one year earlier and one from roughly two years earlier.

Runtime 2024 2025 2026
Bun 1.1.16 1.2.17 1.3.14
Node 22.3.0 24.2.0 26.3.1
WAMR 2.1.0 2.3.1 2.4.4
WABT wasm2c 1.0.35 1.0.37 1.0.41
WasmEdge 0.14.0 0.14.1 0.17.0
Wasmer 4.3.2 6.0.1 7.1.0
Wasmtime 22.0.0 34.0.0 46.0.0
WAVM n/a n/a nightly/2026-04-05
Wazero 1.7.3 1.9.0 1.12.0

WAVM is awkward to compare historically. The old available nightly collapsed to a 2022 binary for both the 2024 and 2025 slots, and that binary refused to run on this machine. I only kept the 2026 nightly.

WAMR 2.1.0, the selected 2024 release, installed fine but its AOT compiler failed on these Zig-generated modules with invalid WASM stack data type. I kept the version in the matrix, but did not include an aggregate for it.

The 2024 and 2025 wasm2c release binaries depended on libcrypto.so.1.1, so I built those WABT versions from source and used their wasm2c tools.

Baseline WebAssembly

This is the plain WebAssembly build, without lime1, SIMD, or wide arithmetic.

Runtime 2024 2025 2026
WAVM n/a n/a 1.41
WAMR AOT n/a 1.59 1.57
WasmEdge 1.66 1.98 partial
wasm2c 2.01 2.08 1.86
Wasmer 2.13 2.56 2.08
Wasmtime 2.67 2.54 2.41
Wazero 4.84 4.70 4.72
Node 8.60 8.22 7.95
Bun 27.41 26.42 8.77

There isn’t one universal trend.

Wasmtime steadily improved: 2.67x native in 2024, 2.54x in 2025, 2.41x in 2026. That’s not a revolution, but it is real progress.

Node also improved slowly, from 8.60x native to 7.95x native.

Wazero was basically flat: 4.84x, 4.70x, 4.72x. That’s not bad, but this benchmark doesn’t show a big speedup over the last two years.

WAMR in AOT mode was already fast in 2025 and slightly faster in 2026: 1.59x native, then 1.57x native. I don’t have a complete 2024 WAMR number because WAMR 2.1.0 couldn’t compile these modules.

Wasmer regressed in the 2025 release I tested, then recovered in 2026. The 2026 baseline is slightly faster than the 2024 baseline, but not by much.

wasm2c improved modestly in 2026. It remains one of the best options if ahead-of-time translation to native C is acceptable for your deployment model.

Bun is the outlier. Its 2024 and 2025 results were far behind, but the 2026 result is about three times faster than the 2025 result. It is still slower than Node on this benchmark, but the direction is excellent.

WasmEdge is complicated. The 2024 and 2025 releases were fast. The latest stable 2026 release, 0.17.0, started running the benchmark, but some tests were orders of magnitude slower on this host. I stopped that run after 15 tests and excluded it from the aggregate table. A simple clock probe worked, so this wasn’t just the old fake-clock problem. It needs separate investigation.

Best supported build by year

The baseline table is useful because it compares the same WebAssembly target everywhere.

But if you are choosing a runtime for your own deployment, you probably care about the fastest build that runtime can actually run.

So for each runtime and year, I also selected the best complete result among the supported builds: baseline, lime1, lime1+simd128, and lime1+simd128+wide_arithmetic.

Runtime 2024 best 2025 best 2026 best
WAVM n/a n/a 1.41 (baseline)
WAMR AOT n/a 1.42 (lime1+simd128) 1.42 (lime1+simd128)
WasmEdge 1.62 (lime1+simd128) 1.64 (lime1) n/a
wasm2c 2.01 (baseline) 2.08 (baseline) 1.86 (baseline)
Wasmer 2.09 (lime1) 2.49 (lime1) 1.33 (lime1+simd128+wide_arithmetic)
Wasmtime 2.60 (lime1+simd128) 1.52 (lime1+simd128+wide_arithmetic) 1.46 (lime1+simd128+wide_arithmetic)
Wazero 4.84 (baseline) 4.64 (lime1) 4.71 (lime1+simd128)
Node 8.60 (baseline) 7.99 (lime1) 7.95 (baseline)
Bun 27.35 (lime1) 26.23 (lime1) 8.77 (baseline)

Ranked by the best supported build, the complete current-year results are:

2026 rank Runtime Best build Slowdown vs native
1 Wasmer lime1+simd128+wide_arithmetic 1.33
2 WAVM baseline 1.41
3 WAMR AOT lime1+simd128 1.42
4 Wasmtime lime1+simd128+wide_arithmetic 1.46
5 wasm2c baseline 1.86
6 Wazero lime1+simd128 4.71
7 Node baseline 7.95
8 Bun baseline 8.77

WasmEdge 0.17.0 is not in the 2026 ranking because I did not get a complete usable run.

CPU feature variants

The WebAssembly feature story is more interesting than the year-to-year runtime story.

For the 2026 releases, these were the aggregate slowdowns:

Runtime baseline lime1 lime1+simd128 lime1+simd128+wide_arithmetic
WAVM 1.41 1.59 1.43 unsupported
WAMR AOT 1.57 1.44 1.42 unsupported
Wasmer 2.08 2.02 2.03 1.33
Wasmtime 2.41 2.30 2.37 1.46
Wazero 4.72 4.77 4.71 unsupported
Node 7.95 8.05 8.25 unsupported
Bun 8.77 11.05 9.53 unsupported

WasmEdge 0.14.1, the newest complete WasmEdge run I kept, looked like this:

Runtime baseline lime1 lime1+simd128 lime1+simd128+wide_arithmetic
WasmEdge 0.14.1 1.98 1.64 1.72 unsupported

lime1 and simd128 alone are not magic here. Sometimes they help, sometimes they hurt, and sometimes the difference is lost in benchmark noise.

wide_arithmetic is different.

Only Wasmtime and Wasmer could run the full wide_arithmetic build among the complete stable rows I tested. WAMR rejected it with unsupported opcode 0xfc13. But when wide_arithmetic worked, it was the biggest speedup in the whole experiment:

  • Wasmtime 46.0.0: 2.41x native without it, 1.46x native with it.
  • Wasmer 7.1.0: 2.08x native without it, 1.33x native with it.

That’s the kind of change cryptographic code wants. A lot of libsodium’s expensive operations are arithmetic-heavy. If the WebAssembly ISA can express that arithmetic directly, the runtime has much less work to rediscover what the C compiler already knew.

Failures

Most runs completed cleanly, but not all of them.

Bun 1.2.17 failed box_easy in the baseline build. Bun 1.1.16 failed pwhash_argon2i in the lime1 and lime1+simd128 builds. Node 22.3.0 failed pwhash_argon2i, pwhash_argon2id, and pwhash_scrypt in the baseline, lime1, and lime1+simd128 builds.

The Node 22.3.0 password-hashing failures were not fixed by increasing Node’s JavaScript heap or stack settings. They were fixed by giving the Wasm modules an explicit maximum linear memory. With the baseline build, a 1024-page maximum, or 64 MiB, made pwhash_argon2i, pwhash_argon2id, and pwhash_scrypt complete. pwhash_scrypt failed with 512 pages and segfaulted again at 1536 pages and above, so this appears to be a V8 memory-mode threshold rather than a simple “more memory is better” setting.

WAMR 2.1.0, the 2024 slot, could not compile even the baseline modules in AOT mode. WAMR 2.3.1 and 2.4.4 compiled and ran the baseline, lime1, and lime1+simd128 builds, but not wide_arithmetic.

Those failures were excluded from the aggregate. So were benchmark rows with a zero reported median.

For WasmEdge 0.17.0, I did not include a 2026 aggregate. The partial run was too strange to reduce to a single number in good faith.

So, are runtimes getting faster?

Some of them are.

Wasmtime is the cleanest yes: it got faster every year in this benchmark. Not massively faster, but consistently faster.

Node is also a yes, but the slope is gentle.

Bun is a loud yes between 2025 and 2026. It still has a lot of ground to cover for this workload, but the improvement is too large to ignore.

Wazero is mostly flat.

WAMR is also mostly flat between the versions that worked here, but “flat” at about 1.4x to 1.6x native is a very good place to be.

Wasmer is mixed if you only look at the baseline, but the 2026 release supporting wide_arithmetic changes the practical answer for crypto code. With that feature enabled, it was the fastest complete 2026 result I could compare across a normal current release.

wasm2c remains good. If you can translate WebAssembly to C ahead of time and compile it for the host, it is hard to beat.

WAVM produced the fastest 2026 baseline number, but I don’t have a fair 2024 or 2025 comparison.

WasmEdge used to be excellent in this benchmark. I would not draw a 2026 conclusion until the 0.17.0 slowdown is understood.

Takeaways

If you run CPU-heavy cryptography in WebAssembly, runtime choice still matters a lot.

The spread between the fastest complete current result and the slowest current result is large: Wasmer with wide_arithmetic was 1.33x native, while current Bun baseline was 8.77x native.

Feature support matters too. The same runtime can move from “pretty good” to “surprisingly close to native” when the WebAssembly module can use better arithmetic instructions.

The comforting part is that the mainstream runtimes are not standing still. Wasmtime improved steadily. Bun made a huge jump. Wasmer gained a feature that matters for real crypto workloads.

The less comforting part is that WebAssembly performance is still not one thing. It depends on the runtime, the release, the enabled WebAssembly features, whether the code goes through WASI from JavaScript, and whether ahead-of-time native compilation is allowed.

So benchmark your actual workload.

But if your workload looks like libsodium, the answer in 2026 is: WebAssembly can be close to native, wide_arithmetic is worth caring about, and yes, some runtimes really are getting faster.