Table of Contents

Benchmarks

All numbers are validation throughput in GB/s (higher is better) measured with BenchmarkDotNet. Twitter.json is a realistic, mostly-ASCII document; the *-Lipsum inputs are dense single-script text that stresses the multi-byte paths.

To reproduce on your own machine:

cd benchmark
dotnet run -c Release
# or a single input:
dotnet run -c Release --filter "*Twitter*"

x64 — Intel Ice Lake (AVX-512)

Up to 13× faster than the standard library; 2.4× on realistic Twitter data.

data set SimdUnicode AVX-512 (GB/s) .NET (GB/s) speed-up
Twitter.json 29 12 2.4×
Arabic-Lipsum 12 2.3 5.2×
Chinese-Lipsum 12 3.9 3.0×
Emoji-Lipsum 12 0.9 13×
Hebrew-Lipsum 12 2.3 5.2×
Hindi-Lipsum 12 2.1 5.7×
Japanese-Lipsum 10 3.5 2.9×
Korean-Lipsum 10 1.3 7.7×
Latin-Lipsum 76 76
Russian-Lipsum 12 1.2 10×

On x64 SimdUnicode ships four kernels — a scalar fallback for legacy systems, SSE4.2 for older CPUs, AVX2 for current x64, and AVX-512 for the most recent processors (AMD Zen 4 or better, Intel Ice Lake, etc.).

ARM — Apple M2 (NEON)

1.5×–4× faster than the standard library.

data set SimdUnicode (GB/s) .NET (GB/s) speed-up
Twitter.json 25 14 1.8×
Arabic-Lipsum 7.4 3.5 2.1×
Chinese-Lipsum 7.4 4.8 1.5×
Emoji-Lipsum 7.4 2.5 3.0×
Hebrew-Lipsum 7.4 3.5 2.1×
Hindi-Lipsum 7.3 3.0 2.4×
Japanese-Lipsum 7.3 4.6 1.6×
Korean-Lipsum 7.4 1.8 4.1×
Latin-Lipsum 87 38 2.3×
Russian-Lipsum 7.4 2.7 2.7×

ARM — AWS Graviton 3 (Neoverse V1)

1.2× to over 5× faster than the standard library.

data set SimdUnicode (GB/s) .NET (GB/s) speed-up
Twitter.json 19 11 1.7×
Arabic-Lipsum 5.2 2.7 1.9×
Chinese-Lipsum 5.2 4.5 1.2×
Emoji-Lipsum 5.2 0.9 5.8×
Hebrew-Lipsum 5.2 2.7 1.9×
Hindi-Lipsum 5.2 2.4 2.2×
Japanese-Lipsum 5.2 3.9 1.3×
Korean-Lipsum 5.2 1.5 3.5×
Latin-Lipsum 57 26 2.2×
Russian-Lipsum 5.2 2.8 1.9×

ARM — Qualcomm 8cx Gen 3 (Windows Dev Kit 2023)

data set SimdUnicode (GB/s) .NET (GB/s) speed-up
Twitter.json 17 10 1.7×
Arabic-Lipsum 5.0 2.3 2.2×
Chinese-Lipsum 5.0 2.9 1.7×
Emoji-Lipsum 5.0 0.9 5.5×
Hebrew-Lipsum 5.0 2.3 2.2×
Hindi-Lipsum 5.0 1.9 2.6×
Japanese-Lipsum 5.0 2.7 1.9×
Korean-Lipsum 5.0 1.5 3.3×
Latin-Lipsum 50 20 2.5×
Russian-Lipsum 5.0 1.2 5.2×

ARM — AWS Graviton 2 (Neoverse N1)

data set SimdUnicode (GB/s) .NET (GB/s) speed-up
Twitter.json 12 8.7 1.4×
Arabic-Lipsum 3.4 2.0 1.7×
Chinese-Lipsum 3.4 2.6 1.3×
Emoji-Lipsum 3.4 0.8 4.3×
Hebrew-Lipsum 3.4 2.0 1.7×
Hindi-Lipsum 3.4 1.6 2.1×
Japanese-Lipsum 3.4 2.4 1.4×
Korean-Lipsum 3.4 1.3 2.6×
Latin-Lipsum 42 17 2.5×
Russian-Lipsum 3.3 0.95 3.5×

Hardware, compiler and input all affect these numbers. Treat the tables as representative, and measure on your own target hardware for decisions that matter.