SimdUnicode

A blazing-fast C# library that validates UTF-8 strings up to 13× faster than the .NET standard library — using AVX-512, AVX2, SSE and ARM NEON.

13×
faster on Emoji-heavy text (Ice Lake, AVX-512)
< 1
instruction per byte to validate UTF-8
4
SIMD back-ends: AVX-512, AVX2, SSE4.2, NEON
0
allocations — works directly on your buffer

Drop-in replacement

SimdUnicode provides SimdUnicode.UTF8.GetPointerToFirstInvalidByte, a faster drop-in replacement for the runtime's private Utf8Utility.GetPointerToFirstInvalidByte. It returns a pointer to the first invalid byte — or the end of the buffer when the input is well-formed.

using SimdUnicode;

byte[] data = File.ReadAllBytes("twitter.json");

unsafe
{
    fixed (byte* p = data)
    {
        byte* invalid = UTF8.GetPointerToFirstInvalidByte(
            p, data.Length,
            out int utf16Adjustment,
            out int scalarAdjustment);

        bool isValid = invalid == p + data.Length;
        Console.WriteLine(isValid ? "Valid UTF-8 ✅" : $"Invalid at offset {invalid - p}");
    }
}

The right SIMD kernel is selected automatically at runtime: ARM64 NEON, AVX-512 (Zen 4 / Ice Lake), AVX2, SSE4.2, or a portable scalar fallback.

Less than one instruction per byte

Implements the Keiser–Lemire algorithm used by Node.js, Bun, Oracle GraalVM and the PHP interpreter.

🧭

Runtime dispatch

One call, the best available kernel. AVX-512, AVX2, SSE4.2, ARM NEON or scalar — chosen for your CPU.

🧪

Extensively tested

A large suite of correctness tests across architectures, plus reproducible BenchmarkDotNet benchmarks.

🍏

x64 & ARM

First-class support for modern Intel/AMD and Apple Silicon / Graviton processors.

How fast is it?

Throughput on an Intel Ice Lake system (AVX-512), validating UTF-8. Longer bars are faster — SimdUnicode in purple, the .NET standard library in grey.

Emoji-Lipsum
12 GB/s
0.9
13×
Russian-Lipsum
12 GB/s
1.2
10×
Korean-Lipsum
10 GB/s
1.3
7.7×
Hindi-Lipsum
12 GB/s
2.1
5.7×
Arabic-Lipsum
12 GB/s
2.3
5.2×
Twitter.json
29 GB/s
12
2.4×

On an Apple M2 (NEON), SimdUnicode is 1.5×–4× faster than the standard library. See the full set of measurements across x64 and ARM in the benchmarks.

Build it

git clone https://github.com/simdutf/SimdUnicode.git
cd SimdUnicode/src
dotnet build -c Release

Then add a project reference to src/SimdUnicode.csproj. Head to the getting started guide or dive into the API reference.

Citing

The algorithm is described in:

John Keiser, Daniel Lemire, Validating UTF-8 In Less Than One Instruction Per Byte, Software: Practice and Experience 51 (5), 2021.