aboutsummaryrefslogtreecommitdiff
path: root/converters/simdutf/pkg-descr
blob: 13499c3b1e10c8b3a704dc257c46af5e3db7ff2b (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
This library provide fast Unicode functions such as

 - ASCII, UTF-8, UTF-16LE/BE and UTF-32 validation, with and without
   error identification,
 - transcoding between each of Latin1, UTF-8, UTF-16LE/BE, and UTF-32,
   with and without validation, with and without error identification
 - From an UTF-8 string, compute the size of the Latin1/UTF-16/UTF-32
   equivalent string,
 - From an UTF-16LE/BE string, compute the size of the
   Latin1/UTF-8/UTF-32 equivalent string,
 - From an UTF-32 string, compute the size of the UTF-8 or UTF-16LE
   equivalent string,
 - UTF-8 and UTF-16LE/BE character counting.
 - UTF-16 endianness change (UTF16-LE/BE to UTF-16-BE/LE)
 - Base64 encoding and decoding

The functions are accelerated using SIMD instructions (e.g., ARM NEON,
SSE, AVX, AVX-512, etc.). When your strings contain hundreds of
characters, we can often transcode them at speeds exceeding a billion
characters per second. You should expect high speeds not only with
English strings (ASCII) but also Chinese, Japanese, Arabic, and so
forth. We handle the full character range (including, for example,
emojis).

The library compiles down to a small library of a few hundred kilobytes.
Our functions are exception-free and non allocating. We have extensive
tests and extensive benchmarks.