Unicode Transformation Format. More...
#include "a.h"
Go to the source code of this file.
Functions | |
unsigned int | a_utf_encode (a_u32 val, void *buf) |
encode one unicode code point into UTF-8 | |
unsigned int | a_utf_decode (void const *ptr, a_size num, a_u32 *val) |
decode one unicode code point from UTF-8 | |
a_size | a_utf_length (void const *ptr, a_size num, a_size *stop) |
number of unicode code points in a UTF-8 encoded byte sequence | |
a_size | a_utf_length_ (void const *ptr, a_size num) |
Unicode Transformation Format.
Unicode | UTF-8 |
---|---|
U+0000000 ~ U+0000007F | 0XXXXXXX |
U+0000080 ~ U+000007FF | 110XXXXX 10XXXXXX |
U+0000800 ~ U+0000FFFF | 1110XXXX 10XXXXXX 10XXXXXX |
U+0010000 ~ U+001FFFFF | 11110XXX 10XXXXXX 10XXXXXX 10XXXXXX |
U+0200000 ~ U+03FFFFFF | 111110XX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX |
U+4000000 ~ U+7FFFFFFF | 1111110X 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX |