summaryrefslogtreecommitdiffhomepage
path: root/debian/README.Debian
blob: 382d20d467a596d11976c46235fffbff77bc38a2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
unicode for Debian
==================

This package is the Debian version of unicode, a C++ library for Unicode encoding.


CLI interface (package unicode-tools)
-------------------------------------

* unicode-recode

  Usage: recode <from-format> <from-file> <to-format> <to-file>
  Format:
      UTF-8       UTF-8
      UTF-16      UTF-16, native endian
      UTF-16LE    UTF-16, little endian
      UTF-16BE    UTF-16, big endian
      UTF-32      UTF-32, native endian
      UTF-32LE    UTF-32, little endian
      UTF-32BE    UTF-32, big endian
      ISO-8859-1  ISO-8859-1 (Latin-1)
      ISO-8859-15 ISO-8859-15 (Latin-9)
  Exit code: 0 if valid, 1 otherwise.

* unicode-validate

  Usage: validate <format> <file>
  Format:
      UTF-8     UTF-8
      UTF-16    UTF-16, big or little endian
      UTF-16LE  UTF-16, little endian
      UTF-16BE  UTF-16, big endian
      UTF-32    UTF-32, big or little endian
      UTF-32LE  UTF-32, little endian
      UTF-32BE  UTF-32, big endian
  Exit code: 0 if valid, 1 otherwise.


C++ interface (package libunicode-dev)
--------------------------------------

Example:

#include <unicode.h>
...

  std::string utf8_value {u8"äöü"};
  std::u16string utf16_value{unicode::convert<char, char16_t>(utf8_value)};

And for C++20:

  std::u8string utf8_value {u8"äöü"};
  std::u16string utf16_value{unicode::convert<char8_t, char16_t>(utf8_value)};

The following encodings are implicitly deducted from types:
  * char resp. char8_t (C++20): UTF-8
  * char16_t: UTF-16
  * char32_t: UTF-32

Explicit encoding specification is also possible:

  std::string value {"äöü"};
  std::u32string utf32_value{unicode::convert<unicode::ISO_8859_1, unicode::UTF_32>(value)};

Supported encodings are:

  * unicode::UTF_8
  * unicode::UTF_16
  * unicode::UTF_32
  * unicode::ISO_8859_1
  * unicode::ISO_8859_15

Validation can be done like this:

  bool valid{unicode::is_valid_utf<char16_t>(utf16_value)};

Or via explicit encoding specification:

  bool valid{unicode::is_valid_utf<unicode::UTF_8>(utf8_value)};


Contact
-------

Reichwein IT <mail@reichwein.it>