Module unicode
An implementation of the Erlang/OTP unicode interface.
Description
This module implements a strict subset of the Erlang/OTP unicode interface.
Data Types
chardata()
chardata() = charlist() | unicode_binary()
charlist()
charlist() = maybe_improper_list(char() | unicode_binary() | charlist(), unicode_binary() | [])
encoding()
encoding() = utf8 | latin1
latin1_chardata()
latin1_chardata() = iodata()
unicode_binary()
unicode_binary() = binary()
Function Index
characters_to_binary/1 | Convert character data to an UTF8 binary. |
characters_to_binary/2 | Convert character data in a given encoding to an UTF8 binary. |
characters_to_binary/3 | Convert character data in a given encoding to a binary in a given encoding. |
characters_to_list/1 | Convert UTF-8 data to a list of Unicode characters. |
characters_to_list/2 | Convert UTF-8 or Latin1 data to a list of Unicode characters. |
Function Details
characters_to_binary/1
characters_to_binary(Data::chardata() | latin1_chardata()) -> unicode_binary() | {error, list(), chardata() | latin1_chardata() | list()} | {incomplete, unicode_binary(), chardata() | latin1_chardata()}
Data
: data to convert to UTF8
returns: an utf8 binary or a tuple if conversion failed.
Equivalent to characters_to_binary(Data, utf8, utf8)
.
Convert character data to an UTF8 binary
characters_to_binary/2
characters_to_binary(Data::chardata() | latin1_chardata(), InEncoding::encoding()) -> unicode_binary() | {error, list(), chardata() | latin1_chardata() | list()} | {incomplete, unicode_binary(), chardata() | latin1_chardata()}
Data
: data to convert to UTF8InEncoding
: encoding of data
returns: an utf8 binary or a tuple if conversion failed.
Equivalent to characters_to_binary(Data, InEncoding, utf8)
.
Convert character data in a given encoding to an UTF8 binary
characters_to_binary/3
characters_to_binary(Data::chardata() | latin1_chardata(), InEncoding::encoding(), OutEncoding::encoding()) -> unicode_binary() | {error, list(), chardata() | latin1_chardata() | list()} | {incomplete, unicode_binary(), chardata() | latin1_chardata()}
Data
: data to convert to UTF8InEncoding
: output encoding
returns: an encoded binary or a tuple if conversion failed.
Convert character data in a given encoding to a binary in a given encoding.
If conversion fails, the function returns a tuple with three elements:
First element is
error
orincomplete
.incomplete
means the conversion failed because of an incomplete unicode transform at the very end of data.Second element is what has been converted so far.
Third element is the remaining data to be converted, for debugging purposes. This remaining data can differ with what Erlang/OTP returns.
Also, Erlang/OTP’s implementation may error with badarg
for parameters
for which this function merely returns an error tuple.
characters_to_list/1
characters_to_list(Data::chardata() | latin1_chardata()) -> list() | {error, list(), chardata() | latin1_chardata() | list()} | {incomplete, list(), chardata() | latin1_chardata()}
Data
: data to convert to Unicode
returns: a list of characters or a tuple if conversion failed.
Convert UTF-8 data to a list of Unicode characters.
If conversion fails, the function returns a tuple with three elements:
First element is
error
orincomplete
.incomplete
means the conversion failed because of an incomplete unicode transform at the very end of data.Second element is what has been converted so far.
Third element is the remaining data to be converted, for debugging purposes. This remaining data can differ with what Erlang/OTP returns.
characters_to_list/2
characters_to_list(Data::chardata() | latin1_chardata(), Encoding::encoding()) -> list() | {error, list(), chardata() | latin1_chardata() | list()} | {incomplete, list(), chardata() | latin1_chardata()}
Data
: data to convertEncoding
: encoding of data to convert
returns: a list of characters or a tuple if conversion failed.
Convert UTF-8 or Latin1 data to a list of Unicode characters. Following Erlang/OTP, if input encoding is latin1, this function returns an error tuple if a character > 255 is passed (in a list). Otherwise, it will accept any character within Unicode range (0-0x10FFFF).
See also: characters_to_list/1.