C16RTOMB(3C)            Standard C Library Functions            C16RTOMB(3C)
NAME
     c16rtomb, 
c32rtomb, 
wcrtomb, 
wcrtomb_l - convert wide-characters to
     character sequences
SYNOPSIS
     #include <uchar.h>     size_t     c16rtomb(
char *restrict str, 
char16_t c16, 
mbstate_t *restrict ps);     
size_t     c32rtomb(
char *restrict str, 
char32_t c32, 
mbstate_t *restrict ps);     
#include <stdio.h>     size_t     wcrtomb(
char *restrict str, 
wchar_t wc, 
mbstate_t *restrict ps);     
#include <stdio.h>     #include <xlocale.h>     size_t     wcrtomb_l(
char *restrict str, 
wchar_t wc, 
mbstate_t *restrict ps,         
locale_t loc);
DESCRIPTION
     The 
c16rtomb(), 
c32rtomb(), 
wcrtomb(), and 
wcrtomb_l() functions
     convert wide-character sequences into a series of multi-byte
     characters.  The functions work in the following formats:     
c16rtomb()
                A UTF-16 code sequence, where every code point is
                represented by one or two 
char16_t.  The UTF-16 encoding
                will encode certain Unicode code points as a pair of two
                16-bit code sequences, commonly referred to as a surrogate
                pair.     
c32rtomb()
                A UTF-32 code sequence, where every code point is
                represented by a single 
char32_t.  It is illegal to pass
                reserved Unicode code points.     
wcrtomb(), 
wcrtomb_l()
                Wide characters, being a 32-bit value where every code point
                is represented by a single 
wchar_t.  While the 
wchar_t and                
char32_t are different types, in this implementation, they
                are similar encodings.
     The functions all work by looking at the passed in wide-character (
c16,     
c32, 
wc) and appending it to the current conversion state, 
ps.  Once a
     valid code point, based on the current locale, is found, then it will
     be converted into a series of characters that are stored in 
str.  Up to
     MB_CUR_MAX bytes will be stored in 
str.  It is the caller's
     responsibility to ensure that there is sufficient space in 
str.
     The functions are all influenced by the LC_CTYPE category of the
     current locale for determining what is considered a valid character.
     For example, in the 
C locale, only ASCII characters are recognized,
     while in a 
UTF-8 based locale like 
en_us.UTF-8, all valid Unicode code
     points are recognized and will be converted into the corresponding
     multi-byte sequence.  The 
wcrtomb_l() function uses the locale passed
     in 
loc rather than the locale of the current thread.
     The 
ps argument represents a multi-byte conversion state which can be
     used across multiple calls to a given function (but not mixed between
     functions).  These allow for characters to be consumed from subsequent
     buffers, e.g.  different values of 
str.  The functions may be called
     from multiple threads as long as they use unique values for 
ps.  If 
ps     is NULL, then a function-specific buffer will be used for the
     conversion state; however, this is stored between all threads and its
     use is not recommended.
     The functions all have a special behavior when NULL is passed for 
str.
     They instead will treat it as though a the NULL wide-character was
     passed in 
c16, 
c32, or 
wc and an internal buffer (buf) will be used to
     write out the results of the conversion.  In other words, the functions
     would be called as:
           c16rtomb(buf, L'\0', ps)
           c32rtomb(buf, L'\0', ps)
           wcrtomb(buf, L'\0', ps)
           wcrtomb_l(buf, L'\0', ps, loc)
   Locale Details
     Not all locales in the system are Unicode based locales.  For example,
     ISO 8859 family locales have code points with values that do not match
     their counterparts in Unicode.  When using these functions with non-
     Unicode based locales, the code points returned will be those
     determined by the locale.  They will not be converted from the
     corresponding Unicode code point.  For example, if using the Euro sign
     in ISO 8859-15, these functions will not encode the Unicode value
     0x20ac into the ISO 8859-15 value 0xa4.
     Regardless of the locale, the characters returned will be encoded as
     though the code point were the corresponding value in Unicode.  This
     means that when using UTF-16, if the corresponding code point were in
     the range for surrogate pairs, then the 
c16rtomb() function will expect
     to receive that code point in that fashion.
     This behavior of the 
c16rtomb() and 
c32rtomb() functions should not be
     relied upon, is not portable, and subject to change for non-Unicode
     locales.
RETURN VALUES
     Upon successful completion, the 
c16rtomb(), 
c32rtomb(), 
wcrtomb(), and     
wcrtomb_l() functions return the number of bytes stored in 
str.
     Otherwise, 
(size_t)-1 is returned to indicate an encoding error and     
errno is set.
EXAMPLES
     Example 1 Converting a UTF-32 character into a multi-byte character
     sequence.
     #include <locale.h>
     #include <stdlib.h>
     #include <string.h>
     #include <err.h>
     #include <stdio.h>
     #include <uchar.h>
     int
     main(void)
     {
             mbstate_t mbs;
             size_t ret;
             char buf[MB_CUR_MAX];
             char32_t val = 0x5149;
             const char *uchar_exp = "\xe5\x85\x89";
             (void) memset(&mbs, 0, sizeof (mbs));
             (void) setlocale(LC_CTYPE, "en_US.UTF-8");
             ret = c32rtomb(buf, val, &mbs);
             if (ret != strlen(uchar_exp)) {
                     errx(EXIT_FAILURE, "failed to convert string, got %zd",
                         ret);
             }
             if (strncmp(buf, uchar_exp, ret) != 0) {
                     errx(EXIT_FAILURE, "converted char32_t does not match "
                         "expected value");
             }
             return (0);
     }
ERRORS
     The 
c16rtomb(), 
c32rtomb(), 
wcrtomb(), and 
wcrtomb_l() functions will
     fail if:
     EINVAL             The conversion state in 
ps is invalid.
     EILSEQ             An invalid character sequence has been detected.
MT-LEVEL     The 
c16rtomb(), 
c32rtomb(), 
wcrtomb(), and 
wcrtomb_l() functions are     
MT-Safe as long as different 
mbstate_t structures are passed in 
ps.  If     
ps is NULL or different threads use the same value for 
ps, then the
     functions are 
Unsafe.
INTERFACE STABILITY
     CommittedSEE ALSO
     mbrtoc16(3C), 
mbrtoc32(3C), 
mbrtowc(3C), 
newlocale(3C), 
setlocale(3C),     
uselocale(3C), 
uchar.h(3HEAD), 
environ(7)illumos                       December 2, 2023                       illumos