UnicodeToUtf8 (Unicode function): Difference between revisions
Jump to navigation
Jump to search
m (1 revision) |
m (1 revision) |
||
Line 2: | Line 2: | ||
[[Category:Unicode methods|UnicodeToUtf8 function]] | [[Category:Unicode methods|UnicodeToUtf8 function]] | ||
[[Category:Intrinsic methods]] | [[Category:Intrinsic methods]] | ||
<!--DPL?? Category:Unicode methods|<var>UnicodeToUtf8</var> function: Unicode string converted to UTF-8 byte stream--> | <!--DPL?? Category:<var>Unicode</var> methods|<var>UnicodeToUtf8</var> function: <var>Unicode</var> string converted to UTF-8 byte stream--> | ||
<!--DPL?? Category:Intrinsic methods|<var>UnicodeToUtf8</var> (Unicode function): Unicode string converted to UTF-8 byte stream--> | <!--DPL?? Category:Intrinsic methods|<var>UnicodeToUtf8</var> (<var>Unicode</var> function): <var>Unicode</var> string converted to UTF-8 byte stream--> | ||
<!--DPL?? Category:System methods|<var>UnicodeToUtf8</var> (Unicode function): Unicode string converted to UTF-8 byte stream--> | <!--DPL?? Category:<var>System</var> methods|<var>UnicodeToUtf8</var> (<var>Unicode</var> function): <var>Unicode</var> string converted to UTF-8 byte stream--> | ||
This function converts a Unicode string to a UTF-8 | This function converts a <var>Unicode</var> string to a UTF-8 | ||
Longstring byte stream. | <var>Longstring</var> byte stream. | ||
The <var>UnicodeToUtf8</var> function is available as of version 7.3 of the <var class=product>Sirius Mods</var>. | The <var>UnicodeToUtf8</var> function is available as of version 7.3 of the <var class=product>Sirius Mods</var>. | ||
Line 15: | Line 15: | ||
<table class="syntaxTable"> | <table class="syntaxTable"> | ||
<tr><th><i>%utf8Stream</i></th> | <tr><th><i>%utf8Stream</i></th> | ||
<td>A String or Longstring variable to receive the method object string translated to a UTF-8 byte stream. </td></tr> | <td>A <var>String</var> or <var>Longstring</var> variable to receive the method object string translated to a UTF-8 byte stream. </td></tr> | ||
<tr><th><i>unicode</i></th> | <tr><th><i>unicode</i></th> | ||
<td>A Unicode string. </td></tr> | <td>A <var>Unicode</var> string. </td></tr> | ||
<tr><th><b>InsertBOM=</b><i>bool</i></th> | <tr><th><b>InsertBOM=</b><i>bool</i></th> | ||
<td>The optional (name required) InsertBOM argument is a Boolean: <ul> <li>If its value is <tt>True</tt>, the "Byte Order Mark" (U+FEFF, encoded as X'EFBBBF') is inserted at the start of the output stream. <li>If its value is <tt>False</tt>, the default, no Byte Order Mark is inserted. </ul></td></tr> | <td>The optional (name required) InsertBOM argument is a Boolean: <ul> <li>If its value is <tt>True</tt>, the "Byte Order Mark" (U+FEFF, encoded as X'EFBBBF') is inserted at the start of the output stream. <li>If its value is <tt>False</tt>, the default, no Byte Order Mark is inserted. </ul></td></tr> | ||
Line 25: | Line 25: | ||
This function can throw the following exception: | This function can throw the following exception: | ||
<dl> | <dl> | ||
<dt>CharacterTranslationException | <dt><var>CharacterTranslationException</var> | ||
<dd>If the method encounters a translation problem, | <dd>If the method encounters a translation problem, | ||
properties of the exception object may indicate the location and type of problem. | properties of the exception object may indicate the location and type of problem. | ||
Line 35: | Line 35: | ||
<li>For more information about UTF-8 conversions, see [[Unicode#UTF-8 and UTF-16]]. | <li>For more information about UTF-8 conversions, see [[Unicode#UTF-8 and UTF-16]]. | ||
<li>The [[UnicodeToUtf16 (Unicode function)|UnicodeToUtf16]] method | <li>The [[UnicodeToUtf16 (Unicode function)|UnicodeToUtf16]] method | ||
converts a Unicode string to UTF-16. | converts a <var>Unicode</var> string to UTF-16. | ||
<li>The [[Utf8ToUnicode (String function)|Utf8ToUnicode]] method | <li>The [[Utf8ToUnicode (String function)|Utf8ToUnicode]] method | ||
converts a UTF-8 Longstring byte stream to Unicode. | converts a UTF-8 <var>Longstring</var> byte stream to <var>Unicode</var>. | ||
</ul> | </ul> | ||
==Examples== | ==Examples== | ||
In the following fragment, <var>UnicodeToUtf8</var> is used to show how the | In the following fragment, <var>UnicodeToUtf8</var> is used to show how the | ||
Unicode U+B2 character (superscript 2) is represented in UTF-8. | <var>Unicode</var> U+B2 character (superscript 2) is represented in UTF-8. | ||
Appending the | Appending the <var>String</var>ToHex method is useful for viewing | ||
the hex values of characters that do not have displayable EBCDIC equivalents. | the hex values of characters that do not have displayable EBCDIC equivalents. | ||
Line 49: | Line 49: | ||
are used in the example. | are used in the example. | ||
<pre> | <pre> | ||
%u Unicode Initial('&#xB2;':U) | %u <var>Unicode</var> Initial('&#xB2;':U) | ||
Print %u:<var>UnicodeToUtf8</var>: | Print %u:<var>UnicodeToUtf8</var>:<var>String</var>ToHex | ||
</pre> | </pre> | ||
Revision as of 15:49, 20 January 2011
Translate to UTF-8 (Unicode class)
This function converts a Unicode string to a UTF-8 Longstring byte stream.
The UnicodeToUtf8 function is available as of version 7.3 of the Sirius Mods.
Syntax
%string = unicode:UnicodeToUtf8[( [InsertBOM= boolean])]
Syntax terms
%utf8Stream | A String or Longstring variable to receive the method object string translated to a UTF-8 byte stream. |
---|---|
unicode | A Unicode string. |
InsertBOM=bool | The optional (name required) InsertBOM argument is a Boolean:
|
Exceptions
This function can throw the following exception:
- CharacterTranslationException
- If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem. See CharacterTranslationException exception class.
Usage notes
- For more information about UTF-8 conversions, see Unicode#UTF-8 and UTF-16.
- The UnicodeToUtf16 method converts a Unicode string to UTF-16.
- The Utf8ToUnicode method converts a UTF-8 Longstring byte stream to Unicode.
Examples
In the following fragment, UnicodeToUtf8 is used to show how the Unicode U+B2 character (superscript 2) is represented in UTF-8. Appending the StringToHex method is useful for viewing the hex values of characters that do not have displayable EBCDIC equivalents.
The U constant function and StringToHex function are used in the example.
%u <var>Unicode</var> Initial('²':U) Print %u:<var>UnicodeToUtf8</var>:<var>String</var>ToHex
The result is:
C2B2