UnicodeToUtf8 (Unicode function): Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (1 revision)
m (1 revision)
Line 2: Line 2:
[[Category:Unicode methods|UnicodeToUtf8 function]]
[[Category:Unicode methods|UnicodeToUtf8 function]]
[[Category:Intrinsic methods]]
[[Category:Intrinsic methods]]
<!--DPL?? Category:Unicode methods|<var>UnicodeToUtf8</var> function: Unicode string converted to UTF-8 byte stream-->
<!--DPL?? Category:<var>Unicode</var> methods|<var>UnicodeToUtf8</var> function: <var>Unicode</var> string converted to UTF-8 byte stream-->
<!--DPL?? Category:Intrinsic methods|<var>UnicodeToUtf8</var> (Unicode function): Unicode string converted to UTF-8 byte stream-->
<!--DPL?? Category:Intrinsic methods|<var>UnicodeToUtf8</var> (<var>Unicode</var> function): <var>Unicode</var> string converted to UTF-8 byte stream-->
<!--DPL?? Category:System methods|<var>UnicodeToUtf8</var> (Unicode function): Unicode string converted to UTF-8 byte stream-->
<!--DPL?? Category:<var>System</var> methods|<var>UnicodeToUtf8</var> (<var>Unicode</var> function): <var>Unicode</var> string converted to UTF-8 byte stream-->


This function converts a Unicode string to a UTF-8
This function converts a <var>Unicode</var> string to a UTF-8
Longstring byte stream.
<var>Longstring</var> byte stream.


The <var>UnicodeToUtf8</var> function is available as of version 7.3 of the <var class=product>Sirius Mods</var>.
The <var>UnicodeToUtf8</var> function is available as of version 7.3 of the <var class=product>Sirius Mods</var>.
Line 15: Line 15:
<table class="syntaxTable">
<table class="syntaxTable">
<tr><th><i>%utf8Stream</i></th>
<tr><th><i>%utf8Stream</i></th>
<td>A String or Longstring variable to receive the method object string translated to a UTF-8 byte stream. </td></tr>
<td>A <var>String</var> or <var>Longstring</var> variable to receive the method object string translated to a UTF-8 byte stream. </td></tr>
<tr><th><i>unicode</i></th>
<tr><th><i>unicode</i></th>
<td>A Unicode string. </td></tr>
<td>A <var>Unicode</var> string. </td></tr>
<tr><th><b>InsertBOM=</b><i>bool</i></th>
<tr><th><b>InsertBOM=</b><i>bool</i></th>
<td>The optional (name required) InsertBOM argument is a Boolean: <ul> <li>If its value is <tt>True</tt>, the "Byte Order Mark" (U+FEFF, encoded as X'EFBBBF') is inserted at the start of the output stream. <li>If its value is <tt>False</tt>, the default, no Byte Order Mark is inserted. </ul></td></tr>
<td>The optional (name required) InsertBOM argument is a Boolean: <ul> <li>If its value is <tt>True</tt>, the "Byte Order Mark" (U+FEFF, encoded as X'EFBBBF') is inserted at the start of the output stream. <li>If its value is <tt>False</tt>, the default, no Byte Order Mark is inserted. </ul></td></tr>
Line 25: Line 25:
This function can throw the following exception:
This function can throw the following exception:
<dl>
<dl>
<dt>CharacterTranslationException
<dt><var>CharacterTranslationException</var>
<dd>If the method encounters a translation problem,
<dd>If the method encounters a translation problem,
properties of the exception object may indicate the location and type of problem.
properties of the exception object may indicate the location and type of problem.
Line 35: Line 35:
<li>For more information about UTF-8 conversions, see [[Unicode#UTF-8 and UTF-16]].
<li>For more information about UTF-8 conversions, see [[Unicode#UTF-8 and UTF-16]].
<li>The [[UnicodeToUtf16 (Unicode function)|UnicodeToUtf16]] method
<li>The [[UnicodeToUtf16 (Unicode function)|UnicodeToUtf16]] method
converts a Unicode string to UTF-16.
converts a <var>Unicode</var> string to UTF-16.
<li>The [[Utf8ToUnicode (String function)|Utf8ToUnicode]] method
<li>The [[Utf8ToUnicode (String function)|Utf8ToUnicode]] method
converts a UTF-8 Longstring byte stream to Unicode.
converts a UTF-8 <var>Longstring</var> byte stream to <var>Unicode</var>.
</ul>
</ul>
==Examples==
==Examples==


In the following fragment, <var>UnicodeToUtf8</var> is used to show how the
In the following fragment, <var>UnicodeToUtf8</var> is used to show how the
Unicode U+B2 character (superscript 2) is represented in UTF-8.
<var>Unicode</var> U+B2 character (superscript 2) is represented in UTF-8.
Appending the StringToHex method is useful for viewing
Appending the <var>String</var>ToHex method is useful for viewing
the hex values of characters that do not have displayable EBCDIC equivalents.
the hex values of characters that do not have displayable EBCDIC equivalents.


Line 49: Line 49:
are used in the example.
are used in the example.
<pre>
<pre>
     %u Unicode Initial('&amp;#xB2;':U)
     %u <var>Unicode</var> Initial('&amp;#xB2;':U)
     Print %u:<var>UnicodeToUtf8</var>:StringToHex
     Print %u:<var>UnicodeToUtf8</var>:<var>String</var>ToHex
</pre>
</pre>



Revision as of 15:49, 20 January 2011

Translate to UTF-8 (Unicode class)

This function converts a Unicode string to a UTF-8 Longstring byte stream.

The UnicodeToUtf8 function is available as of version 7.3 of the Sirius Mods.

Syntax

%string = unicode:UnicodeToUtf8[( [InsertBOM= boolean])]

Syntax terms

%utf8Stream A String or Longstring variable to receive the method object string translated to a UTF-8 byte stream.
unicode A Unicode string.
InsertBOM=bool The optional (name required) InsertBOM argument is a Boolean:
  • If its value is True, the "Byte Order Mark" (U+FEFF, encoded as X'EFBBBF') is inserted at the start of the output stream.
  • If its value is False, the default, no Byte Order Mark is inserted.

Exceptions

This function can throw the following exception:

CharacterTranslationException
If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem. See CharacterTranslationException exception class.

Usage notes

Examples

In the following fragment, UnicodeToUtf8 is used to show how the Unicode U+B2 character (superscript 2) is represented in UTF-8. Appending the StringToHex method is useful for viewing the hex values of characters that do not have displayable EBCDIC equivalents.

The U constant function and StringToHex function are used in the example.

    %u <var>Unicode</var> Initial('&#xB2;':U)
    Print %u:<var>UnicodeToUtf8</var>:<var>String</var>ToHex

The result is:

    C2B2