UnicodeToUtf8 (Unicode function): Difference between revisions

From m204wiki
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 10: Line 10:
<tr><th>unicode</th>
<tr><th>unicode</th>
<td>A <var>Unicode</var> string.</td></tr>
<td>A <var>Unicode</var> string.</td></tr>
<tr><th>InsertBOM</th>
<tr><th><var>InsertBOM</var></th>
<td>The optional (<var>[[Methods#Named parameters|NameRequired]]</var>) <var class="term">InsertBOM</var> argument is a [[Boolean enumeration|Boolean]]:<ul><li>If its value is <code>True</code>, the "Byte Order Mark" (U+FEFF, encoded as X'EFBBBF') is inserted at the start of the output stream.<li>If its value is <code>False</code>, the default, no Byte Order Mark is inserted.</ul></td></tr>
<td>The optional (<var>[[Methods#Named parameters|NameRequired]]</var>) <var class="term">InsertBOM</var> argument is a [[Boolean enumeration|Boolean]]:<ul><li>If its value is <code>True</code>, the "Byte Order Mark" (U+FEFF, encoded as X'EFBBBF') is inserted at the start of the output stream.<li>If its value is <code>False</code>, the default, no Byte Order Mark is inserted.</ul></td></tr>
</table>
</table>
Line 27: Line 27:
<ol><li>
<ol><li>
In the following fragment, <var>UnicodeToUtf8</var> is used to show how the <var>Unicode</var> U+B2 character (superscript 2) is represented in UTF-8.  Appending the <var>StringToHex</var> method is useful for viewing the hex values of characters that do not have displayable EBCDIC equivalents.
In the following fragment, <var>UnicodeToUtf8</var> is used to show how the <var>Unicode</var> U+B2 character (superscript 2) is represented in UTF-8.  Appending the <var>StringToHex</var> method is useful for viewing the hex values of characters that do not have displayable EBCDIC equivalents.
<p class="code">%u unicode initial('&amp;#xB2;':[[U (String function)|U]])
<p class="code">%u unicode initial('&amp;amp;#xB2;':[[U (String function)|U]])
print %u:UnicodeToUtf8:[[StringToHex (String function)|stringToHex]]
print %u:UnicodeToUtf8:[[StringToHex (String function)|stringToHex]]
</p>
</p>

Revision as of 00:55, 13 April 2011

Translate to UTF-8 (Unicode class)

UnicodeToUtf8 converts a Unicode string to a UTF-8 Longstring byte stream.

Syntax

%string = unicode:UnicodeToUtf8[( [InsertBOM= boolean])]

Syntax terms

%string A String or Longstring variable to receive the method object string translated to a UTF-8 byte stream.
unicode A Unicode string.
InsertBOM The optional (NameRequired) InsertBOM argument is a Boolean:
  • If its value is True, the "Byte Order Mark" (U+FEFF, encoded as X'EFBBBF') is inserted at the start of the output stream.
  • If its value is False, the default, no Byte Order Mark is inserted.

Exceptions

UnicodeToUtf8 can throw the following exception:

CharacterTranslationException
If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.

Usage notes

  • UnicodeToUtf8 is available as of Sirius Mods Version 7.3.

Examples

  1. In the following fragment, UnicodeToUtf8 is used to show how the Unicode U+B2 character (superscript 2) is represented in UTF-8. Appending the StringToHex method is useful for viewing the hex values of characters that do not have displayable EBCDIC equivalents.

    %u unicode initial('&amp;#xB2;':U) print %u:UnicodeToUtf8:stringToHex

    The result is:

    C2B2

See also