UnicodeToUtf8 (Unicode function): Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
mNo edit summary |
||
Line 10: | Line 10: | ||
<tr><th>unicode</th> | <tr><th>unicode</th> | ||
<td>A <var>Unicode</var> string.</td></tr> | <td>A <var>Unicode</var> string.</td></tr> | ||
<tr><th>InsertBOM</th> | <tr><th><var>InsertBOM</var></th> | ||
<td>The optional (<var>[[Methods#Named parameters|NameRequired]]</var>) <var class="term">InsertBOM</var> argument is a [[Boolean enumeration|Boolean]]:<ul><li>If its value is <code>True</code>, the "Byte Order Mark" (U+FEFF, encoded as X'EFBBBF') is inserted at the start of the output stream.<li>If its value is <code>False</code>, the default, no Byte Order Mark is inserted.</ul></td></tr> | <td>The optional (<var>[[Methods#Named parameters|NameRequired]]</var>) <var class="term">InsertBOM</var> argument is a [[Boolean enumeration|Boolean]]:<ul><li>If its value is <code>True</code>, the "Byte Order Mark" (U+FEFF, encoded as X'EFBBBF') is inserted at the start of the output stream.<li>If its value is <code>False</code>, the default, no Byte Order Mark is inserted.</ul></td></tr> | ||
</table> | </table> | ||
Line 27: | Line 27: | ||
<ol><li> | <ol><li> | ||
In the following fragment, <var>UnicodeToUtf8</var> is used to show how the <var>Unicode</var> U+B2 character (superscript 2) is represented in UTF-8. Appending the <var>StringToHex</var> method is useful for viewing the hex values of characters that do not have displayable EBCDIC equivalents. | In the following fragment, <var>UnicodeToUtf8</var> is used to show how the <var>Unicode</var> U+B2 character (superscript 2) is represented in UTF-8. Appending the <var>StringToHex</var> method is useful for viewing the hex values of characters that do not have displayable EBCDIC equivalents. | ||
<p class="code">%u unicode initial('&#xB2;':[[U (String function)|U]]) | <p class="code">%u unicode initial('&amp;#xB2;':[[U (String function)|U]]) | ||
print %u:UnicodeToUtf8:[[StringToHex (String function)|stringToHex]] | print %u:UnicodeToUtf8:[[StringToHex (String function)|stringToHex]] | ||
</p> | </p> |
Revision as of 00:55, 13 April 2011
Translate to UTF-8 (Unicode class)
UnicodeToUtf8 converts a Unicode string to a UTF-8 Longstring byte stream.
Syntax
%string = unicode:UnicodeToUtf8[( [InsertBOM= boolean])]
Syntax terms
%string | A String or Longstring variable to receive the method object string translated to a UTF-8 byte stream. |
---|---|
unicode | A Unicode string. |
InsertBOM | The optional (NameRequired) InsertBOM argument is a Boolean:
|
Exceptions
UnicodeToUtf8 can throw the following exception:
- CharacterTranslationException
- If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.
Usage notes
- UnicodeToUtf8 is available as of Sirius Mods Version 7.3.
Examples
-
In the following fragment, UnicodeToUtf8 is used to show how the Unicode U+B2 character (superscript 2) is represented in UTF-8. Appending the StringToHex method is useful for viewing the hex values of characters that do not have displayable EBCDIC equivalents.
%u unicode initial('&#xB2;':U) print %u:UnicodeToUtf8:stringToHex
The result is:
C2B2
See also
- For more information about UTF-8 conversions, see "Unicode: UTF-8 and UTF-16".
- UnicodeToUtf16 converts a Unicode string to UTF-16.
- Utf8ToUnicode converts a UTF-8 Longstring byte stream to Unicode.