Utf8ToUnicode (String function): Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (1 revision)
m (1 revision)
Line 3: Line 3:
This [[Intrinsic classes|intrinsic]] function converts a UTF-8 Longstring byte stream to Unicode.
This [[Intrinsic classes|intrinsic]] function converts a UTF-8 Longstring byte stream to Unicode.


The Utf8ToUnicode function is available as of version 7.3 of the <var class=product>Sirius Mods</var>.
The <var>Utf8ToUnicode</var> function is available as of version 7.3 of the <var class=product>Sirius Mods</var>.
==Syntax==
==Syntax==
{{Template:String:Utf8ToUnicode syntax}}
{{Template:String:Utf8ToUnicode syntax}}
Line 27: Line 27:
==Examples==
==Examples==


In the following fragment, Utf8ToUnicode converts a hexadecimal input to a single Unicode character. In case the Unicode character translates to an EBCDIC character that cannot be displayed, the CharacterEncode option of the [[UnicodeToEbcdic (Unicode function)|UnicodeToEbcdic]] method
In the following fragment, <var>Utf8ToUnicode</var> converts a hexadecimal input to a single Unicode character. In case the Unicode character translates to an EBCDIC character that cannot be displayed, the CharacterEncode option of the [[UnicodeToEbcdic (Unicode function)|UnicodeToEbcdic]] method
causes the output of a hexadecimal character reference. The ''''[[X (String function)|X]]'''' constant function is used in the example.
causes the output of a hexadecimal character reference. The ''''[[X (String function)|X]]'''' constant function is used in the example.
     %u Unicode
     %u Unicode
     %u = 'E284A2':X:Utf8ToUnicode
     %u = 'E284A2':X:<var>Utf8ToUnicode</var>
     Print %u:unicodeToEbcdic(CharacterEncode=true)
     Print %u:unicodeToEbcdic(CharacterEncode=true)



Revision as of 15:32, 19 January 2011

Convert a UTF-8 Longstring bytestream to Unicode (String class)


This intrinsic function converts a UTF-8 Longstring byte stream to Unicode.

The Utf8ToUnicode function is available as of version 7.3 of the Sirius Mods.

Syntax

%unicode = string:Utf8ToUnicode[( [AllowUntranslatable= boolean])] Throws CharacterTranslationException

Syntax terms

%unicode A string variable to receive the method object string translated to Unicode.
string A String or Longstring that is presumed to contain a UTF-8 byte stream.

Exceptions

This intrinsic function can throw the following exception:

CharacterTranslationException
If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.

Usage notes

  • Utf8 and Utf16 has more information about UTF-8 conversions.
  • The Utf16ToUnicode method converts a UTF-16 byte stream to Unicode.
  • The UnicodeToUtf8 method converts a Unicode string to a UTF-8 Longstring byte stream.

Examples

In the following fragment, Utf8ToUnicode converts a hexadecimal input to a single Unicode character. In case the Unicode character translates to an EBCDIC character that cannot be displayed, the CharacterEncode option of the UnicodeToEbcdic method causes the output of a hexadecimal character reference. The 'X' constant function is used in the example.

   %u Unicode
   %u = 'E284A2':X:Utf8ToUnicode
   Print %u:unicodeToEbcdic(CharacterEncode=true)

The result of the above fragment is the character reference for the trademark character:

   &#x2122;