Utf8ToUnicode (String function): Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (1 revision)
m (1 revision)
Line 1: Line 1:
{{Template:String:Utf8ToUnicode subtitle}}
{{Template:String:Utf8ToUnicode subtitle}}


This [[Intrinsic classes|intrinsic]] function converts a UTF-8 Longstring byte stream to Unicode.            
This [[Intrinsic classes|intrinsic]] function converts a UTF-8 Longstring byte stream to Unicode.
                                                                                                             
 
The Utf8ToUnicode function is available as of version 7.3 of the [[Sirius Mods]].                            
The Utf8ToUnicode function is available as of version 7.3 of the [[Sirius Mods]].
==Syntax==
==Syntax==
{{Template:String:Utf8ToUnicode syntax}}
{{Template:String:Utf8ToUnicode syntax}}
Line 13: Line 13:
<td>A String or Longstring that is presumed to contain a UTF-8 byte stream.</td></tr>
<td>A String or Longstring that is presumed to contain a UTF-8 byte stream.</td></tr>
</table>
</table>
                                                                                                       
 
===Exceptions===                                                                                            
===Exceptions===
                                                                                                             
 
This [[Intrinsic classes|intrinsic]] function can throw the following exception:                            
This [[Intrinsic classes|intrinsic]] function can throw the following exception:
<dl>                                                                                                        
<dl>
<dt>[[CharacterTranslationException]]                                                                        
<dt>[[CharacterTranslationException]]
<dd>If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.  
<dd>If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.
</dl>                                                                                                        
</dl>
==Usage notes==
==Usage notes==
*[[Utf8 and Utf16]] has more information about UTF-8 conversions.                                            
*[[Utf8 and Utf16]] has more information about UTF-8 conversions.
*The [[Utf16ToUnicode (String function)|Utf16ToUnicode]] method converts a UTF-16 byte stream to Unicode.  
*The [[Utf16ToUnicode (String function)|Utf16ToUnicode]] method converts a UTF-16 byte stream to Unicode.
*The [[UnicodeToUtf8 (String function)|UnicodeToUtf8]] method converts a Unicode string to a UTF-8 Longstring byte stream.
*The [[UnicodeToUtf8 (String function)|UnicodeToUtf8]] method converts a Unicode string to a UTF-8 Longstring byte stream.
==Examples==
==Examples==
                 
 
In the following fragment, Utf8ToUnicode converts a hexadecimal input to a single Unicode character. In case the Unicode character translates to an EBCDIC character that cannot be displayed, the CharacterEncode option of the [[UnicodeToEbcdic (Unicode function)|UnicodeToEbcdic]] method
In the following fragment, Utf8ToUnicode converts a hexadecimal input to a single Unicode character. In case the Unicode character translates to an EBCDIC character that cannot be displayed, the CharacterEncode option of the [[UnicodeToEbcdic (Unicode function)|UnicodeToEbcdic]] method
causes the output of a hexadecimal character reference. The ''''[[X (String function)|X]]'''' constant function is used in the example.          
causes the output of a hexadecimal character reference. The ''''[[X (String function)|X]]'''' constant function is used in the example.
     %u Unicode                                                                            
     %u Unicode
     %u = 'E284A2':X:Utf8ToUnicode                                                        
     %u = 'E284A2':X:Utf8ToUnicode
     Print %u:unicodeToEbcdic(CharacterEncode=true)                                        
     Print %u:unicodeToEbcdic(CharacterEncode=true)
                                                                                         
 
The result of the above fragment is the character reference for the trademark character:  
The result of the above fragment is the character reference for the trademark character:
     &amp;#x2122;                                                                          
     &amp;#x2122;
[[Category:Intrinsic String methods|Utf8ToUnicode function]]
[[Category:Intrinsic String methods|Utf8ToUnicode function]]
[[Category:Intrinsic methods]]
[[Category:Intrinsic methods]]

Revision as of 14:04, 19 January 2011

Convert a UTF-8 Longstring bytestream to Unicode (String class)


This intrinsic function converts a UTF-8 Longstring byte stream to Unicode.

The Utf8ToUnicode function is available as of version 7.3 of the Sirius Mods.

Syntax

%unicode = string:Utf8ToUnicode[( [AllowUntranslatable= boolean])] Throws CharacterTranslationException

Syntax terms

%unicode A string variable to receive the method object string translated to Unicode.
string A String or Longstring that is presumed to contain a UTF-8 byte stream.

Exceptions

This intrinsic function can throw the following exception:

CharacterTranslationException
If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.

Usage notes

  • Utf8 and Utf16 has more information about UTF-8 conversions.
  • The Utf16ToUnicode method converts a UTF-16 byte stream to Unicode.
  • The UnicodeToUtf8 method converts a Unicode string to a UTF-8 Longstring byte stream.

Examples

In the following fragment, Utf8ToUnicode converts a hexadecimal input to a single Unicode character. In case the Unicode character translates to an EBCDIC character that cannot be displayed, the CharacterEncode option of the UnicodeToEbcdic method causes the output of a hexadecimal character reference. The 'X' constant function is used in the example.

   %u Unicode
   %u = 'E284A2':X:Utf8ToUnicode
   Print %u:unicodeToEbcdic(CharacterEncode=true)

The result of the above fragment is the character reference for the trademark character:

   &#x2122;