Utf8ToUnicode (String function): Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (1 revision)
m (remove Mods version)
 
(41 intermediate revisions by 8 users not shown)
Line 1: Line 1:
{{Template:String:Utf8ToUnicode subtitle}}
{{Template:String:Utf8ToUnicode subtitle}}
The <var>Utf8ToUnicode</var> [[Intrinsic classes|intrinsic]] function converts a UTF-8 string to <var>Unicode</var>.


This [[Intrinsic classes|intrinsic]] function converts a UTF-8 Longstring byte stream to Unicode.           
                                                                                                             
The Utf8ToUnicode function is available as of version 7.3 of the [[Sirius Mods]].                           
==Syntax==
==Syntax==
{{Template:String:Utf8ToUnicode syntax}}
{{Template:String:Utf8ToUnicode syntax}}
===Syntax terms===
===Syntax terms===
<dl>                                                                                                        
<table class="syntaxTable">
<dt>%unicode                                                                                                
<tr><th>%unicode</th>
<dd>A string variable to receive the method object string translated to Unicode.  
<td>A Unicode string variable to receive the method object <var class="term">string</var> translated to <var>Unicode</var>.</td></tr>
<dt>string                                                                                                  
 
<dd>A String or Longstring that is presumed to contain a UTF-8 byte stream.  
<tr><th>string</th>
                                                                                                             
<td>The method object <var class="term">string</var> that is presumed to contain a UTF-8 byte stream.</td></tr>
</dl>                                                                                                        
 
===Exceptions===                                                                                             
<tr><th><var>AllowUntranslatable</var></th>
                                                                                                             
<td>This argument indicates whether this function will store values into the target Unicode string if they cannot be translated to EBCDIC. This value defaults to <var>True</var>, which means that such values would be allowed. If this argument is set to <var>False</var>, a Unicode value not translatable to EBCDIC produces a [[CharacterTranslationException class|CharacterTranslationException exception]].
This [[Intrinsic classes|intrinsic]] function can throw the following exception:                            
<p>
<dl>                                                                                                        
Unless there is a compelling reason to do otherwise, it is best to use the default value of <var>True</var>.</p></td></tr>
<dt>[[CharacterTranslationException]]                                                                        
</table>
<dd>If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.  
 
</dl>                                                                                                        
==Exceptions==
==Usage notes==
 
*[[Utf8 and Utf16]] has more information about UTF-8 conversions.                                           
<var>Utf8ToUnicode</var> can throw the following exception:
*The [[Utf16ToUnicode (String function)|Utf16ToUnicode]] method converts a UTF-16 byte stream to Unicode.   
<table class="noBorder">
*The [[UnicodeToUtf8 (String function)|UnicodeToUtf8]] method converts a Unicode string to a UTF-8 Longstring byte stream.
<tr><th><var>[[CharacterTranslationException_class|CharacterTranslationException]]</var> </th>
<td>If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem. </td></tr>
</table>
 
==Examples==
==Examples==
                 
In the following fragment, <var>Utf8ToUnicode</var> converts a hexadecimal input to a single <var>Unicode</var> character. In case the <var>Unicode</var> character translates to an EBCDIC character that cannot be displayed, the <var>CharacterEncode</var> option of <var>[[UnicodeToEbcdic (Unicode function)|UnicodeToEbcdic]]</var> causes the output of a hexadecimal character reference.
In the following fragment, Utf8ToUnicode converts a hexadecimal input to a single Unicode character. In case the Unicode character translates to an EBCDIC character that cannot be displayed, the CharacterEncode option of the [[UnicodeToEbcdic (Unicode function)|UnicodeToEbcdic]] method
<p class="code">%u unicode
causes the output of a hexadecimal character reference. The ''''[[X (String function)|X]]'''' constant function is used in the example.         
%u = 'E284A2':[[X (String function)|X]]:Utf8ToUnicode
    %u Unicode                                                                           
print %u:unicodeToEbcdic(CharacterEncode=true)
    %u = 'E284A2':X:Utf8ToUnicode                                                        
</p>
    Print %u:unicodeToEbcdic(CharacterEncode=true)                                        
The result of the above fragment is the character reference for the trademark character:
                                                                                         
<p class="output">&amp;#x2122;
The result of the above fragment is the character reference for the trademark character:  
</p>
    &amp;#x2122;                                                                          
 
[[Category:Intrinsic String methods|Utf8ToUnicode function]]
==See also==
[[Category:Intrinsic methods]]
<ul>
<li>[[Unicode#UTF-8 and UTF-16|Utf8 and Utf16]] has more information about UTF-8 conversions. </li>
 
<li><var>[[UnicodeToUtf8 (Unicode function)|UnicodeToUtf8]]</var> converts a <var>Unicode</var> string to a UTF-8 <var>Longstring</var> byte stream. </li>
 
<li><var>[[Utf16ToUnicode (String function)|Utf16ToUnicode]]</var> converts a UTF-16 byte stream to <var>Unicode</var>. </li>
</ul>
 
{{Template:String:Utf8ToUnicode footer}}

Latest revision as of 19:38, 13 April 2016

Convert a UTF-8 Longstring bytestream to Unicode (String class)

The Utf8ToUnicode intrinsic function converts a UTF-8 string to Unicode.

Syntax

%unicode = string:Utf8ToUnicode[( [AllowUntranslatable= boolean])] Throws CharacterTranslationException

Syntax terms

%unicode A Unicode string variable to receive the method object string translated to Unicode.
string The method object string that is presumed to contain a UTF-8 byte stream.
AllowUntranslatable This argument indicates whether this function will store values into the target Unicode string if they cannot be translated to EBCDIC. This value defaults to True, which means that such values would be allowed. If this argument is set to False, a Unicode value not translatable to EBCDIC produces a CharacterTranslationException exception.

Unless there is a compelling reason to do otherwise, it is best to use the default value of True.

Exceptions

Utf8ToUnicode can throw the following exception:

CharacterTranslationException If the method encounters a translation problem, properties of the exception object may indicate the location and type of problem.

Examples

In the following fragment, Utf8ToUnicode converts a hexadecimal input to a single Unicode character. In case the Unicode character translates to an EBCDIC character that cannot be displayed, the CharacterEncode option of UnicodeToEbcdic causes the output of a hexadecimal character reference.

%u unicode %u = 'E284A2':X:Utf8ToUnicode print %u:unicodeToEbcdic(CharacterEncode=true)

The result of the above fragment is the character reference for the trademark character:

&#x2122;

See also