CharacterToUnicodeMap class: Difference between revisions

From m204wiki
Jump to navigation Jump to search
mNo edit summary
 
(8 intermediate revisions by 3 users not shown)
Line 1: Line 1:
A <var>CharacterToUnicodeMap</var> object contains a mapping of single-byte code points to Unicode (double-byte) characters. The single-byte characters are presumably EBCDIC or ASCII, though the class does not explicitly require this &mdash; it simply maps code points in some character set to Unicode.  
A <var>CharacterToUnicodeMap</var> object contains a mapping of single-byte code points to Unicode characters. The single-byte characters are presumably EBCDIC or ASCII, though the class does not explicitly require this &mdash; it simply maps code points in some character set to Unicode.  


The <var>CharacterToUnicodeMap</var> class provides programmers with a facility for creating or copying
The <var>CharacterToUnicodeMap</var> class provides programmers with a facility for creating or copying
Line 7: Line 7:
the <var class="product">Sirius Mods</var> codepages supported at your site.
the <var class="product">Sirius Mods</var> codepages supported at your site.


The [[List of CharacterToUnicodeMap methods|"List of CharacterToUnicodeMap methods"]] shows all the class methods.
You can use the <var>String</var> class <var>[[CharacterToUnicode (String function)|CharacterToUnicode]]</var> function
 
which returns the mapped Unicode characters for its string method object.
You can use the <var>String</var> class <var>[[CharacterToUnicode (String function)|CharacterToUnicode]]</var> method
to produce the mapped Unicode characters for the string characters you specify.


The <var>CharacterToUnicodeMap</var> class is new as of version 8.0 of the <var class="product">Sirius Mods</var>.
The <var>CharacterToUnicodeMap</var> class is new as of version 8.0 of the <var class="product">Sirius Mods</var>.
Line 18: Line 16:
the ASCII '01234' equivalent hex characters to Unicode. If you  
the ASCII '01234' equivalent hex characters to Unicode. If you  
attempt to extract a translation of a code point not in the mapping, say an ASCII code point with the high order bit on like X'80', you get a <var>[[CharacterTranslationException class|CharacterTranslationException]]</var> exception:
attempt to extract a translation of a code point not in the mapping, say an ASCII code point with the high order bit on like X'80', you get a <var>[[CharacterTranslationException class|CharacterTranslationException]]</var> exception:
<p class="code">b                                                                               
<p class="code">begin
                                                                                     
%i    is float       
%l    is longstring
%u    is unicode
%tr  is object characterToUnicodeMap


for %i from 0 to 127  
%i    is float
%l    is longstring
%u    is unicode
%tr  is object characterToUnicodeMap
 
for %i from 0 to 127
   %l = %l with %i:integerToBinary(1)
   %l = %l with %i:integerToBinary(1)
   %u = %u:unicodeWith(%i:integerToBinary(2):utf16ToUnicode)
   %u = %u:unicodeWith(%i:integerToBinary(2):utf16ToUnicode)
end for                          
end for


%tr = new(in=%l, out=%u)
%tr = new(in=%l, out=%u)


printText {'3031323334':x:characterToUnicode(%tr)}  
printText {'3031323334':x:characterToUnicode(%tr)}


end </p>
end </p>
The output of the above request is:
<p class="output">01234</p>
===Further examples===
See the [[NewFromEbcdicCodepage (CharacterToUnicodeMap function)#Examples|NewFromEbcdicCodepage examples]].
==List of CharacterToUnicodeMap methods==
The [[List of CharacterToUnicodeMap methods]] shows all the class methods.
[[Category:System classes]]

Latest revision as of 19:05, 20 April 2018

A CharacterToUnicodeMap object contains a mapping of single-byte code points to Unicode characters. The single-byte characters are presumably EBCDIC or ASCII, though the class does not explicitly require this — it simply maps code points in some character set to Unicode.

The CharacterToUnicodeMap class provides programmers with a facility for creating or copying codepages that are not currently available in the current Sirius Mods version. Besides methods for creating and modifying your own codepage, the class also has methods for dynamically invoking one of the Sirius Mods codepages supported at your site.

You can use the String class CharacterToUnicode function which returns the mapped Unicode characters for its string method object.

The CharacterToUnicodeMap class is new as of version 8.0 of the Sirius Mods.

Example

The following request generates a basic ASCII to Unicode codepage, then converts the ASCII '01234' equivalent hex characters to Unicode. If you attempt to extract a translation of a code point not in the mapping, say an ASCII code point with the high order bit on like X'80', you get a CharacterTranslationException exception:

begin %i is float %l is longstring %u is unicode %tr is object characterToUnicodeMap for %i from 0 to 127 %l = %l with %i:integerToBinary(1) %u = %u:unicodeWith(%i:integerToBinary(2):utf16ToUnicode) end for %tr = new(in=%l, out=%u) printText {'3031323334':x:characterToUnicode(%tr)} end

The output of the above request is:

01234

Further examples

See the NewFromEbcdicCodepage examples.

List of CharacterToUnicodeMap methods

The List of CharacterToUnicodeMap methods shows all the class methods.