NewFromEbcdicCodepage (CharacterToUnicodeMap function)

From m204wiki
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Create CharacterToUnicodeMap object from EBCDIC codepage (CharacterToUnicodeMap class)

[Introduced in Sirius Mods 8.0]


This method lets you dynamically select a particular codepage for specific data without having to change your system-wide codepage.

Syntax

%characterToUnicodeMap = [%(CharacterToUnicodeMap):]NewFromEbcdicCodepage[( [codepageName])]

Syntax terms

%CharacterToUnicodeMap A CharacterToUnicodeMap object.
[%(CharacterToUnicodeMap):] The optional class name in parentheses denotes a virtual constructor. See "Usage notes", below, for more information about invoking a virtual constructor.
codepageName A string that identifies one of the currently supported codepages at your site. You can view (and manage) these codepages with the UNICODE command.

Usage notes

  • NewFromEbcdicCodepage is a virtual constructor and as such can be called with no method object, with an explicit class name, or with an object variable, even if that object is null:

    %charToUmap = newFromEbcdicCodepage('0037EXT') %charToUmap = %(CharacterToUnicodeMap):newFromEbcdicCodepage('0037EXT') %charToUmap = %charToUmap:newFromEbcdicCodepage('0037EXT')

Examples

Simple example: translating to Unicode Euro character

The following code sequence shows the use of a Sirius Mods extended codepage, which maps the EBCDIC X'20' character (normally not translatable to Unicode) to the Unicode character U+20AC (the Euro sign):

%tr is object characterToUnicodeMap %tr = newFromEbcdicCodepage('0285EXT') %u = ('20':x):characterToUnicode(%tr) Printtext {~} is {%u:UnicodeToUtf16:StringToHex}

The result is:

%u:UnicodeToUtf16:StringToHex is 20AC

Extensive example: translating EBCDIC Cyrillic values

In the following example, the contents of field CYR are Cyrillic characters, represented using single-byte EBCDIC values. This example shows how to map those field values to Unicode, and then send them as part of an XML document. The process receiving the XML document will obtain the Cyrillic characters in Unicode.

OPEN FILE QAWORK *UPDATE IN FILE QAWORK INITIALIZE DEFINE FIELD CYR DEFINE FIELD DESCR begin * Declarations: %doc is object xmlDoc %map is object characterToUnicodeMap %rec is object xmlNode %top is object xmlNode %unic is unicode * Store test data: store record cyr = ('7778AF':x) descr = 'Cyrillic A, BE, VE' end store * Initialization: %map = newFromEbcdicCodepage('1154') %doc = new * Create a single XML document with the data: %top = %doc:addElement('body') fr %rec = %top:addElement('record') %rec:addElement('description', descr) %unic = (cyr):characterToUnicode(%map) %rec:addElement('cyrillic', %unic) end for %doc:webSend  ;* Send the XML document to the Janus Web client end

Notes

  • The key concept is that the EBCDIC values representing Cyrillic characters use codepage 1154 - each EBCDIC byte is converted to a Unicode character using the map created from codepage 1154.
  • In setting up the test data, we have:

    store record cyr = ('7778AF':x)

    This is a new feature in version 7.5 of Model 204, allowing a parenthesized expression after = in the Store Record and Add statements.

  • The arguments to AddElement (and many of the XmlDoc API methods) are Unicode, so a Unicode expression such as %unic is used directly, without conversion, by AddElement. Non-Unicode arguments, such as 'body', 'record', 'description', and 'cyrillic', are converted from EBCDIC to Unicode (using the standard Unicode codepage in effect, usually 1047 (the default) or 0037 or another (which can be set in CCAIN via the UNICODE command).
  • Parens are needed to apply a SOUL method to a field reference, such as in (cyr):characterToUnicode(%map).
    • Another approach would be to use a temporary %var, e.g.:

      %junk = cyr %unic = %junk:characterToUnicode

    • Either way, you do not need the %unic temporary %var, e.g.:

      %rec:addElement('cyrillic', (cyr):characterToUnicode(%map))

  • You might notice that semicolons can be freely used to put multiple statements per line in SOUL, e.g., an entire if ..; ifEnd block. We also use this to put "end of line" comments as well, as shown on the line invoking WebSend.
  • There are other approaches for transferring the data besides sending an XML document using Janus Web. Once the data is in a Unicode %var, you can create a UTF-16 bytestream with the UnicodeToUtf16 method, for instance.

See also