Xml (XmlDoc function): Difference between revisions

From m204wiki
Jump to navigation Jump to search
m (1 revision)
m (1 revision)
Line 1: Line 1:
{{Template:XmlDoc:Xml subtitle}}
{{Template:XmlDoc:Xml subtitle}}


This function converts an XmlDoc to its textually represented XML document
This function converts an <var>Xml</var>Doc to its textually represented XML document
(this process is called '''serialization''',
(this process is called '''serialization''',
because the text representation of a document is called the '''serial'''
because the text representation of a document is called the '''serial'''
Line 10: Line 10:
<table class="syntaxTable">
<table class="syntaxTable">
<tr><th>utf8Str</th>
<tr><th>utf8Str</th>
<td>The string serialization of the XmlDoc, encoded in UTF-8. </td></tr>
<td>The string serialization of the <var>Xml</var>Doc, encoded in UTF-8. </td></tr>
<tr><th>doc</th>
<tr><th>doc</th>
<td>XmlDoc expression, whose content is serialized. </td></tr>
<td><var>Xml</var>Doc expression, whose content is serialized. </td></tr>
<tr><th>options</th>
<tr><th>options</th>
<td>Any combination of the following options (but single occurrences only): <br> <ul> <li><b>AllowXmlDecl</b> or <b>NoXmlDecl</b>
<td>Any combination of the following options (but single occurrences only): <br> <ul> <li><b>Allow<var>Xml</var>Decl</b> or <b>No<var>Xml</var>Decl</b>
<p class="code">AllowXmlDecl, the default, produces the XML declaration (that is, <?xml&thinsp.version=...?>) if the declaration was set (see [[Version (XmlDoc property)|Version]]). NoXmlDecl omits the XML declaration. <li><b>Indent</b> <i><b>n</b></i>
<p class="code">Allow<var>Xml</var>Decl, the default, produces the XML declaration (that is, <?xml&thinsp.version=...?>) if the declaration was set (see [[Version (XmlDoc property)|Version]]). No<var>Xml</var>Decl omits the XML declaration. <li><b>Indent</b> <i><b>n</b></i>
Inserts space characters (and line-ends, as described for the next option) into the serialized string such that if the string is broken at the line-ends and displayed as a tree, the display of each lower level in the subtree is indented ''n'' spaces from the previous level's starting point.
Inserts space characters (and line-ends, as described for the next option) into the serialized string such that if the string is broken at the line-ends and displayed as a tree, the display of each lower level in the subtree is indented ''n'' spaces from the previous level's starting point.
If serialized output with an Indent value of 2 is displayed as a tree, the spacing is as in the following: <pre>    <top>      <leaf1 xx="yy">value</leaf1>      <sub>        <leaf2>value</leaf2>      </sub>    </top> </pre> <i>n</i> is a non-negative integer, and it maximum value (as of ''Sirius Mods'' version 7.0) is 254.
If serialized output with an Indent value of 2 is displayed as a tree, the spacing is as in the following: <pre>    <top>      <leaf1 xx="yy">value</leaf1>      <sub>        <leaf2>value</leaf2>      </sub>    </top> </pre> <i>n</i> is a non-negative integer, and it maximum value (as of ''Sirius Mods'' version 7.0) is 254.
Line 32: Line 32:
An Element node that has no children and no Attributes will not be serialized, unless it is the top level Element in the subtree being serialized. The serialization of a child-less and Attribute-less Element is omitted, even if the Element's serialization would contain Namespace declarations in its start tag.
An Element node that has no children and no Attributes will not be serialized, unless it is the top level Element in the subtree being serialized. The serialization of a child-less and Attribute-less Element is omitted, even if the Element's serialization would contain Namespace declarations in its start tag.
If an Element node has no Attributes, but has (only) Element children (one or more), and all of its children are Attribute-less and child-less, then that parent Element is serialized, even though its content in the serialization is empty. That parent is serialized with a start tag and an end tag (and an inserted line separator, if called for by the serializing method's parameter options).
If an Element node has no Attributes, but has (only) Element children (one or more), and all of its children are Attribute-less and child-less, then that parent Element is serialized, even though its content in the serialization is empty. That parent is serialized with a start tag and an end tag (and an inserted line separator, if called for by the serializing method's parameter options).
For example, if the Print method display of a particular XmlDoc object is the following when OmitNullElement is ''not'' specified: <pre>    <top>        <middle>          <empty/>          <p:empty2 xmlns:p="uri:stuff"/>        </middle>    </top> </pre>
For example, if the Print method display of a particular <var>Xml</var>Doc object is the following when OmitNullElement is ''not'' specified: <pre>    <top>        <middle>          <empty/>          <p:empty2 xmlns:p="uri:stuff"/>        </middle>    </top> </pre>
Here is the display of the XmlDoc with the OmitNullElement option specified: <pre>    <top>        <middle>        </middle>    </top> </pre>
Here is the display of the <var>Xml</var>Doc with the OmitNullElement option specified: <pre>    <top>        <middle>        </middle>    </top> </pre>
The OmitNullElement option is available as of ''Sirius Mods'' version 7.3. <li><b>SortCanonical</b>
The OmitNullElement option is available as of ''Sirius Mods'' version 7.3. <li><b>SortCanonical</b>
This indicates that namespace declarations (based on the prefix being declared) and attributes (based on the namespace URI followed by the local name) are serialized in sorted order. This can be useful, for instance, when using the Xml method to serialize a portion of an XML document for a signature.
This indicates that namespace declarations (based on the prefix being declared) and attributes (based on the namespace URI followed by the local name) are serialized in sorted order. This can be useful, for instance, when using the <var>Xml</var> method to serialize a portion of an XML document for a signature.
The sort order for namespace declarations and attributes is from lowest to highest, and it uses the Unicode code ordering (for example, numbers are lower than letters).
The sort order for namespace declarations and attributes is from lowest to highest, and it uses the <var>Unicode</var> code ordering (for example, numbers are lower than letters).
This option was added in ''Sirius Mods'' version 6.9 as a step towards support for canonicalization. As of version 7.0, comprehensive support for canonicalized serialization is provided by the Serial method ExclCanonical option ([[??]] refid=exclcan.). </ul></td></tr>
This option was added in ''Sirius Mods'' version 6.9 as a step towards support for canonicalization. As of version 7.0, comprehensive support for canonicalized serialization is provided by the Serial method ExclCanonical option ([[??]] refid=exclcan.). </ul></td></tr>
</p>
</p>
Line 44: Line 44:
<ul>
<ul>
<li>Options may be specified in any case, for example, you can use
<li>Options may be specified in any case, for example, you can use
either <tt>NoXmlDecl</tt> or <tt>noxmldecl</tt>, interchangeably.
either <tt>No<var>Xml</var>Decl</tt> or <tt>noxmldecl</tt>, interchangeably.
<li>The <i>XmlDoc</i> method object must be well-formed (that is,
<li>The <i><var>Xml</var>Doc</i> method object must be well-formed (that is,
it must contain an Element node).
it must contain an Element node).
For more information about well-formed documents, see [[??]] refid=welform..
For more information about well-formed documents, see [[??]] refid=welform..
<li>Since the result of the Xml function has UTF-8 encoding, you
<li>Since the result of the <var>Xml</var> function has UTF-8 encoding, you
cannot treat it as an EBCDIC string: for example, printing the string
cannot treat it as an EBCDIC string: for example, printing the string
will not produce displayable characters.
will not produce displayable characters.
The "See Also" section below mentions some methods for obtaining
The "See Also" section below mentions some methods for obtaining
an EBCDIC serialization of an XmlDoc.
an EBCDIC serialization of an <var>Xml</var>Doc.
<li>You can use the Print method
<li>You can use the Print method
(??[[Print (XmlDoc/XmlNode subroutine)|Print]])
(??[[Print (XmlDoc/XmlNode subroutine)|Print]])
Line 58: Line 58:
or to '''capture''' a displayable version of a document, but Print is
or to '''capture''' a displayable version of a document, but Print is
used to insert line breaks and optional indentation, which may not be an
used to insert line breaks and optional indentation, which may not be an
accurate serialization of an XmlDoc.
accurate serialization of an <var>Xml</var>Doc.
<li>Using one of the line-end character options (CR, LF, CRLF) produces output
<li>Using one of the line-end character options (CR, LF, CRLF) produces output
that is analogous to the BothCompact option of the Print method.
that is analogous to the BothCompact option of the Print method.
Line 73: Line 73:
does it cause resumption of readability line-ends or indents if they were
does it cause resumption of readability line-ends or indents if they were
suspended by a containing <tt>xml:space="preserve"</tt>.
suspended by a containing <tt>xml:space="preserve"</tt>.
<li>As of version 6.7, the Xml method uses the hexadecimal
<li>As of version 6.7, the <var>Xml</var> method uses the hexadecimal
character references specified in the XML Canonicalization specification
character references specified in the XML Canonicalization specification
(:hp0 color=SirLink.http://www.w3.org/TR/xml-c14n:ehp0.)  to
(:hp0 color=SirLink.http://www.w3.org/TR/xml-c14n:ehp0.)  to
Line 118: Line 118:
<ul>
<ul>
<li>The [[Janus Sockets]]R documents the HttpRequest object, whose
<li>The [[Janus Sockets]]R documents the HttpRequest object, whose
AddXml method has nearly the same options as the Xml function;
Add<var>Xml</var> method has nearly the same options as the <var>Xml</var> function;
the following fragment serializes an XmlDoc and sends it
the following fragment serializes an <var>Xml</var>Doc and sends it
as a request to a Web server.
as a request to a Web server.


Note that if you use the Xml function and $Sock_Send directly,
Note that if you use the <var>Xml</var> function and $Sock_Send directly,
instead of using an HTTP helper object, always use
instead of using an HTTP helper object, always use
the BINARY option of $Sock_Send, because the
the BINARY option of $Sock_Send, because the
result of the
result of the
Xml function is UTF-8, rather than EBCDIC.
<var>Xml</var> function is UTF-8, rather than EBCDIC.
<p class="code">%httpreq object HttpRequest
<p class="code">%httpreq object HttpRequest
%httpresp object HttpResponse
%httpresp object HttpResponse
%doc object XmlDoc
%doc object <var>Xml</var>Doc
%httpreq = New
%httpreq = New
%doc = New
%doc = New
%doc:LoadXml('<inquire><stock>IBM</stock>' With -
%doc:Load<var>Xml</var>('<inquire><stock>IBM</stock>' With -
   <dateRange/></inquire>', 'NoEmptyElt')
   <dateRange/></inquire>', 'NoEmptyElt')


%httpreq:URL = 'foo.com/bar'
%httpreq:URL = 'foo.com/bar'
%httpreq:AddXml(%doc)
%httpreq:Add<var>Xml</var>(%doc)
%httpresp = %httpreq:Post('HTTP_CLIENT')
%httpresp = %httpreq:Post('HTTP_CLIENT')
</p>
</p>
<li>The following fragment is a simple example for serializing an XmlDoc, which
<li>The following fragment is a simple example for serializing an <var>Xml</var>Doc, which
could then, for example, be sent on a transport such as MQ:
could then, for example, be sent on a transport such as MQ:
<p class="code">%s Longstring
<p class="code">%s <var>Longstring</var>
%s = %doc:Xml
%s = %doc:<var>Xml</var>
</p>
</p>
</ul>
</ul>
Line 148: Line 148:
===Request-Cancellation Errors===
===Request-Cancellation Errors===
<ul>
<ul>
<li>XmlDoc does not contain an Element.
<li><var>Xml</var>Doc does not contain an Element.
<li><i>Options</i> is invalid.
<li><i>Options</i> is invalid.
<li>Insufficient free space exists in CCATEMP.
<li>Insufficient free space exists in CCATEMP.
Line 162: Line 162:
an XML document.
an XML document.
<li>Use ??[[WebSend (XmlDoc subroutine)|WebSend]]
<li>Use ??[[WebSend (XmlDoc subroutine)|WebSend]]
to serialize an XmlDoc and send it as an HTTP response using [[Janus Web Server]].
to serialize an <var>Xml</var>Doc and send it as an HTTP response using [[Janus Web Server]].
<li>The string deserialization functions are ??[[LoadXml (XmlDoc/XmlNode function)|LoadXml]]
<li>The string deserialization functions are ??[[LoadXml (XmlDoc/XmlNode function)|LoadXml]]
and ??[[WebReceive (XmlDoc function)|WebReceive]].
and ??[[WebReceive (XmlDoc function)|WebReceive]].
</ul>
</ul>

Revision as of 17:46, 25 January 2011

Serialize XmlDoc as UTF-8 string (XmlDoc class)


This function converts an XmlDoc to its textually represented XML document (this process is called serialization, because the text representation of a document is called the serial form).

Syntax

%string = doc:Xml[( [options])]

Syntax terms

utf8Str The string serialization of the XmlDoc, encoded in UTF-8.
doc XmlDoc expression, whose content is serialized.
options Any combination of the following options (but single occurrences only):
  • AllowXmlDecl or NoXmlDecl

    AllowXmlDecl, the default, produces the XML declaration (that is, <?xml&thinsp.version=...?>) if the declaration was set (see Version). NoXmlDecl omits the XML declaration.

  • Indent n Inserts space characters (and line-ends, as described for the next option) into the serialized string such that if the string is broken at the line-ends and displayed as a tree, the display of each lower level in the subtree is indented n spaces from the previous level's starting point. If serialized output with an Indent value of 2 is displayed as a tree, the spacing is as in the following:
         <top>       <leaf1 xx="yy">value</leaf1>       <sub>         <leaf2>value</leaf2>       </sub>     </top> 
    n is a non-negative integer, and it maximum value (as of Sirius Mods version 7.0) is 254. One of the line-end options, below, must also be specified.
  • One of the line-end options below, to provide line breaks in the output after any of the following is serialized:
    • An element start-tag, if it has any non-text node children
    • An empty element tag, or an empty element end-tag
    • A processing instruction (PI)
    • A comment
    • A text node, if it has any siblings

    CR Insert a carriage-return character as the line-end sequence in the above cases.
    LF Insert a linefeed character as the line-end sequence in the above cases.
    CRLF Insert a carriage-return character followed by a linefeed character as the line-end sequence in the above cases.


  • NoEmptyElt This indicates that an empty element is serialized with its start tag followed by an end tag. For example:
         <middleName></middleName> 
    If this option is not specified, the default is to serialize an empty element with an empty element tag:
         <middleName/> 
  • OmitNullElement

    An Element node that has no children and no Attributes will not be serialized, unless it is the top level Element in the subtree being serialized. The serialization of a child-less and Attribute-less Element is omitted, even if the Element's serialization would contain Namespace declarations in its start tag. If an Element node has no Attributes, but has (only) Element children (one or more), and all of its children are Attribute-less and child-less, then that parent Element is serialized, even though its content in the serialization is empty. That parent is serialized with a start tag and an end tag (and an inserted line separator, if called for by the serializing method's parameter options).

    For example, if the Print method display of a particular XmlDoc object is the following when OmitNullElement is not specified:
         <top>        <middle>           <empty/>           <p:empty2 xmlns:p="uri:stuff"/>        </middle>     </top> 
    Here is the display of the XmlDoc with the OmitNullElement option specified:
         <top>        <middle>        </middle>     </top> 
    The OmitNullElement option is available as of Sirius Mods version 7.3.
  • SortCanonical This indicates that namespace declarations (based on the prefix being declared) and attributes (based on the namespace URI followed by the local name) are serialized in sorted order. This can be useful, for instance, when using the Xml method to serialize a portion of an XML document for a signature. The sort order for namespace declarations and attributes is from lowest to highest, and it uses the Unicode code ordering (for example, numbers are lower than letters). This option was added in Sirius Mods version 6.9 as a step towards support for canonicalization. As of version 7.0, comprehensive support for canonicalized serialization is provided by the Serial method ExclCanonical option (?? refid=exclcan.).

Usage notes

  • Options may be specified in any case, for example, you can use either NoXmlDecl or noxmldecl, interchangeably.
  • The XmlDoc method object must be well-formed (that is, it must contain an Element node). For more information about well-formed documents, see ?? refid=welform..
  • Since the result of the Xml function has UTF-8 encoding, you cannot treat it as an EBCDIC string: for example, printing the string will not produce displayable characters. The "See Also" section below mentions some methods for obtaining an EBCDIC serialization of an XmlDoc.
  • You can use the Print method (??Print) to display a document on the terminal, or to capture a displayable version of a document, but Print is used to insert line breaks and optional indentation, which may not be an accurate serialization of an XmlDoc.
  • Using one of the line-end character options (CR, LF, CRLF) produces output that is analogous to the BothCompact option of the Print method.
  • If one of the line-end (CR, LF, CRLF) options or if Indent is specified, and an element to be serialized has the xml:space="preserve" attribute, then within the serialization of that element and its descendants, no line-end (nor indentation) characters are inserted to provide readability. In addition, the xml:space="default" attribute has no effect under these options: specified by itself, it does not influence serialization, nor does it cause resumption of readability line-ends or indents if they were suspended by a containing xml:space="preserve".
  • As of version 6.7, the Xml method uses the hexadecimal character references specified in the XML Canonicalization specification (:hp0 color=SirLink.http://www.w3.org/TR/xml-c14n:ehp0.) to display the following characters:
    • For Attribute nodes: tab, carriage return, and linefeed
    • For Text nodes: carriage return

    Since the character references are not subject to the standard XML whitespace normalization (?? refid=normwhi.), a serialized document (or subtree) that is then deserialized will retain this whitespace.

    These character references are used:

    tab
    &#x9;
    CR
    &#xD;
    LF
    &#xA;

    The EBCDIC and corresponding ASCII encodings of the characters is:

    &thinsp.
    EBCDIC ASCII
    tab
    X'05' X'09'
    CR
    X'0D' X'0D'
    LF
    X'25' X'0A'
  • As of Sirius Mods version 7.6, Attribute values are always serialized within double-quotation-mark (") delimiters, and a double-quotation mark character in an attribute value is serialized as &quot;. Prior to version 7.6, this convention was not strictly observed.

Examples

  • The Janus SocketsR documents the HttpRequest object, whose AddXml method has nearly the same options as the Xml function; the following fragment serializes an XmlDoc and sends it as a request to a Web server. Note that if you use the Xml function and $Sock_Send directly, instead of using an HTTP helper object, always use the BINARY option of $Sock_Send, because the result of the Xml function is UTF-8, rather than EBCDIC.

    %httpreq object HttpRequest %httpresp object HttpResponse %doc object XmlDoc %httpreq = New %doc = New %doc:LoadXml('<inquire><stock>IBM</stock>' With - <dateRange/></inquire>', 'NoEmptyElt') %httpreq:URL = 'foo.com/bar' %httpreq:AddXml(%doc) %httpresp = %httpreq:Post('HTTP_CLIENT')

  • The following fragment is a simple example for serializing an XmlDoc, which could then, for example, be sent on a transport such as MQ:

    %s Longstring %s = %doc:Xml

Request-Cancellation Errors

  • XmlDoc does not contain an Element.
  • Options is invalid.
  • Insufficient free space exists in CCATEMP.


See also

  • Use ??Print to display an XML document for debugging.
  • Use ??Serial with the EBCDIC option to obtain an EBCDIC serialization of an XML document.
  • Use ??WebSend to serialize an XmlDoc and send it as an HTTP response using Janus Web Server.
  • The string deserialization functions are ??LoadXml and ??WebReceive.