XPath: Difference between revisions

From m204wiki
Jump to navigation Jump to search
 
(37 intermediate revisions by 2 users not shown)
Line 2: Line 2:
This article has information to help you use
This article has information to help you use
XPath arguments to various [[XmlDoc API]] methods.
XPath arguments to various [[XmlDoc API]] methods.
Most of the information is taken from the [http://www.w3.org/TR/xpath XPath 1 standard], which is the
Most of the information is taken from the [http://www.w3.org/TR/xpath XPath 1 standard], which is the authoritative reference.
authoritative reference.
   
   
References to Version 2 of the XPath standard in this manual
References to Version 2 of the XPath standard in this manual
are to the [http://www.w3.org/TR/xpath20 XPath 2 standard], which became a W3C Recommendation
are to the [http://www.w3.org/TR/xpath20 XPath 2 standard], which became a W3C Recommendation on January 23, 2007.
on January 23, 2007.
   
   
The five sections in this chapter explain, respectively, the following:
The five sections in this topic explain, respectively, the following:
<ol>
<ol>
<li>How to understand the
<li>How to understand the components of an XPath expression, which is composed of '''steps'''. </li>
components of an XPath expression, which is composed of '''steps'''.
<li>The syntax of XPath. </li>
<li>The syntax of XPath.
<li>Some subtle aspects of XPath. </li>
<li>Some subtle aspects of XPath.
<li>Specific XPath axis combinations to avoid. </li>
<li>Specific XPath axis combinations to avoid.
<li>The subset of XPath supported by the current version of the XmlDoc API. </li>
<li>The subset of XPath supported by the current version of the XMlDoc API.
</ol>
</ol>
===XPath operation===
==XPath operation==
The purpose of XPath is to select a subset of nodes from a document.
The purpose of XPath is to select a subset of nodes from a document.
This selection is done using an <b>expression</b>, as described by
This selection is done using an <b>expression</b>, as described by
Line 36: Line 34:
<p class="code">pitm[2]/partnum
<p class="code">pitm[2]/partnum
</p>
</p>
This expression contains two Steps (the slash symbol ( / ) is used to
This expression contains two Steps (the slash symbol ( '''/''' ) is used to
separate the Steps in a Location Path).
separate the Steps in a Location Path).
   
   
Line 53: Line 51:
The NodeTest is used to restrict the set of nodes.
The NodeTest is used to restrict the set of nodes.
   
   
Square brackets ('''[''' ... ''']''')
Square brackets ('''[''' ''']''')
in a Step surround another form of restriction, which is called
in a Step surround another form of restriction, which is called
a <b>Predicate</b>, given by the production ([8]) with the same name.
a <b>Predicate</b>, given by the production ([8]) with the same name.
Line 69: Line 67:
the following Predicate, if any.
the following Predicate, if any.
<li>The final filtered sets are combined (using set union), and the result
<li>The final filtered sets are combined (using set union), and the result
is the set of nodes which becomes input to the following Step.
is the set of nodes that becomes input to the following Step.
<li>The result of the final Step is the result of the Location Path.
<li>The result of the final Step is the result of the Location Path.
</ol>
</ol>
 
====Axes====
===Axes===
The various forms of the <tt>AxisName</tt> ([6]) production generate
The various forms of the <tt>AxisName</tt> ([6]) production generate
nodes based on a context node using the simple tree relationships
nodes based on a context node using the simple tree relationships
Line 86: Line 84:
its parent, and so on, to and including the Root node of the
its parent, and so on, to and including the Root node of the
XmlDoc.
XmlDoc.
 
<dt>ancestor-or-self
<dt>ancestor-or-self
<dd>The same contents as <tt>ancestor</tt>, except that it also
<dd>The same contents as <tt>ancestor</tt>, except that it also
includes the context node.
includes the context node.
 
<dt>attribute
<dt>attribute
<dd>Contains the attributes of the context node, which must be an element.
<dd>Contains the attributes of the context node, which must be an element.
 
<dt>child
<dt>child
<dd>Contains the children of the context node.
<dd>Contains the children of the context node.
 
<dt>descendant
<dt>descendant
<dd>Contains the context node children, their
<dd>Contains the context node children, their
Line 103: Line 101:
this axis does not include any attributes (so, this is not equivalent to
this axis does not include any attributes (so, this is not equivalent to
a "sub-tree").
a "sub-tree").
 
<dt>descendant-or-self
<dt>descendant-or-self
<dd>The same contents as <tt>descendant</tt>, except that it also
<dd>The same contents as <tt>descendant</tt>, except that it also
includes the context node.
includes the context node.
Specified by <tt>//</tt>, an
Specified by <tt>'''//'''</tt>, an
abbreviation for the step consisting of <tt>descendant-or-self::node()</tt>,
abbreviation for the step consisting of <tt>descendant-or-self::node()</tt>,
this axis can be used, for example, to locate an element by its name,
this axis can be used, for example, to locate an element by its name,
Line 113: Line 111:
<p class="code">//foo
<p class="code">//foo
</p>
</p>
 
<dt>following
<dt>following
<dd>Contains all the nodes that are after the context node in document order,
<dd>Contains all the nodes that are after the context node in document order,
excluding any descendants and excluding attribute nodes.
excluding any descendants and excluding attribute nodes.
 
<dt>following-sibling
<dt>following-sibling
<dd>Contains all the siblings of the context node that are
<dd>Contains all the siblings of the context node that are
positioned after that node.
positioned after that node.
Contains no siblings if context node is an attribute.
Contains no siblings if context node is an attribute.
 
<dt>parent
<dt>parent
<dd>Contains the parent of the context node.
<dd>Contains the parent of the context node.
Each node has only one parent, except the root node, which has no parent.
Each node has only one parent, except the root node, which has no parent.
 
<dt>preceding-sibling
<dt>preceding-sibling
<dd>Contains all the siblings of the context node that are
<dd>Contains all the siblings of the context node that are
positioned before that node.
positioned before that node.
Contains no siblings if context node is an attribute.
Contains no siblings if context node is an attribute.
 
<dt>self
<dt>self
<dd>Contains the context node itself.
<dd>Contains the context node itself.
Line 144: Line 142:
declarations by using certain XmlDoc API methods, for example, the [[URI (XmlNode function)|URI]]
declarations by using certain XmlDoc API methods, for example, the [[URI (XmlNode function)|URI]]
function on an XmlNode.
function on an XmlNode.
 
<dt>preceding
<dt>preceding
<dd>Support for this axis may be added in a later version.
<dd>Support for this axis may be added in a later version.
</dl>
</dl>
 
====NodeTests====
===NodeTests===
The various forms of the <tt>NodeTest</tt> ([7]) production filter
The various forms of the <tt>NodeTest</tt> ([7]) production filter
nodes as follows:
nodes as follows:
<dl>
<dl>
<dt>NodeType '(' ')'
<dt>NodeType '('<tt> </tt>')'
<dd>This selects any node that has the respective node type, for example
<dd>This selects any node that has the respective node type; for example,
<code>comment()</code> selects all Comment nodes in a node set.
<code>comment()</code> selects all Comment nodes in a node set.
 
<dt>'processing-instruction' '(' Lit ')'
<dt>'processing-instruction'<tt> </tt>'(' Lit ')'
<dd>This selects any Processing Instruction node if the target name is
<dd>This selects any Processing Instruction node, if the target name is
equal to the value of <tt>Lit</tt>.
equal to the value of <tt>Lit</tt>.
 
<dt>'*'  |  NCName ':' '*'  |  QName
<dt>'<tt>*</tt>'  |  NCName '<tt>:</tt>' '<tt>*</tt>'  |  QName
<dd>These forms test the '''name''' of a node, after restricting the type of node
<dd>These forms test the '''name''' of a node, after restricting the type of node
to the "principal node type" of the Axis, as follows:
to the "principal node type" of the Axis, as follows:
Line 171: Line 169:
The name tests then filter the resulting nodes as follows:
The name tests then filter the resulting nodes as follows:
<dl>
<dl>
<dt>'*'
<dt>'<tt>*</tt>'
<dd>This selects a node of the selected type regardless of the node's name.
<dd>This selects a node of the selected type regardless of the node's name.
The "selected type" is the principal node
The "selected type" is the principal node
type of the node subset selected by the preceding Axis.
type of the node subset selected by the preceding Axis.
The default Axis type is <tt>child</tt>,
The default Axis type is <tt>child</tt>, so the default node type is <tt>Element</tt>.
so the default node type is <tt>Element</tt>.
 
<dt>NCName '<tt>:</tt>' '<tt>*</tt>'
<dt>NCName ':' '*'
<dd>This selects a node of the selected type if the node has an associated
<dd>This selects a node of the selected type if the node has an associated
namespace equal to the URI associated with <tt>NCName</tt>.
namespace equal to the URI associated with <tt>NCName</tt>.
 
<dt>QName
<dt>QName
<dd>This selects a node of the selected type if the node has the same name
<dd>This selects a node of the selected type if the node has the same name
Line 190: Line 187:
discussed in
discussed in
[[XML processing in Janus SOAP#Names and namespaces|Names and namespaces]].
[[XML processing in Janus SOAP#Names and namespaces|Names and namespaces]].
 
====Predicates====
===Predicates===
Each nodeSet that is the result of NodeTest filtering is input to the
Each nodeSet that is the result of NodeTest filtering is input to the
series of Predicates in the Step.
series of Predicates in the Step.
Line 206: Line 203:
<ul>
<ul>
<li>Multiple Predicates are allowed.
<li>Multiple Predicates are allowed.
 
<li>Location paths within Predicates can themselves contain Predicates.
<li>Location paths within Predicates can themselves contain Predicates.
</ul>
</ul>
Line 222: Line 219:
including multi-step and absolute expressions.
including multi-step and absolute expressions.
   
   
You can also use a location path or the number(<!--thinsp-->) function,
You can also use a location path or the number(&thinsp;) function,
followed by a comparison
followed by a comparison
operator and literal, as a predicate of an XPath expression,
operator and literal, as a predicate of an XPath expression,
Line 229: Line 226:
For a description of the XPath predicates currently supported in the XmlDoc API,
For a description of the XPath predicates currently supported in the XmlDoc API,
see [[#Predicates supported in the current version|Predicates supported in the current version]]
see [[#Predicates supported in the current version|Predicates supported in the current version]]
 
====Functions====
===Functions===
There are many functions defined for XPath; this section merely gives
There are many functions defined for XPath; this section merely gives
a sample of some of them.
a sample of some of them.
Furthermore, as of the current version, many of these are not supported.
Furthermore, as of the current version, many of these are not supported.
See [[#XPath functions supported in the current version |XPath functions supported in the current version ]] for a list of the XPath functions currently supported.
See [[#XPath functions supported in the current version |XPath functions supported in the current version ]] for a list of the XPath functions currently supported.  All supported functions in the following list are shown as a link to the section which may contain additional notes for using the function in the XmlDoc API.
   
   
In the XmlDoc API, XPath functions are used only in predicates.
In the XmlDoc API, XPath functions are used only in predicates.
Line 240: Line 237:
Here are some XPath functions that return a '''numeric result''':
Here are some XPath functions that return a '''numeric result''':
<dl>
<dl>
<dt>last(<!--thinsp-->)
<dt>last(&thinsp;)
<dd>Returns the size (number of nodes) of the set that the predicate
<dd>Returns the size (number of nodes) of the set that the predicate
is filtering
is filtering.


<dt>position(<!--thinsp-->)
Not supported in the XmlDoc API.
<dt>[[#pos|position]](&thinsp;)
<dd>Returns the position of the node in the set that the predicate
<dd>Returns the position of the node in the set that the predicate
is filtering
is filtering
Line 262: Line 261:
<p class="note">'''Note:'''
<p class="note">'''Note:'''
If a PathExpr is parenthesized and followed by Predicates, and if
If a PathExpr is parenthesized and followed by Predicates, and if
position(<!--thinsp-->) is used in those predicates, the axis in effect is the
position(&thinsp;) is used in those predicates, the axis in effect is the
<tt>child</tt> axis. </p>
<tt>child</tt> axis. </p>
 
<dt>count(nodeSet)
<dt>count(<i>nodeSet</i>)
<dd>Returns the number of nodes in the argument nodeSet
<dd>Returns the number of nodes in the argument nodeSet
   
   
Line 274: Line 273:
This expression will select all chapters that have three or more sections.
This expression will select all chapters that have three or more sections.


<dt>number(object?)
Not supported in the XmlDoc API.
<dt>[[#num|number]](<i>object</i>?)
<dd>Returns the numeric value of the argument (after stripping leading
<dd>Returns the numeric value of the argument (after stripping leading
and trailing blanks if it is a string object).
and trailing blanks if it is a string object).
Line 283: Line 284:
The default argument is the context node (of the "containing" expression,
The default argument is the context node (of the "containing" expression,
which, for our purposes, is the context node of the predicate containing the
which, for our purposes, is the context node of the predicate containing the
number(<!--thinsp-->) function).
number(&thinsp;) function).
If the argument is a nodeSet, the value of the
If the argument is a nodeSet, the value of the
first node (in document order) of the argument nodeSet is converted to
first node (in document order) of the argument nodeSet is converted to
Line 291: Line 292:
Here are some XPath functions that return a '''string result''':
Here are some XPath functions that return a '''string result''':
<dl>
<dl>
<dt>string(object?)
<dt>string(<i>object</i>?)
<dd>
<dd>
This function, like several other XPath functions, allows different
This function, like several other XPath functions, allows different
kinds of arguments: for example, it can be used to convert a number
kinds of arguments: for example, it can be used to convert a number
to a string.
to a string.
The string(<!--thinsp-->) function is implicitly used when a comparison is made between
The string(&thinsp;) function is implicitly used when a comparison is made between
a node and a string value, for example:
a node and a string value, for example:
<p class="code">/book/chapter[@title = 'Introduction']
<p class="code">/book/chapter[@title = 'Introduction']
</p>
</p>
In this case, each node in the
In this case, each node in the
node set that is the result of the <tt>@title</tt>
node set that is the result of the <code>@title</code>
PathExpr is converted using string(<!--thinsp-->), then compared to the string
PathExpr is converted using string(&thinsp;), then compared to the string
literal <tt>Introduction</tt>.
literal <code>Introduction</code>.
   
   
The default argument
The default argument
of the string(<!--thinsp-->) function is the context node of the expression.
of the string(&thinsp;) function is the context node of the expression.
The string(<!--thinsp-->) function, when given a node set argument, uses the string value
The string(&thinsp;) function, when given a node set argument, uses the string value
of the first node, in document order, of that node set.
of the first node, in document order, of that node set.
Not supported in the XmlDoc API.


<dt>substring(string, number, number?)
<dt>substring(<i>string</i>, <i>number</i>, <i>number</i>?)
<dd>Returns the substring of the first argument, starting at the position
<dd>Returns the substring of the first argument, starting at the position
specified by the second argument, and for the number of
specified by the second argument, and for the number of
Line 318: Line 321:
As with most XPath expressions, conversions are freely done, so if the
As with most XPath expressions, conversions are freely done, so if the
first argument is not a manifest string type, it is converted to one
first argument is not a manifest string type, it is converted to one
using the string(<!--thinsp-->) function.
using the string(&thinsp;) function.
 
Not supported in the XmlDoc API.
 
</dl>
</dl>
   
   
This XPath function returns a '''boolean result''':
This XPath function returns a '''boolean result''':
<dl>
<dl>
<dt>not(boolean)
<dt>[[#not|not]](<i>boolean</i>)
<dd>Returns <code>true</code> if its argument is false, and <code>false</code>
<dd>Returns <code>true</code> if its argument is false, and <code>false</code>
otherwise.
otherwise.
Line 330: Line 336:
</dl>
</dl>


===XPath syntax===
==XPath syntax==
This section contains a version of the XPath syntax.
This section contains a version of the XPath syntax.
See
See
Line 352: Line 358:
</ul>
</ul>
   
   
For a &ldquo;cross-reference&rdquo; to the productions as contained in the XPath
For a "cross-reference" to the productions as contained in the XPath
Recommendation, see [[#XPath syntax cross-reference|XPath syntax cross-reference]].
Recommendation, see [[#XPath syntax cross-reference|XPath syntax cross-reference]].
   
   
See also [[#XPath supported in the current version|XPath supported in the current version]].
See also [[#XPath supported in the current version|XPath supported in the current version]].
   
   
<pre style="skel">
<p class="code">[A19] PathExpr  ::= LocPath
[A19] PathExpr  ::= LocPath
         | PrimaryExpr Predicate+
         | PrimaryExpr Predicate+
         | PrimaryExpr Predicate* '/' RelativeLocPath
         | PrimaryExpr Predicate* '/' RelativeLocPath
Line 372: Line 377:
  [1]  LocPath ::= RelativeLocPath
  [1]  LocPath ::= RelativeLocPath
         | AbsoluteLocPath
         | AbsoluteLocPath
  [2]  AbsoluteLocPath ::= '/' RelativeLocPath?
  [2]  AbsoluteLocPath ::= '/' RelativeLocPath?
         | '//' RelativeLocPath
         | '//' RelativeLocPath
  [3]  RelativeLocPath ::= Step ('/' Step)*
  [3]  RelativeLocPath ::= Step ('/' Step)*
         | RelativeLocPath '//' Step
         | RelativeLocPath '//' Step
Line 399: Line 406:
   
   
  [21] Expr ::= EqExpr ( ('and' | 'or') EqExpr )*
  [21] Expr ::= EqExpr ( ('and' | 'or') EqExpr )*
  [23] EqExpr ::= RelExpr ( ('=' | '!=') RelExpr )*
  [23] EqExpr ::= RelExpr ( ('=' | '!=') RelExpr )*
  [24] RelExpr ::= NumExpr
  [24] RelExpr ::= NumExpr
         ( ('<' | '>' | '<=' | '>=') NumExpr )*
         ( ('<' | '>' | '<=' | '>=') NumExpr )*
Line 406: Line 415:
         ( ('+' | '-' | '*' | 'div' | 'mod') UnaryExpr )*
         ( ('+' | '-' | '*' | 'div' | 'mod') UnaryExpr )*
   
   
  [29] Lit ::= '"' [^"]* '"'
  [29] Lit ::= '"' [^"]* '"' | "'" [^']* "'"
        | "'" [^']* "'"
  [30] Number ::= [0-9]+ ('.' [0-9]*)?
  [30] Number ::= [0-9]+ ('.' [0-9]*)?
  [35] FunctionName ::= QName - NodeType
  [35] FunctionName ::= QName - NodeType
  [36] Variable ::= '$' QName
  [36] Variable ::= '$' QName
[38] NodeType ::= 'comment'    |  'text'
        | 'processing-instruction'  | 'node'
</pre>
   
   
[38] NodeType ::= 'comment' | 'text' | 'processing-instruction'  | 'node'
</p>
===XPath syntax notes===
For information about XPath functions, see [[#Functions|Functions]].
For information about XPath functions, see [[#Functions|Functions]].
   
   
Syntax notes:
<ul>
<ul>
<li>In [A19], [2], and [3], the double slash (<tt>//</tt>) is an
<li>In [A19], [2], and [3], the double slash (<tt>//</tt>) is an
abbreviation for
abbreviation for
<pre>
<p class="code">/descendant-or-self::node()/
    /descendant-or-self::node()/
</p>
</pre>
<li>When an at sign (<tt>@</tt>) is used in a Step ([4]), it is an
<li>When an at sign (<tt>@</tt>) is used in a Step ([4]), it is an
abbreviation for
abbreviation for
<pre>
<p class="code">attribute::
    attribute::
</p>
</pre>
<li>The syntax for Step ([4]) notes that it may begin directly with a
<li>The syntax for Step ([4]) notes that it may begin directly with a
NodeTest.
NodeTest.
In that case, <tt>child::</tt> is implied before the NodeTest.
In that case, <tt>child::</tt> is implied before the NodeTest.
<li>The syntax for QName and NCName are given in [[XML processing in Janus SOAP#Name and namespace syntax|Name and namespace syntax]].
<li>The syntax for QName and NCName are given in [[XML processing in Janus SOAP#Name and namespace syntax|Name and namespace syntax]].
<li>The precedence of expressions from
<li>The precedence of expressions from
<tt>Expr</tt> ([21]), <tt>EqExpr</tt> ([23]),
<tt>Expr</tt> ([21]), <tt>EqExpr</tt> ([23]),
Line 449: Line 462:
The operators are all left-associative.
The operators are all left-associative.
For example,
For example,
<tt>3<!--thinsp-->><!--thinsp-->2<!--thinsp-->><!--thinsp-->1</tt> is the same as
<code>3 > 2 > 1</code> is the same as
<tt>(3<!--thinsp-->><!--thinsp-->2)<!--thinsp-->><!--thinsp-->1</tt>, which evaluates to false.
<code>(3 > 2) > 1</code>, which evaluates to false.
<li>The only forms of PrimaryExpr ([B15]) that can create node sets,
<li>The only forms of PrimaryExpr ([B15]) that can create node sets,
and so be used in a PathExpr ([A19]), are the id(<!--thinsp-->) function and
and so be used in a PathExpr ([A19]), are the <tt>id(&thinsp;)</tt> function and
parenthesized LocPath ([1]) (or parenthesized unions of them, with '|' ([C27])).
parenthesized LocPath ([1]) (or parenthesized unions of them, with <code>|</code> ([C27])).
</ul>
</ul>
See also the notes in [[#XPath supported in the current version|XPath supported in the current version]].
See also the notes in [[#XPath supported in the current version|XPath supported in the current version]].
====XPath syntax cross-reference====
===XPath syntax cross-reference===
Here is a listing of all first productions contained in various
Here is a listing of all first productions contained in various
numbered sections and unnumbered subsections from the
numbered sections and unnumbered subsections from the
Line 492: Line 507:
[39]  ExprWhitespace (last production)
[39]  ExprWhitespace (last production)
</dl>
</dl>
===Some notes on XPath usage===
==Some notes on XPath usage==
The following subsections describe some subtle issues in XPath,
The following subsections describe some subtle issues in XPath,
which the XmlDoc API implements exactly as specified in the recommendations.
which the XmlDoc API implements exactly as specified in the recommendations.
====&ldquo;//&rdquo; and &ldquo;.&rdquo;====
 
<blockquote class="note" id="entities">
<p><b>Note:</b> Predicates in XPath expressions, which are widely used, are enclosed within square bracket characters (<tt>[</tt> <tt>]</tt>). However, simply typing square bracket characters in your XPath expressions risks [[Unicode#Consistent XPath predicate errors .E2.80.94 wrong codepage?|invalid character errors]] arising from a mismatch between a 3270-keyboard codepage and the codepage set for the Online with the <var>[[UNICODE command|UNICODE]]</var> command. </p>
<p>
Model 204&nbsp;7.6 maintenance added [[XML processing in Janus SOAP#Entity references|XMHTL entities]] for left and right square brackets in response to this problem. These entities are used in some of the [[#Predicates supported in the current version|predicate examples]] shown below. </p>
</blockquote>
==="//" and "."===
As mentioned in [[#Performance considerations: Document order, certain axes|Performance considerations: Document order, certain axes]], the <tt>descendant-or-self</tt>
As mentioned in [[#Performance considerations: Document order, certain axes|Performance considerations: Document order, certain axes]], the <tt>descendant-or-self</tt>
axis (commonly appearing in an XPath expression with the <tt>//</tt>
axis (commonly appearing in an XPath expression with the <tt>//</tt>
Line 509: Line 532:
<li>An XPath expression that begins with <tt>//</tt> is an absolute
<li>An XPath expression that begins with <tt>//</tt> is an absolute
expression; if you want to use it at the start of a relative XPath
expression; if you want to use it at the start of a relative XPath
expression, make use of the <tt>.</tt> step:
expression, make use of the <code>.</code> step:
<pre>
<p class="code">%chapter = %doc:SelectSingleNode('/book/chapter[title="Aerobics"]')
    %chapter = %doc:SelectSingleNode('/book/chapter[title="Aerobics"]')
%sections = %chapter('.//section')
    %sections = %chapter('.//section')
</p>
</pre>
</ul>
</ul>
====Attributes: not children, and excluded from most axes====
===Attributes: not children, and excluded from most axes===
One subtle point to observe is that attributes are not children of
One subtle point to observe is that attributes are not children of
their parents!
their parents!
As stated in the XPath Recommendation:
As stated in the XPath Recommendation:
<ul>
<ul class="nobul">
<li>2.2 Axes
<li>2.2 Axes
<li>. . .
<li>. . .
<ul>
<ul>
<li>the descendant axis contains the descendants of the context node; a descendant
<li style="list-style-type:square;">the descendant axis contains the descendants of the context node; a descendant
is a child or a child of a child and so on; thus the descendant axis never
is a child or a child of a child and so on; thus the descendant axis never
contains attribute or namespace nodes
contains attribute or namespace nodes
Line 544: Line 567:
the default axis, an Attribute node will never the result of a step
the default axis, an Attribute node will never the result of a step
which does not explicitly include one of the above (abbreviated or unabbreviated) axes.
which does not explicitly include one of the above (abbreviated or unabbreviated) axes.
====Order of nodes: node sets versus nodelists====
===Order of nodes: node sets versus nodelists===
The order of nodes in an XML document is the order in which the
The order of nodes in an XML document is the order in which the
node (or its start-tag, in the case of an Element node) first occurs in
node (or its start-tag, in the case of an Element node) first occurs in
Line 557: Line 581:
In XPath, however, document order is not always used.
In XPath, however, document order is not always used.
In an XPath expression step, the axis implies an order.
In an XPath expression step, the axis implies an order.
This order is important for the <tt>position()</tt> and
This order is important for the <tt>position(&thinsp;)</tt> and
<tt>last()</tt> predicate functions, which
<tt>last(&thinsp;)</tt> predicate functions, which
filter a node based on its order in a node set.
filter a node based on its order in a node set.
This order is the same as document order, except for
This order is the same as document order, except for
Line 570: Line 594:
   
   
For example, consider the following document:
For example, consider the following document:
<pre>
<p class="code"><top>
    <top>
  <a/>
      <a/>
  <b/>
      <b/>
  <c/>
      <c/>
  <d/>
      <d/>
  <e/>
      <e/>
</top>
    </top>
</p>
</pre>
   
   
Using XPath to select the first of the &ldquo;following&rdquo; siblings
Using XPath to select the first of the "following" siblings
(<tt>/*/c/following-sibling::*[1]</tt>)
(<code>/*/c/following-sibling::*[1]</code>)
yields the equivalent of the first element in document order: <tt>d</tt>.
yields the equivalent of the first element in document order: <code>d</code>.
However, selecting the first of the preceding siblings
However, selecting the first of the preceding siblings
(<tt>/*/c/preceding-sibling::*[1]</tt>)
(<code>/*/c/preceding-sibling::*[1]</code>)
yields the first element in ''reverse'' document order: <tt>b</tt>.
yields the first element in ''reverse'' document order: <code>b</code>.
   
   
This reverse ordering
This reverse ordering
is apparent in some contexts but not in others.
is apparent in some contexts but not in others.
For example, the XPath expression in the following statement is used
For example, the XPath expression in the following statement is used
against the document above (call it <tt>%doc</tt>) to select
against the document above (call it <code>%doc</code>) to select
nodes into an XmlNodelist:
nodes into an XmlNodelist:
<pre>
<p class="code">%nodelist = %doc:SelectNodes('/*/c/preceding-sibling::*()')
    %nodelist = %doc:SelectNodes('/*/c/preceding-sibling::*()')
</p>
</pre>
   
   
The XmlDoc API (re)arranges the two found nodes in <tt>%nodelist</tt> into document
The XmlDoc API (re)arranges the two found nodes in <code>%nodelist</code> into document
order: <tt>a</tt> and <tt>b</tt>, in that order, and
order: <code>a</code> and <code>b</code>, in that order, and
the following statement selects <tt>b</tt>, the second of the nodes:
the following statement selects <code>b</code>, the second of the nodes:
<pre>
<p class="code">Print %nodelist:Item(2):LocalName
    Print %nodelist:Item(2):LocalName
</p>
</pre>
   
   
Yet, if you use the following statement (instead of the previous two)
Yet, if you use the following statement (instead of the previous two)
in an attempt to directly select
in an attempt to directly select
the &ldquo;second&rdquo; preceding sibling, the result is node <tt>a</tt>:
the "second" preceding sibling, the result is node <code>a</code>:
<pre>
<p class="code">Print %doc:LocalName('/*/c/preceding-sibling::*[2]')
    Print %doc:LocalName('/*/c/preceding-sibling::*[2]')
</p>
</pre>
   
   
The Item method, above, selects the second node in the set of
The Item method, above, selects the second node in the set of
document-ordered nodes in the XmlNodelist.
document-ordered nodes in the XmlNodelist.
The position(<!--thinsp-->) function above (the <tt>2</tt> within the brackets)
The position(&thinsp;) function above (the <code>2</code> within the brackets)
selects the second node in the
selects the second node in the
set of reverse-document-ordered nodes passed from the preceding-sibling axis
set of reverse-document-ordered nodes passed from the preceding-sibling axis
after filtering by the <code>*</code> NodeTest.
after filtering by the <code>*</code> NodeTest.
===Performance considerations: Document order, certain axes===
 
==Performance considerations: Document order, certain axes==
This section discusses the performance implications of evaluating certain
This section discusses the performance implications of evaluating certain
XPath expressions.
XPath expressions.
Line 641: Line 662:
of methods like SelectNodes and UnionSelected in the XmlNodelist class
of methods like SelectNodes and UnionSelected in the XmlNodelist class
designate a set of nodes.
designate a set of nodes.
In addition to these &ldquo;set-valued&rdquo; methods, XPath expressions can be
In addition to these "set-valued" methods, XPath expressions can be
used in many XmlDoc API methods
used in many XmlDoc API methods
(SelectSingleNode, Value, DeleteSubtree, QName, and more)
(<var>SelectSingleNode</var>, <var>Value</var>, <var>DeleteSubtree</var>, <var>QName</var>, and more) to operate on a single node that satisfies the expression.
to operate on a single node that satisfies the expression.
For a simple XPath expression (described above), the "single node" methods
For a simple XPath expression (described above), the &ldquo;single node&ldquo; methods
may be able to determine the desired node by scanning
may be able to determine the desired node by scanning
fewer nodes, giving better performance than the set-valued methods for the
fewer nodes, giving better performance than the set-valued methods for the
Line 658: Line 678:
   
   
In other words, given an XPath expression <i>expr</i> that uses any
In other words, given an XPath expression <i>expr</i> that uses any
of the axis cases described below in "[[#The extra-processing expressions|The extra-processing expressions]]"
of the axis cases described below in [[#The extra-processing expressions|The extra-processing expressions]] and given any
and given any
single-node selection method <i>XMeth</i>, this expression:
single-node selection method <i>XMeth</i>, this expression:
<!--?? %obj:<i><b>XMeth</b></i>(<i><b>expr</b></i>) -->
<!--?? %obj:<i><b>XMeth</b></i>(<i><b>expr</b></i>) -->
Line 674: Line 693:
   
   
For example, consider the following document:
For example, consider the following document:
<pre>
<p class="code"><top>
    <top>
  <a>
      <a>
      <b x="1"/>
        <b x="1"/>
  </a>
      </a>
  <b x="2"/>
      <b x="2"/>
</top>
    </top>
</p>
</pre>
   
   
When, say, <tt>Value('/*/*/b/@x')</tt> is evaluated, the document search
When, say, <code>Value('/*/*/b/@x')</code> is evaluated, the document search
ends when the first match is found (and the Value method returns <tt>1</tt>).
ends when the first match is found (and the Value method returns <code>1</code>).
   
   
But when <tt>Value('//b/@x')</tt> is evaluated, the document search first
But when <code>Value('//b/@x')</code> is evaluated, the document search first
finds the match <tt>x=2</tt>, then it continues searching the entire
finds the match <code>x=2</code>, then it continues searching the entire
document for all matches, to ensure that the match which is lowest in
document for all matches, to ensure that the match which is lowest in
document order (<tt>x=1</tt>) is the result.
document order (<code>x=1</code>) is the result.
   
   
The performance implications of the expressions that involve extra processing
The performance implications of the expressions that involve extra processing
Line 697: Line 715:
may be selected in an order (due to the selection algorithm) that differs
may be selected in an order (due to the selection algorithm) that differs
from document order.
from document order.
====The extra-processing expressions====
   
   
===The extra-processing expressions===
Extra processing can occur in the following cases:
Extra processing can occur in the following cases:
<ol>
<ol>
Line 732: Line 750:
</ul>
</ul>
</ol>
</ol>
 
<div id="ordScan"></div>
<div id="ordScan"></div>
In addition to the cost of the actual XPath search performed with the above
In addition to the cost of the actual XPath search performed with the above
expressions,
expressions, they can incur an additional "one-time" cost for XPath evaluation.
they can incur an additional "one-time" cost for XPath evaluation.
If the document has been modified in such a way that the internal order of
If the document has been modified in such a way that the internal order of
the nodes cannot be guaranteed to be the same as document order (this will
the nodes cannot be guaranteed to be the same as document order (this will
always happen with any of the XML <tt>Insert..Before</tt> methods, and
always happen with any of the XML <var>Insert..Before</var> methods, and
usually will happen
usually will happen
with any of the <tt>Add..</tt> methods), then the entire document
with any of the <var>Add..</var> methods), then the entire document
(not only the subtree
(not only the subtree
being searched) must be scanned so that the order is adjusted.
being searched) must be scanned so that the order is adjusted.
Line 747: Line 764:
a full scan.  This adjustment "fixes up" the <var>XmlDoc</var> and it will remain "fixed up" for subsequent XPath searches, until such time as the <var>XmlDoc</var> is subsequently updated in such a way that the internal order does not guarantee document order.
a full scan.  This adjustment "fixes up" the <var>XmlDoc</var> and it will remain "fixed up" for subsequent XPath searches, until such time as the <var>XmlDoc</var> is subsequently updated in such a way that the internal order does not guarantee document order.
   
   
'''Note:'''
<blockquote class="note">
<p>'''Note:''' </p>
<ol>
<ol>
<li>One important exception to the above rules is
<li>One important exception to the above rules is
Line 754: Line 772:
by the <tt>child</tt> axis without any predicate.
by the <tt>child</tt> axis without any predicate.
An example of the usual way to specify this is:
An example of the usual way to specify this is:
<pre>
<p class="code">//chapter
    //chapter
</p>
</pre>
In this case, the internal node selection algorithm operates in document
In this case, the internal node selection algorithm operates in document
order, and no extra processing is incurred.
order, and no extra processing is incurred.
<p>
Even with this special case, it is better to avoid the <tt>descendant-or-self</tt>
Even with this special case, it is better to avoid the <tt>descendant-or-self</tt>
step (specified explicitly or by using <tt>//</tt>) if your
step (specified explicitly or by using <code>//</code>) if your
document structure lends itself to explicitly specifying the
document structure lends itself to explicitly specifying the
&ldquo;intermediate&rdquo; elements with &ldquo;<tt>*</tt>&rdquo;
"intermediate" elements with <code>*</code> (or even better, with their names)
(or even better, with their names)
that should be matched. </p>
that should be matched.
<li>The considerations described in this section only apply to the &ldquo;outer&rdquo; XPath
<li>The considerations described in this section only apply to the "outer" XPath
expression; they do not apply to any expression within a predicate.
expression; they do not apply to any expression within a predicate.
Although it is still better, for the sake of efficiency, to prune the search
Although it is still better, for the sake of efficiency, to prune the search
by explicitly
by explicitly
specifying &ldquo;intermediate&rdquo; elements rather than using <tt>//</tt>,
specifying "intermediate" elements rather than using <code>//</code>,
there is no efficiency concern due to the internal order of node selection
there is no efficiency concern due to the internal order of node selection
with an XPath predicate such as the following:
with an XPath predicate such as the following:
<pre>
<p class="code">Print %d:Value('/book/chapter' With '[.//credit/details/@auth="Dave"]')
    Print %d:Value('/book/chapter' With -
</p>
      '[.//credit/details/@auth="Dave"]')
</pre>
<li>'''In conclusion''', except when you '''must''' use
<li>'''In conclusion''', except when you '''must''' use
the &ldquo;<tt>//chapter</tt>&rdquo; exception discussed in Note 1, above,
the "<tt>//chapter</tt>" exception discussed in Note 1, above,
''avoid these extra-processing axes and axis combinations''
''avoid these extra-processing axes and axis combinations''
(especially in outer XPath expressions) if your documents
(especially in outer XPath expressions) if your documents
are relatively large and performance is a consideration.
are relatively large and performance is a consideration.
</ol>
</ol>
 
</blockquote>
===XPath supported in the current version===
==XPath supported in the current version==
This section contains a condensed excerpt of the XPath syntax, showing
This section contains a condensed excerpt of the XPath syntax, showing
only those parts of XPath used in the current version.
only those parts of XPath used in the current version.
Line 856: Line 873:
</pre>
</pre>
   
   
====Explanatory notes====
===Explanatory notes===
The following notes are intended only to explain the above syntax; they
The following notes are intended only to explain the above syntax; they
do not present any limitations on XPath support in the XmlDoc API:
do not present any limitations on XPath support in the XmlDoc API:
Line 863: Line 879:
<li>When the at sign (<tt>@</tt>) is used in a Step ([4]), it is an
<li>When the at sign (<tt>@</tt>) is used in a Step ([4]), it is an
abbreviation for
abbreviation for
<pre>
<p class="code">attribute::
    attribute::
</p>
</pre>
<li>The syntax for Step ([4]) notes that it may begin directly with a
<li>The syntax for Step ([4]) notes that it may begin directly with a
NodeTest.
NodeTest.
In that case, <tt>child::</tt> is implied before the NodeTest.
In that case, <tt>child::</tt> is implied before the NodeTest.
<li>The syntax for QName is given in
<li>The syntax for QName is given in
[[XML processing in Janus SOAP#Name and namespace syntax|Name and namespace syntax]].
[[XML processing in Janus SOAP#Name and namespace syntax|Name and namespace syntax]].
<li>A node is selected by a PositiveInteger predicate if
<li>A node is selected by a PositiveInteger predicate if
the position of the node, in the set which the predicate
the position of the node, in the set which the predicate
is filtering, is equal to that PositiveInteger.
is filtering, is equal to that PositiveInteger.
<li>A node is selected by an
<li>A node is selected by an
Existence test if the result of the <tt>PathExpr</tt>, using that node as
Existence test if the result of the <tt>PathExpr</tt>, using that node as
the context node, is non-empty.
the context node, is non-empty.
<li>A node is selected by a
<li>A node is selected by a
Comparison if any node in the result of the <tt>PathExpr</tt>, using
Comparison if any node in the result of the <tt>PathExpr</tt>, using
Line 882: Line 902:
holds the specified relationship to the <tt>Lit</tt>.
holds the specified relationship to the <tt>Lit</tt>.
</ul>
</ul>
====Restrictions and limitations====
   
   
The following notes
===<b id="xpathRestr"></b>Restrictions and limitations===
concern limitations on XPath support in the XmlDoc API:
The following notes concern limitations on XPath support in the XmlDoc API:
<ul>
<ul>
<li>One way to summarize the XPath productions that are ''not'' supported
<li>One way to summarize the XPath productions that are ''not'' supported
Line 891: Line 910:
is to list the XPath operators that
is to list the XPath operators that
are ''not'' supported, as shown in the following table:
are ''not'' supported, as shown in the following table:
<pre>
Unsupported operators     Meaning
<table class="thJustBold">
+ - * div mod         arithmetic
<tr class="head"><th>Unsupported operators</th><th>Meaning</th><th>Comments</th></tr>
|                         union
<tr><th>&nbsp;&nbsp; + &nbsp;&nbsp;- &nbsp;&nbsp; *&nbsp;&nbsp; div&nbsp;&nbsp; mod</th><td>arithmetic</td><td/)</tr>
</pre>
<tr><th>&nbsp;&nbsp; | </th><td> union </td><td>The <var>UnionSelected</var> method can be used to form the [[UnionSelected (XmlNodelist function)#exmp1|union of two nodesets]].</tr>
'''Note:'''
</table>
As of <var class="product">Sirius Mods</var> version 7.2, parentheses are allowed for grouping
<p class="note">'''Note:'''
Parentheses are allowed for grouping
within Boolean expressions within predicates, but this is the only place
within Boolean expressions within predicates, but this is the only place
they are supported.
they are supported. </p>
<li>As of the current version, XPath function
<li>As of the current version, XPath function
support is limited enough that it can be shown in the syntax above.
support is limited enough that it can be shown in the syntax above.
Line 905: Line 927:
supported functions and for any differences between the
supported functions and for any differences between the
the XmlDoc API implementation and the XPath 1 definition of a function.
the XmlDoc API implementation and the XPath 1 definition of a function.
<li>The Boolean operators (<tt>and</tt>, <tt>or</tt>) and
<li>The Boolean operators (<tt>and</tt>, <tt>or</tt>) and
relational operators (<tt>= != < <= > >=</tt>) are supported
relational operators (<tt>= != < <= > >=</tt>) are supported
only in (some) predicates (see [[#Comparison tests in predicates|Comparison tests in predicates]]).
only in (some) predicates (see [[#Comparison tests in predicates|Comparison tests in predicates]]).
<li>XPath variables ($<var_name>) are not supported.
<li>The numeric constant <tt>+/- infinity</tt> is ''not'' supported.
<li>XPath variables ($<i>var_name</i>) are not supported.
<li>The numeric constants <tt>+/- infinity</tt> are ''not'' supported.
<li>The size of an XPath expression is limited to approximately
<li>The size of an XPath expression is limited to approximately
26 steps, if each has an NCName NodeTest.
26 steps, if each has an NCName NodeTest.
<li>In the XmlDoc API, a numeric value (either a literal or a node value)
<li>In the XmlDoc API, a numeric value (either a literal or a node value)
may be of any form available in User Language.
may be of any form available in SOUL.
In particular, &ldquo;E-format&rdquo; literals, such as <tt>1.003E-5</tt>
In particular, "E-format" literals, such as <code>1.003E-5</code>
(even though they are not very common in XML documents) may be specified.
(even though they are not very common in XML documents) may be specified.
The same form of numbers is available in XPath 2.
The same form of numbers is available in XPath 2.
XPath 1 only allows decimal numbers; it does not allow E-format literals nor
XPath 1 only allows decimal numbers; it does not allow E-format literals nor node values.
node values.
<li>The precision used in the XmlDoc API XPath support is that provided by
<li>The precision used in the XmlDoc API XPath support is that provided by SOUL &mdash; namely, 15 decimal digits.
User Language &mdash; namely, 15 decimal digits.
<li>If the XPath support in the XmlDoc API attempts to convert a long string
<li>If the XPath support in the XmlDoc API attempts to convert a long string
(that is, longer than 255 bytes) or a number whose absolute value is
(that is, longer than 255 bytes) or a number whose absolute value is
beyond the capabilities of <var class="product">User Language</var> (maximum absolute value approximately
beyond the capabilities of <var class="product">SOUL</var> (maximum absolute value approximately 7.237E75), the request is cancelled.
7.237E75), the request is cancelled.
</ul>
</ul>
====Predicates supported in the current version====
 
===Predicates supported in the current version===
The XmlDoc API supports these predicates:
The XmlDoc API supports these predicates:
<ul>
<ul>
<li>Two types of <i><b>Location-Path-expression predicates</b></i>:
<li>Two types of <b>Location-Path-expression predicates</b>:
<ul>
<ul>
<li>A Location Path (that is, production [1], LocPath in [[#XPath supported in the current version|XPath supported in the current version]])
<li>A Location Path (that is, production [1], LocPath in [[#XPath supported in the current version|XPath supported in the current version]]) used as an existence test
used as an existence test
<p>
<p></p>
If the nodeSet that results from the Location Path is non-empty,
If the nodeSet that results from the Location Path is non-empty,
the predicate evaluates as true.
the predicate evaluates as true.
The '''usual''' (but not only) purpose of this predicate is to select
The '''usual''' (but not only) purpose of this predicate is to select
a node if it has at least one attribute or element with a given name.
a node if it has at least one attribute or element with a given name. </p>
<p>
For example, the following expression selects (as the XPath argument to the <var>[[SelectNodes (XmlDoc/XmlNode function)|SelectNodes]]</var> method, for example) all <code>contact</code>
children of <code>cust</code> elements, if the
<code>cust</code> element has an <code>invoice</code> child element and
<code>contact</code> has a <code>fax</code> child element:
</p>
<p class="code" id="brackXmp">'/active/cust&amp;lsqb;invoice&amp;rsqb;/contact&amp;lsqb;fax&amp;rsqb;'<b>:u</b> </p>
<blockquote class="note">
<p><b>Note:</b> This example substitutes the <code>&lsqb;</code> and <code>&rsqb;</code> entities for left and right square-bracket characters from the keyboard, for the reason explained at the [[#entities|beginning of this section]]. Notice also the accompanying use of the <var>[[U (String function)|U]]</var> method. The value of the expression is: </p>
<p class="code">/active/cust[invoice]/contact[fax]
</p>
<p>
Most of the remaining predicate examples below omit the entities in order to more clearly display the XPath grammar. </p>
</blockquote>
</li>
   
   
For example, the following expression selects all <tt>contact</tt>
children of <tt>cust</tt> elements, if the
<tt>cust</tt> element has an <tt>invoice</tt> child element and
<tt>contact</tt> has a <tt>fax</tt> child element:
<pre>
    /active/cust[invoice]/contact[fax]
</pre>
<li>A Location Path expression with a comparison operator and literal
<li>A Location Path expression with a comparison operator and literal
<p></p>
<p>
For example, <tt>@price > 200</tt>
For example, <code>@price > 200</code>
selects a node if the numeric value of the node's <tt>price</tt>
selects a node if the numeric value of the node's <code>price</code>
Attribute is greater than 200.
Attribute is greater than 200. </p>
<p>
See [[#Comparison tests in predicates|Comparison tests in predicates]] for further discussion.
See [[#Comparison tests in predicates|Comparison tests in predicates]] for further discussion. </p></li>
</ul>
</ul>
   
   
<li>These types of <i><b>function predicates</b></i>:
<li>These types of <b>function predicates</b>:
<ul>
<ul>
<li>A &ldquo;simple&rdquo; position test using a numeric literal <i>n</i>
<li>A "simple" position test using a numeric literal <i>n</i>
<p></p>
<p>
This test is
This test is equivalent to the implicit use of the position(&thinsp;) function in the
equivalent to the implicit use of the position(<!--thinsp-->) function in the
predicate term <code>position()=<i>n</i></code>. </p>
predicate term <tt>position()=</tt><i>n</i>.
<p>
For example: </p>
For example:
<p class="code">/book/chapter[2]/section[9]/paragraph[3]
<pre>
</p></li>
    /book/chapter[2]/section[9]/paragraph[3]
</pre>
<li>The number(<!--thinsp-->) function with a location path argument, followed by
a comparison to a numeric literal
<p></p>
For example, <tt>number(@size) > 30</tt>
selects a node if the numeric value of the node's <tt>size</tt>
Attribute is greater than 30.
   
   
<li>The number(&thinsp;) function with a location path argument, followed by a comparison to a numeric literal
<p>
For example, <code>number(@size) > 30</code>
selects a node if the numeric value of the node's <code>size</code>
Attribute is greater than 30. </p>
<p>
This predicate differs from the similar Location Path example above
This predicate differs from the similar Location Path example above
(<tt>@price > 200</tt>)
(<code>@price > 200</code>) primarily in that it allows the Attribute value to be non-numeric.
primarily in that it allows the Attribute value to be non-numeric.
The previous Location Path example cancels the request if a numeric comparison is performed
The previous Location Path
with a <code>price</code> Attribute whose value is non-numeric. </p>
example cancels the request if a numeric comparison is performed
<p>
with a <tt>price</tt> Attribute whose value is non-numeric.
See [[#Comparison tests in predicates|Comparison tests in predicates]] for further discussion. </p></li>
 
<li>The position(&thinsp;) function, followed by a comparison operator,
followed by an integer literal, which may be negative or zero.
<p>
For example: </p>
<p class="code">/book/chapter[position()>1]/section[2]
</p></li>
   
   
See [[#Comparison tests in predicates|Comparison tests in predicates]] for further discussion.
<li>The not(&thinsp;) function, which returns the opposite boolean value
<li>The position(<!--thinsp-->) function, followed by a comparison operator,
followed by an integer literal, which may be negative or zero
<p></p>
For example:
<pre>
    /book/chapter[position(<!--thinsp-->)>1]/section[2]
</pre>
<li>The not(<!--thinsp-->) function, which returns the opposite boolean value
of its boolean argument.
of its boolean argument.
<p></p>
<p>
For example:
For example: </p>
<pre>
<p class="code">/book/chapter[2]/section[9]/not(paragraph[3])
    /book/chapter[2]/section[9]/not(paragraph[3])
</p></li>
</pre>
</ul>
</ul>
<li><i><b>Nested predicates</b></i>.
<p></p>
<li><b>Nested predicates</b>.
For example, this statement
<p>
selects Chapters whose first Section has a Racy attribute:
For example, this statement selects each <code>Chapter</code> whose first <code>Section</code> has a <code>Racy</code> attribute: </p>
<pre>
<p class="code">%lis = %bk:SelectNodes('Chapter&lsqb;Section&lsqb;1 and @Racy&rsqb;&rsqb;'<b>:u</b>)
    %lis = %bk:SelectNodes('Chapter[Section[1 and @Racy]]')
</p>
</pre>
<blockquote>
<li><i><b>Multiple predicates in a single step</b></i>.
<p><b>Note:</b> As in an [[#brackXmp|earlier example]], the <code>&lsqb;</code> and <code>&rsqb;</code> entities in this example require Model&nbsp;204 7.6 or higher.  The value of the XPath argument is: </p>
<p></p>
<p class="code">Chapter[Section[1 and @Racy]] </p>
</blockquote>
</li>
<li><b>Multiple predicates in a single step</b>.
<p>
For example, using the <tt>position()</tt> function to filter based
For example, using the <tt>position()</tt> function to filter based
on the position of nodes from the preceding predicate, rather than
on the position of nodes from the preceding predicate, rather than
from the step's NodeTest:
from the step's NodeTest: </p>
<pre>
<p class="code">/book/chapter[author="Alex"] [2]
    /book/chapter[author="Alex"] [2]
</p>
</pre>
<p>
The preceding two-predicate step
The preceding two-predicate step selects the second <code>chapter</code> child that is authored by <code>Alex</code>, while the following expression
selects the second <tt>chapter</tt> child that is authored by Alex,
selects the second <code>chapter</code> child of the <code>book</code>,
while the following expression
if its author is <code>Alex</code>: </p>
selects the second <tt>chapter</tt> child of the <tt>book</tt>,
<p class="code">/book/chapter[author="Alex" and 2]
if its author is Alex:
</p>
<pre>
<p>
    /book/chapter[author="Alex" and 2]
Parentheses for grouping in Boolean expressions are supported.
</pre>
For example: </p>
<p class="code">chapter[@type="methods" and
  (@class="Stringlist" or @class="Daemon")]
</p></li>
   
   
Parentheses for grouping in Boolean expressions are supported as of
<li><b>Combination predicates</b>.
<var class="product">Sirius Mods</var> version 7.2.
<p>
For example:
<pre>
    chapter[@type="methods" and
      (@class="Stringlist" or @class="Daemon")]
</pre>
:note
Prior to version 7.2, you could only simulate this parenthetical grouping
by using a technique like the following:
<pre>
    chapter[@type="methods"]
      [@class="Stringlist" or @class="Daemon"]
</pre>
And some Boolean parenthesized expressions could not use
this technique, for example:
<pre>
    [@a="w" or @a="x" and (@a="y" or @a="z")]
</pre>
<li><i><b>Combination predicates</b></i>.
<p></p>
Predicates may combine any of these supported
Predicates may combine any of these supported
functions and supported Location Path expressions
functions and supported Location Path expressions
using the <tt>and</tt> and <tt>or</tt> Boolean operators.
using the <tt>and</tt> and <tt>or</tt> Boolean operators. </p>
<p>
For example:
For example: </p>
<pre>
<p class="code">/active/cust[invoice and position>1]
    /active/cust[invoice and position>1]
</p></li>
</pre>
</ul>
</ul>


====XPath functions supported in the current version====
===XPath functions supported in the current version===
The following XPath functions are supported in the current version,
The following XPath functions are supported in the current version,
and "[[#Functions|Functions]]" gives their XPath 1 definitions.
and [[#Functions|Functions]] gives their XPath 1 definitions.
Any differences between the XPath 1 definition and the the XmlDoc API implementation
Any differences between the XPath 1 definition and the the XmlDoc API implementation
are shown below.
are shown below.
'''Note:'''
<p class="note">'''Note:'''
In discussing XPath functions, the name of the function followed
In discussing XPath functions, the name of the function followed
by an empty pair of parentheses (for example, <tt>number()</tt>)
by an empty pair of parentheses (for example, <code>number()</code>)
is sometimes used to name the function, whether or not
is sometimes used to name the function, whether or not
the particular function being discussed takes arguments.
the particular function being discussed takes arguments. </p>
<ul>
<ul>
<li>position(<!--thinsp-->), XPath
<li id="pos">position(&thinsp;)
<li>not(<i>bool</i>)
<p>Performs as specified in the XPath standard.</p>
   
   
<li id="not">not(<i>bool</i>)
<p>
The function argument is a Boolean expression, and the function result
The function argument is a Boolean expression, and the function result
is <tt>true</tt> if the value of the argument is <tt>false</tt>,
is <code>true</code> if the value of the argument is <code>false</code>,
and it is <tt>false</tt> otherwise.
and it is <code>false</code> otherwise. </p>
<p>
Notes:
Notes: </p>
<ul>
<ul>
<li><p>Performs as specified in the XPath standard.
<li>The result of the <tt>not()</tt> function applied to a comparison
<li>The result of the <tt>not()</tt> function applied to a comparison
expression is different than the result of the same expression with
expression is different than the result of the same expression with
Line 1,078: Line 1,099:
For example,
For example,
this statement selects children that have the value of the status
this statement selects children that have the value of the status
attribute equal to &ldquo;pending&rdquo;:
attribute equal to "pending":
<pre>
<p class="code">%lis = %nod:SelectNodes('*[@status="pending"]')
    %lis = %nod:SelectNodes('*[@status="pending"]')
</p>
</pre>
   
   
This statement selects children that have the value of the status
This statement selects children that have the value of the status
attribute equal to something other than &ldquo;pending&rdquo;:
attribute equal to something other than &ldquo;pending&rdquo;:
<pre>
<p class="code">%lis = %nod:SelectNodes('*[@status!="pending"]')
    %lis = %nod:SelectNodes('*[@status!="pending"]')
</p>
</pre>
   
   
This statement selects children that have the value of the status
This statement selects children that have the value of the status
Line 1,093: Line 1,112:
or that have no status attribute:
or that have no status attribute:
   
   
<pre>
<p class="code">%lis = %nod:SelectNodes('*[not(@status="pending")]')
    %lis = %nod:SelectNodes('*[not(@status="pending")]')
</p>
</pre>
</ul>
</ul>
<li>number(<i>nodeset?</i>)
   
   
The XmlDoc API number(<!--thinsp-->) function differs from the XPath 1 definition as follows:
<li id="num">number(<i>nodeset?</i>)
<p>
The XmlDoc API number(&thinsp;) function differs from the XPath 1 definition as follows:</p>
<ol>
<ol>
<li>XPath 1 allows a variety of argument types (for example, a string literal);
<li>XPath 1 allows a variety of argument types (for example, a string literal);
the XmlDoc API allows only a nodeSet argument.
the XmlDoc API allows only a nodeSet argument.
<li>In XPath 1, if a nodeSet argument to number(<!--thinsp-->)
<li>In XPath 1, if a nodeSet argument to number(&thinsp;)
contains more than one node, the first node
contains more than one node, the first node
(in document order) is converted to a number and returned.
(in document order) is converted to a number and returned.
<br>
<p>
In the XmlDoc API, if the argument result contains more than one node, the
In the XmlDoc API, if the argument result contains more than one node, the
request is cancelled, which is consistent with the XPath 2 standard.
request is cancelled, which is consistent with the XPath 2 standard. </p>
<li>The definition of a numeric value for number(<!--thinsp-->) (after
<li>The definition of a numeric value for number(&thinsp;) (after
stripping leading and trailing whitespace) is the same
stripping leading and trailing whitespace) is the same
as the NumericLit production ([30]) in [[#XPath supported in the current version|XPath supported in the current version]].
as the NumericLit production ([30]) in [[#XPath supported in the current version|XPath supported in the current version]].
This is consistent with the XPath 2 standard, and is
This is consistent with the XPath 2 standard, and is
an extension of the XPath 1 definition of number(<!--thinsp-->), which only accepts
an extension of the XPath 1 definition of number(&thinsp;), which only accepts
numbers of the form DecimalNumber ([30d]).
numbers of the form DecimalNumber ([30d]).
</ol>
</ol>
</ul>
</ul>
====Comparison tests in predicates====
 
===Comparison tests in predicates===
In the XPath standard (XPath 1 and XPath 2), either operand
In the XPath standard (XPath 1 and XPath 2), either operand
in a comparison test in a predicate may be any form of XPath expression.
in a comparison test in a predicate may be any form of XPath expression.
The predicate evaluates as true if the comparison is true of at least one
The predicate evaluates as true if the comparison is true of at least one
node in the resulting nodeSet,
node in the resulting nodeSet,
and typically the
and typically the purpose of the predicate is to select the nodes
purpose of the predicate is to select the nodes
for which the comparison is true.
for which the comparison is true.
   
   
For example, the
For example, the following expression selects all <code>item</code> children
following expression selects all <tt>item</tt> children
of <code>order</code> elements that have a <code>price</code> attribute
of <tt>order</tt> elements that have a <tt>price</tt> attribute
whose value is greater than 9.99:
whose value is greater than 9.99:
<pre>
<p class="code">order/item[@price > 9.99]
    order/item[@price > 9.99]
</p>
</pre>
XmlDoc API predicate comparisons differ from XPath 1 comparisons:
XmlDoc API predicate comparisons differ from XPath 1 comparisons:
<ul>
<ul>
Line 1,138: Line 1,157:
in the XPath standard.
in the XPath standard.
As is explained in the syntax discussion below, you can use only a Location
As is explained in the syntax discussion below, you can use only a Location
Path or the number(<!--thinsp-->) function,
Path or the number(&thinsp;) function,
followed by a comparison operator, followed by a literal.
followed by a comparison operator, followed by a literal.
<li>The XmlDoc API uses XPath 2 comparisons.
<li>The XmlDoc API uses XPath 2 comparisons.
The XPath 1 standard does not provide for exception conditions
The XPath 1 standard does not provide for exception conditions
and it does not provide for ordered string comparisons.
and it does not provide for ordered string comparisons.
This is also true for Microsoft .Net, which follows the XPath 1 standard.
This is also true for Microsoft .Net, which follows the XPath 1 standard.
<p>
The XmlDoc API follows the XPath 2 standard by providing for exceptions
The XmlDoc API follows the XPath 2 standard by providing for exceptions
(implemented as request cancellation conditions) and providing
(implemented as request cancellation conditions) and providing
for ordered string comparisons.
for ordered string comparisons. </p>
</ul>
</ul>
   
   
As of the current version, the only forms of comparisons are these:
As of the current version, the only forms of comparisons are these four:
<p class="syntax">position() <span class="term">relOp integer</span>
   
   
    position() <i><b>relOp integer</b></i>
<span class="term">LocPath relOp "stringLiteral"</span>
   
   
    <i><b>LocPath relOp "stringLiteral"</b></i>
<span class="term">LocPath relOp numericLiteral</span>
    <i><b>LocPath relOp numericLiteral</b></i>
    number(<i><b>[LocPath]</b></i>) <i><b>relOp numericLiteral</b></i>
   
   
number(<span class="term">[LocPath]</span>) <span class="term">relOp numericLiteral</span>
</p>
   
   
Where:
Where:
<dl>
<dl>
<dt>position(<!--thinsp-->)
<dt>position(&thinsp;)
<dd>This function is discussed in [[#Predicates supported in the current version|Predicates supported in the current version]].
<dd>This function is discussed in [[#Predicates supported in the current version|Predicates supported in the current version]].
<dt><i><b>relOp</b></i>
<dt><i>relOp</i>
<dd>One of the comparisons: <tt>=</tt>,  <tt>!=</tt>,  <tt><</tt>,
<dd>One of the comparisons: <tt>=</tt>,  <tt>!=</tt>,  <tt><</tt>,
<tt><=</tt>,  <tt>></tt>,  <tt>>=</tt>
<tt><=</tt>,  <tt>></tt>,  <tt>>=</tt>
<dt><i><b>integer</b></i>
<dd>An integer, whose precision is limited to 15 decimal digits (as in User Language).
<dt><i>integer</i>
<dt><i><b>LocPath</b></i>
<dd>An integer, whose precision is limited to 15 decimal digits (as in SOUL).
<dt><i>LocPath</i>
<dd>An XPath location expression
<dd>An XPath location expression
(that is, production [1], LocPath in [[#XPath supported in the current version|XPath supported in the current version]]).
(that is, production [1], LocPath, in [[#XPath supported in the current version|XPath supported in the current version]]).
As of the current version, such an expression still has the limitation that it
As of the current version, such an expression still has the limitation that it
may not contain a predicate.
may not contain a predicate.
<dt><i><b>"stringLiteral"</b></i>
<dt><i>"stringLiteral"</i>
<dd>A quoted string literal value, which must not exceed 255 bytes.
<dd>A quoted string literal value, which must not exceed 255 bytes.
<p>
The XmlDoc API has always supported ordered string comparisons,
The XmlDoc API has always supported ordered string comparisons,
but the XPath 1 standard does not.
but the XPath 1 standard does not.
For more information about these comparisons,
For more information about these comparisons,
see [[#Ordered string comparisons|Ordered string comparisons]].
see [[#Ordered string comparisons|Ordered string comparisons]]. </p>
<dt><i><b>numericLiteral</b></i>
<dt><i>numericLiteral</i>
<dd>A numeric literal value,
<dd>A numeric literal value,
whose precision is limited to 15 decimal digits (as in User Language).
whose precision is limited to 15 decimal digits (as in SOUL).
For additional format and size limitations, see "[[#Restrictions and limitations|Restrictions and limitations]]".
For additional format and size limitations, see [[#Restrictions and limitations|Restrictions and limitations]].
<p>
For more information about support for comparisons of a Location Path
For more information about support for comparisons of a Location Path
to a numeric literal, see [[#Direct numeric comparison|Direct numeric comparison]].
to a numeric literal, see [[#Direct numeric comparison|Direct numeric comparison]].</p>
<p>
Numeric literals in predicate comparisons are supported in the XmlDoc API
Numeric literals in predicate comparisons are supported in the XmlDoc API.</p>
as of <var class="product">Sirius Mods</var> version 7.0.
<dt>number([<i><b>LocPath</b></i>])
<dd>The number(<!--thinsp-->) function with an optional Location Path argument.
   
   
A comparison using the number(<!--thinsp-->) function is very similar to
<dt>number([<i>LocPath</i>])
<dd>The number(&thinsp;) function with an optional Location Path argument.
<p>
A comparison using the number(&thinsp;) function is very similar to
comparison of a Location Path to a numeric literal ([[#Direct numeric comparison|Direct numeric comparison]]).
comparison of a Location Path to a numeric literal ([[#Direct numeric comparison|Direct numeric comparison]]).
Comparing the result of number(<!--thinsp-->) to a literal gives a result
Comparing the result of number(&thinsp;) to a literal gives a result
according to their relative values and to the comparison operator.
according to their relative values and to the comparison operator.
For example, <tt>shirt[number(@size) > 30]</tt>
For example, <code>shirt[number(@size) > 30]</code>
selects nodes that have a size greater than 30.
selects nodes that have a size greater than 30. </p>
<p>
The significant difference between using a Location Path and using a
The significant difference between using a Location Path and using a
number(<!--thinsp-->) function in a numeric comparison is that the request is
number(&thinsp;) function in a numeric comparison is that the request is
cancelled in the former case if a node in the comparison is non-numeric.
cancelled in the former case if a node in the comparison is non-numeric.
This difference is discussed briefly below and in greater detail
This difference is discussed briefly below and in greater detail
in "[[#number(LocPath) comparisons for non-numeric data|number(LocPath) comparisons for non-numeric data]]".
in [[#number(LocPath) comparisons for non-numeric data|number(LocPath) comparisons for non-numeric data]]. </p>
<p>
These are the effects of the function's ''LocPath'' argument
These are the effects of the function's ''LocPath'' argument
(they are consistent with the XPath 2 standard):
(they are consistent with the XPath 2 standard): </p>
<ul>
<ul>
<li>If the result of the ''LocPath'' argument is a single node,
<li>If the result of the ''LocPath'' argument is a single node, number(&thinsp;)
number(<!--thinsp-->)
converts the value of the node, after stripping leading and trailing whitespace,
converts the value of the node, after stripping leading and trailing whitespace,
to a number, or to the special value <tt>NaN</tt>
to a number, or to the special value <tt>NaN</tt>
(&ldquo;Not a Number&ldquo;) if the stripped value of the node is not numeric.
("Not a Number") if the stripped value of the node is not numeric.
<li>If the result of the argument has more than one node, the request
<li>If the result of the argument has more than one node, the request
is cancelled.
is cancelled.
For further details, see [[#number(<!--thinsp-->) comparisons that cause request cancellation|number(<!--thinsp-->) comparisons that cause request cancellation]].
For further details, see [[#number(&thinsp;) comparisons that cause request cancellation|number(&thinsp;) comparisons that cause request cancellation]].
<li>If the ''LocPath'' argument is the empty nodeSet, the result of the
<li>If the ''LocPath'' argument is the empty nodeSet, the result of the
number(<!--thinsp-->) function is <tt>NaN</tt>.
number(&thinsp;) function is <tt>NaN</tt>.
<p>
See [[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]] for examples.
See [[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]] for examples. </p>
'''Note:'''
<p class="note">'''Note:'''
In XPath 1 (which has no exception conditions),
In XPath 1 (which has no exception conditions),
if there is more than one node in the nodeSet argument,
if there is more than one node in the nodeSet argument,
the value of the first node (in document order) is used.
the value of the first node (in document order) is used. </p>
<li>If you omit the ''LocPath'' argument,
<li>If you omit the ''LocPath'' argument,
the default argument &ldquo;.&rdquo; (the context node) is used; that is,
the default argument "." (the context node) is used; that is,
the node that is being filtered by the predicate gets
the node that is being filtered by the predicate gets
converted to a numeric value.
converted to a numeric value.
<p>
For example: in the following XPath expression, the number(<!--thinsp-->)
For example: in the following XPath expression, the number(&thinsp;)
function converts the value of the <tt>size</tt> Attribute
function converts the value of the <code>size</code> Attribute
to a number:
to a number:</p>
<pre>
<p class="code">/*/shirt/@size[number() > 10]
    /*/shirt/@size[number() > 10]
</p>
</pre>
</ul>
</ul>
   
   
If the number(<!--thinsp-->) result that is
If the number(&thinsp;) result that is
compared to a literal is <tt>NaN</tt>, the comparison is always false (or,
compared to a literal is <tt>NaN</tt>, the comparison is always false (or,
in the case of the <tt>!=</tt> operator, is always true).
in the case of the <tt>!=</tt> operator, is always true).
This is important to note, because it means number(<!--thinsp-->) can be used to
This is important to note, because it means number(&thinsp;) can be used to
avoid the request cancellation to which numeric comparisons are subject
avoid the request cancellation to which numeric comparisons are subject
if the nodes evaluated by a predicate may be non-numeric.
if the nodes evaluated by a predicate may be non-numeric.
For further discussion, see [[#number(LocPath) comparisons for non-numeric data|number(LocPath) comparisons for non-numeric data]].
For further discussion, see [[#number(LocPath) comparisons for non-numeric data|number(LocPath) comparisons for non-numeric data]].
   
   
If you are using number(<!--thinsp-->) to avoid request cancellation for a numeric
If you are using number(&thinsp;) to avoid request cancellation for a numeric
comparison because the nodes evaluated by a predicate may be non-numeric,
comparison because the nodes evaluated by a predicate may be non-numeric,
and you are using the &ldquo;not equals&rdquo; comparison (<tt>!=</tt>),
and you are using the "not equals" comparison (<tt>!=</tt>),
remember that an empty nodeSet argument will give a true comparison result
remember that an empty nodeSet argument will give a true comparison result
(as a consequence of the rule for comparing <tt>NaN</tt>).
(as a consequence of the rule for comparing <tt>NaN</tt>).
You can filter out the nodes included by empty nodeSet comparisons
You can filter out the nodes included by empty nodeSet comparisons
by expanding the locationPath expression from <tt>number(LocPath)</tt> to
by expanding the Location Path expression from <code>number(<i>LocPath</i>)</code> to
<tt>LocPath <b>and</b> number(LocPath)</tt>, as described in
<code><i>LocPath</i></code> <b>and</b> <code>number(<i>LocPath</i>)</code>, as described in
"[[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]]".
[[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]].
   
   
The number(<!--thinsp-->) function is supported as of <var class="product">Sirius Mods</var> version 7.0.
<p class="note">'''Note:'''
'''Note:'''
In the XmlDoc API,
In the XmlDoc API,
the number(<!--thinsp-->) function must be immediately followed by a comparison
the number(&thinsp;) function must be immediately followed by a comparison
operator and a numeric literal.
operator and a numeric literal.
This limitation is not required by XPath 1 or XPath 2.
This limitation is not required by XPath 1 or XPath 2. </p>
</dl>
</dl>
=====Ordered string comparisons=====
If an XPath expression contains a locationPath subexpression and
====Ordered string comparisons====
If an XPath expression contains a Location Path subexpression and
a quoted string with an ordered comparison (that is,
a quoted string with an ordered comparison (that is,
a comparison other than &ldquo;=&rdquo; and &ldquo;!=&rdquo;), the result is based
a comparison other than "=" and "!="), the result is based
on a byte-by-byte ordered comparison between each item of
on a byte-by-byte ordered comparison between each item of
the nodeSet result of the subexpression and the literal string value.
the nodeSet result of the subexpression and the literal string value.
   
   
Consider the following example:
Consider the following example:
<pre>
<p class="code">%nlis = %nod:SelectNodes('order[@date>"2007-01-01"]')
    %nlis = %nod:SelectNodes('order[@date>"2007-01-01"]')
</p>
</pre>
If the value of the <code>date</code> Attribute node of an <code>order</code>
If the value of the <tt>date</tt> Attribute node of an <tt>order</tt>
Element child is,
Element child is,
for example, "2007-05-17", that order Element node will be included in
for example, <code>2007-05-17</code>, that order Element node will be included in
the result.
the result.
   
   
Line 1,287: Line 1,311:
Note, however, that
Note, however, that
most practical ordered comparisons involve numeric values, which are
most practical ordered comparisons involve numeric values, which are
supported in <var class="product">Sirius Mods</var> version 7.0.
supported.
   
   
In XPath 1, any ordered comparison is done by first converting each
In XPath 1, any ordered comparison is done by first converting each
Line 1,295: Line 1,319:
Therefore, the XPath 1 result of the above example would always be
Therefore, the XPath 1 result of the above example would always be
empty, because the literal is a non-numeric value.
empty, because the literal is a non-numeric value.
=====Direct numeric comparison=====
====Direct numeric comparison====
If an XPath expression contains a Location Path subexpression
If an XPath expression contains a Location Path subexpression
compared to a literal numeric value, the result is true if any node in
compared to a literal numeric value, the result is true if any node in
the subexpression result, converted to a numeric value, has the
the subexpression result, converted to a numeric value, has the
specified relationship (&ldquo;=&rdquo;, &ldquo;<&rdquo;, etc.) to the literal
specified relationship ("=", "<", etc.) to the literal value.
value.
   
   
In the following example, an <tt>order</tt> child is in the result if
In the following example, an <code>order</code> child is in the result if
it has an <tt>item</tt> child whose <tt>price</tt> Attribute node is greater
it has an <code>item</code> child whose <code>price</code> Attribute node is greater
than 99.99:
than 99.99:
<pre>
<p class="code">%nlis = %nod:SelectNodes('order[item@price > 99.99]')
    %nlis = %nod:SelectNodes('order[item@price > 99.99]')
</p>
</pre>
   
   
For more examples,
For more examples, see [[#xpncmp|Successful direct numeric comparisons]] below.
see "[[#xpncmp|Successful direct numeric comparisons]]" below.
   
   
If any node value used in the direct numeric comparison is non-numeric,
If any node value used in the direct numeric comparison is non-numeric,
the request is cancelled.
the request is cancelled.
For examples, see "[[#xpncan|Direct numeric comparisons that cause request cancellation]]"
For examples, see [[#xpncan|Direct numeric comparisons that cause request cancellation]]
below.
below.
   
   
Numeric comparisons are supported as of <var class="product">Sirius Mods</var> version 7.0.
The discussion that follows makes references to
the following <code>Clothes</code> document:
<p class="code"><Clothes>
  <shirt size="32"  type="dress"  sku="100"/>
  <shirt size="33"  type="sport"  sku="101"/>
  <shirt size="M"   type="sport"  sku="102"/>
  <shirt size="34"  type="frilly" sku="103"/>
</Clothes>
</p>
   
   
The discussion that follows makes references to
the following &ldquo;Clothes&rdquo; document:
<pre>
    <Clothes>
      <shirt size="32"  type="dress"  sku="100"/>
      <shirt size="33"  type="sport"  sku="101"/>
      <shirt size="M"  type="sport"  sku="102"/>
      <shirt size="34" type="frilly" sku="103"/>
    </Clothes>
</pre>
<ol>
<ol>
<div id="xpncmp"></div>
<div id="xpncmp"></div>
Line 1,337: Line 1,357:
the comparison is true if the numeric value, after stripping
the comparison is true if the numeric value, after stripping
leading and trailing whitespace, of any of the nodes
leading and trailing whitespace, of any of the nodes
in the locationPath result has the specified relationship
in the Location Path result has the specified relationship
to the numeric literal.
to the numeric literal.
If none of the nodes has the relationship (which includes the
If none of the nodes has the relationship (which includes the
case that the Location Path result is empty), the result of the
case that the Location Path result is empty), the result of the
comparison is false.
comparison is false.
<p>
For example,
For example,
using the Clothes document described above,
using the <code>Clothes</code> document described above,
the following statement prints <tt>sku="100"</tt>.
the following statement prints <code>sku="100"</code>. </p>
<pre>
<p class="code">%doc:Print('/*/shirt[@size<040]/@sku')
    %doc:Print('/*/shirt[@size<040]/@sku')
</p>
</pre>
<p>
Note that the ''numeric'' value of the node and the
Note that the ''numeric'' value of the node and the
''numeric'' value of the literal are compared,
''numeric'' value of the literal are compared,
so the leading zero in <tt>040</tt> here is ignored.
so the leading zero in <code>040</code> here is ignored.
An equivalent comparison could be <tt>@size<040.00</tt>, etc.
An equivalent comparison could be <code>@size<040.00</code>, etc.
The following statement prints <tt>sku="101"</tt>:
The following statement prints <code>sku="101"</code>: </p>
<pre>
<p class="code">%doc:Print('/*/shirt[@size<40 and @type="sport"]/@sku')
    %doc:Print('/*/shirt[@size<40 and @type="sport"]/@sku')
</p>
</pre>
<p>
The following statement prints <code>sku="103"</code>;
the comparison of the <code>size</code> Attribute is processed
for only one Element, which has a numeric <code>size</code>.
As discussed below, this request would fail if the order
of the attribute subexpressions were reversed. </p>
<p class="code">%doc:Print('/*/shirt[@type="frilly" and @size<40]/@sku')
</p>
   
   
The following statement prints <tt>sku="103"</tt>;
the comparison of the <tt>size</tt> Attribute is processed
for only one Element, which has a numeric <tt>size</tt>.
As discussed below, this request would fail if the order
of the attribute subexpressions were reversed.
<pre>
    %doc:Print('/*/shirt[@type="frilly" and @size<40]/@sku')
</pre>
<div id="xpncan"></div>
<div id="xpncan"></div>
<li>Direct numeric comparisons that cause request cancellation
<li>Direct numeric comparisons that cause request cancellation
Line 1,373: Line 1,391:
a value that is non-numeric after stripping leading and trailing
a value that is non-numeric after stripping leading and trailing
whitespace, the request is cancelled.
whitespace, the request is cancelled.
<p>
For example,
For example, using the <code>Clothes</code> document described above,
using the Clothes document described above,
the following statement causes the request to be cancelled when the size
the following statement causes the request to be cancelled when the size
attribute ("M") of the second sport shirt is compared to the number 40
attribute (<code>M</code>) of the second sport shirt is compared to the number 40
(note the difference between this XPath expression and the last one
(note the difference between this XPath expression and the last one
in "[[#xpncmp|Successful direct numeric comparisons]]" above):
in [[#xpncmp|Successful direct numeric comparisons]] above): </p>
<pre>
<p class="code">%doc:Print('/*/shirt[@size<40 and @type="frilly"]/@sku')
    %doc:Print('/*/shirt[@size<40 and @type="frilly"]/@sku')
</p>
</pre>
<p>
The following statement causes the request to be cancelled (at the same Element),
The following statement causes the request to be cancelled (at the same Element),
because SelectNodes continues after the first selection, unlike the
because SelectNodes continues after the first selection, unlike the
Print example with the same XPath expression
Print example with the same XPath expression
in "[[#xpncmp|Successful direct numeric comparisons]]" above):
in [[#xpncmp|Successful direct numeric comparisons]] above): </p>
<pre>
<p class="code">%sh = %doc:SelectNodes('/*/shirt[@size<40 and @type="sport"]/@sku')
    %sh = %doc:SelectNodes( -
</p>
      '/*/shirt[@size<40 and @type="sport"]/@sku')
<p>
</pre>
If you want the request cancellation to be avoided in statements like these,
If you want the request cancellation to be avoided in statements like these,
consider using the number(<!--thinsp-->) function (see [[#number(LocPath) comparisons for non-numeric data|number(LocPath) comparisons for non-numeric data]]).
consider using the number(&thinsp;) function (see [[#number(LocPath) comparisons for non-numeric data|number(LocPath) comparisons for non-numeric data]]).</p>
</ol>
</ol>
 
=====number(LocPath) comparisons for non-numeric data=====
====number(LocPath) comparisons for non-numeric data====
The number(<!--thinsp-->) function can often be used to avoid request cancellation
The number(&thinsp;) function can often be used to avoid request cancellation
due to the presence of non-numeric data in a direct numeric comparison.
due to the presence of non-numeric data in a direct numeric comparison.
   
   
For example,
For example, both statements from
both statements from
[[#xpncan|Direct numeric comparisons that cause request cancellation]]
"[[#xpncan|Direct numeric comparisons that cause request cancellation]]"
above can avoid request cancellation
above
can avoid request cancellation
(''when used with the particular document discussed in that section'')
(''when used with the particular document discussed in that section'')
if the locationPath <tt>@size</tt> is &ldquo;converted&rdquo;
if the Location Path <code>@size</code> is "converted"
using the number(<!--thinsp-->) function.
using the number(&thinsp;) function.
   
   
The following statement prints <tt>sku="103"</tt>, even though the <tt>size</tt>
The following statement prints <code>sku="103"</code>, even though the <code>size</code>
Attribute equal to <tt>"M"</tt> is processed before the selected Element:
Attribute equal to <code>M</code> is processed before the selected Element:
<pre>
<p class="code">%doc:Print('/*/shirt[number(@size)<40 and @type="frilly"]/@sku')
    %doc:Print('/*/shirt[number(@size)<40 and -
</p>
      @type="frilly"]/@sku')
</pre>
   
   
Similarly, the following statement succeeds even though the <tt>size</tt>
Similarly, the following statement succeeds even though the <code>size</code>
Attribute equal to <tt>"M"</tt> is processed (and not selected):
Attribute equal to <code>M</code> is processed (and not selected):
<pre>
<p class="code">%sh = %doc:SelectNodes('/*/shirt[number(@size)<40 and @type="sport"]/@sku')
    %sh = %doc:SelectNodes( -
</p>
      '/*/shirt[number(@size)<40 and @type="sport"]/@sku')
</pre>
   
   
Note that comparisons with the number(<!--thinsp-->) function are always
Comparisons with the number(&thinsp;) function are always false for a non-numeric node value, unless the comparison is <code>!=</code>.
false for a non-numeric node value, unless the comparison is
The following statements both print <code>None found</code>:
"!=".
<p class="code">Print %doc:ValueDefault( -
The following statements both print <tt>None found</tt>:
    '/*/shirt[number(@size) < 40 and @sku="102"]/@size', 'None found')
<pre>
    Print %doc:ValueDefault( -
        '/*/shirt[number(@size) < 40 and @sku="102"]/@size', -
        'None found')
   
   
    Print %doc:ValueDefault( -
Print %doc:ValueDefault( -
        '/*/shirt[number(@size) >= 40 and @sku="102"]/@size', -
    '/*/shirt[number(@size) >= 40 and @sku="102"]/@size', 'None found')
        'None found')
</p>
</pre>
   
   
The following, however, prints <tt>M</tt> &mdash; the size of shirt with
The following, however, prints <code>M</code>, the size of shirt with
this SKU; its size is not less than 40 nor greater than or
this SKU. Its size is not less than 40 nor greater than or
equal to 40, but it is not equal to 40 (nor any other number):
equal to 40, but it is not equal to 40 (nor any other number):
<pre>
<p class="code">Print %doc:ValueDefault( -
    Print %doc:ValueDefault( -
    '/*/shirt[number(@size) != 40 and @sku="102"]/@size', 'None found')
        '/*/shirt[number(@size) != 40 and @sku="102"]/@size', -
</p>
        'None found')
</pre>
<blockquote class="note">
'''Note:'''
<p>'''Note:'''
Before substituting the number(<!--thinsp-->) function into a direct numeric comparison,
Before substituting the number(&thinsp;) function into a direct numeric comparison,
you should be aware of two differences
you should be aware of two differences
between direct numeric comparison and the use of number(<!--thinsp-->):
between direct numeric comparison and the use of number(&thinsp;): </p>
<ul>
<ul>
<li>Although number(<!--thinsp-->) can be used to prevent non-numeric nodes from causing
<li>Although number(&thinsp;) can be used to prevent non-numeric nodes from causing
request cancellation, a different number(<!--thinsp-->) condition can cause cancellation:
request cancellation, a different number(&thinsp;) condition can cause cancellation:
if the argument nodeSet contains more than one node.
if the argument nodeSet contains more than one node.
See [[#number(<!--thinsp-->) comparisons that cause request cancellation|number(<!--thinsp-->) comparisons that cause request cancellation]]
See [[#number(&thinsp;) comparisons that cause request cancellation|number(&thinsp;) comparisons that cause request cancellation]]. </li>
<li>The result of the &ldquo;!=&rdquo; comparison is <tt>true</tt> if the value
of number(<!--thinsp-->)'s argument nodeSet is empty.
<li>The result of the <code>!=</code> comparison is <code>true</code> if the value
See [[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]], which follows.
of the number(&thinsp;) argument nodeSet is empty.
See [[#number(LocPath) != n, LocPath result is empty node-set|number(LocPath) != n, LocPath result is empty node-set]], which follows. </li>
</ul>
</ul>
=====number(LocPath) != n, LocPath result is empty node-set=====
</blockquote>
When a predicate contains the number(<!--thinsp-->) function followed by
the &ldquo;!=&rdquo; comparison, if the nodeSet result is empty, the result of
====number(LocPath) != n, LocPath result is empty node-set====
When a predicate contains the number(&thinsp;) function followed by
the <code>!=</code> comparison, if the nodeSet result is empty, the result of
the comparison is true; if any other comparison operator is used, the result is
the comparison is true; if any other comparison operator is used, the result is
false.
false.
   
   
For example, consider this document:
For example, consider this document:
<pre>
<p class="code"><t>
    <t>
  <w a="1"/>
      <w a="1"/>
  <x a="PI"/>
      <x a="PI"/>
  <y a="e" b="1"/>
      <y a="e" b="1"/>
  <z a="e" b="2"/>
      <z a="e" b="2"/>
</t>
    </t>
</p>
</pre>
   
   
If you are using a numeric comparison to search for Attribute <tt>a</tt>,
If you are using a numeric comparison to search for Attribute <code>a</code>,
you should use the number(<!--thinsp-->) function to avoid request cancellation,
you should use the number(&thinsp;) function to avoid request cancellation,
because <tt>a</tt> has non-numeric values.
because <code>a</code> has non-numeric values.
The following statement sets the result nodelist to the
The following statement sets the result nodelist to the
Element <tt>w</tt>:
Element <code>w</code>:
<pre>
<p class="code">%nlis = %doc:SelectNodes('/t/*[number(@a) = 1]')
    %nlis = %doc:SelectNodes('/t/*[number(@a) = 1]')
</p>
</pre>
   
   
The following statement sets the result nodelist to the
The following statement sets the result nodelist to the
Elements <tt>x</tt>, <tt>y</tt>, and <tt>z</tt>:
Elements <code>x</code>, <code>y</code>, and <code>z</code>:
<pre>
<p class="code">%nlis = %doc:SelectNodes('/t/*[number(@a) != 1]')
    %nlis = %doc:SelectNodes('/t/*[number(@a) != 1]')
</p>
</pre>
   
   
The <tt>b</tt> Attribute, however, does not have any non-numeric
The <code>b</code> Attribute, however, does not have any non-numeric
values, so it can be used without number(<!--thinsp-->).
values, so it can be used without number(&thinsp;).
Each of the following two statements sets the result nodelist to the
Each of the following two statements sets the result nodelist to the
Element <tt>y</tt>:
Element <code>y</code>:
<pre>
<p class="code">%nlis = %doc:SelectNodes('/t/*/[@b = 1]')
    %nlis = %doc:SelectNodes('/t/*/[@b = 1]')
%nlis = %doc:SelectNodes('/t/*[number(@b) = 1]')
    %nlis = %doc:SelectNodes('/t/*[number(@b) = 1]')
</p>
</pre>
   
   
However, the following two statements differ in their result:
However, the following two statements differ in their result:
<pre>
<p class="code">%nlis = %doc:SelectNodes('/t/*[@b != 1]')
    %nlis = %doc:SelectNodes('/t/*[@b != 1]')
%nlis = %doc:SelectNodes('/t/*[number(@b) != 1]')
    %nlis = %doc:SelectNodes('/t/*[number(@b) != 1]')
</p>
</pre>
The first sets the result nodelist to the
The first sets the result nodelist to the
Element <tt>z</tt>, while the second includes
Element <code>z</code>, while the second includes
Elements <tt>w</tt> and <tt>x</tt> as well as the Element <tt>z</tt>.
Elements <code>w</code> and <code>x</code> as well as the Element <code>z</code>.
Since they do not contain the <tt>b</tt>
Since they do not contain the <code>b</code>
Attribute, the result of <tt>number(@b)!=1</tt> at elements <tt>w</tt>
Attribute, the result of <code>number(@b)!=1</code> at elements <code>w</code>
and <tt>x</tt> is true.
and <code>x</code> is <code>true</code>.
If you want to make number(&thinsp;) similar to a direct comparison in this
respect, you "and" the Location Path argument with the
number(&thinsp;) factor in the predicate.
So, for example, the following sets the result nodelist to the
Element <code>z</code>, just like the direct comparison approach:
<p class="code">%nlis = %doc:SelectNodes('/t/*[@b and number(@b) != 1]')
</p>
<p class="note">'''Note:'''
The other way in which number(&thinsp;) differs from direct comparison
is described in [[#number(&thinsp;) comparisons that cause request cancellation|number(&thinsp;) comparisons that cause request cancellation]].</p>
   
   
If you want to make number(<!--thinsp-->) similar to a direct comparison in this
====number( ) comparisons that cause request cancellation====
respect, you &ldquo;and&rdquo; the locationPath argument with the
When a predicate contains the number(&thinsp;) function,
number(<!--thinsp-->) factor in the predicate.
So, for example,
the following sets the result nodelist to the
Element <tt>z</tt>, just like the direct comparison approach:
<pre>
    %nlis = %doc:SelectNodes('/t/*[@b and number(@b) != 1]')
</pre>
'''Note:'''
The other way in which number(<!--thinsp-->) differs from direct comparison
is described in [[#number(<!--thinsp-->) comparisons that cause request cancellation|number(<!--thinsp-->) comparisons that cause request cancellation]].
=====number(<!--thinsp-->) comparisons that cause request cancellation=====
When a predicate contains the number(<!--thinsp-->) function,
the request is cancelled if the value of
the request is cancelled if the value of
the nodeSet argument to the number(<!--thinsp-->) function has more than one node.
the nodeSet argument to the number(&thinsp;) function has more than one node.
   
   
For example, consider this document:
For example, consider this document:
<pre>
<p class="code"><t>
    <t>
  <x a="1" b="2"/>
      <x a="1" b="2"/>
  <y b="pi" a="3.14159265"/>
      <y b="pi" a="3.14159265"/>
</t>
    </t>
</p>
</pre>
   
   
If you are searching for all Elements that have any Attribute
If you are searching for all Elements that have any Attribute
greater than 1, you can use the locationPath <tt>@*</tt> as
greater than 1, you can use the Location Path <code>@*</code> as
a wildcard comparison for any Attribute.
a wildcard comparison for any Attribute.
However, you cannot use direct comparison, because some of the
However, you cannot use direct comparison, because some of the
attributes are non-numeric.
attributes are non-numeric.
   
   
So, you might try to use <tt>number(@*)</tt>, as in the
So, you might try to use <code>number(@*)</code>, as in the
following example:
following example:
<pre>
<p class="code">%nlis = %doc:SelectNodes('/t/*[number(@*) > 1]')
    %nlis = %doc:SelectNodes('/t/*[number(@*) > 1]')
</p>
</pre>
   
   
However, this will cause a request cancellation, because the value of
However, this will cause a request cancellation, because the value of
<tt>@*</tt> contains more than one node.
<code>@*</code> contains more than one node.
In such situations,
In such situations,
you must decide which node is to be converted
you must decide which node is to be converted
to a number for the comparison.
to a number for the comparison.
In this case, you probably want to use:
In this case, you probably want to use:
<pre>
<p class="code">%nlis = %doc:SelectNodes( -
    %nlis = %doc:SelectNodes( -
  '/t/*[number(@a) > 1 or number(@b) > 1]')
      '/t/*[number(@a) > 1 or number(@b) > 1]')
</p>
</pre>
This will set the result nodelist to Elements <code>x</code>
This will set the result nodelist to Elements <tt>x</tt>
and <code>y</code>.
and <tt>y</tt>.
 
[[Category: Janus SOAP]]
[[Category: Janus SOAP]]
[[Category:Overviews]]
[[Category:Overviews]]

Latest revision as of 14:20, 29 May 2019

This article has information to help you use XPath arguments to various XmlDoc API methods. Most of the information is taken from the XPath 1 standard, which is the authoritative reference.

References to Version 2 of the XPath standard in this manual are to the XPath 2 standard, which became a W3C Recommendation on January 23, 2007.

The five sections in this topic explain, respectively, the following:

  1. How to understand the components of an XPath expression, which is composed of steps.
  2. The syntax of XPath.
  3. Some subtle aspects of XPath.
  4. Specific XPath axis combinations to avoid.
  5. The subset of XPath supported by the current version of the XmlDoc API.

XPath operation

The purpose of XPath is to select a subset of nodes from a document. This selection is done using an expression, as described by the PathExpr production (these syntax productions for XPath are shown in XPath syntax). The simple form (that is, without parentheses) of a PathExpr expression is called a Location Path. (the LocPath production ([1]) in the syntax).

A Location Path consists of a series of Steps (Step ([4]) production). Each Step operates by taking an input set of nodes from the preceding step, and creating an output set of nodes. The output of the last step is the set of nodes selected by the XPath expression.

An example XPath expression is:

pitm[2]/partnum

This expression contains two Steps (the slash symbol ( / ) is used to separate the Steps in a Location Path).

Often a Step will start with an element name, which selects all the child elements with that name. In the above example, partnum children of pitm elements are selected. These child relationships are one kind of relationship between the input to a Step and the first part of the algorithm; the kinds of relationships, or Axes, are shown in the AxisName ([6]) production.

The element names in the above example are a form of NodeTest, described by the NodeTest ([7]) production. The NodeTest is used to restrict the set of nodes.

Square brackets ([ ]) in a Step surround another form of restriction, which is called a Predicate, given by the production ([8]) with the same name. A Predicate is a much more open-ended type of restriction, allowing various functions and operations, including Booleans.

The operation of a Step is as follows:

  1. A Step consists of an Axis, NodeTest, and zero or more Predicates.
  2. The input to a Step is a set of context nodes.
  3. The Axis produces sets of nodes, one set for each context node.
  4. Each of these sets is filtered by the NodeTest.
  5. Each of the resulting sets is filtered by the first Predicate.
  6. Each of the sets which are output by a Predicate is filtered by the following Predicate, if any.
  7. The final filtered sets are combined (using set union), and the result is the set of nodes that becomes input to the following Step.
  8. The result of the final Step is the result of the Location Path.

Axes

The various forms of the AxisName ([6]) production generate nodes based on a context node using the simple tree relationships described by the name. For example, attribute:: (abbreviated as @, the at sign) generates the set of all attributes of a context node.

The XmlDoc API supports the following axes (be sure to also read Performance considerations: Document order, certain axes):

ancestor
Contains the parent of the context node, its parent, and so on, to and including the Root node of the XmlDoc.
ancestor-or-self
The same contents as ancestor, except that it also includes the context node.
attribute
Contains the attributes of the context node, which must be an element.
child
Contains the children of the context node.
descendant
Contains the context node children, their children, and so on. Since the children of a node do not include its attributes, this axis does not include any attributes (so, this is not equivalent to a "sub-tree").
descendant-or-self
The same contents as descendant, except that it also includes the context node. Specified by //, an abbreviation for the step consisting of descendant-or-self::node(), this axis can be used, for example, to locate an element by its name, if the "path" to it is not known:

//foo

following
Contains all the nodes that are after the context node in document order, excluding any descendants and excluding attribute nodes.
following-sibling
Contains all the siblings of the context node that are positioned after that node. Contains no siblings if context node is an attribute.
parent
Contains the parent of the context node. Each node has only one parent, except the root node, which has no parent.
preceding-sibling
Contains all the siblings of the context node that are positioned before that node. Contains no siblings if context node is an attribute.
self
Contains the context node itself.

The following XPath axes are not supported in the current version:

namespace
In keeping with the XPath 2 recommendation, Sirius does not plan to support this axis at any time. You can obtain the information provided by namespace declarations by using certain XmlDoc API methods, for example, the URI function on an XmlNode.
preceding
Support for this axis may be added in a later version.

NodeTests

The various forms of the NodeTest ([7]) production filter nodes as follows:

NodeType '(' ')'
This selects any node that has the respective node type; for example, comment() selects all Comment nodes in a node set.
'processing-instruction' '(' Lit ')'
This selects any Processing Instruction node, if the target name is equal to the value of Lit.
'*' | NCName ':' '*' | QName
These forms test the name of a node, after restricting the type of node to the "principal node type" of the Axis, as follows:
  • Name tests in the attribute:: Axis restrict to Attribute nodes.
  • Name tests in any other Axis restrict to Element nodes.

The name tests then filter the resulting nodes as follows:

'*'
This selects a node of the selected type regardless of the node's name. The "selected type" is the principal node type of the node subset selected by the preceding Axis. The default Axis type is child, so the default node type is Element.
NCName ':' '*'
This selects a node of the selected type if the node has an associated namespace equal to the URI associated with NCName.
QName
This selects a node of the selected type if the node has the same name as QName.

Namespace URI associations and QName equality are discussed in Names and namespaces.

Predicates

Each nodeSet that is the result of NodeTest filtering is input to the series of Predicates in the Step. Each Predicate's result sets are passed to the following one, and a union of the results of the last Predicate (or the NodeTest, if there are no Predicates) forms the result of the Step.

There are a variety of Predicates, and except for a numeric Predicate, a Predicate selects a node if the value of the Predicate, converted to a Boolean, is true.

Within a single step:

  • Multiple Predicates are allowed.
  • Location paths within Predicates can themselves contain Predicates.

Common forms of predicates use Location Path expressions, XPath functions, or a combination of these. For example, the following path selects all contact children of the second cust element that have a fax child element:

/active/cust[position(2)]/contact[fax]

Location Paths (described in XPath operation) in a predicate can be any supported Location Path, including multi-step and absolute expressions.

You can also use a location path or the number( ) function, followed by a comparison operator and literal, as a predicate of an XPath expression, as described in Comparison tests in predicates.

For a description of the XPath predicates currently supported in the XmlDoc API, see Predicates supported in the current version

Functions

There are many functions defined for XPath; this section merely gives a sample of some of them. Furthermore, as of the current version, many of these are not supported. See XPath functions supported in the current version for a list of the XPath functions currently supported. All supported functions in the following list are shown as a link to the section which may contain additional notes for using the function in the XmlDoc API.

In the XmlDoc API, XPath functions are used only in predicates.

Here are some XPath functions that return a numeric result:

last( )
Returns the size (number of nodes) of the set that the predicate is filtering. Not supported in the XmlDoc API.
position( )
Returns the position of the node in the set that the predicate is filtering The position in the set is dependent on the Axis in effect for the Predicate. The following axes use the reverse of document order to arrange the nodes in a node set:
  • ancestor
  • ancestor-or-self
  • preceding
  • preceding-sibling

All other axes use document order to arrange the nodes in a node set.

Note: If a PathExpr is parenthesized and followed by Predicates, and if position( ) is used in those predicates, the axis in effect is the child axis.

count(nodeSet)
Returns the number of nodes in the argument nodeSet Notice that this function uses a nodeSet type argument. As in this example, a LocPath can be passed:

/book/chapter[count(section) >= 3]

This expression will select all chapters that have three or more sections.

Not supported in the XmlDoc API.

number(object?)
Returns the numeric value of the argument (after stripping leading and trailing blanks if it is a string object). If the argument cannot be converted to a number, it returns the special value "NaN" ("Not A Number"), which is not equal to any other object (including another expression whose result is NaN). The default argument is the context node (of the "containing" expression, which, for our purposes, is the context node of the predicate containing the number( ) function). If the argument is a nodeSet, the value of the first node (in document order) of the argument nodeSet is converted to a numeric value as above. If the argument is the empty nodeSet, NaN is returned.

Here are some XPath functions that return a string result:

string(object?)
This function, like several other XPath functions, allows different kinds of arguments: for example, it can be used to convert a number to a string. The string( ) function is implicitly used when a comparison is made between a node and a string value, for example:

/book/chapter[@title = 'Introduction']

In this case, each node in the node set that is the result of the @title PathExpr is converted using string( ), then compared to the string literal Introduction.

The default argument of the string( ) function is the context node of the expression. The string( ) function, when given a node set argument, uses the string value of the first node, in document order, of that node set.

Not supported in the XmlDoc API.

substring(string, number, number?)
Returns the substring of the first argument, starting at the position specified by the second argument, and for the number of characters specified by the third argument (or the remainder of the string, if the third argument is omitted). As with most XPath expressions, conversions are freely done, so if the first argument is not a manifest string type, it is converted to one using the string( ) function. Not supported in the XmlDoc API.

This XPath function returns a boolean result:

not(boolean)
Returns true if its argument is false, and false otherwise. For example, not(paragraph[3]) selects a node if it is not the third paragraph child.

XPath syntax

This section contains a version of the XPath syntax. See XML syntax for an explanation of the syntax conventions. The syntax below has been changed from that in the XPath Recommendation in these ways:

  • Names of some non-terminals have been changed (for example, “Lit” rather than “Literal”).
  • Some productions have been collapsed. This introduces superficial ambiguity that is dealt with as needed, for example, showing the precedence of operators.
  • A few of the productions have been moved, to illustrate that the PathExpr is the syntax goal. It is the form of expression that selects a set of nodes, which is the purpose of XPath in the XmlDoc API. In other places in this manual, an “XPath expression” means a PathExpr.

For a "cross-reference" to the productions as contained in the XPath Recommendation, see XPath syntax cross-reference.

See also XPath supported in the current version.

[A19] PathExpr  ::= LocPath | PrimaryExpr Predicate+ | PrimaryExpr Predicate* '/' RelativeLocPath | PrimaryExpr Predicate* '//' RelativeLocPath [B15] PrimaryExpr ::= Variable | Lit | Number | FunctionCall | '(' UnaryExpr ')' | '(' Expr ')' [C27] UnaryExpr ::= PathExpr ('|' PathExpr)* | '-' UnaryExpr | PrimaryExpr [1] LocPath ::= RelativeLocPath | AbsoluteLocPath [2] AbsoluteLocPath ::= '/' RelativeLocPath? | '//' RelativeLocPath [3] RelativeLocPath ::= Step ('/' Step)* | RelativeLocPath '//' Step [4] Step ::= '.' /* self::node() */ | '..' /* parent::node() */ | (AxisName '::' | '@')? NodeTest Predicate* [6] AxisName ::= 'ancestor' | 'ancestor-or-self' | 'attribute' | 'child' | 'descendant' | 'descendant-or-self' | 'following' | 'following-sibling' | 'namespace' | 'parent' | 'preceding' | 'preceding-sibling' | 'self' [7] NodeTest ::= '*' | NCName ':' '*' | QName | NodeType '(' ')' | 'processing-instruction' '(' Lit ')' [8] Predicate ::= '[' Expr ']' [16] FunctionCall ::= FunctionName '(' ( Expr ( ',' Expr )*)? ')' [21] Expr ::= EqExpr ( ('and' | 'or') EqExpr )* [23] EqExpr ::= RelExpr ( ('=' | '!=') RelExpr )* [24] RelExpr ::= NumExpr ( ('<' | '>' | '<=' | '>=') NumExpr )* [25] NumExpr ::= UnaryExpr ( ('+' | '-' | '*' | 'div' | 'mod') UnaryExpr )* [29] Lit ::= '"' [^"]* '"' | "'" [^']* "'" [30] Number ::= [0-9]+ ('.' [0-9]*)? [35] FunctionName ::= QName - NodeType [36] Variable ::= '$' QName [38] NodeType ::= 'comment' | 'text' | 'processing-instruction' | 'node'

XPath syntax notes

For information about XPath functions, see Functions.

  • In [A19], [2], and [3], the double slash (//) is an abbreviation for

    /descendant-or-self::node()/

  • When an at sign (@) is used in a Step ([4]), it is an abbreviation for

    attribute::

  • The syntax for Step ([4]) notes that it may begin directly with a NodeTest. In that case, child:: is implied before the NodeTest.
  • The syntax for QName and NCName are given in Name and namespace syntax.
  • The precedence of expressions from Expr ([21]), EqExpr ([23]), RelExpr ([24]), and NumExpr ([25]) is as follows (lowest precedence first):
    1. or (Short-circuit evaluation)
    2. and (Short-circuit evaluation)
    3. =, != (On node sets geared to one operand singleton)
    4. <=, <, >=, > (Node sets geared to one operand singleton; no string ordering)
    5. +, -
    6. *, div, mod

    The operators are all left-associative. For example, 3 > 2 > 1 is the same as (3 > 2) > 1, which evaluates to false.

  • The only forms of PrimaryExpr ([B15]) that can create node sets, and so be used in a PathExpr ([A19]), are the id( ) function and parenthesized LocPath ([1]) (or parenthesized unions of them, with | ([C27])).

See also the notes in XPath supported in the current version.

XPath syntax cross-reference

Here is a listing of all first productions contained in various numbered sections and unnumbered subsections from the XPath Recommendation. This may be helpful if you want to cross-reference the productions shown in XPath syntax with those in the XPath Recommendation.

Section in XPath Recommendation
Production number and name
2 Location Paths
[1] LocationPath
2.1 Location Steps
[4] Step
2.2 Axes
[6] AxisName
2.3 Node Tests
[7] NodeTest
2.4 Predicates
[8] Predicate
2.5 Abbreviated Syntax
[10] AbbreviatedAbsoluteLocationPath
3.1 Basics
[14] Expr
3.2 Function Calls
[16] FunctionCall
3.3 NodeSets
[18] UnionExpr
3.5 Numbers
[25] AdditiveExpr
3.7 Lexical Structure
[28] ExprToken
. . .
[39] ExprWhitespace (last production)

Some notes on XPath usage

The following subsections describe some subtle issues in XPath, which the XmlDoc API implements exactly as specified in the recommendations.

Note: Predicates in XPath expressions, which are widely used, are enclosed within square bracket characters ([ ]). However, simply typing square bracket characters in your XPath expressions risks invalid character errors arising from a mismatch between a 3270-keyboard codepage and the codepage set for the Online with the UNICODE command.

Model 204 7.6 maintenance added XMHTL entities for left and right square brackets in response to this problem. These entities are used in some of the predicate examples shown below.

"//" and "."

As mentioned in Performance considerations: Document order, certain axes, the descendant-or-self axis (commonly appearing in an XPath expression with the // abbreviation) should generally be avoided, due to possibly incurring extra CPU and DKRD overhead. In addition, if you have verified the performance questions, be sure that you understand the meaning of //. For example:

  • The expression //chapter[1] is not the first chapter in the document; it is the first chapter child of each element in the document.
  • An XPath expression that begins with // is an absolute expression; if you want to use it at the start of a relative XPath expression, make use of the . step:

    %chapter = %doc:SelectSingleNode('/book/chapter[title="Aerobics"]') %sections = %chapter('.//section')

Attributes: not children, and excluded from most axes

One subtle point to observe is that attributes are not children of their parents! As stated in the XPath Recommendation:

  • 2.2 Axes
  • . . .
    • the descendant axis contains the descendants of the context node; a descendant is a child or a child of a child and so on; thus the descendant axis never contains attribute or namespace nodes

Also, Attributes cannot be parents, and they are explicitly excluded from some axes.

The only axes which may include Attribute nodes are:

  • ancestor-or-self
  • attribute (abbreviated @)
  • descendant-or-self (abbreviated //)
  • self

Note that the child axis is not in the above list, and so it will never include an Attribute node; thus, since child is the default axis, an Attribute node will never the result of a step which does not explicitly include one of the above (abbreviated or unabbreviated) axes.

Order of nodes: node sets versus nodelists

The order of nodes in an XML document is the order in which the node (or its start-tag, in the case of an Element node) first occurs in the serial form of the document. Thus, in document order, the Root node is first, an Element node occurs before its Attribute nodes and Namespace declarations, which appear before the children of the Element, and so on.

For the sake of simplicity and to be consistent with XSLT, the order of nodes in an XmlNodelist in the XmlDoc API is also document order.

In XPath, however, document order is not always used. In an XPath expression step, the axis implies an order. This order is important for the position( ) and last( ) predicate functions, which filter a node based on its order in a node set. This order is the same as document order, except for the following axes (for which the order is the reverse of document order):

  • ancestor
  • ancestor-or-self
  • preceding (not supported by the XmlDoc API)
  • preceding-sibling

For example, consider the following document:

<top> <a/> <c/> <d/> <e/> </top>

Using XPath to select the first of the "following" siblings (/*/c/following-sibling::*[1]) yields the equivalent of the first element in document order: d. However, selecting the first of the preceding siblings (/*/c/preceding-sibling::*[1]) yields the first element in reverse document order: b.

This reverse ordering is apparent in some contexts but not in others. For example, the XPath expression in the following statement is used against the document above (call it %doc) to select nodes into an XmlNodelist:

%nodelist = %doc:SelectNodes('/*/c/preceding-sibling::*()')

The XmlDoc API (re)arranges the two found nodes in %nodelist into document order: a and b, in that order, and the following statement selects b, the second of the nodes:

Print %nodelist:Item(2):LocalName

Yet, if you use the following statement (instead of the previous two) in an attempt to directly select the "second" preceding sibling, the result is node a:

Print %doc:LocalName('/*/c/preceding-sibling::*[2]')

The Item method, above, selects the second node in the set of document-ordered nodes in the XmlNodelist. The position( ) function above (the 2 within the brackets) selects the second node in the set of reverse-document-ordered nodes passed from the preceding-sibling axis after filtering by the * NodeTest.

Performance considerations: Document order, certain axes

This section discusses the performance implications of evaluating certain XPath expressions. The expressions of concern have a common characteristic — they are not simple XPath expressions.

Simple XPath expressions, which have no special performance considerations, are any of these:

  • One or more steps containing only child, attribute, or self axes.
  • A parent axis used in the first step, after which may be one or more steps containing only child, attribute, or self axes.
  • A following-sibling axis used alone.

The rest of this section considers XPath expressions that are not simple and therefore might have negative performance implications. If your use of XPath is confined to the simple expressions defined above, the following discussion is not your concern.

The XPath expression arguments of methods like SelectNodes and UnionSelected in the XmlNodelist class designate a set of nodes. In addition to these "set-valued" methods, XPath expressions can be used in many XmlDoc API methods (SelectSingleNode, Value, DeleteSubtree, QName, and more) to operate on a single node that satisfies the expression. For a simple XPath expression (described above), the "single node" methods may be able to determine the desired node by scanning fewer nodes, giving better performance than the set-valued methods for the same expression.

The single-node XPath selection by the XmlDoc API returns the first node in document order, but with non-simple XPath expressions, this is not the same as the first node found by the XPath internal selection algorithm, which may visit nodes in a different order. In those cases, an entire subtree is examined to determine the first node, in document order, that the XPath expression selects.

In other words, given an XPath expression expr that uses any of the axis cases described below in The extra-processing expressions and given any single-node selection method XMeth, this expression:

%obj:XMeth(expr)

scans as many nodes as:

%obj:SelectNodes(expr):Item(1):XMeth

For all other XPath expressions, the number of nodes scanned by the first of these two approaches may be significantly lower, because the first node internally selected will also be the first node in document order.

For example, consider the following document:

<top> <a> </a> </top>

When, say, Value('/*/*/b/@x') is evaluated, the document search ends when the first match is found (and the Value method returns 1).

But when Value('//b/@x') is evaluated, the document search first finds the match x=2, then it continues searching the entire document for all matches, to ensure that the match which is lowest in document order (x=1) is the result.

The performance implications of the expressions that involve extra processing apply to the set-valued methods as well. The set methods must produce their results in document order, but the nodes selected during XPath evaluation may be selected in an order (due to the selection algorithm) that differs from document order.

The extra-processing expressions

Extra processing can occur in the following cases:

  1. The presence of the preceding-sibling axis
  2. The presence of the ancestor axis
  3. The presence of the ancestor-or-self axis
  4. The presence of the following axis, if it is not the first step in the expression
  5. The presence of any of these axis-combinations:
    • One of the following axes:
      • descendant
      • descendant-or-self
      • following
      • parent, if it is not the first step in the expression

      followed, in a subsequent step, by any of these axes:

      • parent
      • child
      • following-sibling
      • descendant
      • descendant-or-self
    • The parent axis, if it is not the first step in the expression, followed, in a subsequent step, by the
      • attribute axis.

In addition to the cost of the actual XPath search performed with the above expressions, they can incur an additional "one-time" cost for XPath evaluation. If the document has been modified in such a way that the internal order of the nodes cannot be guaranteed to be the same as document order (this will always happen with any of the XML Insert..Before methods, and usually will happen with any of the Add.. methods), then the entire document (not only the subtree being searched) must be scanned so that the order is adjusted. This does not involve any internal movement of the nodes, but it does require a full scan. This adjustment "fixes up" the XmlDoc and it will remain "fixed up" for subsequent XPath searches, until such time as the XmlDoc is subsequently updated in such a way that the internal order does not guarantee document order.

Note:

  1. One important exception to the above rules is the descendant-or-self::node() step followed immediately by the child axis without any predicate. An example of the usual way to specify this is:

    //chapter

    In this case, the internal node selection algorithm operates in document order, and no extra processing is incurred.

    Even with this special case, it is better to avoid the descendant-or-self step (specified explicitly or by using //) if your document structure lends itself to explicitly specifying the "intermediate" elements with * (or even better, with their names) that should be matched.

  2. The considerations described in this section only apply to the "outer" XPath expression; they do not apply to any expression within a predicate. Although it is still better, for the sake of efficiency, to prune the search by explicitly specifying "intermediate" elements rather than using //, there is no efficiency concern due to the internal order of node selection with an XPath predicate such as the following:

    Print %d:Value('/book/chapter' With '[.//credit/details/@auth="Dave"]')

  3. In conclusion, except when you must use the "//chapter" exception discussed in Note 1, above, avoid these extra-processing axes and axis combinations (especially in outer XPath expressions) if your documents are relatively large and performance is a consideration.

XPath supported in the current version

This section contains a condensed excerpt of the XPath syntax, showing only those parts of XPath used in the current version. It also explains any differences in the result of XPath expressions in the XmlDoc API versus that specified in the XPath 1 standard.

See also:

In most cases, the syntax below is a subset of the XPath 1 standard; the syntax of NumericLit ([30]), however, is an extension of XPath 1, whose comparable production (Number, [30]) is limited to the production below of DecimalNumber ([30d]).

 [1]  LocPath ::= RelativeLocPath
         | '/' RelativeLocPath?
 
 [3]  RelativeLocPath ::= Step ('/' Step)*
 
 [4]  Step ::= '.'     /* self::node()  */
         | '..'        /* parent::node() */
         | (AxisName '::' | '@')? NodeTest Predicate*
 
 [6]  AxisName ::= 'ancestor'
        | 'ancestor-or-self'    | 'attribute'
        | 'child'               | 'descendant'
        | 'descendant-or-self'  | 'following'
        | 'following-sibling'   | 'parent'
        | 'preceding-sibling'   | 'self'
 
 [7]  NodeTest ::=  '*'  |  NCName ':' '*'  |  QName
         | 'node()' | 'comment()'  | 'text()'
         | 'processing-instruction(' StringLit? ')'
 
 [8]  Predicate ::= '[' PredExpr ']' | 'not' '(' PredExpr ')'
 
      PredExpr ::= 'position()' CmpOp PositiveInteger
         | PositiveInteger  /* “simple position test” */
         | PathExpr    /* “Existence test” */
         | Comparison
         | PredExpr ('and' | 'or') PredExpr
         | '(' PredExpr ')'
 
      Comparison ::= PathExpr CmpOp StringLit
         | PathExpr CmpOp NumericLit
         | 'number' '(' PathExpr? ')' CmpOp NumericLit
 
      CmpOp ::= '=' | '!=' | '<' | '>' | '<=' | '>='
 
 [29] StringLit ::= '"' [^"]* '"' | "'" [^']* "'"
 
 [30] NumericLit ::= DecimalNumber ('E' ('+' | '-')? NonNegInteger)?
 
 [30d] DecimalNumber ::= ('+' | '-')? NonNegInteger Fraction?
         | ('+' | '-')? NonNegInteger '.'
         | ('+' | '-')? Fraction
 
 [30f] Fraction ::= '.' NonNegInteger
 
 [30p] PositiveInteger ::= [1-9] [0-9]*
 
 [30p] NonNegInteger ::= [0-9]+

Explanatory notes

The following notes are intended only to explain the above syntax; they do not present any limitations on XPath support in the XmlDoc API:

  • When the at sign (@) is used in a Step ([4]), it is an abbreviation for

    attribute::

  • The syntax for Step ([4]) notes that it may begin directly with a NodeTest. In that case, child:: is implied before the NodeTest.
  • The syntax for QName is given in Name and namespace syntax.
  • A node is selected by a PositiveInteger predicate if the position of the node, in the set which the predicate is filtering, is equal to that PositiveInteger.
  • A node is selected by an Existence test if the result of the PathExpr, using that node as the context node, is non-empty.
  • A node is selected by a Comparison if any node in the result of the PathExpr, using that node as the context node, holds the specified relationship to the Lit.

Restrictions and limitations

The following notes concern limitations on XPath support in the XmlDoc API:

  • One way to summarize the XPath productions that are not supported in the current version is to list the XPath operators that are not supported, as shown in the following table: <td/)
    Unsupported operatorsMeaningComments
       +   -    *   div   modarithmetic
       | union The UnionSelected method can be used to form the union of two nodesets.

    Note: Parentheses are allowed for grouping within Boolean expressions within predicates, but this is the only place they are supported.

  • As of the current version, XPath function support is limited enough that it can be shown in the syntax above. However, for clarity, see XPath functions supported in the current version for a list of supported functions and for any differences between the the XmlDoc API implementation and the XPath 1 definition of a function.
  • The Boolean operators (and, or) and relational operators (= != < <= > >=) are supported only in (some) predicates (see Comparison tests in predicates).
  • XPath variables ($var_name) are not supported.
  • The numeric constants +/- infinity are not supported.
  • The size of an XPath expression is limited to approximately 26 steps, if each has an NCName NodeTest.
  • In the XmlDoc API, a numeric value (either a literal or a node value) may be of any form available in SOUL. In particular, "E-format" literals, such as 1.003E-5 (even though they are not very common in XML documents) may be specified. The same form of numbers is available in XPath 2. XPath 1 only allows decimal numbers; it does not allow E-format literals nor node values.
  • The precision used in the XmlDoc API XPath support is that provided by SOUL — namely, 15 decimal digits.
  • If the XPath support in the XmlDoc API attempts to convert a long string (that is, longer than 255 bytes) or a number whose absolute value is beyond the capabilities of SOUL (maximum absolute value approximately 7.237E75), the request is cancelled.

Predicates supported in the current version

The XmlDoc API supports these predicates:

  • Two types of Location-Path-expression predicates:
    • A Location Path (that is, production [1], LocPath in XPath supported in the current version) used as an existence test

      If the nodeSet that results from the Location Path is non-empty, the predicate evaluates as true. The usual (but not only) purpose of this predicate is to select a node if it has at least one attribute or element with a given name.

      For example, the following expression selects (as the XPath argument to the SelectNodes method, for example) all contact children of cust elements, if the cust element has an invoice child element and contact has a fax child element:

      '/active/cust&lsqb;invoice&rsqb;/contact&lsqb;fax&rsqb;':u

      Note: This example substitutes the [ and ] entities for left and right square-bracket characters from the keyboard, for the reason explained at the beginning of this section. Notice also the accompanying use of the U method. The value of the expression is:

      /active/cust[invoice]/contact[fax]

      Most of the remaining predicate examples below omit the entities in order to more clearly display the XPath grammar.

    • A Location Path expression with a comparison operator and literal

      For example, @price > 200 selects a node if the numeric value of the node's price Attribute is greater than 200.

      See Comparison tests in predicates for further discussion.

  • These types of function predicates:
    • A "simple" position test using a numeric literal n

      This test is equivalent to the implicit use of the position( ) function in the predicate term position()=n.

      For example:

      /book/chapter[2]/section[9]/paragraph[3]

    • The number( ) function with a location path argument, followed by a comparison to a numeric literal

      For example, number(@size) > 30 selects a node if the numeric value of the node's size Attribute is greater than 30.

      This predicate differs from the similar Location Path example above (@price > 200) primarily in that it allows the Attribute value to be non-numeric. The previous Location Path example cancels the request if a numeric comparison is performed with a price Attribute whose value is non-numeric.

      See Comparison tests in predicates for further discussion.

    • The position( ) function, followed by a comparison operator, followed by an integer literal, which may be negative or zero.

      For example:

      /book/chapter[position()>1]/section[2]

    • The not( ) function, which returns the opposite boolean value of its boolean argument.

      For example:

      /book/chapter[2]/section[9]/not(paragraph[3])

  • Nested predicates.

    For example, this statement selects each Chapter whose first Section has a Racy attribute:

    %lis = %bk:SelectNodes('Chapter[Section[1 and @Racy]]':u)

    Note: As in an earlier example, the [ and ] entities in this example require Model 204 7.6 or higher. The value of the XPath argument is:

    Chapter[Section[1 and @Racy]]

  • Multiple predicates in a single step.

    For example, using the position() function to filter based on the position of nodes from the preceding predicate, rather than from the step's NodeTest:

    /book/chapter[author="Alex"] [2]

    The preceding two-predicate step selects the second chapter child that is authored by Alex, while the following expression selects the second chapter child of the book, if its author is Alex:

    /book/chapter[author="Alex" and 2]

    Parentheses for grouping in Boolean expressions are supported. For example:

    chapter[@type="methods" and (@class="Stringlist" or @class="Daemon")]

  • Combination predicates.

    Predicates may combine any of these supported functions and supported Location Path expressions using the and and or Boolean operators.

    For example:

    /active/cust[invoice and position>1]

XPath functions supported in the current version

The following XPath functions are supported in the current version, and Functions gives their XPath 1 definitions. Any differences between the XPath 1 definition and the the XmlDoc API implementation are shown below.

Note: In discussing XPath functions, the name of the function followed by an empty pair of parentheses (for example, number()) is sometimes used to name the function, whether or not the particular function being discussed takes arguments.

  • position( )

    Performs as specified in the XPath standard.

  • not(bool)

    The function argument is a Boolean expression, and the function result is true if the value of the argument is false, and it is false otherwise.

    Notes:

    • Performs as specified in the XPath standard.

    • The result of the not() function applied to a comparison expression is different than the result of the same expression with the complementary comparison. For example, this statement selects children that have the value of the status attribute equal to "pending":

      %lis = %nod:SelectNodes('*[@status="pending"]')

      This statement selects children that have the value of the status attribute equal to something other than “pending”:

      %lis = %nod:SelectNodes('*[@status!="pending"]')

      This statement selects children that have the value of the status attribute equal to something other than “pending” or that have no status attribute:

      %lis = %nod:SelectNodes('*[not(@status="pending")]')

  • number(nodeset?)

    The XmlDoc API number( ) function differs from the XPath 1 definition as follows:

    1. XPath 1 allows a variety of argument types (for example, a string literal); the XmlDoc API allows only a nodeSet argument.
    2. In XPath 1, if a nodeSet argument to number( ) contains more than one node, the first node (in document order) is converted to a number and returned.

      In the XmlDoc API, if the argument result contains more than one node, the request is cancelled, which is consistent with the XPath 2 standard.

    3. The definition of a numeric value for number( ) (after stripping leading and trailing whitespace) is the same as the NumericLit production ([30]) in XPath supported in the current version. This is consistent with the XPath 2 standard, and is an extension of the XPath 1 definition of number( ), which only accepts numbers of the form DecimalNumber ([30d]).

Comparison tests in predicates

In the XPath standard (XPath 1 and XPath 2), either operand in a comparison test in a predicate may be any form of XPath expression. The predicate evaluates as true if the comparison is true of at least one node in the resulting nodeSet, and typically the purpose of the predicate is to select the nodes for which the comparison is true.

For example, the following expression selects all item children of order elements that have a price attribute whose value is greater than 9.99:

order/item[@price > 9.99]

XmlDoc API predicate comparisons differ from XPath 1 comparisons:

  • In XmlDoc API predicates, comparison operands are more restricted than in the XPath standard. As is explained in the syntax discussion below, you can use only a Location Path or the number( ) function, followed by a comparison operator, followed by a literal.
  • The XmlDoc API uses XPath 2 comparisons. The XPath 1 standard does not provide for exception conditions and it does not provide for ordered string comparisons. This is also true for Microsoft .Net, which follows the XPath 1 standard.

    The XmlDoc API follows the XPath 2 standard by providing for exceptions (implemented as request cancellation conditions) and providing for ordered string comparisons.

As of the current version, the only forms of comparisons are these four:

position() relOp integer LocPath relOp "stringLiteral" LocPath relOp numericLiteral number([LocPath]) relOp numericLiteral

Where:

position( )
This function is discussed in Predicates supported in the current version.
relOp
One of the comparisons: =, !=, <, <=, >, >=
integer
An integer, whose precision is limited to 15 decimal digits (as in SOUL).
LocPath
An XPath location expression (that is, production [1], LocPath, in XPath supported in the current version). As of the current version, such an expression still has the limitation that it may not contain a predicate.
"stringLiteral"
A quoted string literal value, which must not exceed 255 bytes.

The XmlDoc API has always supported ordered string comparisons, but the XPath 1 standard does not. For more information about these comparisons, see Ordered string comparisons.

numericLiteral
A numeric literal value, whose precision is limited to 15 decimal digits (as in SOUL). For additional format and size limitations, see Restrictions and limitations.

For more information about support for comparisons of a Location Path to a numeric literal, see Direct numeric comparison.

Numeric literals in predicate comparisons are supported in the XmlDoc API.

number([LocPath])
The number( ) function with an optional Location Path argument.

A comparison using the number( ) function is very similar to comparison of a Location Path to a numeric literal (Direct numeric comparison). Comparing the result of number( ) to a literal gives a result according to their relative values and to the comparison operator. For example, shirt[number(@size) > 30] selects nodes that have a size greater than 30.

The significant difference between using a Location Path and using a number( ) function in a numeric comparison is that the request is cancelled in the former case if a node in the comparison is non-numeric. This difference is discussed briefly below and in greater detail in number(LocPath) comparisons for non-numeric data.

These are the effects of the function's LocPath argument (they are consistent with the XPath 2 standard):

  • If the result of the LocPath argument is a single node, number( ) converts the value of the node, after stripping leading and trailing whitespace, to a number, or to the special value NaN ("Not a Number") if the stripped value of the node is not numeric.
  • If the result of the argument has more than one node, the request is cancelled. For further details, see number( ) comparisons that cause request cancellation.
  • If the LocPath argument is the empty nodeSet, the result of the number( ) function is NaN.

    See number(LocPath) != n, LocPath result is empty node-set for examples.

    Note: In XPath 1 (which has no exception conditions), if there is more than one node in the nodeSet argument, the value of the first node (in document order) is used.

  • If you omit the LocPath argument, the default argument "." (the context node) is used; that is, the node that is being filtered by the predicate gets converted to a numeric value.

    For example: in the following XPath expression, the number( ) function converts the value of the size Attribute to a number:

    /*/shirt/@size[number() > 10]

If the number( ) result that is compared to a literal is NaN, the comparison is always false (or, in the case of the != operator, is always true). This is important to note, because it means number( ) can be used to avoid the request cancellation to which numeric comparisons are subject if the nodes evaluated by a predicate may be non-numeric. For further discussion, see number(LocPath) comparisons for non-numeric data.

If you are using number( ) to avoid request cancellation for a numeric comparison because the nodes evaluated by a predicate may be non-numeric, and you are using the "not equals" comparison (!=), remember that an empty nodeSet argument will give a true comparison result (as a consequence of the rule for comparing NaN). You can filter out the nodes included by empty nodeSet comparisons by expanding the Location Path expression from number(LocPath) to LocPath and number(LocPath), as described in number(LocPath) != n, LocPath result is empty node-set.

Note: In the XmlDoc API, the number( ) function must be immediately followed by a comparison operator and a numeric literal. This limitation is not required by XPath 1 or XPath 2.

Ordered string comparisons

If an XPath expression contains a Location Path subexpression and a quoted string with an ordered comparison (that is, a comparison other than "=" and "!="), the result is based on a byte-by-byte ordered comparison between each item of the nodeSet result of the subexpression and the literal string value.

Consider the following example:

%nlis = %nod:SelectNodes('order[@date>"2007-01-01"]')

If the value of the date Attribute node of an order Element child is, for example, 2007-05-17, that order Element node will be included in the result.

This behavior has always been available in the XmlDoc API: if a comparison literal is bracketed in double or single quotation marks, a string comparison is performed, whether or not the literal has a numeric format (this is consistent with XPath 2 string comparisons). Note, however, that most practical ordered comparisons involve numeric values, which are supported.

In XPath 1, any ordered comparison is done by first converting each operand to a numeric value and then performing the comparison, and the result of any ordered comparison of a non-numeric node value is false. Therefore, the XPath 1 result of the above example would always be empty, because the literal is a non-numeric value.

Direct numeric comparison

If an XPath expression contains a Location Path subexpression compared to a literal numeric value, the result is true if any node in the subexpression result, converted to a numeric value, has the specified relationship ("=", "<", etc.) to the literal value.

In the following example, an order child is in the result if it has an item child whose price Attribute node is greater than 99.99:

%nlis = %nod:SelectNodes('order[item@price > 99.99]')

For more examples, see Successful direct numeric comparisons below.

If any node value used in the direct numeric comparison is non-numeric, the request is cancelled. For examples, see Direct numeric comparisons that cause request cancellation below.

The discussion that follows makes references to the following Clothes document:

<Clothes> <shirt size="32" type="dress" sku="100"/> <shirt size="33" type="sport" sku="101"/> <shirt size="M" type="sport" sku="102"/> <shirt size="34" type="frilly" sku="103"/> </Clothes>

  1. Successful direct numeric comparisons

    If a predicate contains a comparison of a Location Path to a numeric literal, the comparison is true if the numeric value, after stripping leading and trailing whitespace, of any of the nodes in the Location Path result has the specified relationship to the numeric literal. If none of the nodes has the relationship (which includes the case that the Location Path result is empty), the result of the comparison is false.

    For example, using the Clothes document described above, the following statement prints sku="100".

    %doc:Print('/*/shirt[@size<040]/@sku')

    Note that the numeric value of the node and the numeric value of the literal are compared, so the leading zero in 040 here is ignored. An equivalent comparison could be @size<040.00, etc. The following statement prints sku="101":

    %doc:Print('/*/shirt[@size<40 and @type="sport"]/@sku')

    The following statement prints sku="103"; the comparison of the size Attribute is processed for only one Element, which has a numeric size. As discussed below, this request would fail if the order of the attribute subexpressions were reversed.

    %doc:Print('/*/shirt[@type="frilly" and @size<40]/@sku')

  2. Direct numeric comparisons that cause request cancellation

    If any of the nodes used in a direct numeric comparison has a value that is non-numeric after stripping leading and trailing whitespace, the request is cancelled.

    For example, using the Clothes document described above, the following statement causes the request to be cancelled when the size attribute (M) of the second sport shirt is compared to the number 40 (note the difference between this XPath expression and the last one in Successful direct numeric comparisons above):

    %doc:Print('/*/shirt[@size<40 and @type="frilly"]/@sku')

    The following statement causes the request to be cancelled (at the same Element), because SelectNodes continues after the first selection, unlike the Print example with the same XPath expression in Successful direct numeric comparisons above):

    %sh = %doc:SelectNodes('/*/shirt[@size<40 and @type="sport"]/@sku')

    If you want the request cancellation to be avoided in statements like these, consider using the number( ) function (see number(LocPath) comparisons for non-numeric data).

number(LocPath) comparisons for non-numeric data

The number( ) function can often be used to avoid request cancellation due to the presence of non-numeric data in a direct numeric comparison.

For example, both statements from Direct numeric comparisons that cause request cancellation above can avoid request cancellation (when used with the particular document discussed in that section) if the Location Path @size is "converted" using the number( ) function.

The following statement prints sku="103", even though the size Attribute equal to M is processed before the selected Element:

%doc:Print('/*/shirt[number(@size)<40 and @type="frilly"]/@sku')

Similarly, the following statement succeeds even though the size Attribute equal to M is processed (and not selected):

%sh = %doc:SelectNodes('/*/shirt[number(@size)<40 and @type="sport"]/@sku')

Comparisons with the number( ) function are always false for a non-numeric node value, unless the comparison is !=. The following statements both print None found:

Print %doc:ValueDefault( - '/*/shirt[number(@size) < 40 and @sku="102"]/@size', 'None found') Print %doc:ValueDefault( - '/*/shirt[number(@size) >= 40 and @sku="102"]/@size', 'None found')

The following, however, prints M, the size of shirt with this SKU. Its size is not less than 40 nor greater than or equal to 40, but it is not equal to 40 (nor any other number):

Print %doc:ValueDefault( - '/*/shirt[number(@size) != 40 and @sku="102"]/@size', 'None found')

Note: Before substituting the number( ) function into a direct numeric comparison, you should be aware of two differences between direct numeric comparison and the use of number( ):

number(LocPath) != n, LocPath result is empty node-set

When a predicate contains the number( ) function followed by the != comparison, if the nodeSet result is empty, the result of the comparison is true; if any other comparison operator is used, the result is false.

For example, consider this document:

<t> <w a="1"/> <x a="PI"/> <y a="e" b="1"/> <z a="e" b="2"/> </t>

If you are using a numeric comparison to search for Attribute a, you should use the number( ) function to avoid request cancellation, because a has non-numeric values. The following statement sets the result nodelist to the Element w:

%nlis = %doc:SelectNodes('/t/*[number(@a) = 1]')

The following statement sets the result nodelist to the Elements x, y, and z:

%nlis = %doc:SelectNodes('/t/*[number(@a) != 1]')

The b Attribute, however, does not have any non-numeric values, so it can be used without number( ). Each of the following two statements sets the result nodelist to the Element y:

%nlis = %doc:SelectNodes('/t/*/[@b = 1]') %nlis = %doc:SelectNodes('/t/*[number(@b) = 1]')

However, the following two statements differ in their result:

%nlis = %doc:SelectNodes('/t/*[@b != 1]') %nlis = %doc:SelectNodes('/t/*[number(@b) != 1]')

The first sets the result nodelist to the Element z, while the second includes Elements w and x as well as the Element z. Since they do not contain the b Attribute, the result of number(@b)!=1 at elements w and x is true.

If you want to make number( ) similar to a direct comparison in this respect, you "and" the Location Path argument with the number( ) factor in the predicate. So, for example, the following sets the result nodelist to the Element z, just like the direct comparison approach:

%nlis = %doc:SelectNodes('/t/*[@b and number(@b) != 1]')

Note: The other way in which number( ) differs from direct comparison is described in number( ) comparisons that cause request cancellation.

number( ) comparisons that cause request cancellation

When a predicate contains the number( ) function, the request is cancelled if the value of the nodeSet argument to the number( ) function has more than one node.

For example, consider this document:

<t> <x a="1" b="2"/> <y b="pi" a="3.14159265"/> </t>

If you are searching for all Elements that have any Attribute greater than 1, you can use the Location Path @* as a wildcard comparison for any Attribute. However, you cannot use direct comparison, because some of the attributes are non-numeric.

So, you might try to use number(@*), as in the following example:

%nlis = %doc:SelectNodes('/t/*[number(@*) > 1]')

However, this will cause a request cancellation, because the value of @* contains more than one node. In such situations, you must decide which node is to be converted to a number for the comparison. In this case, you probably want to use:

%nlis = %doc:SelectNodes( - '/t/*[number(@a) > 1 or number(@b) > 1]')

This will set the result nodelist to Elements x and y.