1 Introduction
2 Concepts
2.1 Initiating a Transformation
2.2 Nodes
2.3 Context
2.4 Precedence Categories
2.5 Match Patterns
2.6 Errors
3 Stylesheet Structure
3.1 STX Namespace
3.2 Transform Element
3.3 Grouping of Templates
3.4 Stylesheet Inclusion
4 Generating Output
4.1 Transformation Options
4.2 Namespace Aliasing
4.3 Templates
4.4 Procedures
4.5 Parameters
4.6 Copying the Current Node
4.7 Processing Nested Events
4.8 Processing Attributes
4.9 Processing Siblings
4.10 Running Overridden Templates
4.11 Processing Text
4.12 Outputting Strings
4.13 Outputting Elements and Attributes
4.14 Outputting Other Nodes
4.15 Conditions
4.16 Loops
4.17 Multiple Input Documents
4.18 Multiple Output Documents
4.19 Buffers
4.20 Messages
5 Data Types
5.1 Atomic Types
5.2 Sequences
5.3 Type Conversions
5.4 Tree Fragments
6 Expressions
6.1 Variables
6.2 Literals
6.3 Parenthesized Expressions
6.4 Functions
6.4.1 Sequence Functions
6.4.2 Node Functions
6.4.3 Boolean Functions
6.4.4 String Functions
6.4.5 Numerical Functions
6.4.6 Other Functions
6.5 Data Accessors
6.6 Sequence Expressions
6.7 Arithmetic Expressions
6.8 Comparison Expressions
6.9 Logical Expressions
7 Extensions
A References
B Element Syntax Summary
C STXPath Grammar
D Acknowledgments (Non-Normative)
E Draft Change History since WD 1 November (Non-Normative)
This document defines the syntax and semantics of the STX transformation language. Transformation rules in STX are expressed as well-formed XML documents. These documents, called stylesheets, may include both elements that are defined by STX (STX declarations and instructions) and other elements (literals). STX-defined elements are identified by a specific XML namespace, which is referred to in this specification as the STX namespace. This document uses a prefix of 'stx' as a shortcut for referring to elements from the STX namespace.
An STX transformation describes rules for transforming one or more source event streams into one or more result event streams. The transformation has a streaming character; this means that it does not need to build a tree representing the source documents in memory. Result events are generated as soon as source events appear and are processed.
The transformation is achieved by associating events with templates. A template pattern is matched against events and their context. The best matching template is then instantiated to create a part of the result stream. A template is always instantiated with respect to a current context, a set of additional information maintained during the transformation. In constructing the result stream, events from the source stream can be filtered and arbitrary events can be added. Events can also be reordered using working storage.
On the surface, the syntax of STX is similar to the syntax of [XSLT]. STX also employs a compact expression language embedded in certain attributes. This expression language, called STXPath, is syntactically similar to [XPath]. This should allow XSLT users to easily adapt to STX syntax.
The software responsible for running an STX transformation is referred to as an STX processor. An STX processor transforms one or more source XML documents according to rules given in an STX stylesheet and generates one or more result XML documents.
The source documents are supplied in the form of streams of [SAX2] events. These streams are referred to as the source streams. The stream whose events are currently processed is referred to as the current source stream. The current source stream at the time when the transformation is initiated is referred to as the principal source stream.
A possibly empty set of external values for stylesheet parameters is supplied. These values are available for use within expressions in the stylesheet.
No tree representation of the source document is constructed. However, when processing each event, a limited amount of contextual information is available from the system.
Data arriving with an event can form one or more objects called
nodes. Pair events for the document and elements form
one node only; all node data is passed with the starting
event. The data of attributes passed with
startElement()
event form separate nodes.
Sequential characters()
and
ignorableWhitespace()
events will be combined into a
single text node.
The stylesheet is a well-formed XML document that may be
precompiled to some kind of executable representation that can be
reused to perform multiple transformations. The stylesheet can
consist of several stylesheet modules contained in different files.
One of these modules is the principal stylesheet module. The
complete stylesheet is assembled by finding the stylesheet modules
referenced directly or indirectly from the principal stylesheet module
using the stx:include
declaration.
The output of the transformation consists of one or more sequences of SAX2 events. These sequences of events are referred to as result streams. The stream events are emitted to currently is referred to as the current result stream. The current result stream at the time when the transformation is initiated is referred to as the principal result stream.
Each incoming event can cause an invocation of one or more rules within the stylesheet by means of a match pattern. The actions such a rule may perform include emitting SAX2 events to result streams, saving working data to working storage, accessing data written to working storage by previously executed rules, and invoking other rules.
Note:
The source or result streams are abstract constructs that function as input or output channels for STX transformations. Each source or result stream is identified with a URI. This URI must not be confused with the URI of a physical document that may be parsed to generate the source stream or a document the result stream may be serialized to. Instead, the stream is associated with a resolver (typically, SAX2 driver for source streams, SAX2 handler for result streams) that maps the abstract stream to particular physical resource.
This document does not specify interfaces for initiating an STX transformation. Instead, these interfaces are implementation dependant. This section describes the minimum amount of information that must be supplied to execute a transformation:
Note:
Some portions of this information can be passed explicitly through an implementation interface while other portions can be built-in for particular implementations. For example, an implementation can have standard resolvers for certain URI schemes (file, http). Thus, streams identified with these URIs may not require explicit definitions.
The data arriving with an event forms zero or more entities
called nodes. Pair events refer to a single node whose data is passed
with the starting event. The attribute data and namespace data arriving
with startElement()
event form separate attribute and
namespace nodes. Aggregated consequent events of the same type
(characters, ignorableWhitespaces) are treated as a single event
and thus form a single node only.
There are eight types of nodes recognized in STX:
root node - Passed with a
startDocument()
event; this node has no
properties.
element node - Passed with a
startElement()
event. The node properties consist of
the element related data (local name, prefix, qualified name,
namespace URI).
attribute node - Passed with a
startElement()
event. The node properties consist of
the data related to a particular attribute (local name, prefix,
qualified name, namespace URI, value).
text node - Passed with a
characters()
or ignorableWhitespace()
event. The node properties consist of character data.
CDATA node - Passed with a
characters()
or ignorableWhitespace()
event within startCDATA()
and endCDATA()
lexical events. The node properties consist of character
data.
processing instruction node - Passed with a
processingInstruction()
event. The node properties
consist of target and character data.
comment node - Passed with a
comment()
event. The node properties consist of
character data.
namespace node - Passed with a
startElement()
event. An element has a namespace node
for each namespace prefix that is in scope for this element. The
namespace node properties consist of prefix and NS URI.
There is contextual information available at each point during processing. It includes the data arriving with the current event and other data related to the state of processing. The contextual information at any particular instant during processing is called the current context. The context information consists of the following parts:
current node data - The node which is the subject of the current event is called the current node. It is always given and there is no way to change the current node using stylesheet rules. The information available for the current node depends on the node type; see [SAX2] definition for details. For example, qualified name, local name, prefix, namespace URI and attributes (qualified name, local name, prefix, namespace URI, and value for each) are available for elements.
ancestor stack - For the current node, all ancestor nodes with all properties are stored in the ancestor stack.
next node data - The processing of the current node is delayed so that the next node data is available. The lookahead information can be used to access the first text child of an element, provided it is the very first child of the element.
position within siblings - Information about the position relative to other siblings is kept. The position is available for the current node and all its ancestors.
A position number is available for all node kind tests such as
node()
, text()
, cdata()
,
processingInstruction()
, comment()
. For
elements, the position is available for all qualified names or names
containing * shortcut: pre:lname
, lname
,
pre:*
, *:lname
, *
. For
processing instructions, the position is also available for each
target. The position of attribute nodes is undefined.
Each incoming event can invoke a template within the stylesheet by means of precedence categories and a match pattern (see 2.5 Match Patterns). The template that is used to process the current node is called the current template. Templates can be separated into groups (see 3.3 Grouping of Templates). Top-level templates are considered to be members of the default group. The group containing the current template is referred to as the current group
Templates are associated with the precedence categories according to
their visibility from the current group or other explicitly specified
group. The visibility is defined using the visibility
attribute for each template (see 4.3 Templates).
There are two precedence categories (listed with decreasing precedence):
templates from the same group and global or public templates (visibility='global'|'public') from children groups
global templates (visibility='global')
The first precedence category is searched for the best matching template by means of a match pattern (see 2.5 Match Patterns). If there is no matching template in this precedence category, the second category is searched.
The match pattern specifies a set of conditions on the current context. If the current context satisfies the conditions the current node matches the pattern; if the current context does not satisfy the conditions the current node does not match the pattern. The syntax for patterns is a subset of the pattern syntax for XSLT (see [XSLT], 5.2). In particular, patterns are in form of location paths that meet certain restrictions.
Here are some examples of patterns:
item
- matches any 'item' element from the
namespace used for unprefixed STXPath path patterns (defined with
'default-stxpath-namespace' option, no namespace by
default)
list/item
- matches any 'item' element with a
'list' parent, where both elements are from the namespace used for
unprefixed STXPath path patterns
chapter//list/item
- matches any 'item' element
with a 'list' parent and a 'chapter' ancestor, where all three elements
are from the namespace used for unprefixed STXPath path
patterns
/root/list/*
- matches any element with a 'list'
parent and a 'root' grand parent which is the document element, where
both 'root' and 'list' elements are from the namespace used for
unprefixed STXPath path patterns
pre:list[@id=5]/pre:item
- matches any
'item' element with a 'list' parent having an 'id'
attribute with a value of 5, where both elements are
from the namespace which is bound to the 'pre' prefix in
the stylesheet for this rule
*[position()=1]
- matches any element that is the
first element child of its parent
node()
- matches any child node
text()
- matches any text node (including CDATA
text node)
cdata()
- matches any CDATA text node
processing-instruction()
- matches any
processing instruction
A match pattern is a set of location path patterns separated with
|
. A location path pattern is a location path whose steps
all use only the child, descendant, and attribute axes. Patterns may use the
/
operator as well as the //
operator. Only
abbreviated syntax is allowed. Up to one predicate is allowed in each
step. Predicate expressions are STXPath expressions (see
6 Expressions).
Predicate expressions are evaluated using the current
context. If the result is a number, the result will be converted to
true if the number is equal to the context position and will be
converted to false otherwise. Thus a location path p[3]
is equivalent to p[position()=3]
. Otherwise the
result will be converted to a boolean using the type
conversion rules describes in 5.3 Type Conversions.
If the result of evaluating and converting the predicate
expression is false, the current template doesn't match
the current node.
If there is no matching template available a default rule is applied.
One of three default rules, specified in the
pass-through
attribute of stx:options
can be used: 'none' (to skip the current node), 'all' (to
pass through the current node), and 'text' (to pass through
the current node only if it is a text node).
The default rule can be set from the stylesheet (see
4.1 Transformation Options). This feature enables copying
of documents with only a few changes, and to straightforwardly select
just a few items from a document. The default behavior is to ignore
all not matching events (value 'none').
It is possible that the current context matches more than one rule within a precedence category. The template rule to be used is determined according the same rules as in XSLT (see [XSLT], 5.5) then. All rules have a computed priority value. The computed priority can be overridden with a 'priority' attribute value (see 4.3 Templates).
If the pattern contains multiple alternatives separated with
|
, then it is treated equivalently to a set of
template rules, one for each alternative.
If the pattern has the form of a qualified name or has the form either of processing-instruction(target) or cdata(), then the priority is 0.
If the pattern has the form pre:* or *:lname, then the priority is -0.25.
If the pattern consists of just a node test other than cdata(), then the priority is -0.5.
Otherwise, the priority is 0.5.
The rule with the highest priority is used. If there is more than one matching template rule with the highest priority, an STX processor must choose the rule that occurs last in the stylesheet.
All errors that can occur during an STX transformation belong to one of the following categories:
This specification doesn't define how to issue a warning or an error. Implementations are free to use either the standard or standard error output, or any convenient handler.
<!-- Category: root --> <stx:transform version = number> <!-- Content: top-level-elements --> </stx:transform>
Stylesheets are required to use the root element
stx:transform
.
The version
attribute contains a version number to
distinguish language versions; this attribute is mandatory and its
value must be '1.0' for this version of the language.
The stx:transform
element can contain the following children
from the STX namespace. These elements are called top-level
elements:
All top-level elements except stx:options
may occur multiple times.
stx:options
and stx:namespace-alias
elements are
allowed as top-level elements only.
Templates can be organized into groups using the stx:group
element. Groups of templates play a role in template matching
(precedence categories are defined in terms of groups) and determine
the scoping of variables.
Each stylesheet has a virtual default group (represented by
stx:transform
) that is considered to be
the parent of top-level groups. Explicit groups are not mandatory; many
transformations can be done without grouping templates. On the other
hand, templates separated to groups make it possible to define more
precise transformation rules and to run safer complex transformation,
especially on well-known, regular input data.
<!-- Category: top-level or group --> <stx:group name = qname> <!-- Content: group-elements --> </stx:group>
This element must be a child of either the
stx:transform
or the
stx:group
element. The optional name
attribute
contains a qualified name that must be unique in the stylesheet. The
name can be referenced by the group
attribute of any of
stx:process-children
, stx:process-attributes
,
stx:process-self
, stx:process-siblings
,
stx:process-document
or stx:process-buffer
instructions. In this event, the referenced group is used instead of
the current group for matching. It is not possible to
reference the default group.
It is a recoverable error if a stylesheet contains more than one group with the same name. The processor can recover from this error by choosing the group which is the last in the document order.
An STX stylesheet may include another STX stylesheet using the
stx:include
element.
<!-- Category: top-level or group --> <stx:include href = uri-reference/>
This declaration is used to insert additional stylesheet modules into the principal stylesheet module. Circular inclusion is prohibited.
This element must be top-level or a child of the
stx:group
element. stx:include
is replaced
with the content of the stx:transform
element of the included
stylesheet with three exceptions:
stx:namespace-alias
and stx:param
(stylesheet parameters) in the
included stylesheet are always inserted as top-level elements (even
when including to a group) and stx:options
of the included
stylesheet is ignored. Top-level variables and top-level templates from
the included stylesheet are treated as group variables and templates
when including into a group. There is no difference between templates
from the principal stylesheet and included templates in terms of matching
precedence.
STX templates are called sequentially rather than from other templates.
Pair events for the document and elements match only one template,
which is broken into two parts;
the first part is executed when the start event appears and the second
one at the end event. The two parts are separated by the
stx:process-children
element.
<!-- Category: top-level --> <stx:options pass-through = "none"|"all"|"text" recognize-cdata = "yes"|"no" default-stxpath-namespace = uri-reference strip-space = "yes"|"no" output-encoding = string/>
Global properties of a transformation can be specified using the
stx:options
element.
pass-through
- This optional attribute specifies a
default rule how to treat events no matching template is found for.
These events are either ignored ("none", default) or
passed to the output without modification ("all").
For "text", only text nodes are passed through to
the output.
recognize-cdata
- This optional attribute specifies, whether
CDATA boundaries are recognized during the transformation. If so,
every CDATA section forms a single node and
a node kind test cdata()
can be used in STXPath
patterns. Otherwise
(recognize-cdata="no"
), CDATA
boundaries will be ignored and all consequent character
data forms a single text node, thus the
cdata()
kind test never matches in STXPath
patterns. The default value is "yes".
default-stxpath-namespace
- This optional attribute
specifies a namespace used for unprefixed STXPath paths and
patterns. No namespace is used by default.
strip-space
- This optional attribute specifies
whether whitespace text nodes are stripped from the input data
stream. Whitespace text nodes are text nodes containing nothing but
the following characters: #x20, #x9, #xD or #xA. The default value
is "no".
output-encoding
- This optional attribute specifies
the preferred output encoding of the resulting byte stream. The
value of this attribute should be treated case-insensitively; the
value must contain only printable ASCII characters (#x21 - #x7E);
the value must be a charset registered with the Internet
Assigned Numbers Authority [IANA]
.
If the attribute is not present, the output encoding is UTF-8. A compliant STX processor is not required to support any particular encoding other than UTF-8.
<!-- Category: top-level --> <stx:namespace-alias source-prefix = ncname|"#default" result-prefix = ncname|"#default"/>
Namespaces from the input stream can be mapped to other namespaces in
the result stream using the stx:namespace-alias
element. Both
attributes are mandatory and can contain either a prefix bound to the
namespace to be used or the "#default" keyword for the
default namespace.
<!-- Category: top-level or group --> <stx:template match = pattern priority = number visibility = "private"|"public"|"global" new-scope = "yes"|"no"> <!-- Content: template --> </stx:template>
Rules to process input events are written in templates. The
stx:template
element must be a child of either the
stx:transform
or the stx:group
element. Templates match
to the events by means of precedence categories and the pattern in the mandatory match
attribute. The optional priority
attribute can
contain a priority value used for matching (see
2.5 Match Patterns).
The optional visibility
attribute specifies whether the template is
visible from other groups (and thus can match to the next event).
Private templates are visible in their group only, public templates are
visible from parent groups, and global templates are visible from any
group. The default value is "private".
The optional new-scope
attribute specifies whether the
template creates new instances of group variables. The default value is
"no". A new set of group variables is created for each
instantiated template with
new-scope="yes"
. These variables
shadow their former values and exist as long as the template is being
processed.
The content of templates may include both STX instructions and literal elements. Literal elements are simply copied to the output.
A text template is defined as the content of some elements
(stx:attribute
, stx:variable
,
stx:param
, stx:assign
, stx:with-param
,
stx:cdata
,
stx:processing-instruction
, stx:comment
,
stx:message
). This is a part of a template that generates
nothing but character events to the current output stream. An STX
processor is required to issue a run-time recoverable error if another
type of event is emitted. The processor is allowed to recover from this
error by the ignoring the non-character event.
<!-- Category: top-level or group --> <stx:procedure visibility = "private"|"public"|"global" new-scope = "yes"|"no" name = qname> <!-- Content: template --> </stx:procedure>
Procedures are sub-templates that can be called by names (with the
stx:call-procedure
element). The optional
visibility
and
new-scope
attributes have the same meaning as
for templates. Only visible procedures can be called by name, the
new-scope
must be set to "yes" to
create new copies of group variables. It is a static non-recoverable
error if a stylesheet contains more than one visible procedure with
the same name.
The content of procedures may be the same as the content of templates.
stx:call-procedure<!-- Category: template --> <stx:call-procedure name = qname group = qname> <!-- Content: stx:with-param* --> <stx:call-procedure>
The stx:call-procedure
element makes it possible to invoke
procedures by their names. The name
attribute is
mandatory. The optional group
attribute makes it possible
to use the specified group instead of the current group to call a
procedure from.
Values can be passed to stylesheets or to their templates and procedures as parameters. Stylesheet parameters behave in the same way as variables of the default group. Template/procedure parameters behave in the same way as local variables; thus they are only visible within the template or procedure they are passed to. There are two elements available to work with parameters:
stx:with-param<!-- Category: process-xxx, call-procedure --> <stx:with-param name = qname select = expression <!-- Content: text template --> </stx:with-param>
Parameters are passed to templates or procedures using the
stx:with-param
element. The required name
attribute
specifies the name of the parameter. The value of the
parameter is either the result returned by an expression located in the optional
select
attribute or the content of this element.
The stx:with-param
instruction is allowed as a child of
the elements
stx:process-children
, stx:process-attributes
,
stx:process-self
, stx:process-siblings
,
stx:process-document
, stx:process-buffer
, or
stx:call-procedure
, and must not have any of
these elements in its content.
<!-- Category: top-level or template --> <stx:param name = qname select = expression required = "yes" | "no"> <!-- Content: text template --> </stx:param>
The stx:param
element is allowed as a top-level
element (indicating a stylesheet parameter as a child of
stx:transform
) and in templates or procedures
(as a child of stx:template
or
stx:procedure
). The required name
attribute
specifies the name of the parameter. The optional select
attribute or the content of this element specifies a default value,
which is used when there is no value specified using the
select
attribute or the content of the appropriate
stx:with-param
element.
Stylesheet parameters are statically initialized while parsing
the stylesheet; Only the static context information is
available during the initialization. Template/procedure parameters
are initialized at run-time. Since there is no current
source stream available during the static initialization,
it is a recoverable error if a stylesheet
(top-level) parameter has an stx:process-children
,
stx:process-attributes
, stx:process-self
, or
stx:process-siblings
instruction in its content. A processor
may recover from this error by ignoring such an instruction.
The optional required
attribute may be used to indicate
that a parameter is mandatory. The default value is
"no", indicating that the parameter is optional. If the
value of the required
attribute is "yes", the
stx:param
element must be empty, and must have no
select
attribute. It is a dynamic non-recoverable error if
the caller doesn't supply a value with stx:with-param
for a required parameter.
<!-- Category: template --> <stx:copy attributes = pattern> <!-- Content: template --> </stx:copy>
The stx:copy
element is used to copy the current node to the
output. The optional attributes
attribute contains a
pattern. The attributes of the current node that match the pattern are
copied to the output. If the attributes
attribute isn't
present no attributes are copied with the current node.
Thus, attributes="@*"
copies all attributes,
attributes="@foo|@bar"
copies the foo
and bar
attributes only,
attributes="@*[not(name()='foo')]"
copies all
but the foo
attribute, and
attributes="@*[false()]"
doesn't copy any
attributes as if the attributes
attribute is missing at
all.
If the stx:copy
instruction applies to a node other than
element the attributes
attribute is ignored.
<!-- Category: template --> <stx:process-children group = qname> <!-- Content: stx:with-param* --> </stx:process-children>
The instruction stx:process-children
suspends the
processing of the current template by processing the children
of the current node. Using SAX2 terms: this instruction splits a
template into two parts such that a SAX2 startElement
event causes the execution of the first part and the
corresponding SAX2 endElement
event causes the
execution of the second part.
There must be always at most one stx:process-children
executed during the processing of a template. Moreover, it is a
non-recoverable error if stx:process-children
is
encountered after an stx:process-self
or an
stx:process-siblings
instruction.
Note:
If a template doesn't contain anystx:process-children
instruction, the children of this element will be skipped.
The default rule (<stx:options pass-through =
"none"|"all"|"text">
) applies only to nodes that
will be processed and no matching template has been found.Note:
If the current node is neither an element node nor the document root thenstx:process-children
simply does
nothing.The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
<!-- Category: template --> <stx:process-attributes group = qname> <!-- Content: stx:with-param* --> </stx:process-attributes>
This instruction is used to apply templates to the attributes of an element node.
The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
<!-- Category: template --> <stx:process-siblings while = pattern until = pattern group = qname> <!-- Content: stx:with-param* --> </stx:process-siblings>
The instruction stx:process-siblings
suspends the
processing of the current template and processes the following
siblings of the context node.
Note:
If the context node is an attribute node or the document rootstx:process-siblings
does nothing.The optional while
attribute takes a
pattern and
causes the processing of the siblings as long as they
match the specified pattern. The first non-matching node will stop
this stx:process-siblings
. The while
attribute
defaults to node()
.
The optional until
attribute takes a
pattern and
causes the processing of all following siblings until
a node matching the pattern is encountered. This node won't
be processed by this stx:process-siblings
.
The until
attribute defaults to
node()[false()]
.
If both while
and until
attributes have been
specified then both conditions have to be met. For example
<stx:process-siblings while="foo" until="foo"/>
doesn't process any siblings. Variable bindings used within the
patterns will be interpreted with regard to the current context.
That means changed group variables affect the evaluation, whereas
new instances of group variables or local variables are not
visible.
Note:
Whitespace text nodes not stripped from the document must be considered in the patterns, particularly when using thewhile
attribute. A typical attribute specification
would be while="foo | text()"
which processes
all following foo
elements and potential text
nodes between these foo
elements.The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
An stx:process-siblings
instruction encountered during
the processing of the siblings of a node does not affect the
while
and until
conditions of the previous
stx:process-siblings
. In other words: nested
stx:process-siblings
instructions process at most the
siblings chosen in the preceding stx:process-siblings
.
That means
stx:process-siblings
also returns if there are no more
siblings in the input available or a preceding
stx:process-siblings
terminates.
Though multiple stx:process-siblings
instructions may
appear within the same template it is a non-recoverable error
if an stx:process-children
or stx:process-self
instruction will be encountered after stx:process-siblings
.
<!-- Category: template --> <stx:process-self group = qname> <!-- Content: stx:with-param* --> </stx:process-self>
This instruction is used to process the current node using the
template that would have been chosen if the current template wasn't
present in the stylesheet. There must be always at most one
stx:process-self
instruction executed during the
processing of a template.
Moreover it is a non-recoverable error if an stx:process-self
instruction is encountered after an stx:process-children
or
an stx:process-siblings
instruction in a template.
The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
Note:
Specifying a different group results in choosing the best matching template in this group, whereas specifying the same group chooses the next best matching template. The latter is also different from specifying nogroup
attribute at all
in case the base group for matching is the parent group of the
current group.<!-- Category: template --> <stx:replace select = expression> <!-- Content: stx:pattern+ --> </stx:replace>
<stx:pattern value = expression case = "sensitive"|"insensitive"> <!-- Content: template --> </stx:pattern>
This instruction processes a string in a similar way as
stx:template
processes nodes. The mandatory
select
attribute of stx:replace
selects a
string to process by evaluating the
expression and converting it to a
string. The mandatory value
attribute of
stx:pattern
takes a regular expression by evaluating
the expression in the value
attribute and converting it to a string, which describes
a substring to look for. The optional case
attribute
determines whether the regular expression is case-sensitve
(value "sensitive") or not (value "insensitive"). The default is
"sensitive".
The stx:replace
instruction looks for the pattern
among the value
attributes of all stx:pattern
elements that matches first in the string selected by the
select
attribute. The substring before the matched
substring will be output, and the matched substring itself will be
replaced by the contents of the stx:pattern
element.
Afterwards this stx:replace
instruction will continue
by processing the substring after the matched substring. If no
pattern matches then the remaining string will be emitted as a
text node to the result stream. A pattern must match at least one
character.
In case two or more pattern may match at the same position then the pattern which matches the longest character sequence will be used. If still two or more patterns meet this condition then the first one will be used.
<!-- Category: template --> <stx:value-of select = expression/>
This instructions emits characters to the result stream. The mandatory
select
attribute contains an STXPath
expression which is evaluated and
converted to a string. This element is always empty.
<!-- Category: template --> <stx:text markup = "error"|"ignore"|"serialize"> <!-- Content: template --> </stx:text>
This instruction emits literal character data to the result stream. The content is neither normalized nor stripped should it contain whitespace characters only.
The optional markup
attribute determines how non-text
nodes in the content of stx:text
should be handled:
"error" causes the processor to raise a run-time
recoverable error for such nodes, "ignore" ignores
any markup by emitting only the string value of the contents to
the result
stream, "serialize" emits any markup serialized as
text. The default value is "error". The processor
may recover from an error raised because having markup
set to "error" by ignoring this attempt.
Note:
The string created bymarkup="serialize"
may
vary in different STX implementations, because some of the
lexical representation is not relevant for the information coded
in XML. For example every STX implementation may choose its own
order for serializing attributes.<!-- Category: template --> <stx:cdata> <!-- Content: text template --> </stx:cdata>
This instructions emits literal data as a CDATA section to the result stream. The content is neither normalized nor stripped should it contain whitespace characters only.
<!-- Category: template --> <stx:element name = {qname} namespace = {uri-reference}> <!-- Content: template --> </stx:element>
This instruction is used to generate an element. It has the same meaning as in [XSLT].
stx:start-element<!-- Category: template --> <stx:start-element name = {qname} namespace = {uri-reference}/>
<!-- Category: template --> <stx:end-element name = {qname} namespace = {uri-reference}/>
There are separate instructions available to output an element start
tag and an element end tag. The name
attribute is required
for both instructions. The both elements must be empty.
A compliant STX processor is required to produce well-formed XML output. An attempt to create an end-tag without a matching start-tag must be reported as non-recoverable error by the STX processor.
stx:attribute<!-- Category: template --> <stx:attribute name = {qname} namespace = {uri-reference} select = expression> <!-- Content: text template --> </stx:attribute>
This instruction is used to generate an attribute. It has the same
meaning as in [XSLT]. Alternatively, the value of the
generated attribute may be specified in the optional
select
attribute. It is a recoverable error of this
instruction has a select
attribute and is not
empty. A processor can recover from this error by ignoring the
content of stx:attribute
.
stx:attribute
must follow an element-starting instruction
(stx:element
, stx:start-element
,
stx:copy
, or a literal element) and no other
output-generating instructions are allowed between the
element-starting instruction and stx:attribute
.
It is a recoverable error if there is no immediate
element-starting instruction before. A processor can recover from
this error by ignoring the stx:attribute
instruction.
<!-- Category: template --> <stx:processing-instruction name = {ncname}> <!-- Content: text template --> </stx:processing-instruction>
This instruction is used to generate a processing instruction. It has the same meaning as in [XSLT].
stx:comment<!-- Category: template --> <stx:comment> <!-- Content: text template --> </stx:comment>
This instruction is used to generate a comment. It has the same meaning as in [XSLT].
<!-- Category: template --> <stx:if test = expression> <!-- Content: template --> </stx:if>
The mandatory test
attribute contains an STXPath
expression evaluating to boolean.
The content template is instantiated if and only
if the test
attribute has evaluated to true.
<!-- Category: template --> <stx:else> <!-- Content: template --> </stx:else>
This instruction must follow immediately after stx:if
; a
non-recoverable error must be reported otherwise. The content
template is instantiated if and only if the test
attribute
of the preceding stx:if
instruction has evaluated to
false.
<!-- Category: template --> <stx:choose> <stx:when test = expression> <!-- Content: template --> </stx:when>+ <stx:otherwise> <!-- Content: template --> </stx:otherwise>? </stx:choose>
The same meaning as in [XSLT].
<!-- Category: template --> <stx:for-each select = expression> <!-- Content: template --> </stx:for-each>
The stx:for-each
instruction contains a template that is
instantiated for each item of the sequence specified in the
select
attribute.
<!-- Category: template --> <stx:process-document href = expression base = {uri-reference}|"#input"|"#stylesheet" group = qname> <!-- Content: stx:with-param* --> </stx:process-document>
A stylesheet can process further source streams in addition to this
supplied when the transformation is invoked (the principal source
stream). The current source stream can be changed with the
stx:process-document
instruction. When this instruction is
instantiated the expression in the
mandatory href
attribute will be evaluated, each
item in the resulting sequence will be converted sequentially to
a string (a URI), and its value will be used to identity and to
process a new current source stream. Then, the execution of the
template containing the stx:process-document
instruction
continues with the original source stream.
If a URI is a relative URI then the base URI will be derived
from the type of the item in the sequence that represents this
URI. In case this item is a node then its base URI will be used,
otherwise the base URI of the stylesheet will be used.
Alternatively, the optional base
attribute can be used
to specify explicitely which base URI should be used. Its value
must be either an absolute URI, the string "#input" in which case
the base URI of the current input stream will be used, or
the string "#stylesheet" in which case the base URI of the
principal stylesheet will be used.
The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
Note:
When processing a new document, the ancestor stack of the original document is not available for matching and navigation. Each new document has an ancestor stack of its own.
<!-- Category: template --> <stx:result-document href = expression> <!-- Content: template --> </stx:result-document>
A stylesheet can produce further result streams in addition to the
principal result stream. The current result stream can be changed with
the stx:result-document
instruction. Events generated as
the result of executing instructions contained within the
stx:result-document
element are emitted to a new current
result stream identified with the URI which is the result of
evaluating the expression in the
href
attribute and converting its value to a string.
Then, the execution of instructions behind the end of the
stx:result-document
element continues to emit events into
the original result stream.
A sequence of events can be stored into an object called a buffer. The stored events can be emitted and processed later, in the same way as events emitted from a source stream. The events are emitted from a buffer in the same order as they were stored in. In other words, the buffers are temporary storages of the 'first in first out' type. The events stored in a buffer must represent a well-formed external general parsed entity (the restriction on a single root node is relaxed).
A buffer must be declared before it can be used. The same rules as for group varibles (see 6.1 Variables) apply for the the visibility of buffers, their shadowing, and the creating of new instances for new-scope templates (see 4.3 Templates).
stx:buffer<!-- Category: top-level or group --> <stx:buffer name = qname/>
The stx:buffer
declaration must be either a top-level
element or a child of the stx:group
element. The
mandatory name
attribute contains a qualified
name identifying the declared buffer.
<!-- Category: template --> <stx:result-buffer name = qname clear = "yes"|"no"> <!-- Content: template --> </stx:result-buffer>
The stx:result-buffer
instruction directs events emitted by
its content into the buffer specified with the mandatory name
attribute rather than to the current result stream. The buffer must be
declared with stx:buffer
before it can be employed in
stx:result-buffer
.
If the buffer specified with the name
attribute already
contains a sequence of events, the new sequence of events is appended
behind the last event in the previously stored sequence normally. If
the stx:result-buffer
element has the optional
clear
attribute with the value of "yes", the
previously stored events are removed from the buffer before the new
sequence of events is stored in. The clear
attribute
defaults to "no".
Note:
To clear a buffer without storing a new sequence of events, use the
stx:result-buffer
instruction with no content:
<stx:result-buffer name="my-buffer" clear="yes"/>
<!-- Category: template --> <stx:process-buffer name = qname group = qname> <!-- Content: stx:with-param* --> </stx:process-buffer>
The stx:process-buffer
instruction emits events stored in the
buffer specified by the mandatory name
attribute. The events are
processed in the same way as events supplied by source streams. When
the very last event from the buffer is processed, the processing in the
current template continues with an instruction, declaration or literal
next to the stx:process-buffer
instruction.
The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
The processing of events from a buffer doesn't mean the emptying of this buffer. Once a sequence of events is stored in the buffer, it can be processed repeatedly.
Note:
A buffer is not treated as a new document, but rather as if events emitted from the buffer originate from the current source stream. The ancestor stack of the current source stream remains available for matching and navigation when processing nodes from the buffer.
<!-- Category: template --> <stx:message> <!-- Content: text template --> </stx:message>
The stx:message
instruction generates a separate result
stream whose handling is implementation dependent. It can be
directed to a log, or to a special message resolver, etc. However,
all instructions of the content of the stx:message
element
must processed even if the message stream is ignored.
There are four atomic data types in STX:
string
number
boolean
node
There are eight types of node recognized in STXPath (see 2.2 Nodes). For every type of node, there is a way of determining a string-value. Since descendants are not available in the time of processing, string-values for some types of nodes are different from XPath string-values.
root nodes - there is no string value defined for root nodes, a recoverable error is reported. An STX processor is allowed to recover from this error by returning the empty string.
element nodes - if the very first child of an element happens to be a text node, the string-value of the element is the string-value of this text node. Otherwise, the string-value of the element is the empty string.
attribute nodes - the string-value of an attribute is the normalized value of this attribute
text nodes - the string-value of a text node is the character data of this node
cdata nodes - the string-value of a cdata node is the character data of this node
processing instruction nodes - the string-value of a
processing instruction node is the part of the processing
instruction following the target and any whitespace not including
the terminating ?>
comment nodes - the string-value of a comment is the content
of this comment not including the opening <!--
or
the closing -->
namespace nodes - the string-value of a namespace node is the namespace URI
STXPath expressions (see 6 Expressions) always return a sequence. A sequence is an ordered collection of zero or more items. Unlike common lists, sequences are "flat"; sequences may not contain other sequences. Sequences may contain duplicate items. An item must be of one of the atomic types: string, number, boolean, or node.
A sequence with zero items is called an empty sequence. A sequence with exactly one item is called a singleton sequence. There is no distinction between an item and a singleton sequence containing this item; an item is equivalent to a singleton sequence containing this item and vice versa. A sequence has no identity. Equality comparison of sequences is performed only by comparing items of the sequences.
Certain operators, functions, and syntactic constructs expect a value of a particular type to be supplied: this type is referred to as a required type. In such an event, a general sequence is converted to the required type according to the conversion rules.
The empty sequence is converted to required types as defined in the following table:
required type | empty sequence |
---|---|
boolean | false |
string | empty string |
number | NaN |
node | FATAL ERROR |
A singleton sequence is converted to a required type according to the type of the only item in the sequence:
required type | boolean item | string item | number item | node item |
---|---|---|---|---|
boolean | - | false is converted to 'false', true
is converted to 'true' | false is converted to 0, true is
converted to 1 | FATAL ERROR |
string | 'false', '0', empty string are converted to
false , other strings are converted to
true | - | a string that consists of optional whitespace followed by an
optional minus sign followed by a numeric literal (see
6.2 Literals) followed by whitespace is converted to
the number that is nearest to the mathematical value represented
by the string; any other string is converted to
NaN . | FATAL ERROR |
number | 0, +0, -0, NaN are converted to
false , other numbers are converted to
true | NaN is converted to 'NaN', +0 and -0 are
converted to '0', positive infinity is converted to 'Infinity',
negative infinity is converted to '-Infinity'. Other numbers are
represented in decimal form as numeric literal (see
6.2 Literals) with no leading zeros (apart possibly
from the one required digit immediately before the decimal
point), preceded by a minus sign (-) if the number is
negative. | - | FATAL ERROR |
node | a node is converted to true | a node is converted to its string value (see 6 Expressions) | a node is converted to its string value (see 6 Expressions); then the rules to convert strings to numbers are applied to convert the string value to a number | - |
A sequence containing more than one item is converted according to its very first item; all other items are ignored. The same conversion rules as for singleton sequences are applied (see the table above).
STX uses an expression language of its own called STXPath. STXPath is very similar to [XPath] on the first sight. Syntactically, STXPath is close to an [XPath2] sub-set. However, since STX has a different notion of context, the meaning of some expressions may be different in STXPath and in XPath. Consider the following example:
In XPath, the expression /node1/node2
returns a node-set
containing all node2
elements, whose parent node1
is
the document element. In STXPath, on contrary, the same expression
returns only a single node from this node-set; the one which is an
ancestor of the current node.
Expressions are used in STX in predicates of match patterns, to specify conditions for different ways of processing of the current node, to generate text to be inserted to the output stream, or to access data from the ancestor stack.
Each expression has its static context - the information that is available during static analysis of the expression, prior to its evaluation. The static context includes in-scope namespaces, default namespace for element names, and in-scope variables. The information that is available at the time when the expression is evaluated is the current context as defined in 2.3 Context.
Basic primitives of STXPath include:
variables (6.1 Variables)
literals (6.2 Literals)
parenthesized expressions (6.3 Parenthesized Expressions)
functions (6.4 Functions)
Expressions evaluate always to a sequence. See the EBNF production for expression in C STXPath Grammar for the details.
STX variables are scoped statically according to the literal structure of stylesheets. The grouping of templates is used to make the sharing of other than global variables possible.
There are two types of variables:
group variables - stx:variable
is child
of either stx:transform
or stx:group
. Top-level
variables are considered to be members of the top-most default
group that exists for each stylesheet.
local variables - Declared within templates.
A group variable is visible for the group where the variable is declared, for all descendant groups and for all templates belonging to these groups. A local variable is visible for all following siblings of the variable declaration and their descendants. Group variables may be shadowed (another variable with the same name is visible) by descendant group variables and by local variables. It is a non-recoverable error to redeclare a variable with the same name in the same group or template.
Variables always contain a sequence. STX instructions
stx:variable
and stx:assign
are used to evaluate an
expression and store its value into a variable.
Since variables are re-assignable, each variable must be declared
using the stx:variable
element before it's used (assigned,
referenced). Group variables are statically initialized while parsing
the stylesheet; Only the static context information is
available during the initialization. Local variables are initialized
at run-time. A variable declared with no value is initialized with the
empty sequence.
<!-- Category: top-level or group or template --> <stx:variable name = qname select = expression keep-value = "yes"|"no"> <!-- Content: text template --> </stx:variable>
This instruction is used to declare and initialize a variable. The
mandatory name
attribute contains the name of
the variable. An expression in
the select
attribute is evaluated and the
variable is initialized with its result. The select
attribute is optional; a variable is initialized with the string
resulting from the content of
the stx:variable
element if the select
is missing.
If the content is empty (stx:variable
element has no children)
the variable is initialized with the empty sequence.
It is a recoverable error if the element
stx:variable
declaring a group variable contains
an stx:process-children
, stx:process-self
,
stx:process-siblings
, or
stx:process-attributes
instruction in its content. A
processor may recover from this error by ignoring such an
instruction.
The optional keep-value
attribute specifies whether a
new instance of the variable created by instantiating a template
having its new-scope
attribute set to
"yes
" is initialized with the value of
the shadowed variable (yes
) or not
(no
). This attribute is allowed only for group
variables. The
default value is no
. If there is no shadowed variable yet,
the keep-value
attribute is ignored.
<!-- Category: top-level or group or template --> <stx:assign name = qname select = expression> <!-- Content: text template --> </stx:assign>
This instruction is used to assign a new value to a previously
declared variable. The mandatory name
attribute contains the
name of the variable. The expression in the optional
select
attribute is
evaluated and its result is assigned to the variable. The string
resulting from the content of the stx:variable
element is
assigned to the variable if the select
is missing. If the
content is empty, the empty sequence is assigned to the variable.
A literal is a direct syntactic representation of an atomic value. STXPath supports two kinds of literals: string literals and numeric literals.
The value of a string literal is a singleton sequence containing an item whose atomic type is string and whose value is the string denoted by the characters between the delimiting quotation marks.
The value of a numeric literal is a singleton sequence containing an item whose type is number and whose value is obtained by parsing the numeric literal according to the rules for string to numbers conversion (see 5.3 Type Conversions).
NumericLiteral | ::= | IntegerLiteral | DecimalLiteral | DoubleLiteral | |
IntegerLiteral | ::= | Digits | |
DecimalLiteral | ::= | ('.' Digits) | (Digits '.' [0-9]*) | |
DoubleLiteral | ::= | (('.' Digits) | (Digits ('.' [0-9]*)?))([e]|[E])([+][-])? Digits |
Parentheses may be used to enforce a particular evaluation order in expressions that contain multiple operators.
Parentheses are also used as delimiters in constructing a sequence, as described in 6.6 Sequence Expressions.
A function call consists of a function name followed by a parenthesized list of zero or more expressions. The expressions inside the parentheses provide the arguments of the function call. The number of arguments must be equal to the number of function parameters; otherwise a static non-recoverable error is raised.
A function call is evaluated as follows:
Each argument expression is evaluated, producing an argument value (sequence).
If the corresponding function parameter has a required type, the argument value is converted to this type.
The function is executed using the converted argument values. The result is a value of the function's declared return type.
The following list of STXPath functions is categorized by the required types of the primary arguments:
The empty() function returns true if the argument is the empty sequence; otherwise it returns false.
The item-at() function returns the item from the first argument sequence at the position given by the second argument. The index number is rounded to the nearest integer if necessary. If the sequence is the empty sequence, this function returns the empty sequence. If the value of index is greater than the number of items in the sequence, or is less than or equal to zero, then the function reports a non-recoverable error.
The sublist() function returns the contiguous sequence of items from the first argument (source sequence) beginning at the position specified by the second argument (index) and continuing for the number of items indicated by the third argument (length). If length is not specified, then the sublist identifies items to the end of the source sequence. The index and length numbers are rounded to the nearest integers if necessary. If the source sequence is the empty sequence, this function returns the empty sequence. If the value of index is greater than the number of items in the sequence, or is less than or equal to zero, then the function reports a non-recoverable error. The length can be greater than the number of items in the source sequence following the beginning position, in which case the sublist identifies items to the source sequence.
The count() function returns the number of items in the sequence.
The name function returns a string containing a qualified name representing the expanded-name of the node in the argument. For nodes with no name defined (root, text, CDATA text, comment), this function returns the empty string. For processing-instructions, this function returns their target.
The namespace-uri function returns the namespace URI of the expanded-name of the node in the argument. For nodes with no namespace defined (root, text, CDATA text, processing instruction, comment, namespace), this function returns the empty string.
The local-name() function returns the local part of the expanded-name of the node in the argument. For nodes with no local name defined (root, text, CDATA text, comment), this function returns the empty string. For processing-instructions, this function returns their target.
The prefix() function returns the prefix of the expanded-name of the node in the argument. For nodes with no prefix defined (root, text, CDATA text, processing instruction, comment, namespace), this function returns the empty string.
The position() function returns a number equal to the position of the current node relative to other siblings, see 2.3 Context for details of position() semantics.
The get-node() function returns the node which
is in the ancestor stack at the level given by the argument. The
level number is rounded to the nearest integer if necessary. For
example, get-node(0)
returns the root of the document,
get-node(1)
returns the document element.
get-node(level())
returns the current node. If there is
no node at the requested level in the ancestor stack, the function
returns the empty sequence.
The has-child-nodes() function returns true if and only if the current node is the document node or an element node and has child nodes (it is not empty). It returns false otherwise.
The true() function returns always
true
.
The false() function returns always
false
.
The not() function reduces its parameter to an effective boolean value using the same rules that are used for the operands of logical expressions (see 6.9 Logical Expressions). It then returns true if the effective boolean value of its parameter is false, and false if the effective boolean value of its parameter is true.
The starts-with() function returns true if the first argument string starts with the second argument string, otherwise it returns false. If the value of any argument is the empty sequence, the function returns the empty sequence.
The contains() function returns true if the first argument string is part of the second argument string, otherwise it returns false. If the value of any argument is the empty sequence, the function returns the empty sequence.
The substring() function returns the number specified with the third argument of characters from the offset specified with the second argument in the first argument string; or all characters from the offset to the end of the string if the third argument is omitted; the offset and length numbers are rounded to the nearest integer if necessary. The offset of the first character is 1. If the value of any argument is the empty sequence, the function returns the empty sequence.
The substring-before() function returns the part of the first argument string from the beginning of the string up to (but not including) the first occurrence of the second argument string. The empty string is returned if the first argument string does not contain the second argument string. If the value of any argument is the empty sequence, the function returns the empty sequence.
The substring-after() function returns the part of the first argument string from the end of the first occurrence of the second argument string to the end of the (first) string. The empty string is returned if the first argument string does not contain the second argument string. If the value of any argument is the empty sequence, the function returns the empty sequence.
The string-length() function returns the number of characters in a string. If the value of the argument is the empty sequence, the function returns the empty sequence.
The normalize-space() function returns the argument string after leading and trailing whitespace is stripped and consequent whitespace characters are replaced with a single space. If the value of the argument is the empty sequence, the function returns the empty sequence.
The translate() function returns the first argument string with occurrences of characters in the second argument string replaced by the corresponding characters from the third argument string. If there is a character in the second argument string with no character at a corresponding position in the third argument string (because the second argument string is longer than the third argument string), then occurrences of that character in the first argument string are removed. If a character occurs more than once in the second argument string, then the first occurrence determines the replacement character. If the third argument string is longer than the second argument string, then excess characters are ignored. If the value of any argument is the empty sequence, the function returns the empty sequence.
The concat() function returns the concatenation of its arguments. If the value of any argument is the empty sequence, the function returns the empty sequence.
The replace() function returns the first argument string with parts that match a regular expression given in the second argument string replaced with the third argument string. The regular expression semantics as defined in XML Schema Part 2: Datatypes ([XSD2]), Appendix F is used.
The fourth optional argument is a string consisting of character flags to be used by the match. If a character is present then that flag is true. The flags are:
g - global replace
All occurrences of the regular expression in the string are replaced. If this character is not present, then only the first occurrence of the regular expression is replaced.
i - case insensitive
The regular expression is treated as case insensitive. If this character is not present, then the regular expression is case sensitive.
If the value of any argument is the empty sequence, the function returns the empty sequence.
The match() function returns a list of integers that identify the offset of the location within the value of the first argument string that is matched by the regular expression that is the value of the second argument string. If there is no substring of the first string that matches the regular expression, the empty sequence is returned. Otherwise, a sequence of two integers is returned: the first integer is the position of the start of the substring and the second integer is the length of the substring that matches. The regular expression semantics as defined in XML Schema Part 2: Datatypes ([XSD2]), Appendix F is used.
The third optional argument is a string consisting of character flags to be used by the match. If a character is present then that flag is true. The flags are:
g - global replace
All occurrences of the regular expression in the string are replaced. If this character is not present, then only the first occurrence of the regular expression is replaced.
i - case insensitive
The regular expression is treated as case insensitive. If this character is not present, then the regular expression is case sensitive.
If the value of any argument is the empty sequence, the function returns the empty sequence.
The floor() function returns the largest number that is not greater than the argument and that is an integer. If the value of the argument is the empty sequence, the function returns the empty sequence.
The ceiling() function returns the smallest number that is not less than the argument and that is an integer. If the value of the argument is the empty sequence, the function returns the empty sequence.
The round() function returns the number that is
closest to the argument and that is an integer. If there are two such
numbers, then the greater one is returned. If the argument is
NaN
, then NaN
is returned. If the value of
the argument is the empty sequence, the function returns the empty
sequence.
The sum() function returns the sum, for each item in the argument sequence, of the result of converting the item to a number. If the value of the argument is the empty sequence, the function returns the empty sequence.
The string() function returns the result of converting the argument to a string. See 5.3 Type Conversions for details.
The number() function returns the result of converting the argument to a number.
The boolean() function returns the result of converting the argument to a boolean.
The level() function returns the level of the
argument node in the ancestor stack. level()
and
level(.)
return the level of the current node.
level(/)
returns 0. If the value of the argument is the
empty sequence, the function returns the empty sequence.
The only data available when processing the current node is the data related to the current node, the data related to the next node, and the data related to nodes in the ancestor stack. Location paths called data accessors are used to access to this data. Axes in data accessors are limited to:
parent and ancestor axes in relative location paths
child and descendant axes (abbreviated syntax only) in absolute location paths
attribute axis (abbreviated syntax only)
text()
node test (child axis) for the current
node
Predicates are not allowed in data accessors.
A data accessor always returns a sequence (often a singleton one).
These sequences are very limited; they can contain nothing but nodes
stored in the ancestor stack (the current node and its attributes,
ancestor elements and their attributes) and the next nodes (only if the
next node happens to be a text node, accessed with text()
).
Resulting sequences can be either passed to functions operating with
sequences or converted to string, number or boolean.
Here are some examples of data accessors:
.
- returns the current node
text()
- returns the first text child of the
current node provided it is the very first child of the current
node. Otherwise, it returns the empty sequence.
parent::*
- returns the parent node of the
current node
ancestor::*
- returns a sequence whose items are
all ancestors of the current node
@foo
- returns the foo
attribute of
the current node
ancestor::*/@bar
- returns a sequence of
bar
attributes of ancestors of the current
node
/aaa/bbb
- returns a bbb
element from
the ancestor stack which is a child of aaa
element which
is the root element of the ancestor stack (and hence the root
element of the input document)
STXPath supports operators to construct and combine sequences. One way to construct a sequence is using a parenthesized expression (6.3 Parenthesized Expressions), which consists of zero or more expressions separated with the comma operator and delimited with parentheses. The parenthesized expression is evaluated by evaluating each of its constituent expressions and concatenating the resulting sequences, in order, into a single result sequence.
Here are some examples of expressions that construct sequences:
This expression is a sequence of five integers:
(10, 1, 2, 3, 4)
This expression constructs one sequence from the sequences 10, (1, 2), the empty sequence (), and (3, 4):
(10, (1, 2), (), (3, 4))
It evaluates to the sequence (10, 1, 2, 3, 4).
STXPath provides arithmetic operators for addition, subtraction, multiplication, division, and modulus, in their usual binary and unary forms. The binary subtraction operator must be preceded by white space in order to distinguish it from a hyphen, which is a valid name character.
An arithmetic expression is evaluated by applying the following rules:
If either operand is the empty sequence, the result of the operation is the empty sequence.
Operands other than empty sequences are converted
(5.3 Type Conversions) to numbers before the
expression is evaluated. If the conversion fails (returns
NaN
) a non-recoverable error is reported.
Comparison expressions allow two values to be compared. STXPath provides the following general comparison operators: =, !=, <, <=, >, >=. The result of a comparison is always true or false (a singleton sequence containing one boolean item).
The result of a comparison of sequences is defined by applying the following rules, in order:
If either operand is the empty sequence, the result is false.
The comparison A operator B
is true for sequences
A
and B
if the comparison
a operator b
is true for some item a
in
A
and some item b
in B
.
Otherwise, A operator B
is false.
The result of a comparison of items is defined by applying the following rules. The rules defined in 5.3 Type Conversions apply for conversions:
If both items to be compared are nodes, then the comparison will be true if and only if the result of performing the comparison on the string-values of the two nodes is true.
If one item to be compared is a node and the other is a number, then the comparison will be true if and only if the result of performing the comparison on the number and on the result of converting the string-value of that node to a number is true.
If one item to be compared is a node and the other is a string, then the comparison will be true if and only if the result of performing the comparison on the string-value of the node and the other string is true.
If one item to be compared is a node and the other is a boolean, then the comparison will be true if and only if the result of performing the comparison of true and the boolean value is true.
When neither item to be compared is node and the operator is = or !=, then the items are compared by converting them to a common type as follows and then comparing them. If at least one item to be compared is a boolean, then each item to be compared is converted to a boolean. Otherwise, if at least one item to be compared is a number, then each item to be compared is converted to a number. Otherwise, both items to be compared are converted to strings.
When neither item to be compared is node and the operator is <=, <, >= or >, then the items are compared by converting both items to numbers and comparing the numbers.
STXPath provides two common logical operators: and
and
or
. The value of a logical expression is always one of the
boolean values true
or false
(a singleton
sequence containing a boolean item).
Logical expressions are evaluated by reducing each of its operands to an effective boolean value by applying the following rules, in order:
If the operand is the empty sequence, its effective boolean
value is false
.
If the operand is a singleton sequence containing a boolean item, the item serves as the effective boolean value.
If the operand is a sequence that contains at least one node,
its effective boolean value is true
.
In any other case, operands are converted to boolean (see 5.3 Type Conversions) to get effective boolean values.
An AND expression returns true
if the
effective boolean values of both of its operands are true
;
otherwise it returns false
.
An OR expression returns false
if the
effective boolean values of both of its operands are
false
; otherwise it returns true
.
In addition to logical expressions, XPath provides a function named not() that takes a general sequence as parameter and returns a boolean value.
Plain list only so far:
stx:transform stx:options stx:include stx:namespace-alias stx:template stx:procedure stx:group stx:call-procedure stx:copy stx:process-children stx:process-attributes stx:process-siblings stx:process-self stx:value-of stx:text stx:cdata stx:element stx:start-element stx:end-element stx:processing-instruction stx:comment stx:attribute stx:if stx:else stx:choose stx:when stx:otherwise stx:variable stx:assign stx:with-param stx:param stx:for-each stx:process-document stx:result-document stx:buffer stx:process-buffer stx:result-buffer stx:process-siblings stx:replace stx:pattern
The following is a complete grammar for STXPath in EBNF notation.
[1] | pattern | ::= | PathPattern ('|' PathPattern)? |
[2] | expression | ::= | Expr |
[3] | PathPattern | ::= | AbsolutePattern | RelativePattern |
[4] | AbsolutePattern | ::= | '/' RelativePattern? |
[5] | RelativePattern | ::= | Step (('/' RelativePattern) | ('//' RelativePattern))? |
[6] | Step | ::= | NodeTest Predicate? |
[7] | NodeTest | ::= | NameTest | KindTest |
[8] | Predicate | ::= | '[' Expr ']' |
[9] | NameTest | ::= | NodeNameTest | AttributeNameTest |
[10] | NodeNameTest | ::= | QName | NCName ':' '*' | '*' | '*' ':' NCName |
[11] | AttributeNameTest | ::= | '@' QName | '@' NCName ':' '*' | '@' '*' | '@' '*' ':' NCName |
[12] | KindTest | ::= | AnyKindTest |
CommentTest |
ProcessingInstructionTest |
TextTest |
CDATATest |
[13] | AnyKindTest | ::= | 'node()' |
[14] | CommentTest | ::= | 'comment()' |
[15] | ProcessingInstructionTest | ::= | 'processing-instruction(' StringLiteral? ')' |
[16] | TextTest | ::= | 'text()' |
[17] | CDATATest | ::= | 'cdata()' |
[18] | Expr | ::= | OrExpr |
[19] | OrExpr | ::= | AndExpr | OrExpr 'or' AndExpr |
[20] | AndExpr | ::= | GeneralComp | AndExpr 'and' GeneralComp |
[21] | GeneralComp | ::= | AdditiveExpr | GeneralComp CompOp AdditiveExpr |
[22] | AdditiveExpr | ::= | MultiplicativeExpr | AdditiveExpr ('+' | '-') MultiplicativeExpr |
[23] | MultiplicativeExpr | ::= | UnaryExpr | MultiplicativeExpr ('*' | 'div' | 'mod') UnaryExpr |
[24] | UnaryExpr | ::= | ('-' | '+')? BasicExpr |
[25] | BasicExpr | ::= | DataAccessor |
ParenthesizedExpr |
Literal |
[26] | ParenthesizedExpr | ::= | '(' ExprSequence? ')' |
[27] | ExprSequence | ::= | Expr (',' Expr)* |
[28] | Literal | ::= | NumericLiteral | StringLiteral |
[29] | DataAccesor | ::= | NodeAccessor | NodeAccessor '/' PropertyAccessor | PropertyAccessor |
[30] | NodeAccessor | ::= | PathAccessor | Variable | FunctionCall |
[31] | FunctionCall | ::= | QName '(' ExprSequence? ')' |
[32] | PathAccessor | ::= | ('/' | '//')? RelativeAccessor |
[33] | RelativeAccessor | ::= | RelativeAccessor ('/' | '//') AccessorStep | AccessorStep |
[34] | AccessorStep | ::= | AccessorAxis? NodeNameTest | '.' | '..' |
[35] | PropertyAccessor | ::= | TextTest | AttributeNameTest | NamespaceNameTest |
[36] | AccessorAxis | ::= | 'parent::' | 'ancestor::' |
[37] | NamespaceNameTest | ::= | 'namespace::' NCName | 'namespace::' '*' |
[38] | CompOp | ::= | '=' | '!=' | '<' | '<=' | '>' | '>=' |
[39] | NumericLiteral | ::= | IntegerLiteral | DecimalLiteral | DoubleLiteral |
[40] | IntegerLiteral | ::= | Digits |
[41] | DecimalLiteral | ::= | ('.' Digits) | (Digits '.' [0-9]*) |
[42] | DoubleLiteral | ::= | (('.' Digits) | (Digits ('.' [0-9]*)?))([e]|[E])([+][-])? Digits |
[43] | StringLiteral | ::= | (["][^"]*["]) | (['][^']*[']) |
[44] | Variable | ::= | '$' QName |
[45] | Digits | ::= | [0-9]+ |
In addition, the following non-terminals are defined in [XML Names]:
These people have contributed to this specification as they sent valuable comments to the stx@gingerall.cz mailing list:
Barrie Slaymaker
Miguel Branco
Eric van der Vlist
Richard R. McKinley
Jan Poslušnư
Gunnlaugur Thor Briem
Robert Koberg
Michael Brennan
2002-12-08 : CN : Added STXPath grammar section. Changed grammar examples to prodrecap elements.
2002-12-10 : PC : 'mode' replaced with 'group' attribute in stx:process-xxx. 'name' attribute added to stx:group. stx:with-param allowed in stx:process-xxx. stx:param allowed in templates
2002-12-11 : OB : Added stx:process-siblings, rephrased description of stx:process-children and stx:process-self
2002-12-16 : PC : Removed joining of output text events. 'group' attribute added to stx:procedure. Changed definition of text template; TTs are run-time checkable now. Added stx:message.
2002-12-19 : OB : Added 'markup' attribute to stx:text. Incorporated section 'Processing Text' including stx:replace and stx:pattern.
2002-12-25 : OB : Added 'required' attribute to stx:param; Forbid stx:process-... in stx:with-param, stylesheet parameters, and group variables.
2002-12-30 : OB : Added 'base' attribute to stx:process-document; Allow attribute value templates (AVT) for stx:processing-instruction/@name, stx:process-document/@href, and stx:result-document/@href
2003-01-06 : PC : TT definition moved to Templates. Added Errors section, all errors categorized. Minor changes in 'Parameters'.
2003-01-08 : PC : Namespace nodes added. Naming changes: start/end-element, @new-scope, @pass-through. Added notes about stacks for process-buffer/document. Conflicts of group names classified as recoverable errors. Missing group name classified as recoverable error.
2003-01-09 : OB : Revised STXPath grammar; added links from within the specification to non-terminals. Changed type of stx:process-document/@href and stx:result-document/@href to expression. Clarified base URI for stx:process-document.
2003-01-10 : PC : Element string-value changed to text().