Copyright © 2002-2003 authors and contributors. All rights reserved.
STX is an XML-based language for transforming XML documents into other XML documents without building a tree in memory. An STX processor transforms one or more source streams of SAX2 events according to rules given in an XML document called STX stylesheet and generates one or more result SAX2 streams. Each incoming event invokes one or more rules, that can e.g. emit events to one of the result streams or access a working storage.
This document is a working draft of the STX transformation language specification.
1 Introduction
2 Concepts
2.1 Initiating a Transformation
2.2 Nodes
2.3 Context
2.4 Precedence Categories
2.5 Expressions
2.6 Match Patterns
2.7 Attribute Value Templates
2.8 Whitespace Stripping
2.9 Errors
3 Stylesheet Structure
3.1 STX Namespace
3.2 Transform Element
3.3 Grouping of Templates
3.4 Stylesheet Inclusion
4 Generating Output
4.1 Namespace Aliasing
4.2 Templates
4.3 Procedures
4.4 Parameters
4.5 Copying the Current Node
4.6 Processing Nested Events
4.7 Processing Attributes
4.8 Processing Siblings
4.9 Running Overridden Templates
4.10 Processing Text
4.11 Outputting Strings
4.12 Outputting Elements and Attributes
4.13 Outputting Other Nodes
4.14 Conditions
4.15 Loops
4.16 Multiple Input Documents
4.17 Multiple Output Documents
4.18 Buffers
4.19 Messages
5 Data Types
5.1 Atomic Types
5.2 Sequences
5.3 Type Conversions
6 STXPath
6.1 Variables
6.2 Literals
6.3 Parenthesized Expressions
6.4 Functions
6.4.1 Sequence Functions
6.4.2 Node Functions
6.4.3 Boolean Functions
6.4.4 String Functions
6.4.5 Numerical Functions
6.4.6 Aggregate Functions
6.4.7 Conversion Functions
6.5 Data Accessors
6.6 Sequence Expressions
6.7 Arithmetic Expressions
6.8 Comparison Expressions
6.9 Logical Expressions
7 Extensions
A References
A.1 Normative References
A.2 Other References
B Element Syntax Summary
C STXPath Grammar
D Acknowledgments (Non-Normative)
E Draft Change History since WD 14 January (Non-Normative)
This document defines the syntax and semantics of the STX transformation language. Transformation rules in STX are expressed as well-formed XML documents. These documents, called stylesheets, may include both elements that are defined by STX (STX declarations and instructions) and other elements (literals). STX-defined elements are identified by a specific XML namespace, which is referred to in this specification as the STX namespace. This document uses the 'stx' prefix as a shortcut for referring to elements from the STX namespace.
An STX transformation describes rules for transforming one or more source event streams into one or more result event streams. The transformation has a streaming character; this means that it does not need to build a tree representing the source documents in memory. Result events are generated as soon as source events appear and are processed.
The transformation is achieved by associating events with templates. A template pattern is matched against events and their context. The best matching template is then instantiated to create a part of the result stream. A template is always instantiated with respect to the current context, a set of additional information maintained during the transformation. In constructing the result stream, events from the source stream can be filtered and arbitrary events can be added. Events can also be reordered using working storage.
On the surface, the syntax of STX is similar to the syntax of [XSLT]. STX also employs a compact expression language embedded in certain attributes. This expression language, called STXPath, is syntactically similar to [XPath]. This should allow XSLT users to easily adapt to STX syntax.
The software responsible for running an STX transformation is referred to as an STX processor. An STX processor transforms one or more source XML documents according to rules given in an STX stylesheet and generates one or more result XML documents.
The source documents are supplied in the form of streams of [SAX2] events. These streams are referred to as the source streams. The stream whose events are currently processed is referred to as the current source stream. The current source stream at the time when the transformation is initiated is referred to as the principal source stream.
A possibly empty set of external values for stylesheet parameters is supplied. These values are available for use within expressions in the stylesheet.
No tree representation of the source document is constructed. However, when processing each event, a limited amount of contextual information is available from the system.
Data arriving with an event can form one or more objects called
nodes. Pair events for the document and elements form
one node only; all node data is passed with the starting
event. The data of attributes passed with
startElement()
event form separate nodes.
Sequential characters()
and
ignorableWhitespace()
events will be combined into a
single text node.
The stylesheet is a well-formed XML document that may be
precompiled to some kind of executable representation that can be
reused to perform multiple transformations. The stylesheet can
consist of several stylesheet modules contained in different files.
One of these modules is the principal stylesheet module. The
complete stylesheet is assembled by finding the stylesheet modules
referenced directly or indirectly from the principal stylesheet module
using the stx:include
declaration.
The output of the transformation consists of one or more sequences of SAX2 events. These sequences of events are referred to as result streams. The stream events are emitted to currently is referred to as the current result stream. The current result stream at the time when the transformation is initiated is referred to as the principal result stream.
Each incoming event can cause an invocation of one or more rules within the stylesheet by means of a match pattern. The actions such a rule may perform include emitting SAX2 events to result streams, saving working data to working storage, accessing data written to working storage by previously executed rules, and invoking other rules.
Note:
The source or result streams are abstract constructs that function as input or output channels for STX transformations. Each source or result stream is identified with a URI. This URI must not be confused with a URI of physical document that may be parsed to generate the source stream or a URI of document the result stream may be serialized to. Stream URIs are passed to a resolver that maps abstract streams to physical resources.
This document does not specify interfaces for initiating an STX transformation. Instead, these interfaces are implementation dependent. This section describes the minimum amount of information that must be supplied to execute a transformation:
An identification of the stylesheet module that is to act as the principal stylesheet module for the transformation.
A possibly empty set of values for stylesheet parameters (name-value pairs). External parameter values are matched against global stylesheet parameters.
An identification of the stream that is to act as the principal source stream.
An identification of the stream that is to act as the principal result stream.
The data arriving with an event forms zero or more entities
called nodes. Pair events refer to a single node whose data is passed
with the starting event. The attribute data arriving with
startElement()
event form separate attribute
nodes. Aggregated consequent events of the same type
(characters, ignorableWhitespaces) are treated as a single event
and thus form a single node only.
There are seven types of nodes recognized in STX:
root node - Passed with a
startDocument()
event; this node has no
properties.
element node - Passed with a
startElement()
event. The node properties consist of
the element related data (local name, prefix, qualified name,
namespace URI).
attribute node - Passed with a
startElement()
event. The node properties consist of
the data related to a particular attribute (local name, prefix,
qualified name, namespace URI, value).
text node - Passed with a
characters()
or ignorableWhitespace()
event. The node properties consist of character data.
CDATA node - Passed with a
characters()
or ignorableWhitespace()
event within startCDATA()
and endCDATA()
lexical events. The node properties consist of character
data.
processing instruction node - Passed with a
processingInstruction()
event. The node properties
consist of target and character data.
comment node - Passed with a
comment()
event. The node properties consist of
character data.
There is contextual information available at each point during processing. It includes the data arriving with the current event and other data related to the state of processing. The contextual information at any particular instant during processing is called the current context. The context information consists of the following parts:
current node data - The node which is the subject of the current event is called the current node. The information available for the current node depends on the node type; see [SAX2] definition for details. For example, qualified name, local name, prefix, namespace URI, attributes (qualified name, local name, prefix, namespace URI, and value for each), and the string-value are available for elements.
ancestor stack - For the current node, all ancestor nodes with all properties are stored in the ancestor stack.
position within siblings - Information about the position relative to other siblings is kept. The position is available for the current node and all its ancestors.
A position number is available for all node kind tests such as
node()
, text()
, cdata()
,
processingInstruction()
, comment()
. For
elements, the position is available for all qualified names or names
containing * shortcut: pre:lname
, lname
,
pre:*
, *:lname
, *
. For
processing instructions, the position is also available for each
target. The position of attribute nodes is undefined.
Each incoming event can invoke a template within the stylesheet by means of precedence categories and a match pattern (see 2.6 Match Patterns). The template that is used to process the current node is called the current template. Templates can be separated into groups (see 3.3 Grouping of Templates). Top-level templates are considered to be members of the default group. The group containing the current template is referred to as the current group
Templates are classed into the precedence categories according to
their visibility from a base group. The base group can be
either the current group or the group explicitly specified in the
group
attribute of process-xxx statements. The visibility is
defined using the visibility
and public
attributes for each template (see 4.2 Templates).
There are three precedence categories (listed with decreasing precedence):
templates from the base group and public templates
(public="yes"
) from child groups
group and global templates
(visibility="group"|"global"
) from all ancestor
groups
all global templates
(visibility="global")
The first precedence category is searched for the best matching template by means of a match pattern (see 2.6 Match Patterns). If there is no matching template in the first precedence category, the second category is searched. If neither the first nor the second category contain a matching template, the third category is searched.
STX uses an expression language called STXPath that is defined later in this specification (see 6 STXPath).
Expressions are used in STX in match patterns, to specify conditions for different ways of processing of the current node, to generate text to be inserted to the output stream, or to access data from the ancestor stack.
An STXPath expression may occur as the value of certain attributes on STX elements, and also within curly braces in attribute value templates (see 2.7 Attribute Value Templates). The context within a stylesheet where an STXPath expression appears may specify the required data type; the type of value that the expression is expected to return.
It is a static non-recoverable error if the value of an expression attribute does not match the STXPath production expressions. Is is a dynamic non-recoverable error if any STXPath expression is evaluated and raises a dynamic error, or when it raises an error when converting to the required data type.
The attribute default-stxpath-namespace
of the
stx:transform
element (see
3.2 Transform Element) may be used to define the
namespace that will be used for an unprefixed name used as a
node name test within a step of an STXPath
expressions or
pattern. The value of the attribute is the
namespace URI to be used.
This default namespace URI applies only to elements; it does not
apply to attributes. In the absence of this attribute, an unqualified
node name test matches an element whose
namespace URI is null: the default namespace (as defined by an
xmlns="some-uri"
declaration) is not used.
A match pattern specifies a set of conditions on the current context. If the current context satisfies the conditions the current node matches the pattern; if the current context does not satisfy the conditions the current node does not match the pattern. The syntax for STX patterns is a subset of the syntax for STXPath expressions. In particular, patterns are in form of location paths that meet certain restrictions.
Here are some examples of patterns:
item
- matches any 'item' element from the
namespace used for unprefixed STXPath path patterns (defined with
'default-stxpath-namespace' attribute, no namespace by
default)
list/item
- matches any 'item' element with a
'list' parent, where both elements are from the namespace used for
unprefixed STXPath path patterns
chapter//list/item
- matches any 'item' element
with a 'list' parent and a 'chapter' ancestor, where all three
elements are from the namespace used for unprefixed STXPath path
patterns
/root/list/*
- matches any element with a 'list'
parent and a 'root' grand parent which is the document element, where
both 'root' and 'list' elements are from the namespace used for
unprefixed STXPath path patterns
pre:list[@id=5]/pre:item
- matches any
'item' element with a 'list' parent having an 'id'
attribute with a value of 5, where both elements are
from the namespace which is bound to the 'pre' prefix in
the stylesheet for this rule
*[sf:position()=1]
- matches any element that is
the first element child of its parent
node()
- matches any child node
text()
- matches any text node (including CDATA
text node)
cdata()
- matches any CDATA text node
processing-instruction()
- matches any
processing instruction
A match pattern is a set of location path
patterns separated with |
. A location path pattern is a
location path whose steps all use only the child, descendant, and
attribute axes. Patterns may use the /
operator as well
as the //
operator. Only abbreviated syntax is allowed.
Up to one predicate is allowed in each step. Predicate expressions are
STXPath expressions (see
2.5 Expressions).
Predicate expressions are evaluated using the current
context. If the result is a number, the result will be converted to
true if the number is equal to the context position and will be
converted to false otherwise. Thus a location path p[3]
is equivalent to p[sf:position()=3]
. Otherwise the
result will be converted to a boolean using the type
conversion rules describes in 5.3 Type Conversions.
If the result of evaluating and converting the predicate
expression is false, the current template doesn't match
the current node.
If there is no matching template available a default rule is
applied. One of three default rules, specified in the
pass-through
attribute of stx:transform
or stx:group
can be used: "none" (to skip
the current node), "all" (to pass through the current
node), and "text" (to pass through the current node
only if it is a text node). The default rule can be set for the
stylesheet (see 3.2 Transform Element) or for a group
(see 3.3 Grouping of Templates). This feature enables
copying of documents with only a few changes, and to
straightforwardly select just a few items from a document. The
default behavior is to ignore all not matching events (value
"none") on the stylesheet level. Groups inherit the
pass-through behavior from their parent group when not specified
explicitly.
It is possible that the current context matches more than one rule within a precedence category. The template rule to be used is determined according the same rules as in XSLT (see [XSLT], 5.5) then. All rules have a computed priority value. The computed priority can be overridden with a 'priority' attribute value (see 4.2 Templates).
If the pattern contains multiple
alternatives separated with |
, then it is treated
equivalently to a set of template rules, one for each
alternative.
If the pattern has the form of a qualified name or has the form either of processing-instruction(target) or cdata(), then the priority is 0.
If the pattern has the form pre:* or *:lname, then the priority is -0.25.
If the pattern consists of just a node test other than cdata(), then the priority is -0.5.
Otherwise, the priority is 0.5.
The rule with the highest priority is used. If there is more than one matching template rule with the highest priority, an STX processor must report a recoverable error. The processor is allowed to recover from this error by selecting the rule that occurs the last in the stylesheet.
In an attribute that is designated as an attribute value template an STXPath expression can be used by surrounding the expression with curly braces ({}).
An attribute value template consists of an alternating sequence of fixed parts and variable parts. A variable part consists of an STXPath expression enclosed in curly braces ({}). A fixed part may contain any characters, except that a left curly brace must be written as {{ and a right curly brace must be written as }}.
The result of evaluating an attribute value template is obtained by concatenating the expansions of the fixed and variable parts. The expansion of a fixed part is obtained by replacing any double curly braces ({{ or }}) by the corresponding single curly brace. The expansion of a variable part is obtained by evaluating the enclosed STXPath expression and converting the resulting value to a string.
If a left curly brace appears in an attribute value template without a matching right curly brace, or if a right curly brace occurs in an attribute value template outside an expression without being followed by a second right curly brace, a processor must signal a non-recoverable error.
The source streams and stylesheet modules may contain whitespace nodes (text nodes consisting solely of whitespace characters: #x20, #x9, #xD or #xA). Such whitespace nodes may be removed according to the following rules. This process is referred to as whitespace stripping.
Whitespace nodes are stripped from source streams if the
strip-space
attribute of stx:transform
(see
3.2 Transform Element) or stx:group
(see 3.3 Grouping of Templates) is set to
"yes". Otherwise they are preserved and treated as
any other text nodes.
For stylesheets, whitespace text nodes are preserved only if
an ancestor element of this text node has an xml:space
attribute with a value of "preserve", and no closer
ancestor element has xml:space
with a value of
"default". All
other whitespace nodes are removed from the stylesheet.
The STX elements stx:text
and stx:cdata
have
a default xml:space
attribute with a value of
"preserve" which may be overridden in the stylesheet.
xml:space
attributes on literal result elements will
not be stripped from these elements.
All errors that can occur during an STX transformation belong to one of the following categories:
warnings - The processor may issue a warning; the transformation must not be stopped.
recoverable errors - The processor may either issue an error and stop the transformation, or it can recover from the error in the way defined in this specification.
non-recoverable (fatal) errors - The processor must exit the transformation and issue an error message.
This specification doesn't define how to issue a warning or an error. Implementations are free to use either the standard or standard error output, or any convenient error handler.
The STX namespace has the URI
http://stx.sourceforge.net/2002/ns
.
The STX function namespace has the URI
http://stx.sourceforge.net/2003/functions
.
These two namespaces are recognized as reserved namespaces in STX stylesheets, and may be used only for purposes specified in this document.
stx:transform
<!-- Category: root --> <stx:transform version = number pass-through = "none"|"all"|"text" recognize-cdata = "yes"|"no" default-stxpath-namespace = uri-reference strip-space = "yes"|"no" output-encoding = string/> <!-- Content: top-level-elements --> </stx:transform>
Stylesheets are required to use the root element
stx:transform
.
The version
attribute contains a version number to
distinguish language versions; this attribute is mandatory and its
value must be "1.0" for this version of STX.
The other attributes make it possible to set global properties of the transformation. Some of these properties (pass-through, recognize-cdata, strip-space) can also be set on the group level.
pass-through
- This optional attribute specifies a
default rule how to treat events no matching template is found for.
These events are either ignored ("none", default) or
passed to the output without modification ("all").
For "text", only text nodes are passed through to
the output.
recognize-cdata
- This optional attribute
specifies, whether CDATA boundaries are recognized during the
transformation. If so, every CDATA section forms a single node and
a node kind test cdata()
can be used in STXPath
patterns. Otherwise (recognize-cdata="no"
),
CDATA boundaries will be ignored and all consequent character
data form a single text node, thus the cdata()
kind
test never matches in STXPath patterns. The default value is
"yes".
default-stxpath-namespace
- This optional attribute
specifies a namespace used for unprefixed name tests in STXPath
expressions and patterns. See 2.5 Expressions for
details. No namespace is used by default.
strip-space
- This optional attribute specifies
whether whitespace text nodes are stripped from source data
streams. See 2.8 Whitespace Stripping for details.
The default value is "no".
output-encoding
- This optional attribute specifies
the preferred output encoding of the resulting byte stream. The
value of this attribute should be treated case-insensitively; the
value must contain only printable ASCII characters (#x21 - #x7E);
the value must be a charset registered with the Internet
Assigned Numbers Authority
(see [IANA Character Sets]).
If the attribute is not present, the output encoding is UTF-8. A compliant STX processor is not required to support any particular encoding other than UTF-8.
The stx:transform
element can contain the following children
from the STX namespace. These elements are called top-level
elements:
stx:include
stx:variable
stx:param
stx:buffer
stx:namespace-alias
stx:group
stx:template
stx:procedure
All top-level elements may occur multiple times.
stx:namespace-alias
element is allowed as top-level
element only.
Templates can be organized into groups using the stx:group
element. Groups of templates play a role in template matching
(precedence categories are defined for groups) and determine
the scoping of variables.
Each stylesheet has a virtual default group (represented
by stx:transform
) that is considered to be the parent
of top-level groups. Explicit groups are not mandatory; many
transformations can be done without grouping templates. On the other
hand, templates separated to groups make it possible to define more
precise transformation rules and to run safer complex transformation,
especially on a well-known, regular input data.
stx:group
<!-- Category: top-level or group --> <stx:group name = qname pass-through = "none"|"all"|"text"|"inherit" recognize-cdata = "yes"|"no"|"inherit" strip-space = "yes"|"no"|"inherit"> <!-- Content: group-elements --> </stx:group>
This element must be a child of either the stx:transform
or the stx:group
element. The optional name
attribute contains a qualified name that must be unique in the
stylesheet. The name can be referenced by the group
attribute of any of stx:process-children
,
stx:process-attributes
,
stx:process-self
, stx:process-siblings
,
stx:process-document
or stx:process-buffer
instructions. In this event, the referenced group is used instead of
the current group for matching. It is not possible to
reference the default group.
It is a non-recoverable error if a stylesheet contains more than one group of the same name.
The attributes pass-through
, recognize-cdata
,
and strip-space
are optional, and they set transformation
properties specific for this group. Their meaning is exactly the
same as the meaning of the global properties of the same name
specified on the stx:transform
element (see
3.2 Transform Element), with the exception that for
all of these attributes the additional value
"inherit" may be used. This value is also the
default value and specifies that the value of the same property
of the nearest ancestor group should be used. In other words,
the value of the property stems from the nearest ancestor group
that has the corresponding attribute set to a value distinct from
"inherit", or from the default value of the
stx:transform
element, if no such attribute was specified.
At each point of the processing, the properties of the base group
apply.
Note:
The last sentence means, that for a visible template from a different group the properties of its parent group will take effect not before this template will be instantiated, thus not during the matching process.
An STX stylesheet may include another STX stylesheet using the
stx:include
element.
stx:include
<!-- Category: top-level or group --> <stx:include href = uri-reference/>
This declaration is used to insert additional stylesheet modules into the principal stylesheet module. Circular inclusion is prohibited.
This element must be top-level or a child of the
stx:group
element. stx:include
is replaced
with the stx:transform
element of the included
stylesheet whereupon the included stx:transform
then
becomes an stx:group
element. There are two exceptions:
top-level stx:namespace-alias
and stx:param
(stylesheet parameters) from the included stylesheet are always
inserted as top-level elements. The resulting stylesheet must
meet the criteria for being a valid STX stylesheet (for
example concerning unique groups and parameters).
The rules for the attributes of the imported stx:transform
element are as follows:
The version
and output-encoding
attributes won't affect the including stylesheet. However,
the included stylesheet must be valid, that means its
version
attribute must be "1.0".
The default-stxpath-namespace
attribute will be
used for the included stylesheet. The
default-stxpath-namespace
attribute of the
stx:transform
element of the principal stylesheet
module never affects included stylesheet modules.
The pass-through
, recognize-cdata
, and
strip-space
attributes become attributes of the
new stx:group
element. A missing attribute becomes
an attribute on the stx:group
element with the
default value defined for stx:transform
, i.e. it
never can have the value "inherit".
There is no difference between templates from the principal stylesheet module and included templates in terms of matching precedence.
STX templates are called sequentially for each incoming node rather
than from other templates. Pair events for the document and elements
match only one template, which is broken into two parts; the first part
is executed when the start event appears and the second one at the end
event. The two parts are separated by the stx:process-children
element.
stx:namespace-alias
<!-- Category: top-level --> <stx:namespace-alias source-prefix = ncname|"#default" result-prefix = ncname|"#default"/>
Namespaces from source streams can be mapped to different namespaces
in result streams using the stx:namespace-alias
element. Both
attributes are mandatory and can contain either a prefix bound to the
namespace to be used or the "#default" keyword for the
default namespace.
stx:template
<!-- Category: top-level or group --> <stx:template match = pattern priority = number visibility = "local"|"group"|"global" public = "yes"|"no" new-scope = "yes"|"no"> <!-- Content: template --> </stx:template>
Rules to process input events are written in templates. The
stx:template
element must be a child of either the
stx:transform
or the stx:group
element.
Templates match to the events by means of precedence categories and
a pattern in the mandatory match
attribute. The optional priority
attribute can contain
an explicit priority value used for matching (see
2.6 Match Patterns).
Two optional attributes; visibility
and public
;
control whether the template is visible from other groups (and thus can
match to the next event) or not. See
2.4 Precedence Categories for meaning of the two
attributes. The default value of the visibility
attribute is
"local". The default value of the public
attribute for top-level templates as "yes", for
group templates it is "no". Whether a top-level
template is public or not is important only when the stylesheet
is included into another stylesheet, because every top-level
template then becomes a group template, see 3.4 Stylesheet Inclusion.
The optional new-scope
attribute specifies whether the
template creates new instances of group variables. The default value
is "no". A new set of group variables is created for
each instantiated template with new-scope="yes"
.
These variables shadow their former values and exist as long as the
template is being processed.
The content of templates may include both STX instructions and declarations, and literal elements. Literal elements are simply copied to the output.
A text template is defined as the content of some elements
(stx:attribute
, stx:variable
,
stx:param
, stx:assign
, stx:with-param
,
stx:cdata
,
stx:processing-instruction
, stx:comment
,
stx:message
). This is a part of template that generates
nothing but character events to the current output stream. An STX
processor is required to issue a run-time recoverable error if another
type of event is emitted. The processor is allowed to recover from this
error by ignoring the non-character event.
stx:procedure
<!-- Category: top-level or group --> <stx:procedure visibility = "local"|"group"|"global" public = "yes"|"no" new-scope = "yes"|"no" name = qname> <!-- Content: template --> </stx:procedure>
Procedures are sub-templates that can be called by names (with the
stx:call-procedure
instruction). The optional
visibility
, public
, and new-scope
attributes have the same meaning and default values as for
templates. Only visible procedures can be called by name, the
new-scope
must be set to "yes" to create
new copies of group variables. It is a static non-recoverable
error if a stylesheet contains more than one visible procedure with
the same name within the same precedence category.
The content of procedures may be the same as the content of templates.
stx:call-procedure
<!-- Category: template --> <stx:call-procedure name = qname group = qname> <!-- Content: stx:with-param* --> <stx:call-procedure>
The stx:call-procedure
element makes it possible to invoke
a procedure by its name. The name
attribute is mandatory.
The optional group
attribute allows to use the specified
group instead of the current group as a base group for calling
the procedure.
The target procedure will be determined according to the precedence categories described in 2.4 Precedence Categories. If the first category doesn't contain a procedure with the requested name, then the second category will be searched. If neither the first nor the second category contain such procedure, the third category is searched. It is a static non-recoverable error if none of the three precedence categories contain the requested procedure.
Values can be passed to stylesheets or to their templates and procedures as parameters. Parameter are variables (see 6.1 Variables) with the additional property that their value can be set by the caller of the stylesheet, the template, or the procedure. Stylesheet parameters behave in the same way as variables of the default group. Template/procedure parameters behave in the same way as local variables; thus they are only visible within the template or procedure they are passed to. There are two elements available to work with parameters:
stx:with-param
<!-- Category: process-xxx, call-procedure --> <stx:with-param name = qname select = expression <!-- Content: text template --> </stx:with-param>
Parameters are passed to templates or procedures using the
stx:with-param
element. The required name
attribute specifies the name of the parameter. The value of the
parameter is either the result returned by the
expression located in the optional
select
attribute or the content of this element if the
select
attribute is missing. If neither the
select
attribute nor the content is present the parameter
value is the empty string.
The stx:with-param
instruction is allowed as a child of
the elements
stx:process-children
, stx:process-attributes
,
stx:process-self
, stx:process-siblings
,
stx:process-document
, stx:process-buffer
, or
stx:call-procedure
, and must not have any of
these elements in its content.
stx:param
<!-- Category: top-level or template --> <stx:param name = qname select = expression required = "yes" | "no"> <!-- Content: text template --> </stx:param>
The stx:param
element is allowed as a top-level
element (indicating a stylesheet parameter as a child of
stx:transform
) and in templates or procedures
(as a child of stx:template
or
stx:procedure
). The required name
attribute
specifies the name of the parameter. The optional select
attribute or the content of this element specifies a default value,
which is both evaluated and used only when there is no value
specified using the
select
attribute or the content of the appropriate
stx:with-param
element. Should both the select
attribute and the content be missing, the parameter defaults to the
empty string.
Stylesheet parameters are statically initialized while parsing
the stylesheet; only the static context information is
available during the initialization. Template/procedure parameters
are initialized at run-time. Since there is no current
source stream available during the static initialization,
it is a recoverable error if a stylesheet
(top-level) parameter has an stx:process-children
,
stx:process-attributes
, stx:process-self
, or
stx:process-siblings
instruction in its content. A processor
may recover from this error by ignoring such an instruction.
The optional required
attribute may be used to indicate
that a parameter is mandatory. The default value is
"no", indicating that the parameter is optional. If the
value of the required
attribute is "yes",
the stx:param
element must be empty, and must have no
select
attribute. It is a dynamic non-recoverable error if
the caller doesn't supply a value with stx:with-param
for a required parameter.
stx:copy
<!-- Category: template --> <stx:copy attributes = pattern> <!-- Content: template --> </stx:copy>
The stx:copy
element is used to copy the current node to the
output. The optional attributes
attribute contains a
pattern. These attributes of the current node
that match the pattern are copied to the output. If the
attributes
attribute isn't present no attributes are copied
with the current node.
Thus, attributes="@*"
copies all attributes,
attributes="@foo|@bar"
copies the foo
and bar
attributes only,
attributes="@*[not(name()='foo')]"
copies all
but the foo
attribute, and
attributes="@*[false()]"
doesn't copy any
attributes as if the attributes
attribute is missing at
all.
If the stx:copy
instruction applies to a node other than
element the attributes
attribute is ignored.
stx:process-children
<!-- Category: template --> <stx:process-children group = qname> <!-- Content: stx:with-param* --> </stx:process-children>
The instruction stx:process-children
suspends the
processing of the current template by processing the children
of the current node. Using SAX2 terms: this instruction splits a
template into two parts such that a SAX2 startElement
event causes the execution of the first part and the
corresponding SAX2 endElement
event causes the
execution of the second part.
There must be always at most one stx:process-children
instruction executed during the processing of a template. Moreover, it
is a non-recoverable error if stx:process-children
is
encountered after an stx:process-self
instruction or an
stx:process-siblings
instruction.
Note:
If a template doesn't contain any stx:process-children
instruction, the children of this element will be skipped. The
default rule applies only to nodes that are processed and no matching
template is found.
Note:
If the current node is neither an element node nor the document root
then the stx:process-children
instruction simply does
nothing.
The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
stx:process-attributes
<!-- Category: template --> <stx:process-attributes group = qname> <!-- Content: stx:with-param* --> </stx:process-attributes>
This instruction is used to apply templates to attributes of an element node.
The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
stx:process-siblings
<!-- Category: template --> <stx:process-siblings while = pattern until = pattern group = qname> <!-- Content: stx:with-param* --> </stx:process-siblings>
The stx:process-siblings
instruction suspends the
processing of the current template and processes the following
siblings of the current node. The processing can be terminated
by one of while
or until
conditions,
or because of the end of the parent element or the current
buffer (see stx:process-buffer
).
Note:
If the current node is an attribute node or the document root
node the stx:process-siblings
instruction does nothing.
The optional while
attribute contains a
pattern. The next siblings are processed
as long as they match the specified pattern. The first non-matching
node stops the processing; this node is not processed by this
stx:process-siblings
instruction.. The while
attribute defaults to node()
.
The optional until
attribute contains a
pattern. The next siblings are processed
until a node matching the pattern is encountered; this node is not
processed by this stx:process-siblings
instruction.
The until
attribute defaults to
node()[false()]
.
If both while
and until
attributes have been
specified then both conditions have to be met. For example
<stx:process-siblings while="foo" until="foo"/>
doesn't process any siblings. Variable bindings used within the
patterns will be interpreted with regard to the current context.
That means changed group variables affect the evaluation, whereas
new instances of group variables or local variables are not
visible.
Note:
Whitespace text nodes not stripped from the document must be
considered in the patterns, particularly when using the
while
attribute. A typical attribute specification
would be while="foo | text()"
which processes
all following foo
elements and potential text
nodes between these foo
elements.
The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
An stx:process-siblings
instruction encountered during
the processing of siblings does not affect the while
and
until
conditions of the previous
stx:process-siblings
. In other words: nested
stx:process-siblings
instructions process at most the
siblings chosen in the preceding stx:process-siblings
.
That means
stx:process-siblings
also returns if there are no more
siblings in the input available or a preceding
stx:process-siblings
terminates.
Though multiple stx:process-siblings
instructions may
appear within the same template it is a non-recoverable error
if an stx:process-children
or stx:process-self
instruction will be encountered after stx:process-siblings
.
stx:process-self
<!-- Category: template --> <stx:process-self group = qname> <!-- Content: stx:with-param* --> </stx:process-self>
This instruction is used to process the current node using the
template that would have been chosen if the current template wasn't
present in the stylesheet. The current template won't be
instantiated again for this node, even in a chain of calls to
stx:process-self
. There must be always at most one
stx:process-self
instruction executed during the
processing of a template.
Moreover it is a non-recoverable error if an stx:process-self
instruction is encountered after an stx:process-children
or
an stx:process-siblings
instruction in a template.
The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
Note:
If no group
attribute has been specified then
the current group will be used for choosing the next best
matching template. This is also true if the current group
has been automatically entered via a public template.
stx:process-text
<!-- Category: template --> <stx:process-text select = expression> <!-- Content: stx:pattern+ --> </stx:process-text>
stx:pattern
<stx:pattern regexp = expression case = "sensitive"|"insensitive"> <!-- Content: template --> </stx:pattern>
This instruction processes a string in a similar way as
stx:template
processes nodes. The mandatory
select
attribute of stx:process-text
selects a
string to process by evaluating the
expression and converting it to a
string. The mandatory regexp
attribute of
stx:pattern
takes a regular expression by evaluating
the expression in the regexp
attribute and converting it to a string, which describes
a substring to look for. The optional case
attribute
determines whether the regular expression is case-sensitive
(value "sensitive") or not (value
"insensitive"). The default is
"sensitive".
The stx:process-text
instruction looks for the pattern
among the regexp
attributes of all stx:pattern
elements that matches first in the string selected by the
select
attribute. The substring before the matched
substring will be output, and the matched substring itself will be
replaced by the contents of the stx:pattern
element.
Afterwards this stx:process-text
instruction will continue
by processing the substring after the matched substring. If no
pattern matches then the remaining string will be emitted as a
text node to the result stream. A pattern must match at least one
character.
In case two or more pattern may match at the same position then the pattern which matches the longest character sequence will be used. If still two or more patterns meet this condition then the first one will be used.
stx:value-of
<!-- Category: template --> <stx:value-of select = expression/>
This instructions emits characters to the result stream. The mandatory
select
attribute contains an STXPath
expression which is evaluated and
converted to a string. This element is always empty.
stx:text
<!-- Category: template --> <stx:text markup = "error"|"ignore"|"serialize"> <!-- Content: template --> </stx:text>
This instruction emits literal character data to the result stream.
The optional markup
attribute determines how non-text
nodes in the content of stx:text
should be handled:
"error" causes the processor to raise a run-time
recoverable error for such nodes, "ignore" ignores
any markup by emitting only the string value of the contents to
the result stream, "serialize" emits any markup
serialized as text. The default value is "error".
The processor may recover from an error raised because having
markup
set to "error" by ignoring this
attempt.
Note:
The string created by markup="serialize"
may
vary in different STX implementations, because some of the
lexical representation is not relevant for the information coded
in XML. For example every STX implementation may choose its own
order for serializing attributes.
The stx:text
element has an implicit xml:space
attribute with the default value "preserve". Thus
the content is normally neither normalized nor stripped should it
contain whitespace characters only.
stx:cdata
<!-- Category: template --> <stx:cdata> <!-- Content: text template --> </stx:cdata>
This instructions emits literal data as a CDATA section to the result stream.
The stx:cdata
element has an implicit xml:space
attribute with the default value "preserve". Thus
the content is normally neither normalized nor stripped should it
contain whitespace characters only.
stx:element
<!-- Category: template --> <stx:element name = {qname} namespace = {uri-reference}> <!-- Content: template --> </stx:element>
This instruction is used to generate an element. It has the same meaning as in [XSLT].
stx:start-element
<!-- Category: template --> <stx:start-element name = {qname} namespace = {uri-reference}/>
stx:end-element
<!-- Category: template --> <stx:end-element name = {qname} namespace = {uri-reference}/>
There are separate instructions available to output an element start
tag and an element end tag. The name
attribute is required
for both instructions. The both elements must be empty.
A compliant STX processor is required to produce well-formed XML output. An attempt to create an end-tag without a matching start-tag must be reported as non-recoverable error by the STX processor.
stx:attribute
<!-- Category: template --> <stx:attribute name = {qname} namespace = {uri-reference} select = expression> <!-- Content: text template --> </stx:attribute>
This instruction is used to generate an attribute. It has the same
meaning as in [XSLT]. Alternatively, the value of the
generated attribute may be specified in the optional
select
attribute. It is a recoverable error of this
instruction has a select
attribute and is not
empty. A processor can recover from this error by ignoring the
content of stx:attribute
.
stx:attribute
must follow an element-starting instruction
(stx:element
, stx:start-element
,
stx:copy
, or a literal element) and no other
output-generating instructions are allowed between the
element-starting instruction and stx:attribute
.
It is a recoverable error if there is no immediate
element-starting instruction before. A processor can recover from
this error by ignoring the stx:attribute
instruction.
stx:processing-instruction
<!-- Category: template --> <stx:processing-instruction name = {ncname}> <!-- Content: text template --> </stx:processing-instruction>
This instruction is used to generate a processing instruction. It has the same meaning as in [XSLT].
stx:comment
<!-- Category: template --> <stx:comment> <!-- Content: text template --> </stx:comment>
This instruction is used to generate a comment. It has the same meaning as in [XSLT].
stx:if
<!-- Category: template --> <stx:if test = expression> <!-- Content: template --> </stx:if>
The mandatory test
attribute contains an STXPath
expression evaluating to boolean.
The content template is instantiated if and only
if the test
attribute has evaluated to true.
stx:else
<!-- Category: template --> <stx:else> <!-- Content: template --> </stx:else>
This instruction must follow immediately after stx:if
; a
non-recoverable error must be reported otherwise. The content
template is instantiated if and only if the test
attribute
of the preceding stx:if
instruction has evaluated to
false.
stx:choose
<!-- Category: template --> <stx:choose> <stx:when test = expression> <!-- Content: template --> </stx:when>+ <stx:otherwise> <!-- Content: template --> </stx:otherwise>? </stx:choose>
The same meaning as in [XSLT].
stx:for-each-item
<!-- Category: template --> <stx:for-each-item name = qname select = expression> <!-- Content: template --> </stx:for-each-item>
The stx:for-each-item
instruction contains a template that
is instantiated for each item of the sequence specified by the
mandatory select
attribute.
The mandatory name
attribute specifies a name of local
variable that is declared automatically for each item, and that
contains the current item.
Neither the current node (accessed with .
) nor
sf:position() change inside stx:for-each-item
.
stx:while
<!-- Category: template --> <stx:while test = expression> <!-- Content: template --> </stx:while>
The mandatory test
attribute contains an STXPath
expression evaluating to boolean.
The contents of the stx:while
element is instantiated
repeatedly as long as the test
attribute evaluates to
true.
stx:process-document
<!-- Category: template --> <stx:process-document href = expression base = {uri-reference}|"#input"|"#stylesheet" group = qname> <!-- Content: stx:with-param* --> </stx:process-document>
A stylesheet can process further source streams in addition to this
supplied when the transformation is invoked (the principal source
stream). The current source stream can be changed with the
stx:process-document
instruction. When this instruction is
instantiated the expression in the
mandatory href
attribute will be evaluated, each
item in the resulting sequence will be converted sequentially to
a string (a URI), and its value will be used to identity and to
process a new current source stream. Then, the execution of the
template containing the stx:process-document
instruction
continues with the original source stream.
If a URI is a relative URI then the base URI will be derived
from the type of the item in the sequence that represents this
URI. In case this item is a node then its base URI will be used,
otherwise the base URI of the stylesheet will be used.
Alternatively, the optional base
attribute can be used
to specify explicitly which base URI should be used. Its value
must be either an absolute URI, the string
"#input" in which case the base URI of the
current input stream will be used, or the string
"#stylesheet" in which case the base URI of the
principal stylesheet will be used.
The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
Note:
When processing a new document, the ancestor stack of the original document is not available for matching and navigation. Each new document has an ancestor stack of its own.
stx:result-document
<!-- Category: template --> <stx:result-document href = expression encoding = string> <!-- Content: template --> </stx:result-document>
A stylesheet can produce further result streams in addition to the
principal result stream. The current result stream can be changed
with the stx:result-document
instruction. Events generated
as the result of executing instructions contained within the
stx:result-document
element are emitted to a new current
result stream identified with the URI which is the result of
evaluating the expression in the
required
href
attribute and converting its value to a string.
Then, the execution of instructions behind the end of the
stx:result-document
element continues to emit events into
the original result stream.
The optional encoding
attribute can be used to specify
a preferred output encoding for the new result stream. If this
attribute is not present the encoding of the principal result
stream will be used.
A sequence of events can be stored into an object called a buffer. The stored events can be emitted and processed later, in the same way as events emitted from a source stream. The events are emitted from a buffer in the same order as they were stored in. In other words, the buffers are temporary storages of the 'first in first out' type. The events stored in a buffer must represent a well-formed external general parsed entity (the restriction on a single root node is relaxed).
There are two types of buffers:
group buffers - stx:buffer
is child
of either stx:transform
or stx:group
. Top-level
buffers are considered members of the top-most default group
that exists for each stylesheet.
local buffers - Declared within templates.
A buffer must be declared before it can be used. The same rules as for variables (see 6.1 Variables) apply for the the visibility of buffers, their shadowing, and the creating of new instances for new-scope templates (see 4.2 Templates).
stx:buffer
<!-- Category: top-level, group or template--> <stx:buffer name = qname> <!-- Content: template --> </stx:buffer>
The stx:buffer
element declares a buffer. The mandatory
name
attribute contains a qualified name identifying the
declared buffer. The buffer is initialized with events generated as
a result of the evaluation of the content of the stx:buffer
.
If the content is empty (stx:buffer
element has no children)
the buffer is empty.
For group buffers, the content of stx:buffer
element is
evaluated statically. It is a recoverable error if the element
stx:buffer
declaring a group buffer contains
an stx:process-children
, stx:process-self
,
stx:process-siblings
, stx:process-attributes
,
stx:process-document
, stx:process-buffer
, or
call-procedure
instruction in its content. A
processor may recover from this error by ignoring such an
instruction.
stx:result-buffer
<!-- Category: template --> <stx:result-buffer name = qname clear = "yes"|"no"> <!-- Content: template --> </stx:result-buffer>
The stx:result-buffer
instruction directs events emitted by
its content into the buffer specified with the mandatory name
attribute rather than to the current result stream. The buffer must be
declared with stx:buffer
before it can be employed in
stx:result-buffer
.
If the buffer specified with the name
attribute already
contains a sequence of events, the new sequence of events is appended
behind the last event in the previously stored sequence normally. If
the stx:result-buffer
element has the optional
clear
attribute with the value of "yes", the
previously stored events are removed from the buffer before the new
sequence of events is stored in. The clear
attribute
defaults to "no".
Note:
To clear a buffer without storing a new sequence of events, use the
stx:result-buffer
instruction with no content:
<stx:result-buffer name="my-buffer" clear="yes"/>
The events stored in a buffer will be available for a following
stx:process-buffer
not before the
stx:result-buffer
instruction has terminated. Until then
the previous contents is accessible. Thus for processing a buffer
and storing the result in the same buffer again use
<stx:result-buffer name="b" clear="yes">
<stx:process-buffer name="b"/> </stx:result-buffer>
It is a non-recoverable error if this instruction is executed for a buffer that acts already as (current or suspended) result buffer.
stx:process-buffer
<!-- Category: template --> <stx:process-buffer name = qname group = qname> <!-- Content: stx:with-param* --> </stx:process-buffer>
The stx:process-buffer
instruction emits the events
currently stored in the buffer specified by the mandatory
name
attribute to the STX processor. The events are
processed in the same way as events supplied by source streams. When
the very last event from the buffer is processed, the processing in the
current template continues with an instruction, declaration or literal
next to the stx:process-buffer
instruction.
Note:
Changes to the contents of a buffer that is currently processed
won't affect this processing. The stx:process-buffer
instruction creates an internal copy of the contained events
and emits them afterwards.
The optional group
attribute makes it possible to use the
specified group instead of the current group as the base for
matching (see 2.4 Precedence Categories). It is
a recoverable error if the group of the specified name is not
available. An STX processor can recover from this error by using the
current group.
The processing of events from a buffer doesn't mean the emptying of this buffer. Once a sequence of events is stored in the buffer, it can be processed repeatedly.
Note:
A buffer is not treated as a new document, but rather as if events emitted from the buffer originate from the current source stream. The ancestor stack of the current source stream remains available for matching and navigation when processing nodes from the buffer.
stx:message
<!-- Category: template --> <stx:message> <!-- Content: template --> </stx:message>
The stx:message
instruction generates a separate result
stream whose handling is implementation dependent. It can be
directed to a log, or to a special message resolver, etc. However,
all instructions of the content of the stx:message
element
must processed even if the message stream is ignored.
There are four atomic data types in STX:
string
number
boolean
node
There are seven types of node recognized in STXPath (see 2.2 Nodes). For every type of node, there is a way of determining the string-value. Since descendants are not available in the time of processing, string-values for some types of nodes are different from XPath string-values.
root nodes - there is no string value defined for root nodes, a recoverable error is reported. An STX processor is allowed to recover from this error by returning the empty string.
element nodes - if the very first child of an element happens to be a text node, the string-value of the element is the string-value of this text node. Otherwise, the string-value of the element is the empty string.
attribute nodes - the string-value of an attribute is the normalized value of this attribute
text nodes - the string-value of a text node is the character data of this node
cdata nodes - the string-value of a cdata node is the character data of this node
processing instruction nodes - the string-value of a
processing instruction node is the part of the processing
instruction following the target and any whitespace not including
the terminating ?>
comment nodes - the string-value of a comment is the content
of this comment not including the opening <!--
or
the closing -->
STXPath expressions (see 6 STXPath) always return a sequence. A sequence is an ordered collection of zero or more items. Unlike common lists, sequences are "flat"; sequences may not contain other sequences. Sequences may contain duplicate items. An item must be of one of the atomic types: string, number, boolean, or node.
A sequence with zero items is called an empty sequence. A sequence with exactly one item is called a singleton sequence. There is no distinction between an item and a singleton sequence containing this item; an item is equivalent to a singleton sequence containing this item and vice versa. A sequence has no identity. Equality comparison of sequences is performed only by comparing items of the sequences.
Certain operators, functions, and syntactic constructs expect a value of a particular type to be supplied: this type is referred to as a required type. In such an event, a general sequence is converted to the required type according to the conversion rules.
The empty sequence is converted to required types as defined in the following table:
required type | result |
---|---|
boolean | false |
string | empty string |
number | NaN |
node | NON-RECOVERABLE ERROR |
A singleton sequence is converted to a required type according to the type of the only item in the sequence. An attempt to convert boolean, string, or number to node causes a non-recoverable error.
item type | boolean required | string required | number required |
---|---|---|---|
boolean | false is converted to 'false', true
is converted to 'true' | false is converted to 0, true is
converted to 1 | |
string | the empty string is converted to
false , other strings are converted to
true | a string that consists of optional whitespace followed by an
optional minus sign followed by a numeric literal (see
6.2 Literals) followed by whitespace is converted to
the number that is nearest to the mathematical value represented
by the string; any other string is converted to
NaN . | |
number | 0, +0, -0, NaN are converted to
false , other numbers are converted to
true | NaN is converted to 'NaN', +0 and -0 are
converted to '0', positive infinity is converted to 'Infinity',
negative infinity is converted to '-Infinity'. Other numbers are
represented in decimal form as numeric literal (see
6.2 Literals) with no leading zeros (apart possibly
from the one required digit immediately before the decimal
point), preceded by a minus sign (-) if the number is
negative. | |
node | a node is converted to true | a node is converted to its string value (see 2.5 Expressions) | a node is converted to its string value (see 2.5 Expressions); then the rules to convert strings to numbers are applied to convert the string value to a number |
A sequence containing more than one item is converted according to its very first item; all other items are ignored. The same conversion rules as for singleton sequences are applied (see the table above).
STXPath is an expression language for STX which is very similar to [XPath] on the first sight. Syntactically, STXPath is close to an [XPath2] sub-set. However, since STX has a different notion of context, the meaning of some expressions may be different in STXPath and in XPath. Consider the following example:
In XPath, the expression /node1/node2
returns a node-set
containing all node2
elements, whose parent node1
is
the document element. In STXPath, on contrary, the same expression
returns only a single node from this node-set; the one which is an
ancestor of the current node.
Each expression has its static context - the information that is available during static analysis of the expression, prior to its evaluation. The static context includes in-scope namespaces, default namespace for element names, default function namespace, and in-scope variables. The information that is available at the time when the expression is evaluated is the current context as defined in 2.3 Context.
Basic primitives of STXPath include:
variables (6.1 Variables)
literals (6.2 Literals)
parenthesized expressions (6.3 Parenthesized Expressions)
functions (6.4 Functions)
Expressions evaluate always to a sequence. See the EBNF production for expression in C STXPath Grammar for the details.
STX variables are scoped statically according to the literal structure of stylesheets. The grouping of templates is used to make the sharing of other than global variables possible.
There are two types of variables:
group variables - stx:variable
is child
of either stx:transform
or stx:group
. Top-level
variables are considered to be members of the top-most default
group that exists for each stylesheet.
local variables - Declared within templates.
A group variable is visible for the group where the variable is declared, for all descendant groups and for all templates belonging to these groups. A local variable is visible for all following siblings of the variable declaration and their descendants. Group variables may be shadowed (another variable with the same name is visible) by descendant group variables and by local variables. It is a non-recoverable error to redeclare a variable with the same name in the same group or template.
Variables always contain a sequence. STX instructions
stx:variable
and stx:assign
are used to evaluate an
expression and store its value into a variable.
Since variables are re-assignable, each variable must be declared
using the stx:variable
element before it's used (assigned,
referenced). Group variables are statically initialized while parsing
the stylesheet; Only the static context information is
available during the initialization of group variables. Local variables
are initialized at run-time. A variable declared with no value is
initialized with the singleton sequence containing the empty
string.
stx:variable
<!-- Category: top-level or group or template --> <stx:variable name = qname select = expression keep-value = "yes"|"no"> <!-- Content: text template --> </stx:variable>
This instruction is used to declare and initialize a variable. The
mandatory name
attribute contains the name of
the variable. An expression in
the select
attribute is evaluated and the
variable is initialized with its result. The select
attribute is optional; a variable is initialized with the string
resulting from the content of the stx:variable
element if the
select
is missing. If the content is empty
(stx:variable
element has no children) the variable is
initialized with the empty string.
Note:
Thus, <stx:assign name="var"/> is equal to <stx:assign name="var" select="''"/>.
It is a recoverable error if the element stx:variable
declaring a group variable contains
an stx:process-children
, stx:process-self
,
stx:process-siblings
, stx:process-attributes
,
stx:process-document
, stx:process-buffer
, or
call-procedure
instruction in its content. A
processor may recover from this error by ignoring such an
instruction.
The optional keep-value
attribute specifies whether a
new instance of the variable created by instantiating a template
having its new-scope
attribute set to
"yes" is initialized with the value of
the shadowed variable (yes
) or not
(no
). This attribute is allowed only for group
variables. The default value is no
. If there is no
shadowed variable yet, the keep-value
attribute is
ignored.
stx:assign
<!-- Category: top-level or group or template --> <stx:assign name = qname select = expression> <!-- Content: text template --> </stx:assign>
This instruction is used to assign a new value to a previously
declared variable. The mandatory name
attribute contains the
name of the variable. The expression in
the optional select
attribute is evaluated and its result is
assigned to the variable. The string resulting from the content of the
stx:assign
element is assigned to the variable if the
select
is missing. If the content is empty, the empty string
is assigned to the variable.
Note:
Thus, <stx:assign name="var"/> is equal to <stx:assign name="var" select="''"/>.
A literal is a direct syntactic representation of an atomic value. STXPath supports two kinds of literals: string literals and numeric literals.
The value of a string literal is a singleton sequence containing an item whose atomic type is string and whose value is the string denoted by the characters between the delimiting quotation marks.
The value of a numeric literal is a singleton sequence containing an item whose type is number and whose value is obtained by parsing the numeric literal according to the rules for string to numbers conversion (see 5.3 Type Conversions).
NumericLiteral | ::= | IntegerLiteral | DecimalLiteral | DoubleLiteral | |
IntegerLiteral | ::= | Digits | |
DecimalLiteral | ::= | ('.' Digits) | (Digits '.' [0-9]*) | |
DoubleLiteral | ::= | (('.' Digits) | (Digits ('.' [0-9]*)?))([e]|[E])([+][-])? Digits |
Parentheses may be used to enforce a particular evaluation order in expressions that contain multiple operators.
Parentheses are also used as delimiters in constructing a sequence, as described in 6.6 Sequence Expressions.
A function call consists of a function qualified name followed by a parenthesized list of zero or more expressions. The expressions inside the parentheses provide the arguments of the function call. The number of arguments must be equal to the number of function parameters; otherwise a static non-recoverable error is raised.
A function call is evaluated as follows:
Each argument expression is evaluated, producing an argument value (sequence).
If the corresponding function parameter has a required type, the argument value is converted to this type.
The function is executed using the converted argument values. The result is a value of the function's declared return type.
STXPath function names are contained in the reserved namespace
http://stx.sourceforge.net/2003/functions
. The
sf:
prefix is used to refer to this namespace in this
document. The default function namespace is assigned to this
reserved namespace in STX. Thus, the functions namespace does not
need to be declared in STX stylesheets and STXPath functions can be
invoked without any namespace prefix.
Some STXPath functions have the same definitions as their counterparts (functions with the same local name) in XPath 2.0. These functions are not re-defined in this section. Instead, original definitions in [Functions and Operators] are referenced. Other STXPath functions are either different from their XPath 2.0 counterparts or have no such counterparts; these functions are defined in this section.
All errors raised when evaluating STXPath functions are non-recoverable errors (see 2.9 Errors).
Indicates whether or not the provided sequence is empty.
See the definition in [Functions and Operators].
Indicates whether or not the provided sequence is not empty.
See the definition in [Functions and Operators].
Returns the item at given index.
See the definition in [Functions and Operators].
Returns a sequence of integer numbers, each of which is the index of a member of the specified sequence that is equal to the item that is the value of the second argument.
See the definition in [Functions and Operators].
Returns the subsequence of a given sequence identified by location.
See the definition in [Functions and Operators].
Inserts an item or sequence of items into a specified position of a sequence.
See the definition in [Functions and Operators].
Removes an item from a specified position of a sequence.
See the definition in [Functions and Operators].
Returns the name of the current node or the specified node.
See the definition in [Functions and Operators].
Returns the namespace URI for the QName of the argument node or the current node if the argument is omitted.
See the definition in [Functions and Operators].
Returns the local name of the current node or the specified node.
See the definition in [Functions and Operators].
The sf:position function returns a number equal to the position of the current node relative to other siblings normally; see 2.3 Context for details of sf:position() semantics.
The sf:has-child-nodes function returns true if and only if the current node is the document node or an element node and has child nodes (it is not empty). It returns false otherwise.
The sf:node-kind function returns a string value representing the node's kind: either "document", "element", "attribute", "text", "cdata", "processing-instruction", or "comment".
Returns the names of the in-scope namespaces for the given element.
See the definition in [Functions and Operators].
Returns the namespace URI of one of the in-scope namespaces for the given element, identified by its namespace prefix.
See the definition in [Functions and Operators].
Returns true or false depending on whether the language of the current node, as defined using the xml:lang attribute, is the same as, or a sub-language of, the language specified by the argument.
See the definition in [Functions and Operators].
Returns the boolean value TRUE.
See the definition in [Functions and Operators].
Returns the boolean value FALSE.
See the definition in [Functions and Operators].
Inverts the boolean value of the argument.
See the definition in [Functions and Operators].
Concatenates two or more character strings.
See the definition in [Functions and Operators].
Accepts a sequence of strings and returns the strings concatenated together with an optional separator.
See the definition in [Functions and Operators].
Indicates whether the value of one string begins with the characters of the value of another string.
See the definition in [Functions and Operators].
Indicates whether the value of one string ends with the characters of the value of another string.
See the definition in [Functions and Operators].
Indicates whether the value of one string contains the characters of the value of another string.
See the definition in [Functions and Operators].
Returns a string located at a specified place in the value of a string.
See the definition in [Functions and Operators].
Returns the characters of one string that precede in that string the characters in the value of another string.
See the definition in [Functions and Operators].
Returns the characters of one string that succeed in that string the characters in the value of another string.
See the definition in [Functions and Operators].
Returns the length of the argument.
See the definition in [Functions and Operators].
Returns the whitespace-normalized value of the argument.
See the definition in [Functions and Operators].
Returns the normalized value of the first argument in the normalization form specified by the second argument.
See the definition in [Functions and Operators].
Returns the upper-cased value of the argument.
See the definition in [Functions and Operators].
Returns the lower-cased value of the argument.
See the definition in [Functions and Operators].
Returns the first argument string with occurrences of characters in the second argument replaced by the character at the corresponding position in the third string.
See the definition in [Functions and Operators].
Returns a string composed of as many copies of its first argument as specified in its second argument.
See the definition in [Functions and Operators].
Returns a boolean value that indicates whether the value of the first argument is matched by the regular expression that is the value of the second argument.
See the definition in [Functions and Operators].
Returns the value of the first argument with every substring matched by the regular expression that is the value of the second argument replaced by the replacement string that is the value of the third argument.
See the definition in [Functions and Operators].
Returns a sequence of zero or more strings whose values are substrings of the value of the first argument separated by substrings that match the regular expression that is the value of the second argument.
See the definition in [Functions and Operators].
Returns the string representing a URI value with certain characters escaped.
See the definition in [Functions and Operators].
Returns the largest integer less than or equal to the argument.
See the definition in [Functions and Operators].
Returns the smallest integer greater than or equal to the argument.
See the definition in [Functions and Operators].
Rounds to the nearest integer.
See the definition in [Functions and Operators].
Returns the number of items in the sequence.
See the definition in [Functions and Operators].
The sf:sum function returns the sum, for each item in the argument sequence, of the result of converting the item to a number. If the value of the argument is the empty sequence, the function returns the empty sequence. If an item can't be converted to a number, then an error is raised.
The sf:avg returns the average of all items in the argument sequence converted to numbers. If the argument sequence is the empty sequence, the empty sequence is returned. If an item can't be converted to a number, then an error is raised.
The sf:max converts all items of the argument sequence to numbers and returns the item whose value is greater than or equal to the value of every other item in the argument sequence. If there are two or more such items, then the specific item whose value is returned is implementation-dependent. If the argument sequence is the empty sequence, the empty sequence is returned. If an item can't be converted to a number, then an error is raised.
The sf:min converts all items of the argument sequence to numbers and returns the item whose value is less than or equal to the value of every other item in the argument sequence. If there are two or more such items, then the specific item whose value is returned is implementation-dependent. If the argument sequence is the empty sequence, the empty sequence is returned. If an item can't be converted to a number, then an error is raised.
The sf:string function returns the result of converting the argument to a string. See 5.3 Type Conversions for details.
The sf:number function returns the result of converting the argument to a number. See 5.3 Type Conversions for details.
The sf:boolean function returns the result of converting the argument to a boolean. See 5.3 Type Conversions for details.
The only data available when processing the current node is the data related to the current node itself, the data related to nodes on the ancestor stack, and data stored in variables. Location paths called data accessors are special expressions used to access to this data.
A location path always operates on the ancestor stack and evaluates
to a sequence of nodes from this stack. A path beginning with
/
or //
is called an absolute location
path, starting at the document root, otherwise it is called a
relative location path, starting at the current node, which is
always the topmost node of the ancestor stack.
A location path consists of a series of one or more steps,
separated by "/
" or "//
". This
sequence of steps is evaluated from left to right. A path
S1/S2
is evaluated as follows: S1
evaluates to a sequence of nodes from the ancestor stack.
Each of these nodes acts as a base node for the following
step S2
.
If the step S2
is a
NodeNameTest or a
KindTest then the result is
the node on the ancestor stack following the base node,
provided it matches this step. In other words: such a step
selects the child of the base node (or the empty sequence).
If the step S2
is "..
" then the
result is the node on the ancestor stack preceding the base
node, or in other words: the parent node, provided the base
node is not the root node (otherwise the empty sequence).
A path S1//S2
is evaluated by evaluating the sub
expression /S2
on a sequence of base nodes which is
the concatenation of the nodes from evaluating S1
and all of the descendant nodes of S1
.
The result of evaluating the path is the concatenation of all resulting nodes into a sequence, sorted in document order without duplicate nodes.
Besides location paths variables and function calls may evaluate to a sequence of nodes. Such an expression is called a NodeAccessor.
A node accessor may be optionally followed by a last step which accesses the attributes for each of the nodes selected by the node accessor.
Note:
Compared to full XPath the location paths in STXPath allow only abbreviated axes. Moreover, variables and function calls can not be used as a first step of location paths, unless the next step accesses only attributes. A location path can only select nodes from the ancestor stack.
Predicates are not allowed in data accessors.
The sub-expression ".
" can not be used as a step
within paths. This expression returns the current node (the topmost
node from the ancestor stack).
Here are some examples of data accessors:
..
- returns the parent node of the
current node
//foo
- returns a sequence whose items are
all foo
elements on the ancestor stack
@foo
- returns the foo
attribute of
the current node
../../@bar
- returns the bar
attribute of the grand parent of the current node
/aaa/bbb
- returns a bbb
element from
the ancestor stack which is a child of aaa
element which
is the root element of the ancestor stack (and hence the root
element of the input document)
/*//node()
- returns all nodes from the
ancestor stack except for the first
STXPath supports operators to construct and combine sequences. One way to construct a sequence is using a parenthesized expression (6.3 Parenthesized Expressions), which consists of zero or more expressions separated with the comma operator and delimited with parentheses. The parenthesized expression is evaluated by evaluating each of its constituent expressions and concatenating the resulting sequences, in order, into a single result sequence.
Here are some examples of expressions that construct sequences:
This expression is a sequence of five integers:
(10, 1, 2, 3, 4)
This expression constructs one sequence from the sequences 10, (1, 2), the empty sequence (), and (3, 4):
(10, (1, 2), (), (3, 4))
It evaluates to the sequence (10, 1, 2, 3, 4).
STXPath provides arithmetic operators for addition, subtraction, multiplication, division, and modulus, in their usual binary and unary forms. The binary subtraction operator must be preceded by a whitespace in order to distinguish it from a hyphen, which is a valid name character.
An arithmetic expression is evaluated by applying the following rules:
If either operand is the empty sequence, the result of the operation is the empty sequence.
Operands other than empty sequences are converted
(5.3 Type Conversions) to numbers before the
expression is evaluated. If the conversion fails (returns
NaN
) it returns NaN
.
Comparison expressions allow two values to be compared. STXPath provides the following general comparison operators: =, !=, <, <=, >, >=. The result of a comparison is always true or false (a singleton sequence containing one boolean item).
The result of a comparison of sequences is defined by applying the following rules, in order:
If either operand is the empty sequence, the result is false.
The comparison A operator B
is true for sequences
A
and B
if the comparison
a operator b
is true for some item a
in
A
and some item b
in B
.
Otherwise, A operator B
is false.
The result of a comparison of items is defined by applying the following rules. The rules defined in 5.3 Type Conversions apply for conversions:
If both items to be compared are nodes, then the comparison will be true if and only if the result of performing the comparison on the string-values of the two nodes is true.
If one item to be compared is a node and the other is a number, then the comparison will be true if and only if the result of performing the comparison on the number and on the result of converting the string-value of that node to a number is true.
If one item to be compared is a node and the other is a string, then the comparison will be true if and only if the result of performing the comparison on the string-value of the node and the other string is true.
If one item to be compared is a node and the other is a boolean, then the comparison will be true if and only if the result of performing the comparison of true and the boolean value is true.
When neither item to be compared is node and the operator is = or !=, then the items are compared by converting them to a common type as follows and then comparing them. If at least one item to be compared is a boolean, then each item to be compared is converted to a boolean. Otherwise, if at least one item to be compared is a number, then each item to be compared is converted to a number. Otherwise, both items to be compared are converted to strings.
When neither item to be compared is node and the operator is <=, <, >= or >, then the items are compared by converting both items to numbers and comparing the numbers.
STXPath provides two common logical operators: and
and
or
. The value of a logical expression is always one of the
boolean values true
or false
(a singleton
sequence containing a boolean item).
Logical expressions are evaluated by reducing each of its operands to an effective boolean value by applying the following rules, in order:
If the operand is the empty sequence, its effective boolean
value is false
.
If the operand is a singleton sequence containing a boolean item, the item serves as the effective boolean value.
If the operand is a sequence that contains at least one node,
its effective boolean value is true
.
In any other case, operands are converted to boolean (see 5.3 Type Conversions) to get effective boolean values.
An AND expression returns true
if the
effective boolean values of both of its operands are true
;
otherwise it returns false
.
An OR expression returns false
if the
effective boolean values of both of its operands are
false
; otherwise it returns true
.
In addition to logical expressions, XPath provides a function named not() that takes a general sequence as parameter and returns a boolean value.
Plain list only so far:
stx:transform stx:include stx:namespace-alias stx:template stx:procedure stx:group stx:call-procedure stx:copy stx:process-children stx:process-attributes stx:process-siblings stx:process-self stx:value-of stx:text stx:cdata stx:element stx:start-element stx:end-element stx:processing-instruction stx:comment stx:attribute stx:if stx:else stx:choose stx:when stx:otherwise stx:variable stx:assign stx:with-param stx:param stx:for-each-item stx:while stx:process-document stx:result-document stx:buffer stx:process-buffer stx:result-buffer stx:process-siblings stx:process-text stx:pattern
The following is a complete grammar for STXPath in EBNF notation.
[1] | pattern | ::= | PathPattern ('|' PathPattern)? |
[2] | expression | ::= | Expr |
[3] | PathPattern | ::= | AbsolutePattern | RelativePattern |
[4] | AbsolutePattern | ::= | '/' RelativePattern? |
[5] | RelativePattern | ::= | Step (('/' RelativePattern) | ('//' RelativePattern))? |
[6] | Step | ::= | NodeTest Predicate? |
[7] | NodeTest | ::= | NameTest | KindTest |
[8] | Predicate | ::= | '[' Expr ']' |
[9] | NameTest | ::= | NodeNameTest | AttributeNameTest |
[10] | NodeNameTest | ::= | QName | NCName ':' '*' | '*' | '*' ':' NCName |
[11] | AttributeNameTest | ::= | '@' QName | '@' NCName ':' '*' | '@' '*' | '@' '*' ':' NCName |
[12] | KindTest | ::= | AnyKindTest |
CommentTest |
ProcessingInstructionTest |
TextTest |
CDATATest |
[13] | AnyKindTest | ::= | 'node()' |
[14] | CommentTest | ::= | 'comment()' |
[15] | ProcessingInstructionTest | ::= | 'processing-instruction(' StringLiteral? ')' |
[16] | TextTest | ::= | 'text()' |
[17] | CDATATest | ::= | 'cdata()' |
[18] | Expr | ::= | OrExpr |
[19] | OrExpr | ::= | AndExpr | OrExpr 'or' AndExpr |
[20] | AndExpr | ::= | GeneralComp | AndExpr 'and' GeneralComp |
[21] | GeneralComp | ::= | AdditiveExpr | GeneralComp CompOp AdditiveExpr |
[22] | AdditiveExpr | ::= | MultiplicativeExpr | AdditiveExpr ('+' | '-') MultiplicativeExpr |
[23] | MultiplicativeExpr | ::= | UnaryExpr | MultiplicativeExpr ('*' | 'div' | 'mod') UnaryExpr |
[24] | UnaryExpr | ::= | ('-' | '+')? BasicExpr |
[25] | BasicExpr | ::= | DataAccessor
| ParenthesizedExpr
| Literal
| '.' |
[26] | ParenthesizedExpr | ::= | '(' ExprSequence? ')' |
[27] | ExprSequence | ::= | Expr (',' Expr)* |
[28] | Literal | ::= | NumericLiteral | StringLiteral |
[29] | DataAccesor | ::= | NodeAccessor
| NodeAccessor '/'
AttributeNameTest
| AttributeNameTest |
[30] | NodeAccessor | ::= | PathAccessor | Variable | FunctionCall |
[31] | FunctionCall | ::= | QName '(' ExprSequence? ')' |
[32] | PathAccessor | ::= | ('/' | '//')? RelativeAccessor |
[33] | RelativeAccessor | ::= | RelativeAccessor ('/' | '//') AccessorStep | AccessorStep |
[34] | AccessorStep | ::= | NodeNameTest
| KindTest
| '..' |
[35] | CompOp | ::= | '=' | '!=' | '<' | '<=' | '>' | '>=' |
[36] | NumericLiteral | ::= | IntegerLiteral | DecimalLiteral | DoubleLiteral |
[37] | IntegerLiteral | ::= | Digits |
[38] | DecimalLiteral | ::= | ('.' Digits) | (Digits '.' [0-9]*) |
[39] | DoubleLiteral | ::= | (('.' Digits) | (Digits ('.' [0-9]*)?))([e]|[E])([+][-])? Digits |
[40] | StringLiteral | ::= | (["][^"]*["]) | (['][^']*[']) |
[41] | Variable | ::= | '$' QName |
[42] | Digits | ::= | [0-9]+ |
In addition, the following non-terminals are defined in [XML Names]:
These people have contributed to this specification as they sent valuable comments to the stx@gingerall.cz mailing list:
Barrie Slaymaker
Miguel Branco
Gunnlaugur Thor Briem
Eric van der Vlist
Richard R. McKinley
Jan Poslušný
Kip Hampton
Robert Koberg
Michael Brennan
Tolja Zubow
Ijon Tichy
2003-01-21 : PC : Editorial changes. Tree fragments removed.
2003-02-04 : PC : Added Attribute Value Templates section. Expressions section split to Expressions and STXPath. Treatment of NCNames in expressions/patterns defined. Added Whitespace stripping section.
2003-02-05 : OB : Grammar update: removed axes completely, removedtext()
as node property, moved'.'
toBasicExpr
. Revised "Data Accessors" section. Changed contents ofstx:message
to template. Added detailed description for calling procedures. Slightly changed whitespace stripping rules (usexml:space
). Renamedstx:replace
tostx:process-text
andstx:pattern
's attributevalue
toregexp
.
2003-02-10 : PC : Added local buffers. Added content to stx:buffer.
2003-02-21 : PC : Removed namespace nodes (remaining references). References split to normative and other. Variables and parameters default to the empty strings (instead of the empty sequences).
2003-03-10 : PC : Changed Function section - QNames, references to XPath2 Functions and Operators.
2003-03-15 : OB : Addedencoding
attribute tostx:result-document
. Addedstx:while
. Clarified some aspects for the processing of buffers.
2003-03-18 : OB : Altered the note forstx:process-self
.
2003-03-19 : PC : Fixes in Functions, added sf:current(). Changed definition of sf:position(). Added clarification tostx:for-each
.
2003-04-01 : PC : Options moved to stx:transfrom/stx:group.
2003-04-24 : PC : Three precedence categories identified viavisibility
andpublic
attributes. Minor stylistic changes realated to URIs and resolvers.stx:for-each
changed tostx:for-each-item
. Removed sf:current(), Changes in '.' and sf:position().
2003-04-29 : OB : Changed including mechanism: the includedstx:transform
now becomes anstx:group
. Thepublic
attribute of top-level templates defaults to "yes". Fixed string to boolean conversion.
2003-05-02 : PC : New functions: string-join, normalize-unicode, upper-case, lower-case, string-pad, escape-uri, lang, index-of, exists, insert, remove. Added brief characterizations of functions defined in FaO. Stylistic changes (Chapters 3 a 4).