<xsl:template match="person">
Person <xsl:value-of select="position( )"/>,
<xsl:value-of select="name"/>
</xsl:template>
Each XPath function returns one of these four types:
-
Boolean
-
Number
-
Node-set
-
String
There are no void functions in XPath. Therefore, XPath is not nearly
as strongly typed as languages like Java or even C. You can often use
any of these types as a function argument regardless of which type
the function expects, and the processor will convert it as best it
can. For example, if you insert a Boolean where a string is expected,
then the processor will substitute one of the two strings
"true" and
"false" for the Boolean. The one
exception is functions that expect to receive node-sets as arguments.
XPath cannot convert strings, Booleans, or numbers to node-sets.
In addition to the functions defined in XPath and discussed in this
chapter, most uses of XPath, such as XSLT and XPointer, define many
more functions that are useful in their particular context. You use
these extra functions just like the built-in functions when
you're using those applications. XSLT even lets you
write extension functions in Java and other languages that can do
almost anything, for example, making SQL queries against a remote
database server and returning the result of the query as a node-set.
9.7.2. String Functions
XPath includes functions for basic
string operations such as finding the length of a string or changing
letters from upper- to lowercase. It doesn't have
the full power of the string libraries in Python or Perl--for
instance, there's no regular expression
support--but it's sufficient for many simple
manipulations you need for XSLT or XPointer.
The string( ) function converts an argument of any
type to a string in a reasonable fashion. Booleans are converted to
the string "true" or the string
"false." Node-sets are converted to
the string value of the first node in the set. This is the same value
calculated by the
xsl:value-of
element. That is, the string value of the element is the complete
text of the element after all entity references are resolved and
tags, comments, and processing instructions have been stripped out.
Numbers are converted to strings in the format used by most
programming languages, such as
"1987,"
"299792500," or
"2.71828."
TIP:
In XSLT the
xsl:decimal-format element and format-number(
) function provide more precise control
over formatting so you can insert separators between groups, change
the decimal separator, use non-European digits, and make similar
adjustments.
The normal use of most of the rest of the string
functions is to manipulate or address the text content of XML
elements or attributes. For instance, if date
attributes were given in the format MM/DD/YYYY,
then the string functions would allow you to target the month, day,
and year separately.
The starts-with(
) function takes two string arguments. It
returns true if the first argument starts with the second argument.
For example, starts-with('Richard',
'Ric') is true but starts-with('Richard',
'Rick') is false. There is no corresponding
ends-with( ) function.
The
contains( ) function also takes two string
arguments. However, it returns true if the first argument contains
the second argument--that is, if the second argument is a
substring of the first argument--regardless of position. For
example, contains('Richard', 'ar') is true but
contains('Richard', 'art') is false.
The substring-before(
) function takes two string arguments and
returns the substring of the first argument that precedes the initial
appearance of the second argument. If the second string
doesn't appear in the first string, then
substring-before( ) returns the empty string. For
example, substring-before('MM/DD/YYYY',
'/') is MM. The
substring-after(
) function also takes two string
arguments but returns the substring of the first argument that
follows the initial appearance of the second argument. If the second
string doesn't appear in the first string, then
substring-after( ) returns the empty string. For
example, substring-after ('MM/DD/YYYY',
'/') is 'DD/YYYY'.
substring-before(substring-after('MM/DD/YYYY', '/')',
'/') is DD.
substring-after(substring-after('MM/DD/YYYY', '/')',
'/') is YYYY.
If you know
the position of the substring you want, then you can use the
substring( ) method instead. This takes three
arguments: the string from which the substring will be copied, the
position in the string from which to start extracting, and the number
of characters to copy to the substring. The third argument may be
omitted, in which case the substring contains all characters from the
specified start position to the end of the string. For example,
substring('MM/DD/YYYY', 1, 2) is
MM; substring('MM/DD/YYYY', 4,
2) is DD; and
substring('MM/DD/YYYY', 7) is
YYYY.
The string-length(
) function returns a number giving the
length of its argument's string value or the context
node if no argument is included. In Example 9-1,
string-length(//name[position( )=1]) is 29. If
that seems long to you, remember that all whitespace characters are
included in the count. If it seems short to you, remember that markup
characters are not included in the count.
Theoretically, you could use these functions to trim and
normalize whitespace in element content. However, since this would be
relatively complex and is such a common need, XPath provides the
normalize-space(
) function to do this. For instance, in
Example 9-1 the value of
string(//name[position( )=1]) is:
Alan
Turing
This contains a lot of extra whitespace that was inserted purely to
make the XML document neater. However,
normalize-space(string(//name[position( )=1])) is
the much more reasonable:
Alan Turing
Although a more powerful string-manipulation library would be useful,
XSLT is really designed for transforming the element structure of an
XML document. It's not meant to have the more
general power of a language like Perl, which can handle arbitrarily
complicated and varying string formats.