home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Book HomeXML in a NutshellSearch this book

11.7. Ranges

A range is the span of parsed character data between two points. It may or may not represent a well-formed chunk of XML. For example, a range can include an element's start-tag but not its end-tag. This makes ranges suitable for uses such as representing the text a user selected with the mouse. Ranges are created with four functions XPointer adds to XPath:

  • range( )

  • range-to( )

  • range-inside( )

  • string-range( )

11.7.1. The range( ) function

The range( ) function takes as an argument an XPath expression that returns a location set. For each location in this set, the range( ) function returns a range exactly covering that location; that is, the start-point of the range is the point immediately before the location, and the end-point of the range is the point immediately after the location. If the location is an element node, then the range begins right before the element's start-tag and finishes right after the element's end-tag. For example, consider this XPointer:

xpointer(range(//title))

When applied to Example 11-1, it selects a range exactly covering the single title element. If there were more than one title element in the document then it would return one range for each such title element. If there were no title elements in the document, then it wouldn't return any ranges.

Now consider this XPointer:

xpointer(range(/novel/*))

If applied to Example 11-1, it returns three ranges, one covering each of the three child elements of the novel root element.

11.7.2. The range-inside( ) function

The range-inside( ) function takes as an argument an XPath expression that returns a location set. For each location in this set, it returns a range exactly covering the contents of that location. For anything except an element node this will be the same as the range returned by range( ). For an element node, this range includes everything inside the element, but not the element's start-tag or end-tag. For example, when applied to Example 11-1, xpointer(range-inside(//title)) returns a range covering The Wonderful Wizard of Oz but not <title>The Wonderful Wizard of Oz</title>. For a comment, processing instruction, attribute, text, or namespace node, this range covers the string value of that node. For a range, this range is the range itself. For a point, this range begins and ends with that point.

11.7.3. The range-to( ) function

The range-to( ) function is evaluated with respect to a context node. It takes a location set as an argument that should return exactly one location. The start-points of the context nodes are the start-points of the ranges it returns. The end-point of the argument is the end-point of the ranges. If the context node set contains multiple nodes, then the range-to( ) function returns multiple ranges.

TIP: This function is underspecified in the XPointer candidate recommendation. In particular, what should happen if the argument contains more or less than one location is not clear.

For instance, suppose you want to produce a single range that covers everything between <title> and </year> in Example 11-1. This XPointer does that by starting with the start-point of the title element and continuing to the end-point of the year element:

xpointer(//title/range-to(year))

Ranges do not necessarily have to cover well-formed fragments of XML. For instance, the start-tag of an element can be included but the end-tag left out. This XPointer selects <title>The Wonderful Wizard of Oz:

xpointer(//title/range-to(text( )))

It starts at the start-point of the title element, but it finishes at the end-point of the title element's text node child, thereby omitting the end-tag.

11.7.4. The string-range( ) function

The string-range( ) function is unusual. Rather than operating on a location set including various tags, comments, processing instructions, and so forth, it operates on the text of a document after all markup has been stripped from it. Tags are more or less ignored.

The string-range( ) function takes as arguments an XPath expression identifying locations and a substring to try to match against the XPath string value of each of those locations. It returns one range for each match, exactly covering the matched string. Matches are case sensitive. For example, this XPointer produces ranges for all occurrences of the word "Wizard" in title elements in the document:

xpointer(string-range(//title, "Wizard"))

If there are multiple matches, then multiple ranges are returned. For example, this XPointer returns two ranges when applied to Example 11-1, one covering the W in "Wonderful" and one covering the W in "Wizard":

xpointer(string-range(//title, "W"))
TIP: This function is also underspecified in the XPointer candidate recommendation. In particular, it is not clear what happens when there are overlapping matches.

You can also specify an offset and a length to the function so that strings start a certain number of characters from the beginning of the match and continue for a specified number of characters. The point before the first character in the string to search is 1. For example, this XPointer selects the first four characters after the word "Wizard" in title elements:

xpointer(string-range(//title, "Wizard", 7, 4))

Nonpositive indices work backwards in the document before the beginning of the match. For example, this XPointer selects the first four characters before the word "Wizard" in title elements:

xpointer(string-range(//title, "Wizard", -3, 4))

If the offset or length causes the range to fall outside the document, then no range is returned.

Since string ranges can begin and end at pretty much any character in the text content of a document, they're the way to indicate points that don't fall on node boundaries. Simply create a string range that either begins or ends at the position you want to point to, and then use start-point( ) or end-point( ) on that range. For example, this XPointer returns the point immediately before the word "Wizard" in the title element in Listing 11-1:

xpointerstart-point(start-pointxpointer(string-range(//title, "Wizard")))

11.7.7. origin( )

The origin( ) function is useful when the document has been loaded from an out-of-line link. It refers to the node from which the user is initiating traversal, even if that is not the node that defines the link. For example, consider an extended link like this one. It has many novel elements, each of which is a locator that shares the same label:

<series xlink:type="extended" xmlns:xlink="http://www.w3.org/1999/xlink">

  <!-- locator elements -->
  <novel xlink:type="locator" xlink:label="oz"
         xlink:href="ftp://archive.org/pub/etext/etext93/wizoz10.txt">
    <title>The Wonderful Wizard of Oz</title>
    <year>1900</year>
  </novel>
  <novel xlink:type="locator" xlink:label="oz"
         xlink:href="ftp://archive.org/pub/etext/etext93/ozland10.txt">
    <title>The Marvelous Land of Oz</title>
    <year>1904</year>
  </novel>
  <novel xlink:type="locator" xlink:label="oz"
         xlink:href="ftp://archive.org/pub/etext/etext93/wizoz10.txt">
    <title>Ozma of Oz</title>
    <year>1907</year>
  </novel>
  <!-- many more novel elements... -->

  <sequel xlink:type="locator" xlink:label="next"
        xlink:href="#xpointer(origin( )/following-sibling::novel[1])" />
  <next xlink:type="arc" xlink:from="oz" xlink:to="next" />

</series>

The sequel element uses an XPointer and the origin( ) function to define a locator that points to the following novel in the series. If the user is reading The Wonderful Wizard of Oz, then the sequel element locates The Marvelous Land of Oz. If the user is reading The Marvelous Land of Oz, then that same sequel element locates Ozma of Oz, and so on. The next element defines links from each novel (since they all share the label oz) to its sequel. The ending resource changes from one novel to the next.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.