The most efficient grouping technique we used before was to use the XSLT key() function along with the XPath generate-id() function. We create a key for the nodes we want to index (in this case, the <state> elements), then compare each address we find to the first value returned by the key() function. Here's how we define the key:
<xsl:key name="states"
match="document(/report/po/@filename)/purchase-order/customer/address"
use="state"/>
Unfortunately, the match attribute of the <xsl:key> element can't begin with a call to the document() function. Maybe we could try creating a variable that contains all the nodes we want to use, then use that node-set to create the key:
<xsl:variable name="addresses"
select="document(/report/po/@filename)/purchase-order/customer/address"/>
<xsl:key name="states" match="$addresses" use="state"/>
This doesn't work either; you can't use a variable in the match attribute. Our hopes for a quick solution to this problem are fading quickly. Complicating the problem is the fact that axes won't help, either. Trying to use the preceding:: axis to see if a previous purchase order came from the current state also doesn't work. Consider this example:
<xsl:if test="not(preceding::address[state=$state])">
When we were working with a single document, the preceding:: axis gave us useful information. Because all of the nodes we're working with are now in separate documents, the various axes defined in XPath won't help. When I ask for any nodes in the preceding:: axis, I only get nodes from the current document. We're going to have to roll up our sleeves and do this the hard way.
Now that we're resigned to grouping nodes with brute force, we'll try to make the process as efficient as possible. For performance reasons, we want to avoid having to call the document() function any more than we have to. This won't be pretty, but here's our approach:
-
Use the document() function to retrieve the values of all of the <state> elements. To keep things simple, we'll write these values out to a string, separating them with spaces. We'll also use the <xsl:sort> element to sort the <state> elements; that will save us some time later.
-
Take our string of sorted, space-separated state names (to be precise, they're the values of all the <state> elements) and remove the duplicates. Because things are sorted, I only have to compare two adjacent values. We'll use recursion to handle this.
-
For each item in our string of sorted, space-separated, unique state names, use the document() function to see which purchase orders match the current state.
This certainly isn't efficient; for each unique state, we'll have to call the document() function once for every filename attribute. In other words, if we had 500 purchase orders from 50 unique states, we would have to open each of those 500 documents 51 times, invoking the document() 25,500 times! It's not pretty, but it works.
Retrieving the values of all <state> elements is relatively straightforward. We'll use the technique of creating a variable whose value contains output from an <xsl:for-each> element:
<xsl:variable name="list-of-states">
<xsl:for-each
select="document(/report/po/@filename)/purchase-order/customer/address/state">
<xsl:sort select="document('')/*/states:name[@abbrev=current()]"/>
<xsl:value-of select="."/><xsl:text> </xsl:text>
</xsl:for-each>
</xsl:variable>
This code produces the string "ME MA MA WI" for our current set of purchase orders. Our next step will remove any duplicate values from the list. We'll do this with recursion, using the following algorithm:
-
Call our recursive template with two arguments: the list of states and the name of the last state we found. the first time we invoke this template, the name of the last state will be blank.
-
Break the list of states into two parts: The first state in the list, followed by the remaining states in the list.
-
If the list of states is empty, exit.
If the first state in the list is different from the last state we found, output the first state and invoke the template on the remaining states on the list.
If the first state in the list is the same as the last state we found, simply invoke the template on the remaining states on the list.
Again, we use our technique of calling this template inside an <xsl:variable> element to save the list of unique states for later. Here is the <xsl:variable> element, along with the recursive template that removes duplicate state names from the string:
<xsl:variable name="list-of-unique-states">
<xsl:call-template name="remove-duplicates">
<xsl:with-param name="list-of-states" select="$list-of-states"/>
<xsl:with-param name="last-state" select="''"/>
</xsl:call-template>
</xsl:variable>
<xsl:template name="remove-duplicates">
<xsl:param name="list-of-states"/>
<xsl:param name="last-state" select="''"/>
<xsl:variable name="next-state">
<xsl:value-of select="substring-before($list-of-states, ' ')"/>
</xsl:variable>
<xsl:variable name="remaining-states">
<xsl:value-of select="substring-after($list-of-states, ' ')"/>
</xsl:variable>
<xsl:choose>
<xsl:when test="not(string-length(normalize-space($list-of-states)))">
<!-- If the list of states is empty, do nothing -->
</xsl:when>
<xsl:when test="not($last-state=$next-state)">
<xsl:value-of select="$next-state"/>
<xsl:text> </xsl:text>
<xsl:call-template name="remove-duplicates">
<xsl:with-param name="list-of-states" select="$remaining-states"/>
<xsl:with-param name="last-state" select="$next-state"/>
</xsl:call-template>
</xsl:when>
<xsl:when test="$last-state=$next-state">
<xsl:call-template name="remove-duplicates">
<xsl:with-param name="list-of-states" select="$remaining-states"/>
<xsl:with-param name="last-state" select="$next-state"/>
</xsl:call-template>
</xsl:when>
</xsl:choose>
</xsl:template>
At this point, we have a variable named list-of-unique-states that contains the value ME MA WI. Now all we have to do is get each value and output all the purchase orders from each state. We'll use recursion yet again to make this happen. We'll pass our list of unique states to our recursive template, which does the following:
-
Breaks the string into two parts: the first state in the list and the remaining states.
-
Outputs a heading for the first state in the list.
-
Invokes the document() function against each purchase order. If a given purchase order is from the first state in the list, use <xsl:apply-templates> to transform it.
-
Invokes the template again for the remaining states. If no states remain (the value of normalize-space($remaining-states) is an empty string), we're done.