home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Book HomeXML SchemaSearch this book

6.5. Common Patterns

After this overview of the syntax used by patterns, let's see some common patterns that you may have to use (or adapt) in your schemas or just consider as examples.

6.5.1. String Datatypes

Regular expressions treat information in its textual form. This makes them an excellent mechanism for constraining strings.

6.5.1.3. URIs

We have seen that xs:anyURI doesn't care about "absolutizing" relative URIs and it may be wise to impose the usage of absolute URIs, which are easier to process. Furthermore, it can also be interesting for some applications to limit the accepted URI schemes. This can easily be done by a set of patterns such as:

<xs:simpleType name="httpURI">
  <xs:restriction base="xs:anyURI">
    <xs:pattern value="http://.*"/>
  </xs:restriction>
</xs:simpleType>

6.5.2. Numeric and Float Types

While numeric types aren't strictly text, patterns can still be used appropriately to constrain their lexical form.

6.5.2.1. Leading zeros

Getting rid of leading zeros is quite simple but requires some precautions if we want to keep the optional sign and the number "0" itself. This can be done using patterns such as:

<xs:simpleType name="noLeadingZeros">
  <xs:restriction base="xs:integer">
    <xs:pattern value="[+-]?([1-9][0-9]*|0)"/>
  </xs:restriction>
</xs:simpleType>

Note that in this pattern, we chose to redefine all the lexical rules that apply to an integer. This pattern would give the same lexical space applied to a xs:token datatype as on a xs:integer. We could also have relied on the knowledge of the base datatype and written:

  <xs:simpleType name="noLeadingZeros">
    <xs:restriction base="xs:integer">
      <xs:pattern value="[+-]?([^0].*|0)"/>
    </xs:restriction>
  </xs:simpleType>

Relying on the base datatype in this manner can produce simpler patterns, but can also be more difficult to interpret since we would have to combine the lexical rules of the base datatype to the rules expressed by the pattern to understand the result.

6.5.3. Datetimes

Dates and time have complex lexical representations. Patterns can give developers extra control over how they are used.

6.5.3.1. Time zones

The time zone support of W3C XML Schema is quite controversial and needs some additional constraints to avoid comparison problems. These patterns can be kept relatively simple since the syntax of the datetime is already checked by the schema validator and only simple additional checks need to be added. Applications which require that their datetimes specify a time zone may use the following template, which checks that the time part ends with a "Z" or contains a sign:

<xs:simpleType name="dateTimeWithTimezone">
  <xs:restriction base="xs:dateTime">
    <xs:pattern value=".+T.+(Z|[+-].+)"/>
  </xs:restriction>
</xs:simpleType>

Still simpler, applications that want to make sure that none of their datetimes specify a time zone may just check that the time part doesn't contain the characters "+", "-", or "Z":

<xs:simpleType name="dateTimeWithoutTimezone">
  <xs:restriction base="xs:dateTime">
    <xs:pattern value=".+T[^Z+-]+"/>
  </xs:restriction>
</xs:simpleType>

In these two datatypes, we used the separator "T". This is convenient, since no occurrences of the signs can occur after this delimiter except in the time zone definition. This delimiter would be missing if we wanted to constrain dates instead of datetimes, but, in this case, we can detect the time zones on their ":" instead:

<xs:simpleType name="dateWithTimezone">
  <xs:restriction base="xs:date">
    <xs:pattern value=".+[:Z].*"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="dateWithoutTimezone">
  <xs:restriction base="xs:date">
    <xs:pattern value="[^:Z]*"/>
  </xs:restriction>
</xs:simpleType>

Applications may also simply impose a set of time zones to use:

<xs:simpleType name="dateTimeInMyTimezones">
  <xs:restriction base="xs:dateTime">
    <xs:pattern value=".+\+02:00"/>
    <xs:pattern value=".+\+01:00"/>
    <xs:pattern value=".+\+00:00"/>
    <xs:pattern value=".+Z"/>
    <xs:pattern value=".+-04:00"/>
  </xs:restriction>
</xs:simpleType>

We promised earlier to look at xs:duration and see how we can define two datatypes that have a complete sort order. The first datatype will consist of durations expressed only in months and years, and the second will consist of durations expressed only in days, hours, minutes, and seconds. The criteria used for the test can be the presence of a "D" (for day) or a "T" (the time delimiter). If neither of those characters are detected, then the datatype uses only year and month parts. The test for the other type cannot be based on the absence of "Y" and "M", since there is also an "M" in the time part. We can test that, after an optional sign, the first field is either the day part or the "T" delimiter:

<xs:simpleType name="YMduration">
  <xs:restriction base="xs:duration">
    <xs:pattern value="[^TD]+"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="DHMSduration">
  <xs:restriction base="xs:duration">
    <xs:pattern value="-?P((\d+D)|T).*"/>
  </xs:restriction>
</xs:simpleType>


Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.