6.2. The Simplest Possible Patterns
In their simplest form,
patterns may be used as enumerations
applied to the lexical space rather than on the value space.
If, for instance, we have a byte value that can only take the values
"1,"
"5," or
"15," the classical way to define
such a datatype is to use the xs:enumeration facet:
<xs:simpleType name="myByte">
<xs:restriction base="xs:byte">
<xs:enumeration value="1"/>
<xs:enumeration value="5"/>
<xs:enumeration value="15"/>
</xs:restriction>
</xs:simpleType>
This is the "normal" way of
defining this datatype if it matches the lexical space and the value
space of an xs:byte. It gives
the flexibility to accept the instance documents with values such as
"1,"
"5," and
"15," but also
"01" or
"0000005." One of the
particularities of xs:pattern is it must be the only facet
constraining the lexical space. If we have an application that is
disturbed by leading zeros, we can use patterns instead of
enumerations to define our datatype:
<xs:simpleType name="myByte">
<xs:restriction base="xs:byte">
<xs:pattern value="1"/>
<xs:pattern value="5"/>
<xs:pattern value="15"/>
</xs:restriction>
</xs:simpleType>
This new datatype is still derived from xs:byte and has the semantic of a byte, but
its lexical space is now constrained to accept only
"1,"
"5," and
"15," leaving out any variation
that has the same value but a different lexical representation.
TIP:
This is an important difference from Perl regular expressions, on
which W3C XML Schema patterns are built.
A Perl expression such as /15/ matches any string
containing "15," while the W3C XML
Schema pattern matches only the string equal to
"15." The Perl expression
equivalent to this pattern is thus /^15$/.
This example has been carefully chosen to avoid using any of the meta
characters used within patterns, which are:
".",
"\",
"?",
"*",
"+",
"{",
"}",
"(",
")",
"[", and
"]". We will see the meaning of
these characters later in this chapter; for the moment, we just need
to know that each of these characters needs to be
"escaped" by a leading
"\" to be used as a literal. For
instance, to define a similar datatype for a decimal when lexical
space is limited to "1" and
"1.5," we write:
<xs:simpleType name="myDecimal">
<xs:restriction base="xs:decimal">
<xs:pattern value="1"/>
<xs:pattern value="1\.5"/>
</xs:restriction>
</xs:simpleType>
A common source of errors is that
"normal" characters should not be
escaped: we will see later that a leading
"\" changes their meaning (for
instance, "\P" matches all the
Unicode punctuation characters and not the character
"P").
 |  |  | 6. Using Regular Expressions to Specify Simple Datatypes |  | 6.3. Quantifying |
Copyright © 2002 O'Reilly & Associates. All rights reserved.
|
|