4.5. Date and Time Datatypes
The
datatypes
covered in this section are shown in Figure 4-4.
Figure 4-4. Date and time datatypes
4.5.1. The Realm of ISO 8601
The W3C Recommendation,
"XML Schema Part 2: Datatypes,"
provides new confirmation of how difficult it is to fix time.
The support for date and time datatypes relies entirely on a subset
of the ISO 8601 standard, which is the only format supported by W3C
XML Schema. The purpose of ISO 8601 is to eliminate the risk of
confusion between the various date and time formats used in different
countries. In other words, W3C XML Schema does not support these
local date and time formats, and imposes the usage of ISO 8601 for
any datatype that has the semantic of a date or time. While this is a
good thing for interchange formats, this is more questionable when
XML is used to define user interfaces, since we will see that ISO
8601 is not very user friendly. The variations using the names of the
months or different orders between year, month, and day are not the
only victims of this decision: ISO 8601 imposes the usage of the
Gregorian (Christian) calendar to the exclusion of calendars used by
other cultures or religions.
ISO 8601 describes several formats to define date, times, periods,
and recurring dates, with different levels of precision and
indetermination. After many discussions, W3C XML Schema selected a
subset of these formats and created a primitive datatype for each
format that is supported.
The indeterminacy allowed in some of these formats adds a lot of
difficulty, especially when comparisons or arithmetic are involved.
For instance, it is possible to define a point in time without
specifying the time zone, which is then
considered undetermined. This undetermined time zone is identical all
over the document (and between the schema and the instance documents)
and it's not an issue to compare two datetimes
without a time zone. The problem arises when you need to compare two
points in time, one with a time zone and the other without. The
result of this comparison will be undetermined if these values are
too close, since one of them may be between -13 hours and +12 hours
of Coordinated Universal Time (UTC).
Thus, the support of these datetime datatypes introduces a notion of
"partial order relation."
Another caveat with ISO 8601 is that time zones are only supported
through the time difference from UTC, which ignores the notion of
summer time. For instance, if an application working in
British Summer Time (BST) wants to specify
the time zone--and we have seen that this is necessary to be
able to compare datetimes--the application needs to know if a
date is in summer (the time zone will be one hour after UTC) or in
winter (the time zone would then be UTC). ISO 8601 ignores the
"named time zones" using the summer
saving times (such as PST, BST, or WET) that we use in our day-to-day
life; ignoring the time zones can be seen as a somewhat dangerous
shortcut to specify that a datetime is on your
"local time," whatever it is.
4.5.2. Datatypes
- Point in time: xs:dateTime
-
The xs:dateTime datatype defines a
"specific instant of time." This is
a subset of what ISO 8601 calls a "moment of
time." Its lexical value follows the format
"CCYY-MM-DDThh:mm:ss," in which all
the fields must be present and may optionally be preceded by a sign
and leading figures, if needed, and followed by fractional digits for
the seconds and a time zone. The time zone may be specified using the
letter "Z," which identifies UTC,
or by the difference of time with UTC.
TIP:
The value space of xs:dateTime is considered to be
the moment of time itself. The time zone that defines the value (when
there is one) is considered meaningless, which is a problem for some
applications that complain that even though
2002-01-18T12:00:00+00:00 and
2002-01-18T11:00:00-01:00 refer to the same
"moment of time," they carry
different time zone information, which should make its way into the
value space.
Valid values for xs:dateTime include:
2001-10-26T21:32:52
2001-10-26T21:32:52+02:00
2001-10-26T19:32:52Z
2001-10-26T19:32:52+00:00
-2001-10-26T21:32:52
2001-10-26T21:32:52.12679
The following values are invalid:
2001-10-26 (all the parts must be specified)
2001-10-26T21:32 (all the parts must be specified)
2001-10-26T25:32:52+02:00 (the hours part (25) is out of range)
01-10-26T21:32 (all the parts must be specified)
In the valid examples given above, three of them have identical value
spaces:
2001-10-26T21:32:52+02:00
2001-10-26T19:32:52Z
2001-10-26T19:32:52+00:00
The first one (2001-10-26T21:32:52), which
doesn't include a time zone specification, is
considered to have an indeterminate value between
2001-10-26T21:32:52-14:00 and
2001-10-26T21:32:52+14:00. With the usage of
summer saving time, this range is subject to national regulations and
may change. The range was between -13:00 and +12:00 when the
Recommendation was published, but the Working Group has kept a margin
to accommodate possible changes in the regulations.
Despite the indeterminacy of the time zone when none is specified,
the W3C XML Schema Recommendation considers that the values of
datetimes without time zones implicitly refer to the same
undetermined time zone and can be compared between them. While this
is fine for "local" applications
that operate in a single time zone, this is a source of potential
confusion and errors for world-wide applications or even for
applications that calculate a duration between moments belonging to
different time saving seasons within a single time zone.
- Periods of time: xs:date, xs:gYearMonth and xs:gYear.
-
The
lexical
space of
xs:date datatype is identical to the date part
of xs:dateTime. Like xs:dateTime,
it includes a time zone that should always be specified to be able to
compare two dates without ambiguity. As defined per W3C XML Schema, a
date is a period one day in its time zone,
"independent of how many hours this day
has." The consequence of this definition is that two
dates defined in a different time zone cannot be equal except if they
designate the same interval (2001-10-26+12:00 and
2001-10-25-12:00, for instance). Another
consequence is that, like with xs:dateTime, the
order relation between a date with a time zone and a date without a
time zone is partial.
Valid values for xs:date include:
2001-10-26
2001-10-26+02:00
2001-10-26Z
2001-10-26+00:00
-2001-10-26
-20000-04-01
The following values are invalid:
2001-10 (all the parts must be specified)
2001-10-32 (the days part (32) is out of range)
2001-13-26+02:00 (the month part (13) is out of range)
01-10-26 (the century part is missing)
xs:date represents a day identified by a
Gregorian calendar date (and could have been called
"gYearMonthDay").
xs:gYearMonth ("g"
for Gregorian) is a Gregorian calendar month and xs:gYear is a Gregorian calendar year. These three
datatypes are fixed periods of time and optional time zones may be
specified for each of them. The only differences between them really
are their length (1 day, 1 month, and 1 year) and their format (i.e.,
their lexical spaces).
The format of xs:gYearMonth is the format of xs:date without the day part. Valid values for xs:gYearMonth include:
2001-10
2001-10+02:00
2001-10Z
2001-10+00:00
-2001-10
-20000-04
The following values are invalid:
2001 (the month part is missing)
2001-13 (the month part is out of range)
2001-13-26+02:00 (the month part is out of range)
01-10 (the century part is missing)
The format of xs:gYear is the format of xs:gYearMonth without the month part. Valid values for xs:gYear include:
2001
2001+02:00
2001Z
2001+00:00
-2001
-20000
The following values are invalid:
01 (the century part is missing)
2001-13 (the month part is out of range)
This support of time periods is very restrictive: these periods can
only match the Gregorian calendar day, month, or year, and cannot
have an arbitrary length or start time.
- Recurring point in time: xs:time
-
The lexical
space of xs:time is identical to the time part
of xs:dateTime. The semantic of xs:time represents a point in time that recurs every
day; the meaning of 01:20:15 is
"the point in time recurring each day at 01:20:15
am." Like xs:date and xs:dateTime, xs:time accepts an
optional time zone definition. The same issue arises when comparing
times with and without time zones.
NOTE:
Despite the fact that: 01:20:15 is commonly used to represent a
duration of 1 hour, 20 minutes, and 15 seconds, a different format
has been chosen to represent a duration.
Valid values for xs:time include:
21:32:52
21:32:52+02:00
19:32:52Z
19:32:52+00:00
21:32:52.12679
The following values are invalid:
21:32 (all the parts must be specified)
25:25:10 (the hour part is out of range)
-10:00:00 (the hour part is out of range)
1:20:10 (all the digits must be supplied)
This support of a recurring point in time is also very limited: the
recursion period must be a Gregorian calendar day and cannot be
arbitrary.
- Recurring period of time: xs:gDay, xs:gMonth, and xs:gMonthDay.
-
We
have
already
seen points in times and periods, as well as recurring points in
time. This wouldn't be complete without a
description of recurring periods. W3C XML Schema supports three
predefined recurring periods corresponding to Gregorian calendar
months (recurring every year) and days (recurring each month or
year). The support of recurring periods is restricted both in terms
of recursion (the recursion period can only be a Gregorian calendar
year or month) and period (the start time can only be a Gregorian
calendar day or month, and the duration can only be a Gregorian
calendar month or year).
xs:gDay is a period of a Gregorian calendar day
recurring each Gregorian calendar month. The lexical representation
of xs:gDay is ---DD with an
optional time zone specification. Valid values for xs:gDay include:
---01
---01Z
---01+02:00
---01-04:00
---15
---31
The following values are invalid:
--30- (the format must be "---DD")
---35 (the day is out of range)
---5 (all the digits must be supplied)
15 (missing the leading "---")
The rules of arithmetic between dates and durations apply in this
case, and days are "pinned" in the
range for each month. In our example, --31, the
selected dates will be January 31st, February 28th (or 29th), March
31st, April 30th, etc.
xs:gMonthDay is a period of a Gregorian calendar day
recurring each Gregorian calendar year. The lexical representation of
xs:gMonthDay is --MM-DD with an
optional time zone specification. Valid values for xs:gMonthDay include:
--05-01
--11-01Z
--11-01+02:00
--11-01-04:00
--11-15
--02-29
The following values are invalid:
-01-30- (the format must be --MM-DD)
--01-35 (the day part is out of range)
--1-5 (one part is missing)
01-15 (the leading -- is missing)
xs:gMonth is a period of a Gregorian calendar
month recurring each Gregorian calendar year. The lexical
representation of xs:gMonth defined in the
Recommendation is --MM-- with an optional
time
zone specification. The W3C XML Schema Working Group has acknowledged
that this was an error and that the format --MM
defined by ISO 8061 should be used instead. It has not been decided
yet if the format described in the Recommendation will be forbidden
or only deprecated, but it is advised to use the format
--MM (assuming that the tools you are using
already support it). Valid values for xs:gMonth
include:
--05
--11Z
--11+02:00
--11-04:00
--02
The following values are invalid:
-01- (the format must be --MM)
--13 (the month is out of range)
--1 (both digits must be provided)
01 (the leading -- is missing)
- xs:duration
-
Naive
programmers who think that the concept of duration is simple should
read the Recommendation, which states: xs:duration is defined as a six-dimensional
space!" Mathematicians would object that this is not
absolutely true since most of the axes of these dimension are
parallel, but the fact is that when these programmers say that a
development will last one month and 3 days, they define a duration
that is comprised of between 31 and 34 days. The attempt of W3C XML
Schema to deal with these issues on top of ISO 8601 has introduced a
degree of indeterminacy in the comparisons between durations.
The lexical space of xs:duration is the format
defined by ISO 8601 under the form PnYnMnDTnHnMnS,
in which the capital letters are delimiters that can be omitted when
the corresponding member is not used. An important difference with
the format used for xs:dateTime is none of these
members are mandatory and none of them are restricted to a range.
This gives flexibility to choose the units that will be used and to
combine several of them--for instance,
P1Y2MT123S (1 year, 2 months, and 123 seconds).
This flexibility has a price; such a duration is not completely
defined: a year may have 365 or 366 days, and a period of two months
lasts between 59 and 62 days. Durations cannot always be compared and
the order between durations is partial. We will see, in the next
chapter, that user-defined datatypes can be derived from xs:duration, which can restrict the components used to
express durations and insure that these indeterminations do not
happen.
Since the value of a duration is fixed as soon as you give it a
starting point, the schema Working Group has identified four
datetimes:
1696-09-01T00:00:00Z
1697-02-01T00:00:00Z
1903-03-01T00:00:00Z
1903-07-01T00:00:00Z
These cause the greatest deviations when durations mixing day, month,
and other components are added. The Working Group has determined that
the comparison of durations is undefined if--and only
if--the result of the comparison is different when each of these
dates is used as a starting point.
Valid values for xs:duration include:
PT1004199059S
PT130S
PT2M10S
P1DT2S
-P1Y
P1Y2M3DT5H20M30.123S
The following values are invalid:
1Y (the leading P is missing)
P1S (the T separator is missing)
P-1Y (all parts must be positive)
P1M2Y (the parts order is significant and Y must precede M)
P1Y-1M (all parts must be
positive)
| | | 4.4. Numeric Datatypes | | 4.6. List Types |
Copyright © 2002 O'Reilly & Associates. All rights reserved.
|