3.7. Parsing Dates and Times from Strings

Problem

You read in a date or time specification in an arbitrary format but need to convert that string into distinct year, month, etc. values.

Solution

If your date is already numeric, or in a rigid and easily parsed format, use a regular expression (and possibly a hash mapping month names to numbers) to extract individual day, month, and year values, and then use the standard Time::Local module's timelocal and timegm functions to turn that into an Epoch seconds value.

use Time::Local;
# $date is "1998-06-03" (YYYY-MM-DD form).
($yyyy, $mm, $dd) = ($date =~ /(\d+)-(\d+)-(\d+)/;
# calculate epoch seconds at midnight on that day in this timezone
$epoch_seconds = timelocal(0, 0, 0, $dd, $mm, $yyyy);

For a more flexible solution, use the ParseDate function provided by the CPAN module Date::Manip, and then use UnixDate to extract the individual values.

use Date::Manip qw(ParseDate UnixDate);
$date = ParseDate($STRING);
if (!$date) {
    # bad date
} else {
    @VALUES = UnixDate($date, @FORMATS);
}

Discussion

The flexible ParseDate function accepts many formats. It even converts strings like "today", "2 weeks ago Friday", and "2nd Sunday in 1996", and understands the date and time format used in mail and news headers. It returns the decoded date in its own format: a string of the form "YYYYMMDDHH:MM:SS". You can compare two such strings to compare the dates they represent, but arithmetic is difficult. For this reason, we use the UnixDate function to extract the year, month, and day values in a preferred format.

UnixDate takes a date in the string form returned by ParseDate and a list of formats. It applies each format to the string and returns the result. A format is a string describing one or more elements of the date and time and the way that the elements are to be formatted. For example, %Y is the format for the year in four-digit form. Here's an example:

use Date::Manip qw(ParseDate UnixDate);

while (<>) {
    $date = ParseDate($_);
    if (!$date) {
        warn "Bad date string: $_\n";
        next;
    } else {
        ($year, $month, $day) = UnixDate($date, "%Y", "%m", "%d");
        print "Date was $month/$day/$year\n";
    }
}

See Also

The documentation for the CPAN module Date::Manip; we use this in Recipe 3.11