2.1.3. Discussion
This problem gets to the heart of what we mean by a number. Even
things that sound simple, like integer, make you
think hard about what you will accept; for example, "Is a leading +
for positive numbers optional, mandatory, or forbidden?" The many
ways that floating-point numbers can be represented could overheat
your brain.
Decide what you will and will not accept. Then, construct a regular
expression to match those things alone. Here are some precooked
solutions (the Cookbook's equivalent of just-add-water meals) for
most common cases:
warn "has nondigits" if /\D/;
warn "not a natural number" unless /^\d+$/; # rejects -3
warn "not an integer" unless /^-?\d+$/; # rejects +3
warn "not an integer" unless /^[+-]?\d+$/;
warn "not a decimal number" unless /^-?\d+\.?\d*$/; # rejects .2
warn "not a decimal number" unless /^-?(?:\d+(?:\.\d*)?|\.\d+)$/;
warn "not a C float"
unless /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/;
These lines do not catch the IEEE notations of "Infinity" and "NaN",
but unless you're worried that IEEE committee members will stop by
your workplace and beat you over the head with copies of the relevant
standards documents, you can probably forget about these strange
forms.
use Regexp::Common;
$string = "Gandalf departed from the Havens in 3021 TA.";
print "Is an integer\n" if $string =~ / ^ $RE{num}{int} $ /x;
print "Contains the integer $1\n" if $string =~ / ( $RE{num}{int} ) /x;
The following examples are other patterns that the module can use to
match numbers:
$RE{num}{int}{-sep=>',?'} # match 1234567 or 1,234,567
$RE{num}{int}{-sep=>'.'}{-group=>4} # match 1.2345.6789
$RE{num}{int}{-base => 8} # match 014 but not 99
$RE{num}{int}{-sep=>','}{-group=3} # match 1,234,594
$RE{num}{int}{-sep=>',?'}{-group=3} # match 1,234 or 1234
$RE{num}{real} # match 123.456 or -0.123456
$RE{num}{roman} # match xvii or MCMXCVIII
$RE{num}{square} # match 9 or 256 or 12321
Some of these patterns, such as square, were not available in early
module versions. General documentation for the module can be found in
the Regexp::Common manpage, but more detailed documentation for just
the numeric patterns is in the Regexp::Common::number manpage.
Some techniques for identifying numbers don't involve regular
expressions. Instead, these techniques use functions from system
libraries or Perl to determine whether a string contains an
acceptable number. Of course, these functions limit you to the
definition of "number" offered by your libraries and Perl.