6.19. Matching a Valid Mail AddressProblemYou want to find a pattern that will verify the validity of a supplied mail address. SolutionThere isn't one. You cannot do real-time validation of mail addresses. You must pick from a number of compromises. Discussion
The common patterns that people try to use for this are all quite incorrect. As an example, the address RFC-822 documents have a formal specification for what constitutes a syntactically valid mail address. However, complete processing requires recursive parsing of nested comments, something that one single regular expression cannot do. If you first strip off legal comments: 1 while $addr =~ s/\([^()]*\)//g; You could then in theory use the 6598-byte pattern given on the last page of Mastering Regular Expressions to test for RFC-conformance, but that's still not good enough, for three reasons.
First, not all RFC-valid address are deliverable. For example,
Second, some RFC-invalid addresses, in practice, are perfectly deliverable. For example, a lone
Thirdly and most important, just because the address happens to be both valid and deliverable doesn't mean that it's the right one. The script at http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/ckaddr.gz makes a valiant (albeit provably imperfect) attempt at doing this incorrectly. It jumps through many hoops, including the RFC-822 regular expression from Mastering Regular Expressions , DNS MX record look-up, and stop lists for naughty words and famous people. But this is still a very weak approach. Our best advice for verifying a person's mail address is to have them enter their address twice, just as you would when changing a password. This usually weeds out typos. If both versions match, send mail to that address with a personal message such as: Dear someuser@host.com, Please confirm the mail address you gave us Wed May 6 09:38:41 MDT 1998 by replying to this message. Include the string "Rumpelstiltskin" in that reply, but spelled in reverse; that is, start with "Nik...". Once this is done, your confirmed address will be entered into our records. If you get the message back and they've followed your directions, you can be reasonably assured that it's real. A related strategy that's less open to forgery is to give them a PIN (personal identification number). Record the address and PIN (preferably a random one) for later processing. In the mail you send, ask them to include the PIN in their reply. If it bounces, or the message is included via a vacation script, it'll be there anyway. So ask them to mail back the PIN slightly altered, such as with the characters reversed, one added or subtracted to each digit, etc. See AlsoThe "Matching an Email Address" section of Chapter 7 of Mastering Regular Expressions ; Recipe 18.9 |
|