6.12. Honoring Locale Settings in Regular ExpressionsProblem
You want to translate case when in a different locale, or you want to make
For example, let's say you're given half a gigabyte of text written in German and told to index it. You want to extract words (with Solution
Perl's regular-expression and text-manipulation routines have hooks to POSIX locale setting. If you use the use locale; Discussion
By default, In Example 6.10 you can see the difference in output between having selected the English ("en") locale and the German ("de") one. Example 6.10: localeg#!/usr/bin/perl -w # localeg - demonstrate locale effects use locale; use POSIX 'locale_h'; $name = "andreas k\xF6nig"; @locale{qw(German English)} = qw(de_DE.ISO_8859-1 us-ascii); setlocale(LC_CTYPE, $locale{English}) or die "Invalid locale $locale{English}"; @english_names = (); while ($name =~ /\b(\w+)\b/g) { push(@english_names, ucfirst($1)); } setlocale(LC_CTYPE, $locale{German}) or die "Invalid locale $locale{German}"; @german_names = (); while ($name =~ /\b(\w+)\b/g) { push(@german_names, ucfirst($1)); } print "English names: @english_names\n"; print "German names: @german_names\n"; This approach relies on POSIX locale support, which your system may or may not provide. Even if your system does claim to provide POSIX locale support, the standard does not specify the locale names. As you can tell, portability of this approach is not assured. See Also
The treatment of Copyright © 2001 O'Reilly & Associates. All rights reserved. |
|