Using Named Unicode Characters (Perl Cookbook, 2nd Edition)

1.5.3. Discussion

The use charnames pragma lets you use symbolic names for Unicode characters. These are compile-time constants that you access with the \N{CHARSPEC} double-quoted string sequence. Several subpragmas are supported. The :full subpragma grants access to the full range of character names, but you have to write them out in full, exactly as they occur in the Unicode character database, including the loud, all-capitals notation. The :short subpragma gives convenient shortcuts. Any import without a colon tag is taken to be a script name, giving case-sensitive shortcuts for those scripts.

use charnames ':full';
print "\N{GREEK CAPITAL LETTER DELTA} is called delta.\n";

Δ is called delta.

use charnames ':short';
print "\N{greek:Delta} is an upper-case delta.\n";

Δ is an upper-case delta.

use charnames qw(cyrillic greek);
print "\N{Sigma} and \N{sigma} are Greek sigmas.\n";
print "\N{Be} and \N{be} are Cyrillic bes.\n";

Σ and σ are Greek sigmas.
Б and б are Cyrillic bes.

Two functions, charnames::viacode and charnames::vianame, can translate between numeric code points and the long names. The Unicode documents use the notation U+XXXX to indicate the Unicode character whose code point is XXXX, so we'll use that here in our output.

use charnames qw(:full);
for $code (0xC4, 0x394) { 
    printf "Character U+%04X (%s) is named %s\n",
        $code, chr($code), charnames::viacode($code);
}

Character U+00C4 (Ä) is named LATIN CAPITAL LETTER A WITH DIAERESIS
Character U+0394 (Δ) is named GREEK CAPITAL LETTER DELTA

use charnames qw(:full);
$name = "MUSIC SHARP SIGN";
$code = charnames::vianame($name);
printf "%s is character U+%04X (%s)\n",
    $name, $code, chr($code); 

MUSIC SHARP SIGN is character U+266F (#)

Here's how to find the path to Perl's copy of the Unicode character database:

% perl -MConfig -le 'print "$Config{privlib}/unicore/NamesList.txt"'
/usr/local/lib/perl5/5.8.1/unicore/NamesList.txt

Read this file to learn the character names available to you.

1.5. Using Named Unicode Characters

1.5.1. Problem

1.5.2. Solution

1.5.3. Discussion

1.5.4. See Also