home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


2.3 Strings

Strings are sequences of characters (like hello ). Each character is an 8-bit value from the entire 256 character set (there's nothing special about the NUL character as in some languages).

The shortest possible string has no characters. The longest string fills all of your available memory (although you wouldn't be able to do much with that). This is in accordance with the principle of "no built-in limits" that Perl follows at every opportunity. Typical strings are printable sequences of letters and digits and punctuation in the ASCII 32 to ASCII 126 range. However, the ability to have any character from 0 to 255 in a string means you can create, scan, and manipulate raw binary data as strings - something with which most other utilities would have great difficulty. (For example, you can patch your operating system by reading it into a Perl string, making the change, and writing the result back out.)

Like numbers, strings have a literal representation (the way you represent the string in a Perl program). Literal strings come in two different flavors: single-quoted strings and double-quoted strings .[ 5 ] Another form that looks rather like these two is the back-quoted string (`like this`). This isn't so much a literal string as a way to run external commands and get back their output. This is covered in Chapter 14, Process Management .

[5] There are also the here strings, similar to the shell's here documents. They are explained in Chapter 19, CGI Programming . See also Chapter 2 of Programming Perl , and perldata (1)

2.3.1 Single-Quoted Strings

A single-quoted string is a sequence of characters enclosed in single quotes. The single quotes are not part of the string itself; they're just there to let Perl identify the beginning and the ending of the string. Any character between the quote marks (including newline characters, if the string continues onto successive lines) is legal inside a string. Two exceptions: to get a single quote into a single-quoted string, precede it by a backslash. And to get a backslash into a double-quoted string, precede the backslash by a backslash. In other pictures:

'hello'     # five characters: h, e, l, l, o
'don\'t'    # five characters: d, o, n, single-quote, t
''          # the null string (no characters)
'silly\\me' # silly, followed by backslash, followed by me
'hello\n'   # hello followed by backslash followed by n
'hello
there'      # hello, newline, there (11 characters total)

Note that the \n within a single-quoted string is not interpreted as a newline, but as the two characters backslash and n . (Only when the backslash is followed by another backslash or a single quote does it have special meaning.)

2.3.2 Double-Quoted Strings

A double-quoted string acts a lot like a C string. Once again, it's a sequence of characters, although this time enclosed in double quotes. But now the backslash takes on its full power to specify certain control characters, or even any character at all through octal and hex representations. Here are some double-quoted strings:

"hello world\n"  # hello world, and a newline
"new \177"       # new, space, and the delete character (octal 177)
"coke\tsprite"   # a 

coke, a 

tab, and a 

sprite

The backslash can precede many different characters to mean different things (typically called a backslash escape ). The complete list of double-quoted string escapes is given in Table 2.1 .


Table 2.1: Double-Quoted String Representations

Construct

Meaning

\n

Newline

\r

Return

\t

Tab

\f

Formfeed

\b

Backspace

\a

Bell

\e

Escape

\007

Any octal ASCII value (here, 007 = bell)

\x7f

Any hex ASCII value (here, 7f = delete)

\cC

Any "control" character (here, CTRL-C)

\\

Backslash

\"

Double quote

\l

Lowercase next letter

\L

Lowercase all following letters until \E

\u

Uppercase next letter

\U

Uppercase all following letters until \E

\Q

Backslash-quote all nonalphanumerics until \E

\E

Terminate \L , \U, or \Q

Another feature of double-quoted strings is that they are variable interpolated , meaning that scalar and array variables within the strings are replaced with their current values when the strings are used. We haven't formally been introduced to what a variable looks like yet (except in the stroll), so I'll get back to this later.