1.9. Special Characters and Quoting
The characters <, >,
|, and &
are four examples of special characters that have
particular meanings to the shell. The wildcards we saw
earlier in this chapter (*, ?,
and [...])
are also special characters.
Table 1-6
gives indications
of the meanings of all special characters within shell command
lines only. Other characters have special meanings in specific situations,
such as the regular expressions and
string-handling operators we'll see in
Chapter 3 and
Chapter 4.
Character |
Meaning |
See chapter |
~ |
Home directory |
1 |
` |
Command substitution (archaic) |
4 |
# |
Comment |
4 |
$ |
Variable expression |
3 |
& |
Background job |
1 |
* |
String wildcard |
1 |
( |
Start subshell |
8 |
) |
End subshell |
8 |
\ |
Quote next character |
1 |
| |
Pipe |
1 |
[ |
Start character-set wildcard |
1 |
] |
End character-set wildcard |
1 |
{ |
Start code block |
7 |
} |
End code block |
7 |
; |
Shell command separator |
3 |
' |
Strong quote |
1 |
" |
Weak quote |
1 |
< |
Input redirect |
1 |
> |
Output redirect |
1 |
/ |
Pathname directory separator |
1 |
? |
Single-character wildcard |
1 |
% |
Job name/number identifier |
8 |
1.9.1. Quoting
Sometimes you will want to use special characters literally, i.e.,
without their special meanings.
This is called quoting. If you surround a string of
characters with single quotes, you strip all characters within
the quotes of any special meaning they might have.
The most obvious situation where you might need to quote a string
is with the print command, which just takes its arguments
and prints them to the standard output. What is the point of
this? As you will see in later chapters, the shell does quite
a bit of processing on command lines -- most of which involves
some of the special characters listed in
Table 1-6.
print
is a way of making the result of that processing available on
the standard output.
But what if we wanted to print the string,
2 * 3 > 5 is a valid
inequality? Suppose you typed this:
print 2 * 3 > 5 is a valid inequality.
You would get your shell prompt back, as if nothing happened!
But then there would be a new
file, with the name 5, containing "2", the names of all
files in your current directory, and then the string 3 is a valid
inequality. Make sure you understand why.[16]
However, if you type:
print '2 * 3 > 5 is a valid inequality.'
the result is the string, taken literally. You needn't quote
the entire line, just the portion containing special characters
(or characters you think might be special, if you just
want to be sure):
print '2 * 3 > 5' is a valid inequality.
This has exactly the same result.
Notice that
Table 1-6
lists double quotes (") as weak quotes.
A string in double quotes is subjected to some of the steps
the shell takes to process command lines, but not all.
(In other words, it treats only some special characters
as special.) You'll
see in later chapters why double quotes are sometimes
preferable; Chapter 7 contains the most comprehensive explanation
of the shell's rules for quoting and other aspects of command-line processing.
For now, though, you should stick to single quotes.
1.9.2. Backslash-Escaping
Another way to change the meaning of a character is to precede
it with a backslash (\). This is called backslash-escaping
the character. In most cases, when you backslash-escape
a character, you quote it. For example:
print 2 \* 3 \> 5 is a valid inequality.
produces the same results as if you surrounded the string
with single quotes. To use a literal backslash, just
surround it with quotes ('\')
or, even better, backslash-escape
it (\\).
Here is a more practical example of quoting special characters.
A few Unix commands take arguments that often include wildcard
characters, which need to be escaped so the shell doesn't
process them first.
The most common such command is
find, which searches for files throughout entire directory
trees.
To use find, you supply the root of the tree you want to
search and arguments that
describe the characteristics of the file(s) you want to find.
For example, the command
find . -name string -print
searches the directory
tree whose root is your current directory for files whose names
match the string, and prints their names. (Other arguments allow you to search
by the file's size, owner, permissions, date of last access, etc.)
You can use wildcards in the string, but you must quote them,
so that the find command itself can match them against names
of files in each directory it searches. The command
find . -name '*.c' will
match all files whose names end in .c anywhere in
your current directory, subdirectories, sub-subdirectories, etc.
1.9.3. Quoting Quotation Marks
You can also use a backslash to include double quotes within
a string. For example:
print \"2 \* 3 \> 5\" is a valid inequality.
produces the following output:
"2 * 3 > 5" is a valid inequality.
Within a double-quoted string, only the double quotes need to be escaped:
$ print "\"2 * 3 > 5\" is a valid inequality."
"2 * 3 > 5" is a valid inequality.
However, this won't work with single quotes inside
quoted expressions.
For example,
print 'Bob\'s hair is brown' will not
give you Bob's hair is brown. You can get around this
limitation in various ways. First, try eliminating the quotes:
print Bob\'s hair is brown
If no other characters are special (as is the case here),
this works. Otherwise, you can use the following command:
print 'Bob'\''s hair is brown'
That is, '\'' (i.e., single quote, backslash, single quote,
single quote) acts like a single quote within a quoted
string. Why? The first ' in
'\''
ends the quoted string we started
with 'Bob,
the \' inserts a literal single quote,
and the next '
starts another quoted string that ends with the word
"brown".
If you understand this,
you will have no trouble resolving the other bewildering
issues that arise from the shell's often cryptic syntax.
A somewhat more legible mechanism, specific to ksh93,
is available for cases where you need to quote single quotes.
This is the shell's extended quoting mechanism: $'...'.
This is known in ksh documentation as
ANSI C quoting, since the rules closely resemble those of
ANSI/ISO Standard C. The full details are provided in Chapter 7.
Here is how to use ANSI C quoting for the previous example:
$ print $'Bob\'s hair is brown'
Bob's hair is brown
1.9.4. Continuing Lines
A related issue is how to continue
the text of a command beyond a single line on your terminal or workstation
window. The answer is conceptually simple: just quote the
ENTER key. After all, ENTER is really just another character.
You can do this in two ways: by ending a line with a backslash
or by not closing a quote mark (i.e., by including ENTER in a quoted
string). If you use the backslash, there must be nothing
between it and the end of the line -- not even spaces or TABs.
Whether you use a backslash or a single quote, you are telling
the shell to ignore the special meaning of the ENTER character.
After you press ENTER, the shell understands that you haven't
finished your command line (i.e., since you haven't typed a
"real" ENTER), so it responds with a secondary
prompt, which is > by default, and waits for you to
finish the line. You can continue a line as many times as you wish.
For example, if you want the shell to print the first sentence
of Thomas Hardy's The Return of the Native, you can type this:
$ print A Saturday afternoon in November was approaching the \
> time of twilight, and the vast tract of unenclosed wild known \
> as Egdon Heath embrowned itself moment by moment.
Or you can do it this way:
$ print 'A Saturday afternoon in November was approaching the
> time of twilight, and the vast tract of unenclosed wild known
> as Egdon Heath embrowned itself moment by moment.'
There is a difference between the two methods.
The first prints the sentence as one long line.
The second preserves the embedded newlines. Try both, and you'll see
the difference.
1.9.5. Control Keys
Control keys -- those that
you type by holding down the CONTROL (or CTRL) key and hitting
another key -- are another type of special character. These normally
don't print anything on your screen, but the operating system
interprets a few of them as special commands. You already know
one of them:
ENTER is actually the same as CTRL-M (try it and see).
You have probably also used the BACKSPACE or DEL key to erase
typos on your command line.
Actually, many control keys have functions that don't really
concern you -- yet you should know about them for future reference
and in case you type them by accident.
Perhaps the most difficult thing about control keys is that they
can differ from system
to system. The usual arrangement is shown in
Table 1-7,
which
lists the control keys that all major modern versions of Unix support.
Note that CTRL-\ and CTRL-| (control-backslash and control-pipe)
are the same character notated two
different ways; the same is true of DEL and CTRL-?.
You can use the stty(1) command to find out what your settings
are and change them if you wish; see Chapter 8 for details.
On modern Unix systems (including GNU/Linux), use stty -a to
see your control-key settings:
$ stty -a
speed 38400 baud; rows 24; columns 80; line = 0;
intr = ^C; quit = ^\; erase = ^H; kill = ^U; eof = ^D; eol = <undef>;
eol2 = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W;
lnext = ^V; flush = ^O; min = 1; time = 0;
...
The ^X notation stands for
CTRL-X.
Table 1-7. Control keys
Control key |
stty name |
Function description |
CTRL-C |
intr |
Stop current command.
|
CTRL-D |
eof |
End of input.
|
CTRL-\ or CTRL-| |
quit |
Stop current command, if CTRL-C doesn't work.
|
CTRL-S |
stop |
Halt output to screen.
|
CTRL-Q |
start |
Restart output to screen.
|
BACKSPACE or CTRL-H |
erase |
Erase last character. This is the most common setting.
|
DEL or CTRL-? |
erase |
Erase last character. This is a common alternative setting.
for the erase character
|
CTRL-U |
kill |
Erase entire command line.
|
CTRL-Z |
susp |
Suspend current command (see Chapter 8).
|
CTRL-R |
rprnt |
Reprint the characters entered so far.
|
The control key you will probably use most often is CTRL-C, sometimes
called the interrupt key. This stops -- or tries to stop -- the
command that is currently running. You will want to use this when
you enter a command and find that it's taking too long, when you gave it
the wrong arguments by mistake, when you change your mind about wanting to
run it, and so on.
Sometimes CTRL-C doesn't work; in that case, if you
really want to stop a job, try CTRL-\. But don't just type
CTRL-\; always try CTRL-C first!
Chapter 8 explains why in
detail. For now, suffice it to say that CTRL-C gives the running job
more of a chance to clean up before exiting, so that files and
other resources are not left in funny states.
We've already seen an example of CTRL-D.
When you are running a command that accepts standard input from
your keyboard, CTRL-D (as the first character on the line) tells the process that your input
is finished -- as if the process were reading a file and it reached the
end of the file.
mail is a utility in which this happens often.
When you are typing in a message, you end by
typing CTRL-D. This tells mail that your message is complete
and ready to be sent. Most utilities that accept standard
input understand CTRL-D as the end-of-input character, though many such
programs accept commands like q, quit, exit, etc.
The shell itself understands CTRL-D as the end-of-input character:
as we saw earlier in this chapter, you can normally end a login session
by typing CTRL-D at the shell prompt. You are just telling the shell
that its command input is finished.
CTRL-S and CTRL-Q are called flow-control characters.
They represent
an antiquated way of stopping and restarting the flow of output from
one device to another (e.g., from the computer to your terminal)
that was useful when the speed of such output was low.
They are rather obsolete in these days of high-speed local networks
and dialup lines.
In fact, under the latter conditions,
CTRL-S and CTRL-Q are basically a nuisance.
The only thing you really need to know about them is that if your screen
output becomes "stuck," then you may have hit CTRL-S by accident.
Type CTRL-Q to restart the output; any keys you may have hit in
between will then take effect.
The final group of control characters gives you rudimentary ways to
edit your command line.
BACKSPACE or CTRL-H
acts as a backspace key (in fact,
some systems use the DEL or CTRL-? keys as "erase" instead of
BACKSPACE);
CTRL-U erases the entire line and lets you start over.
Again, most of these are outmoded.[17]
Instead of using these, go
to the next chapter and read about the Korn shell's editing
modes, which are among its most exciting features.
 |  |  | 1.8. Background Jobs |  | 2. Command-Line Editing |
Copyright © 2003 O'Reilly & Associates. All rights reserved.
|