5.2 forThe most obvious enhancement we could make to the previous script is the ability to report on multiple files instead of just one. Tests like -a and -d only take single arguments, so we need a way of calling the code once for each file given on the command line. The way to do this-indeed, the way to do many things with the Korn shell-is with a looping construct. The simplest and most widely applicable of the shell's looping constructs is the for loop. We'll use for to enhance fileinfo soon. The for loop allows you to repeat a section of code a fixed number of times. During each time through the code (known as an iteration ), a special variable called a loop variable is set to a different value; this way each iteration can do something slightly different. The for loop is somewhat, but not entirely, similar to its counterparts in conventional languages like C and Pascal. The chief difference is that the shell's for loop doesn't let you specify a number of times to iterate or a range of values over which to iterate; instead, it only lets you give a fixed list of values. In other words, you can't do anything like this Pascal-type code, which executes statements 10 times:
for x := 1 to 10 do begin statements... end (You need the while construct, which we'll see soon, to construct this type of loop. You also need the ability to do integer arithmetic, which we will see in Chapter 6, Command-line Options and Typed Variables .) However, the for loop is ideal for working with arguments on the command line and with sets of files (e.g., all files in a given directory). We'll look at an example of each of these. But first, we'll show the syntax for the for construct:
for
The
list
is a list of names. (If
in
list
is omitted,
the list defaults to Task 5.2
The command finger (1) can be used (among other things) to find the names of users logged into a remote system; the command finger @ systemname does this. Its output depends on the version of UNIX, but it looks something like this:
[motet.early.com] Trying 127.146.63.17... -User- -Full name- -What- Idle TTY -Console Location- hildy Hildegard von Bingen ksh 2d5h p1 jem.cal (Telnet) mikes Michael Schultheiss csh 1:21 r4 ncd2.cal (X display 0) orlando Orlando di Lasso csh 28 r7 maccala (Telnet) marin Marin Marais mush 1:02 pb mussell.cal (Telnet) johnd John Dowland tcsh 17 p0 nugget.west.nobis. (X Window) In this output, motet.early.com is the full network name of the remote machine. Assume the systems in your network are called fred , bob , dave , and pete . Then the following code would do the trick:
for sys in fred bob dave pete do finger @$sys print done This works no matter which of the systems you are currently logged into. It prints output for each machine similar to the above, with blank lines in between. A slightly better solution would be to store the names of the systems in an environment variable. This way, if systems are added to your network and you need a list of their names in more than one script, you need change them in only one place. If a variable's value is several words separated by blanks (or TABS), for will treat it as a list of words. Here is the improved solution. First, put lines in your .profile or environment file that define the variable SYSNAMES and make it an environment variable:
SYSNAMES="fred bob dave pete" export SYSNAMES Then, the script can look like this:
for sys in $SYSNAMES do finger @$sys print done The foregoing illustrated a simple use of for , but it's much more common to use for to iterate through a list of command-line arguments. To show this, we can enhance the fileinfo script above to accept multiple arguments. First, we write a bit of "wrapper" code that does the iteration:
for filename in "$@" ; do finfo $filename print done Next, we make the original script into a function called finfo : [11] function finfo { if [[ ! -a $1 ]]; then print "file $1 does not exist." return 1 fi ... }
The complete script consists of the for loop code and the above function, in either order; good programming style dictates that the function definition should go first.
The
fileinfo
script works as follows: in the
for
statement,
Given a directory with the same files as the previous example, typing fileinfo * would produce the following output:
bob is a regular file. you own the file. you have read permission on the file. you have write permission on the file. you have execute permission on the file. custom.tbl is a regular file. you own the file. you have read permission on the file. you have write permission on the file. exp is a directory that you may search. you own the file. you have read permission on the file. you have write permission on the file. lpst is a regular file. you do not own the file. you have read permission on the file. Here is a programming task that exploits the other major use of for . Task 5.3
DOS filenames have the format FILENAME. EXT . FILENAME can be up to eight characters long; EXT is an extension that can be up to three characters. The dot is required even if the extension is null; letters are all uppercase. We want to do the following:
The first tool we will need for this job is the UNIX tr (1) utility, which translates characters on a one-to-one basis. Given the arguments charset1 and charset2 , it will translate characters in the standard input that are members of charset1 into corresponding characters in charset2 . The two sets are ranges of characters enclosed in square brackets ( [] in standard regular-expression form in the manner of grep , awk , ed , etc.). More to the point, tr [A-Z] [a-z] takes its standard input, converts uppercase letters to lowercase, and writes the converted text to the standard output. That takes care of the first step in the translation process. We can use a Korn shell string operator to handle the second. Here is the code for a script we'll call dosmv :
for filename in ${1:+$1/}* ; do newfilename=$(print $filename | tr [A-Z] [a-z]) newfilename=${newfilename%.} print "$filename -> $newfilename" mv $filename $newfilename done The * in the for construct is not the same as $ *. It's a wildcard, i.e., all files in a directory. This script accepts a directory name as argument, the default being the current directory. The expression ${1:+$1/} evaluates to the argument ( $1 ) with a slash appended if the argument is supplied, or the null string if it isn't supplied. So the entire expression ${1:+$1/}* evaluates to all files in the given directory, or all files in the current directory if no argument is given. Therefore, filename takes on the value of each filename in the list. filename gets translated into newfilename in two steps. (We could have done it in one, but readability would have suffered.) The first step uses tr in a pipeline within a command substitution construct. Our old friend print makes the value of filename the standard input to tr . tr 's output becomes the value of the command substitution expression, which is assigned to newfilename . Thus, if $filename were DOSFILE. TXT , newfilename would become dosfile.txt .
The second step uses one of the shell's pattern-matching operators,
the one that deletes the shortest match it finds at the end of the
string. The pattern here is
The last statement in the for loop body does the file renaming with the standard UNIX mv (1) command. Before that, a print command simply informs the user of what's happening. There is one little problem with the solution on the previous page: if there are any files in the given directory that aren't DOS files (in particular, if there are files whose names don't contain uppercase letters and don't contain a dot), then the conversion will do nothing to those filenames and mv will be called with two identical arguments. mv will complain with the message: mv: filename and filename are identical . We can solve this problem by letting grep determine whether each file has a DOS filename or not. The grep regular expression:
[^a-z]\{1,8\}\.[^a-z]\{0,3\} is adequate (for these purposes) for matching DOS-format filenames. [13] The character class [^a-z] means "any character except a lowercase letter." [14] So the entire regular expression means: "Between 1 and 8 non-lowercase letters, followed by a dot, followed by 0 to 3 non-lowercase letters."
When grep runs, it normally prints all of the lines in its standard input that match the pattern you give it as argument. But we only need it to test whether or not the pattern is matched. Luckily, grep 's exit status is "well-behaved": it's 0 if there is a match in the input, 1 if not. Therefore, we can use the exit status to test for a match. We also need to discard grep 's output; to do this, we redirect it to the special file /dev/null , which is colloquially known as the "bit bucket." [15] Any output directed to /dev/null effectively disappears. Thus, the command line: print "$filename" | grep '[^a-z]\{1,8\}\.[^a-z]\{0,3\}' > /dev/null
prints nothing and returns exit status 0 if the filename is in DOS format, 1 if not. Now we can modify our dosmv script to incorporate this code:
dos_regexp='[^a-z]\{1,8\}\.[^a-z]\{0,3\}' for filename in ${1:+$1/}* ; do if print $filename | grep $dos_regexp > /dev/null; then newfilename=$(print $filename | tr [A-Z] [a-z]) newfilename=${newfilename%.} print "$filename -> $newfilename" mv $filename $newfilename fi done For readability reasons, we use the variable dos_regexp to hold the DOS filename-matching regular expression. If you are familiar with an operating system other than DOS and UNIX, you may want to test your script-writing prowess at this point by writing a script that translates filenames from that system's format into UNIX format. Use the above script as a guideline. In particular, if you know DEC's VAX/VMS operating system, here's a programming challenge:
The first of these is a relatively straightforward modification of dosmv . Number 2 is difficult; here's a strategy hint:
Once you have completed No. 2, you can do No. 3 by adding a single line of code to your script; see if you can figure out how. |
|