16.6. Counting Lines, Words, and Characters: wcThe wc (word count) command counts the number of lines, words, and characters in the files you specify. (Like most Unix utilities, wc reads from its standard input if you don't specify a filename.) For example, the file letter has 120 lines, 734 words, and 4,297 characters: % wc letter 120 734 4297 letter You can restrict what is counted by specifying the options -l (count lines only), -w (count words only), and -c (count characters only). For example, you can count the number of lines in a file: % wc -l letter 120 letter or you can count the number of files in a directory: % cd man_pages % ls | wc -w 233 The first example uses a file as input; the second example pipes the output of an ls command to the input of wc. (Be aware that the -a option (Section 8.9) makes ls list dot files. If your ls command is aliased (Section 29.2) to include -a or other options that add words to the normal output -- such as the line total nnn from ls -l -- then you may not get the results you want.) The following command will tell you how many more words are in new.file than in old.file: % expr `wc -w < new.file` - `wc -w < old.file` Many shells have built-in arithmetic commands and don't really need expr ; however, expr works in all shells. NOTE: In a programming application, you'll usually want wc to read the input files by using a < character, as shown earlier. If instead you say:% expr `wc -w new.file` - `wc -w old.file` Taking this concept a step further, here's a simple shell script to calculate the differences in word count between two files: count_1=`wc -w < $1` # number of words in file 1 count_2=`wc -w < $2` # number of words in file 2 diff_12=`expr $count_1 - $count_2` # difference in word count # if $diff_12 is negative, reverse order and don't show the minus sign: case "$diff_12" in -*) echo "$2 has `expr $diff_12 : '-\(.*\)'` more words than $1" ;; *) echo "$1 has $diff_12 more words than $2" ;; esac If this script were called count.it, then you could invoke it like this: % count.it draft.2 draft.1 draft.1 has 23 more words than draft.2 You could modify this script to count lines or characters. NOTE: Unless the counts are very large, the output of wc will have leading spaces. This can cause trouble in scripts if you aren't careful. For instance, in the previous script, the command:echo "$1 has $count_1 words" Finally, two notes about file size:
--JP, DG, and SP Copyright © 2003 O'Reilly & Associates. All rights reserved. |
|