9.6. Be an Expert on find Search Operatorsfind is admittedly tricky. Once you get a handle on its abilities, you'll learn to appreciate its power. But before thinking about anything remotely tricky, let's look at a simple find command: % find . -name "*.c" -print The . tells find to start its search in the current directory (.) and to search all subdirectories of the current directory. The -name "*.c" tells find to find files whose names end in .c. The -print operator tells find how to handle what it finds, i.e., print the names on standard output. All find commands, no matter how complicated, are really just variations on this one. You can specify many different names, look for old files, and so on; no matter how complex, you're really only specifying a starting point, some search parameters, and what to do with the files (or directories or links or . . . ) you find. The key to using find in a more sophisticated way is realizing that search parameters are really "logical expressions" that find evaluates. That is, find:
So, -name "*.c" is really a logical expression that evaluates to true if the file's name ends in .c. Once you've gotten used to thinking this way, it's easy to use the AND, OR, NOT, and grouping operators. So let's think about a more complicated find command. Let's look for files that end in .o or .tmp AND that are more than five days old, AND let's print their pathnames. We want an expression that evaluates true for files whose names match either *.o OR *.tmp: -name "*.o" -o -name "*.tmp" If either condition is true, we want to check the access time. So we put the previous expression within parentheses (quoted with backslashes so the shell doesn't treat the parentheses as subshell operators). We also add a -atime operator: -atime +5 \( -name "*.o" -o -name "*.tmp" \) The parentheses force find to evaluate what's inside as a unit. The expression is true if "the access time is more than five days ago and \( either the name ends with .o or the name ends with .tmp \)." If you didn't use parentheses, the expression would mean something different: -atime +5 -name "*.o" -o -name "*.tmp" Wrong! When find sees two operators next to each other with no -o between, that means AND. So the "wrong" expression is true if "either \( the access time is more than five days ago and the name ends with .o \) or the name ends with .tmp." This incorrect expression would be true for any name ending with .tmp, no matter how recently the file was accessed -- the -atime doesn't apply. (There's nothing really "wrong" or illegal in this second expression -- except that it's not what we want. find will accept the expression and do what we asked -- it just won't do what we want.) The following command, which is what we want, lists files in the current directory and subdirectories that match our criteria: % find . -atime +5 \( -name "*.o" -o -name "*.tmp" \) -print What if we wanted to list all files that do not match these criteria? All we want is the logical inverse of this expression. The NOT operator is an exclamation point (!). Like the parentheses, in most shells we need to escape ! with a backslash to keep the shell from interpreting it before find can get to it. The ! operator applies to the expression on its right. Since we want it to apply to the entire expression, and not just the -atime operator, we'll have to group everything from -atime to "*.tmp" within another set of parentheses: % find . \! \( -atime +5 \( -name "*.o" -o -name "*.tmp" \) \) -print For that matter, even -print is an expression; it always evaluates to true. So are -exec and -ok; they evaluate to true when the command they execute returns a zero status. (There are a few situations in which this can be used to good effect.) But before you try anything too complicated, you need to realize one thing. find isn't as sophisticated as you might like it to be. You can't squeeze all the spaces out of expressions, as if it were a real programming language. You need spaces before and after operators like !, (, ), and {}, in addition to spaces before and after every other operator. Therefore, a command line like the following won't work: % find . \!\(-atime +5 \(-name "*.o" -o -name "*.tmp"\)\) -print A true power user will realize that find is relying on the shell to separate the command line into meaningful chunks, or tokens. And the shell, in turn, is assuming that tokens are separated by spaces. When the shell gives find a chunk of characters like *.tmp)) (without the double quotes or backslashes -- the shell took them away), find gets confused; it thinks you're talking about a weird filename pattern that includes a couple of parentheses. Once you start thinking about expressions, find's syntax ceases to be obscure -- in some ways, it's even elegant. It certainly allows you to say what you need to say with reasonable efficiency. --ML and JP Copyright © 2003 O'Reilly & Associates. All rights reserved. |
|