17.19 Finding Files (Much) Faster with a find Database
If you use find (17.2 ) to search for files, you know that it can take a long time to work, especially when there are lots of directories to search. Here are some ideas for speeding up your find s.
If your system has "fast find " or GNU locate (17.18 ) , that's probably all you need. It lets you search a list of all pathnames on the system.
Even if you have the fast find or locate , it still might not do what you need. For example, those utilties only search for pathnames. To find files by the owner's name, the number of links, the size, and so on, you have to use "slow" find . In that case - or, when you don't have fast find or locate - you may want to set up your own version.
The basic fast find has two parts. One part is a command, a shell script named /usr/lib/find/updatedb , that builds a database of the files on your system - if your system has it, take a look to see a fancy way to build the database. The other part is the find command itself - it searches the database for pathnames that match the name (regular expression) you type.
To make your own fast find :
To search the database, type:
You can do much more. I'll get you started. If you have room to store more information than just pathnames, you can feed your find output to a command like ls -l or sls (16.29 ) . For example, if you do a lot of work with links (18.3 ) , you might want to keep the files' i-numbers (1.22 ) as well as their names. You'd build your database with a command like the one below. Use xargs (9.21 ) or something like it (9.20 ) .
cd find . -print | xargs ls -id > $HOME/.fastfind
Or, if your version of find has the handy -ls operator, use the next script. Watch out for really large i-numbers; they might shift the columns and make cut (35.14 ) give wrong output.
cd find . -ls | cut -c1-7,67- > $HOME/.fastfind
The exact column numbers will depend on your system. Then, your ffind script could search for files by i-number. For instance, if you had a file with i-number 1234 and you wanted to find all its links:
(The space at the end prevents matches with i-numbers like 12345.) You could also search by pathname.
Article 16.21 shows another find database setup, a list of directories or files with the same names.
With some information about UNIX shell programming and utilities like awk (33.11 ) , the techniques in this article should let you build and search a sophisticated file database - and get information much faster than with plain old find .