If you have multiple CPUs, there is potential for
considerable speedup by compiling multiple source
files in parallel. Each file is independent of
the next, and thus creating multiple object files
simultaneously gets more work done, faster.
The changes are relatively straightforward:
fire off the compilation pipeline into the background,
and then add a wait statement
before doing the final link step:
# initialize option-related variables
do_link=true
debug=""
link_libs=""
clib="-lc"
exefile=""
# initialize pipeline components
compile=" | ccom"
assemble=" | as"
optimize=""
# process command-line options
while getopts "cgl:[lib]o:[outfile]O files ..." opt; do
case $opt in
c ) do_link=false ;;
g ) debug="-g" ;;
l ) link_libs+=" -l $OPTARG" ;;
o ) exefile="-o $OPTARG" ;;
O ) optimize=" | optimize" ;;
esac
done
shift $(($OPTIND - 1))
# process the input files
for filename in "$@"; do
case $filename in
*.c )
objname=${filename%.c}.o ;;
*.s )
objname=${filename%.s}.o
compile="" ;;
*.o )
objname=$filename # just link it directly with the rest
compile=""
assemble="" ;;
* )
print "error: $filename is not a source or object file."
exit 1 ;;
esac
# run a pipeline for each input file; parallelize by backgrounding
eval cat \$filename $compile $assemble $optimize \> \$objname &
objfiles+=" $objname"
compile=" | ccom"
assemble=" | as"
done
wait # wait for all compiles to finish before linking
if [[ $do_link == true ]]; then
ld $exefile $objfiles $link_libs $clib
fi
This is a straightforward example of parallelization,
with the only "gotcha" being to make sure that all the compilations
are done before doing the final link step.
Indeed, many versions of make have a
"run this many jobs in parallel" flag, precisely to obtain
the speedup from simultaneous compilation of independent files.
But all of life is not so simple; sometimes just
firing more jobs off into the background won't do the trick.
For example, consider multiple changes to the same database:
the database software (or something, somewhere) has to ensure
that two different processes aren't trying to update the
same record at the same time.
Things get even more involved when working at a lower level,
with multiple threads of control within a single process
(something not visible at the shell level, thankfully).
Such problems, known as concurrency control issues, become
much more difficult as the complexity of the application
increases. Complex concurrent programs often have much more
code for handling the special cases than for the actual job
the program is supposed to do!