Thompson's rule for first-time telescope makers: "It is faster to make a four-inch mirror, then a six-inch mirror, than to make a six-inch mirror."
- Programming Pearls, Communications of the ACM , Sept. 1985
Scripting is almost always a more pleasant and productive alternative to using a systems programming language. Scripting languages aren't designed to do everything,[ 1 ] however, and there comes a time when you need to dig down to C/C++ for speed, fine-grained data structures, type safety, and access to existing libraries. The ability of languages such as Perl, Visual Basic, Python, and Tcl to integrate well with C accords them the status of a serious development language, in contrast to awk and early versions of BASIC, which were seldom used for production applications.
In this chapter, we will examine what it takes to cement Perl and C code together and then study two tool sets that do a remarkable job of performing this binding for us. The first is a pair of tools called h2xs and xsubpp , packaged with the Perl distribution. For brevity, we will refer to this pair as XS,[ 2 ] because it involves an intermediate language of the same name. The other tool is SWIG (Simplified Wrapper and Interface Generator), written by Dave Beazley at the University of Utah.
We'll cover an often-used subset of these tools' capabilities and learn that a lot can be achieved without having to know anything at all about the internal Perl API. But a number of powerful features will have to wait until the section "Meaty Extensions" in Chapter 20, Perl Internals .
This chapter requires you to have the following modules handy: C::Scan, Data::Flow, both required by h2xs and available from CPAN, and the gd library for creating GIF files, downloadable from www.boutell.com .
Figure 18.1 shows a file called testmatrix.pl making a call to an underlying Matrix library written in C. To bind the two sets of code together, we need to have some glue code, indicated by the dark gray boxes.
XS and SWIG both create this glue code in two files - a Perl module and a C wrapper file - and address the following issues:
C header files (such as Matrix.h ) contain data structure declarations, preprocessor macros, publicly accessible variables, and function prototypes - essentially, the interface for a C library. You are typically not interested in making everything available to a Perl script; there's nothing worse than attempting C programming in Perl. In most cases, it suffices to export a subset of public functions, and some constants (which are available as initialized variables, #define 's, or enum s). We refer to them collectively as the public interface and extract them into a public header file.
Figure 18.2 shows how the Matrix library's header file is used as input for the two sets of tools.
The public header file may contain complex C declarations. SWIG expects you, the extension developer, to boil the interface down to a still simpler form and express it in its interface definition language. Fortunately, this language is close enough to ANSI C and simple C++ that a large number of header files don't need any translation at all. From the interface description, SWIG generates the glue code; in the Matrix case, it will be Matrix.pm and Matrix_wrap.c . If your system supports dynamic linking (shared libraries on Unix, and DLLs on Windows), and if the Perl executable has been built to use it, all that is left to be done is to convert the glue code and your C library into a dynamic library. If dynamic linking is not an option, then a new Perl executable is generated by statically linking the Perl archive library ( libperl.a on Unix or perl.lib on Microsoft Windows) with the pieces of code mentioned above.
h2xs and xsubpp take a slightly different approach. h2xs understands C header files (but not C++) and converts all constants and function prototypes to a meta language called XS. But a function declaration may still be too complex for scripting purposes, so this approach expects you to twiddle with the .xs file produced by h2xs and take the necessary steps to simplify the interface. Of course, the hand conversion is unnecessary if the interface is already simple enough. The XS language is a mixture of C and funny keywords and provides directives for you to override the glue code produced by xsubpp .
Incidentally, the code generated by both tools is quite similar, and it is perfectly acceptable to have some extensions built using the XS approach and some using SWIG. Which brings us to the question: which one should you use?
Differences in SWIG's and XS's features spring from differences in their design goals. SWIG is designed to help create a scripting language wrapper over a C library and supports Python, Tcl, and Guile in addition to Perl. In contrast, XS is designed only for Perl and allows for a number of Perlisms that SWIG cannot easily generalize to the other languages.
I prefer SWIG to the XS approach because it feels a lot cleaner, is far less internals-oriented than XS is, and supports multiple languages. In addition, it has excellent support for data structures (not just functions), whereas XS supports only functions. I build C++ and Java applications for a living, so my focus is typically more on the application than on the scripting frontend - I leave the choice of scripting language to the user. Your mileage may vary.
You'll find that all modules in the Perl distribution and on CPAN are currently written by using XS. The chief reason is that XS comes bundled with Perl. Besides, it has supported powerful features such as typemaps since its inception, whereas SWIG has been beefed up only recently. If you have to understand or modify any of the CPAN modules, you have to know XS.
Both tools provide significant degrees of freedom to compensate for most deficiencies, so my advice is to pick one and go with it.