Binding Parameters to Statements (Programming the Perl DBI)

5.3. Binding Parameters to Statements

One topic we have mentioned in our discussion of the preparation of statement handles is bind values. You may also have come across the phrases placeholders , parameters , and binding . What are these things?

A bind value is a value that can be bound to a placeholder declared within an SQL statement. This is similar to creating an on-the-fly SQL statement such as:

$sth = $dbh->prepare( "
            SELECT name, location
            FROM megaliths
            WHERE name = " . $dbh->quote( $siteName ) . "
        " );

but instead of interpolating the generated value into the SQL statement, you specify a placeholder and then bind the generated value to that. For example:

$sth = $dbh->prepare( "
            SELECT name, location
            FROM megaliths
            WHERE name = ?
        " );
$sth->bind_param( 1, $siteName );

The bind_ param( ) method is the call that actually associates the supplied value with the given placeholder. The underlying database will correctly parse the placeholder and reserve a space for it, which is "filled in" when bind_ param( ) is called. It is important to remember that bind_ param( ) must be called before execute( ); otherwise, the missing value will not have been filled in and the statement execution will fail.

It's equally simple to specify multiple bind values within one statement, since bind_ param( ) takes the index, starting from 1, of the parameter to bind the given value to. For example:

$sth = $dbh->prepare( "
            SELECT name, location
            FROM megaliths
            WHERE name = ?
            AND mapref = ?
            AND type LIKE ?
        " );
$sth->bind_param( 1, "Avebury" );
$sth->bind_param( 2, $mapreference );
$sth->bind_param( 3, "%Stone Circle%" );

You may have noticed that we haven't called the quote( ) method on the values. Bind values are passed to the database separately from the SQL statement,[50] so there's no need to "wrap up" the value in SQL quoting rules.

[50]This is not strictly true, since some drivers emulate placeholders by doing a textual replacement of the placeholders with bind values before passing the SQL to the database. Such drivers use Perl's internal information to guess whether each value needs quoting or not. Refer to the driver documentation for more information.

Some database drivers can accept placeholders in the form of :1, :2, and so on, or even :name or :somevalue, but this is not guaranteed to be portable between databases. The only guaranteed portable placeholder form is a single question mark, ?. Of course, if the underlying database in question doesn't support binding, the driver may fail to parse the statement completely.

5.3.1. Bind Values Versus Interpolated Statements

So, why use bind values? What's the real differences between these and interpolated on-the-fly SQL statements?

On the face of it, there's no obvious difference. Interpolated statement creation uses Perl's string-handling functionality to create a complete SQL statement to send to the database. The bind values are sent to the database after the SQL statement, but just before it's executed. In both cases, the same result is achieved.

The actual difference lies in the way that databases handle bind values, assuming that they do. For example, most large database systems feature a data structure known as the "Shared SQL Cache," into which SQL statements are stored along with additional related information such as a query execution plan .

The general idea here is that if the statement already exists within the Shared SQL Cache, the database doesn't need to reprocess that statement before returning a handle to the statement. It can simply reuse the information stored in the cache. This process can increase performance quite dramatically in cases where the same SQL is executed over and over again.[51]

[51]I've known a case where the database spent over a minute just trying to work out a "good enough" query execution plan for a complex SQL query. In cases like this, reuse of the processed statement handle makes for a very large improvement in performance.

For example, say we wished to fetch the general information for 100 megalithic sites, using the name as the search field. We can write the following SQL to do so:

SELECT name, location, mapref
FROM megaliths
WHERE name = <search_term>

By using interpolated SQL, we would actually issue 100 different SQL statements to the database. Even though they are almost identical, they are different enough for the database to re-parse the statement and not use the cached information. By using a bind value, the same piece of SQL and the same "execution plan" will be reused over and over again, even though a different bind value is supplied for each query.

Therefore, for databases that support it, using bind values with prepared statement handles can quite dramatically increase the performance of your applications and the efficiency of your database. This is especially significant when trying to insert many records.

That said, there are good reasons to use interpolated SQL statements instead of bind values. One of these reasons could be simply that your database doesn't support bind values! A more complex reason is that your database may have restrictive rules about what parts of an SQL statement may use placeholders.

In the examples listed above, we've illustrated the use of bind values to supply conditions for the query. For the sake of badness, say we wanted to iterate through a list of database tables and return the row count from each one. The following piece of code illustrates the idea using an interpolated SQL statement:

foreach $tableName ( qw( megaliths, media, site_types ) ) {
    $sth = $dbh->prepare( "
                SELECT count(*)
                FROM $tableName
              " );
    $sth->execute(  );
    my $count = $sth->fetchrow_array(  );
    print "Table $tableName has $count rows\n";
}

By using an interpolated statement, this code would actually execute correctly and produce the desired results, albeit at the cost of parsing and executing four different SQL statements within the database. We could rewrite the code to use bind values, which would be more efficient (theoretically):

$sth = $dbh->prepare( "
                SELECT count(*)
                FROM ?
              " );
    $sth->bind_param( 1, $tableName );
    ...

On most databases, this statement would actually fail to parse at the prepare( ) call, because placeholders can generally be used only for literal values. This is because the database needs enough information to create the query execution plan, and it can't do that with incomplete information (e.g., if it doesn't know the name of the table).

Additionally, the following code will fail, since you are binding more than just literal values:

$sth = $dbh->prepare( "
                SELECT count(*)
                FROM megaliths
                ?
    " );
    $sth->bind_param( 1, "WHERE name = 'Avebury'" );
    ...

Of course, your driver might just support this sort of thing, but don't rely on it working on other database systems!

5.3.2. Bind Values and Data Typing

Perl is a loosely typed language, in that you have strings and you have numbers. Numbers can be strings and strings can, on occasion, be numbers. You can perform arithmetic on strings. It can all be very confusing for us, so you can imagine how the driver feels when confronted with bind values.

To help the driver work out what sort of data is being supplied in a bind value, you can supply an additional argument that specifies the datatype. For example, the following code will bind the appropriately typed bind values to the statement for execution in the database:

use DBI qw(:sql_types);

$sth = $dbh->prepare( "
            SELECT meg.name, meg.location, st.site_type, meg.mapref
            FROM megaliths meg, site_types st
            WHERE name = ?
            AND id = ?
            AND mapref = ?
            AND meg.site_type_id = st.id
          " );
### No need for a datatype for this value. It's a string.
$sth->bind_param( 1, "Avebury" );

### This one is obviously a number, so no type again
$sth->bind_param( 2, 21 );

### However, this one is a string but looks like a number
$sth->bind_param( 3, 123500, { TYPE => SQL_VARCHAR } );

### Alternative shorthand form of the previous statement
$sth->bind_param( 3, 123500, SQL_VARCHAR );

### All placeholders now have values bound, so we can execute
$sth->execute(  );

The use DBI qw(:sql_types); statement asks for the standard SQL types to be imported as names, actually subroutines, that return the corresponding standard SQL integer type value. SQL_VARCHAR, for example, returns 12. If you don't want to import the SQL type names, you can add a DBI:: prefix, so that SQL_VARCHAR would be DBI::SQL_VARCHAR. However, that's not recommended because you lose the significant benefits of compile-time checking by use strict;.

If a type is specified, the driver should take that as a strong hint about what to do. But it is just a hint. Some drivers don't pay any attention to the specified type. Of those that do, most only use it to differentiate between strings, numbers, and LONG/LOB types. This is a relatively new area for the DBI and drivers, and one that's advancing slowly.

In general, databases tend to support a far wider range of datatypes than numbers and strings. Date types are very common and have widely varying formats. The DBI currently copes with these quite happily, by ducking the issue and expecting you to supply strings containing the data, formatted in the form expected by the database for the appropriate datatype.[52]

[52]Future versions may acquire ODBC-style escape functions.

5.3.3. Binding Input and Output Parameters

There is a counterpart method to the bind_ param( ) method called bind_ param_inout( ) , which can be used to sneakily return values from the statement. Typically, this is only useful with stored procedures that take input parameters and return values. Furthermore, few databases, and even fewer drivers, support this functionality, so beware.

bind_ param_inout( ) behaves in a similar way to bind_ param( ), but uses a reference to a bind value instead of the value itself. This allows the bind value to be updated with the return value from the statement.

An additional argument stating the maximum length of the value to be returned must also be specified. If the returned value exceeds this value, the execute( ) call will fail. Therefore, if you aren't sure how large the return value might be, you should be pessimistic and supply a large value for this parameter. The only cost of doing so is using more memory than you need to.

A final, optional, argument that can be supplied is that of the datatype of the bind value. This behavior is identical to datatype specification in bind_ param( ) . See the previous section for more details on how to supply values for this argument.

An Oracle-specific example showing how bind_ param_inout( ) works revolves around the following stored procedure, which returns the nearest integer values to a given input value:

-- Example stored procedure written in Oracle PL/SQL
PROCEDURE ceiling_floor (value IN NUMBER, c OUT NUMBER, f OUT NUMBER) IS
BEGIN
    c := CEIL(value);
    f := FLOOR(value);
END;

The DBI code to get these return values out of this procedure can be written as follows:

### The variables to be populated as return values...
my $ceiling;
my $floor;

$sth = $dbh->prepare( "BEGIN ceiling_floor( ?, ?, ? ); END;" );
$sth->bind_param( 1, 42.3 );
$sth->bind_param_inout( 2, \$ceiling, 50 );
$sth->bind_param_inout( 3, \$floor, 50 );
$sth->execute(  );

print "Stored procedure returned $ceiling, $floor\n";

You can use both bind_ param( ) and bind_ param_inout( ) on the same statement handle quite happily. Of course, if you use bind_ param( ) when a return value is expected, that return value will be lost.

There is one quite subtle difference between bind_ param( ) and bind_ param_inout( ) that's worth pointing out. When you call bind_ param( ), the bind value you supply is copied and can't be changed without calling bind_ param( ) again. However, when you call bind_ param_inout( ), it is the reference that's copied. The actual value that the reference points to is not read until execute( ) is called.

5.3.4. Binding Values Without bind_ param( )

Calling bind_ param( ) for each placeholder can be rather long-winded and tedious when you have many placeholders, so the DBI provides a simpler way to do it via the execute( ) method. When you call execute( ), you can simply give it a list of values, and execute( ) will call bind_ param( ) on each one for you.

Furthermore, the do( ) method described above, and the selectrow_array( ) and selectall_arrayref( ) methods, which we'll discuss shortly, all call execute( ) one way or another, and also accept a list of bind values.

The following code illustrates passing a bind value to the execute( ) method:

$sth = $dbh->prepare( "
            SELECT name, location, mapref
            FROM megaliths
            WHERE name = ? OR description LIKE ?
          " );
$sth->execute( "Avebury", "%largest stone circle%" );
...

When specifying bind values in this manner, explicit data typing of the supplied values is not possible. In some cases, the underlying driver will correctly guess the type, but in most cases all values are passed to the database as SQL_VARCHAR values. However, if you have previously called bind_ param( ) or bind_ param_inout( ) for some or all placeholders with an explicitly specified datatype, that datatype will be used instead. For example:

$sth->prepare( "
            SELECT name, location, mapref
            FROM megaliths
            WHERE id = ?
          " );
$sth->bind_param( 1, 42, SQL_INTEGER );
$sth->execute( 123456 );
...

will result in the value of 123456 being supplied to the database as the bind value with the type of SQL_INTEGER instead of SQL_VARCHAR.