6.3. Transactions, Locking, and IsolationThe final topic in this chapter deals with the important (and hair-raising!) topic of transaction handling. Transaction handling is a feature of the more powerful database systems in which SQL statements can be grouped into logical chunks. Each chunk is known as a transaction, and the operations it performs are guaranteed to be atomic for the purposes of recovery. According to the ANSI/ISO SQL standard, a transaction begins with the first executable SQL statement and ends when it is explicitly committed or rolled back. The process of committing data writes it into the database tables and make it visible to other concurrent users. Rolling back discards any changes that have been made to any tables since the beginning of the current transaction. The standard example to explain transactions is a bank transfer in which a customer transfers $1000 from one bank account to another. The bank transfer consists of three distinct stages:
When viewed as three separate stages, the possibility of disaster is quite obvious. Suppose there's a power outage between stages 1 and 2. The hapless customer is now $1000 poorer, as the money has neither reached the target account nor been logged in the transfer journal. The bank is now $1000 richer.[62]
Of course, if the power outage occurred between stages 2 and 3, the customer would have the correct amount of money in the correct accounts, but the bank would have no record of the transactions. This would lead to all sorts of book-keeping problems. The answer is to treat the three separate stages as one logical unit, or transaction. Thus, when stage 1 starts executing, the transaction is started automatically. The same transaction continues until stage 3 is completed, after which point the transaction can be terminated with all the changes either being committed to the database or rolled back and discarded. Therefore, if a power outage happens at any point during the transaction, the entire transaction can be automatically rolled back when the database restarts and no permanent changes to the data will have been made. A transaction is an all-or-nothing situation. Either it all works, or it all fails -- which is great news for our luckless bank customer. Transactions are also sometimes described as having A.C.I.D. properties:
Implementing ACID transaction handling within a database requires the use of a journal log file, along with some sophisticated techniques and much careful coding. That's why it's rare to find ACID transactions supported on free databases (with the notable exception of PostgreSQL), and why it carries a performance penalty when it is supported. On the upside, full transaction handling brings with it far greater safety from power failures, client failures, database failures, and other popular forms of disaster. Simple explicit locking mechanisms do not afford the same level of safety and recoverability, as we shall discuss later. Since not all database systems support transaction processing, you may not have the luxury of being able to roll back from inadvertent data corruptions or be saved from power outage. But if your database does support transactions, the DBI makes it easy to manage them in a portable manner. 6.3.1. Automatic Transaction HandlingThe ISO standard for SQL defines a particular model for transactions. It says that a database connection is always in a transaction. Each and every transaction will end with either a commit or a rollback, and every new transaction will begin with the next statement executed. Most systems also define an auto-commit mechanism that acts as if commit() is automatically called after each statement. The DBI standard tries to find a way to let all drivers for all databases appear to offer the same facilities, as much as possible. It does this by relying on the fact that there's little practical difference between a database that supports transactions but has auto-commit enabled, versus a database that doesn't support transactions at all. The DBI standard also tries to ensure that an application written to require transactions can't accidentally be run against a database that doesn't support them. It does this by treating an attempt to disable auto-commit as a fatal error for such a database. Given that the ability to enable and disable auto-commit is important, the DBI defines a database handle attribute called AutoCommit that regulates whether or not DBI should appear to automatically force a data commit after every statement. For example, if you issue a statement like $dbh->do() that deletes some data within your database, and AutoCommit is set to a true value, you cannot roll back your change even if the database supports transactions. The DBI defaults to having AutoCommit enabled, making this potentially dangerous behavior automatic unless explicitly disabled. This is due to the precedent set by ODBC and JDBC. It was probably a mistake for the DBI to put standards compliance above safety in this case. A future version may issue a warning if AutoCommit is not specified as an attribute to DBI->connect(), so it's worth getting used to adding it now. The behavior of changing this attribute depends on which type of transaction handling your database supports. There are three possibilities:
Therefore, despite its database independence, the DBI offers both simple automatic transaction committing and powerful manual transaction processing modes. 6.3.2. Forcing a CommitThe DBI defines a method called commit() for explicitly committing any uncommitted data within the current transaction. This method is executed against a valid database handle: $dbh->commit(); If commit() is invoked while AutoCommit is enabled, a warning will be displayed similar to: commit ineffective with AutoCommit which merely tells you that the database changes have already been committed. This warning will also be displayed when commit() is invoked against a database that has no transaction support because, by definition, AutoCommit will be enabled. 6.3.3. Rolling Back ChangesThe corollary operation to committing data to the database is to roll it back. The DBI defines a method called rollback(), which can be used to roll back the most recent uncommitted database changes. Like commit(), the rollback() method is executed against a database handle: $dbh->rollback(); Similarly, if rollback() is invoked while AutoCommit is enabled, a warning will be displayed similar to: rollback ineffective with AutoCommit signifying that the database changes have already been committed. This warning will also be displayed when rollback() is invoked against a database that has no transaction support because, by definition, AutoCommit will be enabled. 6.3.4. Disconnecting, One Way or AnotherThe transaction effect of explicitly disconnecting from a database while AutoCommit is disabled is, sadly, undefined. Some database systems, such as Oracle and Ingres, will automatically commit any outstanding changes. However, other database systems, such as Informix, will roll back any outstanding changes. Because of this, applications not using AutoCommit should always explicitly call commit() or rollback() before calling disconnect() . So what happens if you don't explicitly call disconnect(), or don't have the chance to because the program exits after a die? Well, because DBI handles are object references, we can be sure that Perl itself will call the DESTROY method for us on each handle if the program exits, the handle goes out of scope, or the only copy of a handle is overwritten by another value. The actual implementation of the DESTROY method is in the hands of the driver author. If the database handle is still connected then it should automatically call rollback() (unless AutoCommit is enabled) before calling disconnect(). Calling rollback() in DESTROY is critical. If the driver doesn't, then a program aborting due to a die part way though a transaction may actually "accidentally" commit the incomplete transaction! Fortunately, all the drivers that we're aware of that support transactions do the right thing. As an extra sanity check, if you disconnect from a database while you still have active statement handles, you will get a warning. We discussed active statement handles and related topics in Chapter 5, "Interacting with the Database ". 6.3.5. Combining Automatic Error Handling with TransactionsTransactions, as you've probably realized by now, are closely related to error handling. This is especially true when you have to clean up after an error by putting everything in the database back to the way it was before the transaction started. Chapter 4, "Programming with the DBI ", we discussed error handling in some detail and sang the praises of using the RaiseError attribute for automatic error detection. Imagine combining the automatic error detection of the DBI's RaiseError attribute and the error trapping of Perl's eval { ... } and the error handling properties of transactions. The result is a simple yet powerful way to write robust applications in Perl. There is a fairly common structure to these kind of applications, so to help us discuss the issues, we've included the following example. This outline example processes CSV files containing sales data from one country, it fetches currency exchange rate information from a web site and adds that to the data, it then performs a series of inserts, selects, updates and more inserts of the data to update the database. That processing is repeated for a series of countries. Here's the code: ### Connect to the database with transactions and error handing enabled my $dbh = DBI->connect( "dbi:Oracle:archaeo", "username", "password" , { AutoCommit => 0, RaiseError => 1, } ); ### Keep a count of failures. Used for program exit status my @failed; foreach my $country_code ( qw(US CA GB IE FR) ) { print "Processing $country_code\n"; ### Do all the work for one country inside an eval eval { ### Read, parse and sanity check the data file (e.g., using DBD::CSV) my $data = load_sales_data_file( "$country_file.csv" ); ### Add data from the Web (e.g., using the LWP modules) add_exchange_rates( $data, $country_code, "http://exchange-rate-service.com" ); ### Perform database loading steps (e.g., using DBD::Oracle) insert_sales_data( $dbh, $data ); update_country_summary_data( $dbh, $data ); insert_processed_files( $dbh, $country_code ); ### Everything done okay for this file, so commit the database changes $dbh->commit(); }; ### If something went wrong... if ($@) { ### Tell the user that something went wrong, and what went wrong warn "Unable to process $country_code: $@\n"; ### Undo any database changes made before the error occured $dbh->rollback(); ### Keep track of failures push @failed, $country_code; } } $dbh->disconnect(); ### Exit with useful status value for caller exit @failed ? 1 : 0; In the following list, we make some observations about how and why the code is structured the way it is and discuss a few related issues:
Copyright © 2001 O'Reilly & Associates. All rights reserved. |
|