Published on the O'Reilly Network
(http://www.oreillynet.com/) http://www.onlamp.com/pub/a/onlamp/2001/05/25/postgresql_mvcc.html See this if you're having trouble printing code examples PostgreSQL's Multi-Version Concurrency Controlby Joseph Mitchell05/25/2001 Anyone who has worked in a large multi-user database environment can relate to the frustration that "locks" can cause by making users wait. Whether the database system is using table-level, page-level, column-level, or row-level locking, the same annoying problem persists: Readers ( If you use PostgreSQL, you know that "no-locking" is already a reality. In PostgreSQL, readers never wait for writers, and writers never wait for readers. Before anyone objects to the claim that there is "no-locking" in PostgreSQL, let me explain PostgreSQL's advanced technique called Multi-Version Concurrency Control (MVCC). While most other database systems use locks to maintain concurrency control and data consistency, PostgreSQL uses a multi-version model. Think of a version as a data snapshot at a distinct point in time. When users query a table, the current version of the data appears. If they run the same query again on the table, a new version appears if any data has changed. Data changes occur in a database through A simple example of selecting data from one table shows the difference between traditional row-level locking and PostgreSQL's MVCC.
This statement reads data from a table called In PostgreSQL, however, users can always view the How MVCC works in PostgreSQLLet's look deeper into how MVCC works in PostgreSQL to allow "no-locking." Each row in PostgreSQL has two transaction IDs: a creation transaction ID for the transaction that created the row, and an expiration transaction ID for the transaction that expired the row. When an So now that you know how PostgreSQL creates versions of the data, you might be wondering how it knows which version to display. It's quite simple. At the start of a query, PostgreSQL records two things: 1) the current transaction ID, and 2) all in-process transaction IDs. When a query is issued, PostgreSQL displays all the row versions that match the following criteria:
The power of MVCC is in keeping track of transaction IDs to determine the version of the data, and thereby avoid having to issue any locks. It's very logical and efficient. New users to PostgreSQL will be pleased with the performance improvements of MVCC over row-level locking, especially those running in a large multi-user environment. Additionally, MVCC offers another advantage: hot backups. MVCC allows PostgreSQL to make a full database backup while the database is live. For many database systems, users need to shutdown the database or lock all tables to get a consistent snapshot. Not so with PostgreSQL. It simply takes a snapshot of the entire database at a point in time and dumps the output even while data is being inserted, updated or deleted. In conclusion, MVCC ensures that readers never wait for writers and writers never wait for readers. If you don't believe that PostgreSQL's multi-version model is better than row-level locking, I challenge you to try out PostgreSQL for yourself. Joseph Mitchell is a knowledge engineer for Great Bridge LLC. Return to Related Articles from the O'Reilly Network .
oreillynet.com Copyright © 2003 O'Reilly & Associates, Inc. |
|