Sybase iAnywhere SQL AAnywhere Mobile and Embedded Database

I'd rather play golf

Thoughts on data management, autonomic computing, and self-managing database systems.

header image

Why snapshot isolation is so useful

May 19th, 2009 · 4 Comments

Many commercial and open-source database management systems, including Microsoft SQL Server, Oracle, MySQL (with InnoDB or Falcon storage engines), PostgreSQL, Firebird, H2, Interbase, Sybase IQ, and SQL Anywhere support multi-version concurrency control, abbreviated as MVCC and often referred to as snapshot isolation.

Why is support for snapshot isolation so important? Well, snapshot isolation provides another widget in the DBA’s toolkit, providing reasonable semantics that avoids various types of update anomalies, while not incurring the considerable overhead and contention of serializable query execution strategies.

The term serializability characterizes execution schedules where interleaved interaction of two or more database applications occur as if those transactions were executed serially (one following another). Prior to the idea of snapshot isolation, documented for the first time in reference [1] and subsequently implemented in Oracle, the way to achieve serializable transaction execution was through strict two-phase locking. However, the vast majority of database applications cannot tolerate the lack of concurrency that serializable execution entails; almost always, application developers are willing to tradeoff serializable semantics for improved concurrency (ie. weaker transaction isolation) and snapshot isolation is an important type of weaker concurrency control.

The ANSI/ISO SQL Standard defines isolation levels in terms of anomalies that may be avoided. They are (SQL:2008, Section 4.35.4, pp. 124-5):

The isolation level specifies the kind of phenomena that can occur during the execution of concurrent SQL transactions. The following phenomena are possible:

  1. P1 (“Dirty read”): SQL-transaction T1 modifies a row. SQL-transaction T2 then reads that row before T1 performs a COMMIT. If T1 then performs a ROLLBACK, T2 will have read a row that was never committed and that may thus be considered to have never existed.
  2. P2 (“Non-repeatable read”): SQL-transaction T1 reads a row. SQL-transaction T2 then modifies or deletes that row and performs a COMMIT. If T1 then attempts to reread the row, it may receive the modified value or discover that the row has been deleted.
  3. P3 (“Phantom”): SQL-transaction T1 reads the set of rows N that satisfy some search condition. SQL transaction T2 then executes SQL statements that generate one or more rows that satisfy the search condition used by SQL-transaction T1. If SQL-transaction T1 then repeats the initial read with the same search condition, it obtains a different collection of rows.

It is well-known [1] that the above definitions are lacking in describing all of the anomalies that may occur at isolation levels lower than SERIALIZABLE; the paper by Berenson et al. [1] is highly recommended for DBAs and application programmers alike to help understand the anomalous behaviour that can be encountered at lower isolation levels. However, a common characteristic that ANSI isolation levels 1 through 3 incur is that a writer will block a reader; as with many other DBMS implementations, SQL Anywhere’s concurrency control is based on locking and write locks cause blocking, except for read transactions at isolation level 0 (READ UNCOMMITTED) which offers no correctness guarantees. Moreover, writers always block other writers: SQL Anywhere does not permit “dirty writes” – termed P0 in reference [1] – at any isolation level, due to the ROLLBACK and recovery issues that entail if dirty writes are permitted.

The SQL standard does not specify how these P1, P2, and P3 anomalies are to be avoided; every database management system is free to implement its own solutions. For example, beginning with SQL Anywhere Version 10, SQL Anywhere utilizes intent locks with update-able cursors to prevent concurrent updates; intent locks permit read transactions to process the row, at least until the row is actually modified and the intent lock is upgraded to a write lock.

In addition to expanding on the ANSI isolation levels, reference [1] defined snapshot isolation: the basic idea is for each transaction to “see” a consistent snapshot of the database as of transaction start, and that this snapshot remains unaffected by other concurrent update transactions. Because of some quirks of terminology, many believe that snapshot isolation offers serializable semantics. However, snapshot isolation is not serializable [3] and some researchers have over the years proposed modifications to snapshot isolation so that it does offer serializability [4].

Snapshot isolation as proposed in [1] and implemented in Oracle is based on first-committer-wins; that is, if two transactions modify the same row, which is permitted in this scheme, the first transaction to COMMIT “wins”, and other transactions in conflict will be unable to COMMIT, and must ROLLBACK. In contrast, snapshot isolation in SQL Anywhere is based on first-writer-wins, which (still) forces writers to block writers. This has the advantage of simplifying an application’s COMMIT logic, but the disadvantage of being subject to greater risk of deadlock. However, SQL Anywhere’s snapshot isolation retains the benefits of writers not blocking readers. This permits an application to “see” a consistent state of the database since the start of the transaction, making it straightforward, for example, for a read-only transaction to analyze an entire database without regard to updates made by concurrent transactions. This is a very powerful benefit of snapshot isolation.

Of course, snapshot isolation doesn’t come for free. It is necessary for the database system to construct archive copies of changed data in anticipation of new snapshot transactions. With SQL Anywhere, copies of snapshot rows are managed automatically, written to the temp file (which grows on demand) as necessary. However, though the management impact is near zero, query performance can suffer as snapshot rows may need to be fetched individually from the snapshot row store in the temp file, based on the snapshot semantics of the transaction. The degree of performance degradation depends entirely on the application and its workload, and will be worse with update-intensive workloads. Nonetheless, those same update-intensive workloads may not be perform well with traditional ANSI isolation levels based on locking, because of the lock contention that may be incurred, and the greater potential for deadlock. Hence careful capacity planning should be undertaken prior to deploying such an application in a production setting.

NB. Links to papers are to freely available, public preprint versions.

[1] Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O’Neil, and Patrick O’Neil (June 1995). A Critique of ANSI SQL Isolation Levels. Proceedings of the 1995 ACM SIGMOD Conference, San Jose, California, pp. 1-10. Also available as Microsoft Research Technical Report MSR-TR-95-51.

[2] Atul Adya, Barbara Liskov, and Patrick O’Neil (March 2000). Generalized Isolation Level Definitions. In Proceedings of the 2000 IEEE International Conference on Data Engineering, San Diego, California, pp. 67-78.

[3] Alan Fekete, Elizabeth O’Neil, and Patrick O’Neil (September 2004). A Read-only Transaction Anomaly Under Snapshot Isolation. ACM SIGMOD Record 33(3), pp. 12-14.

[4] Alan Fekete, Dimitrios Liarokapis, Elizabeth O’Neil, Patrick O’Neil, and Dennis Shasha (June 2005). Making snapshot isolation serializable. ACM Transactions on Database Systems 30(2), pp. 492-528.

Tags: Oracle · SQL Anywhere · SQL Standard

4 responses so far ↓

  • 1 Glenn Paulley // May 24, 2009 at 9:34 pm

    An anonymous viewer asked me to the describe details behind the “first-updater-wins” semantics implemented in SQL Anywhere. The “first-commiter-wins” semantics outlined in reference [1] is necessary because in [1] independent transaction can modify the same row in the database, without locking conflicts, and, moreover, “see” their own updates. The problem, however, should this occur is which row version to commit; hence “first commiter wins”. The second transaction that attempt to commit will be unable to, and must (eventually) ROLLBACK.

    In a nutshell, with SQL Anywhere the server continues to write-lock rows whenever they are modified, by snapshot transactions or those executing under ANSI isolation levels. In this case, only one transaction can update a row at a time, and other write transactions will block. However, read transactions will not block; they continue to “see” the row as of the time of transaction start (or the start of the statement, if statement-level snapshot isolation is being used).

  • 2 Glenn Paulley // May 26, 2009 at 7:31 pm

    So, your implementation is a pessimistic version of first-committer-wins, reducing concurrency in the system since a write trans that locks a data item will force all other trans that also want to write that data item to block. A related Q: how do you know ahead of time that a trans is a write trans?

    I wouldn’t characterize our implementation of snapshot isolation as a pessimistic version of first-committer-wins. First-committer-wins means that with two or more transactions that attempt to modify or delete the same row, only one (the first) will succeed – the others must issue a ROLLBACK. There is no free lunch. With SQL Anywhere, attempts to update the same row by two different transactions are handled by the server’s existing locking mechanisms (intent and write locks); precisely what semantics are seen by the conflicting transactions depends on the state of the changes. Here’s an excerpt from the SQL Anywhere help:

    With snapshot isolation, an update conflict can occur when a transaction sees an old version of a row and tries to update or delete it. When this happens, the server gives an error when it detects the conflict. For a committed change, this is when the update or delete is attempted. For an uncommitted change, the update or delete blocks and the server returns the error when the change commits.

    Update conflicts cannot occur when using readonly-statement-snapshot because updatable statements run at a non-snapshot isolation, and always see the most recent version of the database. Therefore, the readonly-statement-snapshot isolation level has many of the benefits of snapshot isolation, without requiring large changes to an application originally designed to run at another isolation level. When using the readonly-statement-snapshot isolation level:

    • Read locks are never acquired for read-only statements;
    • Read-only statements always see a committed state of the database

    Note that the “blocking” referred to above can be controlled through connection options so that the application can get specific concurrency behaviour. For example, a transaction may set the server’s BLOCKING_TIMEOUT option to 0.5 seconds, so that the transaction can wait up to that point, and then the UPDATE statement will fail with an error.

    As to your question concerning knowledge about whether or not a transaction is an update transaction: it really doesn’t matter. Here’s an excerpt from the SQL Anywhere 11.0.1 help:

    Note that for the purposes of snapshot isolation, a transaction does not begin with a BEGIN TRANSACTION statement. Rather, it begins with the first read, insert, update, or delete within the transaction, depending on the snapshot isolation level being used for the transaction.

  • 3 Glenn Paulley // Jun 1, 2009 at 6:49 pm

    More comment from an anonymous reader:

    Unlike optimistic concurrency controls where you take action only when there is a conflict, locking is pessimistic since you expect conflicts to happen and you use locks to guard against conflicts. So, I would characterize your implementation as pessimistic.

    Would your implementation result in different behavior from the optimistic (standard) implementation of first-committer-wins (FCW) rule? Consider two write transactions, one long-running and one short, where the long-running trans acquires a write lock first. In an optimistic (standard) implementation of FCW, the long-runner will be aborted when it tries to commit if the short trans will have committed first.
    In your system, the long runner will block out the short trans and, therefore, the short trans will not be able to acquire a write lock until after the long runner commits?

    Different, yes. With SQL Anywhere, an attempt to acquire a write-lock at the time of the UPDATE statement may result in an error (a snapshot update conflict) or in blocking, depending on the precise sequence of operations intermixed with other transactions.

    In your example above, after the long-running transactions acquires a write lock on the row, the short-running transaction will receive an update conflict error on its attempt to modify the same row. Hence the error logic in the application will be triggered at the time of the UPDATE, rather than at COMMIT.

  • 4 Customizing SQL requests with hints // Jun 13, 2009 at 10:52 am

    [...] SQL isolation levels than SERIALIZABLE have obvious benefits to improving concurrency, but at the risk of incurring anomalies during query execution because of concurrent updates, and interactions between [...]