7. Transactions

This chapter starts with understanding of transactions by explaining the guarantees it provides

ACID

Atomicity - The ability to abort a transaction on error and discard the writes.

Consistency - certain statements about your data (invariants) must always be true

Isolation - Isolate concurrently executing transactions from each other

Durability - Committed data is not lost

These assume we are updating multiple objects as part of our transaction. An example is maintaining emails of a user and unread counts(for faster reads) separately.

In RDBMS, everything between a BEGIN TRANSACTION and a COMMIT is considered as a transaction. Many non-relational databases don’t have such a way of grouping operations.

Maintaining these for single object transactions is essential. These are lightweight transactions and not ACID.

Concurrency issues (race conditions) arise when two transactions manipulate the same data. Serializable isolation (I from ACID) is strong guarantee at the cost of performance. Hence, databases actually provide weak isolation levels. We will look at them and their impacts on application.

Read committed

This isolation level guarantees the following two properties

When reading from the database, you will only see data that has been committed (no dirty reads).
When writing to the database, you will only overwrite data that has been committed (no dirty writes).

Locks can handle Dirty writes. Transactions hold a write lock on the objects being written and release them on completion. The database maintains the old value and currently being written value separately to handle dirty reads. Read locks affect performance.

This isolation level can lead to a lot of temporary inconsistencies. The anomaly of not seeing the same data is called Read skew and example of non-repeatable read. For example, it manifests during backups and analytical integrity checks.

Snapshot isolation

Snapshot isolation is the most common solution to this problem. Each transaction reads from a consistent snapshot of the database. The database keeps several committed versions of an object. Because various in-progress transactions need to see the state of the database at different points in time. This technique is known as multi-version concurrency control (MVCC).

So far, we have seen read transactions happening during writes. We will next look at writes happening concurrently. Let us take an example of incrementing a counter. It involves reading the current value, calculating the new value, and writing back the updated value. This scenario handled incorrectly leads to Lost Updates

Atomic writes

Many databases provide atomic update operations. This removes the need to implement read-modify-write cycles in application code. These are the best solution for most of the cases. But ORM frameworks do not generally leverage this feature.

Locks

Another option is explicit locking. Many databases provide the option of using select for update.

Alternatively, compare and set can be used. i.e. update the value to C(new value) only if the old value is B(previously read).

In a replicated database scenario, the above approach fails, but atomic writes work.

Write Skew

Then there is the problem of write skew. This is easier to understand with an example.

A hospital needs to have at least one on call doctor at a time.
It allows others to take leave if needed
If two doctors apply leave at the exact same moment, both will go through and none will be on shift

Atomic writes fail here but the Select for update option works well here. But that fails when the value to select and update are different. For example, we check for absence of something and then update the value. This scenario is called Phantom Writes.

Materializing conflicts

This involves creating a set of all possible data and store them separately. For example, in meeting room booking scenario, it means creating all possible room and time combinations. Now this combination can be used as select for update and updated in the actual table for booking. A serializable isolation level is much preferable in most cases.

Most databases that provide serializability today use one of three techniques

Actual Serial Execution

The simplest way to achieve serializability is to allow only one transaction at a time. This means making the application single threaded. However, throughput is limited to that of a single CPU core. But this is not suited for interactive styled transactions.

Rather, transactions need to be achieved via stored procedures, i.e. activities bundled via a single transaction. Serial execution has the following constraints

Every transaction must be small and fast
Active dataset should fit in memory. Disk access would slow the system
A single core Write throughput should be enough
Can increase throughput by data partitions
Cross-partition transactions are possible to limited level

Two phase locking

2PL makes the lock requirements much stronger. Several transactions are allowed to concurrently read, as long as nobody is writing. But as soon as anyone wants to write an object, exclusive access is required.

There are two lock modes. Shared lock is obtained by readers and many processes can hold them simultaneously. But as soon as exclusive lock is obtained(writer), no other locks can exist. The main disadvantage is the effect on performance due to locks.

Serializable Snapshot Isolation (SSI)

We have seen serializable isolation so far with tradeoff between performance and scalability. Weak isolation levels provide good performance but cannot handle race conditions well. Serializable snapshot isolation (SSI) is a good alternative, but with minor performance penalty.

2PL is pessimistic locking, while SSI is optimistic locking. It allows transactions to proceed, hoping that it will be alright at the end. When a transaction wants to commit, it checks the state of the database and allows committing if the state is good. Else, the transaction is aborted and has to be retried.

SSI is based on snapshot isolation, i.e. all reads within a transaction are made from a consistent snapshot of the database. The rate of aborts significantly affects the overall performance of SSI. SSI is less sensitive to slow transactions than two-phase locking or serial execution.