Software Development

A beginner’s guide to read and write skew phenomena

Introduction

In my article about ACID and database transactions, I introduced the three phenomena described by the SQL standard:

  • dirty read
  • non-repeatable read
  • phantom read

While these are good to differentiate the four isolation levels (Read Uncommitted, Read Committed, Repeatable Read and Serializable), in reality, there are more phenomena to take into consideration as well. The 1995 paper (A Critique of ANSI SQL Isolation Levels) introduces the other phenomena that are omitted from the standard specification.

In my High-Performance Java Persistence book, I decided to insist on the Transaction chapter as it is very important for both data access effectiveness and efficiency.

Domain model

For the following examples, I’m going to use the following two entities:

readwriteskew

In our fictional application, when the Post title is changed, the author must be recorded in the associated PostDetails record. If read and write skew are not prevented, this domain model constraint can be compromised, as you will see in the following test cases.

Read skew

The following test simulates how a read skew can happen:

doInConnection(aliceConnection -> {
    prepareConnection(aliceConnection);
    String title = selectStringColumn(
        aliceConnection, 
        selectPostTitleSql()
    );
    executeSync(() -> {
        doInConnection(bobConnection -> {
            prepareConnection(bobConnection);
            try {
                update(
                    bobConnection, 
                    updatePostTitleParamSql(), 
                    new Object[]{"Bob"}
                );
                update(
                    bobConnection, 
                    updatePostDetailsAuthorParamSql(), 
                    new Object[]{"Bob"}
                );
            } catch (Exception e) {
                LOGGER.info("Exception thrown", e);
                preventedByLocking.set(true);
            }
        });
    });
    String createdBy = selectStringColumn(
        aliceConnection, 
        selectPostDetailsAuthorSql()
    );
});
  • Alice selects a Post title
  • Bob sneaks in and updates the Post and the PostDetails entities
  • Alice thread is resumed and she selects the PostDetails record

If read skew is permitted, Alice sees Bob’s update and she can assume that the previous Post version (that she read at the beginning of her transaction) was issued by Bob (which might not be accurate).

Running this test on the four most common relation database systems gives the following results:

Database isolation levelRead skew
Oracle Read CommittedYes
Oracle SerializableNo
SQL Server Read UncommittedYes
SQL Server Read CommittedYes
SQL Server Read Committed Snapshot IsolationYes
SQL Server Repeatable ReadNo
SQL Server SerializableNo
SQL Server Snapshot IsolationNo
PostgreSQL Read UncommittedYes
PostgreSQL Read CommittedYes
PostgreSQL Repeatable ReadNo
PostgreSQL SerializableNo
MySQL Read UncommittedYes
MySQL Read CommittedYes
MySQL Repeatable ReadNo
MySQL SerializableNo

Write skew

To emulate write skew, you need to execute the following test case:

doInConnection(aliceConnection -> {
    prepareConnection(aliceConnection);
    String title = selectStringColumn(
        aliceConnection, 
        selectPostTitleSql()
    );
    String createdBy = selectStringColumn(
        aliceConnection, 
        selectPostDetailsAuthorSql()
    );
    executeSync(() -> {
        doInConnection(bobConnection -> {
            prepareConnection(bobConnection);
            try {
                String bobTitle = selectStringColumn(
                    bobConnection, 
                    selectPostTitleSql()
                );
                String bonCreatedBy = selectStringColumn(
                    bobConnection, 
                    selectPostDetailsAuthorSql()
                );
                update(
                    bobConnection, 
                    updatePostTitleParamSql(), 
                    new Object[]{"Bob"}
                );
            } catch (Exception e) {
                LOGGER.info("Exception thrown", e);
                preventedByLocking.set(true);
            }
        });
    });
    update(
        aliceConnection, 
        updatePostDetailsAuthorParamSql(), 
        new Object[]{"Alice"}
    );
});
  • Alice selects the Post title and the author from the PostDetails record
  • Bob also selects the Post title and the associated author, but he decides to update the title only
  • Alice thinks of updating the PostDetails record without changing the Post title

If write skew is permitted, Alice and Bob disjoint writes will be executed without being blocked by the constraint that governs both records.

Running this test on the four most common relation database systems gives the following results:

Database isolation levelWrite skew
Oracle Read CommittedYes
Oracle SerializableYes
SQL Server Read UncommittedYes
SQL Server Read CommittedYes
SQL Server Read Committed Snapshot IsolationYes
SQL Server Repeatable ReadNo
SQL Server SerializableNo
SQL Server Snapshot IsolationYes
PostgreSQL Read UncommittedYes
PostgreSQL Read CommittedYes
PostgreSQL Repeatable ReadYes
PostgreSQL SerializableNo
MySQL Read UncommittedYes
MySQL Read CommittedYes
MySQL Repeatable ReadYes
MySQL SerializableNo

 

  • Write skew is prevalent among Multi-Version Concurrency Control mechanisms and Oracle cannot prevent it even when claiming to be using Serializable, which in fact is just a Snapshot Isolation level instead.
  • SQL Server default locking-based isolation levels can prevent write skews when using Repeatable Read and Serializable. Neither one of the snapshot isolation levels (MVCC-based) can prevent/detect it instead.
  • PostgreSQL prevents it using its more advanced Serializable Snapshot Isolation level
  • MySQL employs shared locks when using Serializable, so the write skew can be prevented even if InnoDB is also MVCC-based

If you are interested in this topic, then you might as well enjoy my High-Performance Java Persistence book that I’m writing.

Vlad Mihalcea

Vlad Mihalcea is a software architect passionate about software integration, high scalability and concurrency challenges.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button