Errata in Hekaton MVCC paper
Hekaton MVCC Paper - High-Performance Concurrency Control Mechanisms for Main-Memory Databases - contains a publication error. After reviewing the paper, I confirmed the error with one of the authors. This blog post explains the mistake, the implications and the fix. I blogged about the background story and my discovery in another post.
Error
Following the conventions of the paper, here is a state diagram:
- At time 60, we observe our initial state, where the value of Larry is 170. This is a committed row.
- At time 75, a transaction is started and is in
Active
state. It wants to update the value to 150, so it appended row2 - At time 80, another new transaction, Tx80 is started, and it wants to read the value of Larry. Both Tx75 and Tx80 are in
Active
state.
What is the value Tx80 going to read?
Tx80 can’t see row2 because of Table 1 rules. If Tx75 is in Active
state, only Tx75 can see row2. This makes sense because row2 is fresh, and Tx75 might drop the changes later. We don’t want any other transactions to see this row.
But can Tx80 see row1? From the paper’s conventions, Tx75 is TE
and Tx80 is T
, and row1 is V
. The rules from the paper contradict that since Tx75 is in Active
state:
This is a typo, and Tx80 should be able to see row1 since it is a committed row. Also, Tx75 should not be able to see row1 anymore; instead, only row2, which it is updating.
Implications
- Committed rows can become invisible for new transactions
- Uncommitted rows can become invisible to the very transactions which are updating them
Fix
The fix is simple, the entire column should read:
V is visible only if TE is not T
The usual rule about the time ranges still apply. By doing this:
- Tx80 can now see row1 (but not row2)
- Tx75 can now see row2 (but not row1)