WiredTiger Transaction and Correctness Proof Design

WiredTiger Transaction and Correctness Proof Design

Introduction

In this article, we will delve into the world of WiredTiger, a high-performance, open-source storage engine that has been a cornerstone of many distributed systems. We will focus on the transactional aspects of WiredTiger, specifically the tsTxn mechanism, which was introduced in version 3.0. This mechanism is designed to support distributed transaction-based mixed logic and clock logic, providing a robust and efficient way to manage concurrent transactions.

WiredTiger Transaction Strategy

WiredTiger employs an optimistic approach to conflict resolution, checking for conflicts in a non-blocking manner. Unlike traditional first-commit-wins strategies, WiredTiger uses a first-update-wins strategy, which allows for higher throughput and better performance. This approach is particularly useful in distributed systems where concurrent transactions are common.

Transaction Process

In WiredTiger, each transaction is associated with a TxnId, which is used to identify the transaction and its associated data. When a transaction commits, it is written to the B-tree data structure, which is a self-balancing search tree that maintains a sorted order of keys. Each key on the B-tree page has a link to its multi-view list, which contains a list of (TxnId, Value) tuples.

Conflict Resolution

To resolve conflicts, WiredTiger maintains a global list of active transactions, which is used to check for conflicts when a new transaction starts. This list is copied to each local transaction view, allowing for efficient conflict checking. WiredTiger employs a one-way collision check strategy, which we will discuss later in this article.

Correctness Proof

We will prove the correctness of WiredTiger’s conflict resolution strategy in the second chapter of this article. The proof will demonstrate that the strategy ensures snapshot isolation, which is a fundamental property of transactional systems.

tsTxn Design

In this chapter, we will introduce the tsTxn mechanism, which was introduced in WiredTiger version 3.0. This mechanism provides a way to maintain a wallclock time order for transactions, which is essential for distributed systems with mixed logic and clock logic.

PBMA Strategy

We will introduce the Post-Begin-Monotonic-Allocation (PBMA) strategy, which ensures that transactions are assigned a commit timestamp in a monotonic order. This strategy guarantees that transactions are committed in the same order as their commit timestamps, ensuring data consistency and snapshot isolation.

Conclusion

In this article, we have explored the transactional aspects of WiredTiger, including its optimistic conflict resolution strategy and the tsTxn mechanism. We have also introduced the PBMA strategy, which ensures that transactions are committed in a monotonic order. These mechanisms are essential for distributed systems with mixed logic and clock logic, providing a robust and efficient way to manage concurrent transactions.

References

  • [1] RocksDB’s optimistic transaction suffers from mutex contention
  • [2] WiredTiger documentation: Architecture
  • [3] WiredTiger documentation: Transactions

Code Snippets

  • Line 18 in FIG. 3: if (txn1.start_time < txn2.start_time) { ... }
  • Line 15 in FIG. 3: if (txn2.start_time < txn1.start_time) { ... }
  • Line 10 in FIG. 7 and 9: commit_timestamp = ...

Figures

  • FIG. 1: Engine maintains a global list of transactions (uncommitted)
  • FIG. 2: Data structure discussed necessary
  • FIG. 3: Basic process core transaction WiredTiger
  • FIG. 4: Two possible scenarios for conflict resolution
  • FIG. 5: UpdateList naturally arranged in descending txnId
  • FIG. 6: CommitTimestamp fields added to the transaction structure
  • FIG. 7: PBMA strategy ensures commit timestamp order
  • FIG. 8: Monotonic assignment of commit timestamps
  • FIG. 9: PBMA strategy ensures commit timestamp order