<aside> đź’¬ This document is open for commenting.
</aside>
A transaction in union-db
is a sequence of operations (steps) which are performed over the Sharded State. Each transaction step can only access (read or mutate) a single data entry stored in one of Sharded Collections of the database. In union-db
transactions are expressed with Coroutines. Transactions allow executing complex atomic operations over the sharded data, but they are not synchronous. Atomicity is eventual - if a transaction executes successfully and the user sees it’s success status, some of the data, that should’ve been updated by this transaction, may appear stale for some amount of time. But it is guaranteed that eventually this data will catchup and reflect the changes. Transactions are the “magic” of union-db
- every other algorithm or data structure in this documentation are designed only to make transactions work the way they should.
You can think of transactions as of stored procedures in classic SQL databases - you define such a procedure and then trigger it from the outside. But because these procedures are defined using fully featured programming language (Rust), instead of SQL, they are much more powerfull and could encapsulate a lot of your business logic inside.
Technically, each transaction is just a piece of state, which travels across shards and executes various code snippets on data, stored on these shards. On highly distributed databases, each transaction step may be executed on a different shard. Each such process of transitioning a transaction from one shard to another during the execution is called rebasing
.
sequenceDiagram
participant Client
participant S1 as Shard 1
participant S2 as Shard 2
participant S3 as Shard 3
participant S4 as Shard 4
Client ->> S1: Invoke transaction
S1 ->> S1: Perform txn step 1
S1 ->> S2: Rebase txn
S2 ->> S2: Perform txn step 2
S2 ->> S3: Rebase txn
S3 ->> S3: Perform txn step 3
S3 ->> S4: Rebase txn
S4 ->> S4: Perform txn step 4
par
S4 ->> S3: Notify txn executed
S4 ->> S2: Notify txn executed
S4 ->> S1: Notify txn executed + result
end
S1 ->> Client: Notify txn executed + result
Consider this example:
db! {
balances: Principal => u64,
// () key type represents a collection with only one entry
// Rust's type system allows that
total_supply: () => u64,
}
This database represents a scheme required for a simple infinite token - it stores an infinite list of accounts and their balances. Let’s implement a typical set of transactions for such a token:
#[CandidType]
struct InsufficientFunds;
#[txn]
fn burn(account: Principal, qty: u64) -> Result<(), InsufficientFunds> {
let balance: u64 = (yield db.balances.get(&account)).unwrap_or_default();
if balance < qty {
return Err(InsufficientFunds);
}
yield db.balances.insert(account, balance - qty);
let total_supply: u64 = (yield db.total_supply.read()).unwrap();
yield db.total_supply.write(total_supply_opt - qty);
Ok(())
}
#[txn]
fn mint(account: Principal, qty: u64) -> Result<(), ()> {
let balance: u64 = (yield db.balances.get(&account)).unwrap_or_default();
yield db.balances.insert(account, balance + qty);
let total_supply: u64 = (yield db.total_supply.read()).unwrap_or_default();
yield db.total_supply.write(total_supply + qty);
Ok(())
}
#[txn]
fn transfer(from: Principal, to: Principal, qty: u64) -> Result<(), InsufficientFunds> {
let from_balance: u64 = (yield db.balances.get(&from)).unwrap_or_default();
if from_balance < qty {
return Err(InsufficientFunds);
}
yield db.balances.insert(from, from_balance - qty);
let to_balance: u64 = (yield db.balances.get(&to)).unwrap_or_default();
yield db.balances.insert(to, to_balance);
Ok(())
}
#[txn]
fn balance_of(account: Principal) -> Result<u64, ()> {
let balance: u64 = (yield db.balances.fetch(&account)).unwrap_or_default();
Ok(balance)
}
#[txn]
fn total_supply() -> Result<u64, ()> {
let total_supply: u64 = (yield db.total_supply.read_caching(ONE_HOUR)).unwrap_or_default();
Ok(total_supply)
}
As you might notice, there is a custom yield
syntax in this code. Each #[txn]
function is just a Coroutine with some additional logic around it.
Also, a transaction should always have a return type of format Result<T, E> where T: CandidType, E: CandidType
. Developers should use this Result
enum to signal the success status of the transaction, so the undelying system could react to this value and either commit or revert changes, depending on the Result
(more - below).
A transaction can be executed from any shard - even if a shard does not contain the key that the transaction wants to access. If this happens, the transaction will be rebased
(see Collections & Routing) to the right shard on the next suspension point. All collection methods are implemented in such a way, so they don’t perform the requested operation immediately. Instead, they return a Coroutine, that has to be yield
ed in order to be processed. This yield
keyword will suspend the execution of the current transaction, execute the coroutine and resume the transaction, possibly rebased
to a different shard. When the transaction is executed, the result of the transaction will get passed back to a shard, where the transaction originally started.
graph LR
Start(("`Start`"))
Stop(("`Stop`"))
Start --> Exec1["`Start txn execution`"]
Sus{"`Suspension
point
reached?`"}
Exec1 --> Reach
Sus -- No --> Ret(["`Return statement reached`"])
Ret --> Pass["`Pass the result back to the originating shard`"]
Pass --> Stop
Sus -- No --> Reach["`Reach a suspension point or a return statement`"]
Reach --> Sus
Sus -- Yes --> Tele{"`
Yielded coroutine
requires rebasing
to another shard?
`"}
Tele -- Yes --> Reroute["`Rebase the txn to the right shard`"]
Reroute --> Exec2
Tele -- No --> Exec2["`Execute the task and submit the result back to the txn`"]
Exec2 --> Reach
When a transaction accesses a collection by some key, even in a non-mutating way, the entry stored by this key may get locked
. When an entry is locked
by some transaction, it can’t be mutated by any other transaction, until this transaction finishes its execution. Other transactions, which would try to concurrently access the same entry in a locking
way, will get suspended and put in the transaction queue
, until the current transaction executes. It is still possible to access such a value in a non-locking
way. Read more in Collections & Routing section.
This means, that once a transaction reaches its terminal state (once it return
s a value, instead of yield
ing it), the Dispatcher Service will send a notification to each shard participated in the execution with a command to either commit changes (if the Result
of the transaction is Ok
) or to revert them (if the Result
of the transaction is Err
). Refer to the sequence diagram at the top of this document for visualisation.
When a transaction step is executed, the result of this execution and the next step are passed to the next shard (if needed). This process is implemented via a custom Transport Layer, that has a number of features, which help with fail safety. You can read more about it by following the link, but as a top-level overview:
notify
call and expects to receive an ack
flag back within certain period of time;