<aside> 💬 This document is open for commenting.
</aside>
<aside>
📖 This document contains a lot of code snippets which are mandatory for reading, since the code itself and the comments inside it contain practical information, that, when applied to theory, makes understanding of union-db
a lot easier.
</aside>
Transactions in union-db
allow you to mutate data distributed across multiple shards in such a way that either all operations in a transaction execute completely, or, if one of them fails, all changes, previously made by this transaction, are reverted. In other words, with transactions you can write atomic distributed logic.
Distributed transactions in union-db
are based on Saga pattern. In short, this means that shards interact with each other as if they were separate independent microservices. As a part of a saga, each shard executes local transactions, which operate over their local state, and then signal to other microservices to execute their part of the transaction. A distinct trait of this pattern is that each microservice is responsible for reverting previosly made changes, if a saga fails to execute. If such an event happens, the microservice that did execute a failed transaction is responsible for letting the other participants know, that they have to revert the changes made by this transaction. This means, that each local transaction may have a reversed twin, that is called compensation
in saga.
<aside>
📖 Terms microservice
, shard
and canister
are interchangable in this document.
</aside>
union-db
implementation of sagas is slightly modified from what is described in books and articles. In books there are two kinds of sagas: choreography-based and orchestration-based. But because in union-db
all shards (microservices) share the same binary, we are able to make our sagas into something in-between these two kinds. Our sagas are both: orchestration-based and choreography-based at the same time. Each shard is able to initiate a transaction and orchestrate it, defining what functionality will be executed and what will not. But once the saga is initiated, each shard executing the transaction is responsible for triggering the execution of the next transaction on some other shard - basically mimicking the event exchange functionality from choreography-based sagas. Such a solution takes best from both worlds: each saga can be defined in its own separate file, simplifying the development, and we don’t have a single orchestrator - a bottleneck that is usually responsible for the execution, which is good for reliability of our sagas.
Sagas are inherently ACD, not ACID. They do not provide isolation for transactions by default - each concurrent transaction sees changes made to data by other transactions. Throughout this document, we’ll see that this is actually not a big issue, even for financial applications. Sagas are so flexible, that you always either have an ability to make up for the isolation by adding and reacting to different states of your data, or even to implement 2PC layer with locks and complete isolation on top of them.
But first, let’s start with the basics. In union-db
each saga consists of transactions
, which are defined by the following trait:
#[async_trait]
trait Transaction {
async fn run(&mut self) -> Option<(Self, Principal)>;
}
<aside>
📖 We’re using async_trait
here to indicate that a transaction body can be asynchronous.
</aside>
So a transaction is something that can be run consuming itself and optionally returns itself with a canister id of another shard as a result. When the result is Some(...)
, the execution of the transaction repeats on another shard, defined by the returned Principal
. To make it easier to understand, let’s look at a simplest possible example - changing the name of the application. Our database has the following schema (just to keep it in front of our eyes - the database itself is schemaless, as you might remember):
struct Database {
app_name: String,
}
And the saga to change the name is defined like this:
let state = State::new();
...
// a starting update function, so clients could execute it as usual
#[update]
fn set_app_name(app_new_name: String) {
// constructs a key "app_name"
let app_name_key = CompositeKey::from_segments([str_key("app_name")]);
// creates the transaction object
let txn = SetAppName { app_new_name, app_name_key };
// lets the dispatcher know what saga do we want it to execute
Dispatcher::run_saga(txn);
}
// our first and only transaction in this saga
#[derive(Serialize, Deserialize)]
struct SetAppName {
new_app_name: String,
app_name_key: CompositeKey,
};
#[async_trait]
impl Transaction for SetAppName {
async fn run(self) -> Option<(Self, Principal)> {
// this can't happen, because our state is very small
// but it is good to be ready for future changes
// checks the bounds of the current shard
// if we're on a wrong shard, redirect this same transaction to the next shard
if let Err(shard_id) = state.check_bounds(&self.app_name_key) {
return Some((self, shard_id));
}
// this code will only execute if we're on the right shard
// encode a new app_name value as bytes and construct a Document::Leaf from it
let value = Document::Leaf(encode(&self.new_app_name));
// insert the value by the key
state.insert(key, value);
// return None as an indication that the end of the saga has beed reached
None
}
}
This example is a little bit redunant, since there is no need in a whole database if the only thing you store - is the name of the app itself. But it demonstrates the basics of transactions in union-db
very well. First of all, we have a Dispatcher
service, which is a singleton service whos job is to make sure that all transactions are executed completely. It’s workflow is very simple:
run_saga
method is executed, it will invoke the run()
method of the submitted transaction;