Transactions encapsulate a number of related actions on a database. They serve two main purposes:
There are two layers of transactions in DRAGOS, user transactions and database transactions. Only user transactions are directly accessed by the user or application, and map to one or more database transactions. This layered architecture allows the addition of sophisticated features like distributed transactions (spawning more than one GraphPool, possibly running on different machines on a network) in the future.
To provide a central point for configuring and accessing transaction managers,
a single TransactionManagerFactory
runs as a service in the
DRAGOS kernel. The current implementation is rather simple, supporting only
a hard-coded default TransactionManager implementation, 1:1 mappings between
TransactionManagers and GraphPools (mostly because there are no TransactionManagers
supporting distributed transactions yet), and little configurability.
But it can easily be extended with new functionality once the need arises.
Do not let the name "factory" confuse you: Once created for any given
DataSourceURL, the same TransactionManager will be returned for every further
call with the same argument. This caching not only increases performance,
but also makes the TransactionListener
mechanism much more useful
and frees the application from keeping a reference to the current
transaction manager (because it can be retrieved again at any time).
TransactionManager
is the central interface in the transaction architecture. Every instance is associated with at least one GraphPool
:
Suppose we have three GraphPools: GP_1
, GP_2
and GP_3
. We want distributed transactions for GP_1
and GP_2
, but not for GP_3
, which belongs to another project running on the same DRAGOS kernel. Then we would have two instances of TransactionManagers: TM_A
which handles GP_1
and GP_2
, and TM_B
which is associated with GP_3
:
TM_A [GP_1, GP_2] TM_B [GP_3]
If you call TM_A
to start a user transaction, it will start two database transactions, one in each GP_1
and GP_2
. If you did the same call on TM_B
, it would only start one database transactions, in GP_3
.
The TransactionManagerFactory
API contains a method to retrieve the TransactionManager instance responsible for a certain GraphPool. So if you wanted to perform an operation on GP_2
in our example, you would call TransactionManagerFactory.getInstance().create(DataSourceURL)
with the DataSourceURL of GP_2
as the parameter, and it would return TM_A
. Then you can use TM_A
to start an user transaction and perform the desired operations.
There is a simple rule in DRAGOS: Everything happens inside a transaction. This makes event handling and ensuring schema consistency much easier. Operations that may seem atomic might consist of any number of single steps depending on the PEGS implementation. Wrapping everything in a transaction ensures that you do not end up with a corrupted database if one of these steps fails. Even read-only operations may actually write to the database, e.g. if the value of a dynamic attribute is recalculated and cached.
Please refer to the example above as well as the API documentation for information on how to use transactions in your application.
DRAGOS aims to be user friendly, so if you do not want to deal with transactions, there is no need to - just activate dragos-ext-autocommit
! This extension, implemented using the Wrapper mechanism (see "Guide to Wrappers" for more information), intercepts every method call, checks whether a transaction is currently active, and starts a new one if necessary. Before returning from the method call, the transaction is commited - but only if it was started automatically by this same method. What does this mean in practice? You can still start and commit transactions manually when you want to, without dragos-ext-autocommit
getting in your way. But if you call any method without starting a transaction first, it will transparently encapsulate that call in a transaction, thus satisfying the requirement that everything happens inside the scope of a transaction.
This section deals with some of the finer points of transaction state changes and the events generated by those.
Generally, the order is pretty simple:
BEFORE_XY
event is firedXY'ED
AFTER_XY
event is firedHowever, two of the possible operations on transactions have a cascading effect, which complicates matters slightly:
commit()
on a top-level transaction changes
the state of all descendants from PRE_COMMIT
to COMMITED
rollback()
on a transaction anywhere in the hierachy changes
the state of all descendants to ROLLED_BACK
These operations are atomic, affecting several transactions at once, which we obviously can not (and do not event want to) replicate in the generated events. We want the events to be fired in a sensible order, which in this case means the reverse order of creation, so that the events for any nested transactions are fired before their parent's event.
But since we also want to display the updated status to the outside world as soon as
possible, and especially when the event generation described above begins,
we first update the status field of all affected transactions before
firing the first event. Thus, even during processing of the first AFTER_COMMIT
or AFTER_ROLLBACK
event (the one generated by the "youngest"
transaction), getState()
on its ancestors and other transactions in the hierachy
will return the correct value.
The details of implementing transaction support vary widely with the database
used (if any). Usually, most of the transaction code will be handled in the
implementation of
i3.dragos.core.services.datasources.DataSource
and
i3.dragos.core.services.datasources.DataSourceTransaction
.
If you are using a database with a JDBC back-end, consider using
dragos-db-jdbc
which takes care of the data source
implementation, so you only have to deal with the actual graph model
implementation.
The user should be able to operate on graph data in a natural and intuitive way.
This means that we did not want to include a Transaction parameter in each method call,
instead we associated exactly one Transaction with each thread.
However, we also wanted to have as much flexibility as possible in the
implementation of various parts of the system, especially allowing
for distribution. This means we do not have the same 1:1 mapping for
threads and DataSourceTransactions. Instead, it is up to the implementation
to ask the Transaction for the associated DataSourceTransaction, and
execute its commands in that context accordingly.
This affects not only the graph model implementation, but also the
data source implementation. You have to be prepared to handle
multiple parallel transactions in a single thread. The details are
back-end specific; for a JDBC-compliant database, you would create
a separate Connection for each top-level DataSourceTransaction,
share the connection for all nested transaction, and close
that connection as soon as the top-level transaction is
committed or rolled back.
A few words about nested transactions: During design and specification of the DRAGOS transaction services, we examined a number of popular RDBMS and ODBMS. It turned out that they either provided no support for nested transactions at all, or that these were simulated using so called checkpoints or savepoints in a single (top-level) transaction. This allows the rollback of nested transactions even if they are already in PRECOMMIT
state, a useful feature which we thus decided to make a requirement of the DRAGOS transaction specification. If you have to implement a data source for a DBMS that does not support this, you can either try to simulate that behaviour, or decide not to support nested transactions at all (which should be documented,
specified in the DataSourceMetaData
, and any attempts to create nested transactions at runtime must result in a TransactionException
).
What you have to keep in mind is that everything happens inside a transaction,
so you should enforce this in your implementation, throwing a
GrasGXLException
if no transaction is active when a method is called.