Introduction

Since DRAGOS deals with large and complex data structures in a possibly distributed context, data consistency is a primary concern. This document describes the various aspects of the topic and the design and realisation in DRAGOS.

Stale references

After deleting a GraphEntity or undeclaring a GraphEntityClass or Attribute, an application may still hold one or more references to it through the corresponding Java object(s). This is especially true for distributed applications.

Obviously, it does not make sense to perform "writing" operations on a deleted entity, e.g. creating nodes inside a graph that no longer exists. That is why at the beginning of any such method, the existence of the object it is called on is verified, and a runtime exception is thrown if it does not exist any more. This is necessary to avoid cluttering the database with inacessible and inconsistent data and helps detecting usage errors in the application.

A disadvantage of these checks is that they add overhead to every method call. A correct application does not need these checks, and even in the distributed case the deletion of objects can be easily detected and reacted to through the event system. This is why we made the checks optional for "reading" operations (those that do not modify the graph or schema data).

You can toogle these checks through the forceExistenceCheckOnRead method of GraphPool. If deactivated, the results of a method call on a stale reference are undetermined: the implementation may throw an exception, return cached data or even purely random garbage. An implementation may even choose to always perform existence checks even when it is not forced to do so! If however you enforce these checks by setting the flag to true, all implementations must perform existence checks on every method, and throw an exception if the object does no longer exist. This provides a determinable and fail-fast behaviour, and should be the modus operandi during development and testing of the application, and even during regular use if performance is of lesser concern.

Two methods are guaranteed to work even on stale references: getDataSourceURL() and getInternalIdentifier(). These methods are also excluded from the forced existence checks. The main reason for this are deletion events - the source information would be useless if you could not identify it or compare it to other objects. Several other methods, namely toString(), hashCode() and equals(Object) are also expected to work even after deletion - if this poses a problem for your implementation, consider using the Proxies from the i3.dragos.gm.core.proxies package.

Graph and schema data inconsistencies

In DRAGOS, we distinguish between two types of inconsistencies: Those that may never occur, and those that may occur temporarily during the course of a transaction.

Some inconsistencies have to be prevented to allow useful operation on the data at all. They are prevented by design, either through checks that throw an EntityInUseException if the object in question is still referenced somewhere, or through cascading deletion of dependent objects. See Schema.undeclareGraphEntityClass(GraphEntityClass) for an example of both techniques. If a graph model implementation or the application detects a violation of these invariants (e.g. through assertions), they should be treated as an internal error, usually by raising an exception, because the code is probably buggy and the database needs to be repaired manually.

However, it makes sense to allow some violations of the schema's restrictions. An obvious example would be minimun cardinalities, since you can not create multiple entities at once. A core graph model implementation must be prepared for these cases, and return correct data even in the presence of such inconsistencies. The expected behaviour in these cases is (or at least should be) desribed in the core GM API doc. An application should also be aware of these, especially during event handling with EventCouplingMode.IMMEDIATE. These inconsistencies must be corrected before the transaction can be committed - this is enforced through the checks automatically performed upon commit. If a commit is attempted while one or more inconsistencies still exist, the commit call will fail with an exception.

The following table should provide a mostly complete overview of the possible inconsistencies and their class. Keep in mind that "references" here means cases like "the (deleted) node class is declared as target type of an edge class", not the stale references discussed above!

ProblemTemporarily allowed during a TransactionNever allowed
Dangling Edges, RelationEndsX
Violation of Edge or RelationEnd cardinalitiesX
Edges or RelationEnds pointing to deleted entitiesX
Entities without associated GraphEntityClassX
Instances of undeclared AttributesX
References to undeclared GraphEntityClasses inside Schema

(including dangling EdgeClasses, RelationEndClasses)
X
Inheritance cycles in the schemaX