4.2 Loading and Saving Data
CloudTran stores data from the in-memory grid to persistent stores.
At start of day, CloudTran can also load the data from the persistent stores into the IMDB.
The load and store mechanisms are controlled by generate-time and run-time configuration properties, as explained in this chapter.
4.2.1 Class Names
|
CloudTran creates a series of classes for each entity to handle the various aspects of entity persistence.
This section describes the classes and the naming scheme, using an entity named 'Customer' as an example.
- <<entity>> - e.g. Customer - the entity object that gives an application-programmer-friendly view of the customer information.
This means that if related objects are referenced, they will be accessible by relation methods.
For example, calling 'getCustorders()' will return Custorder entities into the virtual machine's address space,
regardless of where the information actually resides.
The Customer class implements IEntity.
- <<entity>>Data - e.g. CustomerData - the data object that is stored in the in-memory data grid.
Related objects are referenced by foreign key.
Following database conventions, foreign keys for one-to-one and one-to-many relations are put on one end of the relation.
(And in this version, CloudTran does not support many-to-many relations - you must simulate them.)
For example, consider the Customer to Custorder relation, which is one-to-many - each customer can have many orders.
In this case, if you look on CustomerData object, there are no foreign keys for the CustorderData object;
rather, the foreign key is on the CustorderData object - it has getCustomerFK() for the Customer foreign key.
When the relations are between parent-child objects within an entity group,
CloudTran knows that the related objects will be local - i.e. in the same data space - so there are no cross-node calls to populate the relation in an entity.
For all other relations, in general the requested objects can be spread across multiple nodes, so
CloudTran must do a map-reduce search of all the spaces/nodes that can hold the target objects.
- <<entity>>Storer - e.g. CustomerStorer - manages the store process.
By default, data objects implement ICloudTranPersistable, which is an indicator that the
record must be persisted.
If a data object implements ICloudTranPersistable,
it implements the getConfigData() method.
In the current CloudTran implementation, getConfigData() returns a string,
which is the fully-qualified class name for the Storer object.
In other words, this is the link between data in the IMDG and
information in the transaction buffer; a small benefit of this approach is that
the storer objects do not need to be resident in the IMDG.
Storer objects must implement ICloudTranStorer interface; the business end of
this is the create/update/delete group of methods, which
must perform the action at the database.
For default databases, this is done by Jdbc,
and the Storer object is generated.
- <<entity>>Loader - e.g. CustomerLoader - retrieves the entities from the store at start of day.
At start of day (i.e. when the grid is first powered up),
CloudTran normally loads the in-memory data grid from the persistent store.
The load process is started off by the com.cloudtran.persist.JdbcLoaderFramework class,
which calls the per-entity loader class (like CustomerLoader) method 'bulkLoad'
passing an ICloudTranInjector object. The loader class loads data records into
an array and then calls the injector object to inject them into the correct space.
It is possible to disable loading entirely by setting the
ct.persist.loadAtStartOfDay configuration parameter
to false.
It is also possible to use your own loader framework rather than JdbcLoaderFramework.
To do this, specify the fully-qualified class name in the
ct.persist.overallLoaderClass
configuration parameter.
The class you specify here must implement the
ICloudTranOverallLoader interface.
4.2.2 Use Cases for Data
|
CloudTran is a general-purpose platform for supporting a wide range of applications and the integration between them.
This means that there are different ways that data objects are used - just as in Enterprise Java and SOA.
This section gives an overview of the supported use cases for data, and how they are declared in CloudTran
The uses cases after the following summary of the implications of the cases:
|
Load From DB? |
Store To DB? |
In Data Grid |
Summary |
| Write-Through |
Y |
Y |
Y |
Entities with System of Record in the data grid, managed persistence to persistent store. |
| Direct-to-store |
N |
Y |
N |
Data goes direct to persistent store without being cached in the data grid. Good for transaction audit trails. |
| Temporary |
N |
N |
Y |
Temporary entities that have no representation in persistent store, but can have relationships to persistent entities. |
| Externally-managed Beans |
N |
N |
Y |
Beans that are managed externally and occasionally loaded into the CloudTran Grid. |
| Message Beans |
N |
N |
Y |
Java beans (not entities) that transfer information into and out of the data grid. |
- Write-Through
These are the entities that enterprise Java developers are familiar with.
They are accessed via the ORM. They exist in the in-memory grid, so read requests are fulfilled from there.
Any changes are automatically persisted to the persistent store (e.g. a database).
In other words, they are similar to write-through objects in a cache - they are written into memtory and then sent to the persistent store.
However, compared to entities in Hibernate and EJB, in CloudTran the system of record is in the data grid.
Changes made via the ORM are reflected transactionally in the data grid and then propagated to the persistent store as part of the transaction management.
Normally write-through records are loaded into memory at start of day: the grid is made into a replica of the persistent store.
However, this action can be turned off by the ct.persist.loadAtStartOfDay
configuration property. (This property only turns off load; changes are always stored to persistent store.)
- Direct-To-Store Entities
These are entities that can be written by the application developer via the ORM - they have an application-developer view - but they are not stored in the data grid.
When these objects are written and the transaction is persisted, they are written directly to the database, without touching the grid.
These objects are very useful in micropayments or similar high-volume transactional systems where the impact of the transaction is important
but the details are not.
If these objects will rarely they be looked at again and are not essential to the operation of the grid,
it doesn't make sense to keep them in the grid.
This can make a significant saving.
For example, if CloudTran is writing on average 2,000 micropayment objects per second, that is 7 million per hour ... or 63 billion per year.
Because these objects do not reside in the grid, they can only be written by the ORM.
You cannot read or update them, or search for them.
- Temporary Entities
Temporary entities are a scratchpad for entities that may relate to other entities.
When created by the application, they will be created in the data grid.
These entities can then be updated or deleted.
No temporary entities are loaded at start of day.
State changes to temporary entities will not be propagated to the persistent stores.
- Externally-managed Beans
An application will often need information that it does not create or update.
This information is usually changed infrequently and referenced by any part of the application.
For example, the rate of sales tax - or VAT, in Europe - does not change at all frequently - once every few years.
Other information, for example risk data or the base rate, may change every day or every hour ... but not thousands of times per second.
This type of information is modelled as JavaBeans rather than entities; in other words, it is not part of the data domain for the application.
CloudTran currently does not provide facilities to read this information from the database.
It can be read by an external program or this application. The data is distributed to all
spaces in partitions of a PU using broadcast proxies from the CtSpaces class.
- Message Beans
Message beans are modeled as simple Java beans rather than as entities, because they do not represent the state of the application -
they are an indirect request for interaction with an external system.
The recommended approach to processing messages is as follows:
- Inbound Messages
Inbound messages, represented by a bean instance, are placed into the appropriate space by an external system -
for example, the Mule integration with GigaSpaces.
This represents a request for processing by the application.
- Outbound Messages
Outbound messages are saved, so they are temporarily written to the space.
This is normally the last action of a transactional sequence; when it commits,
the message is committed to the space, which fires the action of an external system to read the bean.
4.2.3 Controlling The Loading Process and Plugins
|
The data loading process, and the controls you can affect, are as follow:
-
When the master coordinator node is elected as primary, it starts the data loading process.
You can tell CloudTran to skip data loading entirely, by setting the
ct.persist.loadAtStartOfDay = false.
This goes across all persistent stores (including any custom plugins you have created).
Use this for testing (where you don't want to load the data store).
In this case, you may also want to also delete all information in the databases - to give you a clean slate for your tests.
CloudTran generates helper methods for JDBC-backed entities on the loader classes (e.g. CustomerLoader.java).
The simplest approach is to call clearEntityGroupSpaceTables(),
which is present on all master entities like Customer - and deletes all rows from database tables for entities in this entity group.
You can also do this on individual entities using the deleteAll() method: this deletes all columns from one table.
You have to call these methods when your application is started: they are not called as a result of setting ct.persist.loadAtStartOfDay = false.
-
If 'ct.persist.loadAtStartOfDay' is not false, then the loader framework is started.
The loader framework is defined by the 'ct.persist.overallLoaderClass' runtime configuration parameter,
which specifies the fully-qualified class name of the loader framework.
The default is to do a purely JDBC load, which is done through the class "com.cloudtran.persist.JdbcLoaderFramework".
If you have non-JDBC stores that you wish to load, you will have to define your own loader class and
declare by setting the config property ct.persist.loadAtStartOfDay to "my.own.loader.framework" or something similar.
This class must implement 'com.cloudtran.persist.ICloudTranOverallLoader' - which has a single method 'load'
with an injector parameter for putting information into the spaces.
If you have both custom and JDBC data stores, you will need to call the JdbcLoaderFramework to load the databases into memory.
This is probably best done in a parallel thread to the non-JDBC loading to shorten the overall load time.
-
We should mention in passing that it is possible to opt out of persistence (load or store) for a particular entity.
For example, you may want to do this with messages that should not be persisted to a store,
or if the entities you are working with are purely in-memory - in an analytics application for example.
You can do this by setting the entity's persistenceStyle attribute to 'memoryOnly'.
-
A custom-persisted entity is specified by setting the its persistenceStyle attribute to 'custom'.
In that case, you must create a loader class, which is generated in the '_CtFramework' project -
e.g. if your application is called 'MyApp', then the framework project is named 'MyApp_CtFramework'.
|