7.2 Local Store
An important part of the Replicator is the LocalStore. This is effectively a back-up service which records the transactional information to a data store
before the information is sent across the wire.
The LocalStore's purpose is to provide high-performance backed-up storage for the Replicator.
It stores the outbound Transactional information, keeping a record of the transactions that have been sent. It also needs to record the transactions that
have been acknowledged
The Replicator makes use of the Local Store calling it at various points in the Replication cycle. The Replicator makes use of the LocalStore api which is described in more
A typical sequence of calls the Replicator may make to the LocalStore are shown below.
7.2.1 The Local Store|
- Start of Day - LocalStore.resync()
Resets the LocalStore ready for a new set of transactions
- Store transactional information - the LocalStore.storeOubound()
This stores the transactional information on the LocalStores back-up medium.
- Release transactional information - the LocalStore.releaseOutbound()
This notifies the LocalStore that the transaction has been committed in the remote grid. The LocalStore can now release any locks on the transaction.
Assuming there is no problem the Replicatior will loop through Items 2 and 3, storing and release transactions. Note: Items 2 and 3 are only synchronous for a given
transactions, but they are run on different threads.
- Link fails - some connection issue causes the link to go down
This is an unexpected action and needs to be resolved. The local system will still be committing and the LocalStore will need to continue.
- The link is restored
The problem with the link is resolved, but the remote system is now out of sync with the local system
- The Local Replicator resyncs with the LocalStore
The resync determines what transactions have not made it across the wire. The LocalStore returns the first Transaction Sequence Number to be resent
and the number of following transactions that need to be resent.
- Retrieve transactional information
For each transaction that has not been replicated retrieve the transactional information from the LocalStore and resend.
- Continue ... Replication can continue as normal - Storing and releasing transactional information as previously.
The LocalStore is supported by the LocalStore API, which has the following methods
7.2.2 LocalStore API|
This method is called to indicate the given transaction sequence number has been replicated. The LocalStore implementation can then release the given transaction
sequence number from the data store. This method is not necessarily called for every transaction sequence number, the given transaction sequence number is the highest transaction sequence number
that has been replicated.
boolean releaseOutbound( long txSeqNum );
Within CloudTran there as two implementations of the LocalStore service
7.2.3 LocalStore Implementations|
- The CacheLocalStoreImpl.
- The SSDLocalStoreImpl.
The CacheLocalStoreImpl backs-up the transactions to a cache. In the abscence of an Elastic Cache this is really only useful for testing purposes.
The SSDLocalStoreImpl uses Solid State Devices (SSDs) as the back-up mechanism. The SSD local store is the recommended implementation of the LocalStore.
An additional further recommendation, which the SSDLocalStoreImpl provides, is to us two SSDs per Replicator. This means there is no single point of failure for the
LocalStore and provides the best combination of resilience and speed.
In essence the SSD LocalStore serialises the transactional information to a set number of files.
Once serialised the data can be transmitted across the wire and replicated on the remote grid.
As the data is replicated an acknowledgement of the replicated transactions is returned. This acknowledged information is also recorded by the SSD.
So should the link fail the local replication service can interogate the SSD to determine the transactions that need to be re-sent to the remote Grid.
If the write of the transaction information to the SSD fails then a TransactionExceptionManagerTooBusy exception is thrown.
The LocalStore implementation is defined by the configuration property:
So for the CacheLocalStoreImpl this would be
and for the SSD
A bespoke implementation can be written, provided it implements the LocalStore and the ct.replicator.localStore.class property is set.
7.2.4 SSDLocalStore Configuration|
There are a number of configuration properties that can be used for the SSD LocalStore.
These are shown below along with their defaults
The size of the block size we should round up to.
Every write to the transaction log files starts on this boundary -
extra bytes at the end of a list of transaction's write are ignored.
The default is 4096, which is good for most SSDs and hard disks.
Some SSDs may have a larger native block size.
This gives the directory location of the files that replicator uses.
By default, this is 'replicator/directory1', which is a path below the current directory of the CloudTran isolator application.
The default is appropriate for development. In production, the directory should be specified.
Although this is an "SSD" property, there is no check that this is actually on an SSD,
so it can be on a normal directory.
The directories can optionally end in '/' or '\'. On Windows, drives, such as "C:", are allowed.
If replication is enabled, you must specify a valid directory name for this property.
For initialization (ct.replicator.ssd.init=true),
the directory will be created if it doesn't already exist.
In production, this directory should ideally be on an SSD for maximum performance.
Performance is severely degraded on Linux if this directory is on the system drive.
ct.replicator.ssd.blockSize = 4096
ct.replicator.ssd.directory1 = replicator/directory1
It is highly recommended that you use a second directory on a separate disk from 'directory1' in production and performance testing;
this is specified in 'ct.replicator.ssd.directory2'. In development, CloudTran will work with a single directory.
See 'ct.replicator.ssd.directory1' for more information.
ct.replicator.ssd.directory2 = replicator/directory2
This properties is the base filename of all the store files.
'_' plus a number (1, 2 etc.) plus the file extension is added to this base to give the full file name.
ct.replicator.ssd.fileBasename = transactionData
This properties set the fileExtension for the store files.
The default is 'rep', for replicator. Avoid standard names like 'log', that may get deleted by mistake.
ct.replicator.ssd.fileExtension = rep
This sets the amount of storage allocated for the storage of the replicated packets, in MB.
This is per directory (i.e. per drive, normally).
The production default is 100,000 or 100GB; the debug default is 40MB.
ct.replicator.ssd.totalSizeMB = 100000/40
This is the size in MB of each file on the disk.
This will be used in conjunction with the 'ct.replicator.ssd.totalSizeMB' to determine the number of file.
The production default is 50MB; the debug default size is 4MB.
When combined with the default 'totalSizeMB' value, this gives 2,000 files in production and 10 files in debug.
ct.replicator.ssd.fileSizeMB = 50/4
If the 'init' property is set to 'true', the SSDs will be initialised for use by CloudTran - which will then stop.
To run normally, and make use of the replicator information after a cluster reboot, this property must be false.
To prevent accidental deletion of the replicator information, when this property is true,
there must be no files in the directories specified in the ct.replicator.ssd.directory1/2 properties.
In the preferred configuration, with two nodes providing the replicator service,
the 'init' run must be done on both nodes.
Best practice is to omit this property from config.properties file in production:
specify it as a system property ('-Dct.replicator.ssd.init=true' on the command line) for an initialization run.
ct.replicator.ssd.init = false
This is the number of threads to use for each 'SSD' drive - to backup for the replicator.
ct.replicator.ssd.threadCount = 2