CloudTran Home

<< Back Contents  >  3.  Product Concepts and Architecture Forward >>

3.5 Isolator Performance and Repartitioning

   This section describes implementation details. You do not need to read it if you just want to use CloudTran.   
The CloudTran Isolator guarantees the isolation of transactions during persistence - relaying the transaction from the data grid to the database.

This section describes

  • how the isolator in this version of CloudTran works
  • options for future scalability.
  • repartitioning

 3.5.1  How The Isolator Works: Primary Key Generation
 3.5.2  How The Isolator Works: Transaction Isolation
 3.5.3  Isolator Performance
 3.5.4  Scaling the Isolator
 3.5.5  Repartitioning and the Isolator

3.5.1  How The Isolator Works: Primary Key Generation
The Isolator has two jobs: transaction isolation; and primary key generation.

Let's quickly deal with primary key generation. Each entity type can establish an entity primary key generator, which can generate unique Integer or Long primary keys.

This is normally initiated automatically, for example by the TopLink Grid JPA support, loading the highest primary key into the Isolator's cache..

The main consideration for this discussion is that key generation is, or can be made, a low-volume activity - and therefore is much less important to application performance than the transaction isolation aspect of the isolator.

When the client inserts a new object/row, it allocates the entity primary key. It first consults a local entity cache. If there are no keys available, the client requests another batch of entity primary keys from the isolator. The default batch size is 1,000 by default, adjustable via the ct.client.primaryKeyPreallocationSize configuration property.

The isolator records the maximum primary key allocated in a per-entity entry in the cache. We recommend that there is more than one isolator instance so that this cache is backed up and can survive a failover.

3.5.2  How The Isolator Works: Transaction Isolation
The main role of the Isolator is to order transactions on their way to the database (or other persistence mechanism).

The issue of isolation arises because CloudTran allows the client application to continue before the data is persisted at the database. [For good reason: it improves response time and lets the application continue to provide service if the database is unreachable.] If transaction 1 commits, and it includes a change to Customer #15, without isolation it would be possible for a later transaction 2, which also changes Customer #15, to do the in-memory transaction and commit its own data to the database before transaction 1 gets there.

This can happen for example because of Java GCs - GCs of 40ms are common, and complete transactions can be completed in under 10ms on Gigabit networks. It can also happen if the database gets slow: there are conditions where one database transaction can take many seconds in extreme situations, which can delay other database transactions from one node while letting database transactions from other nodes persist at full speed.

The manager uses the isolator by asking the question "OkToPersist" of each transaction. To do this, it identifies each object/row in a transaction by its class and primary key. The isolator checks whether any of these objects are "in play" - currently being persisted by another transaction. Normally none are, and the transaction is OkToPersist; the manager immediately puts on the persist queue for the target database. As far as the isolator is concerned, the objects in the transaction are now "in play".

If a transaction is not OkToPersist, the manager must wait to persist it until any "in play" objects are persisted. The manager that persists a transactions tells the isolator, which removes the "in play" statuses. If this process releases the last waiting object for a waiting transaction, it can now be released; the isolator notifies the manager for that transaction that it is OkToPersist.

3.5.3  Isolator Performance
The isolation algorithm in the isolator is highly optimised:
  • No cache writes. The isolator has no backups for "in play" state; this is explained below.
  • Class types are send by Id. The isolator maintains a map of the class types to a short Id number. This, and the use of POF, reduce the volume of data sent to the isolator; if a numeric primary key is also used for an entity, the OkToPersist data for the entity will be less that 20 bytes.
  • Aggregation. Requests to the isolator are aggregated into a single EntryProcessor invoke. Coupled with the previous point about class type Ids, over 50 entity requests will usually fit into a single small Ethernet packet.
The key to the fast operation in the isolator is that it uses thousands of HashMaps. Our experience is that small HashMaps are extremely efficient - much more so than large HashMaps, for example. In addition, there are other optimizations that ensure highly multi-threaded operation is efficient in terms of synchronization and accessing the HashMap.
3.5.4  Scaling the Isolator
In the current version of CloudTran, there is on primary Isolator.

The isolator can be scaled by having each isolator handle a subset of the target databases. This works if there are a number of distinct databases. Alternatively, if the database is sharded, one isolator can handle each shard.

Support for multiple isolators will be provided when application performance exceeds the performance of a single isolator machine. With current CPU technology, this limit is already at millions of rows per second. With the evolution of CPUs to include more cores, this is likely to grow by 10X every few years.

3.5.5  Repartitioning and the Isolator
We mentioned above that there is no "permanent" state held at the isolator: if it crashes, then all transaction isolation state is lost.

After the primary isolator crashes, a magic "Primary Isolator" entry in the isolator cache will move to another machine. This machine will become the primary.

Of course, a non-primary isolator may crash, or another isolator come online, causing a repartition. If this causes the "Primary Isolator" entry to move to another node, the Primary Isolator replaces the value with one that is homed on its own storage.

In other words, the primary isolator is fixed until it fails.

Once that happens, the new primary isolator declares itself and immediately asks all the managers for the objects in their in-play transactions - ones that have committed but not yest persisted. The primary isolator sorts the returned list of transactions into the order they were registered at the previous isolator and then replays all the OkToPersist requests. This will leave most transactions OkToPersist, but some will be left waiting - not OkToPersist. The managers refresh their transaction persistence statuses, because some statuses may have changed while the repartitioning is in process.

Once the manager receives its new list of status from the isolator, it continues normal operation - the isolator's state is now correct and up to date.

Although this scheme is not particularly efficient in recreation, the isolator is so fast that it still relatively fast. The benefit is in not keeping cache state and have it backed up for each set of rows, which would be impossibly slow.

Copyright (c) 2008-2013 CloudTran Inc.