CloudTran Home

 
  
<< Back Contents  >  4.  Developing with CloudTran and TopLink Grid Forward >>

4.2 Problems

CloudTran focuses on two goals: (a) to keep its services running (even in unusual situations); (b) to commit transactions correctly once it has promised to.

To achieve these goals, CloudTran by design has a limit on the number of resources, particularly threads, and has mechanisms to back off if it becomes too busy.

This section describes how the application developer handles these situations.

 4.2.1  Exceptions
 4.2.2  Timeouts
 4.2.3  Database Problems

4.2.1  Exceptions
Exceptions in CloudTran fall into two categories: retriable and non-retriable. They subclass the abstract classes TransactionExceptionNonRetriable.

TransactionExceptionRetriable indicates a transient error has occurred, so retrying the transaction may succeed. Of the subclasses, TransactionExceptionManagerTimeout, TransactionExceptionManagerTooBusy and TransactionExceptionThreadError often indicate overload at one or more managers.

TransactionExceptionReconfiguring occurs when the CloudTran-specific caches are repartitioning - meaning a manager or isolator has left or joined the cluster.

Our recommendation is that, after a TransactionExceptionRetriable, the client application wait 1-2 seconds and retry the transaction once. The reason it is probably not worth retrying more than once is that CloudTran internally already does retries where possible; so a second TransactionExceptionRetriable return probably indicates a condition that is likely to take some time to resolve.


4.2.2  Timeouts
The transaction manager allows transactions to live for a certain amount of time. After that, if the transaction has not been committed or aborted, the transaction manager rolls it back. This is done so that a client that crashes does not permanently hold transactional locks in the cache.

You can change the timeout with ct.txb.defaultTimeoutSeconds.


4.2.3  Database Problems
If one or more of the databases become unreachable, transactions that cannot be written to the database are buffered in the manager. The transactions are distributed across all the managers.

Normally, database problems are not visible to the application: by the time database problems are noted, the commit() called has already returned successfully to the user.

CloudTran makes an error log in the manager and then continually retries the write to the database. This does not affect the validity of the transaction as far as the caches is concerned: the transaction has already successfully committed into the caches before it is persisted.

There is a limit to the number of transactions that one manager can buffer, defined by ct.txb.maxTransactions. Once this limit is breached, the Transaction Manager returns a TransactionExceptionTxbTooBusy to the transaction start() request.

Copyright (c) 2008-2013 CloudTran Inc.