3.4 Warehouse - ORM Modelling and Coding
All too often, development tools don't give you rich enough examples to help in real-world applications.
The Warehouse example tries to address this problem: it is a real application, albeit a small one,
demonstrating how to create an application using the CloudTran ORM.
The warehouse example is at
eclipse/plugins/com.cloudtran.builder_x.y.z/jeewiz/examples/CloudTran/Warehouse
|
The Eclipse workspace is the workspace directory below 'Warehouse'.
There are eighteen entities in Warehouse and a few use cases, plotted in enough detail to give you an idea as how to architect your own.
The focus of the example is on modelling and application design, leading into the code needed to implement the design.
Instructions on setting up the MySQL and database can be found in Chapter 2.
3.4.1 The Scenario
|
The basic idea is of a warehousing company, who buy wholesale and sell retail. They have several warehouses scattered around the country. The company sells products and each variety of the product, varied sizes, colours, etc, are referred to as items. The full list of items for sale is called the catalogue. A quantity of an item in a warehouse is called the stock of the item. Items can be supplied by one or more suppliers. The supplier's version of the item is called the Supplier Item.
To get more stock, a warehouse will create purchase orders, one for each supplier, and having a list of purchase order lines each requesting a quantity of supplier items. When the items arrive, they are booked into the warehouse using a Goods In Line, updating the stock levels.
A customer creates a customer order, which has a number of order lines. Each line on the order requests a certain quantity of a particular item. If the items are available in the customer's local warehouse, the goods will be picked that night for the following day, and a delivery will be made to the customer. If there is not enough stock in the local warehouse, rather than having another warehouse deliver it, an inter-warehouse transfer order is raised to send goods to the local warehouse, from where they get subsequently picked and sent.
3.4.2 Data Design
|
I started off with a relational entity model. Because the in memory data grid functions across multiple computers and multiple java multiple virtual machines, full object linking is not supported at this level, and data objects link via relational style primary keys. The CloudTran Transaction API takes object links from your client and translates them to relational links in the grid. For this reason, relational-style data models work better than class diagrams and inheritance in the data objects should be avoided if using a beta CloudTran release.
It's necessary to decide how to logically apportion the workload and the data. The workload is undertaken in processing units, each processing unit having one or more primary partition. These partitions are where the work is done and they can be run in different virtual machines. There are also backup partitions which only come into play when the primary partitions fail. In GigaSpaces the data and the workload are both split into the same partitions, and what we are trying to achieve is that most of the processing should be done on data held in the local partition.
So if my customers are held in a processing unit and I want to process all customer payments I have received that day, it would be a good idea if the update takes place in the same processing unit, so I don't have transfer my customers to another processing unit, and possibly another virtual machine for processing and than save them back into their own area again. We want to minimise the about of data transfer going on. So we make sure that Customer, CustPayment and the method processCustPayment are all held in the same processing unit.
But what if the processing unit has multiple partitions? In this case there will be a split across multiple processing units as a fraction of the customer records will be held in each of the primary partitions. This will also be true of the payments, and as each processing unit can run in its own virtual machine, to run processing locally we need to ensure that the payments for customer A are in the same primary partition as the customer A record itself, otherwise we either have to transfer the customer record or the payment record across virtual machines. To do this we use entity groupings.
An entity grouping has a master entity, in this case customer, and as all customer payments are related to a customer, we direct the system to store all customer payment records in the same primary partition as its related customer record. To be a candidate for an entity group master entity, there has to be a to-one relationship from all subsidiary entities, not necessarily directly. Also it is necessary to know the master at the time of the creation of the subsidiary. Both there conditions are true for the customer - customer payment relationship. I can extend that. I am going to want to process customer orders together with customers, and customer order lines too, and both of these can be subsidiary entities to the customer master.
I can do something similar for suppliers, creating a supplier entity group and warehouse for stock processing and goods in. I could choose to process deliveries along with a customer or along with a warehouse, as they seem equally relevant. For now I will lump deliveries in with customers, but if when I write the delivery processing routines that I'm pulling across more warehouse information to the customer processing unit than I'm accessing customer information, I can change my mind later. There are other issues that help you make this decision and I'll come back to it when talking about processing.
Product and Item aren't really related to anything else, in that there is no real advantage in having them associated with say a warehouse. It is technically possible to have a product entity group with item as the subsidiary, but in reality the product record will rarely be used during the processing and so clustering on product is not worth the overhead. These are effectively each in their own entity group as their own masters.
We can still put Product and Item in the same processing unit as each other. We don't need to make too many concessions to evening out workload at this point as we can adjust the number of partitions we want to use when specifying the deployment. So I decided have three main processing units - Customer based, supplier based and the rest - warehouse based, product and item.
Now I need to work out what are the attributes on the various entities, and logical names for the various relationship ends, as these will form the accessor names in our objects. I also have to decide what is the primary key for each entity, and I will take the lazy way out, allowing CloudTrans to automatically create an artificial, numeric key, called oid, on every entity. I will then have the choice of putting in the number myself, if it is to be meaningful or allowing the framework to auto-populate it for me.
If you want CloudTran to create a lightweight data item, you can model a java bean instead of an entity. This is especially of use when you intend to bypass the transaction API and handle the nitty-gritty processing yourself. I haven't made use of it in Warehouse.
3.4.3 Process Design
|
Different analysts think in different ways, and although I tend to do data first it is equally as valid to consider processing first. The two aspects of design have to meet in the middle. Let's consider some of the use cases we might want to handle.
- Maintain the Warehouse List
- Update Catalogue (Item List and Pricing)
- Create a Purchase Order
- Create a Customer Order
- Create a Customer Delivery
- Accept a Goods In
- Pay Suppliers
- Update Supplier Item List
- Update Supplier List (Updates, Additions, Suspensions, Deletions)
- Customer Payment Received
- Stocktaking
A real world system would have many more use cases and many more entities too. But a few use cases should give you an idea of how we might want to structure some of the processing. It is worth talking about the first three use cases on this list as that should illustrate some basic considerations.
3.4.4 Basic CRUD Processes
|
CloudTran creates a framework for handling transactions in the grid.
Entities have two aspects, a client entity that the application programmer uses and a mirror in the data grid.
In many ways you can think of the data grid record as your database and the client entity as the java object you manipulate.
You can create it as a client object and subsequently save it to the grid, or you can take it from the grid by finding it,
then possibly update it and save it back to the grid.
All basic stuff.
The client entity instance has the methods to handle finding, saving and deleting.
To create a new warehouse,
I can write a simple Java object
and include the relevant jars that give me access to the client version of the warehouse object - unsurprisingly called Warehouse -
and call the save method.
Warehouse w = new Warehouse();
w.setOid(1);
w.setWarehouseName("London01");
w.setAddress( "31 Rosebay Lane\nLondon" );
w.setPostCode("SW5 6TT");
w.setPhone("020 7947653");
w.save();
|
If I want to save something slightly bigger, I can create a data tree. Let's say I want to create a new product and several items associated with it. I can create a product and two items, link the items to the product and save the product. The framework will automatically save the items too.
Product p = new Product();
p.setBrandName("ACME");
p.setProductName("Widget01");
Item i = new Item();
i.setOid(100001L);
i.setDescription("Singleton Pack / Standard");
i.setWeight("100g");
i.setItemSize("1");
i.setPackSize("0.2 x 0.2 x 0.5");
i.setBasePrice(50L);
i.setOnPallet(false);
p.addProdItem(i);
i = new Item();
i.setOid(100002L);
i.setDescription("100 Pack / Standard");
i.setWeight("10kg");
i.setItemSize("100");
i.setPackSize(null);
i.setBasePrice(2000L);
i.setOnPallet(true);
p.addProdItem(i);
p.save();
|
But note that I link the items to the product and not the other way around.
It is not enough to code i.setProdItem(p).
The product is what I am saving and everything else has to hang off that as an object tree.
For a fuller example look at the WHPopulate.java code, from which the above snippets were taken.
3.4.5 Explicit Transactions
|
When processing data as we have just seen, the transaction is implied. Either the product and items will all be successfully saved, or none of them will. To handle multiple calls in an explicit single transaction I need to create the transaction explicitly and either commit or abort it when I'm done.
This part of the API is held in the CTUtils object and a static call to startTransaction() will return a DistributedTxAttributes object. This can be passed into the save, delete, find and readById methods on an entity.
If we create a Purchase Order to get items supplied to a warehouse, we also want to increment the attribute "quantityOnOrder" on relevant stock record. We want this to happen as part of a transaction. We could create a large tree with purchase order, supplier, po lines, supplier items, items, stock and warehouse, but this would be a massive overkill.
The WHPO.java client code uses this mechanism.
DistributedTxAttributes dta = CTUtils.startTransaction();
po = new PurchaseOrder();
po.setOrderDate(new java.sql.Date((new java.util.Date().getTime())));
po.setPaymentType("with order");
po.setPoWh(w);
po.setPoSupp(sup);
pol = new PurchaseOrderLine();
pol.setPolSi(findSI(sup, 100001L));
pol.setQuantity(10);
po.addPoPol(pol);
updateStock(w, 100001L, 10, dta);
po.save(dta);
CTUtils.commit(dta);
|
where the updateStock method ultimately calls stock.save(dta).
There is another coding issue here to be aware of. The core of the Warehouse application was written pre beta test, before the ability to search by template on relations was implemented. So the only way to find a stock record was find the relevant item, and loop through the related stock records until accessing the one with the relevant warehouse. Of course this approach still works, but it is slower and not scalable.
The two examples above just work as a direct call from the client, and don't need any extra design to achieve that level of processing. But work done in the client code itself can only be run on one processor. More typically, you will want to handle business related processing scaleably, in the grid.
3.4.6 Scalable Processing
|
Rather than create a purchase order as a one off, we might want to create a batch job every night, checking which stock levels are low and creating purchase orders to top up the stock levels. Each stock record has a trigger level and a full stock level, and if the quantity drops below the trigger, we look for the cheapest supplier and order up to the full stock. To do this scaleably, we create a Service to run on the one of the processing units. This operation we could run it either on the supplier PU or the warehouse PU. Considerations as to which one you should pick will include how much data is being pulled across from the other processing unit, how much capacity are you likely to have, is the code common to other business processes, etc. In this case I decided to split the processing, with most of it done on the WarehousePU, and only the final purchase order creation would be done on the SupplierPU.
There are three basic choices for handling methods in services. Broadcast, routed and random. Broadcast says run this method on every primary partition of the processing unit. Routed runs the method on the primary partition associated with a particular instance of a master entity. A random method is run on a single primary partition chosen at random. In the case of the batch job, we create a broadcast method on warehouse that runs per warehouse and cycles through the warehouses local to the partition. Finding the best supplier is also checked on the same warehouse partition and a purchse order line is created, batched by supplier. At the end a routed message is sent containing all the purchase order lines for each supplier to the supplierPO, where the method creates the actual purchase orders locally.
See the WHBatchPO.java example for details. Of course in reality there would also have to be some e-mailing (or printing and mailing) routing for the purchase orders, which is not part of the example.
There is another example of a scaleable processing routine in creating customer orders. Optionally the customer may have to be created, and once again stock values decremented, and possibly even an emergency purchase order created on the fly. See WHCO.java
3.4.7 Deployment
|
The default deployment options are set up for trial on a single machine with a single primary and backup partition per space. The datasource is set up to access a MySql database locally. Both of these can be changed for your machine. The default sql generated is also for MySql and can be found at ${workspace}\Warehouse\warehouse.sql. If you use it to generate the database for any other database, you may need to make the appropriate changes to datatypes.
|