com.atlassian.crowd.util.persistence.hibernate.batch
Class BatchProcessorImpl

java.lang.Object
  extended by com.atlassian.crowd.util.persistence.hibernate.batch.BatchProcessorImpl
All Implemented Interfaces:
BatchProcessor

public class BatchProcessorImpl
extends java.lang.Object
implements BatchProcessor

Threadsafe batch processor. Currently capable of performing "find", "saveOrUpdate" and "replicate" on a collection of entities. This processor is essentially a heavyweight generic DAO for processing batching inserts or updates to NamedEntities and for performing finds on DirectoryEntities. A NamedEntity is just an Object with a getName() method suitable for logging the names of objects that could not be processed. It is assumed that all data-cleansing (eg. convert to lower case, directoryID assigned, etc) is performed prior to calling methods on the BatchProcessor. The batchSize defaults to 20 and can be manually set (via Spring, for example) using the appropriate setter method. The batchSize should match the hibernate.jdbc.batch_size property defined in the Hibernate configuration. You can use the BatchProcessor in a singleton manner. There is no need to transaction/Hibernate wrap the BatchProcessor as transactions and Hibernate sessions are managed internally by this class. Each batch operation is first divided into smaller sets of batchSize. Each batchSet is added via a batched JDBC call (using Hibernate's batching mechanism). If there is an error in processing the batch, the batched JDBC call is rolled-back and the batchSet is processed individually. This mechanism ensures very fast (JDBC-batched) inserts and updates and follows it up with a fail-over retry for the failing batches. NOTE 1: do not use this if you're database is not transactional, *stab* MySQL ISAM. NOTE 2: if you are experiencing problems with the BatchProcessor set the log level on this class to DEBUG via log4j.properties: log4j.logger.com.atlassian.crowd.util.persistence.hibernate.BatchProcessor=DEBUG NOTE 3: it is quite is easy to generify this batch processor if you would like batch processing outside Crowd.

Author:
Shihab Hamid

Constructor Summary
BatchProcessorImpl(org.hibernate.SessionFactory sessionFactory)
           
 
Method Summary
<E extends DirectoryEntity>
java.util.Collection<E>
find(long directoryID, java.util.Collection<java.lang.String> names, java.lang.Class<E> persistentClass)
          Returns a collection of entities that match the names provided.
<E extends java.io.Serializable>
BatchResult<E>
merge(java.util.Collection<E> objects)
          Merge (almost SaveOrUpdate) a set of entities using Hibernate/JDBC batching.
<E extends DirectoryEntity>
java.util.Collection<E>
processBatchFind(org.hibernate.Session session, long directoryID, java.util.Collection<java.lang.String> names, java.lang.Class<E> persistentClass)
           
<E extends java.io.Serializable>
BatchResult<E>
replicate(java.util.Collection<E> objects, org.hibernate.ReplicationMode replicationMode)
          Replicate a set of entities using Hibernate/JDBC batching.
<E extends java.io.Serializable>
BatchResult<E>
saveOrUpdate(java.util.Collection<E> objects)
          Merge (almost SaveOrUpdate) a set of entities using Hibernate/JDBC batching.
 void setBatchSize(int batchSize)
          The batchSize value should be the same as the hibernate.jdbc.batch_size Hibernate property.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BatchProcessorImpl

public BatchProcessorImpl(org.hibernate.SessionFactory sessionFactory)
Method Detail

replicate

public <E extends java.io.Serializable> BatchResult<E> replicate(java.util.Collection<E> objects,
                                                                 org.hibernate.ReplicationMode replicationMode)
Description copied from interface: BatchProcessor
Replicate a set of entities using Hibernate/JDBC batching. If some entities fail replication, the rest can still succeed.

Specified by:
replicate in interface BatchProcessor
Parameters:
objects - entities to replicate.
replicationMode - how to perform the replication, eg. ReplicationMode.OVERWRITE.
Returns:
a BatchResult indicating the number of total entities that were attempted to be processed and the collection of entities that failed to be processed.

merge

public <E extends java.io.Serializable> BatchResult<E> merge(java.util.Collection<E> objects)
Description copied from interface: BatchProcessor
Merge (almost SaveOrUpdate) a set of entities using Hibernate/JDBC batching. If some entities fail saving/updating, the rest can still succeed. The reason why we need to 'merge' as opposed to 'saveOrUpdate' is because the 'saveOrUpdate' sets the identifier on an object and a subsequent operation on the object results in a StaleStateException. Further reading: http://www.jroller.com/hasant/entry/hibernate_saveorupdate_trap_for_web Use this method ONLY if the objects have Hibernate (Hi-Lo) generated identifiers, ie. a Long id field. NOTE: this method can handle TransactionGroups.

Specified by:
merge in interface BatchProcessor
Parameters:
objects - entities to save or update. These objects can be wrapped in TransactionGroups to ensure that a group of objects are added inserted together.
Returns:
a BatchResult indicating the number of total entities that were attempted to be processed and the collection of entities that failed to be processed. Of an TransactionGroup, only the primary object will be reported in the collection of failed entities.

saveOrUpdate

public <E extends java.io.Serializable> BatchResult<E> saveOrUpdate(java.util.Collection<E> objects)
Description copied from interface: BatchProcessor
Merge (almost SaveOrUpdate) a set of entities using Hibernate/JDBC batching. If some entities fail saving/updating, the rest can still succeed. Use this method ONLY if the objects DO NOT have Hibernate (Hi-Lo) generated identifiers, ie. pojo does not have a Long id field. NOTE: this method cannot handle TransactionGroups. WARNING: this is currently NOT used in Crowd.

Specified by:
saveOrUpdate in interface BatchProcessor
Parameters:
objects - entities to save or update. These objects can be wrapped in TransactionGroups to ensure that a group of objects are added inserted together.
Returns:
a BatchResult indicating the number of total entities that were attempted to be processed and the collection of entities that failed to be processed. Of an TransactionGroup, only the primary object will be reported in the collection of failed entities.

find

public <E extends DirectoryEntity> java.util.Collection<E> find(long directoryID,
                                                                java.util.Collection<java.lang.String> names,
                                                                java.lang.Class<E> persistentClass)
Returns a collection of entities that match the names provided. Any names that cannot be matched to persistent entities are not present in the resultant collection. Internally, this performs a: SELECT * FROM entityTable WHERE entityName IN (...) This is batched such that the size of the IN clause is at most the batchSize.

Specified by:
find in interface BatchProcessor
Parameters:
directoryID - directory ID of the entities to return.
names - collection of entity names. This, along with the directoryID should form the primary key of the entity.
persistentClass - the persistent class to lookup. This must be a Hibernate-mapped DirectoryEntity.
Returns:
a collection of the DirectoryEntities that exist matching any of the supplied names.

processBatchFind

public <E extends DirectoryEntity> java.util.Collection<E> processBatchFind(org.hibernate.Session session,
                                                                            long directoryID,
                                                                            java.util.Collection<java.lang.String> names,
                                                                            java.lang.Class<E> persistentClass)

setBatchSize

public void setBatchSize(int batchSize)
The batchSize value should be the same as the hibernate.jdbc.batch_size Hibernate property.

Parameters:
batchSize - batch size used to group batches.


Copyright © 2009 Atlassian Pty Ltd. All Rights Reserved.