Using Infinispan as a persistency solution
Cross-posted from https://vaadin.com/blog/-/blogs/using-infinispan-as-a-persistency-solution. Thanks Fredrik and Matti for your permission!
Various RDBMSs are the de-facto standard for persistency. Using them is such a safe bet by architects that I dare say they are used in too many places nowadays. To fight against this, I have recently been exploring with alternative persistency options, like graph databases. This time I played with Infinispan.
In case you are not familiar with Infinispan, or distributed key/value data stores in general, you could think of it as a HashMap on steroids. Most essentially, the map is shared among all your cluster nodes. With clustering you can gain huge size, blazing fast access and redundancy, depending on how you configure it. There are several products that compete with Infinispan, like Ehcache and Hazelcast from OS world and Oracle Coherence from the commercial side.
Actually, Infinispan is a technology that you might have used without noticing it at all. For example high availability features of Wildfly heavily rely on Infinispan caches. It is also often used as a second level cache for ORM libraries. But it can also be used directly as a persistency library as such.
Why would you consider it as your persistency solution:
- It is a lightning fast in-memory data storage
- The stored value can be any serializable object, no complex mapping libraries needed
- It is built from the ground up for a clustered environment – your data is safer and faster to access. It is very easy for horizontal scaling
- It has multiple optional cache store alternatives, for writing the state to e.g. disk for cluster wide reboots
- Not all data needs to be stored forever, Infinispan has built-in sophisticated evict rules
- Possibility to use transactional access for ACID changes
Sounds pretty amazing, doesn’t it? And it sure is for certain use cases, but all technologies have their weaknesses and so do key/value data stores. When comparing to RDBMSs, the largest drawback is with relations to other entities. You’ll have to come up with a strategy for how to store references to other entities and searching based on related features must also be tackled. If you end up wondering these questions, be sure to check if Hibernate OGM could help you.
Also, doing some analysis on the data can be considered simpler, or at least more familiar, with traditional SQL queries. Especially if you end up having a lot of data, distributed on multiple nodes, you’ll have to learn the basics of MapReduce programming model to do any non trivial queries.
Using Infinispan in a web application
Although Infinispan is not tied to Wildfly, I decided to base my experiments on Wildfly. Its built in version is available for web applications, if you explicitly request it. The easiest method to do this is to add the following MANIFEST.MF entry to your war file. If you don’t want to spoil your project with obsolete files, just add it using a small war plugin config.
Dependencies: org.infinispan export
Naturally you’ll still want to add an Infinispan dependency to your application, but you can leave it to provided. Be sure to use the same version provided by your server, in Wildlfy 8, Infinispan version is 6.0.2. In a Maven project, add this kind of dependency declaration:
<dependency> <groupId>org.infinispan</groupId> <artifactId>infinispan-core</artifactId> <version>6.0.2.Final</version> <!-- Provided as we use the Infinispan provided by Wildfly --> <scope>provided</scope> </dependency>
Before accessing Infinispan “caches”, you need to configure them. There are both programmatic and xml configurations available. With Wildfly, it is most natural to configure the Infinispan data store right into the server config. The “right” config file depends on how you are launching your Wildfly server. If you are testing clustering locally, you probably want to add something like this into your domain.xml, under the <subsystem xmlns="urn:jboss:domain:infinispan:2.0">
section.
<cache-container name="myCache" default-cache="cachedb"> <transport lock-timeout="60000"/> <replicated-cache name="cachedb" batching="true" mode="SYNC"/> </cache-container>
Note that with this config, the data is only stored within the memory of cluster nodes. To learn how to tweak cache settings or to set up disk “backup”, refer to the extensive Infinispan documentation.
To remove all Infinispan references from the UI code, I created an EJB that does all the data access. There I inject the CacheContainer provided by Wildfly and fetch the default cache in an init method.
@Resource(lookup = "java:jboss/infinispan/container/myCache") CacheContainer cc; Map<String, MyEntity> cache; @PostConstruct void init() { this.cache = cc.getCache(); }
I guess you are already wondering it: yes, the Map is the very familiar java.util.Map interface and the rest of the implementation is trivial to any Java developer. Infinispan caches extend the basic Map interface, but in case you need some more advanced features, you can also use Cache or AdvancedCache types.
The MyEntity in the previous code snippet is just a very simple POJO I created for the example. With Vaadin CDI usage, I can then inject the EJB to my UI class and do pretty much anything with it. The actual Vaadin code has no special tricks, just normal CDI spiced Vaadin code.
Based on this exercise, would I use Infinispan directly for persistency in my next project? Probably not, but for certain apps, without hesitation. I can also imagine certain hybrid models where some of the data is only in an Infinispan cache and some in traditional RDBMS, naturally behind ORM, taking the best of both worlds.
We’ll also be using Infinispan in our upcoming joint webinar with Arun Gupta from RedHat on September 8th, 2014. There we’ll show you a simple Vaadin application and how easy it can be to cluster it using Wildfly.
Reference: | Using Infinispan as a persistency solution from our JCG partner Arun Gupta at the Miles to go 2.0 … blog. |