A few weeks ago I attended a ThoughtWorks
Technology Radar seminar. I worked at ThoughtWorks for years and think if anyone knows what is trending up and down in software development these guys do. At number 17 in Techniques with a rising arrow is what they called Thoughtful Caching. At drinks with Scott Shaw, I asked him what it meant.
What the trend is about is the movement from reactive caching to a new style. By reactive I mean you find out your system doesn’t perform or scale after you build it and it is already in production. Lots of Ehcache users come to it that way. This is a trend I am very happy to see.
Deliberate Caching
The new technique is:
- proactive
- planned
- implemented before the system goes live
- deliberate
- is more than turning on caching in your framework and hoping for the best – this is the Thoughtful part
- uses an understanding of the load characteristics and data access patterns
We kicked around a few names for this and came up with Deliberate Caching to sum all of this up.
The work we are doing standardising Caching for Java and JVM based languages,
JSR107, will only aid with this transition. It will be included in Java EE 7 which even for those who have lost interest in following EE specifically will still send a signal that this is an architectural decision which should be made deliberately.
Why it has taken this long?
So, why has it taken until 10 years after Ehcache and Memcache and plenty of others came along for this “new” trend to emerge? I think there are a few reasons.
Some people think caching is dirty
I have met plenty of developers who think that caching is dirty. And caching is cheating. They think it indicates some architectural design failure that is best of being solved some other way.
One of the causes of this is that many early and open source caches (including Ehcache) placed limits on the data safety that could be achieved. So the usual situation is that the data in the cache might but was not sure to be correct. Complicated discussions with Business Analysts were required to find out whether this was acceptable and how stale data was allowed to be. This has been overcome by the emergence of enterprise caches, such as
Enterprise Ehcache, so named because they are feature rich and contain extensive data safety options, including in Ehcache’s case: weak consistency, eventual consistency, strong consistency, explicitly locking, Local and XA transactions and atomic operations. So you can use caching even in situations where the data has to be right.
Following the lead of giant dotcom
The other thing that has happened is that as giant dotcoms it cannot have escaped anyone’s notice that they all use tons of caching. And that they won’t work if the caching layer is down. So much so that if you are building a big dot com app it is clear that you need to build a caching layer in.
Early Performance Optimisation is seen as an anti -pattern
Under Agile we focus on the simplest thing that can possibly work. Requirements are expected to keep changing. Any punts you take on future requirements may turn out to be wrong and your effort wasted. You only add things once it is clear they are needed. Performance and scalability tend to get done this way as well. Following this model you find out about the requirement after you put the app in production and it fails. This same way of thinking causes monolithic systems with single data stores to be built which later turn out to need expensive re-architecting.
I think we need to look at this as Capacity Planning. If we get estimated numbers at the start of the project for number of users, required response times, data volumes, access patterns etc then we can capacity plan the architecture as well as the hardware. And in that architecture planning we can plan to use caching. Because caching affects how the system is architected and what the hardware requirements are, it makes sense to do it then.
Reference: Introducing Deliberate Caching from our JCG partner Greg Luck at the Greg Luck’s Blog.
Related Articles :