When and how to use a ThreadLocal
As our readers might already have guessed, I deal with memory leaks on a daily basis. A particular type of the OutOfMemoryError messages has recently started catching my attention – the issues triggered by misused ThreadLocals have become more and more frequent. Looking at the causes for such leakages, I am starting to believe that more than half of those are caused by developers who either have no clue what they are doing or who are trying to apply a solution to the problems which it is not meant to solve.
Instead of grinding my teeth, I decided to open up the topic by publishing two articles, first of which you are currently reading. In the post I explain the motivation behind ThreadLocal usage. In the second post currently in progress I will open up the ThreadLocal bonnet and look at the implementation.
Let us start with an imaginary scenario in which ThreadLocal usage is indeed reasonable. For this, say hello to our hypothetical developer, named Tim. Tim is developing a webapp, in which there is a lot of localized content. For example a user from California would expect to be greeted with date formatted using a familiar MM/dd/yy pattern, one from Estonia on the other hand would like to see a date formatted according to dd.MM.yyyy. So Tim starts writing code like this:
public String formatCurrentDate() { DateFormat df = new SimpleDateFormat("MM/dd/yy"); return df.format(new Date()); } public String formatFirstOfJanyary1970() { DateFormat df = new SimpleDateFormat("MM/dd/yy"); return df.format(new Date(0)); }
After a while, Tim finds this to be boring and against good practices – the application code is polluted with such initializations. So he makes a seemingly reasonable move by extracting the DateFormat to an instance variable. After making the move, his code now looks like the following:
private DateFormat df = new SimpleDateFormat("MM/dd/yy"); public String formatCurrentDate() { return df.format(new Date()); } public String formatFirstOfJanyary1970() { return df.format(new Date(0)); }
Happy with the refactoring results, Tim tosses an imaginary high five to himself, pushes the change to the repository and walks home. Few days later the users start complaining – some of them seem to get completely garbled strings instead of the former nicely formatted dates.
Investigating the issue Tim discovers that the DateFormat implementation is not thread safe. Meaning that in the scenario above, if two threads simultaneously use the formatCurrentDate() and formatFirstOfJanyary1970() methods, there is a chance that the state gets mangled and displayed result could be messed up. So Tim fixes the issue by limiting the access to the methods to make sure one thread at a time is entering at the formatting functionality. Now his code looks like the following:
private DateFormat df = new SimpleDateFormat("MM/dd/yy"); public synchronized String formatCurrentDate() { return df.format(new Date()); } public synchronized String formatFirstOfJanyary1970() { return df.format(new Date(0)); }
After giving himself another virtual high five, Tim commits the change and goes to a long-overdue vacation. Only to start receiving phone calls next day complaining that the throughput of the application has dramatically fallen. Digging into the issue he finds out that synchronizing the access has created an unexpected bottleneck in the application. Instead of entering the formatting sections as they pleased, threads now have to wait behind one another.
Reading further about the issue Tim discovers a different type of variables called ThreadLocal. These variables differ from their normal counterparts in that each thread that accesses one (via ThreadLocal’s get or set method) has its own, independently initialized copy of the variable. Happy with the newly discovered concept, Tim once again rewrites the code:
public static ThreadLocal df = new ThreadLocal() { protected DateFormat initialValue() { return new SimpleDateFormat("MM/dd/yy"); } }; public String formatCurrentDate() { return df.get().format(new Date()); } public String formatFirstOfJanyary1970() { return df.get().format(new Date(0)); }
Going through a process like this, Tim has through painful lessons learned a powerful concept. Applied like in the last example, the result serves as a good example about the benefits.
But the newly-found concept is a dangerous one. If Tim had used one of the application classes instead of the JDK bundled DateFormat classes loaded by the bootstrap classloader, we are already in the danger zone. Just forgetting to remove it after the task at hand is completed, a copy of that Object will remain with the Thread, which tends to belong to a thread pool. Since lifespan of the pooled Thread surpasses that of the application, it will prevent the object and thus a ClassLoader being responsible for loading the application from being garbage collected. And we have created a leak, which has a chance to surface in a good old java.lang.OutOfMemoryError: PermGen space form
Another way to start abusing the concept is via using the ThreadLocal as a hack for getting a global context within your application. Going down this rabbit hole is a sure way to mangle your application code with all kind of unimaginary dependencies coupling your whole code base into an unmaintainable mess.