Most efficient way to increment a Map value in Java – Only search the key once
This question may be considered too basic, but is frequently asked in the forums. In this post, I will discuss one way that only searches the key in Map ONCE.
Let’s first look at an example. Say I’m creating a string frequency list, using a Map, where each key is a String that is being counted and the value is an Integer that’s incremented each time a String is added. A straightforward way of achieving it is
int count = map.containsKey(string) ? map.get(string) : 0; map.put(string, count + 1);
This piece of code runs quite slowly because it contains three potentially expensive operations on a map, namely containsKey(), get(), and [put()](http://docs.oracle.com/javase/7/docs/api/java/util/Map.html#put(K, V)). Each requires to search the key in the map. Now let’s refactor the code for better performance.
Integer vs MutableInteger vs AtomicInteger
One important reason that we have to invoke three expensive operations is using Integer for counting. In Java, Integer is immutable. It prevents us modifing the integer value after construction. Therefore, to increment a counter, we have to first get the integer from the map, then create another new integer by adding one, and put it back to the map.
To make the counter mutable, there are several ways. One is to simply create your own MutableInteger, like what I showed below.
public class MutableInteger { private int val; public MutableInteger(int val) { this.val = val; } public int get() { return val; } public void set(int val) { this.val = val; } }
Another way might be using AtomicInteger in Java, which is used in applications such as atomically incremented counters. But the main choice for AtomicInteger is if you want to achieve thread safety with the operations on the integer. Therefore it cannot be used as a replacement for an Integer. Based on this, if thread-safety is not a strong consideration of your project, I won’t recommend using AtomicInteger.
Search the key only once
After using the MutableInteger, we can change the above code to
if (map.containsKey(string)) { MutableInteger count = map.get(string); count.set(count.get() + 1); } else { map.put(string, new MutableInteger(1)); }
or
MutableInteger count = map.get(string); if (count != null) { count.set(count.get() + 1); } else { map.put(string, new MutableInteger(1)); }
In the worst case, when the key has not been seen before, the code will search the key twice: one for retrieving, one for setting. It is much better than the previous one. But we should NOT be satisfied right now and stop. If you checked the [Map.put()](http://docs.oracle.com/javase/7/docs/api/java/util/Map.html#put(K, V)) method in Java document, you will find that this method will return the previous value associated with key
. This means we can merge the retrieving and setting into one. However, you may wonder: if we don’t retrieve the counter first, how we can set the new counter? Now we are finally touching the most tricky part in this post: we can simplify put a zero-frequency counter!
public int incrementCount(K key, int count) { MutableInteger tmpCount = new MutableInteger(0); MutableInteger oldCount = map.put(key, tmpCount); if (oldCount != null) { count += oldCount.get(); } tmpCount.set(count); return count; }
Yet another Counter
It looks like putting all necessary operations into a class will be helpful for the future use. Therefore I create a class called Counter and make it public available. The Counter defines a collection that counts the number of times an object appears in the collection. Suppose you have a Counter that contains {a, a, b, c}
. Calling getCount() on “a” would return 2, while calling keySet() would return {a, b, c}
. This class works like a Map, but with different methods for easily getting/setting/incrementing counts for objects and computing various functions with the counts. The Counter constructor and addAll() method can be used to copy another Counter’s contents over. The Counter class is modified according IntCounter and AbstractMapBag.
Some highlighted operations on Counter include
- incrementCount() and decrementCount(): Adds/subtracts the given count to the current count for the given key. If the key hasn’t been seen before, it is assumed to have count 0, and thus the increment method will set its count to the given amount. The decrement will set its count to -1.
- getCount(): returns the current count for the given key, which is 0 if it hasn’t been seen before.
- keysAt(), keysAbove(), and keysBelow(): returns the set of keys whose counts are at, above, or below the given threshold. This set may have 0 elements but will not be null.
- argmin() and argmax(): finds and returns the key in this Counter with the smallest/largest count. If there are several min/max counts, random value is returned. Returns null if this Counter is empty.
int count = map.containsKey(string) ? map.get(string) : 0;
map.put(string, count + 1);
For a HashMap the amortized time complexity of containKey, get and put is O(1). Pretty inexpensive I would have though.
I agree with you. It’s inexpensive when using HashMap, but may be expensive when using TreeMap though.
Or, one could use Trove library for primitive collections. It has TObjectIntHashMap that contains a handy .adjustOrPutValue(key, adjust_amount, put_value)
Of course Trove library or Apache common library provides similar class/function, but that is not the point of this post. :)
Or, one could use Guava’s Multiset:
multiset.add(string);
multiset.add(string);
int count = multiset.count(string); // count == 2
Of course Guava library provides similar class/function, but that is not the point of this post. :)
This is a clever usage of put’s return value. My only cautions are that: 1) This assumes there is no cost to putting a new value at the key. When handing maps on disk this is not the case and so the clever code must only be used for maps in memory. 2) Too much clever code raises maintenance costs.
Agree. It is safer to use third-party reliable libraries. This post is only for eduaation use.
Interesting and clever usage of map api, but I’m a little bit curious about the extra effort for create a new instance of MutableInteger each time the value has to be increased.
Do you have any numbers?
good question! i never thought about it before. and unfortunately, i dont have any numbers right now.