A Simple Map Iterator Performance Test
There are many dimensions of Java Map performance to measure, but a critical one is simple single-threaded scanning. Here is some simple test code for Iterators and the Java 8 Map.forEach()
along with some graphical results.
1. Performance Testing is Difficult
Performance testing is a notoriously difficult endeavor, and precise repeatable testing requires a framework like the Java Microbenchmarking Harness to filter out many noise sources and provide statistics such as the confidence interval. However, the results from the simple test code here are relatively repeatable and correlate well with JMH tests the author has done on the JDK Maps and AirConcurrentMap (at boilerbay.com and github boilerbay/airconcurrentmap). There are also more extensive JMH tests there for streams and so on.
JMH is not complex, but it requires switching gears from an IDE to the maven model, learning the set of control annotations, and dealing with yet another directory tree full of cruft and possibly a new IDE project. Also, we want automatically to span a spectrum of Map sizes and a set of Maps in one run: this requires ‘parameterization’ annotations in JMH. The output data format of this test can easily be adjusted for the needs of the desired graphing tool.
1.1 Issues Causing Noise
Many issues must be addressed to get a good test:
- The inner loop must not have significant overhead. This is solved by having it simply watch changes in a shared static volatile int variable incremented by a ‘heartbeat’ thread;
- Garbage collection may jump in. This is helped by distributing GC overhead round-robin over the tests for different Maps. We also avoid creating any unreachable Objects. Use -Xloggc:<file> in the ORACLE JVM to watch GC;
- Code may be optimized at any time by the JVM during the run. We use warmup runs of each test to allow this to happen. It is hard to know when this will happen exactly, but it is a matter of seconds. This can be minimized with longer warmups. The typical JVM flag ‘-XX:+PrintCompilation’ shows the progress of optimization. Also see the ORACLE article ;
- The JVM will grow erratically while the Maps grow, so the test right after a growth must warmup;
- Different Maps use different amounts of memory, so even forking a JVM for each test would depend on JVM growth characteristics. We keep all Maps in one JVM;
- Different Maps must have exactly the same contents. We use synchronized random number generators seeded by Map size;
- Large Maps need more time to iterate than smaller of course. The heartbeat allows more runs of the smaller Maps so that each test takes a constant time;
- Maps must not share contents so earlier tests do not bring data into the CPU caches that is available for later tests;
- The ‘Just In Time’ compiler must not optimize away the loop. We print the sum of the contents so the loop has a tangible result. If the loop counter were to be used instead, the JIT can recognize the pattern and change the loop to increment by a number greater than one, thus skipping many iterations! Yes this happens! (In AirConcurrentMap at least).
2. Results
These precautions help us come to some tentative but repeatable conclusions. The results show:
- forEach() is much faster than Iterators;
- ConcurrentSkipListMap iterates fastest below about 30K Entries, and AirConcurrentMap is fastest above;
- ConcurrentSkipListMap forEach() is fastest below about 100 Entries, and AirConcurrentMap is fastest above.
3. Code Details
The test code uses a Test
class wrapper to contain a particular Map along with the test progress state such as time, map size, loops, and a running total. This organization hopefully makes the main() more readable.
The Test class implements BiConsumer
so that it can be used in the forEach(this)
invocation. It simply overrides accept(Integer, Integer)
. We also test forEach((k, v) -> total += v)
(the author has tried it and the results are approximately the same). In either case, a local variable cannot be used as the total accumulator, because local variables must be ‘effectively final’ to an inner class or lambda, so we need an instance variable in a containing scope like the one in class Test
anyway. A reduction using a stream will work too, but we are interested in Iterators here. (See https://github.com/boilerbay/airconcurrentmap for relevant stream tests in /jmh).
The heartbeat is a static int
incremented once per second by a separate Thread. The heartbeat must be volatile so that its changes propagate between Threads – otherwise the program never terminates.
The USE_ITERATION
and USE_LAMBDA
flags are static final
, so the javac actually evaluates the code it affects ahead-of-time, removing dead code. This is defined to be the necessary behavior, thus there is no overhead there. You have to recompile to do the different kinds of test of course, but just don’t change them to be non-static or non-final!
The inner loop is not slowed by the dynamic method invocation of test.testIterator()
because methods of length 35 bytes or less are always inlined. Also, there is no polymorphism here – the Test
class has no extensions, so there is no vtable for dispatching, and there are no parameters to push on the stack.
The tests are relatively repeatable and show the overall pattern in the diagrams.
4. Code
public class JavaCodeGeeksIteratorArticle { static volatile int heartBeat = 0; // otherwise use forEach() static final boolean USE_ITERATION = false; static final boolean USE_LAMBDA = true; public static void main(String... args) { System.out.println((USE_ITERATION ? "Iterator" : USE_LAMBDA ? " forEach(lambda)" : "ForEach()") + " performance test"); Test tests[] = { new Test(new HashMap<Integer, Integer>()), new Test(new TreeMap<Integer, Integer>()), new Test(new ConcurrentHashMap<Integer, Integer>()), new Test(new ConcurrentSkipListMap<Integer, Integer>()), new Test(new AirConcurrentMap<Integer, Integer>()) }; int sizes[] = new int[] { 1, 3, 10, 30, 100, 300, 1000, 3000, 10_000, 30_000, 100_000, 300_000, 1000_000, 3_000_000, 10_000_000 }; // Just increment heartBeat every so often. It is volatile. // Reading it is very fast. new Thread(new Runnable() { public void run() { while (true) { heartBeat++; try { Thread.sleep(100); } catch (InterruptedException e) { } } } }).start(); for (int i = 0; i < sizes.length; i++) { for (Test test : tests) test.fillTo(sizes[i]); // warmup for (Test test : tests) { int nextHeartBeat = heartBeat + 20; while (heartBeat < nextHeartBeat) if (USE_ITERATION) test.testIterator(); else if (USE_LAMBDA) test.testForEachLambda(); else test.testForEach(); } for (Test test : tests) { test.time = 0; test.loops = 0; long t0 = System.nanoTime(); int nextHeartBeat = heartBeat + 30; while (heartBeat < nextHeartBeat) { if (USE_ITERATION) test.testIterator(); else if (USE_LAMBDA) test.testForEachLambda(); else test.testForEach(); } long t1 = System.nanoTime(); test.time += (t1 - t0); } for (Test test : tests) test.printResult(); // System.out.println("---------------"); } } } class Test implements BiConsumer<Integer, Integer> { // The total provides a tangible result to prevent optimizing-out long total = 0; long time = 0; int size = 0; long loops = 1; Map<Integer, Integer> map; Test(Map<Integer, Integer> map) { this.map = map; } void fillTo(int newSize) { Random random = new Random(size); while (size < newSize) { Integer n = new Integer(random.nextInt()); map.put(n, n); size++; } } void testIterator() { for (Integer v : map.values()) { total += v.intValue(); } loops++; } // This has the same effect and is 'terser' void testForEachLambda() { map.forEach((k, v) -> total += v); loops++; } void testForEach() { map.forEach(this); loops++; } // Implement BiConsumer for forEach() @Override public void accept(Integer k, Integer v) { total += k.intValue(); } void printResult() { double seconds = time / 1e9; System.out.printf("%22s size=% 9d entries/s(K)=% 11.3f total=%d\n", map.getClass().getSimpleName(), size, size * loops / seconds / 1e3, total); } }
5. Result Data
The data for the charts is below. This can be imported for example into Excel and sorted on the Map class and size to make graphs manually. You need to cut-and-paste the data, then use text-to-columns. Line charts can be used, manually selecting series data one-by-one for each Map class. One can also use ggplot in the ‘R’ statistics language by changing the data output format. Ignore the total: this only provides a tangible output to avoid optimizing out the loop.
JavaCodeGeeksIteratorArticle Iterator performance test HashMap size= 1 entries/s= 31290.754K total=-289049400605539520 TreeMap size= 1 entries/s= 50210.333K total=-331631386373881504 ConcurrentHashMap size= 1 entries/s= 15881.356K total=-91464608232057952 ConcurrentSkipListMap size= 1 entries/s= 42187.535K total=-254234286353018080 AirConcurrentMap size= 1 entries/s= 25577.125K total=-149805405784208032 --------------- HashMap size= 3 entries/s= 62664.626K total=-484691989675689270 TreeMap size= 3 entries/s= 66908.091K total=-550745245141063704 ConcurrentHashMap size= 3 entries/s= 38018.996K total=-211326922860746827 ConcurrentSkipListMap size= 3 entries/s= 71265.063K total=-488278692474832005 AirConcurrentMap size= 3 entries/s= 48540.146K total=-302163579336545082 --------------- HashMap size= 10 entries/s= 86701.181K total=-832795481348512598 TreeMap size= 10 entries/s= 87832.407K total=-916137658370092344 ConcurrentHashMap size= 10 entries/s= 73069.458K total=-502840045890573499 ConcurrentSkipListMap size= 10 entries/s= 96150.046K total=-880874881700377401 AirConcurrentMap size= 10 entries/s= 72001.056K total=-591224549191451578 --------------- HashMap size= 30 entries/s= 89419.363K total=-832238604657166224 TreeMap size= 30 entries/s= 92397.645K total=-915559545928720920 ConcurrentHashMap size= 30 entries/s= 71702.258K total=-502393457157098057 ConcurrentSkipListMap size= 30 entries/s= 103387.524K total=-880213560719307683 AirConcurrentMap size= 30 entries/s= 80807.271K total=-590716865563759348 --------------- HashMap size= 100 entries/s= 90540.307K total=-845663104709828079 TreeMap size= 100 entries/s= 96479.776K total=-930300717715488858 ConcurrentHashMap size= 100 entries/s= 69055.433K total=-512752383626539310 ConcurrentSkipListMap size= 100 entries/s= 111208.365K total=-897628029635465726 AirConcurrentMap size= 100 entries/s= 89071.481K total=-604453907970584431 --------------- HashMap size= 300 entries/s= 94846.852K total=-860347586557330269 TreeMap size= 300 entries/s= 94506.995K total=-944821983122037883 ConcurrentHashMap size= 300 entries/s= 65857.587K total=-522786710141245214 ConcurrentSkipListMap size= 300 entries/s= 112398.344K total=-915694233970137252 AirConcurrentMap size= 300 entries/s= 92168.052K total=-618862268508225735 --------------- HashMap size= 1000 entries/s= 74493.997K total=-852961390806337305 TreeMap size= 1000 entries/s= 80026.348K total=-937067262358689631 ConcurrentHashMap size= 1000 entries/s= 38450.309K total=-519070765727723030 ConcurrentSkipListMap size= 1000 entries/s= 112085.413K total=-904572392174355552 AirConcurrentMap size= 1000 entries/s= 89022.852K total=-609589031935380951 --------------- HashMap size= 3000 entries/s= 57470.417K total=-847037038798366713 TreeMap size= 3000 entries/s= 65963.172K total=-930168324830420495 ConcurrentHashMap size= 3000 entries/s= 41073.089K total=-514814334734654678 ConcurrentSkipListMap size= 3000 entries/s= 109217.866K total=-892841347702876464 AirConcurrentMap size= 3000 entries/s= 89175.845K total=-600361285087522951 --------------- HashMap size= 10000 entries/s= 46254.558K total=-846309210319384299 TreeMap size= 10000 entries/s= 49044.408K total=-929403633977808893 ConcurrentHashMap size= 10000 entries/s= 36385.473K total=-514246034592481772 ConcurrentSkipListMap size= 10000 entries/s= 99442.425K total=-891342788950136070 AirConcurrentMap size= 10000 entries/s= 85447.544K total=-599022098209904335 --------------- HashMap size= 30000 entries/s= 43723.556K total=-848942181585517584 TreeMap size= 30000 entries/s= 45253.915K total=-932143497084889069 ConcurrentHashMap size= 30000 entries/s= 32665.051K total=-516220137420939070 ConcurrentSkipListMap size= 30000 entries/s= 83393.494K total=-896231928805619954 AirConcurrentMap size= 30000 entries/s= 80619.262K total=-603845100973163554 --------------- HashMap size= 100000 entries/s= 40028.088K total=-849706795555554639 TreeMap size= 100000 entries/s= 41755.506K total=-932944794183923404 ConcurrentHashMap size= 100000 entries/s= 26064.027K total=-516724530444651670 ConcurrentSkipListMap size= 100000 entries/s= 46619.667K total=-897138307784594414 AirConcurrentMap size= 100000 entries/s= 75034.058K total=-605290263409285564 --------------- HashMap size= 300000 entries/s= 28271.140K total=-850157323063101369 TreeMap size= 300000 entries/s= 23442.635K total=-933312552546033574 ConcurrentHashMap size= 300000 entries/s= 22886.588K total=-517086645455936620 ConcurrentSkipListMap size= 300000 entries/s= 26852.530K total=-897567202447311134 AirConcurrentMap size= 300000 entries/s= 43406.800K total=-605991920028554584 --------------- HashMap size= 1000000 entries/s= 20762.874K total=-850266118577777400 TreeMap size= 1000000 entries/s= 21465.730K total=-933426629396373490 ConcurrentHashMap size= 1000000 entries/s= 17617.501K total=-517179596963620996 ConcurrentSkipListMap size= 1000000 entries/s= 17753.452K total=-897660153954995510 AirConcurrentMap size= 1000000 entries/s= 24726.115K total=-606121840885886155 --------------- HashMap size= 3000000 entries/s= 20859.307K total=-850290350569160265 TreeMap size= 3000000 entries/s= 17078.422K total=-933446707332090721 ConcurrentHashMap size= 3000000 entries/s= 19987.888K total=-517202444269781983 ConcurrentSkipListMap size= 3000000 entries/s= 23990.479K total=-897687847659433070 AirConcurrentMap size= 3000000 entries/s= 30472.006K total=-606157150359044044 --------------- HashMap size= 10000000 entries/s= 18594.336K total=-850335966429011695 TreeMap size= 10000000 entries/s= 14332.011K total=-933483200019971865 ConcurrentHashMap size= 10000000 entries/s= 17038.665K total=-517248060129633413 ConcurrentSkipListMap size= 10000000 entries/s= 18600.417K total=-897733463519284500 AirConcurrentMap size= 10000000 entries/s= 39037.289K total=-606248382078746904 --------------- ForEach() performance test HashMap size= 1 entries/s= 60469.332K total=-429010055848020608 TreeMap size= 1 entries/s= 162720.446K total=-1192873323002853184 ConcurrentHashMap size= 1 entries/s= 39683.288K total=-238381128098095008 ConcurrentSkipListMap size= 1 entries/s= 125216.579K total=-742139402594604448 AirConcurrentMap size= 1 entries/s= 40199.780K total=-223453462147226592 --------------- HashMap size= 3 entries/s= 154792.076K total=-907351702836558858 TreeMap size= 3 entries/s= 209809.380K total=-1866688472587176884 ConcurrentHashMap size= 3 entries/s= 100788.166K total=-558021636792810008 ConcurrentSkipListMap size= 3 entries/s= 202179.445K total=-1380005440196857173 AirConcurrentMap size= 3 entries/s= 99654.211K total=-536118642462089017 --------------- HashMap size= 10 entries/s= 297297.392K total=-2080956150412338142 TreeMap size= 10 entries/s= 228262.234K total=-2782965758065995472 ConcurrentHashMap size= 10 entries/s= 189641.651K total=-1313404093180619596 ConcurrentSkipListMap size= 10 entries/s= 304427.266K total=-2602702896415550485 AirConcurrentMap size= 10 entries/s= 179131.091K total=-1257952993772621945 --------------- HashMap size= 30 entries/s= 305634.315K total=-2079066757218703314 TreeMap size= 30 entries/s= 224584.862K total=-2781566150975000263 ConcurrentHashMap size= 30 entries/s= 174917.912K total=-1312321443993669200 ConcurrentSkipListMap size= 30 entries/s= 354581.676K total=-2600502266614288915 AirConcurrentMap size= 30 entries/s= 254150.967K total=-1256373143380530839 --------------- HashMap size= 100 entries/s= 295763.376K total=-2123996537948400465 TreeMap size= 100 entries/s= 225375.420K total=-2816005249627104526 ConcurrentHashMap size= 100 entries/s= 168596.386K total=-1337959745143386554 ConcurrentSkipListMap size= 100 entries/s= 366342.358K total=-2656351051389612808 AirConcurrentMap size= 100 entries/s= 336305.468K total=-1307783105877232718 --------------- HashMap size= 300 entries/s= 335940.642K total=-2176036047793281607 TreeMap size= 300 entries/s= 213798.860K total=-2849343024468137449 ConcurrentHashMap size= 300 entries/s= 196158.746K total=-1368245793652746886 ConcurrentSkipListMap size= 300 entries/s= 348762.198K total=-2711376732587358984 AirConcurrentMap size= 300 entries/s= 435378.640K total=-1375390632080685627 --------------- HashMap size= 1000 entries/s= 312285.030K total=-2145888100702813327 TreeMap size= 1000 entries/s= 157007.240K total=-2834013084542617045 ConcurrentHashMap size= 1000 entries/s= 178919.552K total=-1350929801563960438 ConcurrentSkipListMap size= 1000 entries/s= 253518.140K total=-2687136797657993712 AirConcurrentMap size= 1000 entries/s= 449868.858K total=-1331959489090096491 --------------- HashMap size= 3000 entries/s= 269579.050K total=-2118053337323592431 TreeMap size= 3000 entries/s= 129271.886K total=-2820412724828116661 ConcurrentHashMap size= 3000 entries/s= 95870.060K total=-1340856888537750166 ConcurrentSkipListMap size= 3000 entries/s= 203849.473K total=-2666107139705649888 AirConcurrentMap size= 3000 entries/s= 443682.852K total=-1286068695711899563 --------------- HashMap size= 10000 entries/s= 100969.734K total=-2116523986910706773 TreeMap size= 10000 entries/s= 81240.085K total=-2819237854014952091 ConcurrentHashMap size= 10000 entries/s= 74229.656K total=-1339726640597926192 ConcurrentSkipListMap size= 10000 entries/s= 147733.052K total=-2664068913299591178 AirConcurrentMap size= 10000 entries/s= 399269.901K total=-1279844336850624703 --------------- HashMap size= 30000 entries/s= 77830.586K total=-2121068856388826590 TreeMap size= 30000 entries/s= 73801.711K total=-2823183557187296024 ConcurrentHashMap size= 30000 entries/s= 63503.394K total=-1343522194696030345 ConcurrentSkipListMap size= 30000 entries/s= 148907.153K total=-2671403338078408622 AirConcurrentMap size= 30000 entries/s= 450044.679K total=-1305454406448994036 --------------- HashMap size= 100000 entries/s= 57919.591K total=-2122150626578319295 TreeMap size= 100000 entries/s= 39468.243K total=-2823932886520250879 ConcurrentHashMap size= 100000 entries/s= 30273.873K total=-1344103393021081000 ConcurrentSkipListMap size= 100000 entries/s= 48766.187K total=-2672332261897079327 AirConcurrentMap size= 100000 entries/s= 307460.503K total=-1311189966514089586 --------------- HashMap size= 300000 entries/s= 42127.664K total=-2122825947560403955 TreeMap size= 300000 entries/s= 26508.090K total=-2824353316156729769 ConcurrentHashMap size= 300000 entries/s= 27206.845K total=-1344518179306734670 ConcurrentSkipListMap size= 300000 entries/s= 19172.093K total=-2672628537815403377 AirConcurrentMap size= 300000 entries/s= 152485.548K total=-1313414387297697136 --------------- HashMap size= 1000000 entries/s= 40607.148K total=-2123037200986959355 TreeMap size= 1000000 entries/s= 25159.911K total=-2824487462082592448 ConcurrentHashMap size= 1000000 entries/s= 30257.523K total=-1344677675643783997 ConcurrentSkipListMap size= 1000000 entries/s= 37742.151K total=-2672825003502099899 AirConcurrentMap size= 1000000 entries/s= 148954.277K total=-1314169618297632691 --------------- HashMap size= 3000000 entries/s= 43941.038K total=-2123087741997557902 TreeMap size= 3000000 entries/s= 18891.646K total=-2824508924703531557 ConcurrentHashMap size= 3000000 entries/s= 32007.894K total=-1344715062144774703 ConcurrentSkipListMap size= 3000000 entries/s= 21722.543K total=-2672849927836093703 AirConcurrentMap size= 3000000 entries/s= 126084.363K total=-1314312240875486125 --------------- HashMap size= 10000000 entries/s= 35724.715K total=-2123169850545290476 TreeMap size= 10000000 entries/s= 12693.940K total=-2824540855805427558 ConcurrentHashMap size= 10000000 entries/s= 27254.306K total=-1344783485934551848 ConcurrentSkipListMap size= 10000000 entries/s= 13572.712K total=-2672886420523974847 AirConcurrentMap size= 10000000 entries/s= 92726.047K total=-1314522073830802703 ---------------
6. Summary
It is possible to get repeatable, meaningful custom Map performance tests using simple code with certain precautions. This code shows some of those issues. The code is easily adaptable to other kinds of Map tests and data output formats.
Hello sir,
this is really very nice post about map iterator and you explained very well in detail.
keep posting for us.
thank you so much
Thanks very much! Your comment floats my boat.
I do an awful lot of Map performance testing, nowadays using the Java Microbenchmarking Harness, which is excellent. I’m publishing various articles on Map performance right now covering serial and parallel streams. I think the graphical approach over Map sizes is vital, and basically nobody else uses it, so you can see why their results are all different. It’s way more work, though. Everyone has an opinion and makes wild guesses.
I’ll keep going!