Clear details on Java collection ‘Clear()’ API
Several of us might be familiar with the clear () API in Java collections framework. In this post, let’s discuss what is the purpose of this clear() API? What is the performance impact of using this API? What happens under the JVM when this API is invoked?
Video: To see the visual walk-through of this post, click below:
What does clear() API do?
clear() API is present in the Java Collection interface. It’s implemented by all the concrete classes that implement the Collection interface: ArrayList, TreeSet, Stack …. When this method is invoked, it removes all the elements that are present in data structure.
How does ArrayList’s clear() method work in Java?
In this post, let’s focus on the ArrayList’s implementation of the clear() method. Other data structures implementation is also quite similar. ‘ArrayList’ underlying has an Object array i.e., ‘Object[]’ as a member variable. When you add records to the ‘ArrayList’, they are added to this ‘Object[]’. When you invoke the ‘clear()’ API on the ‘ArrayList’, all the objects (i.e., Contents) of this ‘Object[]’ will be removed. Let’s say we created an ‘ArrayList’ and added a list of integers 0 to 1,000,000 (1 million). When the ‘clear()’ method is invoked on it, all the 1 million integers from the underlying ‘Object[]’ will be removed. However, the empty ‘Object[]’ with size of 1 million will continue to remain, consuming memory unnecessarily.
Creating ArrayList example
It’s always easy to learn with an example. Let’s learn the ‘clear()’ API functionality with this simple example:
01: public class ClearNoDemo { 02: 03: private static ArrayList<Long> myList = new ArrayList<>(); 04: 05: public static void main(String[] args) throws Exception { 06: 07: for (int counter = 0; counter < 1_000_000; ++counter) { 08: 09: myList.add(Long.valueOf(counter)); 10: } 11: 12: System.out.println("All records added!"); 13: 14: Thread.sleep(100000); // sleep for 10 seconds 15: } 16: }
Here are the operations we are performing in this ‘ClearNoDemo’ class:
a. We are creating a ‘myList’ object whose type is ‘ArrayList’ in line #3.
b. We are adding 0 to 1 million ‘Long’ wrapper objects to this ‘myList’ from line #07 – #10.
d. In line# 14, we are putting the thread to sleep for 10 seconds, to capture the heap dump for our discussions.
We ran this program and captured the heap dump from the program using the open source yCrash script, when the program was sleeping in line# 14. We captured the heap dump so that we can study how objects are stored in the memory. Heap dump is basically a binary file, which contains information such as: what are the objects that are residing in the memory, what is their size, who is referencing them, what are the values that are present in them.Since heap dump is a binary file in unreadable format, we analyzed the heap dump using the heap dump analysis tool – HeapHero. Report generated by the tool can be found here. Below is the Dominator Tree section from the report that displays the largest objects in the application:
Fig: ‘ArrayList’ without invoking ‘clear()’ API (heap report by HeapHero)
You can notice our ‘myList’ object is reported as the largest object, because we created 1 million ‘Long’ objects and stored them in it. You can notice that the ‘myList’ object has a child object ‘elementData’ whose type is the ‘Object[]’. This is the actual Object[] where 1 million+ records are stored. Also, you can notice that this ‘Object[]’ occupies 27.5mb of memory. This analysis confirms that objects that we are adding are stored in the internal ‘Object[]’.
List#clear() API example
Now, we have created a slightly modified version of the above program, where we are invoking the ‘clear()’ API on the ‘ArrayList’.
01: public class ClearDemo { 02: 03: private static ArrayList<Long> myList = new ArrayList<>(); 04: 05: public static void main(String[] args) throws Exception { 06: 07: for (int counter = 0; counter < 1_000_000; ++counter) { 08: 09: myList.add(Long.valueOf(counter)); 10: } 11: 12: long startTime = System.currentTimeMillis(); 13: myList.clear(); 14: System.out.println("Execution Time: " + (System.currentTimeMillis() - startTime)); 15: 16: Thread.sleep(100000); // sleep for 10 seconds 17: } 18: }
Here are the operations we are performing in this ‘ClearDemo’ class:
a. We are creating a ‘myList’ object whose type is ‘ArrayList’ in line #3.
b. We are adding 0 to 1 million ‘Long’ wrapper objects to this ‘myList’ from line #07 – #10.
c. We are removing the objects from the ‘myList’ on line #13 using the ‘clear()’ API.
d. In line# 16, we are putting the thread to sleep for 10 seconds, to capture the heap dump for our discussions.
When you invoke ‘clear()’ API, all the 1 million ‘Long’ objects that were stored in the ‘Object[]’ will be removed from the memory. However, ‘Object[]’ itself will continue to remain in the memory. To confirm this theory, we ran the above program and captured the heap dump using the open source yCrash script, when the program was sleeping in line# 16.We analyzed the heap dump using the heap dump analysis tool – HeapHero. The report generated by the tool can be found here. Below is the Dominator Tree section from the report that displays the largest objects in the application:
Fig: ‘ArrayList’ after invoking ‘clear()’ API ( heap report by HeapHero)
You can notice our ‘myList’ object is reported as the largest object. You can notice that the ‘myList’ object has a child object ‘elementData’ whose type is the ‘Object[]’. However, this ‘Object[]’ has 0 entries (i.e., no elements in it), but it has an array size of 1 million+. Since this empty array with 1 million+ size is present, it occupies 4.64mb of memory. This analysis confirms that even though objects are removed by invoking ‘clear()’ API, still underlying ‘Object[]’ with 1 million+ size will continue to exist, consuming memory unnecessarily.
Note: Refer to the ‘Memory Impact’ section below to learn what kind of performance impact your application will experience when invoking ‘clear()’ API.
Assigning List to null example
To make our study further interesting, we created a slightly modified version of the above program where we were assigned the ‘myList’ to ‘null’ reference instead of invoking ‘clear()’ API to remove the objects from the ‘ArrayList’.
01: public class ClearNullDemo { 02: 03: private static ArrayList<Long> myList = new ArrayList<>(); 04: 05: public static void main(String[] args) throws Exception { 06: 07: for (int counter = 0; counter < 1_000_000; ++counter) { 08: 09: myList.add(Long.valueOf(counter)); 10: } 11: 12: long startTime = System.currentTimeMillis(); 13: myList = null; 14: System.out.println("Execution Time: " + (System.currentTimeMillis() - startTime)); 15: 16: Thread.sleep(100000); // sleep for 10 seconds 17: } 18: }
Here are the operations we are performing in this ‘ClearNullDemo’ class:
a. We are creating a ‘myList’ object whose type is ‘ArrayList’ in line #3.
b. We are adding 0 to 1 million ‘Long’ wrapper objects to this ‘myList’ from line #07 – #10.
c. We are assigning the list to ‘null’ in line# 13 instead using the ‘clear()’ API.
d. In line# 16, we are putting the thread to sleep for 10 seconds, to capture the heap dump for our discussions.
When you are assigning ‘null’ to ‘myList’, it will make the ‘ArrayList’ and underlying ‘Object[]’ eligible for garbage collection. They will no longer exist in memory. To confirm this theory, we ran the above program and captured the heap dump using the open source yCrash script, when the program was sleeping in line# 16.
We analyzed the heap dump using the heap dump analysis tool – HeapHero. The report generated by the tool can be found here. Below is the Dominator Tree section from the report that displays the largest objects in the application:You can notice our ‘myList’ object reported is not even present in the list (as it got garbage collected from the memory). This is in total contrast to the earlier two example programs.
Memory Impact
Fig: Memory occupied by ArrayList
The above chart shows the memory occupied by the ‘ArrayList’.
a. When ‘ArrayList’ was created 1 million ‘Long’ records it occupies 27.5MB
b. When ‘clear()’ API was invoked, it continues to occupy 4.64MB, because the underlying empty ‘Object[]’ will continue to remain in memory.
c. On the other hand when assigned to ‘null’, ‘ArrayList’ gets garbage collected and doesn’t occupy any memory.
Thus, from the memory perspective, it’s a prudent decision to assign the ‘ArrayList’ to ‘null’ instead of invoking the ‘clear()’ API.
Processing Time Impact
01: public void clear() { 02: modCount++; 03: final Object[] es = elementData; 04: for (int to = size, i = size = 0; i < to; i++) 05: es[i] = null; 06: }
Fig: Java source code of ArrayList#clear()
Above is the source code of the ‘clear()’ method from the JDK. From the source code (i.e., line #4 and #5) – you can notice this method loops through all the elements in the underlying ‘Object[]’ assigns them to ‘null’ value. This is a time consuming process, especially on a collection which has a lot of elements, like our example of 1 million elements. In such circumstances, assigning the ‘ArrayList’ variable to ‘null’ would be more performant.
When to use Collection#Clear() API?
This raises a question, whether we should never invoke ‘clear()’ API because of its memory and processing impact. Although I would vote for this option, there might be scenarios in which clear() API might have its case:
a. Passing by reference: If you are passing a Collection object as a reference to other parts of the code, then assigning ‘null’ value, will result in the famous ‘NullPointerException’ 😊. To avoid that exception, you may use the ‘clear()’ API.
b. Collection size is small: If you are creating only a few collection instances and their size is very small (say has only 10, 20 elements), then invocation of ‘clear()’ API or assigning null might not make much difference.
Conclusion
I hope in this post, we learnt about clear() API and its performance impacts in detail.
Published on Java Code Geeks with permission by Ram Lakshmanan, partner at our JCG program. See the original article here: Clear details on Java collection ‘Clear()’ API Opinions expressed by Java Code Geeks contributors are their own. |
Thread.sleep(100000);
// sleep for 10 seconds
Comments are not your best friends.