Java 8 Streams API: Grouping and Partitioning a Stream
This post shows how you can use the Collectors
available in the Streams API to group elements of a stream with groupingBy
and partition elements of a stream with partitioningBy
.
Consider a stream of Employee
objects, each with a name, city and number of sales, as shown in the table below:
1 2 3 4 5 6 7 8 | +----------+------------+-----------------+ | Name | City | Number of Sales | +----------+------------+-----------------+ | Alice | London | 200 | | Bob | London | 150 | | Charles | New York | 160 | | Dorothy | Hong Kong | 190 | +----------+------------+-----------------+ |
Grouping
Let’s start by grouping employees by city using imperative style (pre-lamba) Java:
01 02 03 04 05 06 07 08 09 10 | Map<String, List<Employee>> result = new HashMap<>(); for (Employee e : employees) { String city = e.getCity(); List<Employee> empsInCity = result.get(city); if (empsInCity == null ) { empsInCity = new ArrayList<>(); result.put(city, empsInCity); } empsInCity.add(e); } |
You’re probably familiar with writing code like this, and as you can see, it’s a lot of code for such a simple task!
In Java 8, you can do the same thing with a single statement using a groupingBy
collector, like this:
1 2 | Map<String, List<Employee>> employeesByCity = employees.stream().collect(groupingBy(Employee::getCity)); |
This results in the following map:
1 | {New York=[Charles], Hong Kong=[Dorothy], London=[Alice, Bob]} |
It’s also possible to count the number of employees in each city, by passing a counting
collector to the groupingBy
collector. The second collector performs a further reduction operation on all the elements in the stream classified into the same group.
1 2 | Map<String, Long> numEmployeesByCity = employees.stream().collect(groupingBy(Employee::getCity, counting())); |
The result is the following map:
1 | {New York= 1 , Hong Kong= 1 , London= 2 } |
Just as an aside, this is equivalent to the following SQL statement:
1 | select city, count(*) from Employee group by city |
Another example is calculating the average number of sales in each city, which can be done using the averagingInt
collector in conjuction with the groupingBy
collector:
1 2 3 | Map<String, Double> avgSalesByCity = employees.stream().collect(groupingBy(Employee::getCity, averagingInt(Employee::getNumSales))); |
The result is the following map:
1 | {New York= 160.0 , Hong Kong= 190.0 , London= 175.0 } |
Partitioning
Partitioning is a special kind of grouping, in which the resultant map contains at most two different groups – one for true
and one for false
. For instance, if you want to find out who your best employees are, you can partition them into those who made more than N sales and those who didn’t, using the partitioningBy
collector:
1 2 | Map<Boolean, List<Employee>> partitioned = employees.stream().collect(partitioningBy(e -> e.getNumSales() > 150 )); |
This will produce the following result:
1 | { false =[Bob], true =[Alice, Charles, Dorothy]} |
You can also combine partitioning and grouping by passing a groupingBy
collector to the partitioningBy
collector. For example, you could count the number of employees in each city within each partition:
1 2 3 | Map<Boolean, Map<String, Long>> result = employees.stream().collect(partitioningBy(e -> e.getNumSales() > 150 , groupingBy(Employee::getCity, counting()))); |
This will produce a two-level Map:
1 | { false ={London= 1 }, true ={New York= 1 , Hong Kong= 1 , London= 1 }} |
Reference: | Java 8 Streams API: Grouping and Partitioning a Stream from our JCG partner Fahd Shariff at the fahd.blog blog. |