Java 8 Streams API: Grouping and Partitioning a Stream

Fahd ShariffNovember 30th, 2015Last Updated: November 30th, 2015

0 6,833 2 minutes read

This post shows how you can use the Collectors available in the Streams API to group elements of a stream with groupingBy and partition elements of a stream with partitioningBy.

Consider a stream of Employee objects, each with a name, city and number of sales, as shown in the table below:

+----------+------------+-----------------+
| Name     | City       | Number of Sales |
+----------+------------+-----------------+
| Alice    | London     | 200             |
| Bob      | London     | 150             |
| Charles  | New York   | 160             |
| Dorothy  | Hong Kong  | 190             |
+----------+------------+-----------------+

Grouping

Let’s start by grouping employees by city using imperative style (pre-lamba) Java:

Map<String, List<Employee>> result = new HashMap<>();
for (Employee e : employees) {
  String city = e.getCity();
  List<Employee> empsInCity = result.get(city);
  if (empsInCity == null) {
    empsInCity = new ArrayList<>();
    result.put(city, empsInCity);
  }
  empsInCity.add(e);
}

You’re probably familiar with writing code like this, and as you can see, it’s a lot of code for such a simple task!

In Java 8, you can do the same thing with a single statement using a groupingBy collector, like this:

Map<String, List<Employee>> employeesByCity =
  employees.stream().collect(groupingBy(Employee::getCity));

This results in the following map:

1	`{New York=[Charles], Hong Kong=[Dorothy], London=[Alice, Bob]}`

It’s also possible to count the number of employees in each city, by passing a counting collector to the groupingBy collector. The second collector performs a further reduction operation on all the elements in the stream classified into the same group.

Map<String, Long> numEmployeesByCity =
  employees.stream().collect(groupingBy(Employee::getCity, counting()));

The result is the following map:

1	`{New York=1, Hong Kong=1, London=2}`

Just as an aside, this is equivalent to the following SQL statement:

1	`select city, count(*) from Employee group by city`

Another example is calculating the average number of sales in each city, which can be done using the averagingInt collector in conjuction with the groupingBy collector:

Map<String, Double> avgSalesByCity =
  employees.stream().collect(groupingBy(Employee::getCity,
                               averagingInt(Employee::getNumSales)));

The result is the following map:

1	`{New York=160.0, Hong Kong=190.0, London=175.0}`

Partitioning

Partitioning is a special kind of grouping, in which the resultant map contains at most two different groups – one for true and one for false. For instance, if you want to find out who your best employees are, you can partition them into those who made more than N sales and those who didn’t, using the partitioningBy collector:

Map<Boolean, List<Employee>> partitioned =
  employees.stream().collect(partitioningBy(e -> e.getNumSales() > 150));

This will produce the following result:

1	`{false=[Bob],` `true=[Alice, Charles, Dorothy]}`

You can also combine partitioning and grouping by passing a groupingBy collector to the partitioningBy collector. For example, you could count the number of employees in each city within each partition:

Map<Boolean, Map<String, Long>> result =
  employees.stream().collect(partitioningBy(e -> e.getNumSales() > 150,
                               groupingBy(Employee::getCity, counting())));

This will produce a two-level Map: