Java 8: Group by with collections
In my continued reading of Venkat Subramaniam’s ‘Functional Programming in Java‘ I’ve reached the part of the book where the Stream#collect function is introduced.
We want to take a collection of people, group them by age and return a map of (age -> people’s names) for which this comes in handy.
To refresh, this is what the Person class looks like:
static class Person { private String name; private int age; Person(String name, int age) { this.name = name; this.age = age; } @Override public String toString() { return String.format("Person{name='%s', age=%d}", name, age); } }
And we can write the following code in Java 8 to get a map of people’s names grouped by age:
Stream<Person> people = Stream.of(new Person("Paul", 24), new Person("Mark", 30), new Person("Will", 28)); Map<Integer, List<String>> peopleByAge = people .collect(groupingBy(p -> p.age, mapping((Person p) -> p.name, toList()))); System.out.println(peopleByAge);
{24=[Paul], 28=[Will], 30=[Mark]}
We’re running the ‘collect’ function over the collection, grouping by the ‘age’ property as we go and grouping the names of people rather than the people themselves.
This is a little bit different to what you’d do in Ruby where there’s a ‘group_by’ function which you can call on a collection:
> people = [ {:name => "Paul", :age => 24}, {:name => "Mark", :age => 30}, {:name => "Will", :age => 28}] > people.group_by { |p| p[:age] } => {24=>[{:name=>"Paul", :age=>24}], 30=>[{:name=>"Mark", :age=>30}], 28=>[{:name=>"Will", :age=>28}]}
This gives us back lists of people grouped by age but we need to apply an additional ‘map’ operation to change that to be a list of names instead:
> people.group_by { |p| p[:age] }.map { |k,v| [k, v.map { |person| person[:name] } ] } => [[24, ["Paul"]], [30, ["Mark"]], [28, ["Will"]]]
At this stage we’ve got an array of (age, names) pairs but luckily Ruby 2.1.0 has a function ‘to_h’ which we can call to get back to a hash again:
> people.group_by { |p| p[:age] }.map { |k,v| [k, v.map { |person| person[:name] } ] }.to_h => {24=>["Paul"], 30=>["Mark"], 28=>["Will"]}
If we want to follow the Java approach of grouping by a property while running a reduce over the collection we’d have something like the following:
> people.reduce({}) { |acc, item| acc[item[:age]] ||=[]; acc[item[:age]] << item[:name]; acc } => {24=>["Paul"], 30=>["Mark"], 28=>["Will"]}
If we’re using Clojure then we might end up with something like this instead:
(def people [{:name "Paul", :age 24} {:name "Mark", :age 30} {:name "Will", :age 28}]) > (reduce (fn [acc [k v]] (assoc-in acc [k] (map :name v))) {} (group-by :age people)) {28 ("Will"), 30 ("Mark"), 24 ("Paul")}
I thought the Java version looked a bit weird to begin with but it’s actually not too bad having worked through the problem in a couple of other languages.
It’d be good to know whether there’s a better way of doing this the Ruby/Clojure way though!
Here is a Groovy version:
@groovy.transform.ToString(includeNames=true)
class Person {
String name
int age
}
def persons = [
new Person(name:’Paul’, age:24),
new Person(name:’Mark’, age:30),
new Person(name:’Bob’, age:24),
new Person(name:’Will’, age:28),
]
persons.groupBy([{it.age}]).collectEntries {k,v -> [k,v.name]}
// Result: [24:[Paul, Bob], 30:[Mark], 28:[Will]]