Pass Streams Instead of Lists
Opening disclaimer: this isn’t always a good idea. I’ll present the idea, along with some of the reasons why it’s a good idea, but then I’ll talk about some instances where it’s not so great.
Being Lazy
As you may know, I’ve been dabbling in Python nearly as much as I’ve been working with Java. One thing that I’ve liked about Python as soon I found out about it is generators. They allow for lazy operations on collections, so you can pass iterators/generators around until you finally actually need the final result of the operations – without affecting the original collection (under most circumstances; but you’re not likely to affect it accidentally).
I really enjoy the power of this idea. The laziness allows you to not do practically any work until the results are needed, and it also makes it so there isn’t useless memory used to store intermediate collections.
Being Lazy in Java
Java has iterators too, but not generators. But it does have something that works fairly similarly when it comes to lazy operations on collections: Stream
s. While not quite as versatile as generators in Python, Stream
s can largely be used the same way.
Passing Streams Around
There are a lot of cases where you should return Stream
s instead of the resulting List
s (or other collections). This does something for you, even besides the lazy benefits mentioned above.
If the receiver of the returned object wants to collect()
it into something other than the List
you had planned on returning, or they want to reduce()
it in a way you never expected, you can give them a Stream
and have nothing to worry about. They can then get what they need with a Stream
method call or two.
What Sucks About This
There is a problem that can be difficult to deal with when it comes to Stream
s being passed around like they’re collections: They’re one-time-use-only. This means that if a function such as the one below wants to use a Stream
instead of a List
, it can’t do it easily, since it needs to do two separate things with the List
.
public static List normalize(List input) { int total = input.stream() .mapToInt(i -> i) .sum(); return input.stream() .map(i -> i * 100 / total) .collect(Collectors.toList()); }
In order to take in a Stream
instead, you need to collect()
it, then run the two operations on it.
public static Stream normalize(Stream input) { List inputList = input.collect(Collectors.toList()); int total = inputList.stream() .mapToInt(i -> i) .sum(); return inputList.stream() .map(i -> i * 100 / total); }
This slightly defeats the purpose of passing the Stream
s around. It’s not horrible, since we’re trying to use a “final” result of the Stream
. Except that it’s not a final result. It’s an intermediate result that is used to calculate the next Stream
output. It creates an intermediate collection which wastes memory.
There are ways around this, akin to how this “article” solves it, but they’re either complicated to implement or prone to user errors. I guess it’s kind of okay to just use the second method I showed you, since it’s still likely a pretty good performance boost over how the first one did it, but it just bugs me.
Interesting (But Probably A Little Silly) Alternative
If you’re familiar with my posts, you may feel like this article is against an article I had written a while back about transforming collections using decorators. Technically, this post does think of that as a rather naive idea, especially since the idea was inspired by Stream
s. But, there is one major benefit to the decorator idea over the Stream
s idea presented in this article: you can iterate over the decorated collections over and over again. It’s probably not as efficient as Stream
s – especially since I’m not sure how to parallelize it – but the it certainly has reusability going for it.
There’s a chance I’ll look into the idea again and see if I can figure out a better way to do it, but I doubt it.
Outro
So, that’s my idea. You can take it or leave it. I’m not sure how often this can be useful in typical projects, but I think I’m going to give it a try in my current and future projects. Thanks for reading. If you’ve got an opinion about this, comment below and let me know.
Reference: | Pass Streams Instead of Lists from our JCG partner Jacob Zimmerman at the Programming Ideas With Jake blog. |
I’m really uncomfortable with this. Giving a “one-time-use-only” object as a parameter will decrease the readability of the code and I think it will increase complexity.
It’s like coding in C++ and messing with a given reference. Also I think it’s a bad practice to modify objects given as parameters (when they’re not Entities nor DTOs).
The only use-cases I think about may be utility methods manipulating streams to prevent duplication or reduce complexity.
I can understand your concerns. There are plenty of cases where this is a bad idea. But this idea is designed around short-lived Streams. You get the Stream, pass through some transformative functions which return the new altered Stream (only terminal operations on Streams actually change them. All the other functions create new Stream objects that wrap the previous one), and then you get the final result, whether it’s a reduced value or a new collection. You never pass the same Stream object to more than one method. Either the method returns a new Stream to continue using or it… Read more »
Thanks for this explanation, it’s clearer for me now.
I would like to see some more code if possible ^^
Handy post, helped me to get a better understanding of streams.
What e garbage article I must say.