Three-State Booleans in Java
Every now and then, I miss SQL’s three-valued BOOLEAN
semantics in Java. In SQL, we have:
TRUE
FALSE
UNKNOWN
(also known asNULL
)
Every now and then, I find myself in a situation where I wish I could also express this UNKNOWN
or UNINITIALISED
semantics in Java, when plain true
and false
aren’t enough.
Implementing a ResultSetIterator
For instance, when implementing a ResultSetIterator
for jOOλ, a simple library modelling SQL streams for Java 8:
1 2 3 4 5 6 7 | SQL.stream(stmt, Unchecked.function(r -> new SQLGoodies.Schema( r.getString( "FIELD_1" ), r.getBoolean( "FIELD_2" ) ) )) .forEach(System.out::println); |
In order to implement a Java 8 Stream, we need to construct an Iterator
, which we can then pass to the new Spliterators.spliteratorUnknownSize() method:
1 2 3 4 | StreamSupport.stream( Spliterators.spliteratorUnknownSize(iterator, 0 ), false ); |
Another example for this can be seen here on Stack Overflow.
When implementing the Iterator
interface, we must implement hasNext()
and next()
. Note that with Java 8, remove() now has a default implementation, so we don’t need to implement it any longer.
While most of the time, a call to next()
is preceded by a call to hasNext()
exactly once, nothing in the Iterator
contract requires this. It is perfectly fine to write:
01 02 03 04 05 06 07 08 09 10 11 | if (it.hasNext()) { // Some stuff // Double-check again to be sure if (it.hasNext() && it.hasNext()) { // Yes, we're paranoid if (it.hasNext()) it.next(); } } |
How to translate the Iterator
calls to backing calls on the JDBC ResultSet? We need to call ResultSet.next()
.
We could make the following translation:
Iterator.hasNext() == !ResultSet.isLast()
Iterator.next() == ResultSet.next()
But that translation is:
- Expensive
- Not dealing correctly with empty
ResultSet
s - Not implemented in all JDBC drivers (Support for the isLast method is optional for ResultSets with a result set type of TYPE_FORWARD_ONLY)
So, we’ll have to maintain a flag, internally, that tells us:
- If we had already called
ResultSet.next()
- What the result of that call was
Instead of creating a second variable, why not just use a three-valued java.lang.Boolean
. Here’s a possible implementation from jOOλ:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | class ResultSetIterator<T> implements Iterator<T> { final Supplier<? extends ResultSet> supplier; final Function<ResultSet, T> rowFunction; final Consumer<? super SQLException> translator; /** * Whether the underlying {@link ResultSet} has * a next row. This boolean has three states: * <ul> * <li>null: it's not known whether there * is a next row</li> * <li>true: there is a next row, and it * has been pre-fetched</li> * <li>false: there aren't any next rows</li> * </ul> */ Boolean hasNext; ResultSet rs; ResultSetIterator( Supplier<? extends ResultSet> supplier, Function<ResultSet, T> rowFunction, Consumer<? super SQLException> translator ) { this .supplier = supplier; this .rowFunction = rowFunction; this .translator = translator; } private ResultSet rs() { return (rs == null ) ? (rs = supplier.get()) : rs; } @Override public boolean hasNext() { try { if (hasNext == null ) { hasNext = rs().next(); } return hasNext; } catch (SQLException e) { translator.accept(e); throw new IllegalStateException(e); } } @Override public T next() { try { if (hasNext == null ) { rs().next(); } return rowFunction.apply(rs()); } catch (SQLException e) { translator.accept(e); throw new IllegalStateException(e); } finally { hasNext = null ; } } } |
As you can see, the hasNext()
method locally caches the hasNext
three-valued boolean state only if it was null
before. This means that calling hasNext()
several times will have no effect until you call next()
, which resets the hasNext
cached state.
Both hasNext()
and next()
advance the ResultSet
cursor if needed.
Readability?
Some of you may argue that this doesn’t help readability. They’d introduce a new variable like:
1 2 | boolean hasNext; boolean hasHasNextBeenCalled; |
The trouble with this is the fact that you’re still implementing three-valued boolean state, but distributed to two variables, which are very hard to name in a way that is truly more readable than the actual java.lang.Boolean
solution. Besides, there are actually four state values for two boolean
variables, so there is a slight increase in the risk of bugs.
Every rule has its exception. Using null
for the above semantics is a very good exception to the null
-is-bad histeria that has been going on ever since the introduction of Option / Optional…
In other words: Which approach is best? There’s no TRUE
or FALSE
answer, only UNKNOWN
!
Be careful with this
However, as we’ve discussed in a previous blog post, you should avoid returning null
from API methods if possible. In this case, using null
explicitly as a means to model state is fine because this model is encapsulated in our ResultSetIterator
. But try to avoid leaking such state to the outside of your API.
Reference: | Three-State Booleans in Java from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog. |
How about using an enum instead?
enum HasMoreRows { UNKNOWN, NO_MORE_ROWS, HAS_MORE_ROWS }
this increases readability, avoids the use of null, and avoids boxing/unboxing hasNext.
Yes, it would work just the same. While I don’t think that unboxing is an issue (all Booleans are cached), an enum might make things a bit more readable, although also a bit more verbose.