Three-State Booleans in Java
Every now and then, I miss SQL’s three-valued BOOLEAN
semantics in Java. In SQL, we have:
TRUE
FALSE
UNKNOWN
(also known asNULL
)
Every now and then, I find myself in a situation where I wish I could also express this UNKNOWN
or UNINITIALISED
semantics in Java, when plain true
and false
aren’t enough.
Implementing a ResultSetIterator
For instance, when implementing a ResultSetIterator
for jOOλ, a simple library modelling SQL streams for Java 8:
SQL.stream(stmt, Unchecked.function(r -> new SQLGoodies.Schema( r.getString("FIELD_1"), r.getBoolean("FIELD_2") ) )) .forEach(System.out::println);
In order to implement a Java 8 Stream, we need to construct an Iterator
, which we can then pass to the new Spliterators.spliteratorUnknownSize() method:
StreamSupport.stream( Spliterators.spliteratorUnknownSize(iterator, 0), false );
Another example for this can be seen here on Stack Overflow.
When implementing the Iterator
interface, we must implement hasNext()
and next()
. Note that with Java 8, remove() now has a default implementation, so we don’t need to implement it any longer.
While most of the time, a call to next()
is preceded by a call to hasNext()
exactly once, nothing in the Iterator
contract requires this. It is perfectly fine to write:
if (it.hasNext()) { // Some stuff // Double-check again to be sure if (it.hasNext() && it.hasNext()) { // Yes, we're paranoid if (it.hasNext()) it.next(); } }
How to translate the Iterator
calls to backing calls on the JDBC ResultSet? We need to call ResultSet.next()
.
We could make the following translation:
Iterator.hasNext() == !ResultSet.isLast()
Iterator.next() == ResultSet.next()
But that translation is:
- Expensive
- Not dealing correctly with empty
ResultSet
s - Not implemented in all JDBC drivers (Support for the isLast method is optional for ResultSets with a result set type of TYPE_FORWARD_ONLY)
So, we’ll have to maintain a flag, internally, that tells us:
- If we had already called
ResultSet.next()
- What the result of that call was
Instead of creating a second variable, why not just use a three-valued java.lang.Boolean
. Here’s a possible implementation from jOOλ:
class ResultSetIterator<T> implements Iterator<T> { final Supplier<? extends ResultSet> supplier; final Function<ResultSet, T> rowFunction; final Consumer<? super SQLException> translator; /** * Whether the underlying {@link ResultSet} has * a next row. This boolean has three states: * <ul> * <li>null: it's not known whether there * is a next row</li> * <li>true: there is a next row, and it * has been pre-fetched</li> * <li>false: there aren't any next rows</li> * </ul> */ Boolean hasNext; ResultSet rs; ResultSetIterator( Supplier<? extends ResultSet> supplier, Function<ResultSet, T> rowFunction, Consumer<? super SQLException> translator ) { this.supplier = supplier; this.rowFunction = rowFunction; this.translator = translator; } private ResultSet rs() { return (rs == null) ? (rs = supplier.get()) : rs; } @Override public boolean hasNext() { try { if (hasNext == null) { hasNext = rs().next(); } return hasNext; } catch (SQLException e) { translator.accept(e); throw new IllegalStateException(e); } } @Override public T next() { try { if (hasNext == null) { rs().next(); } return rowFunction.apply(rs()); } catch (SQLException e) { translator.accept(e); throw new IllegalStateException(e); } finally { hasNext = null; } } }
As you can see, the hasNext()
method locally caches the hasNext
three-valued boolean state only if it was null
before. This means that calling hasNext()
several times will have no effect until you call next()
, which resets the hasNext
cached state.
Both hasNext()
and next()
advance the ResultSet
cursor if needed.
Readability?
Some of you may argue that this doesn’t help readability. They’d introduce a new variable like:
boolean hasNext; boolean hasHasNextBeenCalled;
The trouble with this is the fact that you’re still implementing three-valued boolean state, but distributed to two variables, which are very hard to name in a way that is truly more readable than the actual java.lang.Boolean
solution. Besides, there are actually four state values for two boolean
variables, so there is a slight increase in the risk of bugs.
Every rule has its exception. Using null
for the above semantics is a very good exception to the null
-is-bad histeria that has been going on ever since the introduction of Option / Optional…
In other words: Which approach is best? There’s no TRUE
or FALSE
answer, only UNKNOWN
!
Be careful with this
However, as we’ve discussed in a previous blog post, you should avoid returning null
from API methods if possible. In this case, using null
explicitly as a means to model state is fine because this model is encapsulated in our ResultSetIterator
. But try to avoid leaking such state to the outside of your API.
Reference: | Three-State Booleans in Java from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog. |
How about using an enum instead?
enum HasMoreRows { UNKNOWN, NO_MORE_ROWS, HAS_MORE_ROWS }
this increases readability, avoids the use of null, and avoids boxing/unboxing hasNext.
Yes, it would work just the same. While I don’t think that unboxing is an issue (all Booleans are cached), an enum might make things a bit more readable, although also a bit more verbose.