Enterprise Java

The dark side of Hibernate AUTO flush

Introduction

Now that I described the the basics of JPA and Hibernate flush strategies, I can continue unraveling the surprising behavior of Hibernate’s AUTO flush mode.

Not all queries trigger a Session flush

Many would assume that Hibernate always flushes the Session before any executing query. While this might have been a more intuitive approach, and probably closer to the JPA’s AUTO FlushModeType, Hibernate tries to optimize that. If the current executed query is not going to hit the pending SQL INSERT/UPDATE/DELETE statements then the flush is not strictly required.

As stated in the reference documentation, the AUTO flush strategy may sometimes synchronize the current persistence context prior to a query execution. It would have been more intuitive if the framework authors had chosen to name it FlushMode.SOMETIMES.

JPQL/HQL and SQL

Like many other ORM solutions, Hibernate offers a limited Entity querying language (JPQL/HQL) that’s very much based on SQL-92 syntax.

The entity query language is translated to SQL by the current database dialect and so it must offer the same functionality across different database products. Since most database systems are SQL-92 complaint, the Entity Query Language is an abstraction of the most common database querying syntax.

While you can use the Entity Query Language in many use cases (selecting Entities and even projections), there are times when its limited capabilities are no match for an advanced querying request. Whenever we want to make use of some specific querying techniques, such as:

we have no other option, but to run native SQL queries.

Hibernate is a persistence framework. Hibernate was never meant to replace SQL. If some query is better expressed in a native query, then it’s not worth sacrificing application performance on the altar of database portability.

AUTO flush and HQL/JPQL

First we are going to test how the AUTO flush mode behaves when an HQL query is about to be executed. For this we define the following unrelated entities:

flushautoentities

The test will execute the following actions:

  • A Person is going to be persisted.
  • Selecting User(s) should not trigger a the flush.
  • Querying for Person, the AUTO flush should trigger the entity state transition synchronization (A person INSERT should be executed prior to executing the select query).
Product product = new Product();
session.persist(product);
assertEquals(0L,  session.createQuery("select count(id) from User").uniqueResult());
assertEquals(product.getId(), session.createQuery("select p.id from Product p").uniqueResult());

Giving the following SQL output:

[main]: o.h.e.i.AbstractSaveEventListener - Generated identifier: f76f61e2-f3e3-4ea4-8f44-82e9804ceed0, using strategy: org.hibernate.id.UUIDGenerator
Query:{[select count(user0_.id) as col_0_0_ from user user0_][]} 
Query:{[insert into product (color, id) values (?, ?)][12,f76f61e2-f3e3-4ea4-8f44-82e9804ceed0]} 
Query:{[select product0_.id as col_0_0_ from product product0_][]}

As you can see, the User select hasn’t triggered the Session flush. This is because Hibernate inspects the current query space against the pending table statements. If the current executing query doesn’t overlap with the unflushed table statements, the a flush can be safely ignored.

HQL can detect the Product flush even for:

  • Sub-selects
    session.persist(product);
    assertEquals(0L,  session.createQuery(
        "select count(*) " +
        "from User u " +
        "where u.favoriteColor in (select distinct(p.color) from Product p)").uniqueResult());

    Resulting in a proper flush call:

    Query:{[insert into product (color, id) values (?, ?)][Blue,2d9d1b4f-eaee-45f1-a480-120eb66da9e8]} 
    Query:{[select count(*) as col_0_0_ from user user0_ where user0_.favoriteColor in (select distinct product1_.color from product product1_)][]}
  • Or theta-style joins
    session.persist(product);
    assertEquals(0L,  session.createQuery(
        "select count(*) " +
        "from User u, Product p " +
        "where u.favoriteColor = p.color").uniqueResult());

    Triggering the expected flush :

    Query:{[insert into product (color, id) values (?, ?)][Blue,4af0b843-da3f-4b38-aa42-1e590db186a9]} 
    Query:{[select count(*) as col_0_0_ from user user0_ cross join product product1_ where user0_.favoriteColor=product1_.color][]}

The reason why it works is because Entity Queries are parsed and translated to SQL queries. Hibernate cannot reference a non existing table, therefore it always knows the database tables an HQL/JPQL query will hit.

So Hibernate is only aware of those tables we explicitly referenced in our HQL query. If the current pending DML statements imply database triggers or database level cascading, Hibernate won’t be aware of those. So even for HQL, the AUTO flush mode can cause consistency issues.

AUTO flush and native SQL queries

When it comes to native SQL queries, things are getting much more complicated. Hibernate cannot parse SQL queries, because it only supports a limited database query syntax. Many database systems offer proprietary features that are beyond Hibernate Entity Query capabilities.

Querying the Person table, with a native SQL query is not going to trigger the flush, causing an inconsistency issue:

Product product = new Product();
session.persist(product);
assertNull(session.createSQLQuery("select id from product").uniqueResult());
DEBUG [main]: o.h.e.i.AbstractSaveEventListener - Generated identifier: 718b84d8-9270-48f3-86ff-0b8da7f9af7c, using strategy: org.hibernate.id.UUIDGenerator
Query:{[select id from product][]} 
Query:{[insert into product (color, id) values (?, ?)][12,718b84d8-9270-48f3-86ff-0b8da7f9af7c]}

The newly persisted Product was only inserted during transaction commit, because the native SQL query didn’t triggered the flush. This is major consistency problem, one that’s hard to debug or even foreseen by many developers. That’s one more reason for always inspecting auto-generated SQL statements.

The same behaviour is observed even for named native queries:

@NamedNativeQueries(
    @NamedNativeQuery(name = "product_ids", query = "select id from product")
)
assertNull(session.getNamedQuery("product_ids").uniqueResult());

So even if the SQL query is pre-loaded, Hibernate won’t extract the associated query space for matching it against the pending DML statements.

Overruling the current flush strategy

Even if the current Session defines a default flush strategy, you can always override it on a query basis.

Query flush mode

The ALWAYS mode is going to flush the persistence context before any query execution (HQL or SQL). This time, Hibernate applies no optimization and all pending entity state transitions are going to be synchronized with the current database transaction.

assertEquals(product.getId(), session.createSQLQuery("select id from product").setFlushMode(FlushMode.ALWAYS).uniqueResult());

Instructing Hibernate which tables should be syncronized

You could also add a synchronization rule on your current executing SQL query. Hibernate will then know what database tables need to be syncronzied prior to executing the query. This is also useful for second level caching as well.

assertEquals(product.getId(), session.createSQLQuery("select id from product").addSynchronizedEntityClass(Product.class).uniqueResult());

Conclusion

The AUTO flush mode is tricky and fixing consistency issues on a query basis is a maintainer’s nightmare. If you decide to add a database trigger, you’ll have to check all Hibernate queries to make sure they won’t end up running against stale data.

My suggestion is to use the ALWAYS flush mode, even if Hibernate authors warned us that:

this strategy is almost always unnecessary and inefficient.

Inconsistency is much more of an issue that some occasional premature flushes. While mixing DML operations and queries may cause unnecessary flushing this situation is not that difficult to mitigate. During a session transaction, it’s best to execute queries at the beginning (when no pending entity state transitions are to be synchronized) and towards the end of the transaction (when the current persistence context is going to be flushed anyway).

The entity state transition operations should be pushed towards the end of the transaction, trying to avoid interleaving them with query operations (therefore preventing a premature flush trigger).

Reference: The dark side of Hibernate AUTO flush from our JCG partner Vlad Mihalcea at the Vlad Mihalcea’s Blog blog.

Vlad Mihalcea

Vlad Mihalcea is a software architect passionate about software integration, high scalability and concurrency challenges.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Martin Vanek
Martin Vanek
10 years ago

It sounds like you want buy new shoes because your one shoe lace is untied.

Whenever you are trying anythig else then COMMIT flushing, you are probably going to shoot yourself in the leg.

Vlad Mihalcea
10 years ago

So you say the COMMIT flush type is the only viable one. Then what happens with the pending changes when you want to run an HQL/SQL query.

Are you going to run it against the current database values, that hadn’t yet been synchronized with the current Session entity state changes?

I don;t think it’s worth compromising consistency. After all there’s a very good reason why we have a C in ACID.

Back to top button