Hibernate Facts: The importance of fetch strategy
When it comes to working with an ORM tool, everybody acknowledges the importance of database design and Entity-to-Table mapping. These aspects get a lot of attention, while things like fetching strategy might be simply put-off.
In my opinion, the entity fetching strategy shouldn’t ever be separated from the entity mapping design, since it might affect the overall application performance, unless properly designed.
Before Hibernate and JPA got so popular, there was a great deal of effort put into designing each query, because you had to explicitly select all the joins you wanted to select from, and all the columns you were interested in. And if that was not enough, the DBA would optimize the slow running queries.
In JPA times, the JPA-QL or HQL queries are fetching Entities along with some of their associated relationships. This eases development, as it frees us from manually choosing all table fields we are interested in, and sometimes joins or additional queries are automatically generated for serving our needs.
This is a double-edged sword. On one hand you can deliver features faster, but if your automatically generated SQL queries are not efficient, your application overall performance might suffer significantly.
So what is the entity fetching strategy, anyway?
When JPA loads an entity it also loads all the EAGER or “join fetch” associations too. As long as the persistence context is opened, navigating the LAZY associations results in fetching those as well, through additional executed queries.
By default, the JPA @ManyToOne and @OneToOne annotations are fetched EAGERly, while the @OneToMany and @ManyToMany relationships are considered LAZY. This is the default strategy, and Hibernate doesn’t magically optimizes your object retrieval, it only does what is instructed to do.
While small projects don’t require a thorough entity fetching planning, medium to large applications shouldn’t ever ignore it.
Planning your fetching strategy from the very beginning, and adjusting it all along the development cycle isn’t a “premature optimization”, it’s just a natural part of any ORM design.
The deafult fetch strategy is the one you define thorugh the JPA mapping, while the manual join fetching is when you use JPA-QL queries.
The best advice I can give you is to favor the manual fetching strategy (defined in JPA-QL queries using the fetch operator). While some @ManyToOne or @OneToOne associations make sense to always be fetched eagerly, most of the time, they aren’t needed for every fetching operation.
For children associations it’s always safer to mark them LAZY and only “join fetch” them when needed, because those can easily generate large SQL result sets, with unneeded joins.
Having most of the associations defined as LAZY requires us to use the “join fetch” JPA-QL operator and retrieve only the associations we need to fulfill a given request. If you forget to “join fetch” properly, the Persistence Context will run queries on your behalf while you navigate the lazy associations, and that might generate “N+1″ problems, or additional SQL queries which might have been retrieved with a simple join in the first place.
For a concrete example, let’s start from the following diagram:
The Product entity associations are mapped as:
@ManyToOne(fetch = FetchType.EAGER) @JoinColumn(name = "company_id", nullable = false) private Company company; @OneToOne(fetch = FetchType.LAZY, cascade = CascadeType.ALL, mappedBy = "product", optional = false) private WarehouseProductInfo warehouseProductInfo; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "importer_id") private Importer importer; @OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, mappedBy = "product", orphanRemoval = true) @OrderBy("index") private Set<Image> images = new LinkedHashSet<Image>();
Most of the associations are marked as LAZY, because there is no need to fetch all of them every time we load a Product. The warehouse is only needed when displaying the stock information. The Importer is used in certain displays only, and we will fetch it when necessary. The images are lazy since not all views require displaying those images.
Only the company is fetched eagerly because all our views need it, and in our application a Product always must be considered in the context of a given Company.
It’s a good practice to set the default fetch strategy explicitly (it makes the code more self-descriptive) even if @ManyToOne uses the EAGER fetch option by default.
Use case 1: Loading a product by id generates the following SQL
SELECT product0_.id AS id1_7_1_, product0_.code AS code2_7_1_, product0_.company_id AS company_4_7_1_, product0_.importer_id AS importer5_7_1_, product0_.name AS name3_7_1_, company1_.id AS id1_1_0_, company1_.name AS name2_1_0_ FROM product product0_ INNER JOIN company company1_ ON product0_.company_id = company1_.id WHERE product0_.id = ?
Every time we load through the entity manager the default fetching strategy comes into play, meaning the Company gets fetched along with the Product we are selecting.
Use case 2: Selecting the Product though JPA-QL query (bypassing the Persistence Context first level cache)
entityManager.createQuery( "select p " + "from Product p " + "where p.id = :productId", Product.class) .setParameter("productId", productId) .getSingleResult();
This executes the following SQL query:
SELECT product0_.id AS id1_7_, product0_.code AS code2_7_, product0_.company_id AS company_4_7_, product0_.importer_id AS importer5_7_, product0_.name AS name3_7_ FROM product product0_ WHERE product0_.id = ?
So using JPA-QL overrides the default fetching strategy, but it still leaves us vulnerable if we want to navigate the lazy associations. If the Persistence Context closes we get a LazyInitializationException when accessing the lazy relationships, but if it’s not closed it will generate additional select queries, which might affect the application performance.
Use case 3: Selecting a list of Products with their associated warehouse and importer associations:
entityManager.createQuery( "select p " + "from Product p " + "inner join fetch p.warehouseProductInfo " + "inner join fetch p.importer", Product.class) .getResultList();
This generate the following SQL:
SELECT product0_.id AS id1_7_0_, warehousep1_.id AS id1_11_1_, importer2_.id AS id1_3_2_, product0_.code AS code2_7_0_, product0_.company_id AS company_4_7_0_, product0_.importer_id AS importer5_7_0_, product0_.name AS name3_7_0_, warehousep1_.quantity AS quantity2_11_1_, importer2_.name AS name2_3_2_ FROM product product0_ INNER JOIN warehouseproductinfo warehousep1_ ON product0_.id = warehousep1_.id INNER JOIN importer importer2_ ON product0_.importer_id = importer2_.id
Here you can see that the JPA-QL explicit fetch strategy overrides the default strategy. Because we haven’t specified the “join fetch” with Company, the EAGER association is ignored.
Use case 4: Selecting a list of Images while explicitly join fetching the Product, results in overriding the default strategy too, even if the selected entity is not the one whose strategy we are overriding:
entityManager.createQuery( "select i " + "from Image i " + "inner join fetch i.product p " + "where p.id = :productId", Image.class) .setParameter("productId", productId) .getResultList();
This generates the following SQL:
SELECT image0_.id AS id1_2_0_, product1_.id AS id1_7_1_, image0_.index AS index2_2_0_, image0_.name AS name3_2_0_, image0_.product_id AS product_4_2_0_, product1_.code AS code2_7_1_, product1_.company_id AS company_4_7_1_, product1_.importer_id AS importer5_7_1_, product1_.name AS name3_7_1_ FROM image image0_ INNER JOIN product product1_ ON image0_.product_id = product1_.id WHERE product1_.id = ?
There is one more thing I have to add, and it’s about the @oneToOne relationship for warehouseProductInfo. For optional @OnetoOne associations, the LAZY attribute is ignored, since Hiberante must know if it has to populates your Entity with null or with a proxy. In our example, it makes sense to make it mandatory, since every product is located in a warehouse anyway. In other cases you can simply make the association unidirectional, and keep only the part controlling the link (the one where the foreign key resides).
- Code available on GitHub.