How we accidentally doubled our JDBC traffic with Hibernate
This post describes my recent experience with Hibernate ORM. Over the years the framework has not stopped to amaze me in regards of performance problems you can end up with while using it. I thought I had seen it all, but this time I was again caught off guard.
The foundation for the problem was built while troubleshooting a completely unrelated library causing problems in one of our production servers. To gather more insight about the original issue at hand, we increased the verbosity in the logback configuration.
So all it took to create the below described effect was a mere log verbosity change for the org.* package from ERROR to WARN. After carrying out the necessary tests we harvested the information from the logs and forgot to change the log level back.
The problem started to reveal itself the next day, shortly after the log configuration had propagated to production. All of a sudden our monitoring started reporting problems from production left and right. As we eat our own dog food and monitor our own services with the Plumbr Performance Monitoring solution, we were greeted with information that certain services published to end users were facing latency-related issues.
Checking the information exposed, it became clear that everything seemed to be impacted. Instead of a more typical situation where poorly-performing code is localized within a single service/module, this time the entire JVM seemed to misbehave. In addition, it seemed that for 99% of the transactions, the latency was just barely impacted, but the worst case latency for almost every service had gone through the roof.
Luckily for us, the root cause of the problem was staring right at us. The list of root causes detected by Plumbr had exposed the culprit to the single line in source code. What we were facing were tens of thousands of calls to a SHOW WARNINGS statement being executed over the JDBC.
In addition, Plumbr root cause detection also exposed us the call stacks through which the calls were made:
com.mysql.jdbc.StatementImpl.executeQuery():1500 com.mysql.jdbc.SQLError.convertShowWarningsToSQLWarnings():714 com.mysql.jdbc.SQLError.convertShowWarningsToSQLWarnings():666 com.mysql.jdbc.StatementImpl.getWarnings():2299 com.zaxxer.hikari.pool.HikariProxyPreparedStatement.getWarnings():N/A org.hibernate.engine.jdbc.spi.SqlExceptionHelper.handleAndClearWarnings():320 org.hibernate.engine.jdbc.spi.SqlExceptionHelper.logAndClearWarnings():273 org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.close():529 ... cut for brevity ... org.hibernate.jpa.spi.AbstractEntityManagerImpl.merge():1196 sun.reflect.GeneratedMethodAccessor.invoke():N/A sun.reflect.DelegatingMethodAccessorImpl.invoke():43 java.lang.reflect.Method.invoke():606 org.springframework.orm.jpa.SharedEntityManagerCreator$SharedEntityManagerInvocationHandler.invoke():293 com.sun.proxy.$Proxy.merge():N/A eu.plumbr.portal.service.ServiceDao.save():21
Having the call stack led us directly to Hibernate. Apparently Hibernate contains the following code in the SqlExceptionHelper.handeAndClearWarnings() method:
public void handleAndClearWarnings(Statement statement, WarningHandler handler) { // See HHH-9174. Statement#getWarnings can be an expensive call for many JDBC libs. Don't do it unless // the log level would actually allow a warning to be logged. if (LOG.isEnabled(Level.WARN)) { try { walkWarnings( statement.getWarnings(), handler ); } catch (SQLException sqlException) { // cut for brevity } //cut for brevity }
And here we had the culprit, staring us right in the face. The change to log configuration had enabled the WARN level logging for Hibernate modules. This in turn caused each and every access to the database to carry out a duplicate SQL query of “SHOW WARNINGS”, effectively doubling the number of JDBC calls to the database.
According to Google, the issue is quite a widespread. Is it a reasonable design from the authors of Hibernate? On one hand, the functionality is indeed useful: if your JDBC access has generated any warnings you might wish to expose this information. And on the first glance current implementation is absolutely the logical one: to configure whether you want to see some warnings or not you configure logging framework.
What does not seem to make sense is the resulting high cohesion between absolutely unrelated aspects of your system: logging and the database performance. Configuring logging results in doubling the number of DB calls? Completely unexpected butterfly effect in its full glory. Good luck in troubleshooting your increased DB load without the correct tools. This time we patched the issue in 20 minutes and moved on with our work, but I can only imagine for how long would the troubleshooting have taken had there been no proper monitoring in place for us.
Reference: | How we accidentally doubled our JDBC traffic with Hibernate from our JCG partner Nikita Salnikov Tarnovski at the Plumbr Blog blog. |