Caching Strategy Reminder for Maven-Based Docker Builds
My local development feedback loop between code change and runnable container was annoyingly long on a Maven-based project I was recently working on. I wanted to speed things up.
The scenario was something like this:
- touch/change some source code
docker build
- maven downloads the world
- maven compiles my project
docker run
- touch/change some source code
docker build
- maven downloads the world
- maven compiles my project
docker run
- touch/change some source code
docker build
- maven downloads the world
- maven compiles my project
docker run
- …
I didn’t really enjoy the “maven downloads the world” steps, and wanted to minimize the number of times it needed to run.
Let’s follow along as I make my situation a little better. For illustration, we’ll start off with this generic archetype-created skeleton project:
package com.keyholesoftware.blog; public class App { public static void main( String[] args ) { System.out.println( "Hello World!" ); } }
package com.keyholesoftware.blog; import junit.framework.*; public class AppTest extends TestCase { public void testApp() { assertTrue( true ); } }
FROM maven:3.2.5-jdk-8u40 RUN mkdir --parents /usr/src/app WORKDIR /usr/src/app ADD . /usr/src/app RUN mvn verify
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.keyholesoftware.blog</groupId> <artifactId>khs-docker-caching-blog</artifactId> <version>1.0-SNAPSHOT</version> <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>3.8.1</version> <scope>test</scope> </dependency> </dependencies> </project>
Things aren’t that bad when I am building back-to-back, e.g.
$ docker build . ... $ docker build . ...
Notice that the second build is fast as everything is cached up. But what about when we do something like this:
$ docker build . ... $ touch src/main/java/com/keyholesoftware/blog/App.java ... $ docker build . ...
Notice that the second build is unnecessarily slowed down by the redownload portion.
I sat around and despaired for a while until I remembered the tricks I’ve seen with selective caching:
FROM maven:3.2.5-jdk-8u40 RUN mkdir --parents /usr/src/app WORKDIR /usr/src/app # selectively add the POM file ADD pom.xml /usr/src/app/ # get all the downloads out of the way RUN mvn verify clean --fail-never ADD . /usr/src/app RUN mvn verify
Let’s try that sequence again.
$ docker build . ... $ touch src/main/java/com/keyholesoftware/blog/App.java ... $ docker build . ...
Getting better, but there were still a few downloads going on during the second build. They are related to the surefire test/plugin. Actually this process will help us iron out downloads which are chosen dynamically, and lock those down. In this case, we lock down our surefire provider.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.keyholesoftware.blog</groupId> <artifactId>khs-docker-caching-blog</artifactId> <version>1.0-SNAPSHOT</version> <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>3.8.1</version> <scope>test</scope> </dependency> </dependencies> <properties> <surefire.version>2.8.1</surefire.version> </properties> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-surefire-plugin</artifactId> <version>${surefire.version}</version> <!-- lock down our surefire provider --> <dependencies> <dependency> <groupId>org.apache.maven.surefire</groupId> <artifactId>surefire-junit3</artifactId> <version>${surefire.version}</version> </dependency> </dependencies> </plugin> </plugins> </build> </project>
Let’s try that sequence again.
$ docker build . ... $ touch src/main/java/com/keyholesoftware/blog/App.java ... $ docker build . ...
So now, unless we change the POM, we don’t have to redownload anything. Nice.
Now the scenario is something like this:
- touch/change some source code
docker build
- maven downloads the world
- maven compiles my project
docker run
- touch/change some source code
docker build
- maven compiles my project
docker run
- touch/change some source code
docker build
- maven compiles my project
docker run
- …
Notice the “maven downloads the world” step only happens once (unless I actually change the POM, of course).
Final Thoughts
There might be better ways to handle some of this (e.g. dependency:resolve/resolve-plugin but that doesn’t seem to work as thoroughly, and probably something with fig), but I mainly wanted to highlight a possible use of the selective adding/caching.
Other Notes:
- For you Ruby+Rakefile, Python+requirements.txt, Node+package.json, Go+GoDeps.json etc. folks — Maven doesn’t have an explicit ‘install dependencies’ step. See Introduction to the Build Lifecycle if you’re bored.
- For you Gradle folks, I haven’t used Gradle much. What are your thoughts?
- The source code for this post is at: https://github.com/in-the-keyhole/khs-docker-caching-blog
Thanks for reading!
Reference: | Caching Strategy Reminder for Maven-Based Docker Builds from our JCG partner Luke Patterson at the Keyhole Software blog. |