5 Refactoring Principles by Example
This post features five (mostly well-known) refactoring principles applied when refactoring real open-source code (Gradle Modules Plugin).
Context
When I worked on separate compilation of module-info.java
for Gradle Modules Plugin (PR #73), I noticed potential for some refactoring. As a result, I filed issue #79 and later resolved it with PR #88 (not merged yet), where I refactored the code.
As it turned out, the refactoring was much wider than I initially thought. Here, I present parts of this PR as examples of the refactoring principles that I applied there.
Refactoring Principles
Note: the list presented here is by no means comprehensive, and the principles aren’t original (I present them in my own voice and according to my own understanding, though). As I see it, the greatest value of this post is in the real-life examples that accompany the principles.
The five principles presented here are:
- Hide “how” with “what”
- Aim for consistency
- Avoid deep nesting
- Separate concerns (= Single Responsibility Principle)
- Avoid duplication wisely (= Don’t Repeat Yourself)
1. Hide “How” With “What”
This principle is just a part/rephrasing of the clean code principle, as formulated by Robert Martin.
To me, hiding “how” with “what” means extracting classes and methods whenever:
- I can identify a distinct, non-trivial function performed by some piece of code, and
- I can hide this non-triviality behind a method with a meaningful name.
Example 1: updateRelativePath
Here’s a snippet from RunTaskMutator
before the refactoring:
1 2 3 4 5 | mainDistribution.contents(copySpec -> copySpec.filesMatching(patchModuleExtension.getJars(), action -> { RelativePath relativePath = action.getRelativePath().getParent().getParent() .append( true , "patchlibs" , action.getName()); action.setRelativePath(relativePath); })); |
and here’s the snippet after the refactoring:
1 2 3 | mainDistribution.contents( copySpec -> copySpec.filesMatching(patchModuleExtension.getJars(), this ::updateRelativePath) ); |
To sum up, we:
- hid how to update the relative path
- with what we do (= the fact that we update it).
Thanks to such refactoring, it’s much easier to grasp what happens to mainDistribution
.
For reference, the content of updateRelativePath
is available here.
Example 2: buildAddReadsStream
& buildAddOpensStream
This is how a part of the TestTask
class looked before the refactoring:
1 2 3 4 5 6 7 8 9 | TestEngine.select(project).ifPresent(testEngine -> { args.addAll(List.of( "--add-reads" , moduleName + "=" + testEngine.moduleName)); Set<File> testDirs = testSourceSet.getOutput().getClassesDirs().getFiles(); getPackages(testDirs).forEach(p -> { args.add( "--add-opens" ); args.add(String.format( "%s/%s=%s" , moduleName, p, testEngine.addOpens)); }); }); |
and here’s how it looks afterwards:
1 2 3 4 | TestEngine.select(project).ifPresent(testEngine -> Stream.concat( buildAddReadsStream(testEngine), buildAddOpensStream(testEngine) ).forEach(jvmArgs::add)); |
Again, we:
- hid how the values of
--add-reads
and--add-opens
options are specified - with what we do (= the fact that we specify them).
For reference, the contents of buildAddReadsStream
and buildAddOpensStream
are available here.
2. Aim for Consistency
This is very general, but I mean any kind of reasonable consistency that we can get.
For example, Donald Raab‘s blog post about symmetry is a great example of striving for consistency. Needless to say, I agree with his conclusion wholeheartedly:
A large system with good symmetry becomes easier to understand, because you can detect and expect recurring patterns.
Donald Raab, Symmetric Sympathy
In the case of Gradle Modules Plugin, this boiled down primarily to extracting AbstractModulePluginTask
base class and unifying the task finding & configuration dispatching procedure.
For example, JavadocTask
and TestTask
before the refactoring were:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 | public class JavadocTask { public void configureJavaDoc(Project project) { Javadoc javadoc = (Javadoc) project.getTasks().findByName(JavaPlugin.JAVADOC_TASK_NAME); if (javadoc != null ) { // ... } } } public class TestTask { public void configureTestJava(Project project, String moduleName) { Test testJava = (Test) project.getTasks().findByName(JavaPlugin.TEST_TASK_NAME); // ... (no null check) } } |
and afterwards, they are:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 | public class JavadocTask extends AbstractModulePluginTask { public void configureJavaDoc() { helper().findTask(JavaPlugin.JAVADOC_TASK_NAME, Javadoc. class ) .ifPresent( this ::configureJavaDoc); } private void configureJavaDoc(Javadoc javadoc) { /* ... */ } } public class TestTask extends AbstractModulePluginTask { public void configureTestJava() { helper().findTask(JavaPlugin.TEST_TASK_NAME, Test. class ) .ifPresent( this ::configureTestJava); } private void configureTestJava(Test testJava) { /* ... */ } } |
For reference: JavaDocTask
diff and TestTask
diff.
3. Avoid Deep Nesting
This is rather obvious, I guess. For me, deep nesting of control structures is extremely hard to read and grasp.
As a consequence, I refactored the following getPackages
method:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 | private static Set<String> getPackages(Collection<File> dirs) { Set<String> packages = new TreeSet<>(); for (File dir : dirs) { if (dir.isDirectory()) { Path dirPath = dir.toPath(); try (Stream<Path> entries = Files.walk(dirPath)) { entries.forEach(entry -> { if (entry.toFile().isFile()) { String path = entry.toString(); if (isValidClassFileReference(path)) { Path relPath = dirPath.relativize(entry.getParent()); packages.add(relPath.toString().replace(File.separatorChar, '.' )); } } }); } catch (IOException e) { throw new GradleException( "Failed to scan " + dir, e); } } } return packages; } |
like below:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 | private static Set<String> getPackages(Collection<File> dirs) { return dirs.stream() .map(File::toPath) .filter(Files::isDirectory) .flatMap(TestTask::buildRelativePathStream) .map(relPath -> relPath.toString().replace(File.separatorChar, '.' )) .collect(Collectors.toCollection(TreeSet:: new )); } private static Stream<Path> buildRelativePathStream(Path dir) { try { return Files.walk(dir) .filter(Files::isRegularFile) .filter(path -> isValidClassFileReference(path.toString())) .map(path -> dir.relativize(path.getParent())); } catch (IOException e) { throw new GradleException( "Failed to scan " + dir, e); } } |
Full diff available here.
4. Separate Concerns
SRP (Single Responsibility Principle) is a well-known software design principle. Here, we can see its application in extracting StartScriptsMutator
from RunTaskMutator
.
Before:
1 2 3 4 5 6 7 8 9 | public class RunTaskMutator { // common fields public void configureRun() { /* ... */ } public void updateStartScriptsTask(String taskStartScriptsName) { /* ... */ } // 12 other methods (incl. 2 common methods) } |
After:
01 02 03 04 05 06 07 08 09 10 11 12 13 | public class RunTaskMutator extends AbstractExecutionMutator { public void configureRun() { /* ... */ } // 2 other methods } public class StartScriptsMutator extends AbstractExecutionMutator { public void updateStartScriptsTask(String taskStartScriptsName) { /* ... */ } // 8 other methods } |
Thanks to extracting StartScriptsMutator
, it’s much easier to comprehend the scopes of:
- configuring the
run
task per se, - configuring the related
startScripts
task.
For reference: the commit with the above extraction.
5. Avoid Duplication Wisely
DRY (Don’t Repeat Yourself) is another well-known software development principle. However, in my experience, this principle is sometimes taken too far, which results in code that isn’t duplicated but is also far too complex.
In other words, we should deduplicate only when the cost-gain ratio is positive:
- cost: refactoring time, resulting complexity, etc.
- gain: no duplication (or more strictly: single source of truth).
One such example from Gradle Modules Plugin (where the cost-gain ratio is close to zero but still positive, in my opinion) is the introduction of PatchModuleResolver
.
Below, there’s a code snippet before refactoring that consists of:
- A
PatchModuleExtension.configure
method. - A place where it’s used (
TestTask
). - A place where it can’t be used (
RunTaskMutator
). - Another place where it can’t be used (
JavadocTask
).
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | // 1. PatchModuleExtension public List<String> configure(FileCollection classpath) { List<String> args = new ArrayList<>(); config.forEach(patch -> { String[] split = patch.split( "=" ); String asPath = classpath.filter(jar -> jar.getName().endsWith(split[ 1 ])).getAsPath(); if (asPath.length() > 0 ) { args.add( "--patch-module" ); args.add(split[ 0 ] + "=" + asPath); } } ); return args; } // 2. TestTask args.addAll(patchModuleExtension.configure(testJava.getClasspath())); // 3. RunTaskMutator patchModuleExtension.getConfig().forEach(patch -> { String[] split = patch.split( "=" ); jvmArgs.add( "--patch-module" ); jvmArgs.add(split[ 0 ] + "=" + PATCH_LIBS_PLACEHOLDER + "/" + split[ 1 ]); } ); // 4. JavadocTask patchModuleExtension.getConfig().forEach(patch -> { String[] split = patch.split( "=" ); String asPath = javadoc.getClasspath().filter(jar -> jar.getName().endsWith(split[ 1 ])).getAsPath(); if (asPath != null && asPath.length() > 0 ) { options.addStringOption( "-patch-module" , split[ 0 ] + "=" + asPath); } } ); |
After introducing PatchModuleResolver
, the code looks as follows:
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 | // 1. PatchModuleExtension public PatchModuleResolver resolve(FileCollection classpath) { return resolve(jarName -> classpath.filter(jar -> jar.getName().endsWith(jarName)).getAsPath()); } public PatchModuleResolver resolve(UnaryOperator<String> jarNameResolver) { return new PatchModuleResolver( this , jarNameResolver); } // 2. TestTask patchModuleExtension.resolve(testJava.getClasspath()).toArgumentStream().forEach(jvmArgs::add); // 3. RunTaskMutator patchModuleExtension.resolve(jarName -> PATCH_LIBS_PLACEHOLDER + "/" + jarName).toArgumentStream().forEach(jvmArgs::add); // 4. JavadocTask patchModuleExtension.resolve(javadoc.getClasspath()).toValueStream() .forEach(value -> options.addStringOption( "-patch-module" , value)); |
Thanks to refactoring, now there’s only one place (PatchModuleResolver
) where we split the config
entries of the PatchModuleExtension
class.
For reference: diffs 1, 2, 3, 4.
Summary
In this post, I’ve presented the following five refactoring principles:
- Hide “how” with “what”
- Aim for consistency
- Avoid deep nesting
- Separate concerns
- Avoid duplication wisely
Each principle was accompanied by a real-life example, which — hopefully — showed how adhering to the principle resulted in neat code.
Published on Java Code Geeks with permission by Tomasz Linkowski, partner at our JCG program. See the original article here: 5 Refactoring Principles by Example Opinions expressed by Java Code Geeks contributors are their own. |
Thanks for the reference to the symmetry blog post. I’ll read this very soon.
I agree that often, DRY is taken too far. I have seen many systems with factored out technical frameworks or libraries that end up overly complex and difficult to maintain. DDD and Bounded contexts proposes an interesting answer to this. We should avoid duplication within a bounded context. If too much duplication is appearing between 2 bounded contexts, then we can extract a supportive bounded context and reuse it in both. The topic is wide and a lot can be said though…
Interesting post!
Thanks for your comment!
Yes, I’m learning DDD right now, and I really like the ideas. I think that’s where I got this “single source of truth” concept. If you have two bounded contexts, it means you have two sources of truth. Of course, if you realize that some parts actually correspond to a _single_ source of truth, it’s like you say – you can treat them as a supporting domain.