Core Java

5 Refactoring Principles by Example

This post features five (mostly well-known) refactoring principles applied when refactoring real open-source code (Gradle Modules Plugin).

Context

When I worked on separate compilation of module-info.java for Gradle Modules Plugin (PR #73), I noticed potential for some refactoring. As a result, I filed issue #79 and later resolved it with PR #88 (not merged yet), where I refactored the code.

As it turned out, the refactoring was much wider than I initially thought. Here, I present parts of this PR as examples of the refactoring principles that I applied there.

Refactoring Principles

Note: the list presented here is by no means comprehensive, and the principles aren’t original (I present them in my own voice and according to my own understanding, though). As I see it, the greatest value of this post is in the real-life examples that accompany the principles.

The five principles presented here are:

  1. Hide “how” with “what”
  2. Aim for consistency
  3. Avoid deep nesting
  4. Separate concerns (= Single Responsibility Principle)
  5. Avoid duplication wisely (= Don’t Repeat Yourself)

1. Hide “How” With “What”

This principle is just a part/rephrasing of the clean code principle, as formulated by Robert Martin.

To me, hiding “how” with “what” means extracting classes and methods whenever:

  • I can identify a distinct, non-trivial function performed by some piece of code, and
  • I can hide this non-triviality behind a method with a meaningful name.

Example 1: updateRelativePath

Here’s a snippet from RunTaskMutator before the refactoring:

1
2
3
4
5
mainDistribution.contents(copySpec -> copySpec.filesMatching(patchModuleExtension.getJars(), action -> {
  RelativePath relativePath = action.getRelativePath().getParent().getParent()
      .append(true, "patchlibs", action.getName());
  action.setRelativePath(relativePath);
}));

and here’s the snippet after the refactoring:

1
2
3
mainDistribution.contents(
    copySpec -> copySpec.filesMatching(patchModuleExtension.getJars(), this::updateRelativePath)
);

To sum up, we:

  • hid how to update the relative path
  • with what we do (= the fact that we update it).

Thanks to such refactoring, it’s much easier to grasp what happens to mainDistribution.

For reference, the content of updateRelativePath is available here.

Example 2: buildAddReadsStream & buildAddOpensStream

This is how a part of the TestTask class looked before the refactoring:

1
2
3
4
5
6
7
8
9
TestEngine.select(project).ifPresent(testEngine -> {
  args.addAll(List.of("--add-reads", moduleName + "=" + testEngine.moduleName));
 
  Set<File> testDirs = testSourceSet.getOutput().getClassesDirs().getFiles();
  getPackages(testDirs).forEach(p -> {
    args.add("--add-opens");
    args.add(String.format("%s/%s=%s", moduleName, p, testEngine.addOpens));
  });
});

and here’s how it looks afterwards:

1
2
3
4
TestEngine.select(project).ifPresent(testEngine -> Stream.concat(
    buildAddReadsStream(testEngine),
    buildAddOpensStream(testEngine)
).forEach(jvmArgs::add));

Again, we:

  • hid how the values of --add-reads and --add-opens options are specified
  • with what we do (= the fact that we specify them).

For reference, the contents of buildAddReadsStream and buildAddOpensStream are available here.

2. Aim for Consistency

This is very general, but I mean any kind of reasonable consistency that we can get.

For example, Donald Raab‘s blog post about symmetry is a great example of striving for consistency. Needless to say, I agree with his conclusion wholeheartedly:

A large system with good symmetry becomes easier to understand, because you can detect and expect recurring patterns.

Donald Raab, Symmetric Sympathy

In the case of Gradle Modules Plugin, this boiled down primarily to extracting AbstractModulePluginTask base class and unifying the task finding & configuration dispatching procedure.

For example, JavadocTask and TestTask before the refactoring were:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
public class JavadocTask {
  public void configureJavaDoc(Project project) {
    Javadoc javadoc = (Javadoc) project.getTasks().findByName(JavaPlugin.JAVADOC_TASK_NAME);
    if (javadoc != null) {
      // ...
    }
  }
}
 
public class TestTask {
  public void configureTestJava(Project project, String moduleName) {
    Test testJava = (Test) project.getTasks().findByName(JavaPlugin.TEST_TASK_NAME);
    // ... (no null check)
  }
}

and afterwards, they are:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
public class JavadocTask extends AbstractModulePluginTask {
  public void configureJavaDoc() {
    helper().findTask(JavaPlugin.JAVADOC_TASK_NAME, Javadoc.class)
        .ifPresent(this::configureJavaDoc);
  }
 
  private void configureJavaDoc(Javadoc javadoc) { /* ... */ }
}
 
public class TestTask extends AbstractModulePluginTask {
  public void configureTestJava() {
    helper().findTask(JavaPlugin.TEST_TASK_NAME, Test.class)
        .ifPresent(this::configureTestJava);
  }
 
  private void configureTestJava(Test testJava) { /* ... */ }
}

For reference: JavaDocTask diff and TestTask diff.

3. Avoid Deep Nesting

This is rather obvious, I guess. For me, deep nesting of control structures is extremely hard to read and grasp.

As a consequence, I refactored the following getPackages method:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
private static Set<String> getPackages(Collection<File> dirs) {
  Set<String> packages = new TreeSet<>();
  for (File dir : dirs) {
    if (dir.isDirectory()) {
      Path dirPath = dir.toPath();
      try (Stream<Path> entries = Files.walk(dirPath)) {
        entries.forEach(entry -> {
          if (entry.toFile().isFile()) {
            String path = entry.toString();
            if (isValidClassFileReference(path)) {
              Path relPath = dirPath.relativize(entry.getParent());
              packages.add(relPath.toString().replace(File.separatorChar, '.'));
            }
          }
        });
      } catch (IOException e) {
        throw new GradleException("Failed to scan " + dir, e);
      }
    }
  }
  return packages;
}

like below:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
private static Set<String> getPackages(Collection<File> dirs) {
  return dirs.stream()
      .map(File::toPath)
      .filter(Files::isDirectory)
      .flatMap(TestTask::buildRelativePathStream)
      .map(relPath -> relPath.toString().replace(File.separatorChar, '.'))
      .collect(Collectors.toCollection(TreeSet::new));
}
 
private static Stream<Path> buildRelativePathStream(Path dir) {
  try {
    return Files.walk(dir)
        .filter(Files::isRegularFile)
        .filter(path -> isValidClassFileReference(path.toString()))
        .map(path -> dir.relativize(path.getParent()));
  } catch (IOException e) {
    throw new GradleException("Failed to scan " + dir, e);
  }
}

Full diff available here.

4. Separate Concerns

SRP (Single Responsibility Principle) is a well-known software design principle. Here, we can see its application in extracting StartScriptsMutator from RunTaskMutator.

Before:

1
2
3
4
5
6
7
8
9
public class RunTaskMutator {
  // common fields
 
  public void configureRun() { /* ... */ }
 
  public void updateStartScriptsTask(String taskStartScriptsName) { /* ... */ }
 
  // 12 other methods (incl. 2 common methods)
}

After:

01
02
03
04
05
06
07
08
09
10
11
12
13
public class RunTaskMutator extends AbstractExecutionMutator {
 
  public void configureRun() { /* ... */ }
   
  // 2 other methods
}
 
public class StartScriptsMutator extends AbstractExecutionMutator {
 
  public void updateStartScriptsTask(String taskStartScriptsName) { /* ... */ }
 
  // 8 other methods
}

Thanks to extracting StartScriptsMutator, it’s much easier to comprehend the scopes of:

For reference: the commit with the above extraction.

5. Avoid Duplication Wisely

DRY (Don’t Repeat Yourself) is another well-known software development principle. However, in my experience, this principle is sometimes taken too far, which results in code that isn’t duplicated but is also far too complex.

In other words, we should deduplicate only when the cost-gain ratio is positive:

  • cost: refactoring time, resulting complexity, etc.
  • gain: no duplication (or more strictly: single source of truth).

One such example from Gradle Modules Plugin (where the cost-gain ratio is close to zero but still positive, in my opinion) is the introduction of PatchModuleResolver.

Below, there’s a code snippet before refactoring that consists of:

  1. A PatchModuleExtension.configure method.
  2. A place where it’s used (TestTask).
  3. A place where it can’t be used (RunTaskMutator).
  4. Another place where it can’t be used (JavadocTask).
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// 1. PatchModuleExtension
public List<String> configure(FileCollection classpath) {
  List<String> args = new ArrayList<>();
 
  config.forEach(patch -> {
        String[] split = patch.split("=");
 
        String asPath = classpath.filter(jar -> jar.getName().endsWith(split[1])).getAsPath();
 
        if (asPath.length() > 0) {
          args.add("--patch-module");
          args.add(split[0] + "=" + asPath);
        }
      }
  );
 
  return args;
}
 
// 2. TestTask
args.addAll(patchModuleExtension.configure(testJava.getClasspath()));
 
// 3. RunTaskMutator
patchModuleExtension.getConfig().forEach(patch -> {
      String[] split = patch.split("=");
      jvmArgs.add("--patch-module");
      jvmArgs.add(split[0] + "=" + PATCH_LIBS_PLACEHOLDER + "/" + split[1]);
    }
);
 
// 4. JavadocTask
patchModuleExtension.getConfig().forEach(patch -> {
      String[] split = patch.split("=");
 
      String asPath = javadoc.getClasspath().filter(jar -> jar.getName().endsWith(split[1])).getAsPath();
 
      if (asPath != null && asPath.length() > 0) {
        options.addStringOption("-patch-module", split[0] + "=" + asPath);
      }
    }
);

After introducing PatchModuleResolver, the code looks as follows:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
// 1. PatchModuleExtension
public PatchModuleResolver resolve(FileCollection classpath) {
  return resolve(jarName -> classpath.filter(jar -> jar.getName().endsWith(jarName)).getAsPath());
}
 
public PatchModuleResolver resolve(UnaryOperator<String> jarNameResolver) {
  return new PatchModuleResolver(this, jarNameResolver);
}
 
// 2. TestTask
patchModuleExtension.resolve(testJava.getClasspath()).toArgumentStream().forEach(jvmArgs::add);
 
// 3. RunTaskMutator
patchModuleExtension.resolve(jarName -> PATCH_LIBS_PLACEHOLDER + "/" + jarName).toArgumentStream().forEach(jvmArgs::add);
 
// 4. JavadocTask
patchModuleExtension.resolve(javadoc.getClasspath()).toValueStream()
    .forEach(value -> options.addStringOption("-patch-module", value));

Thanks to refactoring, now there’s only one place (PatchModuleResolver) where we split the config entries of the PatchModuleExtension class.

For reference: diffs 1, 2, 3, 4.

Summary

In this post, I’ve presented the following five refactoring principles:

  1. Hide “how” with “what”
  2. Aim for consistency
  3. Avoid deep nesting
  4. Separate concerns
  5. Avoid duplication wisely

Each principle was accompanied by a real-life example, which — hopefully — showed how adhering to the principle resulted in neat code.

Published on Java Code Geeks with permission by Tomasz Linkowski, partner at our JCG program. See the original article here: 5 Refactoring Principles by Example

Opinions expressed by Java Code Geeks contributors are their own.

Tomasz Linkowski

Tomasz is an enthusiast of clean coding and software craftsmanship, who is particularly interested in problem domain modeling & API design
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Philippe Bourgau
5 years ago

Thanks for the reference to the symmetry blog post. I’ll read this very soon.

I agree that often, DRY is taken too far. I have seen many systems with factored out technical frameworks or libraries that end up overly complex and difficult to maintain. DDD and Bounded contexts proposes an interesting answer to this. We should avoid duplication within a bounded context. If too much duplication is appearing between 2 bounded contexts, then we can extract a supportive bounded context and reuse it in both. The topic is wide and a lot can be said though…

Interesting post!

Back to top button