Gang of Four Patterns With Type-Classes and Implicits in Scala
Type-classes, as they’re known within the Scala language, have a wonderful place in library development. They make code open to extension, less verbose and simplify APIs. I’ve yet to find many other patterns in languages which do the same. A close second, depending on your point of view, being one of either the concept of a generator or decorator in the Python language (where the later is just function composition in disguise.) This post is highly opinionated; after reading if you disagree or feel I missed a point say so in the comments section.
For those that don’t know what a type-class is, the internet is full of smart people describing what they are. In fact, there’s a good paper and presentation by Scala’s creator Martin Odersky that you could reference (here and here.) If none of those suit
your needs the epic Typeclassopedia,which describes the scalaz library’s use of type-classes, is enough to give me a headache. For the impatient it’s really simple to get up and running with these things without needing a PhD. All you need to do is see it in action.
In the next few posts I’m going to go over three Gang of Four design patterns and how you can apply them with type-classes:
- Bridge pattern
- Adapter pattern (also known as Wrapper or Translator patterns)
- Mediator pattern
I hope this will give future library authors who are coming from another language a sense of the what, when and why to use these but must point out that with all things, an over abundance is a sure sign you’re doing it wrong. The adapter pattern is the most widely recognized use case for type-classes in the Scala community. So instead of sticking with something the community is largely aware of I’ll start with something less well known. Why? I’ve recently been using the bridge pattern in a tiny library at my current employer Novus Partners. If you already know the bridge pattern, you can skip the next part and head straight for the type-class section. If not, keep reading.
Bridge Pattern
The bridge pattern was developed to solve the issue of multiple independent concepts needing to coexist without causing a combinatorial explosion of types. In OO-heavy languages (Java, C#, C++) where people have a tendency to couple ideas via inheritance instead of object composition this pattern appears regularly in refactoring excursions. In essence, objects are paired together at run time and loosely coupled via common interfaces at compile time, often with one concept’s interface passed through to the constructor of the other concept’s concrete implementation.
If you’ve never heard of composition over inheritance, this pattern is it. Not surprisingly practitioners and proponents of OO languages feel it is not only an extension of but embodies the best principles of good design. I, myself, will not discount it’s virtues. Like any design pattern it solves a real problem in a simple and elegant manner. For FP languages or hybrid FP-OO languages like Scala, higher-order functions have a tendency to be used in place of one-off abstract interfaces (as functions provide a reasonable and well understood interface in their own right.)
So what’s it look like? Consider the following classes:
class EncryptedFileWriter(cypher: String, key: String) extends FileWriter{ def write(file: File, content: String) = open(file){ encrypt(content) } //more } class CompactJsonFileWriter extends FileWriter{ def write(file: File, content: String) = open(file){ validate(content) compact(content) } //more }
Clearly, each class is performing two different types of writes to a file. One is translating to an encrypted format while the other is translating JSON to a compact representation. However, what if I wanted to write to a database? In order to be able to write either encrypted or JSON formats I’d have to create 2 more classes specific to the database I was using. See where this is going? If I have M formats and N content destinations I’d have to create MN classes.
The bridge pattern says in order to minimize the number of classes I should decouple the concepts of write destination and format into distinct hierarchies which vary independently. Accordingly, in the spirit of good OO design, a destination concept would be passed either via the constructor or using higher-order functions as another argument. In this contrived example we could expect a refactoring to yield:
class FileWriter(file: File) extends Writer{ def write(content: String, convert: String => String = identity) = open{ convert(content) } //more } class JsonConvert extends (String => String){ def apply(input: String): String = //more }
whereby we’ve taken the additional liberty of making formatting optional. Now Writer just means something that writes and formatting is a String => String operation. Neither needs to care or worry about the implementation of the other at all.
Bridge Pattern in Type-Classes
I’d like to think that there are two requirements to using the bridge pattern in practice with type-classes:
- There exist two complementary concepts which need to vary independently
- There is an explicit dependency on something which is implicitly understood but could be expressed through a type signature
Case in point, each database I’ve used (PostGRES, HSQLDB, MS SQL, etc.) implements not only different versions of the ANSI standard SQL (sometimes only 90% of a standard) but also generally contains its own specific extensions which are non-portable to other databases. When writing queries, I’ve rarely seen someone codify in their class or function definition this explicit relationship even though it’s present within the query strings. It’s usually understood implicitly by the team that there is a binding to a type of database and this knowledge is passed around either by word of mouth, comments in the code or assumed by virtue of the tech stack used. To take this even further, there is also a hard dependency on how the JDBC API is used in the context of the type of database. The JDBC documentation clearly gives warning on things like creating a PreparedStatement or getting generated keys. Not all JDBC drivers support all operations and not all databases support the same API hooks.
Taken together (or even separately) we’ve satisfied the second condition. To satisfy the first, we’ll need to talk about connection pooling. There are several fantastic connection pools available to the JVM (BoneCP, C3P0, JDBC-Pool and DBCP to name a few, although I’d love to hear about more in active development.) Each of these libraries is sufficiently different to warrant it’s own instance of a connection handler, behavior/performance logging and initialization mechanisms. To support several pools and databases within the same library, two different concepts (pooling and JDBC API calls) are going to need to coexist in harmony.
An Example Use Case
Let’s assume we want a very basic interface to perform your four types of DML query (otherwise known as CRUD ops.) Let’s go with something like:
trait Query{ def insert(query: String, args: AnyRef*): List[Int] def delete(query: String, args: AnyRef*): Int def update(query: String, args: AnyRef*): Int def select(query: String, args: AnyRef*): ResultSet }
This type of interface exposes a straight-forward, uniform and understandable API. Even out of context there is very little confusion about what is to happen. That said, if we developed against it without modification Query would best be expressed as an abstract class with two constructor arguments taking in an interface for connection pools and one for DB-specific logic. In short, it would look very Java-like. This is where we turn to type-classes.
As I outlined above, any queries we do write run into an implicit assumption about which database they will be used. Why not explicitly code that into the type signature? And if we’d like to use type-classes to simplify the API I declare the type-class should follow this rule: a type-class should be referentially transparent, describing how and nothing more. Thus, in choosing between making the pool a type-class or the statement executor a type-class, I think the choice makes itself. Neither. Both are side-effecting operations. However, if we were so inclined, the later could be rewritten to return an IO Monad and thus remain “pure.”
First we’ll pull all connection pool handling into the Query interface itself:
trait Query[DB]{ def insert(query: String, args: AnyRef*)(implicit con: StatementConstructor[DB]) = pool{ con.insert(query, args: _*) } def delete(query: String, args: AnyRef*)(implicit con: StatementConstructor[DB]) = pool{ con.delete(query, args: _*) } def update(query: String, args: AnyRef*)(implicit con: StatementConstructor[DB]) = pool{ con.update(query, args: _*) } def select(query: String, args: AnyRef*)(implicit con: StatementConstructor[DB]) = pool{ con.select(query, args: _*) } protected def pool[A](f: Connection => A): A }
taking advantage of Scala’s traits to define everything up to but not including the implementation specific to each pool library. Then we’ll define database statement construction in a type-class passed through implicit scope resolution:
trait StatementConstructor[DB]{ def insert(query: String, args: AnyRef*)(connection: Connection): List[Int] def delete(query: String, args: AnyRef*)(connection: Connection): Int def update(query: String, args: AnyRef*)(connection: Connection): Int def select(query: String, args: AnyRef*)(connection: Connection): ResultSet }
allowing both to vary independently of one another while still yielding an API that looks exactly the same to the consumer as we first described.
Notice that the complimentary nature of both these trait signatures describe a system which is free and open to extension by anyone over the given operations. The database specific logic for handling PreparedStatements is completely transparent to the consumer. The consumer only needs to know which database to code against (expressed via the type signature) and how to initialize the concrete implementation of a Query.
In theory, code using this system would look like:
val queryPool = new DBCPBackedQuery[MySQL]("myConfigFile") def activeUsers(query: Query[MySQL]) = { val resultSet = query.select("SELECT * FROM active_users") //more
which I think any future maintainer would find extremely easy to understand. Likewise, any code reviewer would have a guide to how the queries should be written by simply checking the type signature. A win all around.
Conclusion
The bridge pattern solves a fundamental software design problem in OO and OO-FP hybrid languages. When developing in Scala, using type-classes, we can avoid explicitly parameterizing the definitions of constructors by deferring the relationship to the function call site via an implicit. This not only simplifies APIs but opens libraries to extension without burdening contributors with an exponential cost for extension.
I hope the example was illustrative enough to highlight some of the advantages both of this design pattern in Scala and the flexibility of type-classes in general. In the next post I’ll go over implicit type-classes using the adapter pattern; showing how it can be used for ad-hoc polymorphism and limiting class functionality by specific type much akin to C++’s enable_if template construct. Don’t worry, there won’t be any C++, just Scala.
Reference: Gang of Four Patterns With Type-Classes and Implicits in Scala from our JCG partner Owein Reese at the Statically Typed blog.