Top 100 Most Popular Scala Libraries – Based on 10,000 GitHub Projects
As Scala developers working in a language and ecosystem that’s rapidly growing and evolving, we’re faced with a constant dilemma whenever we write new code – go with that hot new Scala framework that everyone’s talking about, or stick with a Java library we know and trust?
When we began building Takipi we wanted to know what are the most common frameworks developers use today, so we could better optimize it for them. Since a large part of Scala applications are commercial or closed-source in nature, it can be hard sometimes to tell the number of projects putting a library to use.
We decided to use a data based approach to get more insight into this by analyzing what Scala developers are actually using on the world’s largest open project repository – GitHub. With a wide variety of projects ranging from small to very large, GH provides us with an extensive data-set, one which is also highly up-to-date.
Much like with the results we saw in Java, there were some pretty big surprises. As both Java and Scala run on the JVM, it was interesting to notice similarities between the frameworks used, and also some stark differences. Overall 42 libraries appear in both the top 100 Java and Scala libraries, helping reaffirm the fact that Scala isn’t just a different language, but it also has its own universe of tools and libraries.
The Approach
To generate our dataset we queried 10,000 Scala projects, with a bias towards the ones most favorited by the community, as a strong qualifying indicator towards their relative importance.
We searched for dependencies in sbt and Maven which the vast majority of Scala projects on GH use to build their projects. For sbt we analyzed the build.sbt, project/Build.scala and any .scala files that extend them. For Maven projects we scanned the pom.xml dependencies file.
We then analyzed and grouped the results into categories. The results were interesting to say the least –
The Results
TDD is big in Scala. JUnit, the classic Java testing framework, is the most popular library with 2513 projects using it. Scalatest comes in at a close second with 2197 entries. TestNG which is fairly popular in Java (ranked 14th in the Java top 100) isn’t in the top 100 libraries for Scala.
SPECS2, the framework for writing software specifications is being used by 1331 projects. SPECS V1 which was deprecated in early 2011 still has 312 projects using it.
A new generation of frameworks. Using Scala is not just about the language, but also about a new generation of frameworks. The Play Framework for building web apps is crushing it when it comes to Scala developers, with 18% of the projects using it. The Akka framework is also doing very well with 776 entries (ranked 9th). Lift, another well known framework for building Scala web applications, is only used by 124 projects, which came as something of a surprise to us.
Some frameworks originally built for Java, however, are seeing much greater use in Scala. The lightweight web server Jetty, which was used by 100 projects in Java, has 4.5X that amount of projects using it in Scala, with 447 entries (17th).
Where’s the Java old guard? This comes in contrast to some of Java’s most venerated languages and frameworks seeing considerable less use in Scala.
- Spring for example, which places 15(!) libraries in the top 100 Java libraries, isn’t on the Scala top 100 board.
- Apache commons is also seeing much reduced usage. commons-io and commons-lang, which are both in the top 10 Java libraries, are at #24 and #39 respectively in the Scala top 100.
- Google’s Guava libraries, which are at #8 in the Java top 100, are also further down the Scala list, coming in at #24 with less than half of projects using it than in Java.
Logging. Ceki’s SLF4J is leading the pack –
- SLF4J and logback seems to be the de-facto logging solution for Scala and is being used in 16% and 14% of the projects respectively.
- log4j, which has 891 projects entries in Java, sees less usage in Scala with only 332 project entries (3%).
- commons-logging is behind the pack with 105 project entries – that’s less than a third of the number of projects using it in Java.
SQL. Big surprises on the Scala DB front –
- H2 is the most common Sql DB with 552 projects using it – that’s more than 4X the usage we saw for it in Java.
- MySql comes in with 387 entries, which is actually more than the 255 entries we saw with Java.
- PostgreSQL is also up there on the board with 332 entries which is almost 3X more entries than the 121 in Java.
NoSql sees less traction than in Java. It’s also worthwhile noting that Hadoop, which is seeing a good amount of usage in Java, isn’t on the Scala top 100 board. The only NoSql DB on the Scala list is MongoDB with 97 entries.
Android. While Scala is very much a server-side language aimed at building scalable server applications, we still saw some presence for Android development with 82 projects using the sbt-android-plugin.
Surprised by some of the results? We know we were with some of them. Take a look at the full list of the top 100 Scala libraries on GitHub below, and let us know what you think in the comments section. We’d love to hear your thoughts and questions.
I see that on your result, for some entries you have both libX and libX_2.10 (or alike). They are actually a dependency toward the same libX library, it’s just the symptom of a naming convention used to differentiate binary incompatible version of the same library.
So, for example for scalatest, you should count: scalatest (2197), scalatest_2.9.0 (137), scalatest_2.9.1 (112) and scalatest_2.10 (616). So, in total 3062, much more than JUnit (2513).
Cheers,