LINQ and Java
LINQ has been quite a successful, but also controversial addition to the .NET ecosystem. Many people are looking for a comparable solution in the Java world. To better understand what a comparable solution could be, let’s have a look at the main problem that LINQ solves:
Query languages are often declarative programming languages with many keywords. They offer few control-flow elements, yet they are highly descriptive. The most popular query language is SQL, the ISO/IEC standardised Structured Query Language, mostly used for relational databases.
Declarative programming means that programmers do not explicitly phrase out their algorithms. Instead, they describe the result they would like to obtain, leaving algorithmic calculus to their implementing systems. Some databases have become very good at interpreting large SQL statements, applying SQL language transformation rules based on language syntax and metadata. An interesting read is Tom Kyte’s metadata matters, hinting at the incredible effort that has been put into Oracle’s Cost-Based Optimiser. Similar papers can be found for SQL Server, DB2 and other leading RDBMS.
LINQ-to-SQL is not SQL
LINQ is an entirely different query language that allows to embed declarative programming aspects into .NET languages, such as C#, or ASP. The nice part of LINQ is the fact that a C# compiler can compile something that looks like SQL in the middle of C# statements. In a way, LINQ is to .NET what SQL is to PL/SQL, pgplsql or what jOOQ is to Java (see my previous article about PL/Java). But unlike PL/SQL, which embeds the actual SQL language, LINQ-to-SQL does not aim for modelling SQL itself within .NET. It is a higher-level abstraction that keeps an open door for attempting to unify querying against various heterogeneous data stores in a single language. This unification will create a similar impedance mismatch as ORM did before, maybe an even bigger one. While similar languages can be transformed into each other to a certain extent, it can become quite difficult for an advanced SQL developer to predict what actual SQL code will be generated from even very simple LINQ statements.
LINQ Examples
This gets more clear when looking at some examples given by the LINQ-to-SQL documentation. For example the Count()
aggregate function:
System.Int32 notDiscontinuedCount = (from prod in db.Products where !prod.Discontinued select prod) .Count(); Console.WriteLine(notDiscontinuedCount);
In the above example, it is not immediately clear if the .Count()
function is transformed into a SQL count(*)
aggregate function within the parenthesised query (then why not put it into the projection?), or if it will be applied only after executing the query, in the application memory. The latter would be prohibitive, if a large number or records would need to be transferred from the database to memory. Depending on the transaction model, they would even need to be read-locked!
Another example is given here where grouping is explained:
var prodCountQuery = from prod in db.Products group prod by prod.CategoryID into grouping where grouping.Count() >= 10 select new { grouping.Key, ProductCount = grouping.Count() };
In this case, LINQ models its language aspects entirely different from SQL. The above LINQ where
clause is obviously a SQL HAVING
clause. into grouping
is an alias for what will be a grouped tuple, which is quite a nice idea. This does not directly map to SQL, though, and must be used by LINQ internally, to produce typed output. What’s awesome, of course, are the statically typed projections that can be reused afterwards, directly in C#!
Let’s look at another grouping example:
var priceQuery = from prod in db.Products group prod by prod.CategoryID into grouping select new { grouping.Key, TotalPrice = grouping.Sum(p => p.UnitPrice) };
In this example, C#’s functional aspects are embedded into LINQ’s Sum(p => p.UnitPrice)
aggregate expression. TotalPrice = ...
is just simple column aliasing. The above leaves me with lots of open questions. How can I control, which parts are really going to be translated to SQL, and which parts will execute in my application, after a SQL query returns a partial result set? How can I predict whether a lambda expression is suitable for a LINQ aggregate function, and when it will cause a huge amount of data to be loaded into memory for in-memory aggregation? And also: Will the compiler warn me that it couldn’t figure out how to generate a C#/SQL algorithm mix? Or will this simply fail at runtime?
To LINQ or not to LINQ
Don’t get me wrong. Whenever I look inside the LINQ manuals for some inspiration, I have a deep urge to try it in a project. It looks awesome, and well-designed. There are also lots of interesting LINQ questions on Stack Overflow. I wouldn’t mind having LINQ in Java, but I want to remind readers that LINQ is NOT SQL. If you want to stay in control of your SQL, LINQ or LINQesque APIs may be a bad choice for two reasons:
- Some SQL mechanisms cannot be expressed in LINQ. Just as with JPA, you may need to resort to plain SQL.
- Some LINQ mechanisms cannot be expressed in SQL. Just as with JPA, you may suffer from severe performance issues, and will thus resort again to plain SQL.
Beware of the above when choosing LINQ, or a “Java implementation” thereof! You may be better off, using SQL (i.e. JDBC, jOOQ, or MyBatis) for data fetching and Java APIs (e.g. Java 8′s Stream API) for in-memory post-processing
LINQ-like libraries modelling SQL in Java, Scala
- jOOQ: http://www.jooq.org
- Sqltyped: https://github.com/jonifreeman/sqltyped
LINQ-like libraries abstracting SQL syntax and data stores in Java, Scala
- Quaere: http://quaere.codehaus.org
- JaQu: http://www.h2database.com/html/jaqu.html
- Linq4j: https://github.com/julianhyde/linq4j
- Slick: http://slick.typesafe.com/
There is also framework called Querydsl which provides LINQ like features for JPA, SQL, Mongodb and other persistence technologies.
Querydsl can be used from Java, Scala, Groovy and other JVM languages
http://www.querydsl.com/
Fishing traffic again, Timo? :-)
What do you think, quid pro quo?
Hi …
Excuse me ,I am of a question , but I don’t know what is it it’s subject to ask..
I wanna know that when the System.currentTimeMillis() in java will became zero , I mean when it will start again ?
Wouldn’t leave jOOQ out of list of alternatives and have no problem to mention it.