Monday, May 29, 2006

Language Wars Redux - The Imminent Approach of LINQ

On the plane trip back from JavaOne I sat across the asile from a MS C# compiler developer. We had a lively conversation the entire trip. This ultimately prompted me to look into something called LINQ.

Language Wars Redux

Well, here we are in the Java community having gone through many months of absorbing the new language features in Java 5. We've been challenged and then excited by the phenomena of the dynamic scripting languages (and their frameworks) such as Ruby on Rails and even JavaScript in the context of AJAX. At JavaOne we were given a preview of Dolphin where modifications to the JVM will make it more facile to integrate dynamic scripting languages. A strong case was made for potentially integrating XML as a first class type in the Java language. We were wowed by the Groovy scripting language demonstration and excited by the prospect of the ranks of the Java platform numbers swelling as Project Semplice BASIC offers to give VB6 developers a home running code on the JVM.

These have been some exciting and heady times for the Java language and perhaps more especially the JVM platform. (At JavaOne it was clear that Sun has finally come around to fully embracing the notion of the JVM as a universal platform that should do a good job of accommodating other languages besides Java - the marketing position that Microsoft has held toward the .NET platform from day one.)

Alas, there may be some storm clouds appearing on the horizon.

If you haven't heard of it yet, the day will come when we've all heard of Microsoft's LINQ feature for the .NET platform. .NET Language Integrated Query (LINQ) is more fundamentally groundbreaking in language design than, say, a feature such as generics. And its impact on programmers and how they achieve their end goals will be more dramatic than such phenomena as Hibernate or Ruby on Rails and the revolution of thinking those entailed (ORM/EJB3 db-independent persistence and ActiveRecord dynamic types synthesized on the fly).

There has arisen a mode of thinking in Java land that by embracing dynamic scripting languages we can essentially address shortcomings in Java or bolster Java with exciting new capabilities. The scripting language Groovy is perhaps the ultimate expression to date of this line of thinking. It is an appealing notion - why add new features to Java, such as an intrinsic XML type, when Groovy already has a great markup language feature that makes working in XML very groovy indeed? Attention could instead be focused on making the integration of Groovy more seamless with Java and the JVM - and even making it the flagship scripting language such that it is there out of the box when the JDK/JRE are downloaded (with no additional installation steps required to make Groovy accessible). Given that Groovy scripts can be compiled into .class files it would seem to be the natural vehicle in which to address neat new language capabilities, as it is not limited to just pure scripting such as the likes of a BeanShell approach.

The new challenge: When Microsoft declares C# 3.0 to be production ready, it will have features that essentially leap frog the likes of: EJB3 ORM and its portable query language; many of the features entailed by the dynamic scripting languages - such as the Ruby on Rails ActiveRecord concept; and it will have a very facile ability to work intrinsically with XML. The LINQ feature will essentially unify the representation of query access of data across the domains of relational database, XML DOM, and in-memory collections/object graphs. It will be a language intrinsic capability for universal query. A day will come where it will even be used in every day programming situations - say, query all the fields of a visual form beginning with some name prefix string, and then perform a common operation on all those widgets that the query selected. One of Anders Hejlsberg's favorite demos is illustrating how LINQ can be used to query (and filter with clauses) out of type system reflection information. Thus LINQ is not just for relational database access, nor is it just a replacement for XPath, nor it it just an in-memory database query language. It embraces all of those as it is general purpose - and even facilitates selecting and migrating data between these various domains, i.e., select a data subset from a database while populating it into an XML DOM with trivial effort on the part of the programmer.

The common wisdom preached by the Ruby advocates and exemplified by its neatest accomplishments, such as ActiveRecord, is that such can only be accomplished by the dynamic, loosely typed scripting languages. However, the LINQ feature in C# 3.0 will turn this thinking completely on its head. It will bring in to question whether we really have to throw strong typing away in order to do these neat new things. LINQ creates tuples or anonymous types but it is able to retain type checking. The new feature of implicitly typed local variable declarations make it possible to hold a reference to these tuples in order to access and manipulate them. Unlike EJB3 query language, a LINQ query expression is type checked by the compiler. The end result is the simplicity style of the dynamic scripting languages but with the full rigor of a traditional strongly typed language. Alas, with C# 3.0 the use cases for these dynamic scripting languages just shrank a good deal. We're basically left with just the use cases of BeanShell scripting.≠ (Well, there are the first class regular expression features of Groovy as well.)

The LINQ feature draws on language enhancements that were added in C# 2.0 and of course various new additions in C# 3.0. To fully understand LINQ, start by reading the spec doc for C# 2.0 and then move on to the spec doc for C# 3.0.

In the on-going saga of the Language Wars, Microsoft's new LINQ feature looks poised to kick butt and take names. For in the meantime, over in the Java community, EJB3 persistence and its portable query language syntax will be regarded as the height of Java technology for query. There will still just be XPath for XML, and nothing at all for in-memory object graphs. Tuples? Forget it - will have to bail out to a dynamic scripting language for that kind of thing.

If you came back from JavaOne all pumped up about the state of Java and the JVM as a platform, you've now got yet another anxiety imminently approaching.