QCon SF 2008: Radical Simplification through Polyglot and Poly-paradigm Programming 143
InfoQ has posted the video of my talk at last year’s QCon San Francisco on Radical Simplification through Polyglot and Poly-paradigm Programming. I make the case that relying on just one programming language or one modularity paradigm (i.e., object-oriented programming, functional programming, etc.) is insufficient for most applications that we’re building today. That includes embedded systems, games, up to complex Internet and enterprise applications.
I’m giving an updated version of this talk at the Strange Loop Conference, October 22-23, in St. Louis. I hope to see you there.
Tighter Ruby Methods with Functional-style Pattern Matching, Using the Case Gem 148
Ruby doesn’t have overloaded methods, which are methods with the same name, but different signatures when you consider the argument lists and return values. This would be somewhat challenging to support in a dynamic language with very flexible options for method argument handling.
You can “simulate” overloading by parsing the argument list and taking different paths of execution based on the structure you find. This post discusses how pattern matching, a hallmark of functional programming, gives you powerful options.
First, let’s look at a typical example that handles the arguments in an ad hoc fashion. Consider the following Person
class. You can pass three arguments to the initializer, the first_name
, the last_name
, and the age
. Or, you can pass a hash using the keys :first_name
, :last_name
, and :age
.
require "rubygems"
require "spec"
class Person
attr_reader :first_name, :last_name, :age
def initialize *args
arg = args[0]
if arg.kind_of? Hash # 1
@first_name = arg[:first_name]
@last_name = arg[:last_name]
@age = arg[:age]
else
@first_name = args[0]
@last_name = args[1]
@age = args[2]
end
end
end
describe "Person#initialize" do
it "should accept a hash with key-value pairs for the attributes" do
person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should accept a first name, last name, and age arguments" do
person = Person.new "Dean", "Wampler", 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
end
The condition on the # 1
comment line checks to see if the first argument is a Hash
. If so, the attribute’s values are extracted from it. Otherwise, it is assumed that three arguments were specified in a particular order. They are passed to #initialize
in a three-element array. The two rspec examples exercise these behaviors. For simplicity, we ignore some more general cases, as well as error handling.
Another approach that is more flexible is to use duck typing, instead. For example, we could replace the line with the # 1
comment with this line:
if arg.respond_to? :has_key?
There aren’t many objects that respond to #has_key?
, so we’re highly confident that we can use [symbol]
to extract the values from the hash.
This implementation is fairly straightforward. You’ve probably written code like this yourself. However, it could get complicated for more involved cases.
Pattern Matching, a Functional Programming Approach
Most programming languages today have switch
or case
statements of some sort and most have support for regular expression matching. However, in functional programming languages, pattern matching is so important and pervasive that these languages offer very powerful and convenient support for pattern matching.
Fortunately, we can get powerful pattern matching, typical of functional languages, in Ruby using the Case gem that is part of the MenTaLguY’s Omnibus Concurrency library. Omnibus
provides support for the hot Actor model of concurrency, which Erlang has made famous. However, it would be a shame to restrict the use of the Case
gem to parsing Actor messages. It’s much more general purpose than that.
Let’s rework our example using the Case gem.
require "rubygems"
require "spec"
require "case"
class Person
attr_reader :first_name, :last_name, :age
def initialize *args
case args
when Case[Hash] # 1
arg = args[0]
@first_name = arg[:first_name]
@last_name = arg[:last_name]
@age = arg[:age]
else
@first_name = args[0]
@last_name = args[1]
@age = args[2]
end
end
end
describe "Person#initialize" do
it "should accept a first name, last name, and age arguments" do
person = Person.new "Dean", "Wampler", 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should accept a has with :first_name => fn, :last_name => ln, and :age => age" do
person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
end
We require the case
gem, which puts the #===
method on steroids. In the when
statement in #initialize
, the expression when Case[Hash]
matches on a one-element array where the element is a Hash
. We extract the key-value pairs as before. The else
clause assumes we have an array for the arguments.
So far, this is isn’t very impressive, but all we did was to reproduce the original behavior. Let’s extend the example to really exploit some of the neat features of the Case
gem’s pattern matching. First, let’s narrow the allowed array values.
require "rubygems"
require "spec"
require "case"
class Person
attr_reader :first_name, :last_name, :age
def initialize *args
case args
when Case[Hash] # 1
arg = args[0]
@first_name = arg[:first_name]
@last_name = arg[:last_name]
@age = arg[:age]
when Case[String, String, Integer]
@first_name = args[0]
@last_name = args[1]
@age = args[2]
else
raise "Invalid arguments: #{args}"
end
end
end
describe "Person#initialize" do
it "should accept a first name, last name, and age arguments" do
person = Person.new "Dean", "Wampler", 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should accept a has with :first_name => fn, :last_name => ln, and :age => age" do
person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should not accept an array unless it is a [String, String, Integer]" do
lambda { person = Person.new "Dean", "Wampler", "39" }.should raise_error(Exception)
end
end
The new expression when Case[String, String, Integer]
only matches a three-element array where the first two arguments are strings and the third argument is an integer, which are the types we want. If you use an array with a different number of arguments or the arguments have different types, this when
clause won’t match. Instead, you’ll get the default else
clause, which raises an exception. We added another rspec example to test this condition, where the user’s age was specified as a string instead of as an integer. Of course, you could decide to attempt a conversion of this argument, to make your code more “forgiving” of user mistakes.
Similarly, what happens if the method supports default values some of the parameters. As written, we can’t support that option, but let’s look at a slight variation of Person#initialize
, where a hash of values is not supported, to see what would happen.
require "rubygems"
require "spec"
require "case"
class Person
attr_reader :first_name, :last_name, :age
def initialize first_name = "Bob", last_name = "Martin", age = 29
case [first_name, last_name, age]
when Case[String, String, Integer]
@first_name = first_name
@last_name = last_name
@age = age
else
raise "Invalid arguments: #{first_name}, #{last_name}, #{age}"
end
end
end
def check person, expected_fn, expected_ln, expected_age
person.first_name.should == expected_fn
person.last_name.should == expected_ln
person.age.should == expected_age
end
describe "Person#initialize" do
it "should require a first name (string), last name (string), and age (integer) arguments" do
person = Person.new "Dean", "Wampler", 39
check person, "Dean", "Wampler", 39
end
it "should accept the defaults for all parameters" do
person = Person.new
check person, "Bob", "Martin", 29
end
it "should accept the defaults for the last name and age parameters" do
person = Person.new "Dean"
check person, "Dean", "Martin", 29
end
it "should accept the defaults for the age parameter" do
person = Person.new "Dean", "Wampler"
check person, "Dean", "Wampler", 29
end
it "should not accept the first name as a symbol" do
lambda { person = Person.new :Dean, "Wampler", "39" }.should raise_error(Exception)
end
it "should not accept the last name as a symbol" do
end
it "should not accept the age as a string" do
lambda { person = Person.new "Dean", "Wampler", "39" }.should raise_error(Exception)
end
end
We match on all three arguments as an array, asserting they are of the correct type. As you might expect, #initialize
always gets three parameters passed to it, including when default values are used.
Let’s return to our original example, where the object can be constructed with a hash or a list of arguments. There are two more things (at least …) that we can do. First, we’re not yet validating the types of the values in the hash. Second, we can use the Case
gem to impose constraints on the values, such as requiring non-empty name strings and a positive age.
require "rubygems"
require "spec"
require "case"
class Person
attr_reader :first_name, :last_name, :age
def initialize *args
case args
when Case[Hash]
arg = args[0]
@first_name = arg[:first_name]
@last_name = arg[:last_name]
@age = arg[:age]
when Case[String, String, Integer]
@first_name = args[0]
@last_name = args[1]
@age = args[2]
else
raise "Invalid arguments: #{args}"
end
validate_name @first_name, "first_name"
validate_name @last_name, "last_name"
validate_age
end
protected
def validate_name name, field_name
case name
when Case::All[String, Case.guard {|s| s.length > 0 }]
else
raise "Invalid #{field_name}: #{first_name}"
end
end
def validate_age
case @age
when Case::All[Integer, Case.guard {|n| n > 0 }]
else
raise "Invalid age: #{@age}"
end
end
end
describe "Person#initialize" do
it "should accept a first name, last name, and age arguments" do
person = Person.new "Dean", "Wampler", 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should accept a has with :first_name => fn, :last_name => ln, and :age => age" do
person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should not accept an array unless it is a [String, String, Integer]" do
lambda { person = Person.new "Dean", "Wampler", "39" }.should raise_error(Exception)
end
it "should not accept a first name that is a zero-length string" do
lambda { person = Person.new "", "Wampler", 39 }.should raise_error(Exception)
end
it "should not accept a first name that is not a string" do
lambda { person = Person.new :Dean, "Wampler", 39 }.should raise_error(Exception)
end
it "should not accept a last name that is a zero-length string" do
lambda { person = Person.new "Dean", "", 39 }.should raise_error(Exception)
end
it "should not accept a last name that is not a string" do
lambda { person = Person.new :Dean, :Wampler, 39 }.should raise_error(Exception)
end
it "should not accept an age that is less than or equal to zero" do
lambda { person = Person.new "Dean", "Wampler", -1 }.should raise_error(Exception)
lambda { person = Person.new "Dean", "Wampler", 0 }.should raise_error(Exception)
end
it "should not accept an age that is not an integer" do
lambda { person = Person.new :Dean, :Wampler, "39" }.should raise_error(Exception)
end
end
We have added validate_name
and validate_age
methods that are invoked at the end of #initialize
. In validate_name
, the one when
clause requires “all” the conditions to be true, that the name is a string and that it has a non-zero length. Similarly, validate_age
has a when
clause that requires age
to be a positive integer.
Final Thoughts
So, how valuable is this? The code is certainly longer, but it specifies and enforces expected behavior more precisely. The rspec examples verify the enforcement. It smells a little of static typing, which is good or bad, depending on your point of view. ;)
Personally, I think the conditional checks are a good way to add robustness in small ways to libraries that will grow and evolve for a long time. The checks document the required behavior for code readers, like new team members, but of course, they should really get that information from the tests. ;) (However, it would be nice to extract the information into the rdocs
.)
For small, short-lived projects, I might not worry about the conditional checks as much (but how many times have those “short-lived projects” refused to die?).
You can read more about Omnibus
and Case
in this InfoQ interview with MenTaLguY. I didn’t discuss using the Actor model of concurrency, for which these gems were designed. For an example of Actors using Omnibus, see my Better Ruby through Functional Programming presentation or the Confreak’s video of an earlier version of the presentation I gave at last year’s RubyConf.
Adopting New JVM Languages in the Enterprise (Update) 106
(Updated to add Groovy, which I should have mentioned the first time. Also mentioned Django under Python.)
This is an exciting time to be a Java programmer. The pace of innovation for the Java language is slowing down, in part due to concerns that the language is growing too big and in part due to economic difficulties at Sun, which means there are fewer developers assigned to Java. However, the real crown jewel of the Java ecosystem, the JVM, has become an attractive platform for new languages. These languages give us exciting new opportunities for growth, while preserving our prior investment in code and deployment infrastructure.
This post emphasizes practical issues of evaluating and picking new JVM languages for an established Java-based enterprise.
The Interwebs are full of technical comparisons between Java and the different languages, e.g., why language X fixes Java’s perceived issue Y. I won’t rehash those arguments here, but I will describe some language features, as needed.
A similar “polyglot” trend is happening on the .NET platform.
The New JVM Languages
I’ll limit my discussion to these representative (and best known) alternative languages for the JVM.
- JRuby – Ruby running on the JVM.
- Scala – A hybrid object-oriented and functional language that runs on .NET as well as the JVM. (Disclaimer: I’m co-writing a book on Scala for O’Reilly.)
- Clojure – A Lisp dialect.
I picked these languages because they seem to be the most likely candidates for most enterprises considering a new JVM language, although some of the languages listed below could make that claim.
There are other deserving languages besides these three, but I don’t have the time to do them justice. Hopefully, you can generalize the subsequent discussion for these other languages.
- Groovy – A dynamically-typed language designed specifically for interoperability with Java. It will appeal to teams that want a dynamically-typed language that is closer to Java than Ruby. With Grails, you have a combination that’s comparable to Ruby on Rails.
- Jython – The first non-Java language ported to the JVM, started by Jim Hugunin in 1997. Most of my remarks about JRuby are applicable to Jython. Django is the Python analog of Rails. If your Java shop already has a lot of Python, consider Jython.
- Fan – A hybrid object-oriented and functional language that runs on .NET, too. It has a lot of similarities to Scala, like a scripting-language feel.
- Ioke – (pronounced “eye-oh-key”) An innovative language developed by Ola Bini and inspired by Io and Lisp. This is the newest language discussed here. Hence, it has a small following, but a lot of potential. The Io/Lisp-flavored syntax will be more challenging to average Java developers than Scala, JRuby, Jython, Fan, and JavaScript.
- JavaScript, e.g., Rhino – Much maligned and misunderstood (e.g., due to buggy and inconsistent browser implementations), JavaScript continues to gain converts as an alternative scripting language for Java applications. It is the default scripting language supported by the JDK 6 scripting interface.
- Fortress – A language designed as a replacement for high-performance FORTRAN for industrial and academic “number crunching”. This one will interest scientists and engineers…
Note: Like a lot of people, I use the term scripting language to refer to languages with a lightweight syntax, usually dynamically typed. The name reflects their convenience for “scripting”, but that quality is sometimes seen as pejorative; they aren’t seen as “serious” languages. I reject this view.
To learn more about what people are doing on the JVM today (with some guest .NET presentations), a good place to start is the recent JVM Language Summit.
Criteria For Evaluating New JVM Languages
I’ll frame the discussion around a few criteria you should consider when evaluating language choices. I’ll then discuss how each of the languages address those criteria. Since we’re restricting ourselves to JVM languages, I assume that each language compiles to valid byte code, so code in the new language and code written in Java can call each other, at least at some level. The “some level” part will be one criterion. Substitute X for the language you are considering.
-
Interoperability: How easily can X code invoke Java code and vice versa? Specifically:
- Create objects (i.e., call
new Foo(...)
). - Call methods on an object.
- Call static methods on a class.
- Extend a class.
- Implement an interface.
- Create objects (i.e., call
- Object Model: How different is the object model of X compared to Java’s object model? (This is somewhat tied to the previous point.)
-
New “Ideas”: Does X support newer programming trends:
- Functional Programming.
- Metaprogramming.
- Easier approaches to writing robust concurrent applications.
- Easier support for processing XML, SQL queries, etc.
- Support internal DSL creation.
- Easier presentation-tier development of web and thick-client UI’s.
-
Stability: How stable is the language, in terms of:
- Lack of Bugs.
- Stability of the language’s syntax, semantics, and library API’s. (All the languages can call Java API’s.)
- Performance: How does code written in X perform?
- Adoption: Is X easy to learn and use?
- Tool Support: What about editors, IDE’s, code coverage, etc.
-
Deployment: How are apps and libraries written in X deployed?
- Do I have to modify my existing infrastructure, management, etc.?
The Interoperability point affects ease of adoption and use with a legacy Java code base. The Object Model and Adoption points address the barrier to adoption from the learning point of view. The New “Ideas” point asks what each language brings to development that is not available in Java (or poorly supported) and is seen as valuable to the developer. Finally, Stability, Performance, and Deployment address very practical issues that a candidate production language must address.
Comparing the Languages
JRuby
JRuby is the most popular alternative JVM langauge, driven largely by interest in Ruby and Ruby on Rails.
Interoperability
Ruby’s object model is a little different than Java’s, but JRuby provides straightforward coding idioms that make it easy to call Java from JRuby. Calling JRuby from Java requires the JSR 223 scripting interface or a similar approach, unless JRuby is used to compile the Ruby code to byte code first. In that case, shortcuts are possible, which are well documented.
Object Model
Ruby’s object model is a little different than Java’s. Ruby support mixin-style modules, which behave like interfaces with implementations. So, the Ruby object model needs to be learned, but it is straightforward or the Java developer.
New Ideas
JRuby brings closures to the JVM, a much desired feature that probably won’t be added in the forthcoming Java 7. Using closures, Ruby supports a number of functional-style iterative operations, like mapping, filtering, and reducing/folding. However, Ruby does not fully support functional programming.
Ruby uses dynamic-typing instead of static-typing, which it exploits to provide extensive and powerful metaprogramming facilities.
Ruby doesn’t offer any specific enhancements over Java for safe, robust concurrent programming.
Ruby API’s make XML processing and database access relatively easy. Ruby on Rails is legendary for improving the productivity of web developers and similar benefits are available for thick-client developers using other libraries.
Ruby is also one of the best languages for defining “internal” DSL’s, which are used to great affect in Rails (e.g., ActiveRecord).
Stability
JRuby and Ruby are very stable and are widely used in production. JRuby is believed to be the best performing Ruby platform.
The Ruby syntax and API are undergoing some significant changes in the current 1.9.X release, but migration is not a major challenge.
Performance
JRuby is believed to be the best performing Ruby platform. While it is a topic of hot debate, Ruby and most dynamically-typed languages have higher runtime overhead compared to statically-typed languages. Also, the JVM has some known performance issues for dynamically-typed languages, some of which will be fixed in JDK 7.
As always, enterprises should profile code written in their languages of choice to pick the best one for each particular task.
Adoption
Ruby is very easy to learn, although effective use of advanced techniques like metaprogramming require some time to master. JRuby-specific idioms are also easy to master and are well documented.
Tool Support
Ruby is experiencing tremendous growth in tool support. IDE support still lags support for Java, but IntelliJ, NetBeans, and Eclipse are working on Ruby support. JRuby users can exploit many Java tools.
Code analysis tools and testing tools (TDD and BDD styles) are now better than Java’s.
Deployment
JRuby applications, even Ruby on Rails applications, can be deployed as jars or wars, requiring no modifications to an existing java-based infrastructure. Teams use this approach to minimize the “friction” of adopting Ruby, while also getting the performance benefits of the JVM.
Because JRuby code is byte code at runtime, it can be managed with JMX, etc.
Scala
Scala is a statically-typed language that supports an improved object model (with a full mixin mechanism called traits; similar to Ruby modules) and full support for functional programming, following a design goal of the inventor of Scala, Martin Odersky, that these two paradigms can be integrated, despite some surface incompatibilities. Odersky was involved in the design of Java generics (through earlier research languages) and he wrote the original version of the current javac
. The name is a contraction of “scalable language”, but the first “a” is pronounced like “ah”, not long as in the word “hay”.
The syntax looks like a cross between Ruby (method definitions start with the def
keyword) and Java (e.g., curly braces). Type inferencing and other syntactic conventions significantly reduce the “cluuter”, such as the number of explicit type declarations (“annotations”) compared to Java. Scala syntax is very succinct, sometimes even more so than Ruby! For more on Scala, see also my previous blog postings, part 1, part 2, part 3, and this related post on traits vs. aspects.
Interoperability
Scala’s has the most seamless interoperability with Java of any of the languages discussed here. This is due in part to Scala’s static typing and “closed” classes (as opposed to Ruby’s “open” classes). It is trivial to import and use Java classes, implement interfaces, etc.
Direct API calls from Java to Scala are also supported. The developer needs to know how the names of Scala methods are encoding in byte code. For example, Scala methods can have “operator” names, like ”+”. In the byte code, that name will be ”$plus”.
Object Model
Scala’s object model extends Java’s model with traits, which support flexble mixin composition. Traits behave like interfaces with implementations. The Scala object model provides other sophisticated features for building “scalable applications”.
New Ideas
Scala brings full support for functional programming to the JVM, including first-class function and closures. Other aspects of functional programming, like immutable variables and side-effect free functions, are encouraged by the language, but not mandated, as Scala is not a pure functional language. (Functional programming is very effective strategy for writing tread-safe programs, etc.) Scala’s Actor library is a port of Erlang’s Actor library, a message-based concurrency approach.
In my view, the Actor model is the best general-purpose approach to concurrency. There are times when multi-threaded code is needed for performance, but not for most concurrent applications. (Note: there are Actor libraries for Java, e.g., Kilim.)
Scala has very good support for building internal DSL’s, although it is not quite as good as Ruby’s features for this purpose. It has a combinator parser library that makes external DSL creation comparatively easy. Scala also offers some innovative API’s for XML processing and Swing development.
Stability
Scala is over 5 years old and it is very stable. The API and syntax continue to evolve, but no major, disruptive changes are expected. In fact, the structure of the language is such that almost all changes occur in libraries, not the language grammar.
There are some well-known production deployments, such as back-end services at twitter.
Performance
Scala provides comparable performance to Java, since it is very close “structurally” to Java code at the byte-code level, given the static typing and “closed” classes. Hence, Scala can exploit JVM optimizations that aren’t available to dynamically-typed languages.
However, Scala will also benefit from planned improvements to support dynamically-typed languages, such as tail-call optimizations (which Scala current does in the compiler.) Hence, Scala probably has marginally better performance than JRuby, in general. If true, Scala may be more appealing than JRuby as a general-purpose, systems language, where performance is critical.
Adoption
Scala is harder to learn and master than JRuby, because it is a more comprehensive language. It not only supports a sophisticated object model, but it also supports functional programming, type inferencing, etc. In my view, the extra effort will be rewarded with higher productivity. Also, because it is closer to Java than JRuby and Clojure, new users will be able to start using it quickly as a “better object-oriented Java”, while they continue to learn the more advanced features, like functional programming, that will accelerate their productivity over the long term.
Tool Support
Scala support in IDE’s still lags support for Java, but it is improving. IntelliJ, NetBeans, and Eclipse now support Scala with plugins. Maven and ant are widely used as the build tool for Scala applications. Several excellent TDD and BDD libraries are available.
Deployment
Scala applications are packaged and deployed just like Java applications, since Scala files are compiled to class files. A Scala runtime jar is also required.
Clojure
Of the three new JVM languages discussed here, Clojure is the least like Java, due to its Lisp syntax and innovative “programming model”. Yet it is also the most innovative and exciting new JVM language for many people. Clojure interoperates with Java code, but it emphasizes functional programming. Unlike the other languages, Clojure does not support object-oriented programming. Instead, it relies on mechanisms like multi-methods and macros to address design problems for which OOP is often used.
One exciting innovation in Clojure is support for software transactional memory, which uses a database-style transactional approach to concurrent modifications of in-memory, mutable state. STM is somewhat controversial. You can google for arguments about its practicality, etc. However, Clojure’s implementation appears to be successful.
Clojure also has other innovative ways of supporting “principled” modification of mutable data, while encouraging the use of immutable data. These features with STM are the basis of Clojure’s approach to robust concurrency.
Finally, Clojure implements several optimizations in the compiler that are important for functional programming, such as optimizing tail call recursion.
Disclaimer: I know less about Clojure than JRuby and Scala. While I have endeavored to get the facts right, there may be errors in the following analysis. Feedback is welcome.
Interoperability
Despite the Lisp syntax and functional-programming emphasis, Clojure interoperates with Java. Calling java from Clojure uses direct API calls, as for JRuby and Scala. Calling Clojure from Java is a more involved. You have to create Java proxies on the Clojure side to generate the byte code needed on the Java side. The idioms for doing this are straightforward, however.
Object Model
Clojure is not an object-oriented language. However, in order to interoperate with Java code, Clojure supports implementing interfaces and instantiating Java objects. Otherwise, Clojure offers a significant departure for develops well versed in object-oriented programming, but with little functional programming experience.
New Ideas
Clojure brings to the JVM full support for functional programming and popular Lisp concepts like macros, multi-methods, and powerful metaprogramming. It has innovative approaches to safe concurrency, including “principled” mechanisms for supporting mutable state, as discussed previously.
Clojure’s succinct syntax and built-in libraries make processing XML succinct and efficient. DSL creation is also supported using Lisp mechanisms, like macros.
Stability
Clojure is the newest of the three languages profiled here. Hence, it may be the most subject to change. However, given the nature of Lisps, it is more likely that changes will occur in libraries than the language itself. Stability in terms of bugs does not appear to be an issue.
Clojure also has the fewest known production deployments of the three languages. However, industry adoption is expected to happen rapidly.
Performance
Clojure supports type “hints” to assist in optimizing performance. The preliminary discussions I have seen suggest that Clojure offers very good performance.
Adoption
Clojure is more of a departure from Java than is Scala. It will require a motivated team that likes Lisp ;) However, such a team may learn Clojure faster than Scala, since Clojure is a simpler language, e.g., because it doesn’t have its own object model. Also, Lisps are well known for being simple languages, where the real learning comes in understanding how to use it effectively!
However, in my view, as for Scala, the extra learning effort will be rewarded with higher productivity.
Tool Support
As a new language, tool support is limited. Most Clojure developers use Emacs with its excellent Lisp support. Many Java tools can be used with Clojure.
Deployment
Clojure deployment appears to be as straightforward as for the other languages. A Clojure runtime jar is required.
Comparisons
Briefly, let’s review the points and compare the three languages.
Interoperability
All three languages make calling Java code straightforward. Scala interoperates most seamlessly. Scala code is easiest to invoke from Java code, using direct API calls, as long as you know how Scala encodes method names that have “illegal” characters (according to the JVM spec.). Calling JRuby and Clojure code from Java is more involved.
Therefore, if you expect to continue writing Java code that needs to make frequent API calls to the code in the new language, Scala will be a better choice.
Object Model
Scala is closest to Java’s object model. Ruby’s object model is superficially similar to Scala’s, but the dynamic nature of Ruby brings significant differences. Both extend Java’s object model with mixin composition through traits (Scala) or modules (Ruby), that act like interfaces with implementations.
Clojure is quite different, with an emphasis on functional programming and no direct support for object-oriented programming.
New Ideas
JRuby brings the productivity and power of a dynamically-typed language to the JVM, along with the drawbacks. It also brings some functional idioms.
Scala and Clojure bring full support for functional programming. Scala provides a complete Actor model of concurrency (as a library). Clojure brings software transactional memory and other innovations for writing robust concurrent applications. JRuby and Ruby don’t add anything specific for concurrency.
JRuby, like Ruby, is exceptionally good for writing internal DSL’s. Scala is also very good and Clojure benefits from Lisp’s support for DSL creation.
Stability
All the language implementations are of high quality. Scala is the most mature, but JRuby has the widest adoption in production.
Performance
Performance should be comparable for all, but JRuby and Clojure have to deal with some inefficiencies inherent to running dynamic languages on the JVM. Your mileage may vary, so please run realistic profiling experiments on sample implementations that are representative of your needs. Avoid “prematurely optimization” when choosing a new language. Often, team productivity and “time to market” are more important than raw performance.
Adoption
JRuby is the the easiest of the three languages to learn and adopt if you already have some Ruby or Ruby on Rails code in your environment.
Scala has the lowest barrier to adoption because it is the language that most resembles Java “philosophically” (static typing, emphasis on object-oriented programming, etc.). Adopters can start with Scala as a “better Java” and gradually learn the advanced features (mixin composition with traits and functional programming). Scala will appeal the most to teams that prefer statically-typed languages, yet want some of the benefits of dynamically-typed languages, like a succinct syntax.
However, Scala is the most complex of the three languages, while Clojure requires the biggest conceptual leap from Java.
Clojure will appeal to teams willing to explore more radical departures from what they are doing now, with potentially great payoffs!
Deployment
Deployment is easy with all three languages. Scala is most like Java, since you normally compile to class files (there is a limited interpreter mode). JRuby and Clojure code can be interpreted at runtime or compiled.
Summary and Conclusions
All three choices (or comparable substitutions from the list of other languages), will provide a Java team with a more modern language, yet fully leverage the existing investment in Java. Scala is the easiest incremental change. JRuby brings the vibrant Ruby world to the JVM. Clojure offers the most innovative departures from Java.
Video of my RubyConf talk, "Better Ruby through Functional Programming" 67
Confreaks has started posting the videos from RubyConf. Here’s mine on Better Ruby through Functional Programming.
Please ignore the occasional Ruby (and Scala) bugs…
A Scala-style "with" Construct for Ruby 111
Scala has a “mixin” construct called traits, which are roughly analogous to Ruby modules. They allow you to create reusable, modular bits of state and behavior and use them to compose classes and other traits or modules.
The syntax for using Scala traits is quite elegant. It’s straightforward to implement the same syntax in Ruby and doing so has a few useful advantages.
For example, here is a Scala example that uses a trait to trace calls to a Worker.work
method.
// run with "scala example.scala"
class Worker {
def work() = "work"
}
trait WorkerTracer extends Worker {
override def work() = "Before, " + super.work() + ", After"
}
val worker = new Worker with WorkerTracer
println(worker.work()) // => Before, work, After
Note that WorkerTracer
extends Worker
so it can override the work
method. Since Scala is statically typed, you can’t just define an override
method and call super
unless the compiler knows there really is a “super” method!
Here’s a Ruby equivalent.
# run with "ruby example.rb"
module WorkerTracer
def work; "Before, #{super}, After"; end
end
class Worker
def work; "work"; end
end
class TracedWorker < Worker
include WorkerTracer
end
worker = TracedWorker.new
puts worker.work # => Before, work, After
Note that we have to create a subclass, which isn’t required for the Scala case (but can be done when desired).
If you know that you will always want to trace calls to work
in the Ruby case, you might be tempted to dispense with the subclass and just add include WorkerTracer
in Worker
. Unfortunately, this won’t work. Due to the way that Ruby resolves methods, the version of work
in the module will not be found before the version defined in Worker
itself. Hence the subclass seems to be the only option.
However, we can work around this using metaprogramming. We can use WorkerTracer#append_features(...)
. What goes in the argument list? If we pass Worker
, then all instances of Worker
will be effected, but actually we’ll still have the problem with the method resolution rules.
If we just want to affect one object and work around the method resolution roles, then we need to pass the singleton class (or eigenclass or metaclass ...) for the object, which you can get with the following expression.
metaclass = class << worker; self; end
So, to encapsulate all this and to get back to the original goal of implementing with
-style semantics, here is an implementation that adds a with
method to Object
, wrapped in an rspec example.
# run with "spec ruby_with_spec.rb"
require 'rubygems'
require 'spec'
# Warning, monkeypatching Object, especially with a name
# that might be commonly used is fraught with peril!!
class Object
def with *modules
metaclass = class << self; self; end
modules.flatten.each do |m|
m.send :append_features, metaclass
end
self
end
end
module WorkerTracer
def work; "Before, #{super}, After"; end
end
module WorkerTracer1
def work; "Before1, #{super}, After1"; end
end
class Worker
def work; "work"; end
end
describe "Object#with" do
it "should make no changes to an object if no modules are specified" do
worker = Worker.new.with
worker.work.should == "work"
end
it "should override any methods with a module's methods of the same name" do
worker = Worker.new.with WorkerTracer
worker.work.should == "Before, work, After"
end
it "should stack overrides for multiple modules" do
worker = Worker.new.with(WorkerTracer).with(WorkerTracer1)
worker.work.should == "Before1, Before, work, After, After1"
end
it "should stack overrides for a list of modules" do
worker = Worker.new.with WorkerTracer, WorkerTracer1
worker.work.should == "Before1, Before, work, After, After1"
end
it "should stack overrides for an array of modules" do
worker = Worker.new.with [WorkerTracer, WorkerTracer1]
worker.work.should == "Before1, Before, work, After, After1"
end
end
You should carefully consider the warning about monkeypatching Object
! Also, note that Module.append_features
is actually private, so I had to use m.send :append_features, ...
instead.
The syntax is reasonably intuitive and it eliminates the need for an explicit subclass. You can pass a single module, or a list or array of them. Because with
returns the object, you can also chain with
calls.
A final note; many developers steer clear of metaprogramming and reflection features in their languages, out of fear. While prudence is definitely wise, the power of these tools can dramatically accelerate your productivity. Metaprogramming is just programming. Every developer should master it.
The Liskov Substitution Principle for "Duck-Typed" Languages 109
OCP and LSP together tell us how to organize similar vs. variant behaviors. I blogged the other day about OCP in the context of languages with open classes (i.e., dynamically-typed languages). Let’s look at the Liskov Substitution Principle (LSP).
The Liskov Substitution Principle was coined by Barbara Liskov in Data Abstraction and Hierarchy (1987).
If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2, then S is a subtype of T.
I’ve always liked the elegant simplicity, yet power, of LSP. In less formal terms, it says that if a client (program) expects objects of one type to behave in a certain way, then it’s only okay to substitute objects of another type if the same expectations are satisfied.
This is our best definition of inheritance. The well-known is-a relationship between types is not precise enough. Rather, the relationship has to be behaves-as-a, which unfortunately is more of a mouthful. Note that is-a focuses on the structural relationship, while behaves-as-a focuses on the behavioral relationship. A very useful, pre-TDD design technique called Design by Contract emerges out of LSP, but that’s another topic.
Note that there is a slight assumption that I made in the previous paragraph. I said that LSP defines inheritance. Why inheritance specifically and not substitutability, in general? Well, inheritance has been the main vehicle for substitutability for most OO languages, especially the statically-typed ones.
For example, a Java application might use a simple tracing abstraction like this.
public interface Tracing {
void trace(String message);
}
Clients might use this to trace methods calls to a log. Only classes that implement the Tracer
interface can be given to these clients. For example,
public class TracerClient {
private Tracer tracer;
public TracerClient(Tracer tracer) {
this.tracer = tracer;
}
public void doWork() {
tracer.trace("in doWork():");
// ...
}
}
However, Duck Typing is another form of substitutability that is commonly seen in dynamically-typed languages, like Ruby and Python.
If it walks like a duck and quacks like a duck, it must be a duck.
Informally, duck typing says that a client can use any object you give it as long as the object implements the methods the client wants to invoke on it. Put another way, the object must respond to the messages the client wants to send to it.
The object appears to be a “duck” as far as the client is concerned.
In or example, clients only care about the trace(message)
method being supported. So, we might do the following in Ruby.
class TracerClient
def initialize tracer
@tracer = tracer
end
def do_work
@tracer.trace "in do_work:"
# ...
end
end
class MyTracer
def trace message
p message
end
end
client = TracerClient.new(MyTracer.new)
No “interface” is necessary. I just need to pass an object to TracerClient.initialize
that responds to the trace
message. Here, I defined a class for the purpose. You could also add the trace
method to another type or object.
So, LSP is still essential, in the generic sense of valid substitutability, but it doesn’t have to be inheritance based.
Is Duck Typing good or bad? It largely comes down to your view about dynamically-typed vs. statically-typed languages. I don’t want to get into that debate here! However, I’ll make a few remarks.
On the negative side, without a Tracer
abstraction, you have to rely on appropriate naming of objects to convey what they do (but you should be doing that anyway). Also, it’s harder to find all the “tracing-behaving” objects in the system.
On the other hand, the client really doesn’t care about a “Tracer” type, only a single method. So, we’ve decoupled “client” and “server” just a bit more. This decoupling is more evident when using closures to express behavior, e.g., for Enumerable
methods. In our case, we could write the following.
class TracerClient2
def initialize &tracer
@tracer = tracer
end
def do_work
@tracer.call "in do_work:"
# ...
end
end
client = TracerClient2.new {|message| p "block tracer: #{message}"}
For comparison, consider how we might approach substitutability in Scala. As a statically-typed language, Scala doesn’t support duck typing per se, but it does support a very similar mechanism called structural types.
Essentially, structural types let us declare that a method parameter must support one or more methods, without having to say it supports a full interface. Loosely speaking, it’s like using an anonymous interface.
In our Java example, when we declare a tracer object in our client, we would be able to declare that is supports trace
, without having to specify that it implements a full interface.
To be explicit, recall our Java constructor for TestClient
.
public class TracerClient {
public TracerClient(Tracer tracer) { ... }
// ...
}
}
In Scala, a complete example would be the following.
class ScalaTracerClient(val tracer: { def trace(message:String) }) {
def doWork() = { tracer.trace("doWork") }
}
class ScalaTracer() {
def trace(message: String) = { println("Scala: "+message) }
}
object TestScalaTracerClient {
def main() {
val client = new ScalaTracerClient(new ScalaTracer())
client.doWork();
}
}
TestScalaTracerClient.main()
Recall from my previous blogs on Scala, the argument list to the class name is the constructor arguments. The constructor takes a tracer
argument whose “type” (after the ’:’) is { def trace(message:String) }
. That is, all we require of tracer
is that it support the trace
method.
So, we get duck type-like behavior, but statically type checked. We’ll get a compile error, rather than a run-time error, if someone passes an object to the client that doesn’t respond to tracer
.
To conclude, LSP can be reworded very slightly.
If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2, then S is substitutable for T.
I replaced a subtype of with substitutable for.
An important point is that the idea of a “contract” between the types and their clients is still important, even in a language with duck-typing or structural typing. However, languages with these features give us more ways to extend our system, while still supporting LSP.
The Open-Closed Principle for Languages with Open Classes 130
We’ve been having a discussion inside Object Mentor World Design Headquarters about the meaning of the OCP for dynamic languages, like Ruby, with open classes.
For example, in Ruby it’s normal to define a class or module, e.g.,
# foo.rb
class Foo
def method1 *args
...
end
end
and later re-open the class and add (or redefine) methods,
# foo2.rb
class Foo
def method2 *args
...
end
end
Users of Foo
see all the methods, as if Foo
had one definition.
foo = Foo.new
foo.method1 :arg1, :arg2
foo.method2 :arg1, :arg2
Do open classes violate the Open-Closed Principle? Bertrand Meyer articulated OCP. Here is his definition1.
Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.
He elaborated on it here.
... This is the open-closed principle, which in my opinion is one of the central innovations of object technology: the ability to use a software component as it is, while retaining the possibility of adding to it later through inheritance. Unlike the records or structures of other approaches, a class of object technology is both closed and open: closed because we can start using it for other components (its clients); open because we can at any time add new properties without invalidating its existing clients.
Tell Less, Say More: The Power of Implicitness
So, if one client require
’s only foo.rb
and only uses method1
, that client doesn’t care what foo2.rb
does. However, if the client also require
’s foo2.rb
, perhaps indirectly through another require
, problems will ensue unless the client is unaffected by what foo2.rb
does. This looks a lot like the way “good” inheritance should behave.
So, the answer is no, we aren’t violating OCP, as long as we extend a re-opened class following the same rules we would use when inheriting from it.
If we use inheritance instead:
# foo.rb
class Foo
def method1 *args
...
end
end
...
class DerivedFoo < Foo
def method2 *args
...
end
end
...
foo = SubFoo.new # Instantiate different class...
foo.method1 :arg1, :arg2
foo.method2 :arg1, :arg2
One notable difference is that we have to instantiate a different class. This is an important difference. While you can often just use inheritance, and maybe you should prefer it, inheritance only works if you have full control over what types get instantiated and it’s easy to change which types you use. Of course, inheritance is also the best approach when you need all behavioral variants simulateneously, i.e., each variant in one or more objects.
Sometimes you want to affect the behavior of all instances transparently, without changing the types that are instantiated. A slightly better example, logging method calls, illustrates the point. Here we use the “famous” alias_method
in Ruby.
# foo.rb
class Foo
def method1 *args
...
end
end
# logging_foo.rb
class Foo
alias_method :old_method1, :method1
def method1 *args
p "Inside method1(#{args.inspect})"
old_method1 *args
end
end
...
foo = Foo.new
foo.method1 :arg1, :arg2
Foo.method1
behaves like a subclass override, with extended behavior that still obeys the Liskov-Substitution Principle (LSP).
So, I think the OCP can be reworded slightly.
Software entities (classes, modules, functions, etc.) should be open for extension, but closed for source modification.
We should not re-open the original source, but adding functionality through a separate source file is okay.
Actually, I prefer a slightly different wording.
Software entities (classes, modules, functions, etc.) should be open for extension, but closed for source and contract modification.
The extra and contract is redundant with LSP. I don’t think this kind of redundancy is necessarily bad. ;) The contract is the set of behavioral expectations between the “entity” and its client(s). Just as it is bad to break the contract with inheritance, it is also bad to break it through open classes.
OCP and LSP together are our most important design principles for effective organization of similar vs. variant behaviors. Inheritance is one way we do this. Open classes provide another way. Aspects provide a third way and are subject to the same design issues.
1 Meyer, Bertrand (1988). Object-Oriented Software Construction. Prentice Hall. ISBN 0136290493.
Baubles in Orbit 50
I have put together a nice little demonstration of the Bauble concept. You may recall that I first wrote about it here. Baubles are a simple component scheme for Ruby, good for when you want a component, but don’t need something as heavy as a gem.
orbit.zip contains all the files for this demonstration. I suggest you download and unpack it.
First you need to install the Bauble gem. Don’t worry, it won’t hurt anything. Just say gem install orbit/bauble/Bauble-0.1.gem
(You’ll probably have to do it with sudo
.) That should install Bauble. From now on you only need to say require 'bauble'
in your ruby scripts that make use of it.
cd orbit/MultipleBodyOrbit/lib
jruby multiple_body_orbit.rb
A swing window should pop up and you should be able to watch an orbital simulation. Every run shows a different random scenario, so you can kill a lot of time by watching worlds in collision.
The thing to note, if you are a ruby programmer, is the use of the term Bauble::use(-some_directory-)
. If you look in the multiple_body_orbit.rb
file you’ll see I use two Baubles, the Physics
bauble does the raw calculation for all the gravity, forces, collisions, etc. The cellular_automaton
bauble provides a very simple Swing framework for drawing dots on a screen. (Yes, this is jruby).
If you look in either of the two Baubles, you’ll see that the require
statements within them do not know (or care) about the directory they live in. There is none of that horrible __FILE__
nonsense that pollutes so many ruby scripts. This is because the Bauble::use
function puts the directory path in the LOAD_PATH
so that subsquent require
statements can simply eliminate the directory spec.
Take a look at the Bauble source code. It’s no great shakes.
Also take a look at the two baubles. They show a pretty nice way to decouple business rules from gui. You might recognize the MVP pattern. The multiple_body_orbit.rb
file contains the presenter. Clearly the Physics
module is the model. And the cellular_automaton
module is the view. (There is no controller, because there is no input.)
The Ascendency of Dynamic X vs. Static X, where X = ... 23
I noticed a curious symmetry the other day. For several values of X, a dynamic approach has been gaining traction over a static approach, in some cases for several years.
X = Languages
The Ascendency of Dynamic Languages vs. Static Languages
This one is pretty obvious. It’s hard not to notice the resurgent interest in dynamically-typed languages, like Ruby, Python, Erlang, and even stalwarts like Lisp and Smalltalk.
There is a healthy debate about the relative merits of dynamic vs. static typing, but the “hotness” factor is undeniable.
X = Correctness Analysis
The Ascendency of Dynamic Correctness Analysis vs. Static Correctness Analysis
Analysis of code to prove correctness has been a research topic for years and the tools have become pretty good. If you’re in the Java world, tools like PMD and FindBugs find a lot of real and potential issues.
One thing none of these tools have ever been able to do is to analyze conformance of your code to your project’s requirements. I suppose you could probably build such tools using the same analysis techniques, but the cost would be too prohibitive for individual projects.
However, while analyzing the code statically is very hard, watching what the code actually does at runtime is more tractable and cost-effective, using automated tests.
Test-driving code results in a suite of unit, feature, and acceptance tests that do a good enough job, for most applications, of finding logic and requirements bugs. The way test-first development improves the design helps ensure correctness in the first place.
It’s worth emphasizing that automated tests exercise the code using representative data sets and scenarios, so they don’t constitute a proof of correctness. However, they are good enough for most applications.
X = Optimization
The Ascendency of Dynamic Optimization vs. Static Optimization
Perhaps the least well known of these X’s is optimization. Mature compilers like gcc have sophisticated optimizations based on static analysis of code (you can see where this is going…).
On the other hand, the javac compiler does not do a lot of optimizations. Rather, the JVM does.
The JVM watches the code execute and it performs optimizations the compiler could never do, like speculatively inlining polymorphic method calls, based on which types are actually having their methods invoked. The JVM puts in low-overhead guards to confirm that its assumptions are valid for each invocation. If not, the JVM de-optimizes the code.
The JVM can do this optimization because it sees how the code is really used at runtime, while the compiler has no idea when it looks at the code.
Just as for correctness analysis, static optimizations can only go so far. Dynamic optimizations simply bypass a lot of the difficulty and often yield better results.
Steve Yegge provided a nice overview recently of JVM optimizations, as part of a larger discussion on dynamic languages.
There are other dynamic vs. static things I could cite (think networking), but I’ll leave it at these three, for now.
Bauble, Bauble... 52
In Ruby, I hate require statements that look like this:
require File.dirname(__FILE__)+"myComponent/component.rb"
So I decided to do something about it.
This all started when my Son, Micah, told me about his Limelight project. Limelight is a jruby/swing GUI framework. If you want to build a fancy GUI in Ruby, consider this tool.
I have neither the time nor inclination to write a framework like this; but my curiosity was piqued. So in order to see what it was like to do Swing in JRuby I spent a few hours cobbling together an implementation of Langton’s Ant. This turned out to be quite simple.
The result, however, was a mess. There was swing code mixed up with “ant” code, in the classic GUI/Business-rule goulash that we “clean-coders” hate so much. Despite the fact that this was throw-away code, I could not leave it in that state – the moral outrage was just too great. So I spent some more time separating the program into two modules.
The first module knew all about Langton’s ant, but nothing about Swing. The second module was a tiny framework for implementing cellular automata in Swing. (Here are all the files).
I was quite happy with the separation, but did not like the horrible require
statements that I had to use. The cellular_automaton component had two classes, in two separate files. In order to get the require
right, I had to either use absolute directory paths, or the horrible File.dirname(__FILE__)...
structure.
What I wanted was for cellular_automaton to behave like a gem. But I didn’t want to make it into a gem. Gem’s are kind of “heavy” for a dumb little thing like “cellular_automaton”.
So I created a module named “Bauble” which gave me some gem-like behaviors. Here it is:
module Bauble
def self.use(bauble)
bauble_name = File.basename(bauble)
ensure_in_path "#{bauble}/lib"
require bauble_name
end
def self.ensure_in_path(path)
$LOAD_PATH << path unless $LOAD_PATH.include? path
end
end
This is no great shakes, but it solved my problem. Now, in my langton’s ant program all I need to do is this:
require 'bauble'
Bauble.use('../cellular_automaton')
All the ugly requires are gone.
I’m thinking about turning Bauble into a rubyforge project, and making a publicly available gem out of it in order to give folks a standard way to avoid those horrible __FILE__
requires. I think there are several other utilities that could be placed in Bauble such as require_relative
etc.
Anyway, what do you think?
Older posts: 1 2