First Pass Completed: Rough Draft TDD Demonstration Videos 123
As promised, I’ve made a complete sweep through a series of videos. You can find all of them: here.
These videos include several warts and false starts. Depending on the interest and feedback, I’ll redo these at some point in the future.
Question: Should I repeat this series in C#, or some other language? Some people have expressed interest in this. It’s probably 15 hours of work in C#, so I’d need to know it was worth the effort. What’s your opinion on that?
- Getting Started
- Adding Basic Operators
- Removing Duplication
- Extracting to Strategy
- Removing Duplication via Refactoring or Removing Duplication via Tdd using Mockito
- Introducing an Abstract Factory
- Adding a Sum operator
- Adding Prime Factors Operator
- Composing Operators and Programming the Calculator
- Using FitNesse to Program the Calculator
I’ve already received several comments both here on the blog as well as with the videos. I’ll keep track of those comments and incorporate the ones that fit for me.
Each video has a link on its page to download it. However, to download a video, you will have to create an account and log in. So here are the links, these won’t work without first creating an account (I’ll update original blog with these as well):- Getting Started Download
- Adding Basic Ops Download
- Removing Duplication Download
- Extracting to Strategy
- Removing Dups/Refactoring Download
- Removing Dups/Tdd Download
- Abstract Factory Download
- Sum Operator Download
- Prime Factors Download
- Composing Math Operators Download
- Using FitNesse Download
FitNesse Tutorials 189
Here is another tutorial for FitNesse: http://schuchert.wikispaces.com/FitNesse.Tutorials.2.
This tutorial and the first now all fit together and form one ongoing example. If you work though the first three tutorials at: http://schuchert.wikispaces.com/FitNesse.Tutorials, you’ll have practiced:- Using Decision Tables
- Using Query Tables
- Refactoring within FitNesse
- Using SetUp and TearDown pages
- Understanding inheritance of SetUp and TearDown pages
- Basic test organization under a suite
- Switching into unit testing from acceptance testing
There’s more to go, but that’s a good start to get you cracking at the fundamentals of FitNesse.
As a bonus, there’s a demonstration of some code in Java that produces query results in a snap. The source code is on github: http://github.com/schuchert/queryresultbuilder/tree/master.
Here is one such example taken from that tutorial:public List<Object> query() { List<Program> programs = CreateSeasonPassFor.getSeasonPassManager() .toDoListContentsFor(programId); QueryResultBuilder builder = new QueryResultBuilder(Program.class); builder.register("timeSlot", new TimeSlotPropertyHandler()); QueryResult result = builder.build(programs); return result.render(); }
Hope this is useful!
Adopting New JVM Languages in the Enterprise (Update) 102
(Updated to add Groovy, which I should have mentioned the first time. Also mentioned Django under Python.)
This is an exciting time to be a Java programmer. The pace of innovation for the Java language is slowing down, in part due to concerns that the language is growing too big and in part due to economic difficulties at Sun, which means there are fewer developers assigned to Java. However, the real crown jewel of the Java ecosystem, the JVM, has become an attractive platform for new languages. These languages give us exciting new opportunities for growth, while preserving our prior investment in code and deployment infrastructure.
This post emphasizes practical issues of evaluating and picking new JVM languages for an established Java-based enterprise.
The Interwebs are full of technical comparisons between Java and the different languages, e.g., why language X fixes Java’s perceived issue Y. I won’t rehash those arguments here, but I will describe some language features, as needed.
A similar “polyglot” trend is happening on the .NET platform.
The New JVM Languages
I’ll limit my discussion to these representative (and best known) alternative languages for the JVM.
- JRuby – Ruby running on the JVM.
- Scala – A hybrid object-oriented and functional language that runs on .NET as well as the JVM. (Disclaimer: I’m co-writing a book on Scala for O’Reilly.)
- Clojure – A Lisp dialect.
I picked these languages because they seem to be the most likely candidates for most enterprises considering a new JVM language, although some of the languages listed below could make that claim.
There are other deserving languages besides these three, but I don’t have the time to do them justice. Hopefully, you can generalize the subsequent discussion for these other languages.
- Groovy – A dynamically-typed language designed specifically for interoperability with Java. It will appeal to teams that want a dynamically-typed language that is closer to Java than Ruby. With Grails, you have a combination that’s comparable to Ruby on Rails.
- Jython – The first non-Java language ported to the JVM, started by Jim Hugunin in 1997. Most of my remarks about JRuby are applicable to Jython. Django is the Python analog of Rails. If your Java shop already has a lot of Python, consider Jython.
- Fan – A hybrid object-oriented and functional language that runs on .NET, too. It has a lot of similarities to Scala, like a scripting-language feel.
- Ioke – (pronounced “eye-oh-key”) An innovative language developed by Ola Bini and inspired by Io and Lisp. This is the newest language discussed here. Hence, it has a small following, but a lot of potential. The Io/Lisp-flavored syntax will be more challenging to average Java developers than Scala, JRuby, Jython, Fan, and JavaScript.
- JavaScript, e.g., Rhino – Much maligned and misunderstood (e.g., due to buggy and inconsistent browser implementations), JavaScript continues to gain converts as an alternative scripting language for Java applications. It is the default scripting language supported by the JDK 6 scripting interface.
- Fortress – A language designed as a replacement for high-performance FORTRAN for industrial and academic “number crunching”. This one will interest scientists and engineers…
Note: Like a lot of people, I use the term scripting language to refer to languages with a lightweight syntax, usually dynamically typed. The name reflects their convenience for “scripting”, but that quality is sometimes seen as pejorative; they aren’t seen as “serious” languages. I reject this view.
To learn more about what people are doing on the JVM today (with some guest .NET presentations), a good place to start is the recent JVM Language Summit.
Criteria For Evaluating New JVM Languages
I’ll frame the discussion around a few criteria you should consider when evaluating language choices. I’ll then discuss how each of the languages address those criteria. Since we’re restricting ourselves to JVM languages, I assume that each language compiles to valid byte code, so code in the new language and code written in Java can call each other, at least at some level. The “some level” part will be one criterion. Substitute X for the language you are considering.
-
Interoperability: How easily can X code invoke Java code and vice versa? Specifically:
- Create objects (i.e., call
new Foo(...)
). - Call methods on an object.
- Call static methods on a class.
- Extend a class.
- Implement an interface.
- Create objects (i.e., call
- Object Model: How different is the object model of X compared to Java’s object model? (This is somewhat tied to the previous point.)
-
New “Ideas”: Does X support newer programming trends:
- Functional Programming.
- Metaprogramming.
- Easier approaches to writing robust concurrent applications.
- Easier support for processing XML, SQL queries, etc.
- Support internal DSL creation.
- Easier presentation-tier development of web and thick-client UI’s.
-
Stability: How stable is the language, in terms of:
- Lack of Bugs.
- Stability of the language’s syntax, semantics, and library API’s. (All the languages can call Java API’s.)
- Performance: How does code written in X perform?
- Adoption: Is X easy to learn and use?
- Tool Support: What about editors, IDE’s, code coverage, etc.
-
Deployment: How are apps and libraries written in X deployed?
- Do I have to modify my existing infrastructure, management, etc.?
The Interoperability point affects ease of adoption and use with a legacy Java code base. The Object Model and Adoption points address the barrier to adoption from the learning point of view. The New “Ideas” point asks what each language brings to development that is not available in Java (or poorly supported) and is seen as valuable to the developer. Finally, Stability, Performance, and Deployment address very practical issues that a candidate production language must address.
Comparing the Languages
JRuby
JRuby is the most popular alternative JVM langauge, driven largely by interest in Ruby and Ruby on Rails.
Interoperability
Ruby’s object model is a little different than Java’s, but JRuby provides straightforward coding idioms that make it easy to call Java from JRuby. Calling JRuby from Java requires the JSR 223 scripting interface or a similar approach, unless JRuby is used to compile the Ruby code to byte code first. In that case, shortcuts are possible, which are well documented.
Object Model
Ruby’s object model is a little different than Java’s. Ruby support mixin-style modules, which behave like interfaces with implementations. So, the Ruby object model needs to be learned, but it is straightforward or the Java developer.
New Ideas
JRuby brings closures to the JVM, a much desired feature that probably won’t be added in the forthcoming Java 7. Using closures, Ruby supports a number of functional-style iterative operations, like mapping, filtering, and reducing/folding. However, Ruby does not fully support functional programming.
Ruby uses dynamic-typing instead of static-typing, which it exploits to provide extensive and powerful metaprogramming facilities.
Ruby doesn’t offer any specific enhancements over Java for safe, robust concurrent programming.
Ruby API’s make XML processing and database access relatively easy. Ruby on Rails is legendary for improving the productivity of web developers and similar benefits are available for thick-client developers using other libraries.
Ruby is also one of the best languages for defining “internal” DSL’s, which are used to great affect in Rails (e.g., ActiveRecord).
Stability
JRuby and Ruby are very stable and are widely used in production. JRuby is believed to be the best performing Ruby platform.
The Ruby syntax and API are undergoing some significant changes in the current 1.9.X release, but migration is not a major challenge.
Performance
JRuby is believed to be the best performing Ruby platform. While it is a topic of hot debate, Ruby and most dynamically-typed languages have higher runtime overhead compared to statically-typed languages. Also, the JVM has some known performance issues for dynamically-typed languages, some of which will be fixed in JDK 7.
As always, enterprises should profile code written in their languages of choice to pick the best one for each particular task.
Adoption
Ruby is very easy to learn, although effective use of advanced techniques like metaprogramming require some time to master. JRuby-specific idioms are also easy to master and are well documented.
Tool Support
Ruby is experiencing tremendous growth in tool support. IDE support still lags support for Java, but IntelliJ, NetBeans, and Eclipse are working on Ruby support. JRuby users can exploit many Java tools.
Code analysis tools and testing tools (TDD and BDD styles) are now better than Java’s.
Deployment
JRuby applications, even Ruby on Rails applications, can be deployed as jars or wars, requiring no modifications to an existing java-based infrastructure. Teams use this approach to minimize the “friction” of adopting Ruby, while also getting the performance benefits of the JVM.
Because JRuby code is byte code at runtime, it can be managed with JMX, etc.
Scala
Scala is a statically-typed language that supports an improved object model (with a full mixin mechanism called traits; similar to Ruby modules) and full support for functional programming, following a design goal of the inventor of Scala, Martin Odersky, that these two paradigms can be integrated, despite some surface incompatibilities. Odersky was involved in the design of Java generics (through earlier research languages) and he wrote the original version of the current javac
. The name is a contraction of “scalable language”, but the first “a” is pronounced like “ah”, not long as in the word “hay”.
The syntax looks like a cross between Ruby (method definitions start with the def
keyword) and Java (e.g., curly braces). Type inferencing and other syntactic conventions significantly reduce the “cluuter”, such as the number of explicit type declarations (“annotations”) compared to Java. Scala syntax is very succinct, sometimes even more so than Ruby! For more on Scala, see also my previous blog postings, part 1, part 2, part 3, and this related post on traits vs. aspects.
Interoperability
Scala’s has the most seamless interoperability with Java of any of the languages discussed here. This is due in part to Scala’s static typing and “closed” classes (as opposed to Ruby’s “open” classes). It is trivial to import and use Java classes, implement interfaces, etc.
Direct API calls from Java to Scala are also supported. The developer needs to know how the names of Scala methods are encoding in byte code. For example, Scala methods can have “operator” names, like ”+”. In the byte code, that name will be ”$plus”.
Object Model
Scala’s object model extends Java’s model with traits, which support flexble mixin composition. Traits behave like interfaces with implementations. The Scala object model provides other sophisticated features for building “scalable applications”.
New Ideas
Scala brings full support for functional programming to the JVM, including first-class function and closures. Other aspects of functional programming, like immutable variables and side-effect free functions, are encouraged by the language, but not mandated, as Scala is not a pure functional language. (Functional programming is very effective strategy for writing tread-safe programs, etc.) Scala’s Actor library is a port of Erlang’s Actor library, a message-based concurrency approach.
In my view, the Actor model is the best general-purpose approach to concurrency. There are times when multi-threaded code is needed for performance, but not for most concurrent applications. (Note: there are Actor libraries for Java, e.g., Kilim.)
Scala has very good support for building internal DSL’s, although it is not quite as good as Ruby’s features for this purpose. It has a combinator parser library that makes external DSL creation comparatively easy. Scala also offers some innovative API’s for XML processing and Swing development.
Stability
Scala is over 5 years old and it is very stable. The API and syntax continue to evolve, but no major, disruptive changes are expected. In fact, the structure of the language is such that almost all changes occur in libraries, not the language grammar.
There are some well-known production deployments, such as back-end services at twitter.
Performance
Scala provides comparable performance to Java, since it is very close “structurally” to Java code at the byte-code level, given the static typing and “closed” classes. Hence, Scala can exploit JVM optimizations that aren’t available to dynamically-typed languages.
However, Scala will also benefit from planned improvements to support dynamically-typed languages, such as tail-call optimizations (which Scala current does in the compiler.) Hence, Scala probably has marginally better performance than JRuby, in general. If true, Scala may be more appealing than JRuby as a general-purpose, systems language, where performance is critical.
Adoption
Scala is harder to learn and master than JRuby, because it is a more comprehensive language. It not only supports a sophisticated object model, but it also supports functional programming, type inferencing, etc. In my view, the extra effort will be rewarded with higher productivity. Also, because it is closer to Java than JRuby and Clojure, new users will be able to start using it quickly as a “better object-oriented Java”, while they continue to learn the more advanced features, like functional programming, that will accelerate their productivity over the long term.
Tool Support
Scala support in IDE’s still lags support for Java, but it is improving. IntelliJ, NetBeans, and Eclipse now support Scala with plugins. Maven and ant are widely used as the build tool for Scala applications. Several excellent TDD and BDD libraries are available.
Deployment
Scala applications are packaged and deployed just like Java applications, since Scala files are compiled to class files. A Scala runtime jar is also required.
Clojure
Of the three new JVM languages discussed here, Clojure is the least like Java, due to its Lisp syntax and innovative “programming model”. Yet it is also the most innovative and exciting new JVM language for many people. Clojure interoperates with Java code, but it emphasizes functional programming. Unlike the other languages, Clojure does not support object-oriented programming. Instead, it relies on mechanisms like multi-methods and macros to address design problems for which OOP is often used.
One exciting innovation in Clojure is support for software transactional memory, which uses a database-style transactional approach to concurrent modifications of in-memory, mutable state. STM is somewhat controversial. You can google for arguments about its practicality, etc. However, Clojure’s implementation appears to be successful.
Clojure also has other innovative ways of supporting “principled” modification of mutable data, while encouraging the use of immutable data. These features with STM are the basis of Clojure’s approach to robust concurrency.
Finally, Clojure implements several optimizations in the compiler that are important for functional programming, such as optimizing tail call recursion.
Disclaimer: I know less about Clojure than JRuby and Scala. While I have endeavored to get the facts right, there may be errors in the following analysis. Feedback is welcome.
Interoperability
Despite the Lisp syntax and functional-programming emphasis, Clojure interoperates with Java. Calling java from Clojure uses direct API calls, as for JRuby and Scala. Calling Clojure from Java is a more involved. You have to create Java proxies on the Clojure side to generate the byte code needed on the Java side. The idioms for doing this are straightforward, however.
Object Model
Clojure is not an object-oriented language. However, in order to interoperate with Java code, Clojure supports implementing interfaces and instantiating Java objects. Otherwise, Clojure offers a significant departure for develops well versed in object-oriented programming, but with little functional programming experience.
New Ideas
Clojure brings to the JVM full support for functional programming and popular Lisp concepts like macros, multi-methods, and powerful metaprogramming. It has innovative approaches to safe concurrency, including “principled” mechanisms for supporting mutable state, as discussed previously.
Clojure’s succinct syntax and built-in libraries make processing XML succinct and efficient. DSL creation is also supported using Lisp mechanisms, like macros.
Stability
Clojure is the newest of the three languages profiled here. Hence, it may be the most subject to change. However, given the nature of Lisps, it is more likely that changes will occur in libraries than the language itself. Stability in terms of bugs does not appear to be an issue.
Clojure also has the fewest known production deployments of the three languages. However, industry adoption is expected to happen rapidly.
Performance
Clojure supports type “hints” to assist in optimizing performance. The preliminary discussions I have seen suggest that Clojure offers very good performance.
Adoption
Clojure is more of a departure from Java than is Scala. It will require a motivated team that likes Lisp ;) However, such a team may learn Clojure faster than Scala, since Clojure is a simpler language, e.g., because it doesn’t have its own object model. Also, Lisps are well known for being simple languages, where the real learning comes in understanding how to use it effectively!
However, in my view, as for Scala, the extra learning effort will be rewarded with higher productivity.
Tool Support
As a new language, tool support is limited. Most Clojure developers use Emacs with its excellent Lisp support. Many Java tools can be used with Clojure.
Deployment
Clojure deployment appears to be as straightforward as for the other languages. A Clojure runtime jar is required.
Comparisons
Briefly, let’s review the points and compare the three languages.
Interoperability
All three languages make calling Java code straightforward. Scala interoperates most seamlessly. Scala code is easiest to invoke from Java code, using direct API calls, as long as you know how Scala encodes method names that have “illegal” characters (according to the JVM spec.). Calling JRuby and Clojure code from Java is more involved.
Therefore, if you expect to continue writing Java code that needs to make frequent API calls to the code in the new language, Scala will be a better choice.
Object Model
Scala is closest to Java’s object model. Ruby’s object model is superficially similar to Scala’s, but the dynamic nature of Ruby brings significant differences. Both extend Java’s object model with mixin composition through traits (Scala) or modules (Ruby), that act like interfaces with implementations.
Clojure is quite different, with an emphasis on functional programming and no direct support for object-oriented programming.
New Ideas
JRuby brings the productivity and power of a dynamically-typed language to the JVM, along with the drawbacks. It also brings some functional idioms.
Scala and Clojure bring full support for functional programming. Scala provides a complete Actor model of concurrency (as a library). Clojure brings software transactional memory and other innovations for writing robust concurrent applications. JRuby and Ruby don’t add anything specific for concurrency.
JRuby, like Ruby, is exceptionally good for writing internal DSL’s. Scala is also very good and Clojure benefits from Lisp’s support for DSL creation.
Stability
All the language implementations are of high quality. Scala is the most mature, but JRuby has the widest adoption in production.
Performance
Performance should be comparable for all, but JRuby and Clojure have to deal with some inefficiencies inherent to running dynamic languages on the JVM. Your mileage may vary, so please run realistic profiling experiments on sample implementations that are representative of your needs. Avoid “prematurely optimization” when choosing a new language. Often, team productivity and “time to market” are more important than raw performance.
Adoption
JRuby is the the easiest of the three languages to learn and adopt if you already have some Ruby or Ruby on Rails code in your environment.
Scala has the lowest barrier to adoption because it is the language that most resembles Java “philosophically” (static typing, emphasis on object-oriented programming, etc.). Adopters can start with Scala as a “better Java” and gradually learn the advanced features (mixin composition with traits and functional programming). Scala will appeal the most to teams that prefer statically-typed languages, yet want some of the benefits of dynamically-typed languages, like a succinct syntax.
However, Scala is the most complex of the three languages, while Clojure requires the biggest conceptual leap from Java.
Clojure will appeal to teams willing to explore more radical departures from what they are doing now, with potentially great payoffs!
Deployment
Deployment is easy with all three languages. Scala is most like Java, since you normally compile to class files (there is a limited interpreter mode). JRuby and Clojure code can be interpreted at runtime or compiled.
Summary and Conclusions
All three choices (or comparable substitutions from the list of other languages), will provide a Java team with a more modern language, yet fully leverage the existing investment in Java. Scala is the easiest incremental change. JRuby brings the vibrant Ruby world to the JVM. Clojure offers the most innovative departures from Java.
I'm glad that static typing is there to help... 13
The Background
A colleague was using FitNesse to create a general fixture for setting values in various objects rendered from a DTD. Of course you can write one per top level object, but given the number of eventual end-points, this would require a bit too much manual coding.This sounds like a candidate for reflection, correct? Yep, but rather than do that manually, using the Jakarta Commons BeanUtils makes sense – it’s a pretty handy library to be familiar with if you’re ever doing reflective programming with attributes.
package com.objectmentor.arraycopyexample; import static org.junit.Assert.assertEquals; import org.junit.Test; public class ArrayPropertySetterTest { @Test public void assertCanAssignToArrayFieldFromArrayOfObject() { Object[] arrayOfBars = createArrayOfBars(); Foo foo = new Foo(); ArrayPropertySetter.assignToArrayFieldFromObjectArray(foo, "bars", arrayOfBars); assertEquals(3, foo.getBars().length); } private Object[] createArrayOfBars() { Object[] objectArray = new Object[3]; for (int i = 0; i < objectArray.length; ++i) objectArray[i] = new Bar(); return objectArray; } }For completeness, you’ll need to see the Foo and Bar classes:
Bar
package com.objectmentor.arraycopyexample; public class Bar { }
Foo
package com.objectmentor.arraycopyexample; public class Foo { Bar[] bars; public Bar[] getBars() { return bars; } public void setBars(Bar[] bars) { this.bars = bars; } }
So an instance of a Foo holds on to an array of Bar objects; and the Foo class has the standard java-bean-esque setters and getters.
With this description of how to set an array field on a Java bean, let’s get this to actually work.
First question, how do you deal with arrays in Java? Sounds trivial, right. If you don’t mind a little pain, it’s not that bad… By dealing, I mean what happens when someone has given you an array created as follows:Object[] arrayOfObject = new Object[3]:Note that this is very different from this:
Object[] arrayOfBars = new Bar[3]:
The runtime type of these two results is different. One is array of Object; the other is Array of Bar.
This will not work:Bar[] arrayOfBar = (Bar[])arrayOfObject;This will generate a runtime cast exception. You cannot simply take something allocated as an array of objects and cast it to an array of a specific type. NO, you have to do something more like the following:
Array.newInstance(typeYouWantAnArrayOf, sizeOfArray);
That’s not too bad, right? You can then either use another method on the Array class to set the values, or you can cast the result to an appropriate array.
That’s enough information to write a generic method to copy from an array of Object to an array of a subtype of Object:public static Object[] copyToArrayOfType(Class destinationType, Object[] fromArray) { Object[] result = (Object[])Array.newInstance(destinationType, fromArray.length); for(int i = 0; i < fromArray.length; ++i) result[i] = fromArray[i]; return result; }This is a bit unruly because the caller still needs to cast the result:
Object[] arrayOfObject = new Object[] { new Foo(), new Foo(), new Foo() }; Foo[] arrayOfFoo = (Foo[])copyToArrayOfType(Foo.class, arrayOfObject);We can get rid of this cast if we use generics:
public static <T> T[] copyToArrayOfType(Class<T> destinationType, Object[] fromArray) { T[] result = (T[])Array.newInstance(destinationType, fromArray.length); for(int i = 0; i < fromArray.length; ++i) result[i] = (T) fromArray[i]; return result; }This doesn’t quite work because of type erasure, so to get this to “compile cleanly – no warnings”, you’ll need to add the following line above the method:
@SuppressWarnings("unchecked")
That’s just me telling the compiler I really think I know what I’m doing.
With this change, you can now write the following:Object[] arrayOfObject = new Object[] { new Foo(), new Foo(), new Foo() }; Foo[] arrayOfFoo = copyToArrayOfType(Foo.class, arrayOfObject);
The original problem was to take an array of Object[] and set it into a destination object’s attribute. Now we can create an array with the correct type, what next?
There are several things still remaining:- Given the name of the property, determine its underlying array type.
- Create the array (above)
- Assign the value to the underlying field
- Do some suficient hand-waving to handle exceptions
Here are solutions for each of those things:
Determine underlying type
PropertyDescriptor pd = PropertyUtils.getPropertyDescriptor( destObject, fieldName); Class<?> destType = pd.getPropertyType().getComponentType();
Create the array
Object[] destArray = copyToArrayOfType.(destType, fromArray);
Assign the value
PropertyUtils.setSimpleProperty(destObject, fieldName, destArray);Here’s all of that put together and simply capturing all of the checked exceptions (that’s a whole other can of worms):
public static void assignToArrayFieldFromObjectArray(Object destObject, String fieldName, Object[] fromArray) { try { PropertyDescriptor pd = PropertyUtils.getPropertyDescriptor(destObject, fieldName); Class<?> destType = pd.getPropertyType().getComponentType(); Object[] destArray = copyToArrayOfType(destType, fromArray); PropertyUtils.setSimpleProperty(destObject, fieldName, destArray); } catch (Exception e) { throw new RuntimeException(e); } }
That’s all it takes to copy an array and then set the value in the field of a destination object.
Simple, right?
Sometimes static (and strong) typing can get in the way. This is one of those cases. Luckily, you can write this one and use it all over. Maybe it’s a part of the BeanUtils that I was unable to track down (probably).
The Liskov Substitution Principle for "Duck-Typed" Languages 105
OCP and LSP together tell us how to organize similar vs. variant behaviors. I blogged the other day about OCP in the context of languages with open classes (i.e., dynamically-typed languages). Let’s look at the Liskov Substitution Principle (LSP).
The Liskov Substitution Principle was coined by Barbara Liskov in Data Abstraction and Hierarchy (1987).
If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2, then S is a subtype of T.
I’ve always liked the elegant simplicity, yet power, of LSP. In less formal terms, it says that if a client (program) expects objects of one type to behave in a certain way, then it’s only okay to substitute objects of another type if the same expectations are satisfied.
This is our best definition of inheritance. The well-known is-a relationship between types is not precise enough. Rather, the relationship has to be behaves-as-a, which unfortunately is more of a mouthful. Note that is-a focuses on the structural relationship, while behaves-as-a focuses on the behavioral relationship. A very useful, pre-TDD design technique called Design by Contract emerges out of LSP, but that’s another topic.
Note that there is a slight assumption that I made in the previous paragraph. I said that LSP defines inheritance. Why inheritance specifically and not substitutability, in general? Well, inheritance has been the main vehicle for substitutability for most OO languages, especially the statically-typed ones.
For example, a Java application might use a simple tracing abstraction like this.
public interface Tracing {
void trace(String message);
}
Clients might use this to trace methods calls to a log. Only classes that implement the Tracer
interface can be given to these clients. For example,
public class TracerClient {
private Tracer tracer;
public TracerClient(Tracer tracer) {
this.tracer = tracer;
}
public void doWork() {
tracer.trace("in doWork():");
// ...
}
}
However, Duck Typing is another form of substitutability that is commonly seen in dynamically-typed languages, like Ruby and Python.
If it walks like a duck and quacks like a duck, it must be a duck.
Informally, duck typing says that a client can use any object you give it as long as the object implements the methods the client wants to invoke on it. Put another way, the object must respond to the messages the client wants to send to it.
The object appears to be a “duck” as far as the client is concerned.
In or example, clients only care about the trace(message)
method being supported. So, we might do the following in Ruby.
class TracerClient
def initialize tracer
@tracer = tracer
end
def do_work
@tracer.trace "in do_work:"
# ...
end
end
class MyTracer
def trace message
p message
end
end
client = TracerClient.new(MyTracer.new)
No “interface” is necessary. I just need to pass an object to TracerClient.initialize
that responds to the trace
message. Here, I defined a class for the purpose. You could also add the trace
method to another type or object.
So, LSP is still essential, in the generic sense of valid substitutability, but it doesn’t have to be inheritance based.
Is Duck Typing good or bad? It largely comes down to your view about dynamically-typed vs. statically-typed languages. I don’t want to get into that debate here! However, I’ll make a few remarks.
On the negative side, without a Tracer
abstraction, you have to rely on appropriate naming of objects to convey what they do (but you should be doing that anyway). Also, it’s harder to find all the “tracing-behaving” objects in the system.
On the other hand, the client really doesn’t care about a “Tracer” type, only a single method. So, we’ve decoupled “client” and “server” just a bit more. This decoupling is more evident when using closures to express behavior, e.g., for Enumerable
methods. In our case, we could write the following.
class TracerClient2
def initialize &tracer
@tracer = tracer
end
def do_work
@tracer.call "in do_work:"
# ...
end
end
client = TracerClient2.new {|message| p "block tracer: #{message}"}
For comparison, consider how we might approach substitutability in Scala. As a statically-typed language, Scala doesn’t support duck typing per se, but it does support a very similar mechanism called structural types.
Essentially, structural types let us declare that a method parameter must support one or more methods, without having to say it supports a full interface. Loosely speaking, it’s like using an anonymous interface.
In our Java example, when we declare a tracer object in our client, we would be able to declare that is supports trace
, without having to specify that it implements a full interface.
To be explicit, recall our Java constructor for TestClient
.
public class TracerClient {
public TracerClient(Tracer tracer) { ... }
// ...
}
}
In Scala, a complete example would be the following.
class ScalaTracerClient(val tracer: { def trace(message:String) }) {
def doWork() = { tracer.trace("doWork") }
}
class ScalaTracer() {
def trace(message: String) = { println("Scala: "+message) }
}
object TestScalaTracerClient {
def main() {
val client = new ScalaTracerClient(new ScalaTracer())
client.doWork();
}
}
TestScalaTracerClient.main()
Recall from my previous blogs on Scala, the argument list to the class name is the constructor arguments. The constructor takes a tracer
argument whose “type” (after the ’:’) is { def trace(message:String) }
. That is, all we require of tracer
is that it support the trace
method.
So, we get duck type-like behavior, but statically type checked. We’ll get a compile error, rather than a run-time error, if someone passes an object to the client that doesn’t respond to tracer
.
To conclude, LSP can be reworded very slightly.
If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2, then S is substitutable for T.
I replaced a subtype of with substitutable for.
An important point is that the idea of a “contract” between the types and their clients is still important, even in a language with duck-typing or structural typing. However, languages with these features give us more ways to extend our system, while still supporting LSP.
The Seductions of Scala, Part III - Concurrent Programming 379
This is my third and last blog entry on The Seductions of Scala, where we’ll look at concurrency using Actors
and draw some final conclusions.
Writing Robust, Concurrent Programs with Scala
The most commonly used model of concurrency in imperative languages (and databases) uses shared, mutable state with access synchronization. (Recall that synchronization isn’t necessary for reading immutable objects.)
However, it’s widely known that this kind of concurrency programming is very difficult to do properly and few programmers are skilled enough to write such programs.
Because pure functional languages have no side effects and no shared, mutable state, there is nothing to synchronize. This is the main reason for the resurgent interest in function programming recently, as a potential solution to the so-called multicore problem.
Instead, most functional languages, in particular, Erlang and Scala, use the Actor model of concurrency, where autonomous “objects” run in separate processes or threads and they pass messages back and forth to communicate. The simplicity of the Actor model makes it far easier to create robust programs. Erlang processes are so lightweight that it is common for server-side applications to have thousands of communicating processes.
Actors in Scala
Let’s finish our survey of Scala with an example using Scala’s Actors library.
Here’s a simple Actor that just counts to 10, printing each number, one per second.
import scala.actors._
object CountingActor extends Actor {
def act() {
for (i <- 1 to 10) {
println("Number: "+i)
Thread.sleep(1000)
}
}
}
CountingActor.start()
The last line starts the actor, which implicitly invokes the act
method. This actor does not respond to any messages from other actors.
Here is an actor that responds to messages, echoing the message it receives.
import scala.actors.Actor._
val echoActor = actor {
while (true) {
receive {
case msg => println("received: "+msg)
}
}
}
echoActor ! "hello"
echoActor ! "world!"
In this case, we do the equivalent of a Java “static import” of the methods on Actor
, e.g., actor
. Also, we don’t actually need a special class, we can just create an object with the desired behavior. This object has an infinite loop that effectively blocks while waiting for an incoming message. The receive
method gets a block that is a match statement, which matches on anything received and prints it out.
Messages are sent using the target_actor ! message
syntax.
As a final example, let’s do something non-trivial; a contrived network node monitor.
import scala.actors._
import scala.actors.Actor._
import java.net.InetAddress
import java.io.IOException
case class NodeStatusRequest(address: InetAddress, respondTo: Actor)
sealed abstract class NodeStatus
case class Available(address: InetAddress) extends NodeStatus
case class Unresponsive(address: InetAddress, reason: Option[String]) extends NodeStatus
object NetworkMonitor extends Actor {
def act() {
loop {
react { // Like receive, but uses thread polling for efficiency.
case NodeStatusRequest(address, actor) =>
actor ! checkNodeStatus(address)
case "EXIT" => exit()
}
}
}
val timeoutInMillis = 1000;
def checkNodeStatus(address: InetAddress) = {
try {
if (address.isReachable(timeoutInMillis))
Available(address)
else
Unresponsive(address, None)
} catch {
case ex: IOException =>
Unresponsive(address, Some("IOException thrown: "+ex.getMessage()))
}
}
}
// Try it out:
val allOnes = Array(1, 1, 1, 1).map(_.toByte)
NetworkMonitor.start()
NetworkMonitor ! NodeStatusRequest(InetAddress.getByName("www.scala-lang.org"), self)
NetworkMonitor ! NodeStatusRequest(InetAddress.getByAddress("localhost", allOnes), self)
NetworkMonitor ! NodeStatusRequest(InetAddress.getByName("objectmentor.com"), self)
NetworkMonitor ! "EXIT"
self ! "No one expects the Spanish Inquisition!!"
def handleNodeStatusResponse(response: NodeStatus) = response match {
// Sealed classes help here
case Available(address) =>
println("Node "+address+" is alive.")
case Unresponsive(address, None) =>
println("Node "+address+" is unavailable. Reason: <unknown>")
case Unresponsive(address, Some(reason)) =>
println("Node "+address+" is unavailable. Reason: "+reason)
}
for (i <- 1 to 4) self.receive { // Sealed classes don't help here
case (response: NodeStatus) => handleNodeStatusResponse(response)
case unexpected => println("Unexpected response: "+unexpected)
}
We begin by importing the Actor
classes, the methods on Actor
, like actor
, and a few Java classes we need.
Next we define a sealed abstract base class. The sealed
keyword tells the compiler that the only subclasses will be defined in this file. This is useful for the case statements that use them. The compiler will know that it doesn’t have to worry about potential cases that aren’t covered, if new NodeStatus
subclasses are created. Otherwise, we would have to add a default case clause (e.g., case _ => ...
) to prevent warnings (and possible errors!) about not matching an input. Sealed class hierarchies are a useful feature for robustness (but watch for potential Open/Closed Principle violations!).
The sealed class hierarchy encapsulates all the possible node status values (somewhat contrived for the example). The node is either Available
or Unresponsive
. If Unresponsive
, an optional reason
message is returned.
Note that we only get the benefit of sealed classes here because we match on them in the handleNodeStatusResponse
message, which requires a response
argument of type NodeStatus
. In contrast, the receive
method effectively takes an Any
argument, so sealed classes don’t help on the line with the comment “Sealed classes don’t help here”. In that case, we really need a default, the case unexpected => ...
clause. (I added the message self ! "No one expects the Spanish Inquisition!!"
to test this default handler.)
In the first draft of this blog post, I didn’t know these details about sealed classes. I used a simpler implementation that couldn’t benefit from sealed classes. Thanks to the first commenter, LaLit Pant, who corrected my mistake!
The NetworkMonitor
loops, waiting for a NodeStatusRequest
or the special string “EXIT”, which tells it to quit. Note that the actor sending the request passes itself, so the monitor can reply to it.
The checkNodeStatus
attempts to contact the node, with a 1 second timeout. It returns an appropriate NodeStatus
.
Then we try it out with three addresses. Note that we pass self
as the requesting actor. This is an Actor
wrapping the current thread, imported from Actor
. It is analogous to Java’s Thread.currentThread()
.
Curiously enough, when I run this code, I get the following results.
Unexpected response: No one expects the Spanish Inquisition!!
Node www.scala-lang.org/128.178.154.102 is unavailable. Reason: <unknown>
Node localhost/1.1.1.1 is unavailable. Reason: <unknown>
Node objectmentor.com/206.191.6.12 is alive.
The message about the Spanish Inquisition was sent last, but processed first, probably because self
sent it to itself.
I’m not sure why www.scala-lang.org couldn’t be reached. A longer timeout didn’t help. According to the Javadocs for InetAddress.isReachable), it uses ICMP ECHO REQUESTs if the privilege can be obtained, otherwise it tries to establish a TCP connection on port 7 (Echo) of the destination host. Perhaps neither is supported on the scala-lang.org site.
Conclusions
Here are some concluding observations about Scala vis-Ã -vis Java and other options.
A Better Java
Ignoring the functional programming aspects for a moment, I think Scala improves on Java in a number of very useful ways, including:
- A more succinct syntax. There’s far less boilerplate, like for fields and their accessors. Type inference and optional semicolons, curly braces, etc. also reduce “noise”.
- A true mixin model. The addition of traits solves the problem of not having a good DRY way to mix in additional functionality declared by Java interfaces.
- More flexible method names and invocation syntax. Java took away operator overloading; Scala gives it back, as well as other benefits of using non-alphanumeric characters in method names. (Ruby programmers enjoy writing
list.empty?
, for example.) - Tuples. A personal favorite, I’ve always wanted the ability to return multiple values from a method, without having to create an ad hoc class to hold the values.
- Better separation of mutable vs. immutable objects. While Java provides some ability to make objects
final
, Scala makes the distinction between mutability and immutability more explicit and encourages the latter as a more robust programming style. - First-class functions and closures. Okay, these last two points are really about FP, but they sure help in OO code, too!
- Better mechanisms for avoiding
null
’s. TheOption
type makes code more robust than allowingnull
values. - Interoperability with Java libraries. Scala compiles to byte code so adding Scala code to existing Java applications is about as seamless as possible.
So, even if you don’t believe in FP, you will gain a lot just by using Scala as a better Java.
Functional Programming
But, you shouldn’t ignore the benefits of FP!
- Better robustness. Not only for concurrent programs, but using immutable objects (a.k.a. value objects) reduces the potential for bugs.
- A workable concurrency model. I use the term workable because so few developers can write robust concurrent code using the synchronization on shared state model. Even for those of you who can, why bother when Actors are so much easier??
- Reduced code complexity. Functional code tends to be very succinct. I can’t overestimate the importance of rooting out all accidental complexity in your code base. Excess complexity is one of the most pervasive detriments to productivity and morale that I see in my clients’ code bases!
- First-class functions and closures. Composition and succinct code are much easier with first-class functions.
- Pattern matching. FP-style pattern matching makes “routing” of messages and delegation much easier.
Of course, you can mimic some of these features in pure Java and I encourage you to do so if you aren’t using Scala.
Static vs. Dynamic Typing
The debate on the relative merits of static vs. dynamic typing is outside our scope, but I will make a few personal observations.
I’ve been a dedicated Rubyist for a while. It is hard to deny the way that dynamic typing simplifies code and as I said in the previous section, I take code complexity very seriously.
Scala’s type system and type inference go a long way towards providing the benefits of static typing with the cleaner syntax of dynamic typing, but Scala doesn’t eliminate the extra complexity of static typing.
Recall my Observer example from the first blog post, where I used traits to implement it.
trait Observer[S] {
def receiveUpdate(subject: S);
}
trait Subject[S] {
this: S =>
private var observers: List[Observer[S]] = Nil
def addObserver(observer: Observer[S]) = observers = observer :: observers
def notifyObservers() = observers.foreach(_.receiveUpdate(this))
}
In Ruby, we might implement it this way.
module Subject
def add_observer(observer)
@observers ||= []
@observers << observer # append, rather than replace with new array
end
def notify_observers
@observers.each {|o| o.receive_update(self)} if @observers
end
end
There is no need for an Observer
module. As long as every observer responds to the receive_update
“message”, we’re fine.
I commented the line where I append to the existing @observers
array, rather than build a new one, which would be the FP and Scala way. Appending to the existing array would be more typical of Ruby code, but this implementation is not as thread safe as an FP-style approach.
The trailing if
expression in notify_observers
means that nothing is done if @observers
is still nil
, i.e., it was never initialized in add_observer
.
So, which is better? The amount of code is not that different, but it took me significantly longer to write the Scala version. In part, this was due to my novice chops, but the reason it took me so long was because I had to solve a design issue resulting from the static typing. I had to learn about the typed self construct used in the first line of the Subject
trait. This was the only way to allow the Observer.receiveUpdate
method accept to an argument of type S
, rather than of type Subject[S]
. It was worth it to me to achieve the “cleaner” API.
Okay, perhaps I’ll know this next time and spend about the same amount of time implementing a Ruby vs. Scala version of something. However, I think it’s notable that sometimes static typing can get in the way of your intentions and goal of achieving clarity. (At other times, the types add useful documentation.) I know this isn’t the only valid argument you can make, one way or the other, but it’s one reason that dynamic languages are so appealing.
Poly-paradigm Languages vs. Mixing Several Languages
So, you’re convinced that you should use FP sometimes and OOP sometimes. Should you pick a poly-paradigm language, like Scala? Or, should you combine several languages, each of which implements one paradigm?
A potential downside of Scala is that supporting different modularity paradigms, like OOP and FP, increases the complexity in the language. I think Odersky and company have done a superb job combining FP and OOP in Scala, but if you compare Scala FP code to Haskell or Erlang FP code, the latter tend to be more succinct and often easier to understand (once you learn the syntax).
Indeed, Scala will not be easy for developers to master. It will be a powerful tool for professionals. As a consultant, I work with developers with a range of skills. I would not expect some of them to prosper with Scala. Should that rule out the language? NO. Rather it would be better to “liberate” the better developers with a more powerful tool.
So, if your application needs OOP and FP concepts interspersed, consider Scala. If your application needs discrete services, some of which are predominantly OOP and others of which are predominantly FP, then consider Scala or Java for the OOP parts and Erlang or another FP language for the FP parts.
Also, Erlang’s Actor model is more mature than Scala’s, so Erlang might have an edge for a highly-concurrent server application.
Of course, you should do your own analysis…
Final Thoughts
Java the language has had a great ride. It was a godsend to us beleaguered C++ programmers in the mid ‘90’s. However, compared to Scala, Java now feels obsolete. The JVM is another story. It is arguably the best VM available.
I hope Scala replaces Java as the main JVM language for projects that prefer statically-typed languages. Fans of dynamically-typed languages might prefer JRuby, Groovy, or Jython. It’s hard to argue with all the OOP and FP goodness that Scala provides. You will learn a lot about good language and application design by learning Scala. It will certainly be a prominent tool in my toolkit from now on.
The Seductions of Scala, Part II - Functional Programming 195
A Functional Programming Language for the JVM
In my last blog post, I discussed Scala’s support for OOP and general improvements compared to Java. In this post, which I’m posting from Agile 2008, I discuss Scala’s support for functional programming (FP) and why it should be of interest to OO developers.
A Brief Overview of Functional Programming
You might ask, don’t most programming languages have functions? FP uses the term in the mathematical sense of the word. I hate to bring up bad memories, but you might recall from your school days that when you solved a function like
y = sin(x)
for y
, given a value of x
, you could input the same value of x
an arbitrary number of times and you would get the same value of y
. This means that sin(x)
has no side effects. In other words, unlike our imperative OO or procedural code, no global or object state gets changed. All the work that a mathematical function does has to be returned in the result.
Similarly, the idea of a variable is a little different than what we’re used to in imperative code. While the value of y
will vary with the value of x
, once you have fixed x
, you have also fixed y
. The implication for FP is that “variables” are immutable; once assigned, they cannot be changed. I’ll call such immutable variables value objects.
Now, it would actually be hard for a “pure” FP language to have no side effects, ever. I/O would be rather difficult, for example, since the state of the input or output stream changes with each operation. So, in practice, all “pure” FP languages provide some mechanisms for breaking the rules in a controlled way.
Functions are first-class objects in FP. You can create named or anonymous functions (e.g., closures or blocks), assign them to variables, pass them as arguments to other functions, etc. Java doesn’t support this. You have to create objects that wrap the methods you want to invoke.
Functional programs tend to be much more declarative in nature than imperative programs. This is perhaps more obvious in pure FP languages, like Erlang and Haskell, than it is in Scala.
For example, the definition of Fibonacci numbers is the following.
F(n) = F(n-1) + F(n-2) where F(1)=1 and F(2)=1
An here is a complete implementation in Haskell.
module Main where
-- Function f returns the n'th Fibonacci number.
-- It uses binary recursion.
f n | n <= 2 = 1
| n > 2 = f (n-1) + f (n-2)
Without understanding the intricacies of Haskell syntax, you can see that the code closely matches the “specification” above it. The f n | ...
syntax defines the function f
taking an argument n
and the two cases of n
values are shown on separate lines, where one case is for n <= 2
and the other case if for n > 2
.
The code uses the recursive relationship between different values of the function and the special-case values when n = 1
and n = 2
. The Haskell runtime does the rest of the work.
It’s interesting that most domain-specific languages are also declarative in nature. Think of how JMock, EasyMock or Rails’ ActiveRecord code look. The code is more succinct and it lets the “system” do most of the heavy lifting.
Functional Programming’s Benefits for You
Value Objects and Side-Effect Free Functions
It’s the immutable variables and side-effect free functions that help solve the multicore problem. Synchronized access to shared state is not required if there is no state to manage. This makes robust concurrent programs far easier to write.
I’ll discuss concurrency in Scala in my third post. For now, let’s discuss other ways that FP in Scala helps to improve code, concurrent or not.
Value objects are beneficial because you can pass one around without worrying that someone will change it in a way that breaks other users of the object. Value objects aren’t unique to FP, of course. They have been promoted in Domain Driven Design (DDD), for example.
Similarly, side-effect free functions are safer to use. There is less risk that a caller will change some state inappropriately. The caller doesn’t have to worry as much about calling a function. There are fewer surprises and everything of “consequence” that the function does is returned to the caller. It’s easier to keep to the Single Responsibility Principle when writing side-effect free functions.
Of course, you can write side-effect free methods and immutable variables in Java code, but it’s mostly a matter of discipline; the language doesn’t give you any enforcement mechanisms.
Scala gives you a helpful enforcement mechanism; the ability to declare variables as val
’s (i.e., “values”) vs. var
’s (i.e., “variables”, um… back to the imperative programming sense of the word…). In fact, val
is the default, where neither is required by the language. Also, the Scala library contains both immutable and mutable collections and it “encourages” you to use the immutable collections.
However, because Scala combines both OOP and FP, it doesn’t force FP purity. The upside is that you get to use the approach that best fits the problem you’re trying to solve. It’s interesting that some of the Scala library classes expose FP-style interfaces, immutability and side-effect free functions, while using more traditional imperative code to implement them!
Closures and First-Class Functions
True to its functional side, Scala gives you true closures and first-class functions. If you’re a Groovy or Ruby programmer, you’re used to the following kind of code.
class ExpensiveResource {
def open(worker: () => Unit) = {
try {
println("Doing expensive initialization")
worker()
} finally {
close()
}
}
def close() = {
println("Doing expensive cleanup")
}
}
// Example use:
try {
(new ExpensiveResource()) open { () => // 1
println("Using Resource") // 2
throw new Exception("Thrown exception") // 3
} // 4
} catch {
case ex: Throwable => println("Exception caught: "+ex)
}
Running this code will yield:
Doing expensive initialization
Using Resource
Doing expensive cleanup
Exception caught: java.lang.Exception: Thrown exception
The ExpensiveResource.open
method invokes the user-specified worker
function. The syntax worker: () => Unit
defines the worker
parameter as a function that takes no arguments and returns nothing (recall that Unit
is the equivalent of void
).
ExpensiveResource.open
handles the details of initializing the resource, invoking the worker, and doing the necessary cleanup.
The example marked with the comment // 1
creates a new ExpensiveResource
, then calls open
, passing it an anonymous function, called a function literal in Scala terminology. The function literal is of the form (arg_list_) => function body
or () => println(...) ...
, in our case.
A special syntax trick is used on this line; if a method takes one argument, you can change expressions of the form object.method(arg)
to object method {arg}
. This syntax is supported to allow user-defined methods to read like control structures (think for
statements – see the next section). If you’re familiar with Ruby, the four commented lines read a lot like Ruby syntax for passing blocks to methods.
Idioms like this are very important. A library writer can encapsulate all complex, error-prone logic and allow the user to specify only the unique work required in a given situation. For example, How many times have you written code that opened an I/O stream or a database connection, used it, then cleaned up. How many times did you get the idiom wrong, especially the proper cleanup when an exception is thrown? First-class functions allow writers of I/O, database and other resource libraries to do the correct implementation once, eliminating user error and duplication. Here’s a rhetorical question I always ask myself:
How can I make it impossible for the user of this API to fail?
Iterations
Iteration through collections, Lists
in particular, is even more common in FP than in imperative languages. Hence, iteration is highly evolved. Consider this example:
object RequireWordsStartingWithPrefix {
def main(args: Array[String]) = {
val prefix = args(0)
for {
i <- 1 to (args.length - 1) // no semicolon
if args(i).startsWith(prefix)
} println("args("+i+"): "+args(i))
}
}
Compiling this code with scalac
and then running it on the command line with the command
scala RequireWordsStartingWithPrefix xx xy1 xx1 yy1 xx2 xy2
produces the result
args(2): xx1
args(5): xx2
The for loop assigns a loop variable i
with each argument, but only if the if
statement is true. Instead of curly braces, the for loop argument list could also be parenthesized, but then each line as shown would have to be separated by a semi-colon, like we’re used to seeing with Java for loops.
We can have an arbitrary number of assignments and conditionals. In fact, it’s quite common to filter lists:
object RequireWordsStartingWithPrefix2 {
def main(args: Array[String]) = {
val prefix = args(0)
args.slice(1, args.length)
.filter((arg: String) => arg.startsWith(prefix))
.foreach((arg: String) => println("arg: "+arg))
}
}
This version yields the same result. In this case, the args array is sliced (loping off the search prefix), the resulting array is filtered using a function literal and the filtered array is iterated over to print out the matching arguments, again using a function literal. This version of the algorithm should look familiar to Ruby programmers.
Rolling Your Own Function Objects
Scala still has to support the constraints of the JVM. As a comment to the first blog post said, the Scala compiler wraps closures and “bare” functions in Function
objects. You can also make other objects behave like functions. If your object implements the apply
method, that method will be invoked if you put parentheses with an matching argument list on the object, as in the following example.
class HelloFunction {
def apply() = "hello"
def apply(name: String) = "hello "+name
}
val hello = new HelloFunction
println(hello()) // => "hello"
println(hello("Dean")) // => "hello Dean"
Option, None, Some…
Null pointer exceptions suck. You can still get them in Scala code, because Scala runs on the JVM and interoperates with Java libraries, but Scala offers a better way.
Typically, a reference might be null when there is nothing appropriate to assign to it. Following the conventions in some FP languages, Scala has an Option
type with two subtypes, Some
, which wraps a value, and None
, which is used instead of null
. The following example, which also demonstrates Scala’s Map
support, shows these types in action.
val hotLangs = Map(
"Scala" -> "Rocks",
"Haskell" -> "Ethereal",
"Java" -> null)
println(hotLangs.get("Scala")) // => Some(Rocks)
println(hotLangs.get("Java")) // => Some(null)
println(hotLangs.get("C++")) // => None
Note that Map
stores values in Options
objects, as shown by the println
statements.
By the way, those ->
aren’t special operators; they’re methods. Like ::
, valid method names aren’t limited to alphanumerics, _
, and $
.
Pattern Matching
The last FP feature I’ll discuss in this post is pattern matching, which is exploited more fully in FP languages than in imperative languages.
Using our previous definition of hotLangs
, here’s how you might use matching.
def show(key: String) = {
val value: Option[String] = hotLangs.get(key)
value match {
case Some(x) => x
case None => "No hotness found"
}
}
println(show("Scala")) // => "Rocks"
println(show("Java")) // => "null"
println(show("C++")) // => "No hotness found"
The first case
statement, case Some(x) => x
, says “if the value
I’m matching against is a Some
that could be constructed with the Some[+String](x: A)
constructor, then return the x
, the thing the Some
contains.” Okay, there’s a lot going on here, so more background information is in order.
In Scala, like Ruby and other languages, the last value computed in a function is returned by it. Also, almost everything returns a value, including match
statements, so when the Some(x) => x
case is chosen, x
is returned by the match
and hence by the function.
Some
is a generic class and the show
function returns a String
, so the match is to Some[+String]
. The +
in the +String
expression is analogous to Java’s extends
, i.e., <? extends String>
. Capiche?
Idioms like case Some(x) => x
are called extractors in Scala and are used a lot in Scala, as well as in FP, in general. Here’s another example using Lists and our friend ::
, the “cons” operator.
def countScalas(list: List[String]): Int = {
list match {
case "Scala" :: tail => countScalas(tail) + 1
case _ :: tail => countScalas(tail)
case Nil => 0
}
}
val langs = List("Scala", "Java", "C++", "Scala", "Python", "Ruby")
val count = countScalas(langs)
println(count) // => 2
We’re counting the number of occurrences of “Scala” in a list of strings, using matching and recursion and no explicit iteration. An expression of the form head :: tail
applied to a list returns the first element set as the head
variable and the rest of the list set as the tail
variable. In our case, the first case
statement looks for the particular case where the head equals Scala
. The second case
matches all lists, except for the empty list (Nil
). Since matches are eager, the first case
will always pick out the List("Scala", ...)
case first. Note that in the second case
, we don’t actually care about the value, so we use the placeholder _
. Both the first and second case
’s call countScalas
recursively.
Pattern matching like this is powerful, yet succinct and elegant. We’ll see more examples of matching in the next blog post on concurrency using message passing.
Recap of Scala’s Functional Programming
I’ve just touched the tip of the iceberg concerning functional programming (and I hope I got all the details right!). Hopefully, you can begin to see why we’ve overlooked FP for too long!
In my last post, I’ll wrap up with a look at Scala’s approach to concurrency, the Actor model of message passing.
The Seductions of Scala, Part I 185
(Update 12/23/2008: Thanks to Apostolos Syropoulos for pointing out an earlier reference for the concept of “traits”).
Because of all the recent hoo-ha about functional programming (e.g., as a “cure” for the multicore problem), I decided to cast aside my dysfunctional ways and learn one of the FP languages. The question was, which one?
My distinguished colleague, Michael Feathers, has been on a Haskell binge of late. Haskell is a pure functional language and is probably most interesting as the “flagship language” for academic exploration, rather than production use. (That was not meant as flame bait…) It’s hard to underestimate the influence Haskell has had on language design, including Java generics, .NET LINQ and F#, etc.
However, I decided to learn Scala first, because it is a JVM language that combines object-oriented and functional programming in one language. At ~13 years of age, Java is a bit dated. Scala has the potential of replacing Java as the principle language of the JVM, an extraordinary piece of engineering that is arguably now more valuable than the language itself. (Note: there is also a .NET version of Scala under development.)
Here are some of my observations, divided over three blog posts.
First, a few disclaimers. I am a Scala novice, so any flaws in my analysis reflect on me, not Scala! Also, this is by no means an exhaustive analysis of the pros and cons of Scala vs. other options. Start with the Scala website for more complete information.
A Better OOP Language
Scala works seamlessly with Java. You can invoke Java APIs, extend Java classes and implement Java interfaces. You can even invoke Scala code from Java, once you understand how certain “Scala-isms” are translated to Java constructs (javap
is your friend). Scala syntax is more succinct and removes a lot of tedious boilerplate from Java code.
For example, the following Person
class in Java:
class Person {
private String firstName;
private String lastName;
private int age;
public Person(String firstName, String lastName, int age) {
this.firstName = firstName;
this.lastName = lastName;
this.age = age;
}
public void setFirstName(String firstName) { this.firstName = firstName; }
public void String getFirstName() { return this.firstName; }
public void setLastName(String lastName) { this.lastName = lastName; }
public void String getLastName() { return this.lastName; }
public void setAge(int age) { this.age = age; }
public void int getAge() { return this.age; }
}
can be written in Scala thusly:
class Person(var firstName: String, var lastName: String, var age: Int)
Yes, that’s it. The constructor is the argument list to the class, where each parameter is declared as a variable (var
keyword). It automatically generates the equivalent of getter and setter methods, meaning they look like Ruby-style attribute accessors; the getter is foo
instead of getFoo
and the setter is foo =
instead of setFoo
. Actually, the setter function is really foo_=
, but Scala lets you use the foo =
sugar.
Lots of other well designed conventions allow the language to define almost everything as a method, yet support forms of syntactic sugar like the illusion of operator overloading, Ruby-like DSL’s, etc.
You also get fewer semicolons, no requirements tying package and class definitions to the file system structure, type inference, multi-valued returns (tuples), and a better type and generics model.
One of the biggest deficiencies of Java is the lack of a complete mixin model. Mixins are small, focused (think Single Responsibility Principle ...) bits of state and behavior that can be added to classes (or objects) to extend them as needed. In a language like C++, you can use multiple inheritance for mixins. Because Java only supports single inheritance and interfaces, which can’t have any state and behavior, implementing a mixin-based design has always required various hacks. Aspect-Oriented Programming is also one partial solution to this problem.
The most exciting OOP enhancement Scala brings is its support for Traits, a concept first described here and more recently discussed here. Traits support Mixins (and other design techniques) through composition rather than inheritance. You could think of traits as interfaces with implementations. They work a lot like Ruby modules.
Here is an example of the Observer Pattern written as traits, where they are used to monitor changes to a bank account balance. First, here are reusable Subject
and Observer
traits.
trait Observer[S] {
def receiveUpdate(subject: S);
}
trait Subject[S] {
this: S =>
private var observers: List[Observer[S]] = Nil
def addObserver(observer: Observer[S]) = observers = observer :: observers
def notifyObservers() = observers.foreach(_.receiveUpdate(this))
}
In Scala, generics are declared with square brackets, [...]
, rather than angled brackets, <...>
. Method definitions begin with the def
keyword. The Observer
trait defines one abstract method, which is called by the Subject
to notify the observer of changes. The Subject
is passed to the Observer
.
This trait looks exactly like a Java interface. In fact, that’s how traits are represented in Java byte code. If the trait has state and behavior, like Subject
, the byte code representation involves additional elements.
The Subject
trait is more complex. The strange line, this: S =>
, is called a self type declaration. It tells the compiler that whenever this
is referenced in the trait, treat its type as S
, rather than Subject[S]
. Without this declaration, the call to receiveUpdate
in the notifyObservers
method would not compile, because it would attempt to pass a Subject[S]
object, rather than a S
object. The self type declaration solves this problem.
The next line creates a private list of observers, initialized to Nil
, which is an empty list. Variable declarations are name: type
. Why didn’t they follow Java conventions, i.e., type name
? Because this syntax makes the code easier to parse when type inference is used, meaning where the explicit :type
is omitted and inferred.
In fact, I’m using type inference for all the method declarations, because the compiler can figure out what each method returns, in my examples. In this case, they all return type Unit
, the equivalent of Java’s void
. (The name Unit
is a common term in functional languages.)
The third line defines a method for adding a new observer to the list. Notice that concrete method definitions are of the form
def methodName(parameter: type, ...) = {
method body
}
In this case, because there is only one line, I dispensed with the {...}
. The equals sign before the body emphasizes the functional nature of scala, that all methods are objects, too. We’ll revisit this in a moment and in the next post.
The method body prepends the new observer object to the existing list. Actually, a new list is created. The ::
operator, called “cons”, binds to the right. This “operator” is really a method call, which could actually be written like this, observers.::(observer)
.
Our final method in Subject
is notifyObservers
. It iterates through observers and invokes the block observer.receiveUpdate(this)
on each observer. The _
evaluates to the current observer reference. For comparison, in Ruby, you would define this method like so:
def notifyObservers()
@observers.each { |o| o.receiveUpdate(self) }
end
Okay, let’s look at how you would actually use these traits. First, our “plain-old Scala object” (POSO) Account
.
class Account(initialBalance: Double) {
private var currentBalance = initialBalance
def balance = currentBalance
def deposit(amount: Double) = currentBalance += amount
def withdraw(amount: Double) = currentBalance -= amount
}
Hopefully, this is self explanatory, except for two things. First, recall that the whole class declaration is actually the constructor, which is why we have an initialBalance: Double
parameter on Account
. This looks strange to the Java-trained eye, but it actually works well and is another example of Scala’s economy. (You can define multiple constructors, but I won’t go into that here…).
Second, note that I omitted the parentheses when I defined the balance
“getter” method. This supports the uniform access principle. Clients will simply call myAccount.balance
, without parentheses and I could redefine balance
to be a var
or val
and the client code would not have to change!
Next, a subclass that supports observation.
class ObservedAccount(initialBalance: Double) extends Account(initialBalance) with Subject[Account] {
override def deposit(amount: Double) = {
super.deposit(amount)
notifyObservers()
}
override def withdraw(amount: Double) = {
super.withdraw(amount)
notifyObservers()
}
}
The with
keyword is how a trait is used, much the way that you implement
an interface in Java, but now you don’t have to implement the interface’s methods. We’ve already done that.
Note that the expression, ObservedAccount(initialBalance: Double) extends Account(initialBalance)
, not only defines the (single) inheritance relationship, it also functions as the constructor’s call to super(initialBalance)
, so that Account
is properly initialized.
Next, we have to override the deposit
and withdraw
methods, calling the parent methods and then invoking notifyObservers
. Anytime you override a concrete method, scala requires the override
keyword. This tells you unambiguously that you are overriding a method and the Scala compiler throws an error if you aren’t actually overriding a method, e.g., because of a typo. Hence, the keyword is much more reliable (and hence useful…) than Java’s @Override
annotation.
Finally, here is an Observer
that prints to stdout when the balance changes.
class AccountReporter extends Observer[Account] {
def receiveUpdate(account: Account) =
println("Observed balance change: "+account.balance)
}
Rather than use with
, I just extend the Observer
trait, because I don’t have another parent class.
Here’s some code to test what we’ve done.
def changingBalance(account: Account) = {
println("==== Starting balance: " + account.balance)
println("Depositing $10.0")
account.deposit(10.0)
println("new balance: " + account.balance)
println("Withdrawing $5.60")
account.withdraw(5.6)
println("new balance: " + account.balance)
}
var a = new Account(0.0)
changingBalance(a)
var oa = new ObservedAccount(0.0)
changingBalance(oa)
oa.addObserver(new AccountReporter)
changingBalance(oa)
Which prints out:
==== Starting balance: 0.0
Depositing $10.0
new balance: 10.0
Withdrawing $5.60
new balance: 4.4
==== Starting balance: 0.0
Depositing $10.0
new balance: 10.0
Withdrawing $5.60
new balance: 4.4
==== Starting balance: 4.4
Depositing $10.0
Observed balance change: 14.4
new balance: 14.4
Withdrawing $5.60
Observed balance change: 8.8
new balance: 8.8
Note that we only observe the last transaction.
Download Scala and try it out. Put all this code in one observer.scala
file, for example, and run the command:
scala observer.scala
But Wait, There’s More!
In the next post, I’ll look at Scala’s support for Functional Programming and why OO programmers should find it interesting. In the third post, I’ll look at the specific case of concurrent programming in Scala and make some concluding observations of the pros and cons of Scala.
For now, here are some references for more information.
- The Scala website, for downloads, documentation, mailing lists, etc.
- Ted Neward’s excellent multipart introduction to Scala at developerWorks.
- The forthcoming Programming in Scala book.
Always close() in a finally block 55
Here’s one for my fellow Java programmers, but it’s really generally applicable.
When you call close() on I/O streams, readers, writers, network sockets, database connections, etc., it’s easy to forgot the most appropriate idiom. I just spent a few hours fixing some examples of misuse in otherwise very good Java code.
What’s wrong the following code?
public void writeContentToFile(String content, String fileName) throws Exception {
File output = new File(fileName);
OutputStreamWriter writer = new OutputStreamWriter(new FileOutputStream(output), "UTF-8");
writer.write(content);
writer.close();
}
It doesn’t look all that bad. It tells it’s story. It’s easy to understand.
However, it’s quite likely that you won’t get to the last line, which closes the writer, from time to time. File and network I/O errors are common. For example, what if you can’t actually write to the location specified by fileName? So, we have to be more defensive. We want to be sure we always clean up.
The correct idiom is to use a try … finally … block.
public void writeContentToFile(String content, String fileName) throws Exception {
File output = new File(getFileSystemPath() + contentFilename);
OutputStreamWriter writer = null;
try {
writer = new OutputStreamWriter(new FileOutputStream(output), "UTF-8");
writer.write(content);
} finally {
if (writer != null)
writer.close();
}
}
Now, no matter what happens, the writer will be closed, if it’s not null, even if writing the output was unsuccessful.
Note that we don’t necessarily need a catch block, because in this case we’re willing to let any Exceptions propagate up the stack (notice the throws clause). A lot of developers don’t realize that there are times when you need a try block, but not necessarily a catch block. This is one of those times.
So, anytime you need to clean up or otherwise release resources, use a finally block to ensure that the clean up happens, no matter what.
Writing Java Aspects ... with JRuby and Aquarium! 25
Aquarium V0.4.0, my AOP library for Ruby, now supports JRuby. Not only do the regular “pure Ruby” Aquarium specs run reliably under JRuby (V1.1RC2), but you can now write aspects for Java types with Aquarium!
There are some important limitations, though. Cartographers of old would mark dangerous or unknown territory on their maps with hic sunt dracones (“here be dragons”), a reference to the old practice of adorning maps with serpents around the edges.
This is true of Aqurium + Java types in JRuby, too, at least for now.
Aquarium uses Ruby’s metaprogramming API extensively and the JRuby team has done some pretty sophisticated work to integrate Java types with Ruby. Hence, it’s not too surprising there are some gotchas. Hopefully, workarounds will be possible for all of them.
The details are discussed on the JRuby page, the README on the Aquarium site, and of course the “specs” in the distribution’s jruby/spec
directory. I’ll summarize them here, after discussing the pros and cons of Aquarium vs. the venerable AspectJ and showing you an example of using Aquarium for Java.
Briefly, Aquarium’s advantages over AspectJ are these:
- You can add and remove advice dynamically at runtime. You can’t remove AspectJ advice.
- You can advise JDK types easily with Aquarium. AspectJ won’t do this by default, but this is really more of a legacy licensing issue than a real technical limitation.
- You can advise individual objects, not just types.
Aquarium’s disadvantages compared to AspectJ include:
- Aquarium will be slower than using AspectJ (although this has not been studied in depth yet).
- Aquarium’s pointcut language is not as full-featured as AspectJ’s.
- There are the bugs and limitations I mentioned above in this initial V0.4.0 release, which I’ll elaborate shortly.
Here is an example of adding tracing calls to a method doIt
in all classes that implement the Java interface com.foo.Work
.
Aspect.new :before, :calls_to => [:doIt, :do_it], :in_types_and_descendents => Java::com.foo.Work do |jp, obj, *args|
log "Entering: #{jp.target_type.name}##{jp.method_name}: object = #{object}, args = #{args.inspect}"
end
There are two important points to notice in this example:
- You can choose to refer to the method as
do_it
(Ruby style) ordoIt
, but these variants are effectively treated as separate methods; advice on one will not affect invocations of the other. So, if you want to be sure to catch all invocations, use both forms. There is a bug (18326) that happens in certain conditions if you use just the Java naming convention. - If the type is an interface, you must use
:types_and_descendents
(or one of the supported variants on the wordtypes
...). Since interfaces don’t have method implementations, you will match no join points unless you use the_and_descendents
clause. (By default, Aquarium warns you when no join points are matched by an aspect.) However, there is a bug (18325) with this approach if Java types are subtyped in Ruby.
Limitations and Bugs
Okay, here’s the “fine print”...
In this (V0.4.0) release, there are some important limitations.
- Aquarium advice on a method in a Java type will only be invoked when the method is called directly from Ruby.
- To have the advice invoked when the method is called from either Java or Ruby, it is necessary to create a Ruby subclass of the Java type and override the method(s) you want to advise. These overrides can just call
super
. Note that it will also be necessary for instances of this Ruby type to be used throughout the application, in both the Java and Ruby code. So, you’ll have to instantiate the object in your Ruby code.
Yea, this isn’t so great, but if you’re motivated… ;)
There are also a few outstanding Aquarium bugs (which could actually be JRuby bugs or quirks of the Aquarium-JRuby “interaction”; I’m not yet sure which).
- Bug #18325: If you have Ruby subclasses of Java types and you advise a Java method in the hierarchy using
:types_and_descendents => MyJavaBaseClassOrInterface
and you call unadvise on the aspect, the advice “infrastructure” is not correctly removed from the Ruby types. Workaround: Either don’t “unadvise” such Ruby types or only advise methods in such Ruby types where the method is explicitly overridden in the Ruby class. (The spec and the Rubyforge bug report provide examples.) - Bug #18326: Normally, you can use either Java- or Ruby-style method names (e.g.,
doSomething
vs.do_something
), for Java types. However, if you write an aspect using the Java-style for a method name and a Ruby subclass of the Java type where the method is actually defined (i.e., the Ruby class doesn’t override the method), Aquarium acts like the JoinPoint is advised, but the advice is never actually called. Workaround: Use the Ruby-style name in this scenario.
So, there is still some work to do, but it’s promising that you can use an aspect framework in one language with another. A primary goal of Aquarium is to make it easy to write simple aspects. My hope is that people who might find AspectJ daunting will still give Aquarium a try.
Older posts: 1 2