Video of my RubyConf talk, "Better Ruby through Functional Programming" 64

Posted by Dean Wampler Thu, 27 Nov 2008 22:09:00 GMT

Confreaks has started posting the videos from RubyConf. Here’s mine on Better Ruby through Functional Programming.

Please ignore the occasional Ruby (and Scala) bugs…

A Scala-style "with" Construct for Ruby 108

Posted by Dean Wampler Tue, 30 Sep 2008 03:41:00 GMT

Scala has a “mixin” construct called traits, which are roughly analogous to Ruby modules. They allow you to create reusable, modular bits of state and behavior and use them to compose classes and other traits or modules.

The syntax for using Scala traits is quite elegant. It’s straightforward to implement the same syntax in Ruby and doing so has a few useful advantages.

For example, here is a Scala example that uses a trait to trace calls to a Worker.work method.

    
// run with "scala example.scala" 

class Worker {
    def work() = "work" 
}

trait WorkerTracer extends Worker {
    override def work() = "Before, " + super.work() + ", After" 
}

val worker = new Worker with WorkerTracer

println(worker.work())        // => Before, work, After
    

Note that WorkerTracer extends Worker so it can override the work method. Since Scala is statically typed, you can’t just define an override method and call super unless the compiler knows there really is a “super” method!

Here’s a Ruby equivalent.

    
# run with "ruby example.rb" 

module WorkerTracer
    def work; "Before, #{super}, After"; end
end

class Worker 
    def work; "work"; end
end

class TracedWorker < Worker 
  include WorkerTracer
end

worker = TracedWorker.new

puts worker.work          # => Before, work, After
    

Note that we have to create a subclass, which isn’t required for the Scala case (but can be done when desired).

If you know that you will always want to trace calls to work in the Ruby case, you might be tempted to dispense with the subclass and just add include WorkerTracer in Worker. Unfortunately, this won’t work. Due to the way that Ruby resolves methods, the version of work in the module will not be found before the version defined in Worker itself. Hence the subclass seems to be the only option.

However, we can work around this using metaprogramming. We can use WorkerTracer#append_features(...). What goes in the argument list? If we pass Worker, then all instances of Worker will be effected, but actually we’ll still have the problem with the method resolution rules.

If we just want to affect one object and work around the method resolution roles, then we need to pass the singleton class (or eigenclass or metaclass ...) for the object, which you can get with the following expression.

    
metaclass = class << worker; self; end
    

So, to encapsulate all this and to get back to the original goal of implementing with-style semantics, here is an implementation that adds a with method to Object, wrapped in an rspec example.

    
# run with "spec ruby_with_spec.rb" 

require 'rubygems'
require 'spec'

# Warning, monkeypatching Object, especially with a name
# that might be commonly used is fraught with peril!!

class Object
  def with *modules
    metaclass = class << self; self; end
    modules.flatten.each do |m|
      m.send :append_features, metaclass
    end
    self
  end
end

module WorkerTracer
    def work; "Before, #{super}, After"; end
end

module WorkerTracer1
    def work; "Before1, #{super}, After1"; end
end

class Worker 
    def work; "work"; end
end

describe "Object#with" do
  it "should make no changes to an object if no modules are specified" do
    worker = Worker.new.with
    worker.work.should == "work" 
  end

  it "should override any methods with a module's methods of the same name" do
    worker = Worker.new.with WorkerTracer
    worker.work.should == "Before, work, After" 
  end

  it "should stack overrides for multiple modules" do
    worker = Worker.new.with(WorkerTracer).with(WorkerTracer1)
    worker.work.should == "Before1, Before, work, After, After1" 
  end

  it "should stack overrides for a list of modules" do
    worker = Worker.new.with WorkerTracer, WorkerTracer1
    worker.work.should == "Before1, Before, work, After, After1" 
  end

  it "should stack overrides for an array of modules" do
    worker = Worker.new.with [WorkerTracer, WorkerTracer1]
    worker.work.should == "Before1, Before, work, After, After1" 
  end
end
    

You should carefully consider the warning about monkeypatching Object! Also, note that Module.append_features is actually private, so I had to use m.send :append_features, ... instead.

The syntax is reasonably intuitive and it eliminates the need for an explicit subclass. You can pass a single module, or a list or array of them. Because with returns the object, you can also chain with calls.

A final note; many developers steer clear of metaprogramming and reflection features in their languages, out of fear. While prudence is definitely wise, the power of these tools can dramatically accelerate your productivity. Metaprogramming is just programming. Every developer should master it.

Traits vs. Aspects in Scala 91

Posted by Dean Wampler Sun, 28 Sep 2008 03:33:00 GMT

Scala traits provide a mixin composition mechanism that has been missing in Java. Roughly speaking, you can think of traits as analogous to Java interfaces, but with implementations.

Aspects, e.g., those written in AspectJ, are another mechanism for mixin composition in Java. How do aspects and traits compare?

Let’s look at an example trait first, then re-implement the same behavior using an AspectJ aspect, and finally compare the two approaches.

Observing with Traits

In a previous post on Scala, I gave an example of the Observer Pattern implemented using a trait. Chris Shorrock and James Iry provided improved versions in the comments. I’ll use James’ example here.

To keep things as simple as possible, let’s observe a simple Counter, which increments an internal count variable by the number input to an add method.

    
package example

class Counter {
    var count = 0
    def add(i: Int) = count += i
}
    

The count field is actually public, but I will only write to it through add.

Here is James’ Subject trait that implements the Observer Pattern.

    
package example

trait Subject {
  type Observer = { def receiveUpdate(subject:Any) }

  private var observers = List[Observer]()
  def addObserver(observer:Observer) = observers ::= observer
  def notifyObservers = observers foreach (_.receiveUpdate(this))
}
    

Effectively, this says that we can use any object as an Observer as long as it matches the structural type { def receiveUpdate(subject:Any) }. Think of structural types as anonymous interfaces. Here, a valid observer is one that has a receiveUpdate method taking an argument of Any type.

The rest of the trait manages a list of observers and defines a notifyObservers method. The expression observers ::= observer uses the List :: (“cons”) operator to prepend an item to the list. (Note, I am using the default immutable List, so a new copy is created everytime.)

The notifyObservers method iterates through the observers, calling receiveUpdate on each one. The _ that gets replaced with each observer during the iteration.

Finally, here is a specs file that exercises the code.

    
package example

import org.specs._

object CounterObserverSpec extends Specification {
    "A Counter Observer" should {
        "observe counter increments" in {
            class CounterObserver {
                var updates = 0
                def receiveUpdate(subject:Any) = updates += 1
            }
            class WatchedCounter extends Counter with Subject {
                override def add(i: Int) = { 
                    super.add(i)
                    notifyObservers
                }
            }
            var watchedCounter = new WatchedCounter
            var counterObserver = new CounterObserver
            watchedCounter.addObserver(counterObserver)
            for (i <- 1 to 3) watchedCounter.add(i)
            counterObserver.updates must_== 3
            watchedCounter.count must_== 6
    }
  }
}
    

The specs library is a BDD tool inspired by rspec in Rubyland.

I won’t discuss it all the specs-specific details here, but hopefully you’ll get the general idea of what it’s doing.

Inside the "observe counter increments" in {...}, I start by declaring two classes, CounterObserver and WatchedCounter. CounterObserver satisfies our required structural type, i.e., it provides a receiveUpdate method.

WatchedCounter subclasses Counter and mixes in the Subject trait. It overrides the add method, where it calls Counter’s add first, then notifies the observers. No parentheses are used in the invocation of notifyObservers because the method was not defined to take any!

Next, I create an instance of each class, add the observer to the WatchedCounter, and make 3 calls to watchedCounter.add.

Finally, I use the “actual must_== expected” idiom to test the results. The observer should have seen 3 updates, while the counter should have a total of 6.

The following simple bash shell script will build and run the code.

    
SCALA_HOME=...
SCALA_SPECS_HOME=...
CP=$SCALA_HOME/lib/scala-library.jar:$SCALA_SPECS_HOME/specs-1.3.1.jar:bin
rm -rf bin
mkdir -p bin
scalac -d bin -cp $CP src/example/*.scala
scala -cp $CP example/CounterObserverSpec
    

Note that I put all the sources in a src/example directory. Also, I’m using v1.3.1 of specs, as well as v2.7.1 of Scala. You should get the following output.

    
Specification "CounterObserverSpec" 
  A Counter Observer should
  + observe counter increments

Total for specification "CounterObserverSpec":
Finished in 0 second, 60 ms
1 example, 2 assertions, 0 failure, 0 error
    

Observing with Aspects

Because Scala compiles to Java byte code, I can use AspectJ to advice Scala code! For this to work, you have to be aware of how Scala represents its concepts in byte code. For example, object declarations, e.g., object Foo {...} become static final classes. Also, method names like + become $plus in byte code.

However, most Scala type, method, and variable names can be used as is in AspectJ. This is true for my example.

Here is an aspect that observes calls to Counter.add.

    
package example

public aspect CounterObserver {
    after(Object counter, int value): 
        call(void *.add(int)) && target(counter) && args(value) {

        RecordedObservations.record("adding "+value);
    }
}
    

You can read this aspect as follows, after calling Counter.add (and keeping track of the Counter object that was called, and the value passed to the method), call the static method record on the RecordedObservations.

I’m using a separate Scala object RecordedObservations

    
package example

object RecordedObservations {
    private var messages = List[String]()
    def record(message: String):Unit = messages ::= message
    def count() = messages.length
    def reset():Unit = messages = Nil
}
    

Recall that this is effectively a static final Java class. I need this separate object, rather than keeping information in the aspect itself, because of the simple-minded way I’m building the code. ;) However, it’s generally a good idea with aspects to delegate most of the work to Java or Scala code anyway.

Now, the “spec” file is:

    
package example

import org.specs._

object CounterObserverSpec extends Specification {
    "A Counter Observer" should {
        "observe counter increments" in {
            RecordedObservations.reset()
            var counter = new Counter
            for (i <- 1 to 3) counter.add(i)
            RecordedObservations.count() must_== 3
            counter.count must_== 6
    }
  }
}
    

This time, I don’t need two more classes for the adding a mixin trait or defining an observer. Also, I call RecordedObservations.count to ensure it was called 3 times.

The build script is also slightly different to add the AspectJ compilation.

    
SCALA_HOME=...
SCALA_SPECS_HOME=...
ASPECTJ_HOME=...
CP=$SCALA_HOME/lib/scala-library.jar:$SCALA_SPECS_HOME/specs-1.3.1.jar:$ASPECTJ_HOME/lib/aspectjrt.jar:bin
rm -rf bin app.jar
mkdir -p bin
scalac -d bin -cp $CP src/example/*.scala 
ajc -1.5 -outjar app.jar -cp $CP -inpath bin src/example/CounterObserver.aj
aj -cp $ASPECTJ_HOME/lib/aspectjweaver.jar:app.jar:$CP example.CounterObserverSpec
    

The ajc command not only compiles the aspect, but it “weaves” into the compiled Scala classes in the bin directory. Actually, it only affects the Counter class. Then it writes all the woven and unmodified class files to app.jar, which is used to execute the test. Note that for production use, you might prefer load-time weaving.

The output is the same as before (except for the milliseconds), so I won’t show it here.

Comparing Traits with Aspects

So far, both approaches are equally viable. The traits approach obviously doesn’t require a separate language and corresponding tool set.

However, traits have one important limitation with respect to aspects. Aspects let you define pointcuts that are queries over all possible points where new behavior or modifications might be desired. These points are called join points in aspect terminology. The aspect I showed above has a simple pointcut that selects one join point, calls to the Counter.add method.

However, what if I wanted to observe all state changes in all classes in a package? Defining traits for each case would be tedious and error prone, since it would be easy to overlook some cases. With an aspect framework like AspectJ, I can implement observation at all the points I care about in a modular way.

Aspect frameworks support this by providing wildcard mechanisms. I won’t go into the details here, but the * in the previous aspect is an example, matching any type. Also, one of the most powerful techniques for writing robust aspects is to use pointcuts that reference only annotations, a form of abstraction. As a final example, if I add an annotation Adder to Counter.add,

    
package example

class Counter {
    var count = 0
    @Adder def add(i: Int) = count += i
}
    

Then I can rewrite the aspect as follows.

    
package example

public aspect CounterObserver {
    after(Object counter, int value): 
        call(@Adder void *.*(int)) && target(counter) && args(value) {

        RecordedObservations.record("adding "+value);
    }
}
    

Now, there are no type and method names in the pointcut. Any instance method on any visible type that takes one int (or Scala Int) argument and is annotated with Adder will get matched.

Note: Scala requires that you create any custom annotations as normal Java annotations. Also, if you intend to use them with Aspects, use runtime retention policy, which will be necessary if you use load-time weaving.

Conclusion

If you need to mix in behavior in a specific, relatively-localized set of classes, Scala traits are probably all you need and you don’t need another language. If you need more “pervasive” modifications (e.g., tracing, policy enforcement, security), consider using aspects.

Acknowledgements

Thanks to Ramnivas Laddad, whose forthcoming 2nd Edition of AspectJ in Action got me thinking about this topic.

The Liskov Substitution Principle for "Duck-Typed" Languages 105

Posted by Dean Wampler Sun, 07 Sep 2008 04:48:00 GMT

OCP and LSP together tell us how to organize similar vs. variant behaviors. I blogged the other day about OCP in the context of languages with open classes (i.e., dynamically-typed languages). Let’s look at the Liskov Substitution Principle (LSP).

The Liskov Substitution Principle was coined by Barbara Liskov in Data Abstraction and Hierarchy (1987).

If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2, then S is a subtype of T.

I’ve always liked the elegant simplicity, yet power, of LSP. In less formal terms, it says that if a client (program) expects objects of one type to behave in a certain way, then it’s only okay to substitute objects of another type if the same expectations are satisfied.

This is our best definition of inheritance. The well-known is-a relationship between types is not precise enough. Rather, the relationship has to be behaves-as-a, which unfortunately is more of a mouthful. Note that is-a focuses on the structural relationship, while behaves-as-a focuses on the behavioral relationship. A very useful, pre-TDD design technique called Design by Contract emerges out of LSP, but that’s another topic.

Note that there is a slight assumption that I made in the previous paragraph. I said that LSP defines inheritance. Why inheritance specifically and not substitutability, in general? Well, inheritance has been the main vehicle for substitutability for most OO languages, especially the statically-typed ones.

For example, a Java application might use a simple tracing abstraction like this.

    
public interface Tracing {
    void trace(String message);
}
    

Clients might use this to trace methods calls to a log. Only classes that implement the Tracer interface can be given to these clients. For example,

    
public class TracerClient {
    private Tracer tracer;

    public TracerClient(Tracer tracer) {
        this.tracer = tracer;
    }

    public void doWork() {
        tracer.trace("in doWork():");
        // ...
    }
}
    

However, Duck Typing is another form of substitutability that is commonly seen in dynamically-typed languages, like Ruby and Python.

If it walks like a duck and quacks like a duck, it must be a duck.

Informally, duck typing says that a client can use any object you give it as long as the object implements the methods the client wants to invoke on it. Put another way, the object must respond to the messages the client wants to send to it.

The object appears to be a “duck” as far as the client is concerned.

In or example, clients only care about the trace(message) method being supported. So, we might do the following in Ruby.

    
class TracerClient 
  def initialize tracer 
    @tracer = tracer
  end

  def do_work
    @tracer.trace "in do_work:" 
    # ... 
  end
end

class MyTracer
  def trace message
    p message
  end
end

client = TracerClient.new(MyTracer.new)
    

No “interface” is necessary. I just need to pass an object to TracerClient.initialize that responds to the trace message. Here, I defined a class for the purpose. You could also add the trace method to another type or object.

So, LSP is still essential, in the generic sense of valid substitutability, but it doesn’t have to be inheritance based.

Is Duck Typing good or bad? It largely comes down to your view about dynamically-typed vs. statically-typed languages. I don’t want to get into that debate here! However, I’ll make a few remarks.

On the negative side, without a Tracer abstraction, you have to rely on appropriate naming of objects to convey what they do (but you should be doing that anyway). Also, it’s harder to find all the “tracing-behaving” objects in the system.

On the other hand, the client really doesn’t care about a “Tracer” type, only a single method. So, we’ve decoupled “client” and “server” just a bit more. This decoupling is more evident when using closures to express behavior, e.g., for Enumerable methods. In our case, we could write the following.

    
class TracerClient2 
  def initialize &tracer 
    @tracer = tracer
  end

  def do_work 
    @tracer.call "in do_work:" 
    # ... 
  end
end

client = TracerClient2.new {|message| p "block tracer: #{message}"}
    

For comparison, consider how we might approach substitutability in Scala. As a statically-typed language, Scala doesn’t support duck typing per se, but it does support a very similar mechanism called structural types.

Essentially, structural types let us declare that a method parameter must support one or more methods, without having to say it supports a full interface. Loosely speaking, it’s like using an anonymous interface.

In our Java example, when we declare a tracer object in our client, we would be able to declare that is supports trace, without having to specify that it implements a full interface.

To be explicit, recall our Java constructor for TestClient.

    
public class TracerClient {
    public TracerClient(Tracer tracer) { ... }
    // ...
    }
}
    
In Scala, a complete example would be the following.
    
class ScalaTracerClient(val tracer: { def trace(message:String) }) {
    def doWork() = { tracer.trace("doWork") }
}

class ScalaTracer() {
    def trace(message: String) = { println("Scala: "+message) }
}

object TestScalaTracerClient {
    def main() {
        val client = new ScalaTracerClient(new ScalaTracer())
        client.doWork();
    }
}
TestScalaTracerClient.main()
    

Recall from my previous blogs on Scala, the argument list to the class name is the constructor arguments. The constructor takes a tracer argument whose “type” (after the ’:’) is { def trace(message:String) }. That is, all we require of tracer is that it support the trace method.

So, we get duck type-like behavior, but statically type checked. We’ll get a compile error, rather than a run-time error, if someone passes an object to the client that doesn’t respond to tracer.

To conclude, LSP can be reworded very slightly.

If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2, then S is substitutable for T.

I replaced a subtype of with substitutable for.

An important point is that the idea of a “contract” between the types and their clients is still important, even in a language with duck-typing or structural typing. However, languages with these features give us more ways to extend our system, while still supporting LSP.

The Open-Closed Principle for Languages with Open Classes 127

Posted by Dean Wampler Fri, 05 Sep 2008 02:42:00 GMT

We’ve been having a discussion inside Object Mentor World Design Headquarters about the meaning of the OCP for dynamic languages, like Ruby, with open classes.

For example, in Ruby it’s normal to define a class or module, e.g.,

    
# foo.rb
class Foo
    def method1 *args
        ...
    end
end
    

and later re-open the class and add (or redefine) methods,

    
# foo2.rb
class Foo
    def method2 *args
        ...
    end
end
    

Users of Foo see all the methods, as if Foo had one definition.

    
foo = Foo.new
foo.method1 :arg1, :arg2
foo.method2 :arg1, :arg2
    

Do open classes violate the Open-Closed Principle? Bertrand Meyer articulated OCP. Here is his definition1.

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.

He elaborated on it here.

... This is the open-closed principle, which in my opinion is one of the central innovations of object technology: the ability to use a software component as it is, while retaining the possibility of adding to it later through inheritance. Unlike the records or structures of other approaches, a class of object technology is both closed and open: closed because we can start using it for other components (its clients); open because we can at any time add new properties without invalidating its existing clients.

Tell Less, Say More: The Power of Implicitness

So, if one client require’s only foo.rb and only uses method1, that client doesn’t care what foo2.rb does. However, if the client also require’s foo2.rb, perhaps indirectly through another require, problems will ensue unless the client is unaffected by what foo2.rb does. This looks a lot like the way “good” inheritance should behave.

So, the answer is no, we aren’t violating OCP, as long as we extend a re-opened class following the same rules we would use when inheriting from it.

If we use inheritance instead:

    
# foo.rb
class Foo
    def method1 *args
        ...
    end
end
...
class DerivedFoo < Foo
    def method2 *args
        ...
    end
end
...
foo = SubFoo.new    # Instantiate different class...
foo.method1 :arg1, :arg2
foo.method2 :arg1, :arg2
    

One notable difference is that we have to instantiate a different class. This is an important difference. While you can often just use inheritance, and maybe you should prefer it, inheritance only works if you have full control over what types get instantiated and it’s easy to change which types you use. Of course, inheritance is also the best approach when you need all behavioral variants simulateneously, i.e., each variant in one or more objects.

Sometimes you want to affect the behavior of all instances transparently, without changing the types that are instantiated. A slightly better example, logging method calls, illustrates the point. Here we use the “famous” alias_method in Ruby.

    
# foo.rb
class Foo
    def method1 *args
        ...
    end
end
# logging_foo.rb
class Foo
    alias_method :old_method1, :method1
    def method1 *args
        p "Inside method1(#{args.inspect})" 
        old_method1 *args
    end
end
...
foo = Foo.new
foo.method1 :arg1, :arg2
    

Foo.method1 behaves like a subclass override, with extended behavior that still obeys the Liskov-Substitution Principle (LSP).

So, I think the OCP can be reworded slightly.

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for source modification.

We should not re-open the original source, but adding functionality through a separate source file is okay.

Actually, I prefer a slightly different wording.

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for source and contract modification.

The extra and contract is redundant with LSP. I don’t think this kind of redundancy is necessarily bad. ;) The contract is the set of behavioral expectations between the “entity” and its client(s). Just as it is bad to break the contract with inheritance, it is also bad to break it through open classes.

OCP and LSP together are our most important design principles for effective organization of similar vs. variant behaviors. Inheritance is one way we do this. Open classes provide another way. Aspects provide a third way and are subject to the same design issues.

1 Meyer, Bertrand (1988). Object-Oriented Software Construction. Prentice Hall. ISBN 0136290493.

Baubles in Orbit 48

Posted by Uncle Bob Tue, 19 Aug 2008 06:29:32 GMT

I have put together a nice little demonstration of the Bauble concept. You may recall that I first wrote about it here. Baubles are a simple component scheme for Ruby, good for when you want a component, but don’t need something as heavy as a gem.

orbit.zip contains all the files for this demonstration. I suggest you download and unpack it.

First you need to install the Bauble gem. Don’t worry, it won’t hurt anything. Just say gem install orbit/bauble/Bauble-0.1.gem (You’ll probably have to do it with sudo.) That should install Bauble. From now on you only need to say require 'bauble' in your ruby scripts that make use of it.

Now you should be able to run the orbital simulator. Just type:

cd orbit/MultipleBodyOrbit/lib
jruby multiple_body_orbit.rb
A swing window should pop up and you should be able to watch an orbital simulation. Every run shows a different random scenario, so you can kill a lot of time by watching worlds in collision.

The thing to note, if you are a ruby programmer, is the use of the term Bauble::use(-some_directory-). If you look in the multiple_body_orbit.rb file you’ll see I use two Baubles, the Physics bauble does the raw calculation for all the gravity, forces, collisions, etc. The cellular_automaton bauble provides a very simple Swing framework for drawing dots on a screen. (Yes, this is jruby).

If you look in either of the two Baubles, you’ll see that the require statements within them do not know (or care) about the directory they live in. There is none of that horrible __FILE__ nonsense that pollutes so many ruby scripts. This is because the Bauble::use function puts the directory path in the LOAD_PATH so that subsquent require statements can simply eliminate the directory spec.

Take a look at the Bauble source code. It’s no great shakes.

Also take a look at the two baubles. They show a pretty nice way to decouple business rules from gui. You might recognize the MVP pattern. The multiple_body_orbit.rb file contains the presenter. Clearly the Physics module is the model. And the cellular_automaton module is the view. (There is no controller, because there is no input.)

The Seductions of Scala, Part III - Concurrent Programming 380

Posted by Dean Wampler Thu, 14 Aug 2008 21:00:00 GMT

This is my third and last blog entry on The Seductions of Scala, where we’ll look at concurrency using Actors and draw some final conclusions.

Writing Robust, Concurrent Programs with Scala

The most commonly used model of concurrency in imperative languages (and databases) uses shared, mutable state with access synchronization. (Recall that synchronization isn’t necessary for reading immutable objects.)

However, it’s widely known that this kind of concurrency programming is very difficult to do properly and few programmers are skilled enough to write such programs.

Because pure functional languages have no side effects and no shared, mutable state, there is nothing to synchronize. This is the main reason for the resurgent interest in function programming recently, as a potential solution to the so-called multicore problem.

Instead, most functional languages, in particular, Erlang and Scala, use the Actor model of concurrency, where autonomous “objects” run in separate processes or threads and they pass messages back and forth to communicate. The simplicity of the Actor model makes it far easier to create robust programs. Erlang processes are so lightweight that it is common for server-side applications to have thousands of communicating processes.

Actors in Scala

Let’s finish our survey of Scala with an example using Scala’s Actors library.

Here’s a simple Actor that just counts to 10, printing each number, one per second.

    
import scala.actors._
object CountingActor extends Actor { 
    def act() { 
        for (i <- 1 to 10) { 
            println("Number: "+i)
            Thread.sleep(1000) 
        } 
    } 
} 

CountingActor.start()
    

The last line starts the actor, which implicitly invokes the act method. This actor does not respond to any messages from other actors.

Here is an actor that responds to messages, echoing the message it receives.

    
import scala.actors.Actor._ 
val echoActor = actor {
    while (true) {
        receive {
            case msg => println("received: "+msg)
        }
    }
}
echoActor ! "hello" 
echoActor ! "world!" 
    

In this case, we do the equivalent of a Java “static import” of the methods on Actor, e.g., actor. Also, we don’t actually need a special class, we can just create an object with the desired behavior. This object has an infinite loop that effectively blocks while waiting for an incoming message. The receive method gets a block that is a match statement, which matches on anything received and prints it out.

Messages are sent using the target_actor ! message syntax.

As a final example, let’s do something non-trivial; a contrived network node monitor.

    
import scala.actors._
import scala.actors.Actor._
import java.net.InetAddress 
import java.io.IOException

case class NodeStatusRequest(address: InetAddress, respondTo: Actor) 

sealed abstract class NodeStatus
case class Available(address: InetAddress) extends NodeStatus
case class Unresponsive(address: InetAddress, reason: Option[String]) extends NodeStatus

object NetworkMonitor extends Actor {
    def act() {
        loop {
            react {  // Like receive, but uses thread polling for efficiency.
                case NodeStatusRequest(address, actor) => 
                    actor ! checkNodeStatus(address)
                case "EXIT" => exit()
            }
        }
    }
    val timeoutInMillis = 1000;
    def checkNodeStatus(address: InetAddress) = {
        try {
            if (address.isReachable(timeoutInMillis)) 
                Available(address)
            else
                Unresponsive(address, None)
        } catch {
            case ex: IOException => 
                Unresponsive(address, Some("IOException thrown: "+ex.getMessage()))
        }
    }
}

// Try it out:

val allOnes = Array(1, 1, 1, 1).map(_.toByte)
NetworkMonitor.start()
NetworkMonitor ! NodeStatusRequest(InetAddress.getByName("www.scala-lang.org"), self)
NetworkMonitor ! NodeStatusRequest(InetAddress.getByAddress("localhost", allOnes), self)
NetworkMonitor ! NodeStatusRequest(InetAddress.getByName("objectmentor.com"), self)
NetworkMonitor ! "EXIT" 
self ! "No one expects the Spanish Inquisition!!" 

def handleNodeStatusResponse(response: NodeStatus) = response match {
    // Sealed classes help here
    case Available(address) => 
        println("Node "+address+" is alive.")
    case Unresponsive(address, None) => 
        println("Node "+address+" is unavailable. Reason: <unknown>")
    case Unresponsive(address, Some(reason)) => 
        println("Node "+address+" is unavailable. Reason: "+reason)
}

for (i <- 1 to 4) self.receive {   // Sealed classes don't help here
    case (response: NodeStatus) => handleNodeStatusResponse(response)
    case unexpected => println("Unexpected response: "+unexpected)
}
    

We begin by importing the Actor classes, the methods on Actor, like actor, and a few Java classes we need.

Next we define a sealed abstract base class. The sealed keyword tells the compiler that the only subclasses will be defined in this file. This is useful for the case statements that use them. The compiler will know that it doesn’t have to worry about potential cases that aren’t covered, if new NodeStatus subclasses are created. Otherwise, we would have to add a default case clause (e.g., case _ => ...) to prevent warnings (and possible errors!) about not matching an input. Sealed class hierarchies are a useful feature for robustness (but watch for potential Open/Closed Principle violations!).

The sealed class hierarchy encapsulates all the possible node status values (somewhat contrived for the example). The node is either Available or Unresponsive. If Unresponsive, an optional reason message is returned.

Note that we only get the benefit of sealed classes here because we match on them in the handleNodeStatusResponse message, which requires a response argument of type NodeStatus. In contrast, the receive method effectively takes an Any argument, so sealed classes don’t help on the line with the comment “Sealed classes don’t help here”. In that case, we really need a default, the case unexpected => ... clause. (I added the message self ! "No one expects the Spanish Inquisition!!" to test this default handler.)

In the first draft of this blog post, I didn’t know these details about sealed classes. I used a simpler implementation that couldn’t benefit from sealed classes. Thanks to the first commenter, LaLit Pant, who corrected my mistake!

The NetworkMonitor loops, waiting for a NodeStatusRequest or the special string “EXIT”, which tells it to quit. Note that the actor sending the request passes itself, so the monitor can reply to it.

The checkNodeStatus attempts to contact the node, with a 1 second timeout. It returns an appropriate NodeStatus.

Then we try it out with three addresses. Note that we pass self as the requesting actor. This is an Actor wrapping the current thread, imported from Actor. It is analogous to Java’s Thread.currentThread().

Curiously enough, when I run this code, I get the following results.

    
Unexpected response: No one expects the Spanish Inquisition!!
Node www.scala-lang.org/128.178.154.102 is unavailable. Reason: <unknown>
Node localhost/1.1.1.1 is unavailable. Reason: <unknown>
Node objectmentor.com/206.191.6.12 is alive.
    

The message about the Spanish Inquisition was sent last, but processed first, probably because self sent it to itself.

I’m not sure why www.scala-lang.org couldn’t be reached. A longer timeout didn’t help. According to the Javadocs for InetAddress.isReachable), it uses ICMP ECHO REQUESTs if the privilege can be obtained, otherwise it tries to establish a TCP connection on port 7 (Echo) of the destination host. Perhaps neither is supported on the scala-lang.org site.

Conclusions

Here are some concluding observations about Scala vis-à-vis Java and other options.

A Better Java

Ignoring the functional programming aspects for a moment, I think Scala improves on Java in a number of very useful ways, including:

  1. A more succinct syntax. There’s far less boilerplate, like for fields and their accessors. Type inference and optional semicolons, curly braces, etc. also reduce “noise”.
  2. A true mixin model. The addition of traits solves the problem of not having a good DRY way to mix in additional functionality declared by Java interfaces.
  3. More flexible method names and invocation syntax. Java took away operator overloading; Scala gives it back, as well as other benefits of using non-alphanumeric characters in method names. (Ruby programmers enjoy writing list.empty?, for example.)
  4. Tuples. A personal favorite, I’ve always wanted the ability to return multiple values from a method, without having to create an ad hoc class to hold the values.
  5. Better separation of mutable vs. immutable objects. While Java provides some ability to make objects final, Scala makes the distinction between mutability and immutability more explicit and encourages the latter as a more robust programming style.
  6. First-class functions and closures. Okay, these last two points are really about FP, but they sure help in OO code, too!
  7. Better mechanisms for avoiding null’s. The Option type makes code more robust than allowing null values.
  8. Interoperability with Java libraries. Scala compiles to byte code so adding Scala code to existing Java applications is about as seamless as possible.

So, even if you don’t believe in FP, you will gain a lot just by using Scala as a better Java.

Functional Programming

But, you shouldn’t ignore the benefits of FP!

  1. Better robustness. Not only for concurrent programs, but using immutable objects (a.k.a. value objects) reduces the potential for bugs.
  2. A workable concurrency model. I use the term workable because so few developers can write robust concurrent code using the synchronization on shared state model. Even for those of you who can, why bother when Actors are so much easier??
  3. Reduced code complexity. Functional code tends to be very succinct. I can’t overestimate the importance of rooting out all accidental complexity in your code base. Excess complexity is one of the most pervasive detriments to productivity and morale that I see in my clients’ code bases!
  4. First-class functions and closures. Composition and succinct code are much easier with first-class functions.
  5. Pattern matching. FP-style pattern matching makes “routing” of messages and delegation much easier.

Of course, you can mimic some of these features in pure Java and I encourage you to do so if you aren’t using Scala.

Static vs. Dynamic Typing

The debate on the relative merits of static vs. dynamic typing is outside our scope, but I will make a few personal observations.

I’ve been a dedicated Rubyist for a while. It is hard to deny the way that dynamic typing simplifies code and as I said in the previous section, I take code complexity very seriously.

Scala’s type system and type inference go a long way towards providing the benefits of static typing with the cleaner syntax of dynamic typing, but Scala doesn’t eliminate the extra complexity of static typing.

Recall my Observer example from the first blog post, where I used traits to implement it.

    
trait Observer[S] {
    def receiveUpdate(subject: S);
}

trait Subject[S] { 
    this: S =>
    private var observers: List[Observer[S]] = Nil
    def addObserver(observer: Observer[S]) = observers = observer :: observers

    def notifyObservers() = observers.foreach(_.receiveUpdate(this))
}
    
In Ruby, we might implement it this way.
    
module Subject 
    def add_observer(observer) 
      @observers ||= []
      @observers << observer  # append, rather than replace with new array
  end

    def notify_observers
      @observers.each {|o| o.receive_update(self)} if @observers
  end
end
    

There is no need for an Observer module. As long as every observer responds to the receive_update “message”, we’re fine.

I commented the line where I append to the existing @observers array, rather than build a new one, which would be the FP and Scala way. Appending to the existing array would be more typical of Ruby code, but this implementation is not as thread safe as an FP-style approach.

The trailing if expression in notify_observers means that nothing is done if @observers is still nil, i.e., it was never initialized in add_observer.

So, which is better? The amount of code is not that different, but it took me significantly longer to write the Scala version. In part, this was due to my novice chops, but the reason it took me so long was because I had to solve a design issue resulting from the static typing. I had to learn about the typed self construct used in the first line of the Subject trait. This was the only way to allow the Observer.receiveUpdate method accept to an argument of type S, rather than of type Subject[S]. It was worth it to me to achieve the “cleaner” API.

Okay, perhaps I’ll know this next time and spend about the same amount of time implementing a Ruby vs. Scala version of something. However, I think it’s notable that sometimes static typing can get in the way of your intentions and goal of achieving clarity. (At other times, the types add useful documentation.) I know this isn’t the only valid argument you can make, one way or the other, but it’s one reason that dynamic languages are so appealing.

Poly-paradigm Languages vs. Mixing Several Languages

So, you’re convinced that you should use FP sometimes and OOP sometimes. Should you pick a poly-paradigm language, like Scala? Or, should you combine several languages, each of which implements one paradigm?

A potential downside of Scala is that supporting different modularity paradigms, like OOP and FP, increases the complexity in the language. I think Odersky and company have done a superb job combining FP and OOP in Scala, but if you compare Scala FP code to Haskell or Erlang FP code, the latter tend to be more succinct and often easier to understand (once you learn the syntax).

Indeed, Scala will not be easy for developers to master. It will be a powerful tool for professionals. As a consultant, I work with developers with a range of skills. I would not expect some of them to prosper with Scala. Should that rule out the language? NO. Rather it would be better to “liberate” the better developers with a more powerful tool.

So, if your application needs OOP and FP concepts interspersed, consider Scala. If your application needs discrete services, some of which are predominantly OOP and others of which are predominantly FP, then consider Scala or Java for the OOP parts and Erlang or another FP language for the FP parts.

Also, Erlang’s Actor model is more mature than Scala’s, so Erlang might have an edge for a highly-concurrent server application.

Of course, you should do your own analysis…

Final Thoughts

Java the language has had a great ride. It was a godsend to us beleaguered C++ programmers in the mid ‘90’s. However, compared to Scala, Java now feels obsolete. The JVM is another story. It is arguably the best VM available.

I hope Scala replaces Java as the main JVM language for projects that prefer statically-typed languages. Fans of dynamically-typed languages might prefer JRuby, Groovy, or Jython. It’s hard to argue with all the OOP and FP goodness that Scala provides. You will learn a lot about good language and application design by learning Scala. It will certainly be a prominent tool in my toolkit from now on.

The Seductions of Scala, Part II - Functional Programming 195

Posted by Dean Wampler Wed, 06 Aug 2008 01:32:00 GMT

A Functional Programming Language for the JVM

In my last blog post, I discussed Scala’s support for OOP and general improvements compared to Java. In this post, which I’m posting from Agile 2008, I discuss Scala’s support for functional programming (FP) and why it should be of interest to OO developers.

A Brief Overview of Functional Programming

You might ask, don’t most programming languages have functions? FP uses the term in the mathematical sense of the word. I hate to bring up bad memories, but you might recall from your school days that when you solved a function like

    
y = sin(x)
    

for y, given a value of x, you could input the same value of x an arbitrary number of times and you would get the same value of y. This means that sin(x) has no side effects. In other words, unlike our imperative OO or procedural code, no global or object state gets changed. All the work that a mathematical function does has to be returned in the result.

Similarly, the idea of a variable is a little different than what we’re used to in imperative code. While the value of y will vary with the value of x, once you have fixed x, you have also fixed y. The implication for FP is that “variables” are immutable; once assigned, they cannot be changed. I’ll call such immutable variables value objects.

Now, it would actually be hard for a “pure” FP language to have no side effects, ever. I/O would be rather difficult, for example, since the state of the input or output stream changes with each operation. So, in practice, all “pure” FP languages provide some mechanisms for breaking the rules in a controlled way.

Functions are first-class objects in FP. You can create named or anonymous functions (e.g., closures or blocks), assign them to variables, pass them as arguments to other functions, etc. Java doesn’t support this. You have to create objects that wrap the methods you want to invoke.

Functional programs tend to be much more declarative in nature than imperative programs. This is perhaps more obvious in pure FP languages, like Erlang and Haskell, than it is in Scala.

For example, the definition of Fibonacci numbers is the following.

    
F(n) = F(n-1) + F(n-2) where F(1)=1 and F(2)=1
    

An here is a complete implementation in Haskell.

    
module Main where 
-- Function f returns the n'th Fibonacci number. 
-- It uses binary recursion. 
f n | n <= 2 = 1 
    | n >  2 = f (n-1) + f (n-2) 
    

Without understanding the intricacies of Haskell syntax, you can see that the code closely matches the “specification” above it. The f n | ... syntax defines the function f taking an argument n and the two cases of n values are shown on separate lines, where one case is for n <= 2 and the other case if for n > 2.

The code uses the recursive relationship between different values of the function and the special-case values when n = 1 and n = 2. The Haskell runtime does the rest of the work.

It’s interesting that most domain-specific languages are also declarative in nature. Think of how JMock, EasyMock or Rails’ ActiveRecord code look. The code is more succinct and it lets the “system” do most of the heavy lifting.

Functional Programming’s Benefits for You

Value Objects and Side-Effect Free Functions

It’s the immutable variables and side-effect free functions that help solve the multicore problem. Synchronized access to shared state is not required if there is no state to manage. This makes robust concurrent programs far easier to write.

I’ll discuss concurrency in Scala in my third post. For now, let’s discuss other ways that FP in Scala helps to improve code, concurrent or not.

Value objects are beneficial because you can pass one around without worrying that someone will change it in a way that breaks other users of the object. Value objects aren’t unique to FP, of course. They have been promoted in Domain Driven Design (DDD), for example.

Similarly, side-effect free functions are safer to use. There is less risk that a caller will change some state inappropriately. The caller doesn’t have to worry as much about calling a function. There are fewer surprises and everything of “consequence” that the function does is returned to the caller. It’s easier to keep to the Single Responsibility Principle when writing side-effect free functions.

Of course, you can write side-effect free methods and immutable variables in Java code, but it’s mostly a matter of discipline; the language doesn’t give you any enforcement mechanisms.

Scala gives you a helpful enforcement mechanism; the ability to declare variables as val’s (i.e., “values”) vs. var’s (i.e., “variables”, um… back to the imperative programming sense of the word…). In fact, val is the default, where neither is required by the language. Also, the Scala library contains both immutable and mutable collections and it “encourages” you to use the immutable collections.

However, because Scala combines both OOP and FP, it doesn’t force FP purity. The upside is that you get to use the approach that best fits the problem you’re trying to solve. It’s interesting that some of the Scala library classes expose FP-style interfaces, immutability and side-effect free functions, while using more traditional imperative code to implement them!

Closures and First-Class Functions

True to its functional side, Scala gives you true closures and first-class functions. If you’re a Groovy or Ruby programmer, you’re used to the following kind of code.

    
class ExpensiveResource {
    def open(worker: () => Unit) = {
        try {
            println("Doing expensive initialization")
            worker()
        } finally {
            close()
        }
    }
    def close() = {
        println("Doing expensive cleanup")
    }
}
// Example use:
try {
    (new ExpensiveResource()) open { () =>        // 1
        println("Using Resource")                 // 2
        throw new Exception("Thrown exception")   // 3
    }                                             // 4
} catch {
    case ex: Throwable => println("Exception caught: "+ex)
}
    

Running this code will yield:

    
Doing expensive initialization
Using Resource
Doing expensive cleanup
Exception caught: java.lang.Exception: Thrown exception
    

The ExpensiveResource.open method invokes the user-specified worker function. The syntax worker: () => Unit defines the worker parameter as a function that takes no arguments and returns nothing (recall that Unit is the equivalent of void).

ExpensiveResource.open handles the details of initializing the resource, invoking the worker, and doing the necessary cleanup.

The example marked with the comment // 1 creates a new ExpensiveResource, then calls open, passing it an anonymous function, called a function literal in Scala terminology. The function literal is of the form (arg_list_) => function body or () => println(...) ..., in our case.

A special syntax trick is used on this line; if a method takes one argument, you can change expressions of the form object.method(arg) to object method {arg}. This syntax is supported to allow user-defined methods to read like control structures (think for statements – see the next section). If you’re familiar with Ruby, the four commented lines read a lot like Ruby syntax for passing blocks to methods.

Idioms like this are very important. A library writer can encapsulate all complex, error-prone logic and allow the user to specify only the unique work required in a given situation. For example, How many times have you written code that opened an I/O stream or a database connection, used it, then cleaned up. How many times did you get the idiom wrong, especially the proper cleanup when an exception is thrown? First-class functions allow writers of I/O, database and other resource libraries to do the correct implementation once, eliminating user error and duplication. Here’s a rhetorical question I always ask myself:

How can I make it impossible for the user of this API to fail?

Iterations

Iteration through collections, Lists in particular, is even more common in FP than in imperative languages. Hence, iteration is highly evolved. Consider this example:

    
object RequireWordsStartingWithPrefix {
    def main(args: Array[String]) = {
        val prefix = args(0)
        for {
            i <- 1 to (args.length - 1)   // no semicolon
            if args(i).startsWith(prefix)
        } println("args("+i+"): "+args(i))
    }
}
    

Compiling this code with scalac and then running it on the command line with the command

    
scala RequireWordsStartingWithPrefix xx xy1 xx1 yy1 xx2 xy2
    

produces the result

    
args(2): xx1
args(5): xx2
    

The for loop assigns a loop variable i with each argument, but only if the if statement is true. Instead of curly braces, the for loop argument list could also be parenthesized, but then each line as shown would have to be separated by a semi-colon, like we’re used to seeing with Java for loops.

We can have an arbitrary number of assignments and conditionals. In fact, it’s quite common to filter lists:

    
object RequireWordsStartingWithPrefix2 {
    def main(args: Array[String]) = {
        val prefix = args(0)
        args.slice(1, args.length)
            .filter((arg: String) => arg.startsWith(prefix))
            .foreach((arg: String) => println("arg: "+arg))
    }
}
    

This version yields the same result. In this case, the args array is sliced (loping off the search prefix), the resulting array is filtered using a function literal and the filtered array is iterated over to print out the matching arguments, again using a function literal. This version of the algorithm should look familiar to Ruby programmers.

Rolling Your Own Function Objects

Scala still has to support the constraints of the JVM. As a comment to the first blog post said, the Scala compiler wraps closures and “bare” functions in Function objects. You can also make other objects behave like functions. If your object implements the apply method, that method will be invoked if you put parentheses with an matching argument list on the object, as in the following example.

    
class HelloFunction {
    def apply() = "hello" 
    def apply(name: String) = "hello "+name
}
val hello = new HelloFunction
println(hello())        // => "hello" 
println(hello("Dean"))  // => "hello Dean" 
    

Option, None, Some…

Null pointer exceptions suck. You can still get them in Scala code, because Scala runs on the JVM and interoperates with Java libraries, but Scala offers a better way.

Typically, a reference might be null when there is nothing appropriate to assign to it. Following the conventions in some FP languages, Scala has an Option type with two subtypes, Some, which wraps a value, and None, which is used instead of null. The following example, which also demonstrates Scala’s Map support, shows these types in action.

    
val hotLangs = Map(
    "Scala" -> "Rocks", 
    "Haskell" -> "Ethereal", 
    "Java" -> null)
println(hotLangs.get("Scala"))          // => Some(Rocks)
println(hotLangs.get("Java"))           // => Some(null)
println(hotLangs.get("C++"))            // => None
    

Note that Map stores values in Options objects, as shown by the println statements.

By the way, those -> aren’t special operators; they’re methods. Like ::, valid method names aren’t limited to alphanumerics, _, and $.

Pattern Matching

The last FP feature I’ll discuss in this post is pattern matching, which is exploited more fully in FP languages than in imperative languages.

Using our previous definition of hotLangs, here’s how you might use matching.

    
def show(key: String) = {
    val value: Option[String] = hotLangs.get(key)
    value match {
        case Some(x) => x
        case None => "No hotness found" 
    }
}
println(show("Scala"))  // => "Rocks" 
println(show("Java"))   // => "null" 
println(show("C++"))    // => "No hotness found" 
    

The first case statement, case Some(x) => x, says “if the value I’m matching against is a Some that could be constructed with the Some[+String](x: A) constructor, then return the x, the thing the Some contains.” Okay, there’s a lot going on here, so more background information is in order.

In Scala, like Ruby and other languages, the last value computed in a function is returned by it. Also, almost everything returns a value, including match statements, so when the Some(x) => x case is chosen, x is returned by the match and hence by the function.

Some is a generic class and the show function returns a String, so the match is to Some[+String]. The + in the +String expression is analogous to Java’s extends, i.e., <? extends String>. Capiche?

Idioms like case Some(x) => x are called extractors in Scala and are used a lot in Scala, as well as in FP, in general. Here’s another example using Lists and our friend ::, the “cons” operator.

    
def countScalas(list: List[String]): Int = {
    list match {
        case "Scala" :: tail => countScalas(tail) + 1
        case _ :: tail       => countScalas(tail)
        case Nil             => 0
    }
}
val langs = List("Scala", "Java", "C++", "Scala", "Python", "Ruby")
val count = countScalas(langs)
println(count)    // => 2
    

We’re counting the number of occurrences of “Scala” in a list of strings, using matching and recursion and no explicit iteration. An expression of the form head :: tail applied to a list returns the first element set as the head variable and the rest of the list set as the tail variable. In our case, the first case statement looks for the particular case where the head equals Scala. The second case matches all lists, except for the empty list (Nil). Since matches are eager, the first case will always pick out the List("Scala", ...) case first. Note that in the second case, we don’t actually care about the value, so we use the placeholder _. Both the first and second case’s call countScalas recursively.

Pattern matching like this is powerful, yet succinct and elegant. We’ll see more examples of matching in the next blog post on concurrency using message passing.

Recap of Scala’s Functional Programming

I’ve just touched the tip of the iceberg concerning functional programming (and I hope I got all the details right!). Hopefully, you can begin to see why we’ve overlooked FP for too long!

In my last post, I’ll wrap up with a look at Scala’s approach to concurrency, the Actor model of message passing.

The Seductions of Scala, Part I 185

Posted by Dean Wampler Sun, 03 Aug 2008 20:30:00 GMT

(Update 12/23/2008: Thanks to Apostolos Syropoulos for pointing out an earlier reference for the concept of “traits”).

Because of all the recent hoo-ha about functional programming (e.g., as a “cure” for the multicore problem), I decided to cast aside my dysfunctional ways and learn one of the FP languages. The question was, which one?

My distinguished colleague, Michael Feathers, has been on a Haskell binge of late. Haskell is a pure functional language and is probably most interesting as the “flagship language” for academic exploration, rather than production use. (That was not meant as flame bait…) It’s hard to underestimate the influence Haskell has had on language design, including Java generics, .NET LINQ and F#, etc.

However, I decided to learn Scala first, because it is a JVM language that combines object-oriented and functional programming in one language. At ~13 years of age, Java is a bit dated. Scala has the potential of replacing Java as the principle language of the JVM, an extraordinary piece of engineering that is arguably now more valuable than the language itself. (Note: there is also a .NET version of Scala under development.)

Here are some of my observations, divided over three blog posts.

First, a few disclaimers. I am a Scala novice, so any flaws in my analysis reflect on me, not Scala! Also, this is by no means an exhaustive analysis of the pros and cons of Scala vs. other options. Start with the Scala website for more complete information.

A Better OOP Language

Scala works seamlessly with Java. You can invoke Java APIs, extend Java classes and implement Java interfaces. You can even invoke Scala code from Java, once you understand how certain “Scala-isms” are translated to Java constructs (javap is your friend). Scala syntax is more succinct and removes a lot of tedious boilerplate from Java code.

For example, the following Person class in Java:

    
class Person {
    private String firstName;
    private String lastName;
    private int    age;

    public Person(String firstName, String lastName, int age) {
        this.firstName = firstName;
        this.lastName  = lastName;
        this.age       = age;
    }

    public void setFirstName(String firstName) { this.firstName = firstName; }
    public void String getFirstName() { return this.firstName; }
    public void setLastName(String lastName) { this.lastName = lastName; }
    public void String getLastName() { return this.lastName; }
    public void setAge(int age) { this.age = age; }
    public void int getAge() { return this.age; }
}
    

can be written in Scala thusly:

    
class Person(var firstName: String, var lastName: String, var age: Int)
    

Yes, that’s it. The constructor is the argument list to the class, where each parameter is declared as a variable (var keyword). It automatically generates the equivalent of getter and setter methods, meaning they look like Ruby-style attribute accessors; the getter is foo instead of getFoo and the setter is foo = instead of setFoo. Actually, the setter function is really foo_=, but Scala lets you use the foo = sugar.

Lots of other well designed conventions allow the language to define almost everything as a method, yet support forms of syntactic sugar like the illusion of operator overloading, Ruby-like DSL’s, etc.

You also get fewer semicolons, no requirements tying package and class definitions to the file system structure, type inference, multi-valued returns (tuples), and a better type and generics model.

One of the biggest deficiencies of Java is the lack of a complete mixin model. Mixins are small, focused (think Single Responsibility Principle ...) bits of state and behavior that can be added to classes (or objects) to extend them as needed. In a language like C++, you can use multiple inheritance for mixins. Because Java only supports single inheritance and interfaces, which can’t have any state and behavior, implementing a mixin-based design has always required various hacks. Aspect-Oriented Programming is also one partial solution to this problem.

The most exciting OOP enhancement Scala brings is its support for Traits, a concept first described here and more recently discussed here. Traits support Mixins (and other design techniques) through composition rather than inheritance. You could think of traits as interfaces with implementations. They work a lot like Ruby modules.

Here is an example of the Observer Pattern written as traits, where they are used to monitor changes to a bank account balance. First, here are reusable Subject and Observer traits.

    
trait Observer[S] {
    def receiveUpdate(subject: S);
}

trait Subject[S] { 
    this: S =>
    private var observers: List[Observer[S]] = Nil
    def addObserver(observer: Observer[S]) = observers = observer :: observers

    def notifyObservers() = observers.foreach(_.receiveUpdate(this))
}
    

In Scala, generics are declared with square brackets, [...], rather than angled brackets, <...>. Method definitions begin with the def keyword. The Observer trait defines one abstract method, which is called by the Subject to notify the observer of changes. The Subject is passed to the Observer.

This trait looks exactly like a Java interface. In fact, that’s how traits are represented in Java byte code. If the trait has state and behavior, like Subject, the byte code representation involves additional elements.

The Subject trait is more complex. The strange line, this: S => , is called a self type declaration. It tells the compiler that whenever this is referenced in the trait, treat its type as S, rather than Subject[S]. Without this declaration, the call to receiveUpdate in the notifyObservers method would not compile, because it would attempt to pass a Subject[S] object, rather than a S object. The self type declaration solves this problem.

The next line creates a private list of observers, initialized to Nil, which is an empty list. Variable declarations are name: type. Why didn’t they follow Java conventions, i.e., type name? Because this syntax makes the code easier to parse when type inference is used, meaning where the explicit :type is omitted and inferred.

In fact, I’m using type inference for all the method declarations, because the compiler can figure out what each method returns, in my examples. In this case, they all return type Unit, the equivalent of Java’s void. (The name Unit is a common term in functional languages.)

The third line defines a method for adding a new observer to the list. Notice that concrete method definitions are of the form

    
def methodName(parameter: type, ...) = {
    method body
}  
    

In this case, because there is only one line, I dispensed with the {...}. The equals sign before the body emphasizes the functional nature of scala, that all methods are objects, too. We’ll revisit this in a moment and in the next post.

The method body prepends the new observer object to the existing list. Actually, a new list is created. The :: operator, called “cons”, binds to the right. This “operator” is really a method call, which could actually be written like this, observers.::(observer).

Our final method in Subject is notifyObservers. It iterates through observers and invokes the block observer.receiveUpdate(this) on each observer. The _ evaluates to the current observer reference. For comparison, in Ruby, you would define this method like so:

    
def notifyObservers() 
    @observers.each { |o| o.receiveUpdate(self) }
end
    

Okay, let’s look at how you would actually use these traits. First, our “plain-old Scala object” (POSO) Account.

    
class Account(initialBalance: Double) {
    private var currentBalance = initialBalance
    def balance = currentBalance
    def deposit(amount: Double)  = currentBalance += amount
    def withdraw(amount: Double) = currentBalance -= amount
}
    

Hopefully, this is self explanatory, except for two things. First, recall that the whole class declaration is actually the constructor, which is why we have an initialBalance: Double parameter on Account. This looks strange to the Java-trained eye, but it actually works well and is another example of Scala’s economy. (You can define multiple constructors, but I won’t go into that here…).

Second, note that I omitted the parentheses when I defined the balance “getter” method. This supports the uniform access principle. Clients will simply call myAccount.balance, without parentheses and I could redefine balance to be a var or val and the client code would not have to change!

Next, a subclass that supports observation.

    
class ObservedAccount(initialBalance: Double) extends Account(initialBalance) with Subject[Account] {
    override def deposit(amount: Double) = {
        super.deposit(amount)
        notifyObservers()
    }
    override def withdraw(amount: Double) = {
        super.withdraw(amount)
        notifyObservers()
    }
}
    

The with keyword is how a trait is used, much the way that you implement an interface in Java, but now you don’t have to implement the interface’s methods. We’ve already done that.

Note that the expression, ObservedAccount(initialBalance: Double) extends Account(initialBalance), not only defines the (single) inheritance relationship, it also functions as the constructor’s call to super(initialBalance), so that Account is properly initialized.

Next, we have to override the deposit and withdraw methods, calling the parent methods and then invoking notifyObservers. Anytime you override a concrete method, scala requires the override keyword. This tells you unambiguously that you are overriding a method and the Scala compiler throws an error if you aren’t actually overriding a method, e.g., because of a typo. Hence, the keyword is much more reliable (and hence useful…) than Java’s @Override annotation.

Finally, here is an Observer that prints to stdout when the balance changes.

    
class AccountReporter extends Observer[Account] {
    def receiveUpdate(account: Account) =
        println("Observed balance change: "+account.balance)
}
    

Rather than use with, I just extend the Observer trait, because I don’t have another parent class.

Here’s some code to test what we’ve done.

    
def changingBalance(account: Account) = {
    println("==== Starting balance: " + account.balance)
    println("Depositing $10.0")
    account.deposit(10.0)
    println("new balance: " + account.balance)
    println("Withdrawing $5.60")
    account.withdraw(5.6)
    println("new balance: " + account.balance)
}

var a = new Account(0.0)
changingBalance(a)

var oa = new ObservedAccount(0.0)
changingBalance(oa)
oa.addObserver(new AccountReporter)
changingBalance(oa)
    

Which prints out:

    
==== Starting balance: 0.0
Depositing $10.0
new balance: 10.0
Withdrawing $5.60
new balance: 4.4
==== Starting balance: 0.0
Depositing $10.0
new balance: 10.0
Withdrawing $5.60
new balance: 4.4
==== Starting balance: 4.4
Depositing $10.0
Observed balance change: 14.4
new balance: 14.4
Withdrawing $5.60
Observed balance change: 8.8
new balance: 8.8
    

Note that we only observe the last transaction.

Download Scala and try it out. Put all this code in one observer.scala file, for example, and run the command:

    
scala observer.scala
    

But Wait, There’s More!

In the next post, I’ll look at Scala’s support for Functional Programming and why OO programmers should find it interesting. In the third post, I’ll look at the specific case of concurrent programming in Scala and make some concluding observations of the pros and cons of Scala.

For now, here are some references for more information.

  • The Scala website, for downloads, documentation, mailing lists, etc.
  • Ted Neward’s excellent multipart introduction to Scala at developerWorks.
  • The forthcoming Programming in Scala book.

Always close() in a finally block 56

Posted by Dean Wampler Thu, 31 Jul 2008 05:12:00 GMT

Here’s one for my fellow Java programmers, but it’s really generally applicable.

When you call close() on I/O streams, readers, writers, network sockets, database connections, etc., it’s easy to forgot the most appropriate idiom. I just spent a few hours fixing some examples of misuse in otherwise very good Java code.

What’s wrong the following code?

    
public void writeContentToFile(String content, String fileName) throws Exception {
    File output = new File(fileName);
    OutputStreamWriter writer = new OutputStreamWriter(new FileOutputStream(output), "UTF-8");
    writer.write(content);
    writer.close();
}
    

It doesn’t look all that bad. It tells it’s story. It’s easy to understand.

However, it’s quite likely that you won’t get to the last line, which closes the writer, from time to time. File and network I/O errors are common. For example, what if you can’t actually write to the location specified by fileName? So, we have to be more defensive. We want to be sure we always clean up.

The correct idiom is to use a try … finally … block.

    
public void writeContentToFile(String content, String fileName) throws Exception {
    File output = new File(getFileSystemPath() + contentFilename);
    OutputStreamWriter writer = null;
    try {
        writer = new OutputStreamWriter(new FileOutputStream(output), "UTF-8");
        writer.write(content);
    } finally {
        if (writer != null)
            writer.close();
    }
}
    

Now, no matter what happens, the writer will be closed, if it’s not null, even if writing the output was unsuccessful.

Note that we don’t necessarily need a catch block, because in this case we’re willing to let any Exceptions propagate up the stack (notice the throws clause). A lot of developers don’t realize that there are times when you need a try block, but not necessarily a catch block. This is one of those times.

So, anytime you need to clean up or otherwise release resources, use a finally block to ensure that the clean up happens, no matter what.

Older posts: 1 2 3