Tighter Ruby Methods with Functional-style Pattern Matching, Using the Case Gem 148
Ruby doesn’t have overloaded methods, which are methods with the same name, but different signatures when you consider the argument lists and return values. This would be somewhat challenging to support in a dynamic language with very flexible options for method argument handling.
You can “simulate” overloading by parsing the argument list and taking different paths of execution based on the structure you find. This post discusses how pattern matching, a hallmark of functional programming, gives you powerful options.
First, let’s look at a typical example that handles the arguments in an ad hoc fashion. Consider the following Person
class. You can pass three arguments to the initializer, the first_name
, the last_name
, and the age
. Or, you can pass a hash using the keys :first_name
, :last_name
, and :age
.
require "rubygems"
require "spec"
class Person
attr_reader :first_name, :last_name, :age
def initialize *args
arg = args[0]
if arg.kind_of? Hash # 1
@first_name = arg[:first_name]
@last_name = arg[:last_name]
@age = arg[:age]
else
@first_name = args[0]
@last_name = args[1]
@age = args[2]
end
end
end
describe "Person#initialize" do
it "should accept a hash with key-value pairs for the attributes" do
person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should accept a first name, last name, and age arguments" do
person = Person.new "Dean", "Wampler", 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
end
The condition on the # 1
comment line checks to see if the first argument is a Hash
. If so, the attribute’s values are extracted from it. Otherwise, it is assumed that three arguments were specified in a particular order. They are passed to #initialize
in a three-element array. The two rspec examples exercise these behaviors. For simplicity, we ignore some more general cases, as well as error handling.
Another approach that is more flexible is to use duck typing, instead. For example, we could replace the line with the # 1
comment with this line:
if arg.respond_to? :has_key?
There aren’t many objects that respond to #has_key?
, so we’re highly confident that we can use [symbol]
to extract the values from the hash.
This implementation is fairly straightforward. You’ve probably written code like this yourself. However, it could get complicated for more involved cases.
Pattern Matching, a Functional Programming Approach
Most programming languages today have switch
or case
statements of some sort and most have support for regular expression matching. However, in functional programming languages, pattern matching is so important and pervasive that these languages offer very powerful and convenient support for pattern matching.
Fortunately, we can get powerful pattern matching, typical of functional languages, in Ruby using the Case gem that is part of the MenTaLguY’s Omnibus Concurrency library. Omnibus
provides support for the hot Actor model of concurrency, which Erlang has made famous. However, it would be a shame to restrict the use of the Case
gem to parsing Actor messages. It’s much more general purpose than that.
Let’s rework our example using the Case gem.
require "rubygems"
require "spec"
require "case"
class Person
attr_reader :first_name, :last_name, :age
def initialize *args
case args
when Case[Hash] # 1
arg = args[0]
@first_name = arg[:first_name]
@last_name = arg[:last_name]
@age = arg[:age]
else
@first_name = args[0]
@last_name = args[1]
@age = args[2]
end
end
end
describe "Person#initialize" do
it "should accept a first name, last name, and age arguments" do
person = Person.new "Dean", "Wampler", 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should accept a has with :first_name => fn, :last_name => ln, and :age => age" do
person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
end
We require the case
gem, which puts the #===
method on steroids. In the when
statement in #initialize
, the expression when Case[Hash]
matches on a one-element array where the element is a Hash
. We extract the key-value pairs as before. The else
clause assumes we have an array for the arguments.
So far, this is isn’t very impressive, but all we did was to reproduce the original behavior. Let’s extend the example to really exploit some of the neat features of the Case
gem’s pattern matching. First, let’s narrow the allowed array values.
require "rubygems"
require "spec"
require "case"
class Person
attr_reader :first_name, :last_name, :age
def initialize *args
case args
when Case[Hash] # 1
arg = args[0]
@first_name = arg[:first_name]
@last_name = arg[:last_name]
@age = arg[:age]
when Case[String, String, Integer]
@first_name = args[0]
@last_name = args[1]
@age = args[2]
else
raise "Invalid arguments: #{args}"
end
end
end
describe "Person#initialize" do
it "should accept a first name, last name, and age arguments" do
person = Person.new "Dean", "Wampler", 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should accept a has with :first_name => fn, :last_name => ln, and :age => age" do
person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should not accept an array unless it is a [String, String, Integer]" do
lambda { person = Person.new "Dean", "Wampler", "39" }.should raise_error(Exception)
end
end
The new expression when Case[String, String, Integer]
only matches a three-element array where the first two arguments are strings and the third argument is an integer, which are the types we want. If you use an array with a different number of arguments or the arguments have different types, this when
clause won’t match. Instead, you’ll get the default else
clause, which raises an exception. We added another rspec example to test this condition, where the user’s age was specified as a string instead of as an integer. Of course, you could decide to attempt a conversion of this argument, to make your code more “forgiving” of user mistakes.
Similarly, what happens if the method supports default values some of the parameters. As written, we can’t support that option, but let’s look at a slight variation of Person#initialize
, where a hash of values is not supported, to see what would happen.
require "rubygems"
require "spec"
require "case"
class Person
attr_reader :first_name, :last_name, :age
def initialize first_name = "Bob", last_name = "Martin", age = 29
case [first_name, last_name, age]
when Case[String, String, Integer]
@first_name = first_name
@last_name = last_name
@age = age
else
raise "Invalid arguments: #{first_name}, #{last_name}, #{age}"
end
end
end
def check person, expected_fn, expected_ln, expected_age
person.first_name.should == expected_fn
person.last_name.should == expected_ln
person.age.should == expected_age
end
describe "Person#initialize" do
it "should require a first name (string), last name (string), and age (integer) arguments" do
person = Person.new "Dean", "Wampler", 39
check person, "Dean", "Wampler", 39
end
it "should accept the defaults for all parameters" do
person = Person.new
check person, "Bob", "Martin", 29
end
it "should accept the defaults for the last name and age parameters" do
person = Person.new "Dean"
check person, "Dean", "Martin", 29
end
it "should accept the defaults for the age parameter" do
person = Person.new "Dean", "Wampler"
check person, "Dean", "Wampler", 29
end
it "should not accept the first name as a symbol" do
lambda { person = Person.new :Dean, "Wampler", "39" }.should raise_error(Exception)
end
it "should not accept the last name as a symbol" do
end
it "should not accept the age as a string" do
lambda { person = Person.new "Dean", "Wampler", "39" }.should raise_error(Exception)
end
end
We match on all three arguments as an array, asserting they are of the correct type. As you might expect, #initialize
always gets three parameters passed to it, including when default values are used.
Let’s return to our original example, where the object can be constructed with a hash or a list of arguments. There are two more things (at least …) that we can do. First, we’re not yet validating the types of the values in the hash. Second, we can use the Case
gem to impose constraints on the values, such as requiring non-empty name strings and a positive age.
require "rubygems"
require "spec"
require "case"
class Person
attr_reader :first_name, :last_name, :age
def initialize *args
case args
when Case[Hash]
arg = args[0]
@first_name = arg[:first_name]
@last_name = arg[:last_name]
@age = arg[:age]
when Case[String, String, Integer]
@first_name = args[0]
@last_name = args[1]
@age = args[2]
else
raise "Invalid arguments: #{args}"
end
validate_name @first_name, "first_name"
validate_name @last_name, "last_name"
validate_age
end
protected
def validate_name name, field_name
case name
when Case::All[String, Case.guard {|s| s.length > 0 }]
else
raise "Invalid #{field_name}: #{first_name}"
end
end
def validate_age
case @age
when Case::All[Integer, Case.guard {|n| n > 0 }]
else
raise "Invalid age: #{@age}"
end
end
end
describe "Person#initialize" do
it "should accept a first name, last name, and age arguments" do
person = Person.new "Dean", "Wampler", 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should accept a has with :first_name => fn, :last_name => ln, and :age => age" do
person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
person.first_name.should == "Dean"
person.last_name.should == "Wampler"
person.age.should == 39
end
it "should not accept an array unless it is a [String, String, Integer]" do
lambda { person = Person.new "Dean", "Wampler", "39" }.should raise_error(Exception)
end
it "should not accept a first name that is a zero-length string" do
lambda { person = Person.new "", "Wampler", 39 }.should raise_error(Exception)
end
it "should not accept a first name that is not a string" do
lambda { person = Person.new :Dean, "Wampler", 39 }.should raise_error(Exception)
end
it "should not accept a last name that is a zero-length string" do
lambda { person = Person.new "Dean", "", 39 }.should raise_error(Exception)
end
it "should not accept a last name that is not a string" do
lambda { person = Person.new :Dean, :Wampler, 39 }.should raise_error(Exception)
end
it "should not accept an age that is less than or equal to zero" do
lambda { person = Person.new "Dean", "Wampler", -1 }.should raise_error(Exception)
lambda { person = Person.new "Dean", "Wampler", 0 }.should raise_error(Exception)
end
it "should not accept an age that is not an integer" do
lambda { person = Person.new :Dean, :Wampler, "39" }.should raise_error(Exception)
end
end
We have added validate_name
and validate_age
methods that are invoked at the end of #initialize
. In validate_name
, the one when
clause requires “all” the conditions to be true, that the name is a string and that it has a non-zero length. Similarly, validate_age
has a when
clause that requires age
to be a positive integer.
Final Thoughts
So, how valuable is this? The code is certainly longer, but it specifies and enforces expected behavior more precisely. The rspec examples verify the enforcement. It smells a little of static typing, which is good or bad, depending on your point of view. ;)
Personally, I think the conditional checks are a good way to add robustness in small ways to libraries that will grow and evolve for a long time. The checks document the required behavior for code readers, like new team members, but of course, they should really get that information from the tests. ;) (However, it would be nice to extract the information into the rdocs
.)
For small, short-lived projects, I might not worry about the conditional checks as much (but how many times have those “short-lived projects” refused to die?).
You can read more about Omnibus
and Case
in this InfoQ interview with MenTaLguY. I didn’t discuss using the Actor model of concurrency, for which these gems were designed. For an example of Actors using Omnibus, see my Better Ruby through Functional Programming presentation or the Confreak’s video of an earlier version of the presentation I gave at last year’s RubyConf.
The Seductions of Scala, Part II - Functional Programming 209
A Functional Programming Language for the JVM
In my last blog post, I discussed Scala’s support for OOP and general improvements compared to Java. In this post, which I’m posting from Agile 2008, I discuss Scala’s support for functional programming (FP) and why it should be of interest to OO developers.
A Brief Overview of Functional Programming
You might ask, don’t most programming languages have functions? FP uses the term in the mathematical sense of the word. I hate to bring up bad memories, but you might recall from your school days that when you solved a function like
y = sin(x)
for y
, given a value of x
, you could input the same value of x
an arbitrary number of times and you would get the same value of y
. This means that sin(x)
has no side effects. In other words, unlike our imperative OO or procedural code, no global or object state gets changed. All the work that a mathematical function does has to be returned in the result.
Similarly, the idea of a variable is a little different than what we’re used to in imperative code. While the value of y
will vary with the value of x
, once you have fixed x
, you have also fixed y
. The implication for FP is that “variables” are immutable; once assigned, they cannot be changed. I’ll call such immutable variables value objects.
Now, it would actually be hard for a “pure” FP language to have no side effects, ever. I/O would be rather difficult, for example, since the state of the input or output stream changes with each operation. So, in practice, all “pure” FP languages provide some mechanisms for breaking the rules in a controlled way.
Functions are first-class objects in FP. You can create named or anonymous functions (e.g., closures or blocks), assign them to variables, pass them as arguments to other functions, etc. Java doesn’t support this. You have to create objects that wrap the methods you want to invoke.
Functional programs tend to be much more declarative in nature than imperative programs. This is perhaps more obvious in pure FP languages, like Erlang and Haskell, than it is in Scala.
For example, the definition of Fibonacci numbers is the following.
F(n) = F(n-1) + F(n-2) where F(1)=1 and F(2)=1
An here is a complete implementation in Haskell.
module Main where
-- Function f returns the n'th Fibonacci number.
-- It uses binary recursion.
f n | n <= 2 = 1
| n > 2 = f (n-1) + f (n-2)
Without understanding the intricacies of Haskell syntax, you can see that the code closely matches the “specification” above it. The f n | ...
syntax defines the function f
taking an argument n
and the two cases of n
values are shown on separate lines, where one case is for n <= 2
and the other case if for n > 2
.
The code uses the recursive relationship between different values of the function and the special-case values when n = 1
and n = 2
. The Haskell runtime does the rest of the work.
It’s interesting that most domain-specific languages are also declarative in nature. Think of how JMock, EasyMock or Rails’ ActiveRecord code look. The code is more succinct and it lets the “system” do most of the heavy lifting.
Functional Programming’s Benefits for You
Value Objects and Side-Effect Free Functions
It’s the immutable variables and side-effect free functions that help solve the multicore problem. Synchronized access to shared state is not required if there is no state to manage. This makes robust concurrent programs far easier to write.
I’ll discuss concurrency in Scala in my third post. For now, let’s discuss other ways that FP in Scala helps to improve code, concurrent or not.
Value objects are beneficial because you can pass one around without worrying that someone will change it in a way that breaks other users of the object. Value objects aren’t unique to FP, of course. They have been promoted in Domain Driven Design (DDD), for example.
Similarly, side-effect free functions are safer to use. There is less risk that a caller will change some state inappropriately. The caller doesn’t have to worry as much about calling a function. There are fewer surprises and everything of “consequence” that the function does is returned to the caller. It’s easier to keep to the Single Responsibility Principle when writing side-effect free functions.
Of course, you can write side-effect free methods and immutable variables in Java code, but it’s mostly a matter of discipline; the language doesn’t give you any enforcement mechanisms.
Scala gives you a helpful enforcement mechanism; the ability to declare variables as val
’s (i.e., “values”) vs. var
’s (i.e., “variables”, um… back to the imperative programming sense of the word…). In fact, val
is the default, where neither is required by the language. Also, the Scala library contains both immutable and mutable collections and it “encourages” you to use the immutable collections.
However, because Scala combines both OOP and FP, it doesn’t force FP purity. The upside is that you get to use the approach that best fits the problem you’re trying to solve. It’s interesting that some of the Scala library classes expose FP-style interfaces, immutability and side-effect free functions, while using more traditional imperative code to implement them!
Closures and First-Class Functions
True to its functional side, Scala gives you true closures and first-class functions. If you’re a Groovy or Ruby programmer, you’re used to the following kind of code.
class ExpensiveResource {
def open(worker: () => Unit) = {
try {
println("Doing expensive initialization")
worker()
} finally {
close()
}
}
def close() = {
println("Doing expensive cleanup")
}
}
// Example use:
try {
(new ExpensiveResource()) open { () => // 1
println("Using Resource") // 2
throw new Exception("Thrown exception") // 3
} // 4
} catch {
case ex: Throwable => println("Exception caught: "+ex)
}
Running this code will yield:
Doing expensive initialization
Using Resource
Doing expensive cleanup
Exception caught: java.lang.Exception: Thrown exception
The ExpensiveResource.open
method invokes the user-specified worker
function. The syntax worker: () => Unit
defines the worker
parameter as a function that takes no arguments and returns nothing (recall that Unit
is the equivalent of void
).
ExpensiveResource.open
handles the details of initializing the resource, invoking the worker, and doing the necessary cleanup.
The example marked with the comment // 1
creates a new ExpensiveResource
, then calls open
, passing it an anonymous function, called a function literal in Scala terminology. The function literal is of the form (arg_list_) => function body
or () => println(...) ...
, in our case.
A special syntax trick is used on this line; if a method takes one argument, you can change expressions of the form object.method(arg)
to object method {arg}
. This syntax is supported to allow user-defined methods to read like control structures (think for
statements – see the next section). If you’re familiar with Ruby, the four commented lines read a lot like Ruby syntax for passing blocks to methods.
Idioms like this are very important. A library writer can encapsulate all complex, error-prone logic and allow the user to specify only the unique work required in a given situation. For example, How many times have you written code that opened an I/O stream or a database connection, used it, then cleaned up. How many times did you get the idiom wrong, especially the proper cleanup when an exception is thrown? First-class functions allow writers of I/O, database and other resource libraries to do the correct implementation once, eliminating user error and duplication. Here’s a rhetorical question I always ask myself:
How can I make it impossible for the user of this API to fail?
Iterations
Iteration through collections, Lists
in particular, is even more common in FP than in imperative languages. Hence, iteration is highly evolved. Consider this example:
object RequireWordsStartingWithPrefix {
def main(args: Array[String]) = {
val prefix = args(0)
for {
i <- 1 to (args.length - 1) // no semicolon
if args(i).startsWith(prefix)
} println("args("+i+"): "+args(i))
}
}
Compiling this code with scalac
and then running it on the command line with the command
scala RequireWordsStartingWithPrefix xx xy1 xx1 yy1 xx2 xy2
produces the result
args(2): xx1
args(5): xx2
The for loop assigns a loop variable i
with each argument, but only if the if
statement is true. Instead of curly braces, the for loop argument list could also be parenthesized, but then each line as shown would have to be separated by a semi-colon, like we’re used to seeing with Java for loops.
We can have an arbitrary number of assignments and conditionals. In fact, it’s quite common to filter lists:
object RequireWordsStartingWithPrefix2 {
def main(args: Array[String]) = {
val prefix = args(0)
args.slice(1, args.length)
.filter((arg: String) => arg.startsWith(prefix))
.foreach((arg: String) => println("arg: "+arg))
}
}
This version yields the same result. In this case, the args array is sliced (loping off the search prefix), the resulting array is filtered using a function literal and the filtered array is iterated over to print out the matching arguments, again using a function literal. This version of the algorithm should look familiar to Ruby programmers.
Rolling Your Own Function Objects
Scala still has to support the constraints of the JVM. As a comment to the first blog post said, the Scala compiler wraps closures and “bare” functions in Function
objects. You can also make other objects behave like functions. If your object implements the apply
method, that method will be invoked if you put parentheses with an matching argument list on the object, as in the following example.
class HelloFunction {
def apply() = "hello"
def apply(name: String) = "hello "+name
}
val hello = new HelloFunction
println(hello()) // => "hello"
println(hello("Dean")) // => "hello Dean"
Option, None, Some…
Null pointer exceptions suck. You can still get them in Scala code, because Scala runs on the JVM and interoperates with Java libraries, but Scala offers a better way.
Typically, a reference might be null when there is nothing appropriate to assign to it. Following the conventions in some FP languages, Scala has an Option
type with two subtypes, Some
, which wraps a value, and None
, which is used instead of null
. The following example, which also demonstrates Scala’s Map
support, shows these types in action.
val hotLangs = Map(
"Scala" -> "Rocks",
"Haskell" -> "Ethereal",
"Java" -> null)
println(hotLangs.get("Scala")) // => Some(Rocks)
println(hotLangs.get("Java")) // => Some(null)
println(hotLangs.get("C++")) // => None
Note that Map
stores values in Options
objects, as shown by the println
statements.
By the way, those ->
aren’t special operators; they’re methods. Like ::
, valid method names aren’t limited to alphanumerics, _
, and $
.
Pattern Matching
The last FP feature I’ll discuss in this post is pattern matching, which is exploited more fully in FP languages than in imperative languages.
Using our previous definition of hotLangs
, here’s how you might use matching.
def show(key: String) = {
val value: Option[String] = hotLangs.get(key)
value match {
case Some(x) => x
case None => "No hotness found"
}
}
println(show("Scala")) // => "Rocks"
println(show("Java")) // => "null"
println(show("C++")) // => "No hotness found"
The first case
statement, case Some(x) => x
, says “if the value
I’m matching against is a Some
that could be constructed with the Some[+String](x: A)
constructor, then return the x
, the thing the Some
contains.” Okay, there’s a lot going on here, so more background information is in order.
In Scala, like Ruby and other languages, the last value computed in a function is returned by it. Also, almost everything returns a value, including match
statements, so when the Some(x) => x
case is chosen, x
is returned by the match
and hence by the function.
Some
is a generic class and the show
function returns a String
, so the match is to Some[+String]
. The +
in the +String
expression is analogous to Java’s extends
, i.e., <? extends String>
. Capiche?
Idioms like case Some(x) => x
are called extractors in Scala and are used a lot in Scala, as well as in FP, in general. Here’s another example using Lists and our friend ::
, the “cons” operator.
def countScalas(list: List[String]): Int = {
list match {
case "Scala" :: tail => countScalas(tail) + 1
case _ :: tail => countScalas(tail)
case Nil => 0
}
}
val langs = List("Scala", "Java", "C++", "Scala", "Python", "Ruby")
val count = countScalas(langs)
println(count) // => 2
We’re counting the number of occurrences of “Scala” in a list of strings, using matching and recursion and no explicit iteration. An expression of the form head :: tail
applied to a list returns the first element set as the head
variable and the rest of the list set as the tail
variable. In our case, the first case
statement looks for the particular case where the head equals Scala
. The second case
matches all lists, except for the empty list (Nil
). Since matches are eager, the first case
will always pick out the List("Scala", ...)
case first. Note that in the second case
, we don’t actually care about the value, so we use the placeholder _
. Both the first and second case
’s call countScalas
recursively.
Pattern matching like this is powerful, yet succinct and elegant. We’ll see more examples of matching in the next blog post on concurrency using message passing.
Recap of Scala’s Functional Programming
I’ve just touched the tip of the iceberg concerning functional programming (and I hope I got all the details right!). Hopefully, you can begin to see why we’ve overlooked FP for too long!
In my last post, I’ll wrap up with a look at Scala’s approach to concurrency, the Actor model of message passing.
The Seductions of Scala, Part I 202
(Update 12/23/2008: Thanks to Apostolos Syropoulos for pointing out an earlier reference for the concept of “traits”).
Because of all the recent hoo-ha about functional programming (e.g., as a “cure” for the multicore problem), I decided to cast aside my dysfunctional ways and learn one of the FP languages. The question was, which one?
My distinguished colleague, Michael Feathers, has been on a Haskell binge of late. Haskell is a pure functional language and is probably most interesting as the “flagship language” for academic exploration, rather than production use. (That was not meant as flame bait…) It’s hard to underestimate the influence Haskell has had on language design, including Java generics, .NET LINQ and F#, etc.
However, I decided to learn Scala first, because it is a JVM language that combines object-oriented and functional programming in one language. At ~13 years of age, Java is a bit dated. Scala has the potential of replacing Java as the principle language of the JVM, an extraordinary piece of engineering that is arguably now more valuable than the language itself. (Note: there is also a .NET version of Scala under development.)
Here are some of my observations, divided over three blog posts.
First, a few disclaimers. I am a Scala novice, so any flaws in my analysis reflect on me, not Scala! Also, this is by no means an exhaustive analysis of the pros and cons of Scala vs. other options. Start with the Scala website for more complete information.
A Better OOP Language
Scala works seamlessly with Java. You can invoke Java APIs, extend Java classes and implement Java interfaces. You can even invoke Scala code from Java, once you understand how certain “Scala-isms” are translated to Java constructs (javap
is your friend). Scala syntax is more succinct and removes a lot of tedious boilerplate from Java code.
For example, the following Person
class in Java:
class Person {
private String firstName;
private String lastName;
private int age;
public Person(String firstName, String lastName, int age) {
this.firstName = firstName;
this.lastName = lastName;
this.age = age;
}
public void setFirstName(String firstName) { this.firstName = firstName; }
public void String getFirstName() { return this.firstName; }
public void setLastName(String lastName) { this.lastName = lastName; }
public void String getLastName() { return this.lastName; }
public void setAge(int age) { this.age = age; }
public void int getAge() { return this.age; }
}
can be written in Scala thusly:
class Person(var firstName: String, var lastName: String, var age: Int)
Yes, that’s it. The constructor is the argument list to the class, where each parameter is declared as a variable (var
keyword). It automatically generates the equivalent of getter and setter methods, meaning they look like Ruby-style attribute accessors; the getter is foo
instead of getFoo
and the setter is foo =
instead of setFoo
. Actually, the setter function is really foo_=
, but Scala lets you use the foo =
sugar.
Lots of other well designed conventions allow the language to define almost everything as a method, yet support forms of syntactic sugar like the illusion of operator overloading, Ruby-like DSL’s, etc.
You also get fewer semicolons, no requirements tying package and class definitions to the file system structure, type inference, multi-valued returns (tuples), and a better type and generics model.
One of the biggest deficiencies of Java is the lack of a complete mixin model. Mixins are small, focused (think Single Responsibility Principle ...) bits of state and behavior that can be added to classes (or objects) to extend them as needed. In a language like C++, you can use multiple inheritance for mixins. Because Java only supports single inheritance and interfaces, which can’t have any state and behavior, implementing a mixin-based design has always required various hacks. Aspect-Oriented Programming is also one partial solution to this problem.
The most exciting OOP enhancement Scala brings is its support for Traits, a concept first described here and more recently discussed here. Traits support Mixins (and other design techniques) through composition rather than inheritance. You could think of traits as interfaces with implementations. They work a lot like Ruby modules.
Here is an example of the Observer Pattern written as traits, where they are used to monitor changes to a bank account balance. First, here are reusable Subject
and Observer
traits.
trait Observer[S] {
def receiveUpdate(subject: S);
}
trait Subject[S] {
this: S =>
private var observers: List[Observer[S]] = Nil
def addObserver(observer: Observer[S]) = observers = observer :: observers
def notifyObservers() = observers.foreach(_.receiveUpdate(this))
}
In Scala, generics are declared with square brackets, [...]
, rather than angled brackets, <...>
. Method definitions begin with the def
keyword. The Observer
trait defines one abstract method, which is called by the Subject
to notify the observer of changes. The Subject
is passed to the Observer
.
This trait looks exactly like a Java interface. In fact, that’s how traits are represented in Java byte code. If the trait has state and behavior, like Subject
, the byte code representation involves additional elements.
The Subject
trait is more complex. The strange line, this: S =>
, is called a self type declaration. It tells the compiler that whenever this
is referenced in the trait, treat its type as S
, rather than Subject[S]
. Without this declaration, the call to receiveUpdate
in the notifyObservers
method would not compile, because it would attempt to pass a Subject[S]
object, rather than a S
object. The self type declaration solves this problem.
The next line creates a private list of observers, initialized to Nil
, which is an empty list. Variable declarations are name: type
. Why didn’t they follow Java conventions, i.e., type name
? Because this syntax makes the code easier to parse when type inference is used, meaning where the explicit :type
is omitted and inferred.
In fact, I’m using type inference for all the method declarations, because the compiler can figure out what each method returns, in my examples. In this case, they all return type Unit
, the equivalent of Java’s void
. (The name Unit
is a common term in functional languages.)
The third line defines a method for adding a new observer to the list. Notice that concrete method definitions are of the form
def methodName(parameter: type, ...) = {
method body
}
In this case, because there is only one line, I dispensed with the {...}
. The equals sign before the body emphasizes the functional nature of scala, that all methods are objects, too. We’ll revisit this in a moment and in the next post.
The method body prepends the new observer object to the existing list. Actually, a new list is created. The ::
operator, called “cons”, binds to the right. This “operator” is really a method call, which could actually be written like this, observers.::(observer)
.
Our final method in Subject
is notifyObservers
. It iterates through observers and invokes the block observer.receiveUpdate(this)
on each observer. The _
evaluates to the current observer reference. For comparison, in Ruby, you would define this method like so:
def notifyObservers()
@observers.each { |o| o.receiveUpdate(self) }
end
Okay, let’s look at how you would actually use these traits. First, our “plain-old Scala object” (POSO) Account
.
class Account(initialBalance: Double) {
private var currentBalance = initialBalance
def balance = currentBalance
def deposit(amount: Double) = currentBalance += amount
def withdraw(amount: Double) = currentBalance -= amount
}
Hopefully, this is self explanatory, except for two things. First, recall that the whole class declaration is actually the constructor, which is why we have an initialBalance: Double
parameter on Account
. This looks strange to the Java-trained eye, but it actually works well and is another example of Scala’s economy. (You can define multiple constructors, but I won’t go into that here…).
Second, note that I omitted the parentheses when I defined the balance
“getter” method. This supports the uniform access principle. Clients will simply call myAccount.balance
, without parentheses and I could redefine balance
to be a var
or val
and the client code would not have to change!
Next, a subclass that supports observation.
class ObservedAccount(initialBalance: Double) extends Account(initialBalance) with Subject[Account] {
override def deposit(amount: Double) = {
super.deposit(amount)
notifyObservers()
}
override def withdraw(amount: Double) = {
super.withdraw(amount)
notifyObservers()
}
}
The with
keyword is how a trait is used, much the way that you implement
an interface in Java, but now you don’t have to implement the interface’s methods. We’ve already done that.
Note that the expression, ObservedAccount(initialBalance: Double) extends Account(initialBalance)
, not only defines the (single) inheritance relationship, it also functions as the constructor’s call to super(initialBalance)
, so that Account
is properly initialized.
Next, we have to override the deposit
and withdraw
methods, calling the parent methods and then invoking notifyObservers
. Anytime you override a concrete method, scala requires the override
keyword. This tells you unambiguously that you are overriding a method and the Scala compiler throws an error if you aren’t actually overriding a method, e.g., because of a typo. Hence, the keyword is much more reliable (and hence useful…) than Java’s @Override
annotation.
Finally, here is an Observer
that prints to stdout when the balance changes.
class AccountReporter extends Observer[Account] {
def receiveUpdate(account: Account) =
println("Observed balance change: "+account.balance)
}
Rather than use with
, I just extend the Observer
trait, because I don’t have another parent class.
Here’s some code to test what we’ve done.
def changingBalance(account: Account) = {
println("==== Starting balance: " + account.balance)
println("Depositing $10.0")
account.deposit(10.0)
println("new balance: " + account.balance)
println("Withdrawing $5.60")
account.withdraw(5.6)
println("new balance: " + account.balance)
}
var a = new Account(0.0)
changingBalance(a)
var oa = new ObservedAccount(0.0)
changingBalance(oa)
oa.addObserver(new AccountReporter)
changingBalance(oa)
Which prints out:
==== Starting balance: 0.0
Depositing $10.0
new balance: 10.0
Withdrawing $5.60
new balance: 4.4
==== Starting balance: 0.0
Depositing $10.0
new balance: 10.0
Withdrawing $5.60
new balance: 4.4
==== Starting balance: 4.4
Depositing $10.0
Observed balance change: 14.4
new balance: 14.4
Withdrawing $5.60
Observed balance change: 8.8
new balance: 8.8
Note that we only observe the last transaction.
Download Scala and try it out. Put all this code in one observer.scala
file, for example, and run the command:
scala observer.scala
But Wait, There’s More!
In the next post, I’ll look at Scala’s support for Functional Programming and why OO programmers should find it interesting. In the third post, I’ll look at the specific case of concurrent programming in Scala and make some concluding observations of the pros and cons of Scala.
For now, here are some references for more information.
- The Scala website, for downloads, documentation, mailing lists, etc.
- Ted Neward’s excellent multipart introduction to Scala at developerWorks.
- The forthcoming Programming in Scala book.
Observations on Test-Driving User Interfaces 43
Test driving user interface development has always been a challenge. Recently, I’ve worked with two projects where most of the work has been on the user-interface components.
The first project is using Adobe Flex to create a rich interface. The team decided to adopt FunFX for acceptance testing. You write your tests in Ruby, typically using Test::Unit or RSpec.
FunFX places some constraints on your Flex application. You have to define the GUI objects in MXML, the XML-based file format for Flex applications, rather than ActionScript, and you need to add ids to all elements you want to reference.[1]
These are reasonable constraints and the first constraint promotes better quality, in fact. The MXML format is more succinct (despite the XML “noise”) and declarative than ActionScript code. This is almost always true of UI code in most languages (with notable exceptions…). Declarative vs. imperative code tends to improve quality because less code means fewer bugs, less to maintain, and it frees the implementor of the declarative “language” to pick the best implementation strategies, optimizations, etc. This characteristic is typical of Functional Languages and well-designed Domain Specific Languages, as well.
I don’t think you can underestimate the benefit of writing less code. I see too many teams whose problems would diminish considerably if they just got rid of duplication and learned to be concise.
The second project is a wiki-based application written in Java. To make deployment as simple as possible, the implementors avoided the Servlet API (no need to install Tomcat, etc.) and rolled their own web server and page rendering components. (I’m not sure I would have made these decisions myself, but I don’t think they are bad, either…)
The rendering components are object-oriented and use a number of design patterns, such as page factories with builder objects that reflect the “widgets” in the UI, HTML tags, etc. This approach makes the UI very testable with JUnit and FitNesse. In fact, the development process was a model of test-driven development.
However, the final result is flawed! It is much too difficult to change the look and feel of the application, which is essential for most UI’s, especially web UI’s. The project made the wrong tradeoffs; the design choices met the requirements of TDD very well, but they made maintenance and enhancement expensive and tedious. The application is now several years old and it has become dated, because of the expense of “refreshing” the look and feel.
What should have been done? These days, most dynamic web UI’s are built with templating engines, of which there are many in the most common programming languages. Pages defined in a templating engine are very declarative, except for the special tags where behavior is inserted. The pages are easy to change. It is mostly obvious where a particular visual element is generated, since most of the “tags” in the template look exactly like the tags in the rendered page. “Declarative” templates, like good DSL’s, can be read, understood, and even edited by the stakeholders, in this case the graphical designers.
But how do you test these page templates? When test-driving UI’s it is important to decide what to test and what not to test. The general rule for TDD is to test anything that can break. The corollary, especially relevant for UI’s, is don’t test anything when you don’t care if it changes.
It is usually the dynamic behavior of the UI that can break and should be tested. Templating engines provide special tags for inserting dynamic behavior in the underlying language (Java, Ruby, etc.). This is what you should test. It is usually best to keep the scripts in these tags as small as possible; the scripts just delegate to code, which can be test-driven in the usual way.
I see too many UI tests that compare long strings of HTML. These tests break whenever someone makes a minor look and feel or other inconsequential change. Part of the art of UI TDD is knowing how to test just what can break and nothing more. In the second project, incidental changes to the UI break tests that should be agnostic to such changes.
To conclude, keep your UI’s as declarative as you can. Only test the “declarations” (e.g., templates) in areas where they might break, meaning if it changes, it’s a bug. You’ll get the full benefits of TDD and the freedom to change the UI easily and frequently, as needed.
1 Disclaimer: my information on FunFX is second hand, so I might not have the details exactly correct; see the FunFX documentation for details.