Bowling Game Kata in Ruby 62

Posted by Brett Schuchert Thu, 01 Oct 2009 18:37:00 GMT

I have not really used Ruby much. I’ve written a few tutorials, messed around with RSpec and Test::Unit and even Rails a bit, but I really don’t know Ruby that well. I get Ruby (the MOP, instances, blocks, open classes, ...) but there’s a difference between understanding that stuff and using it day-to-day.

Last night we had a Dojo in Oklahoma City and I wanted to get refreshed with RSpec, so I decided to jump in and do the bowling game kata. I did not follow uncle bob’s lead exactly. For one, I went ahead and stored two throws for each frame. While what he ends up with is a bit shorter, it bothers me a little bit. I’ve also seen people surprised by how bob stores his scores, so in a sense it violates the law of least astonishment.

That’s neither here nor there, I got it working just fine – though handling strikes was a bit more difficult because I decided to store two rolls instead of one (so clearly, there’s no best answer, just ones that suck for different reasons).

After my first go around, I had a single spec with examples all under a single describe (what’s that term they’d use for what the describe expression creates?). I added several examples for partial scores, to make sure I was handling incomplete games correctly. I restructured those a bit and tried to make the names a bit more clear, not sure if I was successful.

In my original version I started with a frame in the score method as a local variable, but it quickly got converted to an index, and the index was mostly passed around after that. The approach was very c-esque. I didn’t like that index all over the place, so I tried to remove it by refactoring. It took several false starts before I bit the bullet and simply duplicated each of the methods, one at a time, using parallel development. The old version using an index, the new one use a 1-based frame number. After I got that working with frames, I removed most of the methods using an index, except for a few.

What follows is the spec file and the ruby class. If you read the names of some of the examples, you might think I used to bowl in league, I did. My average was a paltry 158, my best game ever a 248. Best split I ever picked up? 4, 6, 7, 10.

Comments welcome.

bowling_score_card_spec.rb

require 'bowling_score_card'

describe BowlingScoreCard do
    before(:each) do
        @bowling_game_scorer = BowlingScoreCard.new
    end

    def roll value
        @bowling_game_scorer.roll value
    end

    def roll_many count, value
        count.times { roll value }
    end

    def score_should_be value
        @bowling_game_scorer.score.should == value
    end

    it 'should score 0' do
        score_should_be 0
    end

    describe 'Scores for Complete Games' do
        it 'should score 0 for an all gutter game' do
            roll_many 20, 0
            score_should_be 0
        end

        it 'should show 20 for an all 1 game' do
            roll_many 20, 1
            score_should_be 20
        end

        it 'should score game with single spare correctly' do
            roll_many 3, 5
            roll_many 17, 0
            score_should_be 20
        end

        it 'should score game with single strike correctly' do
            roll 10
            roll 5
            roll 2
            roll_many 17, 0
            score_should_be 24
        end

        it 'should score a dutch-200, spare-strike, correclty' do
            10.times do
                roll_many 2, 5
                roll 10
            end
            roll_many 2, 5

            score_should_be 200
        end

        it 'should score a dutch-200, strike-spare, correctly' do
            10.times do
                roll 10
                roll_many 2, 5
            end
            roll 10
            score_should_be 200
        end

        it "should score all 5's game as 150" do
            roll_many 21, 5
            score_should_be 150
        end

        it 'should score a perfect game correctly' do
            roll_many 12, 10
            score_should_be 300
        end

        it 'should not count a 0, 10 roll as a strike' do
            roll 0
            roll 10
            roll_many 18, 1
            score_should_be 29
        end

    end

    describe 'Scoring for open games' do
        it 'should score just an open frame' do
            roll 4
            roll 3
            score_should_be 7
        end

        it 'should score just a spare' do
            roll_many 2, 5
            score_should_be 10
        end

        it 'should score partial game with spare and following frame only' do
            roll_many 3, 5
            score_should_be 20
        end

        it 'should score an opening turkey correctly' do
            roll_many 3, 10
            score_should_be 60
        end
    end

    describe 'Scoring open game starting with a srike' do
        before(:each) do
            roll 10
        end
        it 'should score partial game with only strike' do
            score_should_be 10
        end

        it 'should score partial game with strike and half-open frame' do
            roll 4
            score_should_be 18
        end

        it 'should score partial game with strike and open frame' do
            roll 3
            roll 6
            score_should_be 28
        end

        it 'should score partial game with strike and spare' do
            roll 3
            roll 7
            score_should_be 30
        end
    end

    describe 'Open game starting with two Strikes' do
        before(:each) do
            roll 10
            roll 10
        end

        it 'should have a score of 30' do
            score_should_be 30
        end

        it 'should score correctly with following non-mark' do
            roll 4
            score_should_be 42
        end

        it 'should score correclty with third frame open' do
            roll 4
            roll 3
            score_should_be 48
        end
    end
end

bowling_score_card.rb

class BowlingScoreCard
    def initialize
        @rolls = []
    end

    def roll value
        @rolls << value
        @rolls << 0 if first_throw_is_strike?
    end

    def score
        (1..10).inject(0) { |score, frame| score += score_for frame }
    end

    def score_for frame
        return strike_score_for frame if is_strike_at? frame
        return spare_score_for frame if is_spare_at? frame
        open_score_for frame
    end

    def first_throw_is_strike?
        is_first_throw_in_frame? && @rolls.last == 10
    end

    def is_first_throw_in_frame?
        @rolls.length.odd?
    end

    def open_score_for frame
        first_throw_for(frame) + second_throw_for(frame);
    end

    def spare_score_for frame
        open_score_for(frame) + first_throw_for(next_frame(frame))
    end

    def strike_score_for frame
        score = open_score_for(frame) + open_score_for(next_frame(frame))
        if is_strike_at? next_frame(frame)
            score += first_throw_for(next_frame(next_frame(frame)))
        end
        score
    end

    def next_frame frame
        frame + 1
    end

    def is_spare_at? frame
        (open_score_for frame) == 10 && is_strike_at?(frame) == false
    end

    def is_strike_at? frame
        first_throw_for(frame) == 10
    end

    def first_throw_for frame
        score_at_throw(index_for(frame))
    end

    def second_throw_for frame
        score_at_throw(index_for(frame) + 1)
    end

    def index_for frame
        (frame - 1) * 2
    end

    def score_at_throw index
        @rolls.length > index ? @rolls[index] : 0
    end
end

Notes from the OkC Dojo 2009-09-30 69

Posted by Brett Schuchert Thu, 01 Oct 2009 03:57:00 GMT

Tonight we had a small group of die-hard practitioners working with Ruby and RSpec. We intended to use the Randori style, but it was a small enough group that we were a bit more informal than that.

We tried the Shunting Yard Algorithm again and it worked out fairly well. The level of experience in Ruby was low to moderate (which is why we wanted to get people a chance to practice it) and the RSpec experience was generally low (again, great reason to give it a try).

We had several interesting (at least to me) side discussions on things such as:
  • Forth
  • Operator precedence
  • Operator associativity
  • L-Values and R-Values
  • Directed Acyclic Graphis
  • In-fix, pre-fix, post-fix binary tree traversal
  • Abstract Syntax Trees (AST)
  • The list goes on, I’m a big-time extrovert, so I told Chad to occasionally tell me to shut the heck up
The Shunting Yard Algorithm is a means of translating an in-fix expression into a post-fix expression (a.k.a. reverse polish notation – used by the best calculators in the world, HP [I also prefer vi, fyi!-] ). For example:
  • 1 + 3 becomes 1 3 +
  • a = b = 17 becomes a b 17 = =
  • 2 + 3 * 5 becomes 2 3 5 * +
  • 2 * 3 + 5 becomes 2 3 * 5 +

One typical approach to this problem is to develop an AST from the in-fix representation and then recursively traversing the AST using a recursive post-fix traversal.

What I like about he Shunting Yard Algorithm is it takes a traditionally recursive algorithm (DAG traversal, where a binary tree is a degenerate DAG) and writes it iteratively using it’s own stack (local or instance variable) storage versus using the program stack to store activation records (OK stack frames). Essentially, the local stack is used for pending work.

This is one of those things I think is a useful skill to learn: writing traditionally recursive algorithms using a stack-based approach. This allows you to step through something (think iteration) versus having to do things whole-hog (recursively, with a block (lambda) passed in). In fact, I bought a used algorithm book 20 years ago because it had a second on this subject. And looking over my left shoulder, I just saw that book. Nice.

To illustrate, here’s the AST for the first example:

Since the group had not done a lot with recursive algorithms (at least not recently), we discussed a short hand way to remember the various traversal algorithms using three letters: L, R, P

  • L -> Go Left
  • R -> Go Right
  • P -> Print (or process)
Each of the three traditional traversal algorithms (for a binary tree) use just these three letters. And the way to remember each is to put the ‘p’ where the name suggests. For example:
  • in-fix, in -> in between -> L P R
  • pre-fix, pre, before -> P L R
  • post-fix, post, after -> L R P
Then, given the graph above, you can traverse it as follows:
  • in-fix: Go left, you hit the 1, it’s a leaf note so print it, go up to the +, print it, go to the right, you end up with 1 + 3
  • post-fix: Go left you hit the 1, it’s a leaf node, print it, go back to the +, since this is post-fix, don’t print yet, go to the right, you get the 3, it’s a leaf node, print it, then finally print the +, giving: 1 3 +
  • pre-fix: start at + and print it, then go left, it’s a leaf note, print it, go right, it’s a leaf node, print it, so you get: + 1 3 – which looks like a function call (think operator+(1, 3))

It’s not quite this simple – we actually looked at larger examples – but this gets the essence across. And to move from a tree to a DAG, simply iterate over all children, printing before or after the complete iteration; in-fix doesn’t make as much sense in a general DAG. We also discussed tracking the visited nodes if you’ve got a graph versus an acyclic graph.

After we got a multi-operator expression with same-precedence operators working, e.g., 1 + 3 – 2, which results in: 1 3 + 2 -, we moved on to handling different operator precedence.

Around this time, there was some skepticism that post-fix could represent the same expression as in-fix. This is normal, if you have not seen these kinds of representations. And let’s be frank, how often do most of us deal with these kinds of things? Not often.

Also, there was another question: WHY?

In a nutshell, with a post-fix notation, you do not need parentheses. As soon as an operator is encountered, you can immediately process it rather than waiting until the next token to complete the operator (no look-ahead required). This also led to HP developing a calculator in 1967 (or ‘68) that was < 50 pounds and around USD $5,000 that could add, subtract, multiply and divide, which was huge at the time (with a stack size of 3 – later models went to a stack size of 4, giving us the x, y, z and t registers).

During this rat-hole, we discussed associativity. For example, a = b = c is really (a = (b = c))

That’s because the assignment operator is right-associative. This lead into r-values and l-values.

Anyway, we’re going to meet again next week. Because we (read this as me) were not disciplined in following the Randori style, these side discussions lead to taking a long to fix a problem. We should have “hit the reset button” sooner, so next time around we’re going to add a bit more structure to see what happens:
  • The driver finishes by writing a new failing test.
  • The driver commits the code with the newest failing test (we’ll be using git)
  • Change drivers and give him/her some time-box (5 – 10 minutes)
  • If, at the end of the current time-box, the current driver has all tests passing, go back to the first bullet in this list.
  • If at the end, the same test that was failing is still failing, (fist time only) give them a bit more time.
  • However, if any other tests are failing, then we revert back to the last check in and switch drivers.

Here’s an approximation of these rules using yuml.me:

And here’s the syntax to create that diagram:
(start)->(Create New Failing Test)->(Commit Work)->(Change Drivers)
(Change Drivers)->(Driver Working)
(Driver Working)-><d1>[tick]->(Driver Working)
<d1>[alarm]->(Check Results)->([All Tests Passing])->(Create New Failing Test)
(Check Results)->([Driver Broke Stuff])->(git -reset hard)->(Change Drivers)
(Check Results)->([First Time Only and Still Only Newest Test Failing])->(Give Driver A Touch More Time)->(Check Results)

Note that this is not a strict activity diagram, the feature is still in beta, and creating this diagram as I did made the results a bit more readable. Even so, I like this tool so I wanted to throw another example in there (and try out this diagram type I have not used before – at least not with this tool, I’ve created too many activity diagrams). If you’d like to see an accurate activity diagram, post a comment and I’ll draw one in Visio and post it.

Anyway, we’re going to try to move to a weekly informal practice session with either bi-weekly or monthly “formal” meetings. We’ll keep switching out the language and the tools. I’m even tempted to do design sessions – NO CODING?! What?! Why not. Some people still work that way, so it’s good to be able to work in different modes.

If you’re in Oklahoma City, hope to see you. If not, and I’m in your town, I’d be interested in dropping into your dojos!

Tighter Ruby Methods with Functional-style Pattern Matching, Using the Case Gem 148

Posted by Dean Wampler Tue, 17 Mar 2009 00:59:00 GMT

Ruby doesn’t have overloaded methods, which are methods with the same name, but different signatures when you consider the argument lists and return values. This would be somewhat challenging to support in a dynamic language with very flexible options for method argument handling.

You can “simulate” overloading by parsing the argument list and taking different paths of execution based on the structure you find. This post discusses how pattern matching, a hallmark of functional programming, gives you powerful options.

First, let’s look at a typical example that handles the arguments in an ad hoc fashion. Consider the following Person class. You can pass three arguments to the initializer, the first_name, the last_name, and the age. Or, you can pass a hash using the keys :first_name, :last_name, and :age.


require "rubygems" 
require "spec" 

class Person
  attr_reader :first_name, :last_name, :age
  def initialize *args
    arg = args[0]
    if arg.kind_of? Hash       # 1
      @first_name = arg[:first_name]
      @last_name  = arg[:last_name]
      @age        = arg[:age]
    else
      @first_name = args[0]
      @last_name  = args[1]
      @age        = args[2]
    end
  end
end

describe "Person#initialize" do 
  it "should accept a hash with key-value pairs for the attributes" do
    person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
    person.first_name.should == "Dean" 
    person.last_name.should  == "Wampler" 
    person.age.should        == 39
  end
  it "should accept a first name, last name, and age arguments" do
    person = Person.new "Dean", "Wampler", 39
    person.first_name.should == "Dean" 
    person.last_name.should  == "Wampler" 
    person.age.should        == 39
  end
end

The condition on the # 1 comment line checks to see if the first argument is a Hash. If so, the attribute’s values are extracted from it. Otherwise, it is assumed that three arguments were specified in a particular order. They are passed to #initialize in a three-element array. The two rspec examples exercise these behaviors. For simplicity, we ignore some more general cases, as well as error handling.

Another approach that is more flexible is to use duck typing, instead. For example, we could replace the line with the # 1 comment with this line:


if arg.respond_to? :has_key?

There aren’t many objects that respond to #has_key?, so we’re highly confident that we can use [symbol] to extract the values from the hash.

This implementation is fairly straightforward. You’ve probably written code like this yourself. However, it could get complicated for more involved cases.

Pattern Matching, a Functional Programming Approach

Most programming languages today have switch or case statements of some sort and most have support for regular expression matching. However, in functional programming languages, pattern matching is so important and pervasive that these languages offer very powerful and convenient support for pattern matching.

Fortunately, we can get powerful pattern matching, typical of functional languages, in Ruby using the Case gem that is part of the MenTaLguY’s Omnibus Concurrency library. Omnibus provides support for the hot Actor model of concurrency, which Erlang has made famous. However, it would be a shame to restrict the use of the Case gem to parsing Actor messages. It’s much more general purpose than that.

Let’s rework our example using the Case gem.


require "rubygems" 
require "spec" 
require "case" 

class Person
  attr_reader :first_name, :last_name, :age
  def initialize *args
    case args
    when Case[Hash]       # 1
      arg = args[0]
      @first_name = arg[:first_name]
      @last_name  = arg[:last_name]
      @age        = arg[:age]
    else
      @first_name = args[0]
      @last_name  = args[1]
      @age        = args[2]
    end
  end
end

describe "Person#initialize" do 
  it "should accept a first name, last name, and age arguments" do
    person = Person.new "Dean", "Wampler", 39
    person.first_name.should == "Dean" 
    person.last_name.should  == "Wampler" 
    person.age.should        == 39
  end
  it "should accept a has with :first_name => fn, :last_name => ln, and :age => age" do
    person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
    person.first_name.should == "Dean" 
    person.last_name.should  == "Wampler" 
    person.age.should        == 39
  end
end

We require the case gem, which puts the #=== method on steroids. In the when statement in #initialize, the expression when Case[Hash] matches on a one-element array where the element is a Hash. We extract the key-value pairs as before. The else clause assumes we have an array for the arguments.

So far, this is isn’t very impressive, but all we did was to reproduce the original behavior. Let’s extend the example to really exploit some of the neat features of the Case gem’s pattern matching. First, let’s narrow the allowed array values.


require "rubygems" 
require "spec" 
require "case" 

class Person
  attr_reader :first_name, :last_name, :age
  def initialize *args
    case args
    when Case[Hash]       # 1
      arg = args[0]
      @first_name = arg[:first_name]
      @last_name  = arg[:last_name]
      @age        = arg[:age]
    when Case[String, String, Integer]
      @first_name = args[0]
      @last_name  = args[1]
      @age        = args[2]
    else
      raise "Invalid arguments: #{args}" 
    end
  end
end

describe "Person#initialize" do 
  it "should accept a first name, last name, and age arguments" do
    person = Person.new "Dean", "Wampler", 39
    person.first_name.should == "Dean" 
    person.last_name.should  == "Wampler" 
    person.age.should        == 39
  end
  it "should accept a has with :first_name => fn, :last_name => ln, and :age => age" do
    person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
    person.first_name.should == "Dean" 
    person.last_name.should  == "Wampler" 
    person.age.should        == 39
  end
  it "should not accept an array unless it is a [String, String, Integer]" do
    lambda { person = Person.new "Dean", "Wampler", "39" }.should raise_error(Exception)
  end
end

The new expression when Case[String, String, Integer] only matches a three-element array where the first two arguments are strings and the third argument is an integer, which are the types we want. If you use an array with a different number of arguments or the arguments have different types, this when clause won’t match. Instead, you’ll get the default else clause, which raises an exception. We added another rspec example to test this condition, where the user’s age was specified as a string instead of as an integer. Of course, you could decide to attempt a conversion of this argument, to make your code more “forgiving” of user mistakes.

Similarly, what happens if the method supports default values some of the parameters. As written, we can’t support that option, but let’s look at a slight variation of Person#initialize, where a hash of values is not supported, to see what would happen.


require "rubygems" 
require "spec" 
require "case" 

class Person
  attr_reader :first_name, :last_name, :age
  def initialize first_name = "Bob", last_name = "Martin", age = 29
    case [first_name, last_name, age]
    when Case[String, String, Integer]
      @first_name = first_name
      @last_name  = last_name
      @age        = age
    else
      raise "Invalid arguments: #{first_name}, #{last_name}, #{age}" 
    end
  end
end

def check person, expected_fn, expected_ln, expected_age
  person.first_name.should == expected_fn
  person.last_name.should  == expected_ln
  person.age.should        == expected_age
end

describe "Person#initialize" do 
  it "should require a first name (string), last name (string), and age (integer) arguments" do
    person = Person.new "Dean", "Wampler", 39
    check person, "Dean", "Wampler", 39
  end
  it "should accept the defaults for all parameters" do
    person = Person.new
    check person, "Bob", "Martin", 29
  end
  it "should accept the defaults for the last name and age parameters" do
    person = Person.new "Dean" 
    check person, "Dean", "Martin", 29
  end
  it "should accept the defaults for the age parameter" do
    person = Person.new "Dean", "Wampler" 
    check person, "Dean", "Wampler", 29
  end
  it "should not accept the first name as a symbol" do
    lambda { person = Person.new :Dean, "Wampler", "39" }.should raise_error(Exception)
  end
  it "should not accept the last name as a symbol" do
  end
  it "should not accept the age as a string" do
    lambda { person = Person.new "Dean", "Wampler", "39" }.should raise_error(Exception)
  end
end

We match on all three arguments as an array, asserting they are of the correct type. As you might expect, #initialize always gets three parameters passed to it, including when default values are used.

Let’s return to our original example, where the object can be constructed with a hash or a list of arguments. There are two more things (at least …) that we can do. First, we’re not yet validating the types of the values in the hash. Second, we can use the Case gem to impose constraints on the values, such as requiring non-empty name strings and a positive age.


require "rubygems" 
require "spec" 
require "case" 

class Person
  attr_reader :first_name, :last_name, :age
  def initialize *args
    case args
    when Case[Hash]
      arg = args[0]
      @first_name = arg[:first_name]
      @last_name  = arg[:last_name]
      @age        = arg[:age]
    when Case[String, String, Integer]
      @first_name = args[0]
      @last_name  = args[1]
      @age        = args[2]
    else
      raise "Invalid arguments: #{args}" 
    end
    validate_name @first_name, "first_name" 
    validate_name @last_name, "last_name" 
    validate_age
  end

  protected

  def validate_name name, field_name
    case name
    when Case::All[String, Case.guard {|s| s.length > 0 }]
    else
      raise "Invalid #{field_name}: #{first_name}" 
    end
  end

  def validate_age
    case @age
    when Case::All[Integer, Case.guard {|n| n > 0 }]
    else
      raise "Invalid age: #{@age}" 
    end
  end
end

describe "Person#initialize" do 
  it "should accept a first name, last name, and age arguments" do
    person = Person.new "Dean", "Wampler", 39
    person.first_name.should == "Dean" 
    person.last_name.should  == "Wampler" 
    person.age.should        == 39
  end
  it "should accept a has with :first_name => fn, :last_name => ln, and :age => age" do
    person = Person.new :first_name => "Dean", :last_name => "Wampler", :age => 39
    person.first_name.should == "Dean" 
    person.last_name.should  == "Wampler" 
    person.age.should        == 39
  end
  it "should not accept an array unless it is a [String, String, Integer]" do
    lambda { person = Person.new "Dean", "Wampler", "39" }.should raise_error(Exception)
  end
  it "should not accept a first name that is a zero-length string" do
    lambda { person = Person.new "", "Wampler", 39 }.should raise_error(Exception)
  end    
  it "should not accept a first name that is not a string" do
    lambda { person = Person.new :Dean, "Wampler", 39 }.should raise_error(Exception)
  end    
  it "should not accept a last name that is a zero-length string" do
    lambda { person = Person.new "Dean", "", 39 }.should raise_error(Exception)
  end    
  it "should not accept a last name that is not a string" do
    lambda { person = Person.new :Dean, :Wampler, 39 }.should raise_error(Exception)
  end    
  it "should not accept an age that is less than or equal to zero" do
    lambda { person = Person.new "Dean", "Wampler", -1 }.should raise_error(Exception)
    lambda { person = Person.new "Dean", "Wampler", 0 }.should raise_error(Exception)
  end    
  it "should not accept an age that is not an integer" do
    lambda { person = Person.new :Dean, :Wampler, "39" }.should raise_error(Exception)
  end    
end

We have added validate_name and validate_age methods that are invoked at the end of #initialize. In validate_name, the one when clause requires “all” the conditions to be true, that the name is a string and that it has a non-zero length. Similarly, validate_age has a when clause that requires age to be a positive integer.

Final Thoughts

So, how valuable is this? The code is certainly longer, but it specifies and enforces expected behavior more precisely. The rspec examples verify the enforcement. It smells a little of static typing, which is good or bad, depending on your point of view. ;)

Personally, I think the conditional checks are a good way to add robustness in small ways to libraries that will grow and evolve for a long time. The checks document the required behavior for code readers, like new team members, but of course, they should really get that information from the tests. ;) (However, it would be nice to extract the information into the rdocs.)

For small, short-lived projects, I might not worry about the conditional checks as much (but how many times have those “short-lived projects” refused to die?).

You can read more about Omnibus and Case in this InfoQ interview with MenTaLguY. I didn’t discuss using the Actor model of concurrency, for which these gems were designed. For an example of Actors using Omnibus, see my Better Ruby through Functional Programming presentation or the Confreak’s video of an earlier version of the presentation I gave at last year’s RubyConf.

Video of my RubyConf talk, "Better Ruby through Functional Programming" 67

Posted by Dean Wampler Thu, 27 Nov 2008 22:09:00 GMT

Confreaks has started posting the videos from RubyConf. Here’s mine on Better Ruby through Functional Programming.

Please ignore the occasional Ruby (and Scala) bugs…

A Scala-style "with" Construct for Ruby 111

Posted by Dean Wampler Tue, 30 Sep 2008 03:41:00 GMT

Scala has a “mixin” construct called traits, which are roughly analogous to Ruby modules. They allow you to create reusable, modular bits of state and behavior and use them to compose classes and other traits or modules.

The syntax for using Scala traits is quite elegant. It’s straightforward to implement the same syntax in Ruby and doing so has a few useful advantages.

For example, here is a Scala example that uses a trait to trace calls to a Worker.work method.

    
// run with "scala example.scala" 

class Worker {
    def work() = "work" 
}

trait WorkerTracer extends Worker {
    override def work() = "Before, " + super.work() + ", After" 
}

val worker = new Worker with WorkerTracer

println(worker.work())        // => Before, work, After
    

Note that WorkerTracer extends Worker so it can override the work method. Since Scala is statically typed, you can’t just define an override method and call super unless the compiler knows there really is a “super” method!

Here’s a Ruby equivalent.

    
# run with "ruby example.rb" 

module WorkerTracer
    def work; "Before, #{super}, After"; end
end

class Worker 
    def work; "work"; end
end

class TracedWorker < Worker 
  include WorkerTracer
end

worker = TracedWorker.new

puts worker.work          # => Before, work, After
    

Note that we have to create a subclass, which isn’t required for the Scala case (but can be done when desired).

If you know that you will always want to trace calls to work in the Ruby case, you might be tempted to dispense with the subclass and just add include WorkerTracer in Worker. Unfortunately, this won’t work. Due to the way that Ruby resolves methods, the version of work in the module will not be found before the version defined in Worker itself. Hence the subclass seems to be the only option.

However, we can work around this using metaprogramming. We can use WorkerTracer#append_features(...). What goes in the argument list? If we pass Worker, then all instances of Worker will be effected, but actually we’ll still have the problem with the method resolution rules.

If we just want to affect one object and work around the method resolution roles, then we need to pass the singleton class (or eigenclass or metaclass ...) for the object, which you can get with the following expression.

    
metaclass = class << worker; self; end
    

So, to encapsulate all this and to get back to the original goal of implementing with-style semantics, here is an implementation that adds a with method to Object, wrapped in an rspec example.

    
# run with "spec ruby_with_spec.rb" 

require 'rubygems'
require 'spec'

# Warning, monkeypatching Object, especially with a name
# that might be commonly used is fraught with peril!!

class Object
  def with *modules
    metaclass = class << self; self; end
    modules.flatten.each do |m|
      m.send :append_features, metaclass
    end
    self
  end
end

module WorkerTracer
    def work; "Before, #{super}, After"; end
end

module WorkerTracer1
    def work; "Before1, #{super}, After1"; end
end

class Worker 
    def work; "work"; end
end

describe "Object#with" do
  it "should make no changes to an object if no modules are specified" do
    worker = Worker.new.with
    worker.work.should == "work" 
  end

  it "should override any methods with a module's methods of the same name" do
    worker = Worker.new.with WorkerTracer
    worker.work.should == "Before, work, After" 
  end

  it "should stack overrides for multiple modules" do
    worker = Worker.new.with(WorkerTracer).with(WorkerTracer1)
    worker.work.should == "Before1, Before, work, After, After1" 
  end

  it "should stack overrides for a list of modules" do
    worker = Worker.new.with WorkerTracer, WorkerTracer1
    worker.work.should == "Before1, Before, work, After, After1" 
  end

  it "should stack overrides for an array of modules" do
    worker = Worker.new.with [WorkerTracer, WorkerTracer1]
    worker.work.should == "Before1, Before, work, After, After1" 
  end
end
    

You should carefully consider the warning about monkeypatching Object! Also, note that Module.append_features is actually private, so I had to use m.send :append_features, ... instead.

The syntax is reasonably intuitive and it eliminates the need for an explicit subclass. You can pass a single module, or a list or array of them. Because with returns the object, you can also chain with calls.

A final note; many developers steer clear of metaprogramming and reflection features in their languages, out of fear. While prudence is definitely wise, the power of these tools can dramatically accelerate your productivity. Metaprogramming is just programming. Every developer should master it.

The Liskov Substitution Principle for "Duck-Typed" Languages 109

Posted by Dean Wampler Sun, 07 Sep 2008 04:48:00 GMT

OCP and LSP together tell us how to organize similar vs. variant behaviors. I blogged the other day about OCP in the context of languages with open classes (i.e., dynamically-typed languages). Let’s look at the Liskov Substitution Principle (LSP).

The Liskov Substitution Principle was coined by Barbara Liskov in Data Abstraction and Hierarchy (1987).

If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2, then S is a subtype of T.

I’ve always liked the elegant simplicity, yet power, of LSP. In less formal terms, it says that if a client (program) expects objects of one type to behave in a certain way, then it’s only okay to substitute objects of another type if the same expectations are satisfied.

This is our best definition of inheritance. The well-known is-a relationship between types is not precise enough. Rather, the relationship has to be behaves-as-a, which unfortunately is more of a mouthful. Note that is-a focuses on the structural relationship, while behaves-as-a focuses on the behavioral relationship. A very useful, pre-TDD design technique called Design by Contract emerges out of LSP, but that’s another topic.

Note that there is a slight assumption that I made in the previous paragraph. I said that LSP defines inheritance. Why inheritance specifically and not substitutability, in general? Well, inheritance has been the main vehicle for substitutability for most OO languages, especially the statically-typed ones.

For example, a Java application might use a simple tracing abstraction like this.

    
public interface Tracing {
    void trace(String message);
}
    

Clients might use this to trace methods calls to a log. Only classes that implement the Tracer interface can be given to these clients. For example,

    
public class TracerClient {
    private Tracer tracer;

    public TracerClient(Tracer tracer) {
        this.tracer = tracer;
    }

    public void doWork() {
        tracer.trace("in doWork():");
        // ...
    }
}
    

However, Duck Typing is another form of substitutability that is commonly seen in dynamically-typed languages, like Ruby and Python.

If it walks like a duck and quacks like a duck, it must be a duck.

Informally, duck typing says that a client can use any object you give it as long as the object implements the methods the client wants to invoke on it. Put another way, the object must respond to the messages the client wants to send to it.

The object appears to be a “duck” as far as the client is concerned.

In or example, clients only care about the trace(message) method being supported. So, we might do the following in Ruby.

    
class TracerClient 
  def initialize tracer 
    @tracer = tracer
  end

  def do_work
    @tracer.trace "in do_work:" 
    # ... 
  end
end

class MyTracer
  def trace message
    p message
  end
end

client = TracerClient.new(MyTracer.new)
    

No “interface” is necessary. I just need to pass an object to TracerClient.initialize that responds to the trace message. Here, I defined a class for the purpose. You could also add the trace method to another type or object.

So, LSP is still essential, in the generic sense of valid substitutability, but it doesn’t have to be inheritance based.

Is Duck Typing good or bad? It largely comes down to your view about dynamically-typed vs. statically-typed languages. I don’t want to get into that debate here! However, I’ll make a few remarks.

On the negative side, without a Tracer abstraction, you have to rely on appropriate naming of objects to convey what they do (but you should be doing that anyway). Also, it’s harder to find all the “tracing-behaving” objects in the system.

On the other hand, the client really doesn’t care about a “Tracer” type, only a single method. So, we’ve decoupled “client” and “server” just a bit more. This decoupling is more evident when using closures to express behavior, e.g., for Enumerable methods. In our case, we could write the following.

    
class TracerClient2 
  def initialize &tracer 
    @tracer = tracer
  end

  def do_work 
    @tracer.call "in do_work:" 
    # ... 
  end
end

client = TracerClient2.new {|message| p "block tracer: #{message}"}
    

For comparison, consider how we might approach substitutability in Scala. As a statically-typed language, Scala doesn’t support duck typing per se, but it does support a very similar mechanism called structural types.

Essentially, structural types let us declare that a method parameter must support one or more methods, without having to say it supports a full interface. Loosely speaking, it’s like using an anonymous interface.

In our Java example, when we declare a tracer object in our client, we would be able to declare that is supports trace, without having to specify that it implements a full interface.

To be explicit, recall our Java constructor for TestClient.

    
public class TracerClient {
    public TracerClient(Tracer tracer) { ... }
    // ...
    }
}
    
In Scala, a complete example would be the following.
    
class ScalaTracerClient(val tracer: { def trace(message:String) }) {
    def doWork() = { tracer.trace("doWork") }
}

class ScalaTracer() {
    def trace(message: String) = { println("Scala: "+message) }
}

object TestScalaTracerClient {
    def main() {
        val client = new ScalaTracerClient(new ScalaTracer())
        client.doWork();
    }
}
TestScalaTracerClient.main()
    

Recall from my previous blogs on Scala, the argument list to the class name is the constructor arguments. The constructor takes a tracer argument whose “type” (after the ’:’) is { def trace(message:String) }. That is, all we require of tracer is that it support the trace method.

So, we get duck type-like behavior, but statically type checked. We’ll get a compile error, rather than a run-time error, if someone passes an object to the client that doesn’t respond to tracer.

To conclude, LSP can be reworded very slightly.

If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2, then S is substitutable for T.

I replaced a subtype of with substitutable for.

An important point is that the idea of a “contract” between the types and their clients is still important, even in a language with duck-typing or structural typing. However, languages with these features give us more ways to extend our system, while still supporting LSP.

The Open-Closed Principle for Languages with Open Classes 130

Posted by Dean Wampler Fri, 05 Sep 2008 02:42:00 GMT

We’ve been having a discussion inside Object Mentor World Design Headquarters about the meaning of the OCP for dynamic languages, like Ruby, with open classes.

For example, in Ruby it’s normal to define a class or module, e.g.,

    
# foo.rb
class Foo
    def method1 *args
        ...
    end
end
    

and later re-open the class and add (or redefine) methods,

    
# foo2.rb
class Foo
    def method2 *args
        ...
    end
end
    

Users of Foo see all the methods, as if Foo had one definition.

    
foo = Foo.new
foo.method1 :arg1, :arg2
foo.method2 :arg1, :arg2
    

Do open classes violate the Open-Closed Principle? Bertrand Meyer articulated OCP. Here is his definition1.

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.

He elaborated on it here.

... This is the open-closed principle, which in my opinion is one of the central innovations of object technology: the ability to use a software component as it is, while retaining the possibility of adding to it later through inheritance. Unlike the records or structures of other approaches, a class of object technology is both closed and open: closed because we can start using it for other components (its clients); open because we can at any time add new properties without invalidating its existing clients.

Tell Less, Say More: The Power of Implicitness

So, if one client require’s only foo.rb and only uses method1, that client doesn’t care what foo2.rb does. However, if the client also require’s foo2.rb, perhaps indirectly through another require, problems will ensue unless the client is unaffected by what foo2.rb does. This looks a lot like the way “good” inheritance should behave.

So, the answer is no, we aren’t violating OCP, as long as we extend a re-opened class following the same rules we would use when inheriting from it.

If we use inheritance instead:

    
# foo.rb
class Foo
    def method1 *args
        ...
    end
end
...
class DerivedFoo < Foo
    def method2 *args
        ...
    end
end
...
foo = SubFoo.new    # Instantiate different class...
foo.method1 :arg1, :arg2
foo.method2 :arg1, :arg2
    

One notable difference is that we have to instantiate a different class. This is an important difference. While you can often just use inheritance, and maybe you should prefer it, inheritance only works if you have full control over what types get instantiated and it’s easy to change which types you use. Of course, inheritance is also the best approach when you need all behavioral variants simulateneously, i.e., each variant in one or more objects.

Sometimes you want to affect the behavior of all instances transparently, without changing the types that are instantiated. A slightly better example, logging method calls, illustrates the point. Here we use the “famous” alias_method in Ruby.

    
# foo.rb
class Foo
    def method1 *args
        ...
    end
end
# logging_foo.rb
class Foo
    alias_method :old_method1, :method1
    def method1 *args
        p "Inside method1(#{args.inspect})" 
        old_method1 *args
    end
end
...
foo = Foo.new
foo.method1 :arg1, :arg2
    

Foo.method1 behaves like a subclass override, with extended behavior that still obeys the Liskov-Substitution Principle (LSP).

So, I think the OCP can be reworded slightly.

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for source modification.

We should not re-open the original source, but adding functionality through a separate source file is okay.

Actually, I prefer a slightly different wording.

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for source and contract modification.

The extra and contract is redundant with LSP. I don’t think this kind of redundancy is necessarily bad. ;) The contract is the set of behavioral expectations between the “entity” and its client(s). Just as it is bad to break the contract with inheritance, it is also bad to break it through open classes.

OCP and LSP together are our most important design principles for effective organization of similar vs. variant behaviors. Inheritance is one way we do this. Open classes provide another way. Aspects provide a third way and are subject to the same design issues.

1 Meyer, Bertrand (1988). Object-Oriented Software Construction. Prentice Hall. ISBN 0136290493.

The Seductions of Scala, Part III - Concurrent Programming 402

Posted by Dean Wampler Thu, 14 Aug 2008 21:00:00 GMT

This is my third and last blog entry on The Seductions of Scala, where we’ll look at concurrency using Actors and draw some final conclusions.

Writing Robust, Concurrent Programs with Scala

The most commonly used model of concurrency in imperative languages (and databases) uses shared, mutable state with access synchronization. (Recall that synchronization isn’t necessary for reading immutable objects.)

However, it’s widely known that this kind of concurrency programming is very difficult to do properly and few programmers are skilled enough to write such programs.

Because pure functional languages have no side effects and no shared, mutable state, there is nothing to synchronize. This is the main reason for the resurgent interest in function programming recently, as a potential solution to the so-called multicore problem.

Instead, most functional languages, in particular, Erlang and Scala, use the Actor model of concurrency, where autonomous “objects” run in separate processes or threads and they pass messages back and forth to communicate. The simplicity of the Actor model makes it far easier to create robust programs. Erlang processes are so lightweight that it is common for server-side applications to have thousands of communicating processes.

Actors in Scala

Let’s finish our survey of Scala with an example using Scala’s Actors library.

Here’s a simple Actor that just counts to 10, printing each number, one per second.

    
import scala.actors._
object CountingActor extends Actor { 
    def act() { 
        for (i <- 1 to 10) { 
            println("Number: "+i)
            Thread.sleep(1000) 
        } 
    } 
} 

CountingActor.start()
    

The last line starts the actor, which implicitly invokes the act method. This actor does not respond to any messages from other actors.

Here is an actor that responds to messages, echoing the message it receives.

    
import scala.actors.Actor._ 
val echoActor = actor {
    while (true) {
        receive {
            case msg => println("received: "+msg)
        }
    }
}
echoActor ! "hello" 
echoActor ! "world!" 
    

In this case, we do the equivalent of a Java “static import” of the methods on Actor, e.g., actor. Also, we don’t actually need a special class, we can just create an object with the desired behavior. This object has an infinite loop that effectively blocks while waiting for an incoming message. The receive method gets a block that is a match statement, which matches on anything received and prints it out.

Messages are sent using the target_actor ! message syntax.

As a final example, let’s do something non-trivial; a contrived network node monitor.

    
import scala.actors._
import scala.actors.Actor._
import java.net.InetAddress 
import java.io.IOException

case class NodeStatusRequest(address: InetAddress, respondTo: Actor) 

sealed abstract class NodeStatus
case class Available(address: InetAddress) extends NodeStatus
case class Unresponsive(address: InetAddress, reason: Option[String]) extends NodeStatus

object NetworkMonitor extends Actor {
    def act() {
        loop {
            react {  // Like receive, but uses thread polling for efficiency.
                case NodeStatusRequest(address, actor) => 
                    actor ! checkNodeStatus(address)
                case "EXIT" => exit()
            }
        }
    }
    val timeoutInMillis = 1000;
    def checkNodeStatus(address: InetAddress) = {
        try {
            if (address.isReachable(timeoutInMillis)) 
                Available(address)
            else
                Unresponsive(address, None)
        } catch {
            case ex: IOException => 
                Unresponsive(address, Some("IOException thrown: "+ex.getMessage()))
        }
    }
}

// Try it out:

val allOnes = Array(1, 1, 1, 1).map(_.toByte)
NetworkMonitor.start()
NetworkMonitor ! NodeStatusRequest(InetAddress.getByName("www.scala-lang.org"), self)
NetworkMonitor ! NodeStatusRequest(InetAddress.getByAddress("localhost", allOnes), self)
NetworkMonitor ! NodeStatusRequest(InetAddress.getByName("objectmentor.com"), self)
NetworkMonitor ! "EXIT" 
self ! "No one expects the Spanish Inquisition!!" 

def handleNodeStatusResponse(response: NodeStatus) = response match {
    // Sealed classes help here
    case Available(address) => 
        println("Node "+address+" is alive.")
    case Unresponsive(address, None) => 
        println("Node "+address+" is unavailable. Reason: <unknown>")
    case Unresponsive(address, Some(reason)) => 
        println("Node "+address+" is unavailable. Reason: "+reason)
}

for (i <- 1 to 4) self.receive {   // Sealed classes don't help here
    case (response: NodeStatus) => handleNodeStatusResponse(response)
    case unexpected => println("Unexpected response: "+unexpected)
}
    

We begin by importing the Actor classes, the methods on Actor, like actor, and a few Java classes we need.

Next we define a sealed abstract base class. The sealed keyword tells the compiler that the only subclasses will be defined in this file. This is useful for the case statements that use them. The compiler will know that it doesn’t have to worry about potential cases that aren’t covered, if new NodeStatus subclasses are created. Otherwise, we would have to add a default case clause (e.g., case _ => ...) to prevent warnings (and possible errors!) about not matching an input. Sealed class hierarchies are a useful feature for robustness (but watch for potential Open/Closed Principle violations!).

The sealed class hierarchy encapsulates all the possible node status values (somewhat contrived for the example). The node is either Available or Unresponsive. If Unresponsive, an optional reason message is returned.

Note that we only get the benefit of sealed classes here because we match on them in the handleNodeStatusResponse message, which requires a response argument of type NodeStatus. In contrast, the receive method effectively takes an Any argument, so sealed classes don’t help on the line with the comment “Sealed classes don’t help here”. In that case, we really need a default, the case unexpected => ... clause. (I added the message self ! "No one expects the Spanish Inquisition!!" to test this default handler.)

In the first draft of this blog post, I didn’t know these details about sealed classes. I used a simpler implementation that couldn’t benefit from sealed classes. Thanks to the first commenter, LaLit Pant, who corrected my mistake!

The NetworkMonitor loops, waiting for a NodeStatusRequest or the special string “EXIT”, which tells it to quit. Note that the actor sending the request passes itself, so the monitor can reply to it.

The checkNodeStatus attempts to contact the node, with a 1 second timeout. It returns an appropriate NodeStatus.

Then we try it out with three addresses. Note that we pass self as the requesting actor. This is an Actor wrapping the current thread, imported from Actor. It is analogous to Java’s Thread.currentThread().

Curiously enough, when I run this code, I get the following results.

    
Unexpected response: No one expects the Spanish Inquisition!!
Node www.scala-lang.org/128.178.154.102 is unavailable. Reason: <unknown>
Node localhost/1.1.1.1 is unavailable. Reason: <unknown>
Node objectmentor.com/206.191.6.12 is alive.
    

The message about the Spanish Inquisition was sent last, but processed first, probably because self sent it to itself.

I’m not sure why www.scala-lang.org couldn’t be reached. A longer timeout didn’t help. According to the Javadocs for InetAddress.isReachable), it uses ICMP ECHO REQUESTs if the privilege can be obtained, otherwise it tries to establish a TCP connection on port 7 (Echo) of the destination host. Perhaps neither is supported on the scala-lang.org site.

Conclusions

Here are some concluding observations about Scala vis-à-vis Java and other options.

A Better Java

Ignoring the functional programming aspects for a moment, I think Scala improves on Java in a number of very useful ways, including:

  1. A more succinct syntax. There’s far less boilerplate, like for fields and their accessors. Type inference and optional semicolons, curly braces, etc. also reduce “noise”.
  2. A true mixin model. The addition of traits solves the problem of not having a good DRY way to mix in additional functionality declared by Java interfaces.
  3. More flexible method names and invocation syntax. Java took away operator overloading; Scala gives it back, as well as other benefits of using non-alphanumeric characters in method names. (Ruby programmers enjoy writing list.empty?, for example.)
  4. Tuples. A personal favorite, I’ve always wanted the ability to return multiple values from a method, without having to create an ad hoc class to hold the values.
  5. Better separation of mutable vs. immutable objects. While Java provides some ability to make objects final, Scala makes the distinction between mutability and immutability more explicit and encourages the latter as a more robust programming style.
  6. First-class functions and closures. Okay, these last two points are really about FP, but they sure help in OO code, too!
  7. Better mechanisms for avoiding null’s. The Option type makes code more robust than allowing null values.
  8. Interoperability with Java libraries. Scala compiles to byte code so adding Scala code to existing Java applications is about as seamless as possible.

So, even if you don’t believe in FP, you will gain a lot just by using Scala as a better Java.

Functional Programming

But, you shouldn’t ignore the benefits of FP!

  1. Better robustness. Not only for concurrent programs, but using immutable objects (a.k.a. value objects) reduces the potential for bugs.
  2. A workable concurrency model. I use the term workable because so few developers can write robust concurrent code using the synchronization on shared state model. Even for those of you who can, why bother when Actors are so much easier??
  3. Reduced code complexity. Functional code tends to be very succinct. I can’t overestimate the importance of rooting out all accidental complexity in your code base. Excess complexity is one of the most pervasive detriments to productivity and morale that I see in my clients’ code bases!
  4. First-class functions and closures. Composition and succinct code are much easier with first-class functions.
  5. Pattern matching. FP-style pattern matching makes “routing” of messages and delegation much easier.

Of course, you can mimic some of these features in pure Java and I encourage you to do so if you aren’t using Scala.

Static vs. Dynamic Typing

The debate on the relative merits of static vs. dynamic typing is outside our scope, but I will make a few personal observations.

I’ve been a dedicated Rubyist for a while. It is hard to deny the way that dynamic typing simplifies code and as I said in the previous section, I take code complexity very seriously.

Scala’s type system and type inference go a long way towards providing the benefits of static typing with the cleaner syntax of dynamic typing, but Scala doesn’t eliminate the extra complexity of static typing.

Recall my Observer example from the first blog post, where I used traits to implement it.

    
trait Observer[S] {
    def receiveUpdate(subject: S);
}

trait Subject[S] { 
    this: S =>
    private var observers: List[Observer[S]] = Nil
    def addObserver(observer: Observer[S]) = observers = observer :: observers

    def notifyObservers() = observers.foreach(_.receiveUpdate(this))
}
    
In Ruby, we might implement it this way.
    
module Subject 
    def add_observer(observer) 
      @observers ||= []
      @observers << observer  # append, rather than replace with new array
  end

    def notify_observers
      @observers.each {|o| o.receive_update(self)} if @observers
  end
end
    

There is no need for an Observer module. As long as every observer responds to the receive_update “message”, we’re fine.

I commented the line where I append to the existing @observers array, rather than build a new one, which would be the FP and Scala way. Appending to the existing array would be more typical of Ruby code, but this implementation is not as thread safe as an FP-style approach.

The trailing if expression in notify_observers means that nothing is done if @observers is still nil, i.e., it was never initialized in add_observer.

So, which is better? The amount of code is not that different, but it took me significantly longer to write the Scala version. In part, this was due to my novice chops, but the reason it took me so long was because I had to solve a design issue resulting from the static typing. I had to learn about the typed self construct used in the first line of the Subject trait. This was the only way to allow the Observer.receiveUpdate method accept to an argument of type S, rather than of type Subject[S]. It was worth it to me to achieve the “cleaner” API.

Okay, perhaps I’ll know this next time and spend about the same amount of time implementing a Ruby vs. Scala version of something. However, I think it’s notable that sometimes static typing can get in the way of your intentions and goal of achieving clarity. (At other times, the types add useful documentation.) I know this isn’t the only valid argument you can make, one way or the other, but it’s one reason that dynamic languages are so appealing.

Poly-paradigm Languages vs. Mixing Several Languages

So, you’re convinced that you should use FP sometimes and OOP sometimes. Should you pick a poly-paradigm language, like Scala? Or, should you combine several languages, each of which implements one paradigm?

A potential downside of Scala is that supporting different modularity paradigms, like OOP and FP, increases the complexity in the language. I think Odersky and company have done a superb job combining FP and OOP in Scala, but if you compare Scala FP code to Haskell or Erlang FP code, the latter tend to be more succinct and often easier to understand (once you learn the syntax).

Indeed, Scala will not be easy for developers to master. It will be a powerful tool for professionals. As a consultant, I work with developers with a range of skills. I would not expect some of them to prosper with Scala. Should that rule out the language? NO. Rather it would be better to “liberate” the better developers with a more powerful tool.

So, if your application needs OOP and FP concepts interspersed, consider Scala. If your application needs discrete services, some of which are predominantly OOP and others of which are predominantly FP, then consider Scala or Java for the OOP parts and Erlang or another FP language for the FP parts.

Also, Erlang’s Actor model is more mature than Scala’s, so Erlang might have an edge for a highly-concurrent server application.

Of course, you should do your own analysis…

Final Thoughts

Java the language has had a great ride. It was a godsend to us beleaguered C++ programmers in the mid ‘90’s. However, compared to Scala, Java now feels obsolete. The JVM is another story. It is arguably the best VM available.

I hope Scala replaces Java as the main JVM language for projects that prefer statically-typed languages. Fans of dynamically-typed languages might prefer JRuby, Groovy, or Jython. It’s hard to argue with all the OOP and FP goodness that Scala provides. You will learn a lot about good language and application design by learning Scala. It will certainly be a prominent tool in my toolkit from now on.

Unit Testing C and C++ ... with Ruby and RSpec! 110

Posted by Dean Wampler Tue, 05 Feb 2008 04:08:00 GMT

If you’re writing C/C++ code, it’s natural to write your unit tests in the same language (or use C++ for your C test code). All the well-known unit testing tools take this approach.

I think we can agree that neither language offers the best developer productivity among all the language choices out there. Most of us use either language because of perceived performance requirements, institutional and industry tradition, etc.

There’s growing interest, however, in mixing languages, tools, and paradigms to get the best tool for a particular job. <shameless-plug>I’m giving a talk March 7th at SD West on this very topic, called Polyglot and Poly-Paradigm Programming </shameless-plug>.

So, why not use a more productive language for your C or C++ unit tests? You have more freedom in your development chores than what’s required for production. Why not use Ruby’s RSpec, a Behavior-Driven Development tool for acceptance and unit testing? Or, you could use Ruby’s version of JUnit, called Test::Unit. The hard part is integrating Ruby and C/C++. If you’ve been looking for an excuse to bring Ruby (or Tcl or Python or Java or…) into your C/C++ environment, starting with development tasks is usually the path of least resistance.

I did some experimenting over the last few days to integrate RSpec using SWIG (Simplified Wrapper and Interface Generator), a tool for bridging libraries written in C and C++ to other languages, like Ruby. The Ruby section of the SWIG manual was very helpful.

My Proof-of-Concept Code

Here is a zip file of my experiment: rspec_for_cpp.zip

This is far from a complete and working solution, but I think it shows promise. See the Current Limitations section below for details.

Unzip the file into a directory. I’ll assume you named it rspec_for_cpp. You need to have gmake, gcc, SWIG and Ruby installed, along with the RSpec “gem”. Right now, it only builds on OS X and Linux (at least the configurations on my machines running those OS’s – see the discussion below). To run the build, use the following commands:

    
        $ cd rspec_for_cpp/cpp
        $ make 
    

You should see it finish with the lines

    
        ( cd ../spec; spec *_spec.rb )
        .........

        Finished in 0.0***** seconds

        9 examples, 0 failures
    

Congratulations, you’ve just tested some C and C++ code with RSpec! (Or, if you didn’t succeed, see the notes in the Makefile and the discussion below.)

The Details

I’ll briefly walk you through the files in the zip and the key steps required to make it all work.

cexample.h

Here is a simple C header file.

    
        /* cexample.h */
        #ifndef CEXAMPLE_H
        #define CEXAMPLE_H
        #ifdef __cplusplus
         extern "C" {
        #endif
        char* returnString(char* input);
        double returnDouble(int input);
        void  doNothing();

        #ifdef __cplusplus
         }
        #endif
        #endif
    

Of course, in a pure C shop, you won’t need the #ifdef __cplusplus stuff. I found this was essential in my experiment when I mixed C and C++, as you might expect.

cpp/cexample.c

Here is the corresponding C source file.

    
        /* cexample.h */

        char* returnString(char* input) {
            return input;
        }

        double returnDouble(int input) {
            return (double) input;
        }

        void  doNothing() {}
    

cpp/CppExample.h

Here is a C++ header file.

    
        #ifndef CPPEXAMPLE_H
        #define CPPEXAMPLE_H

        #include <string>

        class CppExample 
        {
        public:
            CppExample();
            CppExample(const CppExample& foo);
            CppExample(const char* title, int flag);
            virtual ~CppExample();

            const char* title() const;
            void        title(const char* title);
            int         flag() const;
            void        flag(int value);

            static int countOfCppExamples();
        private:
            std::string _title;
            int         _flag;
        };

        #endif
    

cpp/CppExample.cpp

Here is the corresponding C++ source file.

    
        #include "CppExample.h" 

        CppExample::CppExample() : _title("") {}
        CppExample::CppExample(const CppExample& foo): _title(foo._title) {}
        CppExample::CppExample(const char* title, int flag) : _title(title), _flag(flag) {}
        CppExample::~CppExample() {}

        const char* CppExample::title() const { return _title.c_str(); }
        void        CppExample::title(const char* name) { _title = name; }

        int  CppExample::flag() const { return _flag; }
        void CppExample::flag(int value) { _flag = value; }

        int CppExample::countOfCppExamples() { return 1; }
    

cpp/example.i

Typically in SWIG, you specify a .i file to the swig command to define the module that wraps the classes and global functions, which classes and functions to expose to the target language (usually all in our case), and other assorted customization options, which are discussed in the SWIG manual. I’ll show the swig command in a minute. For now, note that I’m going to generate an example_wrap.cpp file that will function as the bridge between the languages.

Here’s my example.i, where I named the module example.

    
        %module example
        %{
            #include "cexample.h" 
            #include "CppExample.h"    
        %}
        %include "cexample.h" 
        %include "CppExample.h" 
    

It looks odd to have header files appear twice. The code inside the %{...%} (with a ’#’ before each include) are standard C and C++ statements, etc. that will be inserted verbatim into the generated “wrapper” file, example_wrap.cpp, so that file will compile when it references anything declared in the header files. The second case, with a ’%’ before the include statements1, tells SWIG to make all the declarations in those header files available to the target language. (You can be more selective, if you prefer…)

Following Ruby conventions, the Ruby plugin for SWIG automatically names the module with an upper case first letter (Example), but you use require 'example' to include it, as we’ll see shortly.

Building

See the cpp/Makefile for the gory details. In a nutshell, you run the swig command like this.

    
        swig -c++ -ruby -Wall -o example_wrap.cpp example.i
    

Next, you create a dynamically-linked library, as appropriate for your platform, so the Ruby interpreter can load the module dynamically when required. The Makefile can do this for Linux and OS X platforms. See the Ruby section of the SWIG manual for Windows specifics.

If you test-drive your code, which tends to drive you towards minimally-coupled “modules”, then you can keep your libraries and build times small, which will make the build and test cycle very fast!

spec/cexample_spec.rb and spec/cppexample_spec.rb

Finally, here are the RSpec files that exercise the C and C++ code. (Disclaimer: these aren’t the best spec files I’ve ever written. For one thing, they don’t exercise all the CppExample methods! So sue me… :)

    
        require File.dirname(__FILE__) + '/spec_helper'
        require 'example'

        describe "Example (C functions)" do
          it "should be a constant on Module" do
            Module.constants.should include('Example')
          end
          it "should have the methods defined in the C header file" do
            Example.methods.should include('returnString')
            Example.methods.should include('returnDouble')
            Example.methods.should include('doNothing')
          end
        end

        describe Example, ".returnString" do
          it "should return the input char * string as a Ruby string unchanged" do
            Example.returnString("bar!").should == "bar!" 
          end  
        end

        describe Example, ".returnDouble" do
          it "should return the input integer as a double" do
            Example.returnDouble(10).should == 10.0
          end
        end

        describe Example, ".doNothing" do
          it "should exist, but do nothing" do
            lambda { Example.doNothing }.should_not raise_error
          end
        end
    

and

    
    require File.dirname(__FILE__) + '/spec_helper'
    require 'example'

    describe Example::CppExample do
      it "should be a constant on module Example" do
        Example.constants.should include('CppExample')
      end
    end

    describe Example::CppExample, ".new" do
      it "should create a new object of type CppExample" do
        example = Example::CppExample.new("example1", 1)
        example.title.should == "example1" 
        example.flag.should  == 1
      end
    end

    describe Example::CppExample, "#title(value)" do
      it "should set the title" do
        example = Example::CppExample.new("example1", 1)
        example.title("title2")
        example.title.should == "title2" 
      end
    end

    describe Example::CppExample, "#flag(value)" do
      it "should set the flag" do
        example = Example::CppExample.new("example1", 1)
        example.flag(2)
        example.flag.should == 2
      end
    end
    

If you love RSpec like I do, this is a very compelling thing to see!

And now for the small print:

Current Limitations

As I said, this is just an experiment at this point. Volunteers to make this battle-ready would be most welcome!

General

The Example Makefile File

It Must Be Hand Edited for Each New or Renamed Source File

You’ve probably already solved this problem for your own make files. Just merge in the example Makefile to pick up the SWIG- and RSpec-related targets and rules.

It Only Knows How to Build Shared Libraries for Mac OS X and Linux (and Not Very Well)

Some definitions are probably unique to my OS X and Linux machines. Windows is not supported at all. However, this is also easy rectify. Start with the notes in the Makefile itself.

The module.i File Must Be Hand Edited for Each File Change

Since the format is simple, a make task could fill a template file with the changed list of source files during the build.

Better Automation

It should be straightforward to provide scripts, IDE/Editor shortcuts, etc. that automate some of the tasks of adding new methods and classes to your C and C++ code when you introduce them first in your “spec” files. (The true TDD way, of course.)

Specific Issues for C Code Testing

I don’t know of any other C-specific issues, so unit testing with Ruby is most viable today for C code. Although I haven’t experimented extensively, C functions and variables are easily mapped by SWIG to a Ruby module. The Ruby section of the SWIG manual discusses this mapping in some detail.

Specific Issues for C++ Code Testing

More work will be required to make this viable. It’s important to note that SWIG cannot handle all C++ constructs (although there are workarounds for most issues, if you’re committed to this approach…). For example, namespaces, nested classes, some template and some method overloading scenarios are not supported. The SWIG manual has details.

Also, during my experiment, SWIG didn’t seem to map const std::string& objects in C++ method signatures to Ruby strings, as I would have expected (char* worked fine).

Is It a Viable Approach?

Once the General issues listed above are handled, I think this approach would work very well for C code. For C++ code, there are more issues that need to be addressed, and programmers who are committed to this strategy will need to tolerate some issues (or just use C++-language tools for some scenarios).

Conclusions: Making It Development-Team Ready

I’d like to see this approach pushed to its logical limit. I think it has the potential to really improve the productivity of C and C++ developers and the quality of their test coverage, by leveraging the productivity and power of dynamically-typed languages like Ruby. If you prefer, you could use Tcl, Python, even Java instead.

License

This code is complete open and free to use. Of course, use it at your own risk; I offer it without warranty, etc., etc. When I polish it to the point of making it an “official” project, I will probably release under the Apache license.

1 I spent a lot of time debugging problems because I had a ’#’ where I should have had a ’%’! Caveat emptor!

ANN: Talks at the Chicago Ruby Users Group (Oct. 1, Dec. 3) 15

Posted by Dean Wampler Sat, 29 Sep 2007 14:52:00 GMT

I’m speaking on Aquarium this Monday night (Oct. 1st). Details are here. David Chelimsky will also be talking about new developments in RSpec and RBehave.

At the Dec. 3 Chirb meeting, the two of us are reprising our Agile 2007 tutorial Ruby’s Secret Sauce: Metaprogramming.

Please join us!

Older posts: 1 2