The Seductions of Scala, Part II - Functional Programming 176

Posted by Dean Wampler Wed, 06 Aug 2008 01:32:00 GMT

A Functional Programming Language for the JVM

In my last blog post, I discussed Scala’s support for OOP and general improvements compared to Java. In this post, which I’m posting from Agile 2008, I discuss Scala’s support for functional programming (FP) and why it should be of interest to OO developers.

A Brief Overview of Functional Programming

You might ask, don’t most programming languages have functions? FP uses the term in the mathematical sense of the word. I hate to bring up bad memories, but you might recall from your school days that when you solved a function like

    
y = sin(x)
    

for y, given a value of x, you could input the same value of x an arbitrary number of times and you would get the same value of y. This means that sin(x) has no side effects. In other words, unlike our imperative OO or procedural code, no global or object state gets changed. All the work that a mathematical function does has to be returned in the result.

Similarly, the idea of a variable is a little different than what we’re used to in imperative code. While the value of y will vary with the value of x, once you have fixed x, you have also fixed y. The implication for FP is that “variables” are immutable; once assigned, they cannot be changed. I’ll call such immutable variables value objects.

Now, it would actually be hard for a “pure” FP language to have no side effects, ever. I/O would be rather difficult, for example, since the state of the input or output stream changes with each operation. So, in practice, all “pure” FP languages provide some mechanisms for breaking the rules in a controlled way.

Functions are first-class objects in FP. You can create named or anonymous functions (e.g., closures or blocks), assign them to variables, pass them as arguments to other functions, etc. Java doesn’t support this. You have to create objects that wrap the methods you want to invoke.

Functional programs tend to be much more declarative in nature than imperative programs. This is perhaps more obvious in pure FP languages, like Erlang and Haskell, than it is in Scala.

For example, the definition of Fibonacci numbers is the following.

    
F(n) = F(n-1) + F(n-2) where F(1)=1 and F(2)=1
    

An here is a complete implementation in Haskell.

    
module Main where 
-- Function f returns the n'th Fibonacci number. 
-- It uses binary recursion. 
f n | n <= 2 = 1 
    | n >  2 = f (n-1) + f (n-2) 
    

Without understanding the intricacies of Haskell syntax, you can see that the code closely matches the “specification” above it. The f n | ... syntax defines the function f taking an argument n and the two cases of n values are shown on separate lines, where one case is for n <= 2 and the other case if for n > 2.

The code uses the recursive relationship between different values of the function and the special-case values when n = 1 and n = 2. The Haskell runtime does the rest of the work.

It’s interesting that most domain-specific languages are also declarative in nature. Think of how JMock, EasyMock or Rails’ ActiveRecord code look. The code is more succinct and it lets the “system” do most of the heavy lifting.

Functional Programming’s Benefits for You

Value Objects and Side-Effect Free Functions

It’s the immutable variables and side-effect free functions that help solve the multicore problem. Synchronized access to shared state is not required if there is no state to manage. This makes robust concurrent programs far easier to write.

I’ll discuss concurrency in Scala in my third post. For now, let’s discuss other ways that FP in Scala helps to improve code, concurrent or not.

Value objects are beneficial because you can pass one around without worrying that someone will change it in a way that breaks other users of the object. Value objects aren’t unique to FP, of course. They have been promoted in Domain Driven Design (DDD), for example.

Similarly, side-effect free functions are safer to use. There is less risk that a caller will change some state inappropriately. The caller doesn’t have to worry as much about calling a function. There are fewer surprises and everything of “consequence” that the function does is returned to the caller. It’s easier to keep to the Single Responsibility Principle when writing side-effect free functions.

Of course, you can write side-effect free methods and immutable variables in Java code, but it’s mostly a matter of discipline; the language doesn’t give you any enforcement mechanisms.

Scala gives you a helpful enforcement mechanism; the ability to declare variables as val’s (i.e., “values”) vs. var’s (i.e., “variables”, um… back to the imperative programming sense of the word…). In fact, val is the default, where neither is required by the language. Also, the Scala library contains both immutable and mutable collections and it “encourages” you to use the immutable collections.

However, because Scala combines both OOP and FP, it doesn’t force FP purity. The upside is that you get to use the approach that best fits the problem you’re trying to solve. It’s interesting that some of the Scala library classes expose FP-style interfaces, immutability and side-effect free functions, while using more traditional imperative code to implement them!

Closures and First-Class Functions

True to its functional side, Scala gives you true closures and first-class functions. If you’re a Groovy or Ruby programmer, you’re used to the following kind of code.

    
class ExpensiveResource {
    def open(worker: () => Unit) = {
        try {
            println("Doing expensive initialization")
            worker()
        } finally {
            close()
        }
    }
    def close() = {
        println("Doing expensive cleanup")
    }
}
// Example use:
try {
    (new ExpensiveResource()) open { () =>        // 1
        println("Using Resource")                 // 2
        throw new Exception("Thrown exception")   // 3
    }                                             // 4
} catch {
    case ex: Throwable => println("Exception caught: "+ex)
}
    

Running this code will yield:

    
Doing expensive initialization
Using Resource
Doing expensive cleanup
Exception caught: java.lang.Exception: Thrown exception
    

The ExpensiveResource.open method invokes the user-specified worker function. The syntax worker: () => Unit defines the worker parameter as a function that takes no arguments and returns nothing (recall that Unit is the equivalent of void).

ExpensiveResource.open handles the details of initializing the resource, invoking the worker, and doing the necessary cleanup.

The example marked with the comment // 1 creates a new ExpensiveResource, then calls open, passing it an anonymous function, called a function literal in Scala terminology. The function literal is of the form (arg_list_) => function body or () => println(...) ..., in our case.

A special syntax trick is used on this line; if a method takes one argument, you can change expressions of the form object.method(arg) to object method {arg}. This syntax is supported to allow user-defined methods to read like control structures (think for statements – see the next section). If you’re familiar with Ruby, the four commented lines read a lot like Ruby syntax for passing blocks to methods.

Idioms like this are very important. A library writer can encapsulate all complex, error-prone logic and allow the user to specify only the unique work required in a given situation. For example, How many times have you written code that opened an I/O stream or a database connection, used it, then cleaned up. How many times did you get the idiom wrong, especially the proper cleanup when an exception is thrown? First-class functions allow writers of I/O, database and other resource libraries to do the correct implementation once, eliminating user error and duplication. Here’s a rhetorical question I always ask myself:

How can I make it impossible for the user of this API to fail?

Iterations

Iteration through collections, Lists in particular, is even more common in FP than in imperative languages. Hence, iteration is highly evolved. Consider this example:

    
object RequireWordsStartingWithPrefix {
    def main(args: Array[String]) = {
        val prefix = args(0)
        for {
            i <- 1 to (args.length - 1)   // no semicolon
            if args(i).startsWith(prefix)
        } println("args("+i+"): "+args(i))
    }
}
    

Compiling this code with scalac and then running it on the command line with the command

    
scala RequireWordsStartingWithPrefix xx xy1 xx1 yy1 xx2 xy2
    

produces the result

    
args(2): xx1
args(5): xx2
    

The for loop assigns a loop variable i with each argument, but only if the if statement is true. Instead of curly braces, the for loop argument list could also be parenthesized, but then each line as shown would have to be separated by a semi-colon, like we’re used to seeing with Java for loops.

We can have an arbitrary number of assignments and conditionals. In fact, it’s quite common to filter lists:

    
object RequireWordsStartingWithPrefix2 {
    def main(args: Array[String]) = {
        val prefix = args(0)
        args.slice(1, args.length)
            .filter((arg: String) => arg.startsWith(prefix))
            .foreach((arg: String) => println("arg: "+arg))
    }
}
    

This version yields the same result. In this case, the args array is sliced (loping off the search prefix), the resulting array is filtered using a function literal and the filtered array is iterated over to print out the matching arguments, again using a function literal. This version of the algorithm should look familiar to Ruby programmers.

Rolling Your Own Function Objects

Scala still has to support the constraints of the JVM. As a comment to the first blog post said, the Scala compiler wraps closures and “bare” functions in Function objects. You can also make other objects behave like functions. If your object implements the apply method, that method will be invoked if you put parentheses with an matching argument list on the object, as in the following example.

    
class HelloFunction {
    def apply() = "hello" 
    def apply(name: String) = "hello "+name
}
val hello = new HelloFunction
println(hello())        // => "hello" 
println(hello("Dean"))  // => "hello Dean" 
    

Option, None, Some…

Null pointer exceptions suck. You can still get them in Scala code, because Scala runs on the JVM and interoperates with Java libraries, but Scala offers a better way.

Typically, a reference might be null when there is nothing appropriate to assign to it. Following the conventions in some FP languages, Scala has an Option type with two subtypes, Some, which wraps a value, and None, which is used instead of null. The following example, which also demonstrates Scala’s Map support, shows these types in action.

    
val hotLangs = Map(
    "Scala" -> "Rocks", 
    "Haskell" -> "Ethereal", 
    "Java" -> null)
println(hotLangs.get("Scala"))          // => Some(Rocks)
println(hotLangs.get("Java"))           // => Some(null)
println(hotLangs.get("C++"))            // => None
    

Note that Map stores values in Options objects, as shown by the println statements.

By the way, those -> aren’t special operators; they’re methods. Like ::, valid method names aren’t limited to alphanumerics, _, and $.

Pattern Matching

The last FP feature I’ll discuss in this post is pattern matching, which is exploited more fully in FP languages than in imperative languages.

Using our previous definition of hotLangs, here’s how you might use matching.

    
def show(key: String) = {
    val value: Option[String] = hotLangs.get(key)
    value match {
        case Some(x) => x
        case None => "No hotness found" 
    }
}
println(show("Scala"))  // => "Rocks" 
println(show("Java"))   // => "null" 
println(show("C++"))    // => "No hotness found" 
    

The first case statement, case Some(x) => x, says “if the value I’m matching against is a Some that could be constructed with the Some[+String](x: A) constructor, then return the x, the thing the Some contains.” Okay, there’s a lot going on here, so more background information is in order.

In Scala, like Ruby and other languages, the last value computed in a function is returned by it. Also, almost everything returns a value, including match statements, so when the Some(x) => x case is chosen, x is returned by the match and hence by the function.

Some is a generic class and the show function returns a String, so the match is to Some[+String]. The + in the +String expression is analogous to Java’s extends, i.e., <? extends String>. Capiche?

Idioms like case Some(x) => x are called extractors in Scala and are used a lot in Scala, as well as in FP, in general. Here’s another example using Lists and our friend ::, the “cons” operator.

    
def countScalas(list: List[String]): Int = {
    list match {
        case "Scala" :: tail => countScalas(tail) + 1
        case _ :: tail       => countScalas(tail)
        case Nil             => 0
    }
}
val langs = List("Scala", "Java", "C++", "Scala", "Python", "Ruby")
val count = countScalas(langs)
println(count)    // => 2
    

We’re counting the number of occurrences of “Scala” in a list of strings, using matching and recursion and no explicit iteration. An expression of the form head :: tail applied to a list returns the first element set as the head variable and the rest of the list set as the tail variable. In our case, the first case statement looks for the particular case where the head equals Scala. The second case matches all lists, except for the empty list (Nil). Since matches are eager, the first case will always pick out the List("Scala", ...) case first. Note that in the second case, we don’t actually care about the value, so we use the placeholder _. Both the first and second case’s call countScalas recursively.

Pattern matching like this is powerful, yet succinct and elegant. We’ll see more examples of matching in the next blog post on concurrency using message passing.

Recap of Scala’s Functional Programming

I’ve just touched the tip of the iceberg concerning functional programming (and I hope I got all the details right!). Hopefully, you can begin to see why we’ve overlooked FP for too long!

In my last post, I’ll wrap up with a look at Scala’s approach to concurrency, the Actor model of message passing.

The Seductions of Scala, Part I 159

Posted by Dean Wampler Sun, 03 Aug 2008 20:30:00 GMT

(Update 12/23/2008: Thanks to Apostolos Syropoulos for pointing out an earlier reference for the concept of “traits”).

Because of all the recent hoo-ha about functional programming (e.g., as a “cure” for the multicore problem), I decided to cast aside my dysfunctional ways and learn one of the FP languages. The question was, which one?

My distinguished colleague, Michael Feathers, has been on a Haskell binge of late. Haskell is a pure functional language and is probably most interesting as the “flagship language” for academic exploration, rather than production use. (That was not meant as flame bait…) It’s hard to underestimate the influence Haskell has had on language design, including Java generics, .NET LINQ and F#, etc.

However, I decided to learn Scala first, because it is a JVM language that combines object-oriented and functional programming in one language. At ~13 years of age, Java is a bit dated. Scala has the potential of replacing Java as the principle language of the JVM, an extraordinary piece of engineering that is arguably now more valuable than the language itself. (Note: there is also a .NET version of Scala under development.)

Here are some of my observations, divided over three blog posts.

First, a few disclaimers. I am a Scala novice, so any flaws in my analysis reflect on me, not Scala! Also, this is by no means an exhaustive analysis of the pros and cons of Scala vs. other options. Start with the Scala website for more complete information.

A Better OOP Language

Scala works seamlessly with Java. You can invoke Java APIs, extend Java classes and implement Java interfaces. You can even invoke Scala code from Java, once you understand how certain “Scala-isms” are translated to Java constructs (javap is your friend). Scala syntax is more succinct and removes a lot of tedious boilerplate from Java code.

For example, the following Person class in Java:

    
class Person {
    private String firstName;
    private String lastName;
    private int    age;

    public Person(String firstName, String lastName, int age) {
        this.firstName = firstName;
        this.lastName  = lastName;
        this.age       = age;
    }

    public void setFirstName(String firstName) { this.firstName = firstName; }
    public void String getFirstName() { return this.firstName; }
    public void setLastName(String lastName) { this.lastName = lastName; }
    public void String getLastName() { return this.lastName; }
    public void setAge(int age) { this.age = age; }
    public void int getAge() { return this.age; }
}
    

can be written in Scala thusly:

    
class Person(var firstName: String, var lastName: String, var age: Int)
    

Yes, that’s it. The constructor is the argument list to the class, where each parameter is declared as a variable (var keyword). It automatically generates the equivalent of getter and setter methods, meaning they look like Ruby-style attribute accessors; the getter is foo instead of getFoo and the setter is foo = instead of setFoo. Actually, the setter function is really foo_=, but Scala lets you use the foo = sugar.

Lots of other well designed conventions allow the language to define almost everything as a method, yet support forms of syntactic sugar like the illusion of operator overloading, Ruby-like DSL’s, etc.

You also get fewer semicolons, no requirements tying package and class definitions to the file system structure, type inference, multi-valued returns (tuples), and a better type and generics model.

One of the biggest deficiencies of Java is the lack of a complete mixin model. Mixins are small, focused (think Single Responsibility Principle ...) bits of state and behavior that can be added to classes (or objects) to extend them as needed. In a language like C++, you can use multiple inheritance for mixins. Because Java only supports single inheritance and interfaces, which can’t have any state and behavior, implementing a mixin-based design has always required various hacks. Aspect-Oriented Programming is also one partial solution to this problem.

The most exciting OOP enhancement Scala brings is its support for Traits, a concept first described here and more recently discussed here. Traits support Mixins (and other design techniques) through composition rather than inheritance. You could think of traits as interfaces with implementations. They work a lot like Ruby modules.

Here is an example of the Observer Pattern written as traits, where they are used to monitor changes to a bank account balance. First, here are reusable Subject and Observer traits.

    
trait Observer[S] {
    def receiveUpdate(subject: S);
}

trait Subject[S] { 
    this: S =>
    private var observers: List[Observer[S]] = Nil
    def addObserver(observer: Observer[S]) = observers = observer :: observers

    def notifyObservers() = observers.foreach(_.receiveUpdate(this))
}
    

In Scala, generics are declared with square brackets, [...], rather than angled brackets, <...>. Method definitions begin with the def keyword. The Observer trait defines one abstract method, which is called by the Subject to notify the observer of changes. The Subject is passed to the Observer.

This trait looks exactly like a Java interface. In fact, that’s how traits are represented in Java byte code. If the trait has state and behavior, like Subject, the byte code representation involves additional elements.

The Subject trait is more complex. The strange line, this: S => , is called a self type declaration. It tells the compiler that whenever this is referenced in the trait, treat its type as S, rather than Subject[S]. Without this declaration, the call to receiveUpdate in the notifyObservers method would not compile, because it would attempt to pass a Subject[S] object, rather than a S object. The self type declaration solves this problem.

The next line creates a private list of observers, initialized to Nil, which is an empty list. Variable declarations are name: type. Why didn’t they follow Java conventions, i.e., type name? Because this syntax makes the code easier to parse when type inference is used, meaning where the explicit :type is omitted and inferred.

In fact, I’m using type inference for all the method declarations, because the compiler can figure out what each method returns, in my examples. In this case, they all return type Unit, the equivalent of Java’s void. (The name Unit is a common term in functional languages.)

The third line defines a method for adding a new observer to the list. Notice that concrete method definitions are of the form

    
def methodName(parameter: type, ...) = {
    method body
}  
    

In this case, because there is only one line, I dispensed with the {...}. The equals sign before the body emphasizes the functional nature of scala, that all methods are objects, too. We’ll revisit this in a moment and in the next post.

The method body prepends the new observer object to the existing list. Actually, a new list is created. The :: operator, called “cons”, binds to the right. This “operator” is really a method call, which could actually be written like this, observers.::(observer).

Our final method in Subject is notifyObservers. It iterates through observers and invokes the block observer.receiveUpdate(this) on each observer. The _ evaluates to the current observer reference. For comparison, in Ruby, you would define this method like so:

    
def notifyObservers() 
    @observers.each { |o| o.receiveUpdate(self) }
end
    

Okay, let’s look at how you would actually use these traits. First, our “plain-old Scala object” (POSO) Account.

    
class Account(initialBalance: Double) {
    private var currentBalance = initialBalance
    def balance = currentBalance
    def deposit(amount: Double)  = currentBalance += amount
    def withdraw(amount: Double) = currentBalance -= amount
}
    

Hopefully, this is self explanatory, except for two things. First, recall that the whole class declaration is actually the constructor, which is why we have an initialBalance: Double parameter on Account. This looks strange to the Java-trained eye, but it actually works well and is another example of Scala’s economy. (You can define multiple constructors, but I won’t go into that here…).

Second, note that I omitted the parentheses when I defined the balance “getter” method. This supports the uniform access principle. Clients will simply call myAccount.balance, without parentheses and I could redefine balance to be a var or val and the client code would not have to change!

Next, a subclass that supports observation.

    
class ObservedAccount(initialBalance: Double) extends Account(initialBalance) with Subject[Account] {
    override def deposit(amount: Double) = {
        super.deposit(amount)
        notifyObservers()
    }
    override def withdraw(amount: Double) = {
        super.withdraw(amount)
        notifyObservers()
    }
}
    

The with keyword is how a trait is used, much the way that you implement an interface in Java, but now you don’t have to implement the interface’s methods. We’ve already done that.

Note that the expression, ObservedAccount(initialBalance: Double) extends Account(initialBalance), not only defines the (single) inheritance relationship, it also functions as the constructor’s call to super(initialBalance), so that Account is properly initialized.

Next, we have to override the deposit and withdraw methods, calling the parent methods and then invoking notifyObservers. Anytime you override a concrete method, scala requires the override keyword. This tells you unambiguously that you are overriding a method and the Scala compiler throws an error if you aren’t actually overriding a method, e.g., because of a typo. Hence, the keyword is much more reliable (and hence useful…) than Java’s @Override annotation.

Finally, here is an Observer that prints to stdout when the balance changes.

    
class AccountReporter extends Observer[Account] {
    def receiveUpdate(account: Account) =
        println("Observed balance change: "+account.balance)
}
    

Rather than use with, I just extend the Observer trait, because I don’t have another parent class.

Here’s some code to test what we’ve done.

    
def changingBalance(account: Account) = {
    println("==== Starting balance: " + account.balance)
    println("Depositing $10.0")
    account.deposit(10.0)
    println("new balance: " + account.balance)
    println("Withdrawing $5.60")
    account.withdraw(5.6)
    println("new balance: " + account.balance)
}

var a = new Account(0.0)
changingBalance(a)

var oa = new ObservedAccount(0.0)
changingBalance(oa)
oa.addObserver(new AccountReporter)
changingBalance(oa)
    

Which prints out:

    
==== Starting balance: 0.0
Depositing $10.0
new balance: 10.0
Withdrawing $5.60
new balance: 4.4
==== Starting balance: 0.0
Depositing $10.0
new balance: 10.0
Withdrawing $5.60
new balance: 4.4
==== Starting balance: 4.4
Depositing $10.0
Observed balance change: 14.4
new balance: 14.4
Withdrawing $5.60
Observed balance change: 8.8
new balance: 8.8
    

Note that we only observe the last transaction.

Download Scala and try it out. Put all this code in one observer.scala file, for example, and run the command:

    
scala observer.scala
    

But Wait, There’s More!

In the next post, I’ll look at Scala’s support for Functional Programming and why OO programmers should find it interesting. In the third post, I’ll look at the specific case of concurrent programming in Scala and make some concluding observations of the pros and cons of Scala.

For now, here are some references for more information.

  • The Scala website, for downloads, documentation, mailing lists, etc.
  • Ted Neward’s excellent multipart introduction to Scala at developerWorks.
  • The forthcoming Programming in Scala book.

Always close() in a finally block 45

Posted by Dean Wampler Thu, 31 Jul 2008 05:12:00 GMT

Here’s one for my fellow Java programmers, but it’s really generally applicable.

When you call close() on I/O streams, readers, writers, network sockets, database connections, etc., it’s easy to forgot the most appropriate idiom. I just spent a few hours fixing some examples of misuse in otherwise very good Java code.

What’s wrong the following code?

    
public void writeContentToFile(String content, String fileName) throws Exception {
    File output = new File(fileName);
    OutputStreamWriter writer = new OutputStreamWriter(new FileOutputStream(output), "UTF-8");
    writer.write(content);
    writer.close();
}
    

It doesn’t look all that bad. It tells it’s story. It’s easy to understand.

However, it’s quite likely that you won’t get to the last line, which closes the writer, from time to time. File and network I/O errors are common. For example, what if you can’t actually write to the location specified by fileName? So, we have to be more defensive. We want to be sure we always clean up.

The correct idiom is to use a try … finally … block.

    
public void writeContentToFile(String content, String fileName) throws Exception {
    File output = new File(getFileSystemPath() + contentFilename);
    OutputStreamWriter writer = null;
    try {
        writer = new OutputStreamWriter(new FileOutputStream(output), "UTF-8");
        writer.write(content);
    } finally {
        if (writer != null)
            writer.close();
    }
}
    

Now, no matter what happens, the writer will be closed, if it’s not null, even if writing the output was unsuccessful.

Note that we don’t necessarily need a catch block, because in this case we’re willing to let any Exceptions propagate up the stack (notice the throws clause). A lot of developers don’t realize that there are times when you need a try block, but not necessarily a catch block. This is one of those times.

So, anytime you need to clean up or otherwise release resources, use a finally block to ensure that the clean up happens, no matter what.

Tag: How did I get Started in Software Development 10

Posted by Uncle Bob Tue, 29 Jul 2008 10:09:40 GMT

Micah tagged me with this “chain-blog”. I’ve enjoyed reading other peoples’ stories. You can read them too by just following the chain back to the start. (It’s a shame there’s no good way to do the forward links!)

Here’s my story.

How old were you when you started programming.

6th grade. That would have made me 11, so 1963. My mother bought me a little plastic computer named Digi-Comp I. This device contained 3 flip-flops and 6 “AND” gates that could be interconnected to create simile finite state automata. I played with it for weeks. I ordered the companion manual “How to write programs for Digi Comp I” which was a simple tutorial in boolean algebra. I inhaled it.

My freshman year in high school the math department was considering purchasing a simple electronic educational computer. It was called an ECP-18. It has 1024 15 bit words of drum memory. It had the coolest front panel. You programmed it in machine language by toggling in the instructions.

I learned to program it by listening to the salesperson has he entered in the diagnostic programs. He would mumble under his breath as he toggled them in. He’d punch in an octal 15 and mutter “store”, or an octal 12 and mutter “load”. Following the op-code he’d enter the memory address he was loading or storing. Fortunately I knew octal from my experience with Digi-Comp I. So I could follow along. After a while I started entering my own programs. Just simple things to compute 2x+4 or something like that. Fortunately for me I always put the constant zero at the end of each of my programs because I used it to clear the main register (the accumulator). I didn’t know it, but zero was the op-code for halt.

What was the first real program you wrote?

Mr. Patternson’s Computerized Gate. This was a simple little finite state automata that I designed for the Digi-Comp I. Mr. Patterson was a wise old man and people would line up to talk with him. His gate would admit only one person at a time. It detected when Mr. Patterson was free. Would open. When a new petitioner sat down the gate would close again.

Is that a “real” program? Were any of the ECP-18 snippets I put together “real”? I once saw a demo of someone typing BASIC into a GE timesharing computer. I didn’t know BASIC, but I inferred the structure and started writing programs in it. I was never able to execute any of them. My father bought me books on Fortran, Cobol, PL/1. I inhaled them all. I wrote lots of programs in those languages, but I had no computers to execute them on.

Probably the first “real” programs I wrote were for an Olivetti/Underwood Programma 101. It was a programmable calculator the size of a microwave oven. My father took me to a science teacher conference. They had one on display. They let me play with it. I wrote programs to solve pythagoras theorem, etc.

What languages have you used since you started programming?

Egad! Fortran, Cobol, PL/1. BAL. PDP8 assembler, PDP11 assembler, 8080 assembler, Varian 620 assembler, GE Datanet 30 assembler, 6502 assembler, 68000 assembler, 8086 assembler, SNOBOL, LOGO, Smalltalk, Prolog, C, C++, Java, C#, Ruby, Forth, Postscript, Flex, etc. etc. etc. etc….

What was your first professional programming gig?

At the ripe old age of 16 I got a very temporary job writing Honeywell 200 assembler (which is a lot like 1401 assembler) for a actuarial firm named A.S.C. Tabulating.

If there is one thing you learned along the way that you would tell new developers, what would it be?

Being a programmer is like being a doctor or a lawyer. You must never stop learning. Just as doctors will read medical journals, and lawyers keep up with legal decisions, programmers must keep up with new languages, operating systems, frameworks, etc. Learn, learn, learn.

What’s the most fun you’ve ever had programming?

I can’t rank them. There are too many to count. I wrote a Lunar Lander game in Logo. I wrote a multi-tasking nucleus for an 8080 in C. Any time I went into a store with a C64 on display I’d type in a quick program to print / and \ randomly on the screen. I wrote an 8080 program in binary to control a set of relays to play “Mary had a little lamb”. The fun never stops!!!

Up Next

Brett Schuchert, Dave Nicolette, Martin Fowler, (Pragmatic) Dave Thomas, (OTI) Dave Thomas, Grady Booch, Bob Weissman

The Ascendency of Dynamic X vs. Static X, where X = ... 20

Posted by Dean Wampler Sun, 27 Jul 2008 02:48:00 GMT

I noticed a curious symmetry the other day. For several values of X, a dynamic approach has been gaining traction over a static approach, in some cases for several years.

X = Languages

The Ascendency of Dynamic Languages vs. Static Languages

This one is pretty obvious. It’s hard not to notice the resurgent interest in dynamically-typed languages, like Ruby, Python, Erlang, and even stalwarts like Lisp and Smalltalk.

There is a healthy debate about the relative merits of dynamic vs. static typing, but the “hotness” factor is undeniable.

X = Correctness Analysis

The Ascendency of Dynamic Correctness Analysis vs. Static Correctness Analysis

Analysis of code to prove correctness has been a research topic for years and the tools have become pretty good. If you’re in the Java world, tools like PMD and FindBugs find a lot of real and potential issues.

One thing none of these tools have ever been able to do is to analyze conformance of your code to your project’s requirements. I suppose you could probably build such tools using the same analysis techniques, but the cost would be too prohibitive for individual projects.

However, while analyzing the code statically is very hard, watching what the code actually does at runtime is more tractable and cost-effective, using automated tests.

Test-driving code results in a suite of unit, feature, and acceptance tests that do a good enough job, for most applications, of finding logic and requirements bugs. The way test-first development improves the design helps ensure correctness in the first place.

It’s worth emphasizing that automated tests exercise the code using representative data sets and scenarios, so they don’t constitute a proof of correctness. However, they are good enough for most applications.

X = Optimization

The Ascendency of Dynamic Optimization vs. Static Optimization

Perhaps the least well known of these X’s is optimization. Mature compilers like gcc have sophisticated optimizations based on static analysis of code (you can see where this is going…).

On the other hand, the javac compiler does not do a lot of optimizations. Rather, the JVM does.

The JVM watches the code execute and it performs optimizations the compiler could never do, like speculatively inlining polymorphic method calls, based on which types are actually having their methods invoked. The JVM puts in low-overhead guards to confirm that its assumptions are valid for each invocation. If not, the JVM de-optimizes the code.

The JVM can do this optimization because it sees how the code is really used at runtime, while the compiler has no idea when it looks at the code.

Just as for correctness analysis, static optimizations can only go so far. Dynamic optimizations simply bypass a lot of the difficulty and often yield better results.

Steve Yegge provided a nice overview recently of JVM optimizations, as part of a larger discussion on dynamic languages.


There are other dynamic vs. static things I could cite (think networking), but I’ll leave it at these three, for now.

TDD is how I do it, not what I do 46

Posted by Brett Schuchert Mon, 21 Jul 2008 19:32:00 GMT

“Do not seek to follow in the footsteps of the men of old; seek what they sought.” ~Basho

That quote resonates with me. I happend across that a few days after co-teaching an “advanced TDD” course with Uncle Bob. One of the recurring themes during the week was that TDD is a “how” not a “what”. It’s important to remember that TDD is not the goal, the results of successfully applying TDD are.

What are those results?

  • You could end up writing less code to accomplish the same thing
  • You might write better code that is less-coupled and more maleable
  • The code tends to be testable because, well, it IS tested
  • The coverage of your tests will be such that making significant changes will not be too risky
  • The number of defects should be quite low
  • The tests serve as excellent exampls of how to use the various classes in your solution
  • Less “just in case” code written, which typically doesn’t work in those cases that they targeted

Right now I do not know of a better way to accomplish all of these results more effectively than practicing TDD. Even so, this does not elevate TDD from a “how” to a “what.” TDD remains a technique to accomplish thigns I value. It is not a self-justifying practice. If someone asks me “why do we do it this way”, saying something like “we practice TDD” or “well you don’t understand TDD” is not a good answer.

We had an interesting result during that class. One group was practicing Bob’s three rules of TDD (paraphrased);
  • Write no production code without failing tests
  • Write only enough test code so that it fails (not compiling is failing)
  • Write only enough production code to get your tests to pass.
But they ended up with a bit of a mess. Following the three rules wasn’t enough. These rules are guiding principles, but those three rules mean nothing if you forget about clean code, refactoring and basic design principles (here are a few):
  • S.O.L.I.D.
  • F.I.R.S.T.
  • Separation of Concerns (Square Law of Computation)
  • Coupling/Cohesion
  • Protected Variation
  • ...

TDD is a means to an end but it is the end we care about. What is that end? Software that has few defects and is easy to change. Tests give us that. Not testing generally does not give us that. And testing in a common “QA over the wall” configuration typically does not cut it.

Since I do not know how to so easily produce those results in any other way. TDD becomes the defacto means of implementation for me. That doesn’t mean I should turn a blind eye to new ways of doing things. In lieu of any such information, however, I’ll pick TDD as a starting point. This is still a “how” and not a “what”.

It turns out that for me there are several tangible benefits I’ve personally experienced from practicing TDD:
  • Increased confidence in the code I produce (even more than when I was simply test infected)
  • Less worrying about one-off conditions and edge cases. I’ll get to them and as I think about then, they become tests
  • Fun

Fun?

Yes I wrote fun. There are several aspects of this:
  • I seem to produce demonstrable benefits sooner
  • I actually do more analysis throughout
  • I get to do more OO programming

Demonstrable Benefits Sooner

Since I focus on one test at a time, I frequently get back to running tests. I’m able to see results sooner. Sure, those results are sometimes disjoint and piecemeal, but over time they organically grow into something useful. I really enjoy teaching a class and moving from a trivial test to a suite a tests that together have caused what students can see is a tangible implementation of something complex.

More Analysis

Analysis means to break into constituent parts. When I practice TDD, I think about some end (say a user story or a scenario) then I think about a small part of that overall work and tackle it. In the act of getting to a test, I’m doing enough analysis to figure out at least some of the parts of what I’m trying to do. I’m breaking my goal into a parts, that’s a pretty good demonstration of analysis.

More OO

I like polymorphism. I like lots of shallow, but broad hierarchies. I prefer delegation to inheritance. But often, the things I’m writing don’t need a lot of this – or so it might seem.

When I try to create a good unit test, much of what I’m doing is trying to figure out how the effect I’m shooting for can be isolated to make the test fast, independent, reliable … To do so, I make heavy use of test doubles. Sometimes I hand-roll them, sometimes I use mocking libraries. I’ve event used AOP frameworks, but not nearly as extensively.

Doing all of this allows me to use polymorphism more often. And that’s fun.

Conclusion

Am I wasting time writing all of these tests? Is my enjoyment of my work an indication that I might be wasting the time of my product owner?

Those are good questions. And these are things you might want to ask yourself.

Personally, I’m pretty sure I’m not wasting anyone’s time for several reasons:
  • The product owner is keeping me focused on things that add value
  • Short iterations keep me reigned in
  • I’m only doing as much as necessary to get the stories for an iteration implemented
  • The tests I’m writing stay passing, run quickly and over time remain (become) maintainable

Even so, since TDD is a how and not a what, I still need to keep asking myself if the work I’m doing is moving us towards a working solution that will be maintainable during its lifetime.

I think it is. What about you?

It's all in how you approach it 10

Posted by Brett Schuchert Mon, 21 Jul 2008 18:15:00 GMT

I was painting a bedroom over the last week. Unfortunately, it was well populated with furniture, a wall-mounted TV that needed lowering, clutter, the usual stuff. Given the time I had available, I didn’t think I’d be able to finish the whole bedroom before having to travel again.

I decided to tackle the wall with the wall-mounted TV first, so I moved the furniture to make enough room, taped just that wall (but not the ceiling since I was planning on painting it) and then proceeded to apply two coats of primer and two coats of the real paint. I subsequently moved around to an alcove and another wall and the part of the ceiling I could reach without having to rent scaffolding.

I managed to get two walls done and everything moved back into place before I left for another business trip. My wife is happy because the bedroom looks better. I did no damage and made noticeable progress. I still have some Painting to do (the capital P is to indicate it will be a Pain). I eventually have to move the bed, rent scaffolding, and so on. That’s probably more in the future than I’d prefer, but I’ll do it when I know I have the time and space to do it.

Contrast this to when we bough the house back in March. I entered an empty house. I managed to get two bedrooms painted (ceilings included) and the “grand” room with 14’ vaulted ceilings. I only nearly killed myself once – don’t lean a ladder against a wall but put the legs on plastic – and it was much easier to move around. I had a clean slate.

Sometimes you’ve got to get done what you can get done to make some progress. When I was younger, my desire to finish the entire bedroom might have stopped me from making any progress. Sure, the bedroom is now half old paint and half new paint, but according to my wife it looks better – and she’s the product owner! I can probably do one more wall without having to do major lifting and when I’m finally ready to rent the scaffolding, I won’t have as much to do. I can break down the bed, rent the scaffolding and then in one day I might be able to finish the remainder of the work. (Well probably 2 days because I’ll end up wanting to apply 2 coats to the ceiling and I’ll need to wait 8 hours).

Painting is just like software development.

Bauble, Bauble... 36

Posted by Uncle Bob Sun, 20 Jul 2008 20:42:29 GMT

In Ruby, I hate require statements that look like this:

require File.dirname(__FILE__)+"myComponent/component.rb"

So I decided to do something about it.

This all started when my Son, Micah, told me about his Limelight project. Limelight is a jruby/swing GUI framework. If you want to build a fancy GUI in Ruby, consider this tool.

I have neither the time nor inclination to write a framework like this; but my curiosity was piqued. So in order to see what it was like to do Swing in JRuby I spent a few hours cobbling together an implementation of Langton’s Ant. This turned out to be quite simple.

The result, however, was a mess. There was swing code mixed up with “ant” code, in the classic GUI/Business-rule goulash that we “clean-coders” hate so much. Despite the fact that this was throw-away code, I could not leave it in that state – the moral outrage was just too great. So I spent some more time separating the program into two modules.

The first module knew all about Langton’s ant, but nothing about Swing. The second module was a tiny framework for implementing cellular automata in Swing. (Here are all the files).

I was quite happy with the separation, but did not like the horrible require statements that I had to use. The cellular_automaton component had two classes, in two separate files. In order to get the require right, I had to either use absolute directory paths, or the horrible File.dirname(__FILE__)... structure.

What I wanted was for cellular_automaton to behave like a gem. But I didn’t want to make it into a gem. Gem’s are kind of “heavy” for a dumb little thing like “cellular_automaton”.

So I created a module named “Bauble” which gave me some gem-like behaviors. Here it is:


module Bauble
  def self.use(bauble)
    bauble_name = File.basename(bauble)
    ensure_in_path "#{bauble}/lib" 
    require bauble_name
  end

  def self.ensure_in_path(path)
    $LOAD_PATH << path unless $LOAD_PATH.include? path
  end
end

This is no great shakes, but it solved my problem. Now, in my langton’s ant program all I need to do is this:


require 'bauble'
Bauble.use('../cellular_automaton')

All the ugly requires are gone.

I’m thinking about turning Bauble into a rubyforge project, and making a publicly available gem out of it in order to give folks a standard way to avoid those horrible __FILE__ requires. I think there are several other utilities that could be placed in Bauble such as require_relative etc.

Anyway, what do you think?

I love the 90's: The Fusion Episode 14

Posted by Brett Schuchert Wed, 02 Jul 2008 20:55:00 GMT

A few weeks back I was working with a team on the East Coast. They wanted to develop a simulator to assist in testing other software components. Their system to simulate is well-described in a specification using diagrams close to sequence diagrams as described in the UML.

In fact, these diagrams were of a variety I’d call “system” sequence diagrams. They described the interaction between outside entities (actors – in this case another system) and the system to be simulated.

This brought be back to 1993 when I was introduced to The Fusion Method by Coleman et al. Before that I had read Booch (1 and 2) and Rumbaugh (OMT) and I honestly didn’t follow much of their material – I had book knowledge but I really didn’t practice it. I always thought that Booch was especially strong in Design ideas and notation but weak in Analysis. I though the opposite for Rumbaugh, so the two together + Jacobson with Use Cases and Business Modeling really formed a great team in terms of covering the kinds of things you need to cover in a thorough software process (UP + UML).

But before all that was Fusion.

Several colleagues and I really groked Fusion. It started with system sequence diagrams showing interactions much like the specification I mentioned above. It also described a difference between analysis and design (and if Uncle Bob reads this, he’ll probably have some strong words about so-called object-oriented analysis, well this was 15 years ago… though I still there there is some value to be found there). Anyway, this is mostly about system sequence diagrams so I won’t say much more about that excellent process.

Each system sequence diagram represented a scenario. To represent many scenarios, Fusion offered a BNF-based syntax to express those various scenarios. (I understand that this was also because for political reasons within HP they were not allowed to include a state model, but I don’t know if that is true or not.) For several years I practiced Fusion and really I often revert back to that if I’m not trying to do anything in particular.

Spending a little time up front thinking a little about the logical interaction between the system and its various actors helps me get a big picture view of the system boundary and its general flow. I have also found it helps others as well, but your mileage may vary.

So when I viewed their protocol specification, it really brought back some good memories. And in fact, that’s how we decided to look at the problem.

(What follows is highly-idealized)

We reviewed their specification and decided we’d try to work through the initialization sequence and then work through one sequence that involved “completing a simple job.” I need to keep this high level to keep the identity of the company a secret.

There was prior work and we kept that in mind but really started from scratch. In our very first attempt, there had been some work done along the lines of using the Command pattern, so we started there. Of course, once we did our first command, we backed off and when with a more basic design that seemed to fit the complexity a bit better (starting with the command pattern at the beginning is an example of solution-problemming to use a Weinberg term – and one of the reasons I’m sometimes skeptical when people start talking in patterns).

We continued working from the request coming into the system and working its way through the system. Along the way, we wrote unit tests, driven by our end goal of trying to complete a simple job and guided by the single responsibility principle. As we thought about the system, there were several logical steps:
  • Receive a message from the outside as some array of bytes
  • Determine the “command” represented by the bytes
  • Process the parameters within the command
  • Issue a message to the simulator
  • Create a logical response
  • Format the logical response into the underlying protocol
  • Send the response back

At the time, they were considering using JNI, so we spent just over a day validating that we could communicate bi-directionally, maintaining a single process space.

Along the way we moved from using hand-rolled test doubles to using JMock 2 to create mock objects. I mentioned this to friend of mine who lamented that there are several issues using a mock-based approach:
  • It is easy to end up with a bunch of tested objects but no fully-connected system
  • Sharing setup between various mocks is difficult and often not done so there’s a lot of violation of DRY
  • You have to learn a new syntax

We accepted learning a new syntax because it was deemed less painful than maintaining existing hand-rolled test doubles (though there are several reasonable solution for that, ask if you want to know what it is). There is the issue of sharing setup on mocks, but we did not have enough work yet to really notice that as a problem. However, they were at least aware of that and we briefly discussed how to share common expectation-setting (it’s well supported).

Finally, there’s the issue of not having a fully connected system. We knew this was an issue so we started by writing an integration test using JUnit. We needed to design a system that:
  • Could be tested up to but excluding the JNI stuff
  • Could be configured to stub out JNI or use real JNI
  • Was easily configurable
  • Was automatically configured by C++ (since it was a C++ process that was started to get the whole system in place)

We designed that (15 minute white-board session), coded it and ended up with a few integration tests. Along the way, we built a simple factory for creating the system fully connected. That factory was used both in tests as well as by the JNI-based classes to make sure that we had a fully-connected systems when it was finally started by C++.

Near the end, we decided we wanted to demonstrate asynchronous computation, which we did using tests. I stumbled a bit but we got it done in a few hours. We demonstrated that the system receiving messages from the outside world basically queued up requests rather than making the sender wait synchronously (we demonstrated this indirectly – that might be a later blog post – let me know if you’re interested).

By the way, that was the first week. These guys were good and I had a great time.

There was still a little work to be done on the C++ side and I only had a week, so I asked them to keep me posted. The following Tuesday they had the first end-to-end interaction, system initialization.

By Wednesday (so 3 business days later), they had a complete demonstration of end-to-end interaction with a single, simple job finishing. Not long after that they demonstrated several simple jobs finishing. The next thing on their list? Completing more complex jobs, system configuration, etc.

However, it all goes back to having a well-defined protocol. After we had one system interaction described end-to-end, doing the next thing was easier:
  • Select a system interaction
  • List all of the steps it needs to accomplish (some requests required a response, some did not)
  • Write unit tests for each “arm” of the interaction
So they had a very natural way to form the backlog:
Select a set of end-to-end interactions that add value to the user of the system
They also had an easy way to create a sprint backlog:
For each system-level interaction, enumerate all of its steps and then add implementing those steps as individual back-log items

Now some of those individual steps will end up being small (less than an hour) but some will be quite large when they start working with variable parameters and commands that need to operate at a higher priority.

But they are well on their way and I was reminded of just how much I really enjoyed using Fusion.

Contracts and Integration Tests for Component Interfaces 14

Posted by Dean Wampler Mon, 30 Jun 2008 02:54:00 GMT

I am mentoring a team that is transitioning to XP, the first team in a planned, corporate-wide transition. Recently we ran into miscommunication problems about an interface we are providing to another team.

The problems didn’t surface until a “big-bang” integration right before a major release, when it was too late to fix the problem. The feature was backed out of the release, as a result.

There are several lessons to take away from this experience and a few techniques for preventing these problems in the first place.

End-to-end automated integration tests are a well-established way of catching these problems early on. The team I’m mentoring has set up its own continuous-integration (CI) server and the team is getting pretty good at writing acceptance tests using FitNesse. However, these tests only cover the components provided by the team, not the true end-to-end user stories. So, they are imperfect as both acceptance tests and integration tests. Our longer-term goal is to automate true end-to-end acceptance and integration tests, across all components and services.

In this particular case, the other team is following a waterfall-style of development, with big design up front. Therefore, my team needed to give them an interface to design against, before we were ready to actually implement the service.

There are a couple of problems with this approach. First, the two teams should really “pair” to work out the interface and behavior across their components. As I said, we’re just starting to go Agile, but my goal is to have virtual feature teams, where members of the required component teams come together as needed to implement features. This would help prevent the miscommunication of one team defining an interface and sharing it with another team through documentation, etc. Getting people to communicate face-to-face and to write code together would minimize miscommunication.

Second, defining a service interface without the implementation is risky, because it’s very likely you will miss important details. The best way to work out the details of the interface is to test drive it in some way.

This suggests another technique I want to introduce to the team. When defining an interface for external consumption, don’t just deliver the “static” interface (source files, documentation, etc.), also deliver working Mock Objects that the other team can test against. You should develop these mocks as you test drive the interface, even if you aren’t yet working on the full implementation (for schedule or other reasons).

The mocks encapsulate and enforce the behavioral contract of the interface. Design by Contract is a very effective way of thinking about interface design and implementing automated enforcement of it. Test-driven development mostly serves the same practical function, but thinking in “contractual” terms brings clarity to tests that is often missing in many of the tests I see.

Many developers already use mocks for components that don’t exist yet and find that the mocks help them design the interfaces to those components, even while the mocks are being used to test clients of the components.

Of course, there is no guarantee that the mocks faithfully represent the actual behavior, but they will minimize surprises. Whether you have mocks or not, there is no substitute for running automated integration tests on real components as soon as possible.

Older posts: 1 ... 3 4 5 6 7 ... 9