The Open-Closed Principle for Languages with Open Classes 127

Posted by Dean Wampler Fri, 05 Sep 2008 02:42:00 GMT

We’ve been having a discussion inside Object Mentor World Design Headquarters about the meaning of the OCP for dynamic languages, like Ruby, with open classes.

For example, in Ruby it’s normal to define a class or module, e.g.,

    
# foo.rb
class Foo
    def method1 *args
        ...
    end
end
    

and later re-open the class and add (or redefine) methods,

    
# foo2.rb
class Foo
    def method2 *args
        ...
    end
end
    

Users of Foo see all the methods, as if Foo had one definition.

    
foo = Foo.new
foo.method1 :arg1, :arg2
foo.method2 :arg1, :arg2
    

Do open classes violate the Open-Closed Principle? Bertrand Meyer articulated OCP. Here is his definition1.

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.

He elaborated on it here.

... This is the open-closed principle, which in my opinion is one of the central innovations of object technology: the ability to use a software component as it is, while retaining the possibility of adding to it later through inheritance. Unlike the records or structures of other approaches, a class of object technology is both closed and open: closed because we can start using it for other components (its clients); open because we can at any time add new properties without invalidating its existing clients.

Tell Less, Say More: The Power of Implicitness

So, if one client require’s only foo.rb and only uses method1, that client doesn’t care what foo2.rb does. However, if the client also require’s foo2.rb, perhaps indirectly through another require, problems will ensue unless the client is unaffected by what foo2.rb does. This looks a lot like the way “good” inheritance should behave.

So, the answer is no, we aren’t violating OCP, as long as we extend a re-opened class following the same rules we would use when inheriting from it.

If we use inheritance instead:

    
# foo.rb
class Foo
    def method1 *args
        ...
    end
end
...
class DerivedFoo < Foo
    def method2 *args
        ...
    end
end
...
foo = SubFoo.new    # Instantiate different class...
foo.method1 :arg1, :arg2
foo.method2 :arg1, :arg2
    

One notable difference is that we have to instantiate a different class. This is an important difference. While you can often just use inheritance, and maybe you should prefer it, inheritance only works if you have full control over what types get instantiated and it’s easy to change which types you use. Of course, inheritance is also the best approach when you need all behavioral variants simulateneously, i.e., each variant in one or more objects.

Sometimes you want to affect the behavior of all instances transparently, without changing the types that are instantiated. A slightly better example, logging method calls, illustrates the point. Here we use the “famous” alias_method in Ruby.

    
# foo.rb
class Foo
    def method1 *args
        ...
    end
end
# logging_foo.rb
class Foo
    alias_method :old_method1, :method1
    def method1 *args
        p "Inside method1(#{args.inspect})" 
        old_method1 *args
    end
end
...
foo = Foo.new
foo.method1 :arg1, :arg2
    

Foo.method1 behaves like a subclass override, with extended behavior that still obeys the Liskov-Substitution Principle (LSP).

So, I think the OCP can be reworded slightly.

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for source modification.

We should not re-open the original source, but adding functionality through a separate source file is okay.

Actually, I prefer a slightly different wording.

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for source and contract modification.

The extra and contract is redundant with LSP. I don’t think this kind of redundancy is necessarily bad. ;) The contract is the set of behavioral expectations between the “entity” and its client(s). Just as it is bad to break the contract with inheritance, it is also bad to break it through open classes.

OCP and LSP together are our most important design principles for effective organization of similar vs. variant behaviors. Inheritance is one way we do this. Open classes provide another way. Aspects provide a third way and are subject to the same design issues.

1 Meyer, Bertrand (1988). Object-Oriented Software Construction. Prentice Hall. ISBN 0136290493.

The Ascendency of Dynamic X vs. Static X, where X = ... 22

Posted by Dean Wampler Sun, 27 Jul 2008 02:48:00 GMT

I noticed a curious symmetry the other day. For several values of X, a dynamic approach has been gaining traction over a static approach, in some cases for several years.

X = Languages

The Ascendency of Dynamic Languages vs. Static Languages

This one is pretty obvious. It’s hard not to notice the resurgent interest in dynamically-typed languages, like Ruby, Python, Erlang, and even stalwarts like Lisp and Smalltalk.

There is a healthy debate about the relative merits of dynamic vs. static typing, but the “hotness” factor is undeniable.

X = Correctness Analysis

The Ascendency of Dynamic Correctness Analysis vs. Static Correctness Analysis

Analysis of code to prove correctness has been a research topic for years and the tools have become pretty good. If you’re in the Java world, tools like PMD and FindBugs find a lot of real and potential issues.

One thing none of these tools have ever been able to do is to analyze conformance of your code to your project’s requirements. I suppose you could probably build such tools using the same analysis techniques, but the cost would be too prohibitive for individual projects.

However, while analyzing the code statically is very hard, watching what the code actually does at runtime is more tractable and cost-effective, using automated tests.

Test-driving code results in a suite of unit, feature, and acceptance tests that do a good enough job, for most applications, of finding logic and requirements bugs. The way test-first development improves the design helps ensure correctness in the first place.

It’s worth emphasizing that automated tests exercise the code using representative data sets and scenarios, so they don’t constitute a proof of correctness. However, they are good enough for most applications.

X = Optimization

The Ascendency of Dynamic Optimization vs. Static Optimization

Perhaps the least well known of these X’s is optimization. Mature compilers like gcc have sophisticated optimizations based on static analysis of code (you can see where this is going…).

On the other hand, the javac compiler does not do a lot of optimizations. Rather, the JVM does.

The JVM watches the code execute and it performs optimizations the compiler could never do, like speculatively inlining polymorphic method calls, based on which types are actually having their methods invoked. The JVM puts in low-overhead guards to confirm that its assumptions are valid for each invocation. If not, the JVM de-optimizes the code.

The JVM can do this optimization because it sees how the code is really used at runtime, while the compiler has no idea when it looks at the code.

Just as for correctness analysis, static optimizations can only go so far. Dynamic optimizations simply bypass a lot of the difficulty and often yield better results.

Steve Yegge provided a nice overview recently of JVM optimizations, as part of a larger discussion on dynamic languages.


There are other dynamic vs. static things I could cite (think networking), but I’ll leave it at these three, for now.

What We Can Learn from the Ojibwe Language 25

Posted by Dean Wampler Sat, 03 May 2008 14:48:57 GMT

Ojibwe (sometimes spelled Ojibwa; the last syllable is pronounced “way”) is one of the few Native American languages that isn’t immediately threatened with extinction. It is spoken by about 10,000 people around the Great Lakes region. Brothers David and Anton Treuer are helping to keep it alive, as they discussed in a recent Fresh Air interview.

Ojibwe is a language that is optimized for an aboriginal people whose lives and livelihoods depend on an intimate awareness of their environment, especially the weather and water conditions. They have many nouns and verbs for fine gradations of rain, snow, ice conditions, the way water waves look and sound, etc. You would want this clarity of detail if you ventured out on a lake every day to fish for dinner.

In the past, speaking languages like Ojibwe was actively suppressed by the government, in an attempt to assimilate Native Americans. Today, the threat of extinction is more from the sheer ubiquity of English. I think there is another force at play, too. People living a modern, so-called “developed” lifestyle just don’t need to be so aware of their environment anymore. In fact, most of us are pretty “tone deaf” to the nuances of weather and water, which is sad in a way. We just don’t perceive the need for the richness of an Ojibwe to communicate what’s important to us, like sports trivia and fashion tips.

So, what does Ojibwe have to do with programming languages? Our language choices inform the way we frame problem solving and design. I was reminded of this recently while reading Ted Neward’s series of articles on Scala. Scala is a JVM language that provides first-class support for functional programming and object-oriented design refinements like traits, which provide mixin behavior.

While you can write Java-like code in Scala, Neward demonstrates how exploiting Scala features can result in very different code for many problems. The Scala examples are simpler, but sometimes that simplicity only becomes apparent after you grasp the underlying design principle in use, like closures or functional idioms.

One of the best pieces of advice in the Pragmatic Programmer is to learn a new language every year. You should pick a language that is very different from what you know already, not one that is fundamentally similar. Even if you won’t use that language in your professional work, understanding its principles, patterns, and idioms will inform your work in whatever languages you actually use.

For example, there is a lot of fretting these days about concurrent programming, given the rise of multi-core CPUs and multiprocessor computers. We know how to write concurrent programs in our most popular imperative languages, like Java and C++, but that knowledge is somewhat specialized and not widely known in the community. This is the main reason that functional programming is suddenly interesting again. It is inherently easier to write concurrent applications using side-effect-free code. I expect that we will fail to meet the concurrency challenge if we rely exclusively on the mechanisms in our imperative languages.

So, you could adopt a functional language for all or part of your concurrent application. Or, if you can’t use Scala (or Haskell or Erlang or …) you could at least apply functional idioms, like side-effect-free functions, avoidance of mutable objects, etc. in your current imperative language. However, even that won’t be an option unless you understand those principles in the first place.

Learning a new language is more than learning a new vocabulary. It’s even more than learning new design techniques. It’s also learning to see common things from a fresh perspective, with greater clarity.

Strongly Typed Languages Considered Dangerous 72

Posted by Brett Schuchert Thu, 23 Aug 2007 23:37:00 GMT

Have you ever heard of covariance? Method parameters or returns are said to be covariant if, as you work your way down an inheritance hierarchy, the types of the formal parameters or return type in an overridden method can be sub-types of the formal parameters and return types in the superclass’ version of the method.

Oh, and contravariance is just the opposite.

What?! Why should you care? Answer, you shouldn’t. Yes C++ and Java both support covariant return types, but so what? Have you ever used them? OK, I have, but then I also used and liked C++ for about 7 years, over 10 years ago. We all learn to move on.

You ever notice how something meant to help often (typically?) turns out to do exactly the opposite? Even worse, it directly supports or enables another unfortunate behavior. This is just Weinberg’s first rule of problem solving:

Every solution introduces new problems

Here’s an example I’m guilty of. Several years ago I was working on a project where we had written many unit tests. We had not followed the FIRST principles, specifically R-Reliability. Our tests were hugely affected by the environment. We had copies of real databases that would get overridden without warning. We’d have problems with MQ timeouts at certain times of the day or sometimes for days. And our tests would fail.

We wold generally have “events” that would cause tests to fail for some time. We were using continuous integration and so to “fix” the problem, I created a class we called “TimeBomb” – turns out it was the right name for the wrong reason.

You’d put a TimeBomb in a test and then comment out the test. The TimeBomb would be given a date. Until that date, the test would “pass” – or rater be ignored. At some point in the future, the TimeBomb would expire and tests would start failing again.

I was so proud of it, I even took the time to write something up about it here. I had nothing but the best intentions. I wanted an active mechanism to remind us of work that needed to be done. We had so many todo’s and warnings that anything short of an active reminder would simply be missed. I also wanted CI to “work.” But what eventually happened was that as our tests kept failing, we’d simply keep moving the TimeBomb date out.

I wrote something that enabled our project to collect more design debt. As I said, my intentions were noble. But the road to Hell is paved with good intentions. Luckily I got the name right. The thing that was really blowing up, however, was the project itself.

What has all of this got to do with “strongly typed languages”?

If you’re working with a strongly typed language, you will have discussions about covariance (well I do anyway). You’ll discuss the merits of multiple inheritance (it really is a necessary feature – let the flames rise, I’m already in Hell from my good intentions). You’ll also discuss templates/generics. The list goes on and on.

If you’re working with a dynamically typed language (the first one I used professionally was Smalltalk but I also used Self, Objective-C, which lets you choose, and several other languages whose names I do not recall).

Covariance is not even an issue. The same can be said of generics/templates and yes, even multiple inheritance is less of an issue. In a strongly-typed languages, things like Multiple Inheritance are necessary if you want your type system to be complete. (If you don’t believe me, read a copy of Object Oriented Software Construction, 2nd ed. by Bertrand Meyer – an excellent read.)

Ever created a mock object in a typed language? You either need an interface or at least a non-final class. What about Ruby or Smalltalk? Nope. Neither language cares about a class’ interface until it executes. And neither language cares how a class is able respond to a particular message, just that it does. It’s even possible in both languages to NOT have a method and still work if you mess with the meta-language protocol.

OK, but still, does this make typed languages bad?

Lets go back to that issue of enabling bad things.

Virtual Machines, like the JVM and CLR, have made amazing strides in the past few years. Reflection keeps getting faster. Just in time compilers keep getting smarter. The time required for intrinsic locks has improved and now a Java 5 compiler will support non-blocking, safe, multi-threaded updates. Modern processors support such operations using an optimistic approach. Heck, Java 6 even does some cool stuff with local object references to significantly improve Java’s memory usage. Java runs pretty fast.

Dynamic languages, generally, are not there yet. I hope they get there, but they simply are not there. But so what?! If your system runs “fast enough” then it’s fast enough. People used Smalltalk for real applications years ago – and still do to some extent. Ruby is certainly fast enough for a large class of problems.

How is that? These languages force you to write well. If you do not, then you will write code that is not “fast enough.” I’ve see very poor performing Smalltalk solutions. But it was never because of Smalltalk, it was because of poor programming practices. Are there things that won’t currently perform fast enough in dynamically typed languages? Yes. Are most applications like that? Probably not.

You can’t get away with as much in a dynamically typed language. That sounds ironic. On the one hand you have amazing power with dynamically typed languages. Of course Peter Parker learned that “With great power comes great responsibility.” This is just as true with Ruby and other dynamically typed languages.

You do not have as much to guide you in a dynamically typed language. Badly written, poorly organized code in a typed language is hard to follow but it’s possible. In a language like Ruby or Smalltalk it’s still possible but it’s a lot harder. Such poor approaches will typically fail sooner. And thats GOOD! You’ve wasted less money because you failed sooner. If you’ve got crappy design, you’re going to fail. The issue is how much time/money will you spend to get there.

Another thing that strongly typed languages offer is a false sense of security because of compiler errors. I have heard many people deride the need for unit tests because the compiler “will catch it.”

This misses a significant point that unit tests can serve as a specification of behavior rather than just a validation of as-written code.

You cannot get away with as much in a dynamically typed language. Or put another way, a dynamically typed language does not enable you to be as sloppy. It’s just a fact. In fact you can typically get away with a lot in a dynamically typed language, you just have to do it well.

Does this mean that dynamically typed languages are harder to work in? Maybe. But if you follow TDD, then the language is less important.

Do we need fast, typed languages? Clearly we do. There are applications where having such languages is necessary. Device drivers are not written in Ruby (maybe they are, I’m just trying to be more balanced).

However, how many of you were around when games were written in assembly? C was not fast enough. Then C was fast enough but C++ was still a question mark. C++ and C became mainstream but Java was just too slow. Some games are getting written in Java now. Not many, but it’s certainly possible. It is also possible to use multiple languages and put the time-critical stuff, which is generally a small part of your system, in one language and use another language for the rest of the system.

Back in the very early 90’s I worked just a little bit with Self. At that time, their Just In Time compiler would create machine code from byte code for methods that got executed. They went one step further, however. Not only did they JIT compile a method for an object, they would actually do it for combinations of receivers and type parameters.

There’s a name for this. In general it is called multi-dispatch (some languages support double-dispatch, the visitor pattern is a mechanism to turn standard polymorphism into a weak form of double dispatch, Smalltalk used a similar approach for handling numerical calculations).

Self was doing this not to support multi-dispatch but to improve performance. That means that a given method could have multiple compiled forms to handle commonly used methods on a receiver with commonly-used parameters. Yes it used more memory. But given what modern compilers do with code optimization, it just seems that this kind of technique could have huge benefits in the performance of dynamically typed languages. This is just one way a dynamic language can speed things up. There are others and they are happening NOW.

I’m hoping that in the next few years dynamic languages will get more credit for what they have to offer. I believe they are the future. I still primarily develop in Java but that’s just because I’m waiting for the dust to settle a little bit and for a clear dynamic language to start to assert itself. I like Ruby (though its support for block closures is, IMO, weak). I’m not convinced Ruby is the next big thing. I’m working with it a little bit just in case, however.

What are you currently doing that enables yourself or your co-workers to maintain the status quo?

Protecting Developers from Powerful Languages 23

Posted by Dean Wampler Tue, 16 Jan 2007 03:57:00 GMT

Microsoft’s forthcoming C# version 3 has some innovative features, as described in this blog. I give the C# team credit for pushing the boundaries of C#, in part because they have forced the Java community to follow suit. ;)

A common tension in many development shops is how far to trust the developers with languages and tools that are perceived to be “advanced”. It’s tempting to limit developers to “safe” languages and maybe not all the features of those languages. This can be misguided.

Java is usually considered safe, but Java Generics are suspect. Strong typing is safe, but dynamic typing isn’t controlled enough. Closures and continuations sound too advanced and technical to be trusted in the hands of “our team”.

To be fair, larger organizations have more at stake and caution is prudent. Regrettably, it is also true that many people in our profession are … hmm … not that well qualified.

However, I find that I’m far more productive and less likely to make mistakes using Ruby iterators with closures than writing more verbose and inelegant Java.

I used to be a strong believer in static typing, but it has become a distraction, as I have to worry more about the types of method parameters and return values, rather than just worrying about the values themselves. I realized that, on average in a typical section of code, the actual type of a variable is unimportant. The variable is just a “handle” being passed around. The name is always important, as it is a form of documentation. There are places where the type is important, of course, when the variable is read or written in some way.

Finally, static typing offers less security than at first appears. At best, it only confirms that variables of particular types are used consistently. Your unit tests also do this. However, static typing can’t confirm that the usage of the API is correct. This is analogous to testing the syntax but not the semantics of the program. In fact, only unit tests (or alternatives, like rspec ) are effective at testing both.

So, it’s prudent to be reticent about newer languages and features, but make sure the decisions you make about them are backed up by careful evaluation and don’t forget to train your team appropriately!