Liskov Substitution Principle and the Ruby Core Libraries 8

Posted by Dean Wampler Sat, 17 Feb 2007 20:20:00 GMT

There is a spirited discussion happening now on the ruby-talk list called (sic).

In the core Ruby classes, the Kernel module, which is the root of everything, even Object, defines a method called dup, for duplicating objects. (There is also a clone method with slightly different behavior that I won’t discuss here.)

The problem is that some derived core classes throw an exception when dup is called.

Specifically, as the ruby-talk discussion title says, it’s the immutable classes (NilClass, FalseClass, TrueClass, Fixnum, and Symbol) that do this. Consider, for example, the following irb session:
irb 1:0> 5.respond_to? :dup
=> true
irb 2:0> 5.dup
TypeError: can't dup Fixnum
        from (irb):1:in `dup'
        from (irb):1
irb 3:0> 
If you don’t know Ruby, the first line asks the Fixnum object 5 if it responds to the method dup (with the name expressed as a symbol, hence the ”:”). The answer is true, becuase this method is defined by the module Kernel, which is included by the top-level class Object, an ancestor of Fixnum.

However, when you actually call dup on 5, it raises TypeError, as shown.

So, this looks like a classic Liskov Substitution Principle violation. The term for this code smell is Refused Bequest (e.g., see ) and it’s typically fixed with the refactoring .

The email thread is about a proposal to change the library in one of several possible ways. One possibility is to remove dup from the immutable classes. This would eliminate the unexpected behavior in the example above, since 5.respond_to?(:dup) would return false, but it would still be an LSP violation, specifically it would still have the Refused Bequest smell.

One scenario where the current behavior causes problems is doing a deep copy of an arbitrary object graph. For immutable objects, you would normally just want dup to return the same object. It’s immutable, right? Well, not exactly, since you can re-open classes and even objects to add and remove methods in Ruby (there are some limitations for the immutables…). So, if you thought you actually duplicated something and started messing with its methods, you would be surprised to find the original was “also” modified.

So, how serious is this LSP issue (one of several)? When I pointed out the problem in the discussion, one respondent, Robert Dober, said the following (edited slightly):

I would say that LSP does not apply here simply because in Ruby we do not have that kind of contract. In order to apply LSP we need to say at a point we have an object of class Base, for example. (let the gods forgive me that I use Java)

void aMethod(final Base b){
   ....
}
and we expect this to work whenever we call aMethod with an object that is a Base. Anyway the compiler would not really allow otherwise.
SubClass sc;  // subclassing Base od course
aMethod( sc ); // this is expected to work (from the type POV).

Such things just do not exist in Ruby, I believe that Ruby has explained something to me:

  • OO Languages are Class oriented languages
  • Dynamic Languages are Object oriented languages.

Replace Class with Type and you see what I mean.

This is all very much IMHO of course but I feel that the Ruby community has made me evolve a lot away from “Class oriented”.

He’s wrong that the compiler protects you in Java; you can still throw exceptions, etc. The JDK Collection classes have Refused Bequests. Besides that, however, he makes some interesting points.

As a long-time Java programmer, I’m instinctively uncomfortable with LSP violations. Yet, the Ruby API is very nice to work with, so maybe a little LSP violation isn’t so bad?

As Robert says, we approach our designs differently in dynamic vs. static languages. In Ruby, you almost never use the is_a? and kind_of? methods to check for type. Instead, you follow the duck typing philosophy (“If it acts like a duck, it must be a duck”); you rely on respond_to? to decide if an object does what you want.

In the case of dup for the immutable classes, I would prefer that they not implement the method, rather than throw an exception. However, that would still violate LSP.

So, can we still satisfy LSP and also have rich base classes and modules?

There are many examples of traits that one object might or should support, but not another. (Those of you Java programmers might ask yourself why all objects support toString, for example. Why not also toXML...?)

Coming from an background, I would rather see an architecture where dup is added only to those classes and modules that can support it. It shouldn’t be part of the standard “signature” of Kernel, but it should be present when code actually needs it.

In fact, Ruby makes this sort of AOP easy to implement. Maybe Kernel, Module, and Object should be refactored into smaller pieces and programmers should declaratively mixin the traits they need. Imagine something like the following:
irb 1:0> my_obj.respond_to? :dup
=> false
irb 2:0> include 'DupableTrait'  
irb 2:0> my_obj.respond_to? :dup
=> true
irb 4:0> def dup_if_possible items
irb 5:1>  items.map {|item| item.respond_to?(:dup) ? item.dup : item}
irb 6:1> end
...
In other words, Kernel no longer “exposes the dup abstraction”, by default, but the DupableTrait module “magically” adds dup to all the classes that can support it. This way, we preserve LSP, streamline the core classes and modules (SRP and ISP anyone?), yet we have the flexibility we need, on demand.
Trackbacks

Use the following link to trackback from your own site:
http://blog.objectmentor.com/articles/trackback/188

Comments

Leave a response

  1. Avatar
    Michael Feathers about 5 hours later:

    One odd thing about LSP: it is usually stated in terms of ‘types’ but people take that to mean ‘classes’, and generally that works well in statically typed languages.

    The thing in dynamically typed languages is that we can have a notion of type which is independent of class. I can’t recall the example, but I remember hearing that most Smalltalk implementations have some things that look like glaring LSP violations if you are looking at classes rather than some general notion of type.

    From what I’ve read, Tom Love, back in the 80s, came up with an interesting notion for dynamically typed systems that seems to do what LSP does for us.. he said that in a system, every method name should mean the same thing to every caller—if you have a ‘draw’ method you have to decide whether it draws a gun out of a holster or draws a representation on some medium. ‘Draw’ should mean the same thing to everybody.

    I think the key thing is that in statically typed languages, the hierarchy bears the burden of communicating substitutability information. In dynamically typed languages you can get substitution easily and that pushes people toward looking beyond the hierarchy.

    I don’t know. Maybe Ruby modules make a strict mapping of type to class possible.

  2. Avatar
    Dean Wampler 2 days later:

    I googled for references to Tom Love’s writings and I discovered that he wrote Object Lessons: Lessons Learned in Object-Oriented Development Projects, which I read years ago. (It was published in 1993.)

    Pulling it out (and blowing the dust off…), the index took me to this section on testing (page 193). He says,

    With an object-oriented language, one must verify that:
    • all inherited methods used by the class are correct
    • all arguments that are subclasses of the specified argument type are correct
    • all methods by the same name perform the same logical operation
    • the documentation is accurate…


    Flipping through the book, I found this gem on pg. 231. It’s for decision makers and it’s presented in a question and answer format.

    How should I choose between static and dynamic object-oriented programming languages?

    It is really quite easy. Choose a dynamic object-oriented language unless you are absolutely confident that you can write the detailed functional specification for the system you are designing and that this specification will not change for three years.

    We invented software because it took so long to rewire computers. Dynamic object-oriented languages make it easy to rewire computers. Static object-oriented languages use lots of solder. Resoldering is expensive and error prone.

    Most applications and systems are in a high degree of flux. Choose a tool that accomodates rather than hinders this change.

    To be fair, our sophisticated IDEs and their refactoring tools help this problem. Still, I wish I had paid attention to this advice at that time. It was about the time I started a UI project where I chose C++ over a few dynamic-language options. The project might have succeeded without the encumbrance…

  3. Avatar
    Michael Feathers 2 days later:

    Great quotes!

  4. Avatar
    Sebastian Kübeck 4 days later:

    I’m not a Ruby expert but no matter what Robert Dober said, you have a “prove of concept” with deep copying of your code graph (which is done quite frequently in software) that the design violates LSP. I think it would be much easier to implement dup properly on immutable objects than having people write work arounds and spending time explaining them why it has to be this way ;-).

  5. Avatar
    3 months later:

    The thing in dynamically typed languages is that we can have a notion of type which is independent of class. I can’t recall the example, but I remember hearing that most Smalltalk implementations have some things that look like glaring LSP violations if you are looking at classes rather than some general notion of type!

  6. Avatar
    3 months later:

    Most applications and systems are in a high degree of flux. Choose a tool that accomodates rather than hinders this change…

  7. Avatar
    3 months later:

    I wish I had paid attention to this advice at that time….

  8. Avatar
    over 3 years later:

    I prefer the old computer nerd montra about the Liskov Substitution Principle: “If it looks like a duck, and quacks like a duck, but needs batteries, you probably have the wrong abstraction”

Comments