category deans-deprecations

Observations on TDD in C++ (long) 56

Posted by Dean Wampler Wed, 04 Jul 2007 04:15:09 GMT

I spent all of June mentoring teams on TDD in C++ with some Java. While C++ was my language of choice through most of the 90’s, I think far too many teams are using it today when there are better options for their particular needs.

During the month, I took notes on all the ways that C++ development is less productive than development in languages like Java, particular if you try to practice TDD. I’m not trying to start a language flame war. There are times when C++ is the appropriate tool, as we’ll see.

Most of the points below have been discussed before, but it is useful to list them in one place and to highlight a few particular observations.

Based on my observations last month, as well as previously experience, I’ve come to the conclusion that TDD in C++ is about an order of magnitude slower than TDD in Java. Mostly, this is due to poor or non-existent tool support for automated refactorings, no error detection as you type, and the requirement to compile and link an executable test.

So, here is my list of impediments that I encountered last month. I’ll mostly use Java as the comparison language, but the arguments are more or less the same for C# and the popular dynamic languages, like Ruby, Python, and Smalltalk. Note that the dynamic languages tend to have less complete tool support, but they make up for it in other ways (off-topic for this blog).

Getting Started

There is more setup effort involved in configuring your build environment to use your chosen unit testing framework (e.g., CppUnit) and to create small executables, one each for a single or a few tests. Creating many small tests, rather than one big test (e.g., a variant of the actual application). This is important to minimize the TDD cycle.

Fortunately, this setup is a one-time “charge”. The harder part, if you have legacy code, is refactoring it to break hard dependencies so you can write unit tests. This is true for legacy code in any language, of course.

Complex Syntax

C++ has a very complex syntax. This makes it hard to parse, limiting the capabilities of automated tools and slowing build times (more below).

The syntax also makes it harder to program in the language and not just for novices. Even for experts, the visual noise of pointer and reference syntax obscures the story the code is trying to tell. That is, C++ code is inherently less clean than code in most other languages in widespread use.

Also, the need for the developer to remember whether each variable is a pointer, a reference, or a “value”, and how to manage its life-cycle, requires mental effort that could be applied to the logic of the code instead.

Obsolete Tool Support

No editor or IDE supports non-trivial, automated refactorings. (Some do simple refactorings like “rename”.) This means you have to resort to tedious, slow, and error-prone manual refactorings. Extract Method is made worse by the fact that you usually have to edit two files, an implementation and a header file.

There are no widely-used tools that provide on-the-fly parsing and error indications. This alone increases the time between typing an error and learning about it by an order of magnitude. Since a build is usually required, you tend to type a lot between builds, thereby learning about many errors at once. Working through them takes time. (There may be some commercial tools with limited support for on-the-fly parsing, but they are not widely used.)

Similarly, none of the common development tools support incremental loading of object code that could be used for faster unit testing and hence a faster TDD cycle. Most teams just build executables. Even when they structure the build process to generate small, focused executables for unit tests, the TDD cycle times remain much longer than for Java.

Finally, while there is at least one mocking framework available for C++, it is much harder to use than comparable frameworks in newer languages.

Manual Memory Management

We all know that manual memory management leads to time spent finding and fixing memory errors and leaks. Avoiding these problems in the first place also consumes a lot of thought and design effort. In Java, you just spend far less time thinking about “who owns this object and is therefore responsible for managing its life-cycle”.

Dependency Management

Intelligent handling of include directives is entirely up to the developer. We have all used the following “guard” idiom:

    #ifndef MY_CLASS_H
    #define MY_CLASS_H
    ...
    #endif

Unfortunately, this isn’t good enough. The file will still get opened and read in its entirety every time it is included. You could also put the guard directives around the include statement:

    #ifndef MY_CLASS_H
    #include "myclass.h"
    #endif

This is tedious and few people do it, but it does avoid the wasted file I/O.

Finally, too few people simply declare a required class with no body:

    class MyClass;

This is sufficient when one header references another class as a pointer or reference. In our experience with clients, we have often seen build times improve significantly when teams cleaned up their header file usage and dependencies, in general. Still, why is all this necessary in the 21st century?

This problem is made worse by the unfortunate inclusion of private and protected declarations in the same header file included by clients of the class. This creates phantom dependencies from the clients to class details that they can’t access directly.

Other Debugging Issues

Limited or non-existent context information when an exception is thrown makes the origin of the exception harder to find. To fill the gap, you tend to spend more time adding this information manually through logging statements in catch blocks, etc.

The std::exception class doesn’t appear to have a std::string or const char* argument in a constructor for a message. You could just throw a string, but that precludes using an exception class with a meaningful name.

Compiler error messages are hard to read and often misleading. In part this is due to the complexity of the syntax and the parsing problem mentioned previously. Errors involving template usage are particular hard to debug.

Reflection and Metaprogramming

Many of the productivity gains from using dynamic languages and (to a lesser extent) Java and C# are due to their reflection and metaprogramming facilities. C++ relies more on template metaprogramming, rather than APIs or other built-in language features that are easier to use and more full-featured. Preprocessor hacks are also used frequently. Better reflection and metaprogramming support would permit more robust proxy or aspect solutions to be used. (However, to be fair, sometimes a preprocessor hack has the virtue of being “the simplest thing that could possibly work.”)

Library Issues

Speaking of std::string and char*, it is hard to avoid writing two versions of methods, one which takes const std::string& arguments and one which takes const char* arguments. It doesn’t matter that one method can usually delegate to the other one; this is wasted effort.

Discussion

So, C++ makes it hard for me to work the way that I want to work today, which is test-driven, creating clean code that works. That’s why I rarely choose it for a project.

However, to be fair, there are legitimate reasons for almost all of the perceived “deficiencies” listed above. C++ emphasizes performance and backwards-compatibility with C over all other considerations. However, they come at the expense of other interests, like effective TDD.

It is a good thing that we have languages that were designed with performance as the top design goal, because there are circumstances where performance is the number one requirement. However, most teams that use C++ as their primary language are making an optimal choice for, say, 10% of their code, but which is suboptimal the other 90%. Your numbers will vary; I picked 10% vs. 90% based on the fact that performance bottlenecks are usually localized and they should be found by actual measurements, not guesses!

Workarounds

If it’s true that TDD is an order of magnitude slower for C++ then what do we do? No doubt really good C++ developers have optimized their processes as best as they can, but in the end, you will just have to live with longer TDD cycles. Instead of write just enough test to fail, make it pass, refactor, it will be more like write a complete test, write the implementation, build it, fix the compilation errors, run it, fix the logic errors to make the test pass, and then refactor.

A Real Resolution?

You could consider switching to the D language, which is link compatible with C and appears to avoid many of the problems described above.

There is another way out of the dilemma of needing optimal performance some of the time and optimal productivity the rest of the time; use more than one language. I’ll discuss this idea in my next blog.

Posted in Dean's Deprecations
Tags c, D, TDD
Meta no trackbacks, 56 comments, permalink, rss, atom

Are "else" blocks the root of all evil? 67

Posted by Dean Wampler Tue, 05 Jun 2007 19:42:00 GMT

So, I’m pair programming C++ code with a client today and he makes an observation that makes me pause.

The well-structured, open-source code I’ve looked at typically has very few else blocks. You might see a conditional test with a return statement if the conditional evaluates to true, but not many if/else blocks.

(I’m quoting from memory…) Now, this may seem crazy at first, but one of the principles we teach at Object Mentor is the Single Responsibility Principle, which states that a class should have only one reason to change. This principle also applies to methods. More loosely defined, a class or method should do only one thing.

So, if a method has an if/else block, is it doing two (or more) things and therefore violating the SRP?

Okay, so this is a bit too restrictive (and the title was a bit of an attention grabber… ;). We’re not talking about something really evil, like premature optimization, after all!

However, look at your own if/else blocks and ask yourself if maybe your code would express its intent better if you refactored it to eliminate some of those else blocks.

So, is there something to this idea?

Posted in Dean's Deprecations, Design Principles
Tags blocks, design, else, SRP
Meta 67 comments, permalink, rss, atom

What I've Learned from Master Chef Rino Baglio 76

Posted by Dean Wampler Mon, 28 May 2007 02:39:00 GMT

If you want to learn Craftsmanship, you would be hard pressed to find a better mentor than Rino Baglio, the Executive Master Chef at Pazzaluna, in St. Paul, Minnesota. I had a chance to catch up with Rino this past weekend when I was in Minneapolis to teach a tutorial on Aspect-Oriented Design at ICSE 2007.

I have known Rino for over a decade, starting when Ann and I were loyal patrons of Il Bacio in Redmond, Washington, the restaurant he owned and operated with his wife Patsy until a few years ago. Through his cooking classes and many conversations about food and the restaurant business, I learned a lot about what it really means to be a chef and the long mentoring process that true chefs go through.

In the U.S., we think that passing a two-year culinary program qualifies you to be a chef. In Italy, an aspiring chef apprentices to a master at the age of 13 or so and spends the next 20-odd years mastering the craft before deserving the designation of “chef”. You can spend 7 years just working through the stations in a restaurant, cold dishes, salads, sauces, etc., just to become a “cook”.

Here are some of the characteristics of a true craftsman.

A craftsman is widely recognized by peers

Rino recently won an international competition in Italy, one of many times he’s been recognized nationally and internationally.

A craftsman is passionate about the craft

Rino says that if you are passionate about food, you will work on the presentation of even humble dishes. Pasta, as well as lobster, deserves an attractive presentation.

A craftsman delivers value to the customer while meeting business objectives

Rino keeps the kitchen lean and efficient. He keeps costs low by relying on high-quality ingredients, keeping waste to a minimum, and constantly improving the skills of his staff, all without ever compromising quality. In the year Rino has been at Pazzaluna, costs have dropped, while business and profits have increased.

A craftsman knows that quality is the number one priority

Rino knows that cutting quality today means less business tomorrow. He keeps quality high by keeping morale high. Morale is high because his staff is constantly learning new and better recipes. Also, as you watch him interact with his staff, you can see that he treats all of them, from his sous chefs to the dishwashers, with dignity and respect, while always holding them to high standards.

A craftsman never stops learning

You would think that he knows it all, by now. Yet, he has never forgotten a lesson his own mentor taught him, “you can learn something from even the worst cook, because he always knows something you don’t!” How many gurus do you know that think they have nothing left to learn?

What does all this have to do with software? Pretty much everything. Like cuisine, clean code is part art, part science. Clean code is created by passionate craftsman who are fanatical and fastidious about every detail. Clean code is the product of years of accumulated experience. The decisions a master makes moment-by-moment, whether test-driving the next feature or fighting a fire, reflect the wisdom and breadth of knowledge that produce high-quality results quickly and efficiently. Finally, a master leads by example, bringing the rest of the team up to his or her standards.

So, if you’re young and ambitious, latch onto the mentors around you. If you can’t find any, find another job. (Your organization is doomed anyway; so you might as well move on now.) If you’re older and wiser, seek out the promising junior people, teach them what you know, and learn from them as well! Oh, and if you want to taste real Italian food, make a pilgrimage to St. Paul. Tell Rino I sent you.

Posted in Dean's Deprecations
Meta 76 comments, permalink, rss, atom

100% Code Coverage? 34

Posted by Dean Wampler Thu, 17 May 2007 03:50:00 GMT

Should you strive for 100% code coverage from your unit tests? It’s probably not mandatory and if your goal is 100% coverage, you’ll focus on that goal and not focus on writing the best tests for the behavior of your code.

That said, here are some thoughts on why I still like to get close to 100%.

I’m anal retentive. I don’t like those little red bits in my coverage report. (Okay, that’s not a good reason…)
Every time I run the coverage report, I have to inspect all the uninteresting cases to find the interesting cases I should cover.
The tests are the specification and documentation of the code, so if something nontrivial but unexpected happens, there should still be a test to “document” the behavior, even if the test is hard to write.
Maybe those places without coverage are telling me to fix the design.

I was thinking about this last point the other day when considering a bit of Java code that does a downcast (assume that’s a good idea, for the sake of argument…), wrapped in a try/catch block for the potential ClassCastException:

public void handleEvent (Event event) throws ApplicationException {
  try {
    SpecialEvent specialEvent = (SpecialEvent) event;
    doSomethingSpecial (specialEvent);
  } catch (ClassCastException cce) { 
    throw new ApplicationException(cce);
  }
}

To get 100% coverage, you would have to write a test that inputs an object of a different subtype of Event to trigger coverage of the catch block. As we all know, these sorts of error-handling code blocks are typically the hardest to cover and ones we’re most likely to ignore. (When was the last time you saw a ClassCastException anyway?)

So my thought was this, we want 100% of the production code to be developed with TDD, so what if we made 100% coverage a similar goal? How would that change our designs? We might decide that since we have to write a test to cover this error-handling scenario, maybe we should rethink the scenario itself. Is it necessary? Could we eliminate the catch block with a better overall design, in this case, making sure that we test all callers and ensure that they obey the method’s ‘contract’? Should we just let the ClassCastException fly out of the function and let a higher-level catch block handle it? After all, catching and rethrowing a different exception is slightly smelly and the code would be cleaner without the try/catch block. (For completeness, a good use of exception wrapping is to avoid namespace pollution. We might not want application layer A to know anything about layer C’s exception types, so we wrap a C exception in an A exception, which gets passed through layer B…)

100% coverage is often impossible or impractical, because of language or tool oddities. Still, if you give in early, you’re overlooking some potential benefits.

Posted in Dean's Deprecations
Tags code, coverage, TDD
Meta no trackbacks, 34 comments, permalink, rss, atom

AOP and Dynamic Languages: Contradiction in Terms or Match Made in Heaven? 36

Posted by Dean Wampler Wed, 21 Mar 2007 18:21:35 GMT

Consider this quote from Dave Thomas (of the Pragmatic Programmers) on AOP (aspect-oriented programming, a.k.a. aspect-oriented software development - AOSD) :

Once you have decent reflection and metaclasses, AOP is just part of the language.

People who work with dynamic languages don't see any need for AOP-specific facilities in their language. They don't necessarily dispute the value of AOP for Java, where metaprogramming facilities are weaker, but for them, AOP is just a constrained form of metaprogramming. Are they right?

It's easy to see why people feel this way when you consider that most of the applications of AOP have been to solve obvious "cross-cutting concerns" (CCCs) like object-relational mapping, transactions, security, etc. In other words, AOP looks like just one of many tools in your toolbox to solve a particular group of problems.

I'll use Ruby as my example dynamic language, since Ruby is the example I know best. It's interesting to look at Ruby on Rails source code, where you find a lot of "AOP-like" code that addresses the CCCs I just mentioned (and more). This is easy enough to do using Ruby's metaprogramming tools, even though tooling that supports AOP semantics would probably make this code easier to write and maintain.

This is going to be a long blog entry already, so I won't cite detailed examples here, but consider how Rails uses method_missing to effectively "introduce" new methods into classes and modules. For example, in ActiveRecord, the many possible find methods and attribute read/write methods are "implemented" this way.

By the way, another excellent Ruby framework, RSpec used method_missing for similar purposes, but recently refactored its implementation and public API to avoid method_missing, because having multiple frameworks attempt to use the same "hook" proved very fragile!

Also in Rails, method "aliasing" is done approximately 175 times, often to wrap ("advise") methods with new behaviors.

Still, is there really a need for AOP tooling in dynamic languages? First, consider that in the early days of OOP, some of us "faked" OOP using whatever constructs our languages provided. I wrote plenty of C code that used structs as objects and method pointers to simulate method overloading and overriding. However, few people would argue today that such an approach is "good enough". If we're thinking in objects, it sure helps to have a language that matches those semantics.

Similarly, it's true that you can implement AOP using sufficiently powerful metaprogramming facilities, but it's a lot harder than having native AOP semantics in your language (or at least a close approximation thereof in libraries and their DSLs).

Before proceeding, let me remind you what AOP is for in the first place. AOP is essentially a new approach to modularization that complements other approaches, like objects. It tries to solve a group of problems that other modularity approaches can't handle, namely the fine-grained interaction of multiple domain models that is required to implement required functionality.

Take the classic example of security management. Presumably, you have one strategy and implementation for handling authentication and authorization. This is one domain and your application's "core business logic" is another domain.

In a non-AOP system, it is necessary to insert duplicate or nearly duplicate code throughout the system that invokes the security subsystem. This duplication violates DRY, it clutters the logic of the code where it is inserted, it is difficult to test, maintain, replace with a new implementation, etc.

Now you may say that you handle this through a Spring XML configuration file or an EJB deployment configuration file, for example. Congratulations, you are using an AOP or AOP-like system!

What AOP seeks to do is to allow you to specify that repeated behavior in one place, in a modular way.

There are four pieces required for an AOP system:

1. Interception

You need to be able to "intercept" execution points in the program. We call these join points in the AOP jargon and sets of them that the aspect writer wants to work with at once are called pointcuts (yes, no whitespace). At each join point, advice is the executable code that an aspect invokes either before, after or both before and after ("around") the join point.

Note that the most powerful AOP language, AspectJ, let's you advise join points like instance variable reads and writes, class initialization, instance creation, etc. The easiest join points to advise are method calls and many AOP systems limit themselves to this capability.

2. Introduction

Introduction is the ability to add new state and behavior to an existing class, object, etc. For example, if you want to use the Observer pattern with a particular class, you could use an aspect to introduce the logic to maintain the list of observers and to notify them when state changes occur.

3. Inspection

We need to be able to find the join points of interest, either through static or runtime analysis, preferably both! You would also like to specify certain conditions of interest, which I'll discuss shortly.

4. Modularization

If we can't package all this into a "module", then we don't have a new modularization scheme. Note that a part of this modularization is the ability to somehow specify in one place the behavior I want and have it affect the entire system. Hence, AOP is a modularity system with nonlocal effects.

Okay. How does pure Ruby stack up these requirements? If you're a Java programmer, the idea of Interception and Introduction, where you add new state and behavior to a class, may seem radical. In languages with "open classes" like Ruby, it is trivial and common to reopen a class (or Module) and insert new attributes (state) and methods (behavior). You can even change previously defined methods. Hence, Interception and Introduction are trivial in Ruby.

This is why Ruby programmers assume that AOP is nothing special, but what they are missing are the complete picture for Inspection and Modularization, even though both are partially covered.

There is a rich reflection API for finding classes and objects. You can write straightforward code that searches for classes that "respond to" a particular method, for example. What you can't do easily is query based on state. For example, in AspectJ, you can say, "I want to advise method call X.m when it is called in the context flow ('cflow') of method call Y.m2 somewhere up the stack..." Yes, you can figure out how to do this in Ruby, but it's hard. So, we're back to the argument I made earlier that you would really like your language to match the semantics of your ideas.

For modularization, yes you can put all the aspect-like code in a Module or Class. The hard part is encapsulating any complicated "point cut" metaprogramming in one place, should you want to use it again later. That is, once you figure out how to do the cflow pointcuts using metaprogramming, you'll want that tricky bit of code in a library somewhere.

At this point, you might be saying to yourself, "Okay, so it might be nice to have some AOP stuff in Ruby, but the Rails guys seem to be doing okay without it. Is it really worth the trouble having AOP in the language?" Only if AOP is more applicable than for the limited set of problems described previously.

Future Applications of AOP??

Here's what I've been thinking about lately. Ruby is a wonderful language for creating mini-DSLs. The ActiveRecord DSL is a good example. It provides relational semantics, while the library minimizes the coding required by the developer. (AR reads the database schema and builds an in-memory representation of the records as objects.)

Similarly, there is a lot of emphasis these days on development that centers around the domain or features of the project. Recall that I said that AOP is about modularizing the intersection of multiple domains (and recall my previous blog on the AOSD 2007 Conference where Gerald Sussman remarked that successful systems have more than one organizing principle).

I think we'll see AOP become the underlying implementation of powerful DSLs that allow programmers who are not AOP-literate express cross-cutting concerns in domain-specific and intuitive languages. AOP will do the heavy lifting behind the scenes to make the fine-grained interactions work. I really don't expect a majority of developers to become AOP literate any time soon. In my experience, too many so-called developers don't get objects. They'll never master aspects!

Shameless Plug

If you would like to hear more of my ideas about AOP in Ruby and aspect-oriented design (AOD), please come to my talk at SD West, this Friday at 3:30. I'm also giving a full-day tutorial on AOD in Ruby and Java/AspectJ at the ICSE 2007 conference in Minneapolis, May 26th.

Posted in Dean's Deprecations, Dynamic Languages, Design Principles
Tags AOD, AOP, design, Dynamic Languages, Ruby
Meta no trackbacks, 36 comments, permalink, rss, atom

AOSD 2007 Conference 7

Posted by Dean Wampler Wed, 21 Mar 2007 14:41:00 GMT

Last week, I attended the Aspect-Oriented Software Development 2007 Conference in Vancouver, BC, where I gave a tutorial on aspect-oriented design and a paper in the Industry Track, also about design.

AOSD and this particular conference are still mostly academic projects with some notable industry traction, especially in the Java world. It is also a technology that needs to break through to the next level of innovation and applicability, in my opinion.

The Industry Track had a number of interesting papers, including, for example, a paper that describes how aspects are used in the innovative Terracotta JVM clustering tool. Also, the last keynote by Adrian Colyer of Interface 21 recounted his personal involvement in the AspectJ and Spring communities, as well as the impact that aspects are having at major Interface 21 clients. It’s worth noting that many of the important Java middleware systems, e.g., Spring, JBoss, and Weblogic have embraced aspects to one degree or another.

AOSD tools like AspectJ and Spring AOP (AO Programming) solve obvious “cross-cutting concerns” (CCCs) like object-relational mapping, transactions, security, etc. in Java. However, I feel that AOSD needs some breakthrough innovations to move beyond its current role as a useful niche technology to a more central role in the software development process. I get the impression that industry interest in AOP has reached a plateau.

The academic community is doing some interesting work on fundamentals of AOSD theory (type theory, modeling, categorizing types of aspects, etc.), on “early aspects” (e.g., cross-cutting requirements), and on tooling. There is also a lot of minor iterating around the edges, but that’s the nature of academic research (speaking as one who has been there…). Most of the research work is a long ways from practical applicability.

However, I’m seeing too much emphasis on extending the work of AspectJ-like approaches in statically-typed languages, rather than innovating in new areas like new applications of aspects and the nature of AOSD in dynamic languages.

My recent work with Ruby has made me think about these two topics lately. I’ll blog about these topics later, but for now, I’ll just say that I anticipate a fruitful growth area for AOSD will be to facilitate the implementation of powerful DSLs (Domain Specific Languages), a popular topic in the Ruby community.

Here are some other observations from the conference.

All other engineering disciplines recognize cross-cutting concerns

Gregor Kiczales (the father of AspectJ and one of the fathers of AOSD) made this remark in a panel discussion on “early aspects”. He cited the examples of electrical engineers considering systemic issues in circuit design, like capacitance, current leakage, etc. and mechanical engineers who evaluate the stresses and strains of the entire structure they are designing (buildings, brake assemblies in cars, etc.).

Gregor also remarked that CCCs that are evident in the requirements may disappear in the implementation and vice-versa.

Gerald Sussman keynote

In the first keynote, Gerald Sussman argued that robust systems are adaptable for uses that were not anticipated by their designers. These systems often have multiple organizational ideas and a “component” structure that promotes “combinatorial behavior”.

He doesn’t like formalisms such as Dykstra’s A Discipline of Programming. Provable correctness and rigor aren’t very compatible with rich programs. Sussman prefers a more “exploratory” model, very much analogous to Test Driven Development (TDD), and what he calls Paranoid Programming, which he defined as “I won’t be at fault if it fails.”

More on AOSD

I will blog further about aspect-oriented design and aspects in dynamic languages.

Posted in Dean's Deprecations
Tags AOSD, aspects, design
Meta no trackbacks, 7 comments, permalink, rss, atom

Liskov Substitution Principle and the Ruby Core Libraries 78

Posted by Dean Wampler Sat, 17 Feb 2007 20:20:00 GMT

There is a spirited discussion happening now on the ruby-talk list called Oppinions on RCR for dup on immutable classes (sic).

In the core Ruby classes, the Kernel module, which is the root of everything, even Object, defines a method called dup, for duplicating objects. (There is also a clone method with slightly different behavior that I won’t discuss here.)

The problem is that some derived core classes throw an exception when dup is called.

Specifically, as the ruby-talk discussion title says, it’s the immutable classes (NilClass, FalseClass, TrueClass, Fixnum, and Symbol) that do this. Consider, for example, the following irb session:

irb 1:0> 5.respond_to? :dup
=> true
irb 2:0> 5.dup
TypeError: can't dup Fixnum
        from (irb):1:in `dup'
        from (irb):1
irb 3:0>

If you don’t know Ruby, the first line asks the Fixnum object 5 if it responds to the method dup (with the name expressed as a symbol, hence the ”:”). The answer is true, becuase this method is defined by the module Kernel, which is included by the top-level class Object, an ancestor of Fixnum.

However, when you actually call dup on 5, it raises TypeError, as shown.

So, this looks like a classic Liskov Substitution Principle violation. The term for this code smell is Refused Bequest (e.g., see here) and it’s typically fixed with the refactoring Replace Inheritance with Delegation.

The email thread is about a proposal to change the library in one of several possible ways. One possibility is to remove dup from the immutable classes. This would eliminate the unexpected behavior in the example above, since 5.respond_to?(:dup) would return false, but it would still be an LSP violation, specifically it would still have the Refused Bequest smell.

One scenario where the current behavior causes problems is doing a deep copy of an arbitrary object graph. For immutable objects, you would normally just want dup to return the same object. It’s immutable, right? Well, not exactly, since you can re-open classes and even objects to add and remove methods in Ruby (there are some limitations for the immutables…). So, if you thought you actually duplicated something and started messing with its methods, you would be surprised to find the original was “also” modified.

So, how serious is this LSP issue (one of several)? When I pointed out the problem in the discussion, one respondent, Robert Dober, said the following (edited slightly):

I would say that LSP does not apply here simply because in Ruby we do not have that kind of contract. In order to apply LSP we need to say at a point we have an object of class Base, for example. (let the gods forgive me that I use Java)

void aMethod(final Base b){
   ....
}

and we expect this to work whenever we call aMethod with an object that is a Base. Anyway the compiler would not really allow otherwise.

SubClass sc;  // subclassing Base od course
aMethod( sc ); // this is expected to work (from the type POV).

Such things just do not exist in Ruby, I believe that Ruby has explained something to me:

OO Languages are Class oriented languages
Dynamic Languages are Object oriented languages.

Replace Class with Type and you see what I mean.

This is all very much IMHO of course but I feel that the Ruby community has made me evolve a lot away from “Class oriented”.

He’s wrong that the compiler protects you in Java; you can still throw exceptions, etc. The JDK Collection classes have Refused Bequests. Besides that, however, he makes some interesting points.

As a long-time Java programmer, I’m instinctively uncomfortable with LSP violations. Yet, the Ruby API is very nice to work with, so maybe a little LSP violation isn’t so bad?

As Robert says, we approach our designs differently in dynamic vs. static languages. In Ruby, you almost never use the is_a? and kind_of? methods to check for type. Instead, you follow the duck typing philosophy (“If it acts like a duck, it must be a duck”); you rely on respond_to? to decide if an object does what you want.

In the case of dup for the immutable classes, I would prefer that they not implement the method, rather than throw an exception. However, that would still violate LSP.

So, can we still satisfy LSP and also have rich base classes and modules?

There are many examples of traits that one object might or should support, but not another. (Those of you Java programmers might ask yourself why all objects support toString, for example. Why not also toXML...?)

Coming from an AOP background, I would rather see an architecture where dup is added only to those classes and modules that can support it. It shouldn’t be part of the standard “signature” of Kernel, but it should be present when code actually needs it.

In fact, Ruby makes this sort of AOP easy to implement. Maybe Kernel, Module, and Object should be refactored into smaller pieces and programmers should declaratively mixin the traits they need. Imagine something like the following:

irb 1:0> my_obj.respond_to? :dup
=> false
irb 2:0> include 'DupableTrait'  
irb 2:0> my_obj.respond_to? :dup
=> true
irb 4:0> def dup_if_possible items
irb 5:1>  items.map {|item| item.respond_to?(:dup) ? item.dup : item}
irb 6:1> end
...

In other words, Kernel no longer “exposes the dup abstraction”, by default, but the DupableTrait module “magically” adds dup to all the classes that can support it. This way, we preserve LSP, streamline the core classes and modules (SRP and ISP anyone?), yet we have the flexibility we need, on demand.

Posted in Dean's Deprecations, Dynamic Languages, Design Principles
Tags Liskov Substitution Principle, LSP, object-oriented design, Ruby
Meta no trackbacks, 78 comments, permalink, rss, atom

Phantom (Menace) Dependencies 5

Posted by Dean Wampler Sat, 27 Jan 2007 00:07:00 GMT

We’ve been working on new course materials lately. The other day we were discussing the bad dependencies in code that can occur when the Interface Segregation Principle (ISP) is violated.

We started calling these dependencies phantom dependencies. Here’s why…

In the simplest example of an ISP problem, several client components depend on a “server” component, but each uses completely independent facilities provided by that component. There are no direct or indirect connections between clients. So, none of the clients knows anything about the other clients, nor cares to, yet each client has an implicit dependency on the other clients.

Why? Because if a change in one client forces a change in the server component, the other clients are affected. At the very least, a rebuild may be required.

The solution is to hide the server component behind segregated interfaces, one tailored for each client, and have the server component implement those interfaces. Even better, if the features really are independent, then segregate the server component, too!

Anyway, we sometimes use the term back-channel dependency for this situation. However, while thinking about it, I remembered an article that I had read a few days before on the phantom pain phenomenon that amputees sometimes experience, where they feel pain in limbs that are no longer there. The ISP case is analogous; there is no traversable (“real”) connections from one client component to another, yet each “feels” the presence of the others. Hence, the term phantom dependency.

Protecting Developers from Powerful Languages 23

Posted by Dean Wampler Tue, 16 Jan 2007 03:57:00 GMT

Microsoft’s forthcoming C# version 3 has some innovative features, as described in this blog. I give the C# team credit for pushing the boundaries of C#, in part because they have forced the Java community to follow suit. ;)

A common tension in many development shops is how far to trust the developers with languages and tools that are perceived to be “advanced”. It’s tempting to limit developers to “safe” languages and maybe not all the features of those languages. This can be misguided.

Java is usually considered safe, but Java Generics are suspect. Strong typing is safe, but dynamic typing isn’t controlled enough. Closures and continuations sound too advanced and technical to be trusted in the hands of “our team”.

To be fair, larger organizations have more at stake and caution is prudent. Regrettably, it is also true that many people in our profession are … hmm … not that well qualified.

However, I find that I’m far more productive and less likely to make mistakes using Ruby iterators with closures than writing more verbose and inelegant Java.

I used to be a strong believer in static typing, but it has become a distraction, as I have to worry more about the types of method parameters and return values, rather than just worrying about the values themselves. I realized that, on average in a typical section of code, the actual type of a variable is unimportant. The variable is just a “handle” being passed around. The name is always important, as it is a form of documentation. There are places where the type is important, of course, when the variable is read or written in some way.

Finally, static typing offers less security than at first appears. At best, it only confirms that variables of particular types are used consistently. Your unit tests also do this. However, static typing can’t confirm that the usage of the API is correct. This is analogous to testing the syntax but not the semantics of the program. In fact, only unit tests (or alternatives, like rspec ) are effective at testing both.

So, it’s prudent to be reticent about newer languages and features, but make sure the decisions you make about them are backed up by careful evaluation and don’t forget to train your team appropriately!

Mentor	twitter id
Uncle Bob	unclebobmartin
Brett Schuchert	schuchert
Michael Feathers	mfeathers
Bob Koss	bob_koss