TDD on Three Index Cards 82
I had the opportunity to talk to a fellow who missed part of a class on TDD. I told him that I could give him a 15-minute overview, and give him all the essentials of TDD on three index cards.
Yes, I know that volumes have been written about TDD and BDD, and that it’s a large topic with many many branches of application, but I didn’t have time for that. I had time for three index cards. I figure that an index card is a token, and it represents a conversation, and that one can always dig deeper later.
They looked more or less like this:
Card 1: Uncle Bob’s Three Laws (Object Mentor)
- Write no production code except to pass a failing test.
- Write only enough of a test to demonstrate a failure.
- Write only enough production code to pass the test.
Card 2: FIRST Principles (Brett and Tim at Object Mentor)
- Fast: Mind-numbingly fast, as in hundreds or thousands per second.
- Isolated: The test isolates a fault clearly.
- Repeatable: I can run it repeatedly and it will pass or fail the same way each time.
- Self-verifying: The test is unambiguously pass-fail.
- Timely: Produced in lockstep with tiny code changes.
Card 3: Flow (using the famous three-node circle diagram) – origin unknown.
- Red: test fails
- Green: test passes
- Refactor: clean code and tests
Sure there is plenty more, but I didn’t know how I could provide significantly less. As is, I’m pretty happy with the exercise. Now I am wondering if I couldn’t produce most of the important information I wish to convey as a series of index cards. Would that be cool or what?
Unit Testing C and C++ ... with Ruby and RSpec! 105
If you’re writing C/C++ code, it’s natural to write your unit tests in the same language (or use C++ for your C test code). All the well-known unit testing tools take this approach.
I think we can agree that neither language offers the best developer productivity among all the language choices out there. Most of us use either language because of perceived performance requirements, institutional and industry tradition, etc.
There’s growing interest, however, in mixing languages, tools, and paradigms to get the best tool for a particular job. <shameless-plug>I’m giving a talk March 7th at SD West on this very topic, called Polyglot and Poly-Paradigm Programming </shameless-plug>.
So, why not use a more productive language for your C or C++ unit tests? You have more freedom in your development chores than what’s required for production. Why not use Ruby’s RSpec, a Behavior-Driven Development tool for acceptance and unit testing? Or, you could use Ruby’s version of JUnit, called Test::Unit. The hard part is integrating Ruby and C/C++. If you’ve been looking for an excuse to bring Ruby (or Tcl or Python or Java or…) into your C/C++ environment, starting with development tasks is usually the path of least resistance.
I did some experimenting over the last few days to integrate RSpec using SWIG (Simplified Wrapper and Interface Generator), a tool for bridging libraries written in C and C++ to other languages, like Ruby. The Ruby section of the SWIG manual was very helpful.
My Proof-of-Concept Code
Here is a zip file of my experiment: rspec_for_cpp.zip
This is far from a complete and working solution, but I think it shows promise. See the Current Limitations section below for details.
Unzip the file into a directory. I’ll assume you named it rspec_for_cpp
. You need to have gmake
, gcc
, SWIG and Ruby installed, along with the RSpec “gem”. Right now, it only builds on OS X and Linux (at least the configurations on my machines running those OS’s – see the discussion below). To run the build, use the following commands:
$ cd rspec_for_cpp/cpp
$ make
You should see it finish with the lines
( cd ../spec; spec *_spec.rb )
.........
Finished in 0.0***** seconds
9 examples, 0 failures
Congratulations, you’ve just tested some C and C++ code with RSpec! (Or, if you didn’t succeed, see the notes in the Makefile
and the discussion below.)
The Details
I’ll briefly walk you through the files in the zip and the key steps required to make it all work.
cexample.h
Here is a simple C header file.
/* cexample.h */
#ifndef CEXAMPLE_H
#define CEXAMPLE_H
#ifdef __cplusplus
extern "C" {
#endif
char* returnString(char* input);
double returnDouble(int input);
void doNothing();
#ifdef __cplusplus
}
#endif
#endif
Of course, in a pure C shop, you won’t need the #ifdef __cplusplus
stuff. I found this was essential in my experiment when I mixed C and C++, as you might expect.
cpp/cexample.c
Here is the corresponding C source file.
/* cexample.h */
char* returnString(char* input) {
return input;
}
double returnDouble(int input) {
return (double) input;
}
void doNothing() {}
cpp/CppExample.h
Here is a C++ header file.
#ifndef CPPEXAMPLE_H
#define CPPEXAMPLE_H
#include <string>
class CppExample
{
public:
CppExample();
CppExample(const CppExample& foo);
CppExample(const char* title, int flag);
virtual ~CppExample();
const char* title() const;
void title(const char* title);
int flag() const;
void flag(int value);
static int countOfCppExamples();
private:
std::string _title;
int _flag;
};
#endif
cpp/CppExample.cpp
Here is the corresponding C++ source file.
#include "CppExample.h"
CppExample::CppExample() : _title("") {}
CppExample::CppExample(const CppExample& foo): _title(foo._title) {}
CppExample::CppExample(const char* title, int flag) : _title(title), _flag(flag) {}
CppExample::~CppExample() {}
const char* CppExample::title() const { return _title.c_str(); }
void CppExample::title(const char* name) { _title = name; }
int CppExample::flag() const { return _flag; }
void CppExample::flag(int value) { _flag = value; }
int CppExample::countOfCppExamples() { return 1; }
cpp/example.i
Typically in SWIG, you specify a .i
file to the swig
command to define the module that wraps the classes and global functions, which classes and functions to expose to the target language (usually all in our case), and other assorted customization options, which are discussed in the SWIG manual. I’ll show the swig
command in a minute. For now, note that I’m going to generate an example_wrap.cpp
file that will function as the bridge between the languages.
Here’s my example.i
, where I named the module example
.
%module example
%{
#include "cexample.h"
#include "CppExample.h"
%}
%include "cexample.h"
%include "CppExample.h"
It looks odd to have header files appear twice. The code inside the %{...%}
(with a ’#’ before each include
) are standard C and C++ statements, etc. that will be inserted verbatim into the generated “wrapper” file, example_wrap.cpp
, so that file will compile when it references anything declared in the header files. The second case, with a ’%’ before the include
statements1, tells SWIG to make all the declarations in those header files available to the target language. (You can be more selective, if you prefer…)
Following Ruby conventions, the Ruby plugin for SWIG automatically names the module with an upper case first letter (Example
), but you use require 'example'
to include it, as we’ll see shortly.
Building
See the cpp/Makefile
for the gory details. In a nutshell, you run the swig
command like this.
swig -c++ -ruby -Wall -o example_wrap.cpp example.i
Next, you create a dynamically-linked library, as appropriate for your platform, so the Ruby interpreter can load the module dynamically when required. The Makefile
can do this for Linux and OS X platforms. See the Ruby section of the SWIG manual for Windows specifics.
If you test-drive your code, which tends to drive you towards minimally-coupled “modules”, then you can keep your libraries and build times small, which will make the build and test cycle very fast!
spec/cexample_spec.rb
and spec/cppexample_spec.rb
Finally, here are the RSpec files that exercise the C and C++ code. (Disclaimer: these aren’t the best spec files I’ve ever written. For one thing, they don’t exercise all the CppExample
methods! So sue me… :)
require File.dirname(__FILE__) + '/spec_helper'
require 'example'
describe "Example (C functions)" do
it "should be a constant on Module" do
Module.constants.should include('Example')
end
it "should have the methods defined in the C header file" do
Example.methods.should include('returnString')
Example.methods.should include('returnDouble')
Example.methods.should include('doNothing')
end
end
describe Example, ".returnString" do
it "should return the input char * string as a Ruby string unchanged" do
Example.returnString("bar!").should == "bar!"
end
end
describe Example, ".returnDouble" do
it "should return the input integer as a double" do
Example.returnDouble(10).should == 10.0
end
end
describe Example, ".doNothing" do
it "should exist, but do nothing" do
lambda { Example.doNothing }.should_not raise_error
end
end
and
require File.dirname(__FILE__) + '/spec_helper'
require 'example'
describe Example::CppExample do
it "should be a constant on module Example" do
Example.constants.should include('CppExample')
end
end
describe Example::CppExample, ".new" do
it "should create a new object of type CppExample" do
example = Example::CppExample.new("example1", 1)
example.title.should == "example1"
example.flag.should == 1
end
end
describe Example::CppExample, "#title(value)" do
it "should set the title" do
example = Example::CppExample.new("example1", 1)
example.title("title2")
example.title.should == "title2"
end
end
describe Example::CppExample, "#flag(value)" do
it "should set the flag" do
example = Example::CppExample.new("example1", 1)
example.flag(2)
example.flag.should == 2
end
end
If you love RSpec like I do, this is a very compelling thing to see!
And now for the small print:
Current Limitations
As I said, this is just an experiment at this point. Volunteers to make this battle-ready would be most welcome!
General
The Example Makefile
File
It Must Be Hand Edited for Each New or Renamed Source File
You’ve probably already solved this problem for your own make files. Just merge in the example Makefile
to pick up the SWIG- and RSpec-related targets and rules.
It Only Knows How to Build Shared Libraries for Mac OS X and Linux (and Not Very Well)
Some definitions are probably unique to my OS X and Linux machines. Windows is not supported at all. However, this is also easy rectify. Start with the notes in the Makefile
itself.
The module.i
File Must Be Hand Edited for Each File Change
Since the format is simple, a make task could fill a template file with the changed list of source files during the build.
Better Automation
It should be straightforward to provide scripts, IDE/Editor shortcuts, etc. that automate some of the tasks of adding new methods and classes to your C and C++ code when you introduce them first in your “spec” files. (The true TDD way, of course.)
Specific Issues for C Code Testing
I don’t know of any other C-specific issues, so unit testing with Ruby is most viable today for C code. Although I haven’t experimented extensively, C functions and variables are easily mapped by SWIG to a Ruby module. The Ruby section of the SWIG manual discusses this mapping in some detail.
Specific Issues for C++ Code Testing
More work will be required to make this viable. It’s important to note that SWIG cannot handle all C++ constructs (although there are workarounds for most issues, if you’re committed to this approach…). For example, namespaces, nested classes, some template and some method overloading scenarios are not supported. The SWIG manual has details.
Also, during my experiment, SWIG didn’t seem to map const std::string&
objects in C++ method signatures to Ruby strings, as I would have expected (char*
worked fine).
Is It a Viable Approach?
Once the General issues listed above are handled, I think this approach would work very well for C code. For C++ code, there are more issues that need to be addressed, and programmers who are committed to this strategy will need to tolerate some issues (or just use C++-language tools for some scenarios).
Conclusions: Making It Development-Team Ready
I’d like to see this approach pushed to its logical limit. I think it has the potential to really improve the productivity of C and C++ developers and the quality of their test coverage, by leveraging the productivity and power of dynamically-typed languages like Ruby. If you prefer, you could use Tcl, Python, even Java instead.
License
This code is complete open and free to use. Of course, use it at your own risk; I offer it without warranty, etc., etc. When I polish it to the point of making it an “official” project, I will probably release under the Apache license.
1 I spent a lot of time debugging problems because I had a ’#’ where I should have had a ’%’! Caveat emptor!
TDD for AspectJ Aspects 32
There was a query on the TDD mailing list about how to test drive aspects. Here is an edited version of my reply to that list.
Just as for regular classes, TDD can drive aspects to a better design.
Assume that I’m testing a logging aspect that logs when certain methods are called. Here’s the JUnit 4 test:
package logging;
import static org.junit.Assert.*;
import org.junit.Test;
import app.TestApp;
public class LoggerTest {
@Test
public void FakeLoggerShouldBeCalledForAllMethodsOnTestClasses() {
String message = "hello!";
new TestApp().doFirst(message);
assertTrue(FakeLogger.messageReceived().contains(message));
String message2 = "World!";
new TestApp().doSecond(message, message2);
assertTrue(FakeLogger.messageReceived().contains(message));
}
}
Already, you might guess that FakeLogger
will be a test-only version of something, in this case, my logging aspect. Similarly, TestApp
is a simple class that I’m using only for testing. You might choose to use one or more production classes, though.
package app;
@Watchable
public class TestApp {
public void doFirst(String message) {}
public void doSecond(String message1, String message2) {}
}
and @Watchable
is a marker annotation that allows me to define pointcuts in my logger aspect without fragile coupling to concrete names, etc. You could also use an interface.
package app;
public @interface Watchable {}
I made up @Watchable
as a way of marking classes where the public methods might be of “interest” to particular observers of some kind. It’s analogous to the EJB 3 annotations that mark classes as “persistable” without implying too many details of what that might mean.
Now, the actual logging is divided into an abstract base aspect and a test-only concrete sub-aspect>
package logging;
import org.aspectj.lang.JoinPoint;
import app.Watchable;
abstract public aspect AbstractLogger {
// Limit the scope to the packages and types you care about.
public abstract pointcut scope();
// Define how messages are actually logged.
public abstract void logMessage(String message);
// Notice the coupling is to the @Watchable abstraction.
pointcut watch(Object object):
scope() && call(* (@Watchable *).*(..)) && target(object);
before(Object watchable): watch(watchable) {
logMessage(makeLogMessage(thisJoinPoint));
}
public static String makeLogMessage(JoinPoint joinPoint) {
StringBuffer buff = new StringBuffer();
buff.append(joinPoint.toString()).append(", args = ");
for (Object arg: joinPoint.getArgs())
buff.append(arg.toString()).append(", ");
return buff.toString();
}
}
and
package logging;
public aspect FakeLogger extends AbstractLogger {
// Only match on calls from the unit tests.
public pointcut scope(): within(logging.*Test);
public void logMessage(String message) {
lastMessage = message;
}
static String lastMessage = null;
public static String messageReceived() {
return lastMessage;
}
}
Pointcuts in aspects are like most other dependencies, best avoided ;) ... or at least minimized and based on abstractions, just like associations and inheritance relationships.
So, my test “pressure” drove the design in terms of where I needed abstraction in the Logger aspect: (i) how a message is actually logged and (ii) what classes get “advised” with logging behavior.
Just as for TDD of regular classes, the design ends up with minimized dependencies and flexibility (abstraction) where it’s most useful.
I can now implement the real, concrete logger, which will also be a sub-aspect of AbstractLogger
. It will define the scope()
pointcut to be a larger section of the system and it will send the message to the real logging subsystem.
Why you have time for TDD (but may not know it yet...) 40
Note: Updated 9/30/2007 to improve the graphs and to clarify the content.
A common objection to TDD is this; “We don’t have time to write so many tests. We don’t even have enough time to write features!”
Here’s why people who say this probably already have enough time in the (real) schedule, they just don’t know it yet.
Let’s start with an idealized Scrum-style “burn-down chart” for a fictional project run in a “traditional” way (even though traditional projects don’t use burn-down charts…).
We have time increasing on the x axis and the number of “features” remaining to implement on the y axis (it could also be hours or “story points” remaining). During a project, a nice feature of burn-down charts is that you can extend the line to see where it intersects the x axis, which is a rough indicator of when you’ll actually finish.
The optimistic planners for our fictional project plan to give the software to QA near the end of the project. They expect QA to find nothing serious, so the release will occur soon thereafter on date T0.
Of course, it never works out that way:
The red line is the actual effort for our fictional project. It’s quite natural for the planned list of features to change as the team reacts to market changes, etc.. This is why the line goes up sometimes (in “good” projects, too!). Since this is a “traditional” project, I’m assuming that there are no automated tests that actually prove that a given feature is really done. We’re effectively running “open loop”, without the feedback of tests.
Inevitably, the project goes over budget and th planned QA drop comes late. Then things get ugly. Without our automated unit tests, there are lots of little bugs in the code. Without our automated integration tests, there are problems when the subsystems are run together. Without our acceptance tests, the implemented features don’t quite match the actual requirements for them.
Hence, a chaotic, end-of-project “birthing” period ensues, where QA reports a list of big and small problems, followed by a frantic effort (usually involving weekends…) by the developers to address the problems, followed by another QA drop, followed by…, and so forth.
Finally, out of exhaustion and because everyone else is angry at the painful schedule slip, the team declares “victory” and ships it, at time T1.
We’ve all lived through projects like this one.
Now, if you remember your calculus classes (sorry to bring up painful memories), you will recall that the area under the curve is the total quantity of whatever the curve represents. So, the actual total feature work required for our project corresponds to the area under the red line, while the planned work corresponds to the area under the black line. So, we really did have more time than we originally thought.
Now consider a Test-Driven Development (TDD) project [1]:
Here, the blue line is similar to the red line, at least early in the project. Now we have frequent “milestones” where we verify the state of the project with the three kinds of automated tests mentioned above. Each milestone is the end of an iteration (usually 1-4 weeks apart). Not shown are the 5-minute TDD cycles and the feedback from the continuous integration process that does our builds and runs all our tests after every block of commits to version control (many times a day).
The graph suggests that the total amount of effort will be higher than the expected effort without tests, which may be true [2]. However, because of the constant feedback during the whole life of the project, we really know where we actually are at any time. By measuring our progress in this way, we will know early whether or not we can meet the target date with the planned feature set. With early warnings, we can adjust accordingly, either dropping features or moving the target date, with relatively little pain. Whereas, without this feedback, we really don’t know what’s done until something, e.g., the QA process, gives us that feedback. Hence, at time T0, just before the big QA drop, the traditional project has little certainty about what features are really completed.
So, we’ll experience less of the traditional end-of-project chaos, because there will be fewer surprises. Without the feedback from automated tests, QA is find lots of problems, causing the chaotic and painful end-of-project experience. Finding and trying to fix major problems late in the game can even kill a project.
So, TDD converts that unknown schedule time at the end into known time early in the project. You really do have time for automated tests and your tests will make your projects more predictable and less painful at the end.
Note: I appreciate the early comments and questions that helped me clarify this post.
[1] As one commenter remarked, this post doesn’t actually make the case for TDD itself vs. alternative “test-heavy” strategies, but I think it’s pretty clear that TDD is the best of the known test-heavy strategies, as argued elsewhere.
[2] There is some evidence that TDD and pair programming lead to smaller applications, because they help avoid unnecessary features. Also, they provide constant feedback to the team, including the stake holders, on what the feature set should really be and which features are most important to complete first.
Not A Task, But An Approach 46
Transitions are tough. It seems that lately I’ve been getting a lot of contact from frustrated people who don’t really have a good handle on the “drive” part of Test Driven Development. A question heard frequently is: “I’ve almost completed the coding, can you help me write the TDD?”
It seems like Test Driven Development is taken backward, that the developers are driven to write tests. The practitioner winces, realizing that he again faces The Great Misunderstanding of TDD.
TDD stands for Test-Driven Development, which is not as clear as TFD (Test-First Development). If the consultant would strive to always say the word “first” in association with testing, most people would more surely grasp the idea. In fact, I’ve begun an experiment in which I will not say the word “test” without the word “first” in close approximation. I’ll let you know how that works out for me.
If the tests are providing nothing more than reassurance on the tail end of a coding phase, then the tests aren’t driving the development. They are like riders instead of drivers. Test-Ridden Development (TRD)[1] would be a better term for such a plan. Even though it is better to have those tail-end tests than to have no automated testing, it misses the point and could not be reasonably be called TDD.
An old mantra for TDD and BDD is “it’s not about testing”. The term BDD was invented largely to get the idea of “testing” out of the way. People tend to associate “test” as a release-preparation activity rather than an active approach to programming. BDD alleviates some of that cognitive dissonance.
In TDD, tests come first. Each unit test is written as it is needed by the programmer. Tests are just-in-time and are active in shaping the code. Acceptance Tests likewise tend to precede programming by some short span of time. [2]
Through months of repetition I have developed the mantra:TDD isn’t a task. It is not something you do. It is an approach. It is how you write your programs.
I wonder if we shouldn’t resurrect the term Test-First Programming or Test-First Development for simple evocative power. Admittedly there are some who would see that as a phase ordering, but maybe enough people would get the right idea.
Brett Schuchert(with some trivial aid from your humble blogger) has worked up an acronym to help socialize the basic concepts which are somehow being lost in translation to the corporate workplace.
The teaser: Fast, Isolated, Repeatable, Self-validating, and Timely.
As a reader of this blog, you are probably very familiar with all of the terminology and concepts behind TDD. I beg of you, socialize the idea that testing comes first and drives the shape of the code. If we can just get this one simple idea spread into programming dens across our small spheres of influence, then we will have won a very great victory over Test-Ridden Development.
“And there was much rejoicing.”
1 Jeff Langr will refer to this TRD concept as “Test-After-Development”, which he follows with a chuckle and a twinkle, “which is a TAD too late.”
2 Of course, one still needs QC testing as well, however TDD is about driving development, not testing its quality post-facto.
Which came First? 23
- CCCCDPIPE
- Coupling (low)
- Cohesion (high)
- Creator
- Controller
- Don’t talk to strangers (mentioned above and replaced with Protected Variation)
- Polymorphism
- Indirection
- Pure Fabrication
- Expert
- CCCC (4 c’s, foresees)
- D (the)
- PIPE (pipe)
So who foresees the pipe? The Psychic Plumber.
The Psychic Plumber??? I know, it’s awful. However, I heard it once in something like 1999 and I’ve never forgotten it.
That leads me to some other oldies but goodies: SOLID, INVEST, SMART and a relative new-comer: FIRST. While these are actually acronyms (not just abbreviations but real, dictionary-defined acronyms), they are also mnemonics.
You might be thinking otherwise. Typically what people call acronyms are actually just abbreviations. And in any case, they tend to obfuscate rather than elucidate. However, if you’ll lower your guard for just a few more pages, you might find some of these helpful.
Your software should be SOLID:- Single Responsibility
- Open/Closed Principle
- Liskov Substitution Principle
- Interface Segregation
- Dependency Inversion Principle (not to be confused with Dependency Inversion)
I think we should change the spelling: SOLIDD and tack on “Demeter – the Law Of. But that’s just me. Of course if we do this, then it is no longer technically an acronym. That’s OK, because my preference is for mnemonics, not acronyms.
When you’re working on your user stories, make sure to INVEST in them:- Independent
- Negotiable
- Valuable
- Estimable
- Small
- Testable
- Specific
- Measurable
- Achievable
- Relevant
- Time-boxed
- Fast – tests should run fast. We should be able to run all of the tests in seconds or minutes. Running the tests should never feel like a burden. If a developer ever hesitates to execute tests because of time, then the tests tests take too long.
- Isolated – A test is a sealed environment. Tests should not depend on the results of other tests, nor should they depend on external resources such as databases.
- Repeatable – when a test fails, it should fail because the production code is broken, not because some external dependency failed (e.g. database unavailable, network problems, etc.)
- Self-Validating – Manual interpretation of results does not scale. A test should * verify that it passed or failed. Going one step further, a test should report nothing but success or failure.
- Timely – tests should be written concurrently (and preferably before) with production code.
So where does this acronym come from? A while back, a colleague of mine, Tim Ottinger, and I were working on some course notes. I had a list of four out of five of these ideas. We were working on the characteristics of a good unit test. At one point, Tim said to me “Add a T.â€
I can be pretty dense fairly often. I didn’t even understand what he was telling me to do. He had to repeat himself a few times. I understood the words, but not the meaning (luckily that doesn’t happen to other people or we’d have problems writing software). Anyway, I finally typed a “Tâ€. And then I asked him “Why?†I didn’t see the word. Apparently you don’t want me on your unscramble team either.
Well eventually he led me to see the word FIRST and it just seemed to fit (not sure if that pun was intended).
Of course, you add all of these together and what do you get? The best I can come up with is: SFP-IS. I was hoping I could come up with a Roman numeral or something because then I could say developers should always wear SPF IS – which is true because we say out of the sun and burn easily. Unfortunately that did not work. If you look at your phone, you can convert this to the number: 73747
If there are any numerologists out there, maybe you can make some sense of it.
In any case, consider remembering some of these mnemonics. If you actually do more than remember them and start practicing them, I believe you’ll become a better developer.
Observations on TDD in C++ (long) 57
I spent all of June mentoring teams on TDD in C++ with some Java. While C++ was my language of choice through most of the 90’s, I think far too many teams are using it today when there are better options for their particular needs.
During the month, I took notes on all the ways that C++ development is less productive than development in languages like Java, particular if you try to practice TDD. I’m not trying to start a language flame war. There are times when C++ is the appropriate tool, as we’ll see.
Most of the points below have been discussed before, but it is useful to list them in one place and to highlight a few particular observations.
Based on my observations last month, as well as previously experience, I’ve come to the conclusion that TDD in C++ is about an order of magnitude slower than TDD in Java. Mostly, this is due to poor or non-existent tool support for automated refactorings, no error detection as you type, and the requirement to compile and link an executable test.
So, here is my list of impediments that I encountered last month. I’ll mostly use Java as the comparison language, but the arguments are more or less the same for C# and the popular dynamic languages, like Ruby, Python, and Smalltalk. Note that the dynamic languages tend to have less complete tool support, but they make up for it in other ways (off-topic for this blog).
Getting Started
There is more setup effort involved in configuring your build environment to use your chosen unit testing framework (e.g., CppUnit) and to create small executables, one each for a single or a few tests. Creating many small tests, rather than one big test (e.g., a variant of the actual application). This is important to minimize the TDD cycle.
Fortunately, this setup is a one-time “charge”. The harder part, if you have legacy code, is refactoring it to break hard dependencies so you can write unit tests. This is true for legacy code in any language, of course.
Complex Syntax
C++ has a very complex syntax. This makes it hard to parse, limiting the capabilities of automated tools and slowing build times (more below).
The syntax also makes it harder to program in the language and not just for novices. Even for experts, the visual noise of pointer and reference syntax obscures the story the code is trying to tell. That is, C++ code is inherently less clean than code in most other languages in widespread use.
Also, the need for the developer to remember whether each variable is a pointer, a reference, or a “value”, and how to manage its life-cycle, requires mental effort that could be applied to the logic of the code instead.
Obsolete Tool Support
No editor or IDE supports non-trivial, automated refactorings. (Some do simple refactorings like “rename”.) This means you have to resort to tedious, slow, and error-prone manual refactorings. Extract Method is made worse by the fact that you usually have to edit two files, an implementation and a header file.
There are no widely-used tools that provide on-the-fly parsing and error indications. This alone increases the time between typing an error and learning about it by an order of magnitude. Since a build is usually required, you tend to type a lot between builds, thereby learning about many errors at once. Working through them takes time. (There may be some commercial tools with limited support for on-the-fly parsing, but they are not widely used.)
Similarly, none of the common development tools support incremental loading of object code that could be used for faster unit testing and hence a faster TDD cycle. Most teams just build executables. Even when they structure the build process to generate small, focused executables for unit tests, the TDD cycle times remain much longer than for Java.
Finally, while there is at least one mocking framework available for C++, it is much harder to use than comparable frameworks in newer languages.
Manual Memory Management
We all know that manual memory management leads to time spent finding and fixing memory errors and leaks. Avoiding these problems in the first place also consumes a lot of thought and design effort. In Java, you just spend far less time thinking about “who owns this object and is therefore responsible for managing its life-cycle”.
Dependency Management
Intelligent handling of include directives is entirely up to the developer. We have all used the following “guard” idiom:
#ifndef MY_CLASS_H
#define MY_CLASS_H
...
#endif
Unfortunately, this isn’t good enough. The file will still get opened and read in its entirety every time it is included. You could also put the guard directives around the include statement:
#ifndef MY_CLASS_H
#include "myclass.h"
#endif
This is tedious and few people do it, but it does avoid the wasted file I/O.
Finally, too few people simply declare a required class with no body:
class MyClass;
This is sufficient when one header references another class as a pointer or reference. In our experience with clients, we have often seen build times improve significantly when teams cleaned up their header file usage and dependencies, in general. Still, why is all this necessary in the 21st century?
This problem is made worse by the unfortunate inclusion of private and protected declarations in the same header file included by clients of the class. This creates phantom dependencies from the clients to class details that they can’t access directly.
Other Debugging Issues
Limited or non-existent context information when an exception is thrown makes the origin of the exception harder to find. To fill the gap, you tend to spend more time adding this information manually through logging statements in catch blocks, etc.
The std::exception class doesn’t appear to have a std::string or const char* argument in a constructor for a message. You could just throw a string, but that precludes using an exception class with a meaningful name.
Compiler error messages are hard to read and often misleading. In part this is due to the complexity of the syntax and the parsing problem mentioned previously. Errors involving template usage are particular hard to debug.
Reflection and Metaprogramming
Many of the productivity gains from using dynamic languages and (to a lesser extent) Java and C# are due to their reflection and metaprogramming facilities. C++ relies more on template metaprogramming, rather than APIs or other built-in language features that are easier to use and more full-featured. Preprocessor hacks are also used frequently. Better reflection and metaprogramming support would permit more robust proxy or aspect solutions to be used. (However, to be fair, sometimes a preprocessor hack has the virtue of being “the simplest thing that could possibly work.”)
Library Issues
Speaking of std::string and char*, it is hard to avoid writing two versions of methods, one which takes const std::string& arguments and one which takes const char* arguments. It doesn’t matter that one method can usually delegate to the other one; this is wasted effort.
Discussion
So, C++ makes it hard for me to work the way that I want to work today, which is test-driven, creating clean code that works. That’s why I rarely choose it for a project.
However, to be fair, there are legitimate reasons for almost all of the perceived “deficiencies” listed above. C++ emphasizes performance and backwards-compatibility with C over all other considerations. However, they come at the expense of other interests, like effective TDD.
It is a good thing that we have languages that were designed with performance as the top design goal, because there are circumstances where performance is the number one requirement. However, most teams that use C++ as their primary language are making an optimal choice for, say, 10% of their code, but which is suboptimal the other 90%. Your numbers will vary; I picked 10% vs. 90% based on the fact that performance bottlenecks are usually localized and they should be found by actual measurements, not guesses!
Workarounds
If it’s true that TDD is an order of magnitude slower for C++ then what do we do? No doubt really good C++ developers have optimized their processes as best as they can, but in the end, you will just have to live with longer TDD cycles. Instead of write just enough test to fail, make it pass, refactor, it will be more like write a complete test, write the implementation, build it, fix the compilation errors, run it, fix the logic errors to make the test pass, and then refactor.
A Real Resolution?
You could consider switching to the D language, which is link compatible with C and appears to avoid many of the problems described above.
There is another way out of the dilemma of needing optimal performance some of the time and optimal productivity the rest of the time; use more than one language. I’ll discuss this idea in my next blog.
100% Code Coverage? 34
Should you strive for 100% code coverage from your unit tests? It’s probably not mandatory and if your goal is 100% coverage, you’ll focus on that goal and not focus on writing the best tests for the behavior of your code.
That said, here are some thoughts on why I still like to get close to 100%.
- I’m anal retentive. I don’t like those little red bits in my coverage report. (Okay, that’s not a good reason…)
- Every time I run the coverage report, I have to inspect all the uninteresting cases to find the interesting cases I should cover.
- The tests are the specification and documentation of the code, so if something nontrivial but unexpected happens, there should still be a test to “document” the behavior, even if the test is hard to write.
- Maybe those places without coverage are telling me to fix the design.
I was thinking about this last point the other day when considering a bit of Java code that does a downcast (assume that’s a good idea, for the sake of argument…), wrapped in a try/catch block for the potential ClassCastException
:
public void handleEvent (Event event) throws ApplicationException { try { SpecialEvent specialEvent = (SpecialEvent) event; doSomethingSpecial (specialEvent); } catch (ClassCastException cce) { throw new ApplicationException(cce); } }
To get 100% coverage, you would have to write a test that inputs an object of a different subtype of Event
to trigger coverage of the catch block. As we all know, these sorts of error-handling code blocks are typically the hardest to cover and ones we’re most likely to ignore. (When was the last time you saw a ClassCastException
anyway?)
So my thought was this, we want 100% of the production code to be developed with TDD, so what if we made 100% coverage a similar goal? How would that change our designs? We might decide that since we have to write a test to cover this error-handling scenario, maybe we should rethink the scenario itself. Is it necessary? Could we eliminate the catch block with a better overall design, in this case, making sure that we test all callers and ensure that they obey the method’s ‘contract’? Should we just let the ClassCastException
fly out of the function and let a higher-level catch block handle it? After all, catching and rethrowing a different exception is slightly smelly and the code would be cleaner without the try/catch block. (For completeness, a good use of exception wrapping is to avoid namespace pollution. We might not want application layer A to know anything about layer C’s exception types, so we wrap a C exception in an A exception, which gets passed through layer B…)
100% coverage is often impossible or impractical, because of language or tool oddities. Still, if you give in early, you’re overlooking some potential benefits.