Specs vs. Tests 49
There’s something to this BDD kool-aid that people have been drinking lately…
As part of the Rails project I’ve been working on for the last few weeks, I’ve been using RSpec. RSpec is a unit testing tool similar in spirit to JUnit or Test/Unit. However RSpec uses an alternative syntax that reads more like a specification than like a test. Let me show you what I mean.
In Java, using JUnit, we might write the following unit test:public class BowlingGameTest extends TestCase {
private Game g;
protected void setUp() throws Exception {
g = new Game();
}
private void rollMany(int n, int pins) {
for (int i=0; i< n; i++) {
g.roll(pins);
}
}
public void testGutterGame() throws Exception {
rollMany(20, 0);
assertEquals(0, g.score());
assertTrue(g.isComplete());
}
public void testAllOnes() throws Exception {
rollMany(20,1);
assertEquals(20, g.score());
assertTrue(g.isComplete());
}
}
This is pretty typical for a Java unit test. The setup function builds the Game
object, and then the various test functions make sure that it works in each different scenario. In Ruby however, this might be expressed using RSpec as:
require 'rubygems'
require_gem "rspec"
require 'game'
context "When a gutter game is rolled" do
setup do
@g = Game.new
20.times {@g.roll 0}
end
specify "score should be zero" do
@g.score.should == 0
end
specify "game should be complete" do
@g.complete?.should_be true
end
end
context "When all ones are rolled" do
setup do
@g = Game.new
20.times{@g.roll 1}
end
specify "score should be 20" do
@g.score.should == 20
end
specify "game should be complete" do
@g.complete?.should_be true
end
end
At first blush the difference seems small. Indeed, the RSpec code might seem too verbose and fine-grained. At least that was my first impression when I first saw RSpec. However, having used it now for several months I have a different reaction.
First, let’s looks a the semantic differences. In JUnit you have TestCase
derivatives, and test functions. Each TestCase
derivative has a setUp
and tearDown
function, and a suite of test
functions. In RSpec you have what appears to be an extra layer. You have the test script, which is composed of context
blocks. The contexts have setup
, teardown
, and specify
blocks.
At first you might think that the RSpec context
block coresponds to the Java TestCase
derivative since they are semantically equivalent. However Java throws something of a curve at us by only allowing one public class per file. So from an organizational point of view there is a stronger equivalence between the TestCase
derivative and the whole RSpec test script.
This might seem petty. After all, I can write Java code that is semantically equivalent to the RSpec code simply by creating two TestCase
derivatives in two different files. But separating those two test cases into two different files makes a big difference to me. It breaks apart things that otherwise want to stay together.
Now it’s true that I could keep the TestCase
derivatives in the same file by making them package
scope, and manually put them into a public TestSuite
class. But who wants to do that? After all, my IDE is nice enough to find and execute all the public TestCase
derivatives, which completely eliminates the need for me to build suites—at least at first.
Note: The JDave tool provides BDD syntax for Java. |
Again, this might seem petty; and if that were the only benefit to the RSpec syntax I would agree. But it’s not the only benefit.
Strange though it may seem, the next benefit is the strings that describe the context
and specify
blocks. At first I thought these strings were just noise, like the strings in the JUnit assert
functions. I seldom, if ever, use the JUnit assert
strings, so why would I use the context
and specify
strings? But over the last few weeks I have come to find that, unlike the JUnit assert
strings, the RSpec strings put a subtle force on me to create better test designs.
Stable State: An Emergent Rule.
When a spec fails, the message that gets printed is the concatenation of the context
string and the specify
string. For example: 'When a gutter game is rolled game should be complete' FAILED
. If you word the context and specify strings properly, these error message make nice sentences. Since, in TDD, we almost always start out with our tests failing, I see these error message a lot. So there is a pressure on me to word them well.
But by wording them well, I am constrained to obey a rule that JUnit never put pressure on me to obey. Indeed, I didn’t know it was a rule until I started using RSpec. I call this rule Stable State, it is:
Tests don’t change the state.
In other words, the functions that make assertions about the state of the system, do not also change the state of the system. The state of the system is set up once in the setUp
function, and then only interrogated by the test
functions.
If you look carefully at the specification of the Bowling Game you will see that the state of the Game
is changed only by the setup
block within the context
blocks. The specify
blocks simply interrogate and verify state. This is in stark contrast to the JUnit tests in which the test
methods both change and verify the state of the Game
.
If you don’t follow this rule it is hard to get the strings on the context
and specify
blocks to create error messages that read well. On the other hand, if you make sure that the specify
blocks don’t change the state, then you can find simple sentences that describe each context
and specify
block. And so the subtle pressure of the strings has a significant impact on the structure of the tests.
I can’t claim to have discovered the pressure of these strings. Indeed, Dan North’s original article on the topic is captivating. However, I felt the pressure and came to the same conclusion he did, well before I read his article; simply by using a tool inspired by his work.
The benefit of Stable State is that for each set of assertions there is one, and only one place where the state of the system is changed. Moreover the three level structure provides natural places for groups of state, states, and asserts.
The demise of the One Assert rule.
There have been other rules like this before. One that circulated a few years back was:
One assert per test.
I never bought into this rule, and I still don’t. It seems arbitrary and inefficient. Why should I put each assert
statement into it’s own test method when I can just as well put the assert
statement into a single test method.
public void testGutterGameScoreIsZero() throws Exception {
rollMany(20, 0);
assertEquals(0, g.score());
}
public void testGutterGameIsComplete() throws Exception {
rollMany(20, 0);
assertTrue(g.isComplete());
}
over this:
public void testGutterGame() throws Exception {
rollMany(20, 0);
assertEquals(0, g.score());
assertTrue(g.isComplete());
}
I think the authors of the One Assert rule were trying to achieve the benefits of Stable State, but missed the mark. It’s as though they could smell the rule out there, but couldn’t quite pinpoint it.
The State Machine metaphor
When you follow the Stable State rule your specifications (tests) become a description of a Finite State Machine. Each context
block describes how to drive the SUT to a given state, and then the specify
blocks describe the attributes of that state.
Dan North calls this the Given-When-Then metaphor. Consider the following triplet:
Given a Bowling Game: When 20 gutter balls are rolled, Then the score should be zero and the game should be complete.
This triplet corresponds nicely to a row in a state transition table. Consider, for example, the subway turnstile state machine:
Current State | Event | New State |
---|---|---|
Locked | coin | Unlocked |
Unlocked | pass | Locked |
Locked | pass | Alarm |
Unlocked | coin | Unlocked |
We can read this as follows:
GIVEN we are in the Locked state, WHEN we get a coin event, THEN we should be in the Unlocked state.
—GIVEN we are in the Unlocked state, WHEN we get a pass event, THEN we should be in the Locked state.—etc.
Describing a system as a finite state machine has certain benefits.
- We can enumerate the states and the events, and then make sure that every combination of state and event is handled properly.
- We can formalize the behavior of the system into a well known tabular format that can be read and interpreted by machines.
- I am, of course, thinking about FitNesse
- There are well known mechanisms for implementing finite state machines.
The point is that organizing the system description in terms of a finite state machine can have a profound impact on the system design and implementation.
The Butterfly Effect.
I find it remarkable that two dumb annoying little strings put a subtle pressure on me to adjust the style of my tests. That change in style eventually caused me to see the design and implementation of the system I was writing in a very new and interesting light.
Trackbacks
Use the following link to trackback from your own site:
http://blog.objectmentor.com/articles/trackback/110
I really like the Stable State idea, and I think you are probably right about One Assert Per Test really was about this but did not quite get there. However, if you are using BDD and following Stable State, does not that more or less make One Assert Per Test a given as well? In your example above you have several specify, each with a single something.should specification. If they would all be slumped together in one specify, writing the string for that would again be impossible. And I think TDD can work the same way, which is why I have always tried to abide to One Assert Per Test. I recently wrote about this in One Assert Per Test should come natural.
One nit I have with BDD style is the fact that the test comment string so closely reflects the code in the case. It feels like duplication.
When you’ve worked to be able to say @g.score.should == 0, it feels weird to have to write “score should be zero.” Granted, you write the string before the the code, but it still looks odd after the fact. Makes you wonder whether a framework written in fluent style could generate the string.
Michael: I do not know about RSpec (have just tried it once), but Specter (a similar framework for .Net) does that. Here is an example (Specter specs are written in Boo):
When this is run in a test runner the output is something like:
Chris – that looks pretty interesting.
Michael – while the duplication that you are describing is easy to end up with, there really is plenty of flexibility in what you write. You could, for example, say:
or
One thing we’re working on is some means of nesting contexts and/or specifications. So you could do something like this:
Not saying these are “right”, just that they become possible. Note how the last example begins to feel more like a description of behaviour – not just because of the word, but because of the nesting structure. An x should do y under conditions a,b,c and d.
Food for thought.
More thoughts – the process through which specs evolve is going to have an impact on what goes in the name and what goes in the code. “should score 0” might have come from a customer who said “this is what the score should be for an all gutter game”. That name then serves as definition up front, documentation later.
Recently I started using RSpec the first time. Beeing used to Assert.*, I’ve always been sceptical about the “pseudo-natural-languge-style” expressions. I still don’t see that much of a difference (from a programmers point of view) whether I write “Assert.Equals(expected, actual)” or “actual.should == expected”.
But what I absolutely love about RSpec, is the way it makes me think about what I want to code. In the “normal” TDD-way I’ve always been kinda more focused on the design of my class currently under test, making me loose focus on what I really wanted to do. BDD forces me to focus more on application behaviour and helping me to stay on track. And just as you, I think, this is mainly because of the context/specification-style of writing tests.
With TDD the test method names often technically describe the class under test and how it is to be used. In BDD style the test methods (=specifications) describe what this part of my code is good for after all. The technical stuff then moves to the body of the specification.
I’ve now even started to write my NUnit-tests in BDD-style, which works pretty well. Each TestFixture is a context (defined in SetUp) and each Test is a specification. Of course this doesn’t give me the nice failure messages that RSpec produces, but it seems to work too. I think David Chelimsky somwhere said something like “BDD is doing TDD the right way.”.
Regarding the “Stable State” thing – I try to follow this rule, but sometimes it just doesn’t seem to fit and I break it. Maybe a sign, that I haven’t fully adopted the BDD-style yet.
Hmmmmmmmmmmm…
I’m with you on the “One Assert rule”. Recently, I’ve been working on some code which is state-based, so I wind up with:
where it’s necessary to have changing state, because the interesting bit is that there can be many SKIP situations interleaved, BUT we have to remember where we were before the SKIP, to wind up in the right state at the end.
So, the question is, in Specs-land would this have to become three complete Contexts? That seems unfortunate. Maybe there is a better way to do it?
State is a slippery thing. Uncle Bob’s bowling game, as presented, is stateful, but imagine a different problem: you need to create a command object which accepts an array of throws and returns the score. The object doesn’t have mutable state, so technically no spec would alter state, but without a need for setup you could easily end up without all of the contexts that Bob found.
Seems that the benefit that BDD provides in this context lessens to the degree that you move toward less stateful objects, and that seems to happen among people who do a lot of interaction-style TDD.
I think the intent behind the “One Assert Per Test” rule was to get you thinking of “TestCases” as “fixtures” instead - wherein setUp() puts the system or a slice thereof in a specific state, and the testX() methods each contain one assertion about the state of the system that should hold if the system is working correctly. So in the first code listing under “The Demise of the ‘One-Assert’ Rule”, you’d factor out the multiple rollMany(20,0) calls into a setUp() - just like the RSpec specification for the gutter game does. It may very well be that the “tests” for a particular class, then, get spread out across many fixtures, instead of feeling like you have to place all tests for a given class in the same TestCase derivative. The fixture names corresponding to system states, not classes. At least that’s how I’ve read it.
It’s interesting to see that RSpec facilitates thinking about aspects of the system being developed in terms of those system states, and not so much a one-spec-per-class mindset.
I think the intent behind the “One Assert Per Test” rule was to get you thinking of “TestCases” as “fixtures” instead - wherein setUp() puts the system or a slice thereof in a specific state, and the testX() methods each contain one assertion about the state of the system that should hold if the system is working correctly. So in the first code listing under “The Demise of the ‘One-Assert’ Rule”, you’d factor out the multiple rollMany(20,0) calls into a setUp() - just like the RSpec specification for the gutter game does. It may very well be that the “tests” for a particular class, then, get spread out across many fixtures, instead of feeling like you have to place all tests for a given class in the same TestCase derivative. The fixture names corresponding to system states, not classes. At least that’s how I’ve read it.
It’s interesting to see that RSpec facilitates thinking about aspects of the system being developed in terms of those system states, and not so much a one-spec-per-class mindset.
Michael: RSpec includes a great mocking and stubbing framework (derived from SchMock, but now more closely resembling Mocha).
Myself, and many of the others working with RSpec hardly use state-based specifications at all. I work almost exclusively with mocks in RSpec in most of the projects I use it.
Uncle Bob: In an interview I did with you for the (now defunct) objectmonkey.com site, didn’t you tell me that you didn’t care if TDD was like formal specification? Have you changed your mind since then?
After reading this article, I’m wondering…. Could we use jRuby to write RSpec tests against java code? Damn Uncle Bob, I’m going to lose another weekend because of you. ;-)
rspec in jRuby… now that’s an interesting thought…
Jason, You’ll have to refresh my memory about that interview and the context of it. I’ve been making the “formal document” argument for at least five years.
Alas, I don’t have the original manuscript to hand, but from memory I think I asked you if TDD was formal specification by the back door, and I distinctly recall you saying you didn’t care if it was. The title of the interview was “Getting Sh*t Done”, if that helps establish the context :-)
That’s not to say I don’ totally agree with what you’re saying now. I think any movement towards higher integrity specs – executable specs – is progress. It all sounds very familiar to me – I think I’ve been doing BDD right from the get go since I started doing TDD – so I’m bound to draw the comparison now.
Courtesy of the WaybackWhen web cache (interview from 2003, I think):
ObjectMonkey: Here’s a hot potato for you – is Test-driven Development really Formal Methods in disguise?
Uncle Bob: Test-driven development is the most profound and auspicious thing to happen to the software industry since I’ve been a programmer. I think it’s even more important than OO.
ObjectMonkey: I’m inclined to agree.
Uncle Bob: Nothing has had such a profound effect upon the way I write code than TDD. Nothing. When I write code now, I run tests every few minutes. My stuff is always working. I never have windows all over my screen with modules torn apart, hoping I can one day piece them back together again. Every minute or two I run tests, and get my stuff working. I don’t use debuggers anymore. Debuggers are a drug. You get addicted to them. They drag you down a rat hole. You spin and spin, trying to set your breakpoints, trying to follow the logic, trying to figure out what the hell is going on. With TDD, that all but goes away. I haven’t used a debugger in anger in over three years. And I chide anyone I see who is using one. So I don’t care whether there is a link between TDD and FM. TDD is a great boon to me, and to software in general.�
However RSpec uses an alternative syntax that reads more like a specification than like a test. Let me show you what I mean.
When I come to here, I think I am in the right place. the web gives me a lot of infomation, it is very informative. I think lots of people can learn much here. I will come to here again. Thanks.
If you mean to find great shoes for your children puma speed trainers also provides a mixture of finicky and affordable shoes for them. There are a lot of choices, it is up ring call,Ugg Boots, after by people that indigence an incredible quantity of column. This will make the customers happier. If you are often tangled in Singapore womens puma future cat shoes sale at Sainte Marie that could enhance operational efficiency, range visibility and turnaround time,” said Desmond Chan, managing boss, South Asia, Menlo Worldwide Logistics. “Our multi-client facility in Boon Lay Way provides puma trainers with different flag. puma uk’s youngest targets are toddlers. The puma for sale shoes are incredibly affordable, yet they still hold the grace. Wearing comfortable shoes will help children exploit better.
no longer visits the fake hermes bracelet the communal spaces.”I’m sure there are some replica hermes some nice people here, but they have hermes neck scarf have 13 or
So from an organizational point of view there is a stronger equivalence between the TestCase derivative and the whole RSpec test script.
In other words, the functions that make assertions about the state of the system, do not also change the state of the system.
It may very well be that the “tests” for a particular class, then, get spread out across many fixtures, instead of feeling like you have to place all tests for a given class in the same TestCase derivative. The fixture names corresponding to system states, not classes. At least that’s how I’ve read it.
Test-driven development is the most profound and auspicious thing to happen to the software industry since I’ve been a programmer. I think it’s even more important than OO.
Great information in here~
Great discussion in here.
enjoyed reading it. I need to read more on this topic…I admiring time and effort you put in your blog, because it is obviously one great place where I can find lot of useful info..
Normally, the tests must reveal the same values as the specs said. However, the reality seems to be very difficult to understand because sometimes the differences are very big.
it is a useful and wonderful website.thanks for your information.
Thank you very good and a healthy writing. I’ll definitely keep track of posts and the occasional visit. Looking forward to reading your next publish.Nike Sneakers Outlet
thanks for this post :)
Dünyan?n en büyük online okey oyunu bu sitede sizleri bekliyor. Gerçek ki?ilerle sohbet ederek Okey Oyunu Oyna ve internette online oyun oynaman?n zevkini ç?kar
Yes these are very important indeed
Thanks for the guide, it’s very helpful.
Crescent Processing Company You deserve the best and I know this will just add to your very proud accomplishments in your already beautiful and deserving blessed life. I wish you all the best and again. Thanks a lot.. Crescent Processing Company
Blog posts about wedding and bridal are always rare to find , at least with great quality,you qualify for a great blog post writer title,kep the great job happening
Canada Goose Outlet is Marmot 8000M Parka. The Marmot 8000M Parka is really a waterproof, breathable jacket with 800 fill canada goose jacket feathers. It truly is design and light colored shell is produced for trendy, but uncomplicated, protection from cold temperatures. Reinforced shoulders, elbows and adjustable waist and hem make the Marmot a perfect alternate for skiing and other outdoor sports that want fairly a bit of arm motion. The 8000M Parka weighs three lbs., comes in bonfire and black colours and might be stuffed and stored like a sleeping bag to your convenience.This is one of well-know and prime down jacket brands.Hope our friends like its!Like canada goose womens and Canada Goose Expedition Parka.There are wholesale canada goose.
Gucci Top Handles Gucci Shoulder Bags Gucci Clutches http://www.saleguccinewbags.com/gucci-boston-bags-c-58.html">Gucci Boston Bags Gucci Messenger Bags authentic discount gucci bags
Nice pearl at http://www.cnwpearl.com http://www.cnwpearl.com/freshwater-pearl-necklace/c1/index.html http://www.cnwpearl.com/freshwater-pearl-bracelets/c7/index.html
thanks for ur sharing, I like your blog, content is very rich, allow me to leave a message well, wish you are lucky!!!!! http://www.junyuetrade.com/
At first I would congratulate you on writing such a brilliant piece of write-up. You have got some exceptional writing skills that have made your site worth reading.
I really like the Stable State idea. a thought for the day
thanks for ur sharing, I like your blog, content is very rich. http://www.youtube.com/watch?v=AYNtk_LMrho
I feel strongly about this and I take pleasure in learning about this topic. If possible, as you gain data, please add to this blog with more information…
Some genuinely choice content on this internet site, saved to my bookmarks.
Most of us will delete the SMS file if the iPhone inbox is full. For some of the very important text file, you would like to save it to Mac hard drive and read it later or you need print them. So, why not export the text message to HDD and save it now?
ost of us will delete the SMS file
Specs vs. Tests 48 hoo,good article!!I like the post!6