Twitter Does Not Allow For Nuance

Twitter Does Not Allow For Nuance 17

Posted by Brett Schuchert Mon, 13 Apr 2009 03:26:00 GMT

If you have something deep to say, 140 characters is not going to cut it very often. And sometimes, when it does, it’s almost too opaque to be grasped by anybody who doesn’t already grok it.

Here’s one example:

Polymorphsim – Same request different response.

That’s the essence of polymorphism, which can help with the SRP and the OCP. That one happens to fit in to a single bullet of < 140 characters. However, there’s a lot there. In fact, while that could be the definition, it has many ramifications and the context matters. So this probably falls in the opaque category.

A few days ago I made a pithy statement on twitter:

Test is Definition (TDD), therefore code w/o Test, not defined. Therefore, it is broken (or never wrong, take your pick).

I wanted to make that fit in a single tweet. I forgot to put in to whom I was replying (and now I cannot remember[sorry]). Anyway, I got the following replies:

@dws – Test is a form of definition. It’s a very good form, but it’s not the only one.

@jamesmarcusbach – If test is definition, then * must be the same as +, because 2+2=4 & 2 * 2=4. No, a test is just an event. (via )

@ecomba I love this statement!

I wanted to reply with a little more length, so I figured this was a good place to do it.

To @ecomba, thank you. I’m assuming your reply was in response to that statement, but if not, then thanks anyway!-)

To jamesmarcusbach – I do not agree. When I assess “Test is Definition (TDD)”, that suggests to me that there are many, many unit tests (and even acceptance test, load tests, smoke tests, manual tests, exploratory tests, debugging, ...). The union of all of those tests form the definition. You’ve picked one example and erroneously extrapolated from it to discount a tweet, and I don’t think you did it. (And to be clear, I strongly prefer certain forms of tests over others.)

I also do not agree with your interpretation of testing as an event. At the very least it is a process. Even more so, it is a continuous process that is only done when the project is done, which is when the customer stops paying. So I think you and I are using the same 4 letters (test) in very different ways. I suspect, however, that I’ve committed the same error in interpreting your use of the word event as you’ve interpreted my use of the word test.

And finally, I don’t agree with your example for many reasons, two of which are:

I mentioned TDD, I don’t check the compiler very often, so I won’t be testing 2+2 or 2*2.
However, if I am, I would not only pick your two examples. I’d have many trying to capture all of the equivalence classes.

To @dws – Sure, test is a form a definition. And I also agree that it is not the only form. I never said it was the only form (I think that came from you). As for not including the word form in my tweet, I did include TDD. Does that not invoke a large context, part of which is that TDD can possibly be a form of definition? Of course, saying Test is Definition is actually a metaphor, right?

OK, having replied with > 140 characters, I’m going to restate the tweet. Since the restatement is longer than > 140 characters it will have the luxury of being wrong in many more ways than the original.

Test is one form of Definition (TDD). If you do not have any other form [the context of the original tweet I believe, to which I was responding but mistakenly forgot to include the @..] (e.g., some requirements specification or a verbal agreement with some sales person), then the tests are one good definition of what is/is not correct. If we go with the definition of our system in terms of the tests, then where there are no tests, there is no definition and therefore any behavior is OK. Sure you can argue, the system should not crash when a user enters a character into a field that expects numbers, but really, if that behavior is not defined, saying “it should not do that because of common sense” really is saying “Well I assumed it would not do that, you violated my assumption therefore I will prove you wrong in a battle royal.” Even more, it’s great because when it happens, the user will be so mad that s/he will make sure to let you know your definition of the system is incorrect. You can respond by writing another test to improved the fidelity of your understanding of the system.

This last part is really a tip of the hat to Jerry Weinberg who said (and I paraphrase probably incorrectly):

If there are no requirements, any solution will do.

Of course, he was probably referring to Alice in Wonderland…

There’s a lot more to this subject. For example, I don’t believe in proving systems correct. Why? Even if you’ve proven that your system conforms 100% to your formal specification, there’s no reasonable way to prove:

Your formal description is complete (yes you can for simple data structures, BFD, don’t care about formally proving a simple data structure).
For any complex system described in terms of a formal language, the original inception of the system was in natural language. Prove the transformation, and then prove the natural language specification is correct/complete.
I’m a big fan of Gödel’s incompleteness theorem. It is related to this at least loosely because.

By logical extension, you can pick up from this that I’m not a big fan of using formal languages like UML to build my system “from the diagrams.”

So in conclusion I like my original statement. I understand that it is not literally true or “The Truth.” It’s a way of thinking about things. I think the feedback was valuable because communication is ambiguous.

I could have been more clear. Maybe I should not have tried to even express the idea on Twitter. By throwing something out there, it gets refined through feedback and there’s a bettering understanding to be had. At some point we can create some pithy statement that has all of the meaning and none of the meaning at the same time. When we’ve done that, we get to start all over again.

Maybe the statement is simply wrong. At the very least it has an agenda. Question is, does it help, hinder or simply represent a single drop in an infinite bucket?

Flame on. I deserve it.

Posted in Schuchert's Scattered Synapses
Tags Gödel, incompleteness, s, TDD, test, theorem, twitter
Meta 17 comments, permalink, rss, atom

Comments

Leave a response

James Bach about 3 hours later:
Hi Brett,

It is possible that I’ve misunderstood you. I recognize that. I’ll go ahead and critique what I think you are saying, and if it turns out I misconstrued your meaning, we’ll have a big laugh on me.

Anyway…

Are you familiar with the halting problem? Alan Turing proved that there is no way to calculate whether or when a program will halt. (http://en.wikipedia.org/wiki/Halting_problem) I presume you are wise in the ways of software engineering, so at this point you must be saying to yourself “Of course I am familiar with the halting problem! It’s BASIC COMPUTER SCIENCE!”

I want you to feel that feeling for a moment. Then breathe and let it out. Relax.

Okay, now imagine what I’m feeling when your opinion about testing goes against the most basic tenet of testing theory, which is related to the halting problem: complete testing is impossible. This is easy to demonstrate and has been demonstrated in many texts at many times, so I won’t bother to do so, in this comment.

When you conflate “definition” with “test” you seem to be eliminating the incompleteness problem from your test process. You have, actually, DEFINED it away, by declaring, in effect, that your tests are complete when you say they are complete, and being complete they comprise a perfect definition of the intended product. Green light: good. Red light: Bad.

That is not a testing process, that is a deciding process. What worries me is that it’s a deciding process that rules out deliberation based on an understanding of the evidence gathered through testing. There is instead a presumption of PERFECT evidence.

In so doing you are playing Descartes to my Hume. Aristotle to my Bacon. You are confusing a demonstration with an experiment, and neglecting the last 400 years of development of fallibilist science (the view of science that says we can never know any natural fact for sure, because the very next experiment may come out differently). Scientists know that experiments are not and can never be proofs.

A definition can be timeless and complete (within its scope). A test cannot be. We need definitions to be different than tests (with one pathological exception*) because definitions establish a (potentially infinite) region of acceptable or unacceptable behavior, whereas a test can only be a data point (or a set of points). No finite collection of points is equivalent to an infinite collection, obviously.

The interaction between thinking about definitions (specifications, qualities, or capabilities) thinking about risks, and thinking about powerful (though limited) tests, gives you a great chance at producing a fabulous product.

When you fall asleep to the very limited nature of tests, you sow a field of folly, from which you will reap richly surprise after surprise, such as “Oh! I thought my product worked, but there’s a surprising bug in it! And another one! But my product works BY DEFINITION! O Woe that the ‘definition’ of my product was only as strong as the few tests I happened to think of.”

You may experience such surprises even without falling asleep, but falling asleep will make it worse.
- The pathological exception mentioned above is this: if your testing is also the ONLY use of the product (such as in a one-time data conversion project), then you can call your tests the definition of success.
Dave Smith (@dws) about 3 hours later:

“Test is Definition (TDD), therefore code w/o Test, not defined. “

Having missed the tweet to which you were responding, and having only your tweet for context, I don’t think it’s much of a stretch to read that as implying that Test is the only form of definition. Thanks for clarifying that this wasn’t your intent.

As to whether TDD (I assume you mean Test Driven Design/Development) can form the basis of definitions for a system, I think the answer is Yes, with the very big caveat that most artifacts from TDD are inaccessible to non-developers. If you encode that agreement with the sales person in xUnit tests and code, it’s very unlikely that the sales person can validate definitions without seeing a running system.

If you do TDD from the top-down starting with acceptance tests encoded in something like FIT, you stand a chance of getting early feedback on whether non-development stakeholders agree with the definitions. At least I hope that’s the case—I’ve yet to see it work well in practice.
Keith Braithwaite about 4 hours later:

@James touches on a point that I’m finding more and more interesting. I started my programming career in the world of ?’s and ?’s and that does work, although, I have learned, at far too high a cost for most settings. They way is open there to certainty, but few are willing to pay the price.

On the other hand, how often do we need to be certain, really? I think that systems programmers have a strong need for certainty, as they do not know in which regions of its configuration space their code will be used. Application programmers, though, often do have a very good idea of those regions. And application programmer are the majority of programmers.

If I know, for a simple example, that a certain integer value d is a day number within a year and I know the calendar convention used means that d ? [1, 365] then I think I’m at liberty to write code without worrying too much what it will do in the case d = 1024. (if the callers are not under my control then I probably need a guard of some sort, but then I’m done). It’s tractable to produce 367 test cases, but probably unnecessary, if for further example I know that what I’m doing with that day number is working out what quarter it is in.

I’m taking the view now that I construct tests from representative examples of required behaviour organised by equivalence classes in the problem space as modelled by my technology. I interpret test results as (Bayesian) evidence for correctness. Evidence, not proof. And the more strongly I need evidence the more intensively I mine for examples, up to a threshold where it becomes more economical to use other methods. This is in line with the risk-based validation approach that’s increasingly popular in the life-critical world.
Phil Booth about 9 hours later:

@James Bach: So what are you putting forward as a better definition than the tests? Formal specification languages such as Z? UML diagrams? Natural language in the form of lengthy specification and design documents? Can you argue that any of those will be a complete definition in the way that you say tests are not?

Also, tests have the very important property of being executable. This leads to all of the whizz-bang benefits of continuous integration, post-commit hooks, automated builds and flashing red lights that go off when someone makes changes that break our definition of the software, i.e. the tests. I’d like to see you do that with a UML diagram.

And so what if the tests aren’t “complete”anyway? Is it not okay for them to be just “good enough”? Then when a bug is found, new tests get written, the bug gets fixed and our definition of the system becomes a little bit better than it was before.
Greg about 10 hours later:

I agree with Phil. UML and the tools surrounding it (think Rational Rose) have always headed towards code generation, and the glorious idea of round-trip engineering. However, they never quite worked good enough (at least for me). So I think TDD, Junit and TestNG really turned a corner and made specs much more concrete, much faster and with bigger ROI than UML.

Last week I had a meeting with my users. They need a new feature, and put together a slide show with precise numbers and text on what they need their system to do. I have translated it into testable code. I showed them the test code, so they could realize its not that hard to read. I have already figured out what can and cannot work with the current system, thereby shedding light on what changes the system needs to meet their needs.

Another nice side effect: I happen to know some of their other needs and what holes are exist in their scenario, i.e. I know what questions to ask them for clarification. All thanks to runnable specs.

Regarding Grade A proofs and Turing’s thesis, I think the analogous equivalent to solving a non-linear, differential equation will do nicely: just use a numerical solving mechanism that gets epsilon into an acceptable range and forgo the closed-form solution that may not even exist.
Keith Braithwaite about 11 hours later:

Dave Smith wrote

If you do TDD from the top-down starting with acceptance tests encoded in something like FIT, you stand a chance of getting early feedback on whether non-development stakeholders agree with the definitions. At least I hope that’s the case—I’ve yet to see it work well in practice.

I think I have seen it work well in practice, as described in this presentation [pdf]. And see this other one [pdf] for part of why I think it works so well.
Brett L. Schuchert about 13 hours later:

Well I went back several days and I cannot find the original tweet to which I was responding. In retrospect, however, that’s really not relevant.

I now understand where James is coming from. And in fact, my original tweet exposes an underlying belief on my part. I am not interested in proving anything correct, nor am I concerned too much with the halting problem. I understand its relevance to theory. However, I’m worried about trying to find something of value to an end user. I know my end user is going to change his/her mind because we all do. So I want some kind of living description of the system. As the system grows, I want that description to age well. Executable descriptions grow better. I see the practice of creating large, fast, automated testing suites (with different kinds of tests) as an effective way to address change.

HOWEVER, I get that making a blanket statement without context can certainly hit a chord with people. I also get that many words have loaded meaning and by making such a blanket statement I’ve sprung into that trap.

So James Bach is right.

And, as is often the case, in my making a blanket statement that is incorrect without context, I’ve learned quite a bit.

In my original response, I was thinking in terms of specification. That in an absence of any specification, automated tests are good enough and in fact much better than most other attempts at specification. I don’t think that conflicts with anything James said. (If it does, I’ll learn more still.)

Back to that fundamental belief and it relationship to Jame’s response. Of all of the philosophies I’ve studied (the standard set in basic philosophy classes), I really preferred Hume. I took away from Hume the idea that you cannot prove A causes B even if you have a large number of experiments showing such a relationship. The B follows A time after time does not prove that for all time B will follow A.

To me, I took away that proving anything is a faulty notion. It hit a chord with me because when I took discrete mathematics, I was taught techniques for proving a program is “correct.” That part of discrete mathematics was, to me, silly. A simple proof was crazy long and there was still the semantic gap between natural language and formal language. With no way to verify that translation, the proof was/is useless.

I generally disagreed with most of what Descartes had to say. Even the iconic “Cogito, ergo sum” – I think, therefore I am, is suspect. How about: “Something thinks, therefore something exists” – that’s about as much as I can agree to. And don’t get me started on his “proof” of the existence of God. It was was basically: we can conceive of a perfect being, therefore it must exist (highly summarized). What utter hogwash. That two people can fundamentally agree on the definition of a single word, let alone one so loaded as “perfect” is insanity. So if we cannot agree on the full meaning of a single word, then how do we communicate at all? Lots of redundancy, and really we don’t exactly communicate very often.

This original interaction is a great example of that. I used a loaded word, in an unspecified context, and it stirred up some stuff.

Of late I’ve been wanting to add the word “automated” before the word “test” every time I use it because while I think it, other people do not.

Now I wonder if I should follow the BDD crowd and stop using the word test because of its loaded meaning.

So what am I left with? Automated System Specification? ASS. I like ASS!-)
Michael Feathers about 17 hours later:

I notice this trend: it seems that when we notice that something is true or false in general with a specific context, it takes us years or decades to notice that by being more specific or constraining the context, we can have the thing that we want. When I look back at the most important things which have happened over the past 10 or 15 years, most of them have had that quality. Our search for absolutes blinds us.
James Bach about 18 hours later:

Hey Phil,

Definition comes PRIOR to the test. It’s the thought that shapes the test and makes it meaningful. When we test, we are sampling from a very large space of data and behavior. To say that a test DEFINES the product is to deny that we are sampling and to conflate the entire product with the sample.

If someone played with a product of yours for a few minutes and it crashed on them, does that mean your product can’t work? Will it always crash? On every machine? Wouldn’t you cry foul if a reviewer immediately stopped using your product and declared it a piece of junk? Wouldn’t you consider it a mistake for them to DEFINE your product entirely by its apparent behavior on one machine at one time with one set of inputs?

One way you know that definition and testing are different is by noticing what meaning you make from testing. When you execute your tests with one set of inputs at one time and you get one thing you expect as output, you make a leap of inference that the product will do something LIKE THAT on other computers with other inputs. The definition of a product is whatever model or thought (even in your head) allows you to say “this product will perform in a manner LIKE what I see here in my tests.” Definitions are generalizations. Tests are specific. The definition is your idea of what the product should do. A test is a demonstration of the what the product just did in a specific case. The tests, you hope, are covering that idea well enough to support the inference that your product is good enough to be used by people who are doing something other than your tests.

If you fail to appreciate the leap of inference you must make between your observations (tests) and your model of your ideal product (definition) so as to decide that the actual product is good enough, then I fear for your users. It is deeply irresponsible to simply to assume that your tests are EXACTLY EQUAL TO the usage of the product, instead of being a shadow or representation thereof.

Hence 2+2=4 does not define addition. Neither would a million more similar tests taken together. David Hume proved that there is no logical basis for that sort of induction hundreds of years ago.
Michael Bolton about 18 hours later:

@Brett if that behavior is not defined, saying “it should not do that because of common sense” really is saying “Well I assumed it would not do that, you violated my assumption therefore I will prove you wrong in a battle royal.”

Or we’ll just negotiate, which would be the Agile way to handle it, would it not? You know, responding to change, vs. following a plan? Customer collaboration vs. negotiated contracts?

Even more, it’s great because when it happens, the user will be so mad that s/he will make sure to let you know your definition of the system is incorrect. You can respond by writing another test to improved the fidelity of your understanding of the system…This last part is really a tip of the hat to Jerry Weinberg who said (and I paraphrase probably incorrectly):

If there are no requirements, any solution will do.

Actually, what Jerry says is, “if you don’t care about quality, you can meet any other requirement” (and several variations on it, all named as Zeroth Laws). But I don’t think that Jerry would be behind the idea that TDD tests were the requirements; rather, he’d point out in a big hurry that at best, they might be requirements documents, and that the map is not the territory.

He’d also, I believe, question the approach of setting things up such that your understanding of what you’re writing is abysmal, but improves when the customer gets mad. Hardly an issue, since that’s not what you do anyway—is it?

@Phil Booth: @James Bach: So what are you putting forward as a better definition than the tests? Formal specification languages such as Z? UML diagrams? Natural language in the form of lengthy specification and design documents? Can you argue that any of those will be a complete definition in the way that you say tests are not?

I suspect you don’t know James very well. :) A better definition than “tests” is “a multitude of models and representations, some of which might be tests”.

You might want to have a look at 50 Deployments a Day and the Perpetual Beta (and the subsequent comments on it, discussed in later blog posts) to see how “the whizz-bang benefits of continuous integration, post-commit hooks, automated builds and flashing red lights” can turn narcotic. Continuous deployment works great for IMVU and ensures there are no problems because of a tautology: if a problem isn’t revealed by their automated tests, it’s not a problem. Yet I’d argue that the complaints from the customers about missing credits, hacked accounts, and non-functioning aspects of the product represent problems; that they claim 30,000,000 registered users but only 60,000 people online at any given time represents a least suggests some kind of a problem.

And so what if the tests aren’t “complete”anyway? Is it not okay for them to be just “good enough”?

It’s entirely okay for TDD tests not to be “complete” and to be “good enough”; that’s reality. Like all tests, TDD tests are designed to reveal information that informs a decision about the product. Just as Michael Feathers suggests above, the tests pass or fail with respect to a specific context. Can you imagine a context in which someone might use your product in a way that is not covered by your tests? Can you imagine a circumstance in which a description might be less expensive and more valuable than an automated test? Of course you can.

—-Michael B.
Brett L. Schuchert about 21 hours later:
@James Bach wrote:

Definition comes PRIOR to the test. It’s the thought that shapes the test and makes it meaningful. When we test, we are sampling from a very large space of data and behavior.

As I read that I’m reminded of something I always ask when teaching TDD. I ask “Is Testing about guaranteeing quality?” or some variation on that idea. I try to determine if people think of test/Test/testing as a way to guarantee quality, prove correctness, ... I want to get a reading on what the group thinks. (Answer, none of the above.)

After typical hemming and hawing, I bring up the idea of sampling and risk. Individual tests attempt to determine the likelihood that some small part of the system is going to work as desired. It does not guarantee it, but it certainly reduces the risk.

Then there’s the whole argument of test focus. If I need to show a -> d, then if I can show a -> b and b-> c and c -> d in separate tests, then I can probably assume a—> d is also OK. This particular issue can be hard for developers to accept. If you don’t accept that, then you end up with large tests. Large tests, those that cover many different checks/assertions, tend to be more fragile and therefore are often cost much more than they are worth.
So when I put the two ideas together:
- Test is Definition
- Test is Sampling
Both of which I’ve said, that clearly means that I make heavy used of an implied context to interpret the particular form of the word test I am using.
????? 3 months later:

?? ?? ??? ??
FLV extractor 12 months later:

my dear friends ,you can have a try
http://www.whiteiphone4transformer.com about 1 year later:

At 7 June, 2010, the latest iphone 4 generation was announced. It’s been a high time to get the iphone 4 white, or we may lost the fashion trend. Don’t we?
http://www.coach-factory-outlet-online.com/ over 2 years later:

just we know,your website is useful to many people.
okey oyunu oyna over 2 years later:

nice

internette görüntülü olarak okey oyunu oyna, gerçek kisilerle tanis, turnuva heyecanini yasa.
iPhone contacts backup over 3 years later:

Get to know about C# and C++. In fact, I find there is no much difference between the two. However, If we want to do much better. I need work hard.

Mentor	twitter id
Uncle Bob	unclebobmartin
Brett Schuchert	schuchert
Michael Feathers	mfeathers
Bob Koss	bob_koss

Twitter Does Not Allow For Nuance 17

Comments

Blog Search

Follow us on twitter

Categories

Blogroll

Syndicate