On INEFFECTIVENESS OF TESTING vs EFFECTIVENESS OF INSPECTIONS

Phil Armour made this post against an old thread and seeing as it talks about cognitive factors in software - topic that interests me - I thought I'd start a new thread in reply to it.

I have a suspicion this (i.e. studies showing that inspections are more effecive than testing) may be both highly situational and a somewhat invalid conclusion, even based on the original data.

I'm a psychologist by background and I know that clear and pronounced results in any kind of behavioural study are quite unusual. There have been enough studies with clear results to show that inspections are both a very effective way of removing defects from software and much more effective than testing. Also I have never seen a study that shows testing to be more effective than inspections. Also I know from personal experience that inspections are very effective. So unless you can present new evidence your opinion is not significant.

(b) The effectiveness of testing is closely related to the setup (and inspection) of the test cases and expected results. Someone (Pressman? Myers?) stated that the act of creating a good test is more effective at detecting defects than running it. So the two mechanisms are quite closely coupled.

Agreed! inspections of test cases is a good practice.

(c) Testing is (IMHO) the only aspect of software development that truly acknowledges that the real job is to find out things we don't know. And particularly the things we don't know we don't know (what I call "Second Order Ignorance"). It seems to me that all other aspects of development embody a tacit assumption that our job is to translate what we know (the application of our "Zeroth Order Ignorance"). In no other part of software development is the exposure of ignorance truly acceptable. Can you, for instance, imagine a manager saying "Bob, you've done a great job on the design of this system--look at all the things that don't work!"

Testing for things you don't know about is an impressive epistemological breakthrough.< / sarcasmoff >

If your point is that formulating tests helps in discovery, then I would agree, but the discussion is about relative effectiveness. Is testing more effective than other practices, such as domain modelling? If you any evidence that testing is more effective then I would be interested in seeing it. Otherwise I would anticpate domain modelling (for example) is a far more effective means of discovery.

The term "Paradigm Shift" was popularized by Thomas Kuhn, but it's a well known and easily provable aspect of cognition. I wholeheartedly agree with Mr. Anderson that one is coming in software. The problem with such revolutions is always (a) it's really hard to see what it is because it isn't here yet (b) it's even harder to see what will happen as a result of it because of (a).

I don't have Kuhn's book to hand but I have read it more than once, and as I recall he says we can't predict paradigm shifts (although we can recognize one when its under way). Its a variant on the future knowledge problem.

Acquiring new ways of understanding is an interesting subject, and while Kuhn does not cover this topic, it occurs with people, as individuals rather than collectively which was Kuhn's concern, all the time. I recall when it happened with me over inspections. I had never used them previously, and was sceptical of their value to say the least. After doing a couple, the light bulb went on for me and I saw they worked and in ways I didn't anticipate. Like people having to articulate why they have designed or coded a particular way makes them realize it is in fact wrong.

The Agile movement in general has made some inroads in a couple of these areas, most specifically in the sociology of development, which has been astonishingly lacking in almost every development trend until very recently.

Certain parts of the Agile movement are notable for their capacity to ignore what has been well understood for a long time, including human and social factors. Agile just takes an existing tendency in software development, i.e. to ignore what has been learned previously and takes it to the 'extreme'.

I don't claim to be immune to this tendency, I recall being embarassed that I was completely ignorant of studies going back 30 years showing the effectiveness of inspections.

Finally, Kuhn's point was that new paradigms replace existing paradigms when (and only if) they are clearly epistemelogically and empirically superior. On the empirical, I have yet to see *any* data that shows the superiority of XP for example. On the epistemelogical, XP in particular doesn't have enough intellectual coherence to even apply this criteria. Its just a collection of practices, with no underlying intellectual coherence that I can determine. Nor is there is any empirical justification for XP's practices being superior to other practices, inspections and domain modelling for example.

So is Agile a paradigm? I would conclude (certainly in its XP form) its a rejection of the RUP/BDUF paradigm (industrial scale software development) and reverting to an earlier paradigm - software development as a small-scale craft activity. There is nothing wrong with this if RUP/BDUF has failed and I tend to agree that it has, and there is no acceptable replacement paradigm (something I would at least debate).

Finally, while in the FDD community we are primarily concerned with practising good software development, we could certainly put forward both a good empirical case for FDD's practices and a good intellectual rationalization for FDD as a whole and its a probably something we should do.

phil

Mon, 01/19/2004 - 14:52 — Jeff De Luca

The Psych Aspect

Considering the pysch aspect of this only (I replied on the other thread as to the meat of the content), I wonder if something like the availability heuristic is in play here. There's the obvious logic error of "well we only test and don't inspect and our testing finds lots of defects - therefore testing is effective at finding defects." But there's the more subtle, but just as lethal, availability heuristic kind of logic error.

From leshan at slashdot: "Example: The Wigetmobile is the best selling car in america because it's super-cheap and super-reliable, according to statistics. Your uncle says he drove his into a tree and it nearly killed him, so you don't buy it, because his vivid description of his near-death incident (probably on account of his own stupidity) "outweighs" statistical evidence that the product is good."

Read more about this here and here and here and here.

Jeff

Sat, 01/24/2004 - 11:37 — pbradley

On XP and heuristics

jeff

You have a point that XP developers and software developers in general are overly reliant on hueristics both first and second hand. Others might call this 'superstitious behavior'.

In terms of the reliance you should place on knowledge, heuristics are the kind of knowledge you should place least reliance on. Repeatable empirical studies and theoretical predictions supported by experimental evidence are much more important. But reliance on heuristics is characteristic of the 'return to craft' dimension of XP and the rejection of the (at least partially failed) 'scientific' (read theoretical) approach of BDUF/RUP.

Sun, 01/25/2004 - 10:36 — Jeff De Luca

I Never Said XP

You have a point that XP developers and software developers in general are overly reliant on hueristics both first and second hand.

My comment (just above yours) never once mentioned XP. I did not have it or any method in mind. Just people and common thinking traits.

Wed, 01/28/2004 - 06:04 — dja

Resistence to Objectivity

It is a fair point that some XPers are very resistent to the idea of transparent, objective measurement and the consequent scientific results that can come from it.

One passionate XPer wrote to me while my book was in draft, to point out that, "velocity" in XP is a private metric kept only by the team, it would never be revealed outside the immediate programming group.

My reply to that was, "well maybe it's time for a change".

Most books on objective management techniques - teachings on Six Sigma come to mind - warn that most intuitive management ideas turn out to be wrong. The only way to understand if you are doing the best, or the better thing, is to measure it and study the results.

David
--
David J. Anderson
author of "Agile Management for Software Engineering"
http://www.agilemanagement.net/

Feature Driven Development