Wednesday, 13 February 2002

Finding Faults with Software Testing

There are many universal statements that a tester will come across in the course of their career, some true "Testing can only prove the presence of faults and not their absence", and some false "Developers must not test their own code"

Such statements may well be paraphrased quotes, the original source may not be known or presented, and the quote may be taken out of its original context. But they will be one common form of the phrase that exists in the industry's shared consciousness. Each of these statements has an effect on the person hearing them or thinking about them. And each of the Development Lifecycle personnel, no matter what their role is, will re-interpret the phrases based on their own experience.

The benefit of such statements is that they provide something to think around. The danger is that the brain loves to simplify things, identify patterns, reduce and compartmentalise information and can therefore build a less than optimal reasoning chain.

Here is a sequence of derivations:
  1. Testing can only prove the presence of faults and not their absence
(therefore)
  1. The objective of testing is to find faults.
(therefore)
  1. A good test is one that has a high chance of finding a fault.
(therefore)
  1. A successful test is one that finds a fault.
(therefore)
  1. An unsuccessful test is one that does not find a fault.
Each of the above statements has become more focused and as it does so it changes the thinking that is done and can be done around it. And for the rest of the article that is now what we shall do.

Testing can not prove the absence of faults

  1. Testing can only prove the presence of faults
  2. Testing can not prove the absence of faults.
A practicing tester might well argue about statement 2 as it may run contrary to their experience and the aims of testing as defined by their organisation. One of the aims of their testing process is to validate the requirements, in effect, to prove the absence of faults. If a test runs to completion without a fault being identified, and on repeated runs (when it is run in exactly the same way) does not identify a fault, hasn't it just identified the absence of faults under the circumstances specified by the test?

Possibly, but you can't prove it. The test did not identify a fault but that doesn't mean that at a lower level of abstraction than the test was described in terms of, a fault did not occur. Perhaps the results were merely coincidentally correct.

Question: If it can be coincidentally correct then how can I ever be sure that the program works?
Answer: You can't, ours is a risk taking business.

We could accept the objection and re-phrase the statement as:

"Testing can demonstrate the absence of faults at a specific level of abstraction for specific input values over a specific path for the specific items checked"

Caveats:
  1. the test must have been executed correctly,
  2. the test correctly models the expected outcomes.
If the user only ever wants to do tasks which exactly repeat the test then there is a fairly low risk of failure in the live environment. How many users do the same action over and over again with the same data that was used during testing?

We know that testing cannot prove the absence of faults because complete testing is impossible; the combinations of path and input are too vast.

Testing can result in overconfidence in the stability of a product if the universal statement is not understood and accepted.

Testing can only prove the presence of faults

This derivation brings with it a number of presuppositions, a few obvious ones being; the person reading the statement knows what testing is, and what a fault is.

Testing is not an easy thing to define. There are no satisfactory definitions of testing that are universal to every project. Testers on the same project will have different definitions of testing; testers on the same team will have different definitions of testing. This statement is unlikely to be perceived in the same way by any other member of the development process.

So what is testing?

We can agree on the limits of testing but we may not agree what testing is. Is testing the activity done by a tester? Yes, but not only; developers do unit testing. Is a review testing? After all, reviews find faults; does that mean that testing encompasses reviews or that testing is not the only thing that proves the presence of faults?

Perhaps 'testing' is too specific to be part of a universal statement for the purposes of effective analysis. Testing is a quality control process, as are reviews, and walkthroughs. Perhaps the statement should be re-phrased as:

"Quality control processes can only prove the presence of faults and not their absence."

In the mind of the reader this may have the effect of stressing the need for many different quality control processes throughout the life cycle. As no single quality control process is going to prove to us that the system has no faults. Perhaps 'the need for quality' becomes the development focus and testing and reviews become thought of as tools to that end. The benefit of a tool is that it can be wielded by anyone.

Testing is whatever the organisation that you work in says it is. Testing is an environmentally defined mechanism for providing feedback on perceived quality. The 'How' of testing is much more important than the 'What'.

So what is a fault?

A fault is obviously something that is not correct. Therefore testing must obviously know, or believe that it knows, what is supposed to happen in advance of running a test. Testing (and other quality control processes) do this by constructing a model of the system. The model is then used to predict the actions that will occur under specific circumstances. The correctness of a situation is assessed in terms of the things that conflict between the model and the reality (the implementation under test). When the model and the implementation do not match then there is a fault.

The fault may well be in the model and not in the implementation. Quality Control processes must themselves be subject to processes of quality control.

Wherever it is, the fault will be investigated and possibly fixed. And for the rest of the life of the implementation, testing will make sure that if the fault comes back then testing will find it. That's the purpose of regression testing, or "re-running the same damn test time and time again" as it is known in the trade.

A good test is one that has a high chance of finding a fault

How do we ensure that the tests created are 'Good' tests, those with a high chance of finding faults?

The process of test modeling can identify incongruities and omissions in the source documents, some of these could have made their way into the final deliverable as faults so we can find the faults before we execute the tests.

Test derivation is the process of applying strategies to models. Different models have different strategies that are applicable.

Most testers are familiar with flow models, the most basic form being a directed graph. This can be used to model the sequence of actions in a program with decision points and iterations. There are numerous strategies that can be applied: statement coverage, branch coverage, condition coverage, domain testing. (Binder, Beizer [1][2])

A 'Good' test is one that can be justified in terms of one of the strategies as it applies to one of the models. A test is a way of exploring a model.

But how can we assume that this strategy has a high chance of finding a fault?

Strategies are typically constructed by analysing the faults that are discovered outside of testing i.e. faults which testing did not manage to pick up because the strategies in use did not result in tests which found those faults. Other people have done much of the work for us and we have a set of very effective strategies which testers can pick and choose depending on the information which they have to construct the models, the tools they use, the timescales they face, the degree and rate of change which the source information faces.

Successful and unsuccessful tests

A successful test is one that finds a fault.
                                (therefore)
An unsuccessful test is one that does not find a fault.

This is not typically the way that testers describe their tests. A test 'passes' when it has not revealed a fault and a test 'fails' when it has revealed a fault.

It would be too easy to use the above statements as justification for removing the tests that did not find a fault from the test pack, or possibly the strategy that created that test from the process.

The test was only unsuccessful this time. The test should only exist because the strategy used to create it has been effective in finding faults before. Software changes affect the outcome of tests. Tests that previously revealed faults suddenly don't - this is good, a fix has been successful. Tests that did not find faults suddenly reveal faults - this is not so good, the fix caused an unexpected side effect.

Remember, testing can never prove the absence of faults and that if testing is not finding any faults then your customer probably will. The results of testing should never be taken to mean that the product is fault free.

If testing is not finding any faults then it may mean that testing has done the best that it can and has to wait for the live fault reports to come back before the testing process can be improved further. I've never worked in an environment where that statement is true, testing is an infinite task but has typically done what it can given the current time, budget and staffing constraints - as has every other part of the development process.

However if the testers are running many 'unsuccessful' tests then the strategies used to construct the tests should be re-evaluated. Perhaps the strategies used are not effective, perhaps the combination of models led to the construction of redundant tests. If the strategies and models used are changed then the testing will change as a result.

Focus your mind

Statements like "The objective of testing is to find finding faults" have psychological effects on testers. They focus the mind. They make the tester avoid the easy road and encourage them to explore the dark passages hidden within the flawed system. They encourage the tester to examine test results carefully so that less faults slip through. Prioritisation and risk assessment become vital parts of the testing function.

Hopefully they make the tester more responsible about their testing, hopefully it makes the tester want to improve their testing, hopefully it makes the tester eager to learn about strategies, models and techniques which have been effective in the past.

What then happens then if the phrase is changed to: "The objective of a quality control process is to find faults"? A more universal statement that applies to every part of the software development lifecycle. Perhaps the software development lifecycle can become more aware of the faults that are being built into the system and carried forward from one process to the next. Quality control is the responsibility of everyone on the project.

Shape your process

By taking a single statement and distorting it slightly, by simulating, a human communication process from mind to mind, we have explored some of the thinking processes that these simple statements can trigger. And will continue to trigger. For as you think about the statements we have presented, there are other conclusions, other derivations, and other meanings that have not been mentioned here. Ones more relevant to you and your organisation and as you explore those, the ones that stem from your thoughts; they are the ones that will shape your testing processes for the better.

This essay originally appeared on CompendiumDev.co.uk  on 13th February 2002 as a 'journal notes' this was in the days before I had a 'blog' setup and still had a 'web site'. It seems more like a blog post than an essay so I've moved it over to EvilTester.com.

No comments:

Post a Comment