Wednesday, 13 February 2002

Finding Faults with Software Testing

There are many universal statements that a tester will come across in the course of their career, some true "Testing can only prove the presence of faults and not their absence", and some false "Developers must not test their own code"

Such statements may well be paraphrased quotes, the original source may not be known or presented, and the quote may be taken out of its original context. But they will be one common form of the phrase that exists in the industry's shared consciousness. Each of these statements has an effect on the person hearing them or thinking about them. And each of the Development Lifecycle personnel, no matter what their role is, will re-interpret the phrases based on their own experience.

The benefit of such statements is that they provide something to think around. The danger is that the brain loves to simplify things, identify patterns, reduce and compartmentalise information and can therefore build a less than optimal reasoning chain.

Here is a sequence of derivations:
  1. Testing can only prove the presence of faults and not their absence
(therefore)
  1. The objective of testing is to find faults.
(therefore)
  1. A good test is one that has a high chance of finding a fault.
(therefore)
  1. A successful test is one that finds a fault.
(therefore)
  1. An unsuccessful test is one that does not find a fault.
Each of the above statements has become more focused and as it does so it changes the thinking that is done and can be done around it. And for the rest of the article that is now what we shall do.

Testing can not prove the absence of faults

  1. Testing can only prove the presence of faults
  2. Testing can not prove the absence of faults.
A practicing tester might well argue about statement 2 as it may run contrary to their experience and the aims of testing as defined by their organisation. One of the aims of their testing process is to validate the requirements, in effect, to prove the absence of faults. If a test runs to completion without a fault being identified, and on repeated runs (when it is run in exactly the same way) does not identify a fault, hasn't it just identified the absence of faults under the circumstances specified by the test?

Possibly, but you can't prove it. The test did not identify a fault but that doesn't mean that at a lower level of abstraction than the test was described in terms of, a fault did not occur. Perhaps the results were merely coincidentally correct.

Question: If it can be coincidentally correct then how can I ever be sure that the program works?
Answer: You can't, ours is a risk taking business.

We could accept the objection and re-phrase the statement as:

"Testing can demonstrate the absence of faults at a specific level of abstraction for specific input values over a specific path for the specific items checked"

Caveats:
  1. the test must have been executed correctly,
  2. the test correctly models the expected outcomes.
If the user only ever wants to do tasks which exactly repeat the test then there is a fairly low risk of failure in the live environment. How many users do the same action over and over again with the same data that was used during testing?

We know that testing cannot prove the absence of faults because complete testing is impossible; the combinations of path and input are too vast.

Testing can result in overconfidence in the stability of a product if the universal statement is not understood and accepted.

Testing can only prove the presence of faults

This derivation brings with it a number of presuppositions, a few obvious ones being; the person reading the statement knows what testing is, and what a fault is.

Testing is not an easy thing to define. There are no satisfactory definitions of testing that are universal to every project. Testers on the same project will have different definitions of testing; testers on the same team will have different definitions of testing. This statement is unlikely to be perceived in the same way by any other member of the development process.

So what is testing?

We can agree on the limits of testing but we may not agree what testing is. Is testing the activity done by a tester? Yes, but not only; developers do unit testing. Is a review testing? After all, reviews find faults; does that mean that testing encompasses reviews or that testing is not the only thing that proves the presence of faults?

Perhaps 'testing' is too specific to be part of a universal statement for the purposes of effective analysis. Testing is a quality control process, as are reviews, and walkthroughs. Perhaps the statement should be re-phrased as:

"Quality control processes can only prove the presence of faults and not their absence."

In the mind of the reader this may have the effect of stressing the need for many different quality control processes throughout the life cycle. As no single quality control process is going to prove to us that the system has no faults. Perhaps 'the need for quality' becomes the development focus and testing and reviews become thought of as tools to that end. The benefit of a tool is that it can be wielded by anyone.

Testing is whatever the organisation that you work in says it is. Testing is an environmentally defined mechanism for providing feedback on perceived quality. The 'How' of testing is much more important than the 'What'.

So what is a fault?

A fault is obviously something that is not correct. Therefore testing must obviously know, or believe that it knows, what is supposed to happen in advance of running a test. Testing (and other quality control processes) do this by constructing a model of the system. The model is then used to predict the actions that will occur under specific circumstances. The correctness of a situation is assessed in terms of the things that conflict between the model and the reality (the implementation under test). When the model and the implementation do not match then there is a fault.

The fault may well be in the model and not in the implementation. Quality Control processes must themselves be subject to processes of quality control.

Wherever it is, the fault will be investigated and possibly fixed. And for the rest of the life of the implementation, testing will make sure that if the fault comes back then testing will find it. That's the purpose of regression testing, or "re-running the same damn test time and time again" as it is known in the trade.

A good test is one that has a high chance of finding a fault

How do we ensure that the tests created are 'Good' tests, those with a high chance of finding faults?

The process of test modeling can identify incongruities and omissions in the source documents, some of these could have made their way into the final deliverable as faults so we can find the faults before we execute the tests.

Test derivation is the process of applying strategies to models. Different models have different strategies that are applicable.

Most testers are familiar with flow models, the most basic form being a directed graph. This can be used to model the sequence of actions in a program with decision points and iterations. There are numerous strategies that can be applied: statement coverage, branch coverage, condition coverage, domain testing. (Binder, Beizer [1][2])

A 'Good' test is one that can be justified in terms of one of the strategies as it applies to one of the models. A test is a way of exploring a model.

But how can we assume that this strategy has a high chance of finding a fault?

Strategies are typically constructed by analysing the faults that are discovered outside of testing i.e. faults which testing did not manage to pick up because the strategies in use did not result in tests which found those faults. Other people have done much of the work for us and we have a set of very effective strategies which testers can pick and choose depending on the information which they have to construct the models, the tools they use, the timescales they face, the degree and rate of change which the source information faces.

Successful and unsuccessful tests

A successful test is one that finds a fault.
                                (therefore)
An unsuccessful test is one that does not find a fault.

This is not typically the way that testers describe their tests. A test 'passes' when it has not revealed a fault and a test 'fails' when it has revealed a fault.

It would be too easy to use the above statements as justification for removing the tests that did not find a fault from the test pack, or possibly the strategy that created that test from the process.

The test was only unsuccessful this time. The test should only exist because the strategy used to create it has been effective in finding faults before. Software changes affect the outcome of tests. Tests that previously revealed faults suddenly don't - this is good, a fix has been successful. Tests that did not find faults suddenly reveal faults - this is not so good, the fix caused an unexpected side effect.

Remember, testing can never prove the absence of faults and that if testing is not finding any faults then your customer probably will. The results of testing should never be taken to mean that the product is fault free.

If testing is not finding any faults then it may mean that testing has done the best that it can and has to wait for the live fault reports to come back before the testing process can be improved further. I've never worked in an environment where that statement is true, testing is an infinite task but has typically done what it can given the current time, budget and staffing constraints - as has every other part of the development process.

However if the testers are running many 'unsuccessful' tests then the strategies used to construct the tests should be re-evaluated. Perhaps the strategies used are not effective, perhaps the combination of models led to the construction of redundant tests. If the strategies and models used are changed then the testing will change as a result.

Focus your mind

Statements like "The objective of testing is to find finding faults" have psychological effects on testers. They focus the mind. They make the tester avoid the easy road and encourage them to explore the dark passages hidden within the flawed system. They encourage the tester to examine test results carefully so that less faults slip through. Prioritisation and risk assessment become vital parts of the testing function.

Hopefully they make the tester more responsible about their testing, hopefully it makes the tester want to improve their testing, hopefully it makes the tester eager to learn about strategies, models and techniques which have been effective in the past.

What then happens then if the phrase is changed to: "The objective of a quality control process is to find faults"? A more universal statement that applies to every part of the software development lifecycle. Perhaps the software development lifecycle can become more aware of the faults that are being built into the system and carried forward from one process to the next. Quality control is the responsibility of everyone on the project.

Shape your process

By taking a single statement and distorting it slightly, by simulating, a human communication process from mind to mind, we have explored some of the thinking processes that these simple statements can trigger. And will continue to trigger. For as you think about the statements we have presented, there are other conclusions, other derivations, and other meanings that have not been mentioned here. Ones more relevant to you and your organisation and as you explore those, the ones that stem from your thoughts; they are the ones that will shape your testing processes for the better.

This essay originally appeared on CompendiumDev.co.uk  on 13th February 2002 as a 'journal notes' this was in the days before I had a 'blog' setup and still had a 'web site'. It seems more like a blog post than an essay so I've moved it over to EvilTester.com.

Friday, 8 February 2002

Test Conditions

Test Conditions are statements of compliance which testing will demonstrate to be either true or false. These are (in effect) test requirements.

Conditions serve different purposes. Some conditions will act as the audit reason for a particular test case e.g. The user must be able to create a flight. The tester will create a test which creates a flight, obviously there are more attributes to this case than this - what type of flight, what type of user, fully booked, partially booked, etc. These attributes are other types of conditions.

Some conditions are used to define a test’s attributes or preconditions. E.g. create flight of type local, create flight of type international.

Or are they....

This may be modelling that has not gone far enough.

The initial condition 'create a flight' is valid. When we only have test conditions as our modelling tool then we have to represent this as a condition. It is also a program function - create flight, or an object method, or an entity event, or a business process. Consequently we should really have a model and a derivation strategy that says “there must be at least one test for each entity event” or “there must be at least one test for each object method”. In this case it is obvious that “one test” will not cover the condition but with a rich model, with object or entity models we have a list of properties or attributes, these will have scoping variants (i.e. attribute flightType - international, local).

Basically, we use these context rich models to give us the combination information that we require to construct test cases. Without this approach we will never know if we have a valid or complete set of condition combinations.

Hierarchical models are appropriate for test grouping i.e. tests related to business processes, program modules, program functions etc. There is no reason why a test cannot be in more than one test grouping.

Hierarchical models are appropriate in derivation for hierarchical structures (it is possible to list entities attributes and events as hierarchical structures but this hides valid combination options and is a mix of ELH and ER, we should really have a relationship section on the model)

e.g.


  • Entity: Flight
    • Attribute: Type
    • Attribute: Start Airport
    • Event: Create
    • Event: Takeoff
    • Event: Land
    • Event: Delete

Models for test derivation should be rich. This allows a derivation strategy to be created which can be used to gauge the completeness of the test products and the validity of the test products.

With rich models, Test conditions become requirements which are used to check the completeness of the test derivation approach rather than the audit reason - unless of course there is no way to make the construction of test cases automatic with the implicit cross referencing of test conditions i.e. we don’t have to state ‘Create a flight’ as a test condition because there is an entity event on flight called create which we know we have to be able to test and it will apply to a variety of attributes on that entity. Without this rich modelling, and without an implicit (or strategy driven) approach to testing, a vast number of test conditions have to be created and maintained and no guarantee of combination thoroughness can be achieved.

Thursday, 7 February 2002

Process Improvement

The simplest way to improve a process is to analyse the errors that it lets slip through.

For every error not found by one of the previous quality control processes, ask, is there a strategy that could have been applied to one of the existing models that would have created a test case that could have identified the error.

If the answer is yes then it may well have slipped through because timescales or staffing levels forced your hand and you simply didn't have the time to apply that strategy to that model, or the risk of not applying it was deemed low.

If the answer is no then we have to identify a model and a strategy that could have found it.

In both cases assess the cost and time impact of adopting that model and strategy. The development process is one of trade offs and compromises.

Wednesday, 6 February 2002

Error Guessing

Error Guessing is described in 'Testing Computer Software' by Cem Kaner [1]:
"For reasons that you can't logically describe, you may suspect that a certain class of tests will crash the program. Trust your judgment and include the test."
This quote suggests to me that there is an informal model in the tester's head and that the subconscious is applying a strategy to the model which the tester is unaware of. The tester is only aware of the subconscious flagging the results of that check to the conscious as a nagging doubt.

If you do engage in error guessing then you should be aware that:

  • you have a model and applicable strategy in your head that you are not using on the project or possibly even aware of.
  • if your strategy does work then you should try to quantify it so that you can use it consistently.
  • If it doesn't work then you should possibly change the model and strategy in your head.


[1] Testing Computer Software, Cem Kaner, Jack Falk, Hung Quoc Nguyen, 2nd Edition 1993, International Thompson Computer Press

Tuesday, 5 February 2002

Testing

Testing is exploration. In a mature testing organisation the expedition is well planned, and staffed with seasoned explorers. The planning will be done around a number of maps of the territory to be explored. Some maps will show different levels of detail - to show all the detail on one map would confuse the issue, so one map will identify areas of population, one will provide information of season rainfall statistics etc. Maps are very important. The explorer plans different routes through the maps to match the aims of the expedition, perhaps they are trying to unearth hidden temples and consequently will pick routes which take them through areas which are sparsely populated now, but in the past were densely populated. Effective exploration requires an understanding of the terrain to be explored.

Errors cannot be found without a model. Quality Control cannot be conducted without a model.

I have heard it said that "some testers never model" and "reviews are not conducted against a model". These statements are false. In the absence of a defined and identifiable model, there will be an informal model, a model of understanding in the tester's head.

A model is our understanding and knowledge of the system. The level of testing that can be done with no understanding and no knowledge of the system is zero.

Try this. Take a program that you don't know what it does. Make sure that the program presents all its information in a way that you cannot understand. If you don't know Japanese then test a Japanese program. If the information presented to you is obscure enough then you will find it impossible to build a model of it and then you have no way to assess the correctness of any action. Remember that if you even understand the name of the program or its main purpose then that is information that you will have assimilated into a model and will use during testing.

Reviews cannot be conducted without a model of whatever the thing being reviewed is supposed to represent. Review models are different from testing models.

A review will be conducted against a number of models:


  • The model of a well-formed document. (does it have a title page? Are the pages numbered?)
  • The syntax of the actual text
  • The semantic model in the reviewers' head of the items to be presented which they have a vested interest in.

There are at least as many informal models as there are people.

Modelling is a fundamental task in testing.

Quality Control is essentially the checking of a model against an implementation of that model.

A test is a specific situation with a predefined set of things to check against the model. The differences are errors, either in the model or the implementation.

Monday, 4 February 2002

Testing and Modelling

In order to test a model we have to have some way of recognising the success or failure of a test, our test must have a goal, it must have a reason for existing. That reason is inherent in the model from which the test is derived. This means that it is difficult to use a model to test itself. In software development this is typically not a problem, we rarely use the source code as the only model when testing the source code. We typically derive tests from a design model, a requirements model or a specific testing model, or even a model which may be derived from the source code, essentially any model which has the level of detail required for our testing.

Modelling is as fundamental an activity of testing as it is of development.

Test Strategies are applied to models. This is for the purposes of test derivation, derivation and execution coverage measurement, domain analysis, risk analysis, the list includes almost every task that testers do.

Strategies typically evolve and are identified by thinking about errors that slipped through and identifying a strategy that could have found them.

Sunday, 3 February 2002

Checking the model

Do we want to produce programs that are fault free or do we want to produce programs that allow people to do what they want to do in a quality manner? Ideally I'd prefer both, but humans make mistakes so we should aim for a product that functions in a quality manner. When we achieve that we do it by assuring ourselves of the quality of the models in the software development process as they are produced, and before they are fed into a downstream process that will build new models using the information contained in those models.

How do we assure ourselves that the model is a quality model? We have to know what we want from a model, control the production process as much as possible, check the model when it is done and periodically throughout its production.

This is the fundamental assertion of the T.O.T.E model in psychology [1]. The T.O.T.E model is a sequence of steps: Test, Operate, Test, Exit. With a goal in mind we Test to see if the goal has been achieved, if not then we operate to change something which will hopefully bring us closer to our goal, we test and if the test is satisfied we exit that process as our goal has been achieved.

The T.O.T.E model describes a process of refinement.

[1] Modeling with NLP, Robert Dilts, 1998, Meta Publications

Saturday, 2 February 2002

Modelling the development process

Software development tries to create a product. It does this by engaging in a number of processes (requirements, design, coding, etc.). Each of these processes will create a model that will either be the final product or an input to a follow on process. The best example of this is one of the popular life cycle models: waterfall or V model.

There are constraints on the construction of a product; we need it in X days, it must only cost Y thousand, and you only have Z staff members. The skill of developing software is in the effective application of the strategies that have been learned bearing in mind the constraints involved.

Modelling is a fundamental activity of the software development process.

To take the main processes from a minimal software life cycle:
  • Requirements
    • Requirements are modelled, possibly as text.
  • Design
    • The popular UML provides a range of diagrams: Class, Object, Component, Deployment, Use Case, Sequence, collaboration, Statechart, Activity
  • Coding
    • The program is modelled in Code. We have the choice of language to model the system in be it C++, Smalltalk or assembly. The code is a model. When we execute the system we have to have special programs to map that executable system back into the code model. e.g. a source code debugger. (It is possible to formalise the design models above so that they are equivalent to a code model.)
  • Testing
    •  Testing can use many of the development models and will apply strategies such as: loop once, loop twice, cover every statement, cover every predicate condition, cover every exception.

Each of the models produced in the development process is a refinement of a previous model, even if the previous model was never formally documented. A requirement is a refinement of the dreams and aims associated with the picture in the specifier of the requirement's head.

Friday, 1 February 2002

On Modelling

"A map is not the territory it represents, but, if correct, it has a similar structure to the territory, which accounts for its usefulness…" A. Korzybski, Science & Sanity, 4th Ed. 1958, pp. 58-60 (quoted from: R Bandler & J Grinder, Patterns of the hypnotic Techniques of Milton H. Erickson, M.D. Vol 1, 1975, pp 181)

Human beings model the world. We learn by constructing models and then subjecting those models to tests in order to determine their validity. Human beings do this all the time. By the time a human reaches adulthood it has constructed so many models that the person is probably unaware of all those models, and may even be unaware of the modelling process.

When we are faced with something new which we don't understand, a new fangled tap in the washroom for example, we will use the models that we have already developed which are similar to that situation; all our tap models. If they fail to make the tap work then we will use other models associated with the washroom. Having encountered hand driers with proximity sensors we may wave our hands around to get the tap working. If we have been on a train or a plane then we may have encountered taps that work through foot pedals and we would start to look for those. We have models for turning things on outside the washroom using buttons or levers and we would start using those.

We use models and strategies all the time. We are experts at constructing models and strategies to apply to those models. We interact with the world through those models.

The software development process constructs software that works more often than it fails. This is in spite of us not doing what we are told is best practise, or even what we believe we should do; we don't spend enough time on requirements, we don't stabilise requirements, we don't design, we don't unit test, we don't document, we don't review. We often get away with it because modelling is a natural talent and software development is a process of modelling.

Unfortunately we also learn from experience, and if our experience of success includes not doing all these best practise processes then that leads us to not do these processes again.

We have to understand what we do and what each step is for, in order to ascertain if we can miss them out in the situation we are in.

Modelling explicitly, and understanding our models, allows us to be pragmatic.
>