Thursday, 15 August 2002

T.O.T.E - Test, Operate, Test, Exit for Software Testing

Wherein the TOTE (Test Operate Test Exit) model is used to explore the nature of feedback and abstraction of test phases

In 1960, George Miller proposed a model of Goal driven behavior which he titled T.O.T.E (Test operate test exit). This essay, maps the T.O.T.E model to Software Testing.

image TOTE
Fig 1: The TOTE Model as a Graph

My understanding of TOTE is very simple. For this model to be valid we have to accept that the stimulus behind behavior is the achievement of a Goal. In order to achieve the Goal, that goal has to be defined thoroughly enough to allow us to recognise when the goal has been achieved so that as we move towards the achievement of that goal (operate) we can assess (test) if that goal has been achieved and then Exit.

George Miller intended this to be a model of behavior, not a model of IT. Is it possible that I am trying to forcefully overlay one model on top of another just because it has the word “test” in it, not once but twice?

Possibly, I do that, I like to test and try things out and when they work I justify it with the term inter disciplinary research, or some such nonsense. But this does seem to me to be a valid operation to try. And if it fails my reading test then I will learn something about the difference between the two models. I will learn something that makes one model unique and then I’ll exit stage right just a little wiser pursuing some other idea.

Software Development can be viewed as a single TOTE. In that we first of all decide that we want a system to allow us to do things so we list those things so that we can assess how close we are to having the system we want (test), we build the system (operate), and if the system allows us to do the things we want (test) then we call it complete and release it (exit).

Software development has a life cycle with a slightly more detailed lifecycle than that and as we look at it in more detail we see that the software development cycle is actually a sequence of interdependent TOTE feedback chunks.
image TOTE System
Fig 2: System Development TOTE Sequence

In fact every single aspect of software development, because humans do it and it becomes subject to behavioral analysis can be viewed using the TOTE model. The process of constructing a test itself is subject to TOTE, the process of writing every script step is subject to TOTE.

Facetiously, this model may help to explain why companies are so reticent about doing System testing:
  • We specify the system (test)
  • we build the system (operate)
  • we unit test the system (test)
  • why can’t we then ship it? Surely adding independent system testing to the mix can be perceived as OTT.
Well, no. System testing executes the system from a different level than that of unit testing Unit testing often works from within and system testing from without. Unit testing can start before the system has been put together, system testing cannot. Unit testing must know how the system has been constructed, System testing doesn’t. Both have different sets of goals and therefore a different set operations and tests.

User acceptance testing waits for the development testing (unit and system testing) to be complete. The goal being to fit the system into their business process.

An IT development model that has unit testing, followed by system testing, followed by user testing isn’t OTTT, from a distance it is simply OT. We, as IT personnel, have simply taken the T chunk and split it into UT, ST, and UAT to make the most of the parallelism that is available with multiple levels of validation conducted by different people.
image parallelism
Fig 3: The Parallelism of the development cycle

Conclusion:

TOTE is just a model. But Quality Software Development is a goal that most development teams set out to achieve, they cannot do that without knowing what they mean by quality and without checking to ensure that they have achieved it.


An Aside:
       Some Variants of TOTE used in IT
BAD (malformed paths)
  • OE (oh) - no testing involved, code it then ship it
  • OTEA (oh dear), operate, test, then exit anyway
GOOD (well formed paths)
  • TE -( tche!) - no funding project canned and deemed unfeasible
  • (TOT)*E extreme programming, write the test, write the system, run the test

Monday, 1 April 2002

Popular testing phrases #57 - Testing should start early



This is a fantastic phrase, it has been popular and it has worked; testing now starts earlier and testing has a higher profile than ever before, but despite all that, it isn't what we meant or even what we really needed.

Back in the depths of despair that was the Software Crisis. Testers managed to bellow a rallying cry "Testing should start early!" Testers could be found picketing management and development meetings with 'V-model' placards and T-shirts with 'cost of defect' bar charts emblazoned.

And the message got through. Testing started earlier, testing was funded, fewer bugs made it into production, testing became a career and all was well with the software development cycle. Eventually the software crisis disappeared. And yet, years later, we still produce software with defects and we still hear "Testing must start earlier".

So the question is why? What do you mean? Earlier than what? What kind of testing?

If testing doesn't start early enough then obviously it won't be as effective as it could be, and therefore should start earlier, but this is true of every process in the software development cycle. Every process needs to start early enough to be effective and needs to be considered important enough to get funding.

But once we start early enough to be effective and in control of the process, do we get the results that we thought we were going to get? Something must be missing if we still get defects in the system even though we started testing early.

If we examine what happens when we do start testing early we can see that:


  1. Testers get more time with specification documentation and can produce test condition models that help them identify incongruent documentation.
  1. Testers get the time to build flow and process models that help them identify errors of omission e.g. "what happens when we press 1 on screen 3?"


But ultimately, most of the defects that testers are going to find still come from the execution of tests, and the only way that you can move that process forward is to deliver the system earlier and change the development process. e.g. prototyping, extreme programming, iterative development.

What we are really seeing is that when we start testing early, we increase the amount of review work that is done earlier in the project. Testers are essentially finding defects that are found through review work.

So what is the difference between reviewing and testing? Well, quite a lot, the testing process itself is subject to the review process. A better question might be: what are the similarities between reviewing and testing?


  1. can help identify defects associated with requirements and documentation.
  1. Both start early in the life cycle
  1. Both continue throughout the life cycle
  1. Both are forms of quality control.

It is quality control that should start early and it is quality control that will help detect bugs throughout the life cycle of the project and to do that successfully we have to implement many forms of quality control effectively throughout the project.

The current popular quality control technique is testing. Why? It isn't because testing is cheap. But in some development environments this means that only the testing team tests, with minimal input from development and management. Of course it costs more money that way, a lot more, (compared to a review programme, testing is really expensive) but it is someone else's responsibility and by the time testing find the problems it can be argued that it is too late and too expensive to change the offending documentation.

Let us hope you never have to work in an environment like that, but as long as software development stresses testing rather than the gamut of quality control techniques, mass responsibility and involvement, we will have to face the consequences of our lazy partitioning.

Be careful what you wish for.


This essay originally appeared on CompendiumDev.co.uk  on 1st April 2002 as an 'essay' this was in the days before I had a 'blog' setup and still had a 'web site'. It seems more like a blog post than an essay so I've moved it over to EvilTester.com.

Monday, 11 March 2002

An Actor's Life for me

…Meyerhold's bio-mechanical actor said, "I make these movements because I know that when I make them what I want to do can most easily and directly be done."  [1]

All quotations in this article are from "The Actor's Ways and Means", Michael Redgrave, William Heinemann Ltd, 1953 

More for amusement than a desire to educate myself further in the ways of testing, I picked up a copy of Michael Redgrave's lectures on the actor's craft "The Actor's Ways and Means". I was attracted, in much the same way that I am attracted to black and white films on TV by the prospect of concise, focused, gentile and erudite communication. And also by the entertainingly quaint photographs of an earnest and experienced Stagecraft Exponent; all greasepaint, wild hair and madness as Lear; splendidly posed, seated on a high backed chair, and with teacup held perfectly as a well to do gentleman should; and again, all greasepaint, wild hair and evil lidded stare as Shylock.

It was for amusement that I read this small tome. But as I started to read, I found parallels to the software testing and development world. This could be seen as a mild form of obsession, but I was waiting to go to a testing meeting and I had an hour to spend in idle pursuits, so naturally testing was still just below my conscious focus.

Admittedly when I had bought the book I thought "perhaps testers need to learn the skills of acting in order to put themselves in the situation of users when testing the software". Perhaps they do, perhaps they don't. Perhaps they don't need anything so formal as acting skills, as developing other thought processes or taking on other beliefs, perhaps they just need to understand and think about the aims of that user. Perhaps these are the same things.

But I had forgotten all that when I came to read the book, which after all deals with the actor's craft and the skills required to handle the acting life. However, I quickly realised that the book might bear relevance to my own testing vocation when I came upon Michael Redgrave's descriptions of critics.

The conditions in which critics work are harder than is generally supposed. I could wish they could be given more time, more space, less to do and more room in which to do it. I could also wish that the written history of our theatre were not a history of first nights. [1]

I could not help but empathise. I too could wish that the critics of the software development world, and I refer to testers here, were given more time, more space, less to do and more acknowledgement of the hard work that is done. It often seems that software development is a history of first nights because systems are so very quickly passed to the repertory players in the maintenance department.

After more than a few years in theatre, Michael Redgrave was aware that however harsh his critics' words might be, they would critique out of a love of good theatre. Testers too do it out of a preference for good development practices and to have produced for the user a product which will match the user's needs.

Testers, like critics, can "occasionally be severe … savage... loving and yet frivolous and spiteful".
Testers can be all these things but we try hard to be diplomatic for unlike the relationship between critic and actor, the tester plays on the same stage as the developers and are judged as harshly by the audience of management and users as the development team players.

Michael Redgrave's book is intended to be read by aspiring, flowering and seasoned actors so is a practical treatise on his own theories of acting and those of Stanislavski and Meyerhold, with a few anecdotes thrown in for the luvvies in the gallery.


Over the years Michael Redgrave had recourse to converse with many of his acting chums on the theory and practice of acting but held true to his own belief that actors were born, not made, yet they could better their exposition of their craft through the use of theory and method.

No doubt this was a contention that many of his colleagues could not agree with. "No No Michael", they must have emphasised with a sigh, "these theories and methods are of no use, we are artists born to either greatness or mediocrity, we live for the play and play to live, we have a prime and we will reach it naturally over time and experience, it takes 20 years to make an actor dear boy."

Perhaps some people are born with the base requisite attitudes to be a tester but these attitudes can also be learned, the skills of testing are certainly learned, often by experience in the flaming inferno of development hell.

The theories and methods of testing have been outlined in numerous tomes and it is up to the individual tester to educate themselves in the numerous pages of testing experience.

Anyone who has learned some of these skills and methods will have experienced the pain of trying to implement them in the real world and while some of the testing tomes cover this aspect of applying the theory, none do so in quite the same way as Michael Redgrave did in the following paragraph.

Theory and method…are of immense value to the actor who can translate them into practice. They are also, there can be no doubt, toxic to the actor who cannot. They should be labeled 'as prescribed by the physician.' But at their worst they are never as poisonous as convention. The conventional actor suffers a growing paralysis for which there is, after a time, no known cure. [1]

There are certainly times when we have unsuccessfully tried to apply a technique to a situation where it was not suited and it failed. There are certainly times when we tried to apply a technique or theory when we did not understand it and the application cost us more than if we had not. But we should certainly never fail to continue to learn our testing methods and theories and also learn from applying them. There is a skill in the application of this knowledge and this too has to be learned. We should never be put off from this application after a few failures.

And we should never be put off from using or learning a technique because we can't see how it might apply. The subtleties of the technique may not become apparent until a degree of mastery has been achieved, and then we are faced with problems of a different nature.

When the actor has gained some mastery of the essential qualifications, his difficulties have only just begun. He must in the first place strengthen his mastery of all these things so that in each of them he can feel a great reserve of power and then he must, by his intelligence or taste, know how not to use these powers to the full. He must know in fact what to leave out. [1]

There are many levels of maturity of development process and there is no single fit testing method. There is no book that you can take off the shelf and guarantee that the application of the methods and theories in that book will result in the most effective testing practice for that organisation. There are basic skills which we have to learn, but more importantly we have to learn when to apply them to construct the most efficacious test strategy.

And sometimes the most efficacious test strategy is not going to be as formalistic and structured as we would wish.

The basis of all acting is undoubtedly instinctive, but that does not mean that a great deal of this is not susceptible to some kind of analysis, or that method may not make it more than it mars. [1]

All of us have had the experience of using a computer program, trying something and having said program crash. Testers appear more prone to phenomenon than other users, even in their non-professional use of computers.

The power of improvisation is something which is very much underrated in our professional theatre, where it is regarded as something a little bit amateurish or childish. It is not childish but rather child-like, and it is the faculty which an actor has to be like a child in his naiveté which helps him to avoid becoming merely a routine performer. It is paradoxical that in our commercial theatre, with its long runs, in the performance of which the actors are most in need of a stimulus to keep their imagination fresh and child-like, we should neglect the opportunity of exercising those qualities. Of course every really imaginative actor and actress has the power to improvise. We show it at rehearsals and also when things go wrong during the performance… [1]

Witness the rise of exploratory testing, best fit testing and agile methods. These are not simply a throw back to unstructured and unmanaged testing. These require highly skilled testers with a full knowledge of techniques and methods with a highly honed sixth sense of quality. 

Testers are skilled in the art of usage improvisation. When a fault occurs, testers enter a phase of isolation where they focus in on the fault and try to narrow the range of circumstances in which it happens in order to communicate more effectively the nature of the defect to the developers. This is a process of improvisation, and it is improvisation based on a pre-existing model of an ideal system.

There are no doubt people of theatrical genius who can completely improvise their way through 90 minutes without it proving detrimental to the cast, crew or audience, but most actors will be better served by thinking about what they are going to do before hand but being prepared to improvise when required.

The actor has to build a model of the character and the play in which they find themselves playing in order to effectively present that role to the audience so that the performance is a "great moment... when his playing achieves 'rhythm' … has 'flight', or 'leaves the ground'... But these occasions mostly happen when the preliminary work has been deeply felt and composed with at least some conscious care."[1]. And

Michael Redgrave has some notes on how this is done.

I imagine that the mind process of the actor at this stage are similar to the work of a detective. He does not set about like a police inspector, simply to gather every bit of evidence for its own sake as a matter of routine, but like Sherlock Holmes, or the detective Maigret…he shifts the available evidence around in his mind rather as one might shift the pieces of a jig-saw puzzle until by some instinct he finds himself in possession of a psychological clue or characteristic which will suddenly illuminate the whole character for him, and help him find the truth.[1]

Similarly preparation is vital to the testing process. Thinking is vital to the testing process. Questioning is vital to the testing process. Just as the actor questions the motivation of his character, not to berate the writer of the play, but to understand the words he his saying and to make those words live. If words contribute only 7% of the meaning of communication [footnote1] and physiology and intonation contribute far more, then the test scripts that we have are merely words. The thinking behind those scripts, the models, the conditions, the aims and hunches, these are the valuable aspects of testing and these are what we get from preparation.

[The Stanislavski method] remains largely a matter of instinct though the germ of the Stanislavski method is to help the actor discover the creative mood, to clear the decks for action. [1]

From the brief summary of Stanislavski's method in this book I wonder, could the method be of use in inspiring the tester, to help them think and to identify their aims and purpose?

On Offered circumstances…
When I have directed a play I have sometimes asked an actor to ask himself why he thinks the author has written his particular part in the play. Absurd as it may seem, it is often very difficult to persuade actors to consider this point seriously…It is not so much a question of the actor 'knowing his place' in drama but of knowing his value. [1]

Testers too should ask themselves. Why am I on this project? What do the customers want? What do the developers want? What does the manager want? What do I want? Which of those wants are valid? Which of those wants conflict? Which of those wants are unreasonable?

The final phase in the preparatory work on a role was to find what has variously been translated as the seed, the grain, the kernel or the core of the character, to which all the previous considerations are preparatory. This is followed by the 'aim' of the character and when the actor is conscious of this, all else is forgotten, for then the actor, as the character, can answer the question, 'What do I want and why?' [1]

And should we be too wary of constructing such models lest we get them wrong.

The truth for him that is. It does not, of course, follow that he will always be right…And what do we mean by 'right' in this case? We mean right in the sense that it fits both the circumstances offered by the author - the scene of the crime - and what we might call the personality and motives of the criminal- that is, the character. [1]

We may well get them wrong. False Positives are a hazard of the trade. Extraneous tests as a result of incorrect domain analysis and data partitioning are par for the course. But by not preparing correctly we run the risk of putting in a performance which does not fly and in which the audience do not believe.

There are techniques for helping us get our models right. Reviews are very important. Communication with our other testers and development team members is essential. The test process can not be isolated. Actors don't prepare and practice individually with the script and then come together for opening night, if they did then every play would be a disaster and theatre would become financially impossible to produce. Such a thing should never be allowed to happen in the world of software development, just imagine how much it would cost the business community if most IT projects failed. The economy could be ruined.

But we should not judge our performance solely by the reaction of the audience, we must have our own integrity and faith in our own craft. If we pander too much to the audience then we risk the play becoming a pantomime.

[The audience] can betray [an actor] into seeking easy ways to please, repeating blandishments which he knows have been previously successful. It can force him to seek to dominate their mood, as he is frequently obliged to do, with force or tricks which are alien to the part his is playing. It can, in short, make him a flatterer or a fighting madman. 'To please the ears of the groundlings' has become 'To play to the gallery.' [1]


…though the audience may bring a stronger pressure on him than either his author, his producer or his own artistic conscience, it should never force him to be faithless to these three. Perhaps what I am really trying to say is that the must find his own artistic conscience. A faith in what he thinks and hopes he can do. [1]


Everyone involved in the development process has their own aims and requirements of the process. They will not necessarily want to know those of the testers. They will however want to see that where there is an overlap between the aims and processes of the testers and their own aims and processes that they are not detrimentally affected. The conducting of the testing process can be a process of education for everybody in the development team. Not least of all the testers. The comedian can't know how well a joke works until he has said it out loud and been witness to the response.

It is a fallacy to suppose that even an average audience, if there is such a thing, comes to a play with an open mind. It comes bristling with a variety of prejudices. It is one of the functions of the drama to allay these prejudices and to leave the audience with a more open mind and heart…[the actor] must challenge and at the same time embrace his audience… [1]

And so we reach an end. To sum up, how does one become a good actor, or a tester?

Well, just as there are similarities in method and philosophy, there is also a similarity at the fundamental level.

To act well and to act well repeatedly has to become an obsession. [1]

In order to become good.

To become better than good.

In order to excel.

You have to care.


Be interested in the subject, study the subject, do your best to reinterpret and reinvent the subject.

Good actors are not good actors because they mimic the performances of other earlier great actors. They are good actors because they have worked hard to become good actors with their own styles and methods, they take inspiration from the world around them and bring it to the performance at hand.

When I said…that the desire to act must become an obsession, I did not mean that it should become a total obsession. Quite simply it means that if you are going to produce the best work of which you are capable, that work will, sooner or later, take first place in your mind. [1]
 
 
References:
[1]  "The Actor's Ways and Means", Michael Redgrave, William Heinemann Ltd, 1953  (amazon.com, amazon.co.uk)
 
 
Footnotes:
[footnote1] a seemingly random figure which I remember from somewhere but I'm can't remember exactly where.

This essay originally appeared on CompendiumDev.co.uk  on 4th January 2002 as a 'journal notes' this was in the days before I had a 'blog' setup and still had a 'web site'. It seems more like a blog post than an essay so I've moved it over to EvilTester.com.

Wednesday, 13 February 2002

Finding Faults with Software Testing

There are many universal statements that a tester will come across in the course of their career, some true "Testing can only prove the presence of faults and not their absence", and some false "Developers must not test their own code"

Such statements may well be paraphrased quotes, the original source may not be known or presented, and the quote may be taken out of its original context. But they will be one common form of the phrase that exists in the industry's shared consciousness. Each of these statements has an effect on the person hearing them or thinking about them. And each of the Development Lifecycle personnel, no matter what their role is, will re-interpret the phrases based on their own experience.

The benefit of such statements is that they provide something to think around. The danger is that the brain loves to simplify things, identify patterns, reduce and compartmentalise information and can therefore build a less than optimal reasoning chain.

Here is a sequence of derivations:
  1. Testing can only prove the presence of faults and not their absence
(therefore)
  1. The objective of testing is to find faults.
(therefore)
  1. A good test is one that has a high chance of finding a fault.
(therefore)
  1. A successful test is one that finds a fault.
(therefore)
  1. An unsuccessful test is one that does not find a fault.
Each of the above statements has become more focused and as it does so it changes the thinking that is done and can be done around it. And for the rest of the article that is now what we shall do.

Testing can not prove the absence of faults

  1. Testing can only prove the presence of faults
  2. Testing can not prove the absence of faults.
A practicing tester might well argue about statement 2 as it may run contrary to their experience and the aims of testing as defined by their organisation. One of the aims of their testing process is to validate the requirements, in effect, to prove the absence of faults. If a test runs to completion without a fault being identified, and on repeated runs (when it is run in exactly the same way) does not identify a fault, hasn't it just identified the absence of faults under the circumstances specified by the test?

Possibly, but you can't prove it. The test did not identify a fault but that doesn't mean that at a lower level of abstraction than the test was described in terms of, a fault did not occur. Perhaps the results were merely coincidentally correct.

Question: If it can be coincidentally correct then how can I ever be sure that the program works?
Answer: You can't, ours is a risk taking business.

We could accept the objection and re-phrase the statement as:

"Testing can demonstrate the absence of faults at a specific level of abstraction for specific input values over a specific path for the specific items checked"

Caveats:
  1. the test must have been executed correctly,
  2. the test correctly models the expected outcomes.
If the user only ever wants to do tasks which exactly repeat the test then there is a fairly low risk of failure in the live environment. How many users do the same action over and over again with the same data that was used during testing?

We know that testing cannot prove the absence of faults because complete testing is impossible; the combinations of path and input are too vast.

Testing can result in overconfidence in the stability of a product if the universal statement is not understood and accepted.

Testing can only prove the presence of faults

This derivation brings with it a number of presuppositions, a few obvious ones being; the person reading the statement knows what testing is, and what a fault is.

Testing is not an easy thing to define. There are no satisfactory definitions of testing that are universal to every project. Testers on the same project will have different definitions of testing; testers on the same team will have different definitions of testing. This statement is unlikely to be perceived in the same way by any other member of the development process.

So what is testing?

We can agree on the limits of testing but we may not agree what testing is. Is testing the activity done by a tester? Yes, but not only; developers do unit testing. Is a review testing? After all, reviews find faults; does that mean that testing encompasses reviews or that testing is not the only thing that proves the presence of faults?

Perhaps 'testing' is too specific to be part of a universal statement for the purposes of effective analysis. Testing is a quality control process, as are reviews, and walkthroughs. Perhaps the statement should be re-phrased as:

"Quality control processes can only prove the presence of faults and not their absence."

In the mind of the reader this may have the effect of stressing the need for many different quality control processes throughout the life cycle. As no single quality control process is going to prove to us that the system has no faults. Perhaps 'the need for quality' becomes the development focus and testing and reviews become thought of as tools to that end. The benefit of a tool is that it can be wielded by anyone.

Testing is whatever the organisation that you work in says it is. Testing is an environmentally defined mechanism for providing feedback on perceived quality. The 'How' of testing is much more important than the 'What'.

So what is a fault?

A fault is obviously something that is not correct. Therefore testing must obviously know, or believe that it knows, what is supposed to happen in advance of running a test. Testing (and other quality control processes) do this by constructing a model of the system. The model is then used to predict the actions that will occur under specific circumstances. The correctness of a situation is assessed in terms of the things that conflict between the model and the reality (the implementation under test). When the model and the implementation do not match then there is a fault.

The fault may well be in the model and not in the implementation. Quality Control processes must themselves be subject to processes of quality control.

Wherever it is, the fault will be investigated and possibly fixed. And for the rest of the life of the implementation, testing will make sure that if the fault comes back then testing will find it. That's the purpose of regression testing, or "re-running the same damn test time and time again" as it is known in the trade.

A good test is one that has a high chance of finding a fault

How do we ensure that the tests created are 'Good' tests, those with a high chance of finding faults?

The process of test modeling can identify incongruities and omissions in the source documents, some of these could have made their way into the final deliverable as faults so we can find the faults before we execute the tests.

Test derivation is the process of applying strategies to models. Different models have different strategies that are applicable.

Most testers are familiar with flow models, the most basic form being a directed graph. This can be used to model the sequence of actions in a program with decision points and iterations. There are numerous strategies that can be applied: statement coverage, branch coverage, condition coverage, domain testing. (Binder, Beizer [1][2])

A 'Good' test is one that can be justified in terms of one of the strategies as it applies to one of the models. A test is a way of exploring a model.

But how can we assume that this strategy has a high chance of finding a fault?

Strategies are typically constructed by analysing the faults that are discovered outside of testing i.e. faults which testing did not manage to pick up because the strategies in use did not result in tests which found those faults. Other people have done much of the work for us and we have a set of very effective strategies which testers can pick and choose depending on the information which they have to construct the models, the tools they use, the timescales they face, the degree and rate of change which the source information faces.

Successful and unsuccessful tests

A successful test is one that finds a fault.
                                (therefore)
An unsuccessful test is one that does not find a fault.

This is not typically the way that testers describe their tests. A test 'passes' when it has not revealed a fault and a test 'fails' when it has revealed a fault.

It would be too easy to use the above statements as justification for removing the tests that did not find a fault from the test pack, or possibly the strategy that created that test from the process.

The test was only unsuccessful this time. The test should only exist because the strategy used to create it has been effective in finding faults before. Software changes affect the outcome of tests. Tests that previously revealed faults suddenly don't - this is good, a fix has been successful. Tests that did not find faults suddenly reveal faults - this is not so good, the fix caused an unexpected side effect.

Remember, testing can never prove the absence of faults and that if testing is not finding any faults then your customer probably will. The results of testing should never be taken to mean that the product is fault free.

If testing is not finding any faults then it may mean that testing has done the best that it can and has to wait for the live fault reports to come back before the testing process can be improved further. I've never worked in an environment where that statement is true, testing is an infinite task but has typically done what it can given the current time, budget and staffing constraints - as has every other part of the development process.

However if the testers are running many 'unsuccessful' tests then the strategies used to construct the tests should be re-evaluated. Perhaps the strategies used are not effective, perhaps the combination of models led to the construction of redundant tests. If the strategies and models used are changed then the testing will change as a result.

Focus your mind

Statements like "The objective of testing is to find finding faults" have psychological effects on testers. They focus the mind. They make the tester avoid the easy road and encourage them to explore the dark passages hidden within the flawed system. They encourage the tester to examine test results carefully so that less faults slip through. Prioritisation and risk assessment become vital parts of the testing function.

Hopefully they make the tester more responsible about their testing, hopefully it makes the tester want to improve their testing, hopefully it makes the tester eager to learn about strategies, models and techniques which have been effective in the past.

What then happens then if the phrase is changed to: "The objective of a quality control process is to find faults"? A more universal statement that applies to every part of the software development lifecycle. Perhaps the software development lifecycle can become more aware of the faults that are being built into the system and carried forward from one process to the next. Quality control is the responsibility of everyone on the project.

Shape your process

By taking a single statement and distorting it slightly, by simulating, a human communication process from mind to mind, we have explored some of the thinking processes that these simple statements can trigger. And will continue to trigger. For as you think about the statements we have presented, there are other conclusions, other derivations, and other meanings that have not been mentioned here. Ones more relevant to you and your organisation and as you explore those, the ones that stem from your thoughts; they are the ones that will shape your testing processes for the better.

This essay originally appeared on CompendiumDev.co.uk  on 13th February 2002 as a 'journal notes' this was in the days before I had a 'blog' setup and still had a 'web site'. It seems more like a blog post than an essay so I've moved it over to EvilTester.com.

Friday, 8 February 2002

Test Conditions

Test Conditions are statements of compliance which testing will demonstrate to be either true or false. These are (in effect) test requirements.

Conditions serve different purposes. Some conditions will act as the audit reason for a particular test case e.g. The user must be able to create a flight. The tester will create a test which creates a flight, obviously there are more attributes to this case than this - what type of flight, what type of user, fully booked, partially booked, etc. These attributes are other types of conditions.

Some conditions are used to define a test’s attributes or preconditions. E.g. create flight of type local, create flight of type international.

Or are they....

This may be modelling that has not gone far enough.

The initial condition 'create a flight' is valid. When we only have test conditions as our modelling tool then we have to represent this as a condition. It is also a program function - create flight, or an object method, or an entity event, or a business process. Consequently we should really have a model and a derivation strategy that says “there must be at least one test for each entity event” or “there must be at least one test for each object method”. In this case it is obvious that “one test” will not cover the condition but with a rich model, with object or entity models we have a list of properties or attributes, these will have scoping variants (i.e. attribute flightType - international, local).

Basically, we use these context rich models to give us the combination information that we require to construct test cases. Without this approach we will never know if we have a valid or complete set of condition combinations.

Hierarchical models are appropriate for test grouping i.e. tests related to business processes, program modules, program functions etc. There is no reason why a test cannot be in more than one test grouping.

Hierarchical models are appropriate in derivation for hierarchical structures (it is possible to list entities attributes and events as hierarchical structures but this hides valid combination options and is a mix of ELH and ER, we should really have a relationship section on the model)

e.g.


  • Entity: Flight
    • Attribute: Type
    • Attribute: Start Airport
    • Event: Create
    • Event: Takeoff
    • Event: Land
    • Event: Delete

Models for test derivation should be rich. This allows a derivation strategy to be created which can be used to gauge the completeness of the test products and the validity of the test products.

With rich models, Test conditions become requirements which are used to check the completeness of the test derivation approach rather than the audit reason - unless of course there is no way to make the construction of test cases automatic with the implicit cross referencing of test conditions i.e. we don’t have to state ‘Create a flight’ as a test condition because there is an entity event on flight called create which we know we have to be able to test and it will apply to a variety of attributes on that entity. Without this rich modelling, and without an implicit (or strategy driven) approach to testing, a vast number of test conditions have to be created and maintained and no guarantee of combination thoroughness can be achieved.

Thursday, 7 February 2002

Process Improvement

The simplest way to improve a process is to analyse the errors that it lets slip through.

For every error not found by one of the previous quality control processes, ask, is there a strategy that could have been applied to one of the existing models that would have created a test case that could have identified the error.

If the answer is yes then it may well have slipped through because timescales or staffing levels forced your hand and you simply didn't have the time to apply that strategy to that model, or the risk of not applying it was deemed low.

If the answer is no then we have to identify a model and a strategy that could have found it.

In both cases assess the cost and time impact of adopting that model and strategy. The development process is one of trade offs and compromises.

Wednesday, 6 February 2002

Error Guessing

Error Guessing is described in 'Testing Computer Software' by Cem Kaner [1]:
"For reasons that you can't logically describe, you may suspect that a certain class of tests will crash the program. Trust your judgment and include the test."
This quote suggests to me that there is an informal model in the tester's head and that the subconscious is applying a strategy to the model which the tester is unaware of. The tester is only aware of the subconscious flagging the results of that check to the conscious as a nagging doubt.

If you do engage in error guessing then you should be aware that:

  • you have a model and applicable strategy in your head that you are not using on the project or possibly even aware of.
  • if your strategy does work then you should try to quantify it so that you can use it consistently.
  • If it doesn't work then you should possibly change the model and strategy in your head.


[1] Testing Computer Software, Cem Kaner, Jack Falk, Hung Quoc Nguyen, 2nd Edition 1993, International Thompson Computer Press

Tuesday, 5 February 2002

Testing

Testing is exploration. In a mature testing organisation the expedition is well planned, and staffed with seasoned explorers. The planning will be done around a number of maps of the territory to be explored. Some maps will show different levels of detail - to show all the detail on one map would confuse the issue, so one map will identify areas of population, one will provide information of season rainfall statistics etc. Maps are very important. The explorer plans different routes through the maps to match the aims of the expedition, perhaps they are trying to unearth hidden temples and consequently will pick routes which take them through areas which are sparsely populated now, but in the past were densely populated. Effective exploration requires an understanding of the terrain to be explored.

Errors cannot be found without a model. Quality Control cannot be conducted without a model.

I have heard it said that "some testers never model" and "reviews are not conducted against a model". These statements are false. In the absence of a defined and identifiable model, there will be an informal model, a model of understanding in the tester's head.

A model is our understanding and knowledge of the system. The level of testing that can be done with no understanding and no knowledge of the system is zero.

Try this. Take a program that you don't know what it does. Make sure that the program presents all its information in a way that you cannot understand. If you don't know Japanese then test a Japanese program. If the information presented to you is obscure enough then you will find it impossible to build a model of it and then you have no way to assess the correctness of any action. Remember that if you even understand the name of the program or its main purpose then that is information that you will have assimilated into a model and will use during testing.

Reviews cannot be conducted without a model of whatever the thing being reviewed is supposed to represent. Review models are different from testing models.

A review will be conducted against a number of models:


  • The model of a well-formed document. (does it have a title page? Are the pages numbered?)
  • The syntax of the actual text
  • The semantic model in the reviewers' head of the items to be presented which they have a vested interest in.

There are at least as many informal models as there are people.

Modelling is a fundamental task in testing.

Quality Control is essentially the checking of a model against an implementation of that model.

A test is a specific situation with a predefined set of things to check against the model. The differences are errors, either in the model or the implementation.

Monday, 4 February 2002

Testing and Modelling

In order to test a model we have to have some way of recognising the success or failure of a test, our test must have a goal, it must have a reason for existing. That reason is inherent in the model from which the test is derived. This means that it is difficult to use a model to test itself. In software development this is typically not a problem, we rarely use the source code as the only model when testing the source code. We typically derive tests from a design model, a requirements model or a specific testing model, or even a model which may be derived from the source code, essentially any model which has the level of detail required for our testing.

Modelling is as fundamental an activity of testing as it is of development.

Test Strategies are applied to models. This is for the purposes of test derivation, derivation and execution coverage measurement, domain analysis, risk analysis, the list includes almost every task that testers do.

Strategies typically evolve and are identified by thinking about errors that slipped through and identifying a strategy that could have found them.

Sunday, 3 February 2002

Checking the model

Do we want to produce programs that are fault free or do we want to produce programs that allow people to do what they want to do in a quality manner? Ideally I'd prefer both, but humans make mistakes so we should aim for a product that functions in a quality manner. When we achieve that we do it by assuring ourselves of the quality of the models in the software development process as they are produced, and before they are fed into a downstream process that will build new models using the information contained in those models.

How do we assure ourselves that the model is a quality model? We have to know what we want from a model, control the production process as much as possible, check the model when it is done and periodically throughout its production.

This is the fundamental assertion of the T.O.T.E model in psychology [1]. The T.O.T.E model is a sequence of steps: Test, Operate, Test, Exit. With a goal in mind we Test to see if the goal has been achieved, if not then we operate to change something which will hopefully bring us closer to our goal, we test and if the test is satisfied we exit that process as our goal has been achieved.

The T.O.T.E model describes a process of refinement.

[1] Modeling with NLP, Robert Dilts, 1998, Meta Publications

Saturday, 2 February 2002

Modelling the development process

Software development tries to create a product. It does this by engaging in a number of processes (requirements, design, coding, etc.). Each of these processes will create a model that will either be the final product or an input to a follow on process. The best example of this is one of the popular life cycle models: waterfall or V model.

There are constraints on the construction of a product; we need it in X days, it must only cost Y thousand, and you only have Z staff members. The skill of developing software is in the effective application of the strategies that have been learned bearing in mind the constraints involved.

Modelling is a fundamental activity of the software development process.

To take the main processes from a minimal software life cycle:
  • Requirements
    • Requirements are modelled, possibly as text.
  • Design
    • The popular UML provides a range of diagrams: Class, Object, Component, Deployment, Use Case, Sequence, collaboration, Statechart, Activity
  • Coding
    • The program is modelled in Code. We have the choice of language to model the system in be it C++, Smalltalk or assembly. The code is a model. When we execute the system we have to have special programs to map that executable system back into the code model. e.g. a source code debugger. (It is possible to formalise the design models above so that they are equivalent to a code model.)
  • Testing
    •  Testing can use many of the development models and will apply strategies such as: loop once, loop twice, cover every statement, cover every predicate condition, cover every exception.

Each of the models produced in the development process is a refinement of a previous model, even if the previous model was never formally documented. A requirement is a refinement of the dreams and aims associated with the picture in the specifier of the requirement's head.

Friday, 1 February 2002

On Modelling

"A map is not the territory it represents, but, if correct, it has a similar structure to the territory, which accounts for its usefulness…" A. Korzybski, Science & Sanity, 4th Ed. 1958, pp. 58-60 (quoted from: R Bandler & J Grinder, Patterns of the hypnotic Techniques of Milton H. Erickson, M.D. Vol 1, 1975, pp 181)

Human beings model the world. We learn by constructing models and then subjecting those models to tests in order to determine their validity. Human beings do this all the time. By the time a human reaches adulthood it has constructed so many models that the person is probably unaware of all those models, and may even be unaware of the modelling process.

When we are faced with something new which we don't understand, a new fangled tap in the washroom for example, we will use the models that we have already developed which are similar to that situation; all our tap models. If they fail to make the tap work then we will use other models associated with the washroom. Having encountered hand driers with proximity sensors we may wave our hands around to get the tap working. If we have been on a train or a plane then we may have encountered taps that work through foot pedals and we would start to look for those. We have models for turning things on outside the washroom using buttons or levers and we would start using those.

We use models and strategies all the time. We are experts at constructing models and strategies to apply to those models. We interact with the world through those models.

The software development process constructs software that works more often than it fails. This is in spite of us not doing what we are told is best practise, or even what we believe we should do; we don't spend enough time on requirements, we don't stabilise requirements, we don't design, we don't unit test, we don't document, we don't review. We often get away with it because modelling is a natural talent and software development is a process of modelling.

Unfortunately we also learn from experience, and if our experience of success includes not doing all these best practise processes then that leads us to not do these processes again.

We have to understand what we do and what each step is for, in order to ascertain if we can miss them out in the situation we are in.

Modelling explicitly, and understanding our models, allows us to be pragmatic.

Wednesday, 16 January 2002

An exploration of, and notes on, the process of Test Scripting

Software Test Scripting

This essay explores the test scripting in terms of software development as the two processes are very similar and share many of the same techniques and pitfalls. It is primarily aimed at manual test script construction because automated test script construction is software development.

An exploration of, and notes on, the test scripting process:

Introduction

This text will explore the process of test scripting. It will do this through analogy to the development process and will try to show some of the things that testers can learn by studying development methodologies and best practises. The essay is not complete description of test scripting, it is an introduction to test scripting and to the study of development techniques but if you only take one thing from this essay take this:
Test script development involves the same processes and techniques used when constructing software programs, any experience that testers have had in the past when scripting is an experience which has been shared by the development teams when developing. Testers and developers are more alike than different. Testers should study developers and their techniques; developers should study testers and their techniques. We are all software engineers, we just have different areas of specialisation, but it is foolish to specialise without knowing the general techniques of your discipline.

Definitions

A test script is the executable form of a test. It defines the set of actions to carry out in order to conduct a test and it defines the expected outcomes and results that are used to identify any deviance in the actual behaviour of the program from the logical behaviour in the script (errors during the course of that test). In essence it is a program written for a human computer (tester) to execute.
Testing uses a lot of terminology. In this text I will use the following definitions:
  • Test case:
    • a logical description of a test. It details the purpose of the test and the derivation audit trail.
  • Test Script:
    • the physical, executable, description of the test case.
  • Automated test script
    • a program that implements a test.
The development life cycle has a number of processes and tasks that the development community is involved in:
  • Requirements
  • Design
  • Coding
  • Testing
Testers are familiar with each of these stages in the context of system development and its relationship to the construction of tests. However a Test Script is a program and as such it has a life cycle that parallels that of the system development in microcosm.

The development life Cycle

Requirements

Fortunately for testers, tests are derived before building scripts. The test description itself should contain the requirements for the test script. (The process of test case construction and the corresponding requirements analysis techniques are outside the scope of this text.)

Design

Test Script design involves the construction of an executable model which represents the usage of a system. It is an executable model because the model contains enough information to allow the tester to work through the model and at any point unambiguously knows what they can do next.

Executable Models

Executable models use 3 main constructs:
  • Sequence, one action after another.
  • Selection, a choice between one or more actions
  • Iteration, a repeated sequence or selection
Sequence:
  • The model consists of three main stages done one after the other; initialise, Body, and Terminate.
Selection:
  • The model consists of a selection between ‘Action 1’ or ‘Action 2’ or ‘Action 3’
Iteration:
  • The model will iterate while condition C1 is satisfied.
Or, representing the above diagram as a graph:
image of a graph
The graph provides us with the following two Meta paths:
  1. Initialize [Body (Action 1 | Action 2 | Action 3)]* Body Terminate
  2. Initialize Body Terminate
A test script is an interesting executable model as it only embodies the sequence construct. Leading to the familiar situation where testers write numerous scripts around the same area of the program, each script differing slightly from the one before:
  • Script 1: Initialize, Action 2, Action 1, Terminate
  • Script 2: Initialize, Action 1, Action 2, Action 2, Terminate
  • Script 3: Initialize, Terminate
It is obvious that this situation occurs because each test script is an instantiation of one of the script model’s Meta paths, each script is a single sensitised model path.
Test Scripts avoid the concepts of selection and non-deterministic iteration because each test script should be run in exactly the same way each time it is run in order to aid repeatability. The tester is given no choice when executing a test script but to follow it exactly and consistently, this allows errors, once identified, to be demonstrated repeatedly and aids the correction of errors because the exact circumstances surrounding the execution of that test were specified.
Computer Programs do not avoid selection and iteration therefore the scope of a computer program is larger than a single script and a number of scripts will be required to cover the scope of the selections and iterations modelled in the program. A computer program does not represent the instantiations of the model paths; a computer program provides an alternative executable model.

Iterations in Test Scripts

Having pointed out that test scripts do not use iteration constructs it is worth realising that in the real world testers do write test scripts using iteration constructs and then examining why.
One of the software development tenets is the avoidance of repeated code. This aids maintainability, can often aid readability and allows re-use, which increases the speed of construction of similar procedures.
Iteration constructs are used when constructing test scripts for the same reasons.
It is perhaps unfortunate that the test tools which testers use, particularly when constructing manual scripts, do not make re-use or iteration simple. This leads to a more informal implementation than would be found in program code:
  1. repeat steps (3-12) 4 times, but this time enter the details for Joe Bloggs, Mary Smith, John Bland and Michael No-one.
  2. Press the enter key 6 times.
Example 1 above uses iteration to avoid repeating the same set of steps in the script. However, the same steps (3-12) will typically be repeated in other scripts so re-use isn’t facilitated. This is primarily because most tools which testers use for manual testing will not support sub-scripts or procedures.
When testers do use loops it is obvious from the Meta model path description of the test script what they are doing. The tester will have identified a particular instance of the Meta model path and in order to increase maintainability of the script the tester finds it more appropriate to use a higher order description as the actual test script itself.
However, the test script must not implement a non-deterministic loop or an ambiguous selection condition otherwise the test script, will not be implementing an instantiation of a Meta path, it will be implementing an actual Meta path.

Use of the Design

A design that represents the flow of control that scripts can select particular paths from serves a number of purposes:
  • Oracle: expected results can be predicted.
  • Coverage: can be assessed.
If no design is produced then testers will have to assess coverage by examining the discrete set of tests and identify missing paths or actions not taken. Typically missing paths won’t be noticed until the tests have been executed a number of times and the tester has built up a model in their head and realises that they have never executed Action 3.
  • Automatic Transformation from Design to Script.
This obviously depends upon the design technique used and the availability of tool support. Testing models, particularly for scripting, can use models that do allow automatic code generation: Jackson, Flow Charts, State Transition diagrams. Typically tools exist to draw the models, and tools exist which can take the models and produce entire programs. Test scripting requires tools that can produce programs of the chosen path through the model. Testing may have to design and build their own tool to support this.
In practise the relationship between the design model and the test scripts involves less interpretation than the relationship between a software design model and the software program, therefore the test script design model must be maintained before any maintenance is carried out on the derived test scripts.

Coding

The coding of a test script refers to the writing of a test script.
Each test script should follow the path identified from the design and as such should be fairly easy to construct if a design has been produced.
Test Scripts are typically represented by a series of steps, each step being given an id or sequence number, an action and a result.
Some test scripts will be represented with columns for pass/fail attributes during execution. This is not actually an attribute of the test script but is an attribute of the specific execution instantiation of a test script, but given the crude nature of the tool support in testing it is often easier to add the column to the script.

Artfully Vague


Example: a script to save the current file in a word processor which has been saved before:
Step 1
  • Action - Click on the file option of the main menu bar.
  • Result - A drop down menu appears, one of the options is ‘Save’
Step 2
  • Action - Click on the ‘Save’ option of the drop down menu.
  • Result - The disk whirs and the date/time stamp of the file in explorer matches the time that the file was saved.

Test Scripts are often artfully vague. As can be seen from the example above. The script is written in English and a number of presuppositions are embodied, i.e. that the tester knows:
  • What a main menu bar is,
  • What a drop down menu is,
  • That ‘click’ means to maneuver the mouse pointer over the text and then click,
  • What explorer means,
  • How to check the date/time stamp in explorer.
When writing a test script the writer must take in to account the level of knowledge that the tester (the person executing the test) will have. If there is not enough information then the tester may not be able to run the test, or they may run the test but, having misunderstood some of the actions, have actually run an entirely different test which may lead to a false positive or false negative result being reported.
There are also time pressures that affect the writing of scripts. Unlike a computer program a script can be poorly written but can still execute provided the human computer has the correct knowledge.
The above script could be written as: “save the file using the file menu and then check the date”. This is possible when the person executing the script is the same person writing it. It doesn’t aid re-use, repeatability or maintainability but it can still be executed. A computer program cannot do this, a computer program can skimp on the documentation and the computer can still execute the program, but the computer has no more information than that presented in the program and the instructions presented to the computer cannot be artfully vague.

Other concerns

Development is rightly concerned about maintenance and ensuring that their code makes sense now and will make sense 18 months in the future when they have to update it.
Testing should have it easier as the transition from design model to test script should be an automated process but typically testers don’t have an automated mechanism for doing this and end up doing the translation manually.
It can be a lot of work to document each individual test script to a precise level.
Testers probably write as much source as the development teams and yet have fewer tools to support the development and maintenance process.

Testing

Testers are aware of the importance of testing software. They should also be aware of the importance of testing their test ware.
The process of constructing tests and executing them should give testers an appreciation of the difficulties of program construction. Defects slip into test scripts as often as they slip into programs. This should make testers more sympathetic towards the trials and tribulations of their development peers but for some reason, some testers do look at systems with disgust and wonder what on earth the developer could have been thinking to allow such an obvious defect to slip through. As a corollary, developers know how hard it is to program and know how easy it is to slip defects in to systems, but are often scornful towards the tester who has a bug in their test script.
We can blame these attitudes on human nature but they are also symptomatic of a competitive environment where the test team and the development team feel that they are in opposition with one another. Both sides have the same problems and each can try to learn from the other.
Testing a test script can be tricky:
  • Test scripts are often constructed when there is no system available. At this time the quality control techniques are involved in checking the design model used and double-checking the mapping of the script back to the model.
  • Test scripts are constructed to test a new version of the system but the only system that is available is the old buggy version. Again the script must be validated against the design model, but it may also be possible to execute portions of the script against the old version of the system.
  • Testing the script with desired version of the system is the most important of the situations but it does not refer to the execution of the script. It refers to the validation of the design model against the system, in essence testing the Meta paths identified by the model.
Testers are often under time pressure, and when under time pressure errors can creep in far more quickly. This is why the design model must be as accurate and thorough as possible.

The challenge of expected results

There is an interesting challenge set by the writing of test scripts, that of expected results.
A model of system usage upon which the construction of a test script is built may not always model the steps which have to be taken in order to check the pass/fail of an expected result.
The system usage model may tell the user how to create a customer but it may not tell the user how to check that a customer has been added to the system correctly. This may have to be done via an SQL query on a database. But this information must be present in the test script in order to execute the script and determine its pass/fail status.
This suggests that the construction of test scripts is not done through only one model. There is at least one other model available which describes the conditions of the test and how to validate the successful implementation of those conditions.
The challenge to the tester is in the integration of these two models; the condition model, and the system usage model, into a single test script.
Current testing best practice involves the construction of test conditions; these are typically developed from an analysis of specification documentation. (A thorough discussion of test conditions is beyond the scope of this text.)
There would appear to be no standard usage of test conditions:
  1. Test conditions are used to define the domain scope of tests: customer type = “Male”, currency = “USD”
  2. Test conditions are used to document ‘things’ that testing must concentrate on: “The system must allow the creation of customers”, “The system must allow the deletion of customers from the system when they have no active transactions”.
No doubt there are other uses of test conditions that I have not been exposed to.
Current testing tools, if they support test conditions at all, tend to model test conditions as a textual description. These textual descriptions are then cross-referenced to test cases in order to determine coverage of the fundamental features of the system analysed from the specification documents. There is no support in the tools to link the conditions to the scripts or script models. There is also no support for the modeling of the steps taken to validate those conditions that require validation i.e. the 2nd usage given above.
The test script construction process is analogous to the inversion, correspondence and structure clash processes presented so long ago by Jackson Structured Programming and more typically the informal mapping from design and specification to program code that developers do on a routine basis.
There are techniques to be learnt from these processes and tool support is required.
In the absence of tool support the software testers must document these models as effectively as possible.
This has the effect of expanding our modeling of test conditions to including instructions on how to validate that those conditions have been satisfied.
These conditions then have to be cross-referenced to the elements of the script design model at the points where those conditions would be satisfied. The conditions also have to be cross-referenced to the test cases so that the correct condition validation instructions are performed in the correct test scripts.

Conclusion

Test scripting is a time consuming, error prone and difficult process.
There is much that the tester learns from experience, many of these experiences are shared by, and have been documented by, the software development teams already.
Models are important in testing. They form the basis for all aspects of the testing process. It should be appreciated that the models used to derive tests are different from those used to construct test scripts and that in order to construct scripts effectively the overlap and intersection of these models must be identified and controlled.
This is a small section listing recommended reading for some of the development activities discussed in this text.

The Pragmatic Programmer, Andrew Hunt and David Thomas, Addison-Wesley, 2000
  • This is a set of examples, discussions and stories which illustrate the problems and best practice solutions associated with the development process.
Software Requirements & Specifications, Michael Jackson, Addison-Wesley, 1995
  • This is another set of small essays each of which provides insight and triggers contemplation of the various aspects of software development. The discussions of problem frames are particularly relevant.
Any programming manual for any programming language
  • It is important to attempt to learn a programming language, even at a rudimentary level, in order to appreciate the difficulties of software construction and the knowledge that is ready to be assimilated into your testing.

Tuesday, 15 January 2002

An exploration of, and notes on, model path analysis for Testing

Path Analysis for Software Testing

TLDR; This essay explores the use of graph models in testing and the practice of structural path derivation using 3 coverage concerns: node coverage, link coverage and loop coverage. Predicate coverage is considered but is not covered in detail. An exploration of, and notes on, model path analysis for testing:

Introduction

This text will focus on the structural aspects of model path analysis.
This allows us to learn the basic techniques behind test path derivation. I suspect that most testers will use these techniques intuitively.
Whenever a process is conducted intuitively it always helps to examine it more formally for a number of reasons:
  • To allow it to be taught more effectively,
  • To explore our understanding of it,
  • To apply it consistently.
This text is primarily concerned with an exploration of the understanding of structural path analysis.

The example model

I will use the following model to explore the mechanisms behind path analysis:
graph image
This is undoubtedly a very basic model representation. No tester would ever want to work with a model like the above as it lacks a great deal of semantic information.
Node 4 is either a predicate node or initiates the links to 7, 5 and 3 in parallel. For the purposes of this exploration there are no parallel flows so any node with more than one exit point is a predicate node.
The model does not tell us the predicate conditions under which each link from the predicate node is taken so we work on the assumption that any exit node is equally likely. This makes the loops in the models non-deterministic and limits the strategies which we can apply to the derivation of loop Meta paths.
From this we can see that iterative determinism is not structurally represented but is provided by the semantic information that the model embodies.

Basic Testing approach for Graph Models

Testing Books typically present the reader with the following forms of coverage for flow models.
  • Node coverage
  • Link coverage
  • Loop coverage
Node coverage is achieved when the paths identified hit every node in the graph.
Link coverage is achieved when the paths identified traverse every link in the graph.
Loop coverage is achieved when the numerous paths identified explore the interaction between the sub-paths within a loop. This is a fairly vague description of loop coverage as loop coverage itself is a heuristic technique and will be explored in more detail later in the text.
Pursuing each of these types of coverage leads to path descriptions. Some of the path descriptions are ready to be sensitised and thereby map directly on to a test case. Others are Meta paths and numerous paths can typically be derived from them.
Examples:
  • 1 2 is ready to be sensitised.
  • 1 [2 [13 4 ]+17 6 ]*2 10 is a Meta path and a number of paths can be derived from it:
    • 1 3 4 7 6 10
    • 1 3 4 3 4 7 6 3 4 7 6 10
    • etc…
Path 1 2 could be considered a meta path that can only have one path derived from it. Thinking of it in this way may help understanding or the construction of support tools.
Testers who do not do any formal modeling will do this type of path and Meta path identification automatically as it is essentially a form of pattern identification and humans are very good at identifying patterns. We may however miss certain paths or path intricacies and this can lead to defects slipping through the testing process that could have been found by tests within the potential scope of the testing strategies.
The next step, having identified the Meta paths is to apply strategies from them to derive paths:

Meta Path Strategy Path
1 [2 [1 3 4 ]*1 7 6 ]*2 10 => apply ‘once’ to loops => 1 3 4 7 6 10

The strategy-derived paths are then sensitised and tests are identified. For a more detailed treatment of this see Black Box Testing by Boris Beizer.
Node Coverage can be achieved with the following paths through the model:
  1. - 1 2
  2. - 1 3 4 7 6 10
  3. - 1 3 4 5 6 10
Link coverage can be achieved with the addition of the path:
  1. - 1 3 4 3 4 5 3 4 5 6 10
We could actually remove node coverage path 3 (- 1 3 4 5 6 10) and replace it with link coverage path 1 leaving us with three paths to derive tests from instead of four but still achieving both node and link coverage.
  1. - 1 2
  2. - 1 3 4 7 6 10
  3. - 1 3 4 3 4 5 3 4 5 6 10
We will see later that path 3 is actually a derived form of a Loop Meta path. Since at this point we are simply identifying paths for link coverage we will defer the identification of loop paths until later in the text.

Predicate Coverage

Link coverage introduces us to the issue of predicate coverage although the minimal nature of the model makes discussion of this type of coverage more difficult. The following discussion will not fully describe predicate coverage.
The information in the above model obscures the actual conditions associated with the predicate nodes. We do not know the conditions that differentiate the link 4-3, from the links 4-5 and 4-7
graph diagram
The link 4-3 may be conditional on the evaluation of a compound predicate e.g. when A OR (B AND C). This requires more tests to cover than a simpler predicate e.g. when A, although the path description will be the same for each of the predicate coverage tests e.g.:
  • 3 4 3 4 5
    • [3 4] (When A) 3 4 5
    • [3 4] (When B AND C) 3 4 5
    • etc. (see below)
We should expand A OR (B AND C) into a truth table to ensure that the link path is not taken under the wrong conditions and also to ensure that we identify all coverage conditions.

Path ID A B C Path
1 T T T 3 4 3 4 5
2 T T F 3 4 3 4 5
3 T F T 3 4 3 4 5
4 T F F 3 4 3 4 5
5 F T T 3 4 3 4 5
6 F T F 3 4 5
7 F F T 3 4 5
8 F F F 3 4 5

In order to achieve paths 1-5 there has to be a second set of ABC sensitisation values which prevent the loop from being executed again. This is an implementation issue associated with loop test paths.
Predicate Coverage is essentially path sensitisation issue, but it highlights the fact that a strategy aimed at achieving link coverage will provide a weak assurance of the system under test. Of course testers would never construct tests purely on the basis of a link coverage strategy.
Path sensitisation is an important issue that is not covered in this text. (See Beizer)

Loop Coverage

graph image
Re-reading the above model we can see it models a sequence of nested loops. The loops are represented by the links from loop exit node 4 to node 3, from 5 to 3 and from 6 to 3.
From our experience as a tester we suspect that achieving link coverage is not enough to assure us that the model has been implemented correctly. Most defect taxonomies will list a variety of defects associated with loops: infinite loops, incorrect exit criteria, etc. (see Testing Computer Software by Kaner et al).
Beizer describes loop testing as a heuristic technique and provides the following coverage strategies:
  • Bypass
  • Once
  • Twice
  • Typical
  • Max
  • Max + 1
  • Max - 1
  • Min
  • Min - 1
  • Null
  • Negative
Some of these strategies are not applicable for all loops and may lead to the construction of test cases that cannot be executed; this is fine, at least no potential paths have been missed. The scope for testing provided by the model may not represent the scope of testing provided by the system. We will only be able to identify these semantically incorrect paths when we sensitise the paths. Knowledge of these basic strategies allows the tester to be more confident that they are doing the best job that they can.
When analysing loops I generally work with paths using the forms below rather than attempt to follow the graphical representation above.
  1. 1 2
  2. 1 [2 [1 3 4 ]*1 7 6 ]*2 10
  3. 1 [3 [2 [1 3 4 ]*1 5 ]*2 6 ]*3 10
In the path forms above I have assigned each loop a number that represents the sequence of the loop exit node. E.g. loop exit node 4 linking to node 3 is the first loop in the model.
In the model above, the numbering of the loops is less important as each loop links to the same node i.e. node 3 is the loop entry node for all three loops. This statement is only true while the model is represented using paths 2 and 3 above as these only provide link coverage. The numbering of the loops becomes far more important when we consider the representation below, which allows us to go beyond link coverage.
The link coverage paths can be derived directly from the above Meta paths:
  • 1 [2[13 4 ]*1 7 6 ]*210
    • => 1 3 4 7 6 10
  • 1 [3[2[13 4 ]*1 5 ]*26 ]*310
    • => 1 3 4 3 4 5 3 4 5 6 10
If we look at one path (out of many) that we might expect to use during testing:
  • 1 3 4 7 6 3 4 5 6 10
Then we can see that we cannot derive that path from the representations above.
  • 1 [3 [2 ( [1 3 4 ]*1 7 )|( [2 [1 3 4 ]*1 5 ]*2)] 6 ]*3 10
The above path can be identified without first constructing an example that disproves the completeness of the link path descriptions. Both of the Meta paths have common elements
  • [2 [1 3 4 ]*1…6 ]*2 10
and this suggests that a more complete representation of the Meta path exists.
This form of representation helps to emphasise the different levels of model complexity that have to be considered for each of the coverage concerns.
Beizer’s heuristic strategies are applicable when there is more semantic information in the model than we have. Our model has the minimal amount of information and all the loops are post test loops, we are left with the following path construction strategies.
  • Once
  • Many

Beizers strategies considered

It is interesting to look at Beizer’s heuristic strategies from a purely structural point of view.

Once

A Path that exercises the loop once would actually be derived during node coverage path analysis, as the loop exit node link is not considered.
  • 1 3 4 5 6 10 exercises each loop once.
    • (23/11/2001) Note: I’m assuming that these are post test loops (Beizer pg71).

Many: Twice, Typical, Max, Max +1, Min, Min -1

I am considering twice, typical, max, max+1, min and min - 1 as instances of the ‘many’ strategy because structurally the loops are non-deterministic and we have no way of knowing how often the loop can be executed.
As a consequence they are all sensitised instances of the paths identified by the ‘many’ strategy.

Bypass, Null, Negative

In order to affect a bypass strategy, structurally, there has to be a structural representation in the model. There is an example of this in the model in the sense that the sub path 4 7 6 represents a bypass of loop exit node 5.
But then this isn’t typically what is meant by a bypass strategy. The bypass strategy is a sensitisation approach to a path that does have a loop representation in it. The bypass strategy refers to a path such as 1 [3 4]* 5 6 10 and asks, is there a way to construct a case which sensitises that path representation in such a way that the path traversed is 1 5 6 10.
This is a sensitisation issue and is a mechanism for forcing testers to think outside the box, the box being the model that we are working with. This strategy encourages us to think, is there anyway that I can make it do what it is not supposed to do. It is very definitely a defect detection strategy.
In effect, this sensitisation strategy, if it can be used to produce a case either points out that our model is wrong and would require an extra link from 1 to 5 (see below), or it would point out that the system under test incorrectly implemented our model.
graph image
This strategy is interesting because the implications of this strategy do not just apply to loops; they apply to any part of the model. Is there any way to bypass node 7 when moving from node 4 to 6? If there is it implies the existence of a defect in our model or the system.
This strategy is the perfect example of testing thinking, thinking outside the box or constructing structurally ill formed paths. But in order to think outside the box we must know what the box is, hence my laborious presentation of the structural aspects of path analysis.
The skill of testing is to approach the construction of ill formed paths appropriately. But it is entirely possible that these are ill formed paths only because the model is not rich enough to test from. If we had a supplementary model that provided semantic information about the predicate nodes, or the data that is processed, then the combination of these models could result in the construction of structurally ill formed paths that are semantically well formed.
This strategy also highlights the assumptions used when constructing the model. The assumptions are that each link has an equal probability of occurring and that loops are not intrinsic to the model, rather that loops are represented by the links between nodes. Loops are in effect constructed from GOTOs rather than ‘do while’ loops. As a consequence each of my loops is represented by [3 4]* where the * means ‘1 or more times’. If the loops were intrinsic to the model then we could argue the same graph, and the same representation [3 4], but in this case the would mean ‘0 or more times’. This thinking is unlikely to occur in practise as most people will adopt a GOTO perspective when constructing models as this is the most natural way to think about a loop (in a graph) and may be why GOTOs were around before structured loop constructs.
Constructing the graphs and models is outside the scope of this set of notes.

Conclusion

This text has introduced path analysis from a purely structural perspective as this allows a clearer examination of some of the thinking processes and strategies involved in path analysis.
In the real world testing is unlikely ever to be conducted like this as our tests will never be purely structurally based. We have to sensitise paths in order to derive tests. Path sensitisation requires a more semantically rich model or supplementary models that provide the information required to process the model semantically i.e. a set of test conditions, domain analysis, requirements analysis.
The techniques presented here are worth thinking about. They are equally easily mapped on to test script production with the test cases themselves providing the sensitisation information. Using the information presented in this text we can assess the level of coverage achieved on the script model by the test cases compared to the potential coverage.