Tenet: To build it, you have to break it

This post is the fifth in a seven part series covering my seven tenets of software testing.

Let’s say that you are a modern, test-driven developer. You run your tests and the tests all pass. Great, your code must be bug free, let’s ship! Umm, not quite. Are you sitting down? Good, I need to tell you something. Your software has bugs.

It doesn’t matter if you are a graduate fresh out of college, Don Box or Anders Hejlsberg. If you are writing a program that does anything remotely useful, it will have bugs. In Code Complete, Steve McConnell presents some statistics of exactly how many bugs you should expect to find.

Setting the benchmark is the CMM poster child, the NASA team that writes the software for the space shuttle. The NASA team has achieved the impressive statistic of zero bugs for every 500,000 lines of released code.

For all the negative criticism about buggy software that Microsoft have received over they years, they do a pretty good job with 1 defect per 2000 lines of released code. By comparison the rest of the software industry achieve between 15 and 50 errors per 1000 lines of released code. 1 It is Important to note that these statistics are for bugs in released code, i.e. after testing has been completed. Even the impressive NASA numbers don’t mean that there isn’t any bugs in their code, especially before it is released. A much higher number of bugs will have been found and resolved before the code went out the door. So, if your code is say 10,000 lines lines long, you should expect, at a minimum, to have between 150 and 500 defects. So, if the bugs are there, how do I find them?

Good testers will generally (sometimes subconsciously), use a technique known as error guessing. Error guessing is all about trying to throw something at the application that the developers haven’t thought of, otherwise known as a negative test.

Negative tests are basically trying to come up with permutations of data that the application has not been designed to handle. For example, an int32 in .net can handle numbers from -2,147,483,647 to 2,147,483,647. What is the behaviour of an application when an integer is set to 2,147,483,647 and then 1 is added to it?

Negative tests are effective at finding bugs because they do things that the developer may have never considered when they are coding the application. They also represent the types of things that real users may do to a system, sometimes bringing it to it’s knees. Ideally we don’t want our users to do that on a regular basis, or they won’t be users for long. We need to find the bugs, that we know are there, before our end users do. The best way to find the bugs is to do our damnedest to try an break the application, in parallel to construction, starting the day that the compiler produces some output.

Breaking the application as it is being built is important. It’s important because the longer a bug sits undiscovered the more it will cost to remove. You want to find those bugs as early as possible. when they are the cheapest to fix.

The best analogy to this technique is the development of a formula one engine. Whilst the exact techniques are closely guarded secrets, the engine developers will probably push the engine and its components to the absolute limit, identify the cause of failure, resolve the problem and then repeat the process. The alternative is to destroy engines race after race as the limits of the engine are discovered.

I’m sure Mark Webber doesn’t expect to have to be an engine test guinea pig during a race. Similarly, your users shouldn’t be expected to find your bugs for you either.

References

1 Steve McConnell 1993, Code Complete, Microsoft Press, pg. 612-613.

Tenet: Base your decisions on data and metrics, not intuition and opinion

If you have only walked in the dark, you will have never known the clarity that light brings. – me

This post is the fourth in a seven part series covering my seven tenets of software testing.

I was giving a presentation once to the CEO of the company that I worked for about the current state of play within our our organisation. I was Development Manager at the time, so testing was not my primary focus. During the presentaiton, I couldn’t resist including a couple of testing related slides. The first slide showed an example defect trend graph, which I used to illustrate the sort of information that should be generated by the Test Manager to assist with the day to day decisions. The second slide was the same graph with the data removed so that only the two axes remained, illustrating the lack of information available when there aren’t any testers logging issues.

Steve McConnell used a brilliant analogy in Code Complete1, where he compares testing to a bathroom scale when you are trying to loose weight. Steve (or should that be Mr. McConnell) states that the scale does not help you loose weight at all. The scale is merely an indicator of your progress towards your goal.

In my way of thinking, to extend Steve’s analogy, a test team is more like a weight loss clinic. The statistics and metrics that they produce are like the weekly weigh in, and blood test results that tell the real story of how you are progressing.

Government health warning: Metrics can be addictive

I don’t smoke, but testing metrics are like a cigarette habit, once you are used to having them, it is almost impossible to give them up. You may be able to go for short, painful, stints without them, but you know it is a case of when they will be back, not if.

Metrics can provide insights and answers to curly questions such as: When will the product ship? The simple answer is to average the number of bugs fixed per day, and divide the total number of bugs by the average. That is approximately how many days until you reach zero bugs. So, if you are fixing 5 bugs per day and you have 200 active bugs, the earliest that you will ship is in 40 working days time. If you want to ship sooner, you will need to stop adding features and focus on fixing more bugs. The same information can be used in reverse to calculate a maximum allowable bug count. Say you only have 40 days until your desired ship date, and you are fixing 5 bugs per day as in the previous example. If you active bug count is over 200 today, you will probably miss your target. This number continuously decreases so in 2 weeks time, with 30 working days to go, your bug count should be at the 150 mark if you are going to hit your ship date.

Interpreting the results sometimes makes you feel more like a statistician than a test manager. But trust me, it is well worth the effort.

References

1 Steve McConnell 1993, Code Complete, Microsoft Press.

Tenet: Test the product continuously as you build it.

This post is the third in a seven part series covering my seven tenets of software testing.

To start off, I’ll give credit where credit is due. I first came across this tenet in Microsoft Secrets1 some years ago. Whilst the book is starting to show it’s age these days, it is full of some great little gems of information, and in the past I have made Chapter 5: “Developing and shipping products” required reading for members of my team.

The key idea behind the tenet is that testing starts the day development starts. This is a conscious move away from the waterfall approach where testers don’t get to start their testing until the developers have hit code complete. Starting testing so late in the process creates a situation where the true state of the product only becomes visible in the last third of the project or so. I don’t know about you, but if something is going off the rails, I want to know about it as soon as possible, so I can take some corrective actions before things get really ugly, and expensive to fix.

There are several techniques that can be utilised to start testing earlier in the process, adding significant value to the project.

Buddy Testing
In a perfect world where budgetary constraints don’t exist, (like say for the testers of the computers on Star Trek), a testing “buddy” is assigned to each and every developer. At the end of every day, the developer submits their code and hands over a private release to their testing buddy. The buddy tests the newly crafted code in its semifinished state, and provides immediate feedback to the developer, rectifying any issues before the code is integrated into the main build.

This practice is apparently in wide use throughout Microsoft, and I am led to believe that the ratio of developers to testers in times past was approximately 1:1. However the ration may be more or less these days depending on the product, the quality bar and the amount of automation that is being used.

Well I don’t work for Microsoft, and this ain’t Star Trek, so how can the rest of us utilise this technique?

There are a couple of ways that this approach can be adopted in the absence of unlimited testing resources. Firstly, a tester may be allocated to a number of developers, say, an entire feature team, and they test the feature as it develops, instead of only Joe’s code.

In the complete absence of testers, a developer could pair up with another developer who has sufficient objectivity and emotional detachment from the code that they are testing. (Typically this would need to be a developer working on a completely different feature). To encourage the buddy testing practice, issues found as part of a private release, won’t be entered into the defect tracking system, allowing the developer to resolve the issues as quickly as possible.

Test Driven Development (TDD) and developer unit testing
In the last couple of years, TDD and the nUnit style test harnesses have changed the unit testing landscape. nUnit formalises and automates the unit testing techniques that the better developers were doing in times past. This style of testing is a great technique to improve the quality of the code, and definitely should be utilised in one form or another.

The challenge however, is the developers emotional attachment with their code, Particularly when it comes to performing negative (destructive) tests. As a development / testing professional I can only say that nUnit based unit tests are a great thing, but, they are no replacement for someone who has no emotional attachment to the code, pounding away at it. This becomes particularly important as API only testing becomes less and less effective, (finding fewer and fewer bugs) as the product matures.

Daily build and smoke tests
Discussed in the previous tenet, the daily build and smoke test is a key foundation process for a development team that is serious about producing quality code. This practice that should always be implemented if at all possible.
Pre-Checkin tests
Whilst the daily build and smoke test is great at identifying when something is broken, the technique has a fundamental flaw in that the smoke test will not prevent the breakage from occurring in the first place. If a developer performs a smoke test after they do a local build, but before they submit their code then the problem may be caught before the main branch is broken. The challenge with pre-checkin tests is that they can significantly increase the amount of time that a developer will spend submitting their code. You can expect a lot of resistance from developers for this type of process. Especially if they are used to working on a small team and just checking in to VSS whenever they like. If your developers are used to following a controlled check in process that becomes necessary on larger projects, this should be easier to implement.

Performance testing and application profiling
Application performance is almost always an issue, and the judicious use of a profiler early on can help identify issues that may come home to roost later. Also just stepping though your code in the debugger can provide some valuable insight where time is being spent, although this becomes harder and harder the larger your code base becomes.

Code Reviews
Code reviews are another technique that can significantly improve the quality of software that is being developed. Code reviews can vary from a quick informal review to a full blown inspection. The costs and results will vary along with the formality, but at least some form of review should be scheduled during the development process.

Overall there are a number of different techniques that can be applied to an application as it is being built, and judicious use of the resources that are available can improve the quality of software, from the start of the development cycle.

References
1 Michael A. Cusumano and Richard W. Selby 1995, Microsoft Secrets, pg. 294, Harper Collins.

Seven tenets of software testing

In both this MSDN magazine article and this episode of the .net show, Don Box introduced four fundamental tenets for developing service based or connected systems.

  • Boundaries are explicit
  • Services are autonomous
  • Services share schema and contract, not class
  • Service compatibility is determined based on policy

That inspired me to develop my own list of guiding principles that apply to software testing. These tenets are documenting some key learning’s from over the years working as a Test Manager, Senior Consultant and Development Manager for various software development shops.
  • You can’t test everything so you have to focus on what is important.
  • If you are going to run a test more than once, it should be automated.
  • Test the product continuously as you build it.
  • Base your decisions on data and metrics, not intuition and opinion.
  • To build it, you have to break it.
  • Apart from Test-Driven Development, A developer should never test their own software.
  • A test is successful when the software under test fails.
In a series of future posts I will be expanding on the tenets, explaining them in detail, providing links to reference materials; hopefully providing something helpful for you to use on your projects.