If software quality includes a goal like “delivering what we promised on time and on budget”—and it should—then testing is essential to meeting that goal.
There are, at least, three criteria for measuring software quality:
- Reliable delivery process: Can we deliver applications when we said we would?
- Low maintenance costs: Can our organization live with those applications?
- Fitness to purpose: Do the applications do their job?
Deciding if an application or an enhancement to an application is “ready for release,” typically involves assessing when two metrics are met:
- Functionality implemented: When all the functionality has been implemented (or some reasonably close approximation to “all”)
- Bug count: When the known bug count is zero (or the risk associated with the remaining bugs is low)
We need both measures because, despite our best intentions, adding or changing functionality—writing code—generates bugs (though there are coping mechanisms for slowing their growth). Like the tchotchkes in your Aunt Edna’s living room, bugs inevitably increase over the life of the project.
Which is why we have testing. First, of course, testing is how we discover bugs. But it’s also how we determine if those bugs have been remediated—when we perform the test again (a regression test) and get the “right” result, we declare the bug has been fixed and remove the bug from the bug count.
Our schedules and cost estimates are thrown into chaos when we have unexpected increases in the time to remediate a bug (which shows up as the trend in reducing bugs flattening out instead of moving toward zero) or we have sharp increases in the bug count.
Predicting Delivery
At least three levels of testing are involved in determining the bug count and time to remediation: Unit testing by developers to find bugs as they are created; exploratory and other kinds of testing by QA to find bugs after code has been checked in; regression testing to report on both whether a bug has been remediated and whether existing code has remained bug free.
Testing is not, of course, a complete solution—but it’s what we’ve got. Testing doesn’t, for example, find all the bugs (though testing provides the inputs for estimating the number of unknown bugs). And the cost/time to remediate a bug isn’t a constant.
The time to remediate a bug is directly related to when the bug is found: The cost for remediating a bug found long after it’s written is exponentially higher than the cost of remediating a bug found soon after it’s written. The primary reason for this is that our software tends to rely on interacting with other software. When a bug is caught late in development, we not only have to fix the bug but, potentially, “adjust” all the software that interacts with the remediated code.
The problem is compounded because we aren’t very good at predicting the scope of a change in our software. We see that most frequently in changes made after our applications are released: We make what seems like a benign change to a production system and then start getting calls from people we’ve never heard from before, complaining that their application has stopped working. The equivalent effect in development comes when we remediate a bug: We often generate a bunch of new bugs.
And that ignores the impact that increased bug counts and longer remediation times have on our other metric: Delivering functionality. Developers involved in bug remediation are not delivering functionality. When the trend in remediating bugs flattens, you’ll usually find that the trend in delivering functionality has also dropped off.
Increases in the bug count, lengthening remediation time, and slowdowns in delivering functionality all reduce the likelihood that we will deliver on time. Our process becomes less reliable.
Fixing the Problem: Test Early, Test Often
Fortunately, the solution is well known: Apply a software testing methodology that begins with unit testing and continues on through integration testing, various specialized tests (e.g., load tests) and finishes with user acceptance testing and smoke tests.
Regression testing is critical here. First, regression flags when a bug is fixed. Second, frequent and comprehensive regression testing identifies newly generated bugs early and, as a result, reduces the time to remediate those bugs.
Automated testing contributes to achieving the goals of regression testing. Automated tests:
- Don’t tie up our most limited and expensive resources (developers and testers)
- Can be run at our convenience (either on a schedule or on an event—relevant code being checked in, for example)
- Can automatically report our bug count and time to remediate.
Many other kinds of tests can be automated (it can be argued that the only useful load tests are automated tests). Exploratory testing can’t be automated but, when exploratory testing finds a bug, it can lead to an automated regression test that will track the bug’s progress toward remediation.
It would be lovely if that meant we won’t find bugs long after they were written. The reality is that we will still find bugs late and those bugs will still be more expensive to remediate than the ones we found soon after they were written. But the number of late bugs will be sharply reduced so that their impact on cost/schedule is … well, if not zero, at least “manageable.”
Effectively, you get a more reliable process by stopping Aunt Edna from buying her tchotchkes at the start of the process, instead of waiting until they turn up unexpectedly in the living room (where they’re far harder to get rid of). You’ll be happier, your management and users will both be happier and since, in this analogy, Aunt Edna represents bugs in our software … well, we never really liked her anyway.