Life Imitates Code. Essays by Ka Wai CheungEssays by Ka Wai Cheung from around the web.

Testing ● April 2015

Where automated testing should (and shouldn't) fit in your strategy

Having worked with automated testing in various forms for a few years now, I’ve come to a few conclusions about where it fits in the mindset of a software developer—and about how it can become a crutch in programmer behavior if we fit it improperly.

Testing vs. checking

For a while now, something’s felt strange to me about the way that automated tests are often marketed to the development community. For one, I think the term “test” itself is quite misleading. It conjures up this idea that we’ve found a viable replacement to manual testing; that—if all tests pass—we are guaranteed bug-free software. But, automated tests aren’t really testing a system at-large, so much as they are checking specific behavior within a system.

For example, suppose I want to test out the security around our billing page. I could write a couple of integration tests, by using a tool like Selenium, to confirm the following assumptions:

These tests confirm what we expect to happen actually happens, but they don’t confirm that what we don’t expect to happen (because we aren’t testing for it) actually won’t happen. For instance, neither test would prove that going to the billing page doesn’t, say, inadvertently kick off a process that deletes your user account, or the infinite number of other possibilities that could go wrong. We aren’t testing that a system actually works; We are merely checking that specific assumptions we made aren’t broken.

The same is true for unit tests. While they test very narrow and well-defined functionality, they aren’t going to test for, as Uncle Bob states, “the stuff out at the boundaries of the system.” Also, with unit tests, there is often the talk of “100% code coverage.” James O Coplien argues that it’s pragmatically impossible to achieve if we define this as:

…having examined all possible combinations of all possible paths through all methods of a class, having reproduced every possible configuration of data bits accessible to those methods, at every machine language instruction along the paths of execution.

If we truly achieved this, he argues, we’d have on the order of trillions of scenarios to test for even modest software.

So, where does this get us? I often visualize automated tests like building a frame (the tests we write) around a constantly morphing structure (the true behavior we want). We can nail lots of posts together to build a frame that begins to resemble the behavior we want, but we’ll never quite get there.

Writing automated tests provide basic boundaries for bug-free code, but are by no means an end-all solution.

So, what’s a better way to describe these things? The term “automated checks” feels much more appropriate to me, and others agree. As James Bach writes, there is a big difference between what humans can do and what automated tools can do.

In the Rapid Software Testing methodology, we distinguish between aspects of the testing process that machines can do versus those that only skilled humans can do. We have done this linguistically by adapting the ordinary English word “checking” to refer to what tools can do.

With that said, Bach comes up with a differentiation between software testing and software checking.

Testing is the process of evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, etc….Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.

Automated tests are really automated checks. And, checking is just one small part of a complete testing strategy. It’s one kind of insurance policy amongst a whole suite of policies designed to make bugs more unlikely (but, certainly, not impossible) to introduce into a system.

Maintaining your own human vigilance against bugs

So, what’s the big deal about calling a “check” a “test”? When we start becoming overly confident in our ability to write bug-free code simply based on a successful series of automated tests, the returns begin to diminish—rapidly. Our minds start to let go of some of the natural deliberateness we may have once put into code prior to the feeble safety net of automated tests. At its worst, it means we become narrowly-focused developers, using one relatively brittle measure of success as false justification that our testing is now complete.

If we were to truly test every possible scenario in even a modest system, there would quickly be orders of magnitude more tests to write than would be practical. We best rely on a series of other strategies—in addition to automated checks—to cover all the other permutations.

The TDD debate and the real question we should be answering

Is test-driven development (er, check-driven development) a good approach to writing code? If you read the tech pundits today, most everyone has a very strong opinion on the matter. But, I think we’re asking the wrong question. Asking if you should be employing TDD is like asking a basketball coach whether they should play a 2-1-2 zone, 2-3 zone, or man-on-man defense. They can all work and they can all not work. I’ve seen quality software written both with and without a TDD approach.

The real question we should ask is, how do we pragmatically make the introduction of bugs as improbable as possible, (while weighing all the other important matters of software, like new features, deadlines, etc.)? To me, it starts with the right programmer mentality.

We should be talking just as much about these other things, because they are just as important, if not more important, than automated testing:

To be clear, automated checks have a very important place in building sound software. But we need to be careful with our expectations. Placing too much value into them might give us a false sense of security making us even more prone to introducing bugs. We need to make sure that our long-tested human processes still play a majority role in our overall testing strategy.

Originally published Apr 28, 2015 at DoneDone. Go to the next essay in Testing, “Cut the problem in half”.