Site Meter
 
 

Selective Unit Testing – Costs and Benefits

I’ve been writing unit tests regularly for 2-3 years now, and doing full-blown test-driven development (TDD) full time for about the last year. Throughout this whole time, I keep noticing the same thing over and over:

  • For certain types of code, unit testing works brilliantly, flows naturally, and significantly enhances the quality of the resulting code.
  • But for other types of code, writing unit tests consumes a huge amount of effort, doesn’t meaningfully aid design or reduce defects at all, and makes the codebase harder to work with by being a barrier to refactoring or enhancement.

I guess that shouldn’t surprise anyone, because well-respected techniques in all fields – e.g., techniques for winning debates, for dieting, for making money – tend to be strong in some scenarios but weaker in others. At one point I thought this observation was somehow controversial, but every other developer with whom I’ve discussed it already considered it self-evident that unit testing is sometimes very effective and sometimes just isn’t.

So why am I writing this? Two reasons:

  1. Because I think we can go further and understand the underlying forces that make unit testing worthwhile (or not) for any given unit of code.
  2. Because a minority of developers still believes that they should aim for 100% unit test coverage, and that if they don’t follow the TDD code-first process, then they’ve failed as a professional. I’m not satisfied with that view.

How much does that code benefit from unit testing?

I could list a dozen great benefits that come from unit testing, but the list really boils down to two things: Unit tests help you to design some code while you’re writing it, and also help to verify that your implementation actually does what you intended it to do.

That sounds great – and often is – but it’s still legal to question the whole idea. Consider: Why do you actually want a secondary system to help design or verify your code? Doesn’t your source code itself express the design and behaviour of your solution? If unit tests are a repetition of the same design, in what sense do they demonstrate the correctness of that design? What about DRY?

In my experience, if your code is not obvious at a single glance – so working out its exact behaviour would take time and careful thought – then additional design and verification assistance (e.g., through unit testing) is essential to be sure that all your cases are handled properly. For example, if you’re coding a system of business rules or parsing a complex hierarchical string format, there will be too many possible code paths to check at a glance. In scenarios like these, unit tests are extremely helpful and valuable.

Conversely, if your code is basically obvious – so at a glance you can see exactly what it does – then additional design and verification (e.g., through unit testing) yields extremely minimal benefit, if any. For example, if you’re writing a method that gets the current date and the amount of free disk space, and then passes them both to a logging service, the source code listing says everything you need to say about that design. What would a unit test add here, given that you’d be mocking out the clock and disk space provider anyway?

In summary, I’m arguing that the benefit of unit testing is correlated with the non-obviousness of the code under test.

How much does it cost to unit test that code?

A few obvious costs spring to mind:

  • The time spent actually writing unit tests in the first place
  • The time spent fixing and updating unit tests, either because you’ve deliberately refactored interfaces between code units or the responsibilities distributed among them, or because tests broke unexpectedly when you made other changes
  • The tendency – either by you or your colleagues – to avoid improving and refactoring application code out of fear that it may break a load of unit tests and hence incur extra work

As many have written, you can reduce the cost of maintaining unit tests by following certain best practises. After doing that, the remaining cost may be tiny or it may still be significant.

In my experience, the remaining total cost of unit testing a certain code unit is very closely correlated with its number of dependencies on other code units. Why might that be?

Writing tests in the first place: If a method has no dependencies and merely acts as a simple function of a single parameter, unit tests are just a list of examples of input points mapping to output points. But if it takes four parameters and reads or writes five other services (abstract or otherwise) through class properties, you’ve got a lot of mocking to do and API usages to figure out. But this is a trivial cost compared to…

Maintenance: It’s well established that the more direct dependencies a code unit has, the more frequently it gets forced to change. (In fact, this is basically how “instability” is defined by standard code metrics tools.) You can easily see why: On any given day, each of those dependencies has some probability of changing its API or behaviour, forcing you to update your code and its unit tests.

Note that these issues apply equally even if you’re using an IoC container and coding purely to interfaces.

In summary, I’m arguing that the cost of unit testing is correlated with the number of dependencies (concrete or interface) that a code unit has.

Visualising the Costs and Benefits

OK, let’s put those two ideas on a single diagram:

image

This deliberately simplistic diagram illustrates four broad categories of code:

  • Complex code with few dependencies (top left). Typically this means self-contained algorithms for business rules or for things like sorting or parsing data. This cost-benefit argument goes strongly in favour of unit testing this code, because it’s cheap to do and highly beneficial.
  • Trivial code with many dependencies (bottom right). I’ve labelled this quadrant “coordinators”, because these code units tend to glue together and orchestrate interactions between other code units. This cost-benefit argument is in favour of not unit testing this code: it’s expensive to do and yields little practical benefit. Your time is finite; spend it more effectively elsewhere.
  • Complex code with many dependencies (top right). This code is very expensive to write with unit tests, but too risky to write without. Usually you can sidestep this dilemma by decomposing the code into two parts: the complex logic (algorithm) and the bit that interacts with many dependencies (coordinator).
  • Trival code with few dependencies (bottom left). We needn’t worry about this code. In cost-benefit terms, it doesn’t matter whether you unit test it or not.

Let’s get practical. What about my ASP.NET MVC web application?

In ASP.NET MVC, the most general-purpose place to put your application logic is in your controllers. Unfortunately if you keep dumping stuff there, they’ll become unwieldy – amassing complex logic but being very expensive to unit test because of all the dependencies and overlapping concerns. This is known a
s the fat controller anti-pattern.

To avoid this, you can factor out independent bits of application logic into service classes and business logic into domain model classes. You can also split out cross-cutting concerns into ASP.NET MVC filters, custom model binders, and custom action results.

The more you do this, the better, clearer, and simpler your controllers become. Ultimately, the better you structure your controllers, the more they end up being trivial coordinators that manage interactions between other code units while having very little or no logic of their own. In other words, the better you structure your controllers, the more they move down towards the bottom-right corner of the preceding diagram, and the less it makes sense to unit test them.

My controllers aim to be just a meeting place for all the different APIs of my many services. This controller code is trivially readable, and links together multiple dependencies. In cost-benefit terms, I find I’m more productive not unit testing these and instead using the time saved to keep refactoring and writing integration tests.

But, um, surely we don’t want less automated testing?

In case anyone is misinterpreting me, I’m not saying you shouldn’t do unit testing or TDD. What I am saying is:

  • I personally find I can deliver more business value per hour worked over the long term by using TDD only on the kinds of code for which it’s strong. This means nontrivial code with few dependencies (algorithms or self-contained business logic).
  • I sometimes deliberately decompose code into algorithms and coordinators, so that the former can be most clearly unit tested, and the latter most clearly expressed as C# without unit tests. The most obvious example is factoring application logic out of ASP.NET MVC controllers.
  • I’m increasingly becoming aware of the practical business value achieved through integration testing. For a web application, that usually means using some kind of browser automation tool such as Selenium RC or WatiN. This doesn’t replace unit testing, but I’d rather spend an hour writing integration tests to prove the whole system works together in some scenario, than spend that hour writing unit tests associated with trivial code whose behaviour I can know at a glance and which is likely to change each time some underlying API changes anyway.

This is just a description of my experience so far. It’s OK if yours is different.

Footnote: Source Code as Design

To expand on the question, “Doesn’t your source code itself already express the design and behaviour of the solution?”, consider the point made by Jim Reeves in his now-much-cited 1992 article for C++ Journal entitled What is Software Design?

The final goal of any engineering activity is some type of [design] documentation …  After reviewing the software development life cycle as I understood it, I concluded that the only software documentation that actually seems to satisfy the criteria of an engineering design is the source code listings.

His argument is that our source code is not the software itself (for the actual software is an executable binary file of some sort); our source code is the design for that software. A programming language succeeds or fails to the extent that it lets us succinctly and accurately describe our intended software design to the compiler. So, reader, can and should unit tests replace source code as the truest expression of our designs?

kick it on DotNetKicks.com

35 Responses to Selective Unit Testing – Costs and Benefits

  1. Good stuff again, Steve. Your diagram of test effort to test value really clarifies your point.

    I agree that testing coordinators/thin controllers often provides little business value. They don’t usually do much beyond calling other code that can itself be under test. A coordinator/thin controller test often just confirms that the code could be called using mocks and interaction testing.

  2. We’ve seen this too, and starting doing more BDD-style specs in the form of integration tests, doing end-to-end, scenario-centric tests.

  3. Testing shows its greatest benefit when it enables your team to effectively and quickly enact change in your system. The impact or number of dependencies is never important until a dependency breaks (aka changes). Placing tests around a piece of code allows you to change the implementation with less fear.

    The most productive time to create tests is while you are focusing on the implementation, hence test driven development. Assuring the initial implementation works correctly is a one time benefit, the validation of that behavior as the codebase grows is recurring.

    As an aside, if your Cost per Test is high enough to preclude tests around Trivial Code it may be time to reevaluate your testing framework.

  4. Pingback: Reflective Perspective - Chris Alcock » The Morning Brew #470

  5. Very interesting reflexion I had too.
    This article is clarifying what it wasn’t for me.
    Thank you Steve!

  6. I don’t think throwing the target of 100% coverage out is a great idea. As most people will point out 100% coverage doesn’t mean you’ve tested every scenario, just that you’ve hit every line of code. This is where you can weigh up cost-benefit.

    Most TDD literature mentions you can work in baby steps or giant leaps. In the areas of low benefit-cost you can make massive leaps (integration tests on coordinators). In the sensitive areas you take baby steps, testing as many scenarios as you can think of (algorithms).

    This way you still approach 100% coverage whilst minimising the cost incurred

    Finally, you can only know how much you should be testing once you have enough experience in it to know when you’re testing too much. It probably takes a year of TDD to get to that stage.

  7. Excellent article. You’ve expressed very clearly something that I’ve been gradually concluding from my 1st year of doing TDD. I think the point about testing “Coordinator” code using integration tests rather than unit tests is particularly valid.

  8. In my experience, writing executable specifications (tests) forces you to carefully consider your design. They are also a good indicator of flaws in your design and whether you are breaking fundamental OO principals. Even with co-ordinator classes it is important to consider how your object interacts with other objects and the roles and responsibilities of the collaborators.

    BDD allows you to focus on specifying higher-level behaviour rather than discrete units of code. I have found this style of testing produces higher value tests that are less fragile and easier to change than standard unit tests. I do agree that poorly-written or misguided tests are a hindance to refactoring and maintainability. Code is indeed a specification of how your system functions, but tests will tell you more about the intended behaviour.

  9. Steve

    Garry – I guess we’d agree that automated tests are valuable, and a pragmatic mix of unit and integration tests goes a long way to demonstrate robustness and correct behaviour. However, I do find aiming for 100% code coverage to be harmful and undesirable, because it ignores cost-benefit considerations and shifts the focus away from writing excellent software and onto an artificial proxy measurement instead. As you say, code coverage does not equal scenario coverage, and not all scenarios are equally important or frequent anyway. To deliver the best business value per hour worked, maybe something like 75% coverage (an arbitrary guess) would be a healthier level.

    Tim -
    > Even with co-ordinator classes it is important to consider
    > how your object interacts with other objects and the roles
    > and responsibilities of the collaborators
    I totally agree, but TDD-style unit testing is a terribly ineffective way to do that. The actual implementation of simple coordinator classes in C#/Java/whatever is far simpler and more readable than the same design expressed as unit tests. This is hardly surprising considering that C#/Java/etc is the result of 40+ years research into designing the most efficient and expressive syntaxes for designing such implementations. But then if you’re mostly focusing on higher-level BDD integration tests, you probably don’t get stuck in such unproductive minutia.

  10. Good article. For me, the main key is that the parts of your system that would benefit from being tested (algorithms / business rules) need to be isolated from the stuff that is hard to test. In other words, extract things from the top right to the top left section of your diagram.

  11. bk

    Great article. Agreed wholeheartedly.
    Unfortunately, management is seizing on agile and unit testing practice as silver-bullet to solve all their problems. This doesn’t help as number of consultant brought in told them that it can be done.
    Read my lips: “Code coverage”.
    If a project have high enough code coverage implies that it can be targeted for outsourcing….

  12. R.Angers

    Good article, however one must consider this: Unit Testing badly design code is not only time cosuming but one often is often force to build “fragile tests”. If the team respect basic principles like SRP (single responsiblity principle), one should not have any difficulty attaining 100% test coverage and every test will be simple to write and understand.

    On the other hand, if you try to “unit test” a 50 lines function that have five if’s 32 (2^5) code paths, that test fixture will almost always be broken.

  13. this is one of the best texts on TDD that I’ve seen, I really liked some of the points you made against doing TDD in some cases, this matches my experience, no extremes, which I think are the problem in all cases

  14. Dan

    This is a well written article. But for someone learning the ASP.NET MVC framework I have a few questions which I hope someone can answer. First I am a bit confused on how you can decompose the complex code with many dependency into the two parts: algorithm and coordinator. What does the coordinator look like? Could you point me to some example code. Is it nothing more then calls to dependencies’ functions. What if your Algorithm needs to call a dependency? Also could you point me to some example code which uses service classes.

    Thank You for your help.

  15. Eric

    OT: Glad to hear that you’re writing (or updating) a MVC v2 book on Apress! The first book was very well written and enjoyable to read.

  16. Steve,

    I totally agree with you (for now ;) ). After writing tons of unit tests for my ASP.NET MVC controllers I was wondering if it’s really necessary to do that.

    But as a consequence from that point of view, which I share with you, we have to say that one main argument for using MVC instead of WebForms disappears: testability. Having stupid controllers isn’t much different from having some “coordinating methods” in a WebForms code behind class.

  17. Interesting article. I agree with you 100%, but if anyone were to ever ask me how much code coverage they would shoot for, I’d say 100%.

    Why? Well, if they have to ask, that means that they don’t have a clear idea of what code is useful to test. And if I say “You don’t *really* have to test *everything*. Just test what it makes sense to test.”, chances are good they’ll never learn because they’re going to decide not to test code at the drop of a hat.

    That said, this is a good article for someone who wants to move from the “testing newb” phase to the “intermediate tester” phase.

  18. ulu

    I think the idea that “you don’t need to write tests for trivial code” is misleading. Well, it can be applied if you write your tests *after* the production code. On the other hand, if you do TDD, how do you know that your code is trivial before writing a test — the code doesn’t exist yet! Even if your current task is simple enough that you are sure the code is trivial, surely it’s going to change when you add new requirements or edge cases.

    Being lazy myself, I too often think that I don’t need to write tests for the code I’m going to write, only to find myself debugging it an hour later.

    There’s one category of code that’s hard to test with no apparent benefit. It’s the code declarative in nature. One well-known example is UI, but this is true also for coordination code on your diagram (such as IoC container setup), since it is mostly declarative (and often can be represented as XML). “Dumb” controllers can be also viewed as declarative code, but the moment you add the first “if” it becomes a subject to test.

  19. Steve

    Ulu – it sounds like we’re basically agreeing. There are categories of code for which TDD’s return on investment is negative, i.e., coordinator classes and near-declarative code. This is the stuff from the bottom half of the diagram in this post.

    You rightly question whether you can anticipate whether some code will be trivial before you write it. Generally I find that a project’s architecture puts code of specific types in specific places, so UI stuff will be in one place, business algorithms in another, configuration in another, etc. You know when you’re about to write an MVC controller, and you know that if it starts to build up a complex algorithm then you need to refactor it. It’s not that hard to anticipate what you’re doing over the next hour or to adapt your strategy as things evolve. In fact you might say that as a programmer, that’s your job.

  20. ulu

    Steve,

    So, do you ever put View-related logic in controllers? Like, if (User.IsAuthenticated) {show the Logout button}? Or, if (input.IsValid) {saveChanges; show OkThanks}? If yes, do you ever test it?

    In fact, most of my methods are 1-3 lines indeed, and they do look trivial (although each of these 1-3 lines would usually call another 3-liner etc), so I’m always tempted to drop my unit testing and resort to integration tests. So, while in theory I’m sort of defending writing as many unit tests as you can, in practice I tend to live with integration tests (and get punished when an integration test falls and I don’t know what’s wrong).

  21. Steve

    Ulu, yes, that’s exactly the sort of logic that would often appear in my controller action methods. These will go through lots of different types of testing, including UI automation testing, manual exploratory testing, performance testing, and acceptance testing.

    For most of the time I’ve used ASP.NET MVC, I’ve been doing TDD and would always have unit tested these action methods too. However, as I argue in this blog post, I now regard it most important to factor the complexity out of the controllers. If you do that, the remaining controllers are rather trivial and a unit test is very unlikely to help you design it or to detect or avoid bugs in it, because the sort of bugs you’ll get in practice are to do with interactions with other components, view rendering problems, JavaScript issues, things like that – which a unit test by definition wouldn’t address anyway. So, I’ve found TDD on these action methods to be ultimately counterproductive because the benefit is so minimal and yet there’s still the maintenance cost. I still use TDD on other parts of the code and think it can be brilliantly effective there.

  22. Pingback: Behavior Driven Development (BDD) with SpecFlow and ASP.NET MVC « Steve Sanderson’s blog

  23. Interesting article, although I think on balance I still believe that working towards all code flowing from a test is a good aim. I find that thinking about trivial code (that an ArgumentNullException is thrown or a property is set on constuction) helps me focus my mind on the next steps. It also encourages me to think of the ‘what if’ scenarios.

    To often I find that bugs are found in trivial code (eg. collections not being initialized) and that the quick fix without tests breaks more things.

    That said. Code coverage is not an effective metric as all your code could be hit by tests that are completely ineffective and test absolutely nothing.

  24. Thanks Steve. Great stuff. I certainly agree your points. One more thing that I would like to add that some developers tend to write Unit Tests around methods, which have already been tested by framework (e.g .NET framework). Which is a waste of time and repetition of work. Also some developers would like to show their Unit Testing skills by writing as much as complicating Unit Tests, which doesn’t add value if it is not for the correct reason.

  25. Pingback: Trust is good, control is better – Effective Unit Testing in .Net - db@net blog site

  26. David

    Excellent article, it nicely sums up how I feel about TDD in general. I personally feel TDD has become another buzzword that is being applied in earnest without proper consideration of whether it is appropriate or useful to the scenario at hand.

    I firmly believe in using the right tool for the right job and unit testing is sometimes a large amount of effort for little return, particularly when it comes to UI testing.

  27. Pingback: Behavior Driven Development (BDD) with Cucumber and ASP.NET MVC

  28. I agree that the cost of unit testing “appears” to follow your chart; but I am very hesitant to accept the conclusion. One item that is not mentioned (or perhaps I missed it) is the impact that a change to code that is not formally tested may become a breaking change in consuming code (especially when designing shared components).

    As a result, I strive for 100% coverage of *ALL* externally visible behaviour (note that this is different than 100% of lines of code).

    Using an ASP.NET MVC controller as an example, I would expect that the controller has a well defined set of characteristct that can be consumed by multiple views, or even used with different models. If you consider a developer who (unknown to you) has written a view against your controller (following the documented behaviour / API) then the controller MUST be tested to make sure that no changes are made to the behavior or API that will impact this consumer.

  29. Maybe you could edit the webpage name title
    Selective Unit Testing – Costs and Benefits «
    Steve Sanderson’s blog to something more specific for your blog post you create. I loved the the writing all the same.

  30. Pingback: Selective Unit Testing – Costs and Benefits « leosjor

  31. Hi Steve, I absolutely agree with you about :”Unit tests help you to design some code while you’re writing it, and also help to verify that your implementation actually does what you intended it to do”

    You raised a question: “Why do you actually want a secondary system to help design or verify your code” and I think I could answer this.

    Working in a team of about 20 developers of different LEVELS, I mostly want to write unit tests just for 1 reason: “Protect the intention of the code and make sure any changes (from others) don’t break it’s intention”. That “secondary system” could help me avoid fixing bugs caused by the changes from others which always happens.
    What you write is perfectly correct if you work alone in your personal project or you work in a team with developers who all care about code quality. Otherwise, without unit tests, you could end up with a mess.

  32. Marc

    Hi Steve,
    I totally agree on your point regarding integration tests. For applications that deal with shifting data back and forth that do not have many complex algorithms integration tests provide the best value.
    As modern web apps become more and more rich UIs that rely on restful services to obtain their data integration testing of those services isn’t that hard any more and can be done using a unit test runner utilizing a HTTP-Client in the test code.
    The rich UI (let it be JavaScript) can the be tested on it’s own with standalone mock data.

  33. Great post! Agree almost entirely.

  34. Keno Mullings

    Thanks Steve.

    I’m new to TDD and BDD. However, the more I get into BDD, the less I see the need for TDD. I can’t seem to connect the two. These posts helped me understand BDD (http://msdn.microsoft.com/en-us/magazine/gg490346.aspx and http://blog.stevensanderson.com/2010/03/03/behavior-driven-development-bdd-with-specflow-and-aspnet-mvc/) but writing .feature tests and traditional unit tests seem like a duplicate effort…thoughts?

    Thanks.

  35. svend tang

    hi

    Great text….
    Really nice to read the thoughts from someone who isent a purist. This is a must read for my dveloper team. The feeling of if we dont do 100% coverage why bother.

    This coupled with Bdd mindset about tests really nails it in our organisation. Concentrate on behaviour test and dont get all worked up about not having 100% test coverage.

    Thanks
    Its given me a great leverage to get the process started in my team.