Levels of automated testing within a single application

We need a common language for the different types of automated testing.  We’re partially there, but the term “unit test” is still very confusing.  Here, I’ll lay out the different types of automated tests I find helpful with a single application:

  • Unit testing – testing a single class or possible a small group of collaborating classes (absolutely does not call out of process and is the fastest-running of all automated tests).  Running 1000 unit tests in 3 or 4 seconds is common.
  • Full system tests – through the UI integrated with the full application including the database.  May or may not use real system dependencies such as external web services.  (These are the slowest of all tests)
  • Integration testing.  Here, there are some categories.
    • Data access tests.  Used to test repositories, data access classes, etc.  These tests validate the translation from entities to data.  These tests run all SQL and test the structure of the database schema as well.  A real database must be involved.
    • General scenario testing.  Any time it’s appropriate to pull a section of the application in and run a lot of classes together, this is an integration test.  It involves several parts of the system, not just one.  It can run fast if completely in process, or it can be slow if it requires an out-of-process call such as leveraging the file system.

This is not an exhaustive list, but it includes most of the automated testing on a typical enterprise application.  Feel free to comment with any type I may have left out.

Scott Bellware reasoned that the database needs to be left out for unit testing.  I completely agree.  Unit testing, by common definition, excludes external dependencies.  It’s not a unit test if we reach out and touch things.  When you have the right number of unit tests (for example, I’ve worked on a smart client system with 80,000 lines of code and 1300 unit tests and another 700 integration tests), you can’t afford to take more than a few milliseconds to run each one.  You need your unit tests to run very quickly.  Otherwise, you won’t run them very often.

Conversely, this doesn’t mean that the database should be ignored when testing a system.  There are plenty of reasons why a database, SQL, or stored procedures, triggers (shudder), views, etc can cause a bug in the system.  I insist writing an automated integration test for every database operation.  How else can we verify that the database operation works correctly?  We can’t.  It is important, however, for communication’s sake, to understand that these database-inclusive tests are integration tests, as are any tests that exercise an external dependency.

Automated testing with the database REQUIRES the following:

  1. Every developer has a dedicated instance of the database that can be dropped and created at will.
  2. Tests must be responsible for their own data setup.  An empty database should be all that is required to run the test.  The test must be responsible for adding data for the appropriate scenario before testing the scenario.
  3. You will want to generalize test data setup because it isn’t feasible to expect EVERY test to set up all the data.  A general data set that sets a base line of data is very useful and can be invoked with a data helper class.  Then each test can just add specific data necessary for it’s test case.
  4. Data setup, database creation, etc should be automated.  If it’s manual, it cost more, and you won’t run the tests as often.
  5. Database schema must be in source control with the code.  Without that, you never know what the correct version of the schema is.

Another of Scott’s points: “As a side effect of doing the necessary dependency injection, you often get a cleaner and more explicit separation of concerns – which makes software easier to change and maintain.”

He’s right.  If you can’t unit test your domain classes because everything you do with them requires a real database to be online, you have an indication that you aren’t separating concerns.  Data access should be independent of domain object behavior in most cases.  I should be able to verify that a Customer object can Sort() itself without invoking a database query, but if constructing a Customer initiates a database call, my domain model is then materially coupled to the database and needs to be separated.

Jeremy Miller is of the same mind in his comment: “Referential integrity, non null checks, and sundry other data constraints.  All good things.  All a pain in the ass when you’re unit test only needs a single property set on the InvoiceItem class.”

To help clear up some confusion with the term “unit test”, I propose a simple constraint in our dialog:  If the test calls out-of-process, it is then disqualified from “unit test” status and falls into “integration test”.  Feel free to argue in the comments. 🙂