| 1 | Unit Testing for Diogenes |
| 2 | =========================== |
| 3 | |
| 4 | Unit testing is the process of writing lots of little pieces of test code |
| 5 | which each test some part of the functionality of a software system. By |
| 6 | having a comprehensive set of test cases which are satisfied by the |
| 7 | production code, we can have real confidence in our software, instead of |
| 8 | just "hoping" it works. |
| 9 | |
| 10 | What is a unit test? |
| 11 | ---------------------- |
| 12 | |
| 13 | The general structure of all unit tests is as follows: |
| 14 | |
| 15 | * Setup pre-conditions (the state of the system prior to the execution of the |
| 16 | code under test) |
| 17 | * Run the code to be tested, giving it all applicable arguments |
| 18 | * Validate that the code ran correctly by examining the state of the system |
| 19 | after the code under test has run, and ensuring that it has done |
| 20 | what it should have done. |
| 21 | |
| 22 | The pre-conditions are normally either set at the beginning of the test |
| 23 | method, or in a general method called setUp() (see below for the structure |
| 24 | of test code). |
| 25 | |
| 26 | Running the code itself is normally making a call to the function to |
| 27 | be tested, constructing an object, or, for Web applications, running the |
| 28 | top-level web script (using require()). |
| 29 | |
| 30 | Validating the post-run state of the system is done by examining the |
| 31 | database, system files, and script output, and using various assert methods |
| 32 | to communicate the success or failure of various tests to the user. |
| 33 | |
| 34 | Testing Web applications such as Diogenes is relatively easy, especially the |
| 35 | user interface side of things. |
| 36 | |
| 37 | Unlike traditional GUI applications, every state change of a web application |
| 38 | is defined by the database and filesystem, the user's session, and the |
| 39 | values passed from the web browser through $_REQUEST and $_SERVER. Since |
| 40 | all of these aspects are relatively easy to control, setting the pre-run |
| 41 | state of the application is quite simple. There are also relatively few |
| 42 | discrete states that your web application can be in, because all of the |
| 43 | system control has to be made through the constrained interface above. |
| 44 | |
| 45 | Testing post-conditions, as well, is simple. If you want to ensure that a |
| 46 | certain thing is part of the post-run display, you can just interrogate the |
| 47 | HTML output from the test run, which is all text and can be tested with |
| 48 | assertRegExp() and assertSubStr(). Again, the only state available is in |
| 49 | the user's session, database, and filesystem, all of which are easy to |
| 50 | interrogate. |
| 51 | |
| 52 | Test cases and test methods |
| 53 | ----------------------------- |
| 54 | |
| 55 | (This is where things get a bit hairy -- hold on, and perhaps read it twice) |
| 56 | |
| 57 | A test case is a collection of tests which are related in some way. Each |
| 58 | test case is represented in our testing framework by a class subclassed from |
| 59 | PHPUnit_TestCase. Each test is a method on the test case whose name starts |
| 60 | with 'test'. You should name the tests appropriately from there. |
| 61 | |
| 62 | Test cases can have a couple of special methods defined on them, called |
| 63 | setUp() and tearDown(). The first method is called before each of the test |
| 64 | methods is called, and the second is called after each test method has run. |
| 65 | |
| 66 | The setUp() and tearDown() methods are the primary reason for grouping test |
| 67 | methods together. Methods should, as much as possible, grouped into test |
| 68 | cases with common setUp() and tearDown() requirements. |
| 69 | |
| 70 | So how do I write a Unit Test? |
| 71 | ------------------------------- |
| 72 | |
| 73 | Put together a snippet of code which sets up the state of the application, |
| 74 | then run the code to be tested, and check the result. This snippet of code |
| 75 | should be placed in a test case in a method named test[Something](). Each |
| 76 | test method takes no arguments and returns nothing. |
| 77 | |
| 78 | Have a look at the existing test code in the testing/ directory of the |
| 79 | diogenes source distribution for examples of how to write test code. |
| 80 | |
| 81 | When you create a new test case, you need to tell the test code runner that |
| 82 | it needs to run the new test code. There is an array in the alltests script |
| 83 | which defines the files and test cases to be run. Each key in the array |
| 84 | specifies a filename to be read (without the .php extension), while the |
| 85 | value associated with the key is an array of class names in that file. |
| 86 | |
| 87 | What should I test? |
| 88 | --------------------- |
| 89 | |
| 90 | The naive answer would be "everything". However, that is impractical. |
| 91 | There are just too many things that could be tested for a policy of "test |
| 92 | everything" to allow any actual code to be written. |
| 93 | |
| 94 | It helps to think of tests as pulls of the handle on a poker machine. Each |
| 95 | pull costs you something (some time to write it). You "win" when a test |
| 96 | that you expected to pass fails, or when a test you expected to fail passes. |
| 97 | You lose when a test gives you no additional useful feedback. You want to |
| 98 | maximise your winnings. |
| 99 | |
| 100 | So, write tests that demonstrate something new and different about the |
| 101 | operation of the system. Before you write any production code, write a test |
| 102 | which defines how you want the system to act in order to pass the test. Run |
| 103 | the test suite, verify that the test fails. Now, modify the production code |
| 104 | just enough so the test passes. If you want the system to do something that |
| 105 | can't be expressed in one test, write multiple tests, each one interspersed |
| 106 | with some production development to satisfy *just* *that* new test. This |
| 107 | is, in my experience, the best way to ensure that you have good test |
| 108 | coverage, whilst minimising the production of tests which add no value to |
| 109 | the system. |
| 110 | |
| 111 | How do I retro-fit unit testing onto an existing codebase? |
| 112 | ------------------------------------------------------------ |
| 113 | |
| 114 | Diogenes already has a significant amount of code written, which would take |
| 115 | hundreds of tests to cover completely. There is little point in going back |
| 116 | and writing tests for all of this functionality. It appears to work well |
| 117 | enough, so we should just leave it as-is. |
| 118 | |
| 119 | However, from now on, every time you want to make some modification (whether |
| 120 | it be a refactoring, a bug fix, or a feature addition), write one or more |
| 121 | test cases which demonstrate your desired result: |
| 122 | |
| 123 | Refactoring: Write tests surrounding the functionality you intend to |
| 124 | refactor. Show the test cases accurately represent the desired |
| 125 | functionality of the system by ensuring they all run properly. Then |
| 126 | perform the refactoring, ensuring you haven't broken anything by |
| 127 | making sure the tests all still run properly. |
| 128 | |
| 129 | Bug fix: Write one or more tests which shows the bug in action -- in other |
| 130 | words, it hits the bug, and produces erroneous results. Then modify |
| 131 | the system so that the test passes. You can be confident that |
| 132 | you've fixed the bug, because you have concrete feedback in the form |
| 133 | of your test suite that the bug no longer exists. |
| 134 | |
| 135 | Feature addition: Put together some tests which invoke the feature you want |
| 136 | to add, and test that the feature is working as it should. |
| 137 | Naturally, these tests will fail at first, because the feature |
| 138 | doesn't exist. But you then modify the production code to make the |
| 139 | feature work, and you stop when your tests all pass. |
| 140 | |
| 141 | Over time, as old code gets modified, the percentage of code covered by the |
| 142 | tests will increase, and eventually we will have a comprehensively tested |
| 143 | piece of software. |
| 144 | |
| 145 | During modifications, if you manage to break something accidentally, write a |
| 146 | test to show the breakage and fix it from there. If you broke it once, |
| 147 | there's a good chance it'll break again when someone else modifies it, and |
| 148 | there should be a test to immediately warn the programmer that they've |
| 149 | broken something. |
| 150 | |
| 151 | How do I run the unit tests? |
| 152 | ------------------------------ |
| 153 | |
| 154 | The primary script that executes all of the unit tests is the 'alltests' |
| 155 | script in the testing/ directory of the distribution. However, the output |
| 156 | of this script is one line for every test that passes or fails. |
| 157 | |
| 158 | To help highlight the test failures, there is a 'run' script, which filters |
| 159 | out all of the passes, and only shows you the failures. Very useful. |
| 160 | |
| 161 | So, your regular test run will be done with ./run, but if you want to see a |
| 162 | thousand passes, run ./alltests. Both of these should be run from the |
| 163 | testing/ directory. Running it from elsewhere isn't likely to get good |
| 164 | results. |
| 165 | |