Canary Tests
Canary Tests are minimal tests to quickly and automatically verify that everything you depend on is ready. You run Canary tests before other time-consuming tests, and before wasting time investigating in your code when the other tests are red. If the canary test fails, you know you have to fix something on the environments first.
This idea of Canary test is different from the Canary Deployment. In Canary Deployment you deploy to a small fraction of your users to check everything’s fine before rolling out to more users.Save time by checking what should be always OK
Canary tests check for the obvious and frequent sources of issues, such as:
- connectivity to network: firewall rules ok, ports open, proxy working fine, NAT, ping below a good threshold
- Databases and middleware are up
- disk quota for logs not almost full
- every needed login and password is valid
- installed software available in the right version: dll installed, registry set-up, environment variables set, user directories all exist, the frameworks and OS versions are fit, timezone and locale are as expected
- reference data integrity and consistency (dates, valuations…) are ok
- Database schema and audit of applied scripts are as expected
- Licences are not expired (there is usually a way to check that automatically)
Canary tests should run regularly, ideally before any expensive tests like end-to-end tests. Of course you want to run them whenever there is a trouble somewhere, before wasting time on manual investigations in your code when the expected environment is not fully available.
Even at the code level, a canary test is just a trivial test to verify that the testing framework works correctly, as mentioned by Marcus on his blog:
assertTrue(true)
Don’t forget to verify that your tests can fail too!
Simple and low-maintenance
The canary test tools should not assume much from the application. They must be independent from new developments to be as stable as possible. They should require little to no maintenance at all.
One way to do that in practice is to simply scan configuration files for every URL, password and just ping them one by one against a predefined time threshold. Any log path mentioned in the configuration files can be scanned and checked for the required write permissions and available disk space. Any login and password can be checked, even though this may be more complicated.
Canary tests are documentation too
Doing Canary tests may require explicit declarations of expectations, e.g. an annotation AssumedPermission(’777′) to declare the permissions required on the files referenced in the configuration files. Alternatively you may rely on a Convention Over Configuration principle. For example every
log.*.path
variable is assumed to be a log path to check against some predefined expectations like being writable and being ok with disk quota.
When you add canary tests, this automation itself is a form of documentation that makes assumption more explicit.
You could export a report of every canary test that has been ran into a readable form that can become part of your Living Documentation.