Wednesday, January 13, 2010

Continues integration and long running system tests



Say you are working in a software company that use system test as part of the production system.
Ideally you would treat system tests like unit tests e.g. having all of them run each build and have the report email to you.

Now the problem with that is the fact that system tests can take lots of time to run for example we have only 50 tests and it take about 15 min, when you have hundreds of tests it can take some hours to run them all.

Even if you are using grid to run your tests it is only ease the pain since the first machine you add the grid reduce the time by 50% but effect of a new machine is reduced dramatically the more machines you already have.

So we conclude the it is not practical to expect that all the tests will run for each build.

This is how we at Dalet solved this problem:

First for each build the artifacts of the build is 2 zip files
  • product.zip and
  • product-system-tests.zip
those 2 files are saved at the end of the build at a directory with name as the svn revision of the build.

Second the automatic test process is trigged after each build but only one process of the tests can run at some time, now the first thing the tests runner doing is to find the latest revision that was built and not tested (this is a simple scanning of the files generated by the first stage).
Once it find such revision it extract the product extract the system-tests and run the tests, the output of this run is a test report.

Now you can say that this is not good enough and I am totally in with you.
The problem is that every tests run cover multiple commits and you do not know really which one of them broke the test.

So our next step was to write a small web service to display the report nicely and to compare 2 tests once against other.


Once you see those results the problem become much simpler since you realize that only small parts of the tests will fail in each run and that you have only to find the failed tests revision.
This can be done by running for each revision that was skipped only the failed search (something like binary search on the tests) once you do that you have the changes set that fail test the fix become very simple.

Another step that you can go is to collect the statistics about what files caused what test to fail, this can help you understand the dependencies in your code and let you choose automatically
small subset of the tests to run after a given change set.

Nice Ha ?

Barak.

No comments:

Post a Comment