Mutation Testing with Pitest

Author

Philipp Czora

Mutation testing

One tool that can help us to improve the quality of our tests is so-called mutation testing. With mutation testing, for each execution of the test suite, each test is executed not just once, but multiple times. Before each test run, the code to be tested is modified or mutated. If the test fails after a change is made, its robustness has been proven for this case. If it remains green, this means that the test does not cover the failure case caused by the change.

At the end of the day, this means that each unit test shows which changes it did not detect. These changes are called mutations. Each mutation that causes the test to fail is referred to as killed, since it did not survive the test. The term survivors refers to mutations that are not detected by a test – despite their presence, the test stays green. It should be mentioned here that mutations that survive a test do not necessarily indicate a problem; however, they could very well hint at one.

Running each test suite multiple times and also mutating the code under test between runs would involve significant effort—so much that it would not be reasonable to perform this manually. Therefore, mutation tests would not be practical without tools for automating the process. One tool for JVM-based programming languages that automates mutation testing is Pitest. In this article, we will explain and examine the use and usefulness of this tool.

Example

We will use an illustrative example to demonstrate the usefulness of mutation tests as well as how to set up Pitest with Maven.

We assume that the following classes for performing simple calculations on integers are already present in a Maven project:

Since test-driven development was used, the following unit tests already exist for this class:

These tests provide 100 percent method coverage and will also all be green when executed. But how robust are they against mutations? In order to find this out, we will add the Pitest Maven plugin to the project. To do this, we add the following to the pom.xml:

After we have added the plugin to the project, it can be called up as follows:

clean install org.pitest:pitest-maven:mutationCoverage

It will take a moment for the project to be built and for the mutation tests to run. The Pitest plugin creates a folder in the project’s target directory with the name pit-reports. This contains the results of the mutation test. If we now open the HTML file in the subdirectory and follow the links all the way to the tested class, all surviving and killed mutations will be displayed:

As can be seen in the screenshot, there were six mutations, of which two survived the test suite. In both cases, they are mutations on mathematical operators; one time a minus was replaced with a plus, another time an asterisk (multiplication operator) was replaced by a slash (division operator).

The class to be tested clearly appears to be error-free. However, the result of the mutation test still provides meaningful information: the class could possibly contain errors, but our test suite would still be green. In this example: The tests allow an addition to be performed instead of a subtraction, and a division instead of a multiplication, without them failing. Of course, for illustrative purposes, the tests data have been chosen specifically to result in surviving mutations. If we change the unit tests as follows, the Pitest result will be different:

As before, the tests are green; however, all mutations are now killed off:

Mutators

Both surviving mutations from the example were created by replacing a mathematical operator with its inverse, in other words addition became subtraction and multiplication became division. The rules used for these replacements are contained in so-called mutators. Pitest itself provides a few mutators, such as the MATH mutator used above. This mutator not only replaces subtraction and multiplication operators, but also the following:

Other mutators replace each increment with a decrement, or cause each if condition to always be true (or false). Complete documentation of the mutators included with Pitest can be found in the corresponding documentation.
The combination of mutators that Pitest automatically runs out of the box enables identification of problems and ambiguities within the test suite without requiring a lot of the developer’s time. Like most good things, however, mutation tests also have their price.

Disadvantages of mutation tests

With Pitest, mutation tests don’t need to be written by the developer. However, at some point in time they must be executed. In the vast majority of cases, the execution time of the mutation tests is quite a bit longer than the time required to simply execute the unit tests. This is because for each section of code that is tested with a unit test, one or more mutations are generated. For each of these mutations, a unit test will run. Even for projects with just a few thousand lines of code but high test coverage, execution of the mutation tests can take a few minutes. And this is in spite of the optimizations that Pitest performs: For example, not all tests necessarily run for a given mutation, but rather only those that have a chance of detecting that mutation. As soon as a test intercepts a mutation, no more tests are run for it.

In combination with the Maven SCM plugin, Pitest can also be configured so that only newly added code is mutation tested. As part of a CI pipeline, for example, it is possible to always run mutation tests in the nightly build, but only on newly added code. At the end of this pipeline could be a SonarQube, into which the Pitest results are imported. We will explain how this works in a follow-up to this blog post.

As is the case with high test coverage, it is possible for a software developer to get carried away by a high mutation detection rate and overengineer the unit test. And even though it is certainly a good thing if no mutations survive the tests, in each individual case it must be decided which survival rates make sense, and which mutations might in good conscience be allowed to survive. Pitest currently does not have the option of ignoring certain survivors. Thus if one intentionally decides not to kill off a mutation, it will continue to appear in the Pitest report.

Conclusion

Mutation tests can help with detecting and improving weak unit tests with a minimum of effort. This contributes to improved software quality. They are no cure-all, however. The introduction of mutation tests will not suddenly cause the quality of a product to shoot up. Mutation tests do, however, provide a practical way to increase the developer’s confidence in his own unit tests – much like how unit tests can increase confidence in the code. Whether and to what extent mutation tests are used must be decided on a case-by-case basis. There is simply no yellow brick road for this.

About the Author

Philipp Czora

Software Development

Philipp likes to be surrounded by people that he can learn from. When it comes to Software Development, he always strives for the perfect mix of pragmatism and perfectionism.

github.com/pczora

Comments

No Comments

Comments closed

Approach and findings of an architecture analysis within the framework of a code review

Author

Benny Schwarting

Published

15.07.2019

After the last article looked at important key figures from the static analysis of a code review, aspects of manual analysis are now highlighted. How does an architecture analysis work in a code review and what conclusions can be drawn from it?

A Groovy DSL for the Creation of Test Data using JPA

Author

Daniel Behrwind

Published

03.07.2017

With the automated integrative testing of software that works with a complex JPA data model, it is invariably the case that sooner or later, one will face the question of how it is possible to create semantically meaningful test data without great cost. This article shows how Groovy can be used in order to define a Domain Specific Language (DSL) that enables test data to be defined so that it is easily readable, modular and separate from the actual test code.ieren.

Jenkins Pipeline plugin: code completion in IntelliJ

Author

Johannes Schnatterer

Published

06.06.2017

The Pipeline plugin (formerly Workflow plugin) for Jenkins revolutionises working with Jenkins by allowing for the creation of build jobs as code. As a result, build pipelines can be put under version control, become reusable,testable and more easily readable, among other things, as something “put together with mouse clicks”.

Static code analysis with SonarQube

Author

Josha von Gizycki

Published

13.02.2017

This article describes which key figures can be collected by a static analysis and how these can be interpreted. The main focus is on technical debt and complexity.

Mutation Testing with Pitest – Part 2: SonarQube

Author

Philipp Czora

Published

29.11.2017

This post follows on from the previous part. If you have not yet read it, we recommend you take a few minutes to do so now.

Static code analysis – consideration of technical debt and complexity

Author

Benny Schwarting

Published

08.03.2019

This article describes which key figures can be collected by a static analysis and how these can be interpreted. The main focus is on technical debt and complexity.

Conclusion

Comments

Related Posts

Approach and findings of an architecture analysis within the framework of a code review

A Groovy DSL for the Creation of Test Data using JPA

Jenkins Pipeline plugin: code completion in IntelliJ

Static code analysis with SonarQube

Mutation Testing with Pitest – Part 2: SonarQube

Static code analysis – consideration of technical debt and complexity