A Groovy DSL for the Creation of Test Data using JPA

Author

Daniel Behrwind

Requirements

What, however, would a solution that the database-shy Java developer approves look like?
1. Firstly, it would have to be possible to call the solution seamlessly from the Java test code.
2. Test data should be reusable and definable on a modular basis.
3. The solution should work on a completely object-oriented basis.
4. Saved entities should be made accessible to the calling code.
5. It should be possible to define the test data easily and in a readable form.

The Grails Fixtures plugin largely fulfils the aforementioned requirements on the basis of a Domain Specific Language (DSL). It is closely interwoven with the Grails Framework, however. Therefore, the following discussion will highlight how the requirements can be implemented at limited cost in a Groovy DSL which can be used directly in conventional Java projects. The complete code is available at https://github.com/triologygmbh/test-data-loader.

Groovy to the Rescue

With its dynamic nature and plenty of syntactic sugar, the JVM language “Groovy” provides the ideal conditions for being able to define one’s own DSL on a straightforward basis. More detailed information on Groovy is available at http://www.groovy-lang.org/documentation.html. Language features used in the described solution are briefly introduced in the event of use.

The solution

The biggest challenge when developing the envisaged DSL concerns finding a syntax for the definition of the test data that is as simple as possible and then converting this definition into actual JPA entities. To make this a little clearer, first of all, here is an example: With the following snippet, the goal is to instantiate a JPA entity user, to initialise the fields accordingly, to save the entity in the database, and to make the test code available under the name “Peter”.

Reading in and running Groovy scripts

Before we look at how the definition becomes an initialised entity, solutions for the remaining requirements should firstly be discussed.
Groovy is compiled into Java byte code. For this reason, Groovy code can be called directly from Java code as though it were written in Java. This means that we get the seamless integration with Java for free.
It is also possible to use Groovy as a script language. In this context, it is relatively easy to programmatically read in and run script files with standard Groovy tools. This option is ideal for our purposes: In this way, we are able to outsource our test data definitions in .groovy files and load the required files according to the test case.
For this purpose, we introduce the EntityBuilder class, which accepts a file name and creates the entities that are defined in the file. The complete code is available in the GitHub repository referred to initially.

buildEntities accepts the file name of an entity definition script. This definition is read in and converted into an executable script instance. In doing so, a class is created at runtime, that provides a run method which contains the content of the script. We can now call this run method like any other method and run the script accordingly. In this respect, it is necessary to remember that Groovy can resolve references which are unclear in the script at runtime. The DelegatingScript script subclass which is used allows setting a delegate, against which unclear references can be resolved. At this point, the EntityBuilder sets itself as the delegate of the script. The reason for this will become clear later.
With this implementation, we can now define test data on a modular basis in random script files and load it as required. Assuming that the entities defined in the scripts are acutally instantiated and initialised, raises the question of how the data enters the database.

The Glue-Code

To decouple persisting the entities from their creation, the EntityBuilder provides the option to register EntityCreatedListeners. That way we can use a Listener in the Glue Code in the TestDataLoader class (seerepo) which that takes care of saving the data. The TestDataLoader expects a fully initialised JPA-EntityManager as a constructor parameter. Its loadTestData method can then be called by the Client passing entity definition files. After initializing the listener, it forwards the definitions to the EntityBuilder. The EntityBuilder in turn creates the defined entities and passes them to the EntityPersister through the listener interface for saving.

Definition of the actual DSL

We are now able to define entities in arbitrary scripts, to read in the definitions and to save the created entities. During the actual creation of the entities several features of Groovy take effect: When calling methods, Groovy allows the brackets that surround the parameters to be omitted. Additionally, it is possible to define closures with curly brackets, i.e. runnable sections of code similar to Java 8 lambdas, which can be referenced and transferred via variables like usual objects. Closures can then be run at any location.
With that in mind, it becomes clear that the expression

is nothing more than a call to the static method create in the EntityBuilder (static import) with three parameters (in Groovy, a class can be referenced simply via its name. The expression User is therefore equivalent to User.class.).
Let us now take a look at what happens when we call create in the DSL.

We can see that the static create method just delegates the call to createEntity of the EntityBuilder’s singleton instance. Here, a new instance of the passed entity class is created and registered under the passed name in the java.util.Map entitiesByName. Please note: entitiesByName[entityName] = entity is equivalent to entitiesByName.put(entityName, entity). Once the new entity is initialized with data in the executeEntityDataDefinition, the registered listeners will finally be informed.

The actual magic takes place in the two lines of the executeEntityDataDefinition method. It takes the newly instantiated entity as parameter as well as the closure originating from the script in which the data for the entity are defined. To understand what happens, we have to do a little more research, however. Let us take another look at the closure which is passed to the create method as the last parameter in the script.

It appears as though values are assigned to variables. These variables are not declared, however, which means that an additional Groovy feature takes effect in this case. The expression
myObject.someProperty = 'value' is equivalent to calling a setter: myObject.setSomeProperty(‘value’). This means that two setters are called in the closure, the only question being: on what? Since the fields “coincidentally” correspond to the properties of the User entity to be created, wouldn’t it be useful if they were to be called directly on the entity? This is exactly what the two lines in executeEntityDataDefinition achive. Similar to the way it handles scripts, Groovy is able to dynamically resolve method calls within a closure at runtime. In this respect, it is also possible to define a delegate for the closure. The rehydrate method which is called in executeEntityDataDefinition creates a copy of the closure and sets the first parameter as the delegate, in our case, this is the previously instantiated entity. If the closure is now run with entityDataDefinition.call(), the setters in the example are actually called on the User entity so that it is initialised with the corresponding data.
Time to put what has so far been achieved to the test before we further refine the DSL. Let us therefore start by defining a user

The Demo.java class demonstrates the way in which the TestDataLoader can be used from the Java code, and that the created entities are actually persisted. The DSL snippets come from the testData.groovy file (see repo).

Nested Entities

So far, so good – we have completed the round trip from the DSL to the database to the test code. It remains to be seen as to whether a complex data model can be used, i.e. how we deal with references between entities. In this context, let us assume that a user can be assigned to a department. In this respect, the user gains a @ManyToOne relationship to the department. In the DSL, we are able to simply nest the creation of entities:

What happens, however, if a second user belongs to the same department? Creating the department with the create method twice is obviously not an option. Therefore, we have to create a possibility for referencing previously created entities from the DSL. Since we are using normal Groovy code, it should be possible to store an entity created using the create method in a variable. With a little support from the EntityBuilder, however, it is easier:

What is happening here? How can the allocation department = lostBoys work without lostBoys having been initialised? In this context, a variety of Groovy attributes come together: lostBoys is initially an identifier which cannot be resolved. In this case, Groovy calls a getter for a property with the name of the identifier –similar to the setter with the allocation of values to variables. In the same way, the expression myObject.someProperty is equivalent to the expression myObject.getSomeProperty(). In this respect, the question is again as to what the getter is called on. It cannot be the delegate of the closure. In this case, it is a user instance which certainly does not offer a getLostBoys() method. At this point, the above-described delegate of the script comes into play. We recall that the EntityBuilder positions itself as a delegate before the running of the script. The EntityBuilder does not have a getLostBoys() method either, but implements the propertyMissing method, which is called by Groovy in the case of access to a non-existing property. The missing property’s name is passed as argument. This way, we are now able to retrieve and return the previously created department with the name “lostBoys” so that it is set as a value in the script:

We can even set Peter as the head of the department while we create the department and assign it to him:

Code Completion

This actually means we have everything that we need. But things could be a little more convenient. Until now, we are largely on our own with the definition of the actual entity data. Within the DSL, there is no way of finding out which properties an entity has and what their specific types are. This means that code completion by the IDE is also impossible.

We are able to tell the IDE what the delegate of the closure will be, however. All of the information are in the call of the static create method of the EntityBuilder. The delegate of the closure is always an instance of the simultaneously passed class. Let us therefore supplement the create method with two annotations to make this information known:

As a result, the IDE (in this case, IntelliJ IDEA) knows that in our example calls within the closure are delegated to a user instance, and therefore offers the properties of the user.

Summary

Finished! Our DSL enables test data to be defined so that it is easily readable, modular and separate from the actual test code. Hierarchies can be nested and object-oriented instead of being mapped using foreign keys. In this context, we do not have to leave the world of Java, or to make any conceptual break towards the relational model of the database. Since we are using Groovy code within the DSL, we can enjoy all of the freedoms that a programming language is able to offer. One conceivable scenario, for instance, is generating large volumes of data via loops.
This article does not address the question of how the database should be cleaned after a test case. In the examples demonstrated (Demo.java), we make it easy for ourselves, and reverse the transaction after every test. For a genuinely integrative test, this is not available as an option. In “real” projects, we have so far emptied all of the tables with a database script and TRUNCATE TABLE. With large schemas however, this has had a considerable impact on the run time of the tests. It would be interesting to see whether it would be more efficient to clean the database programmatically. Via a stack, it should be possible to delete the created entities in reverse order of their creation.

And another small drawback: The DSL could be designed even prettier with a fluent API, for example:

Since, in this case, the definition is distributed across three method calls (create, named and with), I have not found any way of making the delegate of the closure known using annotations. This is because the class at which the delegating occurs is passed to a different method than the closure itself. For the benefit of the code completion, I have therefore decided on a somewhat less attractive DSL syntax. Can anyone think of how both can be achieved? Pull requests with improvements are particularly welcome.

About the Author

Daniel Behrwind

Software Development

As a passionate software developer and clean code advocate, he is fascinated by creative solutions for complex problems which appear so obvious that they may have simply emerged all on their own.

github.com/Behrwind

Comments

No Comments

Comments closed

Static code analysis with SonarQube

Author

Josha von Gizycki

Published

13.02.2017

This article describes which key figures can be collected by a static analysis and how these can be interpreted. The main focus is on technical debt and complexity.

Approach and findings of an architecture analysis within the framework of a code review

Author

Benny Schwarting

Published

15.07.2019

After the last article looked at important key figures from the static analysis of a code review, aspects of manual analysis are now highlighted. How does an architecture analysis work in a code review and what conclusions can be drawn from it?

Mutation Testing with Pitest

Author

Philipp Czora

Published

28.08.2017

Unit tests can be useful for ensuring code quality and correctness. Not every unit test makes sense, however, and bugs often manage to escape detection by unit tests. How can test quality be increased so that programming errors are detected earlier and more reliably?

Jenkins Pipeline plugin: code completion in IntelliJ

Author

Johannes Schnatterer

Published

06.06.2017

The Pipeline plugin (formerly Workflow plugin) for Jenkins revolutionises working with Jenkins by allowing for the creation of build jobs as code. As a result, build pipelines can be put under version control, become reusable,testable and more easily readable, among other things, as something “put together with mouse clicks”.

Version names with Maven: Reading the version name

Author

Johannes Schnatterer

Published

07.11.2016

The first article in this series – “Version names with Maven: Creating the version name” –shows how during the build, and with Maven’s help, a version number, enriched with further information can be written into a manifest, properties or HTML file. Based on the first article, this second one describes how it can be read from within the application. This is helpful in many cases: You can immediately see which version is deployed on which stage, the version information can avoid misunderstandings in error reports, etc.

Mutation Testing with Pitest – Part 2: SonarQube

Author

Philipp Czora

Published

29.11.2017

This post follows on from the previous part. If you have not yet read it, we recommend you take a few minutes to do so now.

Summary

Comments

Related Posts

Static code analysis with SonarQube

Approach and findings of an architecture analysis within the framework of a code review

Mutation Testing with Pitest

Jenkins Pipeline plugin: code completion in IntelliJ

Version names with Maven: Reading the version name

Mutation Testing with Pitest – Part 2: SonarQube