BDD (Behavior-driven development) is a software development process focusing on active collaboration, which illustrates and automates the requirements using key examples of the problem domain. In BDD, the formalized examples use a natural language-based DSL driven by the Given/When/Then keywords.
At the same time, property-based testing (PBT) uses abstract formulas to declare expectations for the output values given some constraints on the input. The PBT tools try to disproof that the application fulfills these requirements by taking samples from the valid input value space.
The experience shows that for understanding and properly implementing the requirements, the team has to understand the requirements as a set of abstract rules, where collecting key examples can help a lot.
In this article, with the related open-source SpecFlow plugin, I would like to invite you for an experiment to use property-based testing in a BDD project for automating and formalizing business rules.
Thanks for Ciaran McNulty, Konstantin Kudryashov and all other participants of Cukenhagen 2016 (the Cucumber contributors’ workshop in Copenhagen), for pointing me to this topic and giving valuable feedback.
You can find the slides of my related UCAAT conference presentation here.
To get started with this experiment, let’s look at a BDD scenario that is formulated with Given/When/Then. For the sake of simplicity, let’s take the usual calculator example.
As you can see, this scenario is about the addition feature of the calculator and we have used the 1 + 2 = 3 example to illustrate this particular operation.
The most common question people have when seeing such an example: how many examples should we collect in the BDD collaboration process? It somehow feels that one example is not enough. This question usually originates from the desire that we would like to have “full coverage” of our specification. With that coverage we could ensure that the application works in all circumstances.
The bad news is that there is no magical formula for the number of necessary examples that would give you any kind of guarantee for the perfect quality. For that you would need to have infinite number of examples (or even more ;-)). There are some heuristics in testing for the number of test cases needed for a particular problem. They usually use the decision points of the algorithm and with that try to cluster the input values to some kind of equivalence groups. You need to take at least one representative from each group plus maybe check the border cases. (The resulting number is exponentially growing with the number of input values and the decision points, so you will quite quickly get to a very large number of examples needed.)
In case of the addition, hopefully the final implementation in the application will be something like “return a + b
“. There are no decision points. Probably we could find a few edge cases near to the Int32
limits, but if we have a really simple calculator, which does not let you enter more than 10 digits, even these are not relevant.
If you talk about your calculator to your friend or children, I can bet, sooner or later you will get the question whether it supports negative numbers or even zero. For us, humans, these seem to be interesting questions that help better understanding how the calculator works. So probably for the addition we could list the following examples:
- 1+2=3
- 5+ -7 = -2
- 2+0 = 2
This small example illustrates how we use examples in BDD. During the specification workshops when we collect these examples, our primary goal is not to achieve some kind of full-coverage, but to collect enough examples so that we can understand the requirements. (Later on, we can add more tests for better “coverage” of course, even with the same automation infrastructure we introduce for the BDD scenarios.)
BDD is for understanding and validating business requirements through illustrative examples.
What is property-based testing?
In contrast to this, property-based testing (PBT) focuses on a different aspect of the quality. We could say that…
PBT is for verifying the implementation through checking statements about the output for many different possible inputs. (Based on http://blog.jessitron.com/2013/04/property-based-testing-what-is-it.html).
These statements are the “properties” that we use for testing. To make this clearer, let’s look at the addition as an example. In PBT, we could say we would like to verify that if I add any “a” and “b” numbers, then the result should be the sum of these numbers. This is not really a good test, because for verifying the result we would need to implement the addition again. Instead of that, we should look for some properties of addition that we could verify for many input numbers. It is time to find your math school books. (Google is your friend anyway.) It turns out that addition is defined with four different properties. If an operation fulfills all this four for any input numbers, the operation is a proper addition. These properties are the following:
- Commutative property: a + b = b + a
- Associative property: (a + b) + c = a + (b + c)
- Identity property: a + 0 = a
- Distributive property: a * (b + c) = a*b + a*c
So we would need to check all these for as many input numbers as we can. There are PBT tools that can help us with that! The majority of these tools are some kind of derivation of the Haskell QuickCheck tool. There are ports to nearly any programming language, check out Wikipedia for a detailed list. For .NET I will use FsCheck now.
In FsCheck the validation of some of the addition properties look like this:
As you can see, we basically had to express the property as a lambda expression and pass it in to the FsCheck API. FsCheck picks random input numbers in order to falsify the expression. If there is an error, it even tries to narrow the input space to find the point where the property evaluates to false. So if I make a funny (and wrong) implementation of the addition like
It will report back that the problem is near to 38… this is called shrinking by the way.
Now you know the very basic concepts of PBT. (I can recommend the presentation of Scott Wlaschin for a more detailed intro.) Let’s see how this is related to BDD.
Rules and examples
When collecting examples in BDD to illustrate a feature, we use some kind of structured conversation (like example mapping). We first review the business rules (sometimes called acceptance criteria) of the feature and collect the examples for these rules. At the end of the day, we need both the abstract rules and the concrete illustrative examples to get the full picture about the requirements. We can formalize and automatically verify the examples, but the rules usually remain as free-text in the header of the feature file. This is probably good enough in the most of the cases, but in some situations we might want to verify the rules automatically as well with as many input combinations as possible. Maybe a good example for that would be a security-related rule in your system. If you just think about PBT, as we have discussed it before, you can see that this is exactly what it provides. The key difference is that in the case of PBT we have used lambdas and source code for defining the properties and not the ubiquitous language we use for our scenarios. How would such a property look like in our language? Let’s take a simple one, the “Identity” property:
Wow! That fit’s pretty well. Let’s make it run!
To make this experiment, I have created a small SpecFlow plugin that integrates FsCheck with SpecFlow. You can find it on Github or get it from NuGet: SpecFlow.FsCheck. (You can also find this calculator example in the GitHub repo.)
The idea is that we develop the application test-driven based on the selected examples. This means that the step definitions and the application have been already developed for our 3 examples, like the 1+2=3.
Now, as we are in the verification phase of the project, where we would like to verify this feature with “full coverage”, we move on with PBT and automate our property-based scenarios.
With the plugin, you basically have to declare the constraints of your input variables (e.g. that “any number” should be any Int32
), and the plugin will convert the scenario into a property function and let FsCheck evaluate it for many input values.
For this you need to make a binding class deriving from the ConstraintBase base class and define the input variables through step argument transformations. Look at this.
As you can see, we converted the “any number” into an FsCheck input variable (named “any”) without any special constraints. The “first number” is not an input variable, just a reference to another one, so we declared it with the “AsFormula
” method.
It is important to tag these scenarios with @propertyBased
, because this tag triggers the plugin.
If we run this property-based scenario for the wrong addition implementation, we get the expected error. FsCheck again was able to shrink down the input space to find 38.
More real-life examples
Of course it is very rare that you have to implement an addition function or check mathematical definitions of business rules in your projects. But the problem of verifying conditions (or cross-cutting concerns) for many or all input combinations might be needed. Here are some more real-life looking examples that you can use for inspiration. (Parameters are marked with *
for better readability.)
The good news is that you can apply this concept even to some small aspects of your project, you can have hundreds of normal scenarios and just drop in one property-based one. All other will work as before.
You need to have super-fast test automation though. FsCheck will repeat your scenarios many (even >100) times. So this is probably not ideal for UI automation.
Next steps
This is an experiment still, and I have not used it in real projects yet. But I hope I will have a chance soon to try it out. If you think this might be interesting for you, please let me know. I make it even more concrete: I am happy to offer 4 hours of free online consulting, for a project wanting to try this out. Drop me a mail if you are interested and I will pick a project somehow if there will be more.