Gáspár Nagy on software

coach, trainer and bdd addict, creator of SpecFlow; owner of Spec Solutions

Solving: How to test APIs with Gherkin? (#GivenWhenThenWithStyle – Challenge 21)

by Gáspár on March 22, 2021

Challenge 21 in the #GivenWhenThenWithStyle series by Gojko Adzic was how we should use Gherkin to specify API behavior. So here are your responses and my thoughts.

I think this was an open challenge and there is no clear “this is the way to go” option from those proposed, and this was also visible from the responses. This is especially true if we remove Option 4 from the basket – more about this later.

Having no clear winner means that probably we should instead think about the goals of our own project and see when/how a particular option would be helpful. So let me just go through each of the options and share what the responders said and some situations where I have seen these working.

The cuckoo’s egg – Option 4: Describe only behaviour

Let’s start with Option 4, because that one was a cuckoo’s egg: Those scenarios were not described in an API specific way at all.

It has been mentioned so many times that scenarios should be business-readable and focus on the problem and not the solution, therefore technical terms should be avoided.

But when we say business-readable, we should not only keep business analysts or domain experts in mind. There are some problems (like public APIs), where the “business” consists of developers or system integrators – so technical people. The statement is still true: focus on the problem and exclude the solution, but the “problem” might consist of details about the HTTP status codes and the JSON message format. Still, solution details, like the details of the web framework we use and its configuration should not be part of the scenarios. 

In the Formulation book, we call these parts of the living documentation “targeted documentation”. (The target group is not necessarily always developers. In one of my engineering-related projects, we had to make scenarios for the engineers with calculation details.) So we can say, this is targeted documentation for developers/integrators. 

But let’s get back to Option 4, which basically stated that we don’t need targeted scenarios for this group. Sounds odd, but this might be absolutely valid.

Just imagine the case: there is a clearly specified system. All important business functionalities are covered by examples expressed in a form of Gherkin scenarios. So you know what the system is able to do. 

The application has a REST API interface that comes with documentation of the endpoints and description of the different parameters of the endpoint. There are tools like OpenAPI/Swagger that can help you to define your API and it can even generate some reference documentation to present it. The documentation tools typically allow you to add a single “all-in” example request as well. (Note that this is not behavior, but syntax and semantics.)

Now the question should be: If we have scenarios for the business behavior and an API reference, do we need additional examples that show how the API works? 

There is no generic answer for this. Maybe we do, maybe we don’t. But this is a very important question to start with. Almost half of the responders said that they would not create targeted documentation for this particular case.

I have seen examples for both approaches. In cases where the particular API solves a small, specific problem, additional examples might not be necessary. If you try to post a message to LinkedIn, it either posts it or fails. The flow can be well described with a single example that the API reference documentation anyway includes.

But on the other hand where the API behaves quite differently depending on the context, additional scenarios might be helpful. Consider if a LinkedIn post would work in a way that when you try to post to a moderated group instead of your own feed, then depending on your role in that group, you might receive a special pending-approval result. This probably would need further examples to demonstrate and show what you can do with that result (e.g. use the returned ID to query the final outcome).

Option 1: Include full JSON/XML message

From now on, let’s imagine we have considered whether we need targeted documentation for the APIs and we have decided that we do. So let’s focus on how this can be achieved. (For all the remaining stats I have removed the ones who voted for Option 4.)

More than a third of the responders voted for Option 1. This is not a surprise, because we have all seen that having one good defined key example can help a lot to understand how the API works. All public APIs have at least one example. You can call this a “happy-path”, but this is usually more like an “all-in” example that tries to show a little bit of everything. The sample JSON we copied to this option in the challenge was this example from the LinkedIn API.

There is no real problem with this as long as there is only one example like this. But if you have plenty of these examples, it is easy to see that this will quickly become a maintenance burden. This is especially true if the API belongs to a system that is in constant improvement. The LinkedIn API does not change very often, but the API of the application you are currently developing is probably not so stable. 

So, these scenarios may be useful for documentation, but they are not good functional tests. I think as tests they rather belong to the “smoke test” category. This is fine, however, you don’t need hundreds of smoke detectors to know if there is fire in your room.

Option 2: Show important fields only, as request body snippets in the target format

The second option attempts to address the maintenance problems of including the full message in a very simple way: let’s include just parts of the message.

This has not received many votes though, probably because people felt its brittleness. XML and JSON are hierarchical formats. It is not so easy to show a part of them simply. Also it can happen that the important bits of the JSON/XML are in two different sections. You can introduce some placeholders (e.g. “[…]”) that tell the reader and the automation framework where to include the hidden parts, but it can become pretty tricky for complex messages.

I think this is an approach that sometimes might work (and then we are lucky), sometimes not.

I work on a product that synchronizes scenarios to Azure DevOps. The synchronization tool operates on feature files, but in most of the cases the only important thing is the scenario it syncs and not the feature file header or whether there are other scenarios in the file. So, for this case I have a step, like “Given the following scenario” instead of the “Given the following feature file”. In the case of a feature file, the “snippet” is easy to implement and understand.

Option 3: Set important fields with pointers

Basically Option 3 represents a kind of “smart solution” to overcome the maintenance problems of including the full message, which the responders seemed to like: half of them have voted for this.

It is important to highlight here that this is just one smart solution. There may be other ones that need to be developed for your own context. As these smart solutions are technology- and not domain-specific, so you might even find some existing tools that solve this problem. Ken Pugh shared a post about Karate used in this way. 

I remember when I formulated the scenarios for my Gherkin editor in Visual Studio, I had to provide a representation of the feature file loaded in the editor: the text and the cursor (caret) position. For this purpose I “invented” a simple syntax, by using doc-strings with a special “{caret}” marker to show where the cursor is currently positioned (see an example of it here).

Whatever smart solution you may apply, you should always make sure that:

  • the provided syntax is easy to understand for your target audience (test it if it is!)
  • it allows you to focus on the essential part of the scenario
  • you will be able to use it for all similar scenarios and you don’t have to introduce other styles
  • the full message (or text, or Excel file, or whatever you described) can be easily composed from the scenario when the reader/tester/auditor needs the exact message that was used.

I think the first three points are more self-explanatory, but the last one might require more hints.

How to get the full message?

I have used the particular pointer solution from Option 3 in the same product that I mentioned earlier. There the users need to specify the configuration as a JSON file (users are developers here). In the documentation I often had to refer to the different configuration settings, but including a sample config file snippet every time was not convenient. So I started to refer to the different configuration settings in a some/setting/here style, that – together with general knowledge about the config files – is enough for readers to make the necessary settings (see an example in the documentation here). 

Once this format was in more general use, I realized that it would be practical to use it in the scenarios as well, to highlight which particular setting in the configuration has to be set in order to get the expected outcome, like:

The step definition behind this step is implemented in a way that once it has composed the full configuration from the pointers it also prints it out to the test output. So whenever I need the full configuration file that was used in the particular scenario, I just need to open the test output and check the value there, as you can see in the picture below.

This way I get the full details whenever necessary, but without the problems of maintaining the incidental details in the feature file. This is one of the benefits you can get from a truly living documentation. And it is pretty easy to implement. 

Wrap up

I have reviewed all the options for the API testing challenge. I highlighted that although the scenarios should generally be readable for all stakeholders, there might be a need for creating a so-called targeted documentation – a set of scenarios which are meaningful for a specific target audience only. The rules remain the same: focus on the problem and hide solution details, however, you might want to include more technical terms as well, if the target audience is technical. The targeted documentation should be obviously only a small subset of your full living documentation.

When discussing the concrete options, we raised the question of whether we need such targeted documentation for APIs at all. It can happen that with the business-readable scenarios and the API reference documentation the integrators can achieve their goals. Many projects simply don’t need scenarios for this.

If they do however, there are several options you can consider. You can include full details of a particular message, which works well as a generic example, but this will most likely serve as a smoke test and its maintenance costs are pretty high. So, you’d better have only a few of them if they are needed at all.

It is better to hide the incidental details from the scenario and focus on the relevant (essential) details only. If you are lucky, you can just include snippets of the message but you might need to develop or use some smart solutions to keep the scenario readable but reduce the maintenance costs at the same time. 

Regardless of what solution you use to hide the incidental details, there might be a need to see the exact full message that was used. For this, it is easiest to implement the testing solution in a way that once it has composed the message from the details provided, it also prints it out to the test output, so it will become part of your test execution report – your living documentation.

2 thoughts on “Solving: How to test APIs with Gherkin? (#GivenWhenThenWithStyle – Challenge 21)