wiki

Codit Wiki

Loading information... Please wait.

Codit Blog

Posted on Thursday, September 28, 2017 12:46 PM

Stijn Moreels by Stijn Moreels

“Property-Based Testing”, ever heard of it? It’s a very popular topic in the functional community. The opposite of this approach is the well-known “Example-Based Testing”. People think in examples and that’s why it’s so popular; also in the non-functional community. This is/was the way we write/wrote tests.
“Given a static, constant example; the output should be this”
But is that the best we can do?

Introduction

Property-Based Testing is about generalizing the input so we can make statements about the output; without specifying exactly what the input or output should be, only should look like.

I’m not going to give you the full-introduction because there are already so much good resources about this topic, (also in different languages).

But what I will do, is give you an introduction to FsCheck in a C# environment. FsCheck is written in F# but has a C#-friendly API. I’m going to use the FsCheck.Xunit package for this blog post.

FsCheck

For a full-introduction of FsCheck itself, I highly recommend the documentation of FsCheck; with a good explanation about the framework. Although they give a good idea of how the framework is built, I find it hard to find examples of how it can be used concretely; especially if you’re using the xUnit variant.

Several blog posts are using F# to make properties with FsCheck, but with C# the posts are rather rare…

Fact to Property

Let’s start from the xUnit example they present on their documentation:

If you know xUnit, you know that ‘Fact’ is the way xUnit marks methods as test-methods and the static class ‘Assert’ is used to assert on the result.

Now, I’ll give you the same example but written with some FsCheck properties:

What are the differences?

  • The ‘Fact’ attribute is changed to ‘Property’ attribute
  • Return type is ‘Property’ instead of ‘void
  • Assert’ class isn’t used, but the condition is returned and transformed by the ‘ToProperty()’ call to a ‘Property
  • The inputs of the method under test aren’t hard-coded anymore

This last difference is probably the most important one.
I highly recommend you read the resources if you haven’t heard about PDT because I won’t list all the benefits of Property-Based Testing. I hope that you see that by using this approach, I can’t maliciously fake the actual implementation anymore, while in the first example I could have done this.

We’ve added two parameters to the test method that FsCheck will fill-in for us with some random values. This will contain negative, zero and positive values all in the range of an Int32 data type. All valid integers so to say. FsCheck will, by default, run 100 tests with random values for the inputs of the test.

FsCheck has several extension methods on boolean values, like the one above. Let’s look at some more.

Conditional & Lazy Properties

Sometimes, we want to restrict the input to make sure you’re always end up with the same output. A simple example is the mathematical division. You can’t divide by zero, so to have the same result we must make sure that the given input isn’t below zero.

What’s different?

  • We added the ‘When()’ call to specify that we can’t divide by zero (this makes sure we don’t have to call ‘ToProperty()’ again)
  • We extracted the method, which we wanted to test, in its own delegate. Note that FsCheck has extension methods on any delegate that returns a boolean.

That is a good example of the Conditional Property; but why do we need to extract the call to ‘Divide’? Because otherwise FsCheck will evaluate this immediately (even with ‘y’ being zero) which would result in a ‘DivideByZeroException’ and FsCheck will treat any exception being thrown as a test failure. That’s why.

By extracting this, we’re telling FsCheck that we’re only interested in the results IF the condition holds. In our case: ‘y’ must be zero.
That’s convenient!

With this simple example, we’ve shown how we express conditions in our properties to make sure we’re always in a given set of inputs, and shown how we can create Lazy Properties which are useful to only evaluate the test if the condition we’ve set holds. This also can be useful if the actual test takes some time and we don’t want to lose time while evaluating a test which result isn’t of interest for us.

Exception Properties

In functional programming, I’ll try not to use any exceptions in my code; but in imperative languages this is the way to express something went wrong. We also write tests that trigger those exceptions that we throw by giving invalid inputs.

The xUnit package also has some methods on the Assert class called “Throws”, “ThrowsAny”, ... How can we express this in FsCheck?

The documentation says that this isn’t actually supported in C# (you can see it at the lower-case method); but writing it this way works.

Observed Properties

The best possible alternative for this feature, is the ‘usermessage’ you can send with the ‘Assert’ class in the xUnit package. We send a string with the assert so we can later see which assertion has failed.

FsCheck takes this a step further.

Trival Properties

FsCheck has a way to count the cases for which a condition is met. In our previous example, can we count how many generated values are negative values?

In our test output, we can see that the positive and negative values are almost split in half:

Ok, passed 100 tests (47% trivial).

Try to run them again and see how this test output change.

Classified Properties

Sometimes, we want more than one condition to check about our input and maybe add some custom message for each category of input. According to me this is the closest thing to the ‘Assert’’s ‘usermessage’.

FsCheck has a way to express this by classifying properties.

In our output, we’re now seeing:

Ok, passed 100 tests.
63% Smaller than '1000'.
37% Smaller than '1000', Bigger than '10'.

See, how the categories can also be combined and are shown to the user in a friendly way.

Collected Properties

We’ve seen some examples how we can express some categories for our test inputs by specifying conditions on them and giving them a name. But sometimes we’re just interested in the actual input value itself and how it changes during the test run.

This will result in this test output:

Ok, passed 100 tests.
8% "Values together: 0".
5% "Values together: 8".
5% "Values together: 1".
4% "Values together: 3".
4% "Values together: -12".
3% "Values together: 38".
3% "Values together: 2".
3% "Values together: -4".
3% "Values together: -14".
3% "Values together: -1".
2% "Values together: 9".
2% "Values together: 7".
2% "Values together: 5".
2% "Values together: 32".
2% "Values together: 21".
...

This way, we can clearly see how the test inputs changes over time.

Combined Observation Properties

As a final observation property, we can also combine several of the previous observed properties into one property that combines all the results:

This will result in this test output:

Ok, passed 100 tests.
7% "Values together: 3", Smaller than '1000'.
5% "Values together: 2", Smaller than '1000'.
5% "Values together: 0", Smaller than '1000'.
4% "Values together: 13", Smaller than '1000'.
4% "Values together: 1", Smaller than '1000'.
3% "Values together: -8", Smaller than '1000'.
3% "Values together: -4", Smaller than '1000'.
3% "Values together: -15", Smaller than '1000'.
3% "Values together: -12", Smaller than '1000'.
2% "Values together: 9", Smaller than '1000'.
2% "Values together: 8", Smaller than '1000'.
2% "Values together: 7", Smaller than '1000'.
2% "Values together: 27", Smaller than '1000', Bigger than '10'.
2% "Values together: 22", Smaller than '1000'.
2% "Values together: 1", Smaller than '1000', Bigger than '10'.
2% "Values together: -56", Smaller than '1000'.
2% "Values together: -3", Smaller than '1000'.
2% "Values together: -11", Smaller than '1000'.
2% "Values together: -10", Smaller than '1000'.
...

Combined Properties

The previous properties all had the same thing in common: they are  testing a single condition. What if we want to test multiple conditions? And how do we distinguish each property from one another?

Sometimes, we want to test two conditions. Combining them in a ‘AND’ expression. FsCheck also has this as extension method:

We can also add a ‘Label’ which is the same as the ‘usermessage’ in the ‘Assert’ class in the xUnit package: it pops up when the condition isn’t met.

By using this approach, we always know which property has failed.

Note that I now use a ‘NonNegative’ type instead of a regular int. This is also part of the FsCheck framework and allows me to make sure I always get a positive integer without specifying it in a Conditional Property. As you have seen, FsCheck will try any value that acts as a valid type; so, if I would add a condition to my property stating that I want a positive integer, I’ll get roughly the half of the test runs. This way, by using the ‘NonNegative’ I’m sure that I still get my 100 test runs without skewing the input so much I get merely any test runs.

Of course, we can also combine our properties in an ‘OR’ expression with the extension method ‘Or()’.

 Types

We’ve already seen an example with the previous properties where I used the ‘NonNegative’ type. FsCheck has several types that you can use to stricken our input. Some interesting ones:

  • PositiveInt represent an integer bigger than zero
  • NonZeroInt represent an integer which isn’t zero
  • NonNegativeInt represent an integer which isn’t below zero
  • IntWithMinMax represent an integer that can contain the int.Min and int.Max values
  • NonNull<T> wraps a type to prevent null being generated
  • NonEmptyArray<T> represent an array which isn’t empty
  • NonEmptySet<T> represent a set which isn’t empty
  • NonEmptyString represent a string which isn’t empty
  • StringNoNulls represent a string without null characters (‘\000’)
  • NormalFloat represent a float which isn’t infinite or NaN
  • Interval represent an integer interval
  • IPv4Address represents an IPv4 Address
  • IPv6Address represents an IPv6 Address

And many more... Don’t hesitate to come up with your own generic types that you can contribute to FsCheck!

We can also generate our own domain models with invalid and valid ones and use FsCheck to generate them for use; but that would lead us to another topic about generators.

Conclusion

In this post, we’ve seen how Property-Based Testing isn’t just a functional concept but an idea we can use in any language. FsCheck is inspired from the QuickCheck variant in Haskell, there’s also the ScalaCheck in Scala, JavaQuickCheck for Java, ClojureCheck for Clojure, JSVerify for JavaScript, … and so many more.

I’m not saying that you can abandon all your Example-Based Tests. Just like I stated in the beginning of this post: people think in examples. So, I think the combination of Example-Based Tests and Property-Based Tests is the sweet spot. By examples we can show the next person concrete ways of how to use your API and with properties you ensure that the implementation is the right one tested with any boundary conditions.

Thanks for reading!

Categories: Technology
Tags: Code Quality
written by: Stijn Moreels

Posted on Monday, September 18, 2017 12:31 PM

Stijn Moreels by Stijn Moreels

How can Functional Programming help us to ignore even more in our tests?

Introduction

In this series of Test Infected, I will show you how we can increase the Test Ignorance of our tests by applying Functional approaches to our Imperative code.
If you don’t quite understand what I mean with “ignorance”, I recommend my previous post about the topic. In this post, we will go through with the journey of increasing the Code’s Intent by increasing the Ignorance in a Functional way.

Functional Ignorance

Fixture

The fixture-phase of your test can become very large, several previous posts have already proved this.
How can functional programming help?
Well, let’s assume you want to setup an object with some properties, you would:

  • Declare a new variable
  • Initialize the variable with a newly created instance of the type of the variable
  • Assign the needed properties to setup the fixture

Note that we’re most interested in our test in the last item; so how can we make sure that the last part is the most visible?

Following example shows what I mean:

We would like to test something with the subject property of the message, but note that this is not the first thing which catches your eye (especially if we use the object-initializer syntax). We must also initialize something in a context.

We could, of course, extract the creation functionality with a Parameterized Creation Method and extract the insertion functionality that accepts a message instance.

But note that we do not use the message elsewhere in the test. We could extract the whole functionality and just accept the subject name, but we will have to use an explicit method name to make clear that we will insert a message in the context AND will assign the given subject name to that inserted message. What if we want to test something else? Another explicit method?

What I sometimes do is extract only the assigning functionality like this:

We don’t use the name of the method to state our intentions, we use our code.

In the extracted method, we can do whatever necessary to create an ignored message. If we do need another way to create a message initially, we can always create a new method that only inserts the incoming message and call this from our functional method.

If would be nice if we had immutable values and could use something like F# "Copy-And-Replace Expressions".

Exercise

Several times, when you want to test your code branches from an external SUT endpoint, the creation of the SUT doesn’t change, but rather the info you send to the endpoint. Since we have a value that does not change across several tests; we could say that the value is not that important to the test case but rather the changing values.

When you come across such a scenario, you can use the approach I will describe in here.

The idea is to split the exercise logic from the SUT creation. If you have different endpoints you want to test for the same SUT fixture, you can even extend this approach by letting the client code decide what endpoint to call.

Following example shows two test cases where the SUT creation is the same:

Note that we have the same pattern: (1) create SUT, (2) exercise SUT. Compare with the following code where the SUT is being exercised differently.

We ignore the unnecessary info by Functional Thinking:

We can extend this idea by letting the client choose the return value. This is rather useful if we want to test the SUT with the same Fixture but with different member calls:

I use this approach in almost every Class Test I write. This idea is simple: Encapsulate what varies. Only we think in Functions rather than in Objects. Functions can be treated as Objects!

Verification

The last topic I will discuss in a Functional approach is the Result Verification phase of the Four-Phase Test.

When I applied some techniques in this phase, I always come back to the same principle: I ask myself the same question: “What is really important?” What interests me the most?

In the Result Verification phase, this is the Assertion itself. WHAT do you assert in the test to make it a Self-Evaluating Test? What makes the test succeed or fail?
That’s what’s important; all the other clutter should be removed.

A good example (I think) is when I needed to write some assertion code to Spy on a datastore. When the SUT was exercised, I needed to check whether there was any change in the database and if this correspondeded with my expectations.
Of course, I needed some logic to call the datastore, retrieve the entities, assert the entities, Tear Down some datastore-related items. But the test only cares whether the updated happened or not.

As you can see, the assertion itself is baked-in into the called method and we must rename the method to a more declarative name in order for the test reader to know what we’re asserting on.

Now, as you can see in the next example, I extracted the assertion, so the test itself can state what the assertion should be.
Also note that when I extract this part, I can reuse this Higher-Order Function in any test that needs to verify the datastore, which is exactly what I did:

Conclusion

Test Ignorance can be interpreted in many ways, this post explored some basic concepts of how Functional Programming can help us to write more Declarative Tests. By extracting not only hard-coded values, but hard-coded functions, we can make complex behavior by composing smaller functions.

Functional Programming hasn’t been a fully mainstream language (yet), but by introducing Functional Concepts into Imperative Languages such as: lambda functions, pattern matching, inline functions, pipelines, higher-order functions, … we can maybe convince the Imperative programmer to at least try the Functional way of thinking.

Categories: Technology
Tags: Code Quality
written by: Stijn Moreels

Posted on Monday, July 24, 2017 8:24 AM

Stijn Moreels by Stijn Moreels

In this part of the Test Infected series, I will talk about the ignorance of tests and how we can achieve more ignorance.
The short answer: if it doesn’t contribute to the test, hide/remove it!
The reason I wrote this post is because I see many tests with an overload of information embedded, and with some more background information people may increase the ignorance of their tests.

Introduction

In this part of the Test Infected series, I will talk about the ignorance of tests and how we can achieve more ignorance.

The short answer: if it doesn’t contribute to the test, hide/remove it!

The reason I wrote this post is because I see many tests with an overload of information embedded, and with some more background information people may increase the ignorance of their tests.

Ignorance

Test-Driven Discovery

Sometimes people write tests just because they are obliged to do so. Only when someone is looking over their shoulder they write tests, in any other circumstances they don’t. It’s also crazy to see people abandon their practices (TDD, Merciless Refactoring, …) the moment there is a crisis or a need for a quick change or something else that causes stress to people.

The next time you’re in such a situation, look at yourself and evaluate how you react. If you don’t stick with your practices in those situations, then do you really trust your practices at all?If you don’t use your practices in your stress situations, abandon them, because they wouldn’t work for you (yet).
This could be a learning moment for the next time.

Now, that was a long intro to come to this point: what after we have written our tests (in a Test-First/Last approach)?

Test Maintenance

In my opinion, the one thing Test Ignorance is about, is Test Maintenance. When there’s some changes to the SUT (System Under Test), how much do you have to change of the production code and how much of the test code?

When you (over)use Mock Objects (and Test Doubles in general), you can get in situations that Gerard Meszaros calls Overspecified Software. The tight-coupling between the tests and the production code is causing this Smell.

But that’s not actually the topic I want to talk about (at least not directly). What I do want to talk about are all those tests with so much information in them that every method/class/… is Obscured.

People read books about patterns, principles, practices… and try to apply them to their Production Code, but forget their Test Code.

Test Code should be as clear than the Production Code.

If your method in production has 20 lines of code and people always lose time to read and reread it… (how many times do you reread a method before you refactor?), you refactor it to smaller parts to improve usability, readability, intent…

You do this practice in your production code; so, why wouldn’t you do this in your test code?

I believe one of the reasons people sometimes abandon their tests, is because people think they get paid for production code (and not because of their lack of discipline). It’s as simple as that. But remember that you get paid for stable, maintainable, high quality software and you can’t simply deliver that without tests that are easily maintainable.

"Ignorance is Bliss" Patterns

“I know this steak doesn't exist. I know that when I put it in my mouth, the Matrix is telling my brain that it is juicy and delicious. After nine years, you know what I realize? Ignorance is bliss.”

- Cypher (The Matrix)

Now, when you understand that your test code is just as much important than your production code, we can start by defining our Ignorance in our tests.

There are several Test Patterns in literature that support this Test Ignorance, so I’ll give you just the concepts and some quick examples.

This section is about readability and how we can improve this.

Unneccessary Fixture

The one place where you could start and where it is most obvious there’s a Test Smell, is the Fixture Setup. Not only can this section be enormous (I’ve seen gigantic fixtures) and hard to grasp, there’re also hard to change and so, to maintain.

Look at the following example. We need to setup a “valid” customer before we can insert it into the Repository. In this test, do I really need to know what all the different items are to make an invalid customer. Do we need all of them? Maybe it’s just the id that’s missing, but that could be autogenerated, or maybe the address doesn’t exist, …

Only show what I need to make the test pass.

We can change the example with a Parameterized Creation Method as an example of the One Bad Attribute Test Pattern. In the future, we could also parameterize the other properties if we want to test some functionality that depends on this information. If this isn’t the case, we can leave these initializations inside the Creation Method for the customer instead of polluting the test with this unnecessary information.

Now, if we want to act as a fully Test Infected person, we can also Test-Drive these Creation Methods. Next time you analyze the code coverage of your code, include the test projects and also make tests for these methods! This will increase your Defect Localization. If there's a problem with your fixture and not the test that uses this fixture, you will see this in your failed tests and know that you have a problem with the Fixture and not the test itself.

Also note that this newly created method is only accessible within this test class. If we want to write tests with the same Fixture, we can extract this logic in its own class.

In either way, we have made our intentions clear to the test reader. I always try to ask the following question to the test: “Do you really care if you know this?”.

Again, the test can be made clearer if we send the argument that makes the customer invalid with it, so we know what’s the cause why the customer isn’t inserted. If we move the “id” somewhere else, we won’t know what causes it and would made the test more Obscure.

I see some reasons why a Test Fixture can be big:

  • The Fixture has a lot of “setup-code” in place because the SUT is doing too much. Because the SUT is doing all these steps, we must build our Fixture with a lot of info and behavior. Otherwise, the SUT will see the Fixture as invalid.
  • The Fixture is the smallest possible for exercising the SUT and the SUT is a Complete Abstraction, but needs nonetheless a Fixture that needs some lines to setup before the Fixture is valid.
  • The Fixture contains some unnecessary information that doesn’t contribute the result of the test but is embedded in the test anyway.

So, there are a lot of different possibility why a Fixture can be big and the solution is for all these situations the same: make the Fixture as small as possible + only add the information to the test, relevant to the result of the test . Contribute or get out.

Now, if you move ALL the Fixture code somewhere else (so extracting too much), you also have a problem. Test readers will now see some Magic Fixtures in place that act as Mystery Guests which can result in Fragile Tests.

Obscured by Eagerness

Sometimes, I encounter tests that are “Obscured by Eagerness”. A Test can be “obscure” for different reasons. One can be because we want to assert too much in a single test, another can be because we want to “set-up” too much in a single run, and yet another can be because we combine tests in a single test run by exercising multiple actions on the SUT.

To summarize:

  • Eager Assertion: assert on too many state and/or behavior in a single run.
  • Eager Fixture: set up unnecessary fixture (see previous section).
  • Eager Exercises: exercise multiple actions on the SUT to combine tests.

I’ve seen people defend tests with more than 20 assert statements because they still tested a “single unit” outcome. Sometimes you have functionality that looks like you have to write 20 assert statements or more, but instead of writing those statements you should ask yourself: What are you trying to test?

By explicitly asking yourself this question, you often come up with surprising results.

Because the assert-phase of the test (in a Four Phase Test) is important to verify the state of the test (failed or succeed), I always try to start by writing this phase first. It forces you to think about what you trying to test and not what you need to set up as Fixture. By writing this phase first, you’re writing your test from bottom to top and only define what you really need. This way (like writing tests for your production code), you only write what you need.

Previous snippet is a perfect example of how we can abuse the Assert-Phase. By placing so many asserts in a single spot, we obscure what we really trying to test. We need to test if the message is serialized correctly; so instead of manually getting each element, why not assert on the whole xml?

We create an expected xml string and verify if this is the same as the actual serialized xml string.

Conclusion

Writing tests should be taken as serious as the production code, only then we can have maintainable software solutions where developers are eager to run tests instead of ignoring them.

The next time you write a test, try to think firmly about the test. What should I know, what do I find important to exercise the SUT, what do I expect… this way you can determine what items are important and which aren’t.

I sometimes "pretend" to be the test case:

“Do I care how this Fixture is set up?”
“Must I know exactly how to assert all these items?”
“Have I any interest of how a ‘valid’ object looks like?”
“What do I really want to test and what information only pollutes this?”
“Do I care that these actions must be executed before the exercise of the test?”

Tests only need to know what they need to exercise the SUT, nothing more, but equally important: nothing less!

Categories: Technology
Tags: Code Quality
written by: Stijn Moreels

Posted on Wednesday, July 12, 2017 9:45 AM

Stijn Moreels by Stijn Moreels

In this part of the Test Infected series, I will talk about how code is hard to test – both in a Test-First mindset and without.

Hard-to-Test Code

By “hard”, I mean anything that is uneasy, sloppy, frustrating or annoying, … anything that makes you sigh. All that can be categorized as “hard”.

TDD or Test-Driven Development is a lot more than just writing tests first, it’s also about Designing Software. You think about a lot of things during the writing of tests, and all those things together guards you from writing Hard-to-Test Code.

When I write tests, I want an easy way to exercise and verify the SUT (System Under Test). The less code I need to write to do that, the better. The clearer the test intent, the better. The easier the test, the better.

Obscured Fixture Setup

What do I mean with a Fixture in this paragraph? Anything that you need to do so you can exercise the SUT. This could mean that you first have to initialize the SUT with some valid arguments, it could mean that you must insert some Dummy Data into a datastore, it could mean that you must call some methods of the SUT...

According to Kent Beck: anything that “Sets the Table” for the SUT.

This section is about the maintainability of the Test Fixture and how we can improve this.

Discovery

With a Complex Fixture Setup, I mean that I must write a lot of code to “set this table”. I must admit that  I quickly assign a fixture as being “complex” – but that’s a good thing I guess.

Look at the following snippet. It’s a good thing that we “spy” Client instead of actually sending a mail, but also note that your eyes are attracted to the strings in the Attachments and Header of the valid message instead of the actual testing and verifying.

I don’t like complex, big, or hard-to-understand Fixtures. I want clear visual of what is tested and how. Nothing more, nothing less. Of course, I don’t know if you find this complex, maybe you don’t (because you wrote it), I just don’t like big methods I guess.

We have 16 lines, with 3 comments, 2 blank spaces, 2 braces, 7 lines of which are Fixture Setup.

Causes

You could think of many causes to having a Complex Fixture Setup.

  • You could have a Tightly-Coupled system which forces you to create all those extra objects and initialize them with the right values.
  • Your test includes information which doesn’t really matter in the context of the test; this way introducing a Polluted Test. This could happen in a Copy-Paste programming environment in which you just copy the Fixture of another test.
  • It could also happen if there wasn’t enough research done to maybe Fake the Fixture and thereby avoiding unnecessary setup code.

Impact

Now, we have a Complex Fixture – so what?

Let’s look at the impact a Complex Fixture could have on your development. What if I want to test some unhappy paths for the Client. What if we want to test the creation of the Message with Constructor Tests. What if we want to test with a single Attachment instead of two…

Al those tests would require a similar Fixture Setup.

If you have Cut-and-Paste developers in your team, you would have a lot of Test Duplication. Which again result in a High Test Maintenance Cost.
Besides the fact of duplication, it isn’t clear to me that we are testing a “valid message”. Does it have to do with the header value? Does it have to do with the attachments? Does it have to do with something else? …

What do I have to do to create valid mail message. Does this message require attachments? By not having a clear Fixture Setup, we don’t have a clear Test Overview.

Possible Solution

The first thing you should do is to eliminate all the unnecessary information from your test. If you don’t use it/need it – don’t show it.
I only want to see what’s really important to the test to pass.

If you, after that, still have a big Fixture to set up – place it in Parameterized Creation Methods so you can only send the values to the Creation Methods that are valuable for the test. This way you resolve the duplication of the tests.

Also make sure that you don’t have any duplication in your Implicit Fixture (typically in some kind of “setup” method or constructor) for setting up a Datastore for example.

Missing Object Seam Enabling Point

What is an Object Seam? Michael C. Feathers introduced this and called it: “A place where you can alter behavior without editing that place”.

Every Seam must have an Enabling Point where the decision for one behavior or the other can be made. In Object-Oriented languages, we can use this method by introducing Test Doubles by which we implement another version of the dependency or other kind of object we need to exercise the SUT.

Discovery

Not having an Enabling Point for the SUT makes our code Hard-to-Test. This can happen in many situations – especially when you have chosen some kind of design decision and everything must make room. (This sounds a bit like a Golden Hamer Smell to me).

Please don’t look at the names, it’s only an example.

The following example shows how the Message Service doesn’t contains any enabling point. We are bound to use the file system if we want to test the Message Service. A possible solution could introduce a Fake Datastore (probably in-memory) and send it to the Message Service.

Also note that we can’t even write a valid assertion. Yes, we could check if the File Based Data Store has written something on the disk. But I hope that you can see that this isn’t the right way to assert the Message Service.

We would have to write code that assert the functionality of the File Based Data Store and not from the Message Service.

Ok, it was a very “simple” example of a SUT could miss an Enabling Point. Let’s make it a tiny bit harder.

The File Sender always uses XML serialization/deserialization if we want to write a file. We always must use the Message Helper (what kind of name is that?) if we want to write a file.

These Utility Classes are the result of thinking in a Procedural way, not in a OOP way. Because all these static classes are used in this class, we have no choice than to also test them if we want to test the File Sender. If we Unit Test this class, we aren’t testing a “unit”, we are testing 3 (!) classes.

Whenever I see the name “helper” appear, I immediately think that there is room for improvement in the design. For a start, please rename “helper” to a more appropriate name. Everything could be a helper for another class but that doesn’t mean we must name all our classes with the prefix “helper”.

Try to move those functionalities in the appropriate classes. Our Message could for example have a property called IsEmpty instead of outsourcing this functionality in a different class.

Functional Languages have the same need to inject Seams. Look at the following example of the same functionality:

If we want to change the Message Helper or the serialization, we must inject functions in our “Write to File” function. In this case, our Stub is just another function.

Again, don’t look at the names, or functionality – it’s just to make a point on the Enabling Point of our Object Seams.

Causes

You could think of many causes of a Missing Enabling Point:

  • If the Dependencies are implemented right inside the SUT – which would indicate a Tight-Coupling (again); and therefore, we cannot implement our own version of the dependency.
  • Utility Classes are a result of Procedural Thinking in a OOP environment. (Yes, maybe there are some exceptions – maybe) which result in Static Dependency with the SUT. We cannot alter these behaviors of these static classes in our test.
  • The SUT may do a lot of work inside the constructor which always need to run if we want to exercise the SUT – thereby limiting our capabilities of placing an Enabling Point.
  • Having a chain of method calls could also indicate this Tight-Coupling only in a more subtle way. If we have a single dependency but we have to call three methods deep before we have the right info, we have the same problem. It violates the "Don't Talk to Strangers" design principle.

Impact

By constantly missing an Enabling Point for your Seam; you are making a design that isn’t reusable for other purposes.
Sometimes the reason behind the absence of Enabling Points lies in the way the responsibilities are spread across the classes and not wrapped in the right classes.

Maybe I’m a bit allergic to Utility Classes.

Possible Solution

Placing an Enabling Point in our SUT. That should be our solution. We need some kind of Observation Point we can use to alter behavior, so we can verify the outcome.

Note that the observation can be direct or indirect, just like the control point (or Enabling Point) of our SUT. Use different kind of Test Doubles to achieve these goals.

We use a Test Stub to control the Indirect Inputs of the SUT; we use a Mock Object for verification of the Indirect Outputs.

Classes which have private information, behavior, … these classes can maybe expose their information or behavior in a subclass. We could create a Test Specific Subclass which we can use to exercise the SUT instead of the real one.

But be careful that you don’t override any behavior you are testing. That we lead to False Positive test cases and would introduce paths in your software that are never exercised in a test environment.

In Functional languages, everything is a function; so, we could introduce a Stub Function for our injection of data, and a Mock and/or Stub Function for our response and verification, … so we have an Enabling Point for our SUT.

Over-Specified Software by Mocking

I already said it in previous posts and sections: you must be careful about Mocking and what you try to Mock. In another post, I mentioned the Mock Object as Test Double to assert on indirect outputs of the SUT. This can be useful if you can’t verify any other outside observable behavior or state of the SUT.

By using this Mock Object we can in fact verify the change of the SUT.

Discovery

Yes, we can verify the change of the SUT; but have we also a maintainable change of the Mock Object?
If we need to change some signature of the Mock Object, we need to alter the change throughout all the Mock Objects and there direct assertions to complete the change in signature.
If we mock out everything of the SUT and thereby Tight-Couple our Test Double with the SUT, we have Over-Specified Software by Mocking too much.

Causes

Multiple situations can cause a SUT being Tight-Coupled to the DOC (Depend-On Component, in this case the Mock Object).

  • Writing Tests After implementation can cause this situation. If you have developed a Hard-to-Test SUT, you may have encounter a SUT that only can be exercised/tested (and so, verified) by injecting a Mock Object and assert on the indirect outputs of the SUT.
    Sometimes, the assert-phase of these tests aren’t the whole picture we want to test but only a fragment; a side-effect. By using this extra side-effect, we have made our test code Tight-Coupled on our SUT.
  • Testing Unnecessary Side-Effects can also cause Over-Specified Software. If we assert on things we don't necessary need in our test or things that do not add any extra certainty to our test case; we should remove those assertions. Testing on “extra” items doesn’t result in more robust software but rather in Over-Specified Software.

Impact

So, let’s say you’re in such a situation; what’s the impact in your development work?
Just like any software that is Tight-Coupled, you have the cost of maintenance. If you’re software is tested in a way that the slightest change in your SUT that doesn’t alter any behavior of the SUT result in a failed test; you could state that you have Over-Specified Software.
Any change you make is a hard one, which result that developers will make lesser changes. Lesser changes/refactorings/cleanup… will result in lesser quality of your software.

Possible Solution

People tend to forget the Dummy Object when they are writing tests. The purpose of the Dummy Object is to fulfil the SUT needs without actually doing anything for it. Passing “null” or empty strings are good examples, or objects that are empty (and maybe throw exceptions when they are called to ensure that they aren’t called during the exercise of the SUT).
Not everything needs to be a Mock Object. And just to be clear A Mock isn’t a Stub!

You’ll be amazed how many unnecessary data you write in your tests when you start removing all those Mock Objects and replace them with lighter objects like Dummy Objects.

Yes, Mock Objects are absolutely necessary for a complete developer toolset for testing; yes, sometimes Mock Objects are the only possible solution to verify indirect outcomes of the SUT; yes, sometimes we need to assert on indirect output calls directly…
But certainly, not always. Try using another Test Double first instead of directly using a Mock Object. Just like you’ll use an Inline Fixture first before moving to a Shared Fixture.

Besides the fact that you can change your Test Double, you could also look at WHAT you want to test and may come up with some refactorings in your SUT to verify the right state or behavior. The best solution to have code that is easy to test; is writing your tests first, which result immediately in testable code.

Asynchronous Code

Some small part about Asynchronous Code, because it's too much to talk about in such a small section.

The problem with async code, is that we don't always have the same context in which we can assert for the right outcome. Sometimes we use a Polling functionality to get the work done for example. This will (of course) result in Slow Tests, but sometimes we don't have control of the running process.

In the book xUnit Test Patterns we've seen that we can use a Humble Object which extracts the async code, so we can make sync calls in our test. In my previous post, I talked about a Spy and used a Wait Handle, to block the thread before we succeed the test; this can also be a solution (if it's implemented right; timeout, ...).

The xUnit framework (not the xUnit family!; .NET xUnit != xUnit Family) written for .NET, has support for async methods which makes sure that we can assert in the right task-context.

Conclusion

So many classes are already in play that are more difficult to test; that’s why my motto is to not make this pile of classes any bigger and to write in a Test-Driven way easy-to-read/easy-to-maintain code every day. Because every day can be that day where you must change a feature, add/modify/remove functionality, or anything else that include change.

Tests are there to help, not to slow you down. In fact, by writing tests you work more productively, efficiently, safer, rousted, …

So, don’t write any Hard-to-Test code but write code that grows incrementally from your tests.

Categories: Technology
Tags: Code Quality
written by: Stijn Moreels

Posted on Friday, July 7, 2017 12:00 PM

Stijn Moreels by Stijn Moreels

In this part of the Test Infected series, I will talk about the Test Doubles. These elements are defined to be used as a “stand-in” for our SUT (System under Test). These Test Doubles can be DOC (Depend on Components); but also, other elements we need to inject to exercise the SUT.

Introduction

This is probably the first post in the Test Infected series. The term “test infected” was first used by Erich Gamma and Kent Beck in their article.

“We have been amazed at how much more fun programming is and how much more aggressive we are willing to be and how much less stress we feel when we are supported by tests.”

The term "Test-Driven Development" was something I heard in my first steps of programming. But it was when reading different books about the topic that I really understood what they meant.

The Clean Coder by Robert C. Martin talks about the courage and a level of certainty of Test-Driven Development. Test-Driven Development: By Example by Kent Beck has taught me the mentality behind the practice and Gerard Meszaros with his book xUnit Test Patters has showed me the several practices that not only improved my daily development, but also my Test-First mindset. All these people have inspired me to learn more about Test-Driven Development and the Test-First Mindset. To see relationships between different visions and to combine them the way I see it; that's the purpose of my Test Infected series.

In this part of the Test Infected series, I will talk about the Test Doubles. These elements are defined to be used as a “stand-in” for our SUT (System Under Test). These Test Doubles can be DOC (Depend on Components); but also, other elements we need to inject to exercise the SUT.

I find it not only interesting to examine the theoretical concept of a Test Double, but also how we can use it in our programming.

Types

No, a Stub isn’t a Mock; no, a Dummy isn’t a Fake. There are differences in the way we test our code. Some test direct inputs other indirect outputs. Each type has a clear boundary and reason to use.

But be careful, overuse of these Test Doubles leads to Over Specified Software in which the test is Tightly-Coupled to the Fixture Setup of the tests which result in more refactoring work for your tests (sometimes more than the production code itself).

Test Code must be as clear, simple and maintainable… as Production Code – maybe even more.

Dummy Object

We use a Dummy Object if we want to inject some information that will never be used. null (C#), None (Python) … are good examples; but even “ignored data” (string) are valid Dummy Objects. If we’re talking about actual valid objects, we could throw exceptions when the methods of that object are called. This way we make sure that the object isn’t used.

We introduce these kinds of objects because the signature of the object to test requires some information. But if this information is not of interest of the test, we could introduce a Dummy Object to only show the test reader the related test information.

We must introduce custom Dummy Objects if the SUT doesn’t allow us to send null / None

Test Stub

In the literature, I found two different types of Test Stubs. One that returns or exposes some data, which can be used to validate the actual outcome of the System under Test (SUT); this is called a Responder and one that throws exceptions when the SUT interacts with the Stub (by calling methods, data…) so the Unhappy Path is being tested; this is called a Saboteur.

But, I encountered a possible third type which I sometimes use in test cases. I like to call it a Sink but it’s actually just a Null Object. This type of Stub object would just act as a “sink”, which means that the Stub isn’t doing anything with the given data. You could use a Sink in situations where you must for example inject a “valid” object but don’t feel like the test case doesn’t really care about what’s happening outside the SUT (in what cases does it?).

By introducing such an Anonymous Object, you let the test reader know that the object you send to the SUT is not of any value for the test.

This kind of “stubbing” can also be accomplished by introducing a new Subclass with Test Specifics and override with empty, expected or invalid implementations to test all paths of the SUT.

Following example shows how the Basket gets calculated with a valid (Anonymous) and invalid (Saboteur) product. I like to call valid items “filled” or “anonymous” to reference the fact that I don’t care what it contains or is structured. You can use “saboteur” to indicate the you have a “invalid” product in place that throws exceptions when it gets called.

I don’t know why, but sometimes, especially in this case where you have a several valid and a single invalid item – the setup reminds me of a pattern called Poison Pill. This is used in situations where you want to stop an execution task from running by placing a “poison pill” in the flow.

This type of Stub isn’t a Dummy object because the methods, properties, etc… are called. Also note that there’s a difference between a Saboteur and a Dummy Object which throws exceptions when called. The Saboteur is used to test all the paths of the SUT; whether the Dummy Object guards against calls that aren’t supposed to happen (which result in a test failure).

You would be amazed what a Stub can do in your design. Some developers even use this stub later as the actual production implementation. This is for me the ultimate example of Incremental Design. You start by writing your tests and incrementally start writing classes that are dependencies of your SUT. These classes will eventually evolve in actual production code.

Now, here is an example of a Stub. The beauty of Functional Programming, is that we can use Object Expressions. This means we can inline our Stub in our test.

Java also has a feature to define inline classes and override only a part of the methods you exercise during the test run

 

  • To decrease the Test Duplication, we can define Pseudo Objects for our Stubs. This means we define a default implementations that throws exceptions for any called member (like a Dummy for example). This allows us to override only those members we are interested in, in our Stub.
  • During my first experience with Kent Beck's Test-Driven Development, I came across the Self-Shunt idea. This can actually be any Test Double, but I use it most of the time as a Stub. Here, we use the Test Class itself as Test Double. Because we don't create an extra class, and we specify the return value explicitly; we have a very clear Code Intent. Note that I only use this practice if the Test Double, can't be reused somewhere else. Sometimes your Test Double start as a Self-Shunt but can grow to a full-blown Stub.

Test Spy

Ok, Stubs are cool – very cool. The Saboteur is especially useful to test the unhappy/rainy paths throughout the SUT. But there’s a downside to a pure stub: we cannot test the Indirect Output of our SUT.

That’s where the Test Spy comes in. With this Test Double, we can capture the output calls (the Indirect Output) of our SUT for later verification in our test.

Most of the time, this is interesting if the SUT doesn’t return anything useful that we can use to verify if the test was successful. We can just write the ACT statement and no ASSERT statement and the test will automatically result in a test failure if any exception during the exercise of the SUT is being thrown.

But, it’s not a very explicit assertion AND (more importantly), if there are any changes to the SUT, we cannot fully verify if that change in behavior doesn’t break our software.

When developing a logging framework (for example); you will have a lot of these situations because a log-function wouldn’t return anything (the log framework I came across didn’t). So, if we only get a void (in C#), how can we verify if our log message is written correctly?

When working in an asynchronous environment; Test Spies also can be useful. Testing asynchronous code will always have some blocking system in place if we want to test Indirect Outputs – so a Test Spy is the ideal solution.

By hiding the blocking mechanism, we have a clear test and the test reader knows exactly what the purpose of the test is, what the SUT should do to make the test pass, and what the DOC (Depend-on Component) do in the background to make the right verification in our assert-phase.

All of this makes sure that we have a true test positive.

The time-out is (off course) context specific – try to limit to the very minimum; 5 seconds is a very long time for a single unit test to pass but not for an integration test.

Mock Object

If we want to test Indirect Outputs right away and not at the end of the test run (like a Test Spy uses “is called” in the assert-phase); we can use a Mock Object.

There’s a subtle difference with a Mock Object and a Test Spy. A Spy will capture its observations so that it can be verified in a later part (in the assert-phase); while a Mock will make the test fail when it encounters something that was not expected.

Of course, combinations can be made, but there’s something that I would like to warn you about the Mock Object. Actually, two somethings.

1) One must be careful what he/she mocks and what he/she exercises in the SUT. If we mock too much or mock the wrong parts, how do we verify that our SUT will survive in the “real world” where there aren’t any Mock Objects that return just the data the SUT expect?

2) One must be careful that it doesn’t use the Mock Object for all his/her tests. It will result in a Tight-Coupling between the test cases and the SUT. Especially when one uses mocking frameworks, it can be overused. Try to imagine that there must change something in your SUT. Try to go down the path how many Mock Objects you must change in order to get that change in your SUT.

Tests are there to help us, not to frustrate us.

These two items can also be applied on Test Stubs for example, if we specify too much information in our Indirect Input. The  different with a Mock is that we validate also the Indirect Output immediately the output and not in our Assert-phase. Tight-Coupling and overusing any pattern is a bad practice in my opinion. So, always start with the smallest: can you use a Dummy? Then a Stub? Maybe we can just Spy that? Ok, now we can use a Mock.

Look at the following example: we have a function that transforms a given input to an output, but only after we asserted on the expected input. This is a good example of how we assert directly and giving expected output for our SUT.

Fake Object

The last Test Double I would like to discuss, the Fake Object. This Test Double doesn’t always need be configured. An object that is a “fake” is actually a full-implementation object that implement the functionality in such a way that the test can use it during the test run.

A perfect example is the in-memory datastore. We implement the whole datastore operations, all within memory so we don’t need a full configured datastore in our tests.

Yes, of course you must test the datastore connection but with a Fake Object in place you can limit the tests that connect to the “real” database to a minimum and run all the other tests with a “fake”

Your first reaction for external components should be to check if you can fake the whole external connection. Tests that use in-memory storage rather than the file system, datastore, network connectivity… will run a lot faster – and therefore will be run a lot more by the developers.

This type of Test Double is different from the others, in a way that there is no verification in place. This type of object “just” replaces the whole implementation the SUT is dependent on.

Conclusion

Honestly, I think that the reason why I wrote this blog post is because I heard people always talk about “mocks” instead of the right words. Like Martin Fowler says in in his blog post: “Mocks aren’t Stubs”.

I know that in different environments people use different terms. A Pragmatic Programmer will use other words or the same for some Test Doubles than someone from the Extreme Programming background. But it is better that you state what you mean with the right terminology, than to call everything a “mock”.

What I also wanted to show was, that a Test Double isn’t “bound” to Object Oriented Programming or Functional Programming. A Test Double is a concept (invented by Gerard Meszaros) that alter the behavior of your SUT in such a way that the test can verify the expected outcome.

It’s a concept, and concepts can be used everywhere.

Categories: Technology
written by: Stijn Moreels