wiki

Codit Wiki

Loading information... Please wait.

Codit Blog

Posted on Thursday, June 22, 2017 2:41 PM

Toon Vanhoutte by Toon Vanhoutte

In this blog post, I'll explain in depth the routing slip pattern and how you can leverage it within enterprise integration scenarios. As always I'll have a look at the benefits, but also the pitfalls will get some well-deserved attention.

The Pattern

Introduction

A routing slip is a configuration that specifies a sequence of processing steps (services). This routing slip must be attached to the message to be processed. Each service (processing step) is designed to receive the message, perform its functionality (based on the configuration) and invoke the next service. In that way, a message gets processed sequentially by multiple services, without the need of a coordinating component. The schema below is taken from Enterprise Integration Patterns.

Some examples of this pattern are:

Routing Slip

Routing slips can be configured in any language, JSON or XML are quite popular. An example of a simple routing slip can be found below. The header contains the name of the routing slip and a counter that carries the current step number. Each service is represented by a routing step. A step has its own name to identify the service to be invoked and has a specific key-value configuration pairs.

Remark that this is just one way to represent a routing slip. Feel free to add your personal flavor…

Assign Routing Slip

There are multiple ways to assign a routing slip to a message. Let's have a look:

  • External: the source system already attaches the routing slip to the message
  • Static: when a message is received, a fixed routing slip is attached to it
  • Dynamic: when a message is received, a routing slip is attached, based on some business logic
  • Scheduled: the integration layer has routing slips scheduled that contain also a command to retrieve a message

Service

A service is considered as a "step" within your routing slip. When defining a service, you need to design it to be generic. The executed logic within the service must be based on the configuration, if any is required. Ensure your service has a single responsibility and there's a clear boundary of its scope.

A service must consist of three steps:

  • Receive the message
  • Process the message, based on the routing slip configuration
  • Invoke the next service, based on the routing slip configuration

There are multiple ways to invoke services:

  • Synchronous: the next service is invoked without any persistence in between (e.g. in memory). This has the advantage that it will perform faster.
  • Asynchronous: the next service is invoked with persistence in between (e.g. a queue). This has the advantage that reliability increases, but performance degrades.

Think on the desired way to invoke services. If required, a combination of sync and async can be supported.

Advantages

Encourages reuse

Integrations are composed of reusable and configurable building blocks. The routing slip pattern forces you to analyze, develop and operate in a streamlined manner. Reuse is heavily encouraged on different levels: the way analysis is performed, how patterns are implemented, the way releases are rolled out and how operational tasks are performed. One unified way of working, built on reusability.

Configuration based

Your integration is completely driven by the assigned routing slip. There are no hard-coded links between components. This allows you to change its behavior without the need of a re-deployment. This configuration also serves as a great source of documentation, as it explains exactly what message exchanges are running on your middleware and what they exactly do.

Faster release cycles

Once you have set up a solid routing slip framework, you can increase your release cadence. By leveraging your catalogue of reusable services, you heavily benefit from previous development efforts. The focus is only on the specifics of a new message exchange, which are mostly data bound (e.g. mapping). There's also a tremendous increase of agility, when it comes to small changes. Just update the routing slip configuration and it has an immediate effect on your production workload.

Technology independent

A routing slip is agnostic to the underlying technology stack. The way the routing slip is interpreted, is of course specific to the technology used. This introduces ways to have a unified integration solution, even if it is composed of several different technologies. It enables also cross technology message exchanges. As an example, you can have an order that is received via an AS2 Logic App, being transformed and sent to an on premise BizTalk Server that inserts it into the mainframe, all governed by a single routing slip config.

Provides visibility

A routing slip can introduce more visibility into the message exchanges, for sure from an operational perspective. If a message encounters an issue, operations personnel can immediately consult the routing slip to see where the message comes from, what steps are already executed and where it is heading to. This visibility can be improved, by updating the routing slip with some extra historical information, such as the service start and end time. Why even not including an URL in the routing slip that points to a wiki page or knowledge base about that interface type?

Pitfalls

Not enough reusability

Not every integration project is well-suited to use the routing slip pattern. During analysis phase, it's important to identity the integration needs and to see if there are a lot of similarities between all message exchanges. When a high level of reusability is detected, the routing slip pattern might be a good fit. If all integrations are too heterogenous, you'll introduce more overhead than benefits.

Too complex logic

A common pitfall is adding too much complexity into the routing slip. Try to stick as much as possible to a sequential series of steps (services) that are executed. Some conditional decision logic inside a routing slip might be acceptable, but define clear boundaries for such logic. Do not start writing you own workflow engine, with its own workflow language. Keep the routing slip logic clean and simple, to stick to the purpose of a routing slip.

Limited control

In case of maintenance of the surrounding systems, you often need to stop a message flow. Let's take the scenario where you face the following requirement: "Do not send orders to SAP for the coming 2 hours". One option is to stop a message exchange at its source, e.g. stop receiving messages from an SFTP server. In case this is not accepted, as these orders are also sent to other systems that should not be impacted, things get more complicated. You can stop the generic service that sends a message to SAP, but then you also stop sending other message types… Think about this upfront!

Hard deployments

A very common pain-point of a high level of reuse, is the impact of upgrading a generic service that is used all over the place. There are different ways to reduce the risks of such upgrades, of which automated system testing is an important one. Within the routing slip, you can specify explicitly the version of a service you want to invoke. In that way, you can upgrade services gradually to the latest version, without the risk of a big bang deploy. Define a clear upgrade policy, to avoid that too many different versions of a service are running side-by-side.

Monitoring

A message exchange is spread across multiple loosely coupled service instances, which could impose a monitoring challenge. Many technologies offer great monitoring insights for a single service instance, but lack an overall view across multiple service instances. Introducing a correlation ID into your routing slip, can highly improve the monitoring experience. This ID can be generated the moment you initialize a routing slip.

Conclusion

Routing slips are a very powerful mechanism to deliver unified and robust integrations in a fast way. The main key take-aways of this blog are:

  • Analyze in depth if can benefit from the routing slip pattern
  • Limit the complexity that the routing slip resolves
  • Have explicit versioning of services inside the routing slip
  • Include a unique correlation ID into the routing slip
  • Add historical data to the routing slip

Hope this was a useful read!
Toon

 

Categories: Architecture
Tags: Design
written by: Toon Vanhoutte

Posted on Wednesday, February 1, 2017 3:00 PM

Toon Vanhoutte by Toon Vanhoutte

In many integration projects, the importance of idempotent receivers is overlooked. This blog post summarizes what idempotent receivers are, why we need them and how we can achieve it.

What?

Let's first have a closer look at the definition of idempotence, according to Wikipedia. "Idempotence is the property of certain operations in mathematics and computer science, that can be applied multiple times without changing the result beyond the initial application." The meaning of this definition is explained as: "a function is idempotent if, whenever it is applied twice to any value, it gives the same result as if it were applied once; i.e., ƒ(ƒ(x)) ≡ ƒ(x)".

If we apply this on integration, it means that a system is idempotent when it can process a specific message multiple times, while still retaining the same end-result. As a real-life example, an ERP system is idempotent if only one sales order is created, even if the CreateSalesOrder command message was submitted multiple times by the integration layer.

Why?

Often, customers request the integration layer to perform duplicate detection, so the receiving systems should not be idempotent. This statement is only partially true. Duplicate detection on the middleware layer can discard messages that are received more than once. However, even in case a message is only received once by the middleware, it may still end-up multiple times in the receiving system. Below you can find two examples of such edge cases.

Web service communication

Nowadays, integration leverages more and more the power of API's. API's are built on top of the HTTP protocol, which can cause issues due to its nature. Let's consider the following situations:

  1. In this case, all is fine. The service processed the request successfully and the client is aware of this.

  2. Here there is also no problem. The service failed processing the request and the client knows about it. The client will retry. Eventually the service will only process the message once.

  3. This is a dangerous situation in which client and service are misaligned on the status. The service successfully processed the message, however the HTTP 200 response never reached the client. The client times out and will retry. In this case the message is processed twice by the server, so idempotence might be needed.

Asynchronous queueing

In case a message queueing system is used, idempotency is required if the queue supports guaranteed at-least-once delivery. Let's take Azure Service Bus queues as an example. Service Bus queues support the PeekLock mode. When you peek a message from the queue, it becomes invisible for other receivers during a specific time window. You can explicitly remove the message from the queue, by executing a Complete command.

In the example below, the client peaks the message from the queue and sends it to the service. Server side processing goes fine and the client receives the confirmation from the service. However, the client is not able to complete the message because of an application crash or a network interference. In this case, the message will become visible again on the queue and will be presented a second time towards the service. As a consequence, idempotence might be required.

How?

The above scenarios showcase that duplicate data entries can be avoided most of the time, however in specific edge cases a message might be processed twice. Within the business context of your project, you need to determine if this is an issue. If 1 out of 1000 emails is sent twice, this is probably not a problem. If, however 1 out of 1000 sales orders are created twice, this can have a huge business impact. The problem can be resolved by implementing exactly-once delivery or by introducing idempotent receivers.

Exactly-once delivery

The options to achieve exactly-once delivery on a protocol level are rather limited. Exactly-once delivery is very difficult to achieve between systems of different technologies. Attempts to provide an interoperable exactly-once protocol, such as SOAP WS-ReliableMessaging, ended up very complex and often not interoperable in practice. In case the integration remains within the same technology stack, some alternative protocols can be considered. On a Windows platform, Microsoft Distributed Transaction Coordinator can ensure exactly-once delivery (or maybe better exactly-once processing). The BizTalk Server SQL adapter and the NServiceBus MSMQ and SQL transport are examples that leverage this transactional message processing.

On the application level, the integration layer could be made responsible to check first against the service if the message was already processed. If this turns out to be true, the message can be discarded; otherwise the message must be delivered to the target system. Be aware that this results in chatty integrations, which may influence performance in case of a high message throughput.

Idempotent receiver

Idempotence can be established within the message itself. A classic example to illustrate this is a financial transaction. A non-idempotent message contains a command to increase your bank balance with € 100. If this message gets processed twice, it's positive for you, but the bank won't like it. It's better to create a command message that states that the resulting bank balance must be € 12100.  This example clearly solves idempotence, but is not built for concurrent transactions.

An idempotent message is not always an option. In such cases the receiving application must take responsibility to ensure idempotence. This can be done by maintaining a list of message id's that have been processed already. If a message arrives with an id that is already on the list, it gets discarded. When it's not possible to have a message id within the message, you can keep a list of processed hash values. Another way of achieving idempotence is to set unique constraints on the Id of the data entity. Instead of pro-actively checking if a message was already processed, you can just give it a try and handle the unique constraint exception.

Lately, I see more and more SaaS providers that publish idempotent upsert services, which I can only encourage! The term upsert means that the service itself can determine whether it needs to perform an insert or an update of the data entity. Preferably, this upsert is performed based on a functional id (e.g. customer number) and not based on an internal GUID, as otherwise you do not benefit from the performance gains.

Conclusion

For each integration you set up, it's important to think about idempotence. That's why Codit has it by default on its Integration Analysis Checklist. Investigate the probability of duplicate data entries, based on the used protocols. If your business case needs to avoid this at all time, check whether the integration layer takes responsibility on this matter or if the receiving system provides idempotent service endpoints. The latter is mostly the best performing choice.

Do you have other ways to deal with this? Do not hesitate to share your experience via the comments section!

Thanks for reading!

Toon

Categories: Architecture
Tags: Design
written by: Toon Vanhoutte

Posted on Monday, November 21, 2016 4:11 PM

Stijn Moreels by Stijn Moreels

What drives software design? Several mechanisms are introduced over the years that explains what different approaches you can use or combine. Following article talks about just another technique: Responsible-Driven Design, and how I use it in my daily development work.

1    INTRODUCTION

“You cannot escape the responsibility of tomorrow by evading it today”
-Abraham Lincoln

Have you taken responsibility in your work lately? I find it frustrating that some people may not take responsibility in their software design and just writing software until it “works”. We all need to take responsibility in our daily work; software design/development work is no exception.

What drives software design? Several mechanisms are introduced over the years that explains what different approaches you can use or combine. Test-Driven Development talks about writing tests first before writing production code, Data-Driven Development talks about defining processing strategies in function of your data, Domain-Driven Design talks about solving a domain problem by using the vocabulary of the Ubiquitous Language with a high-level abstraction, Behavior-Driven Design is in short extension of Test-Driven Development with concepts of Domain-Driven Design…

Following article talks about just another technique: Responsible-Driven Design, and how I use it in my daily development work.

Rebecca Wirfs-Brock (founder of RDD) talks about some constructs and stereotypes that defines this approach and helps you build a system that takes responsibility at first place. I’m not going to give you a full description about her technique, though I’m going to give you a very quick view of how this can be used to think about your daily designing moments; at high-level and low-level software design.

A lot of programmers don’t think that they should be involved in designing the software, “That’s work for the Application Architect”. Their so wrong. This article is for developers, but also project managers and architects.

"Software Development is all design"
—Eric Evans

Your goal should not be to find THE solution at the beginning of your design process; it should be your goal to have a flexible design that you can refactor constantly to have THE solution at the end of your design process.

“Good programmers know they rarely write good code the first time”
—Martin Fowler

"The problem with software projects isn't change, per se, because change is going to happen; the problem, rather, is the inability to cope with change when it comes."
—Kent Beck

2    DEFINING RESPONSIBILITIES

“Understanding responsibilities is key to good object-oriented design”
—Martin Fowler

Defining responsibilities is crucial to have a good software design. Use Cases are a good starting point for defining responsibilities. These cases state some information of "What if… Then… and How" chains. Though, it isn't the task of use cases to define coordination or control of the software or the design; these tasks you must define yourself.
 
Rebecca describes some Roles to describe different types of responsible implementations to help you define software elements. A “Role” is some collection of related tasks that can be bundled to a single responsibility.

  • Service providers: designed to do things
  • Interfaces: translate requests and convert from one level of abstraction to another
  • Information holders: designed to know things
  • Controllers: designed to direct activities
  • Coordinators: designed to delegate work
  • Structurers: manage object relations or organize large numbers of similar objects

Some basic principles of responsibilities are the following: doing, knowing and deciding. Some element does something, another knows something and another decides what's next or what must be done.
This can help in the defining process of responsibilities. Mixing more than one of these principles in a single element is not a good sign of your design.

When structuring the requirements, and use cases, try to find the work that must be done, the information that must be known, the coordination/control activities, possible solutions to structure these elements…

It’s a good practice to define these responsibilities in the Software Design Document for each significant element. Rebecca talks about CRC Cards which list all the possible information one element knows, what work it must do…

3    ASSIGNING RESPONSIBILITIES 

Data-Driven Design talks about a centralized controlled system in its approach to have application logic in one place; this get quick complex though. Responsible-Driven Design is all about delegated control to assign responsibilities to elements and therefore each element can be reused very quickly because it only fulfills its own responsibilities and not ones from some other elements in the design. Distributing to much responsibilities can lead to weaker objects and collaboration/communication of objects.

When assigning responsibilities, there’s a lot of Principles/Patterns that helps me to reflect constantly on my design. Here are some I think about daily:

Keep information in one place:     
“Single Point of Truth Principle”: principle that states that each piece of information is stored exactly once.

Keep a responsibility small:     
“Law of Demeter - Principle of the least knowledge”: each element should have a limited piece of knowledge stored itself and have about other elements (see also Information Hiding and Information Expert).

Wrap related operations:     
“Whole Value Object” (for example): wrap related operations/information in an object on its own and give it a descriptive name.

Only use what you need:    
“Interface Segregation Principle”: each interface should only implement method which it needs.

Aligned responsibility:    
“Single Responsible Principle (a class should only have one reason to change)”: each part of your software should have a single responsibility and this responsibility should entirely be wrapped in this part.
 
And so, so many more…

4    CLASS RESPONSIBILITIES

When defining classes, the Single Responsibility Principle comes in mind. Each class should do only one predefined task/responsibility. A trick I use to keep me always aware of this responsibility, is to write above each class in comments what that class should do and only do.
 
If you find yourself writing words like: "and", "or", "but", "except"… you're probably trying to do more than just one thing. It's also very useful when adding new code to check if it's still in the class its responsibility; if not, rethink your design to find the exact spot to where to put your new code. Also, try to speak in one sentence about responsibility.
 
Classes which names contains: "manager", "info", "process"… is also an indication that you're doing something more than it should in your class. It could also mean that you named the class with such a general name because otherwise it will not state the class its responsibility.

In that way, you have a "nice" Anti-Pattern in place which I like to call: Responsibility-Hiding Anti-Pattern. Hiding of responsibilities should (of course) be eliminated, it only obscures the design and is a weak excuse for a bad design. A class named “ElementProcessor” for example is a perfect example; “What’s happening in the Process-part?”. It feels the class has some black magic in it and you can call the class to say: “Do the magic!”. Use strong descriptions when defining responsibilities for your class, if you can't do that, refactor your class until you can.

One of the best reasons to create classes is Information Hiding. If a subset of methods in a class uses a subset of information. Then that part could/must be refactored into a new class. This way we have not only Hide the information, we have refactored the data and logic so it’s on the same Rate of Change.

Can you spot following misplaced responsibility?
 


Is it the job of the File Attachment Uploader class to know what extension the Attachment should need? What if I need a FTP Attachment Uploader?

For people who wants more read: assigning to many responsibilities is actually related to the Anti-Patterns from Brown: Software Development Anti-Pattern: Blob/God-Class Anti-Pattern which talks about a single class that contains a bunch information, has some logic exposed… a class that contains SO MANY RESPONSIBILITIES; and the Software Architecture Anti-Pattern: Swiss Army Knife Anti-Pattern which talks about a complex interface (so multiple implementations) that has multiple RESPONSIBILITIES and is used in to many software problems (as a Swiss Army Knife) as solution.

5    FUNCTION RESPONSIBILITIES

I told you that I get very deep, now I'm going to talk about just a function. What of responsibility has a function for example. The name of a function should always be a verb for a start. Something to do.

Naming is key in good communication; which follows also another Principle: Principle of Least Surprise. Only to look at some name you should have a clue what’s been done in that function/class/package… so you don’t get surprised.

A function should (definitely) only do one thing. Robert C. Martin talks about different ways to spot if a function does more than one thing: if you can extract a part of the function and give a meaningful name that doesn’t restate the original function name; you’re doing more than one thing; and so, have multiple responsibilities.

 Looking back at the previous example (after refactoring of the “GetExtension” method). Does “Upload Attachment” one thing? No, it gets first the exact path from attachment related information.


Of course, this example and still be refactored, and please do. Also, note that the extracted function called “GetAttachmentLocation” and not “GetLocation”. Because we added the “Attachment” part. The function logically gets an attachment as argument from the “UploadAttachment” function.

Try to always use descriptive names in your functions and say what you do in the function, not HOW you do it. Name your function after its responsibility. If we named the function “GetLocation” there would be not logically explanation why we would send an Attachment with it because it isn’t in the function its responsibility.


After again a refactoring session, we could see that there’s maybe a missing concept. Why doesn’t have the Attachment a location as information? Also, I don’t like to see if a function has one or multiple parameters AND a return value. It violates the Query-Command Separation Principle.


Note that we also extracted the function with the File Stream related information. We did this so each function is on the same level of abstraction. Each function should do as the name says it does, no surprises. And every time you go to the next implementation, you should get deeper, more concrete and less abstract. This way each function exists on one and only one layer of abstraction. “UploadAttachment” is on a higher layer of abstraction that “AssignAttachmentLocation” and “SaveAttachmentToFileSystem”.

It’s the responsibility of the “UploadAttachment” function to delegate to the two functions and not to try do something concrete itself.

Now, we could still refactor this because there isn’t any exception handling functionality for example. Just like refactoring is an iterative process, so is software designing. Please constantly refactor your code to a better approach.

I rename my classes/functions/… daily for example. I then look at it from a distance and think: “does this really explains what I want to say?”, “Is there any better approach?”, “Is it the responsibility of this variable to know this?”, “Does this function only its responsibility?” …

6    DOCUMENTATION RESPONSIBILITIES

One of the reasons documentation exist, is to define responsibilities. Many developers don’t see the purpose of defining a software design documentation, because it's extra work and gets out of date fast. But also, because it explains something they already know (or think they know). In this document, you describe what each element/package's responsibility is, why you named something like that…
 
Most of the time, I define for each layer in my design (see Layered Architecture) a separated block of documentation. Each title starts with the same explanation: Purpose. What's the reason this package exists? After that a Description section is placed to describe some common information and responsibility the layer has. Next, is the UML Schema/Interaction Diagram section where schemas are placed to have some technical description how object are connected (UML) and collaborate (Interaction) with each other.

Eric Evans state that we should name our layers not only by their technical term like: Infrastructure Layer, Domain Layer, Application Layer, Presentation Layer… but also by their Domain Name. What Role plays this layer in your software design?

So, to summarize:

  • Title
  • Purpose
  • Description
  • UML Schema
  • Interaction Diagram

Just like writing Tests make you think about dependencies and help you to rework your design to a better approach; helps writing high-level software documentation me to get the purpose of each layer in the design. While typing purposes/descriptions/responsibilities… in my documentation, I somethings stops and thinks: “Didn’t I just wrote a piece of code that doesn’t fall in this layer responsibility?”.

Interaction Diagrams are less popular than UML but describes the actual flow your design is following. You quick find a spot in your diagrams where there are just too many arrows. This also helps you think about coordination and control of your design. Please do not underestimate this kind of approach, it helped me to get the design in its whole form.

If your team plans a weekly code freeze, this could be an ideally time to update the documentation and schema’s. This not only keeps track of your changes and responsibilities, it also helps to introduce the design to new members of the team.

This kind of approach helps me to move elements through layers to find a right home.

7    CONCLUSION

When writing every word of code, think about what you're doing. It is the right place to put this here? Is it my job to know that kind of information? Do I have the responsibility to do this? Why does this element have to make that decision? …

Every word, every line, every class, every package… has a responsibility and a purpose to exist. If you can't say way in a strong description why this class knows something, why a function has an argument, why a package is named that way… that you should probably put on your refactoring-head.
 
@Codit we aren’t satisfied just because our code works. The first obvious step in programming is that your code works; only then the real work begins…

Think about what you're doing and take responsibility.

Categories: Architecture
Tags: Design
written by: Stijn Moreels

Posted on Tuesday, June 6, 2017 5:00 PM

Stijn Moreels by Stijn Moreels

Several things became clear to me when studying CI. One of things is that everything is based on the principle of automation. The moment when you start thinking about “I can’t automate this”: that’s the moment when you should ask yourself if that is really the case.

Introduction

Before I read the book about Continuous Integration by Paul Duvall, Stephen M. Matyas III, and Andrew Glover, I thought that CI meant that we just create a deployment pipeline in which we can easily/automate deployment of our software. That and the fact that developers integrate continuously with each other.

I’m not saying that it’s a wrong definition, I’m saying that it might be too narrow for what it really is.

Thank you, Paul, Stephen, and Andrew, for the inspiring book and the motivation to write this post.

Automation

Several things became clear to me when studying CI. One of these things is that everything is based on the principle of automation. The moment when you start thinking about “I can’t automate this” that’s the moment when you should ask yourself if that is really the case.

CI is all about automation. We automate the Compilation (different environments, configurations), Testing (unit, component, integration, acceptance…), Inspection (coverage, complexity, technical debt, maintainability index…), Documentation (code architecture, user manual, schemas…).

We automate the build that runs all these steps, we automate the feedback we get from it, …

You can almost automate everything. Things that you can’t automate are for example Manual Testing. The reason is that the definition of manual testing is that you let a human test your software. You let the human decide what to test. You can in fact automate the environment in which this human must test the software, but not the testing itself (otherwise it wouldn’t be called “manual” testing).

That’s what most intrigued me when studying CI - the automation. It makes you think of all those manual steps you must take to get your work done. All those tiny little steps that by itself aren’t meaning much but are a big waste if you see them all together.

If you always must build your software locally before committing, could we than just place the commit commands at the end of our build script?

Building

It’s kind of funny when people talk about “building” software. When some people say: “I can’t build the software anymore”; don’t always mean “build”; they mean “compile”. In the context of Continuous Integration, the “compile” step is only the first step of the pipeline but it’s sometimes the most important step to people. Many think of it as:

“If it compiles == it works”

When you check out some code and the Build fails (build, not compilation); that could mean several things: failed Unit Tests, missing Code Coverage, maximum Cyclometric Complexity, … but also a compilation failure.

In the next paragraphs, when I talk about a “build” I’m talking in the context of CI and don’t mean “compile”.

Continuous Building Software

Is your build automated?
Are your builds under 10 minutes?
Are you placing the tasks that will most likely to fail at the beginning of your build?
How often do you run your integration builds? Daily? Weekly? At every change (continuously)?

  • Every developer should have the ability to run (on demand) a Private Build on his or her machine.
  • Ever project should have the ability to run (on demand, polled, event-driven) an Integration Build that include slower tasks (integration/component tests, performance/load tests…),
  • Every project should have the ability to run (on demand, scheduled) a Release Build to create deployable software (typically at the end of the iteration), but must include the acceptance tests.

There are tremendous build script tools available to automate these kinds of things. NAnt, Psake, FAKE, Cake… are a few (I use FAKE).

Continuous Preventing Development/Testing

Are your tests automated?
Are you writing a test for every defect?
How many asserts per test? Limit to one?
Do you categorize your tests?

“Drive to fix the defect and prevent from reoccurring”

Many other posts discus the Test-First and Test-Driven mindset and the reason behind that; so, I will not discuss this here. What I will discuss is the reaction people have on a failing test from your build.

A failed build should trigger a “Stop the presses” event within the team. Everyone should be concerned about the failure and should help each other to make the build succeed again as quickly as possible. Fixing a failed build should be the responsible of the team and not (only) the person that broke the build.

But what do you do when the build failed? What reaction should you have?

First, write a test that exposes the defect by writing a test that passes. When that new test passes, you have proven the defect and can start fixing it. Note that we don’t write a failed test!

There are three reasons why you should write a test that passes for a defect (we’re using Test-Driven Development, right?):

  1. It’s difficult to write a failing test that uses the assertion correctly because the assertion may not be added when the test doesn’t fail anymore which means you don’t have a test that passes but a test that’s just not failing.
  2. You’re guessing what the fix should alter in behavior == assumption.
  3. If you have to fix the code being tested, you have a failing test that works but one that doesn’t verify the behavioral change.

To end the part of testing, let me be clear on some points that many developers fail to grasp: the different software tests. I have encountered several definitions of the tests so I merge them here for you. I think the most important part is that you test all these kind of aspects and not the part if you choose to call your Acceptance Tests, or Functional Tests:

  • Unit Tests: testing the smallest possible “units” of code with no external dependencies (including file system, database…), written by programmers - for programmers, specify the software at the lowest level…
    Michael Feathers has some Unit Test Rulz that specify whether a test can be seen as a Unit Test.
  • Component Tests encapsulate business rules (could include external dependencies), …
  • Integration Tests don’t encapsulate business rules (could include external dependencies), tests how components work together, Plumbing Tests, testing architectural structure, …
  • Acceptance Tests (or Functional Tests) written by business people, define the definition of “done”, purpose to give clarity, communication, and precision, test the software as the client expects it, (Given > When > Then structure), …
  • System Tests test the entire system, could sometimes overlap with the acceptance tests, test the system in a developer perspective…

Continuous Inspection

Can you show the current amount of code complexity?
Performing automated design reviews?
Monitoring code duplication?
Current code coverage?
Produce inspection reports?

It wouldn’t surprise you that Code Inspection is maybe not the most “sexy” part of software development (is Code Testing sexy?). But nonetheless it’s a very important part of the build.

Try to ask some projects what their current Code Coverage is, Maintainability Index? Technical Debt? Duplication? Complexity?...

Al those elements are so easily automated but so little teams adopt this mindset of Continuous Inspection. These elements are a certain starting point:

Continuous Deployment

Can you rollback a release?
Are you labelling your builds?
Deploy software with a single command?
Deploy with different environments (configuration)?
How do you handle fixes after deployment?

At the end of the pipeline (in a Release Build), you could trigger the deployment of the project. Yes, you should include the Acceptance Tests in here because this is the last step before the actual deployment.

The deployment itself should be done with one “Push on the Button”; as simple as that. In Agile projects, the deployment of the software is already done at the very beginning of the project. This means that the software is placed at the known deployment target as quickly as possible.

That way the team get as quickly as possible feedback of how the software act in “the real world”.

Continuous Feedback

When you deploy, build, test, … something, wouldn’t you want to know as quickly as possible what happened? I certainly do.

One of the first things I always do when starting a project is checking if I (and the team) gets the right notifications. As a developer, I want to know as quickly as possible when a build succeeds/failed. As an architect, you want to know what the current documentation of the code base is and what the code looks like in schemas, as project manager you may want to know if the acceptance tests where succeeded so the clients get what he/she wants…

Each function has its own responsibilities and its own reason to want feedback on things. You should be able to give them this feedback!

I use Catlight for my build feedback, work item tracking, release status... This tool will maybe in the future support pull request notifications too.

Some development teams have an actual big colorful lamp that indicate the current build status. Red = Failed, Green = Successful and Yellow = Investigating. Some lamps go more lighter/darker red if the build states in a "failed" state for too long.

Conclusion

Don’t call this a full-CI summary because it is certainly not. See this as a quick introduction of how CI can be implemented in a software project with the high-level actions in place and what you can improve in your project automation process. My motto is that anything can be improved and so, be more automated.

I would also suggest you read the book I talked about and/or check the site of Thought Works for more information on the recent developments in the CI community.

Start integrating your software to develop software with lesser risk and higher quality. Make it as automated that you just must “Push the Button”The Integrate Button.

Categories: Technology
written by: Stijn Moreels

Posted on Thursday, June 1, 2017 3:30 PM

Stijn Moreels by Stijn Moreels

One of the mind-blowing development techniques that radically changed the programming world is the Test-Driven Development approach introduced by Kent Beck.

But... writing tests before we start coding? Who will do that?

Introduction

One of the mind-blowing development techniques that radically changed the programming world; is the Test-Driven Development.

Writing tests before we start coding? Who will do that?

I must admit that I personally wasn’t really convinced by the idea; maybe because I didn’t quite understand the reason we should write our tests first and the way we should do it. Can you have a bad Software Design with TDD? Can you break your Architecture with TDD? Yes! TDD is a discipline that you should be following. And like any discipline you must hold to a certain number of requirements. By the end of the day, it’s YOUR task to follow this simple mindset.

In this article, I will talk about the Mindset introduced by Kent Beck when writing in a Test-Driven Development environment.

Too many developers don’t see the added-value about this technique and/or don’t believe it works.

TDD works!

"Testing is not the point; the point is about Responsibility"
-Kent Beck

Benefits

Because so many of us don’t see the benefits of TDD, I thought it would make sense to specify them for you. Robert C. Martin has inspired me with this list of benefits.

Increased Certainty

One of the benefits is that you’re certain that it works. Users have more responsibility in a Test-Driven team; because they will write the Acceptance Tests (with help of course), and they will define what the system must do. By doing so, you’re certain that what you write is what the customer wants.

The amount of uncertainty that builds up by writing code that isn’t exactly what the customer wants is called: The Uncertainty Principle. We must always eliminate this uncertainty.

By writing tests first; you can tell your manager and customer: “Yes, it will work; yes, it’s what you want”.

Defect Reduction

Before I write in a Test-First mindset; I always thought that my code was full of bugs and doesn’t handle unexpected behavior.
Maybe it was because I’m very certain of myself; but also, because I wrote the tests after the code and was testing what a just wrote; not what I want to test.

This increases the Fake Coverage of your code.

Increased Courage

So many developers are “afraid” to change something in their code base. They are afraid to break something. Why are they afraid? Because they don’t have tests!

“When programmers lose the fear of cleaning; they clean”
- Robert C. Martin

A professional developer doesn’t allow that his/her code rots; so, you must refactor with courage.

In-Sync Documentation

Tests are the lowest form of documentation of your code base; always 100% in sync with the current implementation in the production code.

Simple Design

TDD is an analysis/design technique and not necessary a development technique. Tests force you to think about good design. Certainly, if you write them BEFORE you write the actual implementation. If you do so, you’re writing them in offence and not in defense (when you’re writing them afterwards).

Test-First also helps you think about the Simplest thing that could possibly work which automatically helps you to write simple structured designed code.

Test-First Mindset

When you’re introduced into the Test-First methodology, people often get Test Infected. The amount of stress that’s taking from you is remarkable. You refactor more aggressively your code without any fear that you might break something.

Test-Driven Development is based on this very simple idea to first write your test, and only then write your production code. People underestimate the part “first write your test”. When you writing your tests, you’re solving more problems than you think.

Where should I place this code? Who’s responsible for this logic? What names should I use for my methods, classes…? What result must a get from this? What isn’t valid data? How will my class interface look like? …

After trying to use TDD in my daily practice, if found myself always asking the same questio:

“I would like to have a … with … and …”

Such a simple idea changed my vision so radically about development and I’m convinced that by using this technique, you’re writing simpler code because you always think about:

“What’s the simplest thing that could make this test work”

If you find that you can implement something that isn’t the right implementation, write another test to expose this behavior you want to implement.

TDD is - in a way - a physiological methodology. What they say is true: you DO get addicted to that nice green bar that indicate that you’re tests all pass. You want that color as green as possible, you want it always green, you want it run as fast as possible so you can quickly see it’s green…

To be a Green-Bar-Addict is a nice thing.

Kent Beck Test-Patterns

It felt a little weird to just state all the patterns Kent Beck introduced. Maybe you should just read the book Test-Driven Development by Example; he’s a very nice writer and I learned a lot from the examples, patterns and ideas.

What I will do, is give you some basic patterns I will use later in the example and some patterns that we very eye-opening for me the first time.

Fake It

When Kent talked about “What’s the simplest thing that could work”, I was thinking about my implementation but what he meant was “What’s the simplest thing that could work for this test”.

If you’re testing that 2 x 3 is 6 than when you implement it, you should Fake It and just return the 6.

Very weird at first, especially because the whole Fake It approach is based on duplication; the root of all software evil. Maybe that’s the reason experienced software engineers are having problems with this approach.

But it’s a very powerful approach. Using this technique, you can quickly get the bar green (testing bar). And the quicker you get that bar green, the better. And if that means you must fake something; then you should do that.

Triangulation

This technique I found very interesting. This approach really drives the abstraction of your design. When you find yourself not knowing what to do next, or how you should go further with your refactoring; write another test to support new knowledge of the system and the start of new refactorings in your design.

Especially when you’re unsure what to do next. 

If you’re testing that 2 x 3 is 6 than in a Triangulation approach you will first return 6 and only change that if you’re testing again but then for 2 x 2 is 4.

Obvious Implementation

Of course: when the implementation is so simple, so obvious, … Than you could always implement it directly after your test. But remember that this approach is only the second option after Fake It and Triangulation.

When you find yourself taking steps that are too big, you can always take smaller steps.

If you’re testing that 2 x 3 is 6, in an Obvious Implementation approach you will just write 2 x 3 right away.

By Example

I thought it would be useful to show you some example of the TDD workflow. Since everyone is so stoked about test-driving Fibonacci I thought it would be fun to test-drive another integer sequence.

Let’s test-drive the Factorial Sequence!

What happens when we factorial 4 for example? 4! = 4 x 3 x 2 x 1 = 24

Test

But let’s start with something super simple:

Always start with the same sentence: “I would like to have a… “. I would like to have a method called Factorial which I could use to send an integer with that will calculate the factorial integer for me.

Now we have created a test before anything about factorial is implemented.

Compile

Now we have the test, let’s start now by making our code compile again.

Let’s test this:

Hooray! We have a failed test == progress!

Implement

First Steps

What’s the simplest thing that we could write in order that this test will run?

Hooray! Our test passed, we can go home, right?

A Bit Harder

What’s next? Let’s check. What happens if we would test for another value?

I know, I know. Duplication, duplication, duplication. But were testing now right, not yet in the last step of the TDD mantra.

What is the simplest we could change to make this test pass?

Yes, I’m feeling good right now. A nice green bar.

One Step Before Generalize

Let’s add just another test, a bit harder this time. But these duplication is starting to irritate me; you know the mantra: One-Two-Three-Refactor? This is the third time, so let’s start refactoring!

Ok, what’s the simplest thing?

Generalize

Ok, we could add if/else-statements all day long, but I think it’s time to some generalization. Look at what we’ve now been implementing. We write 24 but do we mean 24?

Remembering Factorial, we mean something else:

All still works, yeah. Now, we don’t actually mean 4 by 4 do we. We actually mean the original number:

And we don’t actually mean 3, 2, and 1 by 3, 2 and 1, we actually mean the original number each time mins one. So, actually the Factorial of the 3! could you say, no?

Let’s try:

Wow, still works.Wait, isn’t that if-statement redundant? 2 x 2! == 2 right?

Exploration

Now, the factorial of 0, is also 1. We haven’t tested that haven’t we? We have found a boundary condition!

This will result in a endless loop because we will try to factorial an negative number; and since factorial only happens with positieve numbers (because the formula with negative integers will result in a division by zero and so, blocking us for calculating a factorial value for these negative integers).

Again, simplest thing that could work?

Now, the last step of TDD is always Remove Duplication which in this case is the 1 that’s used two times. Let’s take care of that:

Hmm, someone may have noticed something. We could actually remove the other if-statement with checking for 1 if we adapt the check of 0. This will return 1 for us in the recursive call:

By doing this, we also have ruled out all the other negative numbers passed inside the method.

Conclusion

Why oh why are people so skeptic about Test-Driven Development. If you look at the way you use it in your daily practice, you find yourself writing simpler and more robust code.

TDD is actually a Design Methodology and not a Development Methodology. The way you think about the design, the names, the structure… all that is part of the design process of your project. The tests that you have is the added value of this approach and makes sure that you can refactor safely and are always certain of your software.

Start trying today in your daily practice so you stop thinking about: How will you implement it? but rather:

How will you test it?

 

Categories: Technology
written by: Stijn Moreels