wiki

Codit Wiki

Loading information... Please wait.

Codit Blog

Posted on Monday, August 21, 2017 10:47 AM

Tom Kerkhove by Tom Kerkhove

Azure Event Grid is here - In this first article we'll have a look at what it is, dive into the details and discuss certain new scenarios.

Last week Microsoft announced Azure Event Grid (Preview), an event-driven service that allows you to stitch together all your components and design event-driven architectures.

Next to the built-in support for several Azure services you can also provide your own custom topics and custom webhooks that fix your needs.

By using a combination of filters and multicasting, you can create a flexible event routing mechanism that fits your needs by for example sending event A to one handler, while event B is being multicasted to multiple handlers. Read more about this here.

Azure resources can act as Event Publishers where they send a variety of events to Event Grid. By using Event Subscriptions you can then subscribe to those events and send them to an Event Handler.

The main scenarios for Azure Event Grid are serverless architectures, automation for IT/operations and integration:

  • Serverless Architectures - Trigger a Logic App when a new blob is uploaded
  • Operations - Listen & react on what happens in your subscription by subscribing to Azure Subscription changes
  • Integration - Extend existing workflows by triggering a Logic App once there is a new record in your database
  • Custom - Create your own by using application topics (aka custom topics)

The pricing for Azure Event Grid is fairly simple - You pay $0.60 per million operations and you get the first 100k operations per month for free. Operations are defined as event ingress, advanced match, delivery attempt, and management calls. Currently you only pay $0.30 since it's in public preview, more information on the pricing page.

Basically you can see Azure Event Grid as an extension service that allows you to integrate Azure Services with each other more closely while you also have the flexibility to plug in your own custom topics.

Let's have a closer look at what it has to offer.

Diving into Azure Event Grid

Event Handling at Scale

Azure Event Grid is designed as an high scalable eventing backplane which comes with some serious performance targets:

  • Guaranteed sub-second end-to-end latency (99th percentile)
  • 99.99% availability
  • 10 million events per second, per region
  • 100 million subscriptions per region
  • 50 ms publisher latency for batches of 1M

These are very big numbers which also indirectly have impact on the way we design our custom event handlers. They will need to be scalable and protect themselves from being overwhelmed and should come with a throttling mechanism.

But then again, designing for the cloud typically means that each component should be highly scalable & resilient so this should not be an exception.

Durable Message Delivery

Every event will be pushed to the required Event Handler based on the configured routing. For this, Azure Event Grid provides durable message delivery with an at-least-once delivery.

By using retries with exponential backoff, Event Grid keeps on sending events to the Event Handler until it acknowledges the request with either an HTTP 200 OK or HTTP 202 Accepted.

The Event Handler needs to be capable of processing the event in less than one minute, otherwise Event Grid will consider it as failed and retry it. This means that all Event Handlers should be idempotent to avoid creating invalid state in your system.

However, if your Event Handler is unable to process the event in time and Event Grid has been retrying for up to 24h, 2h in public preview, it will expire the event and stop retrying.

In summary, Event Grid guarantees an at-least-once delivery for all your events but you as an Event Handler are still in charge of being capable of processing the event in time. This also means that it should be able to preserve performance when they are dealing with load spikes.

It is also interesting to see what really happens with the expired events. Do they really just go away or will there be a fallback event stream to which they are forwarded for later processing? In general, I think expiration of events will work but in certain scenarios I see a case where having the fallback event stream is a valuable asset for mission critical event-driven flows.

You can read more on durable message delivery here.

How about security?

Azure Event Grid offers a variety of security controls on all levels:

  • Managing security on the Event Grid resource itself is done with Role-based Access Control (RBAC). It allows you to define granular control to the correct people. It's a good practice to use the least-priviledge principle, but that is applicable to all Azure resources. More information here.
  • Webhook Validation - Each newly registered webhook needs to be validated by Azure Event Grid first. This is to prove that you have ownership over the endpoint. The service will send a validation token to the webhook, which the webhook implementer needs to send back as a validation. It's important to note that only HTTPS webhooks are supported. More information here.
  • Event Subscription uses Role-based Access Control (RBAC) on the Event Grid resource where the person creating a new subscription needs to have the Microsoft.EventGrid/EventSubscriptions/Write permissions.
  • Publishers need to use SAS Tokens or key authentication when they want to publish an event to a topic. SAS tokens allow you to scope the access you grant to a certain resource in Event Grid for a certain amount of time. This is similar to the approach Azure Storage & Azure Service Bus use.

The current security model looks fine to me, although it would be nice if there would be a concept of SAS tokens with a stored access policysimilar to Azure Storage. This would allow us to issue tokens for a certain entity, while still having the capability to revoke access in case we need this, i.e. when a token was compromised.

An alternative to SAS stored access policies would be to be able to create multiple authorization rulessimilar to Azure Service Bus, where we can use the key approach for authentication while still having the capability to have more granular control over whom uses what key and being able to revoke it for one publisher only, instead of revoking it for all publishers.

You can read more on security & authentication here.

Imagine the possibilities

Integration with other Azure services

As of today there are only a few Azure services that integrate with Azure Event Grid but there are a lot of them coming.

Here are a couple of them that I would love to use:

  • Use API Management as a public-facing endpoint where all events are transformed and sent over to Azure Event Grid. This would allow us to use API Management as a webhook proxy between the 3rd party and Azure Event Grid. More on this later in the post
  • Streamlined event processing for Application Insights custom events where it acts as an Event Publisher. By doing this we can push them to our data store so that we can use it in our Power BI reporting, instead of having to export all telemetry and setting up a processing pipeline for that, as described here
  • Real-time auditing & change notifications for Azure Key Vault
    • Publish events when a new version of a Key or Secret was added to notify dependent processes about this so they can fetch the latest version
    • Real-time auditing by subscribing to changes on the access policies
  • Sending events when alerts in Azure Monitor are triggered would be very useful. In the past I've written about how using webhooks for processing alerts instead of emails are more interesting as you can trigger an automation workflow such as Logic Apps. If an alert would send an event to Azure Event Grid we can take it even a step further and create dedicated handlers per alert or alert group. You can already achieve this with Logic Apps & Service Bus Topics as of today but with Event Grid this comes out of the box and makes it more easy to create certain routings
  • Trigger an Azure Data Factory when an event occurs, i.e. when a blob was added to an Azure Storage container
  • Send an event when Azure Traffic Manager detects a probe that is unhealthy

New way of handling webhook events?

When we want to provide 3rd parties to send notifications to a webhook we need to provide a public endpoint which they can call. Typically, these just take the event and queue them for later processing allowing the 3rd party to move on as we handle the event at our own pace.

The "problem" here is that we still need to host an API middleware somewhere; be it an Azure Function, Web App, Api App, etc; that just handles this message. Even if you use Azure API Management, you still need to have the middleware running behind the API Management proxy since you can't push directly to a topic.

Wouldn't it be nice if we can get rid of that host and let API Management push the requests directly to Azure Event Grid so that it can fan-out all the external notifications to the required processors?

That said, this assumes that you don't do any validation or other business logic before the webhook middleware pushes to the topic for processing. If you need this capability, you will have to stick with hosting your own middleware I'm afraid.

Unified integration between APIs

Currently when you are using webhooks inside your infrastructure the Event Publishers are often calling webhooks directly creating a spaghetti infrastructure. This is not manageable since each Event Publisher needs to have the routing logic inside their own component.

By using Azure Event Grid we can route all the events through Azure Event Grid and use it as an event broker, or routing hub if you will, and thus decoupling Event Publisher from the corresponding Event Handlers.

By doing this we can easily change the way we route events to new Event Handlers by simply changing the routing, not the routing logic in the Event Publishers.

Depending on the monitoring Azure Event Grid will provide, it can also provide a more generic approach in how we monitor all the event handling instead of using the monitoring on each component. More on this in my next blog.

Depending on the load, you can of course also use Azure Service Bus Topics but all depends on the load you are expecting. As always, it depends on the scenario; to pick which technology is best for the scenario.

Summary

Azure Event Grid is a unique service that has been added to Microsoft Azure and brings a lot to the table. It promises big performance targets and will enable new scenarios, certainly in the serverless landscape.

I'm curious to see how the service will evolve and what publishers & handlers will be coming soon. Personally, I think it's a big announcement and will give it some more thinking on how we can use it when building platforms on Microsoft Azure.

Want to learn more yourself? Here's a good Cloud Cover episode that will give you a high-level overview of Azure Event Grid or read about the concepts of Event Grid. Tip: Follow our Azure Event Grid webinar on Tuesday 19 December to learn the ins and outs!

What features would you like to see being added to the service? In what scenarios do you see Event Grid as a good fit? Feel free to mention them in the comments!

Thanks for reading,

Tom Kerkhove.

Posted on Wednesday, August 16, 2017 11:05 AM

Toon Vanhoutte by Toon Vanhoutte

Accidental removal of important Azure resources is something to be avoided at all time. For sure if it occurs with statefull Azure services, that are hosting very important information. Common examples of such Azure services are Azure Storage Accounts and Azure KeyVaults. This blog post describes three steps to minimize the risk of such unwanted removals, however additional measures might be needed, depending on your use case.

Prevent accidental removal

Role Based Access Control

Azure services offer real fine-grained access control configuration. You should take advantage of it, in order to govern access to your Azure environment. Resources in your production environment should only be deployed by automated processes (e.g. VSTS). Assign the least required privileges to the developers, operators and administrators within your subscription. Read only access in production should be a default setting. In case of incidents, you can always temporary elevate permissions if needed. Read here more about this subject.

Naming Conventions

It's a good practice to include the environment within the name of your Azure resources. Via this simple naming convention, you create awareness on what environment one is working. Combined with this naming convention, it's also advised to have a different subscription for prod and non-prod resources.

Resource Locks

An additional measure you can take, is applying resource locks on important resources. A resource lock can set two different lock levels:

  • CanNotDelete means authorized users can still read and modify a resource, but they can't delete the resource.
  • ReadOnly means authorized users can read a resource, but they can't delete or update the resource. Applying this lock is similar to restricting all authorized users to the permissions granted by the Reader role.

These locks can only be created or removed by a Resource Owner or a User Access Administrator. This prevents that e.g. Contributors delete the resource by accident.

Locks can be deployed through PowerShell, Azure CLI, REST API and ARM templates. This is an example of how to deploy a lock through an ARM template:

Be aware that locks can also be configured on subscription and resource group level. When you apply a lock at a parent scope, all resources within that scope inherit the same lock. Even resources you add later inherit the lock from the parent.

Conclusion

The Microsoft Azure platform gives you all the access control tools you need, but it's your (or your service provider's) responsibility to use them the right way. One should be very cautious when giving / getting full control on a production environment, because we're all human and we all make mistakes.

Categories: Azure
Tags: Logic Apps
written by: Toon Vanhoutte

Posted on Tuesday, August 8, 2017 12:35 PM

Toon Vanhoutte by Toon Vanhoutte

Should we create a record or update it? What constraints define if a record already exists? These are typical questions we need to ask ourselves when analyzing a new interface. This blog post focuses on how we can deal with such update / insert (upsert) decisions in Logic Apps. Three popular Logic Apps connectors are investigated: the Common Data Service, File System and SQL Server connector.

Common Data Service Connector

This blog post of Tim Dutcher, Solutions Architect, was the trigger for writing about this subject. It describes a way to determine whether a record already exists in CDS, by using the "Get Record" action and deciding based on the returned HTTP code. I like the approach, but it has a downside that it's not 100% bullet proof. An HTTP code different than 200, doesn't always mean you received a 404 Not Found.

My suggested approach is to use the "Get List of Records" action, while defining an ODATA filter query (e.g. FullName eq '@{triggerBody()['AcountName']}'). In the condition, check if the result array of the query is empty or not: @equals(empty(body('Get_list_of_records')['value']), false). Based on the outcome of the condition, update or create a record.

File System Connector

The "Create file" action has no option to overwrite a file if it exists already. In such a scenario, the exception "A file with name 'Test.txt' already exists in the path 'Out' your file system, use the update operation if you want to replace it" is thrown.

To overcome this, we can use a similar approach as described above. Because the "List files in folder" action does not offer a filter option, we need to do this with an explicit filter action. Afterwards, we can check again if the resulting array is empty or not: @equals(empty(body('Filter_array')), false). Based on the outcome of the condition, update or create the file.

You can also achieve this in a quick and dirty way. It's not bullet proof, not clean, but perfect to use in case you want to create fast demos or test cases. The idea is to try first the "Create file" action and configure the next "Update file" action to run only if the previous action failed. Use it at your own risk :-)

SQL Server Connector

A similar approach with the "Get rows" actions could also do the job here. However, if you manage the SQL database yourself, I suggest to create a stored procedure. This stored procedure can take care of the IF-ELSE decision server side, which makes it idempotent.

This results in an easier, cheaper and a less chatty solution.

Conclusion

Create/update decisions are closely related to idempotent receivers. Real idempotent endpoints deal with this logic server side. Unfortunately, there are not many of those endpoints out there. If you manage the endpoints yourself, you are in charge to make them idempotent!

In case the Logic App needs to make the IF-ELSE decision, you get chattier integrations. To avoid reconfiguring such decisions over and over again, it's advised to make a generic Logic App that does it for you and consume it as a nested workflow. I would love to see this out-of-the-box in many connectors.

Thanks for reading!
Toon

Categories: Azure
Tags: Logic Apps
written by: Toon Vanhoutte

Posted on Friday, July 28, 2017 4:59 PM

Toon Vanhoutte by Toon Vanhoutte

Recently I was asked for an intervention on an IoT project. Sensor data was sent via Azure IoT Hubs towards Azure Stream Analytics. Events detected in Azure Stream Analytics resulted in a Service Bus message, that had to be handled by a Logic App. Hence, the Logic Apps failed to parse that JSON message taken from a Service Bus queue. Let's have a more detailed look at the issue!

Explanation

This diagram reflects the IoT scenario:

We encountered an issue in Logic Apps, the moment we tried to parse the message into a JSON object. After some investigation, we realized that the ServiceBus message wasn't actually a JSON message. It was preceded by string serialization "overhead".

Thanks to our favourite search engine, we came across this blog that nicely explains the cause of the issue. The problem is situated at the sender of the message. The issue is caused, because the BrokeredMessage is created with a string object:

If you control the sender side code, you can resolve the problem by passing a stream object instead.

Solution

Unfortunately we cannot change the way Azure Stream Analytics behaves, so we need to deal with it at receiver side. I've found several blogs and forum answers, suggesting to clean up the "serialization garbage" with an Azure Function. Although this is a valuable solution, I always tend to avoid additional components if not really needed. Introducing Azure Functions comes with additional cost, storage, deployment complexity, maintenance, etc…

As this is actually pure string manipulation, I had a look at the available string functions in the Logic Apps Workflow Definition Language. The following expression removes the unwanted "serialization overhead":

If you use this, in combination with the Parse JSON action, you have a user-friendly way to extract data from the JSON in the next steps of the Logic App. In the sample below, I just used the Terminate action for testing purpose. You can now easily use SensorId, Temperature, etc…

Conclusion

It's a pity that Azure Stream Analytics doesn't behave as expected when sending messages to Azure Service Bus. Luckely, we were able to fix it easily in the Logic App. Before reaching out to Azure Functions, as an extensibility option, it's advised to have an in-depth look at the available Logic Apps Workflow Definition Language functions.

Hope this was a timesaver!
Cheers,
Toon

Categories: Azure
written by: Toon Vanhoutte

Posted on Monday, July 24, 2017 8:24 AM

Stijn Moreels by Stijn Moreels

In this part of the Test Infected series, I will talk about the ignorance of tests and how we can achieve more ignorance.
The short answer: if it doesn’t contribute to the test, hide/remove it!
The reason I wrote this post is because I see many tests with an overload of information embedded, and with some more background information people may increase the ignorance of their tests.

Introduction

In this part of the Test Infected series, I will talk about the ignorance of tests and how we can achieve more ignorance.

The short answer: if it doesn’t contribute to the test, hide/remove it!

The reason I wrote this post is because I see many tests with an overload of information embedded, and with some more background information people may increase the ignorance of their tests.

Ignorance

Test-Driven Discovery

Sometimes people write tests just because they are obliged to do so. Only when someone is looking over their shoulder they write tests, in any other circumstances they don’t. It’s also crazy to see people abandon their practices (TDD, Merciless Refactoring, …) the moment there is a crisis or a need for a quick change or something else that causes stress to people.

The next time you’re in such a situation, look at yourself and evaluate how you react. If you don’t stick with your practices in those situations, then do you really trust your practices at all?If you don’t use your practices in your stress situations, abandon them, because they wouldn’t work for you (yet).
This could be a learning moment for the next time.

Now, that was a long intro to come to this point: what after we have written our tests (in a Test-First/Last approach)?

Test Maintenance

In my opinion, the one thing Test Ignorance is about, is Test Maintenance. When there’s some changes to the SUT (System Under Test), how much do you have to change of the production code and how much of the test code?

When you (over)use Mock Objects (and Test Doubles in general), you can get in situations that Gerard Meszaros calls Overspecified Software. The tight-coupling between the tests and the production code is causing this Smell.

But that’s not actually the topic I want to talk about (at least not directly). What I do want to talk about are all those tests with so much information in them that every method/class/… is Obscured.

People read books about patterns, principles, practices… and try to apply them to their Production Code, but forget their Test Code.

Test Code should be as clear than the Production Code.

If your method in production has 20 lines of code and people always lose time to read and reread it… (how many times do you reread a method before you refactor?), you refactor it to smaller parts to improve usability, readability, intent…

You do this practice in your production code; so, why wouldn’t you do this in your test code?

I believe one of the reasons people sometimes abandon their tests, is because people think they get paid for production code (and not because of their lack of discipline). It’s as simple as that. But remember that you get paid for stable, maintainable, high quality software and you can’t simply deliver that without tests that are easily maintainable.

"Ignorance is Bliss" Patterns

“I know this steak doesn't exist. I know that when I put it in my mouth, the Matrix is telling my brain that it is juicy and delicious. After nine years, you know what I realize? Ignorance is bliss.”

- Cypher (The Matrix)

Now, when you understand that your test code is just as much important than your production code, we can start by defining our Ignorance in our tests.

There are several Test Patterns in literature that support this Test Ignorance, so I’ll give you just the concepts and some quick examples.

This section is about readability and how we can improve this.

Unneccessary Fixture

The one place where you could start and where it is most obvious there’s a Test Smell, is the Fixture Setup. Not only can this section be enormous (I’ve seen gigantic fixtures) and hard to grasp, there’re also hard to change and so, to maintain.

Look at the following example. We need to setup a “valid” customer before we can insert it into the Repository. In this test, do I really need to know what all the different items are to make an invalid customer. Do we need all of them? Maybe it’s just the id that’s missing, but that could be autogenerated, or maybe the address doesn’t exist, …

Only show what I need to make the test pass.

We can change the example with a Parameterized Creation Method as an example of the One Bad Attribute Test Pattern. In the future, we could also parameterize the other properties if we want to test some functionality that depends on this information. If this isn’t the case, we can leave these initializations inside the Creation Method for the customer instead of polluting the test with this unnecessary information.

Now, if we want to act as a fully Test Infected person, we can also Test-Drive these Creation Methods. Next time you analyze the code coverage of your code, include the test projects and also make tests for these methods! This will increase your Defect Localization. If there's a problem with your fixture and not the test that uses this fixture, you will see this in your failed tests and know that you have a problem with the Fixture and not the test itself.

Also note that this newly created method is only accessible within this test class. If we want to write tests with the same Fixture, we can extract this logic in its own class.

In either way, we have made our intentions clear to the test reader. I always try to ask the following question to the test: “Do you really care if you know this?”.

Again, the test can be made clearer if we send the argument that makes the customer invalid with it, so we know what’s the cause why the customer isn’t inserted. If we move the “id” somewhere else, we won’t know what causes it and would made the test more Obscure.

I see some reasons why a Test Fixture can be big:

  • The Fixture has a lot of “setup-code” in place because the SUT is doing too much. Because the SUT is doing all these steps, we must build our Fixture with a lot of info and behavior. Otherwise, the SUT will see the Fixture as invalid.
  • The Fixture is the smallest possible for exercising the SUT and the SUT is a Complete Abstraction, but needs nonetheless a Fixture that needs some lines to setup before the Fixture is valid.
  • The Fixture contains some unnecessary information that doesn’t contribute the result of the test but is embedded in the test anyway.

So, there are a lot of different possibility why a Fixture can be big and the solution is for all these situations the same: make the Fixture as small as possible + only add the information to the test, relevant to the result of the test . Contribute or get out.

Now, if you move ALL the Fixture code somewhere else (so extracting too much), you also have a problem. Test readers will now see some Magic Fixtures in place that act as Mystery Guests which can result in Fragile Tests.

Obscured by Eagerness

Sometimes, I encounter tests that are “Obscured by Eagerness”. A Test can be “obscure” for different reasons. One can be because we want to assert too much in a single test, another can be because we want to “set-up” too much in a single run, and yet another can be because we combine tests in a single test run by exercising multiple actions on the SUT.

To summarize:

  • Eager Assertion: assert on too many state and/or behavior in a single run.
  • Eager Fixture: set up unnecessary fixture (see previous section).
  • Eager Exercises: exercise multiple actions on the SUT to combine tests.

I’ve seen people defend tests with more than 20 assert statements because they still tested a “single unit” outcome. Sometimes you have functionality that looks like you have to write 20 assert statements or more, but instead of writing those statements you should ask yourself: What are you trying to test?

By explicitly asking yourself this question, you often come up with surprising results.

Because the assert-phase of the test (in a Four Phase Test) is important to verify the state of the test (failed or succeed), I always try to start by writing this phase first. It forces you to think about what you trying to test and not what you need to set up as Fixture. By writing this phase first, you’re writing your test from bottom to top and only define what you really need. This way (like writing tests for your production code), you only write what you need.

Previous snippet is a perfect example of how we can abuse the Assert-Phase. By placing so many asserts in a single spot, we obscure what we really trying to test. We need to test if the message is serialized correctly; so instead of manually getting each element, why not assert on the whole xml?

We create an expected xml string and verify if this is the same as the actual serialized xml string.

Conclusion

Writing tests should be taken as serious as the production code, only then we can have maintainable software solutions where developers are eager to run tests instead of ignoring them.

The next time you write a test, try to think firmly about the test. What should I know, what do I find important to exercise the SUT, what do I expect… this way you can determine what items are important and which aren’t.

I sometimes "pretend" to be the test case:

“Do I care how this Fixture is set up?”
“Must I know exactly how to assert all these items?”
“Have I any interest of how a ‘valid’ object looks like?”
“What do I really want to test and what information only pollutes this?”
“Do I care that these actions must be executed before the exercise of the test?”

Tests only need to know what they need to exercise the SUT, nothing more, but equally important: nothing less!

Categories: Technology
Tags: Code Quality
written by: Stijn Moreels