wiki

Codit Wiki

Loading information... Please wait.

Codit Blog

Posted on Friday, May 18, 2018 1:59 PM

Tom Kerkhove by Tom Kerkhove

GDPR mandates that you make data available to users on their request. In this post I show you how you can use Azure Serverless to achieve this with very little effort.

GDPR is around the corner which mandates that every company serving European customers need to comply with a lot of additional rules such as being transparent in what data is being stored, how it is being processed, and more.

The most interesting ones are actually that you need to be able to request what data they are storing about you and make it available to you, all of it. A great example is how Google allows you to select the data you want to have and give it to you, try it here.

Being inspired by this, I decided to build a similar flow running on Azure and show how easy it is to achieve this.

Consolidating user data with Azure Serverless

In this sample, I'm using a fictitious company that is called Themis Inc. which provides a web application where users can signup, create a profile and does awesome things. That application is powered by a big data set of survey information which is being processed to analyze and see if the company can deliver targeted ads for specific users.

Unfortunately, this means that the company is storing Personal Identifiable Information (PII) for the user profile and the survey results for that user. Both of these datasets need to be consolidated and provided as a download to the user.

For the sake of this sample, we are actually using the StackExchange data set and the web app simply allows me to request all my stored information.

This is a perfect fit for Azure Serverless where we will combine Azure Data Factory_, the unsung serverless hero,_ with Azure Logic Apps, Azure Event Grid and Azure Data Lake Analytics.

How it all fits together

If we look at the consolidation process, it actually consists of three steps:

  1. Triggering the data consolidation and send an email to the customer that we are working on it
  2. Consolidating, compressing and making the data available for download
  3. Sending an email to the customer with a link to the data

Here is an overview of all the pieces fit together:

Azure Logic Apps is a great way to orchestrate steps that are part of your application. Because of this, we are using a Logic App that is in charge of handling new data consolidation requests that were requested by customers in the web app. It will trigger the Data Factory pipeline that is in charge of preparing all the data. After that, it will get basic profile information about the user by calling the Users API and send out an email that the process has started.

The core of this flow is being managed by an Azure Data Factory pipeline which is great to orchestrate one or more data operations that represent a business process. In our case, it will get all the user information from our Azure SQL DB and get all data, related to that specific user, in our big data set that is stored on Azure Data Lake Store. Both data sets are being moved to a container in Azure Blob Storage and compressed after which a new Azure Event Grid event is being published with a link to the data.

To consolidate all the user information from our big data set we are using U-SQL because it allows me to write a very small script and submit this, while Azure Data Lake Analytics runs and looks through your data. This is where Data Lake Analytics shines because you don't need to be a big data expert to use it, it does all the heavy lifting for you by determining how it needs to execute it, scale it, and so on.

Last but not least, a second Logic App is subscribing to our custom Event Grid topic and sends out emails to customers with a link to their data.

By using Azure Event Grid topics, we remove the responsibility of the pipeline to know who should act on his outcome and trigger it. It also makes our current architecture flexible by providing extension points that can be used by other processes to integrate with it in the process in case we need to make the process more complex. It also removes the responsibility from the pipeline to know who should act on his outcome.

This is not the end

Users can now download their stored data, great! But there is more...

Use an API Gateway

The URLs that are currently exposed by our Logic Apps & Data Factory pipelines are generated by Azure and are tightly coupled to those resources.

As the cloud is constantly changing, this can become a problem when you decide to use another service or somebody simply deletes and you need to recreate it where it will have a new URL. Azure API Management is a great service for this where it will basically shield away from the backend process from the consumer and act as an API gateway. This means that if your backend changes; you don't need to update all you consumers, simply update the gateway instead.

Azure Data Factory pipelines can be triggered via HTTP calls but this has to be done via a REST API - Great! The downside is that it is secured via Azure AD which brings some overhead in certain scenarios. Using Azure API Management, you can shield this from your consumers by using an API key and leave the AD authentication up to the API gateway.

User Deletion

GDPR mandates that every platform needs to give a user the capability to delete all the data for a specific user on request. To achieve this a similar approach can be used or even refactor the current process so that they re-use certain components such as the Logic Apps.

Conclusion

Azure Serverless is a very great way to focus on what we need to achieve and not worry about the underlying infrastructure. Another big benefit is that we only need to pay for what we are using. Given this flow will be used very sporadically this is perfect because we don't want to set up an infrastructure which needs to be maintained and hosted if it will only be used once a month.

Azure Event Grid makes it easy to decouple our processes during this flow and provide more extension points where there is a need for this.

Personally, I am a fan of Azure Data Factory because it makes me as a developer so easy to automate data processes and comes with the complete package - Code & visual editor, built-in monitoring, etc.

Last but not least, this is a wonderful example of how you can combine both Azure Logic Apps & Azure Data Factory to build automated workflows. While at first, they can seem as competitors, they are actually a perfect match - One focusses on the application orchestration while the other one does the data orchestration. You can read more about this here.

Want to see this in action? Attend my "Next Generation of Data Integration with Azure Data Factory" talk at Intelligent Cloud Conference on 29th of May.

In a later post, we will go more into detail on how we can use these components to build this automated flow. Curious to see the details already? Everything will be available on GitHub.

Thanks for reading,

Tom.

Posted on Thursday, May 17, 2018 12:00 AM

Frederik Gheysels by Frederik Gheysels

This article will guide you through the process of exposing the debug-information for your project using a symbol server on VSTS with private build agents.

We'll focus on exposing the symbols using IIS while pointing out some caveats along the way.

Introduction

I believe that we've all experienced the situation where you're debugging an application and would like to step into the code of a dependent assembly that has been written by you or another team in your company but is not part of the current code repository.
Exposing the debug-information of that assembly via a symbol server allows you to do that.

While setting up a symbol-server and indexing pdb files was quite a hassle 10 years ago, it currently is a piece of cake when you use VSTS for your automated builds.

Build definition

To enable the possibility of stepping into the code of your project, the debug-symbols of that assembly must be exposed to the public.

This is done by adding the Index sources & Publish symbols task to your VSTS build definition:
This task will in fact add some extra information to the pdb files that are created during the build process. 

Additional information, such as where the source files can be found and what version of the sources were used during the build will be added to the pdb files.

After that, the pdb files will be published via a Symbol Server.

Once this task has been added, it still needs some simple configuration:

Since VSTS is now also a symbol server, the easiest way to publish your symbols is to select Symbol Server in this account/collection.  
When this option is selected, you should be good to go and don't have to worry about the remainder of this article.

However, since some projects are configured with private build agents, I want to explore the File share Symbol Server type in this article.

Select File Share as the Symbol Server type and specify the path to the location where the debug-symbols must be stored.

See the image below for an example:

When selecting this option, you'll publish the symbols to a File share which means that you'll need to have access to the build-server.  This implies that a Private Build Agent must be used that runs on a server that is under your (or your organizations) control.
Note that the path must be an UNC path and may not end with a backslash, otherwise the task will fail.
This means that the folder that will ultimately contain the symbol-files must be shared and needs the correct permissions.
Make sure that the user under which the build runs, has sufficient rights to write and modify files on that share. Granting the VSTS_AgentService group or the Network Service group Modify rights on that directory should suffice.

At this point, you can trigger the build and verify if the Index sources & Publish symbols task succeeded.

If it succeeded, you should see that some directories are created in the location where the symbols should be stored and you should find pdb files inside those directories.

If nothing has been added to the folder, you should inspect the logs and see what went wrong.

Maybe no *.pdb files have been found, possibly because the path to the build-output folder is incorrect.

It's also possible that *.pdb files have been found but cannot be indexed. This is common when publishing symbols projects that target .NET Core or .NET Standard. In those cases, you might find a warning in the log of the Index & Publish task that looks like this:
Skipping: somefile.pdb because it is a Portable PDB

It seems that the Index sources & Publish symbols task does not support Portable pdb files. To expose debug information for these assemblies, SourceLink  must be used, but this is beyond the scope of this article.

There is a quick workaround however: change the build settings of your project and specify that the debug information should not be portable but must be Full or Pdb only. This can be specified in the Advanced Build settings of your project in Visual Studio.

This workaround enables that the symbols can be indexed but using them while debugging will only be possible on Windows platforms which defeats a bit the purpose of having a .NET core assembly.

Exposing the debug symbols via HTTP

Now that the debug symbols are there, they should be exposed so that users of your assembly / package can make use of them.

One way to do this, is serving the symbols via a webserver.

To do this, install and configure IIS on the server where your build agent runs.

Create a Virtual Directory

This step is fairly simple: In IIS Manager, just create a virtual directory for the folder that contains the debug symbols:

Configure MIME type for the pdb files

IIS will refuse to serve files with an unknown MIME - type. Therefore, you'll have to specify the MIME type for the *.pdb files. If you fail to do so, IIS will return a HTTP 404 status code (NotFound) when a pdb file is requested.

To configure the MIME type for *.pdb files, open IIS Manager and click open the MIME types section and specify a new MIME type for the .pdb extension:

Authentication

Depending on who should have access to the debug-symbols, the correct authentication method has to be setup.

If anyone may download the debug symbols, then IIS must be configured to use Anonymous Authentication.

To enable Anonymous Authentication, open the Authentication pane in IIS and enable Anonymous Authentication. If the Anonymous Authentication option is not listed, then use Turn Windows feature on and off to enable it.

Having access to the debugging information does not imply that everybody also has access to the source code, as we'll see later in the article.

Configure Visual Studio to access the symbol server

Now that the debug information is available, the only thing left to do is enable Visual Studio to use those symbols.

To do this, open the Debug Options in Visual Studio and check the Enable source server support option in the General section.
You might also want to uncheck the Enable Just My Code option to avoid that you'll have initiate the loading of the symbol files manually via the 
Modules window in Visual Studio:
Next to that, Visual Studio also needs to know where the symbols that are exposed can be found. This is done by adding the URL that exposes your symbols as a symbol location in Visual Studio:

Now, everything should be in place to be able to debug through the source of an external library, as we'll see in the next section.

In Action

When everything is setup correctly, you should now be able to step through the code of an external library.

As an example, I have a little program called AgeCalculator that uses a simple NuGet package AgeUtils.Lib for which I have exposed its symbols:

While debugging the program, you can see in the Modules window of Visual Studio that symbols for the external dll AgeUtils.Lib have been loaded.  This means that Visual Studio has found the pdb file that matches the version of the AgeUtils.Lib assembly that is currently in use.

When a line of code is encountered where functionality from the NuGet package is called, you can just step into it.
As can be seen in the Output Window, Visual Studio attempts to download the correct version of the Age.cs source code file from the source-repository. 

The debugger knows how this file is named, which version is required and where it can be found since all information is present in the pdb file that it has downloaded from the symbol server!

When the debugger attempts to retrieve the correct code-file, you'll need to enter some credentials.  Once this is done, the source-file is downloaded and you'll be able to step through it:

Now, you'll be able to find out why that external library isn't working as expected! :)

Happy debugging!

Frederik
Categories: Technology
Tags: Debugging
written by: Frederik Gheysels

Posted on Friday, April 13, 2018 12:25 PM

Tom Kerkhove by Tom Kerkhove

Azure API Management released a new version that changes the OpenAPI interpretation. This article dives into the potential impact on of the consumer experience of your APIs.

Providing clean and well-documented APIs is a must. This allows your consumers to know what capabilities you provide, what they are for and what to expect.

This is where the OpenAPI specification, aka Swagger, comes in and defines how APIs should be defined across the industry, regardless of what technology is underneath it.

Recently, the Azure API Management team started releasing a new version of the product with some new features and some important changes in how they interpret the OpenAPI specification while importing/exporting them.

Before we dive into the changes to OpenAPI interpretation. I'd like to highlight that they've also added the capability to display the id of a specific operation. In the past, you still had to use the old Publisher portal for this but now you can find it via API > Operation > Frontend.

Next to that, as of last Sunday, the old Publisher portal should be fully gone now, except for the analytics part.

OpenAPI Interpretation

The latest version also changes the way OpenAPI specifications are being interpreted and are now fully based on operation as defined by the OpenAPI spec.

Here are the changes in a nutshell:

  • Id of the operation - Operation Id is based on operation.operationId, otherwise it is being generated similar to get-foo
  • Name of the operation - Display name is based on operation.summary, otherwise it will use operation.operationId. If that is not specified, it will generate a name similar to Get - /foo
  • Description of the operation - Description is based on operation.description

I like this change because it makes sense, however, this can be a breaking change in your API documentation depending on how you achieved it in the past.

The reason for this is that before rolling out this change the interpretation was different:

  • Id of the operation was a generated id
  • Name of the operation was based on operation.operationId
  • Description of the operation was based on operation.description and falls back on operation.summary

How I did it in the past

For all the projects I work on I use Swashbuckle because it's very easy to setup, use and ties into the standard XML documentation.

Here is an example of the documentation I provide for my health endpoint for Sello, which I use for demos.

As you notice, everything is right there and via the operation I specify what the operation is called and give a brief summary of what it does and what my consumers can expect as responses.

The OpenAPI specification that is generated will look like this:

Once this is imported into Azure API Management the developer experience was similar to this:

However, this approach is no longer what I'd like to offer to my consumers because if you import it after the new version it looks like this:

How I'm doing it today

Aligning with the latest interpretation was fairly easy to be honest, instead of providing a description what the operation does via summary I started using remarksinstead.

Next to that, I'm now using summary to give the operation a friendly name and assigned a better operationId via SwaggerOperation.

This is how it looks in code:

The new OpenAPI specification is compatible with the recent changes and will look like this:

Once this is imported the developer experience is maintained and looks similar to this:

When you go to the details of the new operation in the Azure portal, you will see that all our information is succesfully imported:

Conclusion

Azure API Management rolled out a change to the OpenAPI interpretation to provide more flexibility so you can define the operation id to use and align with the general specification.

This change is great, but it might have an impact on your current API documentation, similar to what I've experienced. With the above changes, you are good to go and your consumers will not even notice it.

Thanks for reading,

Tom.

Categories: API Management, Azure
written by: Tom Kerkhove

Posted on Tuesday, April 10, 2018 2:48 PM

Sagar Sharma by Sagar Sharma

Calling on-premise hosted web services from Logic Apps is super easy now. Use the on-premise data gateway and custom connector to meet this integration.

In this article, I will show you how to connect to on-premise hosted HTTP endpoints (which now has public access) from Logic Apps. We will do this by using the Logic Apps on-premise data gateway and Logic Apps custom connector. This feature is recently available by the Logic Apps product team. For the people who had never used on-premise data gateway before, please read my previous blog post “Installing and Configuring on-premise data gateway for Logic Apps” which contains a detailed explanation of the Logic App on-premise data gateway.

Part 1: Deploying a webservice on a local machine

You are already familiar with this part:

  • Open visual studio. Create new project>Web>ASP.NET Web Application 

  • I am doing this in the most classical way. Empty template, no authentication and add>new item>Web Service (ASMX). You can do it in your preferred way REST, MVC Web API or WCF Webservice etc.
  • Write some web method. Again, I am doing it in easiest way so, for example "HelloWorld" with one parameter:

  • Build the web application and deploy it to local IIS.
  • Browse the website and save the full WSDL. You will need this WSDL file in part 2.

Part 2: Creating a custom connector for the webservice

  • Log on to your azure subscription where you have on-premise data gateway registered. Create a resource of type “Logic Apps Custom Connector”.
  • Open a custom connector and click on edit. Choose API Endpoint as SOAP and Call mode as SOAP to REST and then browse to upload WSDL file of your on-premise webservice.
     

 Please note that if you are trying to access an REST/Swagger/OpenAPI web service, you will need to choose REST as API endpoint.

  • Don’t forget to select “Connect via on premise data gateway”

     
  • Click on continue to go to security tab from general tab

     
  • Again, click on continue as no authentication for this demo
  • In the definition tab, fill some summary and description. Keep default value of rest of the configuration and click on “Update connecter” from top right of the screen.
     
  • The custom connector is ready to use now.

Part 3: Integrating with an on-premise webservice from Logic Apps

  • Create a new Logic App. Start with Recurrence trigger.
  • Add an action and search for your custom connector:
     
  • Choose your web method as action. Then choose your on-premise gateway which you want to use to connect with your on-premise web service and click on create
     
  • Enter the value for your name parameter and with that your final logic app should look like the following:
     
  • Click on save and then run it. Within a moment you should see response from your on-premise web service based on your input parameter
     

Thanks for reading! I hope you've found this article useful. If you have any questions around this or looking for additional information, feel free to comment below.

Want to discover more? Check out these sites:

Posted on Monday, March 26, 2018 3:18 PM

Sagar Sharma by Sagar Sharma

If you want to connect to your on-premise data sources from Azure hosted Logic Apps, then you can use an on-premise data gateway. Let's see how to install and configure it.

Logic App is a new generation integration platform available in Azure. Being a serverless technology, there is no upfront hardware and licensing cost. That leads to a faster time to market. Because of all these features, Logic App is picking up pace in the Integration world.

Because Logic Apps are hosted in cloud, it’s not straight forward to access on-premise network hosted data sources from Logic Apps. To overcome that limitation, Microsoft introduced “on-premise data gateway”. 

The gateway acts as a bridge that provides quick data transfer and encryption between data sources on-premises and your Logic Apps. All traffic originates as secure outbound traffic from the gateway agent to Logic Apps through Azure Service Bus Relay in background.

Currently, the gateway supports connections to the following data sources hosted on-premises:

  • BizTalk Server 2016
  • PostgreSQL
  • DB2
  • SAP Application Server
  • File System
  • SharePoint
  • Informix
  • SQL Server
  • MQ
  • Teradata
  • Oracle Database
  • SAP Message Server

 

Part 1: How does the Logic App on-premise data gateway works?

  1. The gateway cloud service creates a query, along with the encrypted credentials for the data source, and sends the query to the queue for the gateway to process.
  2. The gateway cloud service analyzes the query and pushes the request to the Azure Service Bus.
  3. The on-premises data gateway polls the Azure Service Bus for pending requests.
  4. The gateway gets the query, decrypts the credentials, and connects to the data source with those credentials.
  5. The gateway sends the query to the data source for execution.
  6. The results are sent from the data source, back to the gateway, and then to the gateway cloud service. The gateway cloud service then uses the results.

Part 2: How to install the on-premises data gateway?

Before we install on-premise data gateway, it’s very important to take following points into consideration:

  1. Download and run the gateway installer on a local computer. Link: http://go.microsoft.com/fwlink/?LinkID=820931&clcid=0x409

  2. Review and accept the terms of use and privacy statement. Specify the path on your local computer where you want to install the gateway.



  3. When prompted, sign in with your Azure work or school account, not a Microsoft account.



  4. Now register your installed gateway with the gateway cloud service. Choose "Register a new gateway on this computer". Provide a name for your gateway installation. Create a recovery key, then confirm your recovery key.


    In order to achieve high availability, you can also configure the gateway in cluster mode. For that select “Add to an existing gateway cluster”. 

    To change the default region for the gateway cloud service and Azure Service Bus used by your gateway installation, choose “Change Region”. For example, you might select the same region as your logic app, or select the region closest to your on-premises data source so you can reduce latency. Your gateway resource and logic app can have different locations.

  5. Click on Configure and your gateway installation should be ready. Now we need to register this on-premise installation in Azure. For that log-on to your Azure subscription. Make sure you use the azure subscription which is associated with your work/school tenant. Create new resource of type “On-premises data gateway”



  6. Enter some name of your gateway. Choose subscription and resource group. Make sure you choose the same location as you selected during the gateway installation. You should be able to see the name of your gateway installation after choosing the same location



  7. Click on create and within a moment you will be able to use the on-premise gateway in your Logic App.

  8. You will be able to choose on-premise gateway installation to access on-premise hosted data sources in supported connectors.
    For example:
    • File System
    • SQL Server

Some important things to keep in mind

  • The on-premise data gateway is firewall friendly. There are no inbound connections to the gateway from the Logic Apps. The gateway always uses outbound connections.
  • Logic App on-premise data gateway also supports High availability via Cluster configuration. You can have more than one installation of gateway and configure them in cluster mode.
  • When you install the gateway on one machine, it can connect to all hosts with in that network. So there is no need to install a gateway on each data source machine rather one in each network.
  • Install the on-premises data gateway only on a local computer. You can't install the gateway on a domain controller.
  • Don't install the gateway on a computer that turns off, goes to sleep, or doesn't connect to the Internet because the gateway can't run under those circumstances. Also, the gateway performance might suffer over a wireless network.
  • During installation, you must sign in with a work or school account that's managed by Azure Active Directory (Azure AD), not a Microsoft account.

You can find all official limitations around logic apps at https://docs.microsoft.com/en-us/azure/logic-apps/logic-apps-limits-and-config

Configure a firewall or proxy

  • The gateway creates an outbound connection to Azure Service Bus Relay. To provide proxy information for your gateway, see Configure proxy settings.
  • To check whether your firewall, or proxy, might block connections, confirm whether your machine can actually connect to the internet and the Azure Service Bus. From a PowerShell prompt, run this command-

 Test-NetConnection -ComputerName watchdog.servicebus.windows.net -Port 9350

    • This command only tests network connectivity and connectivity to the Azure Service Bus. So, the command doesn't have anything to do with the gateway or the gateway cloud service that encrypts and stores your credentials and gateway details.
    • Also, this command is only available on Windows Server 2012 R2 or later, and Windows 8.1 or later. On earlier OS versions, you can use Telnet to test connectivity. Learn more about Azure Service Bus and hybrid solutions.
    • If TcpTestSucceeded is not set to True, you might be blocked by a firewall. If you want to be comprehensive, substitute the ComputerName and Port values with the values listed under Configure ports in this article.
  • The firewall might also block connections that the Azure Service Bus makes to the Azure datacenters. If this scenario happens, approve (unblock) all the IP addresses for those datacenters in your region. For those IP addresses, get the Azure IP addresses list here.

Configure ports

  • The gateway creates an outbound connection to Azure Service Bus and communicates on outbound ports: TCP 443 (default), 5671, 5672, 9350 through 9354. The gateway doesn't require inbound ports.

Domain names

Outbound ports

Description

*.analysis.windows.net

443

HTTPS

*.login.windows.net

443

HTTPS

*.servicebus.windows.net

5671-5672

Advanced Message Queuing Protocol (AMQP)

*.servicebus.windows.net

443, 9350-9354

Listeners on Service Bus Relay over TCP (requires 443 for Access Control token acquisition)

*.frontend.clouddatahub.net

443

HTTPS

*.core.windows.net

443

HTTPS

login.microsoftonline.com

443

HTTPS

*.msftncsi.com

443

Used to test internet connectivity when the gateway is unreachable by the Power BI service.

  • If you must approve IP addresses instead of the domains, you can download and use the Microsoft Azure Datacenter IP ranges list. In some cases, the Azure Service Bus connections are made with IP Address rather than fully qualified domain names.

Want to read more about on-premise data gateways?

Check out the sites below:

Thanks for reading!

P.S.: In the last couple of months, I have extensively worked on Logic Apps and on-premise data gateway. So, feel free to contact me if you have any questions.

Categories: Azure
Tags: Azure, Logic Apps
written by: Sagar Sharma