Codit Wiki

Loading information... Please wait.

Codit Blog

Posted on Friday, May 5, 2017 4:56 PM

Massimo Crippa by Massimo Crippa

The Analytics module in Azure API Management provides insights about the health and usage levels of your APIs, to identify key trends that impact the business. Analytics also provides a number of filtering and sorting options, to better understand who is using what, but what if I want more? For example, how about drill down reports or getting mobile access?

I am a big fan of Power BI so, let's combine the power of Azure Functions and the simplicity of the APIM REST APIs, to flow the analytics data to Power BI.

The picture below displays my scenario: Azure functions connect and combine APIs from different Azure services (AAD, APIM, Storage) to create a smooth and lightweight integration.

It's a serverless architecture which means that we don't have to worry about the infrastructure so we can focus on the business logic, having rapid iterations and a faster time to market.

The APIM analytics (aggregated data) can be read by calling the report REST API. This information can then be written to Azure Tables and automatically synchronized with Power BI. 



The Azure function:

  1. Is triggered via HTTP POST. It accepts a body parameter with the report name (byApi, byGeo, byOperation, byProduct, bySubscrition, byUser) and the day to export.

  2. Calls the AAD token endpoint using the resource owner password flow to get the access token to authorize the ARM call.

  3. Calls the APIM rest API ($filter=timestamp%20ge%20datetime'2017-02-15T00:00:00'%20and%20timestamp%20le%20datetime'2017-02-16T00:00:00')

  4. Iterate through the JTokens in the response body to build a collection of IEnumerable<DynamicTableEntity> that is passed to the CloudTable.ExecuteBatch to persist the data in the azure storage. 

Because I am using a second function to extract and load (to azure storage) additional APIM tables (e.g. apis, products, users etc..), I found this article very useful on reusing code in different Azure functions.

I created a logic app to trigger the functions multiple times, one per report to be exported. The code can support any new aggregation or additional fields added in the future without any modification.

Power BI

Using Power BI desktop I put together some visualizations and pushed them to the Power BI service. The report dataset is synced with the Azure tables one time per day, which is configurable. Here below, you can see screens from my mobile phone (left) and the desktop experience (right).


Even though the same result can be achieved using other Azure services like Webjobs or Data Factory, Azure functions provide multiple benefits like a simple programming model, the abstraction of servers and the possibility to use a simple editor to build, test and monitor the code without leaving your browser. That's a perfect fit for quick development cycle, faster adaptation that gains business advantages, isn't it?



Categories: API Management
written by: Massimo Crippa

Posted on Saturday, May 14, 2016 1:54 PM

Joachim De Roissart by Joachim De Roissart

Michel Pauwels by Michel Pauwels

Pieter Vandenheede by Pieter Vandenheede

Robert Maes by Robert Maes

The Codit recap of the third and last day at Integrate 2016.

Day 3 is already here and looking at the agenda, the main focus of today is IoT and Event Hubs.

Unfortunately we underestimated traffic today and spent quite a long time getting to the Excel, missing the first one and a half session. Luckily we were able to ask around and following Twitter while being stuck in traffic.

Azure IaaS for BizTalkers

Stephen W. Thomas identified the possibilities nowadays with Azure IaaS (Infrastructure as a Service). The offering there is getting bigger and bigger every day. Having lots of choice means that the right choice tends to be hard to make. 58 types of Virtual Machines makes one wonder what the right one would be. Luckily Stephen was able to guide us in the differences, identifiying which ones were the better choice for optimal performance, depending on your MSDN access level. The choices you make have an immediate effect on the monthly cost, so be aware!

Unified tracking on premise and in the cloud

M.R. Ashwin Prabhu was the next presenter. He took off with a very nice demo, showing how he showed how to analyze on premise BAM tracking data into a Power BI dashboard, something a lot of people might have already thought about, but never got to doing. We all know the BAM portal in BizTalk Server has not been updated since 2004 and it urgently needs an update. Customers are spoiled nowadays and, with reason, many of them don't like the looks of the BAM portal and get discouraged to use it.

Moving from BAM data to Logic Apps tracked data is not far-fetched, so another demo followed later demonstrating just this. Just minutes later he demonstrated combining BAM data and Logic Apps tracked data into one solution, with a very refreshing tracking portal in PowerBI, using the following architecture:

(Image by Olli Jääskeläinen)

IoT Cases

IoT is not the next big thing, it already is now! 

Steef-Jan Wiggers explained how a few years ago he managed a solution to process sensory data from windmills. Nowadays Azure components like Event Hubs - the unsung hero of the day -, Azure Stream Analytics, Azure storage, Azure Machine Learning and Power BI can make anything happen!

Kent Weare proved this in his presentation. He explained how, at his company, they implemented a hybrid architecture to move sensory data to Azure instead of on premise historians. Historians are capable of handling huge amounts of data and events, something which can also be done to Azure Event Hubs and Stream Analytics. Handling the output of this data was managed by Logic Apps, Azure Service Bus and partly by an on premise BizTalk Server. The decision to move to Azure were mostly based on the options around scalability and disaster recovery.
In short, a very nice presentation, making Azure services like Event Hubs, Stream Analytics, Service Bus, Logic Apps and BizTalk Server click together!

After the break Mikael Häkansson continued on the IoT train. He showed his experience and expertise by showing off two more demos: first he managed to do some live programming and debugging on-stage. He demonstrated how easy it can be to read sensory data from IoT devices, namely the Raspberry Pi 3. He talked about managing devices such as applications and updates to firm- and software.
In a last demo he showed us how he made an access control device using a camera with facial recognition software and

The last sessions

Next up was Tomasso Groenendijk, talking about his beloved Azure API Management. He showed off the capabilities, especially around security (delegation and SSL), custom domain names and such. 

A 3 day conference takes its toll: very interesting sessions, early mornings, long days and short nights. Just as our attention span was being put to the limits (nothing related to the speakers off course), Nick Hauenstein managed to spice things up, almost as much as the food we tasted the last couple of days ;-)

He managed to give us the most animated session of the conference without a doubt. Demonstrated by this picture alone:

(Picture by BizTalk360)

A lightning fast and interactive session, demonstrating his fictional company creating bobbleheads.He demonstrated a setup of Azure Logic Apps and Service Bus, acting and behaving like a BizTalk Server solution. Correlation, long running processes, most of the things we know from BizTalk Server.

Really nice demo and session, worth watching the video for!!

Due to the fact we needed to close up our commercial stand and had a strict deadline on our EuroStar departure with some high traffic in London, we missed Tom Canters session unfortunately. We'll make sure we check that video when they are released in a few weeks.

The conclusion

Integrate 2016 was packed with great sessions, enchanting speakers and great people from all over the world. We had a real great time, meeting people from different countries and companies. The mood was relaxed, even when talking to competitors! Props to the organizers of the event and the people working to make things go smooth at the Platinum Suites of the London Excel. It was great seeing all of these people again after last year, hearing everyones progress since last year's BizTalk Summit 2015.

We take with us the realisation that Azure is hotter than ever. Microsoft is still picking up its pace and new features get added weekly. Integration has never been so complex, but nowadays Microsoft has one - or even more - answers to each question. With such a diverse toolset, both on premise and in the cloud, it does remains hard to keep ahead. A lot of time is spent on keeping up-to-date with Microsoft's ever changing landscape.

We hope you liked our recaps of Integrate 2016, we sure spent a lot of time on it. Let us know what you liked or missed, we'll make sure to keep it in mind next year.

Thank you for reading!

Brecht, Robert, Joachim, Michel and Pieter




Posted on Thursday, September 10, 2015 8:18 PM

Sam Vanhoutte by Sam Vanhoutte

On September 10 and 11, Codit was present at the first Cortana Analytics Workshop in Redmond, WA. This blog post is a report for the community on the content we collected and our impressions with the offering of Cortana Analytics. From data to decisions to action is the tagline for this event.

Once again, beautiful weather welcomed us in Redmond, when we arrived for the first Cortana Analytics workshop.  A lot of people from all over the world were joining this event that was highly anticipated in the Microsoft data community.  

As Codit is betting big on the new scenarios such as SaaS and Hybrid integration, API management and IoT, we really understand the real value for our customers will be gained through (Big) Data intelligence.  Next to a lot of new faces, there were also quite a bit of integration companies attending the workshop, such as our partner Axon Olympus, our Columbian friends from IT Sinergy and Chris Kabat from SPR.  

I will start with some impressions and opinions from our side.  After that, you can find more information on some of the sessions we attended.

Key take aways.

At first sight, Cortana Analytics suite is combining a lot of the existing data services that exist on Azure.  

  • Information management: Azure Data Catalog, Azure Event Hubs, Azure Data Factory
  • Big data stores: Azure Data Lake, Azure SQL Data Warehouse
  • Machine Learning and Analytics: Azure ML, Azure Stream Analytics, Azure HDInsight
  • Interaction: PowerBI (dashboarding), Cortana (personal assistent), Perceptual Intelligence (Face vision, Speech test)

So far, no specific pricing information has been announced.  The only thing to this regards was "One simple subscription model".

There are a lot of choices to implement stuff on Azure Big data.  And new stuff gets added frequently.  It will become very important to make the right choices for the right solution.  

  • Will we store our data in blob storage or in HDFS (Data Lake)
  • Will we leverage Spark, Storm or the easy Stream Analytics.
  • How will we query data?

Keynote session

The keynote session, in a packed McKinley room, was both entertaining and informative.  The event was kicked off by Corporate Vice President Jospeph Sirosh who positioned the Big Data and Analytics offering of Microsoft and Cortana Analytics.  The suite should give people the answers to the typical questions such as 'What happened?', 'Why did it happen?', 'What will happen?', 'What should I do?'.  To answer those questions, Cortana Analytics suite gives the tools to access data, analyze it and take the decisions.

Every conference has a schema that returns in every single sessions.  The schema for Cortana Analytics is the following one that shows all services that are part of the platform. (apologies for the phone picture quality)

The top differentiators for Cortana Analytics Suite

  • Agility (Volume, Variety, Velocity)
  • Platform (storage, compute, real time ingest, connectivity to all your data, information management, big data environments, advanced analytics, visualization, IDE's)
  • Assistive intelligence (Cortana!)
  • Apps (Vertical toolkits, hosted API's, eco system)
  • Features: Elasticity, Fully managed, Economic, Open Source (R & Python)
  • Facilitators
  • Secure, compliant, auditable
  • One simple subscription model
  • Future proof, with research of MSR & Bing

Cortana Analytics should be Agile, Simple and beautiful.

A firm statement was that "if you can cook by following a recipe, you should be able to use Cortana Analytics".  While I don't think that means we can go and fire all of our data scientists, I really believe that the technology to perform data analytics becomes more easily available and understandable for traditional developers and architects.

This is achieved through the following things.

  • Fully managed cloud services
  • A fully integrated platform
  • Very simple to purchase
  • Productize, simplify and democratize
  • Partner eco system


A lot of the services were demonstrated, using some interesting and well-known examples.  Especially the demo and the underlying architecture was very interesting.  That application was only possible through the tremendous scalability of Azure, and the intelligent combination of the right services on the Azure platform.  

Demystifying Cortana Analytics

Jason Wilcox, Director of engineering, was up next.  During a long intro on data anlytics, he mentioned the 'process' for data analytics as following:

  1. Find out what data sets are available
  2. Gain access to the data
  3. Shape the data
  4. Run first experiment
  5. Repeat steps 1, 2, 3 and 4 until you get it right
  6. Find the insight
  7. Operationalize & take action

 In his talk, Jason explained the following things about Cortana Analytics

  • It is a set of fully managed services (true PaaS!)
  • It works with all types of data (structured and unstructured) at any scale.  Azure Data Lake is a native HDFS (Hadoop File System) for the cloud.  Is it integrated with HDInsight, Hortonworks and Cloudera (and more services to come).  It is accessible from all HDFS compatible projects, such as Spark, Storm, Flume, Sqoop, Kafka, R, etc.  And it is fully built on open standards!
  • Operationalize the data through Azure Data Catalog (publish data assets) which will be integrated in Cortana Analytics
  • Cortana Analytics is open  to embrace and extend and allow customers to use the best-of-breed tools.
  • It's an end-to-end solution from data ingestion to presentation.


Real time data processing.  How do I choose the right solution

Two Europeans from the Data Insight Global Practice (Simon Lidberg, SE and Benjamin Wright-Jones, UK) were giving an overview of the 3 major streaming analysis services, available on Azure: Azure Stream Analytics, Apache Storm and Apache Spark.  

Azure Stream Analytics

Azure Stream Analytics is a known service for us and we've been using it for more than a year now.  We also have some blog posts and talks about it.  Simon was giving a high level overview of the service.

  • Fully managed service: No hardware deployment
  • Scalable: Dynamically scalable
  • Easy development: SQL Language
  • Built-in monitoring: View system performance through Azure portal

The typical Twitter sentiment demo was shown afterwards.  In my opinion, Azure Stream Analytics is indeed extremely easy to get started and to build quick win scenarios on Azure for telemetry ingestion, alerting and out of the box reporting.  

Storm on HDInsight

HDInsight is a streaming framework available on HDInsight.  A quick overview of HDInsight was given, positioning things like Map/Reduce (Batch), Pig (Script), Hive (SQL), HBase (NoSQL) and Storm (Streaming).

This is Apache Storm

  • Real time processing
  • Open Source
  • Visual Studi integration (C# and Java)
  • Available on HDInsight

Spark on HDInsight

Spark is extremely fast (3x faster than Hadoop in 2014).  Spark also unifies and combines Batch processing, Real Time processing, Stream Analytics, Machine Learning and Interactive SQL. 

Spark is integrated very well with the Azure platform.  There is an out of the box PowerBI connector and there is also support for direct integration with Azure Event Hubs.  

The differentiators for Spark were described as follows:

  • Enterprise Ready (fully managed service)
  • Streaming Capabilities (first class connector for Event Hubs)
  • Data Insight 
  • Flexibility and choice

Spark vs Storm comparison

Spark differs in a number of ways:

  • Workload: Spark implements a method for batching incoming updates vs individual events (Storm)
  • Latency: Seconds latency (Spark) vs Sub-second latency (Storm)
  • Fault tolerance: Exactly once (Spark) vs At least once (Storm)

When to use what?

The following table compared the three technologies.  My advise would be to opt for Stream Analytics for quick wins and straight forward scenarios.  For more complex and specialized scenarios, Storm might be a better solution.  It depends, would be the typical answer to the above question.

 Below is a good comparison table, where the '*' typically means "with some limitations".





Multi tenant service




Deployment model








Deployment complexity








Open Source Support






.NET, Java, Python

SparkSQL, Scala, Python, Java…

Power BI Integration

Yes, native


Yes, native

Overview of the Cortana Analytics Big Data stack (pt2)

In this session, 4 technologies were demonstrated and presented by 4 different speakers.  A very good session to get insights in the broad eco system of HDInsight related services.

We were shown Hadoop (Hive for querying), Storm (for streaming), HBase (NoSQL) and the new Big Data applications that will become available on the new Azure portal soon.

A nice demo, leveraging Hadoop HBase is the Tweet Sentiment demo:

Real-World Data Collection for Cortana Analytics

This was a very interesting real world scenario that was presented on a real world IoT project (soil moisture).  It's always good to hear people talk from the field.  People who have had issues and solved them.  People who have the true experience.  This was one of those sessions.  It's hard to give a good summary, so I thought to just write down some of the findings that I took away.

  • You need to get the data right!
  • Data scientists should know about the meaning of the data and sit down with the right functional people.
  • An IoT solution should be built in such a way that it suports changing sensors, data types and versions over time.

Data Warehousing in Cortana Analytics

In this session, we got a good overview of the highly scalable offering of Azure SQL Data Warehousing, by Matt Usher. The characteristics of Azure SQL DW are:

  • You can start small, but scale huge
  • Designed for on-demand scale (<1 minute for resizing!)
  • Massive parallel processing
  • Petabyte scale
  • Combine relational and non-relational data (It is Polybase with HDinsight!)
  • It is integrated with AzureML, Power BI and Azure Data Factory
  • There is SQL Server compatibility (UDF's, Table partitioning, Collations, Indices and Columnstore support)


Closing keynote: the new Power BI

James Philips, CVP Microsoft

Power BI is obviously the flagship visualization tool that gets a lot of attention.  While there are a lot of shortcomings for a lot of scenarios, it's indeed an awesome tool that allows to build reports very fast.  In this session, we got an overview of the new enhancements of Power BI and some insights in what's coming next.

Wihle most of these features were known, it was good to get an overview and recap of these features:

  • Power BI content packs
  • Custom Power BI visuals
  • Natural language query
  • On premise data connectivity

Some things I did not know yet

  • There is R support in Power BI Desktop (plotting graphs and generating word clouds)
  • When you add devToolsEnabled=true in the url, there are custom dev tools available in Power BI
  • Cortana can be integrated with Power BI.  


This was it for the first day.  You can expect another blog post on Day 2 and more on Machine Learning from my colleague Filip.

Cheers! Sam


Categories: Azure
Tags: Power BI
written by: Sam Vanhoutte

Posted on Tuesday, June 23, 2015 12:31 PM

Massimo Crippa by Massimo Crippa

It's crazy to see that the Power BI APIs are documented and managed on Apiary and not on Azure API management, isn't it?
Let’s see how to configure APIM so you can try out all of the Power BI APIs without writing a single line of code using the Microsoft Service.

Moving to Azure API management is more than seting up the documentation and the interactive console with a different look and feel. It gives us the possibility to take advantage of capabilities like throttling, usage analytics, caching and many more.

Here is the 4 steps procedure I did for this exercise:

  • Virtualization layer definition
  • Configure the Authorization Service (Azure Active Directory)
  • Configure Azure API Management to register the Authorization Service
  • Dev portal customization (optional)

Power BI API calls are made on behalf of an authenticated user by sending to the resource service an authorization token acquired through Azure AD.
The diagram below shows the OAuth 2.0 Authorization Code Grant flow.

Virtualization layer

First thing to do is to create the API Façade on Azure API Management defining the set of the operations that will be proxied to the Power BI web service (

Since Power BI APIs don't expose a swagger metadata endpoint, I manually created the API, added the operations, descriptions, parameters and representations (you can find the documentation here).

Then I defined my “Microsoft PowerBI APIs” product activating the visibility to the Guest group and with the Subscription Required option enabled and I added the API to the Product. With this configuration the API are visible to everyone so you can freely access to the documentation and on the other hand a subscription key is required to tryout the API using the built-in console.

The PowerBI APIs require an authentication token, so if we try to call the underline service at this point of the procedure we receive a 403 Forbidden answer.

Authorization Service

In order to provide a secure sign-in and authenticate our service calls with the Power BI APIs, we need to register our APIM built-in console in Azure AD as a Web Application. To complete this step you first need to sign up for the Power BI service and an Azure AD with at least one organization user.

Here you can find the register procedure. 

The main parameters to be set up are:

  • APP ID URI : Unique web application identifier. (e.g.
  • Reply URL : This is the redirect_uri for the auth code. The value configure in this field will be provided by the API Management's "Register Authorization Server" procedure (next section). 
  • Add the delegated permissions that will be added as claims in the authorization token.

Once the AD application is created you will get the ClientId and the ClientSecret values.

Note that the AD web application has been created on the Codit Office 365’s Azure AD so our setup will be valid only for our corporate Power BI tenant. Something different is the Apiary setup where I imagine that the Apiary WebApplication is enabled by default in every Office365's AD.


Register the Authorization Service

In this step we register the Authorization Service in Azure API Management and then we setup our Power BI façade API to use the registered Authorization Service. 

Here you can find the step-by-step procedure.

The main parameters to be set up are:

  • Authorization endpoint URL and Token Endpoint URL. 
  • The Resource we want to access on behalf of the authenticated user. 
  • ClientId and ClientSecret. Specify the value we got from the AAD  

Authorization endpoint URL and Token endpoint URL. Go to the Azure AD, select the application section and click the Endpoint button to access to the endpoint details.

The Resource service (Power BI) parameter must be specified as a body parameter using application/x-www-form-urlencoded format. This is the value of the resource parameter

Common errors with wrong resource parameter are:

  • An error has occurred while authorizing access via PowerBI: invalid_resource AADSTS50001: Resource identifier is not provided. Trace ID: {Guid} Correlation ID: {Guid} Timestamp: {TimeStamp}
  • An error has occurred while authorizing access via PowerBI: invalid_request AADSTS90002: No service namespace named '{resourcename}' was found in the data store. Trace ID: {Guid} Correlation ID: {Guid} Timestamp: {timestamp}
  • An error has occurred while authorizing access via {APIM Authorization Server Name}

The "Register Authorization Server" procedure generates a redirect_uri that must be used to update the "Reply URL" value in the AD Web Application we set up in the previous step. If not, you'll get this error at the first tryout :

  • AADSTS50011: The reply address '{redirect_uri}' does not match the reply addresses configured for the application: {Guid}.

The last step is to configure our Power BI façade API to use the OAuth 2.0 authorization mechanism. Open the API, click on the Security tab and then check the OAuth 2.0 box and select the registered Authorization Server.

Try it out

The API Management built-in console allows us to quickly test the API. 

Choose an API operation and then select "Authorization Code" from the authorization dropdown to access to the sign-in User Agent provided by Azure Active Directory. Once you have signed the HTTP request will be updated with the Bearer Token (the token value is not readable in the UI). 

Now you can specify the desired values for the additional parameters, and click Send to test the API. 

If you are interested to get access to the Bearer token, use Fiddler to intercept the Azure AD reply and then to analyze the token content with all the properties and claims. 

Developer Portal

As the Developer portal is a completely customizable CMS, you can set the look and feel following the branding strategy and add all the content you need to help to drive the APIs adoption. 

In this case, I created a custom page ( dedicated to the Power BI APIs to collect some resources to facilitate the API intake like MSDN references and code samples.  


As you can see, you can very easily use Azure API Management to connect to Power BI APIs on behalf of an authenticated user.

Moving to API management is not only like coming back home but also gives us the possibilities to take advantage of APIM capabilities like usage analytics to get insights about the health and usage levels of the API to identify key trends that impact the business.


Massimo Crippa

Categories: Azure
written by: Massimo Crippa

Posted on Tuesday, June 16, 2015 10:30 AM

Tom Kerkhove by Tom Kerkhove

On June 11th, the second edition of ITPROceed was organised. In this blog post you will find a recap of the sessions that we followed that day.

ITPROceed took place at Utopolis in Mechelen. A nice venue! Parking in front of the building and a working wifi connection immediately set the pace for a professional conference. There were 20 sessions in total (5 tracks).

Keynote: Microsoft Platform State-of-the-union: Today and Beyond By Scott Klein

Scott Klein kicked off the day with the keynote and gave us a nice state-of-the-union.
Scott explained us once the many advantages of using the cloud and compared it with the old days.

A big advantage is the faster delivery of features, where in the past it could take 3/5 years to deliver a feature through service packs or new versions of software.

Why Microsoft puts Cloud first:

  • Speed
  • Agility
  • Proven
  • Feedback

Next to that Scott showed us several new features, SQL Server 2016, Azure SQL Data Warehouse, PolyBase, Data Lake, Windows 10,...

The world is changing for IT Professionals, this means lot's of changes but also a lot of opportunities!

Are you rolling your environments lights-out and hands-free? by Nick Trogh

Nick gave us a lot of demo's and showed us how we could spin up new application environment in a reliable and repeatable process. In this session we looked into tools such as Docker, Chef and Puppet and how you can leverage them in your day-to-day activities.

Demystifying PowerBI

Speaker - Koen Verbeeck

Koen gave us a short BI history lesson whereafter he illustrated what each tool is capable of, when you should use it and where you can use it. To wrap up he showed the audience what Power BI v2 (Preview) looks like and how easy it is to use. Great introduction session to (Power) BI!

If you're interested in Power BI v2, you can start here

Data Platform & Internet of Things: Optimal Azure Database Development by Karel Coenye

In this session Karel told us about several techniques to optimize databases in Azure to get the most out of them and reducing the cost.
With Azure SQL databases you need to think outside the box and optimise with following principle:  

Cross premise connectivity with Microsoft Azure & Windows Server by Rasmus Hald

Running everything in the cloud in the year 2015 is very optimistic, often several systems are still running on premise and have not been migrated already to the cloud. Network connectivity between the cloud and on premise is necessary!

Within Codit we already have experience with Azure networking, it was very nice to follow the session to get more tips and tricks from the field.

Rasmus covered four topics:

1. How Windows Server & Microsoft Azure can be used to extend your existing datacenter.
2. How to use existing 3rd party firewalls to connect to Azure.
3. The new Azure ExpressRoute offering.
4. Windows Azure Pack.

Big Data & Data Lake

 Speaker - Scott Klein

SQL Scott was back for more, this time on BIG data! He kicked off with an introduction of how many data was processed to calculate the space trip for Neil Armstrong to the moon and how the amounts of data have evolved.

Bigger amounts of data means we need to be able to store them as good & efficient as possible but also be able to work with that data. In some cases plain old databases are not good enough anymore - That's where Hadoop comes in. 

The Hadoop ecosystem allows us to store big amounts of data across data nodes and where we can use technologies such as Pig, Hive, Sqoop and others to process those big amounts of data.

As we start storing more and more data Microsoft started delivering Hadoop clusters in Azure called HDInsights which is based on the Hortonworks Data Platform. If you want to know more about HDInsight, have a look here.

Process big data obviously requires the big data itself, during Scott's talk he also talked about Azure Data Lake which was announced at //BUILD/ this year. There are several scenarios where you can benefit from Azure Data Lake - It's built to store your data without any limitation, whether it is size, type or whatsoever, in its raw format.

In the slide below you can see how you can use Azure Data Lake in an IoT scenario.

Just to clarify - Data Lake is not something Microsoft has invented, it's a well known concept in the Data-space that is kinda the contrary of Data Warehousing. If you want to learn more about Data Lake or the relation with Data Warehousing, read Martin Fowlers vision on it.

Scott wrapped up his session with a demo on how you can spin up an Azure HDInsight cluster. After that he used that cluster to run a Hive query on your big files stored in an Azure Storage account as blobs.

Great introduction session to big data on Azure.

Securing sensitive data with Azure Key Vault by Tom Kerkhove

Speaker - Tom Kerkhove

In the closing session Tom introduced us the concepts of Microsoft Azure Key Vault that allows us to securely store keys, credentials and other secrets in the cloud.

 Why you should use Azure Key Vault:

  • Store sensitve data in hardware security modules (HSM)
  • Gives back control to the customer
    • Full control over lifecycle and audit logs
    • Management of keys
  • Removes responsibility from developers
    • Secure storage for passwords, keys and certificates
    • Protect production data
Categories: Community
written by: Tom Kerkhove