wiki

Codit Wiki

Loading information... Please wait.

Codit Blog

Posted on Monday, January 4, 2016 2:00 PM

Glenn Colpaert by Glenn Colpaert

With the release of cumulative update 2 for BizTalk 2013 R2 Microsoft adds support for the .NET Connector in BizTalk Server 2013 R2.

Why this change?

In August 2012 the SAP Integration and Certification Center informed SAP partners and ISVs that the classic RFC Library will no longer be supported after March 31, 2016. Businesses should immediately start the transition to the new SAP NetWeaver Library.

This announcement is also impacting Microsoft BizTalk Server environments and therefor Microsoft released hotfix packages to be able to migrate the current Microsoft BizTalk Server SAP Adapter to use the new SAP NetWeaver Library via the SAP Connector for .NET.

Whitepaper

We recently released a whitepaper on the Codit website that aims to describe the impact and necessary steps of migrating from the classic RFC Library to the new ‘SAP Connector for .NET’ Libraries.

The whitepaper is available for download via /about-us/download-center/8279/end-of-sap-adapter-support-upgrade/.

Update for BizTalk 2013 R2

With the release of cumulative update 2 for BizTalk 2013 R2 Microsoft adds support for the .NET Connector in BizTalk Server 2013 R2.

The Microsoft BizTalk Server SAP Adapter is re-engineered to support both Classic RFC SDK and .NET Connector (NCO) through an option called ‘ConnectorType’ in the SAP Binding. All of these changes are accomplished in a non-breaking manner for the BizTalk Server installations.

You can download the CU2 via https://support.microsoft.com/en-us/kb/3119352

More Information?

Would you like more information or advice on how to migrate to the new SAP Connector for .NET? Do not hesitate to contact me!

Cheers,

Glenn

Categories: BizTalk
written by: Glenn Colpaert

Posted on Wednesday, December 30, 2015 2:00 PM

Tom Kerkhove by Tom Kerkhove

In my previous blog post I've introduced you to a blog series where we will analyse StackExchange data by using Microsoft Azure Data Lake Store & Analytics.

Today I'll illustrate where we can download the StackExchange sample data & how we can upload and store it in the Data Lake Store by using PowerShell.

There are several options for data storage in Azure, each with a specific goal. For data analytics -especially with Azure Data Lake Analytics- Azure Data Lake Store is the "de facto".

The StackExchange data is made available on Archive.org as zip-files. We will use an Azure VM to download it from the website, unzip every folder and upload it to the Store. Let us start!

Why do we use Azure Data Lake Store over Azure Blob Storage?

Before we start you might ask why we are using Azure Data Lake Store over Azure Blob Storage?

The reason is very simple - We are planning to store a decent amount of data and perform analytics on it with Azure Data Lake Analytics.

While Azure Blob Storage can be used with Azure Data Lake Analytics, it is recommended to use Azure Data Lake Store instead. The service is built for running analytical workloads on top of it and is designed to scale along with its load.

Azure Data Lake Store also offers unlimited storage without any limits on file or account level while this isn't the case for Azure Storage.

However - Storing all your data in Azure Blob Storage will be a lot cheaper than storing it in Azure Data Lake, even when you are using Read-Access Geographically Redundant Storage (RA-GRS).

These are only some of the differences they have. They also differ in different topics regarding access control, encryption, etc.

To summarize - There is no silver bullet. It basically depends on what your scenario is and how big the data is you want to store. My suggestion is that if you'll do big data processing in Azure, you should use Azure Data Lake Store!

If for some reason you decide that the store you've picked doesn't fit your needs, you can still move it with tools like Azure Data Factory or PowerShell.

Note - During the public preview of Azure Data Lake Store it will be cheaper but keep in mind that this is at 50% of the GA pricing.

Preparing our environment

For this phase we'll need to provision two resources: A new Azure Data Lake Store account & an Azure VM in the same region.

But do I really need an Azure VM?

Benefits of using an Azure VM

It is also possible to do everything locally but I personally recommend using a VM because we can let it run overnight more easily and will be faster.

It allows us to download a file of 28 GB in the Azure datacenter, unzip 250+ folders overnight and upload 150 GB to the Store. This means that we will only pay for 28 GB of ingress instead of 150 GB, however you need to take into account that you need to pay for your VM.

You will only benefit from this if the resources are allocated within the same region, otherwise Azure will charge you for 150 GB of egress & ingress.

Provisioning a new Data Lake Store

To provision a Data Lake Store resource, browse to the Azure portal and click on 'New > Data + Storage > Data Lake Store (Preview)'.

Give it a self-describing name, assign a resources and location and click 'Create'.

After a couple of minutes, the Store will be created and you should see something similar to this.

As you can see it includes monitoring on the total storage utilization, has a special ADL URI to point to your account and has a Data Explorer. The latter allows you to navigate and browse through your data that is stored in your account.

At the end of this article you should be able to navigate through all the contents of the data dump.

Provisioning & configuring a VM

Last but not least, we'll provision a new Azure VM in which we will download, unzip & upload all the data.

In the Azure Portal, click 'New > Compute' and select a Windows Server template of your choice. Here I'm using the 'Windows Server 2012 R2 Datacenter' template.

Assign a decent host name, user name & solid password and click 'Create'.

We will also add an additional data disk to the VM on which we will store the unzipped data as the default disk is too small.

To do so, navigate to the VM we've just provisioned and open the 'Settings' blade.

Select 'Disks', click on 'Attach New' and give it a decent name. 

We don't need to increase the default value as 1024 GB is more than enough.

Once the disk is added it will show up in the overview. Here you can see my stackexchange-data.vhd data disk.

Now that the disk is added we can connect to the machine and prepare it by formatting the disk and giving it a decent name.

Now that we have a Data Lake Store account and a VM to handle the data we are ready to handle the data set.

Retrieving the StackExchange data

StackExchange has made some of their data available on archive.org allowing you to download insight about all their websites.

The websites provide you several options for downloading everything going from a torrent to individual zips to one large zip.

I personally downloaded everything in one zip and two additional files - Sites.xml & SitesList.xml.

As we can see I've stored all the information on the new data disk that we have added to the VM.

Extracting the data

Time to unzip the large files into individual zip files per website, to do so you can use tools such as 7-Zip.

Once it's done it should look similar like this.

Next up - Unzipping all the individual websites. It is recommended to select all the zip-files, unzip them at once.

Grab a couple of coffees because it will take a while.

You should have around 150 GBs of data excl. the zip-files.

So what kind of data do we have?!

Looking at the data

Now that we have unwrapped all the data we can have a look at what data is included in the data dump.

As mentioned before, the zip contains a folder for each website by StackExchange, incl. all the meta-websites.
Each folder gives your more information about all the relevant data for that specific website going from users & posts to comments and votes and beyond.

Here is all the data included that is included for coffee-stackexchange-com in this example:

+ coffee-stackexchange-com
    - Badges.xml
    - Comments.xml
    - PostHistory.xml
    - PostLinks.xml
    - Posts.xml
    - Tags.xml
    - Users.xml
    - Votes.xml

However, there is one exception - Since StackOverflow is so popular, there is a lot more data and thus bigger files. Because of this they have separated each file across a dedicated folder per file.

Here is an overview of how the data is structured:

+ stackapps-com
    - Badges.xml
    - ...
    - Votes.xml
+ stackoverflow-com-badges
    - Badges.xml
+ stackoverflow-com-...
+ stackoverflow-com-votes
    - Votes.xml
+ startups-stackexchange-com
    - Badges.xml
    - ...
    - Votes.xml

With that structure in mind, let's have a look at how we can upload the data to Azure.

Uploading to Azure with PowerShell

In order to upload all the data, it would be a good thing to automate the process, luckily Azure provides a lot of PowerShell cmdlets that allow you to automate your processes.

For our scenario I've created a script called ImportStackExchangeToAzureDataLakeStore.ps1 that will loop over all the extracted folders & upload all its files to a new directory in Azure Data Lake Store.

Although it's a simple script I'll walk you through some of the interesting commands that are used in the script.

In order to interact with Azure Data Lake Store from within PowerShell we need to use the Azure Resource Manager (Rm) cmdlets.

To do so we first need to authenticate, assign the subscription we want to use and register the Data Lake Store provider.

# Log in to your Azure account
Login-AzureRmAccount

# Select a subscription 
Set-AzureRmContext -SubscriptionId $SubscriptionId

# Register for Azure Data Lake Store
Register-AzureRmResourceProvider -ProviderNamespace "Microsoft.DataLakeStore" 

With Test-AzureRmDataLakeStoreItem-command we can check if a specific path already exists in the account, i.e. a folder or file.

$FolderExists = Test-AzureRmDataLakeStoreItem -AccountName $DataLakeStoreAccountName -Path $DataLakeStoreRootLocation

If the specified would not exist, we could create it in the store with the New-AzureRmDataLakeStoreItem-command.

New-AzureRmDataLakeStoreItem -AccountName $DataLakeStoreAccountName -Folder $DestinationFolder

In our scenario we combine these two commands to check if the folder per website, i.e. coffee-stackexchange-com, already exists. If this is not the case, we will create it before we start uploading the *.xml-files to it.

Uploading is just as easy as calling the Import-AzureRmDataLakeStoreItem with the local path to the file telling it where to save it in the store.

Import-AzureRmDataLakeStoreItem -AccountName $DataLakeStoreAccountName -Path $FullFile -Destination $FullDestination

That's it, that's how easy it is to interact with Azure Data Lake Store from PowerShell!

To start it we simply call the function and pass in some metadata: What subscription we want to use and what the name of the Data Lake Store account is, where we want to upload it and where our extracted data is located.

C:\Demos > Import-StackExchangeToAzureDataLakeStore -DataLakeStoreAccountName 'codito' -DataLakeStoreRootLocation '/stackexchange-august-2015' -DumpLocation 'F:\2015-August-Stackexchange\' -SubscriptionId '<sub-id>'

While it's running you should see how it is going through all the folders and uploading the files to Azure Data Lake.

Once the script is done we can browse through all our data in the Azure portal by using the Data Explorer.

Alternatively you could also update it to Azure Blob Storage with ImportStackExchangeToAzureBlobStorage.ps1.

Conclusion

We've seen how we can provision an Azure Data Lake Store and how we could use an infrastructure in Azure to download, unzip and upload the StackExchange data to it. Also we've had a look at how the dump is structured and what data it contains.

I've made my scripts available on GitHub so you can test it out yourself!
Don't forget to turn off your VM afterwards...

In a next blog post we will see how we can aggregate all the Users.xml data in one CSV file by Azure Data Analytics and writing one U-SQL script. This will allow us to analyze the data later one before we visualize it.

If you have any questions or suggestions, feel free to write a comment below.

Thanks for your time ,

Tom.

Categories: Azure
Tags: Data Lake
written by: Tom Kerkhove

Posted on Wednesday, December 30, 2015 2:00 PM

Tom Kerkhove by Tom Kerkhove

As of Wednesday 28th of October, Azure Data Lake Store & Analytics are now in public preview allowing you to try it out yourself. You won't have to worry about any clusters and allows us to focus on our business logic!

To celebrate this, I'm writing a series that will take you through the process of storing the data in Data Lake Store, processing it with Data Lake Analytics and visualizing the gained knowledge in Power BI.

 

I will break-up the series into four major parts:

  1. Storing the data in Azure Data Lake Store or Azure Storage
  2. Aggregating the data with Azure Data Lake Analytics
  3. Analyzing the data with Azure Data Lake Analytics
  4. Visualizing the data with Power BI

During this series we will use open-source data from StackExchange.
This allows us to deal with real-world data and how that might cause some difficulties.

In my next post I'll walk you through the steps to upload the data and how we can do this in a cost-efficient way.

Thanks for reading,

Tom Kerkhove.

Categories: Azure
written by: Tom Kerkhove

Posted on Monday, December 21, 2015 9:30 AM

Massimo Crippa by Massimo Crippa

A Successful API is attributed by a well-defined communication of its contents and formats, clear documentation and transparent terms on its usage.
The developer portal is where to collect all this information as a key element to assure a good API adoption, as well as manage the interaction between the developers, giving them insights about the API consumption.
In this post we will see 10 important things to keep in mind to set-up an effective developer portal and some tips about Azure API Management

#1- Explain your offer

Keep it simple, explain the value of your APIs in a clear and compelling way. The home page of the dev portal should provide a clear answer on “what are you offering” without struggling to figure out what the API does. Another question to be answered is about what can be built with the API. Hence also consider the opportunity to create a section dedicated to the use cases.

APIM tip: APIM comes with a full CMS with a high degree of customization. Plan your dev-portal concept and leverage the CMS creating additional pages, sections, custom assets, and introducing different style layers.

#2 - Developer On-boarding

The on-boarding procedure through which a developer gains access to the API must be intuitive and easy. The developers want to be productive quickly and be able to try-out your API in minutes. In order to speed up the on-boarding process, enable different identity providers like Google, Microsoft and Twitter.

APIM tip: If feasible, evaluate to add the possibility to try-out your APIs or part of them without registration. This is a nice to have. I don't want to register and have a key for every API I want to try.

#3 – Getting started

Again, the developer portal is all about to enable the developers to hit the ground running on day one and be productive immediately.

It’s mandatory to provide a tutorial that guides the users through the authentication process and how to perform the API calls via the SDK or directly from code. Those tutorials should be not limited to simple calls and it might cover complete use cases like the “User-Registration” or how the “eCommerce transaction” works. Keep in mind the type of audience and their preferences.  

APIM tip: Create a dedicated page/section and link to this section directly from the home page.

#4 –Add discoverable metadata endpoint(s)

Nowadays there are different formats to describe an API. Swagger (reborn as OADF with the Open API Initiative) is the de-facto standard but others like API blueprint or apiary are emerging.

Publishing metadata endpoints with different formats can help the intake of our API. Consider to use the swagger json as the input for the apitransformer tool to generate api blueprint and raml. The downside is that you have to mantain those different outputs.

Description languages are great integration enablers, but don't forget that the most effective documentation is the one written by humans.

APIM tip: APIM provides the swagger and wadl definition in a secure way via the admin portal. The disadvantage is that we cannot virtualize it to make it discoverable. Consider to download the swagger json file and I re-published it in the dev portal.

#5 - Explain headers

HTTP headers are used for multiple purposes, such as describing the media-type of the exchanged entity, managing the versioning or, for example, as transaction parameter. There are headers that are applicable only for the requests and others only for the responses. Headers that have a generic applicability and headers that are bound to the resource we’re requesting. Standard headers and custom headers.

In this jungle it’s easy to lose your way.

It’s important to describe all the mandatory headers and their applicability, what is mandatory and what is optional, the value that we can expect in the response and how to use it. Use a standardized approach and don't document the same thing multiple times.

APIM tip: create a custom section dedicated to the headers and customize the operation template to create a link to the headers' documentation to accessing the extensive description.

#6 – Explain authorization and authentication

There are several way to implement authentication and authorization in RESTful context.  OpenID connect, basic authentication, shared access key, certificates, asymmetric keys are some examples.

Some of those are implemented using headers, others with different mechanisms.  No matter which type of AuthN/AuthZ you adopted, you should explain in detail the model you are using and how to get the access codes.

APIM tip: It's recommended to have a dedicated section about authentication and authorization.

#7 – Status codes, example of requests and responses.

A proper documentation not only helps to perform the API intake but it's a key factor to retain the developer. Pay attention to enrich the API reference with request and response documentation for all the functionalities you provide.

APIM tip: APIM provides out of the box the possibility to add different status codes with different representations. It’s a matter of configuring and keeping it in sync with your API versions. All this information is displayed on the “Operation details” page.

#8 –SDKs, code samples

The aim is to provide developers the tools to easily connect to your APIs. Code samples and SDKs are the pain-killers for every API intake especially when your API is dealing with complex data structures and binary objects.

On the other hand, building an effective SDK and libraries in multiple languages may require a huge investment, especially for a public API that also targets mobile devices.

APIM tip: APIM provides the code samples to directly invoke the API in different languages. Use the “code samples” template to customize the way the code is generated. Publish all your libraries and SDKs on GitHub. If you’re targeting .NET shop, autorest is a good choice to generate client sdk.

#9 – Support page

The goal is to provide a great developer experience, preventing loss of productivity and customer frustrations. Configure a central point to organize your support service to address the problem right away. The support can be delivered over the phone, with a live chat, a ticket system, social networks and many other ways.

APIM tip: Issue management out of the box can be combined with custom pages to organize other channels (email, social networks, etc)

#10 – Communication and Change log

Attract, engage and retain external Developers, Consumers and the community in the business maturity by establishing an effective and open communication through the API Developer Portal. Provide a tool to discuss your product (and your documentation), provide feedback and feature requests. Create a section to collect the latest news about your API, articles and, last but not the least, to publish the change log.

APIM tip: A blog is a great source to deepen the connection with the developer community. Consider to activate the blog feature and enable the RSS feed to automatic data syndication. Put your documentation on github to facilitate the external contributions. 

Conclusion

It's all about providing a great Developer Experience. Give to your API a compelling front door and boost the API to success!

Cheers,

Massimo

Categories: Azure
written by: Massimo Crippa

Posted on Thursday, December 10, 2015 5:00 PM

Operations Management Suite has been announced at Ignite 2015, this blogpost demonstrates how you can use Microsoft Operations Management Suite and Azure Automation to monitor and alert you about problems on your BizTalk server.

What is Microsoft OMS?

Operations Management Suite is a unified IT management solution that has been released in 2015. It's simple to set up, always up to date, and connects to your on-premise datacenters and cloud environments. You get a single, powerful, integrated portal that provides instant access to critical information. You can collect massive machine data, analyze, and search across all your workloads and servers – no matter where they are.

OMS is not a replacement for the System Center products, in-fact OMS can extend your existing System Center investments.

At the time of this writing OMS contains four main services.

  • Log Analytics: Real-time operational intelligence. Deliver unified management across your datacenters and public clouds. Collect, store and analyze log data from virtually any source and turn it into real-time operational intelligence.
  • Automation: Simplified cloud management with process automation. Create, monitor, manage and deploy resources in your hybrid cloud environments while reducing errors and boosting efficiency to help lower your operational costs.
  • Availability: Fully integrated availability solution including rapid disaster recovery. Protect your data using capabilities only possible from the cloud. Enable backup and integrated recovery for all your servers and critical applications, to prepare you in the event of a disaster.
  • Security: Centralized control of server security. Identify missing system updates and malware status. Collect security related events and perform forensic, audit and breach analysis. Glean machine data from all your servers, no matter where they are, and receive deep analytics to react fast to issues.

History

Operations Management Suite is not a brand new product, Microsoft started with this service under the name of System Center Advisor. System Center Advisor paved the road for Operational Insights. This year Azure Operational Insights gained several features and finally became Microsoft Operations Management Suite. The former Intelligence Packs are now named Solutions in the portal.

Pricing

If you dig into the details you will notice there is a free tier available, this tier will also remain free in the future!
It includes an upload limit of 500MB a days and 7 days of data retention.

Microsoft recently released an OMS calculator to budget OMS licenses and commitment needed to use OMS. It also compares the two purchasing options of add-on to System Center and standalone service. You can find it here: http://omscalculator.azurewebsites.net

Operation Console

Operations Management Suite portal can be accessed through the mobile app or the web portal.

I'm not going to dig into the available solutions, there is already a lot of content out there concerning these features. 

Monitoring BizTalk receive locations

Installation

Installation? No need to install infrastructure! OMS requires only a three step process that can be completed within 5 minutes.
Just go to www.microsoft.com/OMS and follow these 5 steps:

Leverage OMS LOG search and find critical events

After installation and executing the 5 steps you can leverage the LOG Search function to find critical events.

Query to find receive locations that got shutdown: Type=Event (EventLevelName=error) (EventID=5649)

 

Using Azure Automation and the OMS search API to send an email alert for a specific event

On GitHub a PowerShell module for Azure Automation has been created that will help you executing queries against OMS. Import the PowerShell module and you can start creating runbooks to connect to the OMS Search API:
The output results in several objects that can be used to create content for an email notification with Azure Automation:

At the moment of this writing OMS Alerting has been released to Public Preview, this provides out of the box alerting based on queries, find out more on:

http://blogs.technet.com/b/momteam/archive/2015/12/02/announcing-the-oms-alerting-public-preview.aspx

This is only a small example of the power of OMS, surely check out the other available features!