When talking about modern application development, where enterprise applications exist both on-premises and in the cloud, companies want to integrate beyond their firewall typically with SaaS based applications or APIs exposed by third parties.
Next to integrating with different services or applications, many companies also want to expose data in a simple, fast and secure way through API first development. Performance and a great experience is the key to success with your APIs. This is where Azure Search comes in...
For a datasharing platform that we are currently building, we have a large amount of files stored on Blob Storage. These files are being ingested into the platform through different endpoints and protocols (HTTPS, FTP, AS2,...). When files are ingested in the platform, different types of metadata are added to the file before storing it to Blob Storage. These metadata values are extracted from the file content and provided as metadata. These metadata values will be made available for querying through APIs.
Our first implementation of the API was directly querying the Blob Storage to search for specific files that match our metadata filters that are provided in the API calls. We started noticing the limits of our implementation because the large amounts of Blobs inside our storage container. This is related to the limited query capabilities of Blob Storage, we needed to list the blobs and then do the filtering inside the implementation based on the metadata.
To optimize our searches and performance we quickly introduced Azure Search into our implementation. To get things straight, Azure Search is not in charge of executing queries across all the blobs in Azure Blob Storage but more to index all the blobs in the storage account with a search layer on top of it.
Azure Search is a search-as-a-service cloud solution that gives developers APIs and tools for adding a rich search experience over your content in web, mobile, and enterprise applications. This all without managing infrastructure or the need to become search experts.
To use Azure Search there are a couple of steps you need to take, first of all you need to provision the Azure Search service, this will be the scope of your capacity, billing and authentication and is fully managed through the Azure Portal or through the Management API.
When your Azure Search service is provisioned you need to define one or more indexes that are linked with your search service. An index is a searchable collection of documents and it contains multiple fields where you can query on.
Once your index is created you can define a schedule for that index to run, once every hour, x-time an hour...
When everything is configured and up and running you can start using the Azure Search service to start querying your indexed data for results.
Fit it all togheter
Below you can find a simplified example of our initial implementation. You can see that we were directly querying the blob storage and needed to fetch the attributes for each single blob file and match it with the search criteria.
This is how our current high level implementation looks like. We are using the Azure Search engine to provide both Queries and Filters to find immediately what we need.
Azure Search immediately gave us the performance we needed and was fairly easy to set up and use.
We struggled a bit finding our way round some of the query limitations and options in the basic Azure Search query options, but quickly came to the conclusion that using the Lucene query syntax provides enough rich query capabilities that we needed to search for the metadata.