all Technical posts

LangChain: Creating a Summary App

In the third part of this blog series on LangChain, we will create an app that analyses the user’s question and performs actions based on this analysis. Specifically, this app will either conduct a cognitive search or summarize a specific file. We will also demonstrate how to upload a file to a blob storage, chunk it and upload it to your index. This post assumes that you already know how to create a basic API in Python.

Requirements: Deployed Azure LLM, Deployed Embedding AI, Azure Search Service, Azure Blob Container.

Uploading and chunking a file

The first step is to understand how to chunk and upload a file. Chunking refers to the process of dividing the file into blocks of text. The cognitive search function will scan through these blocks and return those it deems relevant to the given question.

Our first task is to create an asynchronous function that uploads files. This function requires two arguments: the content (the actual file) and the blob_name (the file name). Begin by establishing a connection to the blob and uploading the file as usual.

So we already need to have a blob in place, and add some prerequisites to this guide.

Next, prepare the objects needed to chunk the data and upload it. As discussed in the previous blog post, this step requires an embeddings AI object and a text splitter for dividing the data. When creating the text splitter, specify the chunk size and overlap. These values decide how big the information blocks are and how much overlap there is between the blocks. You may need to experiment with these values based on the files you upload.  It is interesting to note that there are different ways to split up documents. This one is according to chunk size, while others are separated by sentence based on headers:

Now, load the file again from your blob container to ensure it’s in the correct format and split it using the split_document function. Store the result in a new variable. The file content has now been divided into parts.

Finally, create the vector store and add the split document to it. The chunked documents are now uploaded to your index and can be queried by asking regular questions.

Complete Code:

Conversing with your data

This application is built on the premise that a user inputs a question through the front-end interface, which along with the chat history, is then passed to the backend. This process, although seemingly straightforward, presents a few challenges. For instance, a single question could potentially yield multiple results. Therefore, determining the user’s intent and subsequently, the appropriate action to be taken in the code, is crucial. One simple yet effective solution to this problem is to ask the AI.

After creating the LLM object, you can ask the AI to ascertain the intention behind the question. This is achieved through a process very similar to the one outlined in the first guide. You create your prompt template and chain, making sure to word the prompt correctly. In this prompt, we instruct the AI to return the word ‘file’ if it detects a file name in the question, and ‘other’ otherwise. You can further refine this by asking the AI to determine a more specific intention behind the question if required. However, it’s important to avoid overloading a single prompt template. It’s worth noting that these AI queries are executed quite swiftly, and therefore you need not worry about significantly slowing down the application. Both the intent analyzer and the extraction of the file name (see below) take only a fraction of a second. This is nothing compared to the complete process which can last up to 15 to 20 minutes for summaries of larger files.

Once you have determined the exact intent of the question, you can control the flow of your code accordingly. If it’s a typical question, you can redirect it to a standard cognitive search as seen in the previous guide. However, if the user requests a summary of a specific file, you will need to extract the file name from the question. Please note that this assumes the user correctly inputs the file name. Again, extracting the file name involes a simple query to the AI.

After this, load the file and we can send it to the function that will create a summary.

To create a summary of the desired file, we need to follow a series of steps:

  1. First, we need two separate chains. The “Map Chain” is used to map the chunks of the document. The “Reduce Chain” is used to create summaries of these chunks.
  2. Next, we use the Reduce Chain to create a “Combine Documents Chain”. This chain is used to combine the summaries into one document.
  3. Following that, we use the Combine Documents Chain to create a “Reduce Documents Chain”. This chain is used to further reduce the combined document into a shorter summary.
  4. Then, we create a “Map Reduce Chain” that uses both the Map Chain and the Reduce Documents Chain. This chain will map the chunks of the document and then reduce them into a final summary.
  5. Once the chains are created, we use a text splitter to split the document that we want to summarize into chunks. These chunks are then processed by the Map Reduce Chain.
  6. Finally, we run the Map Reduce Chain on the split document. The chain will process each chunk of the document and gradually reduce them into a final summary.

This process might seem complicated at first, but it essentially involves creating a series of “chains”, each performing a specific task in the summarization process. The final result is a concise summary of the original document.

Here’s the corresponding code for these steps:

One important consideration is the time it takes to generate summaries. While a cognitive search or checking for the intent of a question is relatively quick, creating summaries can be time-consuming. For instance, a text containing about 50,000 words may take about five minutes to summarize, while larger texts may take significantly longer. The largest text I tested was 277,000 words and took about 15 to 20 minutes. Though the results are good if the prompt is written correctly. As someone who knows every detail of those 277,000 words, only rarely did small mistakes sneak into the summary.

Complete Code:


In conclusion, this guide presents a comprehensive approach to building a summarizing application utilizing Python and Azure services. The application’s functionality ranges from uploading and chunking files, conducting cognitive searches, and summarizing specific files.

The guide provides a step-by-step process of how to establish a connection with the blob, upload the file, prepare the objects required for chunking data, and uploading it. It also explores how to determine the user’s intent, extract the file name from the question, and generate a summary for the chosen document.

One key takeaway is that while some operations such as cognitive searches or checking for the intent of a question are relatively quick, creating summaries for larger files can be time-consuming. However, the quality of the output justifies the time spent.

As AI continues to evolve, applications such as the summarizing app discussed here will continue to simplify data analysis and information retrieval, making it easier and faster for users to access and understand large volumes of data.

Thanks for reading!

Subscribe to our RSS feed

Want to know more?

Contact Steven

IoT Data & AI Domain Lead - Data & AI Solution Architect

Hi there,
how can we help?

Got a project in mind?

Connect with us

Let's talk

Let's talk

Thanks, we'll be in touch soon!

Call us

Thanks, we've sent the link to your inbox

Invalid email address


Your download should start shortly!

Stay in Touch - Subscribe to Our Newsletter

Keep up to date with industry trends, events and the latest customer stories

Invalid email address


Great you’re on the list!