Design to benefit from resubmit!
The resubmit functionality will create a new instance of the Logic App by firing an identical trigger message as within the originally failed Logic App. Depending on the type of trigger message, this is helpful or not. However, you can design to benefit from resubmit. Let's have a closer look!
The Logic App below kicks off on a configured time interval. It iterates through all files in the folder, parses them into XML, executes a transformation, sends them to the output folder and eventually deletes them. If something goes wrong, you cannot use the resubmit. The two main reasons are:
- The trigger message does not contain the data you act upon, so resubmitting does not guarantee that you re-process the same message.
- One Logic App handles multiple messages, so it's not possible to simply resubmit just one of them
Let's adjust this into a more convenient design. Here we use the file trigger, that ensures that one Logic App only handles one message. The trigger also contains the payload of the message, so a resubmit guarantees that the same data will be reprocessed. Now we can fully benefit from the resubmit function.
We can further improve this Logic App. In case the last delete action fails, we can still resubmit the message. However, this will result in the message being written twice to the output folder, which is not desired. In order to optimize this, let's split this logic app in two...
The second Logic App takes care of the message processing: flat file parsing, transformation and writing the file to the output. Remark that the Request / Response actions are set at the beginning of the Logic App, which actually means that the processing logic is called in an asynchronous fashion (fire and forget) from the perspective of the consuming Logic App.
With such a design, the message can be deleted already from the input folder, even if the creation of the output file fails. Via the resubmit, you are still able to recover from the failure. Remember: design to benefit from resubmit!
Think about data retention!
If your error handling strategy is built on top of the resubmit function, you need to consider the duration that your Logic App history is available. According to the documentation, the Logic App storage retention is 90 days. Seems more than sufficient for most integration scenarios!
HTTP Request / Response
Consider the following Logic App. It starts with a request / response, followed by additional processing logic that is simulated by a Delay shape.
What happens if we resubmit this? As there is no real client application, the response action is skipped. The execution of template action 'Response' is skipped: the client application is not waiting for a response from service. Cool! The engine does not fail on this and nicely skips the unnecessary step. However, as a consequence, the processing logic is also skipped which is not our intention.
This issue can be tackled by diving into the code view. Navigate to the Delay action and add the Skipped status in its runAfter section. This means that the Delay action will be executed whenever the preceding Response action succeeded (normal behavior) or was skipped (resubmit behavior).
Consider the following Logic App. It starts with a Service Bus PeekLock - Complete combination, a best practice to avoid message loss, followed by additional processing logic that is simulated by a Delay shape.
What happens if we resubmit this? As the message was already completed in the queue by the original run, we get an exception: "Failed to complete the message with the lock token 'baf877b2-d46f-4fae-8267-02903d9a9642'. The lock on the message has been lost". This exception causes the Logic App to fail completely, so the resubmit is not valuable.
As a workaround, you can update the Delay action within the code view and add the Failed status in its runAfter section. This is not the prettiest solution, but I couldn't find a better alternative. Please share your ideas below, if you identified a better approach.
Are singleton Logic Apps respected?
Logic Apps provide the ability to have singleton workflows. This can be done by adding the "operationOptions" : "SingleInstance" to the polling trigger. Triggers are skipped in case a Logic App instance is still active. Read more on creating a Logic Apps Singleton instance.
I was curious to see whether resubmit respects the singleton instance of a Logic App. Therefore, I created a Logic App with a 30 seconds delay. At the moment a Logic App instance was active, I toggled a resubmit. This resulted in two active instances at the same time.
It's important to be aware of the fact that a resubmit might violate your singleton logic!
Against what version is the Logic App resubmitted?
What happens in case you have a failed instance of Logic App v1? Afterwards, you apply changes to the workflow definition, which results in v2. Will the resubmit of the failed v1 instance, result in a v1 or v2 instance that is fired?
So remember that resubmitting is always performed against the latest deployed version of your Logic App. Increasing the "contentVersion" explicitly for every modification, does not alter this behavior.
Feedback to the product team
The resubmit feature is very powerful from a runtime perspective, but it can be improved from an operational point of view. It would be nice to have visibility on the resubmit, so that you know as an operator that specific failed workflows can be ignored, because they were resubmitted already. This can be achieved by adding an additional workflow status: Resubmitted. Another alternative solution is allowing an operator to query for failed Logic App run that were not resubmitted yet. Do you like this suggestion? Please vote for it!
At the moment, Logic Apps does not support a resume function. It would be nice that you could explicitly add a persistence point to the Logic App, so that in case of a failure you can resume from the last point of persistence. Let the product team know if you're interested in this feature. You can work around this limitation by splitting your Logic App into multiple smaller workflows. This kind of introduces a resume function, because you can resubmit at every small workflow. As a drawback, the monitoring experience becomes more painful.
Nice to see that the "singleton" feature has been introduced! Why not taking it one level further and make the number of allowed concurrent instances configurable? In this way, we can easily "throttle" the logic apps by configuration in case the backend system cannot handle a high load. Vote here if you like this suggestion!
Hope this write-up gave you some new insights into Logic Apps resubmit capabilities!