Massimo Crippa

1 September 2025

all Technical posts

Azure Health Models: Alert on health, not just metrics

Monitoring today’s complex cloud environments should go beyond numbers to show real customer impact. Azure Monitor Health Models provide a clearer view of system health, resilience, and availability.

In today’s complex cloud environments, tracking the health of your applications and their dependencies is more than just monitoring raw metrics—it’s about understanding true business impact. If your payment system stalls or your APIs slow down, customers feel it instantly.

Traditional monitoring with dashboards, metrics, and alerts gives us critical data, but not always context. What we really need is a way to understand whether a workload as a whole is healthy, available, and resilient against failures.

Enter Azure Monitor Health Models (preview), a powerful feature that enhances monitoring with meaningful health signals and visualization.

TL;DR

azure health model TL;DR

What are Azure Monitor Health Models?

Azure Monitor Health Models allow you to define and track the health of your workloads through an intuitive modeling experience. They extend traditional monitoring by adding business context, establishing a health baseline that reflects the what matter the most for your business.

Instead of looking at each resource in isolation, health models show how components interact and roll up into the overall health of the system.

Azure Health models is built on top of the following key concepts:

Signals: Metrics or log queries that collectively determine the health rating of a component is Healthy, Degraded, or Unhealthy. You can use built-in recommended signals or define custom ones.
Entities: Components of your workload (VMs, Event Hubs, APIs, Functions, etc.) modeled visually in a graph.
Propagation: If one component fails (e.g., Event Hub under pressure), the failure cascades upstream, showing its effect on dependent components providing an aggregated health score for your entire workload.
Health State: Instead of dozens of noisy alerts, you get a single contextual answer: “Is my app healthy?”

Why Health Models Matter?

Health models matter because they reduce alert fatigue by surfacing meaningful, business-impacting issues and accelerate troubleshooting by mapping failures to customer-facing features.

They also enable automation by plugging into the Azure monitor alerting system allowing health-driven actions. Even more important, they align monitoring with business context by shifting the focus from raw technical metrics to the health of critical user journeys that truly reflect end-customer experience.

Use case

Moving from monitoring to resilience requires several steps, and it all starts by defining the critical flows—the key user journeys such as login, account setup, or checkout—that your business depends on.

By identifying these flows upfront, you can create a clear view of which parts of the application are most important to protect, ensuring that resilience efforts focus where failures would have the greatest impact on customers and business outcomes.

ℹ️ In our fictional case, the organization identified the “live data feed” as business-critical and described the health of this function with these words: “The user can access the live data feed in real time, without interruptions or significant delays, and the information displayed is accurate and up to date.”

Building and Configuring Your Health Model

The next step is to take the definition of a healthy flow and break it down into the technical components and dependencies that support it—such as databases, APIs, messaging services, or third-party systems—so you can clearly see how these elements interact, and then capture them in a health model that reflects how their state directly impacts the users.

Start with a Service Group

Health models are built upon service groups, which group resources (across resource-groups, across subscriptions) that work together in your workload. In our example we create a service group called “data ingestion”.

create a new Azure Service Group

Now that we have a brand-new service group, we can add all the resources that contribute to the health of the “live data feed” critical flow. This functionality relies on components spread across two resource groups (frontend and data ingestion), along with a shared API Management instance from a centralized hub.

By clicking Add members, you can search for the relevant components and filter by properties such as subscription, resource type, and more. As shown in the picture below, some resources—such as Application Insights and the Storage Account—are required for the MapBookApp to function, but since they are not critical to the “live data feed” flow we want to monitor, we won’t include them in the service group.

add components to a Service Group

Create an Health Model

Now, let’s create an Azure Health Model resource named hm-mapbookapp-dataingestion and:

a) configure it to use a system assigned identity. Note that identity used by the health model should have at least reader permission to fetch the resources contained in the service group and all the azure monitoring related resources like Log Analytics workspace and Azure Monitor workspace.
b) enable auto discovery and associate the “MapBookApp_DataIngestion” service group to the health model. Azure automatically picks up the resources included in the service model as health model entities. The model stays up to date as resources are added or removed from the service group.
c) by clicking Add recommended signals, the health model will apply Microsoft’s suggested signals to determine the health of the service. This is especially useful when using a service for the first time, as it helps you quickly understand which metrics matter. However, in real-world scenarios every application is unique, so it’s best to define your own signals and thresholds to reflect the specific requirements of your workload.

create a new Azure Health Model

Configure your Health Model.

Once the model is created, go to the discovery menu to verify whether the Health Model can access the resources contained in the service group and add the recommended signals. Add at least the Reader permission to the resources/resource groups involved.

configure the Azure Health Model discovery

Go to the Entity menu to see the results of the discovery along with the recommended signals that are automatically created. In our use case, alongside the list of monitored components, you can observe the following :

a) The type of signals associated with each resource. For example, here you can see that only metric alerts are contributing to the health of the resource.
b) The overall health state of the resource, derived from its signals. Some resources may appear as Unknown because, at that point in time, the metric data was not yet populated.
c) The automatically discovered metrics provide a good starting point, but they don’t always align perfectly with the workload. You can edit them to fine-tune values such as thresholds, aggregation windows, and other parameters, or remove the irrelevant ones and replace them with signals that better reflect your specific scenario.
d) Out-of-the-box metrics aren’t always sufficient to represent the true health of a component. When needed, we can define custom Log Analytics queries to capture more meaningful signals.

result of a discovery

In MapBookApp use case, we lowered the APIM capacity thresholds and added a Log Analytics query that calculates the failure percentage based on logic tailored to this use case.

customize the Health Model signals

Configure the dependency hierarchy

Let’s now move to the Designer, a very intuitive tool to visually edit entities and create a map that defines how health signals propagate up to the root entity and how they contribute to the overall health of your flow.

When you open the Designer for the first time, all components are directly linked to the root element, one-level, flat hierarchy. From there, you can reorganize and refine the structure to better reflect how your application actually works.

In our case, we used the visual editor to:

a) Implement a three-level hierarchy, grouping related components under intermediate nodes rather than connecting everything directly to the root.
b) Adjust how severity propagates, so that the impact of an unhealthy Stream Analytics job is reflected differently by its parent node, giving a more accurate representation of its importance to the overall flow.
c) Add alerts to the root node, configured to trigger whenever the overall flow moves into a Degraded or Unhealthy state.

configure the Health Model hierarchy

Visualize the health intuitively

Azure Health Model provides two main ways to see health: a Graph view, which offers a real-time visual of workloads and dependencies with color-coded states, and a Timeline view, which shows historical health for trend analysis and root-cause exploration.

We waited a few minutes, grabbed a small coffee, and let the data flow through our application, giving the health model time to collect signals and reflect the state of the system.

The Graph view clearly showed how the signals bubbled up to indicate a healthy flow 💪. But a little later, the overall health turned red, signaling that something was wrong as one of the underlying components had entered an unhealthy state.

Azure Health Monitor graph view

Clicking on any entity in the graph reveals its current health state along with everything you need to analyze the model—signals and their values, health history over time, active alerts, and more.

In this example, however, I chose to switch to the Timeline view to analyze how the anomaly evolved over time. In the screenshot below, we can observe the following:

a) The unhealthy state originates from one or more API Management signals.
b) All API Management dependencies remain green.
c) For about 10 minutes, the failure percentage was around 80%, after which it gradually decreased until the component returned to a healthy state.
d) Everything is now back to normal, with the failure percentage stabilized at 4.95%.

Azure Health Monitor timeline view

Alert on Health, Not Just Metrics

Now that our critical flow is mapped to a Health Model—and we’ve defined what healthy means for us and how different signals roll up into an overall state—we can generate alerts based on workload-wide health instead of triggering a multitude of alerts on raw metric thresholds.

For the sake of simplicity, we created a single alert at the root level to capture both the Degraded and Unhealthy states. However, you can also configure multiple alerts at different levels of your hierarchy — for example, to catch degraded states that bubble up with lower impact.

Set alerts at root level

Finally, the image below shows the alert generated by the transient unhealthy state we experienced because of the high amount of failed requests. This alert is ready to be picked up and dispatched to the notification channel of your choice.

As a best practice, I recommend not attaching an action group directly to an alert rule; instead, use Azure Processing Rules to subscribe to the alert and handle its dispatch.

Alert fired

🏁Result: All the individual metrics contributed to a single health signal (Healthy, Unhealthy, or Degraded). This approach is not only clean but also drastically reduces alert noise, while keeping the focus on business-level issues within the familiar Azure Monitor experience.

🤯 Note that you can even create health models that group other health models—giving you a powerful and flexible way to manage the health and availability of complex cloud applications. Want to know “Is our application available to users?” The answer is right there, at your fingertips. 🤯

combine multiple health models

Conclusion

Azure Monitor Health Models is a great new feature that bridges the gap between raw telemetry and business context, giving you a clear, aggregated view of your application’s health and reflecting what really matters: your customers’ experience.

Personally, I’m really impressed by how intuitive, effective, and easy to use Health Models are—it’s a feature that adds real value from day one.

What's next?

Exploring how far Health Models can go beyond the portal—and how they can be rolled out at enterprise scale—is on my radar. Stay tuned!

Are you also mind-blown by this new service? I’d love to hear how you’re planning to use Health Models in your workloads.

Subscribe to our RSS feed

Microsoft talks IoT, security and more at AzureCon

Last week Microsoft did a lot of announcements regarding Microsoft Azure during AzureCon. In this blog post I'll discuss a few of them that I personally feel that were interesting.

Resiliency Policies in Azure Container Apps

In a distributed system, components may fail, networks may experience delays, or nodes may become unreachable. By designing a system to be resilient, you ensure that it can gracefully handle failures and continue providing services. Azure Container Apps recently released a new feature to effortlessly overcome the outbound dependency request…

Cortana Analytics Workshop - From data to decisions to action (day 1)

On September 10 and 11, Codit was present at the first Cortana Analytics Workshop in Redmond, WA. This blog post is a report for the community on the content we collected and our impressions with the offering of Cortana Analytics. From data to decisions to action is the tagline for…

Brussels Airlines’ Digital Transformation Takes Off

IoT Takes Bühler Group from Field to Fork

Going the Distance with Cloud-Connected Industrial Sensors

Swiss Re leverages Cloud Technology and Data Services for its Digital Risk Intelligence Solutions

Soudal is Digitally Transforming Sales in the Chemical Industry

Creating New Revenue Streams in Logistics by Connecting Data

Brussels Airlines’ Digital Transformation Takes Off

IoT Takes Bühler Group from Field to Fork

Going the Distance with Cloud-Connected Industrial Sensors

Swiss Re leverages Cloud Technology and Data Services for its Digital Risk Intelligence Solutions

Soudal is Digitally Transforming Sales in the Chemical Industry

Creating New Revenue Streams in Logistics by Connecting Data

Azure Health Models: Alert on health, not just metrics

TL;DR

What are Azure Monitor Health Models?

Why Health Models Matter?

Use case

Building and Configuring Your Health Model

Visualize the health intuitively

Alert on Health, Not Just Metrics

Conclusion

What's next?

Related articles

Hi there,
how can we help?

Let's talk

Let's talk

Thanks, we'll be in touch soon!

Call us

Send blog to my inbox

Thanks, we've sent the link to your inbox

Your download should start shortly!

What can we connect for you?

Brussels Airlines’ Digital Transformation Takes Off

IoT Takes Bühler Group from Field to Fork

Going the Distance with Cloud-Connected Industrial Sensors

Swiss Re leverages Cloud Technology and Data Services for its Digital Risk Intelligence Solutions

Soudal is Digitally Transforming Sales in the Chemical Industry

Creating New Revenue Streams in Logistics by Connecting Data

Azure Health Models: Alert on health, not just metrics

TL;DR

What are Azure Monitor Health Models?

Why Health Models Matter?

Use case

Building and Configuring Your Health Model

Visualize the health intuitively

Alert on Health, Not Just Metrics

Conclusion

What's next?

Related articles

Talk to the author

Contact Massimo

Share this

Hi there,how can we help?

Let's talk

Let's talk

Thanks, we'll be in touch soon!

Call us

Send blog to my inbox

Thanks, we've sent the link to your inbox

Your download should start shortly!

Stay in Touch - Subscribe to Our Newsletter

Great you’re on the list!

What can we connect for you?

Hi there,
how can we help?