Microsoft recently announced a new set of preview features, with new VM Metrics for Azure Monitor. One important new metric is a set of improvements for the VM availability feature, adding context to the information it delivers. Knowing if an issue was caused by Azure or by your application will help you prioritize debugging and support calls. This new feature adds labels to metrics, such as platform, customer, and unknown. These can filter results in your dashboard views, giving a quick view of where to focus to get a root-cause analysis of a problem, especially if it caused an outage.
Another new option allows you to combine Project Flash’s Event Grid support with Azure Monitor. Events can now be sent through Event Grid to Azure Monitor, bringing distributed systems events into the same metric framework used to watch core systems. You no longer need to write event-handling code for Project Flash events; instead you can rely on Azure Monitor’s existing alert features, including its support for SMS and push alerts. This approach feeds in events from multiple sources and adds a real-time alert option.
Microsoft also wants to have Project Flash give you information about the underlying Azure infrastructure and platform. For example, one planned new feature will give details of issues with rack-level networking hardware, as well as predictive failure alerts so you can move operations to another region in advance of planned or unplanned data center maintenance. The intent is for a mix of Project Flash events and scheduled event notifications to give you enough warning to alert users and migrate workloads, as well as show when services are recovered and ready for use. Scheduled events are intended to give up to 15 minutes’ warning, which should be enough time to stand up a backup instance of an application and begin rerouting traffic.