Investigation views

Azure Application insights

Enabled Azure customers to solve application issues 4X faster, leading to 6X usage growth over 24m.

Designed insights-tailored, curated APM experiences that streamline detect/triage/diagnose journeys and reduced reliance on complex log analytics.

Worked with PM, Devs and User Research to create an innovative, cross-referenced UX that became an essential tool for application triage workflows in Azure. 

Client

Microsoft Inc., Azure, Application Insights

Role

Senior User Experience Designer

Location

On-sight, remote

Scope

To help App Developers, Dev Ops pros, and Sight Reliability Engineers quickly triage issues with their applications.

Challenge

The investigation views in Azure’s Application Insights suite were created with the goal of assisting App creators with application performance monitoring (APM) by enabling fast diagnosis and problem-solving for their live apps.  

The legacy version of these products was not resonating with customers, due to an unintuitive UX which hid crucial information below the fold, missing data, and no easy way to detect the most critical issues.  

Action

Working closely with a Senior PM / Subject Matter Expert, I tried an approach that combined three important application monitoring categories: Availability, Performance and Failure signals. I suggested a column dedicated to highlighting critical issues. User research concluded that customers liked the basic concept but craved richer and more specific info from each category. 

That feedback led to a new round of wireframe concepts. We focused on creating a flexible template, customizable for each category. I participated in a new round of customer calls to validate which features resonated and to identify blind spots. 

The Performance design iterations included a method to pin Operations from the lower table to the main chart above, to enable easy comparison. We tried a “Details at a glance” section instead of the insights cards, but after user feedback we concluded that cards with critical issues were preferable.

Similarly, with Failures, the PM and I experimented with failed requests and exceptions in the right column, but in the end I recommended keeping consistency with the card layout in Performance insights. I also pivoted away from a horizontal tab system that tested poorly. I reduced the top charts from three to two, and later introduced a configurable second chart, to preserve space for the bottom table.

The final design for Performance included a more detailed table for Operations data, an Operations health chart with a flexible secondary chart, a histogram to show the slowest operations events, and a section devoted to “insights”, including cards that spotlight the top 3 critical issues for topics relevant to the selected operation.

The UX is also cross-referential, meaning that interacting with one quadrant of the UX will automatically update and refine the data on other quadrants of the screen, narrowing the data for greater precision.

The final Failures design uses the same basic template but has a twist: Unlike Performance, it features three fixed cards in the “Insights” section, since that data would always be the same, elegantly utilizing space that wasn’t needed for a histogram.

The final Availability design also uses this template and swaps Availability Tests for Operations, including an accordion control to allow users see all the regions where the selected test is deployed, and the ability to contextually edit tests as needed.

Result