top of page
Ew-logo.png
Sentry Header.png
Sentry Logo White.png

Intelligent Incident Management and AI Operational Support for Modern Systems

COMPANY

Devz

ROLES

Product Design, UX / UI Design

IMPACTS

30+ Customers

BACKGROUND

As modern systems become increasingly complex, engineering teams face growing challenges in managing and resolving operational incidents quickly and efficiently.

Traditional incident management workflows often rely on manual triaging, fragmented communication, and multiple monitoring tools. As a result, teams spend significant time identifying root causes, coordinating responses, and documenting incidents, which can slow recovery and increase operational costs.

Sentry was designed to streamline incident management through an AI-powered AIOps platform. With an intelligent AI agent acting as a custom incident engineer, the platform can automatically triage alerts, assist with remediation actions, and support root cause analysis, enabling teams to respond faster and improve overall operational efficiency.

PROCESS OVERVIEW
PROBLEM

As infrastructure systems become more complex, engineering teams face pressure to detect and resolve incidents quickly while maintaining service reliability.

However, many existing incident management tools still fall short in several key areas:

Manual Incident Triage

Engineers must manually review alerts, which is time-consuming and risks delayed responses.

Fragmented Workflows

Switching between monitoring, messaging, and documentation tools slows coordination and clarity.

Remediation Gaps

Tools detect incidents but rarely guide corrective actions, prolonging resolution and workload.

Lack of Context

Teams struggle to gather necessary insights for root cause analysis, making troubleshooting inefficient.

Reactive Management

Solutions focus on responding to alerts rather than anticipating issues, limiting proactive prevention.

Excessive Alerts

Teams are overwhelmed by high volumes of notifications, making it hard to focus on critical issues.

RESEARCH

Goals

Enhance incident response with a user-centered design that leverages Sentry’s AI engineer, Devi, to streamline triage, remediation, and root cause analysis.

The design focuses on creating a clear, intuitive interface that empowers teams to respond to alerts efficiently while minimizing manual effort. By integrating Devi, Sentry’s AI agent, the platform intelligently guides users through incident resolution, provides actionable insights, and centralizes contextual information for root cause analysis. The experience aims to strengthen system reliability, reduce operational overhead, and deliver measurable improvements in overall incident management efficiency.

Competitive Analysis

We researched and created this competitor analysis to compare leading incident management and AIOps platforms with our own product, Sentry. The analysis reviewed key capabilities such as alerting, automation, remediation support, and root cause analysis to better understand how different platforms support incident response workflows. These insights helped identify gaps, strengths, and opportunities that informed design decisions and improvements for Sentry’s AI-driven incident management

Dark Background.png
PERSONA
IDEATION

Journey Map

The Sentry user journey illustrates the end-to-end workflow of incident management, from receiving alerts to preventing future issues. It highlights common pain points, user actions, and how Sentry’s AI-driven features streamline triage, remediation, root cause analysis, and reporting—turning a traditionally reactive process into a more proactive, efficient experience.

Site Map

The Sentry site map illustrates the platform’s hierarchical structure, showing how users navigate between core sections such as account management, support plans, incidents, the NOC hub, AI assistant, and calendar. It highlights the organization of dashboards, analytics, team configurations, and AI-driven features, ensuring a clear and intuitive flow for monitoring, responding to, and preventing incidents efficiently.

Lo-Fi Wireframes

After defining the main user task and flow, we attempted to create the first set of lo-fi wireframes to run some preliminary testing with the actual users, which allowed us to gather some initial feedback

Sentry-Lo-Fi.png
Gradient Dark Background.png
DESIGN & PROTOTYPE

Design Iterations

During the interaction design phase, we encountered several challenges. To address them, we conducted multiple moderated user testing sessions and went through several rounds of iteration based on the feedback gathered. Insights from focus groups and stakeholder meetings helped us better understand user needs and refine our designs accordingly. Below are some of the key design improvements we made.

Could Be Improved

  • Limited operational visibility – It’s difficult for users to quickly assess system health or ongoing incidents

  • Incidents were not prioritized

  • Did not show relationships between services and incidents

  • Static information layout & Underutilized AI capabilities

New Design Based On Feedback

  • A node-based service map provides a clearer view of system health and relationships between components

  • The My Incidents panel now surfaces alerts directly on the homepage for faster access and response

  • Key metrics such as Incidents, AI Activity, and Prevention Insights are grouped together to provide a quick operational overview

  • Integrated AI insights and more actionable homepage experience

Hi-Fidelity Prototypes

Homepage

A visual monitoring dashboard designed to help teams quickly understand system health and track incidents within a single operational view.

The homepage provides an interactive overview of service activity and system status through a visual network map and real-time incident panels. Users can quickly monitor ongoing incidents, review alerts, and navigate related services without leaving the dashboard. The panels can slide within the interface for quick access while maintaining visibility of the system map, and the listings can also be expanded into a full-screen view for more detailed investigation. This flexible layout improves situational awareness for operations teams, helping them identify issues faster and manage incidents more efficiently.

Support Plan, Service & Incident

Centralize Your Support, Track Issues with Confidence

The dashboard offers a variety of customizable widgets—allowing users to tailor their workspace to their specific roles and needs. This flexibility ensures that team members at all levels can access the information most relevant to them.

My Support Plan

The My Support Plan interface provides a clear overview of active support plans while offering insight into overall system health. Users can explore available services and monitor ongoing issues, and toggle between a list view highlighting key service metrics and a service map view that visualizes system relationships and dependencies.

Services

The Service interface helps users monitor the overall health and performance of their services. The Service Management tab displays key metrics, including total incidents and counts by status—Open, Acknowledged, Blocked, Escalated, and Resolved—enabling teams to quickly assess priorities and respond efficiently. Users can also switch to the Service Maps tab to visualize system relationships and dependencies, or the Link Management tab to manage connections between services, providing a comprehensive and organized view of service operations.

Incidents

The Incident Listing View provides a clear and organized overview of all incidents, allowing users to quickly scan and manage ongoing issues. Each incident is displayed with customized key details such as status, priority, affected services, and timestamps based on their own needs, making it easy to identify critical problems at a glance. Users can sort, filter, and search incidents to focus on the most urgent items, enabling faster response times and better tracking of system reliability across all services.

Incident Details

See the Full Picture, Resolve Incidents Faster

The Incident Details page provides a comprehensive view of each incident, bringing together all the information needed to investigate and resolve issues efficiently. Users can review and edit key incident details along with related sections.

Add Incident

Incident Details

Recommended Solutions

The Recommended Solutions section helps teams quickly identify the best course of action during an incident. Sentry automatically recommends solutions by analyzing incident details, solution metadata, and past applications to similar incidents, with a clear rationale provided for transparency. Devi, Sentry’s AI assistant, can then apply the recommended solution automatically. If the resolution fails or key information is missing, the incident is escalated to the human incident team, with highlighted areas needing attention. By combining AI-driven automation with human validation, teams can resolve issues faster and focus on more complex work.

Analytics

Turning Complex Metrics Into Clear Actionable Insights

The Analytics section turns complex incident data into actionable insights. Users can track volume, unresolved issues, MTTA/MTTR, alerting apps, and other key metrics through customizable graphs. This flexible view helps teams spot patterns, optimize workflows, and make faster, informed decisions.

Solutions

Actionable Intelligence, Instantly

The solution empowers teams to monitor services, track incidents, and resolve issues faster. With real-time insights, clear metrics, and streamlined workflows, it transforms complex operational data into actionable intelligence. Past incident solutions can be reused for similar future issues, helping teams respond faster and prevent repeats.

Create Solution

Solution Details

Incident Retro

Smarter, Clearer Insights After Every Incident

The Incident Retro feature provides a structured view of past incidents, allowing teams to review timelines, analyze root causes, and evaluate the response process. Powered by AI, Sentry automatically compiles incident data, communication history, and remediation actions into a clear summary, helping teams identify patterns, improve workflows, and prevent similar issues while keeping a record for reporting and improvement.

Create Retro

Retro Details

On-Call Schedule

Full Visibility Into Current and Upcoming On-Call Coverage

The On-Call Schedules section provides a clear view of current and upcoming on-call responsibilities, along with a calendar view of the project’s schedule. Users can quickly see who is on duty, when shifts change, and plan accordingly. By centralizing this information, teams can reduce confusion, ensure coverage, and respond to incidents more efficiently.

View & Edit Schedules

Team Updates

Stay Connected & Informed, Updates in One Place

The Team Updates brings all team communications into one place, including Announcements, Jira Updates, Knowledge Sharing, Attention Requests, and Questions. By centralizing these updates, teams can quickly catch up on what matters, share insights, and respond efficiently. This streamlined approach keeps everyone informed, improves collaboration, and ensures important information is always visible.

NOC Hub

Centralized Visibility for Smooth & Efficient Operations

The NOC Hub gives teams a centralized view of system health, combining incidents, service statuses, alerts, and updates in one place. Users can monitor multiple services, prioritize critical issues, and coordinate responses efficiently. By keeping key information accessible, the NOC Hub helps teams respond faster and maintain smooth operations.

Ongoing INC

INC Details

Devi AI Assistant

AI-Powered Engineer for Teams, Reports, and Incidents

The Devi AI Assistant acts as a smart guide for operations teams, streamlining onboarding, triaging incidents, generating reports, and answering questions in real time. By providing instant guidance and structured support, Devi helps new team members get up to speed quickly, ensures incidents are prioritized effectively, and simplifies access to key information. This AI-powered assistant reduces manual effort, accelerates response times, and keeps teams informed and efficient.

Mobile App

Incident Management, Anytime, Anywhere

The Mobile App extends Sentry’s capabilities to teams on the go, providing real-time access to incidents, service statuses, and team updates directly from a smartphone. Users can view and acknowledge incidents, track on-call schedules, and receive notifications instantly, ensuring critical issues are never missed. Designed for quick interactions and easy navigation, the app empowers teams to stay connected, respond faster, and maintain operational efficiency no matter where they are.

bottom of page