Integration Guide

PagerDuty Integration - Automated Incident Response

Enterprise-grade uptime for distributed systems

Last updated: Oct 14, 2024 • v3.2.1 • Maintenance Window Aware

1. Initial Setup & Service Configuration

Map your SentinelPulse monitoring endpoints to PagerDuty services using our native webhook bridge.

Begin by generating an integration key in your PagerDuty account under Integrations > PagerDuty Inbound Integration. Paste the key into the SentinelPulse dashboard under Settings > Alerting > Third-Party Connectors. Ensure your monitoring scope targets the us-east-1 and eu-west-2 regional probes for accurate latency baselines.

Once authenticated, SentinelPulse will automatically sync your existing uptime checks. Use the configuration matrix below to define which check groups trigger immediate alerts versus scheduled digest reports.

⚡

Webhook Routing

Direct POST requests to events.pagerduty.com/v2/enqueue with JSON payload containing routing_key, dedup_key, and incident_key fields.

🔄

Status Sync

Bi-directional state mapping ensures UP, DEGRADED, and DOWN statuses align with PagerDuty trigger, acknowledge, and resolve event types.

🛡️

Maintenance Windows

Automatic suppression during scheduled downtime. Syncs with PagerDuty maintenance API to prevent false positive paging.

{
  "event_action": "trigger",
  "routing_key": "a1b2c3d4e5f6g7h8i9j0",
  "dedup_key": "sentinelpulse-check-8842-api-gateway",
  "payload": {
    "summary": "API Gateway Latency > 2000ms",
    "severity": "critical",
    "source": "sentinelpulse-probe-na-east-1",
    "custom_details": {
      "check_id": "chk_99281",
      "uptime_url": "https://status.sentinelpulse.com/incidents/8842"
    }
  }
 }

2. Escalation Policies & On-Call Routing

Configure multi-tier escalation paths that adapt to incident severity and response SLAs.

SentinelPulse evaluates your monitoring thresholds and routes alerts through PagerDuty escalation policies. For critical severity events such as complete endpoint failure or 5xx error rate exceeding 15 percent, the system immediately pages the primary on-call engineer via push notification and SMS. If unresolved within 15 minutes, the alert escalates to the secondary responder and triggers a Slack incident-response channel alert.

SLA Target: 99.99% Uptime

For warning severity events including latency degradation or certificate expiry under 14 days, alerts route to the engineering digest queue without immediate paging. You can customize escalation timers, repeat notifications, and auto-acknowledge rules directly from the integration dashboard. All escalation events are logged with ISO 8601 timestamps for post-incident review.

14s

Avg. Alert Delivery

98.7%

First-Contact Resolution

24/7

On-Call Coverage

Max Escalation Tiers

3. Automated Resolution & Post-Incident Sync

Streamline recovery workflows with automatic incident closure and status page synchronization.

When SentinelPulse detects successful endpoint recovery across three consecutive probes, it automatically pushes a resolve event to PagerDuty. This action closes the active incident, updates the on-call schedule, and triggers a post-incident report generation.

The resolution payload includes mean time to recovery metrics, total downtime duration, and affected user impact data. You can configure automatic status page updates to reflect resolving to operational transitions without manual intervention. All resolution data feeds directly into your quarterly reliability reviews and SLA reporting dashboards.

{
  "event_action": "resolve",
  "routing_key": "a1b2c3d4e5f6g7h8i9j0",
  "payload": {
    "summary": "API Gateway Latency Normalized",
    "severity": "critical",
    "source": "sentinelpulse-probe-na-east-1",
    "custom_details": {
      "check_id": "chk_99281",
      "mttr_seconds": 412,
      "downtime_minutes": 6.8,
      "resolution_note": "Auto-resolved after 3 consecutive successful probes"
    }
  }
 }